XmlReader now breaks trying to download DTDs from w3.org - by SteveNiemitz

Status : 

  By Design<br /><br />
		The product team believes this item works according to its intended design.<br /><br />
		A more detailed explanation for the resolution of this particular item may have been provided in the comments section.


11
0
Sign in
to vote
ID 618119 Comments
Status Closed Workarounds
Type Bug Repros 6
Opened 10/29/2010 9:57:47 AM
Access Restriction Public

Description

Apparently w3.org now blocks requests with a .NET user agent, causing 403 forbidden responses when trying to download DTDs.

For example:
XmlDocument xdoc = new XmlDocument();
xdoc.Load(@"e:\dbg\xml.txt");

Throws:
Unhandled Exception: System.Net.WebException: The remote server returned an error: (403) Forbidden.
   at System.Net.HttpWebRequest.GetResponse()
   at System.Xml.XmlDownloadManager.GetNonFileStream(Uri uri, ICredentials credentials, IWebProxy proxy, RequestCachePolicy cachePolicy)
   at System.Xml.XmlDownloadManager.GetStream(Uri uri, ICredentials credentials, IWebProxy proxy, RequestCachePolicy cachePolicy)
   at System.Xml.XmlUrlResolver.GetEntity(Uri absoluteUri, String role, Type ofObjectToReturn)
   at System.Xml.XmlTextReaderImpl.OpenAndPush(Uri uri)
   at System.Xml.XmlTextReaderImpl.PushExternalEntityOrSubset(String publicId, String systemId, String baseUriStr, Uri& baseUri, String entityName)
   at System.Xml.XmlTextReaderImpl.PushExternalEntity(IDtdEntityInfo entity)
   at System.Xml.XmlTextReaderImpl.DtdParserProxy_PushEntity(IDtdEntityInfo entity, Int32& entityId)
   at System.Xml.XmlTextReaderImpl.DtdParserProxy.System.Xml.IDtdParserAdapter.PushEntity(IDtdEntityInfo entity, Int32& entityId)
   at System.Xml.DtdParser.HandleEntityReference(XmlQualifiedName entityName, Boolean paramEntity, Boolean inLiteral, Boolean inAttribute)
   at System.Xml.DtdParser.GetToken(Boolean needWhiteSpace)
   at System.Xml.DtdParser.ParseSubset()
   at System.Xml.DtdParser.ParseInDocumentDtd(Boolean saveInternalSubset)
   at System.Xml.DtdParser.Parse(Boolean saveInternalSubset)
   at System.Xml.DtdParser.System.Xml.IDtdParser.ParseInternalDtd(IDtdParserAdapter adapter, Boolean saveInternalSubset)
   at System.Xml.XmlTextReaderImpl.ParseDtd()
   at System.Xml.XmlTextReaderImpl.ParseDoctypeDecl()
   at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
   at System.Xml.XmlTextReaderImpl.Read()
   at System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean preserveWhitespace)
   at System.Xml.XmlDocument.Load(XmlReader reader)
   at System.Xml.XmlDocument.Load(String filename)
   at W3Test.Program.Main(String[] args) in Program.cs:line 27

Using:
<!DOCTYPE Test [
<!ENTITY % HTMLlat1 PUBLIC
"-//W3C//ENTITIES Latin 1 for XHTML//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
%HTMLlat1;
<!ENTITY % HTMLsymbol PUBLIC
"-//W3C//ENTITIES Symbols for XHTML//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent">
%HTMLsymbol;
<!ENTITY % HTMLspecial PUBLIC
"-//W3C//ENTITIES Special for XHTML//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent">
%HTMLspecial;
]>
<Test>
</Test>

as a test XML file.
Sign in to post a comment.
Posted by Arun [MSFT] on 6/9/2011 at 8:30 PM
Hi All,
     Thanks for the comments. Just to clarify the original issue is because the W3C site blocks access as mentioned previously. This is not a component level issue in .NET. The XmlPreloadedResolver not being available pre-.NET 4.0 is an orthogonal issue.

Thanks,
Arun Chandrasekhar
Senior Program Manager
XML Team
Posted by Alex Trishin on 11/18/2010 at 2:15 PM
XmlPreloadedResolver is not available pre .NET 4.0
Posted by Saravanaa_bhc on 11/15/2010 at 8:36 PM
And more over I am using .Net Framework 1.1 i couldn't find the XMLPreloadedResolver.
Posted by Saravanaa_bhc on 11/15/2010 at 8:34 PM
Hi i am also getting the same error, Interestingly the above code was working well but suddenly it behaves like this for the past 3 weeks. Please let me know if you find any workaround.

Thanks
Sarav
Posted by Microsoft on 11/15/2010 at 12:00 AM
It failed to download from w3.org since the site block the access. However, in .Net, following six dtd already cache in XmlPreloadedResolver

http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd
http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent
http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent

XmlPreloadedResolver could set to use in XmlReaderSettings.Resolver:
Sample code work for you case is:

XmlReaderSettings set = new XmlReaderSettings();
set.DtdProcessing = DtdProcessing.Parse;
set.XmlResolver = new System.Xml.Resolvers.XmlPreloadedResolver();
XmlReader reader = XmlReader.Create(new StreamReader(file), set);

XmlDocument doc = new XmlDocument();

doc.Load(reader);



Posted by DeEchtePietjePuk on 11/8/2010 at 7:40 AM
For our company, it has nothing to do with an exceeded request limit. If I request the DTD in IE8, I get a 403 Forbidden. If I change the IE8 UserAgent string to a different browser, I can download the DTD with IE8. After changing it back again, I get a 403 Forbidden. We are seeing this behaviour on servers connected by several different ISP's, so an IP ban does not seem to be the case either. This 403 breaks our application for all our customers because it attempts to load the xhtml 1.0 strict dtd which fails. Please put some pressure on W3 to revert to the old behaviour. For a next release of the .Net Framework, it might be a good idea to include the most commonly requested DTD's in a static resource and make it available in an API (e.g. a W3DtdXmlResolver). That should improve performance for many applications as well.
Posted by ProNotion on 11/6/2010 at 7:01 AM
I don't know that this is because of the user-agent but more likely that the request limit has been exceeded, I have just suffered this recently myself. As a workaround you can try caching the DTD's and referencing the local version rather than hitting the W3C for each request. See http://www.w3.org/Help/abuse-info/re-reqs.html for more related information.
Posted by Microsoft on 10/29/2010 at 10:22 AM
Thank you for your feedback, we are currently reviewing the issue you have submitted. If this issue is urgent, please contact support directly(http://support.microsoft.com)