Unicode character outside of BMP treated incorrectly by schema validating parser - by Martin_Honnen

Status : 

  Won't Fix<br /><br />
		Due to several factors the product team decided to focus its efforts on other items.<br /><br />
		A more detailed explanation for the resolution of this particular item may have been provided in the comments section.

Sign in
to vote
ID 767166 Comments
Status Closed Workarounds
Type Bug Repros 0
Opened 10/12/2012 3:51:01 AM
Access Restriction Public


In terms of the XML specification http://www.w3.org/TR/xml/#dt-character and the XML schema data type specification http://www.w3.org/TR/xmlschema-2/#string there is no difference between Unicode characters in the BMP (basic multilingual plane) and characters outside of the plane, each count as a single character.

That means the an element like <test>&#x10300;</test> has as its value a string with a single Unicode character and should be valid against a schema like

<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="root">
        <xs:element maxOccurs="unbounded" name="test" type="one-char" />
  <xs:simpleType name="one-char">
    <xs:restriction base="xs:string">
      <xs:length value="1"/>

which restrict the length to be one character.

However when I validate the input document http://home.arcor.de/martin.honnen/xml/oneCharInstance1.xml against the schema http://home.arcor.de/martin.honnen/xml/oneCharSchema1.xsd .NET's validating parser reports a validation error "Error: The 'test' element is invalid - The value '??' is invalid according to it
s datatype 'one-char' - The actual length is not equal to the specified length." so it incorrectly treats the single Unicode character as more than one character.

Other validating parsers like Saxon 9.4 EE or XSV (http://www.w3.org/2001/03/webdata/xsv?docAddrs=http%3A%2F%2Fhome.arcor.de%2Fmartin.honnen%2Fxml%2FoneCharInstance1.xml+http%3A%2F%2Fhome.arcor.de%2Fmartin.honnen%2Fxml%2FoneCharSchema1.xsd&warnings=on&keepGoing=on&style=xsl#) don't report any validation error.
Sign in to post a comment.
Posted by Microsoft on 4/10/2013 at 2:14 PM
Thanks for bringing up this interesting issue. We are always grateful when customers point towards potential concerns - this helps us ensuring the quality of the .NET Framework and driving the product into the right direction.

Indeed, you have discovered a genuine problem with the system.
Unfortunately, we cannot fix this issue because it may affect the behaviour of existing programs.
Posted by Microsoft on 10/14/2012 at 10:06 PM
Thanks for your feedback.

We are rerouting this issue to the appropriate group within the Visual Studio Product Team for triage and resolution. These specialized experts will follow-up with your issue.
Posted by Microsoft on 10/12/2012 at 5:14 AM
Thank you for your feedback, we are currently reviewing the issue you have submitted. If this issue is urgent, please contact support directly(http://support.microsoft.com)