Home Dashboard Directory Help
Search

Unicode character outside of BMP treated incorrectly by schema validating parser by Martin_Honnen


Status: 

Closed
 as Won't Fix Help for as Won't Fix


2
0
Sign in
to vote
Type: Bug
ID: 767166
Opened: 10/12/2012 3:51:01 AM
Access Restriction: Public
0
Workaround(s)
view
0
User(s) can reproduce this bug

Description

In terms of the XML specification http://www.w3.org/TR/xml/#dt-character and the XML schema data type specification http://www.w3.org/TR/xmlschema-2/#string there is no difference between Unicode characters in the BMP (basic multilingual plane) and characters outside of the plane, each count as a single character.

That means the an element like <test>&#x10300;</test> has as its value a string with a single Unicode character and should be valid against a schema like

<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="root">
    <xs:complexType>
     <xs:sequence>
        <xs:element maxOccurs="unbounded" name="test" type="one-char" />
     </xs:sequence>
    </xs:complexType>
</xs:element>
<xs:simpleType name="one-char">
    <xs:restriction base="xs:string">
     <xs:length value="1"/>
    </xs:restriction>
</xs:simpleType>
</xs:schema>

which restrict the length to be one character.

However when I validate the input document http://home.arcor.de/martin.honnen/xml/oneCharInstance1.xml against the schema http://home.arcor.de/martin.honnen/xml/oneCharSchema1.xsd .NET's validating parser reports a validation error "Error: The 'test' element is invalid - The value '??' is invalid according to it
s datatype 'one-char' - The actual length is not equal to the specified length." so it incorrectly treats the single Unicode character as more than one character.

Other validating parsers like Saxon 9.4 EE or XSV (http://www.w3.org/2001/03/webdata/xsv?docAddrs=http%3A%2F%2Fhome.arcor.de%2Fmartin.honnen%2Fxml%2FoneCharInstance1.xml+http%3A%2F%2Fhome.arcor.de%2Fmartin.honnen%2Fxml%2FoneCharSchema1.xsd&warnings=on&keepGoing=on&style=xsl#) don't report any validation error.
Details
Sign in to post a comment.
Posted by Microsoft on 4/10/2013 at 2:14 PM
Thanks for bringing up this interesting issue. We are always grateful when customers point towards potential concerns - this helps us ensuring the quality of the .NET Framework and driving the product into the right direction.

Indeed, you have discovered a genuine problem with the system.
Unfortunately, we cannot fix this issue because it may affect the behaviour of existing programs.
Posted by Microsoft on 10/14/2012 at 10:06 PM
Thanks for your feedback.

We are rerouting this issue to the appropriate group within the Visual Studio Product Team for triage and resolution. These specialized experts will follow-up with your issue.
Posted by Microsoft on 10/12/2012 at 5:14 AM
Thank you for your feedback, we are currently reviewing the issue you have submitted. If this issue is urgent, please contact support directly(http://support.microsoft.com)
Sign in to post a workaround.