In terms of the XML specification http://www.w3.org/TR/xml/#dt-character and the XML schema data type specification http://www.w3.org/TR/xmlschema-2/#string there is no difference between Unicode characters in the BMP (basic multilingual plane) and characters outside of the plane, each count as a single character.That means the an element like <test>𐌀</test> has as its value a string with a single Unicode character and should be valid against a schema like<?xml version="1.0" encoding="utf-8"?><xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="root"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" name="test" type="one-char" /> </xs:sequence> </xs:complexType> </xs:element> <xs:simpleType name="one-char"> <xs:restriction base="xs:string"> <xs:length value="1"/> </xs:restriction> </xs:simpleType></xs:schema>which restrict the length to be one character.However when I validate the input document http://home.arcor.de/martin.honnen/xml/oneCharInstance1.xml against the schema http://home.arcor.de/martin.honnen/xml/oneCharSchema1.xsd .NET's validating parser reports a validation error "Error: The 'test' element is invalid - The value '??' is invalid according to its datatype 'one-char' - The actual length is not equal to the specified length." so it incorrectly treats the single Unicode character as more than one character.Other validating parsers like Saxon 9.4 EE or XSV (http://www.w3.org/2001/03/webdata/xsv?docAddrs=http%3A%2F%2Fhome.arcor.de%2Fmartin.honnen%2Fxml%2FoneCharInstance1.xml+http%3A%2F%2Fhome.arcor.de%2Fmartin.honnen%2Fxml%2FoneCharSchema1.xsd&warnings=on&keepGoing=on&style=xsl#) don't report any validation error.
Visual Studio/Team Foundation Server/.NET Framework Tooling Version
Steps to reproduce
Product Language
Operating System
Operating System Language
Actual results
Expected results
Please wait...