Search
Active

10
Sign in to vote
0
Sign in to vote
Sign in
to vote
Type: Bug
ID: 409672
Opened: 2/1/2009 9:52:59 AM
Access Restriction: Public
0
Workaround(s)
0
User(s) can reproduce this bug
Cannot get MGrammar-based parser to recognize any input containing non-Latin symbols. See Repro Steps for details.
Details (expand)
What area of the product is your feedback for?
Oslo Modeling Language (Codename "M")

What distribution are you using?

Oslo SDK CTP
How important is this issue?
 
Repro Steps for Product Issue:
1. Create UTF8-encoded file "Ru.mg" containing:
module Ru_Test
{
language Ru
{
syntax Main = "я";
}
}
2. Create UTF8-encoded file "test.ru" containing single Russian symbol я.
3. Run "mg.exe Ru.mg" to compile the grammar.
4. Run "mgx.exe -reference:Ru.mgx test.ru" to parse test file.
MGX returns unexpected errors:
test.ru(1,1): error 5003: "я" is invalid for token "я". Character "я" was unexpected.
test.ru(1,1): error 5007: Token Error with text "я" unexpected.
If this issue occurred during a Hands-on-Lab, which one was it? 
 
File Attachments
0 attachments
Sign in to post a comment.
Posted by Microsoft on 3/13/2009 at 9:37 AM
Enabling unicode in MGrammar is a feature we are working on. Thanks for the feedback.
Posted by StonyUK on 5/10/2009 at 8:36 AM
Is Unicode still lacking in MGrammar? I see that various whitespace rule examples where Unicode values are specified for line breaks etc. I'm looking for a rule to allow arbitrary Unicode strings such as "Hello World" or "Eckernförde".