Visual Studio and .NET Framework Home
Compile error with source file containing UTF8 strings (in CJK system locale)
as Won't Fix
5/1/2008 5:19:19 AM
User(s) can reproduce this bug
Compiling MySql source code fails on Japanese Windows or if System locale is set to one of CJK locales.
Compile succeeds if system locale is English. Also, VS2003.NET will compile without errors.
The file in question contains UTF8 strings. The workaround to store the file with UTF-8 BOM works, but cannot be applied, because the same file is compiled on different compilers and OS combinations.
For more information refer to mysql bug description here:
I will attach the file in question shortly, together with preprocessed source
Visual Studio 2005 (All Products and Editions) Service Pack 1 w/ Windows Vista Update
Operating System Language
Steps to Reproduce
cl -c /TP sql_locale.i
.\sql_locale.cc(28) : error C2146: syntax error : missing '}' before identifier
.\sql_locale.cc(28) : error C2146: syntax error : missing ';' before identifier
.\sql_locale.cc(28) : error C4430: missing type specifier - int assumed. Note: C
++ does not support default-int
.\sql_locale.cc(28) : error C2059: syntax error : 'string'
.\sql_locale.cc(28) : error C2143: syntax error : missing ';' before '}'
.\sql_locale.cc(28) : error C2059: syntax error : '}'
TAP Code (if applicable)
You can indicate your satisfaction with how Microsoft handled this issue by completing this quick
3 question survey
to post a comment.
Please enter a comment.
on 11/12/2013 at 12:34 PM
Why doesn't Microsoft add the compiler option "/encoding" for cl.exe? If Microsoft adds it, we can type compile command as "cl /encoding utf-8 filename.cpp" and then compile the source file with UTF-8 encoding.
on 5/6/2008 at 11:17 AM
Hi: you are correct the BOM is not part of the C++ Standard - but if you want non-ASCII characters then the "official" and portable way to get them is to use the \u (or \U) hex encoding (which is, I agree, just plain ugly and error prone).
The compiler when faced with a source file that does not have a BOM the compiler reads ahead a certain distance into the file to see if it can detect any Unicode characters - it specifically looks for UTF-16 and UTF-16BE - if it doesn't find either then it assumes that it has MBCS. I suspect that in this case that in this case it falls back to MBCS and this is what is causing the problem.
Being explicit is really best and so while I know it is not a perfect solution I would suggest using the BOM.
Visual C++ Compiler Team.
Visual C++ Compiler Team
on 5/6/2008 at 10:33 AM
I set the status from "Won't Fix" back to "Active". Would appreciate a comment why you think it is not a compiler error and why "Won't fix". This issue is at least a regression. The file compiles ok on VS2003.
on 5/5/2008 at 4:21 PM
thank you for the quick reply.
Workaround using UTF8-BOM works, as I wrote in the bug description
but unfortunately I cannot use that workaround. I also wrote in the bug description
this file is compiled on different platforms using different compilers.
I have not read the newest C++ specification, but my guts feeling is that BOM is not
part of it. Even if I can workarounf my local problem with this compiler or VS2008,
it will fail on gcc and forte and on VS2003 as well.
I'm trying to understand what the compiler needs to guess here. All non-ASCIIs are strings here, and I would argue that char foo  ="bar" in C or C++ denotes a null terminated array of bytes ( since 1970 or so )
In this case, no conversion to/from another encoding is desired. I also have no intention to output them with printf or so. Why not just to preserve the bytes "as is"?
Even if the compiler on some reason always needs to convert my source file, then I could also live with another way to specify encoding, without fallback to BOM because of the reasons outlined above.
E.g using a #pragma setlocale("english.65001") .Which also did not work for me in this case.
In the the worst case, I know, I still can convert all bytes in this file to hex, but this is somewhat ugly. I still would like to see and edit the international strings in UTF8 capable editor, Visual Studio IDE for example.
on 5/5/2008 at 3:35 PM
Hi: our suggestion for fixing this issue would be to use a BOM - this unambiguously lets the compiler know the encoding of the file - without this the compiler needs to revert to guess work.
Visual C++ Compiler Team
on 5/2/2008 at 12:18 AM
Thanks for your feedback.
We are escalating this issue to the appropriate group within the Visual Studio Product Team for triage and resolution.
These specialized experts will follow-up with your issue.
Visual Studio Product Team
to post a workaround.
Please enter a workaround.
© 2014 Microsoft