Home Dashboard Directory Help
Search

BULK INSERT and BCP does not recognize codepage 65001 by Erland Sommarskog


Status: 

Closed
 as Fixed Help for as Fixed


41
0
Sign in
to vote
Type: Bug
ID: 370419
Opened: 9/28/2008 4:07:18 AM
Access Restriction: Public
0
Workaround(s)
view
23
User(s) can reproduce this bug

Description

In SQL 2005, you could use the options -C 65001 and CODEPAGE = 65001 to
load UTF-8 files with BCP and BULK INSERT respectively. In SQL 2008 this fails,
and the error message from BCP gives a clear hint that this is intentional.
I found https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=321839
which may give an explanation for this drastic step.

However, I can find no documentation about this in Books Online, and neither can I recall
having seen this in release notes or a Readme. It appears that this change was introduced in RC0.

While the bug in 321839 may not have passed the bar to be fixed in RTM, it seems quite
astonishing to defer the bug all to SQL 11 in hope that you will have native UTF-8 support
in the engine then. After all, this is was existing functionality in SQL 2005, with or without the bug.



Details
Sign in to post a comment.
Posted by Bluezealot on 1/4/2013 at 10:19 PM
This has been already fixed?This still happen on my server...
Posted by dchang92606 on 1/17/2012 at 5:13 PM
Did anyone try the -C RAW option for bcp to see if it allows UTF-8 characters into SQL 2008 R2? Or, does anyone have a work around for this problem that doesn't involve writing custom code to convert to UTF-16 before inserting it into SQL Server?
Posted by boomhauer on 8/7/2011 at 3:23 PM
someone want to explain why this was removed?

adding a note to the documentation does not explain why it was previously supported.

Posted by tanoshimi on 4/28/2011 at 4:52 AM
This "fixed" issue is still very much not fixed in SQL Server Denali CTP1.
Posted by Alex Rosa - DBA on 2/23/2011 at 8:57 AM
I got this issue today, working on Chinese servers.

I'm surprised with this statement: "I've discussed this with the development team, and they said that SQL Server never has supported code page 65001"

We can see in this link, that it's not totally true:

http://blogs.msdn.com/b/sqlserverfaq/archive/2009/06/03/bcp-command-using-code-page-65001-fails-if-both-sql-server-2000-and-sql-server-2005-tools-are-installed.aspx
Posted by async3 on 1/28/2011 at 1:06 PM
By closing this issue as "Fixed", microsoft is only demonstrating that it has no interest in international standards, only its "standards". and it DID worked in sql server 2005. this is very dissapointing. they should at least release a fix for this.
Posted by orhan mustafa on 12/7/2010 at 1:29 AM
any progress in this issue?

i am from germany. we're building a platform independent database update process which has to be able to load flat files in whatsoever encoding in different sql servers. we simply can not influence what encoding the users' flat files will be. and we're definitely not adding any unprofessional nonsense like "caution: utf-8 (most used encoding in the world) is not supported." to our requirements.

so any help will be highly appreciated.
p.s.: what pr is closing an unresolved issue, which makes users give up microsoft sql server?
Posted by DavidMann1 on 11/11/2010 at 5:48 PM
I would like to add my frustration and disappointment to the other comments. I live in Japan and generally build bilingual websites (Jp and En). UTF-8 is the encoding of choice and it's the only convenient way of working with html, js, css etc files for easy file sharing and exchange. My Japanese is very limited so I rely on Japanese staff to provide Japanese strings. All my IDEs and editing tools work beautifully with UTF-8. MS SQL is the one exception. The only word for this in 2010 is "unbelievable".
Posted by Bogdan Calmac on 7/7/2010 at 8:29 AM
I would like to add to the outrage of not supporting import from UTF-8. Shame.

But of course, MS is too big to care about their customers.
Posted by MSDNuser18 on 5/26/2010 at 6:57 AM
Totally agree, this issue shouldn't be closed.
please correct it.
We'd like to ask SQL Team to support code page 65001
Posted by harish kanyal on 5/23/2010 at 4:58 AM
Hi,

I am also facing this issue as we are upgrading SQL server 2005 to 2008.

Code Page does not work with UTF-8 file and if we change these files to Unicode code page works. So we have to write application to convert all files to Unicode which I am still working on.

Please let me know if anyone has found workaround for this issue and any patch provided by Microsoft

Thanks and Regards

Harish
Posted by Atradius on 11/30/2009 at 9:29 AM
I experience the same bug with UTF-8 not being supported. I would like to add that it is not supported in linked servers and OPENROWSET() either, and also the Microsoft OLE DB Text Driver is unable to read UTF-8 data properly.

It is a severe fault, looking at interoperability which in today's software environment is one of the most important issues.
Posted by Diode3000 on 9/16/2009 at 3:03 PM
I ran into this same issue today with SQL Server 2008. An application that worked with SQL Server 2005 stopped working because of the BCP issue and UTF-8.
Posted by Igor Kostin on 6/12/2009 at 11:57 AM
unbelievable!
looks like an answer of some novice there! Looks like MS dev team don't worry about international customers of SQL Server anymore. Damn! however let's wait and see.. definitely will not get 2008 with such a problem!
Posted by marc_s on 6/7/2009 at 1:23 PM
I *CANNOT* believe a system like SQL Server would *NOT* support UTF-8 today - we're talking 2009 ! This is *THE* world standard for interoperable text files - finally you can forget and get rid of all the problems with local code pages - and now SQL Server of all systems doesn't support it ???? U-N-B-E-L-I-E-V-A-B-L-E !!!
Posted by Dylan Nicholson on 12/29/2008 at 2:12 AM
Absolutely agree that there's no excuse for this. We are basically telling all of our customers NOT to upgrade to SQL 2008 because of the performance loss associated with the lack of UTF-8 bulk insert (our app allows switching bulk insert off, but it runs about half the speed).
Posted by double digger on 11/1/2008 at 12:17 PM
Margi:

I can not understand why this issue has been closed. I also don't understand why the developers that you have asked say that SQL Server never supported codepage 65001. I have used the codepage 65001 / UTF-8 files for years daily in loading data into SQL Serer 2005. And it worked fine.

This response is very frustrating.

D.N.
Posted by Microsoft on 10/9/2008 at 7:31 PM
Hi Erland,

I've discussed this with the development team, and they said that SQL Server never has supported code page 65001 (UTF-8 encoding).

For the next web refresh of SQL Server 2008 Books Online, I've updated the description of code_page in the "bcp Utility," "BULK INSERT (Transact-SQL)," & "OPENROWSET (Transact-SQL)" topics to include the following note:

     Important: SQL Server does not support code page 65001 (UTF-8 encoding).

Regards,

Margi Showman,
Technical Writer
SQL Server Documentation Team





Sign in to post a workaround.