'ß' and 'ss' are NOT equal - by BetterToday

Status : 


Sign in
to vote
ID 341130 Comments
Status Active Workarounds
Type Bug Repros 8
Opened 4/30/2008 1:58:17 AM
Access Restriction Public



when setting SQL Server to standard latin collation (Latin1_General_CI_AS), a NVARCHAR(x) UNIQUE column raises a Unique Constraint error, when adding these two rows to the table:


This is not correct. Maße and Masse are two different words having different meanings. Especially after German spelling reform, 'ß' and 'ss' got even more separated. (Whereas 'SS' is currently still are replacement for 'ß' when capitalizing a word to upper-case, since 'ß' is only a lower-case letter.)

After discussing this online (http://groups.google.com/group/microsoft.public.de.sqlserver/browse_thread/thread/c2f2b1dd36ab439b)
I've been discussing this issue with DIN. They are currently discussing this issue internally to update the DIN/ISO standards.

Nonetheless, SQL2005/2008 should provide an option to *not* regards 'ß' and 'ss' as being equal while still adhering to all other dictionary collating rules.

Axel Dahmen
Sign in to post a comment.
Posted by Dennis Buchholz on 10/17/2017 at 12:40 AM

we have noticed that the upper case ẞ can be added additionally to the lower case ß in a unique column with the Latin_General_CI_AS-Collation

So this generates an error:
(Col1 NVARCHAR(100) COLLATE Latin1_General_CI_AS

INSERT INTO #Test Values (N'Maße')
INSERT INTO #Test Values (N'Masse')

and this works - but shouldn't:
(Col1 NVARCHAR(100) COLLATE Latin1_General_CI_AS

INSERT INTO #Test Values (N'Maße')
INSERT INTO #Test Values (N'Maẞe')

And Windows actually knows that ẞ is the upper case form of ß:

CompareString(LOCALE_USER_DEFAULT, NORM_IGNORECASE, 'ß', 1, 'ẞ', 1) (Windows 10 1703)

Posted by BetterToday on 11/24/2015 at 6:13 AM
So, basically, the Windows collation rules are false and need to be amended.
Posted by BetterToday on 11/24/2015 at 6:06 AM
Here's Keith's response:

If you read the entire answer, Umachandar says two things:

1. “Our current behavior follows the SQL/ISO standard” – this is a tangent for the issue you are describing because the SQL Standard does not specify how characters are compared. The standard specifies how collations are used and how to specify a collation.

2. “we rely on Windows for all of our windows collation sorting capabilities” – this is what I would expect. The SQL Server implementation uses external specifications for character sets and collations. So, you need to understand what collation is being used for the columns in question, and how that collation compares 'ß' and 'ss'.
Posted by BetterToday on 11/24/2015 at 3:14 AM
I had been initiating a discussion with the ANSI SQL group in 1993 but unfortunately I wasn't able to actually perform my request due to my busy schedule.

Here's what I wrote now:

Hello Keith,

I'm getting back to you after this long time with an issue which I have raised with the Microsoft SQL Server team two years ago. Due to my busy schedule I wasn't able to reply back then and lost the issue out of my sight.

Here's the problem: In ANSI SQL, a comparison between "ss" and "ß" is defined as being equal. But that's wrong, particularly now, since Germany has had a spelling reform in 1996.

Words like, e.g., "Masse" und "Maße" are different (=> "weight" and "measurement"), so an ANSI SQL's comparison resulting in equality returns wrong results.

In fact, there is a rule that when converting a word to uppercase letter, "ß" is supposed to become "SS", because "ß" originally was only a ligature of the small letter "sz" (using a legacy German font) which over time became a ligature of "ss". So "Maße" will become "MASSE", just like "Masse" becomes "MASSE" when converting a word to uppercase.

Unfortunately this is not a bijective projection. You cannot exactly determine the small letter equivalent of a capital letter word because the capital letter "SS" might be the equivalent of either "ss" or "ß". The UNICODE group is aware of this problem and they are discussing about the creation of a capital letter equivalent of "ß".

My suggestion to them would be to introduce a capital letter "ß" which would just look like a capital double S: "SS".

Well, until this has been defined we are still facing the current problem about the fact that "ß" does NOT equal "ss" because the current behaviour yields wrong and unexpected results.

I'd like to propose an update to the ANSI SQL spec in a way that "ß" is no longer regarded to be equal to "ss".

Posted by Kk122015 on 11/23/2015 at 11:20 PM
Was there any update on this? How can this be solved if I cannot use the mentioned workaround? Is somebody from SQL team looking into this?
Posted by Mark Guinness on 5/20/2013 at 11:49 AM
This still appears to be an issue in SQL Server 2012. FWIW, the same problem occurs with æ and ae. Has anyone from the SQL team consulted with the Windows team to see if they are adopting the new DIN/ISO standards?
Posted by Microsoft on 6/21/2011 at 3:02 PM
Thanks for your feedback. Our current behavior follows the SQL/ISO standard and unless those standards are updated with the latest changes we don't intend to change the behavior in SQL Server. Changing existing SQL Server behavior has lot of implications and today we rely on Windows for all of our windows collation sorting capabilities. If and when in the future Windows adopts these new rules / Unicode standard we will incorporate it in SQL Server. Till then, you have to use other workarounds suggested in the comments. Hope this helps.

Umachandar, SQL Programmability Team
Posted by YingXiao on 7/14/2010 at 6:58 PM
Hi there,

We have the same issue as Axel. We don't want to add an extra column to this table, or change collation of this column, as it will have negative effect to the existing system.

Do you already have a fix for this issue? If not, when are you going to fix this?

Posted by BetterToday on 8/4/2008 at 3:45 AM
Hi Jim,

sorry for taking me so long to reply..

The corresponding DIN group is discussing this issue now. I can provide you with the personal contact at DIN if you provide me with your e-mail address.

I've played my part in this game so far by getting DIN (ISO) aware of the problem and having a discussion started to update the standards. This is where I'm off the train now.

Do you want me to provide you with the contact at DIN?

Best regards,
Axel Dahmen
Posted by Microsoft on 5/1/2008 at 8:13 AM
Hi Axel,

Thankyou for this bug report. I've passed it over to the Developer Team to investigate; in particular, how much work it would take us to fix and/or provide an option to differentiate.

Please let us know how/when DIN decides on this issue, so we can adhere with their standard.

However, it's unlikely this change will make it into the current release. The 'bar' for checking-in fixes is now very high, as we enter the 'end-game' for Katmai.