Compare with Unicode constant doesn't use index if collation different than default - by Marco Russo

Status : 

  By Design<br /><br />
		The product team believes this item works according to its intended design.<br /><br />
		A more detailed explanation for the resolution of this particular item may have been provided in the comments section.


2
1
Sign in
to vote
ID 229661 Comments
Status Closed Workarounds
Type Bug Repros 2
Opened 10/19/2006 8:46:32 AM
Access Restriction Public

Description

A SQL query with a WHERE condition does not use the existing index if the comparison is made against a unicode constant string. It happens when the field has a collation setting different than the current one (i.e. field has SQL_Latin1_General_CP1_CI_AS where server default is now Latin1_General_CI_AS).
Query plan use Clustered Index Scan instead of Clustered Index Seek operation.
Performance become very slow when the query is made many times into a loop (not a good practice, but the program can't be changed).
Using ANSI string (8 bit) the problem disappear and query plan shows the use of a Clustered Index Seek Operation
Sign in to post a comment.
Posted by Matija Lah on 10/26/2006 at 7:45 AM
Apparently this seems to be a problem with SQL collations - Windows collations appear to produce the expected execution plan. Can anyone else confirm this?

Here's a changed repro:

SET STATISTICS IO OFF
GO
SET NOCOUNT ON
GO
CREATE TABLE #Test(
Name [varchar](50) COLLATE Estonian_CI_AS NOT NULL PRIMARY KEY)
DECLARE @I AS INT
SET @I = 0
WHILE @I < 5000
BEGIN
INSERT INTO #Test VALUES( '12345678901234567890' + CAST(@I AS VARCHAR) )
SET @I = @I + 1
END
INSERT INTO #Test VALUES( 'FOO' )
GO
SET STATISTICS IO ON
GO
-- This one is SLOW (use Clustered Index Scan)
SELECT * FROM #Test
WHERE Name = N'FOO'
-- This one is FAST (use Clustered Index Seek)
SELECT * FROM #Test
WHERE Name = 'FOO'
GO
DROP TABLE #Test
GO

Execution plan:
|--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1005], [Expr1006], [Expr1004]))
     |--Compute Scalar(DEFINE:(([Expr1005],[Expr1006],[Expr1004])=GetRangeThroughConvert(N'FOO',N'FOO',(62))))
     |    |--Constant Scan
     |--Clustered Index Seek(OBJECT:([tempdb].[dbo].[#Test]), SEEK:([tempdb].[dbo].[#Test].[Name] > [Expr1005] AND [tempdb].[dbo].[#Test].[Name] < [Expr1006]), WHERE:(CONVERT_IMPLICIT(nvarchar(50),[tempdb].[dbo].[#Test].[Name],0)=N'FOO') ORDERED FORWARD
Posted by Matija Lah on 10/26/2006 at 7:44 AM
Apparently this seems to be a problem with SQL collations - Windows collations appear to produce the expected execution plan. Can anyone else confirm this?

Here's a changed repro:

SET STATISTICS IO OFF
GO
SET NOCOUNT ON
GO
CREATE TABLE #Test(
Name [varchar](50) COLLATE Estonian_CI_AS NOT NULL PRIMARY KEY)
DECLARE @I AS INT
SET @I = 0
WHILE @I < 5000
BEGIN
INSERT INTO #Test VALUES( '12345678901234567890' + CAST(@I AS VARCHAR) )
SET @I = @I + 1
END
INSERT INTO #Test VALUES( 'FOO' )
GO
SET STATISTICS IO ON
GO
-- This one is SLOW (use Clustered Index Scan)
SELECT * FROM #Test
WHERE Name = N'FOO'
-- This one is FAST (use Clustered Index Seek)
SELECT * FROM #Test
WHERE Name = 'FOO'
GO
DROP TABLE #Test
GO

Execution plan:
|--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1005], [Expr1006], [Expr1004]))
     |--Compute Scalar(DEFINE:(([Expr1005],[Expr1006],[Expr1004])=GetRangeThroughConvert(N'FOO',N'FOO',(62))))
     |    |--Constant Scan
     |--Clustered Index Seek(OBJECT:([tempdb].[dbo].[#Test]), SEEK:([tempdb].[dbo].[#Test].[Name] > [Expr1005] AND [tempdb].[dbo].[#Test].[Name] < [Expr1006]), WHERE:(CONVERT_IMPLICIT(nvarchar(50),[tempdb].[dbo].[#Test].[Name],0)=N'FOO') ORDERED FORWARD
Posted by Microsoft on 10/20/2006 at 9:30 AM
This is not about the collation but about the difference between 8-bit and 16-bit character set.
In fact if you use the default collation (remove "COLLATE SQL_Latin1_General_CP1_CI_AS" from the column definition) then the plans do not change.

There is no generalized valid way to down-convert a 16-bit string to an 8-bit string but an 8-bit string is required to perform a seek because that is the type of the index.
The scan plan works because the 8-bit strings in the table are upconveted to 16-bit and compared to the 16-bit constant.

If you use nvarchar in the table then both forms of the predicate will support index seek (since the 8-bit constant in the predicate can be up-converted to 16-bit).

Thanks,
SQL Server dev.