Search

SCOPE_IDENTITY() sometimes returns incorrect value by dave_dave

Active

97
Sign in to vote
1
Sign in to vote
Sign in
to vote
Type: Bug
ID: 328811
Opened: 2/15/2008 4:20:09 PM
Access Restriction: Public
5
Workaround(s)
11
User(s) can reproduce this bug
We've caught both SCOPE_IDENTITY() and @@IDENTITY returning the incorrect value after doing an insert. It appears to be some combination of our database settings, the number of rows in the tables, and the particulars of the query plan. I've managed to write a script that will show the incorrect behavior. It will create a new database, create a table with 1 million rows, and then show the incorrect behavior.

@@Version =
Microsoft SQL Server 2005 - 9.00.3159.00 (Intel X86)
     Mar 23 2007 16:15:11
     Copyright (c) 1988-2005 Microsoft Corporation
     Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 1)
Details (expand)
Product Language
English

Version

SQL Server 2005 - Enterprise Edition

Category

SQL Engine

Operating System

Win2003 Enterprise Server (SP1)
Operating System Language
US English
Steps to Reproduce
create a new database
create a table with 1 million rows
do a complicate INSERT of one additional row
watch SCOPE_IDENTITY() return the incorrect value

See attached script for details.
Actual Results
SCOPE_IDENTITY() and @@IDENTITY for the single-row inserts into _ddr_T return 1 and 1
Expected Results
1000001 and 1000002
Platform
32
File Attachments
6 attachments
broken identity.sql
broken identity 2.sql
broken query 2 - results.txt
IdentityBug_Data.zip
broken identity 3.sql
broken identity 3 - results.txt
Sign in to post a comment.
Posted by Microsoft on 2/28/2008 at 9:13 AM
Hi Dave,
Thanks for providing this feedback about a possible issue with the Scope_identity function. We are investigating this issue to see if this is a defect that needs to be fixed. We will keep you updated on the resolution to this issue as soon as it gets resolved. Thanks again for narrowing down this issue and providing the details.

Srini Acharya
Relational Engine
Posted by Microsoft on 3/10/2008 at 7:02 PM
Dave, I have run your repro script using both "SQL Management Studio" as well as "osql.exe" on build 9.00.3159.00 and it produced the result I would expect it to produce, added below. Thsi result is by design.

Are you seeing a different result?


*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

(1000000 row(s) affected)

(1 row(s) affected)

(1 row(s) affected)
RowCount    @@IDENTITY                             SCOPE_IDENTITY
----------- --------------------------------------- ---------------------------------------
1         1                                     1

(1 row(s) affected)

(1 row(s) affected)

RowCount    @@IDENTITY                             SCOPE_IDENTITY                         IDENT_CURRENT
----------- --------------------------------------- --------------------------------------- ---------------------------------------
1         1000001                                 1000001                                 1000001

(1 row(s) affected)


(1 row(s) affected)

(1 row(s) affected)
RowCount    @@IDENTITY                             SCOPE_IDENTITY
----------- --------------------------------------- ---------------------------------------
1         1                                     1

(1 row(s) affected)

(1 row(s) affected)

RowCount    @@IDENTITY                             SCOPE_IDENTITY                         IDENT_CURRENT
----------- --------------------------------------- --------------------------------------- ---------------------------------------
1         1000002                                 1000002                                 1000002

(1 row(s) affected)
Posted by dave_dave on 3/11/2008 at 6:15 AM
When I run the script, it returns something different than when you ran it. I get:

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

(1000000 row(s) affected)

(1 row(s) affected)

(1 row(s) affected)
RowCount    @@IDENTITY                             SCOPE_IDENTITY
----------- --------------------------------------- ---------------------------------------
1         1                                     1

(1 row(s) affected)

(1 row(s) affected)

RowCount    @@IDENTITY                             SCOPE_IDENTITY                         IDENT_CURRENT
----------- --------------------------------------- --------------------------------------- ---------------------------------------
1         1                                     1                                     1000001

(1 row(s) affected)


(1 row(s) affected)

(1 row(s) affected)
RowCount    @@IDENTITY                             SCOPE_IDENTITY
----------- --------------------------------------- ---------------------------------------
1         1                                     1

(1 row(s) affected)

(1 row(s) affected)

RowCount    @@IDENTITY                             SCOPE_IDENTITY                         IDENT_CURRENT
----------- --------------------------------------- --------------------------------------- ---------------------------------------
1         1                                     1                                     1000002

(1 row(s) affected)

Posted by dave_dave on 3/11/2008 at 6:17 AM
It also breaks when I run the script on another of our servers. This one with @@Version =
Microsoft SQL Server 2005 - 9.00.3152.00 (Intel X86) Mar 3 2007 03:17:37 Copyright (c) 1988-2005 Microsoft Corporation Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 1)
Posted by dave_dave on 3/11/2008 at 6:20 AM
The script returns the correct results when run on a machine with @@Version =
Microsoft SQL Server 2005 - 9.00.3042.00 (Intel X86) Feb 9 2007 22:47:07 Copyright (c) 1988-2005 Microsoft Corporation Express Edition with Advanced Services on Windows NT 5.1 (Build 2600: Service Pack 2)
Posted by dave_dave on 3/11/2008 at 6:29 AM
I tried it on another client's system -- one that I was not involved in configuring -- and it returned the incorrect results. Their @@Version=
Microsoft SQL Server 2005 - 9.00.3042.00 (Intel X86)
    Feb 9 2007 22:47:07
    Copyright (c) 1988-2005 Microsoft Corporation
    Standard Edition on Windows NT 5.2 (Build 3790: Service Pack 2)
Posted by dave_dave on 3/11/2008 at 7:34 AM
If I change the AUTO_CREATE_STATISTICS setting, it works -- although I'm not convinced this is directly related to the problem. I suspect it is just because this causes a change in the query plan. I've attached a new script and the output that shows the change it causes in the query plan. My guess is that Parallelism is what is breaking the identity functions.

When the script is run using Express (where it always returns the correct results), I can see that parallelism is not being used.

It also returns the correct results on the Enterprise edition if I add either "OPTION (MAXDOP 1)" or "TOP 1" -- both of which cause parallelism to be dropped from the plan.
Posted by Microsoft on 3/11/2008 at 5:47 PM
Dave, thanks for your patience.

I have forced paralellism on my machine and re-run your query.
Now query plans looks nearly identical, yet the result is still different - same as I had last time.
See entire result attached as "Denis result - query 2.txt"

Maybe you could try to attach the database itself to this bug?

This seems the best way to make progress now.

Denis.
Posted by dave_dave on 3/11/2008 at 6:29 PM
I see that you ran your tests using the developer edition. Can you retry using Enterprise and Standard editions?

I'll attach the database tomorrow.
Posted by dave_dave on 3/12/2008 at 7:16 AM
I've attached IdentityBug_Data.zip that contains IdentityBug_Data.mdf

I did not include the log file because even when zipped, it is larger than 10MB. If you need it, let me know.

I preparted the database for uploading using the attached
Broken Identity 3.sql
which output
Broken Identity 3 - Results.txt
Posted by Microsoft on 3/18/2008 at 1:10 PM
Dave, thanks to your very detailed and dilligent report I was able to find the problem. Yes, it's a bug - whenever a parallel query plan is generated @@IDENTITY and SCOPE_IDENTITY() are not being updated consistenly and can't be relied upon. The few workarounds I can offer you for now:

1. Use MAX_DOP=1 as you are already using. This may hurt performance of the SELECT part of your query.
2. Read the value from SELECT part into a set of variables (or single tabel variable) and then insert into the target table with MAX_DOP=1. Since the INSERT plan will not be parallel you will get the right semantic, yet your SELECT will be parallel to achieve performance there if you really need it.
3. Use OUTPUT clause of INSERT to get the value you were looking for, as in the example I give further below. In fact I highly recomend using OUTPUT instead of @@IDENTITY in all cases. It's just the best way there is to read identity and timestamp.
4. Changing autostas is NOT a good workaround. It may hide the problem for a while but a prallel plan will get produced eventually.
5. Force serial plans for entire server via sp_configure 'max degree of parallelism' option.



Now about life beyond workaround. We will fix it in SQL 2008. If you need this to be fixed also in SQL 2005 you will have to request a QFE through official support channels. I can't promise anything either, as usual it's a factor of complexity of request, resource availability, quality of workarounds etc.


In the end, thank you once again for reporting this. You have saved plenty of headaches for many people.

Denis Altudov, Microsoft.


=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
             Example as promised:
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*

DECLARE @MyNewIdentityValues table(myidvalues int)
declare @A table (ID int primary key)
insert into @A values (1)
declare @B table (ID int primary key identity(1,1), B int not null)
insert into @B values (1)
select
    [RowCount] = @@RowCount,
    [@@IDENTITY] = @@IDENTITY,
    [SCOPE_IDENTITY] = SCOPE_IDENTITY()

set statistics profile on
insert into _ddr_T
output inserted.ID into @MyNewIdentityValues
    select
            b.ID
        from @A a
            left join @B b on b.ID = 1
            left join @B b2 on b2.B = -1

            left join _ddr_T t on t.T = -1

        where not exists (select * from _ddr_T t2 where t2.ID = -1)
--option (maxdop 1) --!!! If you uncomment this line, it works
set statistics profile off

select
    [RowCount] = @@RowCount,
    [@@IDENTITY] = @@IDENTITY,
    [SCOPE_IDENTITY] = SCOPE_IDENTITY(),
    [IDENT_CURRENT] = IDENT_CURRENT('_ddr_T')
select * from @MyNewIdentityValues
go
=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/

Posted by Microsoft on 6/18/2008 at 11:55 AM
Unfortunately, after evaluating our fix options we have come to conclusion that we can't fix this for SQL 2008. We're not doing this lightly, I have spent close to a month trying to make a robust fix for this, but amount of required work did not fit into this release.

If you come across this issue you will have to resort to one of the workarounds I have put in my previous post.
Posted by dave_dave on 6/18/2008 at 12:06 PM
Because this can (and has) resulted in data loss, you should at least update the SQL 2005 and 2008 help files to indicate that we should not be using @@Identity or SCOPE_IDENTITY() without including "option (maxdop 1)".
Posted by Microsoft on 6/18/2008 at 12:13 PM
Fair enough. I have made a note of need for documentation in our bug tracking database.

Denis.
Posted by Mike C# on 3/19/2009 at 10:01 PM
This is not a bug to be taken lightly. This thing affects huge amounts of code already in place and could cause severe problems and as Denis pointed out it will cause "plenty of headaches for many people." Because of the amount of existing code out there using SCOPE_IDENTITY() and @@IDENTITY I would consider this an extremely high priority/high severity defect.
Posted by Brandon Reno on 3/20/2009 at 8:45 AM
Don't get me wrong, while I'm sure there is code out there that may need to beware of this, >99% of use cases for SCOPE_IDENTITY and @@IDENTITY fall into the pattern of insert a single value, and get the id. From what I can tell, inserts using the VALUES clause are not affected:

INSERT INTO Table (fld1, fld2) VALUES(@v1, @v2)
SET @id = SCOPE_IDENTITY()

this does not get a parallel plan, and would not be affected.

If you are doing a select, and expecting only one record, I hope that you are either checking against a PK or UNIQUE constraint or using TOP 1 (from what I can see, neither of which ever creates parallelized execution plans, though I could be wrong.)

I did a quick scan of our code, and could not find an example that could possibly be affected.
Posted by SQL Guru on 5/28/2009 at 10:12 AM
dave_dave: You should open a case with Microsoft on this. Cases for bugs are free and maybe they'll fix it then.

I really think this needs to be fixed in any case. This affects the data integrity by possibly having data point to the wrong row. If you don't think this is important, please ask some of the banks using SQL Server and see if having foreign keys point to the wrong row matters. If they find out that they have misplaced money because of this and MS won't fix it, be sure that they will be switching from SQL as fast as they can.

It needs to be fixed in both SQL 2005 and SQL 2008. Just because SQL 2005 is older, doesn't mean the data in those databases is less important. SQL 2005 is still in mainstream support until April, 2011. Until then it should still get these fixes, especially this type were it affects data integrity.
Posted by Paul White (NZ) on 9/12/2009 at 9:26 PM
Is IDENT_CURRENT affected in the same way?
Posted by Emtucifor on 9/29/2009 at 5:19 PM
Quote: >99% of use cases for SCOPE_IDENTITY and @@IDENTITY fall into the pattern of insert a single value, and get the id

Not for me. In one database I developed, the almost exclusive use of Scope_Identity() was in combination with @@RowCount to get the entire range of inserted rows and then perform some additional operation with that range. Just because the function returns the last row value does not mean that only the last row is of interest.

This seems like a really, really nasty bug and I think it should be addressed in some way as a high priority. Even if the problem can't be fixed so soon, how about a procedure-creation-time warning when any of these functions are in the script text, saying that they can return the wrong values if parallelism occurs?
Posted by Daniel Smith on 10/19/2009 at 1:59 AM
I realise this is mainly a corner case, but it puts a seed of doubt in people's minds. I don't want to have to always be checking my back to see if a feature I'm using in the product has a potentially horrendous data corrupting bug.

Please fix this before people loose all trust in the product.
Posted by JAlbanator on 10/26/2009 at 11:06 AM
I recently upgraded from 2000 to 2005 and, like others, SCOPE_IDENTITY() was not working appropriately as well. I found a workaround with our implementation by using the following:

SET NOCOUNT ON SELECT SCOPE_IDENTITY()

I hope this helps somebody.
Posted by john_s50 on 11/4/2009 at 2:07 PM
How can I get the attachments to test the issue on my system?
What is the latest on this? Is it fixed in SQL 2008? SP1?
Posted by whiskeypriest on 12/1/2009 at 5:24 PM
Could you post the repro script? I'd like to see if I can duplicate this. Might have the same problem.