Home Dashboard Directory Help
Search

Problem getting correct result by SwePeso


Status: 

Resolved
 as By Design Help for as By Design


1
2
Sign in
to vote
Type: Bug
ID: 781673
Opened: 3/19/2013 7:30:09 AM
Access Restriction: Public
0
Workaround(s)
view
0
User(s) can reproduce this bug

Description

I am aiding Kathi Kellenberger with her new article. I wrote a solution that work fine for the small sample set in the article. Today I added the sample set with 176,000 rows.

This gives me strange results! The table is a heap. The code is attached.
When I run the code without OPTION(MAXDOP 1), I get a count ranging from 16112 and 16160. I have no idea why.
When I turn on OPTION(MAXDOP 1), I get the correct result of 16008 in almost all cases. In some cases I get a count of 16009.

Why is the count not stable? It doesn't matter if I put a clustered index on the table.
Details
Sign in to post a comment.
Posted by Martin Smith on 3/19/2013 at 1:59 PM
Or using `DENSE_RANK` instead of `ROW_NUMBER` gives a consistent 16008
Posted by Martin Smith on 3/19/2013 at 1:49 PM
You have duplicate data.

e.g. 8 rows match StaffMember = A0207, EventDate = 07/01/2013, EventTime = 08:00:00, EventType = Enter.

Within that group of 8 rows it is undeterminsitic what the result of both

ROW_NUMBER() OVER (PARTITION BY StaffMember ORDER BY EventDate, EventTime) and

ROW_NUMBER() OVER (PARTITION BY StaffMember, EventType ORDER BY EventDate, EventTime)

should be.

I think you are assuming that the row that is assigned the earliest value from the first `ROW_NUMBER` expression will also be assigned the earliest value in the second `ROW_NUMBER` but this may or may not be the case.

If it is the case then these will all be given the same "IslandCandidate" value if it isn't then they won't.

If you add %%physloc%% in as a deterministic tie breaker (for demo purposes) then the issue disappears

        ROW_NUMBER() OVER (PARTITION BY StaffMember ORDER BY EventDate, EventTime,%%physloc%%) -
        ROW_NUMBER() OVER (PARTITION BY StaffMember, EventType ORDER BY EventDate, EventTime,%%physloc%%) AS IslandCandidate
    
            
Posted by SwePeso on 3/19/2013 at 7:36 AM
It doesn't matter if you remove the two where-clauses in the query. The count value still will vary.

However, if I first execute cteIsland and store the data in a temporary table, and use that temporary table in cteGrouped, I always get the correct result of 16008, no matter what.
Sign in to post a workaround.
File Name Submitted By Submitted On File Size  
Query.sql 3/19/2013 1 KB
Connect.7z 3/19/2013 93 KB