Home Dashboard Directory Help
Search

Incorrect execution behavior for MIN() and MAX() by John-Huang


Status: 

Active


6
0
Sign in
to vote
Type: Suggestion
ID: 772432
Opened: 11/27/2012 3:40:23 PM
Access Restriction: Public
0
Workaround(s)
view

Description

Performance of MIN() sometime is very slow even an index is used due to massive amount of NULLs in the index key see example below:

drop table test
go
create table test(id int, data char(7000) not null default(replicate('a', 7000)))
go
create clustered index id on test(id)
go
insert into test(id) values(null)
go 100
insert into test(id) values(1)
go
select * from test
-- return 101 rows
go
set statistics io on
go
select MAX(id) from test
select min(id) from test
go
(1 row(s) affected)
Table 'test'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)
Table 'test'. Scan count 1, logical reads 103, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


Details
Sign in to post a comment.
Posted by Paul White NZ on 12/23/2012 at 8:13 PM
Hi Martin,

No I didn't realize that. I confess I just posted a quick comment without really thinking too hard about it. Adding `WHERE id IS NOT NULL` is better, indeed. I could rescue the original workaround with:

-- MIN
SELECT (SELECT TOP (1) id FROM dbo.test AS t WHERE id IS NOT NULL ORDER BY id ASC);

-- MAX
SELECT (SELECT TOP (1) id FROM dbo.test AS t WHERE id IS NOT NULL ORDER BY id DESC);

...but that's not very nice :)
Posted by Martin Smith on 12/23/2012 at 11:11 AM
@Paul

As I'm sure you realise your proposed work around changes the semantics if all of the values are NULL as it returns zero rows.

select MAX(id) from test WHERE id IS NOT NULL
select MIN(id) from test WHERE id IS NOT NULL

both work for me though.

I guess the same complaint (and solution of seeking into the first NOT NULL value) could also apply to MERGE join

SELECT COUNT(*) FROM test t1 INNER MERGE JOIN test t2 ON t1.id= t2.id


SELECT COUNT(*) FROM test t1 INNER MERGE JOIN test t2 ON t1.id= t2.id WHERE t1.id IS NOT NULL AND t2.id IS NOT NULL
Posted by nicofer on 12/21/2012 at 12:40 AM
Hi Paul,

Really appreciate for your great help , I believe I understand.
Wish you a merry Christmas day!

Best Regards,
nicofer
Posted by Paul White NZ on 12/20/2012 at 9:21 PM
nicofer,

Execution plan operators request a row at a time from the operator to their right as described in Books Online (e.g. the Showplan Logical and Physical Operators Reference - http://msdn.microsoft.com/en-us/library/ms191158.aspx). A more entertaining explanation is at http://bradsruminations.blogspot.co.nz/2010/11/second-in-life-of-query-operator.html

Your query "select MAX(rv) from tb_001" has the following plan:

SELECT <- Stream Aggregate <- Top <- Index Scan

Execution starts at the root of the plan (at the SELECT), which requests a row from the Stream Aggregate, which requests a row from the Top, which requests a row from the Index Scan. The scan very quickly returns the first row from the index ordered descending (Ordered: true, Scan Direction: BACKWARD in the operator properties).

Only one row row is read from the scan. The Top is a Top (1), so after the first row, it stops asking for rows from the Index Scan and execution ends shortly afterward.

I hope that answers your question, but this isn't really the place for an extended discussion. If you need more details, ask a question on a site like http://dba.stackexchange.com/ and I will be happy to respond again.

Apologies to John for hijacking his Connect item.

Paul
Posted by nicofer on 12/20/2012 at 5:46 PM
Hi Paul White NZ,

Thanks for your response, my meaning is Why does not show index seek ?
if it was really index scan it shoud show many logical I/Os, but actually only show 2 logical reads , So i thought the excution plan should show INDEX SEEK instead of INDEX SCAN.
what do you think about it ?

Best Regards,
nicofer

Posted by Paul White NZ on 12/20/2012 at 2:04 AM
nicofer,

That's a completely separate issue. The scan you see is an ordered scan of the index, limited to the first qualifying row it sees by the Top operator. Finding the MIN of an ascending ordered index involves 'scanning' to find the fist non-NULL row ordered forward; finding the MAX 'scans' to find the first non-NULL row ordered backwards. The number of logical reads you see would be very much larger if the execution engine were scanning the whole index.

It's not clear what you mean by, "seeking from logical reads IO list" but index seeking requires a known value. If we knew the value, we wouldn't need to run the query. In any case, with your specific example, there are no NULLs to skip (the 'rv' column is NOT NULL by default) and a seek would require just as many logical I/Os. Note the STATISTICS IO output for your query shows 2 logical reads - hard to improve on that.

John is reporting a quite separate issue, which occurs when the ordered scan has to skip a large number of NULLs (MIN and MAX both ignore NULLs).
Posted by nicofer on 12/19/2012 at 11:56 PM
Hi,

Here is example:

use tempdb
go
--drop table tb_001
create table tb_001(id int primary key,rv [timestamp])
GO
declare @i int = 1
while @i<20000
begin
    insert into tb_001(id) select @i
    set @i+=1
    waitfor delay '00:00:00.001'
end
GO
create index ix_rv on tb_001(rv)

set statistics io on

select MAX(rv) from tb_001 -- it shows index scan, but actually should be seeking from logical reads IO list
--Table 'tb_001'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

select top 1 rv from tb_001 order by rv desc -- it shows index scan, but actually should be seeking from logical reads IO list.
--Table 'tb_001'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Best Regards,
nicofer
Posted by nicofer on 12/19/2012 at 11:45 PM
Hi,

Not only performance issues, but also always shows the "INDEX SCAN" on GUI execution plan when i use MAX or MIN. thanks.
Posted by Microsoft on 12/19/2012 at 2:46 PM
Thanks for the feedback John. We are looking into this request.
Posted by Paul White NZ on 12/1/2012 at 6:35 PM
Workaround:

-- MIN
SELECT TOP (1) id FROM dbo.test AS t WHERE id IS NOT NULL ORDER BY id ASC;

-- MAX
SELECT TOP (1) id FROM dbo.test AS t WHERE id IS NOT NULL ORDER BY id DESC;
Sign in to post a workaround.