The optimizer should be able to understand the invertability of built-in intrinsics on dates for the purposes of index sargability - by Simon Sabin

Status : 

 


49
0
Sign in
to vote
ID 653206 Comments
Status Active Workarounds
Type Suggestion Repros 0
Opened 3/23/2011 4:32:22 PM
Access Restriction Public

Description

Almost every client I go to I find queries that group by year and month to aggregate date. Even though there is an index on the appropriate date column the index isn't used by the query optimiser to efficiently build the results.

Whats more if this is done in a view/derived table and then filtered by year or year or month again the index isn't used to filter the data. This results in index scans with no seeks.

Often people do these types of queries in CTEs that is referenced more than once in the main query. This results in a large number of index scans, and sorts/hash aggregates.

i.e. 

Create view summaryView
as
Select year(datecol) yr, month(datecol) mnth, count(1)
From largeTable
Group by year(datecol), month(datecol)

Select *
 from summaryView where yr = 2001

http://sqlblogcasts.com/blogs/simons/archive/2011/03/24/the-optimiser-should-understand-that-year-datecolumn-is-correlated-to-the-datecolumn.aspx
Sign in to post a comment.
Posted by piers7 on 3/29/2011 at 1:53 AM
Joining against a date dimension table (and filtering on it) is only a viable strategy if:
- you're generating an aggregate for a specific range (a year), rather than across multiple years / months etc...
- your fact table's date column is at a granularity you can meaningfully join against a date dimension (for example, we store a datetime2(2) timestamp, and *derive* the date key as a scalar in the query, to save 4-6 bytes per row in the fact table, which is a big deal).

Any other workarounds available other than turning the whole aggregation into a cursor-driven loop over the table at the time aggregation granularity (ie one pass per month etc...)
Posted by SQLSaran on 3/28/2011 at 9:47 AM
Isn't aggregating and seggregating data a part of the data warehouse? I would like to see the Query Processor perform better with these kind of requests.
Posted by Microsoft on 3/25/2011 at 5:29 PM
Hi Simon,

Thanks for the feedback. We'll consider this for a future release, and I agree it would be a valuable addition. There are numerous workarounds for this though so I don't see it as high priority. E.g. use a Date dimension table and join it to the largeTable on the date column. Filter on date on the Date dimension before the join. That way you can get an index seek during the join to eliminate ranges of data.

Indexed views and columnstore indexes also may be useful for this scenario.

Do you know if the competition can do this particular query transformation you describe?

Best regards,
Eric
Posted by TheSQLGuru on 3/24/2011 at 7:32 AM
This is truly one of those things that the masses can benefit from and which will be important to driving Standard Edition version upgrades - something that has been APPALLINGLY LACKING in the last 5 years or so!!