Home Dashboard Directory Help
Search

Non-recursive CTEs inefficiently recompute self-joins and ROW_NUMBER instead of using worktable by justingrant


Status: 

Active


5
0
Sign in
to vote
Type: Bug
ID: 377976
Opened: 10/27/2008 2:47:34 PM
Access Restriction: Public
Primary Feedback Item: 218968
0
Workaround(s)
view
0
User(s) can reproduce this bug

Description

Non-recursive CTEs which contain self-joins or ROW_NUMBER() are unusuably slow for large data sets, beucase the CTE query is re-evaluated many times instead of being spooled into a worktable and only evaluated once. This makes CTEs useless for large data sets when self-joins or ROW_NUMBER is used.

Is this fixed in SQL 2008?

Tony Rogerson describes the problem thoroughly in these two blog posts.
#1 - http://sqlblogcasts.com/blogs/tonyrogerson/archive/2008/05/17/non-recursive-common-table-expressions-performance-sucks-1-cte-self-join-cte-sub-query-inline-expansion.aspx
#2 - http://sqlblogcasts.com/blogs/tonyrogerson/archive/2008/05/17/non-recursive-common-table-expressions-performance-sucks-2-row-number-is-executed-number-of-cte-references-x-number-of-rows-from-the-anchor.aspx
Details
Sign in to post a comment.
Posted by Microsoft on 10/31/2008 at 10:51 AM
Thanks again for your feedback. We are tracking variants of this DCR under "Feedback ID: 218968".

Campbell Fraser, SQL Development.
Posted by Microsoft on 10/30/2008 at 10:46 AM
Justin,

Thanks for your feedback. We haven't improved the situation in SQL Server 2008. Non-recursive CTEs are processed as inline view definitions. When used later in a query, they are expanded just like views. We'd like to improve this at some point and will consider your suggestion for a future release. For now, if performance is a problem for you, we recommend materializing intermediate results yourself using temp tables, scalar variables, or table variables. Temp tables give you statistics so query plans that use them will tend to be better than plans that use table variables. That's because table variables don't have statistics. If you use scalar variables in queries and compile time is not an issue for you, use OPTION(RECOMPILE) on the query. This does full contant folding of the expression containing the variable and will tend to give you a better query plan.

Eric
Sign in to post a workaround.