CASE / COALESCE won't always evaluate in textual order - by AaronBertrand

Status : 

  By Design<br /><br />
		The product team believes this item works according to its intended design.<br /><br />
		A more detailed explanation for the resolution of this particular item may have been provided in the comments section.


26
0
Sign in
to vote
ID 690017 Comments
Status Closed Workarounds
Type Bug Repros 4
Opened 9/21/2011 4:22:57 PM
Access Restriction Public

Description

The following CASE topic: 

http://msdn.microsoft.com/en-us/library/ms181765.aspx

Suggests that CASE "Evaluates, in the order specified, Boolean_expression for each WHEN clause."

While it doesn't explicitly say so, most users infer that this guarantees that CASE will stop evaluating when the first true Boolean_expression is reached.

However there are multiple cases where this short circuiting does not happen. For example, this scenario that Itzik Ben-Gan brought up:

    DECLARE @i AS INT = 1;
    SELECT CASE @i WHEN 1 THEN NULL ELSE MIN(1/0) END;

This fails with a division by zero error. Clearly the aggregate is being evaluated prior to and independently of the case branching.

This extends to similar COALESCE scenarios as well (most customers assume that COALESCE will short-circuit in a similar way, and while it usually does...). For example, the following works fine:

  DECLARE @i AS INT = 1; 
  SELECT COALESCE(NULLIF(@i, 0), 1/0);

But then if you use MIN again, it fails with the same division by zero error:

  DECLARE @i AS INT = 1;
  SELECT COALESCE(NULLIF(@i, 0), MIN(1/0));

Once again, I am not suggesting that the engine needs to fix this so that aggregates are never evaluated out of order (or evaluated even if an earlier expression yielded true). What I am suggesting is that the documentation should reflect 100% of reality, not just the 99% that most people are likely to hit. If there are exceptions to the rule, it should be CLEAR that it is not a hard-and-fast rule and it should state explicitly that users cannot rely on left-to-right evaluation or short-circuiting behavior.
Sign in to post a comment.
Posted by TechVsLife2 on 2/13/2012 at 10:17 PM
I thought I might as well add a note here pointing to another issue with the CASE documentation: it uses single quotation marks as identifiers (as do other several other BOL topics), against the SQL standard. The use of single quotation marks for identifiers is also already deprecated in Sql Server (at least in some forms):
https://connect.microsoft.com/SQLServer/feedback/details/724381/use-of-single-quotation-marks-as-column-identifier-in-help

'Price Range' =
     CASE
        WHEN ListPrice = 0 THEN 'Mfg item - not for resale'
etc.
Posted by TechVsLife2 on 2/12/2012 at 4:46 PM
Got bitten by it--could have used corrected documentation on it folks! especially since most of the books out there say CASE guarantees order of operations (short circuiting), so it's worth noting EMPHATICALLY.
Posted by Paul White NZ on 1/5/2012 at 8:16 AM
BOL will be updated with the text shown in the comments here: https://connect.microsoft.com/SQLServer/feedback/details/691535/aggregates-dont-follow-the-semantics-of-case

Just waiting for that to come through (it has been 3 months so far).
Posted by Vachhrajani, Nakul on 1/3/2012 at 11:44 PM
I agree - this should be a documentation change clearly documenting that CASE will not short-circuit in all cases.
Posted by Paul White NZ on 9/23/2011 at 5:38 AM
I agree - the response completely missed the point, and seems factually incorrect to boot.

If this item remains closed as By Design I'm going to open a new one. Hoping that won't be necessary.
Posted by AaronBertrand on 9/22/2011 at 10:34 AM
Also Eric your explanation does not cover why this works fine:

    DECLARE @i AS INT = 1;
    SELECT CASE @i WHEN 1 THEN NULL ELSE (SELECT MIN(1/0)) END;

By looking at the showplan_text I can see the difference but the casual observer is going to be confused by this behavior since it doesn't seem to agree with the documentation.
Posted by AaronBertrand on 9/22/2011 at 10:29 AM
Eric, you're brushing off the behavior and I think you're missing the point. The reason I filed the bug was not so that you would "correct" the behavior. It's so that you will correct the documentation, which currently leads a ***LOT*** of customers to believe that CASE (and COALESCE and IIF by extension) will ***ALWAYS*** evaluate left-to-right and short-circuit. Many of us feel it is important to explicitly state in the documentation that left-to-right evaluation and short-circuiting may often be observed but cannot be relied upon 100% of the time.
Posted by Microsoft on 9/22/2011 at 10:22 AM
Hi everybody,

For this example first submitted here:

DECLARE @i AS INT = 1;
SELECT CASE @i WHEN 1 THEN NULL ELSE MIN(1/0) END;

the MIN function is an aggregate. The query processor always aggregates a set of rows first before they flow into later stages of the query. In this case, the later stage is the CASE statement. Since there is no table specified in the query, a logical table with one row is created. While aggregating this rowset (a Constant Scan of a single row), the expression 1/0 must be evaluated. Here is the plan:

|--Compute Scalar(DEFINE:([Expr1001]=CASE WHEN [@i]=(1) THEN NULL ELSE [Expr1000] END))
     |--Stream Aggregate(DEFINE:([Expr1000]=MIN((1)/(0))))
            |--Constant Scan

The behavior is by design.

Best regards,
Eric Hanson
Program Manager
SQL Server Query Processing
Posted by K Christopher Cross, Sr on 9/22/2011 at 8:27 AM
Paul,

Exactly. And yes, I was trying to make a point that the 'bug' is a manifestation of a bigger issue or by design behavior, but as you said that is digressing from the very specific point on CASE. As I said in my very first post: "I agree." The part that escaped me is, "Note that the actual physical execution of the statement is determined by the query processor and the order may vary from this list" (http://msdn.microsoft.com/en-us/library/ms189499.aspx). I just need to RTFM better. :)

Posted by Paul White NZ on 9/22/2011 at 2:18 AM
Kevin,

In the example without the variables, constant-folding is applied to evaluate the expression at compile time. The first condition (1 <= 1) is always true, so the expression will always return 1:

SELECT CASE WHEN 1 <= 1 THEN 1 WHEN 1 <= 2 THEN MIN(1/0) END;

For that reason, the query plan for that statement is a simple Constant Scan containing the scalar value '1'. It is a little interesting that constant-folding can navigate the expression without evaluating MIN(1/0)...but that's getting away from the point.

In the situation where a variable is used, constant-folding obviously cannot be applied (the variable contains an unknown quantity) so the query plan is in the form that exposes the 'bug'. Constant folding: http://msdn.microsoft.com/en-us/library/ms175933.aspx
Posted by spaghettidba on 9/22/2011 at 1:33 AM
Definitely agree with Paul. The point is knowing how things really work and if the behaviour is consistent and reliable.
BOL clearly states that a CASE expression "Returns result_expression of the first Boolean_expression that evaluates to TRUE.", but doesn't say a word on the order of evaluation. The implementation could evaluate all the expressions starting from the cheapest one, spool the results and their ordinal position and then return the first one that evaluates to TRUE. It wouldn't be breaking the specs.

A query such as this one would greatly benefit from a cost-based implementation:

WITH tenColumns AS (
    SELECT 1 A, 2 B, 3 C, 4 D, 5 E, 6 F, 7 G, 8 H, 9 I, 10 J
),
tenRows AS (
    SELECT n
    FROM tenColumns AS tC
    UNPIVOT ( n FOR col IN (A,B,C,D,E,F,G,H,I,J) ) AS p
),
HundredMillionRows AS (
    SELECT n = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
    FROM tenRows AS A
    CROSS JOIN tenRows AS B
    CROSS JOIN tenRows AS C
    CROSS JOIN tenRows AS D
    CROSS JOIN tenRows AS E
    CROSS JOIN tenRows AS F
    CROSS JOIN tenRows AS G
    CROSS JOIN tenRows AS H
)
SELECT 1
WHERE 1 =
    CASE
        WHEN (SELECT COUNT(*) FROM master.dbo.spt_values) > 10000    THEN 1 -- FALSE
        WHEN (SELECT MAX(n) FROM HundredMillionRows)     > 99999999 THEN 1 -- FALSE
        WHEN (SELECT COUNT(*) FROM tempdb.sys.all_columns) > 5000     THEN 1 -- FALSE
        WHEN (SELECT 1/0     FROM master.sys.databases) > 1        THEN 1 -- TRUE
    END
Posted by K Christopher Cross, Sr on 9/21/2011 at 9:47 PM
Thanks! Hopefully, Microsoft can shed light on this case (okay, pun intended). ;)
If CASE behaved as Dave was suggesting like a SWITCH/CASE, then in some languages it falls through to the next case even if the first condition is met.

But results from statements like the below suggests that it does properly take the first condition met and exits.
SELECT CASE WHEN 1 <= 1 THEN 1 WHEN 1 <= 2 THEN MIN(1/0) END;

What I interpreted the bug to be at first was that when there was an aggregate involved, it was evaluating the aggregate first before even looking at the CASE conditions and not that the short-circuit was not working. Rather, the short-circuit was being circumvented by the query processor. The above working shows that the MIN(1/0) is not taking precedence, but as soon as you add in a variable you get the 'bug'.
DECLARE @i AS INT = 1;
SELECT CASE WHEN @i <= 1 THEN 1 WHEN @i <= 2 THEN MIN(1/0) END;

So what I was trying to show with this example:
DECLARE @i AS INT = 1;
SELECT CASE WHEN @i <= 1 THEN 1 WHEN @i <= 2 THEN 1/MIN(0) END;

... is that the short-circuit was working once processing got to it, but as Aaron put it the aggregate was working independently.
Posted by Paul White NZ on 9/21/2011 at 9:07 PM
Hi K Christopher Cross, Sr,

The logical processing order described at http://msdn.microsoft.com/en-us/library/ms189499.aspx) contains this statement:

"This order determines when the objects defined in one step are made available to the clauses in subsequent steps. For example, if the query processor can bind to (access) the tables or views defined in the FROM clause, these objects and their columns are made available to all subsequent steps. Conversely, because the SELECT clause is step 8, any column aliases or derived columns defined in that clause cannot be referenced by preceding clauses. However, they can be referenced by subsequent clauses such as the ORDER BY clause. Note that the actual physical execution of the statement is determined by the query processor and the order may vary from this list."

That entry is explaining binding steps, not saying anything about physical execution order (as it notes at the end). In general, as you have demonstrated, the query optimizer is free to produce any physical execution plan (with the operations in any order) that meets the semantic of the original query (taking the SQL as a whole, not in logical processing order).

The one and only tool we have (in T-SQL) to guarantee that one evaluation happens before another at run time is the CASE statement (it guarantees left-to-right order of evaluation). What is NOT clear from the documentation is whether it stops evaluating left-to-right as soon as the first test returns true. That's the point of this bug report.

The non-deterministic nature of code like the examples you posted is the reason we turn to CASE. Which of your examples happen to produce an error depends on query processor code paths and internal details. As you have seen, small tweaks to the SQL happen to produce different runtime behaviours (and they could all change with the next patch).
Posted by K Christopher Cross, Sr on 9/21/2011 at 6:15 PM
Further, if we accept that HAVING goes first, we should be okay with:


SELECT X, MIN(Y) AS MinY
FROM (
VALUES(1, 0),(2, 1)
) T(X, Y)
WHERE Y <> 0 AND 1/Y = 1
GROUP BY X
HAVING MIN(Y) = 1
;

Of course, it fails, but low and behold this works:


SELECT X, MIN(Y) AS MinY
FROM (
VALUES(1, 0),(2, 1)
) T(X, Y)
WHERE Y <> 0 AND 1/Y = 1
GROUP BY X
HAVING MIN(Y*Y) = 1
;

So any expression even a CAST() in the MIN makes it evaluate first. That might be clear to geniuses such as Paul and Aaron, but how are the rest of us to get by in this world. :) Especially when you then throw in the fact that an expression in another clause totally changes this order again.

-- Fail!
SELECT CONVERT(INT, X), MIN(Y) AS MinY
FROM (
VALUES(1, 0),(2, 1)
) T(X, Y)
WHERE Y <> 0 AND 1/Y = 1
GROUP BY CONVERT(INT, X)
HAVING MIN(CONVERT(INT, Y)) = 1
;

-- Joy!
SELECT X, MIN(Y) AS MinY
FROM (
VALUES(1, 0),(2, 1)
) T(X, Y)
WHERE Y <> 0 AND 1/Y = 1
GROUP BY X
HAVING MIN(CONVERT(INT, Y)) = 1
;

SELECT X, MIN(Y) AS MinY
FROM (
VALUES(1, 0),(2, 1)
) T(X, Y)
WHERE CONVERT(INT, Y) <> 0 AND 1/Y = 1
GROUP BY X
HAVING MIN(Y) = 1
;

Okay, I sufficiently beat that to death I think.
Posted by K Christopher Cross, Sr on 9/21/2011 at 5:56 PM
I agree. The issue is definitely the inconsistency and understanding. I know I for one was a firm believer in the order of evaluation within SELECT statements. In a conversation following a session in which Rob Farley showed a HAVING clause evaluating before GROUP BY happens, I was enlightened that I was sadly mistaken. Having documentation that identifies the fact that this may not be true as Aaron stated would be wonderful.

Expanding on the above:

SELECT X, MIN(Y) AS MinY
FROM (
VALUES(1, 0),(1, 1)
) T(X, Y)
WHERE Y <> 0
GROUP BY X
HAVING MIN(1/Y) = 1
;

By my traditional understanding that this statement evaluates as FROM, WHERE, GROUP BY, HAVING, etc. I would expect this statement to work because Y cannot equal 0 by time it gets to HAVING. However, instead:

Msg 8134, Level 16, State 1, Line 1
Divide by zero error encountered.

Kevin
Posted by AaronBertrand on 9/21/2011 at 5:19 PM
Yes, I agree with Paul... while it is great to expect a fix for this and the other case evaluation order bug (which has been fixed in 649957, but not publicly available yet), it is still hard to imagine that this is the last time this will come up. Folks like Itzik are bound to discover more cases (no pun intended) where this order of evaluation is not honored. It would be fantastic to see some clarification in Books Online that explicitly calls this out. Something like:

"Just because the *order* of evaluation follows an expected pattern, that does not mean that subsequent expressions following the first true expression won't be evaluated at parse/compile/run time."

This reminds me of the "order without ORDER BY" situation, where folks believe that something is true because they observe it "most of the time."
Posted by Paul White NZ on 9/21/2011 at 4:58 PM
I do hope this item gets the proper answer it deserves. For me, it's less about reporting the aggregate bug as it is about getting the answer to "most users infer that this guarantees that CASE will stop evaluating when the first true Boolean_expression is reached".

This is a very widely-held belief, so if we have been relying on a guarantee that does not exist (i.e. CASE might evaluate expressions after the first true one is encountered) we need to know!