Home Dashboard Directory Help
Search

Severe performance issue with persisted computed columns and joins by PotSushi


Status: 

Closed
 as Won't Fix Help for as Won't Fix


2
0
Sign in
to vote
Type: Bug
ID: 646700
Opened: 2/24/2011 3:09:49 AM
Access Restriction: Public
0
Workaround(s)
view
0
User(s) can reproduce this bug

Description

Hi, I'm hoping you can help with a performance issue with persisted computed columns. It would seem the the persisted value is not always used causing the computation to occur each time the row is read - which is bad

--DROP TABLE [dbo].[StatusCode];
--DROP TABLE [dbo].[Case];
--DROP FUNCTION [dbo].[GetId];
--DROP FUNCTION [dbo].[GetSummary];
--DROP FUNCTION [dbo].[GetReference];
--GO

CREATE FUNCTION [dbo].[GetId]
(@xml Xml)
RETURNS uniqueidentifier
WITH SCHEMABINDING
AS
BEGIN
    DECLARE @Id uniqueidentifier;
    WITH XMLNAMESPACES(DEFAULT 'http://www.tempuri.org')
    SELECT @Id = @xml.value('(/Header)[1]/@id','uniqueidentifier')
    RETURN @Id
END
GO

CREATE FUNCTION [dbo].[GetSummary]
(@xml Xml)
RETURNS nvarchar(250)
WITH SCHEMABINDING
AS
BEGIN
    DECLARE @Summary nvarchar(250);
    WITH XMLNAMESPACES(DEFAULT 'http://www.tempuri.org')
    SELECT @Summary = @xml.value('(/Header/Summary)[1]/text()[1]','nvarchar(250)');
    RETURN @Summary

END
GO

CREATE FUNCTION [dbo].[GetReference]
(@xml Xml)
RETURNS nvarchar(250)
WITH SCHEMABINDING
AS
BEGIN
    DECLARE @Reference nvarchar(250);
    WITH XMLNAMESPACES(DEFAULT 'http://www.tempuri.org')
    SELECT @Reference = @xml.value('(/Header/Reference)[1]/text()[1]','nvarchar(250)');
    RETURN @Reference
END
GO

CREATE TABLE [dbo].[Case](
    [Id] AS ([dbo].[GetId]([Xml])) PERSISTED NOT NULL
    , [Xml] xml NOT NULL
    , [StatusCode] nvarchar(50) NULL
    , [Reference] AS ([dbo].[GetReference]([Xml])) PERSISTED
    , [Summary] AS ([dbo].[GetSummary]([Xml])) PERSISTED
);
CREATE TABLE [dbo].[StatusCode](
    [Code] nvarchar(50) not null
    , [Description] nvarchar(250)
);
GO

INSERT INTO [dbo].[StatusCode] VALUES('Code 1','Code 1 Description');
INSERT INTO [dbo].[StatusCode] VALUES('Code 2','Code 2 Description');
GO

INSERT INTO [dbo].[Case] VALUES('<Header xmlns="http://www.tempuri.org" id="1e742554-34f0-42e0-a99d-67cb2b411383"><Summary>Record 1</Summary><Reference>Reference 1</Reference></Header>', 'Code 1');
INSERT INTO [dbo].[Case] VALUES('<Header xmlns="http://www.tempuri.org" id="cf3decf9-5980-48e9-a77d-4e9d76e8b189"><Summary>Record 2</Summary></Header>', 'Code 1');
INSERT INTO [dbo].[Case] VALUES('<Header xmlns="http://www.tempuri.org" id="1c6e1165-477f-49f3-bb72-9bb5e18e4b23"><Summary>Record 3</Summary><Reference>Reference 3</Reference></Header>', 'Code 2');
INSERT INTO [dbo].[Case] VALUES('<Header xmlns="http://www.tempuri.org" id="dec2ba8f-cb07-4644-8faa-11a94e39a88d"><Summary>Record 4</Summary></Header>', 'Code 2');
INSERT INTO [dbo].[Case] VALUES('<Header xmlns="http://www.tempuri.org" id="808554a5-2e51-41a9-ab41-105710a4a1cb"><Summary>Record 5</Summary><Reference>Reference 5</Reference></Header>', 'Code 2');
GO

DECLARE @Summary bit; SET @Summary = 0;
SELECT
    [Id]
    , CASE
         WHEN @Summary = 1 THEN C.[Summary]
         ELSE C.[Reference]
    END AS [DerivedValueColumn]
FROM [Case] C
INNER JOIN [dbo].[StatusCode] S ON C.[StatusCode] = S.[Code];

The above code including inner join causes an extra Compute Scalar to occur which the argument is DEFINE:([Expr1006]=CASE WHEN [@Summary]=(1) THEN [ComputedColumnProblem].[dbo].[GetSummary]([ComputedColumnProblem].[dbo].[Case].[Xml] as [C].[Xml]) ELSE [ComputedColumnProblem].[dbo].[GetReference]([ComputedColumnProblem].[dbo].[Case].[Xml] as [C].[Xml]) END). Using profiler capturing SP:StmtCompleted events clearly demonstrates the functions running once per returned row which is bad.

Now the interesting bit, removing the seemingly unrelated inner join line to [dbo].[StatusCode] uses the computed persisted columns correctly.

I know there are 101 ways to rewrite the query without the CASE but I really want to know why this happens and how to avoid it in the future as I don't believe I'm doing anything particularly wrong here. We use computed columns regularly to pluck information out of xml and persist it per books online article http://msdn.microsoft.com/en-us/library/bb510426.aspx

Kind regards, Adam
Details
Sign in to post a comment.
Posted by PotSushi on 3/10/2011 at 2:15 AM
Thank you, I accept the analysis and appreciate the time and effort. Kind Regards, Adam
Posted by Microsoft on 3/9/2011 at 5:19 PM
Hi Adam,

Here's the summary of investigations:

The computed columns within the case statement are not used by the query because the expression computing the case statement does not get pushed below the join. The reason for not getting pushed is the fact the expression mixes references from the table with references from outside. It also happens that the case statement does not allow individual sub-expression to be pushed in because it could lead to scenarios where certain sub-expression gets evaluated when it should not be evaluated. With this in mind and looking the shape of the query we can suggest the following rewrite:

DECLARE @Summary bit; SET @Summary = 0;
if (@Summary = 1)
     SELECT
             [Id],
             C.[Summary] AS [DerivedValueColumn]
     FROM
             [Case] C
             INNER JOIN [dbo].[StatusCode] S ON C.[StatusCode] = S.[Code]
else
     SELECT
             [Id],
             C.[Reference] AS [DerivedValueColumn]
     FROM
             [Case] C
             INNER JOIN [dbo].[StatusCode] S ON C.[StatusCode] = S.[Code]
go

This is the current behavior of the product and while we can improve it, it is unlikely we will fix that soon. Hence, I'm going to close this item as "won't fix".

Regards,
Boris.
Posted by PotSushi on 3/2/2011 at 3:30 PM
Thanks Boris
Posted by Microsoft on 3/1/2011 at 4:03 PM
Hi Adam,

Thank you for filing the feedback – we are currently investigating this. I will update the item as we know more.

Regards,
Boris Baryshnikov.
SQL Server Engine
Sign in to post a workaround.