String summary - slow statistics creation on VARBINARY(MAX) column - by stormcrow_au

Status : 

  Won't Fix<br /><br />
		Due to several factors the product team decided to focus its efforts on other items.<br /><br />
		A more detailed explanation for the resolution of this particular item may have been provided in the comments section.


7
0
Sign in
to vote
ID 683303 Comments
Status Closed Workarounds
Type Suggestion Repros 0
Opened 8/8/2011 12:06:08 AM
Access Restriction Public

Description

Our databases are configured with AUTO_CREATE_STATISTICS enabled, and AUTO_UPDATE_STATISTICS_ASYNC disabled. We're using SQL 2008 and 2008 R2 Developer and Standard editions in test, and SQL 2008 and 2008 R2 Standard Edition in production.

We've got a table that contains JPEG image data similar to the below:

CREATE TABLE Pictures ([Key] INT NOT NULL IDENTITY(1,1) PRIMARY KEY, [Image] VARBINARY(MAX))

The following query performs very slowly when a statistics creation or update is triggered. 

SELECT COUNT(*), SUM(CAST(DATALENGTH([Image]) AS bigint)) FROM Pictures WHERE [Image] IS NOT NULL

The same query without the WHERE predicate takes only a few seconds, even on a 20GB table, and the 'bytes read' performance counter indicates that sqlservr.exe is not reading all the blob data.

Examining sys.dm_exec_requests while the statistics update is occuring, we see:

SELECT StatMan([SC0], [LC0]) FROM (SELECT TOP 100 PERCENT CONVERT([varbinary](200), SUBSTRING ([Image], 1, 100)++substring([Image], case when LEN([Image])<=200 then 101 else LEN([Image])-99 end, 100)) AS [SC0], datalength([Image]) AS [LC0] FROM [dbo].[RequestFormImages] WITH (READUNCOMMITTED)  ORDER BY [SC0] ) AS _MS_UPDSTATS_TBL

Investigation shows this query creates the 'string summary' for the index, as described in http://msdn.microsoft.com/en-us/library/ms190397%28v=SQL.90%29.aspx. In this and other cases in our database, the string summary will never produce an improved plan. 

The article above states that string summaries will be discarded if they provide no useful information - is it possible to add an option to CREATE STATISTICS to control whether the string summary is computed in the first place?
Sign in to post a comment.
Posted by Microsoft on 8/10/2011 at 8:25 AM
Thanks for the feedback. We'll consider this for a future release. Please consider using manual statistics update on that column for now and disabling automatic updates. Contact Microsoft Customer Support Services if you can't come up with a workaround and this is a critical issue for you.

Best regards,
Eric Hanson
Program Manager
SQL Server Query Processing