SQL 2012 SSISDB Cleanup causes User Packages to fail - by SAinCA

Status : 

  Fixed<br /><br />
		This item has been fixed in the current or upcoming version of this product.<br /><br />
		A more detailed explanation for the resolution of this particular item may have been provided in the comments section.


37
0
Sign in
to vote
ID 749391 Comments
Status Closed Workarounds
Type Bug Repros 11
Opened 6/15/2012 2:30:17 PM
Access Restriction Public

Description

The SSISDB's Catalog Properties included the data retention set to the default 365 days.  There is only ONE Folder in this Catalog.  Changing the retention to 7 days and running the "SSISDB Server Maintenance Job" causes the only package scheduled to run every 15 minutes, that typically runs in 90 seconds, to run for 19 minutes and fail with no explanation at all.

I can see the query below executing for over 2 hours (17GB of SSISDB Data).  UPDATE STATISTICS is also running on a plethora of threads under the Maintenance Plan for the same DB:
DELETE TOP (@delete_batch_size)
  FROM [internal].[operations] 
OUTPUT DELETED.operation_id, DELETED.operation_type INTO @deleted_ops
 WHERE ( [end_time] <= @temp_date
        OR ([end_time] IS NULL AND [status] = 1 AND [created_time] <= @temp_date ))

I had hoped that the "Delete Batch Size" might be configurable, but no signs of it on the page dedicated to SSIDB at http://technet.microsoft.com/en-us/library/hh479588.aspx.

In running the User Package for over two months non-stop, there have been ZERO failures.  The "SSISDB Server Maintenance Job" is clearly impeding normal operations.  Given that this User process MUST run every 15 minutes, there is NO maintenance window in which the Maintenance Job could be scheduled so as to obviate this interfering behavior.

Please fix the delete process so it doesn't render the SSISDB inaccessible for NORMAL Production access.

(Stand-alone instance under VM-Ware)
Sign in to post a comment.
Posted by Thomas W Marshall on 3/17/2014 at 6:07 AM
http://support.microsoft.com/kb/2829948

Fixes for this issues are presented in SQL 2012 SP1 CU4 and SQL 2012 CU7.
Posted by Thomas W Marshall on 3/12/2014 at 7:33 AM
My colleague David just pointed out to me that the @deleted_ops variable does not get truncated with each iteration of the loop. This in turn leads to it acculumulating all of the deleted ids and with each iteration forcing scans, seeks, and lookups in support of the implied join the cascade delete forces.

This process is consistently blocking all package operations from completing. Changing the retries on the agent job that calls it to zero and killing it when it is found at the head of a blocking chain helps to alleviate the subsequent issues, but that only excaberates the fault in terms of the history cleanup.
Posted by Mark Tassin on 1/15/2013 at 1:01 PM
There are several problems here... No indexes on most of the involved tables causes table scans for one. The internal.cleanup_server_retention_window stored proc doesn't specify a hint to control locking so it takes table locks when it runs delete.

There's no way to control the size of the batches except to actually modify the stored proc.
Posted by Microsoft on 7/3/2012 at 1:53 AM
Hi SAinCA

Thank you for your feedback. We will research the issue and respond as soon as possible.
Posted by Ron Van Zanten on 6/27/2012 at 1:49 PM
This is an issue for us also. Is there anybody that know a way around this?
Posted by groenhar on 6/27/2012 at 1:36 PM
We are having the same issue within our production environment. What is the work around to avoid the interference with logging of the packages running in production constantly? I agree the behavior is interfering with production processes. I am not sure how to workaround this issue. The FK's to the operation id is set to cascade delete all the records in the child tables. This appears be blocking the inserts of the production packages doing the normal logging which is causing time outs and the packages to fail during the maintenance job running. Anyone have a workaround?
Posted by SAinCA on 6/15/2012 at 2:43 PM
ADDITIONAL INFORMATION:
==================

Maintenance Plan Job cancelled to reduce IO contention.

The following command is waiting on an LCK_M_IX Wait Type on the SSISDB on three threads for Application "SSIS ISServerExec" - each has been waiting 20 minutes, then disappears coincident with the latest User Package failure:

SSISDB.internal.append_event_message;1

The DELETE task details are : EXEC [internal].[cleanup_server_retention_window]