Home Dashboard Directory Help
Search

Database with 64K filegroups takes unexpectedly long to startup by xor88


Status: 

Closed
 as Won't Fix Help for as Won't Fix


1
5
Sign in
to vote
Type: Bug
ID: 778420
Opened: 2/3/2013 3:32:28 PM
Access Restriction: Public
0
Workaround(s)
view
1
User(s) can reproduce this bug

Description

SQL Server burns 100% of a single CPU core for a *very* long time when starting up a database with 64K filegroups and files. It does almost no IO, it is just burning CPU. This is unexpected because startup should probably be IO bound.

Repro:

1. Create a database with 64K filegroups and one file each (on an SSD drive)
2. Kill SQL Server
3. Start it
4. Observe high CPU for a long time, almost no IO and the database being inaccessible due to X-locks.

Script:

--setup
CREATE DATABASE FgTest ON PRIMARY
( NAME = N'FgTest_Data', FILENAME = N'C:\FgTest.mdf', SIZE = 16MB, MAXSIZE = UNLIMITED, FILEGROWTH = 16MB)
LOG ON
( NAME = N'FgTest_Log', FILENAME = N'C:\FgTest.ldf', SIZE = 16MB, MAXSIZE = UNLIMITED, FILEGROWTH = 16MB)
GO
ALTER DATABASE FgTest SET RECOVERY SIMPLE
GO
USE FgTest
GO


--test limits
DECLARE @i INT = 0
WHILE @i < 100
BEGIN

    DECLARE @nextID INT = (SELECT MAX(file_id) FROM sys.database_files) + 2
    DECLARE @filegroupName NVARCHAR(MAX) = 'FG_' + CONVERT(NVARCHAR(MAX), @nextID)
    DECLARE @fileName NVARCHAR(MAX) = 'F_' + CONVERT(NVARCHAR(MAX), @nextID)
    DECLARE @sql NVARCHAR(MAX) = '
        ALTER DATABASE FgTest ADD FILEGROUP ' + @filegroupName + '
        ALTER DATABASE FgTest ADD FILE (NAME = N''' + @fileName + ''', FILENAME = N''C:\FgTest\FgTest_' + @fileName + '.ndf'', SIZE = 512KB) TO FILEGROUP ' + @filegroupName + '
    '
    EXEC sp_executesql @sql

    IF @i % 10 = 0
        PRINT 'Added filegroup ' + @filegroupName + ', file ' + @filegroupName

    SET @i = @i + 1

END



This is an availability problem because the server is unavailable for a really long time.
Details
Sign in to post a comment.
Posted by xor88 on 3/18/2014 at 5:30 AM
This issue was fixed by http://support.microsoft.com/kb/2932120 in CU9 for SQL Server 2012 R2. Thank you!

Startup time was reduced from 1:45h to 2min in my case for 30K files and filegroups.
Posted by xor88 on 4/25/2013 at 3:54 PM
That's a meaningful reply, thank you.
Posted by Microsoft on 4/25/2013 at 12:43 PM
Hello,
Thank you for submitting this feedback. After carefully evaluating all of the bugs in our pipeline, we are closing this bug as the scenarios reported in the bug are not common enough.
Thanks again for reporting the product issue and continued support in improving our product.

Thanks,
Ajay
Senior Program Manager
Microsoft Corporation
Posted by xor88 on 2/5/2013 at 1:26 PM
The use case is a database of about 10TB split to a few thousand files for fragmentation management reasons. Everybody knows that mixing partitions that have frequent inserts and deletes in the same file can cause a lot of interleaving extents. It is very hard to maintain high sequential read speeds under those conditions. Rebuilds often spread the new partition over small allocation holes so they don't really get rid of the fragmentation. Lots of disk seeks.

For that reason we periodically rebuild certain fragmented indexes/partitions into new filegroups to get them perfectly sequential.

Actually, project Madison (Parallel Data Warehouse) uses this approach: They have a filegroup and file per partition.
Posted by xor88 on 2/5/2013 at 1:25 PM
The use case is a database of about 10TB split to a few thousand files for fragmentation management reasons. Everybody knows that mixing partitions that have frequent inserts and deletes in the same file can cause a lot of interleaving extents. It is very hard to maintain high sequential read speeds under those conditions. Rebuilds often spread the new partition over small allocation holes so they don't really get rid of the fragmentation. Lots of disk seeks.

For that reason we periodically rebuild certain fragmented indexes/partitions into new filegroups to get them perfectly sequential.

Actually, project Madison (Parallel Data Warehouse) uses this approach: They have a filegroup and file per partition.
Posted by Allan Hirt on 2/4/2013 at 10:28 AM
It is totally expected. It has NEVER been a SQL Server best practice to have thousands of filegroups for an individual database for reasons you encountered - things like throttled CPU. Having many small files would also be very hard to manage in the real world - especially for recovery. Who has DBs that are 16MB for each file? None of my customers.

I'm curious why you did this since it really isn't a bug IMO. To me, this is deffinitely a case of "just because SQL technically allows you to do this, why did you go there"? If you don't do this, your database won't take nearly 90m to start.
Posted by xor88 on 2/4/2013 at 2:17 AM
Update: The database startup took 1h:23m:50s of constant CPU time as reported by Process Explorer. That's about 1.5h hours(!) of wall-clock startup time.

The database had no contents at all and almost all files were 1MB or 0.5MB
Sign in to post a workaround.