SQL Server and Fragmentation

There are several types of fragmentation that affect SQL Server, all of which have the potential to cause performance problems.

1) At the disk/OS level individual files can become fragmented due to growth over time. This can be addressed by a defragmentation tool at the SAN or Windows level. Ask your storage team about your options there.

2) Within a SQL Server database individual pages can become internally fragmented so that many pages are less full than expected. This means more pages are needed to store the same amount of data which means more reads to satisfy each query and less total data that can be kept in memory. This can decrease performance.

3) Within a SQL Server database the index pages and extents can become less sequential. This can make scans slower. Though hopefully you are doing more seeks than scans this can still be an important factor in performance.

Sharepoint:

If you are using Sharepoint, you need to be aware that you cannot change the schema at all, including adding or modifying indexes. Here is some information on Sharepoint support:

· 943345 How to defragment Windows SharePoint Services 3.0 databases and SharePoint Server 2007 databases http://support.microsoft.com/default.aspx?scid=kb;EN-US;943345

· 841057 Support for changes to the databases that are used by Office server products and by Windows SharePoint Services http://support.microsoft.com/default.aspx?scid=kb;EN-US;841057

· 968785 After you install Windows SharePoint Services 3.0 Service Pack 2, the Windows SharePoint Services 3.0 database can be auto-defragmented http://support.microsoft.com/default.aspx?scid=kb;EN-US;968785

· Note that with Sharepoint, the recommended break-point where you change from REORGANIZE to REBUILD is 70% instead of the more standard 30%. Your maintenance window may be a bit longer but concurrency will increase for your users on the tables with BLOBs.

Disk/OS level fragmentation:

In a SAN environment, sometimes you cannot use the windows tools to tell you how fragmented a volume is since some characteristics may be hidden from Windows and known only to the SAN. If your storage team has SAN tools to tell them about fragmentation, then use that. Windows can see a SAN volume as fragmented when the SAN has already taken care of the fragmentation.

· This SQL IO basics document has some details on disk defragmentation http://download.microsoft.com/download/4/7/a/47a548b9-249e-484c-abd7-29f31282b04d/SQLIOBasicsCh2.doc

Database level fragmentation:

· sys.dm_db_index_physical_stats (Transact-SQL) http://msdn.microsoft.com/en-us/library/ms188917.aspx

· Reorganizing and Rebuilding Indexes http://msdn.microsoft.com/en-us/library/ms189858.aspx

· Guidelines for Performing Online Index Operations http://msdn.microsoft.com/en-us/library/ms190981.aspx

· Rebuilding Indexes and Updating Statistics http://sqlskills.com/blogs/paul/post/Search-Engine-QA-10-Rebuilding-Indexes-and-Updating-Statistics.aspx

When you REORGANIZE (in older versions of SQL Server this was DBCC INDEXDEFRAG) it compares two pages at a time and works its way through the index. If it is killed before it stops no work done is lost. However, for a very defragmented index this operation will take much longer than REBUILD. The more fragmentation you have the longer this operation takes to complete. There may still be some fragmentation left after this operation completes.

When you REBUILD (DBCC DBREINDEX in earlier versions) a copy is made of the index and the new version is entirely sequential and compact. It is faster to use this method when the fragmentation level is high. REBUILD also does the equivalent of an UPDATE STATISTICS WITH FULLSCAN which gives very accurate statistics. The amount of time to complete has little to do with the amount of fragmentation and instead depends on the total amount of data, the data types, the index structure, etc. Internally this is pretty much equivalent to creating a new index (without the overhead of a sort) then dropping the old one within a transaction. For some indexes it is possible to do a minimally logged operation if you are not using the FULL recovery mode. Not using FULL recovery mode means you have a greater exposure to possible data loss in the case of certain types of system failures, this is not directly related to the REBUILD operation but only to the recovery mode. Extra precautions such as a log backup immediately before and after the change of recovery mode. For non-clustered indexes you can sometimes use the SORT_IN_TEMPDB option to avoid a sort. For tables with no XML or spatial indexes you may be able to set ONLINE=ON in Enterprise Edition so that users can continue to access the index while it is being rebuilt (there are several restrictions on when this is allowed). If there is little free space in the database the new index may not be completely defragmented.

FILLFACTOR:

The FILLFACTOR setting is related to fragmentation and may be changed during a REBUILD. You can set the FILLFACTOR to a higher or lower number to adjust how much free space is left on each page. The tradeoff is in space used vs. the cost of future page splits as data is added or rows get bigger (variable length or nullable columns get more data). The FILLFACTOR does not prevent a page from filling, it sets the amount of free space only at the time the index is created or rebuilt. By default the pages are created as full as possible so unless you have explicitly set your FILLFACTOR to a lower amount for one or more indexes then you won’t be able to save space that way. If you have to defragment an index more often than you would like, and the fragmentation is due mostly to page splits, then you might want to set a lower FILLFACTOR. If you rarely have to do defragmentation and you have a lot of empty space per page, you might want to increase the FILLFACTOR. If you want to guess at a “best” value on each index you need to know the average width of the rows, the clustered index key values (if you’re setting it for a non-clustered index), the expected amount of inserts and whether they will cause a lot of page splits, are inserts on a monotonically increasing key like IDENTITY, whether updates will cause a lot of row movement or changes in row size, whether deletes are likely to happen in random spots or at the “beginning” or “end”, the volume of data modifications compared to total rows and whether the data is compressed. All of which lead back to how much fragmentation do you expect (and does that fragmentation cause any perceivable performance problems) and how often can/will you defragment the indexes. So if you see a lot of fragmentation and when you defragment your performance improves (not always the case), have a lot of page splits (PerfMon: SQL Server Access Methods.Page Splits/sec), and have the space available to rebuild your index then a lower FILLFACTOR is one possible solution. When you rebuild the indexes it will often change the total space allocated to the table, partly due to FILLFACTOR and partly due to the fragmentation cleanup.

References:

· BOL Fill Factor http://msdn.microsoft.com/en-us/library/ms177459.aspx

· BOL Create Index http://msdn.microsoft.com/en-us/library/ms188783.aspx

· Database Maintenance Best Practices Part II – Setting FILLFACTOR

http://www.sqlskills.com/BLOGS/KIMBERLY/post/Database-Maintenance-Best-Practices-Part-II-e28093-the-most-important-setting-FILLFACTOR.aspx

· Who Cares about FillFactor? http://www.sqlservercentral.com/articles/Performance+Tuning/whocaresaboutfillfactor/1196/

· Kimberly Tripp on FILLFACTOR http://www.sqlskills.com/BLOGS/KIMBERLY/category/FILLFACTOR.aspx

· Top Tips for Effective Database Maintenance http://technet.microsoft.com/en-us/magazine/2008.08.database.aspx