Befriending Dragons

Transform Tech with Anti-bullying Cultures


Leave a comment

PASS BAC PREVIEW SERIES: SQL Professionals and the World of Self-service BI and Big Data

Are you excited about the upcoming PASS Business Analytics Conference? You should be! This conference will offer a wide range of sessions about Microsoft’s End to End Business Intelligence (including Self-Service BI), Analytics, Big Data, Architecture, Reporting, Information Delivery, Data Management, and Visualization solutions. Whether you are an implementer, a planner, or a decision maker there is something here for you!

PASS_BAC_Horizontal_Banner

What makes this conference different? Why should you put in the effort to attend this conference in particular? We are seeing a paradigm shift focused on shorter time to decision, more data available than ever before, and the need for self-service BI. There are exciting technology solutions being presented to deal with these needs and new architectural skills are needed to implement them properly. Self-Service BI and Big Data are very different in many ways but also responding to the same problem – the need for additional insights and less time spent getting to those insights and the resulting impactful decisions. Self-Service BI via PowerPivot, Power View, Excel, and existing and new data sources including HDInsight/Hadoop (usually via Hive) offers fast time to decision, but you still sometimes need Enterprise BI to add additional value via services such as data curation, data stewardship, collaboration tools, additional security, training, and automation. Add in the powerful new data sources available with Big Data technologies such as HDInsight/Hadoop that can also reduce time to decision and open up all sorts of new opportunities for insight and you have many powerful new areas to explore. Not to mention that Dr. Steven Levitt, author of Freakonomics and SuperFreakonomics, is one of the keynote speakers!

Read more about my thoughts on Self-Service BI and Big Data in this #PASSBAC guest blog published today: PASS BAC PREVIEW SERIES: SQL Professionals and the World of Self-service BI and Big Data

And sign up for the session I am co-presenting at #PASSBAC with Eduardo Gamez of Intel: How Intel Integrates Self-Service BI with IT for Better Business Results

Take a look at all the information tagged with #PASSBAC and tweeted by @PASSBAC, there are some good blogs, preview sessions, and tidbits being posted. Get your own Twibbon for Twitter, Facebook, or however you want to use it, the Twibbon site will add a ribbon to the picture of your choice:

PASSBA2013Cindy

If you’re going to be in Chicago anyway, you might as well stay a few extra days for two nearby SQL Saturdays. The weekend before the conference take a short hop over to Madison, WI for #SQLSAT206 on April 6, 2013 at the Madison Area Tech College. Then head over to the bacon, uhhh, PASS BA CONference April 10-12. Stay one more day in Chicago (technically Addison, IL) for the #SQLSAT211 sessions at Devry. This is a great opportunity for even more SQL Server immersion and networking!

See you at #PASSBAC in Chicago in April!

@SQLCindy

Small Bites of Big Data


Leave a comment

Taming the Tempdb Tempest – WI SQL Server Virtual User Group, 22 Apr 2011

Thanks to the Wisconsin Virtual SQL Server User Group for letting me talk about tempdb today! The slides and demo queries are attached. Once the recording is available I will update this blog with a link to it.

Taming the Tempdb Tempest

Summary:

·         Multiple data files of the same size, one log file

·         Enough data files to avoid contention, not so many to cause problems.

·         Presize for peak periods of next X months, re-evaluate

·         Set autogrow to be rare but “big enough”

·         Instant File Initialization on (small security risk)

·         Fast IO subsystem

·         Change size/settings if you add new features that use tempdb

·         Monitor for approaching full, change in activity/size

·         Performance tune user databases and applications

·         Limit use of versioning or temp objects

 

The demo queries are:

·         sys.dm_db_file_space_usage.sql: How space is used inside tempdb

·         sys.dm_db_file_space_usage_companion1.sql: Show how different activities cause space to be used in tempdb

·         Autogrow.sql: Find autogrow settings for all dbs on an instance

·         TempdbContention.sql: Find contention on tempdb metadata

 

Also see my previous blog post with the same basic data in a different format: Compilation of SQL Server TempDB IO Best Practices http://blogs.msdn.com/b/cindygross/archive/2009/11/20/compilation-of-sql-server-tempdb-io-best-practices.aspx

I also delivered the talk to the Boise SQL Server User Group on 13 Sep 2011 and the updated queries and slide deck are attached below.

TempdbBoiseSQLSep2011.zip


Leave a comment

New Releases – SQL Server Migration Assistant (SSMA)

We have shipped some updates to our migration tool – the SQL Server Migration Assistant (SSMA). It is now easier than ever to migrate to SQL Server 2005, 2008, and/or 2008 R2 from Access, Oracle, Sybase, and MySQL. You can even use SSMA to migrate from Access or MySQL to SQL Azure.

 

Customers and partners can provide SSMA feedback via ssmateam@microsoft.com.

 

·         SSMA team blog http://blogs.msdn.com/b/ssma/

·         Microsoft announces SQL Server Migration Assistant for MySQL http://blogs.msdn.com/b/ssma/archive/2010/08/12/microsoft-announces-sql-server-migration-assistant-for-mysql.aspx

·         Migrating to SQL Server 2008 http://www.microsoft.com/sqlserver/2008/en/us/migration.aspx

Links to downloads: http://blogs.msdn.com/b/ssma/archive/2010/08/12/microsoft-announces-sql-server-migration-assistant-for-mysql.aspx


7 Comments

Memory Trimming: A significant part of sql server process memory has been paged out.

I have written about this topic before (in my old SQLCommunity blog) but I still see lots of questions about it. So here is a simplified explanation of this memory trimming issue that is common with SQL Server on Windows 2003.

On Windows 2003 you may occasionally see memory trimming messages in your SQL Server error logs: “A significant part of sql server process memory has been paged out. This may result in a performance degradation.” Generally you will see this at the same time that the performance of your SQL Server becomes very bad. What happens is Windows has to free up memory rapidly and it  cannot wait for the normal, polite “hey, I need some memory, are there any apps willing to give me some”. This can happen due to severe memory pressure, when certain APIs are called, or due to certain Windows 2003 bugs (most of which are fixed in Windows 2003 SP2 or post SP2 hotfixes). Windows 2003 will respond to this pressure by trimming GBs of memory at once. Windows 2008 responds much better in that it trims MBs at a time instead and typically does not cripple running applications such as SQL Server. Therefore it is expected that you will NOT see this scenario on Windows 2008 or later.

 

Ideally you would find and resolve the reason for the external memory pressure, but in many cases that is impractical when a workaround may alleviate the problem. On Windows 2003 you can try combining the “lock pages in memory” (LPIM) setting with a “max server memory” setting such that all instances of SQL added together will still leave “enough” memory for everything else running on the box (SQL Agent, SSIS/DTS, RS, AS, anti-virus, backups, batch processes, services, etc.). LPIM keeps Windows from trimming SQL’s buffer pool. Do NOT ever enable LPIM without also setting “max server memory” for all instances to an appropriate value. There are rare cases where LPIM can cause more problems than it solves. This happens if Windows still faces this type of memory pressure and has to trim SQL Server memory outside the buffer pool which could mean trimming the mapped SQL exes and dlls and causing even bigger performance problems. So monitor the system carefully after you add LPIM and make sure you’ve properly set the “max server memory” setting. The currently recommended fix is to upgrade to Windows 2008 or Windows 2008 R2 where the memory trimming is handled much differently.

 

2001745 Working Set Trimming can negatively impact SQL, Exchange, and Operating System performance under Windows 2003

http://support.microsoft.com/kb/2001745

918483 How to reduce paging of buffer pool memory in the 64-bit version of SQL Server

http://support.microsoft.com/default.aspx?scid=kb;EN-US;918483

How to enable the Lock Page in Memory option (Windows)

http://msdn.microsoft.com/en-us/library/ms190730(SQL.90).aspx


Leave a comment

CTP of SP2 for SQL Server 2008 is now available

SQL Server 2008 will soon have a new service pack – SP2. SQL Server service packs go through a CTP (aka beta) cycle prior to release. The CTP of SP2 is now available for you to download and test. This is only the pre-release version, so only use it for testing.

 

·         SQL Server 2008 Service Pack 2 CTP http://www.microsoft.com/downloads/details.aspx?FamilyID=65606fdd-093f-4c70-91f6-dc1f24520e8f&displaylang=en

·         SQL Server 2008 Express Edition Service Pack 2 – CTP http://www.microsoft.com/downloads/details.aspx?FamilyID=c56e753d-13ef-4b95-a5eb-0b470bf56341&displaylang=en

·         Feedback via Connect http://connect.microsoft.com/SQLServer/Feedback

·         SQL Server 2008 Service Pack 2 Customer Technology Preview Available http://blogs.msdn.com/b/sqlreleaseservices/archive/2010/07/07/sql-server-2008-service-pack-2-customer-technology-preview-available.aspx

“Today marks the release of the Customer Technology Preview (CTP) for Microsoft SQL Server 2008 Service Pack 2.  Microsoft SQL Server 2008 Service Pack 2 includes:

·         All hotfixes released up to Microsoft SQL Server 2008 SP1 CU 8.

·         Updates to the Reporting Services Add-in for Microsoft SharePoint Products 2007 allowing customer to connect to SQL Server 2008 R2 Report Servers.

·         Support for SQL Server 2008 R2 version of Management Studio to perform operations for data-tier applications (DAC) in SQL Server 2008 instances that have SP2 installed.  For more information please visit http://msdn.microsoft.com/en-us/library/ee240739.aspx.

·         Support for SQL Server 2008 R2 Utility Explorer and SQL Server Utility viewpoints in (SSMS) provide administrators a holistic view of resource health through an instance of SQL Server that serves as a utility control point (UCP).  For more information please visit http://msdn.microsoft.com/en-us/library/ee210557.aspx.

 We encourage customers to try Microsoft SQL Server 2008 SP2 CTP, and provide us your feedback.  We are actively reading all feedback and appreciate any thoughts that you might have to help us with this product update.  To provide feedback please visit the Microsoft Connect Feedback Center.”

·         List of the bugs that are fixed in SQL Server 2008 Service Pack 2 http://support.microsoft.com/kb/979450

 

SQL Server Servicing plans

http://blogs.msdn.com/sqlreleaseservices/archive/2010/02/12/sql-server-servicing-plans.aspx

 

·         SQL Server 2008 SP2 is targeted for the third quarter of calendar year 2010.

·         SQL Server 2005 SP4 is targeted for the fourth quarter of calendar year 2010. This is planned to be the last SQL Server 2005 service pack.

 

SQL Server Service Pack support (mostly) ends 12 months after the next service pack releases. So 1 year after SQL Server 2008 SP2 releases you need to be have all SQL Server 2008 instances on SP2 (and optionally with SP2 based cumulative updates (CUs) applied). There is very limited support for older service packs under our new SP policy.

 

·         Service Pack support (general) http://support.microsoft.com/?LN=en-us&scid=gp%3B%5Bln%5D%3Blifecycle&x=17&y=10#ServicePackSupport

·         SQL Server lifecycle http://support.microsoft.com/gp/lifeselectindex#S


Leave a comment

Windows storport enhancement to help troubleshoot IO issues

For Windows 2008 and Windows 2008 R2 you can download a Windows storport enhancement (packaged as a hotfix). This enhancement can lead to faster root cause analysis for slow IO issues. Once you apply this Windows hotfix you can use Event Tracing for Windows (ETW) via perfmon or xperf to capture more detailed IO information that you can share with your storage team. This could be very useful for your SQL Server installations or any system where you suspect IO issues.

 

Enhancement information and background topics:

·         Storport ETW Logging to Measure Requests Made to a Disk Unit http://blogs.msdn.com/b/ntdebugging/archive/2010/04/22/etw-storport.aspx

·         Diagnose Slow I/O on Windows With Xperf https://www.youtube.com/watch?v=0h-i_EfzadA (from the MSFT guys who wrote it – based on a pre-release version but the basics still apply)

·         xPerf: Once I Was Blind and Now I See http://blogs.msdn.com/b/jimmymay/archive/2009/09/08/xperf-once-i-was-blind-and-now-i-see.aspx

·         Storport Driver http://msdn.microsoft.com/en-us/library/ff567541(VS.85).aspx

The hotfix downloads are available from the top of each article and instructions are included in the body of each KB:

·         KB 979764 (Windows Server 2008 hotfix)

·         KB 978000 (Windows Server 2008 R2 hotfix)

·         http://msdn.microsoft.com/en-us/performance/cc825801.aspx (xperf downloads)


1 Comment

DBCC CHECKDB / Database Integrity

It is very important to understand DBCC CHECKDB. Make sure you run DBCC CHECKDB periodically and that if you do find some sort of corruption you address it immediately – both the current problem in the SQL Server database and the underlying problem that caused it.

 

·         How often should we run CHECKDB?

o   There is no single best schedule for how often to run CHECKDB. You have to understand how much risk you are willing to take and balance that with the impact of running CHECKDB. Paul Randal’s blog says to run some sort of consistency check “at least once a week”.

o   On systems that are not used 24 hours a day where CHECKDB is able to complete during the low/no use period, maybe you run it every week or even every day. On systems where CHECKDB cannot complete within a downtime/slow period then either you take the performance hit or you run it less frequently.

o   If you have systems that have been up for years with no corruption, does that mean you run it less often? Maybe, but remember that the older the hardware is the more likely it is to begin to wear out or even fail, and bad hardware is the cause of almost all database corruption problems.

·         How can we make it run faster?

o   Indirectly, run it less often. 🙂

§  Instead of running it for every database every night, maybe run it for half the dbs one night and the other half the next night.

§  Do one database a night. Either round robin equally or do more important ones more frequently.

§  Check individual tables with CHECKTABLE and don’t do them all at once (and frequently do the other database level checks that are included in CHECKDB – CHECKALLOC and CHECKCATALOG).

§  Check one partition (assuming each is on a separate filegroup) at a time with CHECKFILEGROUP.

§  Run CHECKDB more frequently on data that changes frequently and less often on less frequently updated or less important data.

§  Create a program that runs one checkdb, then checks to see how much time is left and maybe runs another, looping through until it reaches the end of your batch window. Fancier versions might look at runtime history to estimate whether the “next” check can finish before the batch window ends, when each database was last checked, a priority you add to each database, etc.

§  Use whatever schedule balances your ability to handle the risk of having corruption without knowing vs. the overhead of checking for the corruption.

o   Make the IO faster and/or add more CPUs.

o   Reduce other activity on the system while CHECKDB is running, especially data modifications.

o   Reduce the amount of space used in the database.

§  Archive unused data.

§  Drop redundant, duplicate, or unused indexes.

o   Use different options such as PHYSICAL_ONLY combined with having CHECKSUM enabled. Run a full CHECKDB less often.

o   Restore the backup and run CHECKDB on the restored copy. This has the added benefit of testing the consistency of the log and testing the viability of the backup file itself (and the restore process).

o   Use the TABLOCK option for CHECKDB. This may make the CHECKDB itself run faster BUT can result in some blocking.

·         Options

o   Generally you will want WITH ALL_ERRORMSGS on. The default was OFF in older versions, but is ON in newer versions. So just specify it explicitly to avoid a mistake. A caveat here is that SSMS has been written to limit the output of CHECKDB to the first 1000 rows, so you’re better off running CHECKDB from a tool like SQLCMD or OSQL. This has the added benefit of avoiding the overhead of writing all that output to the memory of the client box (or possibly the server if you run SSMS them).

o   NO_INFOMSGS is commonly used so you only have actual errors in the output, but sometimes the informational info can be useful to have.

o   PHYSICAL_ONLY changes the impact from a CPU bound process to an IO bound process.

·         Finding errors

o   If you have corruption, DBCC CHECKDB will NOT automatically abend! It will “successfully” complete. In SQL 2005 and later a “successful” CHECKDB will update the last known good run for that db.

o   If SQL Agent is used to schedule the CHECKDB then under some conditions it may abend when corruption is detected. This is a feature of SQL Agent, not a feature of DBCC. So if you use another scheduling tool that tool would have to detect @@error or use another method to look for CHECKDB failures. Within SQL Agent, you will probably want to change your job steps so that a failure in checking one database does NOT fail the entire job, but instead continues checking the next database in the next job step.

o   When CHECKDB detects a corruption error it writes a log SQLDUMPnnnn.txt to the SQL log directory.

o   Try/Catch only catches exceptions and DBCC CHECK* errors are not exceptions. Therefore Try/Catch will not detect DBCC errors. 953503 The DBCC CHECKDB statement does not raise exceptions in SQL Server when the DBCC CHECKDB statement detect a consistency error in the database http://support.microsoft.com/default.aspx?scid=kb;EN-US;953503

·         Checking the log

o   CHECKDB checks the data files. It does NOT check the log. The only way to ensure the log is not corrupted is to test a restore and see if it’s successful.

·         Suspect_pages table

o   In SQL 2005 and later a table in MSDB called suspect_pages tracks page IO errors 823 and 824. It does NOT track metadata, header, or linkage corruption errors/problems. It also does not track 825 errors which are early indicators that often later become 823/824 errors.

o   It is populated ONLY if a corrupted page is read into memory and if that page has an 823 or 824 problem.

o   Only the first 1000 rows are kept, don’t rely on this to have every error if you have massive or long running corruption. You need to monitor and clean up this table.

o   These pages are “suspected” to have problems, but it is possible the problem was transient or has already been fixed.

·         Backups

o   The only way to verify a backup is good/usable is to restore it. Let me say it again – you don’t know you have a good backup until you have restored it!

o   You can check that the header is fine with RESTORE …. WITH VERIFYONLY but that only checks the header and checksums!

o   Enabling CHECKSUM on the database plus specifying WITH CHECKSUM on the backup will detect many types of problems, but not all.

o   Even if the file is good at the time you take it, there’s no guarantee it will remain good. The media it is on can potentially cause corruption of the backup file(s).

o   Did I mention that you don’t know you have a good backup until you have restored it?

·         Checksum database option

o   A SQL Server DBA myth a day: (27/30) use BACKUP WITH CHECKSUM to replace DBCC CHECKDB http://sqlskills.com/BLOGS/PAUL/post/A-SQL-Server-DBA-myth-a-day-(2730)-use-BACKUP-WITH-CHECKSUM-to-replace-DBCC-CHECKDB.aspx

o   Once CHECKSUM has been enabled at the database level (ALTER DATABASE … SET PAGE_VERIFY CHECKSUM) any data pages that are modified will have a checksum added. Until every page is modified there will be pages that have no checksum.

o   CHECKSUM is calculated for a data page the first time that page is modified after the ALTER DATABASE.

o   CHECKSUM values on pages are only checked when those pages are read into memory so rarely used pages can be corrupted for a long time without being detected! Common ways to read the pages and see if the checksum values are still valid: BACKUP WITH CHECKSUM and CHECKDB.

o   CHECKSUM causes an error to be reported, it cannot correct the problem.

o   CHECKSUM doesn’t do anything about in-memory corruptions or any corruption errors other than 823, 824, 825.

o   CHECKSUM does NOT apply to/report problems with logical meta data, page linkages, stale reads, or index ordering.

·         Avoid corruption

o   Make sure your hardware is sound.

o   Alert on hardware level errors and take action on them.

o   Keep your firmware and storport drivers updated.

o   Alert on 823, 824, and 825 errors. If you already alert on severity 19+ errors that will capture 823 and 824 errors but not 825.

o   Run SQLIOSim to check that the IO subsystem is still properly handling data correctness.

·         What should I do when there are errors?

o   Check the status of your backups and start thinking about the consequences of restoring them.

o   The REPAIR_ALLOW_DATA_LOSS option is your LAST resort. If you use it, you WILL lose data and you may very well never know how much or what was lost.

o   Fix the underlying problem! If you see corruption once it will almost certainly happen again on the same system. The vast majority of the time the problem is with the hardware layer. Sometimes it is that the actual disk failed, sometimes it’s something less direct like failed batteries on the write controller combined with a power outage. It could potentially be firmware, a filter driver (Diskkeeper recently had a bug), or other problems. There was a rare bug (KB 954734) in SQL 2005 prior to SP3 (and SP3 is now the minimum supported version of 2005) where using the old DBREINDEX syntax instead of the new ALTER INDEX REBUILD/REORG syntax on a heap could corrupt the index and you would have to rebuild the index again. So you MUST find the source of the problem and increase monitoring on any system that has had a corruption problem.

o   Understand the errors. Corruption in a non-clustered index can usually be fixed by rebuilding the index. Other types of errors can require a restore of a backup. Read this blog from the guy who wrote CHECKDB: http://www.sqlskills.com/BLOGS/PAUL/post/CHECKDB-From-Every-Angle-Tips-and-tricks-for-interpreting-CHECKDB-output.aspx

 

General info:

·         Paul Randal’s collection of Corruption and CHECKDB blogs

o   http://www.sqlskills.com/BLOGS/PAUL/category/Corruption.aspx

o   http://www.sqlskills.com/BLOGS/PAUL/category/CHECKDB-From-Every-Angle.aspx

·         Sample corrupt databases to play with  

o   http://www.sqlskills.com/BLOGS/PAUL/post/Sample-corrupt-databases-to-play-with.aspx

o   http://www.sqlskills.com/BLOGS/PAUL/post/Conference-corruption-demo-scripts-and-example-corrupt-databases.aspx

o   http://www.sqlskills.com/pastConferences.asp

·         SQL Server: Is CHECKDB A Necessity?

http://blogs.msdn.com/psssql/archive/2009/02/20/sql-server-is-checkdb-a-necessity.aspx

·         Transaction Log (block) CHECKSUM
http://sqlserverpedia.com/blog/sql-server-2005/transaction-log-block-checksum/

·         Which part of ‘REPAIR_ALLOW_DATA_LOSS’ isn’t clear? http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/16/633645.aspx

·         Misconceptions around database repair http://www.sqlskills.com/BLOGS/PAUL/post/Misconceptions-around-database-repair.aspx

·         SQL Server 2008 Database Checking  http://blogs.technet.com/josebda/archive/2009/03/20/sql-server-2008-checking-databases.aspx

 

Checking the output

·         Automate your Database Integrity checks using SMO http://www.simple-talk.com/sql/backup-and-recovery/automate-your-database-integrity-checks-using-smo/

·         Inserting DBCC CHECKDB Results Into A Table http://www.lockergnome.com/sqlsquirrel/2009/03/11/inserting-dbcc-checkdb-results-into-a-table/


3 Comments

SQL Server lifecycle and the compatibility level settings

I’ve had several people ask how the compatibility mode/level fits in with the SQL Server lifecycle. For example, the question might be: once we no longer support SQL Server 2000, does that mean we don’t support databases restored/attached to later versions of SQL and still set to 80 compatibility mode for example? The answer is that each version of SQL Server has a set of compatibility levels it supports. That support does not change based on the lifecycle. So for example, SQL Server 2008 R2 supports compatibility levels 80 (SQL 2000), 90 (SQL 2005), and 100 (SQL 2008). When you restore a SQL Server 2000 database to SQL Server 2008 R2 it remains in 80 compatibility mode until you issue ALTER DATABASE {dbname} SET COMPATIBILITY LEVEL = 100 (in older versions use sp_dbcmptlevel). When extended support for SQL Server 2000 ends in 2013, Microsoft will not support SQL Server 2000 any more. But we will still support a database on SQL Server 2008 R2 that is in 80 compatibility mode because that is a supported level/mode for R2. So the lifecycle refers to the actual instance, not the database level.

I want to add some information here that a lot of people don’t understand. The compatibility level affects only some of the behavior of that database, mostly related to syntax and available features. A database set to 80 compatibility on SQL Server 2008 R2 is NOT the same as a database actually running on SQL Server 2000. For example, you can NOT restore the 2008 R2 80 compatibility mode database back to a 2000 instance. The optimizer is the 2008 R2 optimizer, not the 2000 optimizer. The compatibility mode/level is meant to be a temporary workaround while you fix syntax or keyword issues.


Leave a comment

New SQL Server Setup Portal is Live!

There’s a new setup portal to help you out with any SQL Server 2008 or SQL Server 2008 R2 setup issues. You can search for setup content, get downloads, watch videos and more. Learn how to slipstream (new in SQL 2008 SP1) or sysprep (new in R2). Check it out and let us know what you think!

SQL Server Setup http://technet.microsoft.com/en-us/sqlserver/ff625277.aspx


Leave a comment

DMV series: sys.dm_db_partition_stats

DMV: sys.dm_db_partition_stats

 

This DMV returns information for the current database context. It can tell you how many rows are in each table (per partition) and how much of the space is used and reserved for in-row, overflow, and LOB data (with or without the metadata overhead).

 

Remember that the stored count information was not always 100% accurate in SQL Server 2000. For a new table created on 2005 the counts will be accurate. But for a table that existed in 2000 and now resides on 2005 through a restore or update, you need to run (only once after the move to 2005) either sp_spaceused @updateusage = N’true’ or DBCC UPDATEUSAGE with the COUNT_ROWS option.

 

Regarding performance with LOB (text, image, XML, max) or overflow (variable length) data: You can end up with “extra” IOs and more page allocations/deallocations, which can potentially make performance worse. They are also processed synchronously which can slow down joins and sorts. If a column is indexed its data cannot be pushed offrow.

·         Row-Overflow Data Exceeding 8 KB http://msdn.microsoft.com/en-us/library/ms186981.aspx
Surpassing the 8,060-byte row-size limit might affect performance because SQL Server still maintains a limit of 8 KB per page. When a combination of varchar, nvarchar, varbinary, sql_variant, or CLR user-defined type columns exceeds this limit, the SQL Server Database Engine moves the record column with the largest width to another page in the ROW_OVERFLOW_DATA allocation unit, while maintaining a 24-byte pointer on the original page. Moving large records to another page occurs dynamically as records are lengthened based on update operations.

 

SQL 2000 did support storing LOBs (text, image) offrow, but it had no concept of overflow rows for variable length data types. If the actual length of a row with variable length data types would exceed 8060 bytes the insert or update would fail. In 2005+ the data over 8060 bytes is moved offrow, leaving behind a 24 byte pointer.

·         Row Size limitation in SQL 2000 and 2005 http://blogs.msdn.com/msdnts/archive/2006/12/01/row-size-limitation-in-sql-2000-and-2005.aspx

·         Stretching the 8K Row Limit  http://www.sqlmag.com/Articles/ArticleID/50490/50490.html?Ad=1

 

Sample Queries

— DMV: sys.dm_db_partition_stats
— Returns information for the current database
— Key data: row count per partition, pages of in-row, overflow, and LOB data, used vs reserved pages
— Common join objects: sys.allocation_units, sys.partitions, sys.indexes
— A table with no explicit partitions has partition_number=1
— Permission needed: VIEW DATABASE STATE

USE AdventureWorks

/*
— For any databases upgraded or restored from SQL 2000, you have to update the count info once
EXEC sp_spaceused @updateusage = N’true’
DBCC UPDATEUSAGE (0) WITH COUNT_ROWS — 0 = current db
*/

— Run the CLEANUP section when you have finished exploring the DMV

/*
— show that you have to account for the schema too (since table has same name as Person.Address)
— show how to force a row into LOB_DATA
— start with inserting just one row and run all the select queries
— then insert the LOB row and run the selects
— then run the overflow insert and run the selects
— DROP TABLE dbo.Address
CREATE TABLE dbo.Address (col1 varchar(max), col2 varchar(max), col3 nvarchar(4000), col4 nvarchar(4000), col5 nvarchar(4000));
INSERT dbo.Address VALUES (‘a’, ‘b’, ‘c’, ‘d’, ‘e’); — no overflow
INSERT dbo.Address VALUES (REPLICATE (‘a’, 8000),REPLICATE (‘b’, 8000), NULL, NULL, NULL); — LOB_DATA
INSERT dbo.Address VALUES (NULL, NULL, REPLICATE (‘a’, 4000),REPLICATE (‘b’, 4000),REPLICATE (‘c’, 4000)); — overflow
–SELECT DATALENGTH(col1), DATALENGTH(col2), DATALENGTH(col3), DATALENGTH(col4), DATALENGTH(col5),* FROM dbo.Address

— add a table with partitions
DROP TABLE a1
DROP PARTITION SCHEME myRangePS1
DROP PARTITION FUNCTION myRangePF1

CREATE PARTITION FUNCTION myRangePF1 (int)
AS RANGE LEFT FOR VALUES (1, 100, 1000);
GO
CREATE PARTITION SCHEME myRangePS1
AS PARTITION myRangePF1
TO (‘primary’, ‘primary’, ‘primary’, ‘primary’);
GO
CREATE TABLE a1
(col1 int) ON myRangePS1 (col1)
GO
INSERT a1 VALUES (1)
INSERT a1 VALUES (10)
INSERT a1 VALUES (101)
INSERT a1 VALUES (1001)
*/

— Simple query with no joins or attempt to get the true count per table
— Note that it returns multiple rows per table when there are multiple indexes, these row counts should NOT be added together
— It also returns multiple rows per table when the table is partitioned, these DO need to be added together
SELECT ps.row_count, OBJECT_SCHEMA_NAME(ps.object_id) AS SchemaName, OBJECT_NAME(ps.object_id) AS ObjectName, ps.index_id, ps.partition_number
, *
FROM sys.dm_db_partition_stats ps
ORDER BY OBJECT_NAME(ps.object_id), OBJECT_SCHEMA_NAME(ps.object_id)

— Still doesn’t take into account how to SUM the rows
— Shows the most interesting columns and calculates the total size
— You could use the “used” or “data page” columns instead of the “reserved” columns, explore this on your own
SELECT ps.row_count, OBJECT_SCHEMA_NAME(ps.object_id) AS SchemaName, OBJECT_NAME(ps.object_id) AS ObjectName, ps.index_id, ps.partition_number
, au.type_desc, au.total_pages AS TotalPages, au.total_pages * 8 /*8k pages*/ /1024 /*1024k per mb*/ AS TotalMB
FROM sys.dm_db_partition_stats ps
JOIN sys.allocation_units au ON ps.partition_id = au.container_id
ORDER BY OBJECT_NAME(ps.object_id), OBJECT_SCHEMA_NAME(ps.object_id)
— do not order by size or pages as we haven’t done a group by to add partitions together

— Query from http://sqlblog.com/blogs/louis_davidson/archive/2007/07/11/sys-dm-db-partition-stats.aspx
— “Get rowcount of tables. Note that I grouped on the object_id, because for a partitioned table,
— you need to add all of the rows in all partitions.”
select object_name(dm_db_partition_stats.object_id),
sum(dm_db_partition_stats.row_count) as row_count
from sys.dm_db_partition_stats
join sys.indexes
on indexes.object_id = dm_db_partition_stats.object_id
and indexes.index_id = dm_db_partition_stats.index_id
where indexes.type_desc in (‘CLUSTERED’,’HEAP’)
group by dm_db_partition_stats.object_id

— total rowcount as modified by cgross
SELECT OBJECT_SCHEMA_NAME(ps.object_id) AS SchemaName, OBJECT_NAME(ps.object_id) AS ObjectName,
SUM(ps.row_count) AS row_count
FROM sys.dm_db_partition_stats ps
JOIN sys.indexes i
ON i.object_id = ps.object_id
AND i.index_id = ps.index_id
WHERE i.type_desc IN (‘CLUSTERED’,’HEAP’)
AND OBJECT_SCHEMA_NAME(ps.object_id) <> ‘sys’
GROUP BY ps.object_id
ORDER BY OBJECT_NAME(ps.object_id), OBJECT_SCHEMA_NAME(ps.object_id)

— rowcount plus inrow, LOB, overflow info
— reserved info includes allocated data, index data, metadata
SELECT OBJECT_SCHEMA_NAME(ps.object_id) AS SchemaName, OBJECT_NAME(ps.object_id) AS ObjectName,
SUM(ps.row_count) AS row_count,
SUM(ps.in_row_reserved_page_count) AS InRowReservedPgs,
SUM(ps.lob_reserved_page_count) AS LOBReservedPgs,
SUM(ps.row_overflow_reserved_page_count) AS OverflowReservedPgs,
SUM(ps.reserved_page_count) AS TotalReservedPgs
FROM sys.dm_db_partition_stats ps
JOIN sys.indexes i
ON i.object_id = ps.object_id
AND i.index_id = ps.index_id
WHERE i.type_desc IN (‘CLUSTERED’,’HEAP’)
AND OBJECT_SCHEMA_NAME(ps.object_id) <> ‘sys’
GROUP BY ps.object_id
ORDER BY OBJECT_NAME(ps.object_id), OBJECT_SCHEMA_NAME(ps.object_id)

/*
— Per BOL “The container ID has a one-to-one mapping to the partition_id in the sys.partitions catalog view that maintains the relationship between the table, index, or the indexed view data stored in a partition and the allocation units used to manage the data within the partition. ”
— more info on sys.allocation_units in http://www.sqlmag.com/articles/index.cfm?articleid=50257&pg=2
SELECT * FROM sys.allocation_units
— WHERE allocation_unit_id = 72057594042974208 OR container_id = 72057594042974208 — partition_id of Address table
WHERE container_id IN (72057594043957248,72057594038321152) — two partition_ids for databaselog

— Per BOL “The sys.system_internals_allocation_units system view is reserved for Microsoft SQL Server internal use only. Future compatibility not guaranteed.”
SELECT * FROM sys.system_internals_allocation_units
WHERE allocation_unit_id = 72057594042974208 OR container_id = 72057594042974208 — partition_id of Address table
*/

/*
References and explanations:
sys.dm_db_partition_stats (Transact-SQL) http://msdn.microsoft.com/en-us/library/ms187737.aspx
Diving In to Object-Storage Metadata http://www.sqlmag.com/articles/index.cfm?articleid=50257&pg=2

— UpdateUsage
Remember that the stored count information was not always 100% accurate in SQL Server 2000.
For a new table created on 2005 the counts will be accurate.
But for a table that existed in 2000 and now resides on 2005 through a restore or update,
you need to run (only once after the move to 2005) either sp_spaceused @updateusage = N’true’ or
DBCC UPDATEUSAGE with the COUNT_ROWS option.

— Performance
Performance can potentially be worse with LOB (text, image, XML, max) or overflow (variable length) data.
You can end up with “extra” IOs and more page allocations/deallocations, which can potentially make
performance worse. They are also processed synchronously which can slow down joins and sorts.
If a column is indexed its data cannot be pushed offrow.
• Row-Overflow Data Exceeding 8 KB http://msdn.microsoft.com/en-us/library/ms186981.aspx
Surpassing the 8,060-byte row-size limit might affect performance because SQL Server still maintains a limit of 8 KB per page. When a combination of varchar, nvarchar, varbinary, sql_variant, or CLR user-defined type columns exceeds this limit, the SQL Server Database Engine moves the record column with the largest width to another page in the ROW_OVERFLOW_DATA allocation unit, while maintaining a 24-byte pointer on the original page. Moving large records to another page occurs dynamically as records are lengthened based on update operations.

— SQL 2000
SQL 2000 did support storing LOBs (text, image) offrow, but it had no concept of overflow rows for variable length data types. If the actual length of a row with variable length data types would exceed 8060 bytes the insert or update would fail. In 2005+ the data over 8060 bytes is moved offrow, leaving behind a 24 byte pointer.
• Row Size limitation in SQL 2000 and 2005 http://blogs.msdn.com/msdnts/archive/2006/12/01/row-size-limitation-in-sql-2000-and-2005.aspx
• Stretching the 8K Row Limit  http://www.sqlmag.com/Articles/ArticleID/50490/50490.html?Ad=1
*/

/*
–CLEANUP
DROP TABLE a1
DROP PARTITION SCHEME myRangePS1
DROP PARTITION FUNCTION myRangePF1
DROP TABLE dbo.Address
*/