SQL Server 2005 - Query Rewriting for Performance Tuning
By : Kasim Wirama, MCITP, MCDBA
For many times, I am asked to solve slow running stored procedure. Interesting to know more? If yes, read on…
I was asked by a user to work on a mission critical stored procedure that doesn’t perform as expected. The stored procedure runs more than 1 minute to return nearly 10,000 rows. In common sense, 10,000 rows is not much and but responsiveness of a query depends how complex the query is. By theoretically, I believe SQL Server could return it under 1 second. Initially, fact speaks differently.
I was assigned to tune the stored procedure (client expects below 1 second). To understand logic behind SQL statements that developer has written takes a little bit of time. Practically I check whether indexes that support table joins have been in-place and also check defragmentation level of tables involved. In SQL Server 2005/2008, you can notice defragmentation level by query to Dynamic Management Function sys.dm_db_index_physical_stats. For example if you would like to know defragmentation severity on Production.Product table on AdventureWorks from root index until leaf index, issue the query below :
select * from sys.dm_db_index_physical_stats(db_id(),object_id('production.product'),null,null,'detailed')
Check value on avg_fragmentation_in_percent column. If you find value greater than 30 percent, then I rebuild the index, as simple as that. In my case, I find some tables which have until 99% defragmentation level. After I rebuild index, the stored procedure performs better by returning result in 10 seconds. Still it is far from client’s expectation.
When I review the stored procedure, it has 1 centralized query with 6 table joins and 2 subqueries in SELECT statement. 2 of 6 tables refers to same table with different table alias and 2 subqueries involve join with 2 tables which one of them refers to same table (2 of 6 tables) as well. They assigned their database team leader to review the stored procedure with me. He splits out the big query becomes 2 subqueries joined with UNION ALL and we identified that second subquery runs slowly when returning big results whereas first subquery returns quickly for it returns less than 20 rows. The second query is tuned by replacing “!=” to “=” in WHERE CLAUSE. You may say it doesn’t make sense; but after we examines the data actually the sproc intends to process only rows which has value 2 particular value only. With the modification, the stored procedure returns 4 seconds. He said that 4 seconds performance still doesn’t meet SLA (service level agreement) which defines 1 second or less. The project is on verge of end-user resistance. As he mentioned to me “We need to make it run 1 second or less within 2 days; otherwise we lose good timing to get potential end-user.” Obviously, it requires more work by rewriting the query. Most cases I find that query rewriting gives huge performance boost but with more effort and time to rewrite it particularly when the query is complex.
The first step I look into execution plan. Everything looks good with no index scan; but there is a hint there. Nested loop runs thousands times on inner reference. If inner and outer reference has contains significant actual number of rows; then it should have merge join or hash match instead. For my case, I prefer merge join. When I hover mouse on execution plan arrow in inner reference, I find huge number of difference between actual number of rows and estimated number of rows. Actual number of rows is nearly 10,0000 whilst estimated number of rows is only 1; because query engine estimates only 1 rows then the most efficient operator is NESTED LOOP. Huge number of difference is called cardinality estimation error. I feel more confident is ratio is under 10, for this case the ratio is 10,000 !!
I isolate those tables involved in table variable. I choose table variable instead of temporary table because I don’t need to maintain statistics and there is only DML issued once (INSERT) so table variable is more lightweight than temporary table. For performance wise, I apply unique index on the table variable. As I mention in previous passage, 2 tables alias refer to same table; the table has over 40 million of rows and I know that the query just need around 3000 rows instead and the table has one big column (nvarchar(max)). I isolate required query of the big table into table variable and I apply unique index on table variable. I replace all physical tables becomes only 2 table variables with different alias; and the performance time decreases to 700ms. The challenge is now on 2 subqueries. After I investigate the data, first subquery returns no result in any condition for any type of product inside the database, simply I put blank character on first subquery; but second subquery returns data (and returns 2 rows for each rows returned) and the expected returned column is COALESCE (VALUE1, VALUE2) which is under one rows instead of 2 rows. I did a trick by put OUTER APPLY and use PIVOT inside OUTER APPLY so I transform 2 rows become 1 rows with 2 columns; then COALESCE would be possible.
The end result would take around 900’s ms which is within their SLA (Service Level Agreement) expectation. Some people say that SQL Server couldn’t handle many table joins and it will slow down; but there must be “something” that causes the situation, one of possibilities is cardinality estimation error. If you find cardinality issue, check your index/statistics before proceeding to identify tables involved and doing isolation into table variable/temporary table. Identify also big table with LOB column(s) that are referenced many times because they most likely causes performance issue when joining with other tables.
SQL Server offers an excellent return of investment as long as you could utilize optimally and troubleshoot performance that might arise from application. If you don’t need advanced feature (such as table partitioning, parallel index operation, indexed view, full-fledged business intelligence etc), standard edition would be sufficient for the case I describe above.