April 2008 - Posts
Customize Black Block Trace
By : Kasim Wirama, MCDBA
SQL Server has background running trace by default that records some events. It is useful when your SQL Server or its hosting environment had crashed, so you can look some queries and investigate them right before the failure. It listens for SP:Starting, SQL:BatchStarting, Exception and Attention events, with some columns for each of the events.
You can enable black box trace by setting @options parameter = 8 from system stored procedure sp_trace_create. And you run the event by executing sp_trace_setstatus with status = 1 and passing @traceId value, here is the script :
DECLARE @traceid INT;
EXEC sp_trace_create @traceid OUTPUT, @options = 8;
EXEC sp_trace_setstatus @traceid, 1
By default, it generates 2 trace files, and flip back and forth between them. Size of each is 5 MB and located in default DATA folder of SQL Server installation directory. For example here I change default size 5 MB to 10 MB into @maxfilesize parameter.
DECLARE @traceid INT, @maxfilesize BIGINT;
SET @maxfilesize = 10
EXEC sp_trace_create @traceid OUTPUT, @options = 8, @tracefile = NULL, @maxfilesize = @maxfilesize;
EXEC sp_trace_setstatus @traceid, 1
When you specify @maxfilesize, you need to pass value for @tracefile even though the value is NULL.
You can make your black box trace start automatically when SQL Server start by wrapping it inside a stored procedure and create the stored procedure in master database. Then you execute sp_procoption system stored procedure below :
EXEC sp_procoption ‘your stored procedure’, ‘STARTUP’, ‘ON’
Detecting Blocking Problem
Detecting Blocking Problem
By : Kasim Wirama, MCDBA
SQL Server uses locks as ACID
implementation. Locks ensure data consistency in multiuser environment in
database. Locks become issue when several processes waiting for a resource that
is currently held by incompatible lock. In this situation, a process blocks
other processes to access shared resource. When your database had excessive
locking issue, you need to find out immediately and resolve it. I will tell you
how you monitor your server regarding to blocking with performance monitor and
DMV for more information about blocking.
You can have information from
these performance counter name below to get sense about current situation for
blocking. They are : SQLServer:Locks: Average Wait Time, SQLServer:Locks: Lock
Requests/sec, SQLServer:Locks: Lock Wait Time, SQLServer:Locks: Lock Waits/sec,
SQLServer:Locks: Number of Deadlocks/Sec, SQLServer:General Statistics:
Processes Blocked and SQLServer:Access Method: Table Lock Escalations/Sec.
Besides these counters above, you
can get history of shared lock and exclusive lock of pageiolatch by querying
DMV sys.dm_os_wait_stats
SELECT wait_type, waiting_tasks_count,
wait_time_ms_max_wait_time_ms, signal_wait_time_ms
FROM sys.dm_os_wait_stats
ORDER BY wait_time_ms DESC
The wait type you need to notice
is LCK_M_S, PAGEIOLATCH_SH and PAGEIOLATCH_EX. Compare wait_time_ms and
max_wait of them, if percentage of max_wait againsts wait_time_ms is large, you
need monitor blocking events from profiler.
Another DMV that is useful to
detect current blocking is sys.dm_tran_locks. To get the query involved in
blocking, you need to join the DMV with other DMVs, i.e.
sys.dm_os_waiting_tasks and sys.dm_exec_requests.
Detect Memory Bottleneck in SQL Server
By : Kasim Wirama, MCDBA
This article, I would like to give information about memory bottleneck at SQL Server. The easiest way to know whether your SQL Server undergoes memory bottleneck is by looking at Task Manager, tab Performance, Physical Memory section. Compare the Total and Available value, as long as Available is over 100 MB, it is still okay. If Available value is under 10 MB, your database server box is definitely under physical memory pressure. Besides physical memory pressure, there is virtual memory pressure. Let’s take a look for these kinds of pressure and see what solutions addressed to the corresponding problems.
You can get information about physical memory pressure by looking at these performance monitor counters: Memory:Available bytes, SQL Server:Buffer Manager: Buffer Cache Hit Ratio, Page Life Expectancy, Checkpoint Pages/sec, and Lazywrites/sec. You have your system performing well if value of first two counters are high and last two counters are low.
You can check how many memories allocated for buffer pool by issuing DBCC MEMORYSTATUS query in SSMS. Compare value between Committed and Target, if Target value is low, it indicates external physical memory pressure. You need to find out whether other application that causes external memory pressure, besides SQL Server, by looking at maximum memory consumed in Process tab of Task Manager window. If other application causes external memory pressure, it is advisable that the application is separated from database server box, or you add more memory. You need to check internal memory pressure by looking at percentage of stolen page to total committed page from DBCC MEMORYSTATUS. Percentage value is greater than 75% or 80% indicating internal physical memory pressure. Interesting thing about internal memory pressure is that you need to find out what components causing so many stolen memory page by querying this DMV below :
SELECT type, SUM(single_pages_kb) FROM sys.dm_os_memory_clerks
GROUP BY type
ORDER BY SUM(single_pages_kb) DESC
From query above, single page allocation takes memory from SQL Server buffer pool, whereas multipage allocation allocates memory outside buffer pool. Multipage allocator possibly causes internal memory pressure. You can get information about memory allocated to multipage page allocation by querying DMV below :
SELECT type, SUM(multi_pages_kb) FROM sys.dm_os_memory_clerks
WHERE multi_pages_kb != 0
GROUP BY type
ORDER BY SUM(multi_pages_kb) DESC
To detect virtual memory pressure, you can look at performance counter name here : Paging File: %Usage and Memory:Commit Limit.
Some solutions addressed to virtual memory pressure. They are:
1. Increase size of page file
2. Use /3GB options
3. Switch to 64 bit machine that have 8 TB memory address space.
Detecting and Resolving I/O Bottleneck
By : Kasim Wirama, MCDBA
I/O subsystem is one of critical component in SQL Server. I/O subsystem is used when SQL Server moves page between memory and I/O subsystem. With intensive activity of DML and DDL, SQL Server generates significant log entries and to tempdb database if the activities are creation and operations of table variable, temporary table, sorting, create and rebuild indexes and row versioning technology. Let’s look how to detect I/O bottleneck and solution to this issue.
Here are performance counter for detecting I/O bottleneck :
1. PhysicalDisk: Avg. Disk Queue Length
If you encounters value 2 or more when SQL Server is under peak usage, you have I/O bottleneck.
2. PhysicalDisk: Avg. Disk Sec/Read and Avg. Disk Sec/Write
These counter names gives information about average value on how fast your disk operates under read and write activity. You need to pay attention for I/O subsystem when the value is more than 20 ms.
3. PhysicalDisk: Disk Reads/sec and Disk Writes/sec
These counter names gives rate of read and write operation. If the value is at least 85 percent of disk capacity, the I/O subsystem experiences bottleneck.
Unfortunately these counters above measures I/O subsystem on hard disk level not in file level. If you have several files in an I/O subsystem, you need to have information from DMV sys.dm_io_virtual_file_stats, looking at io_stall_read_ms and io_stall_write_ms. Run the DMV several times in intended duration to get delta of these values.
You might have 3 possibilities of I/O subsystem issue. It might be caused by inefficient queries that effects to I/O intensive operations, lack of indexes or the disk subsystem needed to be upgraded to accommodate anticipated workload. You can find out I/O intensive query by querying DMV sys.dm_exec_query_stats and sort descending order for sum of total_logical_reads and total_logical_writes. To find out lack of indexes in an underlying table you issue DMV query below :
SELECT t1.object_id, t2.user_seeks, t2.user_scans, t1.equality_columns, t1.inequality_columns
FROM sys.dm_db_missing_index_details AS t1, sys.dm_db_missing_index_group_stats AS t2, sys.dm_db_missing_index_groups AS t3
WHERE database_id = DB_ID(‘your database’) AND object_id = OBJECT_ID(‘your table’) AND t1.index_handle = t3.index_handle AND t2.group_handle = t3.index_group_handle
SQL Server 2005 exposes I/O performance information through least performance impact DMV, so that you can quickly spot and fix the I/O bottleneck issue.
Detect CPU Bottleneck in SQL Server
By : Kasim Wirama, MCDBA
When you have your database server experiencing a problem, there are many possibilities here, such as CPU, memory, hard disk or database configuration itself. There should be a systematic way to trouble shooting performance problem in SQL Server. This article, I would like to point out how to detect bottleneck in CPU with SQL Server 2005.
The straight forward way to detect CPU problem is to look at performance counter, with object : Processor, and counter name : %Processor Time. If it shows high percentage value, let’s say 80% or over during 15 to 20 minutes, you definitely have CPU bottleneck. Anyway, you need to establish your baseline for CPU threshold above. Another counter name that is useful is System:Processor Queue Length. This counter gives information how long a queue for each processor. If you see 2 or more value for most of the time, your processors are under pressure. When your server box has some applications running besides SQL Server, probably one of the application takes up significant CPU resource. To prove your suspicious thought, get information from Process:%Processor Time counter.
If you have your CPU bottleneck caused by SQL Server, you need to find out how many processes that are running, runnable, and suspended. An amount of runnable processes indicate that the CPU is busy serving other request, and an amount of suspended processes indicate that there is blocking issue. Here is the query to get the information.
SELECT COUNT(*) , t2.scheduler_id
From sys.dm_os_workers as t2, sys.dm_os_schedulers as t2
Where t1.state = ‘runnable/running/suspended’ and t1.scheduler_address = t2.scheduler_address and t2.scheduler_id < 255
Group by t2.scheduler_id
In general, there are 2 things that causes CPU bottleneck, they are :
1. Inefficient query plan.
If you want to associate the query with CPU bottleneck, you query it from DMV sys.dm_exec_query_stats and extract query text from sys.dm_exec_sql_text with parameter sql_handle. You sort the result based on most expensive average CPU cost that consists of division between total_worker_time and execution_count
2. Excessive compilation and recompilation.
If SQL Server needs some time to compile/recompile the query, it shows that your execution plan is not reusable. If your query is very complex, try to rewrite/adding some index that will make the compilation time run faster.
These are 3 performance counter relating to excessive compilation/recompilation issue :
a. SQL SERVER: SQL Statistics : Batch Requests/Sec
b. SQL SERVER: SQL Statistics : SQL Compilations/Sec
c. SQL SERVER: SQL Statistics : SQL Recompilations /Sec
With wealth information from DMV and performance monitor, you have a useful tool for troubleshooting CPU bottleneck right away.
Statistics time for Plan Reuse
By : Kasim Wirama, MCDBA
In my early days of query tuning during my IT career, I seldom used STATISTICS TIME for query tuning. But recently I just know that statistics time indeed gives useful information. This time statistics gives indication whether your query plan is reusable and how long it takes for compilation and optimization phase. I would tell you why I say it is valuable tool for query tuning.
If you execute a query after you turn on statistics time, you will get information how long the query is parsed-compiled and executed. Even, it gives granular information how long CPU gets involved in the process and the elapsed time.
For plan reuse, you will see the two sets of data resulted from turning on STATISTICS TIME. They show information the execution plan is added to plan cache for reuse to other similar query request. First set shows parse and compile activity, second set shows SQL Server retrieves plan cache. CPU time is not equal for first execution. If second execution results zero value for CPU time, it shows good indication that your query uses existing execution plan.
Information from statistics time should be combined with information from IO and Profile statistics to give precise information how exactly query processing happened. Statistics time with other statistics information (IO, PROFILE, XML) takes effect for only current session
Read Ahead Read and Scan Count
By : Kasim Wirama, MCDBA
If you turn STATISTICS IO on to get know access to underlying tables in a query, you have information logical read, physical read, read ahead read and scan count. Besides logical read and physical read, other interesting information are read ahead read and scan count. What information that they can give? Let take a look.
Read ahead reads indicates number of pages that are read into data cache using read ahead mechanism. Notice that if read ahead reads is high, probably the physical read is low and buffer cache hit ratio is high. In this situation, additional memory into SQL Server benefits for read ahead read. It is a good indication, but it is better that the data will be in cache from previous execution so no additional data pages requested for read ahead activity. Read ahead read is an asynchronous operation; it means that it doesn’t block the query to complete as long as the query has the pages when needed. Read ahead mechanism has relation to full or partial table scan. In the scan, the IAM table is accessed in disk order to determine which extents are belonged to underlying database object. If you put eight files in a filegroup, the read ahead reads will keeps those files busy instead of sequentially processing those files.
Scan count indicates the number of times a table is accessed. In nested loop, even though scan count of inner table is one but logical read of inner table is calculated as multiplication of number scan count of an outer table is accessed with number of logical read of corresponding inner table. For merge or hash join, the scan count is one, but memory for those joins are bigger than nested loop join. The amount of memory that is consumed can be inspected in column granted_query_memory in DMV sys.dm_exec_requests, but it is only current session, if you want to see all sessions, you inspect column memory_usage in DMV sys.dm_exec_sessions.
Logical and Physical Read in SQL Server Performance
By : Kasim Wirama, MCDBA
Every query has its execution plan before it gets executed. Execution plan is generated by optimizer, one of components in SQL Server database engine; it contains how to access data to satisfy the query request. You probably meet few queries having slow response time by looking into physical read and logical read. How do you get information about logical and physical for a query? I will tell you how and what it means to your query.
You open a connection in SSMS (SQL Server Management Studio) or Query Analyzer, then execute SET STATISTICS IO ON. After that you can execute query that is slow. And you will get information logical and physical read on a related table(s).
Logical read indicates total number of data pages needed to be accessed from data cache to process query. It is very possible that logical read will access same data pages many times, so count of logical read value may be higher than actual number of pages in a table. Usually the best way to reduce logical read is to apply correct index or to rewrite the query.
Physical read indicates total number of data pages that are read from disk. In case no data in data cache, the physical read will be equal to number of logical read. And usually it happens for first query request. And for subsequent same query request the number will be substantially decreased because the data pages have been in data cache.
Buffer hit ratio will be calculated based on these two kinds of read as the following formula: (logical reads – physical reads)/logical read * 100%. The high buffer hit ratio (if possible to near 100%) indicates good database performance on SQL Server level. So use information from physical read and buffer hit ratio to measure performance in server level and logical read to measure individual query level
Access Method in Index
By : Kasim Wirama, MCDBA
Every query has its execution plan. You can usually find these operators in execution plan, i.e. table scan, clustered index scan/seek, non clustered index seek. I would describe how they combined with table structure (heap or clustered index) access qualified rows. The way they access the rows will associate the query cost. Turn on Include Actual Execution Plan in SQL Server Management Studio (SSMS) to be able to see the operators. Let’s see them.
If you have heap, you probably have table scan or index seek operator. The access method for table scan is scanning whole data page. Provided that the table has thousands of data pages, it takes some time to return your query result. For index seek operator, it will scan the number of index level, leaf pages and a number of qualifying rows.
If you have clustered index on your table, you have clustered index seek or clustered index scan. For clustered index scan, it gives you information that it accesses the number of an index level and data pages. For clustered index seek, it gives you information that it access the number of an index level and qualifying pages. With nonclustered index you have options to have covered index. Covered index will direct SQL Server don’t have to access data pages and the response will be faster than accessing data pages.
The way of accessing data will describe the estimated cost in logical read, so it will give you illustration how efficient your query is.
Spot Area where Index Could be Potentially Used in a Query
By : Kasim Wirama, MCDBA
This article talks about where you need to implement index in a query. Index is used for fast retrieval of row(s). It looks good but it has overhead, the overhead lies on slow update, insert and delete operations, because SQL Server has to maintain not only data pages but also index page. But for read intensive operations such as select, index is necessary element to speed up retrieval time. If you have slow running query, here is some query spots that index might be beneficial to implement.
Join
Make sure you have index on join criteria. In simple example join involves 2 tables. Rows on first table are searched based on criteria defined in join predicate, and then they are matched with rows on second table. You can find performance degradation when the huge tables join each other without index on them.
Sort
SQL Server makes use of index for sorting operation. In clustered index, no explicit order by operation required because data pages are linked one another on certain order so SQL Server just follow the page link to go through from current page to next page. In clustered index, SQL Server can use for avoiding explicit sort operator in execution plan. Sort operator is bottleneck operator because it has stop-and-go nature.
Group
If you have group by clause in select query, index can speed up group operations because to be able to do group action, the rows are sorted first. Grouping to clustered index would be performed very quickly.
Unique index
Try to implement unique index where possible, to help SQL Server optimizer that it just needs to do few page reads, so seek operator is much more possibly chosen by optimizer.
Multiple Sessions One Transaction
By : Kasim Wirama, MCDBA
SQL Server is enterprise level database with multiple connections. When some of those connections have transaction, they have their own transaction id. SQL Server 2005 lets you see the transactions by providing related dynamic management view (DMV) called sys.dm_tran_session_transactions. The interesting columns that I see here is session_id, transaction_id and is_bound.
Open 2 windows. First and second window you initiate transaction, open third window, and query the DMV. You will get 2 session, each of them has different transaction_id and is_bound column value is zero. That’s the typical database environment.
You can have 2 or more session with same transaction_id by implementing bound connection with token-based SQL Server. You need to implement first connection with explicit connection and the subsequent connection get bound with transaction token from first connection. In third window, query the DMV, this time it gives the information two session_id but with same transaction_id, notice that the second connection has is_bound column value is one.
The lesson that I get from here is that not always one database connections related to connection from one user. The possibility so far are two, first is that the client opens the connection but not close it, second the client implements transaction that spans multiple connections. To make sure whether your guess is correct, you can infer the information from DMV.
If you are interested to have more information about bound connection, you can read from SQL Server Books On Line or MSDN website (http://msdn.microsoft.com/sql)
SQL Server 2008 Features
By : Kasim Wirama, MCDBA
SQL Server 2008 February CTP has a bunch of features that I have to be familiar with their use. Here I note some topic in SQL Server 2008 February CTP:
Business Intelligence
- Datetime series ARIMA algorithm
- AMO warning
- Tablix control in reporting service
- Partition processing enhancement in data warehouse
- Analyzing SSAS with Excel 2007
- Change Data Capture
- Managing resource governor
- Dynamic calculated member
Data Management
- Policy based management
- Plan freezing
- Performance Studio
High Availability
- Data backup and recovery
- Data snapshot
- Database mirroring enhancement
- Instant initialization
- Peer to peer replication enhancement
- Service broker enhancement
- Table and index partitioning
- Transparent data encryption
Development
- Spatial data type
- Datetime data type
- Hierarchy data type
- Table value data type
- GROUPING SET feature
- Multi row INSERT
- LINQ
- Occasionally connected application
For every new version of SQL Server, there always have almost everything exciting to look into.
SELECT SWITCHOFFSET(SYSDATETIMEOFFSET(), DATEPART(TZOFFSET,SYSDATETIMEOFFSET()))
Amazing. 
Data Modelling
By : Kasim Wirama, MCDBA
One of database discipline that I think the basic of database design is data modelling. Data modelling covers requirement gathering until logical design. The purpose of data modelling is to minimize impact change to application against dynamic change of business requirement. This knowledge is not new but still relevant up to now as long as relational database concept is widely implemented. The more reliable you model your data, the less effort to change application tier, it means your database and application is ready to answer business challenges in a short time and least change effort. That’s purpose of implementing good data modelling.
In database job market, it is preferable to have professionals who have knowledge of data modelling. I have tried 2 modelling tools, Visio and Sybase Power Designer. I found that Sybase Power Designer offers more modelling styles (IDEF1X, Chen, etc) and can generate script to multi database platforms.
Key and Covered Column in SQL Server Indexes
Key and Covered Column in SQL Server
Indexes
By : Kasim Wirama, MCDBA
I would share here about index
knowledge in SQL Server. Well tuned index will contribute superior database
performance besides other factors such as hardware, software and database environment,
etc. Choosing column that will be included in index is important. The thing
that should be kept in mind is that apply only necessary columns for an index
because index gives overhead in DML operation and it has only good for query
operation.
Many kinds of indexes in SQL
Server, they are nonclustered index, clustered index,
and unique index. According to table structure, there are heap and unique clustered
table. Let’s take a look possible index in heap structure, then unique
clustered table.
Here is the example for heap
table,
CREATE TABLE heap (a int, b int, c int);
If you apply nonclustered
or unique nonclustered index on column a, the key and
covered column is a. if you include b in nonclustered
index, the key column is a and covered column will be a and b. In heap you have
limited number of column in a key.
Here is the example for clustered
table,
CREATE TABLE clu (a int, b int, c int);
If you apply unique clustered
index on column a, the key column is a, and covered columns will be column a, b
and c. if you apply nonclustered index on column b,
the key and covered column will be b and a. and if you apply unique nonclustered index on column b, the key column is b and
covered columns are a and b. there is a slight difference in terms of number of
key columns between nonclustered index and unique nonclustered index.
With appropriate columns for
index in heap and clustered table structure, the index will be useful for
boosting particular query performance with less overhead in DML.
Scan and Seek in Table/Index Structure
Scan and Seek in Table/Index
Structure
By : Kasim Wirama, MCDBA
In my articles about query
execution plan, I have mentioned that seek operator is better than scan
operator in many cases. Table and index structure have influence on possibility
of index and seek operator. By understanding which table and index structure
will determine seek and scan, there will be room for performance improvement. Let’s
take a look the existing table and index structure in SQL Server.
If your table doesn’t have any
indexes on it, the table is called heap. It’s just a heap of rows with certain ordered
on physical location. The only possibility for heap structure is table scan. For
few numbers of pages and with read-only for those pages, it is quick to do
table scan. Performance will degrade will large number of pages for heap
structure.
Next structure is clustered
index. It has certain ordered on physical location of row. Scan and seek could probably occur on it. For
scan, it is called clustered index scan, and for seek, it is called clustered
index seek. Performance of clustered index scan is similar to table scan. The better
condition is clustered index seek.
Final structure is nonclustered index. it has certain
ordered on logical location of row, physically unordered. Seek and scan could
happen. For scan, it is called index scan and for seek it is called index seek.
Performance of index scan is better than that of clustered index scan. But index
seek is the best operator among all of them.
More Posts
Next page »