SQL Server knowledge center

everything about SQL Server
See also: Other Geeks@INDC
Variable Declaration Enhancement in SQL Server 2008

Variable Declaration Enhancement in SQL Server 2008

By : Kasim Wirama, MCDBA

 

In this posting, I would let you know how SQL Server 2008 introduces enhancement in variable declaration.

1.       Assign new value of newly declared variable

In previous SQL Server version up to SQL Server 2005, you need to write 2 lines of codes in declaring new variable and assign new value to it. For example :

 

DECLARE @var1 INT;

SET @var1 = 5;

 

SQL Server 2008 make code above is written only in one statement here :

DECLARE @var1 INT = 5;

 

2.       Compound assignment operator

When you need to apply some value to a variable and assign it to the variable itself you need to specify the variable in right expression explicitly.

For example :

 

SET @var1 = @var1 + 1;

 

SQL Server 2008 makes it brief by introducing compound assignment operator here :

 

SET @var1 += 1;

 

Here is the complete list of compound assignment operators:

+=

-=

/=

%=

&=

|=

^=

 

3.       Row constructor in insert statement.

In previous SQL Server version up to SQL Server 2005, when you need to insert more than one new rows, you need to specify explicitly INSERT statement for each new rows. For example :

INSERT INTO dbo.ATable (col1, col2) VALUES (1, ‘test1’);

INSERT INTO dbo.ATable (col1, col2) VALUES (2, ‘test2’);

 

SQL Server 2008 makes it shorter by introducing row constructor, so you just need to specify 1 insert statement only for more than one new rows here :

INSERT INTO dbo.ATable (col1, col2) VALUE (1, ‘test1’), (2,’test2’);

There are still a bunch of TSQL enchancements in SQL Server 2008. The items above are simple ones, and I hope you become aware the enhancements above so that you will get used to it and implement in your database development projects.

Share this post: | | | |
Introduction of XML Data Type in Practical Use (Part 2-finish)

Introduction of XML Data Type in Practical Use (Part 2-finish)

                                                                         By : Kasim Wirama, MCDBA        

 

This second part posting, I would like to share by example about XML query and data manipulation.

Query XML data

Several XML methods are used for query XML data. They are :

1.       Query

This method takes XQuery input parameter. For example : query to get all name of Job Candidate in Resume column, database Adventureworks.

 

with xmlnamespaces('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume' as ns1)

select Resume.query('/ns1:Resume/ns1:Name')

from HumanResources.JobCandidate

 

results returned from above query are rows which contains candidate names extracted from Resume column (XML type)

 

2.       Value

To get scalar value of XML structure, you can use VALUE method of XML type. It requires 2 input parameters. First parameter is XQuery and second parameter is sql server native data type. For example here, I return list of last name for each job candidate.

 

with xmlnamespaces ('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume' as ns1)

select Resume.value ('(/ns1:Resume/ns1:Name/ns1:Name.Last)[1]','nvarchar(50)') as [Candidate Last Name]

from HumanResources.JobCandidate

 

3.       Exist

To get filtered result based on XML criteria, you can use Exist with XQuery as input parameter. The return value is 1 when it is true or 0 when it is false.

Example for Exist method is to get list of candidate’s last name whose education level is bachelor in Resume column of HumanResources.JobCandidate table below :

 

with xmlnamespaces ('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume' as ns1)

select Resume.value ('(/ns1:Resume/ns1:Name/ns1:Name.Last)[1]','nvarchar(50)') as [Candidate Last Name]

from HumanResources.JobCandidate

where Resume.exist ('/ns1:Resume/ns1:Education/ns1:Edu.Level [.="Bachelor"]') = 1

 

4.       Nodes method

This method will convert from XML format into relational format. Illustration for this method is to return list of bycicle model that contain material.

 

with xmlnamespaces ('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelDescription' as ns1)

select ProductModelID, Name, material.query('.') as [material name]

from Production.ProductModel

cross apply catalogdescription.nodes ('/ns1:ProductDescription/ns1:Specifications/Material/text()') as cd(material)

XML Data Manipulation

For data manipulation, XML type provides modify method. This method covers manipulation for insert, update and delete.

Insert

The format of insert statement for XML is

                INSERT (                                                <tag1>…</tag1>, <tagn>…</tagn>                                )                 INTO (/parent node)[1]

For example, I insert new element (state, country, CEO) into existing XML data as shown below :

declare @var xml;

select @var = '<?xml version="1.0"?>

<company>

                <name>Microsoft</name>

                <address>One Microsoft Way</address>

                <city>Redmond</city>

</company>';

set @var.modify ('insert (<state>Washington</state>,<country>USA</country>,<CEO>Bill Gates</CEO>) into (/company)[1]');

select @var;

The select result is :

<company>  <name>Microsoft</name>  <address>One Microsoft Way</address>  <city>Redmond</city>  <state>Washington</state>  <country>USA</country>  <CEO>Bill Gates</CEO>

</company>

Update

The syntax for XML update statement is :

                Replace value of ( XQuery )[1] with “new value”

Note : XQuery for update statement must include text() keyword for target XML element.

The example here is to update old CEO element text into new one (Steve Ballmer):

declare @var xml;

select @var = '<?xml version="1.0"?>

<company>

                <name>Microsoft</name>

                <address>One Microsoft Way</address>

                <city>Redmond</city>

</company>';

set @var.modify ('insert (<state>Washington</state>,<country>USA</country>,<CEO>Bill Gates</CEO>) into (/company)[1]');

set @var.modify ('replace value of (/company/CEO/text())[1] with "Steve Ballmer"');

select @var;

the query result is :

<company>  <name>Microsoft</name>  <address>One Microsoft Way</address>  <city>Redmond</city>  <state>Washington</state>  <country>USA</country>  <CEO>Steve Ballmer</CEO>

</company>

Delete

Syntax for XML delete statement is :

DELETE XQuery

For example here, I delete node address.

declare @var xml;

select @var = '<?xml version="1.0"?>

<company>

                <name>Microsoft</name>

                <address>One Microsoft Way</address>

                <city>Redmond</city>

</company>';

set @var.modify ('insert (<state>Washington</state>,<country>USA</country>,<CEO>Bill Gates</CEO>) into (/company)[1]');

set @var.modify ('replace value of (/company/CEO/text())[1] with "Steve Ballmer"');

set @var.modify ('delete /company/city');

select @var;

the result for delete statement is :

<company>  <name>Microsoft</name>  <address>One Microsoft Way</address>  <state>Washington</state>  <country>USA</country>  <CEO>Steve Ballmer</CEO>

</company>

Share this post: | | | |
Posted: Aug 27 2008, 07:59 PM by Kasim.Wirama | with no comments
Filed under:
Tip Querying Using SQL Server Management Studio

Tip Querying Using SQL Server Management Studio

By : Kasim Wirama, MCDBA

 

Starting from SQL Server 2005, functionalities of Enterprise Manager and Query Analyzer are united into one application, which is called SQL Server Management Studio (SSMS).

 

For those who has used SSMS; it is inherited from Visual Studio 2005. You can do database management task and also database development task. Here are some points that makes your work more effectively with SSMS.

 
  1. If you type a query and execute it; the result set display below the query; if the resultset is large enough, you’d better to see the resultset in separate tab. What I do is to select Tools menu, Options, go to Query Results, then SQL Server Results to Grid/Results to Text, and check “display results in a separate tab”. And if you would like to switch to result tab, you can check “switch to results tab after the query executes”.
 
  1. after point 1 is done, you need to navigate between query tab and result tab, the shortcut key for this is by pressing F6 key.
  
  1. sometimes you want to copy the resultset into excel; but you get no column header copied at excel. To have your resultset copied to excel along with column header name, you need to go to Tools menu, Options, go to Query Results, then SQL Server, Result to Grid, and check “Include column headers when copying or saving the results”
 
  1. when you need to specify all column list (not typing ‘*’) in select statement, you just drag “Columns” folder of the table you want to query into query window. It will automatically specify all column names for you without having to type it one by one especially the table has many column list.
 
  1. if you want to do ad-hoc insert/update/delete/select statement, you just right click on a table, and choose “script table as” then there is sub context menu available for you to choose (SELECT To, INSERT To, UPDATE To, DELETE To).
Share this post: | | | |
Posted: Aug 20 2008, 06:34 AM by Kasim.Wirama | with no comments
Filed under:
Installing SQL Server 2008 (RTM) without any issue

Installing SQL Server 2008 (RTM) without any issue

By : Kasim Wirama, MCDBA

 

I have downloaded SQL Server 2008 (RTM), but when I installed the RTM, the installer didn’t permitted me to install successfully because my Visual Studio 2008 was not patched for service pack 1. I searched to Microsoft website, it only has service pack 1 beta version for visual studio 2008.

The first installation failed; I tried my workaround here by installing database core engine without installing Management studio; then the installation for core engine has completed successfully. Now how can I managed the database using Management Studio? I executed SQL Server 2008 RC0 installer and choose to install Management Studio. After that I can operate SQL Server 2008 (RTM) engine with Management Studio of SQL Server 2008 RC0 installer.

I think it is the workaround while waiting Visual Studio 2008 service pack 1 gets released!

Share this post: | | | |
Posted: Aug 10 2008, 07:19 AM by Kasim.Wirama | with no comments
Filed under:
Introduction of XML Data Type in Practical Use (Part 1)

Introduction of XML Data Type in Practical Use (Part 1)

By : Kasim Wirama, MCDBA

 

Beginning from SQL Server 2005, XML becomes first class citizen with any other native data type.  In this first part of XML posting, I would like to show by example how to declare XML and query XML SCHEMA COLLECTION as required element to define typed XML declaration.

You now can define XML either as data type of either variable or column, for example :

Declare @var_xml AS XML – – XML in variable declaration

CREATE TABLE dbo.xmlTable
(
                cola XML
);                                             – – XML in table declaration

XML use in table declaration at the above sample is called untyped XML, for typed XML you need to define XML SCHEMA COLLECTION and then use the XML SCHEMA COLLECTION in column declaration. For example

CREATE TABLE dbo.xmlTable
(
                cola XML (CONTENT dbo.myXMLSCHEMACOLLECTION)
);

To get definition of dbo. myXMLSCHEMACOLLECTION, issue the query here :

SELECT XML_SCHEMA_NAMESPACE (‘dbo’,’ myXMLSCHEMACOLLECTION’);

For existing example of xml schema collection, you can find in AdventureWorks database here :

SELECT XML_SCHEMA_NAMESPACE('production','manuinstructionsschemacollection');

You can create XML SCHEMA COLLECTION with this syntax here :

CREATE XML SCHEMA COLLECTION [schema name].collection_name
AS
‘xml schema definition string’

To drop a XML schema collection, issue this syntax :

DROP XML SCHEMA COLLECTION [schema name].collection_name

To alter a XML schema collection, issue this syntax :

ALTER XML SCHEMA COLLECTION [schema name].collection_name
ADD ‘xml schme definition string’

Share this post: | | | |
Execution Plan Optimality through Statistics Information

Execution Plan Optimality through Statistics Information

By : Kasim Wirama, MCDBA

 

As we know that index is used for fast query response time. What causes index could speed up fast query response time? It is because index helps optimizer to generate good execution plan. Is it only index that is used for the performance purpose? The answer is no. Statistics is another option for optimizer to generate good execution plan. So why we need index?

Index is used to locate qualified records whereas statistics doesn’t. Statistics only tell optimizer in what extent a column or column combination is unique. The uniqueness of column (a group of column) is called density. It is expressed as 1/number of unique value. The value is between 0 and 1 for the just mentioned expression. The bigger the number; the density is less; the more likely optimizer will decide to have index scanning rather than index seek.

In old thread of mailing list discussion in sqlserver-indo@yahoogroups.com, once a mailing list user asked what highly selective index meant? Here I say that a highly selective index is another word for high density statistics. An index must have statistics but statistics doesn’t necessarily have index. Optimizer will choose index seek most likely when density is 0.10 or less. To get density information, you can issue DBCC SHOW_STATISTICS transact-sql command.

For example usage of SHOW_STATISTICS, I create compound statistics on orders table and drop index on EmployeeID column:

CREATE STATISTICS st_EmployeeID_OrderDate ON dbo.orders (EmployeeID, OrderDate)

Now let’s see statistics information for each of them by running DBCC command here :

DBCC SHOW_STATISTICS ('dbo.orders','st_EmployeeID_OrderDate')

I am interested with All Density column information here:

All Density Average Length Column
0.1111111 4 EmployeeID
0.001302083 12 EmployeeID, OrderDate
 

If query predicate only have EmployeeID column, optimizer consider index scan because the density is not high enough (higher than 0.1) and when query predicate includes OrderDate and EmployeeID, optimizer will consider index seek because the density is high (less than 0.1). Another interesting point is that you can influence query optimizer decision from index scan to index seek by implementing index on employeeID column.

Density information from DBCC show statistics will help you determine whether you really need index or just statistics to achieve optimal execution plan.

Share this post: | | | |
Welcome to SQL Server Knowledge Centre

Welcome to SQL Server Knowledge Centre

 

I name my blog as knowledge source for SQL Server, hopefully someday it will become SQL Server knowledge portal, not just portal. So, I welcome you guys to visit SQL SERVER KNOWLEDGE CENTRE.

Share this post: | | | |
Posted: Jul 29 2008, 08:38 AM by Kasim.Wirama | with no comments
Filed under:
Common Mistakes in SQL Practice

Common Mistakes in SQL Practice

By : Kasim Wirama, MCDBA 

Some common mistakes are always found in all projects when I have ever involved. Here are points I would like to share with you.

 
  1. Data file and transaction log file put in same drive.

For development phase, it is fine, but for production, it is less optimal because of access method difference between transaction log (sequential) and data file (random). When you have multiple processors and one or more RAID controller, it is recommended you spread heavily used tables into multiple data files.

 
  1. Specify * (asterisk sign) instead of explicitly column names in query statement.

When * sign is issued, under the hood, SQL Server will query system tables to get column names, it is not good programming practice and it will prone to application error when the number of columns in database doesn’t match with those of application.

 
  1. Specify sp prefix in stored procedure name.

When you specify sp prefix name, SQL Server will look the stored procedure in master database then in current context database; so it is less efficient.

 
  1. Do not specify schema name when references database object.
 
  1. Specify unidentical column attributes when doing comparison.
    let’s say your query contains predicate firstname like @var1 + ‘%’, where firstname column is varchar(40) and @var1 is nvarchar(40). In this case SQL Server will convert one of them into another column type which has higher data type precedence. When the conversion happens on firstname; possible access method is index scan rather than index seek; which is less efficient.
 
  1. Wrap column name with a function
    for example : convert(varchar(8), createdDate,112) = ‘20080729’. If you find column name is wrapped with function; it is guaranteed that only index scan is option for getting the result; the more rows in tables, the longer the result will be returned.
 
  1. Don’t take benefit of full text indexing for % wildcard in front for variable.
    When you find this predicate : columnname like ‘%’ + @var1 + ‘%’; again the one and only access method is index scan provided an index is created on the columnname. The better solution is to create full text indexing upon columnname.
 
  1. Not specify SET NOCOUNT ON within database objects.
    It is advisable to put SET NOCOUNT ON within database objects to reduce network overhead, otherwise network bandwidth will be occupied with unnecessary chattiness communication between database tier and application tier.
 
  1. Don’t use efficient cursor

I always find developers use less efficient cursor. Cursor itself gives overhead to database and potentially causes concurrency problem. It is preferable to create set based solution (with query) rather than cursor, but in some business requirement; it is impossible to use query. If you really need to use cursor; try to lightweight one by declaring local, readonly and one direction cursor. For example :

DECLARE mycursor CURSOR LOCAL READ_ONLY FAST_FORWARD FOR select distinct city from dbo.Customers.

 
  1. Don’t implement clustered index

A table which doesn’t have clustered index will have forwarding records when it is frequently updated. It will lead to fragmentation sooner.

 
  1. Database is less normalized

Sign of less normalized database design is that need more business logic to get ad hoc information from database. Consider to review the database design when it happens.

 

By avoiding these common mistakes, you will have better performing database. If you have other points that will help to avoid database common mistakes, feel free to add it through comments.

Share this post: | | | |
Tips to Detect Suboptimal Execution Plan

Tips to Detect Suboptimal Execution Plan

By : Kasim Wirama, MCDBA

 

This posting, I would like to give important points that need to consider when reviewing an execution plan. With this information on this posting, you can assess whether your execution plan has potential performance problem. It answers your question why sometimes a query responds inconsistently.

1.       If you find nested loop operator, make sure outer query doesn’t give big input into nested loop operator because nested loop is less efficient for processing big input. When you find out inner query is in scan (clustered or nonclustered), try to add appropriate index on the inner query. When you find out bookmark lookup (key lookup), a query improvement chance is there to remove inefficient bookmark lookup by including columns on outer query index.

2.       Beware when your execution plan has sort, hash aggregate and hash join operator, and particularly the query is the most accessed queries. As I already discuss these operators on recent postings, these operators take memory. The more frequent it’s accessed, the more possible there will be insufficient memory that will lead to spill data into tempdb database which will degrade performance significantly.

a.       To detect memory problem with sort operator, in profiler select  Error and Warnings event group, Sort Warnings, choose EventSubClass. When you find out EventSubClass is 2, it indicates sort operator is inefficient.

b.      To detect memory problem with hash operator, in profiler select Error and Warnings event group, Hash Warnings, choose EventSubClass and IntegerData column. When you find out EventSubClass is 0, it indicates that there already happened normal spilling to tempdb disk; in addition IntegerData column shows how many times same data is spilled out to disk. In worst condition, EventSubClass is 1, it indicates that IntegerData has shown maximum number of times same data got spilled and normal spilling has turned out into bailout spilling.

3.       When your database undergoes concurrency issue (excessive lock, blocking or deadlock), consider figuring out efficient index to would give better access method to data for given query.

4.       In OLTP, avoid scan operator, it is much better to change scan operator to seek operator. It is useful when the operator actually needs a few row instead of a large number of rows in a table.

5.       In OLTP, parallelism is favored in order to reduce response time even though it takes more processor resource.

6.       In OLTP, avoid parallel merge join and reassess merging exchange; when you have parallel hash join, it would be scalable especially when you have multicore CPU or multi processor because it doesn’t claim more memory as degree of parallelism increases.

7.       When you have clustered and nonclustered index, it is best to put nonclustered index on separate physical drive; so it give less contention to your data file.

Share this post: | | | |
Examine Database Backup Size

Examine Database Backup Size

By : Kasim Wirama, MCDBA

 

I have seen Idera products several times intended for SQL Server use through several advertisements at SQL Server Magazine. One of its products is SQLSafe which has backup and restore functionality. It seems that Idera is generous to let SQL Server customers to try its core feature in freeware edition with same functionality as full feature edition does. You need full feature edition when have advanced backup/restore requirement such as backup encryption, table level recovery, stripped backup, policy-based backup, backup reporting, and integration with Tivoli Storage Manager. I am very interested to know what table level recovery feature can do precisely; I leave it for some time after I download free trial 14 days from its website. Back to topic, the core functionality that exists both SQLSafe freeware edition and SQLSafe (full feature) is that they are indeed high performance backup engine, it can be accessed through command line interface or extended stored procedures, and it is optimized in speed.

I would try to backup Adventureworks database (on SQL Server 2005). With SQLSafe freeware edition, I issue the following command line to backup adventureworks database to drive c:\. Note you can do backup not only by command line interface but also by extended stored procedures from SQL Safe so that you can schedule backup with SQLSafe in SQL Server job (one of SQL Server agent components). For more information about command line and its extended stored procedure, you can access its Command Line Interface documention from SQLSafe installation directory.

SQLSafeCmd backup AdventureWorks C:\BackupWithSQLSafe.bak

Backup file, BackupWithSQLSafe.bak, is created with size 46.25 MB.

Now I create backup using native SQL Server 2005 backup TSQL command here :

BACKUP DATABASE AdventureWorks TO DISK = ‘C:\ Conventional2005Backup.bak’

Backup file, Conventional2005Backup.bak, is created with size 171.09 MB. (3.7 times bigger than that of SQLSafe).

Now I restore AdventureWorks database from SQL Server 2005 to SQL Server 2008 RC0, and I change AdventureWorks database compatibility level from SQL Server 2005 (90) to SQL Server 2008 (100), and issue database backup with compression options (new feature in SQL Server 2008) with the following T-SQL command :

BACKUP DATABASE AdventureWorks
TO DISK ='C:\Conventional2008Backup_with_compression.bak'
WITH COMPRESSION

Backup file, Conventional2008Backup_with_compression.bak, is created with size 138.06 MB. Here is summary of Adventureworks database backup.

Backup tools Backup size (MB)
SQL Safe 46.25
SQL Server 2005 (no compression option available) 171.09
SQL Server 2008 RC0 (with compression option) 138.06
 

From the information above, space consumption saving is around 3 times between SQLSafe and SQL Server’s native backup. If you have other findings or you have tried backup using other products with backup size information, please kindly let me know. Feel free to try SQLSafe.

Share this post: | | | |
Dealing with Cardinality Estimation Issue

Dealing with Cardinality Estimation Issue

By : Kasim Wirama, MCDBA

 

This posting, I would discuss about cardinality estimation issue that leads to suboptimal execution plan in SQL Server. It applies to all SQL Server versions. Generally, SQL  Server always generate optimal execution plan, but sometimes it generates suboptimal one. Given a complex execution plan, how can I identify which operator is the root cause of the issue. You inspect value between Actual Number of Rows and Estimated Number of Rows in operator of the execution plan. If comparison between Actual and Estimated one is big enough, the operator has cardinality estimation issue. I will let you know how to deal with the issue here.

When you find out cardinality estimation issue, try to track down the rightmost operator that begins to generate cardinality estimation issue. The problem handling is different between scan-seek operator and join operator.

For scan-seek operator, possible cause is improper statistics, too complex predicate (in WHERE clause). Make sure statistics supports predicate. Make sure your predicate is in SARG form (search arguments)-search Kalen Delaney’s article in SQL Server Magazine (http://www.sqlmag.com) for comprehensive discussion about SARG form. When the predicate is in SARG form, make multicolumn statistics. Execute the query to see whether there is no more cardinality estimation issue on the operator. When it still exists, do bigger resample or full scan on the statistics. Sometimes full scan doesn’t solve this issue, other alternatives is to simplify predicate or if it cannot, consider creating computed column and applies statistics on it.

For cardinality issue on join operator, it might be caused by complex join or correlation between 2 tables. For example, join between customers and orders where predicate is on customers. Some customers might place more orders than others. In this case optimizer couldn’t generate accurate cardinality estimation. You have alternatives to put the join between customers and orders into temporary table so that optimizer could generate accurate cardinality estimation through compilation or you can use hints. SQL Server 2005 and later versions provides many hints. I have written about hints available in SQL Server 2005 in my blogs. You can use recompile option on stored procedure or individual statement in order to generate accurate cardinality estimation.

By recognizing cardinality estimation issue, you know effective solution to address it. You are on the way to exceptional performance expert by understanding SQL Server query processing engine.

Share this post: | | | |
Posted: Jul 26 2008, 06:21 PM by Kasim.Wirama | with no comments
Filed under:
2 Stored Procedures for generating index information

2 Stored Procedures for generating index information