SQL Server knowledge center

everything about SQL Server
See also: Other Geeks@INDC

Parallelism in Nested Loop

Parallelism in Nested Loop

By : Kasim Wirama, MCDBA

 

This posting, I would show behavior characteristic of parallelism in nested loop join.  Let’s build these 2 big tables first, with the following TSQL script :

CREATE TABLE bigtable

(

col1 INT PRIMARY KEY,

col2 INT,

col3 CHAR(2000)

);

GO

 

SET NOCOUNT ON;

DECLARE @i INT;

SET @i = 1;

WHILE @i <=300000

BEGIN

     INSERT INTO bigtable VALUES (@i , @i + 100, 'SOME VALUE');

     SET @i = @i + 1;

END

GO

 

SELECT *

INTO anotherbigtable

FROM bigtable;

GO

 

ALTER TABLE anotherbigtable

ADD CONSTRAINT pk_1 PRIMARY KEY (col1);

GO

 

I retrieve result from columns of both tables by joining them on col1 column with filter criteria col2 column less than 500. The sql query for achieve parallelism in nested loop join is :

 

SELECT

t1.*, t2.*

FROM bigtable AS t1 JOIN anotherbigtable AS t2

ON t1.col1 = t2.col1

WHERE t2.col2 < 500

 

The execution plan for the query (run first : SET STATISTICS PROFILE ON) is :

  |--Parallelism(Gather Streams)

       |--Nested Loops(Inner Join, OUTER REFERENCES:([t2].[col1], [Expr1004]) OPTIMIZED WITH UNORDERED PREFETCH)

            |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[anotherbigtable].[pk_1] AS [t2]), WHERE:([tempdb].[dbo].[anotherbigtable].[col2] as [t2].[col2]<(500)))

            |--Clustered Index Seek(OBJECT:([tempdb].[dbo].[bigtable].[PK__bigtable__18EBB532] AS [t1]), SEEK:([t1].[col1]=[tempdb].[dbo].[anotherbigtable].[col1] as [t2].[col1]) ORDERED FORWARD)

 

With retrieve only 500 rows from 300,000 rows tables, optimizer need to use parallelism. Why?  It is because no index applied on col2 column, so optimizer will search from first until end of join predicate on clustered index col1 on both bigtable and anotherbigtable. To speed up the nested looping processing time, optimizer decides to apply parallelism on Nested Loops operator. With parallelism on nested loop, optimizer use more than one thread, and group of rows would be independent across threads.  Outer query (Clustered Index Scan operator) will be on each of threads. How about inner query (Clustered Index Seek operator)? Is it also separated on each threads? Here is something a bit tricky. The answer is no, inner query will be executed in serial mode because data on inner query (bigtable) is shared among threads on anotherbigtable (outer query) so parallelism on inner query couldn’t be applied whereas outer query could.

Share this post: | | | |

Comments

SQL Server knowledge center said:

Round Robin Algorithm in Distribute Stream Parallelism of SQL Server Execution Plan By : Kasim Wirama

# July 20, 2008 6:00 PM

SQL Server knowledge center said:

Broadcast Algorithm in Distribute Stream Parallelism of SQL Server Execution Plan By : Kasim Wirama,

# July 20, 2008 6:26 PM

SQL Server knowledge center said:

Bitmap Operator By : Kasim Wirama, MCDBA This posting, I would like to dissect other operator that might

# July 23, 2008 9:01 PM
Leave a Comment

(required) 

(required) 

(optional)

(required) 
Are you human?:  


Enter the numbers above: