Round Robin Algorithm in Distribute Stream Parallelism of SQL Server Execution Plan
Round Robin Algorithm in Distribute Stream Parallelism of SQL Server Execution Plan
By : Kasim Wirama, MCDBA
I would like to show round robin implementation in distribute stream parallelism. Before I talk about round robin specific in distribute stream parallelism, let revisit about distribute stream parallelism.
In internal execution plan generated by optimizer, parallelism contains two or more threads, and each threads could have one/more producers and one/more subscribers. In case of distribute stream parallelism, there is one producer thread and two or more consumer threads. So we move further to round robin discussion specific to distribute stream in parallelism.
I take script example from “Parallelism in Nested Loop” posting (http://geeks.netindonesia.net/blogs/kasim.wirama/archive/2008/07/12/parallelism-in-nested-loop.aspx). After that, turn on graphical execution plan toolbar, and execute the following query here :
select
t1.col1, t1.col2, t1.col3
from
(
select top 100
col1, col2, col3
from bigtable
order by col2
) as t1
join anotherbigtable as t2
on t1.col1 = t2.col1
Execution plan for the query is :
|--Parallelism(Gather Streams)
|--Nested Loops(Inner Join, OUTER REFERENCES:([tempdb].[dbo].[bigtable].[col1], [Expr1005]) WITH
|--Parallelism(Distribute Streams, RoundRobin Partitioning)
| |--Top(TOP EXPRESSION:((100)))
| |--Parallelism(Gather Streams, ORDER BY:([tempdb].[dbo].[bigtable].[col2] ASC))
| |--Sort(TOP 100, ORDER BY:([tempdb].[dbo].[bigtable].[col2] ASC))
| |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[bigtable].[PK__bigtable__182C9B23]))
|--Clustered Index Seek(OBJECT:([tempdb].[dbo].[anotherbigtable].[PK__anotherbigtable__5EBF139D] AS [t2]), SEEK:([t2].[col1]=[tempdb].[dbo].[bigtable].[col1]) ORDERED FORWARD)
I find out there 3 parallelisms, the first parallelism comes immediately after SORT operator, this is parallel scan, and just because TOP (n) requires single thread execution, so first parallelism is gather stream parallelism. The second parallelism is round robin partitioning, it means that it distribute sets of rows for multi threads in Nested Loops operator. And the topmost is gather stream parallelism.
In performance perspective, round robin partitioning is less desired operator because it handles no load balancing, and it is execution will stop here to gather complete TOP (n) records.
Parallelism is not bad at all, but there is some kinds of parallelism you need to be aware about and take appropriate action to correct the execution plan either by indexing or rewriting a problematic query.