Nested Loop, Merge Join, and Hash Join algorithm
Nested Loop, Merge Join, and Hash
Join algorithm
By : Kasim Wirama, MCDBA
If you observe execution plan on your query using join
types, you will find out the following operators: nested loop, merge loop and
hash join. I would like to describe algorithm for each of them.
Nested loop
As its name implies, it will iterate each rows at outer
input set to query rows at inner input set. The cost will be multiplication
between inner input and outer input rows. The algorithm for nested loop is
shown below :
For each R1 in outer input set
For each R2 in inner input set
If R1 match with R2
Output (R1, R2)
R1 and R2 are rows for outer and inner input sets.
Merge Join
Merge join will output rows from both input sets that have their
matches by comparing 2 inputs sets that have been already sorted. Cost of merge
loop is sum of number of both inputs. Algorithm for merge join is shown below :
Read first R1 at first input set
Read first R2 at second input set
While not reach the end of either of the inputs
begin
if R1 match R2
begin
output (R1, R2)
get next R2
end
else if R1 less than R2
get next R1
else
get next R2
end
Hash join
Hash join will output qualified resultset by comparing hash
value of second input set to hash value of first input set. It will create hash
table (called build table) for first input set, then compare hash value of
second input set to the build table. It is undesirable operator in most OLTP
scenarios. The algorithm for hash join is shown below :
Create build table
For each R1
begin
generate hash value of R1 join key
insert into build table to appropriate hash bucket
end
for each R2
begin
generate hash value of R2 join key
for each R1 in corresponding hash bucket
if match R1 and R2
output (R1,R2)
end