Professional Documents
Culture Documents
18 Hashjoins
18 Hashjoins
ADVANCED
DATABASE
SYSTEMS
T O D AY ’ S A G E N D A
Background
Parallel Hash Join
Hash Functions
Hash Table Implementations
Evaluation
PA R A L L E L J O I N A L G O R I T H M S
O B S E R VAT I O N
1970s – Sorting
1980s – Hashing
1990s – Equivalent
2000s – Hashing
2010s – ???
PA R A L L E L J O I N A L G O R I T H M S
I M P R O V I N G C A C H E B E H AV I O R
PA R A L L E L H A S H J O I N S
C L O U D E R A I M PA L A
% of Total CPU Time Spent in Query Operators
Workload: TPC-H Benchmark
HASH JOIN
25.0% SEQ SCAN
49.6% 3.1% UNION
19.9% AGGREGATE
OTHER
2.4%
CMU 15-721 (Spring 2017)
11
PA R T I T I O N P H A S E
PA R T I T I O N P H A S E
N O N - B L O C K I N G PA R T I T I O N I N G
S H A R E D PA R T I T I O N S
Data Table
A B C
S H A R E D PA R T I T I O N S
Data Table
A B C
hashP(key)
#p
#p
#p
S H A R E D PA R T I T I O N S
Data Table Partitions
A B C
hashP(key)
P1
#p
P2
#p
⋮
#p Pn
S H A R E D PA R T I T I O N S
Data Table Partitions
A B C
hashP(key)
P1
#p
P2
#p
⋮
#p Pn
P R I VAT E PA R T I T I O N S
Data Table Partitions
A B C
hashP(key)
#p
#p
#p
P R I VAT E PA R T I T I O N S
Data Table Partitions
A B C
hashP(key)
#p
#p
#p
P R I VAT E PA R T I T I O N S
Data Table Partitions Combined
A B C
hashP(key)
P1
#p
P2
#p
⋮
#p Pn
R A D I X PA R T I T I O N I N G
RADIX
Input 89 12 23 08 41 64
RADIX
Input 89 12 23 08 41 64
Radix 9 2 3 8 1 4
CMU 15-721 (Spring 2017)
18
RADIX
Input 89 12 23 08 41 64
Radix 8 1 2 0 4 6
CMU 15-721 (Spring 2017)
19
PREFIX SUM
Input 1 2 3 4 5 6
PREFIX SUM
Input 1 2 3 4 5 6
Prefix Sum 1
CMU 15-721 (Spring 2017)
19
PREFIX SUM
Input 1 2 3 4 5 6
+
Prefix Sum 1 3
CMU 15-721 (Spring 2017)
19
PREFIX SUM
Input 1 2 3 4 5 6
+ + + + +
Prefix Sum 1 3 6 10 15 21
CMU 15-721 (Spring 2017)
20
R A D I X PA R T I T I O N S
Step #1: Inspect input,
create histograms
#p 07
#p 18
#p 19
0
hashP(key)
#p 07
#p 03
#p 11
#p 15
1
#p 10
Source: Spyros Blanas
CMU 15-721 (Spring 2017)
20
R A D I X PA R T I T I O N S
Step #1: Inspect input,
create histograms
#p 07
#p 18
#p 19
0
hashP(key)
#p 07
#p 03
#p 11
#p 15
1
#p 10
Source: Spyros Blanas
CMU 15-721 (Spring 2017)
20
R A D I X PA R T I T I O N S
Step #1: Inspect input,
create histograms
#p 07
#p 18
#p 19
0
hashP(key)
#p 07
#p 03
#p 11
#p 15
1
#p 10
Source: Spyros Blanas
CMU 15-721 (Spring 2017)
20
R A D I X PA R T I T I O N S
Step #1: Inspect input,
create histograms
#p 07
#p 18
#p 19
0
hashP(key)
#p 07 Partition 0: 2
Partition 1: 2
#p 03
#p 11
#p 15
1
#p 10 Partition 0: 1
Partition 1: 3
Source: Spyros Blanas
CMU 15-721 (Spring 2017)
20
R A D I X PA R T I T I O N S
Step #2: Compute output
offsets
#p 07 Partition 0 , CPU 0
#p 18
#p 19
0 Partition 0, CPU 1
hashP(key)
R A D I X PA R T I T I O N S
Step #3: Read input
and partition
#p 07 07 Partition 0 , CPU 0
#p 18
#p 19
0 03 Partition 0, CPU 1
hashP(key)
R A D I X PA R T I T I O N S
Step #3: Read input
and partition
#p 07 07 Partition 0 , CPU 0
#p 18 07
#p 19
0 03 Partition 0, CPU 1
hashP(key)
R A D I X PA R T I T I O N S
#p 07 07 Partition 0
#p 18 07
#p 19
0 03
hashP(key)
#p 07 Partition 0: 2 18 Partition 1
Partition 1: 2
#p 03 19
#p 11 11
#p 15
1 15
#p 10 Partition 0: 1 10
Partition 1: 3
Source: Spyros Blanas
CMU 15-721 (Spring 2017)
20
R A D I X PA R T I T I O N S
Recursively repeat until target number of
partitions have been created
#p 07 07 Partition 0
#p 18 07
#p 19
0 03
hashP(key)
#p 07 Partition 0: 2 18 Partition 1
Partition 1: 2
#p 03 19
#p 11 11
#p 15
1 15
#p 10 Partition 0: 1 10
Partition 1: 3
Source: Spyros Blanas
CMU 15-721 (Spring 2017)
20
R A D I X PA R T I T I O N S
Recursively repeat until target number of
partitions have been created
#p 07 07
#p 18 07
#p 19
0 03
0
hashP(key)
#p 07 Partition 0: 2 18
Partition 1: 2
#p 03 19
#p 11 11
#p 15
1 15
1
#p 10 Partition 0: 1 10
Partition 1: 3
Source: Spyros Blanas
CMU 15-721 (Spring 2017)
20
R A D I X PA R T I T I O N S
Recursively repeat until target number of
partitions have been created
#p 07 07
#p 18 07
#p 19
0 03
0
hashP(key)
#p 07 Partition 0: 2 18
Partition 1: 2
#p 03 19
#p 11 11
#p 15
1 15
1
#p 10 Partition 0: 1 10
Partition 1: 3
Source: Spyros Blanas
CMU 15-721 (Spring 2017)
21
BUILD PHASE
HASH FUNCTIONS
HASH FUNCTIONS
MurmurHash (2008)
→ Designed to a fast, general purpose hash function.
Google CityHash (2011)
→ Based on ideas from MurmurHash2
→ Designed to be faster for short keys (<64 bytes).
Google FarmHash (2014)
→ Newer version of CityHash with better collision rates.
CLHash (2016)
→ Fast hashing function based on carry-less multiplication.
192
128
8000
4000
0
1 51 101 151 201 251
Key Size (bytes) Source: Fredrik Widlund
CMU 15-721 (Spring 2017)
25
128
12000 64
32
6000
0
1 51 101 151 201 251
Key Size (bytes) Source: Fredrik Widlund
CMU 15-721 (Spring 2017)
26
H A S H TA B L E I M P L E M E N TAT I O N S
C H A I N E D H A S H TA B L E
C H A I N E D H A S H TA B L E
hashB(key)
⋮ ⋮
O P E N - A D D R E S S I N G H A S H TA B L E
O P E N - A D D R E S S I N G H A S H TA B L E
hashB(key)
X
Y
Z
hashB(X) | X
⋮
O P E N - A D D R E S S I N G H A S H TA B L E
hashB(key) hashB(Y) | Y
X
Y
Z
hashB(X) | X
⋮
O P E N - A D D R E S S I N G H A S H TA B L E
hashB(key) hashB(Y) | Y
X
Y
Z
hashB(X) | X
⋮
O P E N - A D D R E S S I N G H A S H TA B L E
hashB(key) hashB(Y) | Y
X
Y
Z
hashB(X) | X
⋮
hashB(Z) | Z
⋮
O B S E R VAT I O N
C U C KO O H A S H TA B L E
C U C KO O H A S H TA B L E
Hash Table #1 Hash Table #2
⋮ ⋮
C U C KO O H A S H TA B L E
Hash Table #1 Hash Table #2
Insert X
hashB1(X) hashB2(X)
⋮ ⋮
C U C KO O H A S H TA B L E
Hash Table #1 Hash Table #2
Insert X
hashB1(X) hashB2(X)
X
⋮ ⋮
C U C KO O H A S H TA B L E
Hash Table #1 Hash Table #2
Insert X
hashB1(X) hashB2(X)
X
Insert Y
hashB1(Y) hashB2(Y)
⋮ ⋮
C U C KO O H A S H TA B L E
Hash Table #1 Hash Table #2
Insert X
hashB1(X) hashB2(X) Y
X
Insert Y
hashB1(Y) hashB2(Y)
⋮ ⋮
C U C KO O H A S H TA B L E
Hash Table #1 Hash Table #2
Insert X
hashB1(X) hashB2(X) Y
X
Insert Y
hashB1(Y) hashB2(Y)
⋮ ⋮
Insert Z
hashB1(Z) hashB2(Z)
C U C KO O H A S H TA B L E
Hash Table #1 Hash Table #2
Insert X
hashB1(X) hashB2(X) Z
X
Insert Y
hashB1(Y) hashB2(Y)
⋮ ⋮
Insert Z
hashB1(Z) hashB2(Z)
C U C KO O H A S H TA B L E
Hash Table #1 Hash Table #2
Insert X
hashB1(X) hashB2(X) Z
Y
Insert Y
hashB1(Y) hashB2(Y)
⋮ ⋮
Insert Z
hashB1(Z) hashB2(Z)
hashB1(Y)
C U C KO O H A S H TA B L E
Hash Table #1 Hash Table #2
Insert X
hashB1(X) hashB2(X) Z
Y
Insert Y
hashB1(Y) hashB2(Y)
X
⋮ ⋮
Insert Z
hashB1(Z) hashB2(Z)
hashB1(Y)
hashB2(X)
CMU 15-721 (Spring 2017)
34
C U C KO O H A S H TA B L E
PROBE PHASE
H A S H J O I N VA R I A N T S
H A S H J O I N VA R I A N T S
No-P Shared-P Private-P Radix
Partitioning No Yes Yes Yes
Input scans 0 1 1 2
Sync during – Spinlock Barrier, Barrier,
partitioning per tuple once at end 4 * #passes
Hash table Shared Private Private Private
Sync during Yes No No No
build phase
Sync during No No No No
probe phase
BENCHMARKS
No output materialization
H A S H J O I N – U N I F O R M D ATA S E T
Intel Xeon CPU X5650 @ 2.66GHz
6 Cores with 2 Threads Per Core
Partition Build Probe
160
Cycles / Output Tuple
120
76.8
80 60.2 67.6
47.3
40
0
No Partitioning Shared Private Radix
Partitioning Partitioning Source: Spyros Blanas
CMU 15-721 (Spring 2017)
39
H A S H J O I N – U N I F O R M D ATA S E T
Intel Xeon CPU X5650 @ 2.66GHz
6 Cores with 2 Threads Per Core
Partition Build Probe
160
3.3x cache misses
24% faster than
Cycles / Output Tuple
76.8
80 60.2 67.6
47.3
40
0
No Partitioning Shared Private Radix
Partitioning Partitioning Source: Spyros Blanas
CMU 15-721 (Spring 2017)
40
H A S H J O I N – S K E W E D D ATA S E T
Intel Xeon CPU X5650 @ 2.66GHz
6 Cores with 2 Threads Per Core
Partition Build Probe
160 167.1
Cycles / Output Tuple
120
80
56.5 50.7
40 25.2
0
No Partitioning Shared Private Radix
Partitioning Partitioning Source: Spyros Blanas
CMU 15-721 (Spring 2017)
41
O B S E R VAT I O N
R A D I X H A S H J O I N – U N I F O R M D ATA S E T
Intel Xeon CPU X5650 @ 2.66GHz
Varying the # of Partitions
Partition Build Probe
120
Cycles / Output Tuple
80 +24% -5%
No Partitioning
40
0
32768
32768
131072
131072
64
4096
64
4096
256
512
1024
8192
256
512
1024
8192
Radix / 1-Pass Radix / 2-Pass Source: Spyros Blanas
CMU 15-721 (Spring 2017)
43
R A D I X H A S H J O I N – U N I F O R M D ATA S E T
Intel Xeon CPU X5650 @ 2.66GHz
Varying the # of Partitions
Partition Build Probe
120
Cycles / Output Tuple
80
No Partitioning
40
0
32768
32768
131072
131072
64
4096
64
4096
256
512
1024
8192
256
512
1024
8192
Radix / 1-Pass Radix / 2-Pass Source: Spyros Blanas
CMU 15-721 (Spring 2017)
44
E F F E C T S O F H Y P E R-T H R E A D I N G
Intel Xeon CPU X5650 @ 2.66GHz
Uniform Data Set
No Partitioning Radix Ideal
11
9
Speedup
7
5
3
Hyper-Threading
1
1 3 5 7 9 11
Source: Spyros Blanas
Threads CMU 15-721 (Spring 2017)
44
E F F E C T S O F H Y P E R-T H R E A D I N G
Intel Xeon CPU X5650 @ 2.66GHz
Uniform Data Set
No Partitioning Radix Ideal
11 Multi-threading hides cache &
TLB miss latency.
9
Speedup
7
5
3
Hyper-Threading
1
1 3 5 7 9 11
Source: Spyros Blanas
Threads CMU 15-721 (Spring 2017)
44
E F F E C T S O F H Y P E R-T H R E A D I N G
Intel Xeon CPU X5650 @ 2.66GHz
Uniform Data Set
No Partitioning Radix Ideal
11 Multi-threading hides cache &
TLB miss latency.
9
Radix join has fewer cache &
Speedup
7
TLB misses but this has
5 marginal benefit.
3
Hyper-Threading
1
1 3 5 7 9 11
Source: Spyros Blanas
Threads CMU 15-721 (Spring 2017)
44
E F F E C T S O F H Y P E R-T H R E A D I N G
Intel Xeon CPU X5650 @ 2.66GHz
Uniform Data Set
No Partitioning Radix Ideal
11 Multi-threading hides cache &
TLB miss latency.
9
Radix join has fewer cache &
Speedup
7
TLB misses but this has
5 marginal benefit.
3 Non-partitioned join relies on
Hyper-Threading
1 multi-threading for high
1 3 5 7 9 11 performance. Source: Spyros Blanas
Threads CMU 15-721 (Spring 2017)
45
PA R T I N G T H O U G H T S
NEXT CLASS