Professional Documents
Culture Documents
Peter Boncz
Senior Research Scientist @ CWI
Senior Lecturer @ Vrije Universiteit Amsterdam
Architect & Co-founder MonetDB
Architect & Co-founder VectorWise
2
The Story of VectorWise - Keynote BDA 25/10/2012, Rabat Morocco
Winning an Award
For research contribution on column stores
and architecture-conscious database architecture
3
The Story of VectorWise - Keynote BDA 25/10/2012, Rabat Morocco
Driving Some Fancy Cars
original
Time (sec)
C-store
enable
late
column-oriented enable materialization
join algorithm compression &
“Column-Stores vs Row-Stores: operate on compressed
How Different are They
Really?” Abadi,
The Story Hachem, and
of VectorWise - Keynote BDA 25/10/2012, Rabat Morocco
Madden. SIGMOD 2008. 8
Some Architectural Differences
storage system
read-optimized: dense-packed, compressed
batch & differential updates
multiple sort orders instead of secondary indexes
deep prefetching in scans
execution engine
vectorized operators
compressed execution
optimized relational operators
from 7x slower
Bit-vector Encoding
Quarter Product ID
1 2 3
… … …
ProductID, COUNT(*))
(Q1, 1, 300) 1 0 0
301-306 0 0 1 (1, 3)
(Q2, 301, 6) 0 1 0
(2, 1)
(Q3, 307, 500) 1 0 0
1 0 0 (3, 2)
(Q4, 807, 600) 0 0 1
0 1 0
SELECT ProductID, Count(*)
1 0 0 FROM table
0 0 1 WHERE (Quarter = Q2)
0 1 0 GROUP BY ProductID
Index Lookup + Offset jump 0 0 1
… … …
The Story of VectorWise - Keynote BDA 25/10/2012, Rabat Morocco
“Integrating Compression and Execution in Column-
Oriented Database Systems” Abadi et. al, SIGMOD ‟06
Selection getNext()
Operator decompressIntoArray() (Q1, 1, 300)
getValueAtPosition(pos)
(Q2, 301, 6)
getMin() Daniel Abadi
Compression- getMax() C-Store (Q3, 307, 500)
Aware Scan getSize() ACM Dissertation Award
(Q4, 807, 600) 2010
Operator founder Hadapt
The Story of VectorWise - Keynote BDA 25/10/2012, Rabat Morocco
The History Of
??
Simple, hard-
coded semantics
in operators
MATERIALIZED
SELECT id, name, (age-30)*50 as bonus intermediate
FROM people
WHERE age > 30
results
Plus:
use of cache-conscious algorithms for Sort/Aggr/Join
Run-time query optimization: recycling and cracking, etc
Liberal open-source license (monetdb.cwi.nl)
C program: 0.2s
MonetDB: 3.7s
MySQL: 26.2s
DBMS “X”: 28.1s
next() 7 *–50
37 30 SELECT id, name
(age-30)*50 AS bonus
102 ivan 37 7 350
FROM employee
WHERE age > 30
PROJECT
next() 22 > 30 ?
37
102
101 ivan
alice 37 FALSE
22 TRUE
SELECT
next()
102
101 ivan
alice 37
22 SCAN
A Look at the Query Pipeline
next() 7 *–50
37 30
Operators
PROJECT
Iterator interface
next() 22 > 30 ?
37 -open()
-next(): tuple
SELECT -close()
next()
SCAN
A Look at the Query Pipeline
next() 7 *–50
37 30
next()
PROJECT
next()
SELECT
next()
SCAN
next()
PROJECT
next()
SELECT
next()
101 alice 22
102 ivan 37
104 peggy 45
105 victor 25 SCAN
next()
PROJECT
> 30 ?
next() 101 alice 22
102 ivan 37
104 peggy 45
105 victor 25 SELECT
next()
101 alice 22
102 ivan 37
104 peggy 45
105 victor 25 SCAN
“Vectorized In Cache
Processing”
next()
vector = array of ~100
next()
101 alice 22
102 ivan 37
104 peggy 45
105 victor 25 SCAN
Observations:
next()
101 alice 22
102 ivan 37
104 peggy 45
105 victor 25 SCAN
next() res[i]
350 = (col[i] * x)
750 101 alice 22
102 ivan 37
104 peggy 45
105 victor 25 SCAN
next()
PROJECT
next() 101 alice 22 FALSE
102 ivan 37 TRUE
104 peggy 45 TRUE
105 victor 25 FALSESELECT
next()
101 alice 22
102 ivan 37
104 peggy 45
105 victor 25 SCAN
- 30 * 50
next()
102 ivan 37
104 peggy 45
PROJECT
next() 101 alice 22 FALSE
102 ivan 37 TRUE
104 peggy 45 TRUE
105 victor 25 FALSESELECT
next()
101 alice 22
102 ivan 37
104 peggy 45
105 victor 25 SCAN
next()
102 ivan 37 7 350
104 peggy 45 15 750
PROJECT
next() 101 alice 22 FALSE
102 ivan 37 TRUE
104 peggy 45 TRUE
105 victor 25 FALSESELECT
next()
101 alice 22
102 ivan 37
104 peggy 45
105 victor 25 SCAN
next()
102 ivan 37 7 350
104 peggy 45 15 750
PROJECT
next() 101 alice 22 FALSE
102 ivan 37 TRUE
104 peggy 45 TRUE
105 victor 25 FALSESELECT
next()
101 alice 22
102 ivan 37
104 peggy 45
105 victor 25 SCAN
Memory Hierarchy
VectorWise engine
CPU
cache
RAM ColumnBM
(buffer manager)
(raid)
The Story of VectorWise -Disk(s)
Keynote BDA 25/10/2012, Rabat Morocco
“MonetDB/X100: Hyper-Pipelining Query
Execution ” Boncz, Zukowski, Nes, CIDR‟05
Memory Hierarchy
VectorWise engine
RAM ColumnBM
(buffer manager)
(raid)
The Story of VectorWise -Disk(s)
Keynote BDA 25/10/2012, Rabat Morocco
“MonetDB/X100: Hyper-Pipelining Query
Execution ” Boncz, Zukowski, Nes, CIDR‟05
“MonetDB/X100” (VectorWise)
Both efficiency
Vectorized primitives
and scalability..
Pipelined query evaluation
C program: 0.2s
VectorWise: 0.6s
MonetDB: 3.7s
MySQL: 26.2s
DBMS “X”: 28.1s
46
The Story of VectorWise - Keynote BDA 25/10/2012, Rabat Morocco
OLAP: Scan Thrashing
47
The Story of VectorWise - Keynote BDA 25/10/2012, Rabat Morocco
Cooperative scans
TABLE0
SID STORE PROD NEW QTY RID
0 Berlin chair Y 20 0
0 Berlin cloth Y 5 1
0 Berlin table Y 10 2
0 London chair N 30 3
1 London stool N 10 4
2 London table N 20 5
3 Paris rug N 1 6
4 Paris stool N 5 7
TABLE1
TABLE0
SID 0
Insert Value Table
∆ 2 1
STORE PROD NEW QTY
i0 Berlin table Y 20
i1 Berlin cloth Y 5
SID 0 0 SID 0 i2
Berlin chair Y 10
type ins ins type ins
value i2 i1 value i0
2 London table N 20 5
3 Paris rug N 1 6
4 Paris stool N 5 7
TABLE1
SID 0
Insert Value Table
∆ 2 -1
STORE PROD NEW QTY
i0 Berlin table Y 20
i1 Berlin cloth Y 5
SID 0 0 SID 3 i2
Berlin chair Y 10
type ins ins type del
value i2 i1 value d0
4 Paris stool N 5 5
TABLE2
SID 0
Insert Value Table
∆ 2 -1
STORE PROD NEW QTY
i0 Berlin
Paris table
Rack Y 20
4
RID 5 > 0 + 2
i1 Berlin cloth Y 5
SID 0 0 SID 3 3 i2
Berlin chair Y 10
type ins ins type ins
del del
value i2 i1 value di00 d0
SID 0 SID 3
∆ 2 1 ∆ 1 0
∆ 2 ∆ 4
RID 2 RID 7
∆ 0 1 ∆ 2 ∆ 3 ∆ 4 5
RID 0 1 RID 2 RID 4 RID 7 8
PDT t2
t3
t0
vs PDT t1 are PDT
PDT
PDT
t2
PDT t3
t1
consecutive t2=t1 PDT t2
PDT
PDT
PDT
t0
PDT
PDT
PDT
PDT t1
Table
PDT t2
t3
t0
vs PDT t1 are
consecutive t2=t1
aligned t2=t0
“same base”
PDT
PDT
Table
PDT t2
t3
t0
vs PDT t1 are
Table
TABLEx
Write-PDT
Propagate()
Read-PDT
Stable Table
Copy
Write-PDT trans-PDT into write-PDT
TABLEx
Write-PDT
Read-PDT
Stable Table
Trans Trans
A Trans B Trans
PDT PDT
Copy Copy
Write-PDT Write-PDT
TABLEx
Write-PDT
Read-PDT
Stable Table
Copy Copy
Write-PDT Write-PDT
TABLEx
Write-PDT
Read-PDT
Stable Table
Write-PDT
Read-PDT
Stable Table
67
The Story of VectorWise - Keynote BDA 25/10/2012, Rabat Morocco
TPC-H
political situation in TPC
HDS transition
hardware companies rule
APIs (JDBC,ODBC)
SQL parser
Query optimizer
Ingres Execution X100 CrossComp
Ingres Storage Rewriter
Vectorized Execution
PAX/DSM Storage
Updates/Transactions
APIs (JDBC,ODBC)
SQL parser
Query optimizer
Ingres Execution X100 CrossComp
Ingres Storage Rewriter
Vectorized Execution
PAX/DSM Storage
X100 Table Type
DDL Updates/Transactions
APIs (JDBC,ODBC)
SQL parser
Query Optimizer
Ingres Execution X100 CrossComp
Ingres Storage Rewriter
Vectorized Execution
Still used to get the logical PAX/DSM Storage
plan order
Updates/Transactions
APIs (JDBC,ODBC)
SQL parser
Query Optimizer
Ingres Execution X100 CrossComp
Ingres Storage Rewriter
Vectorized Execution
PAX/DSM Storage
Translates optimized SQL
plans into VectorWise Updates/Transactions
Relational Algebra
APIs (JDBC,ODBC)
SQL parser
Query Optimizer
Ingres Execution X100 CrossComp
Ingres Storage Rewriter
Vectorized Execution
PAX/DSM Storage
Updates/Transactions
Binary Releases
Download from www.ingres.com/vectorwise
www.actian.com/vectorwise