Professional Documents
Culture Documents
Agenda
Introduction
Columnar Stores
Oracle In-Memory
Analytics
Loading
Conclusions
2014 sumIT AG
03/2012
sumIT AG
Consulting and implementation services in Switzerland
Experts for
Data Warehousing and
Business Intelligence solutions
2013 sumIT AG
03/2012
Holger Friedrich
Computer Science diploma of
Karlsruhe Institute of Technology (KIT)
Ph.D. in Robotics and Machine Learning
More than 16 years experience with Oracle technology
Expert for
Data Integration
Data Warehousing,
Data Mining and
Business Intelligence
2013 sumIT AG
03/2012
Agenda
Introduction
Columnar Databases
Oracle In-Memory
Analytics
Loading
Conclusions
2014 sumIT AG
03/2012
Advantages
Best for queries that
- scan large quantities of data
- on a rather small set of columns
- compute aggregates on the
results
2014 sumIT AG
03/2012
03/2012
Competition
Niche vendors
-
Exasol
HP Vertica
Infobright
Paracell
!
2014 sumIT AG
03/2012
Agenda
Introduction
Columnar Stores
Oracle In-Memory
Analytics
Loading
Conclusions
2014 sumIT AG
03/2012
03/2012
10
Technology Gems
1. In-memory storage index
2. Filtering on binary compressed data
3. Columnar storage of selected columns
4.
5.
6.
7.
8.
9.
2014 sumIT AG
03/2012
11
03/2012
Min 13
Max 15
12
2014 sumIT AG
Memory
Example:
Find all sales
With PROMO_ID 9999
PROMO_ID
CPU
9999
Load
multiple
PROMO_ID
values
03/2012
VECTOR
REGISTER
Single Instruction
processing Multiple Data
values
Evaluation of a set of
column values in a single
CPU instruction cycle
Potential to speed up
processing to billions of
rows per second
9999
9999
Vector
Compare
all values
in 1 cycle
9999
13
In-Memory Aggregation
Scan dimensions
Build key vectors
Prepare accumulator
Build tmp-tables for
dim select attributes
Phase 2 - computation
5. Scan facts w.r.t.
key vectors
6. Join filtered facts with tmp-tables
2014 sumIT AG
03/2012
14
2014 sumIT AG
03/2012
15
Fault tolerance
(engineered systems only)
DISTRIBUTE clause to keep
redundant IMCU copies on nodes
DISTRIBUTE ALL = each IMCU
copied to every node
2014 sumIT AG
03/2012
16
Assessment
Scan Data
Scan
Data
Row Store
In-Memory
Aggregate
Scan Data
Scan
Data
Aggregate
Aggregate
Aggregate
t
2014 sumIT AG
03/2012
17
Agenda
Introduction
Columnar Stores
Oracle In-Memory
Analytics
Loading
Conclusions
2014 sumIT AG
03/2012
18
2014 sumIT AG
03/2012
19
few joins
simple one-step aggregations (if at all)
lots of filtering
sometimes many rows and to be displayed
Processing
-
In-Memory impact
high performance gains
2014 sumIT AG
03/2012
20
some joins
complex analytic functions
lots of filtering
often many attributes to be displayed
Processing
-
In-Memory impact
gain of performance, but smaller than for more simple reporting queries
2014 sumIT AG
03/2012
21
Processing
-
In-Memory impact
gain of using in-memory option depends on query complexity
the need for pre-computing (some) aggregates remains
2014 sumIT AG
03/2012
22
Dimensional Queries
Characteristics
-
In-Memory impact
high performance gain
2014 sumIT AG
03/2012
23
2014 sumIT AG
03/2012
24
report type
no of rows
times X
simple
400K
35
(SGA)
10ms
2ms
join
2M
25s
25s
(bloom)
join, top10
(analytics)
10
2s
1s
dimensional
(vector by)
88
8s
0.8s
10
2014 sumIT AG
03/2012
25
Agenda
Introduction
Columnar Stores
Oracle In-Memory
Analytics
Loading
Conclusions
2014 sumIT AG
03/2012
26
Challenge
-
Observation
- gain depends significantly on test complexity
2014 sumIT AG
03/2012
27
gender
DWH gender
Challenge
mapping table
src system
gender
Strategy
- populate only columns to be transformed into column store
- check population time vs. speed gain
2014 sumIT AG
03/2012
28
Key Transformations
Typical scenario
- transformation of source dependent natural/business keys into DWH owned
surrogate representation
- reverse lookups for data mart loading
- multiple (outer) joins against target tables
- typical case of (outer) joins without aggregation
Challenge
- staging tables initially not in column store
Strategy
- populate only rows to be transformed into column store
- check population time vs. speed gain
- works also with lookup tables in columnar and staging table in row format
2014 sumIT AG
03/2012
29
s.cutoffdt,
where rownum < 100000) s,
(+)
(+)
(+)
(+)
2014 sumIT AG
03/2012
30
4. hash-join bloom
filter false positives
03/2012
31
Conclusions
Oracle In-Memory is a game changer on the DWH/BI market
in contrary to niche players it is absolutely enterprise ready
in contrary to the other big players its use requires no modifications
2014 sumIT AG
03/2012
32