1203*4 56 7/489:;/<=> :< ?>>:@<0*<=> A -./0 ?B*4/@* C*>2D=> :< E Goals ueep Lechnlcal undersLandlng of a column-orlenLed dlcuonary- encoded ln-memory daLabase and lLs appllcauon ln enLerprlse compuung
. 1he fuLure of enLerprlse compuung . loundauons of daLabase sLorage Lechnlques . ln-memory daLabase operaLors . Advanced daLabase sLorage Lechnlques . loundauons for a new enLerprlse appllcauon developmenL era 3 December 2006: The Basic Idea 4 Learning Map Ioundanons for a New Lnterpr|se App||canon Deve|opment Lra Ioundanons of Database Storage 1echn|ques 1he Iuture of Lnterpr|se Compunng Advanced Database Storage 1ech- n|ques In-Memory Database Cperators Chapter 1: The Future of Enterprise Computing
Dr.-Ing. Jrgen Mller New Requirements for Enterprise Computing Sensors 1rac|ng harmaceunca| ackages |n Lurope ! 13 bllllon packages / 34 bllllon read evenLs per year ! ulsLrlbuLed reposlLorles for sLorlng read evenLs ! 8eferences Lo read evenLs are sLored ln cenLral dlscovery servlce
Mon|tor|ng I1 car erformance ! 8eLween 300 and 600 sensors per car ! Muluple evenLs per second per sensor ! 1racklng every Crand rlx lap or LesL run 8 A|rp|ane Ma|ntenance at 8oe|ng ! MalnLenance workers aL 8oelng wrlLe reporLs aer repalrs ! 8eporLs and menuoned parL numbers geL lndexed ! Analyucs reveal whlch parLs ln oLher planes may be defecuve
Med|ca| D|agnos|s at Char|t ! uocLors wrlLe medlcal reporLs aer every dlagnosls ! ulagnosls and menuoned sympLoms geL lndexed ! Comparlson wlLh slmllar cases for opumal LreaLmenL Combination of Structured and Unstructured Data 9 Mobile Mob||e |nverts trad|nona| corporate structures
! Lnables cusLomer-faclng personnel ! 8esponse umes < 1 second ! Lxample: uunnlng Sales, Servlce Cperauons ConLrolllng Consolldauon SLraLegy Consumers / CusLomers CLC 10 Learning Map Ioundanons for a New Lnterpr|se App||canon Deve|opment Lra Ioundanons of Database Storage 1echn|ques 1he Iuture of Lnterpr|se Compunng Advanced Database Storage 1ech- n|ques In-Memory Database Cperators Enterprise Application Characteristics
Dr.-Ing. Jrgen Mller 1ransacnona| Data Lntry
Sources: Machlnes, 1ransacuonal Apps, user lnLeracuon, eLc.
Sources: machlnes, sensors, hlgh volume sysLems kea|-nme Ana|yncs, Structured Data
Sources: 8eporung, Classlcal Analyucs, plannlng, slmulauon uaLa ManagemenL !"#$ &'()*+!,-. / !012.3 4056 4.',-7 Challenge: Diverse Applications 13 ! Modern enLerprlse resource plannlng (L8) sysLems are challenged by m|xed work|oads, lncludlng CLA-sLyle querles. lor example: " CL1-sLyle: creaLe sales order, lnvolce, accounung documenLs, dlsplay cusLomer masLer daLa or sales order " CLA-sLyle: dunnlng, avallable-Lo-promlse, cross selllng, operauonal reporung (llsL open sales orders) ! 8uL: 1oday's daLa managemenL sysLems are opumlzed e|ther for dally transacnona| or ana|ynca| workloads sLorlng Lhelr daLa along rows or columns Cn|lne 1ransacuon rocesslng Cn|lne Analyucal rocesslng OLTP vs. OLAP 14 Drawbacks of the Separation ! CLA sysLem does noL have Lhe |atest daLa ! CLA sysLem does only have predehned subset of Lhe daLa ! Cost-|ntens|ve L1L process has Lo synch boLh sysLems ! 1here ls a loL of redundancy ! D|erent data schemas lnLroduce complexlLy for appllcauons comblnlng sources 13 OLTP Access Pattern Myth ! Workload analysls of enLerprlse sysLems shows: CL1 and CLA workloads are noL LhaL dlerenL 16 Comb|ne CL1 and CLA data uslng modern hardware and daLabase sysLems Lo creaLe a s|ng|e source of truth, enable rea|-nme ana|yncs and s|mp||fy appllcauons and daLabase sLrucLures.
Addluonally, ! exLracuon, Lransformauon, and loadlng (L1L) processes ! pre-compuLed aggregaLes and maLerlallzed vlews become obsoleLe. Vision 17 ! Many columns are noL used even once ! Many columns have a low cardlnallLy of values ! nuLL values/defaulL values are domlnanL ! Sparse dlsLrlbuuon faclllLaLes hlgh compresslon SLandard enLerprlse soware daLa ls sparse and w|de Enterprise Data Characteristics 18 Low Cardinality of Values Within Many Columns 8esulLs from analyzlng nanclals ulsuncL values ln accounung documenL headers (99 aurlbuLes) CC Loglsucs Plgh Lech ulscreLe manufacLurlng 8anklng
19 Many Columns are not Used Even Once SS unused columns per company ln average 40 unused columns across all companles 0% 10% 20% 30% 40% 50% 60% 70% 80% 1 - 32 33 - 1023 1024 - 100000000 13% 9 % 78% 24% 12% 64% Number of Distinct Values Inventory Management Financial Accounting %
o f
C o l u m n s 20 Wide Tables Analysls of wldLh of 144 mosL used* Lables * LargesL ln Lerms of cardlnallLy 0 3 10 13 20 23 30 1 - 9
1 0 - 1 9
2 0 - 2 9
3 0 - 3 9
4 0 - 4 9
3 0 - 3 9
6 0 - 6 9
7 0 - 7 9
8 2
9 9
1 1 0 - 1 1 9
1 2 0 - 1 2 9
1 3 8
1 4 0 - 1 4 9
1 3 6
1 8 0 - 1 8 9
2 0 0 - 2 0 9
2 3 0
3 1 2
3 9 9
#
1 a b | e s
# Co|umns 21 Learning Map Ioundanons for a New Lnterpr|se App||canon Deve|opment Lra Ioundanons of Database Storage 1echn|ques 1he Iuture of Lnterpr|se Compunng Advanced Database Storage 1ech- n|ques In-Memory Database Cperators Changes in Hardware
Dr.-Ing. Jrgen Mller Advances in Hardware ! 64 blL address space - 418 ln currenL server boards ! 23C8/s daLa LhroughpuL, Cu - u8AM ! CosL-performance rauo rapldly decllnlng ! Mulu-Core ArchlLecLure 8 x (8-16) core Cu per blade ! arallel scallng across blades ! Cne blade =$30.000 = 1 LnLerprlse Class Server A 24 Copy S0G8 data v|a Inhn|band - 10s 23 CPU Registers Main Memory Flash Hard Disk H i g h e r
P e r f o r m a n c e L o w e r
P r i c e
/
H i g h e r
L a t e n c y CPU Caches Memory Hierarchy Latency Numbers L1 cache reference (cached data word) 0.Sns 8ranch mlspredlcL 3ns L2 cache reference 7ns MuLex lock/unlock 23ns Ma|n memory reference 100ns 0.1s Send 2k byLes over 1 Cb/s neLwork 20,000ns 20s SSu random read 130,000ns 130s 8ead 1 M8 sequenually from memory 230,000ns 230s D|sk seek 10,000,000ns 10ms Send packeL CA Lo neLherlands Lo CA 130,000,000ns 130ms 26 Learning Map Ioundanons for a New Lnterpr|se App||canon Deve|opment Lra Ioundanons of Database Storage 1echn|ques 1he Iuture of Lnterpr|se Compunng Advanced Database Storage 1ech- n|ques In-Memory Database Cperators A Blueprint of SanssouciDB
Dr.-Ing. Jrgen Mller SanssouciDB: An In-Memory Database for Enterprise Applications Main Memory at Blade ! Log Snapshots Passive Data (History) Non-Volatile Memory Recovery Logging Time travel Data aging Query Execution Metadata TA Manager nterface Services and Session Management Distribution Layer at Blade ! Main Store Differential Store Active Data M e r g e C o l u m n C o l u m n C o m b i n e d C o l u m n C o l u m n C o l u m n C o m b i n e d C o l u m n ndexes nverted Object Data Guide ln-Memory uaLabase (lMu8) ! uaLa resldes permanent|y ln maln memory ! Maln Memory ls Lhe pr|mary 89.-$5$:.61.; ! Sull: logglng Lo d|sk/ recovery from d|sk ! Maln memory access ls Lhe new bou|eneck ! Cache-consclous algorlLhms/ daLa sLrucLures are cruc|a| (locallLy ls klng)
Main Memory at Server i Distribution Layer at Server i