You are on page 1of 33

Turbocharging MySQL Reporting

and Data Warehousing

© 2008 Kickfire, Inc. All rights reserved.
Agenda 

MySQL intro 
Kickfire background 
Technology overview 
Demo 
Customer case studies 
Q&A

© 2008 Kickfire, Inc. All rights reserved.
Agenda 

MySQL intro 
Kickfire background 
Technology overview 
Demo 
Customer case studies 
Q&A

© 2008 Kickfire, Inc. All rights reserved.
MySQL Data Warehousing Market 
Growing use for data warehousing and BI
• MySQL DW deployed at 28% of MySQL customers
• Strong DW ecosystem support

© 2008 Kickfire, Inc. All rights reserved.
MySQL and Kickfire Break Records 

100 GB TPC-H™ records1
• #1 performance, non clustered
• #1 price/performance 

300 GB TPC-H™ records2
• #1 performance, non clustered
• #1 price/performance 

System tested is fully ACID compliant
1. The Kickfire Database Appliance Series 2300 achieved 49,228 QphH@100G. The price performance is $0.70/QphH@100GB USD.
The total Kickfire system price over three years is $34,425 USD.
2. The Kickfire Database Appliance Series 2400 achieved 54,895 QphH@300G. The price performance is $0.89/QphH@300GB UDS.
The total Kickfire system price over three years is $48,790 USD.
The appliances will be available October 14, 2008. TPC-H is a registered trademark of the TPC council.

© 2008 Kickfire, Inc. All rights reserved.
Agenda 

MySQL intro 
Kickfire background 
Technology overview 
Demo 
Customer case studies 
Q&A

© 2008 Kickfire, Inc. All rights reserved.
Quick Poll #1 

What types of problem have you experienced with
reporting and query performance?

• Can't support ad hoc or complex reports & queries
• Needed to restrict the number of users with reporting access
• Lots of tuning and maintenance necessary
• Needed to add more hardware
• No significant problems

© 2008 Kickfire, Inc. All rights reserved.
DW and Reporting Problems We Hear About 

Technical issues
• Reports and queries take too long to run
• Lots of tuning and maintenance necessary
• Difficulty supporting ad-hoc queries or complex reports 

Key features needed
• Parallel query
• I/O subsystem performance
• Data warehouse-focused optimizer 

Business impact
• Difficulty in growing the system with the business
• Too much time, money, and effort spent on maintenance
• Value for business and for customers limited

© 2008 Kickfire, Inc. All rights reserved.
Who is Kickfire?

World’s first high-performance appliance for MySQL

 Makes MySQL rock for reporting and queries
 Affordable, low-power, load-and-go appliance
 Scalable from GBs to TBs
© 2008 Kickfire, Inc. All rights reserved.
Agenda 

MySQL intro 
Kickfire background 
Technology overview 
Demo 
Customer case studies 
Q&A

© 2008 Kickfire, Inc. All rights reserved.
Quick Poll #2 

Do you have a dedicated MySQL reporting and/or data
warehousing server today?

• Yes
• No
• No, but plan to

© 2008 Kickfire, Inc. All rights reserved.
The Technology (1/2)

Parallel Query von Neumann bottleneck I/O bottleneck
SQL: Assembly Code
= = = =
1:1,000,000,000,000 Massive h/w buildout CPUs idle Massive storage

SQL: MOPS Query Parallelization SQC operates directly SQC uses column-
= done on SINGLE SQC out of Memory, NOT store, compression,
drastically reducing limited by Registers intell indexes and pre-
1:10 h/w footprint fetches from storage

© 2008 Kickfire, Inc. All rights reserved.
The Technology (2/2)

Database
Memory

Unified
Host x86
Memory x86
Storage
Array

Unified Dataflow
Database Engine
Memory
(SQL Chip)

Dataflow Pipelined Parallel Processing (SQL Chip)

© 2008 Kickfire, Inc. All rights reserved.
What’s in the Box

1. SQL execution 1.
1. Connectivity
Optimizer
2. Memory management 2.
2. Security
Column store & cache
3. Loader acceleration 3.
3. Administration
Transactional engine

MySQL
KDB

© 2008 Kickfire, Inc. All rights reserved.
Deploying the Box 

Setup
• Pre-configured appliance to deliver high performance on broad set
of schemas and queries 

Data movement and loading
• Utility to migrate from existing MySQL data sources
• Fast Loader can load 100GB/hr; Incremental Loader for changes
• Works with ETL tools certified for MySQL 

Reporting and queries
• Supports full MySQL SQL syntax
• Works with business intelligence tools certified for MySQL

© 2008 Kickfire, Inc. All rights reserved.
Product Line Overview

 2000 Series
• Size: 2 RU
• Capacity: up to 0.8TB
• Power use: ~600 W
• Price: starting at ~$20k

 3000 Series
• Size: 3 RU
• Capacity: up to 3TB
• Power use: ~700 W
• Price: starting at ~$65k

© 2008 Kickfire, Inc. All rights reserved.
Quick Poll #3 

If you use MySQL for reporting or data warehousing, how
much data are you reporting on today?

• 1GB - 49GB
• 50GB – 99GB
• 100GB – 999GB
• 1TB – 5TB
• > 5TB

© 2008 Kickfire, Inc. All rights reserved.
Demo

© 2008 Kickfire, Inc. All rights reserved.
Agenda 

MySQL intro 
Kickfire background 
Technology overview 
Demo 
Customer case studies 
Q&A

© 2008 Kickfire, Inc. All rights reserved.
Customer Case Study #1: Overview 

Customer
• Public company with large online communities 

Problems
• Canned reports are too slow and report development too slow
because need too many aggregate tables
• Can’t create a new revenue service based on ad hoc queries 

Customer environment
• 1 TB MySQL DW with multiple customer DB of clickstream data
• Two dual-CPU servers with 16GB RAM 

Kickfire system
• Kickfire Database Appliance 2200 with 32GB RAM
© 2008 Kickfire, Inc. All rights reserved.
Customer Case Study #1: Results 

Test platform  Results
• One 45GB customer DB • Average 35X improvement
• 50 million rows in fact table
• Sample query below

© 2008 Kickfire, Inc. All rights reserved.
Customer Case Study #2: Overview 

Customer
• Successful high-tech startup in network management domain 

Problems
• Can’t scale to larger data volumes of network data
• Revenue tied to scaling of network monitoring service 

Customer environment
• Up to 1 TB DW of customer network data 

Kickfire system
• Kickfire Database Appliance 2300 with 64GB RAM

© 2008 Kickfire, Inc. All rights reserved.
Customer Case Study #2: Results 

Test platform  Results
• 450M rows in each fact table • Average 600X improvement
• Over 125 tables over current system results
• 450GB total • In lab were able to improve 60X
• Sample query below with partitions etc. but required
major app re-architecture

© 2008 Kickfire, Inc. All rights reserved.
Q&A 

Additional information
• Product is currently in beta; program currently oversubscribed
• If interested in trials, please email sales@kickfire.com
• For more info, download our white paper and data sheet 

Stay in touch
• Read our blog www.kickfire.com/blog
• Send us an email info@kickfire.com
• Call us at 408.450.5400

© 2008 Kickfire, Inc. All rights reserved.
Appendix

© 2008 Kickfire, Inc. All rights reserved.
TPC-H Queries 

Multiple aggregations over large number of rows (Q1) 
Multiple tables in the query (Q2, Q5, Q7, Q8, Q9) 
Like predicate [over a large number of rows] and string functions
(Q2, Q9, Q13, Q16, Q22) 
Order by limit (top 10-100 rows) (Q2, Q3, Q10, Q18, Q21) 
Correlated subqueries (Q4, Q17, Q20, Q21, Q22) 
Case expression (will highlight how well you can handle conditional
expressions) (Q8, Q12, Q14) 
Large groupby key (Q10, Q18) 
Left Outer join ( Q13) 
Queries over views (Q15) 
Complex filters involving AND-OR (Q19) 

Randomized data and query parameters

© 2008 Kickfire, Inc. All rights reserved.
100GB Performance: Kickfire vs. MyISAM
Kickfire (secs) MyISAM (secs) Speedup
Query 1
Query 2
34.1
2.2
3737.54
10800
110
4909  Ran all 22 queries to
Query 3
Query 4
7.5
3.6
10800
10800
1440
3000 completion
Query 5 4.6 10800 2348
Query 6 2.3 10800 4696 • MyISAM timed out
Query 7
Query 8
5.1
0.8
4328.22
4122.62
849
5153
after three hours on 12
Query 9 9.9 10800 1091 of the queries
Query 10 4 10800 2700
Query 11 2.7 2726.2 1010
Query 12 5 10800 2160
Query 13 36.8 10800 293  Over 1000x faster on
Query 14 1.5 2345.68 1564
Query 15 4.4 10800 2455 average
Query 16 5.9 693.56 118
Query 17 0.6 1895.55 3159
Query 18 26 10800 415
Query 19 2.2 4682.19 2128
Query 20 2.1 1450.42 691
Query 21 23.6 10800 458
Query 22 3.7 100.72 27

© 2008 Kickfire, Inc. All rights reserved.
Chip Exploits fine-grained Parallelism
γγf(.)f(.)

σσp2(c,d)
ππ p2(c,d)

T2.d
σσ σσ
p1(a,b)
p1(a,b)
p1(a,b)
p1(a,b)
T2.c

σσ p0(a)
p0(a)
σσ
p0(a)
p0(a)

T2.b T2.b  Pipelined parallelism
 Data-partitioned parallelism
T1.a T1.a  Independent-operator parallelism
 Inter-query parallelism

© 2008 Kickfire, Inc. All rights reserved.
KDB: Transactional Storage Engine 

Full ACID compliant
• KDB supports serializable isolation level
• TPC-H benchmark requires full ACID 

Automatic crash recovery using the write-ahead logs
• Automatic check-pointing for fast recovery from crashes 

Automatic deadlock detection and rollback 

Support for streaming DML for operational BI
• Future release to allow concurrent updates with long running queries
• Similar to MVCC but without the expensive overhead

© 2008 Kickfire, Inc. All rights reserved.
Loading Utilities 

Fast loader for initial load
• Efficiently loads data using bulk processing - 100GB in 1hr
• Leverages the SQL Chip to execute the bulk operations 

Incremental Loader for periodic loading
• Bulk loading of Inserts, Deletes, and Updates – SQL chip to offload
• Row store like efficiency on column store data 

Data Migration tools utilize Fast and Incremental Loaders
• Schema migration + data type mapping + index migration
• Foreign key migration or declaration

© 2008 Kickfire, Inc. All rights reserved.
KDB supported Indexing 

Basic indexing is automatically generated
• All primary keys have a B+ tree index automatically
• All Foreign keys have a FK-index automatically 

The following indices can be created/deleted by user
• Any Date/Time column can have a Date-range index
• Any string column can have Word index
• B+ tree index for any data type 

Any automatically created indices can be deleted

© 2008 Kickfire, Inc. All rights reserved.
Leveraging MySQL SE Architecture

Answer is
returned
through
MySQL

MySQL
Optimizer &
execution is
by-passed

Parsed
query is
intercepted
by KDB &
executed in
Chip

KDB
Storage
Source: MySQL 5.0 Pluggable Storage Engine Architecture document
Engine
© 2008 Kickfire, Inc. All rights reserved.
Typical Features in DW Queries
Select  Many columns; Complex expressions; Aggregations;

Proj. Optimization
From  In-line-views (Subselect)

Where  Complex selection conditions:
• Large IN lists; Case/If; OR conditions; Primary key condition
• Functions: Extract, Substring, Like
 Complex Join conditions:
• FK joins; EQ joins
 Exists, NOT Exists, IN, NOT IN subselect (Correlated)

Multi-column, Expressions Functions/HW limits
Group By 

Having  Condition on Aggregations

Order By  Orderby on multi-columns and Aggregations, with limit

OrderBy “limted values”
© 2008 Kickfire, Inc. All rights reserved.