You are on page 1of 25

Scaling MySQL to New Heights

ScaleDB Technical
Presentation
Thursday April 17, 2008
ScaleDB for MySQL

Database

InnoDB, MyISAM, Cluster,


Falcon, BDB, Merge, etc. Storage Engine

ScaleDB
What Makes ScaleDB
Better?
• ScaleDB Advantages:
• Performance: New indexing delivers
dramatic performance improvement
• Scalability: Designed for clustering
with Plug-and-Cluster™ Architecture

ScaleDB
Improving Performance
ScaleDB Indexing
Hash Bitmap Aggreg Etc.
ate Special-purpose
Index Add-ons*

Conventional Indexing General Purpose


(B-tree) Indexing

ScaleDB Index: A general purpose


index that also delivers much of the
functionality and performance of
*Only special-purpose index add-ons
supported by high-end commercial databases

ScaleDB
ScaleDB: Multi-Table
Indexing
B-tree: Only indexes the data in tables
Index Index Index Index Index
#1 #2 #3 #4 #5

#1 #2 #3 #4 #5
ScaleDB: Indexes the data and relationships
ScaleDB
Index Advantages:
#1 • Faster
#2 • Smaller
#4
• Referential integrity
#3 #5 • More functionality

ScaleDB
Describing Our Demo
• Scenario: Select information that is
spread across 3 tables: Colleges,
Students and Enrollment
• Relationships: Students are enrolled
in courses within departments of
colleges
• DDL Definitions

ScaleDB
The Query
SELECT c1.CollName, s.StudName, c2.CourseName ,
e.Grade
FROM College AS c1
STRAIGHT_JOIN Student AS s
STRAIGHT_JOIN Enrollment AS e
STRAIGHT_JOIN Course AS c2
ON ( c1.CollNo = s.CollNo AND
s.CollNo = e.CollNo AND
s.StudentNo = e.StudentNo AND
e.CollNo = c2.CollNo AND
e.DeptNo = c2.DeptNo AND
e.CourseNum = c2.CourseNum )
WHERE c1.CollNo = X
AND s.StudentNo = Y ;
ScaleDB
A Sample Scenario
• Scenario: I need information that is spread across
3 tables: Colleges, Students and Enrollment
• Options:
• #1: Conventional Joins
• #2: Materialized View
• #3: ScaleDB
Colleges Students Enrollment
Col_ID#
Coll_ID# Col_Name
Coll_Name Col_Budget
Coll_Budget Col_Description
Coll_Description Dept_ID# Dept_Name Coll_ID# Dept_Budget Course_ID# Course_Name Coll_ID# Dept_ID#

0001 Amhearst $1,234,567 Nice place to visit 0001 Amhearst $1,234,567 Nice place to visit 0001 Amhearst $1,234,567 Nice place to visit
0002 Berkeley $5,432,567 Sports not so good 0002 Berkeley $5,432,567 Sports not so good 0002 Berkeley $5,432,567 Sports not so good
0003 Harvard $9,999,666 Cool logo 0003 Harvard $9,999,666 Cool logo 0003 Harvard $9,999,666 Cool logo
0004 Holy Cross $3,234,567 Ugh Worcester 0004 Holy Cross $3,234,567 Ugh Worcester 0004 Holy Cross $3,234,567 Ugh Worcester
0005 MIT $8,238,568 Serious work 0005 MIT $8,238,568 Serious work 0005 MIT $8,238,568 Serious work
0006 Cornell $7,237,767 Jumpy students 0006 Cornell $7,237,767 Jumpy students 0006 Cornell $7,237,767 Jumpy students
0007 Stanford $9,898,777 Pretty campus 0007 Stanford $9,898,777 Pretty campus 0007 Stanford $9,898,777 Pretty campus
0008 TCU $5,987,004 In Texas 0008 TCU $5,987,004 In Texas 0008 TCU $5,987,004 In Texas

ScaleDB
The Query
• SELECT c1.CollName, s.StudName, c2.CourseName , e.Grade
• FROM College AS c1
• STRAIGHT_JOIN Student AS s
• STRAIGHT_JOIN Enrollment AS e
• STRAIGHT_JOIN Course AS c2
• ON ( c1.CollNo = s.CollNo AND
• s.CollNo = e.CollNo AND
• s.StudentNo = e.StudentNo AND
• e.CollNo = c2.CollNo AND
• e.DeptNo = c2.DeptNo AND
• e.CourseNum = c2.CourseNum )
• WHERE c1.CollNo = X
• AND s.StudentNo = Y +------+--------+----------------------+------------+-------+----------+
;
|table | type | key | key_len | rows | filtered |
+------+--------+-----------------------+-----------+-------+----------+
| c1 | const | PRIMARY |4 | 1 | 100.00 |
| s | const | PRIMARY | 14 | 1 | 100.00 |
| e | ref | idx_EnrollStud | 4 | 3 | 100.00 |
| c2 | eq_ref | PRIMARY | 17 | 1 | 100.00 |
+------+--------+-----------------------+-----------+-------+----------+

ScaleDB
Option #1: Conventional
Joins

Colleges Index(s) Students Index(s) Enrollment Index(s)

Colleges Students Enrollment


Col_ID#
Coll_ID# Col_Name
Coll_Name Col_Budget
Coll_Budget Col_Description
Coll_Description Student_ID# College_ID# Student_Name Student_Desc College_ID# Dept_ID# Student_ID# Grade
001 Agriculture $1,234,567 Nice place to visit 56-8033 008 Mike Hogan Caucasian 008 4455 56-8037 B+
002 Arts $5,432,567 Sports not so good 56-8045 008 Moshe Smith Caucasian 008 4455 56-8033 C
003 Business $9,999,666 Cool logo 56-8044 008 Sally Shadmon Native American 008 4455 56-8045 B+
004 Education $3,234,567 Ugh Worcester 56-8055 008 Billy Fleegle African American 008 4456 56-8044 A-
005 Engineering $8,238,568 Serious work Join 56-8037 008 Saul Goode
Join
African American 008 4456 56-8122 B-
006 Law $7,237,767 Jumpy students 56-8122 008 Tim Collins Polynesian 008 4454 56-8233 C
007 Liberal Arts $9,898,777 Pretty campus 56-8233 008 Sam Gee Asian 008 4455 56-8334 F
008 Medicine $5,987,004 In Texas 56-8334 008 Rod Paulino Asian 008 4454 56-8055 D

Query Result:
008 Medicine $5,987,004 In Texas | 56-8037 Saul Goode African American | 4455 B+ |

ScaleDB
Option #2: Materialized
View
Materialized View Indexes

Materialized View
Col_ID# Col_Name Col_Budget Col_Description
Copies (and synchronizes) the
Coll_ID# Coll_Name Coll_Budget Coll_Description Student_ID# Student_Name Student_Desc Dept_ID# Grade

001 Agriculture $1,234,567 Nice place to visit 56-8033 Mike Hogan Caucasian 3345 A
001 Agriculture $1,234,567 Nice place to visit 56-8033 Mike Hogan Caucasian 3235 B+
001
001
data from individual tables into
Agriculture
Agriculture
$1,234,567
$1,234,567
Nice place to visit
Nice place to visit
56-8033
56-8033
Mike Hogan
Mike Hogan
Caucasian
Caucasian
3245
3245
A-
B

Colleges
one massiveStudents
view
001 Agriculture $1,234,567 Nice place to visit 56-8033 Mike Hogan Caucasian 3235 A+
001 Agriculture $1,234,567
Enrollment
Nice placeDept_Name
to visit 56-8034 PaulDept_Budget
Martyn Caucasian 3239 A-
Col_ID#
Coll_ID# Col_Name
Coll_Name Col_Budget
Coll_Budget Col_Description
Coll_Description Dept_ID# Coll_ID# Course_ID# Course_Name Coll_ID# Dept_ID#

…………
001 Agriculture 001
$1,234,567 NiceAgriculture
place to visit $1,234,567 56-8033
Nice place to visit
008 56-8034
Mike Hogan Paul Martyn
Caucasian Caucasian 3239 B
008 4455 56-8037 B+
002 Arts 001
$5,432,567 Agriculture
Sports not so good $1,234,567 56-8045
Nice place to visit
008 56-8034
Moshe Smith Paul Martyn
Caucasian Caucasian 3240 A+
008 4455 56-8033 C
003 Business $9,999,666 Cool logo 56-8044 008 Sally Shadmon Native American 008 4455 56-8045 B+
004 Education Col_ID#
$3,234,567
008
Col_Name
Ugh Worcester
Medicine
Col_Budget
$5,987,004
Col_Description
56-8055
In Texas
008 Billy Fleegle
56-8037
African American
Saul Goode African American 008 4455 4456 A 56-8044 A-
005 Engineering $8,238,568 Serious work 56-8037 008 Saul Goode African American
008 Medicine $ 5,987,004 In Texas 56-8037 Saul Goode African American 008 4455 4456 A 56-8122 B-
006 Law $7,237,767
008 Jumpy students
Medicine $ 5,987,00456-8122In Texas 008 Tim Collins
56-8037 Polynesian
Saul Goode African American 008 4455 4454 B+ 56-8233 C
007 Liberal Arts $9,898,777
008 Pretty campus
Medicine $ 5,987,00456-8233In Texas 008 Sam Gee
56-8037 Asian
Saul Goode African American 008 4455 4455 A- 56-8334 F
008 Medicine $5,987,004
008 In Texas
Medicine $ 5,987,00456-8334In Texas 008 Rod Paulino
56-8037 Asian
Saul Goode African American 008 4455 4454 B 56-8055 D

008 Medicine $ 5,987,004 In Texas 56-8039 Paul Martyn Caucasian 4454 A-


008 Medicine $ 5,987,004 In Texas 56-8039 Paul Martyn Caucasian 4454 B
008 Medicine $ 5,987,004 In Texas 56-8039 Paul Martyn Caucasian 4454 A+

Query Result:
008 Medicine $5,987,004 In Texas | 56-8037 Saul Goode African American | 4455 B+ |

ScaleDB
Option #3: ScaleDB
ScaleDB’s multi-table index is relationship-aware
ScaleDB Index
College
A Single
Departme Student Index
nts s Lookup
Enrollme
Courses
nt
Colleges Students Enrollment
Col_ID#
Coll_ID# Col_Name
Coll_Name Col_Budget
Coll_Budget Col_Description
Coll_Description Student_ID# College_ID# Student_Name Student_Desc College_ID# Dept_ID# Student_ID# Grade
001 Agriculture $1,234,567 Nice place to visit 56-8033 008 Mike Hogan Caucasian 008 4455 56-8037 B+
002 Arts $5,432,567 Sports not so good 56-8045 008 Moshe Smith Caucasian 008 4455 56-8033 C
003 Business $9,999,666 Cool logo 56-8044 008 Sally Shadmon Native American 008 4455 56-8045 B+
004 Education $3,234,567 Ugh Worcester 56-8055 008 Billy Fleegle African American 008 4456 56-8044 A-
005 Engineering $8,238,568 Serious work 56-8037 008 Saul Goode African American 008 4456 56-8122 B-
006 Law $7,237,767 Jumpy students 56-8122 008 Tim Collins Polynesian 008 4454 56-8233 C
007 Liberal Arts $9,898,777 Pretty campus 56-8233 008 Sam Gee Asian 008 4455 56-8334 F
008 Medicine $5,987,004 In Texas 56-8334 008 Rod Paulino Asian 008 4454 56-8055 D

Query Result:
008 Medicine $5,987,004 In Texas | 56-8037 Saul Goode African American | 4455 B+ |

ScaleDB
Building Relationships in
ScaleDB
College
Create College
Departme Student
Create Department nts s
- foreign key: College Enrollme
Courses
Create Course nt
- foreign key: Department
Create Students
- foreign key: College Relationship
Create Enrollment creation is
- foreign key: Students automated

ScaleDB
Pros & Cons of Each Method

Ease of Real- Performa Tuning


Implement Time nce
ation Data
Conventio
nal + + - -
Joins
Materializ
ed
- - + -
Views
ScaleDB+ + + +

ScaleDB
Performance Variables
• Early performance benchmarks
• Used a vanilla scenario
• Our performance advantage
increases with:
• Query/Schema Complexity
• Referential Integrity Checks
• Key Size
• Data Size/Number of Keys
• Performance Advantage: 2X – 20X+
ScaleDB
MySQL Integration
• ScaleDB leverages its index to
assemble data across tables without
step-wise joins
• MySQL query optimizer sees multiple
tables, so it calls for step-wise joins
• ScaleDB tells the query optimizer
about joined tables, they are
virtualized (built on the fly)
• When MySQL’s query optimizer
ScaleDB
recognizes ScaleDB, phantom tables
Improving Scalability
The Challenges of Scaling
• How do I partition data?
• Predict usage patterns, application
evolution, data growth patterns…all are
moving targets
• Avoid data skew: bottlenecks caused by
frequently accessed data on just a few
nodes
• Data shipping between nodes (2-phase
commit)
• Searches outside the partition column
require participation by all nodes ScaleDB
ScaleDB’s Plug-and-
Cluster™
• Cluster-ready solution, just plug in a
server
• No need to partition the data
• Based on shared-everything
architecture
• Found in the highest-end commercial
databases
• Eliminates all of the data partitioning
problems
ScaleDB
ScaleDB Cluster
Local Lock
Local Lock
Manager
Manager

Local Lock Local Lock


Manager Manager

Shared
Storage

ScaleDB
ScaleDB Cluster

Global Lock Manager

Shared
Storage

ScaleDB
Demo

ScaleDB
In a nutshell…

MySQL + ScaleDB

MySQL

ScaleDB
Summary
• Revolutionary indexing solution delivers
a quantum leap in performance &
scalability
• Results:
• Performance improvements of 2X and up
• 7X smaller index size (average)
• Stop jumping through hoops to avoid
joins…FREE JOINS!
• Enables more complex applications,
fresh data, lower TCO, superior
scalability & performance
• We’re looking for appropriate beta
ScaleDB
testers