You are on page 1of 15

Data warehousing with MySQL

MySQL

MS-SQL

Oracle

MySQL

DB2

Flat Files

Free and Open Source Software

MySQL is licensed under


GPL.
The GPL is a Free and Open Source
Software (FOSS) license that grants
licensees many rights to the software
under the condition that, if they
choose to share the software, or
software built with GPL-licensed
software, they share it under the
same liberal terms.

Free and Open Source Software


Advantages of Open Source
MySQL has 5 million plus active
installation base.
New releases immediately
downloaded by users providing early
feedback on bugs and features.
Access to source code
Write your own features/proprietary
Storage Engine
Freedom !

Data Warehousing application


Data Warehouse is a relational
database.
It is designed for query and analysis
rather than for transaction
processing.
It enables an organization to
consolidate data from several
resources.

Extraction ,Transformation and Loading


Data
Source

Staging
Tables

MERGE
& BULK
INSERT

MERGE
Tables

Indexes,
Memory

Views,
Summary

SWH

AWH

HEAP

Extract

Load

Transform Storage

Performance

OLTP/ BI

Users

Extraction ,Transformation and


Loading
Staging database
LOAD DATA INFILE . Command.
Merging of SQLs
Segregating Informations
View enhancements
Index Enhancement
Memory Manipulation

Extraction, Transformation and Loading


Staging Area and its benefits
Relational Table structures are
flattened to support extract processes
in Staging Area.
First data is loaded into the temporary
table and then to the main DB tables.
Reduces the required space during ETL.
Data can be distributed to any number
of data marts

Partitioning and Storage Engine


The MERGE Table
A collection of identical
MyISAM tables used as one
You can use SELECT,
DELETE, UPDATE, and
INSERT on the collection of
tables.
Use it when having large
tables
DROP the MERGE table, you
drop only the MERGE spec.
Advantage : manageability
and performance

MERGE SALES
Table

Sales
for
Yr04
Aug04

Oct04
Sep04

Partitioning and Storage Engine


MERGING based on month as Range
JUN2004

JUL2004

JUN2004
OCT2004

AUG2004

SEP2004

OCT2004

Partitioning and Storage Engine


MERGE Table Example
mysql> CREATE TABLE jan04 ( -> a INT
NOT NULL AUTO_INCREMENT PRIMARY
KEY, -> message CHAR(20));
mysql> CREATE TABLE feb04 ( -> a INT
NOT NULL AUTO_INCREMENT PRIMARY
KEY, -> message CHAR(20));
mysql> CREATE TABLE year04 ( -> a
INT NOT NULL AUTO_INCREMENT, ->
message CHAR(20), INDEX(a)) ->
TYPE=MERGE UNION=(jan04,feb04)
INSERT_METHOD=LAST;

Partitioning and Storage Engine


MyISAM Storage Engine

Supports MERGE table.


Support fulltext indexing
INSERT DELAYED ... option very useful
when clients can't wait for the INSERT
to complete. Many client bundled
together and written in one block
Compress MyISAM tables with
myisampack to take up much less
space.
Benefit from higher performance on
SELECT statements

Partitioning and Storage Engine


Restrictions on MERGE tables
You can use only identical MyISAM tables
for a MERGE table.
MERGE tables use more file descriptors. If
10 clients are using a MERGE table that
maps to 10 tables, the server uses (10*10)
+ 10 file descriptors.
Key reads are slower. When you read a key,
the MERGE storage engine needs to issue a
read on all underlying tables to check
which one most closely matches the given
key.

Partitioning and Storage Engine

my.cnf parameters for DWH (example)


key_buffer =
1G

myisam_sort_buffer_size =
sort_buffer =

256M

5M

query_cache_type

query_cache_size

100M

key_buffer is the important one, this tells


mysql how much memory to cap itself

Business Intelligence
Using MySQL database server
Drastically reduce information retrieval by
distributing data into replicated clusters.
This enables parallel processing.
Tighter storage format (3 TB squeezed to
1TB)
Aggregate huge amount of data and deliver
reports for OLAP
Relieve overloaded OLTP databases
Availability, scalability and throughput for
the most demanding applications, and of
course affordability

Summary

Free and Open Source under GPL


MyISAM
Storage Engine
No Transactional Overhead
MERGE Table
Tighter storage format
Highly efficient

You might also like