0 to 60 in 3.

1
Presented by, MySQL AB® & O’Reilly Media, Inc.

Tyler Carlton Cory Sessions

<Insert funny joke here>
Presented by, MySQL AB® & O’Reilly Media, Inc.

The Project
 Medium sized demographics data mining project  1,700,000+ User base  Hundreds of data points per user

“Legacy” System – Why Upgrade?
+  Main DB (External Users)  Offline backup (Internal Users)  Weekly manual copy backups  Max of 3 simultaneous data pulls  8hr+ data pull times for complex data pulls  Random index corruption

Notes: Smaller is Better On average, CPU usage with MySQL was 20% lower than our old database solution.

Why We Chose MySQL Cluster
    Scalable Distributed processing 5 – 9’s Reliability Instant data availability between internal & external users

What We Built – NDB Data Nodes
 8 Node NDB cluster
Dual Core 2 Quad 1.8 ghz 16 Gig ram (Data memory) 6x Raid 10 SAS 15k RPM drives

What We Built – API & MGMT Nodes
 3 API nodes + 1 management node
Dual Core 2 Quad 1.8 ghz 8 Gig ram 300 gig 7200rpm (Raid 0)

NDB Issues with a Large Data Set
 NDB load times
Loading from backup: ~ 1 hour Restarting NDB nodes: ~ 1 hour
Note: Load times differ depending on your data size

NDB Issues with a Large Data Set
 Indexing Issues
Force index (NDB picks wrong) Index creation/modification order matters (Seriously!)

 Local Checkpoint Tuning
TimeBetweenLocalCheckpoints - 20 means 4MB (4 × 220) of write operations NoOfFragmentLogFiles – No. of 4 x 16MB files None deleted until 3 local checkpoints On startup: Local checkpoint buffers would overrun RTFM (two, maybe three times)

NDB Network Issues
 Network transport packet size
Buffer would fill and overrun This caused nodes to miss their heartbeats and drop

 This would happen when:
A backup was running A local checkpoint was running at the same time

 Solved by : Increasing network packet buffer

Issues - IN Statements
 IN statements die with engine_condition_pushdown=ON with a set of apx. 10,000 or more. (caused with zip codes)  Really need engine_condition_pushdown=ON, but this broke it for us, so… we had to disable it.

Structuring Apps: Redundency
 Redundant power supply + dual power sources  Port trunking w/ redundant Gig-E switches  # NDB Replicas: 2 (2x4 setup) 64 gig max data size  MySQL (API Nodes ) Heads: Load balanced with automatic fail over

Structuring Apps: Internal Apps
 Ultimate goal: Offload the data intensive processing to the MySQL nodes

The Good Stuff: Stats!
Queries per Second (over 20 days)  Average 1100-1500 Queries / Sec during our peak times  Average 250 Queries / Sec

Website Traffic
Stats for March 2008

Net Usage: NDB Node
 All NDB data nodes have nearly identical network bandwidth usage MySQL ( API ) Nodes  use about 9 MBs max  under our current structure Totaling 75 MBs during peak(600 Mbs)

Monitoring & Maintenance
 SNMP Monitoring: CPU, Network, Memory, Load, Disk

 Cron Scripts:
Node status & Node down notification Backups Database maintenance routines

MySQL Clustering book provided the base the scripts

Net Usage: Next steps…
 Dolphin NIC Testing
4 node test cluster 4 x overall performance Brand new patch to handle automatic Ethernet failover / Dolphin Fail Over ( beta as of March 28 )

Questions?

Contact Information
 Tyler Carlton www.qdial.com tcarlton@gmail.com  Cory Sessions CorySessions.com OrangeSoda.com

Sign up to vote on this title
UsefulNot useful