You are on page 1of 35

Another MySQL Performance Talk

Morgan Tocker <morgan@mysql.com>


Topics I’ll cover

• How *not* to benchmark

• Disk CPU RAM or Network?

• MySQL Topologies
How *not* to benchmark
My favorite quote...

• “I tested both MyISAM and InnoDB and accessing the same URL
1000 times it was always faster in MyISAM....”
First past the post tests
suck.
Why?
Why?
• By using the same URL you almost guaranteed that it would be
in:
Why?
• By using the same URL you almost guaranteed that it would be
in:

• The query_cache (if enabled)


Why?
• By using the same URL you almost guaranteed that it would be
in:

• The query_cache (if enabled)

• The MySQL key_cache (if MyISAM) for the index blocks


Why?
• By using the same URL you almost guaranteed that it would be
in:

• The query_cache (if enabled)

• The MySQL key_cache (if MyISAM) for the index blocks

• The InnoDB buffer pool (both indexes and data) and the InnoDB
adaptive hash
Why?
• By using the same URL you almost guaranteed that it would be
in:

• The query_cache (if enabled)

• The MySQL key_cache (if MyISAM) for the index blocks

• The InnoDB buffer pool (both indexes and data) and the InnoDB
adaptive hash

• A filesystem cache. Maybe even a disk cache, or the drive head


might be at the right location to minimize seek.
What else is wrong?

• You’ve also omitted other key problems from your test.

• Can’t catch potential deadlocks, Table locking

• Any issues where disk Starvation etc occurs.


My second favorite quote...

• “My system’s average load is 0.10 with 100 connections, so I should be


able to support somewhere up to 1000 connections”.

• Rarely true.

• Bottlenecks shift.

• Multi-cores and multi CPUs are exposing new problems.


BUG #15815, BUG #30738, BUG #26442
Tools to use
• Apache benchmark (ab)

• http_load

• Similar to ab, can accept an input file of URLs (apache access log
makes a good victim)

• Custom shell scripts

• curl $url
Disk RAM CPU or Network?
Potential Bottlenecks

DISK RAM

CPU Network
Potential Bottlenecks
DISK
The number one bottleneck. RAM
Seriously.

CPU Network
Potential Bottlenecks
DISK RAM
The number one bottleneck. Use it to cheat - get out of using
Seriously. your DISK wherever you can.

CPU Network
Potential Bottlenecks
DISK RAM
The number one bottleneck. Use it to cheat - get out of using
Seriously. your DISK wherever you can.

CPU
May report as high if Network
heavy wait I/O.
Potential Bottlenecks
DISK RAM
The number one bottleneck. Use it to cheat - get out of using
Seriously. your DISK wherever you can.

Network
CPU
Maximum throughput is not
May report as high if
normally an issue, but round trips
heavy wait I/O.
add latency.
My Advice:

• Better value in fixing the bigger problems first.

• Sometimes this means not buying expensive CPUs and Gigabit


Ethernet cards.
Warning! Your bottleneck
may change Mention how to create sample
data?

Drop caches bet ween tests.

Mention in memory structure !=


on disk structure, we need to
work on in memory compression.

• What’s percentage of your data fits nicely into RAM?

• Almost any sloppy design will work with a dataset of 1G

• Can you afford the same when your dataset is 90G?


My favorite unixy commands

• iostat -d 60 5 > /tmp/iostat.txt &


• df -h

• vmstat 60 5 > /tmp/vmstat.txt &


• free

• lsof
My favorite unixy commands
(cont.)
• ps aux
• uptime

• top

• netstat
• ping -f
My advice: Hard Drives

• RAID 10 is the best (slightly better write speed than RAID5)

• Be careful that cheaper cards apparently do RAID5 in some crappy


reduced performance way

• Drive speed also counts (10K, 15K RPM)


My advice: Hard Drives (cont.)

• Slower drives can’t fsync() fast enough, so you’ll still be write bound.

• Battery backed write caches on RAID controllers kick arse

• You can distribute InnoDB log files or binary log files for better disk
IO distribution
Some things to try...

• mount drives with noatime

• experiment with different IO schedulers

• Temporary tables that need to be created on disk in MySQL can be


redirected to a tmpfs

• You’ll want to reduce the size of max_heap_table_size and


tmp_table_size
Topologies that scale
In the beginning...

Internet

www
+ mysql
The next logical step...

mysql
Internet

www
The next logical step (cont.)

mysql
Internet

www
Continued Growth...
slaves
master

Internet
reads
writes

xN
www
When this works...

Unused Reads Reads

Writes Writes Writes


Where this hits the wall...
Unused Reads Reads Reads Reads Reads Reads

Writes Writes Writes Writes Writes Writes


Writes

* Single replication thread makes this even worse than it looks.


You need to (partition) shard.

• Use groups of Master, Slave, Slave for different tasks:

• Do logging tables ever join on the other tables?

• Maybe have a global user table and store user_details on different


groups of servers.

• Different servers for searching


The End
(Here’s where I give
time for other Q&A)