Professional Documents
Culture Documents
Capacity planning uses p95 or p99 response time Servers must be underutilized to tolerate variance
Manageability
Cost of extra hardware can be predicted Cost of downtime cannot Downtime comes in many forms (server down and server too busy)
What is manageability?
The
rate of interrupts/server for the operations team count grows quickly and operations team grows slowly of service must improve over time
Server
Quality
Why MySQL?
It
We
My peers in db eng/ops are very good Room for new people, ideas and products
indexes queries
replication on a WAN
of it does not
Automated replacement of failed nodes Less downtime on schema changes or fewer schema changes Multi-master Better compression Write-optimized
Rows
450M peak
Network
Rows
38GB peak
3.5M peak
Queries
per second
InnoDB
13M peak
5.2M peak
Add
Flash
Write-optimized
Engineering
Fix
Deploy Tell
Make
me what to x
Market
The git log for our MySQL branch has 452 changes.
Fix stalls to make use of capacity Improve efciency to use less Repeat
Fix stalls
Dont make MySQL faster, make it less slow
Stalls from le systems Stalls from caches in MySQL Stalls from mutexes in MySQL Everything else
Switch
XFS does not lock a per-inode mutex on writes XFS has less variance on write-append
purge removes delete-marked rows insert buffer defers IO for secondary index maintenance
Arrival rate exceeds completion rate Throughput collapses when cache is full
Solution
InnoDB les
and kernel_mutex
innodb_thread_concurrency Group
commit control
Admission
Solution
Some Linux kernels get the big kernel lock on fcntl calls MySQL called it too often
Doubled
peak QPS by changing MySQL to call it less is now xed in ofcial MySQL
Problem
Fix
is simple
Disable it and rely on lock wait timeout Detection is now more efcient in ofcial MySQL
With
1000+ sleeping threads it can take too long to wake one some threads to run in FIFO order
Allow
When new thread arrives run if other threads are slow to wake
FIFO
+ LIFO = FLIFO
Commit HW
Group commit
Modied Fix
was fun
Uses a group commit timeout Threads only wait when other threads are about to commit (magic)
Useful
side effect
Good preserve the rows read rate, limit threads running Better preserve query completion rate, limit queries running
Admission control
Simple TP monitor in MySQL Limits max concurrent queries per database account Does the right thing when a query blocks on IO and lock waits
1600
1200
800
400
know MySQL
We
use PMP
Poor Mans Proler State of the art tool for debugging stalls Continue to invest in making it better
This is PMP
echo
"set
pagination
0"
>
/tmp/pmpgdb
echo
"thread
apply
all
bt"
>>
/tmp/pmpgdb
mpid=$(
pidof
mysqld
)
t=$(
date
+'%y%m%d_%H%M%S'
)
gdb
--command
/tmp/pmpgdb
--batch
-p
$mpid
|
grep
-v
'New
Thread'
>
f.$t
cat
f.$t
|
awk
'BEGIN
{
s
=
"";
}
/Thread/
{
print
s;
s
=
"";
}
/^\#/
{
x=index($2,
"0x");
if
(x
==
1)
{
n=$4
}
else
{
n=$2
};
if
(s
!=
""
)
{
s
=
s
","
n}
else
{
s
=
n
}
}
END
{
print
s
}'
-
|
sort
|
uniq
-c
|
sort
-r
-n
-k
1,1
>
h.$t
Non
Queries
Queries
Dont do them
Manageability: solutions
Online
Dogpiled
Get performance counters and the list of running queries Generate HTML page with interesting results
Pylander
Kill duplicate queries Limit the number of queries from specic accounts
Schema Change
Must Add
a column, add an index, change an index TABLE can take hours on a large table TABLE can block reads and writes to the table
ALTER ALTER
2. 3. 4.
Copy data to new table with desired schema Replay changes on new table Rename new table as the target table
InnoDB compression work for OLTP tool for prefetching for replication slaves
replacement replace failed and unhealthy MySQL servers resharding sharding is easy, resharding is hard
Faker
Replication
Bottleneck might be disk reads Work done by a single thread Transactions on master are concurrent
Faker
Multiple threads replay transactions in fake-changes mode on slaves Captures 70% of disk reads, work in progress to improve the rate
is one host slow? is the database tier doing a lot more work today? do I spend the next N dollars (memory, disk, ash)?
do I run a workload across old (slow) and new (fast) servers? do I integrate cache and database tiers? monitoring signals generate useful interrupts?
What
a server in production before writing a new one more in monitoring, debugging and tuning
Invest
(c) 2007 Facebook, Inc. or its licensors. "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0