You are on page 1of 15

MySQL ep ca on

By Stephane Combaudon,
Aurimas Mikalauskas and
Muhammad Irfan

MySQL's built-in replication is the foundation for building large, high-performance applications
on top of MySQL using the "scale-out" architecture. MySQL replication enables DBAs to
configure one or more server as replicas of another server - keeping the data synced with the
master copy.

This Percona eBook answers the questions:


* "How does MySQL Replication really work?"
* "How to identify and cure MySQL replication slave lag"
* "GTIDs in MySQL 5.6: New replication protocol; new ways to break replication"

Copyright © 2006-2014 Percona LLC


MySQL Replication

Table of Contents

hap er How does MySQL Replication really work? 3


hap er How to identify and cure MySQL replication slave lag
hap er GTIDs in MySQL 5.6: New replication protocol; new ways to break replication 1

About Percona
Percona was founded in August 2006 and now employs a global
network of experts with a staff of more than 120 people. Our customer
list is large and diverse, including Fortune 50 corporations, popular
websites, and small startups. We have over 2,000 customers and,
although we do not reveal all of their names, chances are we're
working with nearly every large MySQL user you've heard about. To
put Percona's MySQL expertise to work for you, please contact us.

Is this an emergency? Get immediate assistance


from Percona Support 24/7. Click here

Skype: oncall.percona
GTalk: oncall@percona.com
AIM (AOL Instant Messenger): oncallpercona
Telephone direct-to-engineer: +1-877-862-4316 or
UK Toll Free: +44-800-088-5561
Telephone to live operator: +1-888-488-8556
Customer portal: https://customers.percona.com/

Copyright © 2006-2014 Percona LLC


MySQL Replication
hap er How does MySQL Replication

How does MySQL Replication really work?

By Auri as ika auskas

While Percona experts have written in-depth on MySQL replication on our MySQL Performance
Blog, such as on replication being single-threaded, on semi-synchronous replication or on
estimating replication capacity, I don’t think we’ve ever covered the very basics of how MySQL
replication really works. That’s one of the main reasons for publishing this eBook on the subject.

Of course, there are many aspects of MySQL replication, but my main focus will be the logistics –
how replication events are written on the master, how they are transferred to the replication slave
and then how they are applied there. Note that this is NOT a HOWTO setup replication, but rather
a type of thing.

Replication events
I say replication events in this article because I want to avoid discussion about different replication
formats. These are covered pretty well . Put simply, the events can be one of two types:

Statement based – in which case these are write queries


Row based – in this case these are changes to records, sort of row diffs if you will

But other than that, I won’t be going back to differences in replication with different replication
formats, mostly because there’s very little that’s different when it comes to transporting the data
changes.

On the master
So now let me start with what is happening on the master. For replication to work, first of all master
needs to be writing replication events to a special log called binary log. This is usually very
lightweight activity (assuming events are not synchronized to disk), because writes are buffered
and because they are sequential. The binary log file stores data that replication slave will be
reading later.

Whenever a replication slave connects to a master, master creates a new thread for the connection
(similar to one that’s used for just about any other server client) and then it does whatever the
client – replication slave in this case – asks. Most of that is going to be (a) feeding replication slave
with events from the binary log and (b) notifying slave about newly written events to its binary log.

Slaves that are up to date will mostly be reading events that are still cached in OS cache on the
master, so there is not going to be any physical disk reads on the master in order to feed binary log
MySQL Replication
hap er How does MySQL Replication

events to slave(s). However, when you connect a replication slave that is few hours or even days
behind, it will initially start reading binary logs that were written hours or days ago – master may no
longer have these cached, so disk reads will occur. If master does not have free IO resources, you
may feel a bump at that point.

On the replica
Now let’s see what is happening on the slave. When you start replication, two threads are started
on the slave:

1. IO thread

This process called IO thread connects to a master, reads binary log events from the master as
they come in and just copies them over to a local log file called relay log. That’s all.

Even though there’s only one thread reading binary log from the master and one writing relay log
on the slave, very rarely copying of replication events is a slower element of the replication. There
could be a network delay, causing a steady delay of few hundred milliseconds, but that’s about it.

If you want to see where IO thread currently is, check the following in “show slave statusG”:

Master_Log_File – last file copied from the master (most of the time it would be the same
as last binary log written by a master)
Read_Master_Log_Pos – binary log from master is copied over to the relay log on the
slave up until this position.

And then you can compare it to the output of “show master statusG” from the master.

2. SQL thread

The second process – SQL thread – reads events from a relay log stored locally on the replication
slave (the file that was written by IO thread) and then applies them as fast as possible.

This thread is what people often blame for being single-threaded. Going back to “show slave
statusG”, you can get the current status of SQL thread from the following variables:

Relay_Master_Log_File – binary log from master, that SQL thread is “working on” (in
reality it is working on relay log, so it’s just a convenient way to display information)
Exec_Master_Log_Pos – which position from master binary log is being executed by SQL
thread.
MySQL Replication
hap er How does MySQL Replication

Replication lag
Now I want to briefly touch the subject of replication lag in this context. When you are dealing with
replication lag, first thing you want to know is which of the two replication threads is behind. Most of
the time it will be the SQL thread, still it makes sense to double check. You can do that by
comparing the replication status variables mentioned above to the master binary log status from
the output of “show master statusG” from the master.

If it happens to be IO thread, which, as I mentioned many times already, is very rare, one thing you
may want to try to get that fixed is .

Otherwise, if you are sure it is SQL thread, then you want to understand what is the reason and
that you can usually observe by vmstat. Monitor server activity over time and see if it is “r” or “b”
column that is “scoring” most of the time. If it is “r”, replication is CPU-bound, otherwise – IO. If it
is not conclusive, mpstat will give you better visibility by CPU thread.

Note this assumes that there is no other activity happening on the server. If there is some activity,
then you may also want to look at diskstats or even do a query review for SQL thread to get a good
picture.

If you find that replication is CPU bound, this maybe very helpful.

If it is IO bound, then fixing it may not be as easy (or rather, as cheap). Let me explain. If replication
is IO bound, most of the time that means that SQL thread is unable to read fast enough because
reads are single threaded. Yes, you got that right – it is reads that are limiting replication
performance, not writes. Let me explain this further.

Assume you have a RAID10 with a bunch of disks and write-back cache. Writes, even though they
are serialized, will be fast because they are buffered in the controller cache and because internally
RAID card can parallelize writes to disks. Hence replication slave with similar hardware can write
just as fast as master can.

Now Reads. When your workset does not fit in memory, then the data that is about to get modified
is going to have to be read from disk first and this is where it is limited by the single-threaded
nature of the replication, because one thread will only ever read from one disk at a time.

That being said, one solution to fix IO-bound replication is to increase the amount of memory so
working set fits in memory. Another – get IO device that can do much more IO operations per
second even with a single thread – fastest traditional disks can do up to 250 iops, SSDs – in the
order of 10,000 iops.

In Chapter Two, Percona support engineer Muhammad Irfan will talk about identifying and curing
MySQL replication slave lag.

5
MySQL Replication
a r How to identify and cure MySQL replication slave lag

How to identify and cure MySQL replication slave lag

By Muhammad Irfan

Here on the Percona MySQL Support team, we often see issues where a
customer is complaining about replication delays – and many times the problem ends up being tied
to MySQL replication slave lag. This of course is nothing new for MySQL users and we’ve had a
few posts here on the MySQL Performance Blog on this topic over the years (two particularly
popular post in the past were: “Reasons for MySQL Replication Lag” and “Managing Slave Lag
with MySQL Replication,” both by Percona CEO Peter Zaitsev).

In this chapter, however, I will share some new ways of identifying delays in replication – including
possible causes of lagging slaves – and how to cure this problem.

How to identify Replication Delay


MySQL replication works with two threads, IO_THREAD & SQL_THREAD. IO_THREAD connects
to a master, reads binary log events from the master as they come in and just copies them over to
a local log file called relay log. On the other hand, SQL_THREAD reads events from a relay log
stored locally on the replication slave (the file that was written by IO thread) and then applies them
as fast as possible. Whenever replication delays, it’s important to discover first whether it’s
delaying on slave IO_THREAD or slave SQL_THREAD.

Normally, I/O thread would not cause a huge replication delay as it is just reading the binary logs
from the master. However, It depends on the network connectivity, network latency… how fast is
that between the servers. The Slave I/O thread could be slow because of high bandwidth usage.
Usually, when the slave IO_THREAD is able to read binary logs quickly enough it copies and piles
up the relay logs on the slave – which is one indication that the slave IO_THREAD is not the culprit
of slave lag.

On the other hand, when the slave SQL_THREAD is the source of replication delays it is probably
because of queries coming from the replication stream are taking too long to execute on the slave.
This is sometimes because of different hardware between master/slave, different schema indexes,
workload. Moreover, the slave OLTP workload sometimes causes replication delays because of
locking. For instance, if a long-running read against a MyISAM table blocks the SQL thread, or any
transaction against an InnoDB table creates an IX lock and blocks DDL in the SQL thread. Also,
take into account that slave is single threaded prior to MySQL 5.6, which would be another reason
for delays on the slave SQL_THREAD.

Let me show you via master status/slave status example to identify either slave is lagging on slave
IO_THREAD or slave SQL_THREAD on t e following page.

6
MySQL Replication
a r How to identify and cure MySQL replication slave lag
MySQL Replication
a r How to identify and cure MySQL replication slave lag

This clearly suggests that the slave IO_THREAD is lagging and obviously because of that the slave
SQL_THREAD is lagging, too, and it yields replication delays. As you can see the Master log file is
mysql-bin.018196 (File parameter from master status) and slave IO_THREAD is
on mysql-bin.018192 (Master_Log_File from slave status) which indicates slave IO_THREAD is
reading from that file, while on master it’s writing on mysql-bin.018196, so the slave IO_THREAD
is behind by 4 binlogs. Meanwhile, the slave SQL_THREAD is reading from same file i.e.

mysql-bin.018192 (Relay_Master_Log_File from slave status) This indicates that the slave
SQL_THREAD is applying events fast enough, but it’s lagging too, which can be observed from
the difference between Read_Master_Log_Pos & Exec_Master_Log_Pos from show slave
status output.

You can calculate slave SQL_THREAD lag from Read_Master_Log_Pos –


Exec_Master_Log_Pos in general as long as Master_Log_File parameter output from show
slave status and Relay_Master_Log_File parameter from show slave status output are the same.
This will give you rough idea how fast slave SQL_THREAD is applying events. As I mentioned
above, the slave IO_THREAD is lagging as in this example then off course slave SQL_THREAD is
behind too. You can read detailed description of show slave status output fields

Also, the Seconds_Behind_Master parameter shows a huge delay in seconds. However, this can
be misleading, because it only measures the difference between the timestamps of the relay log
most recently executed, versus the relay log entry most recently downloaded by the IO_THREAD.
If there are more binlogs on the master, the slave doesn’t figure them into the calculation of
Seconds_behind_master. You can get a more accurate measure of slave lag using pt-heartbeat
from Percona Toolkit. So, we learned how to check replication delays – either it’s slave
IO_THREAD or slave SQL_THREAD. Now, let me provide some tips and suggestions for what
exactly causing this delay.

Tips and Suggestions What Causing Replication Delay & Possible Fixes
Usually, the slave IO_THREAD is behind because of slow network between master/slave. Most of
the time, enabling helps to mitigate slave IO_THREAD lag. One other suggestion is to disable
binary logging on slave as it’s IO intensive too unless you required it for point in time recovery.

To minimize slave SQL_THREAD lag, focus on query optimization. My recommendation is to


enable the configuration option so that the queries executed by slave that take more than will be
logged to the slow log. To gather more information about query performance, I would also
recommend setting the configuration option log_slow_verbosity to “full”.

This way we can see if there are queries executed by slave SQL_thread that are taking long time to
complete. You can follow my previous post about how to enable slow query log for specific time
period with mentioned options here. And as a reminder, log_slow_slave_statements as variable
were first introduced in Percona Server 5.1 which is now part of vanilla MySQL from version 5.6.11
In upstream version of MySQL Server log_slow_slave_statements were introduced as command
line option. Details can be found here while log_slow_verbosity is Percona Server specific feature.
MySQL Replication
a r How to identify and cure MySQL replication slave lag

One ot er reason for delay on t e slave SQL_THREAD if you use row based binlog format is that if
any database table is missing a primary key or unique key then it will scan all rows of the table for
DML on slave and causes replication delays. Make sure all your tables have a primary key
or unique key. Check this bug report for details. You can use t e query below on t e slave to
identify which database tables are issing pri ary or uni ue keys.

One improvement is made for this case in MySQL 5.6, where in memory hash is used comes to
the rescue.

Note that Seconds_Behind_Master is not updated while we read huge RBR event, So, “lagging”
may be related to just that – we had not completed reading of the event. For example, in row based
replication huge transactions may cause delay on slave side e.g. if you have 10 million rows table
and you do “DELETE FROM table WHERE id 5M rows will be sent to slave, each row separately
which will be painfully slow. So, if you have to delete oldest rows time to time from huge table using
Partitioning might be good alternative for this for some kind of workloads where instead using
DELETE use DROP old partition may be good and only statement is replicated because it will be
DDL operation.

To explain it better, let suppose you have partition1 holding rows of ID’s from 1 to 1000000,
partition2 – ID’s from 1000001 to 2000000 and so on so instead of deleting via statement
“DELETE FROM table WHERE ID /dev/null mysqld_alive=$? if [[ $mysqld_alive == 0 ]] then
seconds_behind_master=$(mysql $EXT_ARGV -e "show slave status" --vertical | grep
Seconds_Behind_Master | awk '{print $2}') echo $seconds_behind_master else echo 1 fi } #
Uncomment below to test that trg_plugin function works as expected #trg_plugin ------- -- That's
the pt-plug.sh file you would need to create and then use it as below with pt-stalk: $
/usr/bin/pt-stalk --function=/root/pt-plug.sh --variable=seconds_behind_master --threshold=300
--cycles=60 --notify-by-email=muhammad@example.com --log=/root/pt-stalk.log
--pid=/root/pt-stalk.pid --daemonize

You can adjust the threshold, currently its 300 seconds, combining that with –cycles, it means that
if seconds_behind_master value is >= 300 for 60 seconds or more then pt-stalk will start capturing
data. Adding –notify-by-email option will notify via email when pt-stalk captures data. You can
adjust the pt-stalk thresholds accordingly so that’s how it triggers to collect diagnostic data during
problem.
MySQL Replication
a r How to identify and cure MySQL replication slave lag

Conclusion
A lagging slave is a tricky problem but a common issue in MySQL replication. I’ve tried to cover
most aspects of replication delays in this post. Please share in the comments section if you know of
any other reasons for replication delay.
MySQL Replication
a r GTIDs in MySQL 5.6: New replication protocol; new ways to break replication

GTIDs in MySQL 5.6: New replication protocol; new ways to break replication
By han m aud n

One of the MySQL 5.6 features many people are interested in is Global Transactions IDs (. This is
for a good reason: Reconnecting a slave to a new master has always been a challenge while it is
so trivial when GTIDs are enabled. However, using GTIDs is not only about replacing good old
binlog file/position with unique identifiers, it is also using a new replication protocol. And if you are
not aware of it, it can bite.

Replication protocols: old vs new


The old protocol is pretty straightforward: the slave connects to a given binary log file at a specific
offset, and the master sends all the transactions from there.

The new protocol is slightly different: the slave first sends the range of GTIDs it has executed, and
then the master sends every missing transaction. It also guarantees that a transaction with a given
GTID can only be executed once on a specific slave.

In practice, does it change anything? Well, it may change a lot of things. Imagine the following
situation: you want to start replicating from trx 4, but trx 2 is missing on the slave for some reason.

With the old replication protocol, trx 2 will never be executed while with the new replication
protocol, it WILL be executed automatically.

Here are 2 common situations where you can see the new replication protocol in action.

Skipping transactions

1
MySQL Replication
a r GTIDs in MySQL 5.6: New replication protocol; new ways to break replication

Skipping transactions
It is well known that the good old SET GLOBAL sql_slave_skip_counter = N is no longer supported
when you want to skip a transaction and GTIDs are enabled. Instead, to skip the transaction with
GTID XXX:N, you have to inject an empty transaction:

Why can’t we use sql_slave_skip_counter? Because of the new replication protocol!

Imagine that we have 3 servers like the picture below:

Let’s assume that sql_slave_skip_counter is allowed and has been used on S2 to skip trx 2. What
happens if you make S2 a slave of S1?

1
MySQL Replication
a r GTIDs in MySQL 5.6: New replication protocol; new ways to break replication

Both servers will exchange the range of executed GTIDs, and S1 will realize that it has to send trx to
S . Two options t en:

If trx 2 is still in the binary logs of S1, it will be sent to S2, and the transaction is no longer
skipped.
If trx 2 no longer exists in the binary logs of S1, you will get a replication error.

This is clearly not safe, that’s why sql_slave_skip_counter is not allowed with GTIDs. The only
safe option to skip a transaction is to execute a fake transaction instead of the real one.

Errant transactions
If you execute a transaction locally on a slave (called errant transaction in the MySQL
documentation), what will happen if you promote this slave to be the new master?

With the old replication protocol, basically nothing (to be accurate, data will be inconsistent
between the new master and its slaves, but that can probably be fixed later).

With the new protocol, the errant transaction will be identified as missing everywhere and will be
automatically executed on failover, which has the potential to break replication.

Let’s say you have a master (M), and 2 slaves (S1 and S2). Here are 2 simple scenarios where
reconnecting slaves to the new master will fail (with different replication errors):

# Scenario 1

1
MySQL Replication
a r GTIDs in MySQL 5.6: New replication protocol; new ways to break replication

# Scenario 2

As you can understand, errant transactions should be avoided with GTID-based replication. If you
need to run a local transaction, your best option is to disable binary logging for that specific
statement:

Conclusion
GTIDs are a great step forward in the way we are able to reconnect replicas to other servers. But
they also come with new operational challenges. If you plan to use GTIDs, make sure you correctly
understand the new replication protocol, otherwise you may end up breaking replication in new and
unexpected ways.

I’ll do more exploration about errant transactions in a future post on ercona s MySQL
Performance Blog. I invite you to join me there for a conversation on the topic.

1
MySQL Replication
a r GTIDs in MySQL 5.6: New replication protocol; new ways to break replication

About the authors

During his career, both as a system administrator and a performance engineer,


Percona architect Aurimas Mikalauskas has become familiar with many different
technologies and how to apply them. He always knows the right tools for the task. In
addition to MySQL performance optimization, his key areas of expertise include:
MySQL High Availability, full text search, web and content cache servers, and MySQL
data recovery.

Muhammad Irfan is vastly experienced in the LAMP Stack. He joined Percona in


December 2012 as Support Engineer. Prior to joining Percona, he worked in the role
of MySQL DBA & LAMP Administrator, maintained high traffic websites, and worked
as a Consultant. His professional interests focus on MySQL scalability and on
performance optimization. In his spare time, he normally spends time with family and
friends and loves to play and watch cricket.

Percona architect Stéphane Combaudon joined Percona in July 2012 after working as
a MySQL DBA for leading French companies. In real life, he lives in Paris with his wife
and their twin daughters. When not in front of a computer or not spending time with his
family, he likes playing chess and hiking.

Visit Percona's ever-growing MySQL eBook library for more great content at
http://www.percona.com/resources/mysql-ebooks

Copyright © 2006-2014 Percona LLC


Powered by TCPDF (www.tcpdf.org)

You might also like