Professional Documents
Culture Documents
Bob Burgess
radian6 Technologies
10
0
Current Solution
month table
month table
...
Data over NFS can be unreliable
Separate copy for each DB that could
use it (Master / Replicas)
Goals
Single place for all content
Redundancy without complete
duplication
Add storage by adding nodes
Survive a server failure
Existing Products
HiveDB
◦ Java/Hibernate-based
◦ No redundancy
Spock Proxy
◦ No redundancy
MySQL Cluster
◦ All indexes in RAM
Cluster
Master DB Other
Other DBs
OtherDBs
DBs
content content
(federated) (federated)
Load Balancer
Constants
MYSQL_PACKET_OK
MYSQL_PACKET_ERR
MYSQL_TYPE_LONG
PROXY_SEND_RESULT
MySQL Proxy
MySQL 3306 MySQL
client server
MySQL
Proxy
MySQL 4040 3306 MySQL
client Lua
server
LuaSQL
MySQL Proxy
MySQL MySQL
client server
MySQL Proxy
Lua
read_query( )
read_query_result( )
MySQL Proxy
MySQL MySQL
client server
MySQL Proxy
Lua
LuaSQL calls
LuaSQL
Connect directly to databases from
Lua scripts
environment
fields (table)
value value
rows (table) type name
value value
type name
proxy.response.type = proxy.MYSQLD_PACKET_OK
proxy.response.resultset = {
fields = {
{type=proxy.MYSQL_TYPE_LONGLONG, name="blogPostID" },
{type=proxy.MYSQL_TYPE_LONG, name="partitionKey"},
{type=proxy.MYSQL_TYPE_VAR_STRING, name="rawContent" } },
rows = { { tonumber(itemId),
tonumber(partKey),
contentValue } }
}
return proxy.PROXY_SEND_RESULT
Cluster Operation Overview
Cluster: Librarian
Talks to the client
Accepts items to store
Retrieves items / gives them to client
The single Lua script that runs under
Proxy
Cluster: Librarian
Directory
◦ itemId
◦ nodeId
◦ partitionKey
Event table
◦ serial no.
◦ event type
◦ event data
Cluster: Librarian
Content_partitionKey
◦ itemId
◦ item – compressed into largeblob
Node table
◦ nodeId
◦ connection / authentication info
◦ system status (disk)
◦ capacity factor
Cluster: Librarian
no
insert
yes
bad
syntax error
good
yes
exists error
no
Error: table
doesn't exist
store on this node create table store on this node
update ALL
directories
store Event
return OK to client
Cluster: Librarian
Returning an error
proxy.response={
type= proxy.MYSQLD_PACKET_ERR,
errmsg= "Duplicate entry '"..itemId.."' for key 1",
errcode= 1062,
sqlstate= "23000"}
return proxy.PROXY_SEND_RESULT
Content Tables
◦MyISAM (concurrent_insert=2)
◦One table per partition key
◦For us: 300 MB / hour
Cluster: Librarian
no
select
yes
bad
syntax error
good
yes return empty result
1=0
set to client
no
yes calculate
info return result set
max(ID) and to client
table
count(*)
no
remote read doesn't exist
get item from return empty result
remote db dir set to client
local
Client
create table _content_info (
_max_id bigint unsigned,
_count bigint unsigned)
engine=federated
connection=(...schema/_content_info...);
Target
create view _content_info as
select max(id) _max_id, count(*) _count
from _content;
Cluster: Librarian
show table status like `_content`;
Auto_increment 0
next step
Cluster: Rebalancer
Policy enforce
for other items
next step
Cluster: Rebalancer
Rebalancer
done
Member Add
Add new node to Node table
Get a copy of the directory from all
nodes (a piece from each)
Member Remove
Set free-space margin to 0
Force "free disk space" for this node
to 0
Wait for Rebalance to copy everything
off
Restore the free-space margin and
update the Node list
Member Fail
Update node list
Remove all directory entries for this
node
Information Age-Out
Drop tables for obsolete partition keys
Remove directory entries for those
partitions
Table Optimizer
information_schema.tables
compare data_free to data_length
"Lock" that partition on that node
(row in PartitionLock table)
Run optimize table but abandon if
any node fails
Unlock table
Backup
"Write Lock" one partition across all
nodes
Copy that partition table for all nodes
Copy directory entries for that node
Unlock
Development Directions
Bulletproof error handling
Performance tuning
Alternative Architectures
Load Balancer choices
Move SQL parsing to a complex
proxy-based load balancer,
communicate with nodes on network
sockets
Librarian in Perl, Java, C
Alternative databases
Resources
MySQL Documentation
http://forge.mysql.com/wiki/MySQL_Proxy
http://forge.mysql.com/tools/search.php?t=tag&k=mysqlproxy
http://www.lua.org
http://www.keplerproject.org/luasql
http://lua-users.org
http://jan.kneschke.de/projects/mysql/mysql-proxy
http://www.linuxvirtualserver.org
Thank you!
Bob Burgess
bob.burgess@radian6.com
www.radian6.com/mysql