Professional Documents
Culture Documents
George Palmer
george.palmer@gmail.com
3dogsbark.com
Overview
• One server
• Two servers
• Scaling the database
• Scaling the web server
• User clusters
• Final architecture
• Caching
• Cached architecture
• Links
• Questions
George Palmer
17th February 2007
How you start out
Shared Hosting
Web Server DB
• Shared Hosting
• One web server and DB on same machine
• Application designed for one machine
• Volume of traffic will depend on host
George Palmer
17th February 2007
Two servers
Web Server DB
Slave
George Palmer
17th February 2007
Scaling the database (2)
MySQL Cluster
Master
DB
Web Server
Master
DB
George Palmer
17th February 2007
Scaling the web server
Web Server
Worker thread
Worker thread
DB
Worker thread Farm
Worker thread
George Palmer
17th February 2007
Load balancing
App Server
App Server
App Server
Slave
George Palmer
17th February 2007
User Clusters
• For each user registered on the service
add a entry to a master database detailing
where their user data is stored
– UserID
– DB Cluster
– Basic authorisation details such as username,
password, any NLS settings
George Palmer
17th February 2007
User Clusters (2)
SELECT * FROM
users WHERE
username=‘Bob’ Master
AND … DB
George Palmer
17th February 2007
User Clusters (3)
• ID management becomes an issue
– Best to use master DB id as user_id in user cluster
– If let cluster allocate then make sure use offset and
increment (not auto_increment)
• Other DBs such as session must reference a
user by id and DB cluster
• Serious code changes may be required
• Will want to have ability to move use users
between clusters
George Palmer
17th February 2007
The final architecture
• As number of app servers grow it’s a good idea
to add a database connection manager (eg
SQLRelay)
• Extract out session, search, translation
databases onto own machines
• Use MySQL cluster (or equivalent) for any
critical database
– In replication setup can make a slave a backup
master
• Add a NFS/SAN for static files
George Palmer
17th February 2007
The final architecture (2)
NFS/SAN Master Master
DB DB
App Server 1
Session
App Server 2 DB
DB Connection
Load balancer
Manager
… Search
DB
App Server 50
NLS
DB
User User
Cluster Cluster
Master 1 Master
2
George Palmer
17th February 2007
Issues
• Load balancer and database connection manager are
single point of failure
– Easy solved
• 2PC needed for some operations. For example a user
wants to be removed from search database
– 2PC not supported in rails
• Rails doesn’t support database switching for a given
model
– Can do explicitly on each request but expensive due to
connection establishment overhead
– Can get round if using connection manager but a proper solution
is required (I may write a gem to do this)
George Palmer
17th February 2007
Making the most of your assets
• In a lot of web applications a huge % of
the hits are read only. Hence the need for
caching:
– Squid
• A reverse-proxy (or webserver accelerator)
– Memcached
• Distributed memory caching solution
George Palmer
17th February 2007
Squid
App Server 1
Squid …
Not in App Server 2
In cache
cache
NFS/SAN
Memcached Memcached
(Not in
memcached)
• Location of data is irrespective of physical machine
• A really nice simple API
– SET
– GET
– DELETE
• In rails only a fews LOC will make a model cached
• Also useful for tracking cross machine information – eg dodge user behaviour
George Palmer
17th February 2007
Cached Architecture
• Introduce Squid
– Acts as load balancer (note there are higher
performing load balancers)
• Introduce memcached
– Can go on every machine that has spare
memory
• Best suited to application servers which have high
CPU usage but low memory requirements
George Palmer
17th February 2007
Cached architecture
NFS/SAN Master Master
M DB DB
App Server 1
C
Session
M DB
App Server 2
C DB Connection
Squid
Manager
… Search
M DB
App Server 50
C
NLS
DB
User User
Cluster Cluster
Master 1 Master
2
MC=memcached
George Palmer
17th February 2007
Cached architecture
• Wikipedia quote a cache hit rate of 78%
for squid and 7% for memcached
– So only 15% of hits actually get to the DB!!
• Performance is a whole new ball game but
we recently gained 15-20% by optimising
our rails configuration
– But don’t get carried away - at some point the
time you spend exceeds the money saved
George Palmer
17th February 2007
Cached architecture – 1 machine
Physical Machine
NFS/SAN Master Master
DB DB
App Server 1
Session
App Server 2 DB
DB Connection
Squid Memcached Manager
… Search
DB
App Server 5
NLS
DB
User
Cluster
1 Master
George Palmer
17th February 2007
How far can it go?
• For a truly global application, with millions
of users - In order of ease:
– Have a cache on each continent
– Make user clusters based on user location
• Distribute the clusters physically around the world
– Introduce app servers on each continent
– If you must replicate your site globally then
use transaction replication software, eg
GoldenGate
George Palmer
17th February 2007
Useful Links
• http://www.squid-cache.org/
• http://www.danga.com/memcached/
• http://sqlrelay.sourceforge.net/
• http://railsexpress.de/blog/
George Palmer
17th February 2007
Questions?
George Palmer
17th February 2007