Rob Sharp Lead Developer The Sound Alliance

About Memcached
• conceived by Brad Fitzpatrick as a • “memcached is a highperformance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load” solution to the scaling issues faced by Livejournal

• Sydney-based online media

Who we are

• Community and Content Sites • • • • Thoughtbythem - Marketing

Who we are
• • Australia’s busiest music website • ~ 250,000 pages per day • Plus two other busy sites! • Maximum performance for the
hardware we have

Current Architechture
• 3 Linux servers • Apache • Lighttpd • Memcache • 1 MySQL Master

• CMS written in OO style, using
Activerecord well

Why do we use memcached?

• PHP4 and objects don't mix too • Activerecord gave us fast
development but reduced performance

• Call us greedy, but we want both • Use Rails Memcache!

Our Application
• CMS written from the ground up • Effectively three sites running on
one codebase

• Uses three seperate databases,
but aiming to consolidate more implemented in most places yet!

• Has data namespacing

• But seperation is not quite there

Our Memcache Setup
• We have 3 webservers running

• Each server runs three daemons
on separate ports - one for each site (more on this later!)

• Each daemon knows about the
and access it from any server, whether in the pool or not

Memcache Pool

other 2 daemons and connects to them over TCP

• This allows us to store data once, • Hashing algorithm means that a • Efficient use of memory • Efficient for cache clearing
given key maps to a single server

• But what if we lose a server? We

Memcache Pool
for any keys we attempt to retrieve algorithm breaks... :(

• Ignore it - we simply get misses • Remove it - our hashing • We can also add new servers to
the pool after data has been stored, but the same hashing

Memcache Pool
• Consistent hashing will solve the
problem of removing or adding servers once data has been hashed production ready

• Currently in its infancy - not really • We simply monitor our daemons
and restart if required

• Available in most Linux distros • packaged for Fedora, RHEL4/5, Installing

Memcached • OSX? Use Ports! • sudo port install memcache • sudo gem install memcacheclient

Ubuntu, Debian, Gentoo and BSD

• sudo gem install cached_model

Memcache and Ruby
• We’ll use the memcache-client

• Pure Ruby implementation • Pretty fast!

Storing Stuff
rsharp$ sudo gem install memcache-client require 'memcache' memcache_options = { :compression => true, :debug => false, :namespace => 'my_favourite_artists', :readonly => false, :urlencode => false } Cache = memcache_options Cache.servers = 'localhost:11211'

Storing Stuff

Cache.set 'favourite_artist', 'Salvador Dali' skateboarder = Cache.get 'favourite_artist' Cache.delete 'favourite_artist'

Memcache • Memcache doesn’t have Namespaces namespaces, so we have to

• Prefix your keys with a namespace
by setting the namespace when you connect

• Our solution: • Run multiple memcache

instances on different ports

• Makes it easier to clear out

Roll your own?
• Memcache-client provides basic
cache methods

• What if we extended ActiveRecord? • We can, with active_model

Storing Stuff Part Deux
rsharp$ sudo gem install cached_model require 'cached_model' memcache_options = { :compression => true, :debug => false, :namespace => 'hifibuys', :readonly => false, :urlencode => false } CACHE = memcache_options CACHE.servers = 'localhost:11211'

Storing Stuff Part Deux
class Artist < CachedModel end

cached_model Performance CachedModel is not magic. •
finds for single rows. you run.

• CachedModel only accelerates simple • CachedModel won’t cache every query • CachedModel isn’t smart enough to
determine the dependencies between your queries so that it can accelerate more complicated queries. If you want to cache more complicated queries you

Other options
• acts_as_cached provides a similar

Memcache Storage
• Memcache stores blobs • The memcache client handles

marshalling, so you can easily cache objects objects aren’t necessarily crosslanguage

• This does however mean that the

Memcache Storage
• The most obvious things to store
are objects

• We cache articles • We cache collections of articles • We cache template data • We cache fragments • We don’t cache SQL queries

What we cache
• Our sites are fairly big and datarich communities

• Almost every page has editorially
controlled attributes along with user generated content

• Like...

Our Example Dataset
• Article • Joins Artists • Joins Locations • Joins Genres • Joins Related Content • Joins Related Forum Activity • Joins Related Gallery Data

• Article (continues) • ... • Joins Media Content

Our Example Dataset

• Joins Comments • Joins ‘Rollcalls’ • Joins other secret developments

• An article requires many data

Our Example

• Most don’t change that often • We also know when they change • Yay for the Observer pattern • User content changes much
more regularly

• Can be changed from outside
our controlled area (e.g.

Our Example Summary Summary Data can be loosely divided into •
editorially controlled and usergenerated content separately from usergenerated content fragment caching

• Cache editorially controlled

• Simplest way to implement is in

Fragment Caching
• Memcache allows timed expiry of

• Identify areas that change
infrequently and cache

• Remember to measure

performance before and after

• Evidence suggests very large • Use memcache_fragments

Caching Fragments
rsharp$ sudo gem install memcache_fragments require 'memcache_fragments' memcache_options = { :compression => true, :debug => false, :namespace => 'hifibuys', :readonly => false, :urlencode => false } CACHE = memcache_options CACHE.servers = 'localhost:11211'

Caching Fragments

ActionController::Base.fragment_cache_store = :mem_cache_store ,{} = CACHE, {} ActionController::CgiRequest::DEFAULT_SESSION_OPTIONS.merge!({ 'cache' => CACHE })

Caching Fragments

/cache/key', :expire => 10.minutes do %>


Memcache Sessions • We could store our session in

• Great for load balancing - share
across a server farm without using a DB store

• Ideal for transient data • Solution exists in

• DB backend with memcache

layer - the best of both worlds

In Summary • Memcache gives you a distributed
cache store

• Very fast and very easy to use • Lots of ruby and rails libraries • memcache_client • cached_model • db_memcache_store • memcache_fragments

Any Questions?