You are on page 1of 9

What is Memcached?

Memcached is
a high-performance and free distributed memory object caching system.
It stores objects as a key value pair similar to hash tables and operates completely as an in-memory data store.
It is probably the second largest used software on the Internet after Linux operating system.

Why Memcached?
Advantages:

 It is an extremely light-weight process that can be co-hosted even with application servers.

 It reduces database workload and Disk I/O, thereby making it best suited for web/mobile applications

 It is technology-agnostic and runs with no or minimal configuration.

Disadvantages: test

 It has no persistence. Hence, it cannot be used in isolation without a database

 It cannot cache large objects > 1MB with key size > 255 characters

 It has no replication. Hence, in case of clustering, the memory stores contain mutually exclusive datasets."

Memcached deployments mostly follow any of the mentioned deployment models:

 Co-hosted with Application Server : In this model the Memcached process runs on the same infrastructure like that of the
application server

 Central Hosting: In this model, a large server runs the Memcached process which connects with different application
server instances through sockets.

 Memcached follows a multi-state architecture with every instance having the ability to both read and write.

 Redis follows a master-slave single state architecture with all writes happening to master and reads happening
from N slaves.

Memcached Vs. Redis - Performance Comparison


The choice between redis and Memcached is all about a tradeoff between atomicity and speed.
If you have complex retrievals with a high frequency of data change then you might have to opt for redis. If you have simple retrievals
with very high frequency of data change , then you might have to opt for Memcached.

source: antirez.com

Memcached Installation on Ubuntu


You can install Memcached (64 bit) with the following command on ubuntu.

sudo apt-get install memcached


or

wget http://www.memcached.org/files/memcached-1.x.x.tar.gz
tar -zxvf memcached-1.x.x.tar.gz
cd memcached-1.x.x
./configure && make && make test && sudo make install
where 1.x.x is the latest version of the file you can download from http://www.memcached.org.

Installation of Memcached on Windows


To install Memcached in Windows,

 Download a stable version, in either 32-bit or 64-bit. I have tested the 64-bit version.
 Create a directory similar to c:\memcached\
 Copy the .zip content to that directory. There should be one memcached.exe file by itself (32-bit) or with a couple of .dll
files (64-bit)
 Open command prompt as an administrator. Go to Start > Search and type 'cmd'
 To start the service, type 'c:\memcached\memcached.exe -d start'
 To stop the service, type 'c:\memcached\memcached.exe -d stop'
 To change the memory pool size, type 'c:\memcached\memcached.exe -m 512' for 512MB

Source: ubergizmo

Memcached Configurations
All Memcached Configurations are available in the /etc/memcached.conf(for ubuntu only) .
You can alter these configurations on the configuration file for permanent changes.
In case of non-persistent changes, you can provide them as command-line arguments during Memcached startup.

Memcached Configurations
The configurations for Memcached are as follows:

 -d: Runs Memcached as daemon or service

 logfile : Points to logfile of Memcached (default : /var/log/memcached.log)

 -v: For verbose logging for Memcached operations

 -m: Memory pool allocated for the Memcached process

 -u: User to run the Memcached process

 -p: Port to run the Memcached process

 -c: Max incoming connections for Memcached to be inline with ulimit parameter

Memached Control Flow

In this diagram, Memcached is a large storage, partitioned for N servers where N denotes the number of application servers.

Every request to Memcached has a key in it and when the request is sent to the client the following steps take place:

 Tries to locate the server

if the key is given.

 Forwards the request to that particular server

This scenario is valid only when Memcached is centrally hosted. In case of co-hosting with application servers, the clients do not
have to route between different Memcached servers.
Slabs,Pages & Chunks

The latest versions of Memcached(1.2+) offers 30% reduction in memory usage.

 Slabs: Memcached memory pool contains memory slabs with certain fixed size ranges. Like 0-1000 bytes, 1001-2000
bytes, 2001 - 5000 bytes etc.
 Pages: Every Slab can contain one or more pages and the size of the pages are limited to 1M.

 Chunks: Every page contains one or more Chunks of standard size 2000Bytes. At the max, a page can have 5 chunks.

Slabs, Pages & Chunks


Let us consider an example to understand memory allocation in Memcached.
When an object of size 250M is being asked to cache by application server, Memcached performs the following,

 Identifies an appropriate slab suitable for the object size

 Traverses the pages in that slab to identify a chunk with the required amount of free space.

 Allocates the first identified chunk to the object for storage.

Consistent Hashing
$server_id = hashfunc($key) % $servercount;
A web application can talk to multiple Memcached servers, and the client running your application should get updated with the list
of IPs running the Memcached server.
Memcached client generates the hash of each key that is stored or retrieved. The modulus function is applied to identify which
server will store the information and vice versa, the process is repeated for retrievals as well.

Memcached - Memory Eviction


Let's say we are running a Memcached process with 250M memory pool allocated and let's say it has now 250M pages exhausting
the entire memory pool.
What will happen when we allocate additional objects to the cache?

Every object in the memory pool has a counter associated with it which contains the timestamp. On each fetch or update or
create, the timestamp is reset to its current time, leaving Memcached to identify Least Recently Used (LRU) objects and to evict
them.
Items are evicted when the slab class is left with no free chunks.
Note: Eviction is forced in Memcached and hence the durability of data within Memcached is extremely low.

Single Order Functions


One of the key advantages of Memcached is its atomicity. All operations on Memcached like SET, ADD, and DELETE are single
order functions. O(1) They take the same amount of time for the first record as well as the 100th record to perform the operation as
they all are atomic.

Ways to Run Operations in Memcached


There are primarily two ways to connect to memcached process,

 Telnet to the actual process on the memcached server

 Use language Specific clients like php5-memcached, python-memcached and connect to memcached server through
sockets.
As part of this course, we will try the telnet method while in real application scenarios second method will be prominently used.

Memcached - Add Operation


Add Operation stores the data carried along with key only if it doesn't exist already.

Syntax:

add greeting 1 0 11

Hello world

Explanation: Use key greeting, use 1 as a flag (arbitrary metadata), use 0 as the expiry (i.e, never expire), and expect the value to
be 11 bytes long".

 New items are on the top of Least Recently Used List


 If an item already exists, then add fails but still promotes the item to LRU


2 of 10

Set Memcached Operations

Set operator is the most commonly used operator.

Syntax:

set newgreeting 1 0 14

Happy Birthday

New items are at the top of the LRU list.

Memcached Get Operation

Get operation allows to retrieve values from the object store

Syntax:
get keyname

Memcached - Delete

Delete operator deletes the object for a given key.

Syntax:

delete greeting

Get All Keys in the Memcached


Memcached allows you to see the list of keys that are available in the current Memcached instance.
Syntax:

stats items
STAT items:1:number 2
Numeric after the item is the slab number. To get all keys in the slab.

stats cachedump 1 100


ITEM views.decorators.cache.cache_header..cc7d9 [6 b; 1256056128 s]
ENDITEM
In the command above, 1 denotes the slab number and 100 denotes the number of keys to dump.

Memcached - Flush All


Flush all clears all existing cache items. It optionally takes a parameter for clearing objects after N seconds
Syntax:

flush_all

flush_all 90

The first command clears all the objects and their corresponding keys while the second one clears them after 90 seconds.

Other Memcached Operations


The list of other Memcached operations are as follows:

 Command: append, Syntax: append key 0 60 15, Description: Append data to existing key
 Command: prepend , Syntax: prepend key 0 60 15, Descripton: Prepend data to existing key
 Command: incr, Syntax: incr mykey 2, Description: Increments numerical key value by given number
 Command: cas, Description: An operation that stores data, only if you have updated it last. Useful for resolving race
conditions on updating cache data.
You can read more on operations in this wiki

Memcached - Multi Get Operation

Most memcached clients have an option to perform the multi-get operation.

Consider that you need to retrieve 10 keys from a cache and there are 3 servers of Memcached.

With multi-get the following process takes place:

 Keys are mapped to servers

 Gets are issued to a server with an array of keys.


 Based on the client, it might request servers in parallel, or it might contact them one at a time.

Memcached Race Condition


How to handle race conditions and stale data in Memcached?
Let's take an example. You are caching the latest five comments on your profile timeline page.
You decide that the data needs to be refreshed only once per minute(expiring items in 60s). But, you forget that profile timeline is
refreshed 50 times per second. Thus, once 60 seconds rolls around and the cache expires, suddenly there are 10+ (number 10
Memcached servers) processes which are running the same SQL query to repopulate that particular cache. Every time the cache
expires, it will result in a sudden burst of SQL traffic.
In the worst case, since multiple processes update the same data, if the wrong one ends up updating the cache, then you have
stale, outdated data floating about for another 60s.
So, you can handle a race condition in Memcached through the following 3 ways,

 Using only add operation and not setoperations


 Using cas() and set or delete
 Using locks

Memcached - Set vs Add


In a simple SQL parlance, ADD denotes INSERT statements while the SET denotes UPDATE statements
If you consistently use only add in your code instead of set, you can avoid race conditions completely.

Memcached - CAS
Latest versions of memcached(1.2+) provide an atomic operation called cas(). CAS also known as Check and Set or Compare and
Swap allows you to set only if the same memcached client is
requesting to change the data, else it returns with error.
CAS command may produce one of the following result −

 STORED indicates success.


 ERROR indicates error while saving data or wrong syntax.
 EXISTS indicates that someone has modified the CAS data since last fetch.
 NOT_FOUND indicates that the key does not exist in the Memcached server.

Memcached - Stampeding Herd Problem


When hot items often miss the cache, it stresses your database with a heavy load, leaving a situation often called as stampeding
herdproblem.

A common way to handle this is to use Ghetto Locking.

Memcached - Locking
From Memcached docs,
A series of commands is not considered atomic. If you issue a 'get' against an item, operate on the data, and then wish to 'set' it
back into Memcached, you are not guaranteed to be the only process working on that value. In parallel, you could end up
overwriting a value set by something else
The easiest approach to implement locking is to use view count.

Memcached - View Count Locking


Here is a sample psuedocode on how Facebook managed like count on posts.

 Step 1: Let the new like count be available in a variable new_count.


 Step 2: Identify the key that holds the like count for the post in Memcached for e.g post_1_like_count.
 Step 3: Try to increment the key with INCR operation.
 Step 4: If Step 3 returns false try to add with the key pattern else increment the key.
 Step 5: If Step 4 returns false some other client has raced us else create the key.

In this method, a psuedo-lock system is applied by using INCR operation in the count instead of the actual data.

Memcached Monitoring
Memcached servers can be monitored in two modes,

 CLI Mode: Using the stats commands with multiple parameters and Memcached-top suitable for monolithic instances

 Memcached Tools: Monitoring tools like phpMemcachedadmin can be used for monitoring and suitable for cluster
deployments.

Memcached Monitoring - Why?


Memcached monitoring is about being able to answer the following questions with data,

 Is Memcached daemon process running?


 What is the system usage for the Memcached servers?
 Are the Memcached queries running efficiently?
 Is my Memcached cache performing effectively?

Over the next few cards in this course, you will understand how to monitor usual suspects

Memcached Monitoring - Cache Performance


While monitoring cache performance, you need to derive some of the metrics. Here is a list of metrics that are indispensable for
monitoring Memcached:

 Hit Rate indicates the efficiency of your Memcached server and is computed by get_hits / cmd_get

 Fill percent represents free memory available for caching and is computed by used bytes / limit_maxbytes

 Evictions is a scenario where items are removed before their expiry time due to lack of memory and item is selected
with a pseudo-LRU mechanism.

 Memcached-top is a CLI utility that allows you to monitor the Memcached


process runtime.

 Follow these commands to run to install Memcached-top


 sudo add-apt-repository ppa:duggan/memcached-top


 sudo apt-get update


 sudo apt-get install memcache-top


 memcached-top

Memcached Tuning - Connection Yields


Here is an excerpt from stats for slabs,

stats
STAT pid 27631
STAT uptime 104
STAT time 1401932723
STAT version 1.4.5
STAT pointer_size 64
STAT rusage_user 0.004999
STAT rusage_system 0.010998
STAT curr_connections 10
[...]
STAT listen_disabled_num 0
STAT threads 4
STAT conn_yields 4354
STAT bytes 0
STAT curr_items 0
STAT total_items 0
STAT evictions 23420009
STAT reclaimed 0
END
conn_yields or Connection Yields parameter defines the number of times Memcached throttled connections to a specific client. If
connection yields are high, it means that there are large multi-get requests or R value(number of connections to client) is less.

Memcached Eviction Tuning


Here is an excerpt from stats slabs command

stats slabs
STAT active_slabs 0
STAT total_malloced 0
END
If your evictions count on specific slabs are high, then you need to probably work on altering the growth factor.
Growth factor denotes the increasing size between slabs. The default growth factor is 1.25

Memcached Tuning - Threads


The most important configuration that influences overall Memcached performance is the number of threads or -t parameter.
The default value is 4 for this parameter and can be tuned at runtime.

Disable Compare and Swap


Compare and Swap(CAS) feature has an additional item field of 8 bytes to store unique CAS fields.
Disabling CAS would improve the performance and would save both time as well as memory. This option is suitable if it is not a
cluster deployment.

Memcached Replication
Memcached replication can allow the memory pool of one instance of Memcached to be synchronous with another.
Replication here doesn't bring the entire set of Memcached servers to the same state, instead it ensures that there are more than
one Memcached servers in the same state, offering extended durability.

Memcached Replication - Repcached


Repcached is patch set. It adds the data replication feature to Memcached.
Source: Repcached Site

Setting up Repcached in single cluster


Following are the steps to setup replication with Memcached servers,

 Step 1: Download and install Memcached.


 Step 2: Download and install repcached and enable replication option.
 Step 3: Create a copy of Memcached configuration file with -xparameter pointing to the replica host name and port for
example -x 127.0.0.1:11211
 Step 4: Create copy of the Memcached configuration file with -x parameter point to the master host name.
 Step 5: Restart services both in master as well as replica host

.
Now, you should be able to replicate objects across both the instances of Memcached.
$ sudo apt-get update && sudo apt-get install memcached

$ add F 15 0 15 15
0-0
STORED
$ add N 0 15 0 0
0-0
STORED

You might also like