You are on page 1of 6

Introduction

Amazon runs its e-commerce operation around the world. For this, Amazon uses a highly
decentralized, loosely coupled and service-oriented architecture, containing hundreds of services.
Amazon in the year 2004, was growing at its peak, and was starting to hit the maximum scaling
limits of its Oracle database, to overcome this limitations of their RDBMS, engineers started
looking at their current database schema, and they identified some of the major issues with
RDBMS and found that 90 percent of the SQL operations were not using JOIN functionality
which are the core of the relational databases. Engineers concluded that not every application
needs the datastore to be ACID compliant, and relaxing consistency can improve the availability.
In addition, the replication technologies available with RDBMS are limited and typically choose
consistency over availability, also It is not easy to scale-out RDBMS or use smart partitioning
schemes for load balancing. For overcoming these limitations of relational database systems,
Amazon has designed Dynamo, which is an eventually consistent storage system and can be used
in production with demanding applications. The paper describes the learnings from building a
highly available key-value store designed to meet the growth of Amazon.com.

Assumptions & Intended Usage

With millions of components in Amazon’s architecture, there are always number of network
components failing in normal operation. In general, Amazon’s Software services are designed
with consideration to treat failure as normal case w/o impacting performance or availability.
Dynamo was intended to meet the need of high availability and scalability required of the
Amazon’s platform.

With diverse sets of applications ranging from shopping cart to complex security requirements, it
was observed that majority of the application only need simple primary key based access to the
data store. Also, experimental analysis at Amazon has proven that applications relying on data
stores satisfying transactional ACID properties failed to address the high availability need of
Amazon platform. Dynamo targets the needs of application that need high availability at the
expense of consistency. Also, Dynamo does not guarantee isolation (‘I’ in ACID properties).
Instead it only allows single key update in its data store. Instead of supporting complex relational
queries like joins, Dynamo’s design was based on the assumption that services can work with
majority of their operations spread across single data item.

Another assumption was that Amazon’s platform services incorporating Dynamo should be able
to meet the stringent latency and throughput requirements. Additionally, to control the
infrastructure cost, it was a necessity to run the system on easily affordable hardware. To provide
this ability, applications can configure dynamo to balance the tradeoff between performance,
availability, and cost. Also, Dynamo’s design was established on the assumption that each
service should run in a non-hostile environment with its own instance and no additional security
requirements.

Get Help With Your Essay


If you need assistance with writing your essay, our professional essay writing
service is here to help!

Find out more


Design tradeoffs

In a distributed system, when data is stored in a particular node, it has to be updated to other
backup nodes. Here, a node is a data center. Database replication improves reliability of the
system by providing backups in cases of failures [1]. However, in a distributed environment
facing frequent network failures, maintaining consistency and availability becomes cumbersome.
As per the CAP theorem, in a distributed system, a system can only achieve two of the three
desirable properties i.e. consistency, availability, and partition tolerance [2]. For an application
such as Amazon’s platform, which serves to an increasing amount of user traffic, it was a
requirement to provide availability and reliability over consistency.

The tradeoff of consistency meant that the updated data would not be present across multiple
data nodes at a particular moment. This resulted in a system that required conflict resolution.
Dynamo, an “always writable” data store, had to provide conflict resolution via application level
or data store itself. Conflict resolution via application was advantageous as the developer knew
about the data schema and could decide on a proper resolution approach. However, in case of
conflict resolution via data store, it employed a simple policy of “last write wins”.

With the acceptable consistency tradeoff in the system design, Dynamo can achieve horizontal
scaling. In horizontal scaling, a system has multiple nodes in a network and it can cater to the
increasing requests. Dynamo’s support for incremental scalability helps system to scale out one
node at a time. To achieve this, Dynamo employs consistent hashing technique for spreading the
data items across multiple nodes. This technique does not affect the performance of the system.

Amazon makes use of service-oriented architecture where each business unit has a separate
service. This approach was employed by Amazon to cater to the incoming requests without
frequent alterations to the database schema. For services that demanded high performance,
Dynamo was able to trade-off durability. To attain this, the write operations were written to
buffer initially and then periodically moved to storage engine. Moreover, the read operations
checked buffer for the key initially and if not found then were read from storage engine.

Implementation

In order to meet the requirement of high scalability, Dynamo uses a modified Consistent Hashing
technique in which it partitions the data across multiple nodes. Instead of mapping a node to a
single point in a ring, the nodes are mapped to more than one points called as virtual nodes. This
ensures that there is a uniform distribution of data across all the nodes and also improves
performance. To achieve high availability each data item is replicated at N different nodes.
Dynamo uses vector clocks to maintain different versions of the same object. Each vector clock
consists of a (node, context) pair. Each modification to data results in an immutable version of
the data. Dynamo also stores a timestamp indicating the last time the node updated the object. To
maintain consistency among all the replicas, Dynamo uses consistency protocol similar to
quorum systems. It maintains a list of top N nodes in a preference list for any given key. It sets R
and W such that R + W > N, where R and W is the number of nodes that should participate in
successful read operations and write operation. To recover from temporary node failures
Dynamo uses sloppy quorum technique where all the read and write operations are performed by
the top N healthy nodes. To recover from permanent node failures Dynamo uses anti-entropy
protocol to minimize the inconsistencies between the data by constructing a Merkle Tree. A
gossip-based protocol is used to add or remove a node from the Dynamo ring.

Get Help With Your Essay


If you need assistance with writing your essay, our professional essay writing
service is here to help!

Find out more


Every node consists of three main software components: request coordination, membership and
failure detection and a local persistence engine. It supports multiple storage engines like
MySQL, Berkeley Database (BDB) etc. The purpose of having multiple storage engine is to
choose the one which is best suited for an application. On top of this Dynamo has a request
coordination system which coordinates read and write operation for a given key on behalf of the
client. Each client request creates a new state machine on the node that is processing the request.
The state machine has logic for forwarding the request to corresponding node, doing multiple
retries, receiving response from the node and sending the response back to the client.

Advantages

The ease of scaling a large number of nodes in and out ensuring high availability is the main
advantage. Dynamo is fully managed system with automatic scaling, where the nodes can be
added or removed automatically. All the nodes in the system are utilized to their optimum
capacity due to which the efficiency of resource usage is high. The SPOC (single point of
Contact) is avoided and a single node failure does not affect the entire system. It is also very easy
to integrate with AWS Lambda and API Gateway.

Disadvantages

The main disadvantage is that Dynamo is not ACID compliant, i.e. it does not support Atomicity
and Consistency. It becomes expensive for storing large size data, for example, a large JSON
blob (1MB file size limit on querying). Once we hit the read (or write) limit, further requests are
denied until enough time is elapsed. Dynamo needs some scripts to back up the tables in. There
are considerable overheads, as compared to SQL. Joins are impossible and all complex data
relations needs to be managed at the code layer. Lack of support of Foreign keys and triggers
also add to the overheads. Also, there is some Latency Availability, where a table created is not
available instantly.

Learning and Criticism

Dynamo is a state-of-the-art technique to create databases which in turn led to the emergence of
other NoSQL databases. It is highly configurable with functions like read-only and write-only
and techniques like Consistent Hashing and Merkle tree have been highlighted in the paper.

There are some disadvantages though which were not clearly stated or were lacking substantial
quantitative evidence to support the claims. For example, although high scalability is stated as
advantage, there are no claims or explanation on the performance when operating an
exponentially large number of nodes. The paper presents admin monitored (command line based)
mechanism to handle addition or removal of node from Dynamo ring. However, with hundreds
of nodes and multiple failures, there could be too much human intervention. This could lead to
time consuming, inefficient membership management.

References:

 [1] Bernstein, P.A., and Goodman, N. An algorithm for concurrency control and recovery in
replicated distributed databases. ACM Trans. on Database Systems, 9(4):596-615, December
1984

Get Help With Your Essay


If you need assistance with writing your essay, our professional essay writing
service is here to help!

Find out more


[2] S. S. Shim, “Guest Editor’s Introduction: The CAP Theorem’s Growing Impact,” in
Computer, vol. 45, no., pp. 21-22, 2012. doi:10.1109/MC.2012.54 keywords: {CAP theorem;
distributed databases}, url: doi.ieeecomputersociety.org/10.1109/MC.2012.54

Share this: Facebook  Twitter  Reddit  LinkedIn  WhatsApp   
Cite This Work
To export a reference to this article please select a referencing style below:

APA

MLA

MLA-7

Harvard
Vancouver

Wikipedia

OSCOLA

Copy to Clipboard 
Related Services
View all 

Information Systems Essay Writing Service

Dissertation Writing Service

Assignment Writing Service

DMCA / Removal Request


If you are the original writer of this essay and no longer wish to have the essay published on the
UK Essays website then please:

Request the removal of this essay


Related Services
Our academic writing and marking services can help you!

Find out more about our


Essay Writing Service 

Dissertation Writing Service 


Assignment Writing Service 
Marking Service 
Samples of our Service 
Full Service Portfolio 
Related Lectures
Study for free with our range of university lectures!

All Available Lectures 


Study Resources
Free resources to assist you with your university studies!
Dissertation Resources at UKDiss.com 

Essay Writing Guide 


Essay Buyers Guide 
Referencing Tools 
Undergraduate Writing Guides 

You might also like