Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
0 of .
Results for:
No results containing your search query
P. 1
Life Beyond Distributed Transactions

Life Beyond Distributed Transactions

Ratings: (0)|Views: 659|Likes:
Published by talhakamran2006

More info:

Published by: talhakamran2006 on Mar 15, 2008
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





Life beyond Distributed Transactions:an Apostate’s Opinion
Position Paper
Pat HellandAmazon.Com705 Fifth Ave SouthSeattle, WA 98104USAPHelland@Amazon.com
The positions expressed in this paper arepersonal opinions and do not in any way reflectthe positions of my employer Amazon.com.
Many decades of work have been invested in thearea of distributed transactions includingprotocols such as 2PC, Paxos, and variousapproaches to quorum. These protocols providethe application programmer a façade of globalserializability. Personally, I have invested a non-trivial portion of my career as a strong advocatefor the implementation and use of platformsproviding guarantees of global serializability.My experience over the last decade has led me toliken these platforms to the Maginot Line
. Ingeneral, application developers simply do notimplement large scalable applications assumingdistributed transactions. When they attempt touse distributed transactions, the projects founderbecause the performance costs and fragility makethem impractical. Natural selection kicks in… 
The Maginot Line was a huge fortress that ran the lengthof the Franco-German border and was constructed at greatexpense between World War I and World War II. Itsuccessfully kept the German army from directly crossingthe border between France and Germany. It was quicklybypassed by the Germans in 1940 who invaded throughBelgium.
This article is published under a Creative Commons License Agreement(http://creativecommons.org/licenses/by/2.5/).You may copy, distribute, display, and perform the work, makederivative works and make commercial use of the work, but you mustattribute the work to the author and CIDR 2007.3
Biennial Conference on Innovative DataSystems Research (CIDR)January 7-10, Asilomar, California USA.
Instead, applications are built using differenttechniques which do not provide the sametransactional guarantees but still meet the needsof their businesses.This paper explores and names some of thepractical approaches used in the implementationsof large-scale mission-critical applications in aworld which rejects distributed transactions. Wediscuss the management of fine-grained pieces of application data which may be repartitioned overtime as the application grows. We also discussthe design patterns used in sending messagesbetween these repartitionable pieces of data.The reason for starting this discussion is to raiseawareness of new patterns for two reasons. First,it is my belief that this awareness can ease thechallenges of people hand-crafting very largescalable applications. Second, by observing thepatterns, hopefully the industry can work towards the creation of platforms that make iteasier to build these very large applications.
Let’s examine some goals for this paper, someassumptions that I am making for this discussion, andthen some opinions derived from the assumptions. WhileI am keenly interested in high availability, this paper willignore that issue and focus on scalability alone. Inparticular, we focus on the implications that fall out of assuming we cannot have large-scale distributedtransactions.
This paper has three broad goals:
Discuss Scalable ApplicationsMany of the requirements for the design of scalablesystems are understood implicitly by many applicationdesigners who build large systems.
The problem is that the issues, concepts, andabstractions for the interaction of transactions andscalable systems have no names and are not crisplyunderstood. When they get applied, they areinconsistently applied and sometimes come back to biteus. One goal of this paper is to launch a discussionwhich can increase awareness of these concepts and,hopefully, drive towards a common set of terms and anagreed approach to scalable programs.This paper attempts to name and formalize someabstractions implicitly in use for years to implementscalable systems.
Think about Almost-Infinite Scaling of ApplicationsTo frame the discussion on scaling, this paper presentsan informal thought experiment on the impact of almost-infinite scaling. I assume the number of customers, purchasable entities, orders, shipments,health-care-patients, taxpayers, bank accounts, and allother business concepts manipulated by the applicationgrow significantly larger over time. Typically, theindividual things do not get significantly larger; wesimply get more and more of them. It really doesn’tmatter what resource on the computer is saturated first,the increase in demand will drive us to spread whatformerly ran on a small set of machines to run over alarger set of machines…
Almost-infinite scaling is a loose, imprecise, anddeliberately amorphous way to motivate the need to bevery clear about when and where we can knowsomething fits on one machine and what to do if wecannot ensure it does fit on one machine. Furthermore,we want to scale almost linearly
with the load (bothdata and computation).
Describe a Few Common Patterns for Scalable AppsWhat are the impacts of almost-infinite scaling on thebusiness logic? I am asserting that scaling impliesusing a new abstraction called an “entity” as you writeyour program. An entity lives on a single machine at atime and the application can only manipulate one entityat a time. A consequence of almost-infinite scaling isthat this programmatic abstraction must be exposed tothe developer of business logic.By naming and discussing this as-yet-unnamedconcept, it is hoped that we can agree on a consistentprogrammatic approach and a consistent understandingof the issues involved in building scalable systems.Furthermore, the use of entities has implications on themessaging patterns used to connect the entities. Theselead to the creation of state machines that cope with the
To be clear, this is conceptually assuming tens of thousands or hundreds of thousands of machines. Toomany to make them behave like one “big” machine.
Scaling at N log N for some big log would be reallynice…message delivery inconsistencies foisted upon theinnocent application developer as they attempt to buildscalable solutions to business problems.
Let’s start out with three assumptions which areasserted and not justified. We simply assume these aretrue based on experience.
Layers of the Application and Scale-AgnosticismLet’s start by presuming (at least) two layers in eachscalable application. These layers differ in theirperception of scaling. They may have other differencesbut that is not relevant to this discussion.The lower layer of the application understands the factthat more computers get added to make the systemscale. In addition to other work, it manages themapping of the upper layer’s code to the physicalmachines and their locations. The lower layer is
in that it understands this mapping. We arepresuming that the lower layer provides a
scale-agnostic programming abstraction
to the upper layer
.Using this scale-agnostic programming abstraction, theupper layer of application code is written withoutworrying about scaling issues. By sticking to the scale-agnostic programming abstraction, we can writeapplication code that is not worried about the changeshappening when the application is deployed againstever increasing load.Over time, the lower layer of these applications mayevolve to become new platforms or middleware whichsimplify the creation of scale-agnostic applications(similar to the past scenarios when CICS and other TP-Monitors evolved to simplify the creation of applications for block-mode terminals).The focus of this discussion is on the possibilitiesposed by these nascent scale-agnostic APIs.
Scopes of Transactional SerializabilityLots of academic work has been done on the notion of providing transactional serializability across distributedsystems. This includes 2PC (two phase commit) whichcan easily block when nodes are unavailable and otherprotocols which do not block in the face of nodefailures such as the Paxos algorithm.
Google’s MapReduce is an example of a scale-agnosticprogramming abstraction.
Scale Agnostic CodeScale-Aware CodeImplementing Supportfor the Scale-AgnosticProgramming Abstraction
Upper Layerof CodeLower Layerof CodeScale AgnosticProgramming AbstractionScale Agnostic CodeScale-Aware CodeImplementing Supportfor the Scale-AgnosticProgramming Abstraction
Upper Layerof CodeLower Layerof Code
Upper Layerof CodeLower Layerof CodeScale AgnosticProgramming Abstraction
Let’s describe these algorithms as ones which provide
global transactional serializability
. Their goal is toallow arbitrary atomic updates across data spreadacross a set of machines. These algorithms allowupdates to exist in a single scope of serializabilityacross this set of machines.We are going to consider what happens when yousimply don’t do this. Real system developers and realsystems as we see them deployed today rarely even tryto achieve transactional serializability across machinesor, if they do, it is within a small number of tightlyconnected machines functioning as a cluster. Putsimply, we aren’t doing transactions across machinesexcept
in the simple case where there is a tightcluster which looks like one machine.Instead, we assume
multiple disjoint scopes of transactional serializability
. Consider each computerto be a separate scope of transactional serializability
.Each data item resides in a single computer or cluster
.Atomic transactions may include any data residingwithin that single scope of transactional serializability(i.e. within the single computer or cluster). You cannotperform atomic transactions across these disjointscopes of transactional serializability. That’s whatmakes them disjoint!
Most Applications Use “At-Least-Once” MessagingTCP-IP is great if you are a short-lived Unix-styleprocess. But let’s consider the dilemma faced by anapplication developer whose job is to process amessage and modify some data durably represented ondisk (either in a SQL database or some other durablestore). The message is consumed but not yetacknowledged. The database is updated and then themessage is acknowledged. In a failure, this is restartedand the message is processed again.The dilemma derives from the fact that the messagedelivery is not directly coupled to the update of thedurable data other than through application action.While it is possible to couple the consumption of messages to the update of the durable data, this is notcommonly available. The absenceof this couplingleads to failure windows in which the message isdelivered more than once. The messaging plumbing
I am deliberately conflating strict serializability and theweaker locking modes. The issue is the scope of the dataparticipating in the transactions visible to the application.
This is not intended to preclude a small collection of computers functioning in a cluster to behave as if they areone machine. This IS intended to formally state that weassume many computers and the likelihood that we mustconsider work which cannotbe atomically committed.
This is excluding replication for high-availability whichwill not change the presumption of disjoint scopes of tra`nsactional serializability.does this because its only other recourseis tooccasionally lose messages (“at-most-once”messaging) and that is even more onerous to deal with
.A consequence of this behavior from the messagingplumbing is that the application must tolerate messageretries and the out-of-order arrival of some messages.This paper considers the application patterns arisingwhen business-logic programmers must deal with thisburden in almost-infinitely large applications.
Opinions to Be Justified
The nice thing about writing a position paper is thatyou can express wild opinions. Here are a few that wewill be arguing in the corpus of this position paper
Scalable Apps Use Uniquely Identified “Entities”This paper will argue that the upper layer code for eachapplication must manipulate a single collection of datawe are calling an
. There are no restrictions onthe size of an entity except that it must live within asingle scope of serializability (i.e. one machine orcluster).Each entity has a unique identifier or key. An entity-key may be of any shape, form, or flavor but itsomehow uniquely identifies exactly one entity and thedata contained within that entity.There are no constraints on the representation of theentity. It may be stored as SQL records, XMLdocuments, files, data contained within file systems, asblobs, or anything else that is convenient or appropriatefor the app’s needs. One possible representation is as acollection of SQL records (potentially across manytables) whose primary key beginswith the entity-key. 
I am a big fan of “exactly-once in-order” messaging butto provide it for durable data requires a long-livedprogrammatic abstraction similar to a TCP connection.The assertion here is that these facilities are rarelyavailable to the programmer building scalableapplications. Hence, we are considering cases dealingwith “at-least-once”.
Note that these topics will be discussed in more detail.
Data for an app comprisesmany different entities.Key = “UNW”EntityEntityEntityEntityKey = “ABC”Key = “WBP”Key = “QLA”
Data for an app comprisesmany different entities.Key = “UNW”Entity
Key = “UNW”
Key = “UNW”EntityEntityEntityEntityKey = “ABC”Key = “WBP”Key = “QLA”

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->