You are on page 1of 43

Session 5 and 6, November 2013

Anteneh Tesfaye
antenehta@gmail.com
Make copies of data or services on multiple machines.
WHY Replication¿?
Reliability
 Redundancy
Performance
 Reduce Communication
 Increase Processing Capacity
Scalability
 Geographic
 Size

Department of Computer Science


REPLICATION ISSUES
 Performance and Scalability
 Keeping copies of data close to the processes
using them
 Updates
• Consistency (how to deal with updated data)
• Update propagation
 Replica placement
• How many replicas?
• Where to put them?
 Redirection/Routing
 Which replica should clients use?
November 11, 2013
Department of
Computer Science
Distributed Data Store

November 11, 2013 Department of Computer Science


November 11, 2013 Department of Computer Science
November 11, 2013 Department of Computer Science
 Replica manager process
Receives operation invocation requests from
clients and executes the operations locally
on its copy of the data
Creates communication with replica
managers running on the other replica servers

November 11, 2013 Department of Computer Science


Timeline of two clients accessing a distributed data-store

Time flows to the right: Absolute global time + Position…


Time of issue, time of execution, and time of completion
Arrows show time of execution on remote replicas
Read: locally Write: locally + Propagated to remote . . .
November 11, 2013 Department of Computer Science
INCONSISTENCY
 Staleness:
 How old is the data?
 How old is the data allowed to be?
🞄 Time
🞄 Versions
 Operation order:
 Were operations performed in the right order?
 What orderings are allowed?
 Conflicting Data:
 Do replicas have exactly the same data?
 What differences are permitted?
November 11, 2013 Department of Computer Science
Consistency Models
 Concerned with consistency of a data store.
 Specifies characteristics of valid
total orderings
1. Data Centric Consistency Model
2. Client Centric Consistency Model

November 11, 2013 Department of Computer Science


Data Centric Consistency Model
A contract, between a distributed data store
and clients, in which the data store specifies
precisely what the results of read and write
operations are in the presence of
concurrency
 Described consistency is experienced by all clients
 Multiple clients accessing the same data store
 Client A, Client B, Client C see same kinds of orderings
 Non-mobile clients (replica used doesn’t change)
November 11, 2013 Department of Computer Science
 Two approaches to achieve data-centric
consistency:
1. Continuous consistency
 Goal: to impose limits to deviations between
replicas
2. Ordering-based consistency
Data consistency can be defined in terms of
ordering of reads/writes

November 11, 2013 Department of Computer Science


1. Continuous consistency
 Goal: to impose limits to deviations
between replicas
 The deviations between replicas describe
the degree of consistency
🞄 in their numerical value
🞄 in their relative staleness
🞄 with respect to (number and order) of
performed update operations
November 11, 2013 Department of Computer Science
2. Ordering-based consistency
 Various consistency models are proposed.
 Strict consistency
🞄 Absolute time-based.
 Sequential consistency
🞄 All processes see same order of operations.
 Causal consistency
 All processes see causally-related operations in
same order
NB: Events are Causally Related if:
 A read is followed by a write in the same client
 A write of a particular data item is followed by a read of that data
item in any client
November 11, 2013 Department of Computer Science
. . . Ordering-based consistency
Synchronization-based consistency
🞄 Weak consistency
🞄 Consistency is made after synchronization is done.
🞄 Entry consistency
🞄 Consistency is made when a critical region is
entered.

November 11, 2013


Department of
Computer Science
CLIENT-CENTRIC CONSISTENCY MODELS
 Provides guarantees about ordering of operations for a single
client
 Single client accessing data store
 Client accesses different replicas (modified data store model)
 Data isn’t shared by clients
 Client A, Client B, Client C may see different kinds of
orderings

November 11, 2013 Department of Computer Science


Client-Centric Consistency Models
 Monotonic Reads
 Monotonic Writes
 Read Your Writes
 Writes Follow Reads

November 11, 2013 Department of Computer Science


Monotonic Reads
If a client has seen a value of x at a time t, it
will never see an older version of x at a later time

Monotonic Read Not Monotonic Read

 Each time you connect to a different e-mail server, that


server fetches (at least) all the updates from the server
you previously visited.
November 11, 2013 Department of Computer Science
Monotonic Writes
A write operation on data item x is completed before any
successive write on x by the same client
 All writes by a single client are sequentially ordered.

Monotonic Write Not Monotonic Write


 Updating a program at server S2, and ensuring that all
components on which compilation and linking depends, are
also placed at S2.
November 11, 2013 Department of Computer Science
Read Your Writes
 The effect of a write on x will always be seen by a successive
read of x by the same client

Consistent Inconsistent

 Updating your Web page and guaranteeing that your Web


browser shows the newest version instead of its cached copy.

November 11, 2013 Department of Computer Science


Writes Follow Reads
A write operation on x will be performed on a copy of x that is
up to date with the value most recently read by the same client

Consistent Inconsistent

 See reactions to posted articles only if you have the original


posting

November 11, 2013 Department of Computer Science


CHOOSING THE RIGHT MODEL
 Tradeoffs
 Consistency and Redundancy:
• All copies must be strongly consistent
• All copies must contain full state
• Reduced consistency  Reduced reliability
 Consistency and Scalability:
• Implementation of consistency must be
scalable
🞄 don’t take a centralized approach
🞄 avoid too much extra communication
November 11, 2013 Department of Computer Science
CHOOSING THE RIGHT MODEL
 Tradeoffs
 Consistency and Performance:
• Consistency requires extra work
• Consistency requires extra communication
• Can result in loss of overall performance

November 11, 2013 Department of Computer Science


CONSISTENCY PROTOCOLS
 Implementation of a consistency model
 Primary-Based Protocols:
 Remote-write protocols
 Local-write protocols
 Replicated-Write Protocols:
 Active Replication
 Quorum-Based Protocols
November 11, 2013 Department of Computer Science
REMOTE-WRITE PROTOCOLS
 Single Server:
 All writes and reads executed at single server
 No replication of data

November 11, 2013 Department of Computer Science


November 11, 2013 Department of Computer Science
Primary-Backup:
 All writes executed at single server, Reads
are local
 Updates block until executed on all backups

Performance

November 11, 2013 Department of Computer Science


November 11, 2013 Department of Computer Science
LOCAL-WRITE PROTOCOLS
 Migration:
 Data item migrated to local server on
access
 Distributed, non-replicated, data store
 Single Copy Migration
🞄 A type of local-write protocol in which a
single copy is migrated between processes
(fully distributed non-replicated version of
the data store).
November 11, 2013 Department of Computer Science
November 11, 2013 Department of Computer Science
 Primary Migration
🞄 A variant of local-write protocol with
primary-backup protocol in which the primary
copy migrates between processes that wish to
perform a write operation.

November 11, 2013 Department of Computer Science


November 11, 2013 Department of Computer Science
ACTIVE REPLICATION
 Updates (write operation) sent to
all replicas + Read locally
 Need totally-ordered multicast
 Needs: sequencer/coordinator
to add sequence numbers

November 11, 2013 Department of Computer Science


November 11, 2013 Department of Computer Science
QUORUM-BASED PROTOCOLS
 Voting
 Read Quorum: Nr
 Write Quorum: Nw
 Nr + Nw > N Why?
 Nw > N/2 Why?

November 11, 2013 Department of Computer Science


A correct write set A choice that may
choice of November 11, 2013
lead to write-
read and write
conflicts
A correct
choice,
known as
ROWA (read
one, write
all)
Department of
Computer Science
REPLICA PLACEMENT

November 11, 2013 Department of Computer Science


Permanent Replicas:
 Initial set of replicas
 Created and maintained by data-store owner(s)
 Allow writes
 Server-Initiated Replicas:
 Enhance performance
 Not maintained by owner
 Placed close to groups of clients
🞄 Dynamically

November 11, 2013 Department of Computer Science


 Client-Initiated Replicas:
 Client caches
 Temporary
 Owner not aware of replica
 Placed close to client
 Maintained by host (often client)

November 11, 2013 Department of Computer Science


DYNAMIC REPLICATION
 Situation changes over time
🞄 Number of users, Amount of data
🞄 Bursty changes: Flash crowds
🞄 R/W ratio
 Dynamic replica placement

November 11, 2013 Department of Computer Science

You might also like