Professional Documents
Culture Documents
Chapters
Chapter 12: Distributed DBMS Reliability
Chapter 14: Distributed Object Database
Management Systems
Chapter 16: Current Issues
Preethi Vishwanath
Un kn own
Envir onment
Har dwar e
Oper ations
S of t war e
Sof twar e
Ta n d e m D a t a
Envir onment
har dwar e
s of twar e
maintainence
oper ations
Fault tolerance
– Refers to a system design approach which recognizes that faults will occur
Fault prevention/Fault intolerance
– Aim at ensuring that the implemented system will not contain any faults
– Two aspects
Fault avoidance
– Refers to the techniques used to make sure that faults are not introduced into the system
– Involve detailed design methodologies such as design walkthroughs, design inspections etc..
Fault removal
– Refers to the techniques that are employed to detect any faults that might have remained in the system despite the
application of fault avoidance and removed these faults.
Fault detection
– Issue a warning when a failure occurs but do not provide any means of tolerating the failure.
Latent Failure
– One that is detected some time after its occurrence
Mean time to detect
– Average error latency time over a number of identical systems.
Fail-stop modules
– Constantly monitors itself and when it detects a fault, shuts itself down automatically
Fail-fast
– Implemented in software by defensive programming, where each software module checks
its own state during state transactions.
Local Recovery
Manager
Stable
database Database
Database Buffer
Manager Buffers
(Volatile
Database)
Recovery Information
In-Place Update Recovery Information Network Partitioning
– Necessary to store info about – Simple partition
database state changes, inorder to Network is divided into only two
recover back. components
– Recorded in the database log – Multiple partitioning
– REDO Action Network is divided into more than two
components
Database needs to include sufficient
data to permit the undo by taking the Centralized Protocols
old database state and recover the – Primary Site
new state Makes sense to permit the operation of the
– UNDO Action partition that contains the primary site,
since it manages the lock.
Database needs to include sufficient
data to permit the undo by taking the – Primary copy
new database state and recover the More than one partition may be operational
old state. for different queries.
Out-of-place update recovery Voting-based Protocols
information – Transactions are executed if a majority
– Typical techniques of the sites vote to execute it.
Shadowing – Quorum-based voting can be used as
Every time an update is made, the old a replica control method, as well as a
stable storage page, called shadow commit method to ensure transaction
page is left intact and a new page with atomicity in the presence of network
the updated data item values is written partitioning.
into the stable database. – In case of non replicated databases,
Differential files this involves the integration of the
voting principle with commit protocols.
2 Phase Commit Protocol
The two phase commit protocol is a distributed algorithm which lets all
sites in a distributed system agree to commit a transaction.
The protocol results in either all nodes committing the transaction or
aborting, even in the case of site failures and message losses.
Basic Algorithm
– Commit-request phase
Advantages
– With careful design, it is possible to ensure that single points of
failure are eliminated
– Overall system availability is maintained even when one or more
sites fail.
Disadvantages
– Whenever updates are introduced, the complexity of keeping
replicas consistent arises and this is the topic of replication
protocols.
Concepts
Object
– Represents a real entity in the Abstract Data Types
system
– Template for all objects of that type.
– Represented as a pair (object
Identity, state) – Describes type of data by providing a
– Enables referential object sharing. domain of data with the same
structure, as well as operations
State applicable to the objects of that
– Either an atomic value or a domain.
constructed value – Abstraction capability commonly
Value referred as encapsulation.
– An element of D is a value, called an Composition (Aggregation)
atomic value – Restriction on composite objects
– [a1:v1,…,an:vn], in which ai is an results in complex objects
element of A and vi is either a value – The composite object relationship
or an element of I, is called a tuple between types can be represented by
value. a composition graph.
– {v1,..,vn}, in which vi is either a value Collection
or an element of I, is called a set – User defined grouping of objects
value.
– Similar to class in that it groups
Class objects.
– Grouping of common objects Subtyping
– Template for all common objects – Based on specialization relationship
Inheritance among types.
– Declaring a type to be a subtype of
another.
Object Distribution Design
Path partitioning Allocation
– A concept describe the clustering – Local behavior-local object
of all the objects forming a Behavior, the object to which it is
composite object into a partition. applied, and the arguments are all
– Can be represented as a co-located.
hierarchy of nodes forming a No special mechanism needed to
structural index. handle this case.
– Index contains the references to – Local behavior-remote object
all the component objects of a Behavior, the object to which it is
composite object, eliminating the applied, and the arguments are all
co-located.
need to traverse the class
composition hierarchy. Two ways to deal
– Move th remote object to the site
Class Partitioning Algorithms where the behavior is located.
– Ship the behavior
– Main issue is to improve the implementation to the site where
performance of user queries and the object is located
applications by reducing the
irrelevant data access.
– Affinity based approach
Affinity among instance variables
and methods and affinity among
multiple methods can be used for
horizontal and vertical class
partitioning.
– Cost-Driven Approach
Client-Server Architecture
Object
Object Database
Database
Cache Consistency
Problem in any data shipping system that moves data to the clients.
Cache consistency algorithms
– Avoidance-based synchronous algorithms
Clients retain read locks across transactions, but they relinquish write locks at the end
of the transaction.
The client send lock requests to the server and they block until the server responds.
If the client requests a write lock on a page that is cached at other clients.
– Avoidance-based asynchronous algorithms
Do not have the message blocking overhead present in synchronous algorithms.
Clients send lock escalation messages to the server and continue application
processing
– Avoidance-based deferred algorithms
Clients batch their lock escalation requests and send them to the server at commit time.
The server blocks the updating client if other clients are reading the updated objects.
– Detection-based synchronous algorithms
Clients contact the server whenever they access a page in their cache to ensure that
the page is not stale or being written to by other clients.
– Detection-based asynchronous algorithms
Clients send lock escalation requests to the server, but optimistically assume that their
requests will be successful.
After a client transaction commits, the server propagates the updated pages to all the
other clients that have also cached the affected pages.
– Detection-based deferred algorithms
Can outperform callback locking algorithms even while encountering a higher abort rate
if the client transaction state completely fits into the client cache, and all application
processing is strictly performed at the clients.
Object Identifier Management
Object Identifiers are system generated
Used to Uniquely identify every object
Transient object identity can be implemented more
efficiently
Two common solutions
– Physical Identifier approach (POID)
Equates the OID with the physical address of the corresponding
object
Advantage , the object can be obtained directly from the OID.
Drawback, all the parent objects and indexes must be updated
whenever an object is moved to a different page.
– Logical Identifier approach (LOID)
Consists of allocating a system wide unique OID.
Since OIDs are invariant, there is no overhead due to object
movement.
Object Migration
Three alternatives can be
considered for the migration of
– Active
classes (types)
Active objects are currently
involved in an activity in response
– The source code is moved and to an invocation or a message
recompiled at the destination – Waiting
– The compiled version of a class is Waiting objects have invoked
migrated just like any other object, another object and are waiting for
or a response.
– The source code of the class – Suspended
definition is moved, but not its
compiled operations, for which a Suspended objects are
lazy migration strategy us used. temporarily unavailable for
invocation.
Objects can be in one of the four
states Migration involves two steps
Wrapper Data
Source
Global
Web Data Wrapper Data
Server Dictionary Source
Wrapper
Data
Source
Algorithm – Push based approach
Problems with Pull-based approach
1. Order the data items from hottest to
– users need to know a priori where coldest
and when to look for data. 2. Partition the data items into ranges of
– Mismatch between the items, such that the items in each
asymmetric nature of some range have similar application access
applications and the symmetric profiles. The number of ranges is
communications infrastructure on denoted by num_ranges.
applications such as internet. 3. Choose the relative broadcast
– Two types of asymmetry frequency for each range as integers
Network asymmetry, network (rel_freqi, where i is the range).
bandwidth between client- server 4. Divide each range into smaller
different from server-client. elements, called chunks (Cij is the j-th
Distributed information systems, chunk of range i). Determine the
due to imbalance between the number of chunks into which range i is
number of clients and the number divided as num_chunk, =
of servers. max_chunks/rel_freqi, where
Data, amount of data being max_chunks is the least common
transferred between client and multiple of rel_freqi,¥i.
server.
Data volatility 5. Create the broadcast schedule by
interleaving the chunks of each range
using the following procedure.
Why Push based technologies? for I from 0 to max_chunks-1 by 1 do
for j from 1 to max ranges by 1 do
Response to some of the Broadcast chunk Cj, (i mod
problems inherent in pull-based
systems. num_chunksj)
end-for
end-for
Difference between pull-based and push-based systems
– Cache replacement policies
– Prefetching mechanism
PIX algorithm, calculates the “cost” of replacing a page and replaces the
least costly one.