You are on page 1of 14

II B.Tech – II Sem Database Management Systems Dept.

of AI

DBMS – Unit 4: Concurrency Control & Recovery System


4.1 Lock-Based Protocols

In a multiprogramming environment where multiple transactions can be executed simultaneously, it


is highly important to control the concurrency of transactions. We have concurrency control
protocols to ensure atomicity, isolation, and serializability of concurrent transactions. Concurrency
control protocols can be broadly divided into two categories −

 Lock based protocols


 Time stamp based protocols

Database systems equipped with lock-based protocols use a mechanism by which any transaction
cannot read or write data until it acquires an appropriate lock on it. Locks are of two kinds −
 Binary Locks − A lock on a data item can be in two states; it is either locked or unlocked.
 Shared/exclusive − This type of locking mechanism differentiates the locks based on their
uses. If a lock is acquired on a data item to perform a write operation, it is an exclusive lock.
Allowing more than one transaction to write on the same data item would lead the database
into an inconsistent state. Read locks are shared because no data value is being changed.
There are four types of lock protocols available −

Simplistic Lock Protocol: Simplistic lock-based protocols allow transactions to obtain a lock on
every object before a 'write' operation is performed. Transactions may unlock the data item after
completing the ‘write’ operation.

Pre-claiming Lock Protocol: Pre-claiming protocols evaluate their operations and create a list of
data items on which they need locks. Before initiating an execution, the transaction requests the
system for all the locks it needs beforehand. If all the locks are granted, the transaction executes and
releases all the locks when all its operations are over. If all the locks are not granted, the transaction
rolls back and waits until all the locks are granted.

Two-Phase Locking (2PL) Protocol: This locking protocol divides the execution phase of a
transaction into three parts. In the first part, when the transaction starts executing, it seeks permission
for the locks it requires. The second part is where the transaction acquires all the locks. As soon as
the transaction releases its first lock, the third phase starts. In this phase, the transaction cannot
demand any new locks; it only releases the acquired locks.

DrAA Unit-4 1
II B.Tech – II Sem Database Management Systems Dept. of AI

Two Phases: (a) Locking (Growing)


(b) Unlocking (Shrinking).
Locking (Growing) Phase: A transaction applies locks (read or write) on desired data items one at
a time.
Unlocking (Shrinking) Phase: A transaction unlocks its locked data items one at a time.
Requirement: For a transaction these two phases must be mutually exclusively, that is, during
locking phase unlocking phase must not start and during unlocking phase locking phase must not
begin.
Two-Phase Locking Example:
T1 T2 Result

read_lock (Y); X=50; Y=50


read_item (Y); Non-serializable because it.
unlock (Y); violated two-phase policy.
read_lock (X);
read_item (X);
unlock (X);
write_lock (Y);
read_item (Y);
Y:=X+Y;
write_item (Y);
unlock (Y);
write_lock (X);
read_item (X);
X:=X+Y;
write_item (X);
unlock (X);

Timestamp-based Protocol: The most commonly used concurrency protocol is the timestamp based
protocol. This protocol uses either system time or logical counter as a timestamp. Lock-based
protocols manage the order between the conflicting pairs among transactions at the time of execution,
whereas timestamp-based protocols start working as soon as a transaction is created.

Every transaction has a timestamp associated with it, and the ordering is determined by the age of
the transaction. A transaction created at 0002 clock time would be older than all other transactions
that come after it. For example, any transaction 'y' entering the system at 0004 is two seconds
younger and the priority would be given to the older one.
The timestamp-ordering protocol ensures serializability among transactions in their conflicting read
and write operations. This is the responsibility of the protocol system that the conflicting pair of
tasks should be executed according to the timestamp values of the transactions.

 The timestamp of transaction Ti is denoted as TS(Ti).


 Read time-stamp of data-item X is denoted by R-timestamp(X).
 Write time-stamp of data-item X is denoted by W-timestamp(X).

DrAA Unit-4 2
II B.Tech – II Sem Database Management Systems Dept. of AI

Basic Timestamp ordering protocol works as follows:

1. Check the following condition whenever a transaction Ti issues a Read (X) operation:
• If W_TS(X) >TS(Ti) then the operation is rejected.
• If W_TS(X) <= TS(Ti) then the operation is executed.

Timestamps of all the data items are updated.

2. Check the following condition whenever a transaction Ti issues a Write(X) operation:


• If TS(Ti) < R_TS(X) then the operation is rejected.
• If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back
otherwise the operation is executed.
Where,
TS(Ti) denotes the timestamp of the transaction Ti.
R_TS(X) denotes the Read time-stamp of data-item X.
W_TS(X) denotes the Write time-stamp of data-item X.

4.2 Deadlock Handling

A deadlock is a condition where two or more transactions are waiting indefinitely for one
another to give up locks. Deadlock is said to be one of the most feared complications in DBMS as no
task ever gets finished and is in waiting state forever.

For example: In the student table, transaction T1 holds a lock on some rows and needs to update
some rows in the grade table. Simultaneously, transaction T2 holds locks on some rows in the grade
table and needs to update the rows in the Student table held by Transaction T1.

Now, the main problem arises. Now Transaction T1 is waiting for T2 to release its lock and similarly,
transaction T2 is waiting for T1 to release its lock. All activities come to a halt state and remain at a
standstill. It will remain in a standstill until the DBMS detects the deadlock and aborts one of the
transactions.

DrAA Unit-4 3
II B.Tech – II Sem Database Management Systems Dept. of AI

Deadlock Avoidance

o When a database is stuck in a deadlock state, then it is better to avoid the database rather than
aborting or restating the database. This is a waste of time and resource.
o Deadlock avoidance mechanism is used to detect any deadlock situation in advance. A
method like "wait for graph" is used for detecting the deadlock situation but this method is
suitable only for the smaller database. For the larger database, deadlock prevention method
can be used.

Deadlock Detection

In a database, when a transaction waits indefinitely to obtain a lock, then the DBMS should detect
whether the transaction is involved in a deadlock or not. The lock manager maintains a Wait for the
graph to detect the deadlock cycle in the database.

Wait for Graph

o This is the suitable method for deadlock detection. In this method, a graph is created based on
the transaction and their lock. If the created graph has a cycle or closed loop, then there is a
deadlock.
o The wait for the graph is maintained by the system for every transaction which is waiting for
some data held by the others. The system keeps checking the graph if there is any cycle in the
graph.

The wait for a graph for the above scenario is shown below:

Deadlock Prevention

o Deadlock prevention method is suitable for a large database. If the resources are allocated in
such a way that deadlock never occurs, then the deadlock can be prevented.
o The Database management system analyzes the operations of the transaction whether they can
create a deadlock situation or not. If they do, then the DBMS never allowed that transaction to
be executed.

DrAA Unit-4 4
II B.Tech – II Sem Database Management Systems Dept. of AI

4.3 Multiple Granularity

Granularity: It is the size of data item allowed to lock.

Multiple Granularity:
o It can be defined as hierarchically breaking up the database into blocks which can be locked.
o The Multiple Granularity protocol enhances concurrency and reduces lock overhead.
o It maintains the track of what to lock and how to lock.
o It makes easy to decide either to lock a data item or to unlock a data item. This type of
hierarchy can be graphically represented as a tree.

For example: Consider a tree which has four levels of nodes.


o The first level or higher level shows the entire database.
o The second level represents a node of type area. The higher level database consists of exactly
these areas.
o The area consists of children nodes which are known as files. No file can be present in more
than one area.
o Finally, each file contains child nodes known as records. The file has exactly those records
that are its child nodes. No records represent in more than one file.
o Hence, the levels of the tree starting from the top level are as follows:
1. Database
2. Area
3. File
4. Record

In this example, the highest level shows the entire database. The levels below are file, record, and
fields.

DrAA Unit-4 5
II B.Tech – II Sem Database Management Systems Dept. of AI

There are three additional lock modes with multiple granularity:

Intention-shared (IS): It contains explicit locking at a lower level of the tree but only with shared
locks.
Intention-Exclusive (IX): It contains explicit locking at a lower level with exclusive or shared locks.

Shared & Intention-Exclusive (SIX): In this lock, the node is locked in shared mode, and some
node is locked in exclusive mode by the same transaction.

Compatibility Matrix with Intention Lock Modes: The below table describes the compatibility
matrix for these lock modes:

It uses the intention lock modes to ensure serializability. It requires that if a transaction attempts to
lock a node, then that node must follow these protocols:
o Transaction T1 should follow the lock-compatibility matrix.
o Transaction T1 firstly locks the root of the tree. It can lock it in any mode.
o If T1 currently has the parent of the node locked in either IX or IS mode, then the transaction
T1 will lock a node in S or IS mode only.
o If T1 currently has the parent of the node locked in either IX or SIX modes, then the
transaction T1 will lock a node in X, SIX, or IX mode only.
o If T1 has not previously unlocked any node only, then the Transaction T1 can lock a node.
o If T1 currently has none of the children of the node-locked only, then Transaction T1 will
unlock a node.

Observe that in multiple-granularity, the locks are acquired in top-down order, and locks must be
released in bottom-up order.
o If transaction T1 reads record Ra9 in file Fa, then transaction T1 needs to lock the database,
area A1 and file Fa in IX mode. Finally, it needs to lock Ra2 in S mode.
o If transaction T2 modifies record Ra9 in file Fa, then it can do so after locking the database,
area A1 and file Fa in IX mode. Finally, it needs to lock the Ra9 in X mode.
o If transaction T3 reads all the records in file Fa, then transaction T3 needs to lock the
database, and area A in IS mode. At last, it needs to lock Fa in S mode.
o If transaction T4 reads the entire database, then T4 needs to lock the database in S mode.

DrAA Unit-4 6
II B.Tech – II Sem Database Management Systems Dept. of AI

4.4 Timestamp-Based Protocols

Timestamp based Protocol in DBMS is an algorithm which uses the System Time or Logical
Counter as a timestamp to serialize the execution of concurrent transactions. The Timestamp-
based protocol ensures that every conflicting read and write operations are executed in a timestamp
order.
o The Timestamp Ordering Protocol is used to order the transactions based on their
Timestamps. The order of transaction is nothing but the ascending order of the transaction
creation.
o The priority of the older transaction is higher that's why it executes first. To determine the
timestamp of the transaction, this protocol uses system time or logical counter.
o The lock-based protocol is used to manage the order between conflicting pairs among
transactions at the execution time. But Timestamp based protocols start working as soon as a
transaction is created.
o Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has entered the
system at 007 times and transaction T2 has entered the system at 009 times. T1 has the higher
priority, so it executes first as it is entered the system first.
o The timestamp ordering protocol also maintains the timestamp of last 'read' and 'write'
operation on a data.

Basic Timestamp ordering protocol works as follows:

1. Check the following condition whenever a transaction Ti issues a Read (X) operation:
o If W_TS(X) >TS(Ti) then the operation is rejected.
o If W_TS(X) <= TS(Ti) then the operation is executed.
o Timestamps of all the data items are updated.
2. Check the following condition whenever a transaction Ti issues a Write(X) operation:
o If TS(Ti) < R_TS(X) then the operation is rejected.
o If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back otherwise the
operation is executed.
Where,
TS(TI) denotes the timestamp of the transaction Ti.
R_TS(X) denotes the Read time-stamp of data-item X.
W_TS(X) denotes the Write time-stamp of data-item X.
Advantages and Disadvantages of TO protocol:
o TO protocol ensures serializability since the precedence graph is as follows:

o TS protocol ensures freedom from deadlock that means no transaction ever waits.
o But the schedule may not be recoverable and may not even be cascade- free.

DrAA Unit-4 7
II B.Tech – II Sem Database Management Systems Dept. of AI

4.5 Validation-Based Protocols

Validation phase is also known as optimistic concurrency control technique. In the validation
based protocol, the transaction is executed in the following three phases:
1. Read phase: In this phase, the transaction T is read and executed. It is used to read the value
of various data items and stores them in temporary local variables. It can perform all the write
operations on temporary variables without an update to the actual database.
2. Validation phase: In this phase, the temporary variable value will be validated against the
actual data to see if it violates the serializability.
3. Write phase: If the validation of the transaction is validated, then the temporary results are
written to the database or system otherwise the transaction is rolled back.

Here each phase has the following different timestamps:

Start(Ti): It contains the time when Ti started its execution.

Validation (Ti): It contains the time when Ti finishes its read phase and starts its validation phase.

Finish(Ti): It contains the time when Ti finishes its write phase.


o This protocol is used to determine the time stamp for the transaction for serialization using the
time stamp of the validation phase, as it is the actual phase which determines if the transaction
will commit or rollback.
o Hence TS(T) = validation(T).
o The serializability is determined during the validation process. It can't be decided in advance.
o While executing the transaction, it ensures a greater degree of concurrency and also less
number of conflicts.
o Thus it contains transactions which have less number of rollbacks.

4.6 Failure Classification


Failure in terms of a database can be defined as its inability to execute the specified transaction
or loss of data from the database. A DBMS is vulnerable to several kinds of failures and each of these
failures needs to be managed differently. There are many reasons that can cause database failures
such as network failure, system crash, natural disasters, carelessness, sabotage(corrupting the data
intentionally), software errors, etc. A failure in DBMS can be classified as:
\

Failure Classification in DBMS

DrAA Unit-4 8
II B.Tech – II Sem Database Management Systems Dept. of AI

Transaction Failure: If a transaction is not able to execute or it comes to a point from where the
transaction becomes incapable of executing further then it is termed as a failure in a transaction.
Reason for a transaction failure in DBMS:
Logical error: A logical error occurs if a transaction is unable to execute because of some mistakes
in the code or due to the presence of some internal faults.
System error: Where the termination of an active transaction is done by the database system itself
due to some system issue or because the database management system is unable to proceed with the
transaction. For example– The system ends an operating transaction if it reaches a deadlock condition
or if there is an unavailability of resources.
System Crash: A system crash usually occurs when there is some sort of hardware or software
breakdown. Some other problems which are external to the system and cause the system to abruptly
stop or eventually crash include failure of the transaction, operating system errors, power cuts, main
memory crash, etc. These types of failures are often termed soft failures and are responsible for the
data losses in the volatile memory. It is assumed that a system crash does not have any effect on the
data stored in the non-volatile storage and this is known as the fail-stop assumption.

Data-transfer Failure: When a disk failure occurs amid data-transfer operation resulting in loss of
content from disk storage then such failures are categorized as data-transfer failures. Some other
reason for disk failures includes disk head crash, disk unreachability, formation of bad sectors, read-
write errors on the disk, etc. In order to quickly recover from a disk failure caused amid a data-
transfer operation, the backup copy of the data stored on other tapes or disks can be used. Thus it’s a
good practice to backup your data frequently.

4.7 Recovery and Atomicity


When a system crashes, it may have several transactions being executed and various files
opened for them to modify the data items. Transactions are made of various operations, which are
atomic in nature. But according to ACID properties of DBMS, atomicity of transactions as a whole
must be maintained, that is, either all the operations are executed or none. When a DBMS recovers
from a crash, it should maintain the following −
 It should check the states of all the transactions, which were being executed.
 A transaction may be in the middle of some operation; the DBMS must ensure the atomicity of
the transaction in this case.
 It should check whether the transaction can be completed now or it needs to be rolled back.
 No transactions would be allowed to leave the DBMS in an inconsistent state.

There are two types of techniques, which can help a DBMS in recovering as well as maintaining the
atomicity of a transaction −
 Maintaining the logs of each transaction, and writing them onto some stable storage before
actually modifying the database.
 Maintaining shadow paging, where the changes are done on a volatile memory, and later, the
actual database is updated.

DrAA Unit-4 9
II B.Tech – II Sem Database Management Systems Dept. of AI

Log-based Recovery
Log is a sequence of records, which maintains the records of actions performed by a
transaction. It is important that the logs are written prior to the actual modification and stored on a
stable storage media, which is failsafe. Log-based recovery works as follows −
 The log file is kept on a stable storage media.

 When a transaction enters the system and starts execution, it writes a log about it.
<T , Start>
n

 When the transaction modifies an item X, it write logs as follows −


<T , X, V , V >
n 1 2

It reads T has changed the value of X, from V to V .


n 1 2

 When the transaction finishes, it logs −


<T , commit>
n

Recovery with Concurrent Transactions


When more than one transaction are being executed in parallel, the logs are interleaved. At
the time of recovery, it would become hard for the recovery system to backtrack all logs, and then
start recovering. To ease this situation, most modern DBMS use the concept of 'checkpoints'.

Checkpoint: Keeping and maintaining logs in real time and in real environment may fill out all the
memory space available in the system. As time passes, the log file may grow too big to be handled at
all. Checkpoint is a mechanism where all the previous logs are removed from the system and stored
permanently in a storage disk. Checkpoint declares a point before which the DBMS was in consistent
state, and all the transactions were committed.

Recovery: When a system with concurrent transactions crashes and recovers, it behaves in the
following manner −

 The recovery system reads the logs backwards from the end to the last checkpoint.
 It maintains two lists, an undo-list and a redo-list.
 If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>,
it puts the transaction in the redo-list.
 If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it puts
the transaction in undo-list.

All the transactions in the undo-list are then undone and their logs are removed. All the transactions
in the redo-list and their previous logs are removed and then redone before saving their logs.
DrAA Unit-4 10
II B.Tech – II Sem Database Management Systems Dept. of AI

4.8 Recovery Algorithm

Database systems, like any other computer system, are subject to failures but the data stored
in it must be available as and when required. When a database fails it must possess the facilities for
fast recovery. It must also have atomicity i.e. either transactions are completed successfully and
committed (the effect is recorded permanently in the database) or the transaction should have no
effect on the database.

There are both automatic and non-automatic ways for both, backing up of data and recovery from any
failure situations. The techniques used to recover the lost data due to system crash, transaction errors,
viruses, catastrophic failure, incorrect commands execution etc. are database recovery techniques. So
to prevent data loss recovery techniques based on deferred update and immediate update or backing
up data can be used.

Recovery techniques are heavily dependent upon the existence of a special file known as a system
log. It contains information about the start and end of each transaction and any updates which occur
in the transaction. The log keeps track of all transaction operations that affect the values of database
items. This information is needed to recover from transaction failure.

The log is kept on disk start_transaction(T): This log entry records that transaction T starts the
execution.

read_item(T, X): This log entry records that transaction T reads the value of database item X.

write_item(T, X, old_value, new_value): This log entry records that transaction T changes the
value of the database item X from old_value to new_value. The old value is sometimes known as a
before an image of X, and the new value is known as an afterimage of X.

commit(T): This log entry records that transaction T has completed all accesses to the database
successfully and its effect can be committed (recorded permanently) to the database.

abort(T): This records that transaction T has been aborted.

checkpoint: Checkpoint is a mechanism where all the previous logs are removed from the system
and stored permanently in a storage disk. Checkpoint declares a point before which the DBMS was in
consistent state, and all the transactions were committed.

A transaction T reaches its commit point when all its operations that access the database have been
executed successfully i.e. the transaction has reached the point at which it will not abort (terminate
without completing). Once committed, the transaction is permanently recorded in the database.
Commitment always involves writing a commit entry to the log and writing the log to disk. At the
time of a system crash, item is searched back in the log for all transactions T that have written a
start_transaction(T) entry into the log but have not written a commit(T) entry yet; these transactions
may have to be rolled back to undo their effect on the database during the recovery process

Undoing – If a transaction crashes, then the recovery manager may undo transactions i.e. reverse the
operations of a transaction. This involves examining a transaction for the log entry write_item(T, x,
old_value, new_value) and setting the value of item x in the database to old-value.There are two
major techniques for recovery from non-catastrophic transaction failures: deferred updates and
immediate updates.

DrAA Unit-4 11
II B.Tech – II Sem Database Management Systems Dept. of AI

Deferred update – This technique does not physically update the database on disk until a transaction
has reached its commit point. Before reaching commit, all transaction updates are recorded in the
local transaction workspace. If a transaction fails before reaching its commit point, it will not have
changed the database in any way so UNDO is not needed. It may be necessary to REDO the effect of
the operations that are recorded in the local transaction workspace, because their effect may not yet
have been written in the database. Hence, a deferred update is also known as the No-undo/redo
algorithm

Immediate update – In the immediate update, the database may be updated by some operations of a
transaction before the transaction reaches its commit point. However, these operations are recorded in
a log on disk before they are applied to the database, making recovery still possible. If a transaction
fails to reach its commit point, the effect of its operation must be undone i.e. the transaction must be
rolled back hence we require both undo and redo. This technique is known as undo/redo algorithm.

Caching/Buffering – In this one or more disk pages that include data items to be updated are cached
into main memory buffers and then updated in memory before being written back to disk. A
collection of in-memory buffers called the DBMS cache is kept under control of DBMS for holding
these buffers. A directory is used to keep track of which database items are in the buffer. A dirty bit
is associated with each buffer, which is 0 if the buffer is not modified else 1 if modified.

Shadow paging – It provides atomicity and durability. A directory with n entries is constructed,
where the ith entry points to the ith database page on the link. When a transaction began executing
the current directory is copied into a shadow directory. When a page is to be modified, a shadow
page is allocated in which changes are made and when it is ready to become durable, all pages that
refer to original are updated to refer new replacement page.

Some of the backup techniques are as follows:

Full database backup – In this full database including data and database, Meta information needed to
restore the whole database, including full-text catalogs are backed up in a predefined time series.

Differential backup – It stores only the data changes that have occurred since last full database
backup. When same data has changed many times since last full database backup, a differential
backup stores the most recent version of changed data. For this first, we need to restore a full
database backup.

Transaction log backup – In this, all events that have occurred in the database, like a record of every
single statement executed is backed up. It is the backup of transaction log entries and contains all
transaction that had happened to the database. Through this, the database can be recovered to a
specific point in time. It is even possible to perform a backup from a transaction log if the data files
are destroyed and not even a single committed transaction is lost.

4.9 ARIES
Algorithm for Recovery and Isolation Exploiting Semantics (ARIES) is based on the Write
Ahead Log (WAL) protocol. Every update operation writes a log record which is one of the
following :
Undo-only log record: Only the before image is logged. Thus, an undo operation can be done to
retrieve the old data.
Redo-only log record: Only the after image is logged. Thus, a redo operation can be attempted.
Undo-redo log record: Both before images and after images are logged.
DrAA Unit-4 12
II B.Tech – II Sem Database Management Systems Dept. of AI

In it, every log record is assigned a unique and monotonically increasing log sequence
number (LSN). Every data page has a page LSN field that is set to the LSN of the log record
corresponding to the last update on the page. WAL requires that the log record corresponding to an
update make it to stable storage before the data page corresponding to that update is written to disk.
For performance reasons, each log write is not immediately forced to disk. A log tail is maintained in
main memory to buffer log writes. The log tail is flushed to disk when it gets full. A transaction
cannot be declared committed until the commit log record makes it to disk.

Once in a while the recovery subsystem writes a checkpoint record to the log. The checkpoint
record contains the transaction table and the dirty page table. A master log record is maintained
separately, in stable storage, to store the LSN of the latest checkpoint record that made it to disk. On
restart, the recovery subsystem reads the master log record to find the checkpoint’s LSN, reads the
checkpoint record, and starts recovery from there on.

The recovery process actually consists of 3 phases:


Analysis: The recovery subsystem determines the earliest log record from which the next pass must
start. It also scans the log forward from the checkpoint record to construct a snapshot of what the
system looked like at the instant of the crash.
Redo: Starting at the earliest LSN, the log is read forward and each update redone.
Undo: The log is scanned backward and updates corresponding to loser transactions are undone.

DrAA Unit-4 13
II B.Tech – II Sem Database Management Systems Dept. of AI

4.10 Remote Backup Systems


Remote backup provides a sense of security in case the primary location where the database is
located gets destroyed. Remote backup can be offline or real-time or online. In case it is offline, it is
maintained manually.

Online backup systems are more real-time and lifesavers for database administrators and
investors. An online backup system is a mechanism where every bit of the real-time data is backed up
simultaneously at two distant places. One of them is directly connected to the system and the other
one is kept at a remote place as backup.
As soon as the primary database storage fails, the backup system senses the failure and
switches the user system to the remote storage. Sometimes this is so instant that the users can’t even
realize a failure.

DrAA Unit-4 14

You might also like