You are on page 1of 57

 A transaction is a unit of program execution that accesses and possibly

updates various data items.


 A transaction is delimited by statements (or function calls) of the form
begin transaction and end transaction. The transaction consists of all
operations executed between the begin transaction and end
transaction.
Let’s take an example of a simple transaction. Suppose a bank employee transfers
Rs 500 from A's account to B's account. This very simple and small transaction
involves several low-level tasks.
A’s Account
Open_Account (A)
Old_Balance = A.balance
New_Balance = Old_Balance - 500
A.balance = New_Balance
Close_Account (A)
B’s Account
Open_Account (B)
Old_Balance = B.balance
New_Balance = Old_Balance + 500
B.balance = New_Balance
Close_Account (B)
ACID Properties
A transaction in a database system must maintain Atomicity, Consistency, Isolation,
and Durability − commonly known as ACID properties − in order to ensure
accuracy, completeness, and data integrity.
Atomicity (A)
This property ensures that either all the operations of a transaction reflect in
database or none.
Example
Suppose Account A has a balance of Rs.400 & B has Rs.700. Account A is
transferring Rs.100 to Account B. This is a transaction that has two operations

a) Debiting Rs.100 from A’s balance


b) Creating Rs.100 to B’s balance.

Let’s say first operation passed successfully while second failed, in this case A’s
balance would be Rs. 300 while B would be having Rs. 700 instead of Rs. 800.
This is unacceptable in a banking system. Either the transaction should fail without
executing any of the operation or it should process both the operations. The
Atomicity property ensures that.
Consistency (C)
Integrity Constraints should be maintained. So, that the database is consistent
before and after the transaction. To preserve the consistency of database, the
execution of transaction should take place in isolation (that means no other
transaction should run concurrently when there is a transaction already running).
Example
Account A is having a balance of 400 and it is transferring 100 to account B & C
both. We have two transactions here. Let’s say these transactions run concurrently
and both the transactions read 400 balance, in that case the final balance of A
would be 300 instead of 200 .
This is wrong. If the transaction were to run in isolation, then the second
transaction would have read the correct balance 300 (before debiting 100 ) once
the first transaction went successful.

Isolation
Even though multiple transactions may execute concurrently, the system
guarantees that, for every pair of transactions Ti and Tj , it appears to Ti that either
Tj finished execution before Ti started, or Tj started execution after Ti finished.
Thus, each transaction is unaware of other transactions executing concurrently in
the system.
Durability (D)
Once a transaction completes successfully, the changes it has made into the
database should be permanent even if there is a system failure. The recovery
management component of database systems ensures the durability of transaction.

Transaction States
Any transaction at any point of time must be in one of the following states:
Active: The initial state; the transaction stays in this state while it is executing.
Partially committed: After the final statement has been executed.
Failed: After the discovery that normal execution can no longer proceed.
Aborted: After the transaction has been rolled back and the database has been
restored to its state prior to the start of the transaction
Committed: After successful completion.

A transaction that completes its execution successfully is said to be committed. A


committed transaction that has performed updates transforms the database into a
new consistent state, which must persist even if there is a system failure.
Once a transaction has committed, we cannot undo its effects by aborting it. The
only way to undo the effects of a committed transaction is to execute a
compensating transaction.
An aborted transaction has two options
• It can be restarted, but only if the transaction was aborted as a result of
some hardware or software error that was not created through the
internal logic of the transaction. A restarted transaction is considered to
be a new transaction.

• It can be killed, It usually does so because of some internal logical error


that can be corrected only by rewriting the application program, or
because the input was bad, or because the desired data were not found in
the database.
Serializability
Schedule
A chronological execution sequence of a transaction is called a schedule. A
schedule can have many transactions in it and each comprising of number of
instructions/tasks.
In general, there are two types of schedules
✓ Serial Schedules
✓ Equivalence Schedules
T1 and T2 be two transactions that transfer funds from one account to another.
Transaction T1 transfers $50 from account A to account B. It is defined as

T1: read(A);
A := A − 50;
write(A);
read(B);
B := B + 50;
write(B).
Transaction T2 transfers 10 percent of the balance from account A to account B. It
is defined as
T2: read(A);
temp := A * 0.1;
A := A − temp;
write(A);
read(B);
B := B + temp;
write(B).
Suppose the current values of accounts A and B are $1000 and $2000
Schedule2 Concurrent schedule
Serial Schedules
It is a schedule in which transactions are aligned in such a way that one transaction
is executed first. When the first transaction completes its cycle, then the next
transaction is executed. Transactions are ordered one after the other. This type of
schedule is called a serial schedule, as transactions are executed in a serial manner.
Equivalence Schedules
An equivalence schedule can be of the following types −
✓Result Equivalence
✓View Equivalence
✓Conflict Equivalence

Result Equivalence:
If two schedules produce the same result after execution, they are said to be
result equivalent. They may yield the same result for some value and different
results for another set of values. That's why this equivalence is not generally
considered significant.
View Equivalence
Two schedules would be view equivalence if the transactions in both the
schedules perform similar actions in a similar manner.
For example −
 If Ti reads the initial data in S1, then it also reads the initial data in S2.
 If Ti reads the value written by J in S1, then it also reads the value written by J
in S2.
 If Ti performs the final write on the data value in S1, then it also performs the
final write on the data value in S2.
Conflict Equivalence
Two Operations would be conflicting if they have the following properties –
✓ Both belong to separate transactions.
✓ Both accesses the same data item.
✓ At least one of them is "write" operation.
Example:
Conflicting operations pair (R1(A), W2(A)) because they belong to two
different transactions on same data item A and one of them is write operation.
Similarly,
(W1(A), W2(A)) and (W1(A), R2(A)) pairs are also conflicting.
On the other hand, (R1(A), W2(B)) pair is non-conflicting because they
operate on different data item.
Similarly, ((W1(A), W2(B)) pair is non-conflicting.

S: R1(X) R2(X) W1(X) R1(Y) W2(X) W1(Y)

Two schedules having multiple transactions with conflicting operations are


said to be conflict equivalent if and only if

❖ Both the schedules contain the same set of Transactions.

❖ The order of conflicting pairs of operation is maintained in both


the schedules.
Check whether the following schedules are conflict equivalent or not.
S1: R1(X) R2(X) W1(X) R1(Y) W2(X) W1(Y)
S2:R1(X) W1(X) W1(Y) R1(Y) R2(X) W2(X)

Let us consider S1. In this the conflict operations are


 R1(X) W2(X)
 R2(X) W1(X)
 W1(X) W2(X)

Now, consider S2. In this the conflict operations are


 R1(X) W2(X)
 W1(X)R2(X)
 W1(X) W2(X)

Note − View equivalent schedules are view serializable and conflict equivalent
schedules are conflict serializable. All conflict serializable schedules are view
serializable too.
Serializability
A schedule is said to be serializable schedule if it is equivalent to the any one of
the serial schedule.
Conflict Serializable Schedules
A schedule S is said to be conflict serializable schedule if it is conflict equivalent
to the any one of the serial schedule.

Method-1 for checking the conflict Serializability: -


Step-1: Generate all possible serial schedules for the given schedule. If a
schedule contains n transactions, then possible number of serial
schedules are n!
Step-2: Now compare each serial schedule with given schedule for conflict
equivalence. If any serial schedule is conflict equivalent with the given
schedule, then it is conflict serializable schedule otherwise it is not a
conflict serializable schedule.
Example: S: R1(X) R2(X) W1(X) R1(Y) W2(X) W1(Y)
Step-1: Generate all possible serial schedules for S. Here S contains 2
transactions, so the possible number of serial schedules are 2(2!). Those
are
<T1 T2> (S12)
<T2 T1> (S21)
Step-2: Now compare S with <T1 T2 > for conflict equivalence.
Now S: R1(X) R2(X) W1(X) R1(Y) W2(X) W1(Y)
S12: R1(X)W1(X) R1(Y)R2(X)W2(X)

In S the conflicting operations are


R1(X) W2(X),
R2(X) W1(X),
W1(X)W2(X)
In S12 the conflicting operations are
R1(X) W2(X),
W1(X) R2(X)
W1(X)W2(X)

Here S, S12 are not following the same order for conflicting operation pairs. S
Now compare S with <T2 T1> for conflict equivalence.

S: R1(X) R2(X) W1(X) R1(Y) W2(X) W1(Y)


S21: R2(X)W2(X)R1(X)W1(X) R1(Y) W1(Y)

In S the conflicting operations are


R1(X) W2(X),
R2(X) W1(X),
W1(X)W2(X)

In S21 the conflicting operations are


R2(X) W1(X),
W2(X)W1(X),
W2(X)R1(X)

Here S, S21 are not following the same order for conflicting operation pairs.
So, the given schedule S is not a conflict serializable schedule.

S: R1(A)W1(A)R2(A)W2(A)R1(B)W1(B)R2(B)W2(B)
View Serializable Schedules
A schedule S is said to be view Serializable schedule if it is view equivalent to
the any one of the serial schedule.

Method for checking the View Serializability:

Step-1: Generate all possible serial schedules for the given schedule. If a schedule
contains n transactions, then possible number of serial schedules are n!
Step-2: Now compare each serial schedule with given schedule for view
equivalence. If any serial schedule is view equivalent with the given
schedule, then it is view Serializable schedule otherwise it is not a view
Serializable schedule.

Example:
Check whether the schedule is view Serializable or not?
S: R2 (B); R2 (A); R1 (A); R3 (A); W1 (B); W2 (B); W3 (B);
Solution: With 3 transactions, total number of schedules possible = 3! = 6
<T1 T2 T3>
<T1 T3 T2>
<T2 T3 T1>
<T2 T1 T3>
<T3 T1 T2>
<T3 T2 T1>
Step 1: Initial Read
A: T2
B: T2
Remaining Schedules:
< T2 T1 T3>
<T2 T3 T1 >

Step 2: Write Read Sequence (WR):


No need to check here.
Step 3: Final Updation on data items:
A: -
B: T3
Since the final Updation on B is made by T3, so the transaction T3 must execute
after transactions T1 and T2.
Now, removing those schedules in which T3 is not executing at last.
Hence, view equivalent serial schedule is:
T2 → T1 → T3
Hence, given schedule is view Serializable schedule.
S:R1(A)W2(A)R3(A)W1(A)W3(A)
Recoverability
✓ A computer system, like any other device, is subject to failure from a variety
of Causes. In any failure, information may be lost.
✓ Therefore, the database system must take actions in advance to ensure that the
atomicity and durability properties of transactions are preserved.
✓ An integral part of a database system is a recovery scheme that can restore the
database to the consistent state that existed before the failure.

Failure Classification
There are various types of failure that may occur in a system, each of which needs
to be dealt with in a different manner. The failures are classified as follows
✓ Transaction failure.
✓ System crash.
✓ Disk failure.
Transaction failure
There are two types of errors that may cause a transaction to fail:
Logical error
The transaction can no longer continue with its normal execution because of some
internal condition, such as bad input, data not found, overflow, or resource limit
exceeded.
System error
The system has entered an undesirable state (for example, deadlock), as a result of
which a transaction cannot continue with its normal execution. The transaction,
however, can be executed later.

System crash
There is a hardware malfunction, or a bug in the database software or the
operating system, that causes the loss of the content of volatile storage and brings
transaction processing to a halt. The content of non-volatile storage remains intact
and is not corrupted.
Disk failure
✓ A disk block loses its content as a result of either a head crash or failure during
a data transfer operation.

To determine how the system should recover from failures, we need to identify
the failure modes of those devices used for storing data. Next, we must consider
how these failure modes affect the contents of the database.
Storage Structure
The storage structure can be divided into two categories

Volatile storage
✓ As the name suggests, a volatile storage cannot survive system crashes.
Examples
main memory ,RAM and cache memory are examples of volatile storage.

Non-volatile storage
✓ These memories are made to survive system crashes. They are huge in data
storage capacity, but slower in accessibility.

Examples:
Harddisks, magnetic tapes, flash memory, and non-volatile (battery backed up)
ROM.
Recovery and Atomicity
✓ When a system crashes, it may have several transactions being executed and
various files opened for them to modify the data items.
✓ According to atomicity of transactions must be maintained, that is, either all
the operations are executed or none.

When a DBMS recovers from a crash, it should maintain the following

❖ It should check the states of all the transactions, which were being executed.
❖ A transaction may be in the middle of some operation; the DBMS must ensure
the atomicity of the transaction in this case.
❖ It should check whether the transaction can be completed now or it needs to be
rolled back.
❖ No transactions would be allowed to leave the DBMS in an inconsistent state.

There are two types of techniques, which can help a DBMS in recovering as well
as maintaining the atomicity of a transaction
➢ Maintaining the logs of each transaction and writing them onto some stable
storage before modifying the database.
➢ Maintaining shadow paging, where the changes are done on a volatile
memory, and later, the actual database is updated.
Log-based Recovery
Log maintains the records of actions performed by a transaction. It is important
that the logs are written prior to the actual modification and stored on a stable
storage media, which is failsafe.

Log-based recovery works as follows


The log file is kept on a stable storage media.

 When a transaction enters the system and starts execution, it writes a log about
it.
<Tn, Start>
 When the transaction modifies an item X, it write logs as follows −
<Tn, X, V1, V2>
It reads Tn has changed the value of X, from V1 to V2.
 When the transaction finishes, it logs −
<Tn, commit>

The database can be modified using two approaches


➢ Deferred database modification
➢ Immediate database modification
Deferred database modification
In this all the writes are deferred to after partially commit. In this we have two
cases.
✓ All the operations are successful.
✓ Some or all the operations are failed.

If all the operations in schedule are successful, then all the writes are deferred to
partial commit otherwise no writes are deferred to partial commit.

✓ In this method write operation does not need old record i.e., (write, T, x, new
value).
✓ (Abort, T) entry never used in this method.
✓ In this method if any system crash occurs because of any transaction failure then
the schedule looks up into log file and reads all the writes until (commit, T)
entry is found. If the entry is found write all (write, T) entries into the database
otherwise redo all the operations. If (commit, T) entry is not found nothing to be
done.
✓ This method is also called as (No-undo) / redo recovery scheme.
✓ Redo must be done in a order.
Example:
(Start,T0)
(Write,T0,a,9)
Crash (Nothing is to be done because no commit found)
(commit,T0)
(start, T1)
(write,T1,a,7)
Crash (T0 committed so redo T0)
(commit,T1)
Crash (T0, T1 are committed so redo T0,T1)

Immediate Database Modification:


✓ It allows un-committed writes.
✓ Needs old, new value for each write operation.
✓ In this method if any system crash occurs because of any transaction failure then
the schedule look up into log file and reads all the writes until (commit, T) entry
is found. If the entry is found redo all the operations otherwise undo all the
operations.
✓ It is also called as undo/redo recovery scheme.
✓ Undo is done in the reverse order of the log where as redo will be done in
forward order.
✓ In this first we should perform undo operations then perform redo operation.

Example:
(Start, T0)
(Write, T0, A, 3, 9)
Crash (Undo T0 so A=3)
(Commit, T0)
(Start, T1)
(Write, T1, B, 2, 5)
(Write, T1, a, 9, 7)
Crash (undo – T1 and Redo T0 so B=2, A=9)
(Commit, T1)
Crash (Redo T0, T1 so B=5, A=7)
Recovery with concurrent Transactions
When more than one transaction are being executed in parallel, the logs are
interleaved. At the time of recovery, it would become hard for the recovery system
to backtrack all logs, and then start recovering. To ease this situation, most modern
DBMS use the concept of 'checkpoints’.
Checkpoint
✓ Checkpoint declares a point before which the DBMS was in consistent state,
and all the transactions were committed.
✓ Checkpoint is a mechanism where all the previous logs are removed from the
system and stored permanently in a storage disk.
Recovery
When a system with concurrent transactions crashes and recovers, it behaves in the
following manner
✓ The recovery system reads the logs backwards from the end to the last
checkpoint.
✓ It maintains two lists, an undo-list and a redo-list.
✓ If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just
<Tn, Commit>, it puts the transaction in the redo-list.
✓ If the recovery system sees a log with <Tn, Start> but no commit or abort log
found, it puts the transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed.
All the transactions in the redo-list and their previous logs are removed and then
redone before saving their logs.
Example:
(Start, T1)
(Write, T1, B, 2, 3)
(Start, T2)
(Commit, T1)
(Write, T2, C, 5, 7)
(Commit, T2)
(Checkpoint, {T2})
(Start, t3)
(Write, T3, A, 1, 9)
(Commit, T3)
(Start, T4)
(Write, T4, C, 7, 2)
In the above example Undo list is – T4
Redo list is – T2,T3
First it performs Undo list in the order i.e., T4 and then it performs Redo T2,T3.
Deadlock Handling
Deadlock is a state of a database system having two or more transactions, when
each transaction is waiting for a data item that is being locked by some other
transaction.
A deadlock can be indicated by a cycle in the wait-for-graph. This is a directed
graph in which the vertices denote transactions and the edges denote waits for
data items.

Example
Transaction T1 is waiting for data item X which is locked by T3. T3 is waiting for
Y which is locked by T2 and T2 is waiting for Z which is locked by T1. Hence, a
waiting cycle is formed, and none of the transactions can proceed executing.

T1
X
Z

T2 T3
Y
There are three classical approaches for deadlock handling, namely
❖ Deadlock prevention.
❖ Deadlock avoidance.
❖ Deadlock detection and removal.
All of the three approaches can be incorporated in both a centralized and a
distributed database system.

Deadlock Prevention
 The deadlock prevention approach does not allow any transaction to acquire
locks that will lead to deadlock.
 The convention is that when more than one transactions request for locking the
same data item, only one of them is granted the lock.
 One of the most popular deadlock prevention methods is pre-acquisition of all
the locks.
 In this method, a transaction acquires all the locks before starting to execute
and retains the locks for the entire duration of transaction.
 Using this approach, the system is prevented from being deadlocked since
none of the waiting transactions are holding any lock.
There are two algorithms for this purpose, namely wait-die and wound-wait.
Let us assume that there are two transactions, T1 and T2, where T1 tries to lock a
data item which is already locked by T2. The algorithms are as follows
Wait-Die
If T1 is older than T2, T1 is allowed to wait. Otherwise, if T1 is younger than T2,
T1 is aborted and later restarted.

Wound-Wait
If T1 is older than T2, T2 is aborted and later restarted. Otherwise, if T1 is
younger than T2, T1 is allowed to wait.

Deadlock Avoidance
✓ The deadlock avoidance approach handles deadlocks before they occur. It
analyzes the transactions and the locks to determine whether waiting leads to a
deadlock or not .
✓ When a transaction requests a lock on data item The lock manager checks
whether the lock is available. If it is available, the lock manager allocates the
data item and the transaction acquires the lock.
✓ If the item is locked by some other transaction in incompatible mode, the lock
manager runs an algorithm to test whether keeping the transaction in waiting
state will cause a deadlock or not.
Deadlock Detection and Removal
 The deadlock detection and removal approach runs a deadlock detection
algorithm periodically and removes deadlock in case there is one.

 When a transaction requests a lock, the lock manager checks whether it is


available. If it is available, the transaction is allowed to lock the data item;
otherwise the transaction is allowed to wait.

 To detect deadlocks, the lock manager periodically checks if the wait-for graph
has cycles.

 If the system is deadlocked, the lock manager chooses a victim transaction


from each cycle. The victim is aborted and rolled back; and then restarted later.

Some of the methods used for victim selection are


➢ Choose the youngest transaction.
➢ Choose the transaction with fewest data items.
➢ Choose the transaction that has performed least number of updates.
➢ Choose the transaction having least restart overhead.
➢ Choose the transaction which is common to two or more cycles.
Concurrency Control
In a multiprogramming environment where multiple transactions can be executed
simultaneously, it is highly important to control the concurrency of transactions.
We have concurrency control protocols to ensure atomicity, isolation, and
serializability of concurrent transactions. Concurrency control protocols can be
broadly divided into two categories
 Lock based protocols
 Time stamp based protocols

Lock based protocol


This protocol requires that all the data items must be accessed in a mutually
exclusive manner, i.e. when one transaction is executing then no other transaction
should interrupt the same object.

To implement Lock based protocol we need 2 types of locks.

Shared Lock: When we take this lock we can just read the item but cannot write.
Exclusive Lock: In this type of lock we can write as well as read the data item.
Below table will give clear idea about what we can do and cannot while having
shared or exclusive lock.
In general we have 2 types of locking protocols. Those are
❖ Simple locking protocol
❖ 2 – Phase locking Protocol

Simple locking protocol


In this simply we have only two operations.
1. Lock the item before access that item.
Read operation (Lock-S (A))
Write operation & Read operation (Lock- X (A))
2. Unlock the item after access
Unlock-S (A)
Unlock-X (A)
Problem 1: Inconsistency in the database
So above we have shown one interleaving and we could understand the data flow
by below example.
Output:
Let say in the start of the schedule A=100 and B=200 so in total we should have
300 always. But as we see that after Till Display (B+A) we have A=50 and B=200
to total=250 but it should be 300 to make consistency.

Problem 2: Deadlock Problem


Let’s say T2 is waiting for T1 to unlock (A) and T1 is waiting for T2 to unlock (B)
Two-Phase Locking protocol
A transaction is said to follow the Two-Phase Locking protocol if Locking and
Unlocking can be done in two phases.
1. Growing Phase: New locks on data items may be acquired but none can be
released.
2. Shrinking Phase: Existing locks may be released but no new locks can be
acquired.
LOCK POINT: The Point at which the growing phase ends, i.e., when a
transaction takes the final lock it needs to carry on its work
2 PL suffers from other problems such as cascading rollback and rollback.
Cascading Rollback Problem:
Failure of one Transaction leads to failure of its sub sequent Transactions. Here we
could see cascading rollback by the following example

So if transaction T1 rollbacks then transaction T2 and T3 has to rollback.


Deadlock Problem Example

In 2 PL we have seen:
Serializability: It is guaranteed to happen.
Cascading Rollback: It is possible which is bad.
Deadlock: It is possible.

Strict Two-phase locking (Strict-2PL)


The first phase of Strict-2PL is like 2PL.
✓ In the first phase, after acquiring all the locks, the transaction continues to
execute normally.
✓ The only difference between 2PL and strict 2PL is that Strict-2PL does not
release a lock after using it.
✓ Strict-2PL waits until the whole transaction to commit, and then it releases all
the Exclusive locks at a time.
✓ Strict-2PL protocol does not have shrinking phase of lock release.
Strict 2-PL ensures that our schedule is:
✓ Recoverable
✓ Cascade less
Rigorous 2-PL
✓ In the first phase, after acquiring all the locks, the transaction continues to
execute normally.
✓ The only difference between 2PL and Rigorous 2PL is that Rigorous 2PL does
not release a lock after using it.
✓ Rigorous 2PL waits until the whole transaction to commit, and then it releases
all the Exclusive locks and Shared locks at a time.
Rigorous 2-PL ensures that our schedule is:
✓ Recoverable
✓ Cascade less

The difference between


Strict 2-PL and Rigorous 2-PL is that Rigorous is more restrictive, it requires both
Exclusive and Shared locks to be held until after the Transaction commits and this
is what makes the implementation of Rigorous 2-PL easier.
Conservative 2-PL
✓ This protocol requires the transaction to lock all the items it access before the
Transaction begins execution by predeclaring it’s read-set and write-set.
✓ If any of the required locks are not available, the transaction waits until it gets
them.
✓ Conservative 2-PL is Deadlock free, but it does not ensure a Strict schedule.
✓ It is difficult to use in practice because of the need to predeclare the read-set
and the write-set which is not possible in many situations.

The Venn Diagram below shows the classification of schedules that are rigorous
strict and conservative.
Example 1:

Now we will see what is the above schedule following the properties discussed above

2 PL: There is growing and shrinking phase.


Strict 2 PL: There is Lock-x (B) and it is unlocked before commit so no strict 2 PL.
Rigorous 2 PL: If it is not strict 2 PL then it can’t be Rigorous.
Conservative 2 PL: If it is not strict 2 PL then it can’t be conservative.
Example 2:

2 PL: There is growing and shrinking phase so it is 2 PL.


Strict 2 PL: Exclusive locks are unlocked after commit. So yes it is.
Rigorous: We have unlocked all the locks after commit so it is rigorous.
Conservative: We have not taken all the locks at first then start the transaction so no
conservative.
Example 3:

2 PL: There is growing and shrinking phase so it is 2 PL.


Strict 2 PL: Exclusive locks are unlocked after commit. so it is strict.
Rigorous: We have unlocked all the locks after commit so it is rigorous.
Conservative: We have taken all the locks at first then start the transaction so it is
conservative.
Time Stamp Protocol
✓ The Timestamp Ordering Protocol is used to order the transactions based on
their Timestamps.
✓ The priority of the older transaction is higher that's why it executes first. To
determine the timestamp of the transaction, this protocol uses system time or
logical counter.
✓ The lock-based protocol is used to manage the order between conflicting pairs
among transactions at the execution time. But Timestamp based protocols start
working as soon as a transaction is created. We use two Timestamp Values
relating to each database item X.
✓ W_TS(X) is the largest timestamp of any transaction that
executed write(X) successfully.
✓ R_TS(X) is the largest timestamp of any transaction that
executed read(X) successfully.

Basic Timestamp ordering protocol works as follows:


Check the following condition whenever a transaction Ti issues a Read
(X) operation:
✓ If W_TS(X) >TS(Ti) then the operation is rejected.
✓ If W_TS(X) <= TS(Ti) then the operation is executed.
✓ Timestamps of all the data items are updated.
Check the following condition whenever a transaction Ti issues
a Write(X) operation:
✓ If R_TS(X) > TS(Ti) then the operation is rejected.
✓ If W_TS(X) > TS(Ti) then the operation is rejected and Ti is rolled back
otherwise the operation is executed.
Where,
TS(Ti) denotes the timestamp of the transaction Ti.
R_TS(X) denotes the Read time-stamp of data-item X.
W_TS(X) denotes the Write time-stamp of data-item X.

Advantages and Disadvantages of TO protocol:


✓ Timestamp Ordering protocol ensures serializability
✓ Timestamp protocol ensures freedom from deadlock as no transaction ever
waits.
✓ But the schedule may not be cascade free, and may not even be
recoverable.
Thomas Write Rule
Timestamp Ordering Protocol states that if Ri(X) and Wj(X) are conflicting
operations then Ri (X) is processed before Wj(X) if and only if TS(Ti) < TS(Tj).
Whenever a schedule does not follow a serializability order according to the
Timestamp, a user generally rejects it and rollback the Transaction.
✓ Thomas Write Rule allows such operations and is a modification on the Basic
Timestamp Ordering protocol. In Thomas Write Rule users ignore outdated
writes.
Thomas Write Rule
✓ Thomas Write Rule does not enforce Conflict Serializability but rejects fewer
Write Operations by modifying the check Operations for W_item(X)

1. If R_TS(X) > TS(T), then abort and roll back T and reject the operation.
2. If W_TS(X) > TS(T), then don’t execute the Write Operation and continue
processing. This is a case of Outdated or Obsolete Writes. Remember, outdated
writes are ignored in Thomas Write Rule but a Transaction following Basic TO
protocol will abort such a Transaction.
3. If neither the condition in 1 or 2 occurs, then and only then execute the
W_item(X) operation of T and set W_TS(X) to TS(T)
Validation Based Protocol (Optimistic Concurrency)
validation based protocol executes transaction in three phases:

Read phase
In this phase, the transaction T is read and executed. It is used to read the value of
various data items and stores them in temporary local variables. It can perform all
the write operations on temporary variables without an update to the actual
database.

Validation phase
In this phase, the temporary variable value will be validated against the actual data
to see if it violates the serializability.

Write phase
If the validation of the transaction is validated, then the temporary results are
written to the database or system otherwise the transaction is rolled back.
Here each phase has the following different timestamps:
Start(Ti): It contains the time when Ti started its execution.
Validation (Ti): It contains the time when Ti finishes its read phase and starts its
validation phase.
Finish(Ti): It contains the time when Ti finishes its write phase.

✓ This protocol is used to determine the time stamp for the transaction for
serialization using the time stamp of the validation phase, as it is the actual
phase which determines if the transaction will commit or rollback.
✓ Hence TS(T) = validation(T).
✓ The serializability is determined during the validation process. It can't be
decided in advance.
✓ While executing the transaction, it ensures a greater degree of concurrency and
also less number of conflicts.
✓ Thus it contains transactions which have less number of rollbacks.
Multiple Granularity
Granularity: It is the size of data item allowed to lock.
✓ Multiple Granularity can be defined as hierarchically breaking up the database
into blocks which can be locked.
✓ The Multiple Granularity protocol enhances concurrency and reduces lock
overhead.
✓ It maintains the track of what to lock and how to lock.
✓ It makes easy to decide either to lock a data item or to unlock a data item. This
type of hierarchy can be graphically represented as a tree.
Example
✓ Consider a tree which has four levels of nodes.
✓ The first level or higher level shows the entire database.
✓ The second level represents a node of type area. The higher level database
consists of exactly these areas.
✓ The area consists of children nodes which are known as files. No file can be
present in more than one area.
✓ Finally, each file contains child nodes known as records. The file has exactly
those records that are its child nodes. No records represent in more than one
file.
Hence, the levels of the tree starting from the top level are as follows:
1.Database
2.Area
3.File
4.Record
There are three additional lock modes with multiple granularity:
Intention Mode Lock
Intention-shared (IS): It contains explicit locking at a lower level of the tree but
only with shared locks.
Intention-Exclusive (IX): It contains explicit locking at a lower level with
exclusive or shared locks.
Shared & Intention-Exclusive (SIX): In this lock, the node is locked in shared
mode, and some node is locked in exclusive mode by the same transaction.

Compatibility Matrix
It uses the intention lock modes to ensure serializability. It requires that if a
transaction attempts to lock a node, then that node must follow these protocols:

✓ Transaction T1 should follow the lock-compatibility matrix.


✓ Transaction T1 firstly locks the root of the tree. It can lock it in any mode.
✓ If T1 currently has the parent of the node locked in either IX or IS mode, then
the transaction T1 will lock a node in S or IS mode only.
✓ If T1 currently has the parent of the node locked in either IX or SIX modes,
then the transaction T1 will lock a node in X, SIX, or IX mode only.
✓ If T1 has not previously unlocked any node only, then the Transaction T1 can
lock a node.
✓ If T1 currently has none of the children of the node-locked only, then
Transaction T1 will unlock a node.

In multiple-granularity, the locks are acquired in top-down order, and locks must
be released in bottom-up order.
Algorithm for Recovery and Isolation Exploiting Semantics (ARIES)

ARIES algorithm follows three principles


write-ahead logging (WAL)
✓ It provides atomicity and durability (two of the ACID properties) in database
systems.
✓ WAL states that the changes are first recorded in the log, the changes should
be written to stable storage, before they are written to the database.

Repeating history during Redo


On restart after a crash, ARIES retraces the actions of a database before the crash
and brings the system back to the exact state that it was in before the crash.

Logging changes during Undo


Changes made to the database while undoing transactions are logged to ensure
such an action isn't repeated in the event of repeated restarts
Log sequence number
✓ It refers to a pointer used to identify the log records.

Dirty page table


✓ It refers to pages with their updated version placed in main memory and disk
version of it is not updated.
✓ A table is maintained which is useful in reducing unnecessary redo operation.

The ARIES recovery procedure consists of three main steps:

Analysis
✓ The analysis step identifies the dirty (updated) pages in the buffer and the set of
transactions active at the time of the crash.
✓ The appropriate point in the log where the REDO operation should start is also
determined

REDO

✓ The REDO phase updates only committed transactions from the log to the
database.
✓ ARIES will have information which provides the start point for REDO
✓ Information stored by ARIES and in the data pages will allow ARIES to determine
whether the operation to be redone had been applied to the database.
✓ Thus only the necessary REDO operations are applied during recovery.
UNDO
✓ During the UNDO phase, the log is scanned backwards and the operations of
transactions that were active at the time of the crash are undone in reverse
order.
✓ The information needed for ARIES to accomplish its recovery procedure
includes the log, the Transaction Table, and the Dirty Page Table.

You might also like