Professional Documents
Culture Documents
Let’s say first operation passed successfully while second failed, in this case A’s
balance would be Rs. 300 while B would be having Rs. 700 instead of Rs. 800.
This is unacceptable in a banking system. Either the transaction should fail without
executing any of the operation or it should process both the operations. The
Atomicity property ensures that.
Consistency (C)
Integrity Constraints should be maintained. So, that the database is consistent
before and after the transaction. To preserve the consistency of database, the
execution of transaction should take place in isolation (that means no other
transaction should run concurrently when there is a transaction already running).
Example
Account A is having a balance of 400 and it is transferring 100 to account B & C
both. We have two transactions here. Let’s say these transactions run concurrently
and both the transactions read 400 balance, in that case the final balance of A
would be 300 instead of 200 .
This is wrong. If the transaction were to run in isolation, then the second
transaction would have read the correct balance 300 (before debiting 100 ) once
the first transaction went successful.
Isolation
Even though multiple transactions may execute concurrently, the system
guarantees that, for every pair of transactions Ti and Tj , it appears to Ti that either
Tj finished execution before Ti started, or Tj started execution after Ti finished.
Thus, each transaction is unaware of other transactions executing concurrently in
the system.
Durability (D)
Once a transaction completes successfully, the changes it has made into the
database should be permanent even if there is a system failure. The recovery
management component of database systems ensures the durability of transaction.
Transaction States
Any transaction at any point of time must be in one of the following states:
Active: The initial state; the transaction stays in this state while it is executing.
Partially committed: After the final statement has been executed.
Failed: After the discovery that normal execution can no longer proceed.
Aborted: After the transaction has been rolled back and the database has been
restored to its state prior to the start of the transaction
Committed: After successful completion.
T1: read(A);
A := A − 50;
write(A);
read(B);
B := B + 50;
write(B).
Transaction T2 transfers 10 percent of the balance from account A to account B. It
is defined as
T2: read(A);
temp := A * 0.1;
A := A − temp;
write(A);
read(B);
B := B + temp;
write(B).
Suppose the current values of accounts A and B are $1000 and $2000
Schedule2 Concurrent schedule
Serial Schedules
It is a schedule in which transactions are aligned in such a way that one transaction
is executed first. When the first transaction completes its cycle, then the next
transaction is executed. Transactions are ordered one after the other. This type of
schedule is called a serial schedule, as transactions are executed in a serial manner.
Equivalence Schedules
An equivalence schedule can be of the following types −
✓Result Equivalence
✓View Equivalence
✓Conflict Equivalence
Result Equivalence:
If two schedules produce the same result after execution, they are said to be
result equivalent. They may yield the same result for some value and different
results for another set of values. That's why this equivalence is not generally
considered significant.
View Equivalence
Two schedules would be view equivalence if the transactions in both the
schedules perform similar actions in a similar manner.
For example −
If Ti reads the initial data in S1, then it also reads the initial data in S2.
If Ti reads the value written by J in S1, then it also reads the value written by J
in S2.
If Ti performs the final write on the data value in S1, then it also performs the
final write on the data value in S2.
Conflict Equivalence
Two Operations would be conflicting if they have the following properties –
✓ Both belong to separate transactions.
✓ Both accesses the same data item.
✓ At least one of them is "write" operation.
Example:
Conflicting operations pair (R1(A), W2(A)) because they belong to two
different transactions on same data item A and one of them is write operation.
Similarly,
(W1(A), W2(A)) and (W1(A), R2(A)) pairs are also conflicting.
On the other hand, (R1(A), W2(B)) pair is non-conflicting because they
operate on different data item.
Similarly, ((W1(A), W2(B)) pair is non-conflicting.
Note − View equivalent schedules are view serializable and conflict equivalent
schedules are conflict serializable. All conflict serializable schedules are view
serializable too.
Serializability
A schedule is said to be serializable schedule if it is equivalent to the any one of
the serial schedule.
Conflict Serializable Schedules
A schedule S is said to be conflict serializable schedule if it is conflict equivalent
to the any one of the serial schedule.
Here S, S12 are not following the same order for conflicting operation pairs. S
Now compare S with <T2 T1> for conflict equivalence.
Here S, S21 are not following the same order for conflicting operation pairs.
So, the given schedule S is not a conflict serializable schedule.
S: R1(A)W1(A)R2(A)W2(A)R1(B)W1(B)R2(B)W2(B)
View Serializable Schedules
A schedule S is said to be view Serializable schedule if it is view equivalent to
the any one of the serial schedule.
Step-1: Generate all possible serial schedules for the given schedule. If a schedule
contains n transactions, then possible number of serial schedules are n!
Step-2: Now compare each serial schedule with given schedule for view
equivalence. If any serial schedule is view equivalent with the given
schedule, then it is view Serializable schedule otherwise it is not a view
Serializable schedule.
Example:
Check whether the schedule is view Serializable or not?
S: R2 (B); R2 (A); R1 (A); R3 (A); W1 (B); W2 (B); W3 (B);
Solution: With 3 transactions, total number of schedules possible = 3! = 6
<T1 T2 T3>
<T1 T3 T2>
<T2 T3 T1>
<T2 T1 T3>
<T3 T1 T2>
<T3 T2 T1>
Step 1: Initial Read
A: T2
B: T2
Remaining Schedules:
< T2 T1 T3>
<T2 T3 T1 >
Failure Classification
There are various types of failure that may occur in a system, each of which needs
to be dealt with in a different manner. The failures are classified as follows
✓ Transaction failure.
✓ System crash.
✓ Disk failure.
Transaction failure
There are two types of errors that may cause a transaction to fail:
Logical error
The transaction can no longer continue with its normal execution because of some
internal condition, such as bad input, data not found, overflow, or resource limit
exceeded.
System error
The system has entered an undesirable state (for example, deadlock), as a result of
which a transaction cannot continue with its normal execution. The transaction,
however, can be executed later.
System crash
There is a hardware malfunction, or a bug in the database software or the
operating system, that causes the loss of the content of volatile storage and brings
transaction processing to a halt. The content of non-volatile storage remains intact
and is not corrupted.
Disk failure
✓ A disk block loses its content as a result of either a head crash or failure during
a data transfer operation.
To determine how the system should recover from failures, we need to identify
the failure modes of those devices used for storing data. Next, we must consider
how these failure modes affect the contents of the database.
Storage Structure
The storage structure can be divided into two categories
Volatile storage
✓ As the name suggests, a volatile storage cannot survive system crashes.
Examples
main memory ,RAM and cache memory are examples of volatile storage.
Non-volatile storage
✓ These memories are made to survive system crashes. They are huge in data
storage capacity, but slower in accessibility.
Examples:
Harddisks, magnetic tapes, flash memory, and non-volatile (battery backed up)
ROM.
Recovery and Atomicity
✓ When a system crashes, it may have several transactions being executed and
various files opened for them to modify the data items.
✓ According to atomicity of transactions must be maintained, that is, either all
the operations are executed or none.
❖ It should check the states of all the transactions, which were being executed.
❖ A transaction may be in the middle of some operation; the DBMS must ensure
the atomicity of the transaction in this case.
❖ It should check whether the transaction can be completed now or it needs to be
rolled back.
❖ No transactions would be allowed to leave the DBMS in an inconsistent state.
There are two types of techniques, which can help a DBMS in recovering as well
as maintaining the atomicity of a transaction
➢ Maintaining the logs of each transaction and writing them onto some stable
storage before modifying the database.
➢ Maintaining shadow paging, where the changes are done on a volatile
memory, and later, the actual database is updated.
Log-based Recovery
Log maintains the records of actions performed by a transaction. It is important
that the logs are written prior to the actual modification and stored on a stable
storage media, which is failsafe.
When a transaction enters the system and starts execution, it writes a log about
it.
<Tn, Start>
When the transaction modifies an item X, it write logs as follows −
<Tn, X, V1, V2>
It reads Tn has changed the value of X, from V1 to V2.
When the transaction finishes, it logs −
<Tn, commit>
If all the operations in schedule are successful, then all the writes are deferred to
partial commit otherwise no writes are deferred to partial commit.
✓ In this method write operation does not need old record i.e., (write, T, x, new
value).
✓ (Abort, T) entry never used in this method.
✓ In this method if any system crash occurs because of any transaction failure then
the schedule looks up into log file and reads all the writes until (commit, T)
entry is found. If the entry is found write all (write, T) entries into the database
otherwise redo all the operations. If (commit, T) entry is not found nothing to be
done.
✓ This method is also called as (No-undo) / redo recovery scheme.
✓ Redo must be done in a order.
Example:
(Start,T0)
(Write,T0,a,9)
Crash (Nothing is to be done because no commit found)
(commit,T0)
(start, T1)
(write,T1,a,7)
Crash (T0 committed so redo T0)
(commit,T1)
Crash (T0, T1 are committed so redo T0,T1)
Example:
(Start, T0)
(Write, T0, A, 3, 9)
Crash (Undo T0 so A=3)
(Commit, T0)
(Start, T1)
(Write, T1, B, 2, 5)
(Write, T1, a, 9, 7)
Crash (undo – T1 and Redo T0 so B=2, A=9)
(Commit, T1)
Crash (Redo T0, T1 so B=5, A=7)
Recovery with concurrent Transactions
When more than one transaction are being executed in parallel, the logs are
interleaved. At the time of recovery, it would become hard for the recovery system
to backtrack all logs, and then start recovering. To ease this situation, most modern
DBMS use the concept of 'checkpoints’.
Checkpoint
✓ Checkpoint declares a point before which the DBMS was in consistent state,
and all the transactions were committed.
✓ Checkpoint is a mechanism where all the previous logs are removed from the
system and stored permanently in a storage disk.
Recovery
When a system with concurrent transactions crashes and recovers, it behaves in the
following manner
✓ The recovery system reads the logs backwards from the end to the last
checkpoint.
✓ It maintains two lists, an undo-list and a redo-list.
✓ If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just
<Tn, Commit>, it puts the transaction in the redo-list.
✓ If the recovery system sees a log with <Tn, Start> but no commit or abort log
found, it puts the transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed.
All the transactions in the redo-list and their previous logs are removed and then
redone before saving their logs.
Example:
(Start, T1)
(Write, T1, B, 2, 3)
(Start, T2)
(Commit, T1)
(Write, T2, C, 5, 7)
(Commit, T2)
(Checkpoint, {T2})
(Start, t3)
(Write, T3, A, 1, 9)
(Commit, T3)
(Start, T4)
(Write, T4, C, 7, 2)
In the above example Undo list is – T4
Redo list is – T2,T3
First it performs Undo list in the order i.e., T4 and then it performs Redo T2,T3.
Deadlock Handling
Deadlock is a state of a database system having two or more transactions, when
each transaction is waiting for a data item that is being locked by some other
transaction.
A deadlock can be indicated by a cycle in the wait-for-graph. This is a directed
graph in which the vertices denote transactions and the edges denote waits for
data items.
Example
Transaction T1 is waiting for data item X which is locked by T3. T3 is waiting for
Y which is locked by T2 and T2 is waiting for Z which is locked by T1. Hence, a
waiting cycle is formed, and none of the transactions can proceed executing.
T1
X
Z
T2 T3
Y
There are three classical approaches for deadlock handling, namely
❖ Deadlock prevention.
❖ Deadlock avoidance.
❖ Deadlock detection and removal.
All of the three approaches can be incorporated in both a centralized and a
distributed database system.
Deadlock Prevention
The deadlock prevention approach does not allow any transaction to acquire
locks that will lead to deadlock.
The convention is that when more than one transactions request for locking the
same data item, only one of them is granted the lock.
One of the most popular deadlock prevention methods is pre-acquisition of all
the locks.
In this method, a transaction acquires all the locks before starting to execute
and retains the locks for the entire duration of transaction.
Using this approach, the system is prevented from being deadlocked since
none of the waiting transactions are holding any lock.
There are two algorithms for this purpose, namely wait-die and wound-wait.
Let us assume that there are two transactions, T1 and T2, where T1 tries to lock a
data item which is already locked by T2. The algorithms are as follows
Wait-Die
If T1 is older than T2, T1 is allowed to wait. Otherwise, if T1 is younger than T2,
T1 is aborted and later restarted.
Wound-Wait
If T1 is older than T2, T2 is aborted and later restarted. Otherwise, if T1 is
younger than T2, T1 is allowed to wait.
Deadlock Avoidance
✓ The deadlock avoidance approach handles deadlocks before they occur. It
analyzes the transactions and the locks to determine whether waiting leads to a
deadlock or not .
✓ When a transaction requests a lock on data item The lock manager checks
whether the lock is available. If it is available, the lock manager allocates the
data item and the transaction acquires the lock.
✓ If the item is locked by some other transaction in incompatible mode, the lock
manager runs an algorithm to test whether keeping the transaction in waiting
state will cause a deadlock or not.
Deadlock Detection and Removal
The deadlock detection and removal approach runs a deadlock detection
algorithm periodically and removes deadlock in case there is one.
To detect deadlocks, the lock manager periodically checks if the wait-for graph
has cycles.
Shared Lock: When we take this lock we can just read the item but cannot write.
Exclusive Lock: In this type of lock we can write as well as read the data item.
Below table will give clear idea about what we can do and cannot while having
shared or exclusive lock.
In general we have 2 types of locking protocols. Those are
❖ Simple locking protocol
❖ 2 – Phase locking Protocol
In 2 PL we have seen:
Serializability: It is guaranteed to happen.
Cascading Rollback: It is possible which is bad.
Deadlock: It is possible.
The Venn Diagram below shows the classification of schedules that are rigorous
strict and conservative.
Example 1:
Now we will see what is the above schedule following the properties discussed above
1. If R_TS(X) > TS(T), then abort and roll back T and reject the operation.
2. If W_TS(X) > TS(T), then don’t execute the Write Operation and continue
processing. This is a case of Outdated or Obsolete Writes. Remember, outdated
writes are ignored in Thomas Write Rule but a Transaction following Basic TO
protocol will abort such a Transaction.
3. If neither the condition in 1 or 2 occurs, then and only then execute the
W_item(X) operation of T and set W_TS(X) to TS(T)
Validation Based Protocol (Optimistic Concurrency)
validation based protocol executes transaction in three phases:
Read phase
In this phase, the transaction T is read and executed. It is used to read the value of
various data items and stores them in temporary local variables. It can perform all
the write operations on temporary variables without an update to the actual
database.
Validation phase
In this phase, the temporary variable value will be validated against the actual data
to see if it violates the serializability.
Write phase
If the validation of the transaction is validated, then the temporary results are
written to the database or system otherwise the transaction is rolled back.
Here each phase has the following different timestamps:
Start(Ti): It contains the time when Ti started its execution.
Validation (Ti): It contains the time when Ti finishes its read phase and starts its
validation phase.
Finish(Ti): It contains the time when Ti finishes its write phase.
✓ This protocol is used to determine the time stamp for the transaction for
serialization using the time stamp of the validation phase, as it is the actual
phase which determines if the transaction will commit or rollback.
✓ Hence TS(T) = validation(T).
✓ The serializability is determined during the validation process. It can't be
decided in advance.
✓ While executing the transaction, it ensures a greater degree of concurrency and
also less number of conflicts.
✓ Thus it contains transactions which have less number of rollbacks.
Multiple Granularity
Granularity: It is the size of data item allowed to lock.
✓ Multiple Granularity can be defined as hierarchically breaking up the database
into blocks which can be locked.
✓ The Multiple Granularity protocol enhances concurrency and reduces lock
overhead.
✓ It maintains the track of what to lock and how to lock.
✓ It makes easy to decide either to lock a data item or to unlock a data item. This
type of hierarchy can be graphically represented as a tree.
Example
✓ Consider a tree which has four levels of nodes.
✓ The first level or higher level shows the entire database.
✓ The second level represents a node of type area. The higher level database
consists of exactly these areas.
✓ The area consists of children nodes which are known as files. No file can be
present in more than one area.
✓ Finally, each file contains child nodes known as records. The file has exactly
those records that are its child nodes. No records represent in more than one
file.
Hence, the levels of the tree starting from the top level are as follows:
1.Database
2.Area
3.File
4.Record
There are three additional lock modes with multiple granularity:
Intention Mode Lock
Intention-shared (IS): It contains explicit locking at a lower level of the tree but
only with shared locks.
Intention-Exclusive (IX): It contains explicit locking at a lower level with
exclusive or shared locks.
Shared & Intention-Exclusive (SIX): In this lock, the node is locked in shared
mode, and some node is locked in exclusive mode by the same transaction.
Compatibility Matrix
It uses the intention lock modes to ensure serializability. It requires that if a
transaction attempts to lock a node, then that node must follow these protocols:
In multiple-granularity, the locks are acquired in top-down order, and locks must
be released in bottom-up order.
Algorithm for Recovery and Isolation Exploiting Semantics (ARIES)
Analysis
✓ The analysis step identifies the dirty (updated) pages in the buffer and the set of
transactions active at the time of the crash.
✓ The appropriate point in the log where the REDO operation should start is also
determined
REDO
✓ The REDO phase updates only committed transactions from the log to the
database.
✓ ARIES will have information which provides the start point for REDO
✓ Information stored by ARIES and in the data pages will allow ARIES to determine
whether the operation to be redone had been applied to the database.
✓ Thus only the necessary REDO operations are applied during recovery.
UNDO
✓ During the UNDO phase, the log is scanned backwards and the operations of
transactions that were active at the time of the crash are undone in reverse
order.
✓ The information needed for ARIES to accomplish its recovery procedure
includes the log, the Transaction Table, and the Dirty Page Table.