You are on page 1of 137

Transaction Concepts and ACID

Properties

Database System Concepts, 6th Ed.


©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Chapter 14: Transactions
Transaction Concept
Transaction State

Database System Concepts - 6th Edition 14.2 ©Silberschatz, Korth and Sudarshan
Transaction Concept
A transaction is a unit of program execution that accesses and
possibly updates various data items.
E.g. transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Two main issues to deal with:
Failures of various kinds, such as hardware failures and
system crashes
Concurrent execution of multiple transactions

Database System Concepts - 6th Edition 14.3 ©Silberschatz, Korth and Sudarshan
Transaction Operations

Following are the main operations of transaction:

Read(X): Read operation is used to read the value of X from the


database and stores it in a buffer in main memory. It transfers the data
item X from the database to a variable, also called X, in a buffer in main
memory belonging to the transaction that executed the read operation.

Write(X): Write operation is used to write the value back to the


database from the buffer. It transfers the value in the variable X in the
main-memory buffer of the transaction that executed the write to the data
item X in the database.

Database System Concepts - 6th Edition 14.4 ©Silberschatz, Korth and Sudarshan
Example of Fund Transfer
Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Atomicity requirement
if the transaction fails after step 3 and before step 6, money will be
“lost” leading to an inconsistent database state
 Failure could be due to software or hardware
the system should ensure that updates of a partially executed
transaction are not reflected in the database
Durability requirement — once the user has been notified that the transaction
has completed (i.e., the transfer of the $50 has taken place), the updates to
the database by the transaction must persist even if there are software or
hardware failures.

Database System Concepts - 6th Edition 14.5 ©Silberschatz, Korth and Sudarshan
Example of Fund Transfer (Cont.)
Consistency requirement in above example:
the sum of A and B is unchanged by the execution of the transaction
In general, consistency requirements include
 Explicitly specified integrity constraints such as primary keys and
foreign keys
 Implicit integrity constraints
– e.g. sum of balances of all accounts, minus sum of loan amounts
must equal value of cash-in-hand
A transaction must see a consistent database.
During transaction execution the database may be temporarily inconsistent.
When the transaction completes successfully the database must be
consistent
 Erroneous transaction logic can lead to inconsistency

Database System Concepts - 6th Edition 14.6 ©Silberschatz, Korth and Sudarshan
Example of Fund Transfer (Cont.)
Isolation requirement — if between steps 3 and 6, another
transaction T2 is allowed to access the partially updated database, it
will see an inconsistent database (the sum A + B will be less than it
should be).
T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B
Isolation can be ensured trivially by running transactions serially
that is, one after the other.
However, executing multiple transactions concurrently has
significant benefits, as we will see later.

Database System Concepts - 6th Edition 14.7 ©Silberschatz, Korth and Sudarshan
Example of Data Access
buffer
Buffer Block A input(A)
X A
Buffer Block B Y B
output(B)
read(X)
write(Y)

x2
x1
y1

work area work area


of T1 of T2

memory disk

Database System Concepts - 6th Edition 14.8 ©Silberschatz, Korth and Sudarshan
ACID Properties
A transaction is a unit of program execution that accesses and possibly
updates various data items.To preserve the integrity of data the database
system must ensure:
Atomicity. Either all operations of the transaction are properly reflected
in the database or none are.
Consistency. Execution of a transaction in isolation preserves the
consistency of the database.
Isolation. Although multiple transactions may execute concurrently,
each transaction must be unaware of other concurrently executing
transactions. Intermediate transaction results must be hidden from
other concurrently executed transactions.
That is, for every pair of transactions Ti and Tj, it appears to Ti that
either Tj, finished execution before Ti started, or Tj started execution
after Ti finished.
Durability. After a transaction completes successfully, the changes it
has made to the database persist, even if there are system failures.

Database System Concepts - 6th Edition 14.9 ©Silberschatz, Korth and Sudarshan
Transaction State
Active – the initial state; the transaction stays in this state while it is
executing
Partially committed – after the final statement has been executed.
Failed -- after the discovery that normal execution can no longer
proceed.
Aborted – after the transaction has been rolled back and the
database restored to its state prior to the start of the transaction.
Two options after it has been aborted:
restart the transaction
 can be done only if no internal logical error

kill the transaction


Committed – after successful completion.

Database System Concepts - 6th Edition 14.10 ©Silberschatz, Korth and Sudarshan
Transaction State (Cont.)
❑ A transaction goes through many different states throughout its life cycle.
❑ These states are called as transaction states.

partially
committed Permanent committed
R/W Store
operations

terminated
Failure state
active

Failure
failed aborted
Roll back

Database System Concepts - 6th Edition 14.11 ©Silberschatz, Korth and Sudarshan
Transaction State

Active
❑ This is the first state in the life cycle of a transaction.
❑ A transaction is called in an active state as long as its
instructions are getting executed.
❑ All the changes made by the transaction now are stored in the
buffer in main memory.
Partially committed –
❑ After the last instruction of transaction has executed, it enters
into a partially committed state.
❑ After entering this state, the transaction is considered to be
partially committed.
❑ It is not considered fully committed because all the changes
made by the transaction are still stored in the buffer in main
memory.

Database System Concepts - 6th Edition 14.12 ©Silberschatz, Korth and Sudarshan
Transaction State

Committed State –
❑ After all the changes made by the transaction have been
successfully stored into the database, it enters into a committed
state.
❑ Now, the transaction is considered to be fully committed.
❑ After a transaction has entered the committed state, it is not
possible to roll back the transaction.
❑ In other words, it is not possible to undo the changes that has
been made by the transaction.
❑ This is because the system is updated into a new consistent state.
❑ The only way to undo the changes is by carrying out another
transaction called as compensating transaction that performs the
reverse operations.
Database System Concepts - 6th Edition 14.13 ©Silberschatz, Korth and Sudarshan
Transaction State
Failed State
❑ When a transaction is getting executed in the active state or
partially committed state and some failure occurs due to which it
becomes impossible to continue the execution, it enters into a failed
state.
Aborted State
❑ After the transaction has failed and entered into a failed
state, all the changes made by it have to be undone.
❑ To undo the changes made by the transaction, it becomes
necessary to roll back the transaction.
❑ After the transaction has rolled back completely, it enters
into an aborted state.
Terminated State
❑ This is the last state in the life cycle of a transaction.
❑ After entering the committed state or aborted state, the
transaction finally enters into a terminated state where its
life cycle finally comes to an end.

Database System Concepts - 6th Edition 14.14 ©Silberschatz, Korth and Sudarshan
Thank You

Database System Concepts, 6th Ed.


©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Schedules and Serializability

Database System Concepts, 6th Ed.


©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Chapter 14: Transactions
Concurrent Executions
Serializability
Implementation of Isolation

Database System Concepts - 6th Edition 14.2 ©Silberschatz, Korth and Sudarshan
Concurrent Executions
Multiple transactions are allowed to run concurrently in the system.
Advantages are:
increased processor and disk utilization, leading to better
transaction throughput
 E.g. one transaction can be using the CPU while another is
reading from or writing to the disk
reduced average response time for transactions: short
transactions need not wait behind long ones.
Concurrency control schemes – mechanisms to achieve isolation
That is, to control the interaction among the concurrent
transactions in order to prevent them from destroying the
consistency of the database

Database System Concepts - 6th Edition 14.3 ©Silberschatz, Korth and Sudarshan
Schedules
Schedule – a sequences of instructions that specify the chronological
order in which instructions of concurrent transactions are executed
a schedule for a set of transactions must consist of all instructions
of those transactions
must preserve the order in which the instructions appear in each
individual transaction.
A transaction that successfully completes its execution will have a
commit instructions as the last statement
by default transaction assumed to execute commit instruction as its
last step
A transaction that fails to successfully complete its execution will have
an abort instruction as the last statement

Database System Concepts - 6th Edition 14.4 ©Silberschatz, Korth and Sudarshan
Types of Schedules
Schedules in DBMS

Serial Schedules Non-Serial Schedules

Serializable Non-Serializable

Conflict Serializable View Serializable

Recoverable Non- Recoverable

Cascading schedule Cascadeless schedule Strict schedule

Database System Concepts - 6th Edition 14.5 ©Silberschatz, Korth and Sudarshan
Serial Schedules
In serial schedules,
❑ All the transactions execute serially one after the other.
When one transaction executes, no other transaction is allowed
to execute.
Characteristics
Serial schedules are always-
❑ Consistent
❑ Recoverable
❑ Cascadeless
❑ Strict

Database System Concepts - 6th Edition 14.6 ©Silberschatz, Korth and Sudarshan
Serial Schedules- Example
Transaction T1 Transaction T2

R(A)

W(A)
R(B)
W(B)
Commit
R(A)
W(B)
Commit

• There are two transactions T1 and T2 executing serially one after the other.
• Transaction T1 executes first.
• After T1 completes its execution, transaction T2 executes.
• So, this schedule is an example of a Serial Schedule.

Database System Concepts - 6th Edition 14.7 ©Silberschatz, Korth and Sudarshan
Serial Schedules- Example

Transaction T1 Transaction T2

R(A)

W(B)

Commit
R(A)

W(A)
R(B)
W(B)
Commit

Database System Concepts - 6th Edition 14.8 ©Silberschatz, Korth and Sudarshan
Schedule 1
Let T1 transfer $50 from A to B, and T2 transfer 10% of the
balance from A to B.
A serial schedule in which T1 is followed by T2 :

Database System Concepts - 6th Edition 14.9 ©Silberschatz, Korth and Sudarshan
Schedule 2
• A serial schedule where T2 is followed by T1

Database System Concepts - 6th Edition 14.10 ©Silberschatz, Korth and Sudarshan
Non- Serial Schedules
In non-serial schedules,
❑ Multiple transactions execute concurrently.
❑ Operations of all the transactions are inter leaved or mixed
with each other.
Characteristics
Non-serial schedules are NOT always
❑ Consistent
❑ Recoverable
❑ Cascadeless
❑ Strict

Database System Concepts - 6th Edition 14.11 ©Silberschatz, Korth and Sudarshan
Non- serial Schedules- Example

Transaction T1 Transaction T2

R(A)

W(B)
R(A)

R(B)
W(B)
Commit

W(B)
Commit

• In this schedule, There are two transactions T1 and T2 executing


concurrently.
• The operations of T1 and T2 are interleaved.
• So, this schedule is an example of a Non-Serial Schedule.
Database System Concepts - 6th Edition 14.12 ©Silberschatz, Korth and Sudarshan
Non- serial Schedules- Example

Transaction T1 Transaction T2

R(A)
R(A)

W(B)
W(B)

Commit
R(B)
W(B)
Commit

Database System Concepts - 6th Edition 14.13 ©Silberschatz, Korth and Sudarshan
Schedule 3
Let T1 and T2 be the transactions defined previously. The
following schedule is not a serial schedule, but it is
equivalent to Schedule 1.

In Schedules 1, 2 and 3, the sum A + B is preserved.

Database System Concepts - 6th Edition 14.14 ©Silberschatz, Korth and Sudarshan
Schedule 4
The following concurrent schedule does not preserve the
value of (A + B ).

Database System Concepts - 6th Edition 14.15 ©Silberschatz, Korth and Sudarshan
Finding Number Of Schedules
Consider there are n number of transactions T1, T2, T3 …. , Tn with N1,
N2, N3 …. , Nn number of operations respectively.
Total Number of Schedules
Total number of possible schedules (serial + non-serial) is given by

( N1 + N2 + N3 + … + Nn)!
N1! x N2! x N3! x … x Nn!
Total Number of Serial Schedules-
Total number of serial schedules
= Number of different ways of arranging n transactions
= n!
Total Number of Non-Serial Schedules-
Total number of non-serial schedules
= Total number of schedules – Total number of serial schedules

Database System Concepts - 6th Edition 14.16 ©Silberschatz, Korth and Sudarshan
Finding Number Of Schedules - Example
Consider there are three transactions with 2, 3, 4 operations respectively,
Total Number of Schedules

( 2+3+4)!
Total number of schedules =
2! x 3! x 4!
= 1260
Total Number of Serial Schedules
= Number of different ways of arranging 3 transactions
= 3!
=6
Total Number of Non-Serial Schedules
= Total number of schedules – Total number of serial schedules
= 1260 – 6
= 1254

Database System Concepts - 6th Edition 14.17 ©Silberschatz, Korth and Sudarshan
Serializability
Basic Assumption – Each transaction preserves database
consistency.
Thus serial execution of a set of transactions preserves
database consistency.
A (possibly concurrent) schedule is serializable if it is
equivalent to a serial schedule. Different forms of schedule
equivalence give rise to the notions of:
1. conflict serializability
2. view serializability

Database System Concepts - 6th Edition 14.18 ©Silberschatz, Korth and Sudarshan
Simplified view of transactions

We ignore operations other than read and write


instructions
We assume that transactions may perform arbitrary
computations on data in local buffers in between reads
and writes.
Our simplified schedules consist of only read and write
instructions.

Database System Concepts - 6th Edition 14.19 ©Silberschatz, Korth and Sudarshan
Conflicting Instructions
Instructions li and lj of transactions Ti and Tj respectively, conflict
if and only if there exists some item Q accessed by both li and lj,
and at least one of these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
Intuitively, a conflict between li and lj forces a (logical) temporal
order between them.
If li and lj are consecutive in a schedule and they do not
conflict, their results would remain the same even if they had
been interchanged in the schedule.

Database System Concepts - 6th Edition 14.20 ©Silberschatz, Korth and Sudarshan
Conflict Serializability
If a schedule S can be transformed into a schedule S´ by a series of
swaps of non-conflicting instructions, we say that S and S´ are
conflict equivalent.
We say that a schedule S is conflict serializable if it is conflict
equivalent to a serial schedule

Database System Concepts - 6th Edition 14.21 ©Silberschatz, Korth and Sudarshan
Conflict Serializability (Cont.)
Schedule 3 can be transformed into Schedule 6, a serial
schedule where T2 follows T1, by series of swaps of non-
conflicting instructions. Therefore Schedule 3 is conflict
serializable.

Schedule 3 Schedule 6

Database System Concepts - 6th Edition 14.22 ©Silberschatz, Korth and Sudarshan
Anomalies with Interleaved Execution

Reading Uncommitted Data (WR Conflicts, “dirty


reads”):

Database System Concepts - 6th Edition 14.23 ©Silberschatz, Korth and Sudarshan
Anomalies with Interleaved Execution

Unrepeatable Reads (RW Conflicts):

Database System Concepts - 6th Edition 14.24 ©Silberschatz, Korth and Sudarshan
Anomalies (Continued)
Overwriting Uncommitted Data (WW Conflicts):

Database System Concepts - 6th Edition 14.25 ©Silberschatz, Korth and Sudarshan
Precedence graph for schedule 1 and
schedule 2.
Simple and efficient method for determining conflict serializability of a
schedule.
Consider a schedule S. We construct a directed graph, called a
precedence graph, from S.
This graph consists of a pair G = (V, E), where V is a set of vertices
and E is a set of edges.
The set of edges consists of all edges Ti → T j for which one of three
conditions holds:
1. Ti executes write(Q) before T j executes read(Q).
2. Ti executes read(Q) before T j executes write(Q).
3. Ti executes write(Q) before T j executes write(Q).

Database System Concepts - 6th Edition 14.26 ©Silberschatz, Korth an d Sudarshan


Database System Concepts - 6th Edition 14.27 ©Silberschatz, Korth and Sudarshan
Database System Concepts - 6th Edition 14.28 ©Silberschatz, Korth and Sudarshan
Figure 14.13

Database System Concepts - 6th Edition 14.29 ©Silberschatz, Korth and Sudarshan
CONFLICT SERIALIZABILITY - PROBLEM
Check whether the given schedule S is conflict serializable or not-
S : R1(A) , R2(A) , R1(B) , R2(B) , R3(B) , W1(A) , W2(B)

Step 1:

List all the conflicting operations and determine the dependency


between the transactions-
R2(A) , W1(A) (T2 → T1)
R1(B) , W2(B) (T1 → T2)
R3(B) , W2(B) (T3 → T2)

Step-02:
Draw the precedence graph
• Clearly, there exists a cycle in
T1 T2 the precedence graph.
• Therefore, the given schedule S
is not conflict serializable.
T3

Database System Concepts - 6th Edition 14.30 ©Silberschatz, Korth and Sudarshan
CONFLICT SERIALIZABILITY - PROBLEM
Check whether the given schedule S is conflict serializable and recoverable or
not

Database System Concepts - 6th Edition 14.31 ©Silberschatz, Korth and Sudarshan
CONFLICT SERIALIZABILITY - PROBLEM
Step 1:

List all the conflicting operations and determine the dependency


between the transactions-
R2(X) , W3(X) (T2 → T3)
R2(X) , W1(X) (T2 → T1)
W3(X) , W1(X) (T3 → T1)
W3(X) , R4(X) (T3 → T4)
W1(X) , R4(X) (T1 → T4)
W2(Y) , R4(Y) (T2 → T4)

Step-02:
Draw the precedence graph

• Clearly, there exists no cycle in the precedence graph.


• Therefore, the given schedule S is conflict serializable.

Database System Concepts - 6th Edition 14.32 ©Silberschatz, Korth and Sudarshan
View Serializability
Consider two schedules S and S′, where the same set of
transactions participates in both schedules. The schedules S and S′
are said to be view equivalent if three conditions are met:
➢ For each data item Q, if transaction Ti reads the initial
value of Q in schedule S, then transaction Ti must, in
schedule S′, also read the initial value of Q.
➢ For each data item Q, if transaction Ti executes read(Q) in
schedule S, and if that value was produced by a write(Q)
operation executed by transaction T j , then the read(Q)
operation of transaction Ti must, in schedule S′, also read
the value of Q that was produced by the same write(Q)
operation of transaction T j .
➢ For each data item Q, the transaction (if any) that performs
the final write(Q) operation in schedule S must perform the
final write(Q) operation in schedule S′.

Database System Concepts - 6th Edition 14.33 ©Silberschatz, Korth and Sudarshan
View Serializability (Cont.)
Schedule 5 is view equivalent to the serial schedule <T27, T28, T29>, since
the one read(Q) instruction reads the initial value of Q in both schedules and
T29 performs the final write of Q in both schedules.
Every conflict-serializable schedule is also view serializable, but there are
view-serializable schedules that are not conflict serializable.
In schedule 5, transactions T28 and T29 perform write(Q) operations
without having performed a read(Q) operation.
Writes of this sort are called blind writes.
Blind writes appear in any view-serializable schedule that is not conflict
serializable.

Schedule 4 Schedule 5

Database System Concepts - 6th Edition 14.34 ©Silberschatz, Korth and Sudarshan
View Serializability - Problem
Check whether the given schedule S is view serializable or not. If yes, then give the
serial schedule.
S : R1(A) , W2(A) , R3(A) , W1(A) , W3(A)

we can represent the given schedule pictorially as

Database System Concepts - 6th Edition 14.35 ©Silberschatz, Korth and Sudarshan
VIEW SERIALIZABILITY - PROBLEM
Step-01:

List all the conflicting operations and determine the dependency between
the transactions-
• R1(A) , W2(A) (T1 → T2)
• R1(A) , W3(A) (T1 → T3)
• W2(A) , R3(A) (T2 → T3)
• W2(A) , W1(A) (T2 → T1)
• W2(A) , W3(A) (T2 → T3)
• R3(A) , W1(A) (T3 → T1)
• W1(A) , W3(A) (T1 → T3)

Step-02:
Draw the precedence graph

Database System Concepts - 6th Edition 14.36 ©Silberschatz, Korth and Sudarshan
VIEW SERIALIZABILITY - PROBLEM
Checking for Blind Writes
•There exists a blind write W2 (A) in the given schedule S.
•Therefore, the given schedule S may or may not be view serializable.

Drawing a Dependency Graph


T1 firstly reads A and T2 firstly updates A.
So, T1 must execute before T2.
Thus, we get the dependency T1 → T2.
Final updation on A is made by the transaction T3.
So, T3 must execute after all other transactions.
Thus, we get the dependency (T1, T2) → T3.
From write-read sequence, we get the dependency T2 → T3
Now, let us draw a dependency graph using these dependencies-

Database System Concepts - 6th Edition 14.37 ©Silberschatz, Korth and Sudarshan
Transaction Isolation and Atomicity
Lets see the effect of transaction failures during concurrent
execution.
If a transaction Ti fails, for whatever reason, we need to undo the
effect of this transaction to ensure the atomicity property of the
transaction.
In a system that allows concurrent execution, the atomicity
property requires that any transaction Tj that is dependent on Ti
is also aborted.
To achieve this, we need to place restrictions on the type of
schedules permitted in the system.
➢ Non-recoverable Schedules
➢ Recoverable Schedules
➢ Cascading Rollback
➢ Cascadeless Schedules

Database System Concepts - 6th Edition 14.38 ©Silberschatz, Korth and Sudarshan
Nonrecoverable Schedules
Consider the partial schedule in which T7 is a transaction that performs
only one instruction: read(A).
We call this a partial schedule because we have not included a commit or
abort operation for T6.
Notice that T7 commits immediately after executing the read(A)
instruction. Thus, T7 commits while T6 is still in the active state.
Now suppose that T6 fails before it commits. T7 has read the value of data
item A written by T6 . Therefore, we say that T7 is dependent on T6.
Because of this, we must abort T7 to ensure atomicity. However, T7 has
already committed and cannot be aborted. Thus, we have a situation
where it is impossible to recover correctly from the failure of T6.
This Schedule is an example of a nonrecoverable schedule.

Database System Concepts - 6th Edition 14.39 ©Silberschatz, Korth and Sudarshan
Recoverable Schedules
A recoverable schedule is one where, for each pair of transactions Ti and
Tj such that Tj reads a data item previously written by Ti , the commit
operation of Ti appears before the commit operation of Tj .
For the example of the previous schedule is said to be recoverable, T7
would have to delay committing until after T6 commits.

Database System Concepts - 6th Edition 14.40 ©Silberschatz, Korth and Sudarshan
Cascading Rollback
A situation in which the abort of one transaction forces the abort of another
transaction to prevent the second transaction from reading invalid data.
Also called "cascading rollback".
If in a schedule, failure of one transaction causes several other dependent
transactions to rollback or abort, then such a schedule is called as a
Cascading Schedule or Cascading Rollback or Cascading Abort.

Database System Concepts - 6th Edition 14.41 ©Silberschatz, Korth and Sudarshan
Cascadeless Schedules
It is desirable to restrict the schedules to those where cascading rollbacks
cannot occur. Such schedules are called cascadeless schedules.
Formally, a cascadeless schedule is one where, for each pair of
transactions Ti and Tj such that Tj reads a data item previously written by
Ti , the commit operation of Ti appears before the read operation of Tj .
It is easy to verify that every cascadeless schedule is also recoverable.

Database System Concepts - 6th Edition 14.42 ©Silberschatz, Korth and Sudarshan
Strict Schedule
If in a schedule, a transaction is neither allowed to read nor write a data
item until the last transaction that has written it is committed or aborted,
then such a schedule is called as a Strict Schedule.

Strict schedule allows only committed read and write operations.


Clearly, strict schedule implements more restrictions than cascadeless
schedule.

Example-

Database System Concepts - 6th Edition 14.43 ©Silberschatz, Korth and Sudarshan
THANK YOU

Database System Concepts - 6th Edition 14.28 ©Silberschatz, Korth and Sudarshan
Concurrency Control
Content

• Concurrency Control

• Need for Concurrency Control


Concurrency Control

• In the concurrency control, the multiple transactions can be


executed simultaneously.
• It may affect the transaction result. It is highly important to
maintain the order of execution of those transactions.

• Transactions involve performing various operations on


the database, often concurrently, to meet the
requirements of different users.
Problems with Concurrent Execution

• When multiple transactions execute concurrently in an


uncontrolled or unrestricted manner, then it might lead to
several problems.
• Such problems are called as concurrency problems.
• The concurrency problems are
Dirty Read Problem

Unrepeatable Read Problem


Concurrency
Problems in
Transaction Lost Update Problem

Phantom Read Problem


Dirty Read Problem

• There is always a chance that the uncommitted transaction


might roll back later.

• Thus, uncommitted transaction might make other transactions


read a value that does not even exist.

• This leads to inconsistency of the database.

• Dirty read does not lead to inconsistency always.

• It becomes problematic only when the uncommitted


transaction fails and roll backs later due to some reason.
Dirty Read Problem - Example

1. T1 reads the value of A.


2. T1 updates the value of A in the buffer.
3. T2 reads the value of A from the buffer.
4. T2 writes the updated the value of A.
5. T2 commits.
6. T1 fails in later stages and rolls back.
In this example
• T2 reads the dirty value of A written by the uncommitted transaction T1.
• T1 fails in later stages and roll backs.
• Thus, the value that T2 read now stands to be incorrect.
• Therefore, database becomes inconsistent.
Unrepeatable Read Problem

This problem occurs when a transaction gets to read unrepeated


i.e. different values of the same variable in its different read operations
even when it has not updated its value.
1. T1 reads the value of X (= 10 say).
2. T2 reads the value of X (= 10).
3. T1 updates the value of X (from 10 to
15 say) in the buffer.
1. T2 again reads the value of X (but = 15).
In this example,
1. T2 gets to read a different value of X in its second reading.
2. T2 wonders how the value of X got changed because according to it, it
is running in isolation.
Lost Update Problem

This problem occurs when multiple transactions execute concurrently and


updates from one or more transactions get lost.
1. T1 reads the value of A (= 10 say).
2. T2 updates the value to A (= 15 say) in the buffer.
3. T2 does blind write A = 25 (write without read)
in the buffer.
4. T2 commits.
5. When T1 commits, it writes A = 25 in the database.
In this example,
1.T1 writes the over written value of X in the database.
2.Thus, update from T1 gets lost.
Phantom Read Problem

This problem occurs when a transaction reads some variable from the buffer
and when it reads the same variable later, it finds that the variable does not
exist.

Here,
1.T1 reads X.
2.T2 reads X.
3.T1 deletes X.
4.T2 tries reading X but does not find it.

In this example,
• T2 finds that there does not exist any variable X when it tries reading X
again.
• T2 wonders who deleted the variable X because according to it, it is
running in isolation.
Avoiding Concurrency Problems

• To ensure consistency of the database, it is very important to


prevent the occurrence of above problems.
• Concurrency Control Protocols help to prevent the occurrence
of above problems and maintain the consistency of the
database.
Concurrency control protocols can be broadly divided into two
categories

• Lock based protocols

• Time stamp based protocols


Need for Concurrency Control

• Lost Updates occur when multiple transactions select the same


row and update the row based on the value selected

• The dirty read problem: Transactions read a value written by a


transaction that has been later aborted.

• Incorrect Summary issue occurs when one transaction takes


summary over the value of all the instances of a repeated data-
item, and second transaction update few instances of that specific
data-item.
Optimistic Concurrency Control

• Optimistic Concurrency Control(OCC) is a technique for


managing concurrent access to data in RDBMS.
• It defers conflict detection until the end of the transaction,
reducing overhead.
Phases of OCC:
• Read phase
• Validation phase
• Write phase
Phases of OCC:

• Read Phase:
• Data items are read and stored in local copies.
• Operations are performed on these copies without
updating the database.
Validation Phase:

• Checks for conflicts with concurrent transactions.


• Transaction timestamps and read/write sets are maintained.
• Criteria for conflict checking:
• TransB completes write phase before TransA starts read phase.
• TransA starts write phase after TransB completes write phase.
• No common items between read set of TransA and write set of
TransB.
• No common items between read/write set of TransA and write set of
TransB.
• Rollback occurs if conflicts are detected.
Write Phase:

• Updates are applied to the database if validation is


successful.
• Otherwise, updates are discarded, and transactions are
aborted and restarted.
• No locks are used, ensuring deadlock-free operation.
THANK YOU
Concurrency Control
Mechanisms
Concurrency Control

✓ Lock-Based Protocols
✓ Timestamp-Based Protocols
✓ Validation-Based Protocols
✓ Multiple Granularity
✓ Multi version Schemes
✓ Insert and Delete Operations
✓ Concurrency in Index Structures
Concurrency Control

A lock is a mechanism to control concurrent access to a data item


Data items can be locked in two modes :
• Exclusive (X) mode. Data item can be both read as well as
written. X-lock is requested using lock-X instruction.
• Shared (S) mode. Data item can only be read. S-lock is
requested using lock-S instruction.

Lock requests are made to concurrency-control manager.


Transaction can proceed only after request is granted.
Lock-compatibility matrix

• A transaction may be granted a lock on an item if the requested lock


is compatible with locks already held on the item by other
transactions.
• Any number of transactions can hold shared locks on an item, but if
any transaction holds an exclusive on the item no other transaction
may hold any lock on the item.
• If a lock cannot be granted, the requesting transaction is made to wait
till all incompatible locks held by other transactions have been
released. The lock is then granted.
Lock-Based Protocols

T1: lock-X(B); T2: lock-S(A);


read(B); read (A);
B: = B - 50; unlock(A);
write(B);
unlock(B); lock-S(B);
lock-X(A); read (B);
read(A); unlock(B);
A = A + 50; display(A+B)
write(A);
unlock(A).
Example of a transaction performing locking

• Locking as above is not


sufficient to guarantee
serializability — if A and B
get updated in-between the
read of A and B, the
displayed sum would be
wrong.

• A locking protocol is a set


of rules followed by all
transactions while
requesting and releasing
locks. Locking protocols
restrict the set of possible
schedules.
Example of a transaction performing locking
Pitfalls of Lock-Based Protocols

Consider the partial schedule

• Neither T3 nor T4 can make progress — executing lock-S(B) causes T4 to


wait for T3 to release its lock on B, while executing lock-X(A) causes T3
to wait for T4 to release its lock on A.
• Such a situation is called a deadlock.
o To handle a deadlock one of T3 or T4 must be rolled back and its
locks released.
Pitfalls of Lock-Based Protocols

• The potential for deadlock exists in most locking protocols.


Deadlocks are a necessary evil.
• Starvation is also possible if concurrency control manager is badly
designed. For example:
o A transaction may be waiting for an X-lock on an item, while a
sequence of other transactions request and are granted an S-lock on
the same item.
o The same transaction is repeatedly rolled back due to
deadlocks.
• Concurrency control manager can be designed to prevent
starvation.
The Two-Phase Locking Protocol

This is a protocol which ensures conflict-serializable schedules.


Phase 1: Growing Phase
• transaction may obtain locks
• transaction may not release locks
Phase 2: Shrinking Phase
• transaction may release locks
• transaction may not obtain locks
The protocol assures serializability. It can be proved that the transactions
can be serialized in the order of their lock points (i.e. the point where a
transaction acquired its final lock).
The Two-Phase Locking Protocol

• Two-phase locking does not ensure freedom from deadlocks


• Cascading roll-back is possible under two-phase locking. To avoid this,
follow a modified protocol called strict two-phase locking. Here a
transaction must hold all its exclusive locks till it commits/aborts.

• Rigorous two-phase locking is even stricter: here all locks are held till
commit/abort. In this protocol transactions can be serialized in the
order in which they commit.
The Two-Phase Locking Protocol
Lock Conversions

Consider the following two transactions, for which we have shown only
some of the significant read and write operations:

T8 : read(a1 );
read(a2 );
...
read(a n);
write(a1 )

T9 : read(a1 );
read(a2 );
display(a1 + a2 )
Lock Conversions

Two-phase locking with lock conversions:


– First Phase:
• can acquire a lock-S on item
• can acquire a lock-X on item
• can convert a lock-S to a lock-X (upgrade)
– Second Phase:
• can release a lock-S
• can release a lock-X
• can convert a lock-X to a lock-S (downgrade)
This protocol assures serializability. But still relies on the programmer to
insert the various locking instructions.
Lock Conversions
Automatic Acquisition of Locks

A transaction Ti issues the standard read/write instruction, without explicit


locking calls.
The operation read(D) is processed as:
if Ti has a lock on D
then
read(D)
else begin
if necessary wait until no other
transaction has a lock-X on D
grant Ti a lock-S on D;
read(D)
end
Automatic Acquisition of Locks

write(D) is processed as:


if Ti has a lock-X on D
then
write(D)
else begin
if necessary wait until no other trans. has any lock on D,
if Ti has a lock-S on D
then
upgrade lock on D to lock-X else
grant Ti a lock-X on D
write(D)
end;
All locks are released after commit or abort
Implementation of Locking

• A lock manager can be implemented as a separate process to which


transactions send lock and unlock requests.
• The lock manager replies to a lock request by sending a lock grant
messages (or a message asking the transaction to roll back, in case of
a deadlock).
• The requesting transaction waits until its request is answered.
• The lock manager maintains a data-structure called a lock table to
record granted locks and pending requests.
• The lock table is usually implemented as an in-memory hash table
indexed on the name of the data item being locked.
Deadlock

• A deadlock is a condition where two or more transactions are


waiting indefinitely for one another to give up locks.

• Deadlock is said to be one of the most feared complications in


DBMS as no task ever gets finished and is in waiting state
forever.
Example

• In the student table, transaction T1 holds a lock on some rows and


needs to update some rows in the grade table. Simultaneously,
transaction T2 holds locks on some rows in the grade table and
needs to update the rows in the Student table held by Transaction
T1.
Deadlock Handling

System is deadlocked if there is a set of transactions such that every


transaction in the set is waiting for another transaction in the set.

• Deadlock prevention
• Deadlock detection
• Deadlock recovery
Deadlock Handling
Deadlock Prevention

• Deadlock prevention method is suitable for a large database. If the


resources are allocated in such a way that deadlock never occurs,
then the deadlock can be prevented.

• The Database management system analyzes the operations of the


transaction whether they can create a deadlock situation or not. If
they do, then the DBMS never allowed that transaction to be
executed.
Deadlock Prevention

Two different deadlock-prevention schemes using timestamps have


been proposed:

Wait-Die scheme:
• In this scheme, if a transaction requests for a resource which is already
held with a conflicting lock by another transaction then the DBMS
simply checks the timestamp of both transactions.

• It allows the older transaction to wait until the resource is available for
execution.

• Let's assume there are two transactions Ti and Tj and let TS(T) is a
timestamp of any transaction T. If T2 holds a lock by some other
transaction and T1 is requesting for resources held by T2 then the
following actions are performed by DBMS:
Deadlock Prevention

• Check if TS(Ti) < TS(Tj) - If Ti is the older transaction and Tj has held
some resource, then Ti is allowed to wait until the data-item is
available for execution.

• That means if the older transaction is waiting for a resource which is


locked by the younger transaction, then the older transaction is
allowed to wait for resource until it is available.

• Check if TS(Ti) < TS(Tj) - If Ti is older transaction and has held some
resource and if Tj is waiting for it, then Tj is killed and restarted later
with the random delay but with the same timestamp.
Deadlock Prevention

Wound wait scheme:

• In wound wait scheme, if the older transaction requests for a


resource which is held by the younger transaction, then older
transaction forces younger one to kill the transaction and release the
resource.

• After the minute delay, the younger transaction is restarted but with
the same timestamp.

• If the older transaction has held a resource which is requested by the


Younger transaction, then the younger transaction is asked to wait
until older releases it.
Deadlock Prevention Schemes

• Another simple approach to


deadlock prevention is based on
lock timeouts.

• In this approach, a transaction that


has requested a lock waits for at
most a specified amount of time.

• If the lock has not been granted


within that time, the transaction is
said to time out, and it rolls itself
back and restarts.
Deadlock Detection

Deadlock detection algorithms used to detect deadlocks

Wait-for graph without a cycle Wait-for graph with a cycle


Deadlock Detection

• When deadlock is detected :


• Some transaction will have to rolled back (made a victim) to break
deadlock. Select that transaction as victim that will incur minimum
cost.
• Rollback -- determine how far to roll back transaction
o Total rollback: Abort the transaction and then restart it.

o More effective to roll back transaction only as far as


necessary to break deadlock.
• Starvation happens if same transaction is always chosen as victim.
Include the number of rollbacks in the cost factor to avoid starvation
Locking Extensions

Multiple granularity locking:

idea: instead of getting separate locks on each record.


• lock an entire page explicitly, implicitly locking all record in the
page, or
• lock an entire relation, implicitly locking all records in the
relation.
Timestamp-Based Protocols

• Each transaction is issued a timestamp when it enters the system. If


an old transaction Ti has time-stamp TS(Ti), a new transaction Tj is
assigned time-stamp TS(Tj) such that TS(Ti) <TS(Tj).
• The protocol manages concurrent execution such that the time- stamps
determine the serializability order.
• In order to assure such behavior, the protocol maintains for each data Q
two timestamp values:
• W-timestamp(Q) is the largest time-stamp of any transaction that
executed write(Q) successfully.
• R-timestamp(Q) is the largest time-stamp of any transaction that
executed read(Q) successfully.
Timestamp-Based Protocols

The timestamp ordering protocol ensures that any conflicting read and
write operations are executed in timestamp order.
Suppose a transaction Ti issues a read(Q)
• If TS(Ti)  W-timestamp(Q), then Ti needs to read a value of Q that
was already overwritten.
Hence, the read operation is rejected, and Ti is rolled back.

• If TS(Ti) W-timestamp(Q), then the read operation is executed,


and R-timestamp(Q) is set to max (R- timestamp(Q), TS(Ti)).
Timestamp-Based Protocols

Suppose that transaction Ti issues write(Q).


o If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing was
needed previously, and the system assumed that that value would
never be produced.
Hence, the write operation is rejected, and Ti is rolled back.
o If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an
obsolete value of Q.
Hence, this write operation is rejected, and Ti is rolled back.
o Otherwise, the write operation is executed, and W-timestamp(Q) is set
to TS(Ti).
Example Use of the Protocol

A partial schedule for several data items for transactions with timestamps 1, 2,
3, 4, 5
Correctness of Timestamp-Ordering Protocol

The timestamp-ordering protocol guarantees serializability since all the arcs


in the precedence graph are of the form:

Transaction with Transaction with


smaller TS larger TS

Thus, there will be no cycles in the precedence graph


• Timestamp protocol ensures freedom from deadlock as no transaction
ever waits.
• But the schedule may not be cascade-free, and may not even be
recoverable.
Recoverability and Cascade Freedom

Problem with timestamp-ordering protocol:

• Suppose Ti aborts, but Tj has read a data item written by Ti


• Then Tj must abort; if Tj had been allowed to commit earlier, the
schedule is not recoverable.
• Further, any transaction that has read a data item written by Tj
must abort.
• This can lead to cascading rollback --- that is, a chain of rollbacks.
Recoverability and Cascade Freedom

Solution 1:
• A transaction is structured such that its writes are all performed at the
end of its processing.
• All writes of a transaction form an atomic action; no transaction may
execute while a transaction is being written.
• A transaction that aborts is restarted with a new timestamp.
Solution 2:
• Limited form of locking: wait for data to be committed before
reading it.
Solution 3:
• Use commit dependencies to ensure recoverability.
Validation-Based Protocols

Execution of transaction Ti is done in three phases.


1. Read and execution phase: Transaction Ti writes only to temporary
local variables
2. Validation phase: Transaction Ti performs a “validation test” to
determine if local variables can be written without violating
serializability.
3. Write phase: If Ti is validated, the updates are applied to the database;
otherwise, Ti is rolled back.
Validation-Based Protocols

• The three phases of concurrently executing transactions can be


interleaved, but each transaction must go through the three
phases in that order.

• Assume for simplicity that the validation and write phase occur
together, atomically and serially.

I.e., only one transaction executes validation/write at a time.


• Also called as optimistic concurrency control since transaction
executes fully in the hope that all will go well during validation.
Validation-Based Protocols

• To perform the validation test, we need to know when the various


phases of transactions took place.
• Therefore, associate three different timestamps with each transaction
Ti :

1.Start(Ti ), the time when Ti started its execution.


2.Validation(Ti ), the time when Ti finished its read phase and started
its validation phase.
3.Finish(Ti ), the time when Ti finished its write phase.
THANK YOU!
Transaction Recovery and Save
Points
Content

• Transaction Recovery

• Crash Recovery

• Failure Classification

• Transaction failure

• Log-based recovery Or Manual Recovery

• Save Points

• Isolation Levels

• SQL Facilities for Concurrency and Recovery


Transaction Recovery

• Recovery is the process of restoring a database to the correct


state in the event of a failure.

• It ensures that the database is reliable and remains in consistent


state in case of a failure.

Database recovery can be classified into two parts;


1. Rolling Forward applies redo records to the corresponding data
blocks.
2. Rolling Back applies rollback segments to the data files. It is
stored in transaction tables.
Crash Recovery

• DBMS is a highly complex system with hundreds of transactions


being executed every second.

• The durability and robustness of a DBMS depends on its complex


architecture and its underlying hardware and system software.

• If it fails or crashes amid transactions, it is expected that the


system would follow some sort of algorithm or techniques to
recover lost data.
Failure Classification

To see wherever the matter has occurred, we tend to generalize a

failure into numerous classes, as follows:

• Transaction failure

• System crash

• Disk failure
Failure Classification

• Transaction failure: A transaction needs to abort once it fails to


execute or once it reaches to any further extent from wherever it
can’t go to any extent further. This is often known as transaction
failure wherever solely many transactions or processes are hurt.
The reasons for transaction failure are:

• Logical errors

• System errors
Failure Classification

• Logical errors: Where a transaction cannot complete as a result of


its code error or an internal error condition.

• System errors: Wherever the information system itself


terminates an energetic transaction as a result of the DBMS isn’t
able to execute it, or it’s to prevent due to some system condition.
to Illustrate, just in case of situation or resource inconvenience,
the system aborts an active transaction.
Failure Classification

• System crash: There are issues − external to the system − that will
cause the system to prevent abruptly and cause the system to crash.
For instance, interruptions in power supply might cause the failure of
underlying hardware or software package failure.

• Disk failure: In early days of technology evolution, it had been a


typical drawback wherever hard-disk drives or storage drives
accustomed to failing oftentimes. Disk failures include the formation
of dangerous sectors, unreachability to the disk, disk crash or the
other failure, that destroys all or a section of disk storage.
Categories of Storage Structure:

• We have already described the storage system. In brief, the


storage structure can be divided into two categories −
Categories of Storage Structure:

• Volatile storage − A volatile storage cannot survive system crashes.


Volatile storage devices are placed very close to the CPU; normally
they are embedded onto the chipset itself. For example, main memory
and cache memory are examples of volatile storage.

• Non-volatile storage − These memories are made to survive system


crashes. They are huge in data storage capacity, but slower in
accessibility. Examples may include hard-disks, magnetic tapes, flash
memory, and non-volatile (battery backed up) RAM.
Log-based recovery Or Manual Recovery:

• Log could be a sequence of records, which maintains the records of


actions performed by dealing. It’s necessary that the logs area unit
written before the particular modification and hold on a stable
storage media, that is failsafe. Log-based recovery works as follows:

• The log file is unbroken on a stable storage media.

• When a transaction enters the system and starts execution, it writes


a log regarding it.
Log-based recovery Or Manual Recovery:

The database can be modified using two approaches −

• Deferred database modification − All logs are written on to the


stable storage and the database is updated when a transaction
commits.

• Immediate database modification − Each log follows an actual


database modification. That is, the database is modified
immediately after every operation.
Transactions Recovery

• Transaction recovery involves the transaction log file that is used


as a write-ahead log, plus journal files maintained on a per-
database basis.

• Log files contain short-term recovery information regarding


active databases, while the journal files contain long-term
information used for auditing and disaster recovery.
Types of Transaction Recovery

Recovery information is divided into two types:

• Undo (or Rollback) Operations

✓ Undo or transaction back out recovery is performed by the


DBMS Server.

• Redo (or Cache Restore) Operations

✓ A Redo recovery operation is database-oriented. Redo


recovery is performed after a server or an installation fails.
Save Points

• SAVEPOINT − creates points within the groups of transactions in


which to ROLLBACK.

The SAVEPOINT Command

• A SAVEPOINT is a point in a transaction when you can roll the


transaction back to a certain point without rolling back the entire
transaction.

Syntax for a SAVEPOINT command:

-- SAVEPOINT SAVEPOINT_NAME;
Isolation Levels

• Isolation levels determine the type of phenomena that can occur


during the execution of concurrent transactions.

Three phenomena define SQL Isolation levels for a transaction:

• Dirty Reads

• Non-Repeatable Reads

• Phantom Read
SQL Facilities

• Based on these phenomena, The SQL standard defines four


isolation levels :

1. Read Uncommitted 2. Read Committed

3. Repeatable Read 4. Serializable

The relationship between isolation levels, read phenomena and locks


Recovery with Concurrent Transactions

• When more than one transaction are being executed in parallel,


the logs are interleaved. At the time of recovery, it would become
hard for the recovery system to backtrack all logs, and then start
recovering.

Checkpoint

• Checkpoint is a mechanism where all the previous logs are


removed from the system and stored permanently in a storage
disk. Checkpoint declares a point before which the DBMS was in
consistent state, and all the transactions were committed.
Recovery with Concurrent Transactions
Recovery with Concurrent Transactions

• The recovery system reads the logs backwards from the end to the
last checkpoint.

• It maintains two lists, an undo-list and a redo-list.

• If the recovery system sees a log with <Tn, Start> and <Tn,
Commit> or just <Tn, Commit>, it puts the transaction in the redo-
list.

• If the recovery system sees a log with <Tn, Start> but no commit or
abort log found, it puts the transaction in undo-list.
THANK YOU

You might also like