Professional Documents
Culture Documents
CS 728
Advanced Database Systems
Chapter 20
Foundation of Database
Transaction Processing
1
1 Introduction to Transaction Processing (1)
A transaction is a
sequence of operations whose execution
transforms a database from one consistent state
to another consistent state.
Transaction boundaries:
Begin and End transaction.
Consistent state: the data currently in the database
satisfy all integrity constraints defined for the
database.
3
Introduction to Transaction Processing (3)
read(A)
A = A - 200
write(A)
read(B)
System crash
B = B + 200
write(B)
4
Introduction to Transaction Processing (5)
5
Introduction to Transaction Processing (6)
6
Introduction to Transaction Processing (7)
8
Why Concurrency Control is needed
Multiple transactions are allowed to run concurrently.
Advantages are:
increased processor and disk utilization, leading
to better transaction throughput
one transaction can be using the CPU while another is
reading from or writing to the disk
throughput: # of transactions executed in a given
amount if time.
reduced average response time for transactions
short transactions need not wait behind long ones.
Concurrency control is needed to achieve isolation
i.e., to control the interaction among the concurrent
transactions in order to prevent them from destroying
the database consistency.
9
Why Concurrency Control is needed
Problems occur when concurrent transactions
execute in an uncontrolled manner:
The Lost Update
The Temporary Update (or Dirty Read)
The Incorrect Summary
Unrepeatable Read:
A transaction T1 may read a given value. If another
transaction later updates that value and T1 reads that
value again, then T1 will see a different value.
10
The Lost Update Problem
11
The Temporary Update Problem
(Dirty Read)
This occurs when one transaction updates a
database item and then the transaction fails for
some reason.
The updated item is accessed by another
transaction before it is changed back to its
original value.
12
Incorrect Summary Problem
13
Incorrect Summary Problem
14
Transaction and System Concepts (1)
A transaction is an atomic unit of work that is either
completed in its entirety or not done at all.
For recovery purposes, the system needs to keep track
of when the transaction starts, terminates, and
commits or aborts.
Aborted:
after the transaction has been rolled back and the
database restored to its state prior to the start of the
transaction. Two options after it has been aborted:
– restart the transaction (only if no internal logical error)
– kill the transaction
16
Transaction and System Concepts (3)
Recovery manager keeps track of the following operations:
begin_transaction
This marks the beginning of transaction execution.
read or write:
These specify read or write operations on the database items
that are executed as part of a transaction.
end_transaction:
This specifies that read and write transaction operations have
ended and marks the end limit of transaction execution.
At this point it may be necessary to check whether the changes
introduced by the transaction can be permanently applied to
the database or whether the transaction has to be aborted
because it violates concurrency control or for some other
reason.
17
Transaction and System Concepts (4)
Recovery manager keeps track of the following
operations (cont):
commit_transaction:
18
Transaction and System Concepts (5)
Recovery techniques use the following operators:
undo:
19
State transition diagram illustrating the
states for transaction execution
Rollback
20
Desirable Properties of Transactions (1)
The ACID properties of a transaction
Atomicity
a transaction is an atomic processing unit; it is either performed
in its entirety or not performed at all.
Consistency
a transaction transforms a database from a consistent state to
another consistent state.
Isolation
A transaction should not make its updates visible to other
transactions until it is committed; this property, when enforced
strictly, solves the temporary update problem.
Durability
committed work must never be lost due to subsequently failure.
21
ACID Properties
Example:
T1 T2 value of X
read(X) 200 (initial value)
X = X + 100 300 (not saved yet)
read(X) 200
X = X - 50 150 (not saved yet)
write(X) 300 (saved)
write(X) 150 (overwrite 300)
22
Schedules
A schedule S of n transactions T1, T2, ..., Tn is an
ordering of all the operations in these transactions
subject to the constraint that:
for each transaction T , the operations of T in S
i i
must appear in the same order as they do in Ti.
Note, however, that operations from other transactions Tj
can be interleaved with the operations of Ti in S.
Example: Given
T = R (Q) W (Q) & T = R (Q) W (Q)
1 1 1 2 2 2
24
Schedules
25
Conflict Operations
26
Recoverable Schedule
Recoverable schedule:
One where no committed transaction needs to be
rolled back.
A schedule S is recoverable if no transaction T in S
commits until all transactions T’ that have written an
item that T reads have committed.
T reads from T’ in S if X is first written by T’ and later
read by T.
T’ should not have been aborted before T reads X
There should be no transaction Ti that writes X
after T’ writes it before T reads it (unless Ti, if any,
has aborted before T reads X).
Sa, Sb and Sa’ are recoverable:
Sa’: r1(X); r2(X); w1(X); r1(Y); w2(X); c2; w1(Y); c1;
27
Recoverable Schedule
Consider the following schedules:
Sc: r1(X); w1(X); r2(X); r1(Y); w2(X); c2; a1;
Sd: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); c1; c2;
Se: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); a1; a2;
Sc is not recoverable because:
T2 reads item X from T1, and then T2 commits before T1
commits.
If T1 aborts after c2 operation in Sc, then the value of X
that T2 read is no longer valid and T2 must be aborted
after it is committed, leading to a schedule that is not
recoverable.
For the schedule to be recoverable c2 operation in Sc must
be postponed until after T1 commits, as shown in Sd;
If T1 aborts instead of committing, then T2 should also
abort as shown in Se, because the value of X it read is no
longer valid.
28
Cascade-less Schedule
Cascadeless Schedule:
One where every transaction reads only the
items that are written by committed
transactions.
r2(X) in Sd and Se must be postponed until after T1
has committed (or aborted), thus delaying T2 but
ensuring no cascading rollback if T1 aborts.
29
Cascade-less Schedule
Strict Schedules:
A schedule in which a transaction can neither
read or write an item X until the last
transaction that wrote X has committed.
Consider the following schedule:
Sf: w1(X, 5); w2(X , 8); a1;
Suppose the value of X was originally 9.
If T aborts, as in S , the recovery system will
1 f
restore the value of X to 9, even though it has
already been changed to 8 by T2, thus leading to
incorrect results.
Although Sf is cascade-less, it is not strict
It permits T to write X even though T that last
2 1
wrote X had not yet committed (or aborted).
30
Schedules Classification
In term of:
1. Recoverability
2. Avoidance of cascading rollback
3. Strictness
Thus,
all strict schedules are cascade-less, and
All cascade-less schedules are recoverable
31
Recoverability
32
Recoverability
33
Recoverability (Cont.)
Cascading rollback
a single transaction failure leads to a series of
transaction rollbacks.
Consider the following schedule where none of
the transactions has yet committed (so the
schedule is recoverable)
If T10 fails, T11 and T12 must also be rolled back.
34
Recoverability (Cont.)
Cascadeless schedules
cascading rollbacks cannot occur; for each pair of
transactions Ti and Tj such that Tj reads a data
item previously written by Ti, the commit
operation of Ti appears before the read operation
of Tj.
Every cascadeless schedule is also recoverable
Non-serial schedule:
R (Q) R (Q) W (Q) W (Q)
1 2 1 2
36
Example Schedules
The following is a serial schedule (Schedule 1), in
which T1 is followed by T2.
37
Example Schedule (Cont.)
38
Example Schedules (Cont.)
39
Example Schedules
40
Example Schedule (Cont.)
41
Several Observations
serial schedule:
R1(X) W1(X) R1(Y) W1(Y) R2(X) W2(X)
43
Serializability
Basic Assumption
each transaction preserves database consistency
44
Serializability
One way to ensure correctness of concurrent
transactions is to enforce serializability of
transactions
that is the interleaved execution of the
transactions must be equivalent to some serial
execution of those transactions.
The interleaved execution of a set of transactions is
considered correct iff it is serializable.
A nonserial but serializable schedule often permits
higher degree of concurrency than a serial schedule.
Different forms of schedule equivalence give rise
to the notions of:
conflict serializability
view serializability
45
Serializability
46
Conflict Serializability
T2 = R2(X) W2(X)
48
Conflict Serializability (Cont.)
49
Conflict Serializability (Cont.)
Schedule 3 below can be transformed into Schedule
1, a serial schedule where T2 follows T1, by series of
swaps of non-conflicting instructions.
Therefore Schedule 3 is conflict serializable.
50
View Serializability
52
View Serializability (Cont.)
53
Testing for Serializability
54
Example Schedule (Schedule A)
T1 T2 T3 T4 T5
R(X)
R(Y)
R(Z)
R(B)
R(A)
R(A)
R(Y)
W(Y) T1 T2
W(Z)
R(U)
R(Y) T5
W(Y)
R(Z)
W(Z)
R(U) T4
W(U)
T3
55
Test for Conflict Serializability
56
Test Schedule Serializability
T1 T2 T3
T4
Two possible orders of topological sorting:
T1 T2 T3 T4 & T1 T2 T4 T3
S is equivalent to both of the above two serial
schedules
57