Professional Documents
Culture Documents
PART 1
1
Lecture Objectives:
• Discuss transaction processing concepts
• Describe desirable properties of transactions
• Achieving the desired properties:
• Concurrency Control
• Recovery
2
What is a Transaction
• A transaction is the basic logical unit of execution in an
information system.
Transfer £500
database may be
temporarily in an
inconsistent state
during execution
4
Transaction concepts (3)
• Single-user VS multi-user systems
• A DBMS is single-user if at most one user can use the
system at a time
• A DBMS is multi-user if many users can use the system
concurrently - the most common occurrence
• Problem
How to make the simultaneous interactions of multiple
users with the database safe, consistent, correct, and
efficient?
5
Transaction concepts (4)
• Concurrency in Computing systems
• Single-processor computer system
• Multiprogramming
• Inter-leaved Execution
• Pseudo-parallel processing
• Multi-processor computer system
• Parallel processing
6
Transaction concepts (5)
B B B
CPU2
A A
CPU1 A
CPU1
time
t1 t2 t1 t2
Interleaved processing Parallel processing
(Single processor) (Two or more processors)
7
Transaction concepts (6)
• A transaction T is a logical unit of
database processing that includes one or
more database access operations and can
be either:
• Embedded within an application
program
• Specified interactively (e.g., via SQL)
8
Transaction concepts (7)
• Transaction have boundaries:
• Begin/end transaction
• Types of transactions
• Read transaction
• write transaction
10
General Idea on Database Read and Write
Operations
• A database is represented as a collection of named data items
• Read-item (X)
1. Find the address of the disk block that contains item X
2. Copy the disk block into a buffer in main memory
3. Copy the item X from the buffer to the program variable named X
• Write-item (X)
1. Find the address of the disk block that contains item X.
2. Copy that disk block into a buffer in main memory
3. Copy item X from the program variable named X into its correct
location in the buffer.
4. Store the updated block from the buffer back to disk (either
immediately or at some later point in time).
11
A Transaction: A Formal Example
T1
t0 BEGIN:
read_item(X);
read_item(Y);
X:=X - 400000;
Y:=Y + 400000;
tk write _item(X);
Write_item(Y);
END
12
Transaction States: A state transition diagram
13
Transaction States
• BEGIN_TRANSACTION: marks start of transaction
• READ or WRITE: two possible operations on the data
• END_TRANSACTION: marks the end of the read or
write operations; start checking whether everything
went according to plan
• COMMIT_TRANSACTION: signals successful end of
transaction; changes can be “committed” to DB
• Partially committed
• ROLLBACK (or ABORT): signals unsuccessful end of
transaction, changes applied to DB must be undone
14
A Sample SQL Transaction
EXEC SQL WHENEVER SQLERROR GOTO UNDO;
EXEC SQL SET TRANSACTION
READ WRITE
DIAGONOSTIC SIZE 5
ISOLATION LEVEL SERIALIZABLE;
EXEC SQL INSERT INTO
EMPLOYEE(FNAME, LNAME, SSN, DNO, SALARY)
VALUES (‘Ali’, ’Al-Fares’, ‘991004321’, 2, 35000)
EXEC SQL UPDATE EMPLOYEE
SET SALARY = SALARY * 1.1 WHERE DNO = 2;
EXEC SQL COMMIT;
GOTO END_T;
UNDO: EXEC SQL ROLLBACK;
END_T: ……;
15
Desirable Properties of Transactions
• ACID properties
1. Atomicity
A transaction is an atomic unit of processing; it is either
performed in its entirety or not performed at all.
2. Consistency preservation
A transaction is consistency preserving if its complete execution
takes the database from one consistent state to another
3. Isolation
The execution of a transaction should not be interfered with by any
other transactions executing concurrently
4. Durability
The changes applied to the database by a committed transaction must
persist in the database. These changes must not be lost because of any
failure
16
Achieving the ACID Properties
• Assertion: If all transactions achieved the ACID
properties, then the database is assured to be in a
consistent and correct state always.
19
Why Do We Interleave Transactions?
T1 T2
read_item(X);
X:=X-N;
write_item(X);
Could be a long wait
read_item(Y);
Y:=Y+N;
write_item(Y);
read_item(X):
X:=X+M;
write_item(X);
20
Concurrent Executions
• Serial execution is by far simplest method to execute
transactions
• No extra work ensuring consistency
• Inefficient!
• Reasons for concurrency:
• Increased throughput
• Reduces average response time
• However we need correct concurrent execution
21
Concurrency Control
• Why is concurrency control needed?
22
Lost Update
T1 T2
Read(X)
X = X - 5
Read(X)
X = X + 5
This update
Write(X) Only this update
Is lost
Write(X) succeeds
COMMIT
COMMIT
23
Uncommitted dependency /Temporary
Update / (“dirty read”)
T1 T2
Read(X)
X = X - 5
Write(X)
Read(X) This reads
the value
X = X + 5 of X which
Write(X) it should
ROLLBACK not have
COMMIT seen
24
Inconsistent analysis/Summary
T1 T2
Read(X)
X = X - 5
Write(X)
Read(X)
Read(Y) Summing up
Sum = X+Y data while it is
Read(Y) being updated
Y = Y + 5
Write(Y)
25
Phantom record problem
T1 T2
Read(X)
Read(X)
Delete(x)
Read (x)
26
Unrepeatable read problem
T1 T2
Read(X)
Read(X)
Write(X)
Read (x)
27
Schedules
• A schedule is a time-ordered sequence of actions taken by
one or more transactions.
Formal Definition:
• Schedule S of n transactions T1, T2, … , Tn is an ordering of
the operations of various transactions subject to the
constraint that, for each transaction Ti that participates in S,
the operations of Ti in S must appear in the same order in
which they occur in Ti.
• i.e READ and WRITE actions, and their orders are important
when considering concurrency.
28
Example of a Schedule
• A schedule, S:
r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); c1; c2
29
Serial Schedule
• A serial schedule is a schedule where operations of
each transaction are executed consecutively without
any interleaved operations from other transactions
(each transaction commits before the next one is
allowed to begin).
• Non-serial schedule – operations from a set of
concurrent transactions are interleaved
30
Example
• If T2 is scheduled to run after T1 is completely
finished, then the schedule is serial.
31
Serial Schedule
• We consider transactions to be
Independent (Isolated), so a serial
schedule is always correct
• Based on C and I properties in ACID
• Furthermore, it does not matter which
transaction is executed first, as long as
every transaction is executed in its
entirety, from beginning to end
32
Serializable Schedule
Definition
33
Serializability
• Assumption: Every serial schedule is correct
• Goal: Find non-serial schedules which are also correct
• A schedule S of n transactions is serializable if it is
equivalent to some serial schedule of the same n
transactions
• Serializability of a schedule means equivalence to a serial
schedule (i.e., sequential with no transaction overlap in
time) with the same transactions.
• Q: When are two schedules equivalent?
• Option 1: They lead to same result (result equivalent)
• Option 2: The order of any two conflicting operations is the
same (conflict equivalent) 34
Types of Equivalence
• Conflict equivalence
• Result equivalence
• View equivalence
35
Example: Serial and Serializable
36
Conflict Equivalence
• Conflicting operations are used to define how
schedules are equivalent
• Two operations conflict if they satisfy ALL three
conditions:
1. they belong to different transactions AND
2. they access the same item AND
3. at least one is a write_item()operation
• Example.:
• S: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y);
conflicts
37
Conflict Equivalent Schedules
38
Example 1: Conflict Serialisable Schedule
40
Conflict Equivalence
Serial Schedule S1
T1 T2
read_item(A);
write_item(A);
42
Conflict Equivalence
Schedule S1’’
T1 T2
read_item(A):
write_item(A);
read_item(A); different order than in S1
write_item(A);
read_item(B);
write_item(B);
read_item(B); different order than in S1
write_item(B);
Schedule S1’’ is not conflict equivalent to S1
(produces a different result than S1)
43
Conflict Serialisability
• Conflict serialisable • Important questions: how to
schedules are the main focus
determine whether a
of concurrency control
schedule is conflict
• They allow for interleaving serialisable
and at the same time they • How to construct conflict
are guaranteed to behave as serialisable schedules
a serial schedule
44
Test for Conflict Serializability
• Construct a directed graph, precedence graph, G = (V, E)
• V: set of all transactions participating in schedule
• E: set of edges Ti Tj for which one of the following holds:
• Ti executes a write_item(X) before Tj executes read_item(X)
• Ti executes a read_item(X) before Tj executes write_item(X)
• Ti executes a write_item(X) before Tj executes write_item(X)
45
Sample Schedule S
T2
T1 T3
read_item(Y);
read_item(Z);
read_item(X);
write_item(X);
write_item(Y);
write_item(Z);
read_item(Z);
read_item(Y);
write_item(Y);
read_item(Y);
write_item(Y);
read_item(X);
write_item(X);
46
Precedence Graph for S
X,Y
T1 T2
Y Y,Z
no cycles S is serializable
T3 Equivalent Serial Schedule:
T3 T1 T2
(precedence order)
47
Precedence Graph Example
• The lost update schedule has
the precedence graph:
T1 T2
Read(X)
X = X - 5
T1 Write(X) followed by T2 Write(X) Read(X)
X = X + 5
Write(X)
T1 T2
Write(X)
COMMIT
T2 Read(X) followed by T1 Write(X)
COMMIT
48
Precedence Graph Example
• No cycles: conflict
serialisable schedule
T1 T2
49
Result Equivalent Schedules
• Two schedules are result equivalent if they produce
the same final state of the database
• Problem: May produce same result by accident!
S1 S2
read_item(X); read_item(X);
X:=X+10; X:=X*1.1;
write_item(X); write_item(X);
Schedules S1 and S2 are result equivalent for X=100 but not in general
50
View Serializability
• View serializability
• Definition of serializability based on view equivalence.
A schedule is view serializable if it is view equivalent to a
serial schedule.
51
View Equivalence
Two schedules are said to be view equivalent if the following three conditions
hold:
1. The same set of transactions participates in S and S’, and S and S’ include
the same operations of those transactions.
2. For any operation Ri(X) of Ti in S, if the value of X read by the operation
has been written by an operation Wj(X) of Tj (or if it is the original value
of X before the schedule started), the same condition must hold for the
value of X read by operation Ri(X) of Ti in S’.
3. If the operation Wk(Y) of Tk is the last operation to write item Y in S, then
Wk(Y) of Tk must also be the last operation to write item Y in S’.
52
View Equivalence
53
View and Conflict equivalence
54
View and Conflict equivalence
In Sa, the operations w2(X) and w3(X) are blind writes, since T1 and T3 do
not read the value of X.
55
Characterizing Schedules based on
Serializability
• Being serializable is not the same as being serial
• Being serializable implies that the schedule is a
correct schedule.
• It will leave the database in a consistent state.
• The interleaving is appropriate and will result in a
state as if the transactions were serially executed, yet
will achieve efficiency due to concurrent execution.
56
Characterizing Schedules based on
Serializability
57
Characterizing Schedules based on
Serializability
Practical approach:
• Come up with methods (protocols) to ensure serializability.
• It’s not possible to determine when a schedule begins and
when it ends. Hence, we reduce the problem of checking the
whole schedule to checking only a committed project of the
schedule (i.e. operations from only the committed
transactions.)
• Current approach used in most DBMSs (covered next Lecture):
• Concurrency control techniques
• Examples
• Two-phase locking technique
• Timestamp ordering technique
58
Summary
59