You are on page 1of 32

Module 3:Transaction Processing

Overview
• In this chapter we will:
– Define transaction and discuss its elements
– Define schedules of transactions, and their
desirable properties
– Discuss what problems may arise from concurrent
execution of transactions
– Discuss the serialisation of transactions
– Introduce steps for recovery of a database
Interleaved Model of Concurrency
• Most DBMSs allow multiple users to interact
with the system concurrently, through
interleaving.
• Consider, as an example, two transactions A
and B that are interleaved to access a
database:
Interleaved…
Simultaneous Model of Concurrency

• Another possibility for concurrent execution of


transactions is a simultaneous access to the
database.
• Consider, as an example, two transactions A
and B that simultaneously access the
database:
Simultaneous…
What are Transactions?
• A transaction is an atomic unit of work on a database
that is either:
– Completed in its entirety
– Has no effect on the database (is not completed)
• In most cases, we are concerned about the atomicity
of transactions that update the database.
• Read-only transactions are typically not difficult to
manage and, throughout this lecture, when discussing
transactions we are concerned with atomicity of
read/write transactions or write-only transactions.
Elements of a Transaction
• There are essentially two primitives in transactions when
considering database operations at the level of disk blocks or
variables/data items:
– read item(A)—Read database item a in to variable a.
– write item(B)—Write variable b in to database item b.
• In the general case, reading and writing database items and
variables requires disk transfers that are measured in disk
blocks (although, often, the item does not span the complete
block).
• Note that some blocks may be cached and, as such,
(subsequent) modifications can be made in the cache memory.
An Overview to Concurrency

To T1
read ( X ) read ( Y )
X=X+1 Y=Y∗2
write ( X ) write ( Y )
read ( X )
X=X+2
write ( X )
Introduction…
• Consider two transactions T0 and T1. T0 is an
access to a stock database, to increase the
count of widgets available by 1. T1 doubles
the cost of widgets and adds two more
widgets to the stock count.
Concurrency Problems with Transactions
• Consider the following transaction primitives that are interleaved as
shown:
To T1
read ( X ) read ( Y )
Y=Y∗2
write ( Y )
read ( X )
X=X+2
write ( X )
X=X+1
write ( X )
Lost Update Problem
• The preceding transactions result in a "lost
update". 
• T0 reads X (the number of widgets), adds 1,
and writes the new value of X (X + 1).
• While this is occurring (between the read and
write in transaction T0), T1 reads X (the number
of widgets), adds 2, and writes a new value
of X (X + 2),
• The value of X after both transactions has
increased by 1, but should have increased by 3.
Dirty Read Problem
• Consider two other transactions, interleaved as
follows:
T2 T3
read ( X )
X = X +2
write ( X )
read ( X )
X=X+1
write ( X )
read ( Y )
Dirty Read…
• The preceding transaction can result in a “dirty read”
problem if the last read of transaction T3 fails.
• T3 reads in X, adds 2, and writes the new value X. After
this, T2 reads the new value of X, adds 1, and writes X.
At this point the value of X is “correct”.
• T3 then attempts to read Y and, for some reason, this
fails. T3 should therefore be undone and the database
returned to its previous state.
• However, T2 has read and used the intermediate value
of X from T3. This is known as the “Dirty Read” problem.
Concurrency Control
• Two other common problems are encountered if transactions
can occur concurrently in an uncontrolled fashion:
– Incorrect Summary Problem—One transaction updates values while
another reads and summarizes the same values. Values for
summarization may be read before or after each individual update,
the total thus becoming inconsistent with the parts.
– Unrepeatable Read—A value is read in a transaction, updated by
another transaction, and subsequently re-read by the first
transaction. Despite not modifying the value, the first transaction
encounters two different values (an unrepeatable read).
• We will discuss concurrency and mechanisms to ensure
concurrency in the next chapter.
An Introduction to Recovery
• The “Dirty Read” problem illustrates the need for
recovery in a database system: either all the elements
of a transaction should be completed and recorded in
the database, or the transaction should have no effect
on the database (or other transactions).
• Clearly, the DBMS must permit all actions in a
transaction or no actions in a transaction to occur.
• Moreover, the DBMS must be able to recover if a
transaction fails after some but not all actions are
carried out.
Types of Transaction Failure
• A transaction can fail for several reasons, such as:
– System failure—A hardware or software error causing, possibly,
the loss of information on both static and non-static devices.
– Transaction failure—Badly formed transactions, or user
interruption.
– Exception conditions—A condition in the transaction is not met
– Concurrency failure—A problem occurs in assuring the
concurrency of the transaction and other transactions.
• In all cases, a DBMS should have techniques to recover from
both failure and concurrency problems (in some cases, such
as catastrophic system failure, this may not be possible)
Transaction States
• To ensure atomicity, we can state that a transaction must
be in one of several states:
– Active—A transaction that has executed at least one action, but
has not completed.
– Partially committed—A transaction that has completed all
actions, but has not been confirmed as completed.
– Failed—A transaction that fails completion checks, or is aborted
during the active state.
– Committed—A completed and checked transaction (we will
discuss this later).
– Aborted—A transaction that has failed and has had no effect on
the database system
Transaction States …
Desirable Properties of Transactions
• Most importantly, while transactions should be in
known states, they should maintain four properties
(the ACID principle) at all times:
1. Atomicity—A transaction is atomic and should be
completed (committed) or have no effect (aborted).
2. Consistency—A transaction must take the database from
one consistent state to another.
3. Isolation—A transaction should be fully independent in its
actions until it is committed.
4. Durability—Once a transaction is complete, it must remain
complete in the event of subsequent transaction failures
Summary of Transactions
• Transactions are atomic (they complete or are
aborted completely).
• Transactions must be controlled to allow concurrency.
• Transactions must support (and be supported by)
recovery techniques.
• Transactions should be in a known state and have
known mechanisms to move between states.
• Transactions should obey the ACID principle
(atomicity, consistency, isolation, and durability).
Transaction Schedules
Schedule 1 Schedule 2
T0 T1 T0 T1
read ( X ) read ( Y ) read ( X ) read ( Y )
X=X+1 Y=Y∗2 Y=Y∗2
write ( X ) write ( Y ) write ( Y )
read ( X ) read ( X )
X=X+2 X=X+2
write ( X ) write ( X )
X=X+1
write ( X )
Transaction Schedules…
• The preceding slide shows two possible schedules for
transactions T0 and T1.
• A schedule is a possibly interleaved ordered arrangement of the
operations in one or more transactions such that the operations
of each transaction appear in the same order as they would
independently.
• For example, schedule 1 can be written as:
– S 1 : r 0 ( X); r 1 ( Y ); w 0 ( X); w 1 ( Y ); r 1 ( X); c 0; w 1 ( X); c 1;
– Note that r means read, w means write, and c means commit. Other
operations are omitted.
• Schedule 2 is:
– S 2 : r 0 ( X); r 1 ( Y ); w 1 ( Y ); r 1 ( X); w 1 ( X); c 1; w 0 ( X); c 0;
Serialisability of Schedules
• If two transactions are not interleaved, there are two possible
correct schedules:
– one that has all operations in T0 before all operations in T1, and
– a second schedule that has all operations in T1 before all operations in
T0 (these are known as serial schedules).
• If transactions can be interleaved, there may be many possible
interleavings (non-serial schedules).
• What is more interesting, however, is determining the
interleaved schedules that are correct, that is, those that are
serialisable.
• Serialisable schedules are equivalent to a serial schedule of the
same transactions.
Testing for Serialisability
• One approach to test for the serialibility of schedules is to draw
a directed graph using the following approach:
1. For each transaction Ti in S, draw a node labelled Ti.
2. For each read item(X) in Tj that occurs after a write item(X) in Ti,
draw an edge Ti → Tj .
3. For each write item(X) in Tj that occurs after a read item(X) in Ti,
draw an edge Ti → Tj .
4. For each write item(X) in Tj that occurs after a write item(X) in Ti,
draw an edge Ti → Tj .
5. The schedule is serialisable, iff there are no cycles (acyclic).
• (This is a test for conflict-serialisability, a stricter form of
serialisability is used in commercial database systems.)
Testing for Serialisability …
• Example:
– S 2 : r 0 ( X); r 1 ( Y ); w 1 ( Y ); r 1 ( X); w 1 ( X); c 1; w 0 ( X); c 0; serialisable?

– The edge from T0 to T1 is present since w 1 ( X ) occurs after r 0 ( X ).


– The edge from T1 to T0 is present since w 0 ( X ) occurs after r 1 ( X ).
Equivalent Serialisable Schedules
• If the graph contains no cycles, we can create an
equivalent serial schedule by:
– Identifying each edge in the graph.
– Ordering the schedule so that for an edge Ti → Tj , Ti
occurs before Tj .
• If there are no edges, then the operations can be
ordered in (almost) any order.
• In practice, however, rules are used in DBMSs to
ensure serialisability, as detecting cycles for
concurrent transactions is difficult.
Equivalent Serialisable Schedules …

• For each diagram, what is the equivalent serial


schedule? In both cases it is T0 → T1 → T2
• Note that a serial schedule is not the same as
being serialisable.
Recoverable Schedules
• First, some terminology: T has read from T’ if in a schedule T’
writes X and T later reads X.
• A schedule S is recoverable if: No other transaction T that writes
an item that T reads remains uncommitted when T is committed.
• The following schedule is not recoverable, since T1 is committed
before T0 and T1 reads X (which has been written by T0):
• S : r0( X); w0( X); r1( X); w1( X); c1; a0
• A recoverable schedule guarantees that a committed transaction
satisfies the Durability property, that is, it will never be required
to roll back (reversed to bring the database back to the state
prior to the transaction).
Strict Recoverable Schedules
• In practical terms, guaranteeing that a schedule is recoverable
is only part of a recovery strategy.
• For example, cascading rollback may be required in some
recoverable schedules. Consider the following example:
– S : r0 ( X); w0 ( X); r 1 ( X); w1( X); a0
• In this case, T0 is aborted, but T1 must also be rolled back
since T1 read from T0. A schedule can avoid cascading rollback
by only reading items written by committed transactions.
• A strict recoverable schedule extends this approach by
imposing that no transaction can read or write an item X until
the last transaction that wrote X has committed or aborted.
Transaction Support in SQL
• Begin….end statement
Presentation Assignment
• Strict Serialisability and Strict Recoverability of
schedules
– Oracle and MS-SQL DBMS

You might also like