You are on page 1of 72

Database System Concepts, 6th Ed.

©Silberschatz, Korth and Sudarshan


See www.db-book.com for conditions on re-use
n Transaction Concept
n Transaction State
n Concurrent Executions
n Serializability
n Recoverability
n Implementation of Isolation
n Transaction Definition in SQL
n Testing for Serializability.

Database System Concepts - 6th Edition 14.2 ©Silberschatz, Korth and Sudarshan
n A transaction is a unit of program execution that accesses and
possibly updates various data items.
n E.g. transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
n Two main issues to deal with:
l Failures of various kinds, such as hardware failures and system
crashes
l Concurrent execution of multiple transactions

Database System Concepts - 6th Edition 14.3 ©Silberschatz, Korth and Sudarshan
n Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
n Atomicity requirement
l if the transaction fails after step 3 and before step 6, money will be “lost”
leading to an inconsistent database state
4 Failure could be due to software or hardware
l the system should ensure that updates of a partially executed transaction
are not reflected in the database
n Durability requirement — once the user has been notified that the transaction
has completed (i.e., the transfer of the $50 has taken place), the updates to the
database by the transaction must persist even if there are software or
hardware failures.

Database System Concepts - 6th Edition 14.4 ©Silberschatz, Korth and Sudarshan
n Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
n Consistency requirement in above example:
l the sum of A and B is unchanged by the execution of the transaction
n In general, consistency requirements include
4 Explicitly specified integrity constraints such as primary keys and foreign
keys
4 Implicit integrity constraints
– e.g. sum of balances of all accounts, minus sum of loan amounts
must equal value of cash-in-hand
l A transaction must see a consistent database.
l During transaction execution the database may be temporarily inconsistent.
l When the transaction completes successfully the database must be
consistent
4 Erroneous transaction logic can lead to inconsistency

Database System Concepts - 6th Edition 14.5 ©Silberschatz, Korth and Sudarshan
n Isolation requirement — if between steps 3 and 6, another
transaction T2 is allowed to access the partially updated database, it
will see an inconsistent database (the sum A + B will be less than it
should be).
T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B
n Isolation can be ensured trivially by running transactions serially
that is, one after the other.
l
n However, executing multiple transactions concurrently has significant
benefits, as we will see later.

Database System Concepts - 6th Edition 14.6 ©Silberschatz, Korth and Sudarshan
A transaction is a unit of program execution that accesses and possibly
updates various data items.To preserve the integrity of data the database
system must ensure:
n Atomicity. Either all operations of the transaction are properly reflected
in the database or none are.
n Consistency. Execution of a transaction in isolation preserves the
consistency of the database.
n Isolation. Although multiple transactions may execute concurrently,
each transaction must be unaware of other concurrently executing
transactions. Intermediate transaction results must be hidden from other
concurrently executed transactions.
l That is, for every pair of transactions Ti and Tj, it appears to Ti that
either Tj, finished execution before Ti started, or Tj started execution
after Ti finished.
n Durability. After a transaction completes successfully, the changes it
has made to the database persist, even if there are system failures.

Database System Concepts - 6th Edition 14.7 ©Silberschatz, Korth and Sudarshan
n Active – the initial state; the transaction stays in this state while it is
executing
n Partially committed – after the final statement has been executed.
n Failed -- after the discovery that normal execution can no longer
proceed.
n Aborted – after the transaction has been rolled back and the
database restored to its state prior to the start of the transaction.
Two options after it has been aborted:
l restart the transaction
4 can be done only if no internal logical error
l kill the transaction
n Committed – after successful completion.

Database System Concepts - 6th Edition 14.8 ©Silberschatz, Korth and Sudarshan
Database System Concepts - 6th Edition 14.9 ©Silberschatz, Korth and Sudarshan
n Multiple transactions are allowed to run concurrently in the system.
Advantages are:
l increased processor and disk utilization, leading to better
transaction throughput
4 E.g. one transaction can be using the CPU while another is
reading from or writing to the disk
l reduced average response time for transactions: short
transactions need not wait behind long ones.
n Concurrency control schemes – mechanisms to achieve isolation
l that is, to control the interaction among the concurrent
transactions in order to prevent them from destroying the
consistency of the database
4 Will study in Chapter 16, after studying notion of correctness
of concurrent executions.

Database System Concepts - 6th Edition 14.10 ©Silberschatz, Korth and Sudarshan
n Schedule – a sequences of instructions that specify the chronological
order in which instructions of concurrent transactions are executed
l a schedule for a set of transactions must consist of all instructions
of those transactions
l must preserve the order in which the instructions appear in each
individual transaction.
n A transaction that successfully completes its execution will have a
commit instructions as the last statement
l by default transaction assumed to execute commit instruction as its
last step
n A transaction that fails to successfully complete its execution will have
an abort instruction as the last statement

Database System Concepts - 6th Edition 14.11 ©Silberschatz, Korth and Sudarshan
n Let T1 transfer $50 from A to B, and T2 transfer 10% of the
balance from A to B.
n A serial schedule in which T1 is followed by T2 :

Database System Concepts - 6th Edition 14.12 ©Silberschatz, Korth and Sudarshan
• A serial schedule where T2 is followed by T1

Database System Concepts - 6th Edition 14.13 ©Silberschatz, Korth and Sudarshan
n Let T1 and T2 be the transactions defined previously. The
following schedule is not a serial schedule, but it is equivalent
to Schedule 1.

In Schedules 1, 2 and 3, the sum A + B is preserved.

Database System Concepts - 6th Edition 14.14 ©Silberschatz, Korth and Sudarshan
n The following concurrent schedule does not preserve the
value of (A + B ).

Database System Concepts - 6th Edition 14.15 ©Silberschatz, Korth and Sudarshan
n Basic Assumption – Each transaction preserves database
consistency.
n Thus serial execution of a set of transactions preserves
database consistency.
n A (possibly concurrent) schedule is serializable if it is
equivalent to a serial schedule. Different forms of schedule
equivalence give rise to the notions of:
1. conflict serializability
2. view serializability

Database System Concepts - 6th Edition 14.16 ©Silberschatz, Korth and Sudarshan
l We ignore operations other than read and write
instructions
l We assume that transactions may perform arbitrary
computations on data in local buffers in between reads
and writes.
l Our simplified schedules consist of only read and write
instructions.

Database System Concepts - 6th Edition 14.17 ©Silberschatz, Korth and Sudarshan
n Instructions li and lj of transactions Ti and Tj respectively, conflict
if and only if there exists some item Q accessed by both li and lj,
and at least one of these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
n Intuitively, a conflict between li and lj forces a (logical) temporal
order between them.
l If li and lj are consecutive in a schedule and they do not
conflict, their results would remain the same even if they had
been interchanged in the schedule.

Database System Concepts - 6th Edition 14.18 ©Silberschatz, Korth and Sudarshan
n Some applications are willing to live with weak levels of consistency,
allowing schedules that are not serializable
l E.g. a read-only transaction that wants to get an approximate total
balance of all accounts
l E.g. database statistics computed for query optimization can be
approximate (why?)
l Such transactions need not be serializable with respect to other
transactions
n Tradeoff accuracy for performance

Database System Concepts - 6th Edition 14.19 ©Silberschatz, Korth and Sudarshan
n Serializable — default
n Repeatable read — only committed records to be read, repeated
reads of same record must return same value. However, a
transaction may not be serializable – it may find some records
inserted by a transaction but not find others.
n Read committed — only committed records can be read, but
successive reads of record may return different (but committed)
values.
n Read uncommitted — even uncommitted records may be read.

n Lower degrees of consistency useful for gathering approximate


information about the database
n Warning: some database systems do not ensure serializable
schedules by default
l E.g. Oracle and PostgreSQL by default support a level of
consistency called snapshot isolation (not part of the SQL
standard)
Database System Concepts - 6th Edition 14.20 ©Silberschatz, Korth and Sudarshan
n Data manipulation language must include a construct for
specifying the set of actions that comprise a transaction.
n In SQL, a transaction begins implicitly.
n A transaction in SQL ends by:
l Commit work commits current transaction and begins a new
one.
l Rollback work causes current transaction to abort.
n In almost all database systems, by default, every SQL statement
also commits implicitly if it executes successfully
l Implicit commit can be turned off by a database directive
4 E.g. in JDBC, connection.setAutoCommit(false);

Database System Concepts - 6th Edition 14.21 ©Silberschatz, Korth and Sudarshan
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
n Lock-Based Protocols
n Timestamp-Based Protocols
n Validation-Based Protocols
n Multiple Granularity
n Multiversion Schemes
n Insert and Delete Operations
n Concurrency in Index Structures

Database System Concepts - 6th Edition 14.23 ©Silberschatz, Korth and Sudarshan
n A lock is a mechanism to control concurrent access to a data item
n Data items can be locked in two modes:
1. exclusive (X) mode. Data item can be both read as well as
written. X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is
requested using lock-S instruction.
n Lock requests are made to concurrency-control manager. Transaction
can proceed only after request is granted.

Database System Concepts - 6th Edition 14.24 ©Silberschatz, Korth and Sudarshan
n Lock-compatibility matrix

n A transaction may be granted a lock on an item if the requested lock is


compatible with locks already held on the item by other transactions.
n Any number of transactions can hold shared locks on an item,
l but if any transaction holds an exclusive on the item no other
transaction may hold any lock on the item.
n If a lock cannot be granted, the requesting transaction is made to wait till
all incompatible locks held by other transactions have been released.
The lock is then granted.

Database System Concepts - 6th Edition 14.25 ©Silberschatz, Korth and Sudarshan
n Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
n Locking as above is not sufficient to guarantee serializability — if A
and B get updated in-between the read of A and B, the displayed
sum would be wrong.
n A locking protocol is a set of rules followed by all transactions
while requesting and releasing locks. Locking protocols restrict the
set of possible schedules.

Database System Concepts - 6th Edition 14.26 ©Silberschatz, Korth and Sudarshan
n Consider the partial schedule

n Neither T3 nor T4 can make progress — executing lock-S(B) causes T4


to wait for T3 to release its lock on B, while executing lock-X(A) causes
T3 to wait for T4 to release its lock on A.
n Such a situation is called a deadlock.
l To handle a deadlock one of T3 or T4 must be rolled back
and its locks released.

Database System Concepts - 6th Edition 14.27 ©Silberschatz, Korth and Sudarshan
n The potential for deadlock exists in most locking protocols. Deadlocks
are a necessary evil.
n Starvation is also possible if concurrency control manager is badly
designed. For example:
l A transaction may be waiting for an X-lock on an item, while a
sequence of other transactions request and are granted an S-lock
on the same item.
l The same transaction is repeatedly rolled back due to deadlocks.
n Concurrency control manager can be designed to prevent starvation.

Database System Concepts - 6th Edition 14.28 ©Silberschatz, Korth and Sudarshan
n This is a protocol which ensures conflict-serializable schedules.
n Phase 1: Growing Phase
l transaction may obtain locks
l transaction may not release locks
n Phase 2: Shrinking Phase
l transaction may release locks
l transaction may not obtain locks
n The protocol assures serializability. It can be proved that the
transactions can be serialized in the order of their lock points (i.e.,
the point where a transaction acquired its final lock).

Database System Concepts - 6th Edition 14.29 ©Silberschatz, Korth and Sudarshan
n Two-phase locking does not ensure freedom from deadlocks.
n Cascading roll-back is possible under two-phase locking. To avoid
this, follow a modified protocol called strict two-phase locking. Here
a transaction must hold all its exclusive locks till it commits/aborts.
n Rigorous two-phase locking is even stricter: here all locks are held
till commit/abort. In this protocol transactions can be serialized in the
order in which they commit.

Database System Concepts - 6th Edition 14.30 ©Silberschatz, Korth and Sudarshan
n There can be conflict serializable schedules that cannot be obtained if
two-phase locking is used.
n However, in the absence of extra information (e.g., ordering of access
to data), two-phase locking is needed for conflict serializability in the
following sense:
Given a transaction Ti that does not follow two-phase locking, we can
find a transaction Tj that uses two-phase locking, and a schedule for Ti
and Tj that is not conflict serializable.

Database System Concepts - 6th Edition 14.31 ©Silberschatz, Korth and Sudarshan
n Two-phase locking with lock conversions:
– First Phase:
l can acquire a lock-S on item
l can acquire a lock-X on item
l can convert a lock-S to a lock-X (upgrade)
– Second Phase:
l can release a lock-S
l can release a lock-X
l can convert a lock-X to a lock-S (downgrade)
n This protocol assures serializability. But still relies on the programmer to
insert the various locking instructions.

Database System Concepts - 6th Edition 14.32 ©Silberschatz, Korth and Sudarshan
n A transaction Ti issues the standard read/write instruction, without
explicit locking calls.
n The operation read(D) is processed as:
if Ti has a lock on D
then
read(D)
else begin
if necessary wait until no other
transaction has a lock-X on D
grant Ti a lock-S on D;
read(D)
end

Database System Concepts - 6th Edition 14.33 ©Silberschatz, Korth and Sudarshan
n write(D) is processed as:
if Ti has a lock-X on D
then
write(D)
else begin
if necessary wait until no other trans. has any lock on D,
if Ti has a lock-S on D
then
upgrade lock on D to lock-X
else
grant Ti a lock-X on D
write(D)
end;
n All locks are released after commit or abort

Database System Concepts - 6th Edition 14.34 ©Silberschatz, Korth and Sudarshan
n A lock manager can be implemented as a separate process to which
transactions send lock and unlock requests.
n The lock manager replies to a lock request by sending a lock grant
messages (or a message asking the transaction to roll back, in case of
a deadlock).
n The requesting transaction waits until its request is answered.
n The lock manager maintains a data-structure called a lock table to
record granted locks and pending requests.
n The lock table is usually implemented as an in-memory hash table
indexed on the name of the data item being locked.

Database System Concepts - 6th Edition 14.35 ©Silberschatz, Korth and Sudarshan
n Black rectangles indicate granted locks,
white ones indicate waiting requests
n Lock table also records the type of lock
granted or requested
n New request is added to the end of the
queue of requests for the data item, and
granted if it is compatible with all earlier
locks
n Unlock requests result in the request
being deleted, and later requests are
checked to see if they can now be
granted
n If transaction aborts, all waiting or
granted requests of the transaction are
deleted
l lock manager may keep a list of
locks held by each transaction, to
implement this efficiently

Database System Concepts - 6th Edition 14.36 ©Silberschatz, Korth and Sudarshan
n Consider the following two transactions:
T1: write (X) T2: write(Y)
write(Y) write(X)
n Schedule with deadlock

Database System Concepts - 6th Edition 14.37 ©Silberschatz, Korth and Sudarshan
n System is deadlocked if there is a set of transactions such that every
transaction in the set is waiting for another transaction in the set.
n Deadlock prevention protocols ensure that the system will never
enter into a deadlock state. Some prevention strategies:
l Require that each transaction locks all its data items before it
begins execution (predeclaration).
l Impose partial ordering of all data items and require that a
transaction can lock data items only in the order specified by the
partial order (graph-based protocol).

Database System Concepts - 6th Edition 14.38 ©Silberschatz, Korth and Sudarshan
n Following schemes use transaction timestamps for the sake of deadlock
prevention alone.
n wait-die scheme — non-preemptive
l older transaction may wait for younger one to release data item.
Younger transactions never wait for older ones; they are rolled back
instead.
l a transaction may die several times before acquiring needed data
item
n wound-wait scheme — preemptive
l older transaction wounds (forces rollback) of younger transaction
instead of waiting for it. Younger transactions may wait for older
ones.
l may be fewer rollbacks than wait-die scheme

Database System Concepts - 6th Edition 14.39 ©Silberschatz, Korth and Sudarshan
n Both in wait-die and in wound-wait schemes, a rolled back
transactions is restarted with its original timestamp. Older transactions
thus have precedence over newer ones, and starvation is hence
avoided.
n Timeout-Based Schemes:
l a transaction waits for a lock only for a specified amount of time.
After that, the wait times out and the transaction is rolled back.
l thus deadlocks are not possible
l simple to implement; but starvation is possible. Also difficult to
determine good value of the timeout interval.

Database System Concepts - 6th Edition 14.40 ©Silberschatz, Korth and Sudarshan
n Deadlocks can be described as a wait-for graph, which consists of a
pair G = (V,E),
l V is a set of vertices (all the transactions in the system)
l E is a set of edges; each element is an ordered pair Ti Tj.
n If Ti  Tj is in E, then there is a directed edge from Ti to Tj, implying
that Ti is waiting for Tj to release a data item.
n When Ti requests a data item currently being held by Tj, then the edge
Ti Tj is inserted in the wait-for graph. This edge is removed only when
Tj is no longer holding a data item needed by Ti.
n The system is in a deadlock state if and only if the wait-for graph has a
cycle. Must invoke a deadlock-detection algorithm periodically to look
for cycles.

Database System Concepts - 6th Edition 14.41 ©Silberschatz, Korth and Sudarshan
Wait-for graph without a cycle Wait-for graph with a cycle

Database System Concepts - 6th Edition 14.42 ©Silberschatz, Korth and Sudarshan
n When deadlock is detected:
l Some transaction will have to rolled back (made a victim) to break
deadlock. Select that transaction as victim that will incur minimum
cost.
l Rollback -- determine how far to roll back transaction
4 Total rollback: Abort the transaction and then restart it.
4 More effective to roll back transaction only as far as necessary
to break deadlock.
l Starvation happens if same transaction is always chosen as
victim. Include the number of rollbacks in the cost factor to avoid
starvation

Database System Concepts - 6th Edition 14.43 ©Silberschatz, Korth and Sudarshan
n Index-locking protocol to prevent phantoms required locking entire leaf
l Can result in poor concurrency if there are many inserts
n Alternative: for an index lookup
l Lock all values that satisfy index lookup (match lookup value, or
fall in lookup range)
l Also lock next key value in index
l Lock mode: S for lookups, X for insert/delete/update
n Ensures that range queries will conflict with inserts/deletes/updates
l Regardless of which happens first, as long as both are concurrent

Database System Concepts - 6th Edition 14.44 ©Silberschatz, Korth and Sudarshan
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
n Failure Classification
n Storage Structure
n Recovery and Atomicity
n Log-Based Recovery
n Remote Backup Systems

Database System Concepts - 6th Edition 14.46 ©Silberschatz, Korth and Sudarshan
n Transaction failure :
l Logical errors: transaction cannot complete due to some internal
error condition
l System errors: the database system must terminate an active
transaction due to an error condition (e.g., deadlock)
n System crash: a power failure or other hardware or software failure
causes the system to crash.
l Fail-stop assumption: non-volatile storage contents are assumed
to not be corrupted by system crash
4 Database systems have numerous integrity checks to prevent
corruption of disk data
n Disk failure: a head crash or similar disk failure destroys all or part of
disk storage
l Destruction is assumed to be detectable: disk drives use
checksums to detect failures

Database System Concepts - 6th Edition 14.47 ©Silberschatz, Korth and Sudarshan
n Consider transaction Ti that transfers $50 from account A to account B
l Two updates: subtract 50 from A and add 50 to B
n Transaction Ti requires updates to A and B to be output to the
database.
l A failure may occur after one of these modifications have been
made but before both of them are made.
l Modifying the database without ensuring that the transaction will
commit may leave the database in an inconsistent state
l Not modifying the database may result in lost updates if failure
occurs just after transaction commits
n Recovery algorithms have two parts
1. Actions taken during normal transaction processing to ensure
enough information exists to recover from failures
2. Actions taken after a failure to recover the database contents to a
state that ensures atomicity, consistency and durability

Database System Concepts - 6th Edition 14.48 ©Silberschatz, Korth and Sudarshan
n Volatile storage:
l does not survive system crashes
l examples: main memory, cache memory
n Nonvolatile storage:
l survives system crashes
l examples: disk, tape, flash memory,
non-volatile (battery backed up) RAM
l but may still fail, losing data
n Stable storage:
l a mythical form of storage that survives all failures
l approximated by maintaining multiple copies on distinct
nonvolatile media
l See book for more details on how to implement stable storage

Database System Concepts - 6th Edition 14.49 ©Silberschatz, Korth and Sudarshan
n Maintain multiple copies of each block on separate disks
copies can be at remote sites to protect against disasters such as
l
fire or flooding.
n Failure during data transfer can still result in inconsistent copies: Block
transfer can result in
l Successful completion
l Partial failure: destination block has incorrect information
l Total failure: destination block was never updated
n Protecting storage media from failure during data transfer (one
solution):
l Execute output operation as follows (assuming two copies of each
block):
1. Write the information onto the first physical block.
2. When the first write successfully completes, write the same
information onto the second physical block.
3. The output is completed only after the second write
successfully completes.

Database System Concepts - 6th Edition 14.50 ©Silberschatz, Korth and Sudarshan
n Protecting storage media from failure during data transfer (cont.):
n Copies of a block may differ due to failure during output operation. To
recover from failure:
1. First find inconsistent blocks:
1. Expensive solution: Compare the two copies of every disk block.
2. Better solution:
l Record in-progress disk writes on non-volatile storage (Non-
volatile RAM or special area of disk).
l Use this information during recovery to find blocks that may be
inconsistent, and only compare copies of these.
l Used in hardware RAID systems
2. If either copy of an inconsistent block is detected to have an error (bad
checksum), overwrite it by the other copy. If both have no error, but are
different, overwrite the second block by the first block.

Database System Concepts - 6th Edition 14.51 ©Silberschatz, Korth and Sudarshan
n Physical blocks are those blocks residing on the disk.
n Buffer blocks are the blocks residing temporarily in main memory.
n Block movements between disk and main memory are initiated
through the following two operations:
l input(B) transfers the physical block B to main memory.
l output(B) transfers the buffer block B to the disk, and replaces the
appropriate physical block there.
n We assume, for simplicity, that each data item fits in, and is stored
inside, a single block.

Database System Concepts - 6th Edition 14.52 ©Silberschatz, Korth and Sudarshan
buffer
Buffer Block A input(A)
X A
Buffer Block B Y B
output(B)
read(X)
write(Y)

x2
x1
y1

work area work area


of T1 of T2

memory disk

Database System Concepts - 6th Edition 14.53 ©Silberschatz, Korth and Sudarshan
n Each transaction Ti has its private work-area in which local copies of
all data items accessed and updated by it are kept.
l Ti's local copy of a data item X is called xi.
n Transferring data items between system buffer blocks and its private
work-area done by:
l read(X) assigns the value of data item X to the local variable xi.
l write(X) assigns the value of local variable xi to data item {X} in
the buffer block.
l Note: output(BX) need not immediately follow write(X). System
can perform the output operation when it deems fit.
n Transactions
l Must perform read(X) before accessing X for the first time
(subsequent reads can be from local copy)
l write(X) can be executed at any time before the transaction
commits

Database System Concepts - 6th Edition 14.54 ©Silberschatz, Korth and Sudarshan
n To ensure atomicity despite failures, we first output information
describing the modifications to stable storage without modifying the
database itself.
n We study log-based recovery mechanisms in detail
l We first present key concepts
l And then present the actual recovery algorithm
n Less used alternative: shadow-paging (brief details in book)

Database System Concepts - 6th Edition 14.55 ©Silberschatz, Korth and Sudarshan
n A log is kept on stable storage.
l The log is a sequence of log records, and maintains a record of
update activities on the database.
n When transaction Ti starts, it registers itself by writing a
<Ti start>log record
n Before Ti executes write(X), a log record
<Ti, X, V1, V2>
is written, where V1 is the value of X before the write (the old value),
and V2 is the value to be written to X (the new value).
n When Ti finishes it last statement, the log record <Ti commit> is written.
n Two approaches using logs
l Deferred database modification
l Immediate database modification

Database System Concepts - 6th Edition 14.56 ©Silberschatz, Korth and Sudarshan
n The immediate-modification scheme allows updates of an
uncommitted transaction to be made to the buffer, or the disk itself,
before the transaction commits
n Update log record must be written before database item is written
l We assume that the log record is output directly to stable storage
l (Will see later that how to postpone log record output to some
extent)
n Output of updated blocks to stable storage can take place at any time
before or after transaction commit
n Order in which blocks are output can be different from the order in
which they are written.
n The deferred-modification scheme performs updates to buffer/disk
only at the time of transaction commit
l Simplifies some aspects of recovery
l But has overhead of storing local copy

Database System Concepts - 6th Edition 14.57 ©Silberschatz, Korth and Sudarshan
Transaction Commit
n A transaction is said to have committed when its commit log record is
output to stable storage
l all previous log records of the transaction must have been output
already
n Writes performed by a transaction may still be in the buffer when the
transaction commits, and may be output later

Database System Concepts - 6th Edition 14.58 ©Silberschatz, Korth and Sudarshan
Log Write Output

<T0 start>
<T0, A, 1000, 950>
<To, B, 2000, 2050
A = 950
B = 2050
<T0 commit>
<T1 start>
<T1, C, 700, 600>
BC output before T1
C = 600 commits
BB , BC
<T1 commit>
BA
BA output after T0
n Note: BX denotes block containing X. commits

Database System Concepts - 6th Edition 14.59 ©Silberschatz, Korth and Sudarshan
Concurrency Control and Recovery
n With concurrent transactions, all transactions share a single disk
buffer and a single log
l A buffer block can have data items updated by one or more
transactions
n We assume that if a transaction Ti has modified an item, no other
transaction can modify the same item until Ti has committed or
aborted
l i.e. the updates of uncommitted transactions should not be visible
to other transactions
4 Otherwise how to perform undo if T1 updates A, then T2
updates A and commits, and finally T1 has to abort?
l Can be ensured by obtaining exclusive locks on updated items
and holding the locks till end of transaction (strict two-phase
locking)
n Log records of different transactions may be interspersed in the log.

Database System Concepts - 6th Edition 14.60 ©Silberschatz, Korth and Sudarshan
Undo and Redo Operations
n Undo of a log record <Ti, X, V1, V2> writes the old value V1 to X
n Redo of a log record <Ti, X, V1, V2> writes the new value V2 to X
n Undo and Redo of Transactions
l undo(Ti) restores the value of all data items updated by Ti to their
old values, going backwards from the last log record for Ti
4 each time a data item X is restored to its old value V a special
log record <Ti , X, V> is written out
4 when undo of a transaction is complete, a log record
<Ti abort> is written out.
l redo(Ti) sets the value of all data items updated by Ti to the new
values, going forward from the first log record for Ti
4 No logging is done in this case

Database System Concepts - 6th Edition 14.61 ©Silberschatz, Korth and Sudarshan
n When recovering after failure:
l Transaction Ti needs to be undone if the log
4 contains the record <Ti start>,
4 but does not contain either the record <Ti commit> or <Ti abort>.
l Transaction Ti needs to be redone if the log
4 contains the records <Ti start>
4 and contains the record <Ti commit> or <Ti abort>
n Note that If transaction Ti was undone earlier and the <Ti abort> record
written to the log, and then a failure occurs, on recovery from failure Ti is
redone
l such a redo redoes all the original actions including the steps that
restored old values
4 Known as repeating history
4 Seems wasteful, but simplifies recovery greatly

Database System Concepts - 6th Edition 14.62 ©Silberschatz, Korth and Sudarshan
Below we show the log as it appears at three instances of time.

Recovery actions in each case above are:


(a) undo (T0): B is restored to 2000 and A to 1000, and log records
<T0, B, 2000>, <T0, A, 1000>, <T0, abort> are written out
(b) redo (T0) and undo (T1): A and B are set to 950 and 2050 and C is
restored to 700. Log records <T1, C, 700>, <T1, abort> are written out.
(c) redo (T0) and redo (T1): A and B are set to 950 and 2050
respectively. Then C is set to 600

Database System Concepts - 6th Edition 14.63 ©Silberschatz, Korth and Sudarshan
n Redoing/undoing all transactions recorded in the log can be very slow
1. processing the entire log is time-consuming if the system has run
for a long time
2. we might unnecessarily redo transactions which have already
output their updates to the database.
n Streamline recovery procedure by periodically performing
checkpointing
1. Output all log records currently residing in main memory onto
stable storage.
2. Output all modified buffer blocks to the disk.
3. Write a log record < checkpoint L> onto stable storage where L
is a list of all transactions active at the time of checkpoint.
l All updates are stopped while doing checkpointing

Database System Concepts - 6th Edition 14.64 ©Silberschatz, Korth and Sudarshan
n During recovery we need to consider only the most recent transaction
Ti that started before the checkpoint, and transactions that started
after Ti.
1. Scan backwards from end of log to find the most recent
<checkpoint L> record
• Only transactions that are in L or started after the checkpoint
need to be redone or undone
• Transactions that committed or aborted before the checkpoint
already have all their updates output to stable storage.
• Some earlier part of the log may be needed for undo operations
1. Continue scanning backwards till a record <Ti start> is found for
every transaction Ti in L.
• Parts of log prior to earliest <Ti start> record above are not
needed for recovery, and can be erased whenever desired.

Database System Concepts - 6th Edition 14.65 ©Silberschatz, Korth and Sudarshan
Tc Tf
T1
T2
T3
T4

checkpoint system failure


n T1 can be ignored (updates already output to disk due to checkpoint)
n T2 and T3 redone.
n T4 undone

Database System Concepts - 6th Edition 14.66 ©Silberschatz, Korth and Sudarshan
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
n Remote backup systems provide high availability by allowing transaction
processing to continue even if the primary site is destroyed.

Database System Concepts - 6th Edition 14.68 ©Silberschatz, Korth and Sudarshan
n Detection of failure: Backup site must detect when primary site has
failed
l to distinguish primary site failure from link failure maintain several
communication links between the primary and the remote backup.
l Heart-beat messages
n Transfer of control:
l To take over control backup site first perform recovery using its copy
of the database and all the long records it has received from the
primary.
4 Thus, completed transactions are redone and incomplete
transactions are rolled back.
l When the backup site takes over processing it becomes the new
primary
l To transfer control back to old primary when it recovers, old primary
must receive redo logs from the old backup and apply all updates
locally.

Database System Concepts - 6th Edition 14.69 ©Silberschatz, Korth and Sudarshan
n Time to recover: To reduce delay in takeover, backup site periodically
proceses the redo log records (in effect, performing recovery from
previous database state), performs a checkpoint, and can then delete
earlier parts of the log.
n Hot-Spare configuration permits very fast takeover:
l Backup continually processes redo log record as they arrive,
applying the updates locally.
l When failure of the primary is detected the backup rolls back
incomplete transactions, and is ready to process new transactions.
n Alternative to remote backup: distributed database with replicated data
l Remote backup is faster and cheaper, but less tolerant to failure
4 more on this in Chapter 19

Database System Concepts - 6th Edition 14.70 ©Silberschatz, Korth and Sudarshan
n Ensure durability of updates by delaying transaction commit until update is
logged at backup; avoid this delay by permitting lower degrees of durability.
n One-safe: commit as soon as transaction’s commit log record is written at
primary
l Problem: updates may not arrive at backup before it takes over.
n Two-very-safe: commit when transaction’s commit log record is written at
primary and backup
l Reduces availability since transactions cannot commit if either site fails.
n Two-safe: proceed as in two-very-safe if both primary and backup are
active. If only the primary is active, the transaction commits as soon as is
commit log record is written at the primary.
l Better availability than two-very-safe; avoids problem of lost
transactions in one-safe.

Database System Concepts - 6th Edition 14.71 ©Silberschatz, Korth and Sudarshan
n 14.2; 14.13
n 15.2; 15.23
n 16.1; 16.24
n experiment 8
n submit before the 5:30pm to the monitor
n submit to the TA before 6:00pm.

Database System Concepts - 6th Edition 14.72 ©Silberschatz, Korth and Sudarshan

You might also like