You are on page 1of 70

College of Computing and Informatics

Department of Software Engineering


Advanced Database Systems
Chapter One
Transaction Management & Concurrency Control

By Yaregal N.
1
Agendas

| What is transaction processing ?

| Concurrency control Techniques

| Database recovery

| Transaction and recovery

| Recovery techniques and facilities

01
What do you understand from the diagram?
02
What is a Transaction?

| A Transaction is a mechanism for applying the desired modifications/operations to a


database.

| Action, or series of actions, carried out by a single user or application program, which
accesses or changes contents of database. (i.e. Logical unit of work on the database.)

| A transaction is typically implemented by a computer program that includes database


commands such as retrievals, insertions, deletions, and updates.

| Examples include ATM transactions, credit card approvals, flight reservations, hotel
check-in, phone calls, supermarket canning, academic registration and billing.

03
Transaction processing system

| A system that manages transactions and controls their access to a DBMS is called a
TP monitor.

| A transaction processing system (TPS) generally consists of a TP monitor, one or


more DBMSs, and a set of application programs containing transaction.

| In database field, a transaction is a group of logical operations that must all


succeed or fail as a group. Systems dedicated to supporting such operations are
known as transaction processing systems.

| Transaction processing systems are systems with large databases and hundreds
of concurrent users executing database transactions.
04
Single-User Vs Multi-User Systems

Single-User Multi-User
Access restricted to a single user at a time Access can share by multiple user at a time

Database structure relatively simple Complex database structure due to shard


access. Complexity increases with the structure
of database
Switching between projects is difficult as Switching between projects is easy as single
different schemas repositories are used schemas repository is used
Committing change in the database without Access sharing makes it difficult to make
causing deadlock changes, sometimes causes deadlock
Infrastructure cost is minimum as database is Infrastructure such as servers, networks are
accessed by single user at a time needed for shared access

Wastage of CPU and resource when Optimum usage/optimization of resources


user/application remain idle between various users.
05
Transaction Outcomes

1. Success - transaction commits and database reaches a new consistent state


• Committed transaction cannot be aborted or rolled back.

• How do you discard a committed transaction?

2. Failure - transaction aborts, and database must be restored to consistent state


before it started.
• Such a transaction is rolled back or undone.

• Aborted transaction that is rolled back can be restarted later.

06
Desirable Properties of Transactions

| A transaction is expected to exhibit some basic features or properties to be


considered as a valid transaction.

| It is referred to as ACID property of a transaction. Without the ACID property,


the integrity of the database cannot be guaranteed.

| These features are:


A: Atomicity
C: Consistency
I: Isolation
D: Durability

07
Cont…..

| Atomicity: - A transaction is an atomic unit of processing;

• It should either be performed in its entirety or not performed at all.

| Consistency preservation:- A transaction should be consistency preserving

• meaning that if it is completely executed from beginning to end


without interference from other transaction.
• It should take the database from one consistent state to another
consistent state.

08
Cont…..

| Isolation:- A transaction should appear as though it is being executed in


isolation from other transactions, even though many transactions are executing
concurrently.
• That is, the execution of a transaction should not be interfered with by
any other transactions executing concurrently.

| Durability or permanency:- The changes applied to the database by a


committed transaction must persist in the database.
• These changes must not be lost because of any failure.

09
States of a Transactions

| A transaction can end in three possible ways.

1. Successful Termination: when a transaction completes the execution of all


operations in it and reaches the COMMIT command.

2. Suicidal Termination: when the transaction detects an error during its


processing and decide to abrupt itself before the end of the transaction and
perform a ROLL BACK.

3. Murderous Termination: When the DBMS or the system force the execution
to abort for any reason. And hence, rolled back.

10
States of a Transactions

11
Ways of Transaction Execution: Serially

| In a serial execution transactions are executed strictly serially.

| Thus, Transaction Ti completes and writes its results to the database then
only the next transaction Tj is scheduled for execution.

| This means at one time there is only one transaction is being executed in
the system.

| The data is not shared between transactions at one specific time.

| In Serial transaction execution, one transaction being executed does not


interfere the execution of any other transaction.
12
Ways of Transaction Execution: Serially

13
Ways of Transaction Execution: Serially

| Advantages:

• Correct execution (if the input is correct then output will be correct)

• Fast execution (since all the resources are available to the active)

| Disadvantages:

• Very inefficient resource utilization (i.e. reduce parallelism)

14
Examples of Serial Execution:

| Suppose data items X = 10, Y = 6, and N =1 and T1 and T2 are transactions.

| Notations: Read(x) read data item x from database


Write(x) write data item x into the database
T1 T2
read (X) read (X)
X := X+N X := X+N
write (X) write (X)
read (Y)  
Y := Y+N  
write (Y)  

| We execute this transaction serially as follows:


15
Examples of Serial Execution:

Time T1 T2
read (X) {X = 10}  
X := X+N {X = 11}  
write (X) {X = 11}  
read (Y) {Y = 6}  
Y := Y+N {Y = 7}  
write (Y) {Y = 7}  
  read (X) {X = 11}
  X := X+N {X = 12}
  write (X)

• Final values of X, Y at the end of T1 and T2: X = 12 and Y = 7.


• Thus we can witness that in serial execution of transaction, if we have two
transactions Ti and Ti+1, then Ti+1 will only be executed after the completion of Ti.
16
Ways of Transaction Execution: Concurrently

| Concurrently: is the reverse of serially executable transactions, in this scheme the


individual operations of transactions, i.e., reads and writes are interleaved in some
order.
Time T1 T2
read (X) {X = 10}  
  read (X) {X = 10}

X := X+N {X = 11}  
  X := X+N {X = 11}

write (X) {X = 11}  


  write (X) (X=11)
17
Ways of Transaction Execution: Concurrently

| Final values at the end of T1 and T2: X = 11, and Y = 7.

| This improves resource utilization, unfortunately gives incorrect result.

| The correct value of X is 12 but in concurrent execution X =11, which is incorrect.

| The reason for this error is incorrect sharing of X by T1 and T2.

| In serial execution T2 read the value of X written by T1 (i.e., 11) but in concurrent

execution T2 read the same value of X (i.e., 10) as T1 did and the update made by T1

was overwritten by T2’s update.

| This is the reason the final value of X is one less than what is produced by serial 18
Why Concurrent Execution?

 Reasons:
| Many systems have an independent component to handle I/O like DMA
(Direct Memory Access) module.
| CPU may process other transactions while one is in I/O operation

| Improves resource utilization

| Increases the throughput of the system.

19
Problems Associated with Concurrent Transaction Processing

| Lost update problem: Successfully completed update on a data set by one


transaction is overridden by another transaction/user.
Ex1:- Account with balance A=100.
T1 reads the account A
T1 withdraws 10 from A
T1 makes the update in the Database
T2 reads the account A
T2 adds 100 on A
T2 makes the update in the Database
| In the above case, if done one after the other (serially) then we have no problem.

| If the execution is T1 followed by T2 then A=190

| If the execution is T2 followed by T1 then A=190


20
Cont…..

| But if they start at the same time in the following sequence:


T1 reads the account A=100

T1 withdraws 10 making the balance A=90

T2 reads the account A=100

T2 adds 100 making A=200

T1 makes the update in the Database A=90

T2 makes the update in the Database A=200


| After the successful completion of the operation in this schedule, the final value of
A will be 200 which override the update made by the first transaction that
changed the value from 100 to 90.
21
Cont……

| Uncommitted Dependency Problem: Occurs when one transaction can see


intermediate results of another transaction before it is committed.

| Example:- T2 increases 100 making it 200 but then aborts the transaction before it
is committed. T1 gets 200, subtracts 10 and make it 190. But the actual balance
should be 90

22
Cont……

| Inconsistent Analysis Problem: Occurs when transaction reads several values but second
transaction updates some of them during execution and before the completion of the first.

| Example:-T2 would like to add the values of A=10, B=20 and C=30. after the
values are read by T2 and before its completion, T1 updates the value of B to be 50.
At the end of the execution of the two transactions T2 will come up with the sum
of 60 while it should be 90 since B is updated to 50.

23
Serializability

| When multiple transactions are being executed by the operating system in a


multiprogramming environment, there are possibilities that instructions of one
transactions are interleaved with some other transaction.

| In any transaction processing system, if concurrent processing is implemented,


there will be concept called schedule having or determining the execution
sequence of operations in different transactions.

| Basic assumption- each transaction preserves database consistency.

24
Schedule

| Time-ordered sequence of the important actions taken by one or more transitions.

| Represents the order in which instructions are executed in the system in


chronological ordering.

| The scheduler component of a DBMS must ensure that the individual steps of
different transactions preserve consistency.

| A schedule for a set of transactions must consist of all instructions of those


transactions.

25
Cont…..

| Serial Schedule: a schedule where the operations of each transaction are executed
consecutively without any interleaved operations from other transactions.

| Non-serial Schedule: Schedule where operations from a set of concurrent


transactions are interleaved.

| The objective of Serializability is to find non-serial schedules that allow


transactions to execute concurrently without interfering with one another.

| In other words, want to find non-serial schedules that are equivalent to some
serial schedule. Such a schedule is called serializable.

26
Serialization

| To find schedules that allow transactions to execute concurrently without


interfering with

| If two transactions only read data, order is not important.

| If two transactions either read or write completely separate data items, they do not
conflict and order is not important.

| If one transaction writes a data item and another reads or writes the same data
item, order of execution is important
| Possible solution: Run all transactions serially.
| This is often too restrictive as it limits degree of concurrency or parallelism in
system. 27
Test of Serializability

| Constrained write rule: transaction updates data item based on its old value,
which is first read by the transaction.

| Under the constrained write rule we can use precedence graph to test for
Serializability.
| A precedence graph is composed of:
• Nodes to represent the transactions, and
• Arc(directed edge) to connect nodes indicating the order of execution for
the transactions
| So that the directed edge TiTj implies, in an equivalent serial execution Ti must be
executed before Tj.

  28
Precedence Graph for Schedule S
| The set of edges is used to contain all edges Ti ->Tj for which one of the three
conditions holds:
1. Create a node Ti->Tj ,if Ti executes write(Q) before Tj executes
read(Q)
2. Create a node Ti->Tj ,if Ti executes read(Q) before Tj executes
write(Q)
3. Create a node Ti->Tj ,if Ti executes write(Q)before Tj executes
write(Q)

Example 1:

29
Precedence Graph for Schedule S
| The precedence graph for the schedule looks like:

Precedence graph for S1


Note: The precedence graph for schedule S1 contains a cycle that's why Schedule S1 is non-
serializable.

30
Precedence Graph for Schedule S
| Example2:

31
Precedence Graph for Schedule S
| Example2 precedence graph for S2:

 The precedence graph for schedule S2 contains no cycle that's why ScheduleS2 is
serializable.
32
Concurrency Control Techniques

33
Concurrency Control Technique?

| Concurrency Control is the process of managing simultaneous operations on the


database without having them interfere with one another.

| Prevents interference when two or more users are accessing database


simultaneously and at least one is updating data.

| Although two transactions may be correct in themselves, interleaving of


operations may produce an incorrect result.

| When more than one transactions are running simultaneously there are chances
of a conflict to occur which can leave database to an inconsistent state. 

34
Cont….

 Purpose of concurrency control are:

1. To ensure Isolation property of concurrently executing transactions

2. To preserve database consistency

3. To resolve read-write and write-write conflicts

4. helps to ensure Serializability

 Example:
| In concurrent execution environment, if T1 conflicts with T2 over a data
item A, then the existing concurrency controller decides if T1 or T2 should
get A and which transaction should be rolled-back or waits 35
Cont…….
| Three basic concurrency control techniques are:

1. Locking methods
• Pessimistic and conservative approaches: since they
2. Time stamping delay transactions in case they conflict with other
transactions.
3. Optimistic
• Allows us to proceed and check conflicts at the end.

• Assume conflict is rare and only check for conflicts at commit

36
Locking Method
| Locking method is a mechanism for preventing simultaneous access on a shared
resource for a critical operation.

| A LOCK is a mechanism for enforcing limits on access to a resource in an


environment where there are many threads of execution.

| Transaction uses locks to deny access to other transactions and so prevent


incorrect updates.

| Lock prevents another transaction from modifying item or even reading it, in the
case of a write lock.

37
Types of a Lock

38
Binary Locks
| Can have two states or values: locked and unlocked (or 1 and 0).

| A distinct lock is associated with each database item X.

| If the value of the lock on X is 1, item X cannot be accessed by a database


operation that requests the item. And if the value of the lock on X is 0, the item
can be accessed when requested, and the lock value is changed to 1.

| We refer to the current value (or state) of the lock associated with item X as
lock(X).

| Too restrictive for database items because at most one transaction can hold a lock
on a given item
39
Cont…
| Lock and Unlock operations are as follows:

40
Cont…
| Every transaction in a binary lock must obey the following rules:

1. A transaction T must issue the operation lock_item(X) before any


read_item(X) or write_item(X) operations are performed in T.

2. A transaction T must issue the operation unlock_item(X) after all


read_item(X) and write_item(X) operations are completed in T.

3. A transaction T will not issue a lock_item(X) operation if it already holds the


lock on item X.1

4. A transaction T will not issue an unlock_item(X) operation unless it already


holds the lock on item X.
41
Shared/Exclusive (Read/Write) Locks

| Allow several transactions to access the same item x if they all access x for
reading purpose

| There are three locking operations: read_lock(X), write_lock(X), and unlock(X).

| A lock associated with an item X, LOCK(X), has three possible states: "read-
locked," "write-locked," or "unlocked."

| A read-locked item is also called share-locked because other transactions are


allowed to read the item, whereas as a write-locked item is called exclusive-
locked because a single transaction exclusively holds the lock on the item.

42
Cont…..

 In shared/exclusive method, data items can be locked in two modes :

 Shared mode: shared lock (X),


o More than one transaction can apply shared lock on X for reading its value
but no write lock can be applied on X by any other transaction
 Exclusive mode: Write lock (X),
o Only one write lock on X can exist at any time and no shared lock can be
applied by any other transaction on X.

43
Cont….

44
Cont…

| When we use the shared/exclusive locking scheme, the system must enforce the
following rules:

1. A transaction T must issue the operation read_lock(X) or write_lock(X)


before any read_item(X) operation is performed in T.

2. A transaction T must issue the operation write_lock(X) before any


write_item(X) operation is performed in T.

3. A transaction T must issue the operation unlock(X) after all read_item(X) and
write_item(X) operations are completed in T.

45
Cont…

4. A transaction T will not issue a read_lock(X) operation if it already holds a read


(shared) lock or a write (exclusive) lock on item X.

5. A transaction T will not issue a write_lock(X) operation if it already holds a read


(shared) lock or write (exclusive) lock on item X.

6. A transaction T will not issue an unlock(X) operation unless it already holds a


read (shared) lock or a write (exclusive) lock on item X.

46
Lock Conversion
| Sometimes, it is desirable to relax condition 4 and 5 in the coding list in order to
allow lock conversions. That is:
 Under certain conditions, a transaction that already holds a lock on item X is allowed to
convert the lock from one lock state to another.
 For example, it is possible for a transaction T to issue a read_lock (X) and then later on to
upgrade the lock by issuing a write_lock(x) operation
 If T is the only transaction holding a read lock on x at the time it issues the write_lock (x)
operation, the lock can be upgraded ;otherwise, the transaction must wait.
 It is also possible for a transaction T to issue a write_lock(X) and then later on to downgrade
the lock by issuing a read_lock(X) operation

47
Lock Conversion
| Lock upgrade: change existing read lock to write lock.
o If Ti has a read-lock (X) and Tj has no read-lock (X) (i  j) then convert read-lock (X) to write-
lock (X) else force Ti to wait until Tj unlocks X

| Lock downgrade: change existing write lock to read lock.


o If Ti has a write-lock (X) (*no transaction can have any lock on X*), convert write-lock (X) to
read-lock (X)

48
Example: Locking Method
| T1 and T2 are two transactions. They are executed under locking as follows. T1
locks A in exclusive mode. When T2 wants to lock A, it finds it locked by T1 so
T2 waits for Unlock on A by T1. When A is released then T2 locks A and begins
execution.

| Suppose a lock on a data item is applied, the data item is processed and it is
unlocked immediately after reading/writing is completed as follows.

| Initial values of A = 10 and B = 20.

49
Example: Locking Method

50
Two-Phase Locking (2PL)
| A transaction is said to follow two phase locking protocol if all locking
operations (either read_lock or write_lock) precede the first unlock operation in
the transaction.

| This is a protocol which ensures conflict-serializable schedules.


| Phase 1: Growing Phase
o transaction may obtain locks
o transaction may not release locks
| Phase 2: Shrinking Phase
o transaction may release locks
o transaction may not obtain locks

 Hence the 2PL protocol allows avoiding the three problems of concurrent
execution. 51
Locking methods: Problems
A. Dead Lock: A deadlock that may result when two (or more) transactions are
each waiting for locks held by the other to be released.

• Deadlock - possible solutions:

 Only one way to break deadlock: abort one or more of the transactions in
the deadlock.

• Deadlock should be transparent to user, so DBMS should restart transaction(s).

• Two general techniques for handling deadlock:

1. Deadlock prevention.

2. Deadlock detection and recovery. 52


Locking methods: Problems
B. Timeout:
 The deadlock detection could be done using the technique of TIMEOUT.

 Every transaction will be given a time to wait in case of deadlock.

 If a transaction waits for the predefined period of time in idle mode, the
DBMS will assume that deadlock occurred and it will abort and restart the
transaction.

53
Time-stamping Methods
• Time stamp (TS): Is a unique identifier created by the DBMS to identify a
transaction

• Time-stamping  a concurrency control protocol that orders transactions in such


a way that older transactions, transactions with smaller time stamps, get priority
in the event of conflict.

• Conflict is resolved by rolling back and restarting transaction.

• Since there is no need to use lock there will be No Deadlock.

54
Cont……
 Rules for permitting execution of operations in Time-stamping Method:
 Suppose that Transaction Ti issues Read(A)
o If TS(Ti) < WTS(A): this implies that Ti needs to read a value of A which was already
overwritten. Hence the read operation must be rejected and Ti is rolled back.
o If TS(Ti) >= WTS(A): then the read is executed and RTS(A) is set to the maximum of RTS(A)
and TS(Ti).
 Suppose that Transaction Ti issues Write(A)
o If TS(Ti) < RTS(A): then this implies that the value of A that Ti is producing was previously needed and
it was assumed that it would never be produced. Hence, the Write operation must be rejcted and Ti is
rolled back.
o If TS(Ti) < WTS(A): then this implies that Ti is attempting to Write an object value of A. hence, this write
operation can be ignored.
o Otherwise the Write operation is executed and WTS(A) is set to the maximum of WTS(A) or TS(Ti).

55
Optimistic Technique
| In this technique, Serializability is checked only at the time of commit and
transactions are aborted in case of non-serializable schedules

| No checking is done while transaction is executing

| In this scheme, updates in the transaction are not applied directly to the database
item until it reaches its commit point

| Three phases:

1. Read phase

2. Validation phase

3. Write phase 56
Cont….
1. Read phase: A transaction can read values of committed data items. However,
updates are applied only to local copies (versions) of the data items (in database
cache)

2. Validation phase: Serializability is checked before transactions write their


updates to the database.

3. Write phase: On a successful validation transactions’ updates are applied to the


database; otherwise, transactions are restarted

57
Database Recovery

58
Definition
| Database recovery is the process of restoring database to a correct state in the
event of a failure.

| A database recovery is the process of eliminating the effects of a failure from the
database.

| Recovery, in database systems terminology, is called restoring the last consistent


state of the data items.

59
Purpose of Database Recovery
| To bring the database into the last consistent state, which existed prior to the
failure.
| To preserve transaction properties (Atomicity, Consistency, Isolation and
Durability).

| Example:
o If the system crashes before a fund transfer transaction completes its
execution, then either one or both accounts may have incorrect value. Thus,
the database must be restored to the state before the transaction modified any
of the accounts.

60
Types of Database Recovery
| The database may become unavailable for use due to:

1. Transaction failure: Transactions may fail because of incorrect input,


deadlock, incorrect synchronization timeout, protection violation, or system
error.

2. System failure: the database system is unable to process any transactions


because of addressing error, application error, operating system fault, RAM
failure, register overflow, etc.

3. Media failure: failure of non-volatile storage media (mainly disk). Disk head
crash, power disruption, etc.
61
Cont…….
| The basic steps in performing a recovery are:

1. Isolating the database from other users. Occasionally, you may need to drop
and re-create the database to continue the recovery.

2. Restoring the database from the most recent useable dump.

3. Applying transaction log dumps, in the correct sequence, to the database to


make the data as current as possible.

62
Transaction Log
| Execution history of concurrent transactions.
| For recovery from any type of failure data values prior to modification (BFIM -
BeFore Image) and the new value after modification (AFIM – AFter Image) are
required.
| These values and other information is stored in a sequential file called
Transaction log.

ck P Next P Operation Data item BFIM AFIM


0 1 Begin
1 4 Write X X = 100 X = 200
0 8 Begin
2 5 W Y Y = 50 Y = 100
4 7 R M M = 200 M = 200
0 9 R N N = 400 N = 400
5 nil End
63
Recovery Facilities
| DBMS should provide following facilities to assist with recovery:
o Backup mechanism: that makes periodic backup copies of database.

o Logging facility: that keeps track of current state of transactions and database
changes.
o Checkpoint facility: that enables updates to database in progress to be made
permanent.
o Recovery manger: This allows the DBMS to restore the database to a
consistent state following a failure.

64
Check pointing
| Randomly or under some criteria, the database flushes its buffer to database disk
to minimize the task of recovery.
| A checkpoint record is written into the log periodically at that point when the
system writes out to the database on disk all DBMS buffers that have been
modified.
| The following steps defines a checkpoint operation:
o Suspend execution of transactions temporarily.
o Force write modified buffer data to disk.
o Write a [checkpoint] record to the log, save the log to disk.
o Resume normal transaction execution.
| During recovery, redo or undo is required to transactions appearing after
[checkpoint] record.

65
Recovery Techniques
1. Deferred Update
o Updates are not written to the database until after a transaction has reached
its commit point.
o If transaction fails before commit, it will not have modified database and so
no undoing of changes required.
o May be necessary to redo updates of committed transactions as their effect
may not have reached database.
o If a transaction aborts, ignore the log record for it. And do nothing with
transaction having a “transaction start” and “Transaction abort” log records
o The redo operations are made in the order they were written to log. 66
Recovery Techniques
2. Immediate Update/ Update-In-Place
o Updates are applied to database as they occur.
o Need to redo updates of committed transactions following a failure.
o May need to undo effects of transactions that had not committed at time of failure.
o Essential that log records are written before write to database. Write-ahead log protocol.
o If no "transaction commit" record in log, then that transaction was active at failure and must
be undone.
o Undo operations are performed in reverse order in which they were written to log.

67
Recovery Techniques
3. Shadow Paging
o Maintain two page tables during life of a transaction: current page and shadow page table.
o When transaction starts, two pages are the same.
o Shadow page table is never changed thereafter and is used to restore database in event of
failure.
o During transaction, current page table records all updates to database.
o When transaction completes, current page table becomes shadow page table.
o No log record management
o However, it has an overhead of managing pages i.e. page replacement issues have to be
handled.

68
Many Thanks!!!

You might also like