You are on page 1of 32

F28DM Database Management Systems

Transaction Management
monica@macs.hw.ac.uk Room: EMG30, Ext: 4160 Material on Vision & my web page

Monica Farrow

Content taken from HW & GLA lecturers

Transaction Management

Supporting Concurrent Access


Many applications require a lot of users to access the data simultaneously (e.g. airline booking systems)
Uncontrolled simultaneous access can result in chaos, so some controlling mechanism is required We introduce the notion of the transaction to aid the discussion

A transaction is a logical unit of work which takes the DB from one consistent state to another, i.e. obeying constraints
It will probably be made up of smaller operations which temporarily cause inconsistency
Transaction Management 2

Transactions
Database transactions are logical units of work which must ALL be performed to maintain data integrity E.g. Move money from one account to another
UPDATE Account SET balance = balance 100 WHERE accountNo = 123; UPDATE Account SET balance = balance + 100 WHERE accountNo = 124;

Another example would be a purchase

Create order, decrease stock quantity, add payment


Transaction Management 3

ACID Properties of Transactions


Atomicity
ALL operations in a transaction must be completed. If not, the transaction is aborted. The entire transaction is treated as a single, indivisible unit of work which must be performed completely or not at all.

Consistency
If an operation is executed that violates the databases integrity constraints, the entire transaction will be rolled back. A successful transaction takes the database from one state that is consistent with the rules to another state that is also consistent with the rules.
Transaction Management 4

ACID Properties of Transactions


Isolation
Data used within a transaction cannot be used by another transaction until the first transaction is completed. (or it must appear that this happened!). The partial effects of incomplete transactions should not be visible to other transactions.

Durability
Once the transaction changes have been made, they will survive failure. The recovery system must ensure this.

Transaction Management

Transactions in Oracle
Transactions can consist of one or more SQL commands.
In Oracle, a transaction starts when you connect to sqlplus, and ends when you type COMMIT;
If youre accessing the database from 2 different applications, this explains why what youve altered in one application may not show up in the second.

An implicit COMMIT occurs before and after any Data Definition commands (CREATE, ALTER etc)
You can also SET AUTOCOMMIT ON (or OFF) to force a commit after each command.

Transaction Management

Transactions and the user


For the JDBC section of the coursework, Jenny is using the word transaction in a slightly more general way, meaning a logical unit of work, without necessarily requiring it to use database transactions
e.g.
Show data to user Ask for response Use response in next sql command

There is a problem with this sort of transaction where the user is involved because of the length of time taken. It is impractical to treat them within the DBMS in the same way as transactions not involving a user.
There are various partial solutions not discussed in this module
Transaction Management

Rollback
The DBMS maintains a transaction log. If the computer crashes in the middle of a transaction, the DBMS will rollback the database to the last completed transaction

Transaction Management

How transactions are used


Transactions are used for three purposes in DBMS:
to determine when integrity constraint checks should occur (only at the end of transactions) to control concurrent access. Gives a single user the illusion of being the sole user of the database to manage recovery from system crashes

Transaction Management

More generally, in databases


A transaction is the execution of a program that accesses the DB and starts with a BEGIN operation, followed by a sequence of READ and WRITE operations, ending with a COMMIT or ABORT operation.

Transaction Management

10

Concurrent access
In introducing many users, we can either serialise their transactions or interleave them
We wish to do the latter as we want to use the processor to perform other work while one transaction waits for a disc access However, we must not allow the transactions to conflict with each other Conflict may occur when two transactions are trying to use the same piece of data and at least one of them is trying to change it
Transaction Management 11

Schedules
The execution of a set of transactions is called a schedule If each transaction is executed entirely before the next transaction is started, the schedule is said to be serial Non-serial schedules are called interleaved schedules A schedule is serializable if it has the same effect on the database as a serial schedule
Here are some examples of problems with non-serial schedules
Transaction Management 12

Lost updates
Consider two transactions A and B which add 10 and 20 respectively to a value V A and B both take a copy of the original value of V They both change the value in memory A puts back its new value first and then B puts back its new value which immediately overwrites As change A's update is lost! A B | | Get V | | Get V Add 10 | | Add 20 Put V | put V V 5 5 5 5 5 15 25

Transaction Management

13

Temporary Update
A updates V B uses A's updated value A aborts and V's old value is restored B continues with erroneous value!
A B | | Store 20 in V | | Get V Crash & | Rollback | Use wrong value of V | V 5 20 20 5 5 5

Transaction Management

14

Incorrect Summary
A updates all the values in a set V B calculates an average while A is half-way through B uses inconsistent data
B A | | | Update V1 | Update V2 Read all Vs . | . Calculate . avg . | . Update Vn

One solution to these problems is to use locks


Transaction Management

15

Concurrency control Algorithms


The scheduler component of a DBMS applies some concurrency control algorithm (enforces a protocol) to ensure that only serializable schedules are permitted
Concurrency control algorithms can be divided into
Locking vs timestamp protocols Pessimistic and optimistic

One commonly used algorithm is 2-phase locking


Transaction Management 16

Locking
Every time a transaction makes use of a piece of data it notifies the DBMS of this and acquires a lock on that item This gives it certain access rights, usually one of two types:
an exclusive Lock (X-lock) means that no-one else can use it a shared Lock (S-lock) means that anyone else can also have an S-lock but not an X-lock (NB - Oracle has more than this)

"One writer or many readers"


When updating, the transaction needs an X-lock When retrieving, the transaction only needs an S-lock

Transaction Management

17

Locking continued
If a transaction tries to acquire a lock but someone else already holds an incompatible lock, the transaction must wait
The database system might provide locks at different levels of granularity:
e.g. locking a cell, a record, a page, the whole table, the whole database the bigger the locking unit the more the system will be slowed down by blocked transactions the smaller the locking unit the more lock management needs to be done
Transaction Management 18

2-Phase Locking protocol


If a lock request cannot be granted, the transaction must wait.
Rather than acquiring and releasing locks whenever they are required, a 2-phase locking protocol if often used. The transaction passes through 2 phases
a growth phase, acquiring locks and not releasing any a shrink phase, releasing locks and not acquiring more.

There are variations in strict 2PL, all write locks are released at the end.
Transaction Management 19

Solving Lost Updates


Transaction A Request X-lock on V Transaction B Request X-lock on V Wait .... .... Wait Acquire X-lock on V Get V Update V Value of V 5 5 5 5 5 5 15 15 15 15 35

Acquire X-lock on V
Get V Update V Release X-lock on V

Release X-lock on V

35

Transaction Management

20

Solving Temporary update


Transaction A Request X-lock on V Acquire X-lock on V Set V to 20 Crash Roll back Release lock Transaction B Value of V 5 Request S-lock on V 5 Wait 5 20 20 5 5 Acquire an S-lock on V 5 Get V 5

Transaction Management

21

Solving incorrect summary


Transaction A Request X-lock on V1 Acquire X-lock on V1 Update V1 Request X-lock on V2 ....... Release all locks Transaction B
Request S-lock on V1 Wait

Acquire S-lock on V1 etc.

Transaction Management

22

Deadlocks
2PL still leads to deadlocks A deadlock is a cycle of transactions waiting for locks to be released by each other
E.g.
T1 holds excl(X) and requests shrd(Y) T2 is holding excl(Y) and requests shrd(X)

Deadlocks can be timed-out, prevented or detected


Transaction Management 23

Deadlock handling
Time-outs
Assume that if a transaction is blocked longer than a certain period of time, it must be involved in a deadlock. Abort. Easy to implement, but may abort some transactions unnecessarily

Prevention
Order transactions by timestamps, apply rules as to whether transactions are allowed to wait or must be restarted.

Detection
Periodically check for deadlock. Create a Wait-For Graph. Deadlock exists if there are cycles in the graph. Abort transactions until cycles vanish

Transaction Management

24

Time Based Concurrency Control: Idea

Lock-based concurrency control is pessimistic


Assumption conflicts are likely to happen, and locking prevents this

Timestamp-based concurrency control is

optimistic

If conflict is discovered, transactions are rolled back and restarted

Transaction Management

25

Timestamps
Each transaction T gets a timestamp TS at startup
This may be from the system clock or simply by incrementing a logical counter for each transaction

Each data item contains timestamp data


Read-timestamp: largest (youngest) timestamp of the transactions which read the data Write-timestamp: largest (youngest) timestamp of the transactions which wrote the data

Transaction Management

26

Timestamps
Timestamping orders transactions so that older transactions get priority in case of conflict R/W operation is only allowed if the last update on a data item was carried out by an older transaction. If not, the transaction requesting the operation is aborted and restarted with a new timestamp.
No locking means no waiting and no deadlocks
Transaction Management 27

Timestamp reading
A transaction T wants to read X TS(T) < WriteTS(X)

Conflict! A later transaction has changed X. Values already acquired by T may now be inconsistent. Abort and restart T

TS(T)>= WriteTS(X)

T started after X was updated Let T read X Update ReadTS(X) if T is the youngest transaction to read X
Transaction Management 28

Timestamp writing
A transaction T wants to write X
TS(T) < ReadTS(X)
Conflict! A later transaction is using X. Its too late to update it. Abort and restart T with a later timestamp

TS(T)< WriteTS(X)

X has been written by a later transaction. Either abort and restart T with a later timestamp Or ignore the write on the grounds that it would already have been aborted on the read if it mattered (Thomass write rule)

Otherwise, T can write X and update WriteTS(X)


Transaction Management 29

Timestamps summary
2 kinds of conflict can arise
A transaction wants to write an item that has been read by a later transaction A transaction wants to read an item that has been written by a later transaction

Schedules may not be recoverable


E.G. T1 : W1(A) T2: R2(A) -> W2(A) -> Commit Need to rollback T1 before T1 commits, but cant rollback T2.

Full timestamp-based CC is more complex. . .


30

Transaction Management

Multiversion concurrency control


MVCC provides each user connected to the database with a "snapshot" of the database for that person to work with. Any changes made will not be seen by other users of the database until the transaction has been committed. Any changes made will not be seen by other users of the database until the transaction has been committed.
Timestamps are used to ensure serializability MVCC is used in Oracle
Transaction Management 31

Timestamp examples
2 transactions, T1 and T2.
So T2 is the later transaction

TS(T1) = 1, TS(T2) = 2

R1 means read by T1, W2 means write by T2 etc T1 = R1(X), W1(X), R1(Y), W1(Y) T2 = R2(Y), W2(Y) Schedule R1(X), W1(X), R1(Y), R2(Y), W2(Y), W1(Y)
Ok until W1(Y), when TS(T1) < ReadTS(Y) {1< 2}

Try
Schedule R1(X), W1(X), R1(Y), R2(X), W2(X), W1(Y) Schedule R1(X), R2(Y), W2(Y) ,W1(X), R1(Y), W1(Y)

Transaction Management

32