You are on page 1of 38

Unit – 4

Transaction Processing System


Transaction processing systems are the systems, which includes large size
database and allows many users to use same database concurrently.

A transaction process system (TPS) is an information processing system


for business transactions involving the collection, modification and retrieval of
all transaction data. Characteristics of a TPS include performance, reliability
and consistency.

Examples:
1. Railway Reservation System

2. Banking System

3. User Database based website etc.

1
Transaction System
 Collection of operation that form of single logical unit of work are called
transaction.

 A Transaction is a unit of program execution that accesses and possibly


updates various data items.

 Transaction access data using two operations:


 Read (X) : Which transfers the data item (X) from the database to

local buffer belong to the transaction that executed the read


operation.

 Write (X) : Which transfers the data item (X) from the local buffer of
the transaction that executed the write back to the database.
2
Transaction System
Ex. transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)

 Two main issues to deal with:


 Failures of various kinds, such as hardware failures and system crashes

 Concurrent execution of multiple transactions

3
Properties of Transaction
 Transaction have four basic properties, which are called ACID properties.
These properties are closely related to each other.

A Atomicity: a transaction is an atomic unit of processing and it is either


performed entirely or not at all.

C Consistency : a transaction's correct execution must take the database


from one correct state to another

I Isolation/Independence: the updates of a transaction must not be made


visible to other transactions until it is committed (solves the temporary
update problem)

D Durability (or Permanency): if a transaction changes the database and is


4
committed, the changes must never be lost because of subsequent failure
ACID Properties Examples
 Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)

 Atomicity requirement
 if the transaction fails after step 3 and before step 6, money will be “lost” leading to an
inconsistent database state. Failure could be due to software or hardware.
 the system should ensure that updates of a partially executed transaction are not
reflected in the database.

 Durability requirement — once the user has been notified that the transaction has
completed (i.e., the transfer of the $50 has taken place), the updates to the database by the
transaction must carry on even if there are software or hardware failures. 5
ACID Properties Examples
 Consistency requirement :

 In general, consistency requirements include integrity constraints such


as primary keys and foreign keys and sum of balances of all accounts
etc.

 A transaction must be a consistent database.

 During transaction execution the database may be temporarily


inconsistent.

 When the transaction completes successfully the database must be


consistent

 invalid transaction logic can lead to inconsistency


6
ACID Properties Examples
 Isolation requirement — if between steps 3 and 6, another transaction
T2 is allowed to access the partially updated database, it will see an
inconsistent database.

T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B )
 Isolation can be ensured trivially by running transactions serially
 that is, one after the other.
 However, executing multiple transactions concurrently has significant 7
Transaction State

8
Transaction State

 Active – the initial state; the transaction stays in this state while it is executing.

 Partially committed – after the final statement has been executed.

 Failed -- after the discovery that normal execution can no longer proceed.

 Aborted – after the transaction has been rolled back and the database restored to
its state prior to the start of the transaction. Two options after it has been aborted:
 restart the transaction
 can be done only if no internal logical error
 kill the transaction

 Committed – after successful completion.


9
Recovery from Transaction Failures
A recovery process is an integral part of a database system, which is responsible
for system failure. The system failure may be several cause :

• Computer Failure (system crash)


• A transaction or system error
• Local errors or exception conditions detected by the transaction
• Concurrency control enforcement
• Disk failure and Physical component problems

There are various techniques of recovery:


• Cascading Rollback
• Recoverable Schedules
• Log Based Recovery
• Checkpoints
• Shadow Paging
• Backup Mechanism 10
Cascading Rollback
The event, in which a single transaction failure leads to a series of transaction
rollback is known as Cascading Rollback.
Example: Transaction T1, write a value of Z, that is read by transaction T 2.
Transaction T2, write a value of Z, that is read by transaction T 3.
Suppose that, at this point T1 fails, T1 must be rolled back. Since T2 is dependent on
T1.T2 must be rollback. Since T3 is dependent on T2. T3 must also be rolled back.

T1 T2 T3
Read (X)
Read (Y)
Z=X+Y
Write (Z)

Read (Z)
Write Z)
Read (Z)
11
Log Based Recovery
To keep track of the database transactions, the DBMS maintains special file called
log files that contain information about all updates.

The log file contains information like Transaction Identifier, Data Item Identifier,
Old Value and New Value etc.

The information contained in the log files are critical for the database recovery, two
separate copies are maintained.

In the past the log files were stored on magnetic tape. But nowadays, DBMS are
expected to recover from minor failures very quickly so that log files are stored
online on a fast Direct Access Storage Devices (DASD).

The log file are periodically archived and are stored in off-line storage. These log
files are called “Archive Log File” 12
Checkpoints

A checkpoints is a point of synchronization between the database and the


transaction log file. All buffers are force written to secondary storage at
the checkpoint.

Checkpoints are also called save points .

Checkpoints are schedules at pre determined interval and involves


operation like writing all log records in main memory to secondary storage.

If the transaction are executed serially, when a failure occurs, we check


the log file to find the transaction that started before the last checkpoint.

13
Backup Mechanism

The backup copy of the database can be used to recover the


database in the event that the database has been damaged or
destroyed.

A backup can be a complete copy of the entire database or an


incremental copy.

An incremental backup consists only of modifications made


since the last complete transaction.

14
Shadow Paging
The shadow paging scheme does not require the use of log file in a single user
environment, though a log is needed in multi users environment for concurrency
control.

Shadow paging considers the database to be made up of a number of fixed size


disk pages or disk blocks for recovery purposes.

The shadow paging technique maintain two directories during the transaction
current directory and shadow directory.

When the transaction starts, the two directories are the same. The shadow
directory is saved to the disk and the current directory is used by the transaction.

The shadow directory is never changed and is used to restore the database in case
of a failure. During the transaction execution, the shadow directory is never
modified. 15
Distributed Database
Distributed Database System (DDBMS) is a collection of data with
different parts under the control of separate DBMSs running on
different computers.

These computers are connected together and communicate with


one another through various communication techniques.

DDBMS can handle both local and global transactions.

The main difference between centralized and distributed database


system is that, in the former, the data exist in single location, but in
the later, the data exist in various locations.

16
Fig. Distributed Database between MFG./Sales/HQ

17
Advantage of Distributed Database
DDBMS allows each client/ site to store and maintain its own database,
causing immediate and efficient access to data.

It allows to access the data stored at remote sites. At the same time
users can retain the control to its own client/ site to access the local data.

If one site is not working due to any reason, the system will not be down
because other client/ site of the network can possible continue
functioning.

New client/ site can be added to the system any time.

If a user need to access the data from multiple clients/sites then the
desired query can be subdivided into sub queries. 18
Disadvantage of Distributed Database

Complex software is required for a Distributed Database


System (DDBMS)

Distributed Database System (DDBMS) technology are s


expensive.

DDBMS based on communication and server- client


technology.

At the same time multiple queries from one client to another


clients may be slow. 19
Data Fragmentation
Data Fragmentation is method in which different relations of a
relational database system can be sub-divided and distributed
among network sites/clients.

For example, suppose we have a relation S and it is fragmented, it


means it is divided sub into-sets S1,S2,…..Sn

There are three type fragmentation:

1. Horizontal Fragmentation
2. Vertical Fragmentation
3. Mixed Fragmentation

20
Horizontal Fragmentation
In Horizontal fragmentation records of a relation are divided into number of
subsets.

For example, consider a relation Packing as shown in figure below. This


relation can be divided into n different fragments, each of which records
belong to a particular Packing house no. If above relation has two packing
house no 1 and 2, then there may be two fragments.
Table :Packing
Date Cartoon No. Make/Type Weight Packing
(Kg.) House no
18/04/2104 123 BCF 95 1
18/04/2104 124 FDY 98 1
18/04/2104 125 POY 102 2
18/04/2104 126 POY 103 2
18/04/2104 127 BCF 110 2
18/04/2104 128 FDY 112 2 21
Horizontal Fragmentation
These two fragments are shown below in pack1 and pack2 and can now be stored in
different networks.
Pack1 = σ Packing_house_no = “1” (packing) Table :Pack1
Date Cartoon No. Make/Type Weight Packing
(Kg.) House no
18/04/2014 123 BCF 95 1
18/04/2014 124 FDY 98 1
Pack2 = σ Packing_house_no = “2” (packing) Table :Pack2
Date Cartoon No. Make/Type Weight Packing
(Kg.) House no
18/04/2014 125 POY 102 2
18/04/2014 126 POY 103 2
18/04/2014 127 BCF 110 2
18/04/2014 128 FDY 112 2
The reconstruction of relation packing can be obtained by taking the union of all
fragmentation i.e, Packing = Pack1 U Pack2 22
Vertical Fragmentation
Vertical fragmentation is very similar to horizontal fragmentation. In vertical
fragmentation attributes/ fields are divided into numbers of subsets.

Vertical fragmentation is accomplished by adding a special attribute called


Desp. A desp is a logical value for every record.
Table :Packing

Date Cartoon Make/Type Weight (Kg.) Packing Desp


No. House no
18/04/2014 123 BCF 95 1 T
18/04/2014 124 FDY 98 1 F
18/04/2014 125 POY 102 2 F
18/04/2014 126 POY 103 2 T
18/04/2014 127 BCF 110 2 T
18/04/2014 128 FDY 112 2 F
23
Vertical Fragmentation
These two vertical fragments are shown below in pack1 and pack2 and can
now be stored in different networks.
Pack1 = Π date,cartoon_no, make, weight (packing) Table :Pack1

Date Cartoon Make/Type Weight (Kg.)


No.
18/04/2014 123 BCF 95
18/04/2014 124 FDY 98
18/04/2014 125 POY 102
18/04/2014 126 POY 103
18/04/2014 127 BCF 110
18/04/2014 128 FDY 112

24
Vertical Fragmentation
Pack2 = Π date, cartoon, desp (packing)
Table :Pack2
Date Cartoon Desp
No.
18/04/2014 123 T
18/04/2014 124 F
18/04/2014 125 F
18/04/2014 126 T
18/04/2014 127 T
18/04/2014 128 F

To reconstruct the original packing relation from the fragments, we join


different individual fragments i.e. Packing = Pack1 X Pack2
25
Mixed Fragmentation
It is the combination of both horizontal and vertical fragmentation
For example, mixed fragmentation is:

Π date, packing_house_no (σ weight > 100 (packing))

Date Packing Table :Mix_Pack


House no
18/04/2014 2
18/04/2014 2
18/04/2014 2
18/04/2014 2

26
Schedules
We know that transactions are set of instructions and these instructions
perform operations on database. When multiple transactions are
running concurrently then there needs to be a sequence in which the
operations are performed because at a time only one operation can be
performed on the database. This sequence of operations is known
as Schedule. Banking transaction system is the best example of
scheduling. The various type of Schedules are:

1. Serial Schedules
2. Non Serial Schedules
3. Conflict Schedules
4. View Schedules

27
Serial Schedules
Two schedules A and B are called serial schedule if the operations of each
transaction are executed consecutively , without any interleaved operation from
the other transaction.

In a serial schedule, entire transaction re performed in serial order T1 and then


T2 or T2 and then T1.

Schedule A Schedule B
T1: T2: T1: T2:
read_item(X); read_item(X);
X:= X - N; X:= X + M;
write_item(X); write_item(X);
read_item(Y); read_item(X);
Y:=Y + N; X:= X - N;
write_item(Y); write_item(X);
read_item(X); read_item(Y);
X:= X + M; Y:=Y + N;
write_item(X); write_item(Y); 28
Non-Serial Schedules
Two schedules C and D are called non-serial schedule if the operations of each
transaction are executed non- consecutively with interleaved operation from the
other transaction.

Schedule C Schedule D
T1: T2: T1: T2:
read_item(X); read_item(X);
X:= X - N; X:= X - N;
read_item(X); write_item(X);
X:= X + M; read_item(X);
write_item(X); X:= X + M;
read_item(Y); write_item(X);
write_item(X); read_item(Y);
Y:=Y + N; Y:=Y + N;
write_item(Y); write_item(Y);

29
Conflict Schedules
Consider a schedule S in which there are two consecutive instructions I i and Ij
of the transactions Ti and Tj respectively .

Now we identified three rules that show that if two operations in a schedule
satisfy all these three condition then operations are said to conflict.

1. They belong to different transactions.


2. They access the same time.
3. At least one of the operations is a write-item.

If graph contains a cycle, hence it is not conflict serializable


30
Conflict Serializable Schedule
A schedule is called conflict serializability if after swapping of non-conflicting operations,
it can transform into a serial schedule. The schedule will be a conflict serializable if it
is conflict equivalent to a serial schedule.

Conflicting Operations : The two operations become conflicting if all conditions satisfy:
• Both belong to separate transactions.
• They have the same data item.
• They contain at least one write operation

Example: Swapping is possible only if S1 and S2 are logically equal.

31
Here, S1 = S2. That means it is non-conflict.
Conflict Serializable Schedule

Here, S1 ≠ S2. That means it is conflict.

32
Conflict Equivalent
In the conflict equivalent, one can be transformed to another by swapping non-conflicting
operations. In the given example, S2 is conflict equivalent to S1 (S1 can be converted to S2 by
swapping non-conflicting operations).

Two schedules are said to be conflict equivalent if and only if:


1.They contain the same set of the transaction.
2.If each pair of conflict operations are ordered in the same way

Example:

Schedule S2 is a serial schedule because, in this, all operations of T1 are performed before starting
any operation of T2. Schedule S1 can be transformed into a serial schedule by swapping non-
conflicting operations of S1. 33
Conflict Equivalent
After swapping of non-conflict operations, the schedule S1 becomes:

T1 T2

Read(A)
Write(A)
Read(B)
Write(B)
Read(A)
Write(A)
Read(B)
Write(B)

Since, S1 is conflict serializable.

34
Testing of Serializability

For testing of seriallzability the simple and efficient method is to


construct a graph, called precedence graph G.
G = (V , E)
The set of vertices consists of all the transaction participating in
the schedule. The set of edges consists of all edges T i > Tj for
which one of three condition holds:

1. Ti executes write (Q) before Tj executes read (Q)


2. Ti executes read (Q) before Tj executes write (Q)
3. Ti executes write (Q) before Tj executes write (Q)
35
Testing of Serializability
T1 T2
Example: Read (A)
A : A - 50
Read (A)
Temp : A * 0.1
A : A- temp
Write (A)
Read (B)
Write (A)
Read (B)
B : B + 50
Write (B)
B : B + temp
Write (B)

Precedence graph for above schedule and graph contains a cycle,


hence it is not conflict serializable.
36
View Schedules
View Serializability is a process to find out that a given schedule is view serializable
or not. To check whether a given schedule is view serializable, we need to check
whether the given schedule is View Equivalent to its serial schedule.

If a schedule is view equivalent to its serial schedule then the given schedule is
said to be View Serializable. Lets take an example.

37
View Schedules
Lets check the three conditions of view serializability:
Initial Read
In schedule S1, transaction T1 first reads the data item X. In S2 also transaction T1 first reads the data item X.
Lets check for Y. In schedule S1, transaction T1 first reads the data item Y. In S2 also the first read operation on
Y is performed by T1.
We checked for both data items X & Y and the initial read condition is satisfied in S1 & S2.

Final Write
In schedule S1, the final write operation on X is done by transaction T2. In S2 also transaction T2 performs the
final write on X.
Lets check for Y. In schedule S1, the final write operation on Y is done by transaction T2. In schedule S2, final
write on Y is done by T2.
We checked for both data items X & Y and the final write condition is satisfied in S1 & S2.

Update Read
In S1, transaction T2 reads the value of X, written by T1. In S2, the same transaction T2 reads the X after it is
written by T1.
In S1, transaction T2 reads the value of Y, written by T1. In S2, the same transaction T2 reads the value of Y
after it is updated by T1.
The update read condition is also satisfied for both the schedules.

Result: Since all the three conditions that checks whether the two schedules are view equivalent are satisfied in
this example, which means S1 and S2 are view equivalent. Also, as we know that the schedule S2 is the serial
schedule of S1, thus we can say that the schedule S1 is view serializable schedule. 38

You might also like