You are on page 1of 50

UNIT-4

TRANSACTIONS AND CONCURRENCY MANAGEMENT


Meaning of Transaction
• A collection of actions that transforms the database from one consistent state into another consistent
state .
Let’s take an example of a simple transaction. Suppose a bank employee transfers Rs 500 from A's
account to B's account. This is very simple and small transaction involves several low-level tasks.
A’s Account
Open_Account(A)
Old_Balance = A.balance
New_Balance = Old_Balance - 500
A.balance = New_Balance
Close_Account(A)
B’s Account
Open_Account(B)
Old_Balance = B.balance
New_Balance = Old_Balance + 500
B.balance = New_Balance
Close_Account(B)
Thus, Transaction involves 3 stages
1. Begin Transaction 2. Execution of Transaction 3. End Transaction
States of Transactions: A transaction in a DB can be in one
of the following state:
In a database, the transaction can be in one of the following states -
Active state
-The active state is the first state of every transaction. In this
state, the transaction is being executed.
-For example: Insertion or deletion or updating a record is
done here. But all the records are still not saved to the
database.
Partially committed
-In the partially committed state, a transaction executes its
final operation, but the data is still not saved to the database.
In the total mark calculation example, a final display of the
total marks step is executed in this state.
Committed
-A transaction is said to be in a committed state if it executes
all its operations successfully. In this state, all the effects are
now permanently saved on the database system.
Failed state
-If any of the checks made by the database recovery system fails, then the
transaction is said to be in the failed state.
-In the example of total mark calculation, if the database is not able to fire a
query to fetch the marks, then the transaction will fail to execute.
Aborted
-If any of the checks fail and the transaction has reached a failed state then the
database recovery system will make sure that the database is in its previous
consistent state. If not then it will abort or roll back the transaction to bring the
database into a consistent state.
-If the transaction fails in the middle of the transaction then before executing
the transaction, all the executed transactions are rolled back to its consistent
state.
-After aborting the transaction, the database recovery module will select one of
the two operations:
Re-start the transaction
Kill the transaction
Transaction Issues:
There are two main transaction issues
1. Concurrent execution of multiple
transactions
2. Recovery after hardware failures and
system crashes
To preserve the integrity of data, the DBMS
has to ensure that the ACID Properties are
fulfilled for any transaction.
Transaction property

The transaction has the four properties. These are used to maintain
consistency in a database, before and after the transaction.

1.Atomicity
2.Consistency
3.Isolation
4.Durability
Atomicity
•It states that all operations of the transaction take place at once
if not, the transaction is aborted.
•There is no midway, i.e., the transaction cannot occur partially.
Each transaction is treated as one unit and either run to
completion or is not executed at all.
Atomicity involves the following two operations:
Abort: If a transaction aborts then all the changes
madearenotvisible.
Commit: If a transaction commits then all the changes made are visible.
Example: Let's assume that following transaction T consisting of T1 and T2. A
consists of Rs 600 and B consists of Rs 300. Transfer Rs 100 from account A to
account B.

T1 T2 • After completion of the


transaction, A consists of Rs 500
and B consists of Rs 400.
• If the transaction T fails after the
Read(A) Read(B) completion of transaction T1 but
A:= A-100 Y:= Y+100 before completion of transaction
T2, then the amount will be
Write(A) Write(B) deducted from A but not added
to B. This shows the inconsistent
database state. In order to
ensure correctness of database
state, the transaction must be
executed in entirety.
Consistency
The integrity constraints are maintained so that the database is consistent
before and after the transaction.
The execution of a transaction will leave a database in either its prior stable
state or a new stable state.
The consistent property of database states that every transaction sees a
consistent database instance.
The transaction is used to transform the database from one consistent state to
another consistent state.
For example: The total amount must be maintained before or after the
transaction.
Total before T occurs = 600+300=900
Total after T occurs= 500+400=900
Therefore, the database is consistent. In the case when T1 is completed but T2
fails, then inconsistency will occur.
Isolation
It shows that the data which is used at the time of execution of a transaction
cannot be used by the second transaction until the first one is completed.
In isolation, if the transaction T1 is being executed and using the data item X, then
that data item can't be accessed by any other transaction T2 until the transaction
T1 ends.
The concurrency control subsystem of the DBMS enforced the isolation property.
Durability
The durability property is used to indicate the performance of the database's
consistent state. It states that the transaction made the permanent changes.
They cannot be lost by the erroneous operation of a faulty transaction or by the
system failure. When a transaction is completed, then the database reaches a state
known as the consistent state. That consistent state cannot be lost, even in the
event of a system's failure.
The recovery subsystem of the DBMS has the responsibility of Durability property.
Concurrent Transactions
The running together of two transactions, which may access the same
database rows during overlapping time periods. Such simultaneous
accesses, called collisions, may result in errors or inconsistencies if not
handled properly. The more overlapping that is possible, the greater
the concurrency.
To overcome this the following two methods are adopted:
1. Schedule
2. Serial Schedule
Schedule Serial Schedule

• It is a chronological execution • The serial schedule is a type of


sequence of transaction. schedule where one transaction is
executed completely before
starting another transaction. In the
serial schedule, when the first
transaction completes its cycle,
then the next transaction is
executed.
Equivalence of Schedules-

In DBMS, schedules may have the following three different kinds of equivalence relations among them-
1. Result Equivalent Schedules-

• If any two schedules generate the same result after their execution,
then they are called as result equivalent schedules.
• This equivalence relation is considered of least significance.
• This is because some schedules might produce same results for some
set of values and different results for some other set of values.
2. Conflict Equivalent Schedules-

• If any two schedules satisfy the following two conditions, then they
are called as conflict equivalent schedules-
• The set of transactions present in both the schedules is same.
• The order of pairs of conflicting operations of both the schedules is
same.
3. View Equivalent Schedules-

• Two schedules are view equivalence if transactions in both schedules


perform similar actions in similar manner.
Serialisable Schedules
• When multiple transactions are being executed by the operating system in a multiprogramming environment, there are
possibilities that instructions of one transactions are interleaved with some other transaction.
• Schedule − A chronological execution sequence of a transaction is called a schedule. A schedule can have many
transactions in it, each comprising of a number of instructions/tasks.
• Serial Schedule − It is a schedule in which transactions are aligned in such a way that one transaction is executed first.
When the first transaction completes its cycle, then the next transaction is executed. Transactions are ordered one after
the other. This type of schedule is called a serial schedule, as transactions are executed in a serial manner.
• In a multi-transaction environment, serial schedules are considered as a benchmark. The execution sequence of an
instruction in a transaction cannot be changed, but two transactions can have their instructions executed in a random
fashion. This execution does no harm if two transactions are mutually independent and working on different segments
of data; but in case these two transactions are working on the same data, then the results may vary. This ever-varying
result may bring the database to an inconsistent state.
• To resolve this problem, we allow parallel execution of a transaction schedule, if its transactions are either serializable or
have some equivalence relation among them.
• Example:
• i. Suppose, two railway reservation agents perform two transactions T1 and T2 at approximately same time.
• ii. If no interleaving of operations is permitted, there are only two possible outcomes:
• Execute all the operations of transaction T1 (in sequence) followed by all the operations of transaction T2 (in sequence).
• Execute all the operations of transaction T2 (in sequence) followed by all the operations of transaction T1 (in sequence).
Concurrency Control in Database Management System

Concurrency Control in Database Management System is a procedure of


managing simultaneous operations without conflicting with each other. It
ensures that Database transactions are performed concurrently and
accurately to produce correct results without violating data integrity of the
respective Database.
Need for Concurrency control
Simultaneous execution of transactions over a shared database can create
several data integrity and consistency problems like
Lost updates
Uncommitted data
 Inconsistent retrieval
Locking Protocol

What is Lock?
A lock is a variable associated with a data item that describes the status of
the item with respect to possible operations that can be applied to it.
Generally, there is one lock for each data item in the database.
Locks are used as a means of synchronizing the access by
concurrent transactions to the database item.
Types of Locks
• Binary Lock: This locking mechanism has two states or values: locked and
unlocked.
• Multi-mode locks: In this locking type each data item can be in any state:
Shared lock: It is also known as a Read-only lock. In a shared lock, the
data item can only read by the transaction. ...
Exclusive lock: In the exclusive lock, the data item can be both reads as
well as written by the transaction.
Two Phase locking protocol
To guarantee serializablity, we must follow some additional protocol concerning the
positioning of locking and unlocking operations in every transaction. This is where the
concept of Two Phase Locking(2-PL) comes in the picture, 2-PL ensures serializablity.
• Two Phase Locking –
A transaction is said to follow Two Phase Locking protocol if Locking and Unlocking can
be done in two phases.
Growing Phase: New locks on data items may be acquired but none can be released.
Shrinking Phase: Existing locks may be released but no new locks can be acquired.
The two-phase locking protocol divides the execution phase of the transaction into three parts.
-In the first part, when the execution of the transaction starts, it seeks permission for the lock it
requires.
-In the second part, the transaction acquires all the locks. The third phase is started as soon as the
transaction releases its first lock.
-In the third phase, the transaction cannot demand any new locks. It only releases the acquired locks.
In the below example, if lock conversion is allowed then the following phase can happen:
Upgrading of lock (from S(a) to X (a)) is allowed in growing phase.
Downgrading of lock (from X(a) to S(a)) must be done in shrinking phase.

Example:
The following way shows how unlocking and locking work with 2-
PL.

Transaction T1:
• Growing phase: from step 1-3
• Shrinking phase: from step 5-7
• Lock point: at 3
Transaction T2:
• Growing phase: from step 2-6
• Shrinking phase: from step 8-9
• Lock point: at 6
Lock Point:
The “Lock Point” is when all locks are held for the whole
transaction. Binds all of the data used in a transaction together.
It, therefore, prevents a transaction from being split into parts.
Deadlock and Its Prevention
• In a database, a deadlock is an unwanted situation in which two or more
transactions are waiting indefinitely for one another to give up locks. Deadlock is
said to be one of the most feared complications in DBMS as it brings the whole
system to a Halt.
• Example – let us understand the concept of Deadlock with an example :
Suppose, Transaction T1 holds a lock on some rows in the Students table and needs
to update some rows in the Grades table. Simultaneously, Transaction T2
holds locks on those very rows (Which T1 needs to update) in the Grades table but
needs to update the rows in the Student table held by Transaction T1.
• Now, the main problem arises. Transaction T1 will wait for transaction T2 to give up
lock, and similarly, transaction T2 will wait for transaction T1 to give up the lock. As
a consequence, All activity comes to a halt and remains at a standstill forever unless
the DBMS detects the deadlock and aborts one of the transactions.
Deadlock Avoidance –
- When a database is stuck in a deadlock, It is always better to avoid the deadlock rather
than restarting or aborting the database. Deadlock avoidance method is suitable for
smaller databases whereas deadlock prevention method is suitable for larger databases.
- One method of avoiding deadlock is using application-consistent logic. In the above given
example, Transactions that access Students and Grades should always access the tables in
the same order. In this way, in the scenario described above, Transaction T1 simply waits
for transaction T2 to release the lock on Grades before it begins. When transaction T2
releases the lock, Transaction T1 can proceed freely.
Another method for avoiding deadlock is to apply both row-level locking mechanism and
READ COMMITTED isolation level. However, It does not guarantee to remove deadlocks
completely.
Deadlock Detection –
When a transaction waits indefinitely to obtain a lock, The database management system should detect whether the transaction is involved in a deadlock or not .

• Wait-for-graph is one of the methods for detecting the deadlock


situation. This method is suitable for smaller database. In this method
a graph is drawn based on the transaction and their lock on the
resource. If the graph created has a closed loop or a cycle, then there
is a deadlock.
• For the mentioned scenario the wait-for graph is drawn below
Deadlock prevention –
For large database, deadlock prevention method is suitable. A deadlock can be prevented if the
resources are allocated in such a way that deadlock never occur. The DBMS analyzes the operations
whether they can create deadlock situation or not, If they do, that transaction is never allowed to be
executed.
Deadlock prevention mechanism proposes two schemes :
Wait-Die Scheme – Wound Wait Scheme –
• In this scheme, If a transaction request for a resource that is
locked by other transaction, then the DBMS simply checks the In this scheme, if an older transaction
timestamp of both transactions and allows the older transaction
to wait until the resource is available for execution.
requests for a resource held by younger
Suppose, there are two transactions T1 and T2 and Let timestamp transaction, then older transaction
of any transaction T be TS (T). Now, If there is a lock on T2 by
some other transaction and T1 is requesting for resources held by forces younger transaction to kill the
T2, then DBMS performs following actions: Checks if TS (T1) < TS
(T2) – if T1 is the older transaction and T2 has held some
transaction and release the resource.
resource, then it allows T1 to wait until resource is available for The younger transaction is restarted
execution. That means if a younger transaction has locked some
resource and older transaction is waiting for it, then older with minute delay but with same
transaction is allowed wait for it till it is available. If T1 is older
transaction and has held some resource with it and if T2 is
timestamp. If the younger transaction is
waiting for it, then T2 is killed and restarted latter with random requesting a resource which is held by
delay but with the same timestamp. i.e. if the older transaction
has held some resource and younger transaction waits for the older one, then younger transaction is
resource, then younger transaction is killed and restarted with
very minute delay with same timestamp. asked to wait till older releases it.
This scheme allows the older transaction to wait but kills the
younger one.
Concurrency Control Locking Strategies
Pessimistic Locking Optimistic Locking
• Protects system from concurrency • Allow concurrency conflict happen
conflicts so it will not happen. and if it happens, we react on it in
• Best solution when there are lot of some manner.
updates and concurrency possibility is • Best solution when concurrency
high.
possibility is rather low.
• Locks records so record so selected for
update will not be updated meantime by • Doesn’t lock records-to ensure record
another user. wasn’t changed in time between
select & submit operations, it checks
• More complex in designing and managing
the programming part(deadlocks risk). row version.
• Suits well we have a table with relatively • Simple in designing and programming
small amount of records but a lot of • Suits best when DB has a lot of
update operations. Often transaction records and not too many users.
rollback would be ‘effort waste’.
Lock Problems
 Deadlocks (already explained in the previous slide)
 LiveLock: A Livelock is a situation where a request for an exclusive
lock is denied repeatedly, as many overlapping shared locks keep on
interfering each other. The processes keep on changing their status,
which further prevents them from completing the task. This further
prevents them from completing the task.
An easiest example of Livelock would be two people who meet face-to-
face in a corridor, and both of them move aside to let the other pass.
They end up moving from side to side without making any progress as
they move the same way at the time. Here, they never cross each other.
Basic Time-stamping
• It is a concurrency control mechanism that eliminates deadlock.
• It is a unique identifier created by the DBMS to identify a transaction.
They are usually assigned in the order in which they are submitted to
the system.
• This allows an age to be assigned to transaction.
• Data items have both read-timestamp and write-timestamp.
• These timestamps are updated each time the data item is read or
updated respectively.
• Thus time-stamping process allows the transactions to be serialized and
a chronological schedule of transactions can then be created.
Database Recovery
• Database recovery is the process of restoring the database to a
correct (consistent) state in the event of a failure. In other words, it is
the process of restoring the database to the most recent consistent
state that existed shortly before the time of system failure.
• Reasons for failures:
-user may decide to abort the transaction.
-there might be deadlock in the system.
-there might be system failure.
Kinds of Failures
• Software failures
-application program failure
-failure due to viruses
-DBMS software failure
-Operating system failure
• Hardware failures
• External failures
Failure Controlling Methods

• Having a regulated power supply.


• Having a better secondary storage system.
• Taking periodic backup of Db and keeping track of transactions after
each recorded state.
• Properly testing the transaction programs prior to use.
• Setting important integrity checks in the DBs as well as user
interfaces.
Database Errors and Types
An error is said to have occurred if the execution of a command to manipulate the DB
cannot be successfully completed either due to inconsistent data or due to state of
program.
Example:
There may be a command in a program to store the data in DB. On the execution of
command, it is found that there is no space in DB to accommodate that additional data.
The it can be said that error has occurred. This is because of physical state of DB
storage.
Types
User errors
Consistency error
System error
Backup and Recovery Techniques
1. Periodic data and applications backups.
2. Proper backup identification.
3. Convenient and safe backup storage.
4. Physical protection of both hardware and software.
5. Personal access control to the software of a DB installation.
6. Insurance coverage for the data in the database.
Recovery can be Backward Recovery(UNDO) and Forward
Recovery( REDO).
Database Security
-Database security refers to the collective measures used to protect and secure a database or
database management software from illegitimate use and malicious threats and attacks.
-Database security covers and enforces security on all aspects and components of databases. This
includes:
• Data stored in database
• Database server
• Database management system (DBMS)
• Other database workflow applications
-Database security is generally planned, implemented and maintained by a database administrator
and or other information security professional.
-Some of the ways database security is analyzed and implemented include:
• Restricting unauthorized access and use by implementing strong and multifactor access and
data management controls
• Load/stress testing and capacity testing of a database to ensure it does not crash in a distributed
denial of service (DDoS) attack or user overload
• Physical security of the database server and backup equipment from theft and natural disasters
• Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road map/plan to mitigate them.
Database Integrity

• Data integrity is the overall accuracy, completeness, and consistency


of data. Data integrity also refers to the safety of data in regards to
regulatory compliance and security. It is maintained by a collection
of processes, rules, and standards implemented during the design
phase.
• When the integrity of data is secure, the information stored in a
database will remain complete, accurate, and reliable no matter how
long it’s stored or how often it’s accessed. Data integrity also ensures
that your data is safe from any outside forces.
Types of Data Integrity

• Physical integrity: Physical integrity is the protection of data’s wholeness and accuracy as it’s stored and retrieved.
When natural disasters strike, power goes out, or hackers disrupt database functions, physical integrity is
compromised. Human error, storage erosion, and a host of other issues can also make it impossible for data processing
managers, system programmers, applications programmers, and internal auditors to obtain accurate data.
• Logical integrity: Logical integrity keeps data unchanged as it’s used in different ways in a relational database. Logical
integrity protects data from human error and hackers as well, but in a much different way than physical integrity does.
There are four types of logical integrity.
• Entity integrity: Entity integrity relies on the creation of primary keys, or unique values that identify pieces of data, to
ensure that data isn’t listed more than once and that no field in a table is null. It’s a feature of relational systems which
store data in tables that can be linked and used in a variety of ways.
• Referential integrity: Referential integrity refers to the series of processes that make sure data is stored and used
uniformly. Rules embedded into the database’s structure about how foreign keys are used ensure that only
appropriate changes, additions, or deletions of data occur. Rules may include constraints that eliminate the entry of
duplicate data, guarantee that data is accurate, and/or disallow the entry of data that doesn’t apply.
• Domain integrity: Domain integrity is the collection of processes that ensure the accuracy of each piece of data in a
domain. In this context, a domain is a set of acceptable values that a column is allowed to contain. It can include
constraints and other measures that limit the format, type, and amount of data entered.
• User-defined integrity: User-defined integrity involves the rules and constraints created by the user to fit their
particular needs. Sometimes entity, referential, and domain integrity aren’t enough to safeguard data. Often, specific
business rules must be taken into account and incorporated into data integrity measures.
Authentication and Types
• Data authentication is the process of confirming the origin and
integrity of data. The term is typically related to communication,
messaging and integration. Data authentication has two
elements: authenticating that you're getting data from the correct
entity and validating the integrity of that data.
• 5 Common Authentication Types
-Password-based authentication. Passwords are the most
common methods of authentication
-Access Control
-Discretionary access control(DAC)
Authorization and its Forms and Control
operations
• It is a set of rules that can be used to determine which user has what type of access to which
portion of the DB.
• Forms
-READ
-INSERT
-UPDATE
-DELETE
• Control operations
-ADD
-DROP
-ALTER
-PROPAGATE ACCESS CONTROL
Encryption
• It is the process of encoding information. This process converts the
original representation of the information, known as plaintext, into an
alternative form known as ciphertext. Ideally, only authorized parties
can decipher a ciphertext back to plaintext and access the original
information.
• Ciphertext is encrypted text transformed from plaintext using an
encryption algorithm. Ciphertext can't be read until it has been
converted into plaintext (decrypted) with a key. The decryption cipher
is an algorithm that transforms the ciphertext back into plaintext.
Views
• A view is a means of providing a user with a personalized model of
the DB.
• It is a useful way of limiting user’s access to various portions of the
DB.
• By creating different views for different classes of users, a high degree
of access control is automatically attained.

You might also like