CH8 Slide

TRANSACTION MANAGEMENT,
RECOVERY
AND
QUERY PROCESSING
Learning Objectives
  A transaction represents a real‑world event such as the sale
of a product.
  A transaction must be a logical unit of work. That is, no
portion of a transaction stands by itself. For example, the
product sale has an effect on inventory and, if it is a credit
sale, it has an effect on customer balances.
  A transaction must take a database from one consistent
state to another. Therefore, all parts of a transaction must be
executed or the transaction must be aborted. (A consistent
state of the database is one in which all data integrity
constraints are satisfied.)
Course Contain
  Introduction to Transaction Management
  ACID Properties
  Introduction to Concurrency Control
  Reasons of Transaction Failure, System
Recovery and Media Recovery
  Introduction to Query Processing
  Steps in query Processing
Introduction
  A transaction is a logical processing corresponding to a
series of elementary physical operations(reads/writes)
on the DB
  Examples:
  Transfer of a sum between bank accounts
  UPDATE CC UPDATE CC
  SET balance=balance-50 SET
balance=balance+50
  WHERE account=123 WHERE
account=235
  Updating wages of employees in a branch
  UPDATE Emp
  SET wage=1.1*wage
  WHERE branch=‘S01’
Transaction Concept
  A transaction is a unit of program execution that accesses
and possibly updates various data items.
  A transaction must see a consistent database. During
transaction execution the database may be inconsistent.
  When the transaction is committed, the database must be
consistent
  Two main issues to deal with:
  Failures of various kinds, such as hardware failures and
system crashes
  Concurrent execution of multiple transactions
Transactions in DBMS
  Transactions are a set of operations used to perform a
logical set of work. A transaction usually means that the
data in the database has changed. One of the major uses
of DBMS is to protect the user’s data from system failures.
It is done by ensuring that all the data is restored to a
consistent state when the computer is restarted after a
crash. The transaction is any one execution of the user
program in a DBMS. Executing the same program multiple
times will generate multiple transactions.
  Example –
Transaction to be performed to withdraw cash from an
ATM vestibule.
  Set of Operations :
Consider the following example for transaction operations
as follows.
  Example -ATM transaction steps.
  Transaction Start.
  Insert your ATM card.
  Select language for your transaction.
  Select Savings Account option.
  Enter the amount you want to withdraw.
  Enter your secret pin.
  Wait for some time for processing.
  Collect your Cash.
  Transaction Completed.
  Three operations can be performed in a transaction as follows.
  Read/Access data (R).
  Write/Change data (W).
  Commit.
  Example –
Transfer of Rs.500 from Account A to Account B. Initially A= 500,
B= Rs.800. This data is brought to RAM from Hard Disk.
  The updated value of Account A = 450 and Account B = 850.
  All instructions before commit come under a partially
committed state and are stored in RAM. When the commit is
read the data is fully accepted and is stored in Hard Disk.
  If the data is failed anywhere before commit we have to go
back and start from the beginning. We can’t continue from
the same state. This is known as Roll Back.
Transaction failure in between the operations
  The transaction can fail before finishing the all the

operations in the set. This can happen due to power failure,
system crash etc. This is a serious problem that can leave
database in an inconsistent state. Assume that transaction
fail after third operation then the amount would be deducted
from your account but your friend will not receive it.
  To solve this problem, we have the following two operations
  Commit: If all the operations in a transaction are
completed successfully then commit those changes to
the database permanently.
  Rollback: If any of the operation fails then rollback all the
changes done by previous operations.
  Uses of Transaction Management :
  The DBMS is used to schedule the access of data
concurrently. It means that the user can access multiple
data from the database without being interfered with each
other. Transactions are used to manage concurrency.
  It is also used to satisfy ACID properties.
  It is used to solve Read/Write Conflict.
  It is used to implement Recoverability, Serializability, and
Cascading.
  Transaction Management is also used for Concurrency
Control Protocols and Locking of data.
Disadvantage of using a Transaction
  It may be difficult to change the information within the

transaction database by end-users.
  We need to always roll back and start from the
beginning rather than continue from the previous state.
ACID Properties in DBMS
Atomicity
  It states that all operations of the transaction take place at

once if not, the transaction is aborted.
  There is no midway, i.e., the transaction cannot occur
partially. Each transaction is treated as one unit and either
run to completion or is not executed at all.
  Atomicity involves the following two operations:
  Abort: If a transaction aborts then all the changes made
are not visible.
  Commit: If a transaction commits then all the changes
made are visible.
  Consider the following transaction T consisting of T1 and T2:
Transfer of 100 from account X to account Y.
  If the transaction fails after completion of T1 but before

completion of T2.( say, after write(X) but before write(Y)),
then amount has been deducted from X but not added to Y.
This results in an inconsistent database state. Therefore, the
transaction must be executed in entirety in order to ensure
correctness of database state.
Consistency
  The integrity constraints are maintained so that the database is
consistent before and after the transaction.
  The execution of a transaction will leave a database in either its prior
stable state or a new stable state.
  The consistent property of database states that every transaction
sees a consistent database instance.
  The transaction is used to transform the database from one
consistent state to another consistent state.
  For example: The total amount must be maintained before or after
the transaction.
  Therefore, the database is consistent. In the case when T1 is

completed but T2 fails, then inconsistency will occur.
Isolation
  It shows that the data which is used at the time of execution
of a transaction cannot be used by the second transaction
until the first one is completed.
  In isolation, if the transaction T1 is being executed and using
the data item X, then that data item can't be accessed by any
other transaction T2 until the transaction T1 ends.
  The concurrency control subsystem of the DBMS enforced
the isolation property.
  Consider two transactions T and T”
  Suppose T has been executed till Read (Y) and then T’’ starts.
As a result , interleaving of operations takes place due to which
T’’ reads correct value of X but incorrect value of Y and sum
computed by
  T’’: (X+Y = 50, 000+500=50, 500)
  is thus not consistent with the sum at end of transaction:
  T: (X+Y = 50, 000 + 450 = 50, 450).
  This results in database inconsistency, due to a loss of 50 units.
Hence, transactions must take place in isolation and changes
should be visible only after they have been made to the main
memory.
Durability
  The durability property is used to indicate the performance
of the database's consistent state. It states that the
transaction made the permanent changes.
  They cannot be lost by the erroneous operation of a faulty
transaction or by the system failure. When a transaction is
completed, then the database reaches a state known as the
consistent state. That consistent state cannot be lost, even
in the event of a system's failure.
  The recovery subsystem of the DBMS has the responsibility
of Durability property.
Transaction States
  Transactions can be implemented using SQL queries and Server. In
the below-given diagram, you can see how transaction states
works.
Transaction Support
  Two possible outcomes:

  Success - transaction commits and database
reaches new consistent state
  Failure - transaction aborts, and database restored to
consistent state before transaction started
○  Referred to as rolled back or undone transaction
  Committed transaction cannot be aborted
  Aborted transaction that is rolled back can be restarted
later
  Active state
  The active state is the first state of every transaction. In this
state, the transaction is being executed.
  For example: Insertion or deletion or updating a record is
done here. But all the records are still not saved to the
database.
  Partially committed
  In the partially committed state, a transaction executes its
final operation, but the data is still not saved to the
database.
  In the total mark calculation example, a final display of the
total marks step is executed in this state.
  Committed
  A transaction is said to be in a committed state if it
executes all its operations successfully. In this state, all the
effects are now permanently saved on the database
system.
  Failed state
  If any of the checks made by the database recovery system fails,
then the transaction is said to be in the failed state.
  In the example of total mark calculation, if the database is not able
to fire a query to fetch the marks, then the transaction will fail to
execute.
  Aborted
  If any of the checks fail and the transaction has reached a failed
state then the database recovery system will make sure that the
database is in its previous consistent state. If not then it will abort or
roll back the transaction to bring the database into a consistent
state.
  If the transaction fails in the middle of the transaction then before
executing the transaction, all the executed transactions are rolled
back to its consistent state.
  After aborting the transaction, the database recovery module will
select one of the two operations:
○  Re-start the transaction
○  Kill the transaction
Schedule
  A series of operation from one transaction to another transaction
is known as schedule. It is used to preserve the order of the
operation in each of the individual transaction.
Serial Schedule
  The serial schedule is a type of schedule where one
transaction is executed completely before starting another
transaction. In the serial schedule, when the first
transaction completes its cycle, then the next transaction
is executed.
  For example: Suppose there are two transactions T1 and
T2 which have some operations. If it has no interleaving
of operations, then there are the following two possible
outcomes:
  Execute all the operations of T1 which was followed by
all the operations of T2.
  In the given (a) figure, Schedule A shows the serial
schedule where T1 followed by T2.
  In the given (b) figure, Schedule B shows the serial
schedule where T2 followed by T1.
Non-serial Schedule
  If interleaving of operations is allowed, then there will be

non-serial schedule.
  It contains many possible orders in which the system can
execute the individual operations of the transactions.
  In the given figure (c) and (d), Schedule C and Schedule D
are the non-serial schedules. It has interleaving of
operations.
Serializable schedule
  The serializability of schedules is used to find non-serial

schedules that allow the transaction to execute
concurrently without interfering with one another.
  It identifies which schedules are correct when executions
of the transaction have interleaving of their operations.
  A non-serial schedule will be serializable if its result is
equal to the result of its transactions executed serially.
  Here,
  Schedule A and Schedule B are serial
schedule.
  Schedule C and Schedule D are Non-serial
schedule.
Serializable
  These are of two types:

  Conflict Serializable:
  A schedule is called conflict serializable if it can be
transformed into a serial schedule by swapping
non-conflicting operations.
  View Serializable:
  A Schedule is called view serializable if it is view
equal to a serial schedule (no overlapping
transactions). A conflict schedule is a view
serializable but if the serializability contains blind
writes, then the view serializable does not conflict
serializable.
Non-Serializable
  The non-serializable schedule is divided into two types,

Recoverable and Non-recoverable Schedule. Recoverable
Schedule: Schedules in which transactions commit only after all
transactions whose changes they read commit are called
recoverable schedules. In other words, Non-Recoverable, if some
transaction Tj is reading value updated or written by some other
transaction Ti, then the commit of Tj must occur after the commit of
Ti.
  Example – Consider the following schedule involving two
transactions T1 and T2.
This is a recoverable schedule since T1 commits before T2, that
makes the value read by T2 correct.
Non-Recoverable Schedule
  Example: Consider the following schedule involving two
transactions T1 and T2.
  T2 read the value of A written by T1, and committed. T1 later

aborted, therefore the value read by T2 is wrong, but since T2
committed, this schedule is non-recoverable.
Concurrency Control
  When more than one transactions are running simultaneously
there are chances of a conflict to occur which can leave
database to an inconsistent state. To handle these conflicts we
need concurrency control in DBMS, which allows transactions to
run simultaneously but handles them in such a way so that the
integrity of data remains intact.
○  Conflict Example
  You and your brother have a joint bank account, from which you
both can withdraw money. Now let’s say you both go to different
branches of the same bank at the same time and try to withdraw
Rs. 5000 , your joint account has only 6000 balance. Now if we
don’t have concurrency control in place you both can get Rs.
5000 at the same time but once both the transactions finish the
account balance would be -4000 which is not possible and
leaves the database in inconsistent state.
  We need something that controls the transactions in such a way
that allows the transaction to run concurrently but maintaining
the consistency of data to avoid such issues.
Concurrency Control
  Process of managing simultaneous
operations on the database without having
them interfere with one another
  Prevents interference when two or more
users access database simultaneously and at
least one updates data
  Interleaving of operations may produce
incorrect results
Need for Concurrency Control
  Potential concurrency problems:
  Lost update problem
  Uncommitted dependency problem
  Inconsistent analysis problem
Lost Update Problem
  Successfully completed update

overridden by another user
  Example:
  T1 withdrawing Rs.10 from an account with
balx, initially Rs.100
  T2 depositing Rs.100 into same account
  Serially, final balance would be Rs.190
Lost Update Problem
  Loss of T2’s update avoided by preventing T1 from reading balx

until after update
Uncommitted Dependency Problem
  Occurs when one transaction can see intermediate

results of another transaction before it has
committed
  Example:
  T4 updates balx to Rs.200 but it aborts, so balx
should be back at original value of Rs.100
  T3 has read new value of balx (Rs.200) and uses
value as basis of Rs.10 reduction, giving a new
balance of Rs.190, instead of Rs.90
  Problem avoided by preventing T3 from reading balx
until after T4 commits or aborts
Inconsistent Analysis Problem
  Occurs when transaction reads several values but 2nd

transaction updates some of them during execution of
first
  Aka dirty read or unrepeatable read
  Example:
  T6 is totaling balances of account x (Rs.100), account y
(Rs.50), and account z (Rs.25).
  Meantime, T5 has transferred Rs.10 from balx to balz,
so T6 now has wrong result (Rs.10 too high).
  Problem avoided by preventing T6 from reading balx and balz
until after T5 completed updates
Concurrency Control
  Concurrency Control is a method used to ensure that database

transactions are executed in a safe manner (i.e. without data loss).
It is especially applicable to relational database and DBMS, which
must ensure that transactions are executed safely and that they
follow the ACID rules. The DBMS must be able to ensure that only
serializable, recoverable schedules are allowed.
  There are several methods/categorized to concurrency control
  Lock Based Concurrency Control Protocol
  Time Stamp Concurrency Control Protocol
  Validation Based Concurrency Control Protocol
Lock Based Concurrency Control Protocol
  n this type of protocol, any transaction cannot read or write data

until it acquires an appropriate lock on it. There are two types of
lock:
  1. Shared lock:
○  It is also known as a Read-only lock. In a shared lock, the
data item can only read by the transaction.
○  It can be shared between the transactions because when the
transaction holds a lock, then it can't update the data on the
data item.
  2. Exclusive lock:
○  In the exclusive lock, the data item can be both reads as well
as written by the transaction.
○  This lock is exclusive, and in this lock, multiple transactions
do not modify the same data simultaneously.
  There are four types of lock protocols available:
  1. Simplistic lock protocol
  It is the simplest way of locking the data while transaction. Simplistic
lock-based protocols allow all the transactions to get the lock on the
data before insert or delete or update on it. It will unlock the data item
after completing the transaction.
  2. Pre-claiming Lock Protocol
  Pre-claiming Lock Protocols evaluate the transaction to list all the data
items on which they need locks.
  Before initiating an execution of the transaction, it requests DBMS for
all the lock on all those data items.
  If all the locks are granted then this protocol allows the transaction to
begin. When the transaction is completed then it releases all the lock.
  If all the locks are not granted then this protocol allows the transaction
to rolls back and waits until all the locks are granted.
  3. Two-phase locking (2PL)
  The two-phase locking protocol divides the execution phase of
the transaction into three parts.
  In the first part, when the execution of the transaction starts, it
seeks permission for the lock it requires.
  In the second part, the transaction acquires all the locks. The
third phase is started as soon as the transaction releases its first
lock.
  In the third phase, the transaction cannot demand any new locks.
It only releases the acquired locks.
  There are two phases of 2PL:
  Growing phase:
  In the growing phase, a new lock on the data item may be
acquired by the transaction, but none can be released.
  Shrinking phase:
  In the shrinking phase, existing lock held by the transaction may
be released, but no new locks can be acquired.
  In the below example, if lock conversion is allowed then the
following phase can happen:
  Upgrading of lock (from S(a) to X (a)) is allowed in growing
phase.
  Downgrading of lock (from X(a) to S(a)) must be done in
shrinking phase.
  Example:
The following way shows how unlocking and locking work with 2-PL.
Transaction T1:
Growing phase: from step 1-3
Shrinking phase: from step 5-7
Lock point: at 3
Transaction T2:
Growing phase: from step 2-6
Shrinking phase: from step 8-9
Lock point: at 6
  4. Strict Two-phase locking (Strict-2PL)
The first phase of Strict-2PL is similar to 2PL. In the first phase,
after acquiring all the locks, the transaction continues to execute
normally.
The only difference between 2PL and strict 2PL is that Strict-2PL
does not release a lock after using it.
Strict-2PL waits until the whole transaction to commit, and then it
releases all the locks at a time.
Strict-2PL protocol does not have shrinking phase of lock release.
Timestamp Ordering Protocol
  The Timestamp Ordering Protocol is used to order the transactions
based on their Timestamps. The order of transaction is nothing but
the ascending order of the transaction creation.
  The priority of the older transaction is higher that's why it executes
first. To determine the timestamp of the transaction, this protocol
uses system time or logical counter.
  The lock-based protocol is used to manage the order between
conflicting pairs among transactions at the execution time. But
Timestamp based protocols start working as soon as a transaction
is created.
  Let's assume there are two transactions T1 and T2. Suppose the
transaction T1 has entered the system at 007 times and transaction
T2 has entered the system at 009 times. T1 has the higher priority,
so it executes first as it is entered the system first.
  The timestamp ordering protocol also maintains the timestamp of
last 'read' and 'write' operation on a data.
  Basic Timestamp ordering protocol works as follows:
  1. Check the following condition whenever a transaction Ti issues
a Read (X) operation:
  If W_TS(X) >TS(Ti) then the operation is rejected.
  If W_TS(X) <= TS(Ti) then the operation is executed.
  Timestamps of all the data items are updated.
  2. Check the following condition whenever a transaction Ti issues
a Write(X) operation:
  If TS(Ti) < R_TS(X) then the operation is rejected.
  If TS(Ti) < W_TS(X) then the operation is rejected and Ti is
rolled back otherwise the operation is executed.
○  Where,
○  TS(TI) denotes the timestamp of the transaction Ti.
○  R_TS(X) denotes the Read time-stamp of data-item X.
○  W_TS(X) denotes the Write time-stamp of data-item X.
  Advantages and Disadvantages of TO protocol:
  TO protocol ensures serializability since the precedence graph is
as follows:
  TS protocol ensures freedom from deadlock that means no

transaction ever waits.
  But the schedule may not be recoverable and may not even be
cascade- free.
Thomas write Rule
  Thomas Write Rule provides the guarantee of serializability order
for the protocol. It improves the Basic Timestamp Ordering
Algorithm.
  The basic Thomas write rules are as follows:
  If TS(T) < R_TS(X) then transaction T is aborted and rolled back,
and operation is rejected.
  If TS(T) < W_TS(X) then don't execute the W_item(X) operation of
the transaction and continue processing.
  If neither condition 1 nor condition 2 occurs, then allowed to
execute the WRITE operation by transaction Ti and set W_TS(X) to
TS(T).
  If we use the Thomas write rule then some serializable schedule
can be permitted that does not conflict serializable as illustrate by
the schedule in a given figure:
  Figure: A Serializable Schedule that is not Conflict Serializable
  In the above figure, T1's read and precedes T1's write of the same data
item. This schedule does not conflict serializable.
  Thomas write rule checks that T2's write is never seen by any transaction.
If we delete the write operation in transaction T2, then conflict serializable
schedule can be obtained which is shown in below figure.
  Figure: A Conflict Serializable Schedule
Validation Based Protocol
  Validation phase is also known as optimistic concurrency control
technique. In the validation based protocol, the transaction is
executed in the following three phases:
  Read phase: In this phase, the transaction T is read and
executed. It is used to read the value of various data items and
stores them in temporary local variables. It can perform all the
write operations on temporary variables without an update to the
actual database.
  Validation phase: In this phase, the temporary variable value
will be validated against the actual data to see if it violates the
serializability.
  Write phase: If the validation of the transaction is validated, then
the temporary results are written to the database or system
otherwise the transaction is rolled back.
  Here each phase has the following different timestamps:
  Start(Ti):
  It contains the time when Ti started its execution.
  Validation (Ti):
  It contains the time when Ti finishes its read phase and starts its
validation phase.
  Finish(Ti):
  It contains the time when Ti finishes its write phase.
○  This protocol is used to determine the time stamp for the
transaction for serialization using the time stamp of the
validation phase, as it is the actual phase which determines if
the transaction will commit or rollback.
○  Hence TS(T) = validation(T).
○  The serializability is determined during the validation process. It
can't be decided in advance.
○  While executing the transaction, it ensures a greater degree of
concurrency and also less number of conflicts.
○  Thus it contains transactions which have less number of
rollbacks.
Multiple Granularity
  Granularity: It is the size of data item allowed to lock.
  Multiple Granularity:
  It can be defined as hierarchically breaking up the database into
blocks which can be locked.
  The Multiple Granularity protocol enhances concurrency and
reduces lock overhead.
  It maintains the track of what to lock and how to lock.
  It makes easy to decide either to lock a data item or to unlock a data
item. This type of hierarchy can be graphically represented as a tree.
  For example: Consider a tree which has four levels of nodes.
  The first level or higher level shows the entire database.
  The second level represents a node of type area. The higher level
database consists of exactly these areas.
  The area consists of children nodes which are known as files. No file
can be present in more than one area.
  Finally, each file contains child nodes known as records. The file has
exactly those records that are its child nodes. No records represent
in more than one file.
  Hence, the levels of the tree starting from the top level are as follows:
  Database
  Area
  File
  Record
  In this example, the highest level shows the entire database. The
levels below are file, record, and fields.
  There are three additional lock modes with multiple granularity:
  Intention Mode Lock
  Intention-shared (IS): It contains explicit locking at a lower
level of the tree but only with shared locks.
  Intention-Exclusive (IX): It contains explicit locking at a lower
level with exclusive or shared locks.
  Shared & Intention-Exclusive (SIX): In this lock, the node is
locked in shared mode, and some node is locked in exclusive
mode by the same transaction.
  Compatibility Matrix with Intention Lock Modes: The below
table describes the compatibility matrix for these lock modes:
  It uses the intention lock modes to ensure serializability. It requires that if
a transaction attempts to lock a node, then that node must follow these
protocols:
•  Transaction T1 should follow the lock-compatibility matrix.
•  Transaction T1 firstly locks the root of the tree. It can lock it in any mode.
•  If T1 currently has the parent of the node locked in either IX or IS mode,
then the transaction T1 will lock a node in S or IS mode only.
•  If T1 currently has the parent of the node locked in either IX or SIX
modes, then the transaction T1 will lock a node in X, SIX, or IX mode only.
•  If T1 has not previously unlocked any node only, then the Transaction T1
can lock a node.
•  If T1 currently has none of the children of the node-locked only, then
Transaction T1 will unlock a node.
  Observe that in multiple-granularity, the locks are acquired in
top-down order, and locks must be released in bottom-up
order.
  If transaction T1 reads record Ra9 in file Fa, then
transaction T1 needs to lock the database, area A1 and file
Fa in IX mode. Finally, it needs to lock Ra2 in S mode.
  If transaction T2 modifies record Ra9 in file Fa, then it can
do so after locking the database, area A1 and file Fa in IX
mode. Finally, it needs to lock the Ra9 in X mode.
  If transaction T3 reads all the records in file Fa, then
transaction T3 needs to lock the database, and area A in
IS mode. At last, it needs to lock Fa in S mode.
  If transaction T4 reads the entire database, then T4 needs
to lock the database in S mode.
Recovery with Concurrent Transaction
  Whenever more than one transaction is being executed,

then the interleaved of logs occur. During recovery, it
would become difficult for the recovery system to
backtrack all logs and then start recovering.
  To ease this situation, 'checkpoint' concept is used by
most DBMS.
  As we have discussed checkpoint in Transaction
Processing Concept of this tutorial, so you can go
through the concepts again to make things more clear.
Reasons for Transaction Failure
  Data is manipulated by processes. Records can be altered, new
records added and old records deleted. A transaction is a
complete function undertaken by a set of processes.
  When a transaction is submitted for execution ( for example when
the save button is pressed) the system checks whether
  all operations involved in the transaction are successfully
completed
  the transaction has had no effect on the database or any other
transaction.
  If either of these checks fail then the system will generate an error
message depending on the nature of the failure.
Types of Failure
  Transaction :Caused by errors within the transaction processes.
  System :Caused by failure of network or operating system or
physical threats to the system as a whole.
  Media :Failure of hard disk, out of memory errors, out of disk space
errors.
  1. Transaction failure
  The transaction failure occurs when it fails to execute or when it
reaches a point from where it can't go any further. If a few
transaction or process is hurt, then this is called as transaction
failure.
  Reasons for a transaction failure could be -
  Logical errors: If a transaction cannot complete due to some
code error or an internal error condition, then the logical error
occurs.
  Syntax error: It occurs where the DBMS itself terminates an
active transaction because the database system is not able to
execute it. For example, The system aborts an active transaction,
in case of deadlock or resource unavailability.
  2. System Crash
  System failure can occur due to power failure or other
hardware or software failure. Example: Operating system
error.
  Fail-stop assumption: In the system crash, non-volatile
storage is assumed not to be corrupted.
  3. Disk Failure
  It occurs where hard-disk drives or storage drives used to fail
frequently. It was a common problem in the early days of
technology evolution.
  Disk failure occurs due to the formation of bad sectors, disk
head crash, and unreachability to the disk or any other failure,
which destroy all or part of disk storage.
Reasons for Failure
  Failure may be caused by a number of things.
  Transaction errors, system errors, system crashes, concurrency

problems and local errors or exceptions are the more common
causes of system failure. The system must be able to recover from
such failures without loss of data.
System Recovery
What is Transaction Failure?
  When a transaction is submitted to a database system, this is the
responsibility of the database management system to execute all
the operations in the Transaction.
  According to the atomicity property of Transaction, all the
operations in a transaction have to be executed, or none will be
completed. There won’t be a case where only half of the operations
will be executed, or this case will lead to a transaction failure.
Recovery System in DBMS from Transaction Failure

  In a database recovery management system, there are mainly two
recovery techniques that can help a DBMS in recovering and
maintaining the atomicity of a transaction. Those are as follows
  1.Log Based Recovery.
  2.Shadow Paging
Log Based Recovery in DBMS
  A log is a sequence of records that contains the history of all
updates made to the Database. Log the most commonly used
structure for recording database modification. Some time log
record is also known as system log.
  Update log has the following fields-
  Transaction Identifier: To get the Transaction that is executing.
  Data item Identifier: To get the data item of the Transaction
that is running.
  The old value of the data item (Before the write operation).
  The new value of the data item (After the write operation).
  the basic structure of the format of a log record.
  <T, Start >. The Transaction has started.
  <T, X, V1,V2>. The Transaction has performed write on data. V
is a value that X will have value before writing, and V2 is a
Value that X will have after the writing operation.
  <T, Commit>. The Transaction has been committed.
  <T, Abort>. The Transaction has aborted.
  Consider the data Item A and B with initial value 1000. (A=B=1000)
  In the above table, in the left column, a transaction is written, and in

the right column of the table, a log record is written for this
Transaction.
Shadow Paging Recovery Method
  It is a commonly used method for database recovery systems
in DBMS. It requires less disk access than do-log methods.
  Here the D.B. is partitioned into some number of fixed-length blocks
known as pages, and it maintains two-page tables during the life
cycle of Transaction.
  At the starting of the Transaction, the page tables are identical at
that time.
  Here each entry contains a pointer to a certain block on the disk.
The key idea is to maintain two-page tables during the
transaction-1) Current page table 2) Shadow page table.
  When the Transaction starts, both the pages are identical. But
during the Transaction, the current page table makes all the
changes while the shadow page table remains as it was before.
On the shadow page, the instructions of the Transaction are
stored.
CheckPoints Recovery Method in DBMS
  A checkpoint is another recovery technique used in database

recovery management in DBMS. In this technique, checkpoint
operation is performed periodically that copies log information
onto stable storage (volatile to stable storage). The information
and operations performed at each checkpoint consists of the
following-
  The Start of the checkpoint and the time and date of the
checkpoint is written to the log, and it’s done on a stable
storage device.
  All log data from the buffers within the computer memory is
copied to the log on the stable storage.
  The databases are updated from the buffers that are in the
volatile storage that are then moved to the physical Database.
  An end of checkpoint record is written, and the address of
the checkpoint record is saved on a file accessible to the
recovery routine on start-up after a system crash.
  The frequency of checkpointing is a design consideration
of the recovery system. Following are the options-
  The fixed interval of time.
  Transaction consistent checkpoint.
  Action-consistent checkpoint.
  Transaction oriented checkpoint
  Solution – Option B is the right answer.
  Explanation – Since the system failed before record 7 in
transaction T2. It means T2 does not perform commit
operations. When the Recovery manager checks the log
file to recover the Database, he will find the entry <T1,
Start > and <T1, Commit> in the Log file.
  It means T1 has committed successfully, so record two
and record 3 of Transaction T1 will be REDO, and new
Value or Updated value of B and M will be set in the
Database.
  For transaction T2, there is <t2, Start> but < T2, Commit>
is not present in the Log record, so in this case, to bring
the Database inconsistent state record six will be UNDO,
it means the value of B will not be changed to 10500.
Value of B set by transaction T1 means 10000 will be
written in the Database.
Media Recovery
  Unlike crash and instance recovery, media recovery is executed
on your command. In media recovery, you use online and archived
redo logs and incremental backups to make a restored backup
current or to update it to a specific time. It is called media recovery
because you usually perform it in response to media failure.
  Media recovery uses redo records or incremental backups to
recover restored data files either to the present or to a specified
non-current time. When performing media recovery, you can
recover the whole database, a table space, or a data file. In any
case, you always use a restored backup to perform the recovery.
The principal division in media recovery is between
complete recovery and incomplete recovery.
Complete Recovery
  Complete recovery involves using redo data or incremental
backups combined with a backup of a database, table space, or
data file to update it to the most current point in time. It is called
complete because Oracle applies all of the redo changes to the
backup. Typically, you perform media recovery after a media failure
damages data files or the control file.
  Requirements for Complete Recovery

  You can perform complete recovery on a database, table space, or
data file. If you are performing complete recovery on the whole
database, then you must:
  Mount the database.
  Ensure that all data files you want to recover are online.
  Restore a backup of the whole database or the files you want to
recover.
  Apply online or archived redo logs, or a combination of the two.
  If you are performing complete recovery on a tablespace or
datafile, then you must:
  Take the tablespace or datafile to be recovered offline if the
database is open.
  Restore a backup of the datafiles you want to recover.
  Apply online or archived redo logs, or a combination of the
two.
Incomplete Recovery
  Incomplete recovery uses a backup to produce a non-current version
of the database. In other words, you do not apply all of the redo data
generated since the most recent backup. You usually perform
incomplete recovery when:
  Media failure destroys some or all of the online redo logs.
  A user error causes data loss, e.g., a user inadvertently drops a
table.
  You cannot perform complete recovery because an archived redo
log is missing.
  You lose your current control file and must use a backup control file
to open the database.
  To perform incomplete media recovery, you must restore all datafiles
from backups created prior to the time to which you want to recover
and then open the database with the RESETLOGS option when
recovery completes. The RESETLOGS operation creates a new
incarnation of the database. All archived redo logs generated after the
point of the RESETLOGS on the old incarnation are invalid on the
new incarnation.
Query Processing in DBMS
  Query Processing is a translation of high-level queries into

low-level expression.
  It is a step wise process that can be used at the physical
level of the file system, query optimization and actual
execution of the query to get the result.
  It requires the basic concepts of relational algebra and file
structure.
  It refers to the range of activities that are involved in
extracting data from the database.
  It includes translation of queries in high-level database
languages into expressions that can be implemented at the
physical level of the file system.
  In query processing, we will actually understand how these
queries are processed and how they are optimized.
In the above diagram,
  T h e f i r s t s t e p i s t o
transform the query into a
standard form.
  A query is translated into
SQL and into a relational
algebraic expression.
During this process,
Parser checks the syntax
and verifies the relations
and the attributes which
are used in the query.
  The second step is Query
O p t i m i z e r. I n t h i s , i t
transforms the query into
equivalent expressions
that are more efficient to
execute.
  The third step is Query
evaluation. It executes the
above query execution
plan and returns the result.
  Translating SQL Queries into Relational Algebra
  Example
  SELECT Ename FROM Employee
WHERE Salary > 5000;
  Translated into Relational Algebra Expression
σ Salary > 5000 (π Ename (Employee))
OR
π Ename (σ Salary > 5000 (Employee))
•  A sequence of primitive operations

that can be used to evaluate a query
is a Query Execution Plan or Query
Evaluation Plan.
•  The above diagram indicates that the

query execution engine takes a query
execution plan and returns the
answers to the query.
•  Query Execution Plan minimizes the

cost of query evaluation.
  Block Diagram of Query Processing is as:
  Detailed Diagram is drawn as:
Basic Steps in Query Processing
  Parsing and translation
  Optimization
  Evaluation
1. Parsing and translation
  Translate the query into its internal form. This is then translated into
relational algebra.
  Parser checks syntax, verifies relation.
  Step-1:
Parser: During parse call, the database performs the following checks-
Syntax check, Semantic check and Shared pool check, after converting
the query into relational algebra.Parser performs the following checks
as (refer detailed diagram):
  Syntax check – concludes SQL syntactic validity. Example:
SELECT * FORM employee
Here error of wrong spelling of FROM is given by this check.
  Semantic check – determines whether the statement is meaningful
or not. Example: query contains a tablename which does not exist is
checked by this check.
  Shared Pool check – Every query possess a hash code during its
execution. So, this check determines existence of written hash code
in shared pool if code exists in shared pool then database will not
take additional steps for optimization and execution.
  Hard Parse and Soft Parse –
If there is a fresh query and its hash code does not exist in
shared pool then that query has to pass through from the
additional steps known as hard parsing otherwise if hash code
exists then query does not passes through additional steps. It just
passes directly to execution engine (refer detailed diagram). This
is known as soft parsing.
Hard Parse includes following steps – Optimizer and Row source
generation.
2. Optimization
  SQL is a very high level language:

  The users specify what to search for- not how the search is
actually done
  The algorithms are chosen automatically by the DBMS.
  For a given SQL query there may be many possible execution
plans.
  Amongst all equivalent plans choose the one with lowest cost.
  Cost is estimated using statistical information from the database
catalog.
  Step-2:
Optimizer:
  During optimization stage, database must perform a hard parse
atleast for one unique DML statement and perform optimization
during this parse. This database never optimizes DDL unless it
includes a DML component such as subquery that require
optimization.It is a process in which multiple query execution plan
for satisfying a query are examined and most efficient query plan
is satisfied for execution.
Database catalog stores the execution plans and then optimizer
passes the lowest cost plan for execution.
  Row Source Generation –

The Row Source Generation is a software that receives a optimal
execution plan from the optimizer and produces an iterative
execution plan that is usable by the rest of the database. the
iterative plan is the binary program that when executes by the sql
engine produces the result set.
3. Evaluation
  The query evaluation engine takes a query

evaluation plan, executes that plan and returns
the answer to that query.
  Step-3:
Execution Engine: Finally runs the query and
display the required result.

CH8 Slide

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH8 Slide

Uploaded by

Copyright:

Available Formats

TRANSACTION MANAGEMENT,

 The transaction can fail before finishing the all the

 It may be difficult to change the information within the

 It states that all operations of the transaction take place at

 If the transaction fails after completion of T1 but before

 Therefore, the database is consistent. In the case when T1 is

 Two possible outcomes:

 If interleaving of operations is allowed, then there will be

 The serializability of schedules is used to find non-serial

 These are of two types:

 The non-serializable schedule is divided into two types,

 T2 read the value of A written by T1, and committed. T1 later

 Successfully completed update

 Loss of T2’s update avoided by preventing T1 from reading balx

 Occurs when one transaction can see intermediate

 Occurs when transaction reads several values but 2nd

 Concurrency Control is a method used to ensure that database

 n this type of protocol, any transaction cannot read or write data

 TS protocol ensures freedom from deadlock that means no

 Whenever more than one transaction is being executed,

 Transaction errors, system errors, system crashes, concurrency

Recovery System in DBMS from Transaction Failure

 In the above table, in the left column, a transaction is written, and in

 A checkpoint is another recovery technique used in database

 Requirements for Complete Recovery

 Query Processing is a translation of high-level queries into

• A sequence of primitive operations

• The above diagram indicates that the

• Query Execution Plan minimizes the

 SQL is a very high level language:

 Row Source Generation –

 The query evaluation engine takes a query

You might also like

  The transaction can fail before finishing the all the

  It may be difficult to change the information within the

  It states that all operations of the transaction take place at

  If the transaction fails after completion of T1 but before

  Therefore, the database is consistent. In the case when T1 is

  Two possible outcomes:

  If interleaving of operations is allowed, then there will be

  The serializability of schedules is used to find non-serial

  These are of two types:

  The non-serializable schedule is divided into two types,

  T2 read the value of A written by T1, and committed. T1 later

  Successfully completed update

  Loss of T2’s update avoided by preventing T1 from reading balx

  Occurs when one transaction can see intermediate

  Occurs when transaction reads several values but 2nd

  Concurrency Control is a method used to ensure that database

  n this type of protocol, any transaction cannot read or write data

  TS protocol ensures freedom from deadlock that means no

  Whenever more than one transaction is being executed,

  Transaction errors, system errors, system crashes, concurrency

  In the above table, in the left column, a transaction is written, and in

  A checkpoint is another recovery technique used in database

  Requirements for Complete Recovery

  Query Processing is a translation of high-level queries into

•  A sequence of primitive operations

•  The above diagram indicates that the

•  Query Execution Plan minimizes the

  SQL is a very high level language:

  Row Source Generation –

  The query evaluation engine takes a query