You are on page 1of 17

Unit No.

7 Crash Recovery and Backup


Unit No.7 Crash Recovery and Backup

1. Failure classifications

2. Recovery & Atomicity

3. Log based recovery

4. Checkpoint and Shadow Paging in Data recovery

5. Database backup and types of backups


1. Failure Classifications
To find that where the problem has occurred, we generalize a failure into the following categories:
1. Transaction failure
2. System crash
3. Disk failure
1. Transaction failure
The transaction failure occurs when it fails to execute or when it reaches a point from where it can't go any further. If a
few transaction or process is hurt, then this is called as transaction failure.
Reasons for a transaction failure could be -
1. Logical errors: If a transaction cannot complete due to some code error or an internal error condition, then the
logical error occurs.
2. Syntax error: It occurs where the DBMS itself terminates an active transaction because the database system is not
able to execute it. For example, The system aborts an active transaction, in case of deadlock or resource
unavailability.
2. System Crash
System failure can occur due to power failure or other hardware or software failure. Example: Operating system error. A
system crash usually occurs when there is some sort of hardware or software breakdown. Some other problems which are
external to the system and cause the system to abruptly stop or eventually crash include failure of the transaction,
operating system errors, power cuts, main memory crash, etc.
These types of failures are often termed soft failures and are responsible for the data losses in the volatile memory. It is
assumed that a system crash does not have any effect on the data stored in the non-volatile storage and this is known as
the fail-stop assumption.

3. Disk Failure
• When a disk failure occurs amid data-transfer operation resulting in loss of content from disk storage then such
failures are categorized as data-transfer failures. Some other reason for disk failures includes disk head crash, disk
unreachability, formation of bad sectors, read-write errors on the disk, etc.
2. Recovery & Atomicity
When a system crashes, it may have several transactions being executed and various files opened for them to modify the
data items. Transactions are made of various operations, which are atomic in nature. But according to ACID properties of
DBMS, atomicity of transactions as a whole must be maintained, that is, either all the operations are executed or none.
When a DBMS recovers from a crash, it should maintain the following −
•It should check the states of all the transactions, which were being executed.
•A transaction may be in the middle of some operation; the DBMS must ensure the atomicity of the transaction in this
case.
•It should check whether the transaction can be completed now or it needs to be rolled back.
•No transactions would be allowed to leave the DBMS in an inconsistent state.
There are two types of techniques, which can help a DBMS in recovering as well as maintaining the atomicity of a
transaction −
•Maintaining the logs of each transaction, and writing them onto some stable storage before actually modifying the
database.
•Maintaining shadow paging, where the changes are done on a volatile memory, and later, the actual database is updated.
3. Log-based Recovery
Log is a sequence of records, which maintains the records of actions performed by a transaction. It is important that the
logs are written prior to the actual modification and stored on a stable storage media, which is failsafe.
Log-based recovery works as follows −
•The log file is kept on a stable storage media.
•When a transaction enters the system and starts execution, it writes a log about it.

<T n , Start>

When the transaction modifies an item X, it write logs as follows −


<T n, X, V 1 , V 2>

It reads Tn has changed the value of X, from V 1 to V2.

•When the transaction finishes, it logs −

<Tn, commit>
Recovery with Concurrent Transactions
When more than one transaction are being executed in parallel, the logs are interleaved. At the time of recovery, it would
become hard for the recovery system to backtrack all logs, and then start recovering. To ease this situation, most modern
DBMS use the concept of 'checkpoints'.
Checkpoint
Keeping and maintaining logs in real time and in real environment may fill out all the memory space available in the
system. As time passes, the log file may grow too big to be handled at all. Checkpoint is a mechanism where all the
previous logs are removed from the system and stored permanently in a storage disk. Checkpoint declares a point before
which the DBMS was in consistent state, and all the transactions were committed.
Recovery
When a system with concurrent transactions crashes and recovers, it behaves in the following manner −

•The recovery system reads the logs backwards from the end to the last checkpoint.
•It maintains two lists, an undo-list and a redo-list.
•If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>, it puts the transaction in the redo-list.
•If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it puts the transaction in undo-list.

All the transactions in the undo-list are then undone and their logs are removed. All the transactions in the redo-list and
their previous logs are removed and then redone before saving their logs.
4. Checkpoint and Shadow Paging in Data recovery

Checkpointing is a mechanism where all the previous logs are removed from the system and stored permanently in a
storage disk.
•Checkpoint declares a point before which the DBMS was in consistent state, and all the transactions were committed.
•The checkpoint is like a bookmark. While the execution of the transaction, such checkpoints are marked, and the
transaction is executed then using the steps of the transaction, the log files will be created.
Shadow Paging
Shadow Paging is a recovery technique that is used to recover databases.
It is an alternative to log-based recovery techniques, which has both advantages and disadvantages.
This is the method where all the transactions are executed in the primary memory or the shadow copy of the database.
Once all the transactions are completely executed, it will be updated to the database.
Shadow paging is a method used to acquire atomic and durable transactions, and provides the capability to manipulate
pages in a database.
Functions of shadow paging based on different environment:
This recovery scheme does not require the use of a log in a single-user environment. In a multiuser environment, a log may
be needed for the concurrency control method. Shadow paging considers the database to be made up of a number of fixed -
size disk pages (or disk blocks)—say, n—for recovery purposes.
Advantages:
•No overhead of writing log records.
•Recover is trivial.
Disadvantages:
•Copying the entire page is quite expensive .
•Hard to extend algorithms to allow transactions to run concurrently.
5. Database backup and types of backups

There are various data backup types, each designed to address different risks, vulnerabilities and storage needs.
Effectively backing up the files, networks, servers and other assets begin with addressing a network's capabilities and
selecting the proper type of backup for the circumstances.
A backup is a copy of the data that store in the cloud. Backing-up is an important process that everyone should do to have
a fail-safe for when the inevitable happens. The principle is to make copies of particular data to use those copies for
restoring the information if a failure occurs. A data loss event occurs due to deletion, corruption, theft, viruses, etc.
Protecting data against loss, corruption, disasters (human-caused or natural), and other problems is one of the IT
organizations' top priorities. To avoid this loss, implementing an efficient and effective set of backup operations can be
difficult.
There are four most common backup types implemented and generally used in most of these programs, such as:
1.Full backup
2.Incremental backup
3.Differential backup
4.Mirror backup

A type of backup defines how data is copied from source to destination and lays the data repository model's grounds or
how the back-up is stored and structured.
1. Full backups
The most basic and complete type of backup operation is a full backup. As the name implies, this backup type makes a copy
of all data to a storage device, such as a disk. The primary advantage of performing a full backup during every operation is
that a complete copy of all data is available with a single media set.
It takes the shortest time to restore data, a metric known as a recovery time objective. However, the disadvantages are that
it takes longer to perform a full backup than other types, requiring more storage space.

2. Incremental backups
An incremental backup operation will result in copying only the data that has changed since the last backup operation of
any type. An organization typically uses the modified timestamp on files and compares them to the last backup timestamp.
Backup applications track and record the date and time that backup operations occur to track files modified since these
operations. Because an incremental backup will only copy data since the last backup of any type, an organization may run it
as often as desired, with only the most recent changes stored.
The benefit of an incremental backup is that it copies a smaller amount of data than a full. Thus, these operations will have
a faster backup speed and require fewer media to store the backup.
3. Differential backups
A differential backup operation is similar to an incremental the first time it is performed, in that it will copy all data changed
from the previous backup. However, each time it is run afterward, it will continue to copy all data changed since the
previous full backup. Therefore, it will store more backed up data than an incremental on subsequent operations, although
typically far less than a full backup.
Differential backups require more space and time to complete than incremental backups, although less than full backups.
From these three primary types of backup, it is possible to develop an approach for comprehensive data protection. An
organization often uses one of the following backup settings:
• Full daily
• Full weekly + differential daily
• Full weekly + incremental daily
Full backup daily requires the most amount of space and will also take the most amount of time. However, more total
copies of data are available, and fewer media pieces are required to perform a restore operation. As a result, implementing
this backup policy has a higher tolerance to disasters and provides the least time to restore since any data required will be
located on at most one backup set.
4. Mirror backups
A mirror backup is comparable to a full backup. This backup type creates an exact copy of the source data set, but only the
latest data version is stored in the backup repository with no track of different versions of the files. All the different backed
up files are stored separately like they are in the source.
One of the benefits of mirror backup is a fast data recovery time. It's also easy to access individual backed up files.

One specific kind of mirror, disk mirroring, is also known as RAID 1. This process replicates data to two or more disks. Disk
mirroring is a strong option for data that needs high availability because of its quick recovery time. It's also helpful for
disaster recovery because of its immediate failover capability. Disk mirroring requires at least two physical drives. If one
hard drive fails, an organization can use the mirror copy. While disk mirroring offers comprehensive data protection, it
requires a lot of storage capacity

You might also like