You are on page 1of 6

RECAP: Overall of a DBMS

A. Functional Components of a DBMS

A typical DBMS has the following six basic components:

1. File Manager manages the allocation space on disk storage and the data structures
used to represent info stored on other media. In most applications (99.9%) the file is the
central element. All applications are designed with the specific goal: generation and use of
information. A typical file system layered architecture is the following (see also CPS510).
User Program




Logical I/O
Basic File System Structure
Device Drivers (Disk,tape,etc)
Actual Device

2. Buffer Manager among other tasks, it transfers blocks between disk (or other



devices) and Main Memory (MM). A DMA (Direct Memory Access) is a form of I/O that
controls the exchange of blocks between MM and a device. When a processor receives a
request for a transfer of a block, it sends it to the DMA which transfers the block
Query Parser translates statements in a query language, whether embedded or not,
into a lower level language. (See RL language example from CPS510). This parser is also a
strategy selector: i.e., finding the best and most efficient way (faster?) of executing the
Authorization and Integrity Manager checks for the authority of the users to
access and modify info, as well as integrity constraints (keys, etc).
Recovery Manager ensures that the database is and remains in a consistent (sound)
state after any kind of failure.
Concurrency Controller enforces Mutual Exclusion by ensuring that concurrent
interactions with the data base proceed without conflict (deadlocks, etc).

Here are all the components:

Query Parser
Strategy Selector
User Transaction
Buffer Management
File Manager


Recovery Manager
Concurrency Controller

Disk Storage:

Lock Table

User, stat and system data

Buffer (Main memory)

B. Buffer management
Buffer Management (BM) is an essential part of DBMSs. This is a quite large area of
Main Memory that the OS allocates for the DBMS. All Transactions share this buffer
through common blocks. (Note: here blocks and pages are used interchangeably)
The blocks, same as pages in OS, fit exactly into the frames (located in Main Memory).
Since the buffer area is large, it means that it can "hold" a lot of blocks: the larger
the chunk of buffer, the more blocks it can hold, thus minimizing the transfer
between the disk and main memory, which is a slow process.
BM employs a directory or page tables which describe the current contents of the blocks
that are in the buffer: Physical file and corresponding block number. This is called
Locality or Localization in Software Engineering, (if there is such thing.., maybe the
most appropriate term should be Software Development). In this particular instance
localization of data. The Transaction does not have to wait for the appropriate block.
There is a rule of thumb, or a well known empirical law which says that only 20% of
data is being accessed by 80% of applications!

Buffer Management primitives:

Fix - is used to request a block that it has to access and load it into buffer (MM).
This is also known as page fault in OS. When loaded the block (page) is assigned
(actually through a pointer) to the executing Transaction that requested it. When
the Transaction is finished, the page/block is unassigned. It checks for valid Page.
Use - The Transaction asks for a block/page that was previously loaded into the
MM. It checks for valid Page.
Unfix - The transaction has terminated the use of the block/page. The block is no
longer valid.
Flush - used by the BM to transfer all used blocks, or blocks that have been
inactive for a period of time back to the disk.
Force - It transfers a page/block synchronously from the BM to the disk. The T
that requested this page/block is suspended until the end of the execution of the
force (which also contains a physical write to disk) OR asynchronously by the BM

Some page strategies available for virtual memory management

1. Replacement or Steal Policy: In Locality, the block which the MM has assigned
to the running T by passing the address of the page to the T. The T requested this
page and the MM has found it. If the page is not there then an existing page (in
the buffer) is selected, called the victim page, provided that it is free and is
replaced by a new one transferred from the disk. The victim page is a page that
was referenced least recently. (No-steal Policy: does not permit this process.)
1. Pinned pages: For RECOVERY purposes (from crashes) it is wise to restrict a
block to be written back to the disk. A page that is not allowed to be written back
to the disk is pinned!! - this method/policy is very good for systems that are
inherently resilient to crashes, although OSs do not support this policy.
2. Forced Output: An alternative for log recovery procedures. This may happen
after a commit, where the active pages of a T are forced into the disk, even
though the space they occupy are NOT needed.
3. Pre-fetching and Pre-flushing: The system anticipates the loading and
unloading of pages/ blocks, i.e. before the T's request, or as a victim.
Note: there are more strategies available for Operating Systems.

Buffer Manager and File System

The DBMSs, as mentioned elsewhere in this course, exploit the usage of the OS File
System, but for efficiency purposes they have their own abstractions of the files used by
means of the buffers. The functions are as follows:
1. create and delete a file. At the time of creation an initial number of blocks are
allocated to a file and then extended from there.
2. opening and closing a file. This info describes a file in Main Memory. Allocates a
numerical identifier (fileid) to the name of the file (filename).
3. read(fileid, block, buffer), for the direct access to a block of a file into the
page of the buffer.
4. read_seq (fileid, f-block, count, f-buffer) for sequential access to a fixed
number, (count), of blocks, starting from f-block, and the first page of the buffer
5. write(fileid, block, buffer) and write_seq(fileid, f-block, count, f-buffer)
corresponding to the reads above.
6. The system must know the secondary memory directory structure and the latest
secondary memory use. Also, must know where blocks are allocated and which
are free, as to communicate


This an alternative to Log based recovery procedures
Some systems, such as system R (DB2) offered both facilities.

The Technique
While a Transaction is active, and during its life time Two Directories or Page Tables are
1. The Current Page and
2. The Shadow Page.
The Shadow Page Table whose pages point to the Pages on Disk and the Current
Page Table also points to the Pages on Disk.
Before the start of the T both are identical. The Shadow table is never changed during
the execution of the T. Exactly like the parent in a fork in Unix. The Current page may
be changed when a write is performed - What transpires is that the system uses the
Current page to locate the pages on disk.
Suppose that a T performs a write (X,x(i)) and that X is located in the ith page. We have
the execution:
1. The ith page where X is located is not already in MM. So, we issue an Input(X).
2. Output the Current page table to disk (Note: for recovery purposes we should not
overwrite the shadow table)
3. Output the disk address to the Current page table to disk: fixed location that
contains the address of the Shadow page table. This overwrites the address of the
old Shadow table. Therefore the Current Page becomes Shadow and T has

It requires that we locate the Shadow page table AFTER the crash: the fixed location.
Then we copy that Shadow page table into MM and use it for the subsequent
Transaction processing.
Now the Shadow page table points to the pages that correspond to the state of the db
prior to any Transaction write.

Advantages and Disadvantages


No maintenance of the log file.

No undo and redo

Back track reading of the log. (vi editor in unix has this feature: using the -r and
also .rtf files in MSDOS?).

Very hard to implement for multi-user environments/systems.

D. The RELIABILITY Control System and the FAILURE

Management Model

1. The Reliability Control System (RCS)

This system ensures some or all properties of the T hold:

Atomicity: the T is indivisible

Consistency: No violation of integrity constraints

Isolation: enforcing Mutual Exclusion and other asynchronous policies.

Correctness: Requires that after a T that has successfully executed a commit,

NO data is lost.

RCS is responsible for:


Executing the T commands mentioned above: fix, use, unfix, begin, commit, abort,
and flush, for the data base and the log. (Note: the flush belongs to the Buffer
Executing primitives for recovery after failure: WARM Restart and COLD Restart.
(see below)
Reading and writing pages.
For Checkpoints
For Dumps. A complete copy of the db. A back up. At the conclusion of the dump
operation a dump record is written in the log, with time, filename and device.

Checkpoint is an operation that is carried out periodically so that it records which

Ts are active and updating the disk relative to all Ts that are finished. (see also
When all pages of Ts that issued commit or abort are written to disk by the
buffer manager executing a flush operation. After the initiated point, no commit
operations are accepted by the active T.
The checkpoint ends by synchronously forcing (writing) a checkpoint record
which contains the ids of the active Ts.
So now we know that the commit is permanent.

2. The Failure Management Model.

There are:

System (power loss, exceptions, etc)

Device Failures (head crashes, etc)

When a system identifies a failure then all the Ts are halted followed by a restart (boot).
It is called Warm Restart in the case of system failure and Cold Restart in the case of
device failure.
1. WARM Restart
Here we have four phases:
i. The last block of the log is accessed: at the time of the failure and the log
is traced back until the most recent checkpoint.
ii. Which Ts must be redone or undone. All Ts names are scanned before
decision is made.
iii. The undo is performed: The log is traced back.
iv. The redo actions are applied in the order recorded in the log.
2. COLD Restart
Respond to a failure that causes damage to a part of the db. There are three
i. The dump is accessed and the damaged parts are copied to the db
from the back up.
ii. The log is traced forward. The commit and the abort actions are
applied to the damaged parts of the db. Restored!
iii. A Warm Restart is carried out.