Professional Documents
Culture Documents
3) Operating Support Dbms PDF
3) Operating Support Dbms PDF
PRACTICES
1. Introduction
Database management systems SUMMARY: Several operating system services are examined
(DBMS) provide higher level user with a view toward their applicability to support of database
support than conventional operating management functions. These services include buffer pool
systems. The DBMS designer must
work in the context of the OS he/she
management; the file system; scheduling, process manage-
is faced with. Different operating ment, and interprocess communication; and consistency
systems are designed for different control.
use. In this paper we examine several
popular operating system services
and indicate whether they are appro-
priate for support of database man- suggestions concerning improve- the operating system is compiled.
agement functions. Often we will see ments. In the next several sections Then, all file I / O is handled through
that the wrong service is provided or we look at the services provided by this cache. A file read (e.g., read X
that severe performance problems buffer pool management; the file sys- in Figure 1) returns data directly
exist. When possible, we offer some tem; scheduling, process manage- from a block in the cache, if possible;
ment, and interprocess communica- otherwise, it causes a block to be
Permission to copy without fee all or part of
this material is granted provided that the cop- tion; and consistency control. We "pushed" to disk and replaced by the
ies are not made or distributed for direct then conclude with a discussion of desired block. In Figure 1 we show
commercial advantage, the ACM copyright the merits of including all files in a block Y being pushed to make room
notice and the title of the publication and its
"date appear, and notice is given that copying paged virtual memory. for block X. A file write simply
is by permission of the Association for Com- The examples in this paper are moves data into the cache; at some
puting Machinery. To copy otherwise, or to drawn primarily from the UNIX op- later time the buffer manager writes
republish, requires a fee and/or specific per-
mission. erating system [17] and the INGRES the block to the disk. The UNIX
This research was sponsored by U.S. Air relational database system [19, 20] buffer manager used the popular
Force Office of Scientific Research Grant 78- which was designed for use with L R U [15] replacement strategy. Fi-
3596, U.S. Army Research Office Grant
DAAG29-76-G-0245, Naval Electronics Sys- UNIX. Most of the points made for nally, when UNIX detects sequential
tems Command Contract N00039-78-G-0013, this environment have general appli- access to a file, it prefetches blocks
and National Science Foundation Grant cability to other operating systems before they are requested.
MCS75-03839-A01.
Key words and phrases: database manage- and data managers. Conceptually, this service is de-
ment, operating systems, buffer management, sirable because blocks for which
file systems, scheduling, interprocess commu- 2. Buffer Pool Management there is so-called locality of reference
nication
CR Categories: 3.50, 3.70, 4.22, 4.33, 4.34, 4.35 Many modern operating systems [15, 18] will remain in the cache over
Author's address: M. Stonebraker, Dept. of provide a main memory cache for repeated reads and writes. However,
Electrical Engineering and Computer Sci- the file system. Figure 1 illustrates the problems enumerated in the fol-
ences, University of California, Berkeley, CA
94720. this service. In brief, UNIX provides lowing subsections arise in using this
© 1981 ACM 0001-0782/81/0700-0412 $00.75. a buffer pool whose size is set when service for database management.
1
main memory [
Section 6. order. Hence, there is no way for an
OS to implement the correct prefetch
strategy.
cache ] 2.2 LRU R e p l a c e m e n t
Although the folklore indicates
D that L R U is a generally good tactic
for buffer management, it appears to
2.4 Crash Recovery
An important DBMS service is to
perform only marginally in a data- provide recovery from hard and soft
base environment. Database access crashes. The desired effect is for a
® (9 in I N G R E S is a combination of: unit of work (a transaction) which
(1) sequential access to blocks may be quite large and span multiple
which will not be rereferenced; files to be either completely done or
(2) sequential access to blocks look like it had never started.
which will be cyclically rerefer- The way many DBMSs provide
enced; this service is to maintain an inten-
(3) random access to blocks which tions list. When the intentions list is
will not be referenced again; complete, a commit flag is set. The
I - ' - "~'/ (4) random access to blocks for last step of a transaction is to process
disk I 3 the intentions list making the actual
which there is a nonzero prob-
ability of rereference. updates. The DBMS makes the last
Fig. 1. Structure o f a Cache. operation idempotent (i.e., it gener-
Although L R U works well for case ates the same final outcome no mat-
4, it is a bad strategy for other situ- ter how many times the intentions
ations. Since a DBMS knows which list is processed) by careful program-
blocks are in each category, it can ming. The general procedure is de-
2.1 Performance use a composite strategy. For case 4 scribed in [6, 13]. An alternate pro-
The overhead to fetch a block it should use L R U while for 1 and 3 cess is to do updates as they are
from the buffer pool manager usu- it should use toss immediately. For found and maintain a log of before
ally includes that of a system call and blocks in class 3 the reference pattern images so that backout is possible.
a core-to-core move. For U N I X on is 1, 2, 3 . . . . . n, 1, 2, 3 . . . . . Clearly, During recovery from a crash the
a PDP-11/70 the cost to fetch 512 L R U is the worst possible replace- commit flag is examined. If it is set,
bytes exceeds 5,000 instructions. To ment algorithm for this situation. the DBMS recovery utility processes
fetch 1 byte from the buffer pool Unless all n pages can be kept in the the intentions list to correctly install
requires about 1,800 instructions. It cache, the strategy should be to toss the changes made by updates in
appears that these numbers are immediately. Initial studies [9] sug- progress at the time of the crash. If
somewhat higher for U N I X than gest that the miss ratio can be cut the flag is not set, the utility removes
other contemporary operating sys- 10-15% by a DBMS specific algo- the intentions list, thereby backing
tems. Moreover, they can be cut rithm. out the transaction. The impact of
somewhat for VAX 11/780 hardware In order for an OS to provide crash recovery on the buffer pool
[10]. It is hoped that this trend to- buffer management, some means manager is the following.
ward lower overhead access will con- must be found to allow it to accept The page on which the commit
tinue. "advice" from an application pro- flag exists must be forced to disk
However, many DBMSs includ- gram (e.g., a DBMS) concerning the after all pages in the intentions list.
ing I N G R E S [20] and System R [4] replacement strategy. Designing a Moreover, the transaction is not re-
choose to put a DBMS managed clean buffer management interface liably committed until the commit
buffer pool in user space to reduce with this feature would be an inter- flag is forced out to the disk, and no
overhead. Hence, each of these sys- esting problem. response can be given to the person
tems has gone to the trouble of con- submitting the transaction until this
structing its own buffer pool man- 2.3 Prefetch time.
ager to enhance performance. Although U N I X correctly pre- The service required from an OS
In order for an operating system fetches pages when sequential access buffer manager is a selected force out
(OS) provided buffer pool manager is detected, there are important in- which would push the intentions list
to be attractive, the access overhead stances in which it fails. and the commit flag to disk in the
must be cut to a few hundred instruc- Except in rare cases I N G R E S at proper order. Such a service is not
tions. The trend toward providing (or very shortly after) the beginning present in any buffer manager
the file system as a part of shared o f its examination of a block knows known to us.
5. Consistency Control
management. If a DBMS provides
buffer management in addition to
whatever is supplied by the operating
/
The services provided by an op- system, then transaction manage-
erating system in this area include
the ability to lock objects for shared
ment by the operating system is im-
pacted as discussed in the following I DBMS
process
or exclusive access and support for
crash recovery. Although most op-
erating systems provide locking for
files, there are fewer which support
subsections.