Professional Documents
Culture Documents
Questions:
1) What is a DBMS, specifically a RDBMS?
2) Why considering it for managing data?
3) How is application data rep. in a DBMS?
4) How is data retrieved & manipulated, in a DBMS?
5) How does a DBMS support concurrent access & protect data during system failures?
6) Which are the main comp. of a DBMS?
7) Who is involved with DB in real life?
1. Short Definitions
• database = collection of data, typically describing the activities of one or more related organizations
o entities
o relationships between entities
• DBMS = software designed to assist in maintaining & utilizing large collections of data
2. History
• 1960s: Charles Bachman developed the 1st general-purpose DBMS, called Integrated Data Store =>
basis for the network data model
• late 1960s: IBM developed IMS (Information Mgmt. Syst.), used even today => hierarchical data
model
• 1970: Edgar Codd (IBM) proposed new model => relational data model (RDM)
• 1980s: RDM consolidates => dominant DBMS paradigm => SQL developed by IBM
• 1990s: advances in many areas of BD => data warehouse = consolidates data from several DBs
• 1999: James Gray contributes to DB transaction mgmt. = concurrent execution of DB programs
• enterprise resource planning (ERP) & mgmt. resource planning (MRP) emerged
• Internet age: 1st generation of websites stored data exclusively in OS files
• now: more accessible, new visions including multimedia DB, interactive videos, streaming data, etc.
7. Data Independence
• application programs are insulated from changes in the way the data is structured or stored
• achieved through conceptual & external schema
• relations in external schema are in principle generated on demand from the ones in the conceptual
schema => if the data is reorganized => the conceptual schema changes => the definition of a view
relation can be modified so the same relation is computed as before
ex:
if we change our Faculty by 2 relations:
Faculty_public(fid: string, fname: string, office: integer)
Faculty_private(fid: string, sal: real)
• Courseinfo will be adjusted for student's to only see the faculty public infos
• => users can be shielded from changes in the logical structure of data or ones in the choice of
relations to be stored = logical/conceptual data independence
• physical data independence is achieved by the fact that users can't change the storage details
8. Queries
• useful tools to retrieve information from the DB
• relational calculus = formal query language based on mathematical logic
• relational algebra = formal query language based on a collection of operators for manipulating
relations
• DBMS enable users to create, modify and query data through a data manipulation language
(DML), and the query language is a part of it
• DML + DDL => data sublanguages when embedded within a host language (C, COBOL)
9. Transaction Mgmt.
• at any given time, it is possible that several users are accessing (and possibly modify) a DB
concurrently => the DBMS must order their requests carefully to avoid conflicts
• the DBMS must protect users from the effects of system failures by ensuring that all data (and the
status of active applications) is restored to a consistent state when the system is restarted after a
crash
• transaction = any one execution of a user program in a DBMS; basic unit of change as seen by the
DBMS
• executing the same program several times => several transactions
• partial transactions aren't allowed
• group of transactions = some serial execution of all transactions
• example of concurrent use:
o DB of airline reservations. At any given time several agents are looking up information about
available seats on various flights and making new set reservations. If for example, one agent
looks up flight 100 on some given day and finds an empty seat, another travel agent may
simultaneously be making a reservation for that seat, thereby making the information by the 1st
one obsolete
• locking protocol = set of rules to be followed by each transaction to ensure that, even though
actions of several transactions might be interleaved, the net effect is identical to executing all of
them in some serial order
• lock = access control mechanism on an object of the DB
o shared lock -> an object can be held by 2 different transactions simultaneously
o exclusive lock -> no other transactions hold any lock on that object
• transaction might be interrupted before running to completion (ex: system crashes)
o to retake the state before the crash => log of all writes to the DB
o Write-Ahead Log (WAL) = property of the log which provides that each write action must be
recorded in the log (on disk) before the corresponding effects take place
o checkpoint = periodically forcing some info. to disk (often slows down normal execution)