Rdbms

2
Introduction to
Database Management Systems
(DBMS)
Database Management
System (DBMS)
Definitions:
 Data: Known facts that can be recorded and that

have implicit meaning
 Database: Collection of related data
 Ex. the names, telephone numbers and addresses of all
the people you know
 Database Management System: A
computerized record-keeping system
4
DBMS (Contd.)
 Goals of a Database Management System:
 To provide an efficient as well as convenient environment for
accessing data in a database
 Enforce information security: database security, concurrence
control, crash recovery
 It is a general purpose facility for:
 Defining database
 Constructing database
 Manipulating database
5
Benefits of database approach
 Redundancy can be reduced
 Inconsistency can be avoided
 Data can be shared
 Standards can be enforced
 Security restrictions can be applied
 Integrity can be maintained
 Data independence can be provided
6
DBMS Functions
 Data Definition
 Data Manipulation
 Data Security and Integrity
 Data Recovery and Concurrency
 Data Dictionary
 Performance
7
Database System
Users
DATABASE Application Programs/Queries

SYSTEM
DBMS
Software Software to process queries/programs
Software to access stored data
Stored Data Defn.

(META-DATA). Stored Database
8
Database System
Application user Database

program query
Q2 query Q1 scheme
Query processor DDL compiler
Compiled query
Q2 Database
Database manager description
File manager
Physical
database
9
Data Model
 A set of concepts used to describe the structure of
a database
 By structure, we mean the data types,
relationships, and constraints that should holds for
the data Categories of Data Models
Conceptual Physical Representational
10
Database Architecture
External level
(individual user
views)
Conceptual level
(community user view)
Internal level
(storage view)
Database
11
An example of the three levels
SNo FName LName Age Salary BranchNo
Conceptual View
struct STAFF { Internal
int staffNo;
int branchNo;
View
SNo FName LName Age Salary char fName[15];
char lName[15];
struct date dateOfBirth;
External View1 float salary;
struct STAFF *next;
/* pointer to next Staff record
*/
SNo LName BranchNo };
index staffNo; index branchNo;
External View2 /* define indexes for staff */
12
Schema
 Schema: Description of data in terms of a data
model
 Three-level DB Architecture defines following
schemas:
 External Schema (or sub-schema)
 Written using external DDL
 Conceptual Schema (or schema)
 Written using conceptual DDL
 Internal Schema
 Written using internal DDL or storage structure definition
13
Data Independence
 Change the schema at one level of a database system
without a need to change the schema at the next
higher level
 Logical data independence: Refers to the immunity of the
external schemas to changes in the conceptual schema
e.g., add new record or field
 Physical data independence: Refers to the immunity of the
conceptual schema to changes in the internal schema e.g.,
adding new index should not void existing ones
14
TYPES OF DATABASE MODELS
HIERARCHICAL
NETWORK
COLUMN
TABLE
ROW VALUE
RELATIONAL
15
DATABASE DESIGN PHASES
DATA ANALYSIS
Entities - Attributes - Relationships - Integrity Rules
LOGICAL DESIGN
Tables - Columns - Primary Keys - Foreign Keys
PHYSICAL DESIGN
DDL for Tablespaces, Tables, Indexes
16
Introduction to
Relational Databases:
RDBMS
Definition : RDBMS
 It is a system in which, at a minimum :
 The data is perceived by the user as tables ( and
nothing but tables ); and
 The operators at the user’s disposal - e.g., for data
retrieval - are operators that generate new tables
from old, and those include at least SELECT,
PROJECT, and JOIN.
18
Features of an RDBMS
 The ability to create multiple relations (tables)

and enter data into them
 An interactive query language
 Retrieval of information stored in more than
one table
 Provides a Catalog or Dictionary, which itself
consists of tables ( called system tables )
19
Some Important Terms
 Relation : a table
 Tuple : a row in a table
 Attribute : a Column in a table
 Degree : number of attributes
 Cardinality : number of tuples
 Primary Key : a unique identifier for the table
 Domain : a pool of values from which specific attributes of
specific relations draw their values
20
Properties of Relations (Tables)
 There are no duplicate rows (tuples)

 Tuples are unordered, top to bottom
 Attributes are unordered, left to right
 All attribute values are atomic ( or scalar )
 Relational databases do not allow repeating
groups
21
Keys
 Key
 Super Key
 Candidate Keys
 Primary Key
 Alternate Key
 Secondary Keys
22
Keys and Referential Integrity
Enrolled Student
sid cid grade sid name login age gpa

53666 carnatic101 C
53666 Jones Jones@cs 18 3.4
53688 reggae203 B
53688 Smith Smith@eecs 18 3.2
53650 topology112 A
53650 Smith Smith@math 19 3.8
53666 history105 B
Primary key
Foreign key referring to
sid of STUDENT relation
23
24
Relational Algebra
Relational Query Languages
 Query languages: Allow manipulation and
retrieval of data from a database.
 Relational model supports simple, powerful
QLs:
 Strong formal foundation based on logic.
 Allows for much optimization.
 Query Languages != programming languages!
26
Example Instances
sid bid day
R1 22 101 10/10/99
58 103 11/12/99
sid sname rating age
S1 22 Deepa 7 45.0
31 Laxmi 8 55.5
58 Roopa 10 35.0

28 Yamuna 9 35.0
S2
31 Laxmi 8 55.5
44 Geeta 5 35.0
58 Roopa 10 35.0
27
Relational Algebra
 Basic operations:
 Selection ( )
 Projection ()
 Cross- product ( )
 Set- difference ( –)
 Union ( )
28
Projection
sname rating
Yamuna 9 sname, rating(S2)
Laxmi 8
Geeta 5
Roopa 10
age
35.0
55.5 age(S2)
29
Selection

28 Yamuna 9 35.0 rating > 8(S2)
58 Roopa 10 35.0
sname rating
Yamuna 9
Roopa 10 sname, rating(S2) (rating > 8(S2))
30
Union, Intersection, Set
Difference
22 Deepa 7 45.0 S1  S2
31 Laxmi 8 55.5
58 Roopa 10 35.0
44 Geeta 5 35.0
28 Yamuna 9 35.0

31 Laxmi 8 55.5 S1  S2
58 Roopa 10 35.0

22 Deepa 7 45.0 S1  S2
31
Cross- Product
(sid) sname rating age (sid) bid day
22 Deepa 7 45.0 22 101 10/10/99
22 Deepa 7 45.0 58 103 11/12/99
31 Laxmi 8 55.5 22 101 10/10/99
31 Laxmi 8 55.5 58 103 11/12/99
58 Roopa 10 35.0 22 101 10/10/99
58 Roopa 10 35.0 58 103 11/12/99
32
Joins
Condition Join :
(sid) sname rating age (sid) bid day

22 Deepa 7 45.0 22 101 10/10/99
31 Laxmi 8 55.5 58 103 11/12/99
33
Equi-Join
(sid) sname rating age bid day

22 Deepa 7 45.0 101 10/10/99
58 Roopa 10 35.0 103 11/12/99
34
Division
•Not supported as a primitive operator, but useful for
expressing queries like:
•Find sailors who have reserved all boats .
sno pno
pno
s1 p1 pno pno
A s1 p2 p2 p1
p2
s1 p3 p4 p2
B1
s1 p4 B2 p4
s2 p1 sno
s2 p2 s1 B3
sno
s3 p2 s2
s1 sno
s3
s4 p2
s4 s4 s1
s4 p4
A/B1 A/B2 A/B3
35
36
Introduction to Query
Optimization
Processing A High-level
Query
Query in a high level language
SCANING, PARSING AND VALIDATING
Intermediate form of query
QUERY OPTIMIZER
Typical steps
Execution plan when processing
a high level
QUERY CODE GENERATOR
query.
Code to execute the query
RUNTIME DATABASE PROCESSOR
Result of query
38
Two Main Techniques for Query
Optimization
 Heuristic Rules: A heuristic is a rule that works well in most of cases,
but not always. General Idea:
 Many different relational algebra expressions (and thus query trees) are
equivalent.
 Transform the initial query tree of a query into an equivalent final query tree
that is efficient to execute.
 Cost based query optimization
 Estimate the cost for each execution plan, and choose the one with the lowest
cost.
 Can we get the best execution plan?
39
Motivating Example
select *
from R1, R2, R3
where R1.r2no=R2.r2no NLJ
and R2.r3no=R3.r3no
and R1.a=5000
NLJ SS(R1, “a=5000”)
SS(R2) SS(R3)
40
Alternative Plans 1(No Indexes)
select *
from R1, R2, R3
and R2.r3no=R3.r3no
and R1.a=5000
NLJ SS(R3)
SS(R1, “a=5000”) SS(R2)
41
Alternative Plans 2 (With
Indexes)
select *
from R1, R2, R3
and R2.r3no=R3.r3no
and R1.a=5000
NLJ SS(R3)
IS(R1, “a=5000”) SS(R2)
42
43
Conceptual Design
Using the
Entity- Relationship
Model
Overview of Database Design
 Conceptual design : (ER Model is used at this
stage.)
 Schema Refinement : (Normalization)
 Physical Database Design and Tuning
45
E R Modeling
 Conceptual Schema Design
 Relational Calculus
- Formal Language for Relational D/B.
Relational Calculus
Predicate Calculus Domain Calculus

SQL / Tuple Based Query By Examples
46
Design Phases…
Requirements Collection
& Analysis
Data Requirements
Functional Requirements Conceptual Design
User Defined Operations Entity Types, Constraints , Relationships

Data Flow Diagrams No Implementation Details.
Sequence Diagrams, Scenarios
Ensures Requirements
Logical Design Meets the Design
Data Model Mapping – Type of Database is identified
Physical Design
Internal Storage Structures / Access Path / File Organizations
47
E-R Modeling
 Entity
 is anything that exists and is distinguishable
 Entity Set
 a group of similar entities
 Attribute
 properties that describe an entity
 Relationship
 an association between entities
48
Notations
ENTITY TYPE ( REGULAR )
WEAK ENTITY TYPE
RELATIONSHIP TYPE
WEAK RELATIONSHIP TYPE
49
Entity
Attributes
SSN NAME LOT
ssn name lot
123- 22- 3666 Attishoo 48
231- 31- 5368 Smiley 22
Employee 131- 24- 3650 Smethurst 35
Entity Set
CREATE TABLE Employees
(ssn CHAR (11),
name CHAR (20),
lot INTEGER,
PRIMARY KEY (ssn))
50
Types of Relationships
1 1
1:1 student Is issued ID card
1:M students M 1 course

enrols in
M:M students M M tests

take
51
ER Model
ssn name lot since did dname budget
Employee Works_in Department

supervisor Sub-
ordinate
Reports_To
52
ER Model (Contd.)
Works_ In
SSN DID SINCE

123-22-3666 51 1/1/91 CREATE TABLE Works_ In(
ssn CHAR (11),
123-22-3666 56 3/3/93
did INTEGER,
231-31-5368 51 2/2/92 since DATE,
PRIMARY KEY (ssn, did),
FOREIGN KEY (ssn)
REFERENCES Employees,
FOREIGN KEY (did)
REFERENCES Departments)
53
Key Constraints
Employee Department
Manages
54
Key Constraints for Ternary Relationships
Location
address capacity
55
Participation Constraints
Employee Manages Department
Works_in
since
56
Weak Entities
ssn name lot cost pname

age
Employee Dependent
policy
57
ISA (‘is a’) Hierarchies
ssn name lot
Employee
Hrly_wages
Hrs_worked IsA
contractid
Hourly_Emp Contract_Emp
58
Aggregation
ssn name lot
Employee
monitors until
pid pbudget Started on did dname budget
project sponsors department
59
Entity vs. Attribute
Works_ In does not allow an employee to work in a department

for two or more periods (why?)
ssn name lot from to did dname budget
Employee Department
Works_in
60
Entity vs. Attribute (Contd.)
ssn name lot did dname budget
Employee Department
Works_in
from Duration to
61
Entity vs. Relationship
ssn name lot since DB did dname budget
Employee Department
manages
DB - Dbudget
62
Entity vs. Relationship
ssn name lot did dname budget
Employee Department
manages
since
Appt num Mgr_appt
DBudget
63
Binary vs. Ternary Relationships
ssn name lot pname age
Employee Dependent
covers
Policy
policyid cost
64
Binary vs. Ternary Relationships
Better Design
ssn name lot pname
age
Employee Dependent
purchaser Beneficiary
Policy
policyid cost
65
Constraints Beyond the ER Model
• Some constraints cannot be captured in ER diagrams:
• Functional dependencies
• Inclusion dependencies
• General constraints
66
E-R Diagram
DEPARTMENT
1
SUPPLIER
DEPT_
M
EMP
PROJ_ M
M M
M WORK M SUPP_
EMPLOYEE PROJECT PART_
M PROJ
1 1 PROJ_ SUPP_
MGR M PART
EMP_
DEP PART M
M M M
DEPENDENT
PART_
STRUC
TURE
67
Example to Start with ….
 An Example Database Application called
COMPANY which serves to illustrate the ER
Model concepts and their schema design.
The following are collection from the Client.
68
Analysis…
 Company :
Organized into Departments, Each Department
has a name, no and manager who manages the
department. The Company keeps track of the
date that employee managing the department.
A Department may have a Several locations.
69
Analysis…
 Department :
A Department controls a number of Projects each of
which has a unique name , no and a single Location.
 Employee :
Name, Age, Gender, BirthDate, SSN, Address,
Salary. An Employee is assigned to one department,
may work on several projects which are not
controlled by the department. Track of the number of
hours per week is also controlled.
70
Analysis….
 Keep track of the dependents of each
employee for insurance policies : We keep
each dependant first name, gender, Date of
birth and relationship to the employee.
71
Now to our Company…
DEPARTMENT
( Name , Number , { Locations } , Manager, Start Date )
PROJECT
( Name, Number, Location , Controlling Department )
EMPLOYEE
(Name (Fname, Lname) , SSN , Gender, Address, Salary
Birthdate, Department , Supervisor , (Workson ( Project , Hrs))
DEPENDENT
( Employee, Name, Gender, Birthdate , Relationship )
72
Example …
 Manage:
 Department and Employee
 Partial Participation
 Relation Attribute : StartDate.
 Works For:
 Department and Employee
 Total Participation
73
Example…
 Control :
 Department , Project
 Partial Participation from Department
 Total Participation from Project
 Control Department is a RKA.
 Supervisor :
 Employee, Employee
 Partial and Recursive
74
Example …
 Works – On :
 Project , Employee
 Total Participation
 Hours Worked is a RKA.
 Dependants of:
 Employee , Dependant
 Dependant is a Weaker
 Dependant is Total , Employee is Partial.
75
One Possible mapping of the Problem
Statement Name No Loc
Lname
Fname Sal Work
s For Department
Sex
SSN
Name
Sdate
Address Control
Employee s
Bdate manage
s
Hours
Project
Work
Supe sOn
rvise
s Name No Loc
Depend On
Dependent Relationship
Name Sex Bdate 76

77
78
79
80
Schema Refinement
and
Normalization
Normalization and Normal
Forms
 Normalization:
 Decomposing a larger, complex table into several smaller,
simpler ones.
 Move from a lower normal form to a higher Normal form.
 Normal Forms:
 First Normal Form (1NF)
 Second Normal Form (2NF)
 Third Normal Form (3NF)
 *Higher Normal Forms (BCNF, 4NF, 5NF ....)
 In practice, 3NF is often good enough.
82
Why Normal Forms
 The first question to ask is whether any
refinement is needed!
 If a relation is in a certain normal form
(BCNF, 3NF etc.), it is known that certain
kinds of problems are avoided/ minimized.
This can be used to help us decide whether
decomposing the relation will help.
83
The Evils of Redundancy
 Redundancy is at the root of several problems
associated with relational schemas
 More seriously, data redundancy causes
several anomalies: insert, update, delete
 Wastage of storage.
 Main refinement technique: decomposition
(replacing ABCD with, say, AB and BCD, or
ACD and ABD).
84
Refining an ER Diagram - Before
85
Refining an ER Diagram - After
ssn name since did dname budget

lot
Employee Department
Works_in
86
First Normal Form
 A table is in 1NF, if every row contains exactly one value for
each attribute.
 Disallow multivalued attributes, composite attributes and their
combinations.
 1NF states that :
 domains of attributes must include only atomic (simple, indivisible)
values and that value of any attribute in a tuple must be a single value
from the domain of that attribute.
 By definition, any relational table must be in 1NF.
87
Functional Dependencies (FDs)
 Provide a formal mechanism to express
constraints between attributes
 Given a relation R, attribute Y of R is
functionally dependent on the attribute X of R
if & only if each X-value in R has associated
with it precisely one Y-value in R.
88
Full Dependency
 Concept of full functional dependency
 A FD x  y is a full functional dependency if
removal of any attribute A from X means that the
dependency does not hold any more.
89
Partial Dependency
 An F.D. x  y is a partial dependency if there
is some attribute A  X that can be removed
from X and the dependency will still hold.
90
Example: Constraints on Entity Set
S N L R W H
123- 22- 3666 Attishoo 48 8 10 40
231- 31- 5368 Smiley 22 8 10 30
131- 24- 3650 Smethurst 35 5 7 30
434- 26- 3751 Guldu 35 5 7 32
612- 67- 4134 Madayan 35 8 10 40
S N L H R
123- 22- 3666 Attishoo 48 40 8
8 R W
231- 31- 5368 Smiley 22 30
131- 24- 3650 Smethurst 35 30 5 5 7
434- 26- 3751 Guldu 35 32 8 10
5
612- 67- 4134 Madayan 35 40 8
91
Second Normal Form (2NF)
 A relation schema R is in 2NF if:
 it is in 1NF and
 every non-prime attribute A in R is fully
functionally dependent on the primary key of R.
 2NF prohibits partial dependencies.
92
2NF: An Example
 Emp{Eno, Dept, ProjCode, Hours}
 Primary key: {Eno, ProjCode}
 {Eno} -> {Dept}, {Eno, ProjCode} -> {Hours}
 Test of 2NF
 {Eno} -> {Dept}: partial dependency.
 Emp is in 1NF, but not in 2NF.
 Decomposition:
 Emp {Eno, Dept}
 Proj {Eno, ProjCode, Hours}
93
Transitive Dependency
 An FD X  Y in a relation schema R is a
transitive dependency if
 there is a set of attributes Z that is not a subset of
any key of R, and
 both X  Z and Z  Y hold.
94
Third Normal Form
 A relation schema R is in 3NF if
 It is in 2NF and
 No nonprime attribute of R is transitively dependent on the
primary key.
 3NF means that each non-key attribute value in any tuple
is truly dependent on the Primary Key and not even
partially on other attributes.
 3NF prohibits transitive dependencies.
95
3NF: An Example
 Emp{Eno, Dept, Dept_Head}
 Primary key: {Eno}
 {Eno} -> {Dept}, {Dept} -> {Dept_Head}
 Test of 3NF
 {Eno} -> {Dept} -> {Dept_Head}: Transitive dependency.
 Emp is in 2NF, but not in 3NF.
 Decomposition:
 Emp {Eno, Dept}
 Dept {Dept, Dept_Head}
96
Boyce –Codd Normal Form
 The intention of BCNF is that- 3NF does not
satisfactorily handle the case of a relation
processing two or more composite or
overlapping candidate keys
97
BCNF ( Boyce Codd Normal
Form)
 A Relation is said to be in Boyce Codd
Normal Form (BCNF) if and only if every
determinant is a candidate key.
98
Decomposition of a Relation
Scheme
 Suppose that relation R contains attributes
A1 ... An. A decomposition of R consists of
replacing R by two or more relations such that:
 Each new relation scheme contains a subset of the
attributes of R (and no attributes that do not appear
in R), and
 Every attribute of R appears as an attribute of one
of the new relations.
99
100
101
102
103
104
105
106
Transaction,
Concurrency Control
and Recovery
Transaction
 A sequence of many actions which are
considered to be one atomic unit of work.
 Read, write, commit, abort
 Governed by four ACID properties:
 Atomicity, Consistency, Isolation, Durability
 Has a unique starting point, some actions and
one end point
108
The ACID Properties
 A tomicity: All actions in the transaction
happen, or none happen.
 C onsistency: If each transaction is consistent,
and the DB starts consistent, it ends up
consistent.
 I solation: Execution of one transaction is
isolated from that of other transactions.
 D urability: If a transaction commits, its
effects persist.
109
Automicity
 All-or-nothing, no partial results. An event either happens and is
committed or fails and is rolled back.
 e.g. in a money transfer, debit one account, credit the other. Either
both debiting and crediting operations succeed, or neither of them
do.
 Transaction failure is called Abort
 Commit and abort are irrevocable actions. There is no undo for these
actions.
 An Abort undoes operations that have already been executed
 For database operations, restore the data’s previous value from
before the transaction (Rollback-it); a Rollback command will
undo all actions taken since the last commit for that user.
 But some real world operations are not undoable.
Examples - transfer money, print ticket, fire missile
110
Consistency
 Every transaction should maintain DB consistency
 Referential integrity - e.g. each order references an
existing customer number and existing part numbers
 The books balance (debits = credits, assets =
liabilities)
 Consistency preservation is a property of a transaction,
not of the database mechanisms for controlling it
(unlike the A, I, and D of ACID)
 If each transaction maintains consistency,
then a serial execution of transactions does also
111
Isolation
Intuitively, the effect of a set of transactions should
be the same as if they ran independently.
 Formally, an interleaved execution of transactions is
serializable if its effect is equivalent to a serial one.
 Implies a user view where the system runs each user’s
transaction stand-alone.
 Of course, transactions in fact run with lots of
concurrency, to use device parallelism – this will be
covered later.
 Transactions can use common data (shared data)
 They can use the same data processing mechanisms
(time sharing)
112
Durability
 When a transaction commits, its results will survive failures
(e.g. of the application, OS, DB system … even of the disk).
 Makes it possible for a transaction to be a legal contract.
 Implementation is usually via a log
 DB system writes all transaction updates to a log file
 to commit, it adds a record “commit(Ti)” to the log
 when the commit record is on disk, the transaction is
committed.
 system waits for disk ack before acknowledging to user
113
Transaction processing
Can be automatic (controlled by the RDBMS) or
programmatic (programmed using SQL or
other supported programming languages, like
PL/SQL)
114
Why Have Concurrent Processes?
 Better transaction throughput
 Improved response time
 Done via better utilization of resources:
 While one processes is doing a disk read, another
can be using the CPU or reading another disk.
115
Typical situations requiring
concurrency control
 Exclusive access to an external device or shared
service (e.g., managing printer queues)
 Coordination of applications which process
parallel data (e.g. parallel DB servers)
 Disabling or enabling execution of the client
programs in a specific moment (typically for
database administration - e.g. database backups,
enforcing resource occupation, etc.)
 Detection of transaction ends when managing
multiple sessions for connection to the database
(client/server architectures, Web access)
116
Problems with Concurrency (in absence
of locking)
 Lost Update problem - losing values due to
intervention of write operation from other
overlapping transactions
 Temporary Update problem - discarding
previous changes made by overlapping
transaction after rollback
 Incorrect Summary problem - overwriting of
certain
values used for calculation by write operations
from other transactions
117
Lost Update Problem
Time Transaction A Value Transaction B
T0 Start A 6
T1 Read Value (6) 6 Start B
T2 Add 2 (6+2=8) 6 Read Value

(6)
T3 Write Value (8) 8 Add 3 (6+3=9)
T4 End A 9 Write Value (9)
T5 9 End B
 What should the final Order Value be?
 Which Update has been lost?
118
Temporary Update Problem
Time Transaction A Value Transaction B
T0 Start A 6
T1 Read Value (6) 6
T2 Add 2 (8) 6
T3 Write Value (8) 8 Start B
T4 Failure: Rollback! 8 Read Value (8)
T5 Write Value (6) 6 Add 3 (8+3=11)
T6 End A 11 Write Value (11)
T5 11 End B
 What should the final Order Value be?
 Where is the temporary update?
119
Incorrect Summary Problem
Time Transaction A Values Transaction B
Read 1st Value (6) 6

T0 3
T1 6
Add 2 (6+2=8) 3
T2 Write 1st Value (8) 8
3
T3 Read 2nd Value (3) 8 Read 1st Value (8)

3
8 Read 2nd Value (3)
T4 Add 2 (3+2 = 5) 3
8 Total Sum = 11
T5 Write 2 nd
Value (5) 5
 What should the total Order Value be?
 Which order was accumulated before update, and which after?
120
t1 t2
3.1 Database State and Changes
State D1 State D2
T
D1, D2 - Logically consistent states of the database data
T- Transaction for changing the database
t1, t2 - Absolute time before and after the transaction
121
3.2 Transaction State and Progress
A transaction reaches its commit point when all
operations accessing the database are completed
and the result has been recorded in the log. It then
writes a [commit, <transaction-id>] and terminates.
BEGIN END COMMIT
active partially committed
committed
ROLLBACK
READ , WRITE ROLLBACK
aborted terminated
When a system failure occurs, search the log file for entries
[start, <transaction-id>]
and if there are no logged entries [commit, <transaction-id>]
then undo all operations that have logged entries
[write, <transaction-id>, X, old_value, new_value]
122
Schedules
• Schedule: Actions of transactions as seen by the DBMS
T1 T2
R(A)
W(A)
R(B)
W(B)
R(C)
W(C)
123
Serializable Schedule
 A schedule whose effect on the DB “state” is
the same as that of some serial schedule
 All serial schedules are serializable
 But the reverse may not be true
124
Serializability Violations
Transfer Add 6%
Rs.10,000 interest to
from A to B A&B
T1 T2
R(A)
W(A)
R(A)
W(A)
Database is R(B)
inconsistent! W(B)
commit
R(B)
W(B)
commit
125
Cascading Aborts
T1 T2
R(A)
W(A)
R(A)
W(A)
abort
126
Recoverable Schedules
Unrecoverable Schedule Recoverable Schedule
T1 T2 T1 T2
R(A) R(A)
W(A) W(A)
R(A) R(A)
W(A) W(A)
commit commit
abort commit
127
Locking
 The concept of locking data items is one of the main techniques
for controlling the concurrent execution of transactions.
 A lock is a variable associated with a data item in the database.
 Generally there is a lock for each data item in the database.
 A lock describes the status of the data item with respect to
possible operations that can be applied to that item
 used for synchronising the access by concurrent transactions
to the database items.
 A transaction locks an object before using it
 When an object is locked by another transaction, the requesting
transaction must wait
128
Locking Granularity
 A database item which can be locked could be
 a database record
 a field value of a database record
 the whole database
 Trade-offs
 coarse granularity
 the larger the data item size, the lower the degree of
concurrency
 fine granularity
 the smaller the data item size, the more locks to be managed
and stored, and the more lock/unlock operations needed.
129
Locking: A Technique for
Concurrency Control
•Locks are automatically obtained by DBMS.
•Guarantees serializability!
Compatibility matrix for lock types X and S

-- S X
--    S: Shared lock
S   X: Exclusive lock
-- No lock
X 
130
Two- Phase Locking (2PL)
Strict 2PL:
– If T wants to read an object, first obtains an S lock.
– If T wants to modify an object, first obtains X lock.
– Hold all locks until end of transaction.
– Guarantees serializability, and recoverable schedule, too!
also avoids WW problems!
2PL:
– Slight variant of strict 2PL
– transactions can release locks before the end (commit or
abort)
But after releasing any lock it can acquire no new locks
– Guarantees serializability
131
Handling a Lock Request
Lock Request (XID, OID, Mode)

Mode==X Mode==S
Currently Locked? Empty Wait Queue?
Yes No Yes
Currently X-locked?
No Yes
Put on Queue
No
Grant Lock
132
133
Recovery
 Occurs in case of transaction failures.
 Database (DB) is restored to the most

recent consistent state just before the time
of failure.
 To do this, the DB system needs

information about changes applied by
various transactions. It is the system log.
134
Recovery: Motivation
crash
T1
T2
T3
T4
T5
•Atomicity: Undoing actions of transaction that do not commit
•Durability: Making sure all actions of committed transactions
survive system crashes
•The Recovery Manager guarantees Atomicity & Durability.
135
Recovery Outline
 Restore to most recent “consistent” state just
before time of failure
 Use data in the log file
 Catastrophic Failure
 Restore database from backup
 Replay transactions from log file
 Database becomes inconsistent (non-

catastrophic errors)
 Undo or Redo last transactions until consistent state is
restored
136
Logging
 Record REDO and UNDO information, for
every update, in a log.
– Sequential writes to log (put it on a separate disk).
– Minimal info (diff) written to log, so multiple
updates fit in a single log page.
137
Handling the Buffer Pool
• When is buffer written back to disk?
• Steal/No-steal
Can it be written before commit? (steal)
Or does it have to wait till after commit? (no-steal)
• Force/No-force
Is it written “immediately” after commit? (force)
Or can it remain in memory? (no-force)
NoSteal Steal
Force Trivial
NoForce Desired
138
Write- Ahead Logging (WAL)
 The Write- Ahead Logging Protocol:
 Must force the log record for an update before the
corresponding data page gets to disk.
 Must write all log records for a transaction before
commit .
 What goes into log:
 BFIM needed for UNDO type algorithms
 AFIM needed for REDO type algorithms
139
Checkpoints in the System Log
 Checkpoint record written in log when all updated DB buffers written
out to disk
 Any committed transaction occurring before checkpoint in log can be
considered permanent (won’t have to be redone after crash)
 Actions
 suspend execution of all transactions
 force-write all modified buffers to disk
 write checkpoint entry in log and force write log
 resume transactions
 Fuzzy checkpointing
 resume transactions as soon as buffers written
140
141
142

Rdbms

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Rdbms

Uploaded by

Copyright:

Available Formats

2

 Data: Known facts that can be recorded and that

DATABASE Application Programs/Queries

Software to access stored data

Stored Data Defn.

Application user Database

Query processor DDL compiler

Conceptual Physical Representational

 The ability to create multiple relations (tables)

 There are no duplicate rows (tuples)

sid cid grade sid name login age gpa

 Allows for much optimization.

 Query Languages != programming languages!

sid sname rating age

sid sname rating age

sid sname rating age

sid sname rating age

sid sname rating age

(sid) sname rating age (sid) bid day

(sid) sname rating age bid day

SCANING, PARSING AND VALIDATING

Intermediate form of query

RUNTIME DATABASE PROCESSOR

SS(R1, “a=5000”) SS(R2)

IS(R1, “a=5000”) SS(R2)

 Schema Refinement : (Normalization)

 Physical Database Design and Tuning

Predicate Calculus Domain Calculus

Functional Requirements Conceptual Design

User Defined Operations Entity Types, Constraints , Relationships

WEAK ENTITY TYPE

WEAK RELATIONSHIP TYPE

1:M students M 1 course

M:M students M M tests

ssn name lot since did dname budget

Employee Works_in Department

SSN DID SINCE

ssn name lot since did dname budget

ssn name lot since did dname budget

Employee Works_in Department

ssn name lot since did dname budget

Employee Manages Department

ssn name lot cost pname

ssn name lot

ssn name lot

pid pbudget Started on did dname budget

project sponsors department

Works_ In does not allow an employee to work in a department

ssn name lot from to did dname budget

ssn name lot did dname budget

ssn name lot since DB did dname budget

ssn name lot did dname budget

ssn name lot pname age

• Some constraints cannot be captured in ER diagrams:

The following are collection from the Client.

 Total Participation from Project

 Control Department is a RKA.

 Hours Worked is a RKA.

 Dependant is Total , Employee is Partial.

Name Sex Bdate 76

 In practice, 3NF is often good enough.

ssn name lot since did dname budget

Employee Works_in Department

ssn name since did dname budget