2

Introduction to Database Management Systems (DBMS)

Database Management System (DBMS)
Definitions: 


Data: Known facts that can be recorded and that have implicit meaning Database: Collection of related data 

Ex. the names, telephone numbers and addresses of all the people you know 

Database Management System: A computerized record-keeping system record4

DBMS (Contd.) 

Goals of a Database Management System: 

To provide an efficient as well as convenient environment for accessing data in a database Enforce information security: database security, concurrence control, crash recovery  

It is a general purpose facility for: 
 

Defining database Constructing database Manipulating database
5

Benefits of database approach 
     

Redundancy can be reduced Inconsistency can be avoided Data can be shared Standards can be enforced Security restrictions can be applied Integrity can be maintained Data independence can be provided

6

DBMS Functions 
    

Data Definition Data Manipulation Data Security and Integrity Data Recovery and Concurrency Data Dictionary Performance

7

Database System
Users

DATABASE SYSTEM DBMS Software

Application Programs/Queries

Software to process queries/programs Software to access stored data

Stored Data Defn.
(META-DATA).

Stored Database

8

Database System
Application program query Q2

user query Q1
Query processor

Database scheme
DDL compiler

Compiled query Q2

Database manager

Database description

File manager

Physical database

9

Data Model  

A set of concepts used to describe the structure of a database By structure, we mean the data types, relationships, and constraints that should holds for the data
Categories of Data Models

Conceptual

Physical

Representational
10

Database Architecture
External level (individual user views) Conceptual level (community user view)

Internal level (storage view) Database

11

An example of the three levels
SNo FName LName Conceptual View
struct STAFF { Internal int staffNo; View int branchNo; char fName[15]; char lName[15]; struct date dateOfBirth; float salary; struct STAFF *next; /* pointer to next Staff record */ }; index staffNo; index branchNo; /* define indexes for staff */

Age

Salary

BranchNo

SNo FName LName External View1

Age

Salary

SNo LName BranchNo External View2

12

Schema  

Schema: Description of data in terms of a data model ThreeThree-level DB Architecture defines following schemas: 

External Schema (or sub-schema) sub

Written using external DDL Written using conceptual DDL Written using internal DDL or storage structure definition
13 

Conceptual Schema (or schema)  

Internal Schema 

Data Independence 

Change the schema at one level of a database system without a need to change the schema at the next higher level 

Logical data independence: Refers to the immunity of the external schemas to changes in the conceptual schema e.g., add new record or field Physical data independence: Refers to the immunity of the conceptual schema to changes in the internal schema e.g., adding new index should not void existing ones 

14

TYPES OF DATABASE MODELS
HIERARCHICAL

NETWORK

COLUMN

ROW

VALUE

TABLE

RELATIONAL
15

DATABASE DESIGN PHASES
DATA ANALYSIS Entities - Attributes - Relationships - Integrity Rules

LOGICAL DESIGN Tables - Columns - Primary Keys - Foreign Keys

PHYSICAL DESIGN DDL for Tablespaces, Tables, Indexes

16

Introduction to Relational Databases: RDBMS

Definition : RDBMS 

It is a system in which, at a minimum : 

The data is perceived by the user as tables ( and nothing but tables ); and 

The operators at the user·s disposal - e.g., for data retrieval - are operators that generate new tables from old, and those include at least SELECT, PROJECT, and JOIN.
18

Features of an RDBMS  

 

The ability to create multiple relations (tables) and enter data into them An interactive query language Retrieval of information stored in more than one table Provides a Catalog or Dictionary, which itself consists of tables ( called system tables )
19

Some Important Terms 
     

Relation : a table Tuple : a row in a table Attribute : a Column in a table Degree : number of attributes Cardinality : number of tuples Primary Key : a unique identifier for the table Domain : a pool of values from which specific attributes of specific relations draw their values
20

Properties of Relations (Tables) 
   

There are no duplicate rows (tuples) Tuples are unordered, top to bottom Attributes are unordered, left to right All attribute values are atomic ( or scalar ) Relational databases do not allow repeating groups

21

Keys 
 

Key Super Key Candidate Keys 


Primary Key Alternate Key 

Secondary Keys

22

Keys and Referential Integrity
Enrolled
sid 53666 53688 53650 53666 cid grade carnatic101 C reggae203 B topology112 A history105 B sid

Student
name login Jones@cs Smith@eecs Smith@math age 18 18 19 gpa 3.4 3.2 3.8

53666 Jones 53688 Smith 53650 Smith

Foreign key referring to sid of STUDENT relation

Primary key

23

24

Relational Algebra

Relational Query Languages   

Query languages: Allow manipulation and retrieval of data from a database. Relational model supports simple, powerful QLs:  Strong formal foundation based on logic.  Allows for much optimization. Query Languages != programming languages!

26

Example Instances
sid bid 101 103 day 10/10/99 11/12/99 sid sname Deepa Laxmi Roopa sid rating 7 8 10 sname Yamuna Laxmi Geeta Roopa age 45.0 55.5 35.0 rating 9 8 5 10 age 35.0 55.5 35.0 35.0
27

R1

22 58

S1

22 31 58

S2

28 31 44 58

Relational Algebra 

Basic operations: 
   

Selection (W ) Projection (T) Cross- product ( ') CrossSetSet- difference ( ²) Union (Š )

28

Projection
sname Yamuna Laxmi Geeta Roopa rating 9 8 5 10

Tsname, rating(S2)

age 35.0 55.5

Tage(S2)
29

Selection
sid 28 58 sname Yamuna Roopa rating 9 10 age 35.0 35.0

Wrating > 8(S2)

sname Yamuna Roopa

rating 9 10

Tsname, rating(S2) (Wrating > 8(S2))

30

Union, Intersection, Set Difference
sid 22 31 58 44 28 sid 31 58 sid 22 sname Deepa Laxmi Roopa Geeta Yamuna sname Laxmi Roopa sname Deepa rating 7 8 10 5 9 rating 8 10 rating 7 age 45.0 55.5 35.0 35.0 35.0 age 55.5 35.0 age 45.0

S1 Š S2

S1 ‰ S2 S1  S2
31

CrossCross- Product
(sid) 22 22 31 31 58 58 sname Deepa Deepa Laxmi Laxmi Roopa Roopa rating 7 7 8 8 10 10 age 45.0 45.0 55.5 55.5 35.0 35.0 (sid) 22 58 22 58 22 58 bid 101 103 101 103 101 103 day 10/10/99 11/12/99 10/10/99 11/12/99 10/10/99 11/12/99

32

Joins
Condition Join :
(sid) 22 31 sname Deepa Laxmi rating 7 8 age 45.0 55.5 (sid) 22 58 bid 101 103 day 10/10/99 11/12/99

33

EquiEqui-Join

(sid) 22 58

sname Deepa Roopa

rating 7 10

age 45.0 35.0

bid 101 103

day 10/10/99 11/12/99

34

Division
‡Not supported as a primitive operator, but useful for expressing queries like: ‡Find sailors who have reserved all boats .
sno s1 s1 s1 s1 s2 s2 s3 s4 s4 pno p1 p2 p3 p4 p1 p2 p2 p2 p4 pno p2 B1 sno s1 s2 s3 s4 A/B1 pno p2 p4 B2 sno s1 s4 A/B2 pno p1 p2 p4 B3 sno s1 A/B3
35

A

36

Introduction to Query Optimization

Processing A High-level HighQuery
Query in a high level language SCANING, PARSING AND VALIDATING Intermediate form of query QUERY OPTIMIZER Execution plan QUERY CODE GENERATOR Code to execute the query
RUNTIME DATABASE PROCESSOR

Typical steps when processing a high level query.

Result of query

38

Two Main Techniques for Query Optimization 

Heuristic Rules: A heuristic is a rule that works well in most of cases, but not always. General Idea:
Many different relational algebra expressions (and thus query trees) are equivalent.  Transform the initial query tree of a query into an equivalent final query tree that is efficient to execute.  

Cost based query optimization 

Estimate the cost for each execution plan, and choose the one with the lowest cost.
39 

Can we get the best execution plan?

Motivating Example
select * from R1, R2, R3 where R1.r2no=R2.r2no and R2.r3no=R3.r3no and R1.a=5000
NLJ NLJ

SS(R1, ³a=5000´)

SS(R2)

SS(R3)

40

Alternative Plans 1(No Indexes) 1(No
select * from R1, R2, R3 where R1.r2no=R2.r2no and R2.r3no=R3.r3no and R1.a=5000
NLJ

NLJ

SS(R3)

SS(R1, ³a=5000´)

SS(R2)

41

Alternative Plans 2 (With Indexes)
select * from R1, R2, R3 where R1.r2no=R2.r2no and R2.r3no=R3.r3no and R1.a=5000
NLJ

NLJ

SS(R3)

IS(R1, ³a=5000´)

SS(R2)

42

43

Conceptual Design Using the EntityEntity- Relationship Model

Overview of Database Design 

Conceptual design : (ER Model is used at this stage.) Schema Refinement : (Normalization) Physical Database Design and Tuning  

45

E R Modeling 


Conceptual Schema Design Relational Calculus - Formal Language for Relational D/B.
Relational Calculus

Predicate Calculus SQL / Tuple Based

Domain Calculus Query By Examples
46

Design Phases«
Requirements Collection & Analysis Data Requirements

Functional Requirements User Defined Operations Data Flow Diagrams Sequence Diagrams, Scenarios

Conceptual Design Entity Types, Constraints , Relationships No Implementation Details. Ensures Requirements Meets the Design

Logical Design

Data Model Mapping ± Type of Database is identified Physical Design
Internal Storage Structures / Access Path / File Organizations
47

E-R Modeling 

Entity 

is anything that exists and is distinguishable a group of similar entities properties that describe an entity an association between entities
48 

Entity Set  

Attribute  

Relationship 

Notations
ENTITY TYPE ( REGULAR )

WEAK ENTITY TYPE RELATIONSHIP TYPE

WEAK RELATIONSHIP TYPE

49

Entity
Attributes ssn name lot SSN NAME LOT 48 22 123- 22- 3666 Attishoo 231- 31- 5368 Smiley

Employee Entity Set

131- 24- 3650 Smethurst 35

CREATE TABLE Employees (ssn CHAR (11), name CHAR (20), lot INTEGER, PRIMARY KEY (ssn))
50

Types of Relationships
1:1 st e t 1 Is iss e 1 I car

1:

st

e ts

e rols i

1

co rse

:

st

e ts

take

tests

51

ER Model
ssn name lot since

did

dname

budget

Employee supervisor

Works_in Subordinate

Department

Reports_To

52

ER Model (Contd.)
Works_ In SSN DID 123-22-3666 51 123-22-3666 56 231-31-5368 51 SINCE 1/1/91 3/3/93 2/2/92 CREATE TABLE Works_ In( ssn CHAR (11), did INTEGER, since DATE, PRIMARY KEY (ssn, did), FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN KEY (did) REFERENCES Departments)
53

Key Constraints

ssn

name

lot

since

did

dname Department

budget

Employee Manages

54

Key Constraints for Ternary Relationships

ssn

name

lot

since

did

dname budget

Employee

Works_in

Department

Location address capacity

55

Participation Constraints
ssn name lot since did dname budget

Employee

Manages

Department

Works_in since
56

Weak Entities
ssn name lot cost pname age

Employee policy

Dependent

57

ISA (µis a¶) Hierarchies
lot

ssn

name

Employee Hrly_wages Hrs_worked IsA contractid Contract_Emp
58

Hourly_Emp

Aggregation
ssn name Employee monitors until lot

pid

pbudget

Started on

did

dname budget

project

sponsors

department
59

Entity vs. Attribute
Works_ In does not allow an employee to work in a department for two or more periods (why?)

ssn

name

lot

from

to

did

dname Department

budget

Employee Works_in

60

Entity vs. Attribute (Contd.)

ssn

name

lot

did

dname Department

budget

Employee Works_in

from

Duration

to

61

Entity vs. Relationship

ssn

name

lot

since

DB

did

dname Department

budget

Employee

manages

DB - Dbudget

62

Entity vs. Relationship
ssn name lot did dname Department manages budget

Employee

since Appt num Mgr_appt DBudget
63

Binary vs. Ternary Relationships

ssn

name

lot

pname

age

Employee covers

Dependent

Policy

policyid

cost
64

Binary vs. Ternary Relationships Better Design
ssn name lot pname age

Employee

Dependent

purchaser

Beneficiary

policyid

Policy

cost
65

Constraints Beyond the ER Model
‡ Some constraints cannot be captured in ER diagrams: ‡ Functional dependencies ‡ Inclusion dependencies ‡ General constraints

66

E-R Diagram
DEPARTMENT 1 SUPPLIER DEPT_ EMP M EMPLOYEE 1 EMP_ DEP M DEPENDENT M PART_ STRUC TURE 1 PROJ_ MGR M M M PROJ_ WORK M PROJECT M M SUPP_ PART_ PROJ M PART M

SUPP_ PART M

67

Example to Start with «. 

An Example Database Application called COMPANY which serves to illustrate the ER Model concepts and their schema design. The following are collection from the Client.

68

Analysis« 

Company : Organized into Departments, Each Department has a name, no and manager who manages the department. The Company keeps track of the date that employee managing the department. A Department may have a Several locations.

69

Analysis«  

Department : A Department controls a number of Projects each of which has a unique name , no and a single Location. Employee : Name, Age, Gender, BirthDate, SSN, Address, Salary. An Employee is assigned to one department, may work on several projects which are not controlled by the department. Track of the number of hours per week is also controlled.

70

Analysis«. 

Keep track of the dependents of each employee for insurance policies : We keep each dependant first name, gender, Date of birth and relationship to the employee.

71

Now to our Company«
DEPARTMENT ( Name , Number , { Locations } , Manager, Start Date ) PROJECT ( Name, Number, Location , Controlling Department )

EMPLOYEE (Name (Fname, Lname) , SSN , Gender, Address, Salary Birthdate, Department , Supervisor , (Workson ( Project , Hrs)) DEPENDENT ( Employee, Name, Gender, Birthdate , Relationship )

72

Example « 

Manage: 
 

Department and Employee Partial Participation Relation Attribute : StartDate. Department and Employee Total Participation 

Works For: 


73

Example« 

Control :
Department , Project  Partial Participation from Department  Total Participation from Project  Control Department is a RKA.  

Supervisor :
Employee, Employee  Partial and Recursive 

74

Example « 

Works ² On :
Project , Employee  Total Participation  Hours Worked is a RKA.  

Dependants of:
Employee , Dependant  Dependant is a Weaker  Dependant is Total , Employee is Partial. 

75

Lname Fname

One Possible mapping of the Problem Statement Name No
Sal Sex SSN Address Sdate Work s For Department

Loc

Name Employee Bdate Hours

Control s

manage s Project Name Depend On Dependent Name Sex Bdate Loc

Supe rvise s

Work sOn

No

Relationship
76

77

78

79

80

Schema Refinement and Normalization

Normalization and Normal Forms 

Normalization:
Decomposing a larger, complex table into several smaller, simpler ones.  Move from a lower normal form to a higher Normal form.  

Normal Forms:
First Normal Form (1NF)  Second Normal Form (2NF)  Third Normal Form (3NF)  *Higher Normal Forms (BCNF, 4NF, 5NF ....)  

In practice, 3NF is often good enough.
82

Why Normal Forms 

The first question to ask is whether any refinement is needed! If a relation is in a certain normal form (BCNF, 3NF etc.), it is known that certain kinds of problems are avoided/ minimized. This can be used to help us decide whether decomposing the relation will help.
83 

The Evils of Redundancy   



Redundancy is at the root of several problems associated with relational schemas More seriously, data redundancy causes several anomalies: insert, update, delete Wastage of storage. Main refinement technique: decomposition (replacing ABCD with, say, AB and BCD, or ACD and ABD).
84

Refining an ER Diagram - Before
ssn name lot since did dname budget

Employee

Works_in

Department

85

Refining an ER Diagram - After

ssn

name

since

did

dname

budget lot

Employee Works_in

Department

86

First Normal Form 

A table is in 1NF, if every row contains exactly one value for each attribute. Disallow multivalued attributes, composite attributes and their combinations. 1NF states that :   

domains of attributes must include only atomic (simple, indivisible) values and that value of any attribute in a tuple must be a single value from the domain of that attribute.
87 

By definition, any relational table must be in 1NF.

Functional Dependencies (FDs) 

Provide a formal mechanism to express constraints between attributes Given a relation R, attribute Y of R is functionally dependent on the attribute X of R if & only if each X-value in R has associated Xwith it precisely one Y-value in R. Y- 

88

Full Dependency 

Concept of full functional dependency 

A FD x p y is a full functional dependency if removal of any attribute A from X means that the dependency does not hold any more.

89

Partial Dependency 

An F.D. x p y is a partial dependency if there is some attribute A  X that can be removed from X and the dependency will still hold.

90

Example: Constraints on Entity Set
S N L 123- 22- 3666 Attishoo 48 231- 31- 5368 Smiley 22 131- 24- 3650 Smethurst 35 434- 26- 3751 Guldu 35 612- 67- 4134 Madayan 35 S N L 123- 22- 3666 Attishoo 48 231- 31- 5368 Smiley 22 131- 24- 3650 Smethurst 35 434- 26- 3751 Guldu 35 612- 67- 4134 Madayan 35 H 40 30 30 32 40 R 8 8 5 5 8 R 8 8 5 5 8 W 10 10 7 7 10 H 40 30 30 32 40

R W 5 7 8 10

91

Second Normal Form (2NF) 

A relation schema R is in 2NF if: 


it is in 1NF and every non-prime attribute A in R is fully functionally nondependent on the primary key of R. 

2NF prohibits partial dependencies. dependencies.

92

2NF: An Example 

Emp{Eno, Dept, ProjCode, Hours} 


Primary key: {Eno, ProjCode} {Eno} -> {Dept}, {Eno, ProjCode} -> {Hours} {Eno} -> {Dept}: partial dependency. Emp is in 1NF, but not in 2NF. Emp {Eno, Dept} {Eno, Proj {Eno, ProjCode, Hours} {Eno, ProjCode,
93 

Test of 2NF 
 

Decomposition: 


Transitive Dependency 

An FD X p Y in a relation schema R is a transitive dependency if 

there is a set of attributes Z that is not a subset of any key of R, and both X p Z and Z p Y hold. 

94

Third Normal Form 

A relation schema R is in 3NF if 


It is in 2NF and No nonprime attribute of R is transitively dependent on the primary key. 

3NF means that each non-key attribute nonvalue in any tuple is truly dependent on the Primary Key and not even partially on other attributes. 3NF prohibits transitive dependencies. dependencies.
95 

3NF: An Example 

Emp{Eno, Dept, Dept_Head}
Primary key: {Eno}  {Eno} -> {Dept}, {Dept} -> {Dept_Head}  

Test of 3NF
{Eno} -> {Dept} -> {Dept_Head}: Transitive dependency.  Emp is in 2NF, but not in 3NF.  

Decomposition:
Emp {Eno, Dept} {Eno, Dept}  Dept {Dept, Dept_Head} 

96

Boyce ²Codd Normal Form 

The intention of BCNF is that- 3NF does not thatsatisfactorily handle the case of a relation processing two or more composite or overlapping candidate keys

97

BCNF ( Boyce Codd Normal Form) 

A Relation is said to be in Boyce Codd Normal Form (BCNF) if and only if every determinant is a candidate key.

98

Decomposition of a Relation Scheme 

Suppose that relation R contains attributes A1 ... An. A decomposition of R consists of replacing R by two or more relations such that:
Each new relation scheme contains a subset of the attributes of R (and no attributes that do not appear in R), and  Every attribute of R appears as an attribute of one of the new relations. 

99

100

101

102

103

104

105

106

Transaction, Concurrency Control and Recovery

Transaction 

A sequence of many actions which are considered to be one atomic unit of work. 

Read, write, commit, abort Atomicity, Consistency, Isolation, Durability 

Governed by four ACID properties:  

Has a unique starting point, some actions and one end point

108

The ACID Properties 
 



A tomicity: All actions in the transaction happen, or none happen. C onsistency: If each transaction is consistent, and the DB starts consistent, it ends up consistent. I solation: Execution of one transaction is isolated from that of other transactions. D urability: If a transaction commits, its effects persist.
109

Automicity  



All-orAll-or-nothing, no partial results. An event either happens and is committed or fails and is rolled back.  e.g. in a money transfer, debit one account, credit the other. Either both debiting and crediting operations succeed, or neither of them do.  Transaction failure is called Abort Commit and abort are irrevocable actions. There is no undo for these actions. An Abort undoes operations that have already been executed  For database operations, restore the data·s previous value from before the transaction (Rollback-it); a Rollback (Rollbackcommand will undo all actions taken since the last commit for that user.  But some real world operations are not undoable. Examples - transfer money, print ticket, fire missile

110

Consistency   

Every transaction should maintain DB consistency  Referential integrity - e.g. each order references an existing customer number and existing part numbers  The books balance (debits = credits, assets = liabilities) Consistency preservation is a property of a transaction, not of the database mechanisms for controlling it (unlike the A, I, and D of ACID) If each transaction maintains consistency, then a serial execution of transactions does also
111

Isolation
Intuitively, the effect of a set of transactions should be the same as if they ran independently.  Formally, an interleaved execution of transactions is serializable if its effect is equivalent to a serial one.  Implies a user view where the system runs each user·s transaction stand-alone. stand Of course, transactions in fact run with lots of concurrency, to use device parallelism ² this will be covered later.  Transactions can use common data (shared data)  They can use the same data processing mechanisms (time sharing)
112

Durability  



When a transaction commits, its results will survive failures (e.g. of the application, OS, DB system « even of the disk). Makes it possible for a transaction to be a legal contract. Implementation is usually via a log  DB system writes all transaction updates to a log file  to commit, it adds a record ´commit(Ti)µ to the log  when the commit record is on disk, the transaction is committed.  system waits for disk ack before acknowledging to 113 user

Transaction processing
Can be automatic (controlled by the RDBMS) or programmatic (programmed using SQL or other supported programming languages, like PL/SQL)

114

Why Have Concurrent Processes? 
 

Better transaction throughput Improved response time Done via better utilization of resources: 

While one processes is doing a disk read, another can be using the CPU or reading another disk.

115

Typical situations requiring concurrency control 
  

Exclusive access to an external device or shared service (e.g., managing printer queues) mana Coordination of applications which process parallel data (e.g. parallel DB servers) Disabling or enabling execution of the client programs in a specific moment (typically for database administration - e.g. database backups, enforcing resource occupation, etc.) Detection of transaction ends when managing multiple sessions for connection to the database (client/server architectures, Web access)
116

Problems with Concurrency (in absence of locking)
Lost Update problem - losing values due to intervention of write operation from other overlapping transactions  Temporary Update problem - discarding previous changes made by overlapping transaction after rollback  Incorrect Summary problem - overwriting of certain values used for calculation by write operations from other transactions 

117

Lost Update Problem
Time T0 T1 T2 T3 T4 T5 


Transaction A Start A Read Value (6) Add 2 (6+2=8) Write Value (8) End A

Value 6 6 6 8 9 9

Transaction B

Start B Read Value (6) Add 3 (6+3=9) Write Value (9) End B

What should the final Order Value be? Which Update has been lost?
118

Temporary Update Problem
Time T0 T1 T2 T3 T4 T5 T6 T5 


Transaction A Start A Read Value (6) Add 2 (8) Write Value (8) Failure: Rollback! Write Value (6) End A

Value 6 6 6 8 8 6 11 11

Transaction B

Start B Read Value (8) Add 3 (8+3=11) Write Value (11) End B

What should the final Order Value be? Where is the temporary update?
119

Incorrect Summary Problem
Time T0 T1 T2 T3 T4 T5 


Transaction A Read 1st Value (6) Add 2 (6+2=8) Write 1st Value (8) Read 2nd Value (3)

Values 6 3 6 3 8 3 8 3 8 3 8 5

Transaction B

Read 1st Value (8) Read 2nd Value (3) Total Sum = 11

Add 2 (3+2 = 5) Write 2nd Value (5)

What should the total Order Value be? Which order was accumulated before update, and which after?

120

3.1 Database State and Changes

t1

t2

State D1

State D2

T

D1, D2

- Logically consistent states of the database data

TTransaction for changing the database t1, t2 - Absolute time before and after the transaction

121

3.2 Transaction State and Progress
A transaction reaches its commit point when all operations accessing the database are completed and the result has been recorded in the log. It then writes a [commit, <transaction-id>] and terminates.
BEGIN active END partially committed ROLLBACK ROLLBACK terminated COMMIT committed

READ , WRITE

aborted

When a system failure occurs, search the log file for entries [start, <transaction-id>] and if there are no logged entries [commit, <transaction-id>] then undo all operations that have logged entries [write, <transaction-id>, X, old_value, new_value]
122

Schedules
‡ Schedule: Actions of transactions as seen by the DBMS T1 R(A) W(A) T2

R(B) W(B) R(C) W(C)
123

Serializable Schedule 

A schedule whose effect on the DB ´stateµ is the same as that of some serial schedule 

All serial schedules are serializable 

But the reverse may not be true

124

Serializability Violations
Transfer Rs.10,000 from A to B Add 6% interest to A &B

T1 R(A) W(A)
Database is inconsistent!

T2 R(A) W(A) R(B) W(B) commit

R(B) W(B) commit
125

Cascading Aborts

T1 R(A) W(A)

T2

R(A) W(A) abort

126

Recoverable Schedules
Unrecoverable Schedule Recoverable Schedule

T1 R(A) W(A)

T2

T1 R(A) W(A)

T2

R(A) W(A) commit abort

R(A) W(A) commit commit
127

Locking 
  



The concept of locking data items is one of the main techniques for controlling the concurrent execution of transactions. A lock is a variable associated with a data item in the database.  Generally there is a lock for each data item in the database. A lock describes the status of the data item with respect to possible operations that can be applied to that item  used for synchronising the access by concurrent transactions to the database items. A transaction locks an object before using it When an object is locked by another transaction, the requesting transaction must wait

128

Locking Granularity  

A database item which can be locked could be  a database record  a field value of a database record  the whole database TradeTrade-offs  coarse granularity  the larger the data item size, the lower the degree of concurrency  fine granularity  the smaller the data item size, the more locks to be managed and stored, and the more lock/unlock operations needed.

129

Locking: A Technique for Concurrency Control
‡Locks are automatically obtained by DBMS. ‡Guarantees serializability!
Compatibility matrix for lock types X and S

-S X

-˜ ˜ ˜

S ˜ ˜

X ˜
S: Shared lock X: Exclusive lock -- No lock

130

TwoTwo- Phase Locking (2PL)
Strict 2PL: ± If T wants to read an object, first obtains an S lock. ± If T wants to modify an object, first obtains X lock. ± Hold all locks until end of transaction. ± Guarantees serializability, and recoverable schedule, too! also avoids WW problems! 2PL: ± Slight variant of strict 2PL ± transactions can release locks before the end (commit or abort) 
But after releasing any lock it can acquire no new locks

± Guarantees serializability
131

Handling a Lock Request
Lock Request (XID, OID, Mode)
Mode==X Mode==S

Currently Locked?
Yes No

Empty Wait Queue?
Yes

Currently X-locked? XNo Yes

Put on Queue
No

Grant Lock
132

133

Recovery 

Occurs in case of transaction failures. Database (DB) is restored to the most recent consistent state just before the time of failure. To do this, the DB system needs information about changes applied by various transactions. It is the system log. log.
134  

Recovery: Motivation
crash T1 T2 T3 T4 T5 ‡Atomicity: Undoing actions of transaction that do not commit ‡Durability: Making sure all actions of committed transactions survive system crashes ‡The Recovery Manager guarantees Atomicity & Durability.
135

Recovery Outline 

Restore to most recent ³consistent´ state just before time of failure 

Use data in the log file 

Catastrophic Failure
Restore database from backup  Replay transactions from log file  

Database becomes inconsistent (non(noncatastrophic errors) 

Undo or Redo last transactions until consistent state is restored

136

Logging 

Record REDO and UNDO information, for every update, in a log.
² Sequential writes to log (put it on a separate disk). ² Minimal info (diff) written to log, so multiple updates fit in a single log page.

137

Handling the Buffer Pool
‡ When is buffer written back to disk?
‡ Steal/No-steal Can it be written before commit? (steal) Or does it have to wait till after commit? (no-steal) ‡ Force/No-force Is it written ³immediately´ after commit? (force) Or can it remain in memory? (no-force) NoSteal Force NoForce Trivial Desired
138

Steal

WriteWrite- Ahead Logging (WAL) 

The Write- Ahead Logging Protocol: Write

Must force the log record for an update before the corresponding data page gets to disk. Must write all log records for a transaction before commit .  

What goes into log: 


BFIM needed for UNDO type algorithms AFIM needed for REDO type algorithms
139

Checkpoints in the System Log 
  

Checkpoint record written in log when all updated DB buffers written out to disk Any committed transaction occurring before checkpoint in log can be considered permanent (won·t have to be redone after crash) Actions  suspend execution of all transactions  force-write all modified buffers to disk force write checkpoint entry in log and force write log  resume transactions Fuzzy checkpointing  resume transactions as soon as buffers written
140

141

142

Sign up to vote on this title
UsefulNot useful