You are on page 1of 8

DBMS Architecture

Query Parser

Transaction Management: Query Rewriter
Query Optimizer
Concurrency Control Query Executor

Yanlei Diao Lock Manager Access Methods Log Manager
UMass Amherst Concurrency Control Buffer Manager Recovery

Disk Space Manager

DB
Slides Courtesy of R. Ramakrishnan and J. Gehrke 2

Outline Concurrent User Programs
 Concurrent execution of user programs: good for
 Transaction management overview
performance.
 When task 1 is doing I/O, run task 2 to utilize the CPU.
 Serializability & recoverability
 Improve average response time (average delay that a user
task experiences)
 Lock-based concurrency control
 Improve system throughput (number of user tasks
processed in each time unit)
 Optimistic concurrency control

 Efficient B+tree locking

3 4

Transactions Concurrency
 User programs may do many things on data retrieved.  Many users submit xacts, but each user thinks of his as
executing by itself.
 E.g., operations on Bob’s bank account.
 DMBS interleaves reads and writes of xacts for concurrency.
 E.g. transfer of money from account A to account B.
 E.g., search for a ticket, think about it…, and buy it.  Consistency: each xact starts and ends with a consistent
state (i.e., satisfying all integrity constraints).
 But the DBMS is only concerned about what data is read
 E.g., if an IC states that all accounts must have a positive
from/written to the database.
balance, no transaction can violate this rule.
 A transaction is DBMS’s abstract view of a user program:  Isolation: execution of one xact appears isolated from
a sequence of reads and writes. others.
 Nobody else can see the data in its intermediate state, e.g.,
account A being debited but B not being credited.

5 6

1

or it could be aborted after executing some actions. W(B) transactions running serially in some order! T2: R(A). 11 12 2 .  Recognizing any serializable schedule is highly complex. transactions that are not (explicitly) aborted. W(B) 9 10 Scheduling Transactions Serializability  Serial schedule: Schedule that does not interleave the actions  Serializability theory concerns the schedules of of different transactions.  Optimistic concurrency control  DBMS logs all actions so that it can redo the actions of committed xacts. its effect will persist despite system failure. B=1. But what about:  2nd xact credits both accounts with a 6% interest payment.  DBMS logs all actions so that it can undo the actions of aborted xacts.  Lock-based concurrency control  Durability: once a user program has been notified of success. B=B-100 T2: BEGIN A=1.  Serializable schedule: A schedule that is equivalent to some serial execution of the transactions.  This is OK. B=1. schedule preserves consistency. W(A). R(B). executing the second schedule. W(A). Recovery Outline  A transaction might commit after completing all its  Transaction management overview actions. ideally want to allow any executing the first schedule is identical to the effect of serializable schedule. T1: A=A+100. if possible. every serializable are easy to detect. B=B-100  No guarantee that T1 will execute before T2 or vice-versa.  Atomicity: either all actions of a xact are performed or  Serializability & recoverability none of them is (all-or-none).  Efficient B+tree locking 7 8 Example Example (Contd. the net effect must be equivalent to these two T1: R(A).)  Consider two transactions:  Consider a possible interleaving schedule: T1: BEGIN A=A+100. the effect of  Given a set of such xacts. allow only a subset of serializable schedules that  If each transaction preserves consistency.06*B END T2: A=1.  Instead. B=B-100 END T1: A=A+100. R(B).06*B both are submitted together. B=1.06*A.06*A. if T2: A=1.  Equivalent schedules: For any database state.06*A. The DBMS’s view of the second schedule:  However.06*B  1st xact transfers $100 from B’s account to A’s.

T1: R(A) W(A) T1: R(A).)  Schedules S1 and S2 are view equivalent if:  A schedule is view serializable if it is view equivalent to a  If Ti reads initial value of A in S1. WR. to some serial schedule.  Every pair of conflicting actions is ordered the same way.W(A)  The cycle in the graph reveals the problem. then Ti also reads initial serial schedule. Conflict Serializability Dependency Graph  Two schedules are conflict equivalent if:  Precedence graph:  Involve the same actions of the same transactions.  If Ti reads value of A written by Tj in S1. R(B). R(B).W(A) T2: W(A) T2: W(A) T3: W(A) T3: W(A) 17 18 3 .  One node per Xact.  There are serializable schedules that can’t be detected using conflict serializability. WW operations on the same object). The output of T1 depends on T2. W(B)  Conflict serializability is sufficient but not necessary for T2: R(A). A S1: interleaved schedule T1: R(A) W(A) T1 T2 Precedence graph T2: W(A) B T3: W(A)  The schedule is not conflict serializable: S2: serial schedule T1: R(A). 13 14 Example Weaker Condition on Serializability T1: R(A). value of A in S2  Every conflict serializable schedule is view serializable.  Edge from Xact Ti to Xact Tj if an action of Ti precedes and  Schedule S is conflict serializable if S is conflict equivalent conflicts with one of Tj‘s actions (RW.  Given a set of xacts. its precedence graph is acyclic. W(A). W(B) serializability. T2: W(A) T3: W(A) 15 16 View Serializability View Serializability (Contd. then Ti also reads  The converse is not true. W(A). and vice-versa. then Ti also writes final  Every view serializable schedule that is not conflict value of A in S2 serializable contains a blind write. conflict serializable schedules are a  Theorem: Schedule is conflict serializable if and only if subset of serializable schedules. value of A written by Tj in S2  If Ti writes final value of A in S1.

aborts Serial  No cascading aborts. 19 20 Recoverability (Contd. but with cascading aborts. T2: R(A). W(A) Abort All schedules View serializable T2: R(A) W(A) (Commit) Recoverable.  Actions of aborted xacts can be simply undone by restoring the original values of modified objects.  Serializability & recoverability  If an Xact holds an X lock on an object. Recoverability Recoverability (Contd.)  Recoverability theory concerns schedules that involve aborted transactions. no cascading aborts.W(A) Abort Recoverable. locks  Efficient B+tree locking time C 23 24 4 . T1: R(A).W(A) Abort T2: R(A). no other Xact can get a lock (S or X) on that object. Conflict serializable but update of A by T2 will be lost! Recoverable Avoid  S is strict if each xact may read and write only objects cascading Strict previously written by committed xacts.  Lock-based concurrency control  All locks held by a transaction are released when the transaction completes.W(A) Abort T1: R(A).  Optimistic concurrency control #. an X (exclusive) lock on object before writing.) Venn Diagram for Schedules Commited Xacts T1: R(A). Also Aborted Xacts 21 22 Outline Strict 2PL  Transaction management overview  Strict Two-Phase Locking (Strict 2PL) Protocol:  Each Xact must obtain a S (shared) lock on object before reading.W(A) Commit Unrecoverable!  S avoids cascading rollback if each xact may read only those values written by committed xacts.  A schedule S is recoverable if each xact commits only after all xacts from which it read have committed.

 Hence. locks time C 25 26 Nonstrict 2PL (contd. #.) Deadlocks  Theorem: Nonstrict 2PL ensures acyclicity of precedence graph. be released by each other.  Periodically check for cycles. locks time C 27 28 Deadlock Detection Deadlock Detection (Contd. indicating deadlocks.  But allows xacts to go through more quickly. T1 T2 T1 T2  Resolve a deadlock by aborting a transaction on the cycle and releasing all its locks.  A transaction cannot request additional locks once it releases  Strict 2PL is recoverable without anomalies related to any locks.  If an Xact holds an X lock on an object. aborted transactions.  Two ways of dealing with deadlocks:  Deadlock detection  Nonstrict 2PL is recoverable but not strict!  Deadlock prevention  Involves complex abort processing. no other Xact can get a lock (S or X) on that object.W(B) X(C) There is an edge from Xact Ti to Xact Tj if Ti is waiting for Tj T3: S(C). R(C)  T4: X(B) X(A) to release a lock. it simplifies transaction aborts.) Nonstrict 2PL  Theorem: Strict 2PL allows only schedules whose  Two-Phase Locking Protocol precedence graph is acyclic.  An equivalent serial schedule is given by the order of xacts entering their shrinking phase. an X (exclusive) lock on object before writing.  Deadlock: Cycle of transactions waiting for locks to  Nonstrict 2PL only allows conflict serializable schedules. Strict 2PL (contd. R(A).  Note the difference from the precedence graph for conflict serializability.  Each Xact must obtain a S (shared) lock on object before  Strict 2PL only allows conflict serializable schedules! reading. in the waits-for graph.)  Create a waits-for graph: T1: S(A).  Strict 2PL is strict with respect to recoverability. S(B)  Nodes are Xacts. #. T4 T3 T4 T3 29 30 5 . T2: X(B).

g. and finds oldest sailor.  so that no new records with rating = 1 can be added. even  The older the timestamp. otherwise Ti aborts. age = 71.  If there is an index on the rating field. age = 96  Wound-wait: Ti wants a lock that Tj holds.  If we consider insertion and update of records in a DB.  Higher priority xacts never wait. T1 must lock the entire file/table to prevent new pages from being added. we gain concurrency if we do not  Optimistic concurrency control lock. Ti waits for Tj.  Phantom problem. age = 96. Deadlock Prevention The Phantom Problem  Assign priorities based on timestamps. Strict 2PL won’t assure serializability:  Wait-Die: Ti wants a lock that Tj holds. if it existed!  Predicate locking: impractical.  Problem: T1 assumes that it has locked all sailor records  If a transaction re-starts. not discussed in this class  If there is no suitable index. timestamp so its priority increases. update a sailor’s rating from 3 to 1.. If Ti has higher appears as a phantom! priority.  E. If Ti has higher  T1 locks all pages containing sailor records with priority.  Efficient B+tree locking 35 36 6 .  If there are no data entries with rating = 1. and instead check for conflicts before Xacts commit. T1 should Data lock the index page containing data entries with  Need a mechanism to enforce the assumption that T1 rating = 1.  Deadlock detection/resolution. make sure it has its original with rating = 1. now. rating = 1.  This is only true if nobody adds/updates records while T1 is running! 31 32 The Phantom Problem (Contd. the higher the xact’s priority.  Lock-based concurrency control  Idea: If conflicts are rare.  T1 now reads reads the oldest sailor again. T1 must  Index locking: used in practice lock the index page where such a data entry would be.  T2 next inserts a new sailor: rating = 1.) Index Locking Index r=1  Phantoms can occur with both inserts and updates. Tj aborts. 33 34 Outline Optimistic CC (Kung-Robinson)  Transaction management overview  Locking is a conservative approach to preventing conflicts:  Lock management overhead. indeed holds locks on all recrods satisfying a condition. say. otherwise Ti waits.  Lower priority xacts can never wait.  Serializability & recoverability  Lock contention for heavily used objects.

but make changes to  Technique: find an equivalent serializable schedule local (private) copies of objects. if present. Does Tj read dirty data? Does Tj overwrite Ti’s writes?  Deadlock free. try Test 3 Comments on Optimistic CC  For all i and j such that Ti < Tj. If there is a conflict. old  ReadSet(Ti): objects read by Xact Ti  WriteSet(Ti): objects modified by Ti modified ROOT objects new 37 38 Test 1 If Test 1 fails. only fixes problems when conflicts appear. go one way only. and  WriteSet(Ti) ∩ ReadSet(Tj) is empty. Ti  Works well for some workloads: R V W  All xacts are readers. R-W. abort  Check if the timestamp-ordering of xacts is an equivalent (clear the local copy & restart)! serial order.g.  Three test conditions ensure an equivalent serializable schedule.  No phantom problem!  Why is it correct? 41 42 7 . Kung-Robinson Model Validation  Xacts have three phases:  Goal: guarantee that only serializable schedules result.  Assign an Xact id (timestamp) to each xact at the beginning of the validation phase. but may have starvation. and  Locking (pessimistic): conflicts are prevented in advance.  WriteSet(Ti) ∩ ReadSet(Tj) is empty.  Ti completes before Tj begins its Write phase. large amount of data. Tj  Low interference.  WriteSet(Ti) ∩ WriteSet(Tj) is empty. check that:  Compared to Locking  Ti completes Read phase before Tj does. W-W.  VALIDATE: Check for conflicts. try Test 2…  For all i and j such that Ti < Tj. by restarting xacts.  WRITE: Make local copies of changes public. 39 40 If Test 2 fails. Ti Ti Tj R V W R V W Tj R V W R V W Does Tj read dirty data? Does Tj overwrite Ti’s writes?  Check correctness: all three types of conflicts.  READ: Xacts read from the database. check that Ti completes before Tj begins. and  Optimistic CC: assumes no conflicts first. W-R. e. check that:  For all i and j such that Ti < Tj. by blocking from (potentially) nonserializable actions. each xact R V W accessing a small (likely non-overlapping) amount of data.

43 8 .  Restart Xacts that fail validation.  Work done so far is wasted.  Scheme for making writes global can reduce clustering of objects. Sequential I/O is unlikely later. Overheads in Optimistic CC  Record read/write activity in ReadSet/WriteSet per Xact. requires clean-up.  Check for conflicts during validation  Code for validation is in a critical section.  Must create and destroy these sets as needed.  Make validated writes ``global’’. and critical section can reduce concurrency.  Starvation may occur.