You are on page 1of 61

Log-Based Recovery Schemes

If you are going to be in the logging


business, one of the things that you
have to do is to learn about heavy
equipment.
Robert VanNatta,
Logging History of
Columbia County
CS3223 – Crash Recovery 1
Integrity or consistency constraints
• Predicates/constraints data must satisfy, e.g.
• x is key of relation R
• x  y holds in R
• Domain(x) = {Red, Blue, Green}
• no employee should make more than twice the average
salary
• Definitions
• Consistent state: satisfies all constraints
• Consistent DB: DB in consistent state

CS3223 – Crash Recovery 2


Observation:
DB cannot always be consistent!
Example: Transfer 100 from a2 to a10

a2  a2 - 100
a10  a10 + 100

. . .
. . .
a2 500 400 400
. . .
. . .
a10 1000 1000 1100

CS3223 – Crash Recovery 3


Transaction: collection of actions that
preserve consistency
Consistent DB T Consistent DB’

If T starts with a consistent state + T executes in


isolation (and absence of errors)
 T leaves a consistent state

CS3223 – Crash Recovery 4


Reasons for failures
• Transaction failures
• Logical errors, deadlocks
• System crash
• Power failures, operating system bugs etc
• Memory data lost
• Disk failure
• Disk Read-Write Head crashes
STABLE STORAGE: Data is never lost. Can approximate by using RAID
and maintaining geographically distant copies of the data
but if earthquake everything will still be lost
CS3223 – Crash Recovery 5
Key problem: Unfinished transaction

Example Transfer fund from A to B


T1: A  A - 100
B  B + 100

CS3223 – Crash Recovery 6


T1: Read (A);
A  A-100
Write (A);
Read (B);
B  B+100
Write (B);

A: 800
A: 800
B: 800
B: 800
memory disk
CS3223 – Crash Recovery 7
T1: Read (A);
A  A-100
Write (A);
Read (B);
B  B+100
Write (B);

A: 800 700
A: 800
B: 800
B: 800
memory disk
CS3223 – Crash Recovery 8
T1: Read (A);
A  A-100
Write (A);
Read (B);
B  B+100
Updated A value is written to disk.
Write (B); This may be triggered ANYTIME
by explicit command or DBMS or OS

A: 800 700
A: 800 700
B: 800
B: 800
memory disk
CS3223 – Crash Recovery 9
T1: Read (A);
A  A-100
Write (A);
Read (B);
B  B+100
Write (B);

A: 800 700
A: 800 700
B: 800 900
B: 800
memory disk
CS3223 – Crash Recovery 10
T1: Read (A);
A  A-100
Write (A);
Read (B);
B  B+100
Failure before commit
Write (B);
(memory content lost
before disk updated)!
A: 800 700
A: 800 700
B: 800 900
B: 800
memory disk
CS3223 – Crash Recovery 11
What is the disk content before
the crash?
Disk not updated yet
A = 700; B = 800?
A = 800; B = 700?
Disk fully updated
A: 700 A = 800; B = 800?
Disk partially updated
B: 800
disk
Need atomicity: execute all
actions of a transaction or
none at all
CS3223 – Crash Recovery 12
Recovery Manager
• Recovery Manager guarantees atomicity and durability
properties of Xacts
• Undo: remove effects of aborted Xact to preserve atomicity
• Redo: re-installing effects of committed Xact for durability
• Processes three operations:
• Commit(T) - install T’s updated “pages” into disk
• Abort(T) - restore all data that T updated to their prior values
• Restart - recover database to a consistent state from system failure
• abort all active Xacts at the time of system failure
• installs updates of all committed Xacts
need to remember old values
• Desirable properties:
• Add little overhead to the normal processing of Xacts
• Recover quickly from a failure

CS3223 – Crash Recovery 13


Interaction Between Recovery and Buffer
Managers: Dirty pages in buffer pool
• Can a dirty page updated by Xact T be written to disk
before T commits?
•yesYes policy
-> steal need to 
steal->policy need old
remember to remember
value old value
no -> no steal policy
• No  no steal policy
• Must all dirty pages that are updated by Xact T be
written to disk when T commits?
• Yes
yes policy
-> force force policy
no -> no force policy -> need to remember the new value
• No  no force policy  need to remember the new value

CS3223 – Crash Recovery 14


Recovery schemes: Design options

• Four possible design options


Force No-force

Steal Undo & no redo Undo & redo

No steal No undo & no redo No undo & redo

No-steal policy  No undo


Force policy  No redo

Which is the best solution??


CS3223 – Crash Recovery 15
Log-based Recovery
• Log (aka trail/journal): history of actions executed by
DBMS remembering old and new values
• Contains a log record for each write, commit, & abort
• Each log record has a unique identifier called Log
Sequence Number (LSN)
• Log is stored as a sequential file of records in stable
storage log should never be lost

• LSN of log record = address of log record

CS3223 – Crash Recovery 16


UNDO AND NO REDO
One Solution: Undo logging
(Immediate modification/Steal-Force)
T1: Read (A); A  A-100
Write (A); Undo log: <TID, Object, oldValue>
Read (B); B  B+100 (not showing LSN)
Write (B);

<T1, start>

A:800 A:800
B:800 B:800

memory disk Log (Stable)


CS3223 – Crash Recovery 17
One Solution: Undo logging
(Immediate modification)
T1: Read (A); A  A-100
Write (A); Undo log: <TID, Object, oldValue>
Read (B); B  B+100
Write (B);

<T1, start>

A:800 700 A:800


B:800 900 B:800

memory disk log


CS3223 – Crash Recovery 18
One Solution: Undo logging
(Immediate modification)
T1: Read (A); A  A-100
Write (A); Undo log: <TID, Object, oldValue>
Read (B); B  B+100
Write (B);

<T1, start>
<T1, A, 800>
A:800 700 A:800
B:800 900 B:800

memory disk log


CS3223 – Crash Recovery 19
One Solution: Undo logging
(Immediate modification)
T1: Read (A); A  A-100
Write (A);
Read (B); B  B+100
Write (B);

<T1, start>
<T1, A, 800>
A:800 700 A:800 700
B:800 900 B:800

memory disk log


CS3223 – Crash Recovery 20
One Solution: Undo logging
(Immediate modification)
T1: Read (A); A  A-100
Write (A);
Read (B); B  B+100
Write (B);

<T1, start>
<T1, A, 800>
A:800 700 A:800 700 <T1, B, 800>
B:800 900 B:800

memory disk log


CS3223 – Crash Recovery 21
One Solution: Undo logging
(Immediate modification)
T1: Read (A); A  A-100
Write (A);
Read (B); B  B+100
Write (B);

<T1, start>
<T1, A, 800>
A:800 700 A:800 700 <T1, B, 800>
B:800 900 B:800 900

memory disk log


CS3223 – Crash Recovery 22
One Solution: Undo logging
(Immediate modification)
T1: Read (A); A  A-100 yes forcing -> upon commit, all
updates written to disk
Write (A);
Read (B); B  B+100
Write (B);

<T1, start>
<T1, A, 800>
A:800 700 A:800 700 <T1, B, 800>
B:800 900 B:800 900 <T1, commit>

memory disk log


CS3223 – Crash Recovery 23
inefficient, alot of random writes to disk.
Complications
• Log is first written in memory

memory A: 800
B: 800 DB
A: 800 700
B: 800 900
Log: Log
<T1,start>
<T1, A, 800>
<T1, B, 800>

CS3223 – Crash Recovery 24


Complications
• Log is first written in memory

memory A: 800 700


B: 800 DB BAD STATE
A: 800 700
B: 800 900 #1
Log: Log Failure occurs after
<T1,start> partial updates on
<T1, A, 800> disk but before log
<T1, B, 800> is written to disk

CS3223 – Crash Recovery 25


Complications
• Log is first written in memory

memory A: 800 700


B: 800 DB BAD STATE
A: 800 700
B: 800 900 #1
Log: Log Failure occurs after
<T1,start> partial updates on
<T1, A, 800> disk but before log
<T1, B, 800> is written to disk

This means log record for A must be on log disk


before A can be updated on data disk (DB)
CS3223 – Crash Recovery 26
Complications
• Log is first written in memory
• Updates are not written to disk on every action
memory A: 800
B: 800 DB
A: 800 700
B: 800 900
Log: Log
<T1,start> <T1, B, 800>
<T1, A, 800> <T1, commit>
<T1, B, 800>
<T1, commit>

CS3223 – Crash Recovery 27


Complications
• Log is first written in memory
• Updates are not written to disk on every action
memory A: 800 700
B: 800 DB BAD STATE
A: 800 700
B: 800 900 #2
Log: Log All logs are on
disk (including
<T1,start> <T1, B, 800> commit log) but
<T1, A, 800> <T1, commit> only partial updates
<T1, B, 800> on disk.

CS3223 – Crash Recovery 28


Complications
• Log is first written in memory
• Updates are not written to disk on every action
memory A: 800 700
B: 800 DB BAD STATE
A: 800 700
B: 800 900 #2
Log: Log All logs are on
disk (including
<T1,start> <T1, B, 800> commit log) but
<T1, A, 800> <T1, commit> only partial updates
<T1, B, 800> on disk.
Before COMMIT log is written to Log, all updates
must be on disk (DB)
CS3223 – Crash Recovery force policy 29
Undo logging rules
(1) For every action generate undo log record
(containing old value)
(2) Before x is modified on disk, log record
pertaining to x must be on disk (write ahead
logging: WAL) write the log ahead of the actual modification to rem old val
(3) Before commit is flushed to log, all writes of
transaction must be reflected on disk

CS3223 – Crash Recovery 30


Undo Logging
T1: Read (A); A  A-100
Write (A);
Read (B); B  B+100
Write (B);

A: 800 700
B: 800 900
Log: A: 800
<T1,start> B: 800
<T1, A, 800>
<T1, B, 800>
log
CS3223 – Crash Recovery 31
Undo Logging
T1: Read (A); A  A-100
Write (A);
Read (B); B  B+100
Write (B);

A: 800 700
B: 800 900 <T1, Start>
Log: A: 800 <T1, A, 800>
<T1,start> B: 800 <T1, B, 800>
<T1, A, 800>
<T1, B, 800>
log
CS3223 – Crash Recovery 32
Undo Logging
T1: Read (A); A  A-100
Write (A);
Read (B); B  B+100
Write (B);

A: 800 700
B: 800 900 <T1, Start>
Log: A: 800 700 <T1, A, 800>
<T1,start> B: 800 <T1, B, 800>
<T1, A, 800>
<T1, B, 800>
log
CS3223 – Crash Recovery 33
Undo Logging
T1: Read (A); A  A-100
Write (A);
Read (B); B  B+100
Write (B);

A: 800 700
B: 800 900 <T1, Start>
Log: A: 800 700 <T1, A, 800>
<T1,start> B: 800 900 <T1, B, 800>
<T1, A, 800>
<T1, B, 800>
log
CS3223 – Crash Recovery 34
Undo Logging
T1: Read (A); A  A-100
Write (A);
Read (B); B  B+100
Write (B);

A: 800 700
B: 800 900 <T1, Start>
Log: A: 800 700 <T1, A, 800>
<T1,start> B: 800 900 <T1, B, 800>
<T1, A, 800> <T1, Commit>
<T1, B, 800>
<T1, commit> log
CS3223 – Crash Recovery 35
Recovery rules: Undo logging
(1)Let S = set of transactions with <Ti, start> in log, but
no <Ti, commit> (or <Ti, abort>) record in log
• What about those with Commit/Abort? commited transactions
dont care since updates
alr on disk
(2) For each <Ti, X, v> in log,
aborted txs
in reverse order (latest  earliest) do: have already
been undone

- if Ti  S then - X  v
- Update disk
(3) For each Ti  S do
- write <Ti, abort> to log
CS3223 – Crash Recovery 36
What if failure during recovery?
No problem! Undo is idempotent

CS3223 – Crash Recovery 37


Redo logging (deferred
modification/no-steal-no-force)
• In UNDO logging, we remember only the “old” value
• How about remembering only the “new” (updated)
values instead?
• Log record of the form <TID, object, newValue>
• What does this mean?
• NO old values, so NO updates must be written to disk
until a transaction commits!
• All updates have to be buffered in memory!
• Can also store dirty pages in temporary disk storage but very
inefficient
CS3223 – Crash Recovery 38
Redo logging (deferred modification)
T1: Read(A); A  A-100; write (A);
Read(B); B  B+100; write (B);

A: 800 A: 800
B: 800 B: 800
memory DB LOG
Redo log: <TID, Object, newValue>
CS3223 – Crash Recovery 39
Redo logging (deferred modification)
T1: Read(A); A  A-100; write (A);
Read(B); B  B+100; write (B);

A: 800 700 A: 800


B: 800900 B: 800
memory DB LOG

CS3223 – Crash Recovery 40


Redo logging (deferred modification)
T1: Read(A); A  A-100; write (A);
Read(B); B  B+100; write (B);

<T1, start>
<T1, A, 700>
A: 800 700 A: 800 <T1, B, 900>
B: 800900 B: 800 <T1, commit>

memory DB LOG

CS3223 – Crash Recovery 41


Redo logging (deferred modification)
T1: Read(A); A  A-100; write (A);
Read(B); B  B+100; write (B);

<T1, start>
<T1, A, 700>
A: 800 700 output A: 800 700 <T1, B, 900>
B: 800900 B: 800 <T1, commit>

memory DB LOG

CS3223 – Crash Recovery 42


Redo logging rules
(1) For every action, generate redo log record
(containing new value)
(2) Before X is modified on disk (DB), ALL
log records for transaction that modified X
(including commit) must be on disk
no urgency to ensure that data dirty page has to be written to disk.

CS3223 – Crash Recovery 43


Recovery rules: Redo logging
(1) Let S = set of transactions with
<Ti, commit> in log
(2) For each <Ti, X, v> in log, in forward
order (earliest  latest) do:
- if Ti  S then X  v
go ALL the way back
Update X on disk

Redo is also idempotent

CS3223 – Crash Recovery 44


Key drawbacks:
• Undo logging (steal/force)
• increase the number of disk I/Os
• Redo logging (no-steal/no-force)
• need to keep all modified blocks in memory until
commit inefficient use of space

CS3223 – Crash Recovery 45


Another Solution: undo/redo logging!
Update  <TID, object, newValue, oldValue>
page X

Rules:
1) Page X can be flushed before or after Ti commit
2) Log record flushed before corresponding updated page (WAL)
3) All log records flushed at commit

This is adopted in IBM DB2 – known as the


Aries Recovery Manager
CS3223 – Crash Recovery 46
Recovery process:

• Backwards pass
• construct set S of committed transactions
• undo actions of transactions not in S
• Forward pass
• redo actions of S transactions

CS3223 – Crash Recovery 47


Checkpointing

CS3223 – Crash Recovery 48


Recovery can be very, very SLOW !
Redo log:

... ... ...

First T1 wrote A,B Last


Crash
Record Committed a year ago Record
(1 year ago) --> STILL, Need to redo after crash!!

CS3223 – Crash Recovery 49


Solution: Checkpoint (simple version)
Periodically:
(1) Do not accept new transactions Tx freeze
(2) Wait until all (active) transactions finish
non-ideal
(3) Flush all log records to disk (log)
(4) Flush all buffers to disk (DB)
(5) Write “checkpoint” record on disk (log)
(6) Resume transaction processing
database now considered to be "new"

CS3223 – Crash Recovery 50


Example: what to do at recovery?
Redo log (disk): <T1,commit>

<T2,commit>
<T1,A,16>

<T3,C,21>
<T2,B,17>
Checkpoint Crash
... ... ... ... ... ...

No need to examine log records before the most recent Checkpoint

CS3223 – Crash Recovery 51


Non-quiescent Checkpoint
• Processing continues in the midst of
checkpointing
• No blocking of newly arrived transactions

too frequent - lots of overhead


rarely - log file very long

tradeoff

CS3223 – Crash Recovery 52


Non-quiescent Checkpoint: Undo Log
Only uncommitted transactions
started from this point onwards
need to be undone
L Start-ckpt
end ...
O ... active TR: ...
ckpt
T1,T2,... Tk
G
if theres no end, means that
cp did not happen successfully
Can be discarded Wait till all of
when checkpoint T1,. . . , Tk
completes commit or abort,
successfully but do not prohibit
other transactions
guaranteed T1-Tk completed
successfully, same for everything from starting
else before
CS3223 – Crash Recovery 53
Non-quiescent Checkpoint: Redo Log
We do not need to look further even if they started b4
start-cp
back than the earliest of the Start
Need to redo transactions that
of the active transactions T1… Tk
committed from this point
onwards (including T1… Tk)
L Start-ckpt
end ...
O ... active TR: ...
ckpt
T1,T2,... Tk
G
Committed transactions’ Write to disk all database elements that
updates all reflected on were written to buffers but not yet to disk
disk. So, no need to redo by transactions that had already committed
them. when the START CKPT record was written
to the log
CS3223 – Crash Recovery if commited tx have dirty pages that are not flushed 54
out, flush them out
Non-quiesce checkpoint
(Undo/Redo logging)
Committed transactions’
updates all reflected on Need to redo actions of transactions
disk. So, no need to redo committed from this point
them. onwards. No need to redo
actions of T1… Tk (if committed)
before Start-Ckpt.
L Start-ckpt
end
O ... active TR: ... ... need to undo
actions after start
T1,T2,... ckpt cp that have not been
G commited.
...

for Undo All dirty buffer pages prior to


(because Start-Ckpt are flushed without
dirty pages disrupting runtime operations
are flushed) write to disk, even if not completed. since
CS3223 – Crash Recovery we have undo value in case it crashes. 55
Examples: what to do at recovery time?
no T1 commit
L
T1,- Ckpt Ckpt T 1-
O ... ... ... ...
a T1 end b
G

 Undo T1 (undo a,b)

CS3223 – Crash Recovery 56


Example

L
O T1 ckpt-s T1 ckpt- T1 T1
... ... T1 ... ... ... ... ...
G a b end c cmt

 Redo T1: (redo b,c)


if crash happens after commit, only need to redo b and c,
because those before cp start are guaranteed to be flushed to disk

CS3223 – Crash Recovery 57


Media failure
(Loss of non-volatile storage)

A: 16

Solution: Make copies of data!


stateless storage. read from 1 copy but write to m copies
hopefully most copies will survive when theres a failure

CS3223 – Crash Recovery 58


Triple Modular Redundancy
• Keep 3 copies on separate disks
• Output(X) --> three outputs
• Input(X) --> three inputs + vote

X1 X2 X3

CS3223 – Crash Recovery 59


DB Dump + Log

backup active
database database
log

• If active database is lost,


– restore active database from backup
– bring up-to-date using redo entries in log

CS3223 – Crash Recovery 60


Summary
• Consistency of data
• Two sources of problems:
• Failures
• Logging
• Redundancy
• Data sharing
• Concurrency Control
• Log-based recovery mechanisms
• Undo, Redo, Undo/Redo
• What about No Undo/No Redo???
no recovery needed. if failure, restart
not used in practice because of high overhead.
crashes are uncommon anyway, not worth the overhead
CS3223 – Crash Recovery 61

You might also like