You are on page 1of 10

The Database Life Cycle (DBLC) d.

Create the physical design


3. Implementation and Loading
Within larger information systems, the database is
subject to a life cycle. It goes hand in hand with the a. Install the DBMS
System/Software Development life Cycle (SDLC). b. Create the database(s)
c. Load or convert the data
4. Testing and Evaluation
Database Initial Study Analysis
a. Test the database
b. Fine-tune the database
Database Design Detailed Design c. Evaluate the database and its application
programs
Implementation and 5. Operation
Coding
Loading Testing and
a. Produce the required information flow
Testing and Evaluation 6. Maintenance and evolution
Evaluation
a. Introduce changes
b. Make enhancements
Operation
Database Database Design Strategies
Application program
Maintenance and
maintenance Classic Approaches:
Evolution
1. Top-down design starts by identifying the data
sets, then defines the data elements for each of
1. Database Initial Study those sets. The entities are defined before the
attributes per entity.
a. Analyse the company situation
b. Define problems and constraints 2. Bottom-up design first identifies the data
c. Define objectives elements (items), then groups them together in
d. Define scope and boundaries data sets. The attributes are defined first then
2. Database Design are grouped to form the entities..
a. Create the conceptual design The design approach used often depends on the scope
b. DBMS software selection of the problem and on personal preferences. The
c. Create the logical design
approaches are complementary rather than mutually A transaction is a logical unit of work that must
exclusive. be entirely completed or entirely aborted.

A successful transaction changes the database from


one consistent state to another. A consistent
Database Design Philosophies database state is one in which all data integrity
constraints are satisfied.
The choice of design philosophy is essentially based on
the company’s structure as well as the scope and size Thus, to ensure consistency of the database, every
of the system. transaction must begin with the database in a known
consistent state, and the transaction itself should be
1. Centralized design is more productive when
successful (thereby changing to another consistent
the data component is composed of a relatively
state) or aborted (so it is reverted to the consistent
small number of objects and procedures. The
state it started with). Because of this, all transactions
database design can be represented in a fairly
are controlled and executed by the DBMS to guarantee
simple database. A single conceptual design is
database integrity.
completed and then validated.
Most transactions are composed of two or more
2. Decentralized design is more appropriate for
database requests. A database request is the
systems with data components that have a
equivalent of a single SQL statement in an application
considerable number of entities and complex
program or transaction.
relations on which complex operations are
performed. It is also applied when the problem
itself is spread across several operational sites
and each element is a subset of the entire data
set.
Transaction Properties

All transactions must display:


Transaction Management and Concurrency
Controls • Atomicity – this requires that all operations of a
transaction be completed; if not, the transaction
Basically, a transaction is any action that reads from is aborted.
and/or writes to a database. It may be as simple as a
SELECT statement; it may consist of any combination • Consistency – indicates the permanence of the
of SELECTs, INSERTs, UPDATEs and/or DELETEs. database’s consistent state.
• Isolation – means that the data used during the 4. The program is abnormally terminated. This
execution of a transaction cannot be used by a should be equivalent to a ROLLBACK.
second transaction until the first one is
completed. This is particularly useful in a multi-
user database environment. Transaction Log

• Durability – ensures that once transaction A DBMS uses a transaction log to keep track of all
changes are done (committed), they cannot be transactions that update the database. The
undone or lost, even in the vent of system information may be used by the DBMS for a recovery
failure. requirement by rollback statements, an abnormal
termination, or system failure.
• Serializability – ensures that the concurrent
execution of several transactions yield
consistent results. That is, the concurrent Concurrency Control
execution of transactions T1, T2, and T3 yields
results that appear to have been executed in The coordination of the simultaneous execution of
serial order. transactions in a multiuser database system is known
as concurrency control. Its main objective is to
ensure the Serializability of transactions in a multiuser
database environment.
Transaction Management with SQL
Concurrency control aims to resolve:
ANSI has defined standards to govern over SQL
database transactions. Transaction support is provided 1. Lost Updates
by: COMMIT and ROLLBACK. ANSI requires that when Without concurrency control, updates may be
a transaction sequence is initiated by the user or an lost.
application program, the sequence must continue Consider the following transactions:
through all succeeding statements until one of the T1 : purchase 100 units
following occurs: T2 : sell 30 units
Both transactions refer to the same
1. A COMMIT statement is reached. Changes are
product. The product currently has 35
permanently recorded within the database.
units.
2. A ROLLBACK statement is reached. All changes
T1 and T2 read that value they each get
are aborted and the database is returned to the
35.
previous consistent state.
Then T1 adds 100 units to that quantity
3. The end of the program is successfully reached.
making it 135 (35 + 100) but T2 subtracts
This should be equivalent to a COMMIT.
30 units making it 5 (35-30).
T1 then writes the updated value to the has deducted the value of 10 from A but
database (quantity is 135). not yet added 10 to AA. The counts would
T2 also writes the updated value to the be off by 10.
database (quantity is 5).
Because T2 writes after T1, T1’s update is lost,
the only transaction actually saved will be T2. Scheduler

2. Uncommitted data The scheduler is a special DBMS program that


Using the same transactions and values, establishes the order in which the operations within
consider this scenario: concurrent transactions are executed. The scheduler
T1 reads the quantity as 35. interleaves the execution of database operations to
T1 adds 100 units so quantity = 135. ensure Serializability and isolation of transactions.
T1 writes the new quantity (135).
T2 then reads the quantity as 135.
T2 computes for new value (135- Concurrency Control with Locking Methods
30=105).
T1 issues a ROLLBACK. A lock guarantees exclusive use of a data item to a
T2 writes the new quantity 105. current transaction. A transaction acquires a lock prior
Because T2 read the uncommitted update of T1, to data access; the lock is released (unlocked) when
it violated the isolation property of transactions. the transaction is completed so that another
Once T1 is rolled back, then the quantity should transaction can lock and use the data item.
be 35 and T2’s update would make the final
quantity equal to 5. Lock granularity indicates the level of lock use,
Locking can take place at the following levels:
3. Inconsistent retrievals
Inconsistent retrievals occur when a transaction • Database
calculates some summary value (aggregate) In a database-level lock, the entire database
functions over a set of data while other is locked. This prevents the use of any tables in
transactions are updating the data. the database by transaction T2 while
T1: SUM of quantity transaction T1 is being executed. This level is
T2: UPDATEs 2 products to correct a typing good for batch processing but unsuitable for
error; that 10 units were added to product A but online multiuser databases.
it was actually meant for product AA. The
correction should be A-10 and AA+10 • Table
While T1 is summing up the quantities, it In a table-level lock, the entire table is locked.
may catch the values of A and AA when it This prevents the use of any rows from the
locked table by T2. However, T2 can access the A binary lock has 2 states: locked (1) or
same database as long as it accesses a different unlocked (0). If an object (database, table, page,
table. or row) is locked by a transaction, no other
• Page transaction can use the object. It is unlocked
In a page-level lock, the DBMS locks the entire then any object can lock it for its use. As a rule,
diskpage. A diskpage, or page, is the a transaction must unlock the object after its
equivalent of a diskblock. A diskblock is the termination. Such operations are automatically
directly accessible section of a disk. A page has managed and scheduled by the DBMS, and
a fixed size, e.g. 4KB, 8KB, 16KB. If you have to every DBMS has a default locking mechanism. If
write 73bytes to a 4KB page, then the entire the end user wants to override the default,
4KB page must be read from disk, update din there are also SQL commands available for it.
memory, and written back to disk. A table can
span several pages, a page can contain several 2. Shared/Exclusive Locks
rows of one or more tables.
The labels “shared” and “exclusive” indicate the
• Row nature of the lock. An exclusive lock exists
A row-level lock is much less restrictive as it when access is reserved specifically for the
allows concurrent transactions to access transaction that locked the object. It should be
different rows of the same table even when the used when the potential for conflict exists. A
rows are located on the same page. This greatly shared lock exists when concurrent
improves the availability of data but requires a transactions are granted access on the basis of
high overhead. a common lock. It produces no conflict as long
• Field as all concurrent transactions are read-only.
A field-level lock allows concurrent
transactions to access the same row as long as A shared lock is issued when a transaction
they require different fields or attributes. This is wants to read data and no exclusive lock is held
the most flexible but is rarely done due to the on that data item. An exclusive lock is issued
extremely high level of overhead required. when a transaction wants to update (write) a
data item and no locks are currently held on
that data item.

A lock can then have three states: unlocked,


shared (read), and exclusive (write).
Lock Types Still, the mutual exclusive rule is applied; only
one transaction at a time can own an exclusive
1. Binary Locks
lock on the same object.
data. Once all locks have been acquired, the
transaction is in its locked point.
Read/Write Conflict Scenarios:
2. A shrinking phase, in which a transaction
Transaction Transaction
T1 T2
RESULT releases all locks and cannot obtain any new
lock.
Read Read No conflict

Read Write Conflict

Write Read Conflict

Write Write Conflict Deadlocks

A deadlock occurs when two transactions wait for


each other to unlock data. To illustrate:

T1 = access data items X and Y


Issues with Locks
T2 = access data items Y and X
Although locks prevent serious data inconsistencies,
they can lead to two major problems: T1 locks data item X

• The resulting transaction may not be T2 locks data item Y


serializable
T1 waits for T2 to unlock data item Y
• The schedule may create deadlocks. A database
deadlock is caused when two transactions wait T2 waits for T1 to unlock T1
for each other to unlock data.
The above scenario is also known as a deadly
embrace.
Two-Phase Locking to Ensure Serializability
In a real-world DBMS, many more transactions can be
Two-phase locking (2PL) defines how transactions executed simultaneously, thereby increasing the
acquires and relinquish locks. I can ensure probability of generating deadlocks.
serializability but cannot prevent deadlocks. It is called
Two-phase locking because it consists of: It is important to note that deadlocks re only possible
when one of the transactions wants to obtain an
1. A growing phase, in which a transaction exclusive lock on a data item; no deadlock condition
acquires all required locks without unlocking can exists among shared locks.
The three basic techniques to control deadlocks are: 2. Monotonicity. This ensures that the time
stamp value always increases.
• Deadlock prevention. A transaction NB: the term monotonicity is part of the
requesting a new lock is aborted when there is standard control concurrency vocabulary.
the possibility that a deadlock can occur. If the
transaction is aborted, all changes by the
transaction are rolled back and all locks are
released. The transaction is rescheduled for All database operations (read and write) within the
execution. same transaction must have the same timestamp. The
DBMS executes conflicting operations in time stamp
• Deadlock detection. The DBMS periodically order, thereby ensuring serializability of the
tests the database for deadlocks. If a deadlock transactions.
is found, one of the transaction (the “victim”) is
aborted and the other transaction continues. The disadvantage of the time stamping approach is
that each value stored in the database requires two
• Deadlock avoidance. The transaction must additional time stamp fields: one for the last time he
obtain all of the locks it needs before it can be field was read and one for the last update, thus
executed. This technique avoids the rollback of increasing memory requirements and overhead cost.
conflicting transactions by requiring that locks Time stamping also tends to demand a lot of system
be obtained in succession. resources because many transactions may have to be
stopped, rescheduled, and restamped.

Concurrency Control with Time Stamping


Methods Wait/Die and Wound Wait Schemes

The time stamping approach to scheduling When using time stamps to manage concurrent
concurrent transactions assigns a global, unique transactions, there are two schemes that can be
timestamp to each transaction. The time stamp value applied to decide which transaction is rolled back and
produces an explicit order in which transactions are which continues executing:
submitted to the DBMS.
Assume: T1 has time stamp of 11548789 and T2 has
time stamp of 19562545
Time stamps must have two properties:
Transacti Transacti
Wait/Die Wound/Wait
1. Uniqueness. This ensures that no equal time on on
Scheme Scheme
Requesti Owning
stamp value can exist.
ng Lock Lock
T1 T2 • T1 waits until • T1 preempts
T2 is
(rolls back) T2 • A differential backup of the database, in
• T2 is which only the last modifications to the
completed and
rescheduled database (compared to a previous backup copy)
T2 releases its
using the same are copied
locks
time stamp
• T2 dies (rolls • A transaction log backup, which backs up
back) • T2 waits until only the transaction log operations that are not
• T2 is T1 is reflected in a previous backup copy of the
T2 T1
rescheduled completed and
database.
using the T1 releases its
same time locks
The database backup is stored in a secure place,
stamp
usually in a different building, and is protected against
dangers such as fire, theft, flood, and other potential
In both schemes, transactions may wait for lock calamities. The backup’s existence guarantees
requests and that may cause deadlocks, to prevent database recovery following system
this, each lock request has an associated time-out (hardware/software) failures.
value. If the lock is not granted before the time-out
Causes / Factors of system failures:
expires, the transaction is rolled back.
• Software

o Software-induced failures are often traced


Database Recovery Management
back to the OS, the DBMS software,
Database recovery restores a database from a given application programs, or viruses.
state, usually inconsistent, to a previously consistent
state. Recovery techniques are based on the atomic • Hardware
transaction property: all portions of the transaction
o May include: memoty chip errors, disk
must be treated as a single, logical unit of work in
crashes, bad disk sectors, and disk full
which all operations are applied and completed to
errors.
produce a consistent database.

Levels of backup: • Programming exceptions

o Application programs or end users may


• A full backup of the database, or dump of the
database roll back transactions when certain
conditions are defined.

• Transactions
o The system detects deadlocks and aborts
one of the transactions.
Transaction Recovery
• External Factors
The database recovery process involves bringing the
o Fire, earthquake, flood, or other natural database to a consistent state after a failure.
disasters Transaction recovery procedures generally make use
of deferred-write and write-through techniques.
Concepts that affect the recovery process:
When the recovery procedure uses deferred-write or
• The write-ahead-log-protocol. This ensures deferred-update, the transaction operations do not
that transaction logs are always written before immediately update the physical database. Instead,
any database data are actually updated so that, only the transaction log is updated. The database is
in case of failure, the database can later be physically updated only after transaction reaches its
recovered to a consistent state using the data in commit point, using the transaction log information. If
the transaction log. the transaction aborts before it reaches to its commit
point, no changes (no ROLLBACK or undo) need to be
• Redundant transaction logs. Several copies made to the database because the database was
of the transaction log ensure that a copy is never updated. The recovery process for all started
available even when there is physical disk and committed transaction (before the failure) will
failure. follow the steps:

• Database buffers. A buffer is a temporary 1. Identify the last checkpoint in he transaction


storage area in primary memory used to speed log. This is the last time transaction data was
up disk operations. To improve processing time, physically save to disk.
DBMS software reads the data from the physical
disk and stores a copy of it in the buffer. When a 2. For the transaction started and committed
transaction updates data , it updates the copy of before the last checkpoint, nothing needs to be
the data in the buffer. Later all buffers that done because the data are already saved.
contain updated data are written to a physical
3. For a transaction that performed a omit
disk during a single operation.
operation after the last checkpoint, the DBMS
• Database checkpoints. A database checkpoint uses the transaction log records to redo the
is an operation in which the DBMS writes all of transaction and to update the database, using
its updated buffers to disk. The DBMS does not the ”after” values in the transaction log. The
execute any other requests and checkpoint changes are made in ascending order, from
operations are registered in the transaction log. oldest to newest.
4. For any transaction of any ROLLBACK operation ROLLBACK) before the failure occurred, the
after the last checkpoint or that was left active DBMS uses the transaction log records to
(with neither a COMMIT nor a ROLLBACK) before ROLLBACK or undo the operations, using the
the failure occurred, nothing needs to be done “before” values in the transaction log. Changes
because the database was never updated. are applied in reverse order, from newest to
oldest.

When the recovery procedures uses write-through or


immediate update, the database is immediately
updated by transaction operations during the
transaction’s execution, even before the transaction
reaches to its commit point. If the transaction aborts
before it reaches its commit point, a ROLLBACK or
undo operation needs to be done to restore te
database to a consistent state. In that case, the
ROLLBACK operation will use the transaction log
“before” values. The recovery process will follow the
steps:

1. Identify the last checkpoint in the transaction


log. This is the last time transaction data were
physically saved to disk.

2. For a transaction that started and committed


before the last checkpoint, nothing needs to be
done because the data are already saved.

3. For a transaction that committed after the last


checkpoint, the DBMS redoes the transaction,
using the “after” values of the transaction log.
Changes are applied in ascending order, from
oldest to newest.

4. For any transaction that had a ROLLBACK


operation after the last checkpoint or that was
left active (with neither a COMMIT nor a

You might also like