You are on page 1of 66

CoSc 265

Fundamental of Database management


System
CHAPTER 7
Data Protection
Introduction
 Database Recovery
 Concurrency
 Data Security

3
Data Recovery
Introduction
 Recovery algorithms
 Recovery concepts
◦ Write-ahead logging
◦ In-place versus shadow updates
◦ Rollback
◦ Deferred update
◦ Immediate update
 Certain recovery techniques best used with
specific concurrency control methods

5
Database Recovery
Purpose of Database Recovery
◦ To bring the database into the last consistent state, which
existed prior to the failure.
◦ To restored the database to the most recent consistent
state just before the failure.
◦ To preserve transaction properties (Atomicity,
Consistency, Isolation and Durability).
 Example:
◦ If the system crashes before a fund transfer transaction
completes its execution, then either one or both
accounts may have incorrect value. Thus, the database
must be restored to the state before the transaction
modified any of the accounts.

Slide 19- 6
Types of Failure
The database may become unavailable for use due to

 Computer/ System Failure

 A hardware or software error occurs during transaction execution. System may fail because
of addressing error, application error, operating system fault, RAM failure, etc.

 Transaction failure:

 Transactions may fail because of Some operation error such as integer overflow or division
by zero.

 Local errors or exception conditions

 certain conditions necessitate cancellation of the transaction, such as insufficient account


balance in a banking database

 Concurrency control enforcement

A transactions may fail because of incorrect input, deadlock, incorrect

synchronization.

 Media failure: Disk head crash, power disruption, etc.

 Physical problems and catastrophes


Recovery Concepts
 Recovery process restores database to the
most recent consistent state before time of
failure
 Information kept in system log
 Typical recovery strategies
◦ Restore backed-up copy of database
 Best in cases of extensive damage
◦ Identify any changes that may cause inconsistency
 Best in cases of non catastrophic failure
 Some operations may require redo

8
Recovery Concepts (cont’d.)
 Deferred update techniques
◦ Do not physically update the database until after
transaction commits
◦ Undo is not needed; redo may be needed
 Immediate update techniques
◦ Database may be updated by some operations of
a transaction before it reaches commit point
◦ Operations also recorded in log
◦ Recovery still possible

9
Recovery Concepts (cont’d.)
 Undo and redo operations required to be
idempotent
◦ Executing operations multiple times equivalent to
executing just once
◦ Entire recovery process should be idempotent
 Caching (buffering) of disk blocks
◦ DBMS cache: a collection of in-memory buffers
◦ Cache directory keeps track of which database
items are in the buffers

10
Recovery Concepts (cont’d.)
 Cache buffers replaced (flushed) to make
space for new items
 Dirty bit associated with each buffer in the
cache
◦ Indicates whether the buffer has been modified
 Contents written back to disk before flush if
dirty bit equals one
 Pin-unpin bit
◦ Page is pinned if it cannot be written back to disk
yet

11
Recovery Concepts (cont’d.)
 Main strategies
◦ In-place updating
 Writes the buffer to the same original disk location
 Overwrites old values of any changed data items
◦ Shadowing
 Writes an updated buffer at a different disk location, to
maintain multiple versions of data items
 Not typically used in practice
 Before-image: old value of data item
 After-image: new value of data item

12
Recovery Concepts (cont’d.)
 Write-ahead logging
◦ Ensure the before-image (BFIM) is recorded
◦ Appropriate log entry flushed to disk
◦ Necessary for UNDO operation if needed
 UNDO-type log entries
 REDO-type log entries

13
Recovery Concepts (cont’d.)
 Steal/no-steal and force/no-force
◦ Specify rules that govern when a page from the
database cache can be written to disk
 No-steal approach
◦ Cache buffer page updated by a transaction
cannot be written to disk before the transaction
commits
 Steal approach
◦ Recovery protocol allows writing an updated
buffer before the transaction commits

14
Recovery Concepts (cont’d.)
 Force approach
◦ All pages updated by a transaction are
immediately written to disk before the
transaction commits
◦ Otherwise, no-force
 Typical database systems employ a steal/no-
force strategy
◦ Avoids need for very large buffer space
◦ Reduces disk I/O operations for heavily updated
pages

15
Recovery Concepts (cont’d.)
 Write-ahead logging protocol for recovery
algorithm requiring both UNDO and REDO
◦ BFIM of an item cannot be overwritten by its after
image until all UNDO-type log entries have been
force-written to disk
◦ Commit operation of a transaction cannot be
completed until all REDO-type and UNDO-type
log records for that transaction have been force-
written to disk

16
Checkpoints in the System Log and Fuzzy Checkpointing

 Taking a checkpoint
◦ Suspend execution of all transactions temporarily
◦ Force-write all main memory buffers that have been
modified to disk
◦ Write a checkpoint record to the log, and force-write
the log to the disk
◦ Resume executing transactions
 DBMS recovery manager decides on checkpoint interval
 Fuzzy checkpointing
◦ System can resume transaction processing after a
begin_checkpoint record is written to the log
◦ Previous checkpoint record maintained until
end_checkpoint record is written

17
Transaction Rollback
 Transaction failure after update but before
commit
◦ Necessary to roll back the transaction
◦ Old data values restored using undo-type log
entries
 Cascading rollback
◦ If transaction T is rolled back, any transaction S
that has read value of item written by T must also
be rolled back
◦ Almost all recovery mechanisms designed to
avoid this
18
Figure 22.1 Illustrating cascading rollback (a process that never occurs
in strict or cascadeless
schedules) (a) The read and write operations of three transactions (b) System log
at point of crash (c) Operations before the crash 19
Transactions that Do Not Affect the Database

 Example actions: generating and printing


messages and reports
 If transaction fails before completion, may not
want user to get these reports
◦ Reports should be generated only after
transaction reaches commit point
 Commands that generate reports issued as
batch jobs executed only after transaction
reaches commit point
◦ Batch jobs canceled if transaction fails

20
NO-UNDO/REDO Recovery Based on Deferred Update
 Deferred update concept
◦ Postpone updates to the database on disk until the transaction
completes successfully and reaches its commit point
◦ Redo-type log entries are needed
◦ Undo-type log entries not necessary
◦ Can only be used for short transactions and transactions that
change few items
 Buffer space an issue with longer transactions
 Deferred update protocol
◦ Transaction cannot change the database on disk until it reaches
its commit point
 All buffers changed by the transaction must be pinned until the
transaction commits (no-steal policy)
◦ Transaction does not reach its commit point until all its REDO-
type log entries are recorded in log and log buffer is force-
written to disk

21
NO-UNDO/REDO Recovery Based
on Deferred Update (cont’d.)

Figure 22.2 An example of a recovery timeline to illustrate the effect of


checkpointing
22
Recovery Techniques Based on Immediate Update

 Database can be updated immediately


◦ No need to wait for transaction to reach commit
point
◦ Not a requirement that every update be
immediate
 UNDO-type log entries must be stored
 Recovery algorithms
◦ UNDO/NO-REDO (steal/force strategy)
◦ UNDO/REDO (steal/no-force strategy)

23
Figure 22.3 An example of recovery using deferred update with concurrent
transactions (a) The READ and WRITE operations of four transactions (b)
System log at the point of crash 24
Shadow Paging
 No log required in a single-user environment
◦ Log may be needed in a multiuser environment
for the concurrency control method
 Shadow paging considers disk to be made of
n fixed-size disk pages
◦ Directory with n entries is constructed
◦ When transaction begins executing, directory
copied into shadow directory to save while
current directory is being used
◦ Shadow directory is never modified

25
Shadow Paging (cont’d.)
 New copy of the modified page created and
stored elsewhere
◦ Current directory modified to point to new disk
block
◦ Shadow directory still points to old disk block
 Failure recovery
◦ Discard current directory
◦ Free modified database pages
◦ NO-UNDO/NO-REDO technique

26
Shadow Paging (cont’d.)

Figure 22.4 An example of shadow paging

27
Database Backup and Recovery from Catastrophic Failures

 Database backup
◦ Entire database and log periodically copied onto
inexpensive storage medium
◦ Latest backup copy can be reloaded from disk in case
of catastrophic failure
 Backups often moved to physically separate
locations
◦ Subterranean storage vaults
 Backup system log at more frequent intervals
and copy to magnetic tape
◦ System log smaller than database
 Can be backed up more frequently
◦ Benefit: users do not lose all transactions since last
database backup
28
Summary
 Main goal of recovery
◦ Ensure atomicity property of a transaction
 Caching
 In-place updating versus shadowing
 Before and after images of data items
 UNDO and REDO operations
 Deferred versus immediate update
 Shadow paging
 Catastrophic failure recovery

29
Concurrency
Introduction
 Concurrency control protocols
◦ Set of rules to guarantee serializability
 Two-phase locking protocols
◦ Lock data items to prevent concurrent access
 Timestamp
◦ Unique identifier for each transaction

31
Two-Phase Locking Techniques for Concurrency Control

 Lock
◦ Variable associated with a data item describing
status for operations that can be applied
◦ One lock for each item in the database
 Binary locks
◦ Two states (values)
 Locked (1)
 Item cannot be accessed
 Unlocked (0)
 Item can be accessed when requested

32
Two-Phase Locking Techniques for Concurrency Control
(cont’d.)
 Transaction requests access by issuing a
lock_item(X) operation

Figure 21.1 Lock and unlock operations for binary locks


33
Two-Phase Locking Techniques for Concurrency Control
(cont’d.)
 Lock table specifies items that have locks
 Lock manager subsystem
◦ Keeps track of and controls access to locks
◦ Rules enforced by lock manager module
 At most one transaction can hold the lock on an item
at a given time
 Binary locking too restrictive for database items
 Shared/exclusive or read/write locks
◦ Read operations on the same item are not conflicting
◦ Must have exclusive lock to write
◦ Three locking operations
 read_lock(X)
 write_lock(X)
 unlock(X)

34
Figure 21.2 Locking and unlocking operations for two-mode (read/write, or
shared/exclusive) locks 35
Two-Phase Locking Techniques for Concurrency Control
(cont’d.)
 Lock conversion
◦ Transaction that already holds a lock allowed to
convert the lock from one state to another
 Upgrading
◦ Issue a read_lock operation then a write_lock
operation
 Downgrading
◦ Issue a read_lock operation after a write_lock
operation

36
Guaranteeing Serializability by Two-Phase Locking

 Two-phase locking protocol


◦ All locking operations precede the first unlock
operation in the transaction
◦ Phases
 Expanding (growing) phase
 New locks can be acquired but none can be released
 Lock conversion upgrades must be done during this phase
 Shrinking phase
 Existing locks can be released but none can be acquired
 Downgrades must be done during this phase

37
Figure 21.3 Transactions that do not obey two-phase locking (a) Two transactions
T1 and T2 (b) Results of possible serial schedules of T1 and T2 (c) A
nonserializable schedule S that uses locks
38
Guaranteeing Serializability by Two-Phase Locking

 If every transaction in a schedule follows the


two-phase locking protocol, schedule
guaranteed to be serializable
 Two-phase locking may limit the amount of
concurrency that can occur in a schedule
 Some serializable schedules will be prohibited
by two-phase locking protocol

39
Variations of Two-Phase Locking
 Basic 2PL
◦ Technique described on previous slides
 Conservative (static) 2PL
◦ Requires a transaction to lock all the items it
accesses before the transaction begins
 Predeclare read-set and write-set
◦ Deadlock-free protocol
 Strict 2PL
◦ Transaction does not release exclusive locks until
after it commits or aborts

40
Variations of Two-Phase Locking (cont’d.)

 Rigorous 2PL
◦ Transaction does not release any locks until after
it commits or aborts
 Concurrency control subsystem responsible
for generating read_lock and write_lock
requests
 Locking generally considered to have high
overhead

41
Dealing with Deadlock and Starvation
 Deadlock
◦ Occurs when each transaction T in a set is waiting
for some item locked by some other transaction
T’
◦ Both transactions stuck in a waiting queue

Figure 21.5 Illustrating the deadlock problem (a) A partial schedule of T1′ and T2′ that
is in a state of deadlock (b) A wait-for graph for the partial schedule in (a)
42
Dealing with Deadlock and Starvation (cont’d.)
 Deadlock prevention protocols
◦ Every transaction locks all items it needs in advance
◦ Ordering all items in the database
 Transaction that needs several items will lock them in that order
◦ Both approaches impractical
 Protocols based on a timestamp
◦ Wait-die
◦ Wound-wait
 No waiting algorithm
◦ If transaction unable to obtain a lock, immediately aborted
and restarted later
 Cautious waiting algorithm
◦ Deadlock-free

43
Dealing with Deadlock and Starvation (cont’d.)
 Deadlock detection
◦ System checks to see if a state of deadlock exists
◦ Wait-for graph
 Victim selection
◦ Deciding which transaction to abort in case of
deadlock
 Timeouts
◦ If system waits longer than a predefined time, it
aborts the transaction
 Starvation
◦ Occurs if a transaction cannot proceed for an
indefinite period of time while other transactions
continue normally
◦ Solution: first-come-first-served queue
44
Concurrency Control Based on Timestamp Ordering
 Timestamp
◦ Unique identifier assigned by the DBMS to identify a transaction
◦ Assigned in the order submitted
◦ Transaction start time
 Concurrency control techniques based on timestamps do
not use locks
◦ Deadlocks cannot occur
 Generating timestamps
◦ Counter incremented each time its value is assigned to a
transaction
◦ Current date/time value of the system clock
 Ensure no two timestamps are generated during the same tick of the
clock
 General approach
◦ Enforce equivalent serial order on the transactions based on
their timestamps

45
Concurrency Control Based on Timestamp Ordering
 Timestamp ordering (TO)
◦ Allows interleaving of transaction operations
◦ Must ensure timestamp order is followed for each pair of
conflicting operations
 Each database item assigned two timestamp values
◦ read_TS(X)
◦ write_TS(X)
 Basic TO algorithm
◦ If conflicting operations detected, later operation rejected
by aborting transaction that issued it
◦ Schedules produced guaranteed to be conflict serializable
◦ Starvation may occur
 Strict TO algorithm
◦ Ensures schedules are both strict and conflict serializable

46
Concurrency Control Based on Timestamp Ordering

 Thomas’s write rule


◦ Modification of basic TO algorithm
◦ Does not enforce conflict serializability
◦ Rejects fewer write operations by modifying
checks for write_item(X) operation

47
Summary
 Concurrency control techniques
◦ Two-phase locking
◦ Timestamp-based ordering

48
Data Security
Outline
Database Security and Authorization
 Introduction to Database Security Issues
 Types of Security
 Database Security and DBA
 Access Protection, User Accounts, and Database Audits
Discretionary Access Control Based on
Granting Revoking Privileges
 Types of Discretionary Privileges
 Specifying Privileges Using Views
 Revoking Privileges
 Propagation of Privileges Using the GRANT OPTION
 Specifying Limits on Propagation of Privileges
Introduction to Database Security Issues
Database
 The collection of data or information
Security
The State of being free from danger or injury.

Database Security :
◦ Securing databases against a variety of threats.
◦ Securing the database from any kinds of damage
or attack
◦ Preventing database access privileges to
unauthorized users.
Introduction to Database Security Issues
Types of Security
 Legal and ethical issues
regarding the right to access certain information.
 Policy issues
at the governmental, institutional, or corporate level as
to what kinds of information should not be made
publicly available
 System-related issues
the system levels at which various security functions
should be enforced.
 The need to identify multiple security levels
categorize the data and users based on these
classifications, For example top secret, secret,
confidential, and unclassified
Introduction to Database Security Issues
Threats to databases
 Loss of confidentiality
Database confidentiality refers to the protection of data from
unauthorized disclosure
 Loss of integrity
Database integrity refers to the requirement that information be
protected from improper modification
 Loss of availability
Database availability refers to making objects available to a human
user or a program to which they have a legitimate right
To protect databases against these types of
threats four kinds of countermeasures can
be implemented:
 Access control
 Inference control
 Flow control
 Encryption
Introduction to Database Security Issues
 A DBMS typically includes a database security and
authorization subsystem that is responsible for ensuring the
security portions of a database against unauthorized access.

 Two types of database security mechanisms:


 Discretionary security mechanisms
These are used to grant privileges to users, including the capability to
access specific data files, records, or fields in a specified mode (such as read,
insert, delete, or update).
 Mandatory security mechanisms
These are used to enforce multilevel security by classifying the data and
users into various security classes (or levels) and then implementing the
appropriate security policy of the organization.

 The security mechanism of a DBMS must include provisions for


restricting access to the database as a whole
 This function is called access control and is handled by creating
user accounts and passwords to control login process by the
DBMS.
Introduction to Database Security Issues
 The security problem associated with databases is that
of controlling the access to a statistical database,
which is used to provide statistical information or
summaries of values based on various criteria.

 The countermeasures to statistical database security


problem is called inference control measures.

 Another security is that of flow control, which prevents


information from flowing in such a way that it reaches
unauthorized users.

 Channels that are pathways for information to flow


implicitly in ways that violate the security policy of an
organization are called covert channels.
Introduction to Database Security Issues
 A final security issue is data encryption,
which is used to protect sensitive data (such
as credit card numbers) that is being
transmitted via some type communication
network.
 The data is encoded using some encoding
algorithm.
◦ An unauthorized user who access encoded data
will have difficulty deciphering it, but authorized
users are given decoding or decrypting
algorithms (or keys) to decipher data.
Database Security and the DBA
 The database administrator (DBA) is the central authority for
managing a database system.
 The DBA’s responsibilities include
granting privileges to users who need to use the system
classifying users and data in accordance with the policy of the
organization
 The DBA is responsible for the overall security of the database
system.
 The DBA has a DBA account in the DBMS
 Sometimes these are called a system or superuser account
 These accounts provide powerful capabilities such as:
1. Account creation
2. Privilege granting
3. Privilege revocation
4. Security level assignment
 Action 1 is access control, whereas 2 and 3 are discretionary and
4 is used to control mandatory authorization
Access Control, User Accounts, and Database Audits
 Whenever a person or group of persons need to access a
database system, the individual or group must first apply
for a user account.
 The DBA will then create a new account id and password
for the user if he/she deems there is a legitimate need to
access the database
 The user must log in to the DBMS by entering account id
and password whenever database access is needed.
 The database system must also keep track of all
operations on the database that are applied by a certain
user throughout each login session.
 To keep a record of all updates applied to the database and of
the particular user who applied each update, we can modify
system log, which includes an entry for each operation
applied to the database that may be required for recovery from
a transaction failure or system crash.
Access Control, User Accounts, and Database Audits

 If any tampering with the database is


suspected, a database audit is performed
◦ A database audit consists of reviewing the log to
examine all accesses and operations applied to
the database during a certain time period.
 A database log that is used mainly for
security purposes is sometimes called an
audit trail.
Discretionary Access Control Based on Granting
and Revoking Privileges
The typical method of enforcing
discretionary access control in a
database system is based on the granting
and revoking privileges.

Types of Discretionary Privileges


 The account level:
At this level, the DBA specifies the particular privileges
that each account holds independently of the relations
in the database.
 The relation level (or table level):
At this level, the DBA can control the privilege to
access each individual relation or view in the database.
Types of Discretionary Privileges
 The privileges at the account level apply
to the capabilities provided to the account
itself and can include
◦ the CREATE SCHEMA or CREATE TABLE privilege,
to create a schema or base relation;
◦ the CREATE VIEW privilege;
◦ the ALTER privilege, to apply schema changes such adding
or removing attributes from relations;
◦ the DROP privilege, to delete relations or views;
◦ the MODIFY privilege, to insert, delete, or update tuples;
◦ and the SELECT privilege, to retrieve information from
the database by using a SELECT query.
Types of Discretionary Privileges
 The second level of privileges applies to the
relation level
◦ This includes base relations and virtual (view) relations.
 The granting and revoking of privileges
generally follow an authorization model for
discretionary privileges known as the access
matrix model where
◦ The rows of a matrix M represents subjects (users, accounts,
programs)
◦ The columns represent objects (relations, records, columns,
views, operations).
◦ Each position M(i,j) in the matrix represents the types of
privileges (read, write, update) that subject i holds on object j.
Types of Discretionary Privileges
 To control the granting and revoking of
relation privileges, each relation R in a
database is assigned and owner account,
which is typically the account that was used
when the relation was created in the first
place.
◦ The owner of a relation is given all privileges on that relation.
◦ In SQL2, the DBA can assign and owner to a whole schema
by creating the schema and associating the appropriate
authorization identifier with that schema, using the CREATE
SCHEMA command.
◦ The owner account holder can pass privileges on any of the
owned relation to other users by granting privileges to their
accounts.
Types of Discretionary Privileges
InSQL the following types of privileges can be
granted on each individual relation R:
 SELECT (retrieval or read) privilege on R:
Gives the account retrieval privilege.
In SQL this gives the account the privilege to use the SELECT
statement to retrieve tuples from R.
 MODIFY privileges on R:
This gives the account the capability to modify tuples of R.
In SQL this privilege is further divided into UPDATE, DELETE,
and INSERT privileges to apply the corresponding SQL command
to R.
In addition, both the INSERT and UPDATE privileges can specify
that only certain attributes can be updated by the account.
 REFERENCES privilege on R:
This gives the account the capability to reference relation
R when specifying integrity constraints.
The privilege can also be restricted to specific attributes
of R.
Types of Discretionary Privileges
Notice that to create a view, the account must
have SELECT privilege on all relations involved
in the view definition.

Specifying Privileges Using Views


The mechanism of views is an important
discretionary authorization mechanism in its
own right. For example,
 If the owner A of a relation R wants another account B to be able to
retrieve only some fields of R, then A can create a view V of R that
includes only those attributes and then grant SELECT on V to B.
 The same applies to limiting B to retrieving only certain tuples of R; a
view V’ can be created by defining the view by means of a query that
selects only those tuples from R that A wants to allow B to access.
End of Chapter Seven
Thank You

66

You might also like