You are on page 1of 22

UNIT NO 3

1. Explain different features of good relational database desig

 The database should be strong enough to store all the relevant data and
requirements.
 Should be able to relate the tables in the database by means of a relation, for
example, an employee works for a department so that employee is related to a
particular department. We should be able to define such a relationship between
any two entities in the database.
 Multiple users should be able to access the same database, without affecting
the other user. For example, several teachers can work on a database to update
learners’ marks at the same time. Teachers should also be allowed to update the
marks for their subjects, without modifying other subject marks.
 A single database provides different views to different users, it supports
multiple views to the user, depending on his role. In a school database, for
example, teachers can see the breakdown of learners’ marks; however, parents
are only able to see only their child’s report – thus the parents’ access would be
read only. At the same time, teachers will have access to all the learners’
information and assessment details with modification rights. All this is able to
happen in the same database.
 Data integrity refers to how accurate and consistent the data in a database
is. Databases with lots ofmissing information and incorrect information is said to
have low data integrity.
 Data independence refers to the separation between data and the
application (or applications) in whichit is being used. This allows you to update the
data in your application (such as fixing a spelling mistake)without having to
recompile the entire application.
 Data Redundancy refers to having the exact same data at different places in
the database. Data redundancy Increases the size of the database, creates
Integrity problems, decreases efficiency and leads to anomalies. Data should be
stored so that It Is not repeated In multiple tables.
 Data security refers to how well the data in the database is protected from
crashes, hacks andaccidental deletion.
 Data maintenance refers to monthly, daily or hourly tasks that are run to fix
errors within a databaseand prevent anomalies from occurring. Database
maintenance not only fixes errors, but it also detects potential errors and prevents
future errors from occurring.
2. Explain what is normalization? Explain with example
requirements of Third Normal Form
Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate undesirable characteristics like Insertion,
Update, and Deletion Anomalies.
o Normalization divides the larger table into smaller and links them using
relationships. o The normal form is used to reduce redundancy from the database
table.
Why do we need Normalization?
The main reason for normalizing the relations is removing these anomalies. Failure
to eliminate anomalies leads to data redundancy and can cause data integrity and
other problems as the database grows. Normalization consists of a series of
guidelines that helps to guide you in creating a good database structure.
Third Normal Form :
A relation is in third normal form, if there is no transitive dependency for non-prime
attributes as well as it is in second normal form.
A relation is in 3NF if at least one of the following condition holds in every non-trivial
function dependency X –> Y:
X is a super key.
Y is a prime attribute (each element of Y is part of some candidate key).
In other words,
A relation that is in First and Second Normal Form and in which no non-primary-key
attribute is transitively dependent on the primary key, then it is in Third Normal Form
(3NF).

Note – If A->B and B->C are two FDs then A->C is called transitive dependency.

The normalization of 2NF relations to 3NF involves the removal of transitive


dependencies. If a transitive dependency exists, we remove the transitively
dependent attribute(s) from the relation by placing the attribute(s) in a new relation
along with a copy of the determinant.

Consider the examples given below.


Example-1:
In relation STUDENT given in Table 4,

FD set:
{STUD_NO -> STUD_NAME, STUD_NO -> STUD_STATE, STUD_STATE ->
STUD_COUNTRY, STUD_NO -> STUD_AGE}

Candidate Key:
{STUD_NO}

For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE ->
STUD_COUNTRY are true. So STUD_COUNTRY is transitively dependent on
STUD_NO. It violates the third normal form. To convert it in third normal form, we will
decompose the relation STUDENT (STUD_NO, STUD_NAME, STUD_PHONE,
STUD_STATE, STUD_COUNTRY_STUD_AGE) as:

STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_AGE)


STATE_COUNTRY (STATE, COUNTRY)

3. One of the rule designed by codd’s for a good relational


database management system is integrity independence, which
states that all integrity constraints can be independently modified
without the need of any change in the application. Justify the
significance of rule in relational database management
A database must be independent of the application that uses it. All its integrity
constraints can be independently modified without the need of any change in the
application. This rule makes a database independent of the front-end application and
its interface.
When we are using SQL to put data into table cells, a DB must guarantee integrity
independence. All the entered values must not be changed, and the integrity of the
data should not be reliant on any external component or application. It’s also useful
for making each front-end app DB-independent.
4. Explain 3NF and BCNF. Also enlist their differences.
A relation is in 3NF, if it is in 2NF and no non-key attribute of the relation is
transitively
dependent on the primary key.
3NF prohibits transitive dependencies.
In short 2NF means,
1. It should be in 2NF.
2. There should not be any transitive partial dependency.
A table is said to be in third normal form when,
i. It is in the second normal form.(i.e. it does not have partial functional dependency).
ii. It doesn’t have transitive dependency.
A relational schema R is in third normal form with respect to a set F of functional
dependencies
if, for all functional dependencies in F + of the form a α →β , where α ⊆ R and β ⊆ R,
at least
one of the following holds:
 α → β is a trivial functional dependency.
 α is a superkey for R.
 Each attribute A in β − α is contained in a candidate key for R.

5. Consider the following relation R(A, B, C, D, E, F) and FDs A-


>BC, C->A, D->E, F->A, E->D is the decomposition of R into R1(A,
C, D), R2(B, C, D) and R3 (E, F, D). Check for lossless.
Solution :
Step 1 : R1 ∪ R2 ∪ R3 = R. Here the first condition for checking lossless join is
satisfied as (A,C,D) ∪ (B,C,D) ∪ (E,F,D) = {A,B,C,D,E,F} which is nothing but R.
Step 2 : Consider R1∩ R2 = {CD} and R2∩R3 = {D}. Hence second condition of
intersection not being  gets satisfied.
Step 3 : Now, consider R1(A, C, D) and R2(B, C, D). We find R1∩R2 = {CD} (CD)+ =
{ABCDE}  attributes of R1 i.e.{A, C, D}. Hence condition 3 for checking lossless join
for R1 and R2 gets satisfied.
Step 4 : Now, consider R2(B, C, D) and R3(E, F, D) . We find R2∩R3={D}. (D)+ = {D,
E} which is neither complete set of attributes of R2 or R3. [Note that F is missing for
being attribute of R3].
Hence it is not lossless join decomposition. Or in other words we can say it is a lossy
decomposition

6. What is Normalization? Give types of normalization


Normalization is the process of organizing data into a related table; it also eliminates
redundancy and increases the integrity which improves performance of the query. To
normalize a database, we divide the database into tables and establish relationships
between the tables.

Database normalization can essentially be defined as the practice of optimizing table


structures. Optimization is accomplished as a result of a thorough investigation of the
various pieces of data that will be stored within the database, in particular
concentrating upon how this data is interrelated.

Normalization Avoids
Duplication of Data- The same data is listed in multiple lines of the database
Insert Anomaly- A record about an entity cannot be inserted into the table without
first inserting information about another entity - Cannot enter a customer without a
sales order
Delete Anomaly- A record cannot be deleted without deleting a record about a
related entity. Cannot delete a sales order without deleting all of the customer's
information.
Update Anomaly- Cannot update information without changing information in many
places. To update customer information, it must be updated for each sales order the
customer has placed

First Normal Form (1st NF)


 
In 1st NF
 The table cells must be of a single value.
 Eliminate repeating groups in individual tables.
 Create a separate table for each set of related data.
 Identify each set of related data with a primary key.
Definition: An entity is in the first normal form if it contains no repeating groups. In
relational terms, a table is in the first normal form if it contains no repeating columns.
Repeating columns make your data less flexible, waste disk space, and makes it
more difficult to search for data.
 
IMP: In 1NF relation, the order of tuples (rows) and attributes (columns) does not
matter.
 
Example
 
Order Customer Contact Person Total
1 Rishabh Manish 134.23
2 Preeti Rohan 521.24
3 Rishabh Manish 1042.42
4 Rishabh Manish 928.53
 
The above relation satisfies the properties of relation and is said to be in first normal
form (or 1NF). Conceptually it is convenient to have all the information in one relation
since it is then likely to be easier to query the database.
 
Second Normal Form (2nd NF)
 
In 2nd NF
 Remove Partial Dependencies.
 Functional Dependency: The value of one attribute in a table is determined
entirely by the value of another.
 Partial Dependency: A type of functional dependency where an attribute is
functionally dependent on only part of the primary key (primary key must be a
composite key).
 Create a separate table with the functionally dependent data and the part of the
key on which it depends. The tables created at this step will usually contain
descriptions of resources.
Definition: A relation is in 2NF if it is in 1NF and every non-key attribute is fully
dependent on each candidate key of the relation.
 
Example
 
The following relation is not in Second Normal Form:
 
Order Customer Contact Person Total
1 Rishabh Manish 134.23
2 Preeti Rohan 521.24
3 Rishabh Manish 1042.42
4 Rishabh Manish 928.53
 
In the table above, the order number serves as the primary key. Notice that the
customer and total amount are dependent upon the order number -- this data is
specific to each order. However, the contact person is dependent upon the
customer. An alternative way to accomplish this would be to create two tables:
 
Customer Contact Person
Rishabh Manish
Preeti Rohan
 
Order Customer Total
1 Rishabh 134.23
2 Preeti 521.24
3 Rishabh 1042.42
4 Rishabh 928.53 
 
The creation of two separate tables eliminates the dependency problem. In the first
table, contact person is dependent upon the primary key -- customer name. The
second table only includes the information unique to each order. Someone interested
in the contact person for each order could obtain this information by performing a
Join Operation.
 
Third Normal Form (3rd NF)
 
In 3rd NF
 Remove transitive dependencies.
 Transitive Dependency A type of functional dependency where an attribute is
functionally dependent on an attribute other than the primary key. Thus its value
is only indirectly determined by the primary key.
 Create a separate table containing the attribute and the fields that are
functionally dependent on it. The tables created at this step will usually contain
descriptions of either resources or agents. Keep a copy of the key attribute in
the original file.
A relation is in third normal form if it is in 2NF and every non-key attribute of the
relation is non-transitively dependent on each candidate key of the relation.
 
Non-transitive dependency
 
Let A, B, and C be three attributes of a relation R such that AB and BC. From
these FDs, we may derive A-C. This dependence A-C is transitive. 
 
Example
 
Company City State ZIP
ABC Ltd. Mumbai MH 10169
XYZ Ltd. Noida UP 33196
ASD Ltd. Chennai TN 21046
 
The above table is not in the 3NF.
 
In this example, the city and state are dependent upon the ZIP code. To place this
table in 3NF, two separate tables would be created -- one containing the company
name and ZIP code and the other containing city, state, ZIP code pairings.
 
Company ZIP
ABC Ltd. 10169
XYZ Ltd. 33196
ASD Ltd. 21046
 
City State ZIP
Mumbai MH 10169
Noida UP 33196
Chennai TN 21046
 
This may seem overly complex for daily applications and indeed it may be. Database
designers should always keep in mind the tradeoffs between higher level normal
forms and the resource issues that complexity creates.
 
Boyce-Codd Normal Form (BCNF)
 
In BCNF
 When a relation has more than one candidate key, anomalies may result even
though the relation is in 3NF.
 3NF does not deal satisfactorily with the case of a relation with overlapping
candidate keys
 i.e. composite candidate keys with at least one attribute in common.
 BCNF is based on the concept of a determinant.
 A determinant is any attribute (simple or composite) on which some other
attribute is fully functionally dependent.
 A relation is in BCNF is, and only if, every determinant is a candidate key.
Definition: A relation is in Boyce-Codd Normal Form (BCNF) if every determinant is
a candidate key. (See the links in the box at right for definitions of determinant and
candidate key.)
 
The difference between 3NF and BCNF is that for a functional dependency A  B,
3NF allows this dependency in a relation if B is a primary-key attribute and A is not a
candidate key,
 
Whereas BCNF insists that for this dependency to remain in a relation, A must be a
candidate key.
 
Example
 
CLIENT INTERVIEW
 
ClientNo InterviewDate InterviewTime StaffNo RoomNo
CR76 13-may-11 10:30 SG5 G101
CR76 13-may-11 12:00 SG5 G101
CR74 13-may-11 12:00 SG37 G102
CR56 02-july-11 10:30 SG5 G102
 FD1 ClientNo, InterviewDate -> InterviewTime, StaffNo, RoomNo (Primary Key)
 FD2 StaffNo, InterviewDate, InterviewTime -> ClientNo (Candidate key)
 FD3 RoomNo, InterviewDate, InterviewTime -> ClientNo, StaffNo (Candidate
key)
 FD4 StaffNo, InterviewDate -> RoomNo (not a candidate key)
As a consequence, the ClientInterview relation may suffer from update anomalies.
 
To transform the ClientInterview relation to BCNF, we must remove the violating
functional dependency by creating two new relations called Interview and StaffRoom
as shown below,
 
Interview (ClientNo, InterviewDate, InterviewTime, StaffNo)
StaffRoom (StaffNo, InterviewDate, RoomNo)
 
INTERVIEW
 
ClientNo InterviewDate InterviewTime StaffNo
CR76 13-may-11 10:30 SG5
CR76 13-may-11 12:00 SG5
CR74 13-may-11 12:00 SG37
CR56 02-july-11 10:30 SG5
 
STAFFROOM
 
StaffNo InterviewDate RoomNo
SG5 13-may-11 G101
SG37 13-may-11 G102
SG5 02-july-11 G102
 
BCNF Interview and StaffRoom relations.
 
An entity is in Fourth Normal Form (4NF) when it meets the requirement of being in
Third Normal Form (3NF) and additionally:
 Has no multiple sets of multi-valued dependencies. In other words, 4NF states
that no entity can have more than a single one-to-many relationship within an
entity if the one-to-many attributes are independent of each other.
 
 Many: many relationships are resolved independently.

7. Explain lossless join decomposition


Lossless-join decomposition is a process in which a relation is decomposed into two
or more relations. This property guarantees that the extra or less tuple generation
problem does not occur and no information is lost from the original relation during
the decomposition. It is also known as non-additive join decomposition.
When the sub relations combine again then the new relation must be the same as
the original relation was before decomposition.
Consider a relation R if we decomposed it into sub-parts relation R1 and relation R2.
The decomposition is lossless when it satisfies the following statement −
 If we union the sub Relation R1 and R2 then it must contain all the attributes
that are available in the original relation R before decomposition.
 Intersections of R1 and R2 cannot be Null. The sub relation must contain a
common attribute. The common attribute must contain unique data.
The common attribute must be a super key of sub relations either R1 or R2.
Here,
R = (A, B, C)
R1 = (A, B)
R2 = (B, C)
The relation R has three attributes A, B, and C. The relation R is decomposed into
two relation R1 and R2. . R1 and R2 both have 2-2 attributes.The common
attributes are B.
The Value in Column B must be unique. if it contains a duplicate value then the
Lossless-join decomposition is not possible.
Draw a table of Relation R with Raw Data −
R (A, B, C)

A B C

12 25 34

10 36 09

12 42 30
It decomposes into the two sub relations −
R1 (A, B)

A B

12 25

10 36

12 42
R2 (B, C)

B C

25 34

36 09

42 30
Now, we can check the first condition for Lossless-join decomposition.
The union of sub relation R1 and R2 is the same as relation R.
R1U R2 = R
We get the following result −

A B C

12 25 34
A B C

10 36 09

12 42 30
The relation is the same as the original relation R. Hence, the above decomposition
is Lossless-join decomposition.

8. Explain transitive dependency? Explain the third normal form


with a suitable example?
When an indirect relationship causes functional dependency it is called Transitive
Dependency.
If  P -> Q and Q -> R is true, then P-> R is a transitive dependency.
To achieve 3NF, eliminate the Transitive Dependency.

UNIT NO 4

1. Explain ACID properties of Transaction


1) Atomicity :
 This property states that each transaction must be considered as a single unit and
must be completed fully or not completed at all.
 No transaction in the database is left half completed.
 Database should be in a state either before the transaction execution or after the
transaction execution. It should not be in a state ‘executing’.
 For example - In above mentioned withdrawal of money transaction all the five
steps must be completed fully or none of the step is completed. Suppose if
transaction gets failed after step 3, then the customer will get the money but the
balance will not be updated accordingly. The state of database should be either at
before ATM withdrawal (i.e customer without withdrawn money) or after ATM
withdrawal (i.e. customer with money and account updated). This will make the
system in consistent state.
2) Consistency :
 The database must remain in consistent state after performing any transaction.
 For example : In ATM withdrawal operation, the balance must be updated
appropriately after performing transaction. Thus the database can be in consistent
state.
3) Isolation :
 In a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions
will be carried out and executed as if it is the only transaction in the system.
 No transaction will affect the existence of any other transaction.
 For example : If a bank manager is checking the account balance of particular
customer, then manager should see the balance either before withdrawing the
money or after withdrawing the money. This will make sure that each individual
transaction is completed and any other dependent transaction will get the consistent
data out of it. Any failure to any transaction will not affect other transaction in this
case. Hence it makes all the transactions consistent.
4) Durability :
 The database should be strong enough to handle any system failure.
 If there is any set of insert /update, then it should be able to handle and commit to
the database.
 If there is any failure, the database should be able to recover it to the consistent
state.
 For example : In ATM withdrawal example, if the system failure happens after
customer getting the money then the system should be strong enough to update
Database with his new balance, after system recovers. For that purpose the system
has to keep the log of each transaction and its failure. So when the system recovers,
it should be able to know when a system has failed and if there is any pending
transaction, then it should be updated to Database.

2.Explain Timestamp-Based Concurrency Control protocol and the


modifications implemented in it
The main idea for this protocol is to order the transactions based on their
Timestamps. A schedule in which the transactions participate is then
serializable and the only equivalent serial schedule permitted has the
transactions in the order of their Timestamp Values. Stating simply, the
schedule is equivalent to the particular Serial Order corresponding to
the order of the Transaction timestamps. An algorithm must ensure that, for
each item accessed by Conflicting Operations in the schedule, the order in
which the item is accessed does not violate the ordering. To ensure this, use
two Timestamp Values relating to each database item X. 
 W_TS(X) is the largest timestamp of any transaction that
executed write(X) successfully.
 R_TS(X) is the largest timestamp of any transaction that
executed read(X) successfully.
Basic Timestamp Ordering – 
Every transaction is issued a timestamp based on when it enters the system.
Suppose, if an old transaction T i has timestamp TS(Ti), a new transaction
Tj is assigned timestamp TS(Tj) such that TS(Ti) < TS(Tj). The protocol
manages concurrent execution such that the timestamps determine the
serializability order. The timestamp ordering protocol ensures that any
conflicting read and write operations are executed in timestamp order.
Whenever some Transaction T tries to issue a R_item(X) or a W_item(X), the
Basic TO algorithm compares the timestamp of T with R_TS(X) &
W_TS(X) to ensure that the Timestamp order is not violated. This describes
the Basic TO protocol in the following two cases. 

1. Whenever a Transaction T issues a W_item(X) operation, check the


following conditions: 
 If R_TS(X) > TS(T) or if W_TS(X) > TS(T), then abort and rollback T
and reject the operation. else,
 Execute W_item(X) operation of T and set W_TS(X) to TS(T).
2. Whenever a Transaction T issues a R_item(X) operation, check the
following conditions: 
 If W_TS(X) > TS(T), then abort and reject T and reject the operation,
else
 If W_TS(X) <= TS(T), then execute the R_item(X) operation of T and
set R_TS(X) to the larger of TS(T) and current R_TS(X). 
 
Whenever the Basic TO algorithm detects two conflicting operations that
occur in an incorrect order, it rejects the latter of the two operations by
aborting the Transaction that issued it. Schedules produced by Basic TO are
guaranteed to be conflict serializable. Already discussed that using
Timestamp can ensure that our schedule will be deadlock free. 
One drawback of the Basic TO protocol is that Cascading Rollback is still
possible. Suppose we have a Transaction T 1 and T2 has used a value written
by T1. If T1 is aborted and resubmitted to the system then, T 2 must also be
aborted and rolled back. So the problem of Cascading aborts still prevails. 
Let’s gist the Advantages and Disadvantages of Basic TO protocol: 
 
 Timestamp Ordering protocol ensures serializability since the precedence
graph will be of the form: 
Image – Precedence Graph for TS ordering 
 Timestamp protocol ensures freedom from deadlock as no transaction
ever waits.
 But the schedule may not be cascade free, and may not even be
recoverable.
Strict Timestamp Ordering – 
A variation of Basic TO is called Strict TO ensures that the schedules are
both Strict and Conflict Serializable. In this variation, a Transaction T that
issues a R_item(X) or W_item(X) such that TS(T) > W_TS(X) has its read or
write operation delayed until the Transaction T‘ that wrote the values of X has
committed or aborted. 

3.How lock based protocols are useful to control concurrency?


 One of the method to ensure the isolation property in transactions is to require that
data items be accessed in a mutually exclusive manner. That means, while one
transaction is accessing a data item, no other transaction can modify that data item.
 The most common method used to implement this requirement is to allow a
transaction to access a data item only if it is currently holding a lock on that item.
 Thus the lock on the operation is required to ensure the isolation of transaction.

4. Explain Two phase locking protocol


The two phase locking is a protocol in which there are two phases :
i) Growing phase (Locking phase) : It is a phase in which the transaction may obtain
locks but does not release any lock.
ii) Shrinking phase (Unlocking phase) : It is a phase in which the transaction may
release the locks but does not obtain any new lock.
 Lock Point : The last lock position or first unlock position is called lock point.
5.How concurrency is performed? Explain the protocol that is used
to maintain the concurrency concept.

Concurrency control  is provided in a database to:


 (i) enforce isolation among transactions.
 (ii) preserve database consistency through consistency preserving
execution of transactions.
 (iii) resolve read-write and write-read conflicts.
Various concurrency control techniques are:
1. Two-phase locking Protocolring Protocol
3. Multi version concurrency control
4. Validation concurrency control
These are briefly explained below. 1. Two-Phase Locking
Protocol : Locking is an operation which secures: permission to read, OR
permission to write a data item. Two phase locking is a process used to gain
ownership of shared resources without creating the possibility of deadlock.
The 3 activities taking place in the two phase update algorithm are:
(i). Lock Acquisition
(ii). Modification of Data
(iii). Release Lock
Two phase locking prevents deadlock from occurring in distributed systems
by releasing all the resources it has acquired, if it is not possible to acquire
all the resources required without waiting for another process to finish using
a lock. This means that no process is ever in a state where it is holding some
shared resources, and waiting for another process to release a shared
resource which it requires. This means that deadlock cannot occur due to
resource contention. A transaction in the Two Phase Locking Protocol can
assume one of the 2 phases:
 (i) Growing Phase: In this phase a transaction can only acquire locks but
cannot release any lock. The point when a transaction acquires all the
locks it needs is called the Lock Point.
 (ii) Shrinking Phase: In this phase a transaction can only release locks
but cannot acquire any.
2. Time Stamp Ordering Protocol : A timestamp is a tag that can be
attached to any transaction or any data item, which denotes a specific time
on which the transaction or the data item had been used in any way. A
timestamp can be implemented in 2 ways. One is to directly assign the
current value of the clock to the transaction or data item. The other is to
attach the value of a logical counter that keeps increment as new timestamps
are required. The timestamp of a data item can be of 2 types:
 (i) W-timestamp(X): This means the latest time when the data item X has
been written into.
 (ii) R-timestamp(X): This means the latest time when the data item X has
been read from. These 2 timestamps are updated each time a successful
read/write operation is performed on the data item X.
3. Multiversion Concurrency Control: Multiversion schemes keep old
versions of data item to increase concurrency. Multiversion 2 phase
locking: Each successful write results in the creation of a new version of the
data item written. Timestamps are used to label the versions. When a
read(X) operation is issued, select an appropriate version of X based on the
timestamp of the transaction. 4. Validation Concurrency Control : The
optimistic approach is based on the assumption that the majority of the
database operations do not conflict. The optimistic approach requires neither
locking nor time stamping techniques. Instead, a transaction is executed
without restrictions until it is committed. Using an optimistic approach, each
transaction moves through 2 or 3 phases, referred to as read, validation and
write.
 (i) During read phase, the transaction reads the database, executes the
needed computations and makes the updates to a private copy of  the
database values. All update operations of the transactions are recorded in
a temporary update file, which is not accessed by the remaining
transactions.
 (ii) During the validation phase, the transaction is validated to ensure that
the changes made will not affect the integrity and consistency of the
database. If the validation test is positive, the transaction goes to a write
phase. If the validation test is negative, he transaction is restarted and the
changes are discarded.
 (iii) During the write phase, the changes are permanently applied to the
database.

6. Explain the Concept of Conflict Serializability. Decide whether


the following schedule is conflict serializable or not. Justify your
answer
T1 T2
Read(A)
Write(A)
Read(A)
Write(A)
Read(B)
Write(B)
Read(B)
Write(B)
7. To ensure atomicity despite failures we use Recovery Methods.
Explain in detail Log-Based Recovery method

The log is a sequence of log records, recording all the update activities in the
database. In a stable storage, logs for each transaction are maintained. Any
operation which is performed on the database is recorded is on the log. Prior
to performing any modification to database, an update log record is created
to reflect that modification.
An update log record represented as: <Ti, Xj, V1, V2> has these fields:
1. Transaction identifier: Unique Identifier of the transaction that
performed the write operation.
2. Data item: Unique identifier of the data item written.
3. Old value: Value of data item prior to write.
4. New value: Value of data item after write operation.
Other type of log records are:
1. <Ti start>: It contains information about when a transaction Ti starts.
2. <Ti commit>: It contains information about when a transaction Ti
commits.
3. <Ti abort>: It contains information about when a transaction Ti aborts.
Undo and Redo Operations –
Because all database modifications must be preceded by creation of log
record, the system has available both the old value prior to modification of
data item and new value that is to be written for data item. This allows
system to perform redo and undo operations as appropriate:
1. Undo: using a log record sets the data item specified in log record to old
value.
2. Redo: using a log record sets the data item specified in log record to new
value.
The database can be modified using two approaches –
1. Deferred Modification Technique: If the transaction does not modify the
database until it has partially committed, it is said to use deferred
modification technique.
2. Immediate Modification Technique: If database modification occur
while transaction is still active, it is said to use immediate modification
technique.
Recovery using Log records –
After a system crash has occurred, the system consults the log to determine
which transactions need to be redone and which need to be undone.
1. Transaction Ti needs to be undone if the log contains the record <Ti
start> but does not contain either the record <Ti commit> or the record
<Ti abort>.
2. Transaction Ti needs to be redone if log contains record <Ti start> and
either the record <Ti commit> or the record <Ti abort>.
Use of Checkpoints –
When a system crash occurs, user must consult the log. In principle, that
need to search the entire log to determine this information. There are two
major difficulties with this approach:
1. The search process is time-consuming.
2. Most of the transactions that, according to our algorithm, need to be
redone have already written their updates into the database. Although
redoing them will cause no harm, it will cause recovery to take longer.
To reduce these types of overhead, user introduce checkpoints. A log record
of the form <checkpoint L> is used to represent a checkpoint in log where L
is a list of transactions active at the time of the checkpoint. When a
checkpoint log record is added to log all the transactions that have
committed before this checkpoint have <Ti commit> log record before the
checkpoint record. Any database modifications made by Ti is written to the
database either prior to the checkpoint or as part of the checkpoint itself.
Thus, at recovery time, there is no need to perform a redo operation on Ti.
After a system crash has occurred, the system examines the log to find the
last <checkpoint L> record. The redo or undo operations need to be applied
only to transactions in L, and to all transactions that started execution after
the record was written to the log. Let us denote this set of transactions as T.
Same rules of undo and redo are applicable on T as mentioned in Recovery
using Log records part.
Note that user need to only examine the part of the log starting with the last
checkpoint log record to find the set of transactions T, and to find out
whether a commit or abort record occurs in the log for each transaction in T.
For example, consider the set of transactions {T0, T1, . . ., T100}. Suppose
that the most recent checkpoint took place during the execution of
transaction T67 and T69, while T68 and all transactions with subscripts lower
than 67 completed before the checkpoint. Thus, only transactions T67, T69, .
. ., T100 need to be considered during the recovery scheme. Each of them
needs to be redone if it has completed (that is, either committed or aborted);
otherwise, it was incomplete, and needs to be undone.

8.Transaction during its execution should be in one of the different


states at any point of time, explain the different states of
transactions during its execution

States through which a transaction goes during its lifetime. These are the
states which tell about the current state of the Transaction and also tell how
we will further do the processing in the transactions. These states govern the
rules which decide the fate of the transaction whether it will commit or abort. 
1. Active State – 
When the instructions of the transaction are running then the transaction
is in active state. If all the ‘read and write’ operations are performed
without any error then it goes to the “partially committed state”; if any
instruction fails, it goes to the “failed state”. 
 
2. Partially Committed – 
After completion of all the read and write operation the changes are made
in main memory or local buffer. If the changes are made permanent on
the DataBase then the state will change to “committed state” and in case
of failure it will go to the “failed state”. 
 
3. Failed State – 
When any instruction of the transaction fails, it goes to the “failed state” or
if failure occurs in making a permanent change of data on Data Base. 
 
4. Aborted State – 
After having any type of failure the transaction goes from “failed state” to
“aborted state” and since in previous states, the changes are only made
to local buffer or main memory and hence these changes are deleted or
rolled-back. 
 
5. Committed State – 
It is the state when the changes are made permanent on the Data Base
and the transaction is complete and therefore terminated in the
“terminated state”. 
 
6. Terminated State – 
If there isn’t any roll-back or the transaction comes from the “committed
state”, then the system is consistent and ready for new transaction and
the old transaction is terminated. 

You might also like