Professional Documents
Culture Documents
Prelims Ans
Prelims Ans
The database should be strong enough to store all the relevant data and
requirements.
Should be able to relate the tables in the database by means of a relation, for
example, an employee works for a department so that employee is related to a
particular department. We should be able to define such a relationship between
any two entities in the database.
Multiple users should be able to access the same database, without affecting
the other user. For example, several teachers can work on a database to update
learners’ marks at the same time. Teachers should also be allowed to update the
marks for their subjects, without modifying other subject marks.
A single database provides different views to different users, it supports
multiple views to the user, depending on his role. In a school database, for
example, teachers can see the breakdown of learners’ marks; however, parents
are only able to see only their child’s report – thus the parents’ access would be
read only. At the same time, teachers will have access to all the learners’
information and assessment details with modification rights. All this is able to
happen in the same database.
Data integrity refers to how accurate and consistent the data in a database
is. Databases with lots ofmissing information and incorrect information is said to
have low data integrity.
Data independence refers to the separation between data and the
application (or applications) in whichit is being used. This allows you to update the
data in your application (such as fixing a spelling mistake)without having to
recompile the entire application.
Data Redundancy refers to having the exact same data at different places in
the database. Data redundancy Increases the size of the database, creates
Integrity problems, decreases efficiency and leads to anomalies. Data should be
stored so that It Is not repeated In multiple tables.
Data security refers to how well the data in the database is protected from
crashes, hacks andaccidental deletion.
Data maintenance refers to monthly, daily or hourly tasks that are run to fix
errors within a databaseand prevent anomalies from occurring. Database
maintenance not only fixes errors, but it also detects potential errors and prevents
future errors from occurring.
2. Explain what is normalization? Explain with example
requirements of Third Normal Form
Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate undesirable characteristics like Insertion,
Update, and Deletion Anomalies.
o Normalization divides the larger table into smaller and links them using
relationships. o The normal form is used to reduce redundancy from the database
table.
Why do we need Normalization?
The main reason for normalizing the relations is removing these anomalies. Failure
to eliminate anomalies leads to data redundancy and can cause data integrity and
other problems as the database grows. Normalization consists of a series of
guidelines that helps to guide you in creating a good database structure.
Third Normal Form :
A relation is in third normal form, if there is no transitive dependency for non-prime
attributes as well as it is in second normal form.
A relation is in 3NF if at least one of the following condition holds in every non-trivial
function dependency X –> Y:
X is a super key.
Y is a prime attribute (each element of Y is part of some candidate key).
In other words,
A relation that is in First and Second Normal Form and in which no non-primary-key
attribute is transitively dependent on the primary key, then it is in Third Normal Form
(3NF).
Note – If A->B and B->C are two FDs then A->C is called transitive dependency.
FD set:
{STUD_NO -> STUD_NAME, STUD_NO -> STUD_STATE, STUD_STATE ->
STUD_COUNTRY, STUD_NO -> STUD_AGE}
Candidate Key:
{STUD_NO}
For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE ->
STUD_COUNTRY are true. So STUD_COUNTRY is transitively dependent on
STUD_NO. It violates the third normal form. To convert it in third normal form, we will
decompose the relation STUDENT (STUD_NO, STUD_NAME, STUD_PHONE,
STUD_STATE, STUD_COUNTRY_STUD_AGE) as:
Normalization Avoids
Duplication of Data- The same data is listed in multiple lines of the database
Insert Anomaly- A record about an entity cannot be inserted into the table without
first inserting information about another entity - Cannot enter a customer without a
sales order
Delete Anomaly- A record cannot be deleted without deleting a record about a
related entity. Cannot delete a sales order without deleting all of the customer's
information.
Update Anomaly- Cannot update information without changing information in many
places. To update customer information, it must be updated for each sales order the
customer has placed
A B C
12 25 34
10 36 09
12 42 30
It decomposes into the two sub relations −
R1 (A, B)
A B
12 25
10 36
12 42
R2 (B, C)
B C
25 34
36 09
42 30
Now, we can check the first condition for Lossless-join decomposition.
The union of sub relation R1 and R2 is the same as relation R.
R1U R2 = R
We get the following result −
A B C
12 25 34
A B C
10 36 09
12 42 30
The relation is the same as the original relation R. Hence, the above decomposition
is Lossless-join decomposition.
UNIT NO 4
The log is a sequence of log records, recording all the update activities in the
database. In a stable storage, logs for each transaction are maintained. Any
operation which is performed on the database is recorded is on the log. Prior
to performing any modification to database, an update log record is created
to reflect that modification.
An update log record represented as: <Ti, Xj, V1, V2> has these fields:
1. Transaction identifier: Unique Identifier of the transaction that
performed the write operation.
2. Data item: Unique identifier of the data item written.
3. Old value: Value of data item prior to write.
4. New value: Value of data item after write operation.
Other type of log records are:
1. <Ti start>: It contains information about when a transaction Ti starts.
2. <Ti commit>: It contains information about when a transaction Ti
commits.
3. <Ti abort>: It contains information about when a transaction Ti aborts.
Undo and Redo Operations –
Because all database modifications must be preceded by creation of log
record, the system has available both the old value prior to modification of
data item and new value that is to be written for data item. This allows
system to perform redo and undo operations as appropriate:
1. Undo: using a log record sets the data item specified in log record to old
value.
2. Redo: using a log record sets the data item specified in log record to new
value.
The database can be modified using two approaches –
1. Deferred Modification Technique: If the transaction does not modify the
database until it has partially committed, it is said to use deferred
modification technique.
2. Immediate Modification Technique: If database modification occur
while transaction is still active, it is said to use immediate modification
technique.
Recovery using Log records –
After a system crash has occurred, the system consults the log to determine
which transactions need to be redone and which need to be undone.
1. Transaction Ti needs to be undone if the log contains the record <Ti
start> but does not contain either the record <Ti commit> or the record
<Ti abort>.
2. Transaction Ti needs to be redone if log contains record <Ti start> and
either the record <Ti commit> or the record <Ti abort>.
Use of Checkpoints –
When a system crash occurs, user must consult the log. In principle, that
need to search the entire log to determine this information. There are two
major difficulties with this approach:
1. The search process is time-consuming.
2. Most of the transactions that, according to our algorithm, need to be
redone have already written their updates into the database. Although
redoing them will cause no harm, it will cause recovery to take longer.
To reduce these types of overhead, user introduce checkpoints. A log record
of the form <checkpoint L> is used to represent a checkpoint in log where L
is a list of transactions active at the time of the checkpoint. When a
checkpoint log record is added to log all the transactions that have
committed before this checkpoint have <Ti commit> log record before the
checkpoint record. Any database modifications made by Ti is written to the
database either prior to the checkpoint or as part of the checkpoint itself.
Thus, at recovery time, there is no need to perform a redo operation on Ti.
After a system crash has occurred, the system examines the log to find the
last <checkpoint L> record. The redo or undo operations need to be applied
only to transactions in L, and to all transactions that started execution after
the record was written to the log. Let us denote this set of transactions as T.
Same rules of undo and redo are applicable on T as mentioned in Recovery
using Log records part.
Note that user need to only examine the part of the log starting with the last
checkpoint log record to find the set of transactions T, and to find out
whether a commit or abort record occurs in the log for each transaction in T.
For example, consider the set of transactions {T0, T1, . . ., T100}. Suppose
that the most recent checkpoint took place during the execution of
transaction T67 and T69, while T68 and all transactions with subscripts lower
than 67 completed before the checkpoint. Thus, only transactions T67, T69, .
. ., T100 need to be considered during the recovery scheme. Each of them
needs to be redone if it has completed (that is, either committed or aborted);
otherwise, it was incomplete, and needs to be undone.
States through which a transaction goes during its lifetime. These are the
states which tell about the current state of the Transaction and also tell how
we will further do the processing in the transactions. These states govern the
rules which decide the fate of the transaction whether it will commit or abort.
1. Active State –
When the instructions of the transaction are running then the transaction
is in active state. If all the ‘read and write’ operations are performed
without any error then it goes to the “partially committed state”; if any
instruction fails, it goes to the “failed state”.
2. Partially Committed –
After completion of all the read and write operation the changes are made
in main memory or local buffer. If the changes are made permanent on
the DataBase then the state will change to “committed state” and in case
of failure it will go to the “failed state”.
3. Failed State –
When any instruction of the transaction fails, it goes to the “failed state” or
if failure occurs in making a permanent change of data on Data Base.
4. Aborted State –
After having any type of failure the transaction goes from “failed state” to
“aborted state” and since in previous states, the changes are only made
to local buffer or main memory and hence these changes are deleted or
rolled-back.
5. Committed State –
It is the state when the changes are made permanent on the Data Base
and the transaction is complete and therefore terminated in the
“terminated state”.
6. Terminated State –
If there isn’t any roll-back or the transaction comes from the “committed
state”, then the system is consistent and ready for new transaction and
the old transaction is terminated.