You are on page 1of 7

The Concept of Anomalies

The problems associated with using a relation that is not appropriately normalized are known as anomalies. Anomalies
can potentially occur during changes to a database. An anomaly is a bad thing because data can become logically
corrupted.
Problems without Normalization
Without Normalization, it becomes difficult to handle and update the database, without facing data loss.
Insertion, Updation and Deletion anomalies are very frequent if database is not normalized. To understand
these anomalies let us take an example of Student table.
S_id S_Name S_Address Subject_opted
401 Mulenga 22 Masuku Rd Biology
402 Zulu 4 Orange close Maths
403 Sinks 10 Mango St Maths
401 Mulenga 22 Masuku Rd Physics

Updation Anamoly : To update address of a student who occurs twice or more than twice in a table, we will
have to update S_Address column in all the rows, else data will become inconsistent.
Insertion Anamoly : Suppose for a new admission, we have a Student id(S_id), name and address of a
student but if student has not opted for any subjects yet then we have to insert NULL there, leading to
Insertion Anamoly.
Deletion Anamoly : If (S_id) 401 has only one subject and temporarily he drops it, when we delete that row,
entire student record will be deleted along with it.

Functional dependency
An important concept associated with normalization is functional dependency, which describes the relationship
between attributes.
Y is functionally dependent on X if the value of Y is determined by X. In other words, if Y = X +1, the value of X will
determine the resultant value of Y. Thus, Y is dependent on X as a function of the value of X. (X + 1 Y)
For example, if A and B are attributes of relation R, B is functionally dependent on A (A B), if each value of A is
associated with exactly one value of B. If we know the value of A and we examine the relation that holds this
dependency, we find only one value for B in all the tuples that have a given value of A, at any moment in time. If two
tuples have the same value of A, they also have the same value of B. However, for a give value of B, there may be
several different values of A.

B is functionally
A B
Dependent on A

Dependency, Determinants
Determinant—the attribute or group of attributes on the left hand side of the arrow of a functional dependency. A is a
determinant of B.

1
StaffNo functionally
StaffNo Position
determines position

Staff No ZT30 Lecturer

(a) staffNo funtionall determine position (staffNo position)

Position does not functionally


Position StaffNo
determine staffNo

Staff No ZT05

Lecturer Staff No ZT10

Staff No ZT45

Staff No ZT30
(b) Position does not functionally determine staffNo (Position staffNo)

Transitive dependency—C is transitively dependent on A when A determines B and B determines C. Transitive


dependence thus describes that C is indirectly dependent on A through its relationship with B.

Full functional dependency—this situation occurs where A determines B, but A combined with C does not determine
B. In other words, B depends on A and A alone. If B depends on A with anything else, there is not full functional
dependence. Essentially A, the determinant, cannot be a composite key. A composite key contains more than one field
(the equivalent of A with C).

Partial functional dependency – If there is some attribute that can be removed from A and yet the dependency holds.

StaffNo, sName branchNo (This is a partial dependency because branchNo is also functional dependent
on a subset of (staffNo, sName) ie staffNo

StaffNo branchNo (This is a full dependeny)

Indentifying functional dependencies

Functional Dependencies of the ClientRental relation


clientNo propertyNo cName pAddress rentStart rentFinish rent ownerNo oName
fd1
fd2
fd3

fd4

fd5

fd6

Fd1 clientNo cName (partial dependence)


2
Fd2 propertyNo pAddress, rent, ownerNo, oName (partial dependence)
Fd3 ownerNo oName (Transitive dependence)
Fd4 clientNo, propertyNo rentStart, rentFinish (primary Key)
Fd5 clientNo, rentStart propertyNo, pAddress, rentFinish, rent, ownerNo, oName (candidate key)
Fd6 propertyNo, rentStart clientNo, cName, rentFinish (candidate key)

Trivial dependency – LHS attributes appearing in the RHS i.e A B is A A,B


Non Trivial dependency – when B is not a subset of A
Closure of FDs
A relation R have some functional dependencies F specified. The closure of F is the set of all functional
dependencies that may be logically derived from F. The closure, is the set of all the functional dependencies
including F and those that can be deduced from F. The closure is important and may, for example, be needed
in finding one or more candidate keys of the relation.

Student
SNo SName CNo CName Addr Instr. Office

5425 Susan Ross 102 Calc I …San Jose, CA P. Smith B42 Room 112

7845 DaveTurco 541 Bio 10 ...San Diego, CA L. Talip B24 Room 210

SNo -> SName CNo -> CName Instr -> Office CNo -> Instr SNo -> Addr

Axioms
1. Reflexivity Rule --- If X is a set of attributes and Y is a subset of X, then X  Y holds.
each subset of X is functionally dependent on X.
2. Augmentation Rule --- If X  Y holds and W is a set of attributes, then WX  WY holds.
3. Transitivity Rule --- If X  Y and Y  Z holds, then X  Z holds.
4. Union Rule --- If X  Y and X  Z holds, then X  YZ holds.
5. Decomposition Rule --- If X  YZ holds, then so do X  Y and X  Z.
6. Pseudotransitivity Rule --- If X  Y and WY  Z hold then so does WX  Z.

SNo SName CNo CName Addr Instr. Office

Based on the rules provided, the following dependencies can be derived.


(SNo, CNo)  SNo (Rule 1) -- subset
(SNo, CNo)  CNo (Rule 1)
(SNo, CNo)  (SName, CName) (Rule 2) -- augmentation
CNo  office (Rule 3) -- transitivity
SNo  (SName, address) (Union Rule)
etc.

Normalization
The term normalization means to make normal in terms of causing something to conform to a standard, or to introduce
consistency with respect to style and content. In terms of relational database modeling, that consistency becomes a
process of removing duplication in data, among other factors. Removal of duplication tends to minimize redundancy.
Minimization of redundancy implies getting rid of unneeded data present in particular places, or tables. The goal of
3
normalization is to reduce problems with data consistency by reducing redundancy. That is, to identify a suitable set of
relations that support the data requirements of an enterprise.
The characteristics of a suitable set of relations include:
 The minimum of attributes necessary to support the data requirements of the enterprise;
 Attributes with close logical relationship (functional dependency) are found in the same relation;
 Minimum redundancy, with each attribute represented only once.

The sequence of steps involved in the normalization process is called Normal Forms.
Normalization is an incremental process i.e. each Normal Form layer adds to whatever Normal Forms have already
been applied. These steps are the 1st, 2nd, and 3rd Normal Forms.

Benefits of Normalization
Effectively minimizing redundancy is another way of describing removal of duplication. The effect of removing
duplication is as follows:
 Physical space needed to store data is reduced thus minimizing costs;
 Data becomes better organized, hence updates to the data stored in the database are achieved with a
minimum number of operations;
 The database will be easier for the user to access and maintain the data.

Unnormalized Form (UNF)


This may be data from the source like a standard data entry form and then transferred into table format with rows and
columns. This is referred to as an unnormalized table. A table that contains one or more repeating groups.
DreamHome Lease
DreamHome Lease
DreamHome Lease
DreamHome Lease

Client Number CR76 Property Number PG4

Full Name John Kay Property Address 6 Lawrence St, Glasgow

Monthly Rent 350 Owner Number C040

Rent Start 01/07/07


Full Name Tina Murphy
Rent Finish 31/08/08

ClientRental unnormalized table


clienNo cName propertyNo pAddress rentStart rentFinish rent ownerNo oName
CR76 John Kay PG4 6 Lawrence St, Glasgow 1 Jul 07 31 Aug 08 350 CO40 Tina Murphy

PG16 5 Novar Dr, Glasgow 1 Sep 08 1 Sep 09 450 CO93 Tony Shaw

CR56 Aline Stewart PG4 6 Lawrence St, Glasgow 1 Sep 06 10 Jun 07 350 CO40 Tina Murphy

PG36 2 Manor Rd, Glasgow 10 Oct 07 1 Dec 08 375 CO93 Tony Shaw

PG16 5 Novar Dr, Glasgow 1 Nov 09 10 Aug 10 450 CO93 Tony Shaw

1st Normal Form (1NF)


The first normal form (1NF) requires that data in tables be two-dimensional — that there be no repeating groups in the
rows.
First identify the key attribute from the unnormalized table, in this case clientNo. Then identify the repeating group.

4
Repeating group = (propertyNo, pAddress, rentStart, rentFinish, rent, ownerNo, oName).

The problem with putting data in tables with repeating groups is that the table cannot be easily indexed or arranged so
that the information in the repeating group can be found without searching each record individually.
The solution is to eliminate repeating groups such that all records in all tables can be identified uniquely. The table is
decomposed into 1NF table with no repeating groups:

ClientRental
clienNo cName propertyNo pAddress rentStart rentFinish rent ownerNo oName
CR76 John Kay PG4 6 Lawrence St, Glasgow 01-Jul-07 31-Aug-08 350 CO40 Tina Murphy

CR76 John Kay PG16 5 Novar Dr, Glasgow 01-Sep-08 01-Sep-09 450 CO93 Tony Shaw

CR56 Aline Stewart PG4 6 Lawrence St, Glasgow 01-Sep-06 10-Jun-07 350 CO40 Tina Murphy

CR56 Aline Stewart PG36 2 Manor Rd, Glasgow 10-Oct-07 01-Dec-08 375 CO93 Tony Shaw

CR56 Aline Stewart PG16 5 Novar Dr, Glasgow 01-Nov-09 10-Aug-10 450 CO93 Tony Shaw

2nd Normal Form (2NF)


The second normal form (2NF) requires that every non primary key attribute is fully functionally dependent on the
primary key. Partial dependencies are not allowed. If a partial dependency exists, we remove the partially dependent
attribute(s) from the relation by placing them in a new relation along with a copy of their determinant.
Converting 1NF relations to 2NF relations:

Functional Dependencies of the ClientRental relation


clientNo propertyNo cName pAddress rentStart rentFinish rent ownerNo oName
fd1
fd2
fd3

fd4

fd5

fd6

Fd1 clientNo cName (partial dependence)


Fd2 propertyNo pAddress, rent, ownerNo, oName (partial dependence)
Fd3 ownerNo oName (Transitive dependence)
Fd4 clientNo, propertyNo rentStart, rentFinish (primary Key)
Fd5 clientNo, rentStart propertyNo, pAddress, rentFinish, rent, ownerNo, oName (candidate key)
Fd6 propertyNo, rentStart clientNo, cName, rentFinish (candidate key)

Using the functional dependencies above, we identify the presence of any partial dependencies on the primary key (in
fd1 cName only on clientNo and in fd2 pAddress, rent, ownerNo, oName only on propertyNo). The attributes rentStart
and rentFinish are fully dependent on the whole primary key that is the clientNo and propertyNo attributes. These
results in the creation of three new relations called Client Rental and PropertyOwner. These three relations are in
second normal form.
Rental
Client
clientNo propertyNo rentStart rentFinish
clientNo cName
CR76 PG4 01-Jul-07 31-Aug-08
CR76 John Kay CR76 PG16 01-Sep-08 01-Sep-09
CR56 Aline Stewart CR56 PG4 01-Sep-06 10-Jun-07
CR56 PG36 10-Oct-07 01-Dec-08 5
CR56 PG16 01-Nov-09 10-Aug-10
PropertyOwner
propertyNo pAddress rent ownerNo oName
PG4 6 Lawrence St, Glasgow 350 CO40 Tina Murphy
PG16 5 Novar Dr, Glasgow 450 CO93 Tony Shaw
PG36 2 Manor Rd, Glasgow 375 CO93 Tony Shaw

3rd Normal Form (3NF)


The third normal form (3NF) requires that the data in tables depends on the primary key of the table.
Eliminate transitive dependencies, meaning that a field is indirectly determined by the primary key. This is because the
field is functionally dependent on another field, whereas the other field is dependent on the primary key.
Owner PropertyForRent
ownerNo oName propertyNo pAddress rent ownerNo
CO40 Tina Murphy PG4 6 Lawrence St, Glasgow 350 CO40
CO93 Tony Shaw PG16 5 Novar Dr, Glasgow 450 CO93
PG36 2 Manor Rd, Glasgow 375 CO93

The resulting 3NF relations have the form:


Client (clientNo, cName)
Rental (clientNo, propertyNo, rentStart, rentFinish)
PropertyForRent (propertyNo, pAddress, rent, ownerNo)
Owner (ownerNo, oName)

EXERCISE:
1. In relational database development, what is the process of normalization intended to achieve?
2. Normalize the following customer record

Customer Record
Customer No
Customer Firstname
Customer Surname
Address
Tel No
Supplier No
Supplier name
Supplier address
Stock No
Stock item
Stock cost
Description
Supplier Tel No

3. Design a set of three relations to represent this data that conform to First Normal Form (1NF), selecting
keys as necessary.
Emp_Proj
EmpNo ProjNo Hours EmpName ProjName ProjLocation

6
4. Members of a sports club can take up to two activities with a personal trainer, for which they
have to pay a fee depending on the activity, as shown below:

Member Activity 1 Cost 1 Activity 2 Cost 2


John Adams Tennis 36 Swimming 15
Jane Bloggs Squash 40 Swimming 15
John Smith Tennis 36
Mark Jones Swimming 15 Golf 47
Design a set of three relations to represent this data that conform to First Normal Form (1NF), selecting or
creating keys as necessary. Construct the relations as tables and identify the keys by adding an asterisk (*)
to their names.
5. Identify possible candidate keys in the relation STUDENT with attributes NAME, STUDENT_ID_NO, ADDRESS,
DATE_OF_BIRTH, SEX, FIELD_OF_STUDY, justifying your answer.

You might also like