You are on page 1of 25

DATABASE MANAGEMENT SYSTEM

PIT FALLS OF RELATIONS DATABASE SYSTEM:


The relational database may have some disadvantages. They are

-Repeatation of information

-In ability to represent certain information

Consider the following example:

Relation:lending

Branc Branch city Assets Cname Lno Amt


h
Name

XYZ Chennai 400000 A L-2 30000

PQR Madurai 600000 B L-3 40000

ABC Coimbator 300000 C L-4 50000


e

PQR Madurai 600000 D L-5 60000

XYZ Chennai 400000 E L-6 70000

Problems:

In the above relation, we have the following problems.

Insertion
Suppose we need to insert branch name as ‘XYZ', ‘F', 800000.We need to repeat ‘Chennai’ and 40000
again. It leads the wastage of storage space. We can’t insert without loan no and amount.

Updation
If ‘XYZ’ moves from Chennai to the updation should be done properly in all places. If any one is not
changed, it leads data inconsistency.
Deletion
Suppose when the loan amount is paid we delete the tuple. It result in loss of branch information.

Data Redundancy
In the above example, Chennai, Madurai and their assets are repeated. This leads the repeatation of
data.

To avoid redundancy, we can decompose the relation into 2 relation.

Branch (B name, city, assets, came)

Loan(cname, Lno, amt)

Decomposition

The process of dividing a relation into several relation is called as decomposition.

R->R1 UR2 UR3….. Urn

Here,

R1 -> Original relation

R1, R2….Rn -> Sub relation of R

Advantages
Aoid redundancy

-Avoid inconsistency

-Anomalies (insertion, deletion and updation)

Lossy Decomposition

If the information is lost, then it is lossy (or) loss-join decomposition. when the decomposed relations
are joined if there is any loss of information, then it results in lossy decomposition.

In lossy, we may have the tuples present in the original relation and also have extra tuples.

Let us consider the relation R with the attributes: ABC

R= A, B, C

A B C

A1 B1 C1

A3 B1 C2

A2 B2 C3
A4 B2 C4

Decompose

A B B C

A1 B1 B1 C1

A3 B1 B1 C2

A2 B2 B2 C3

A4 B2 B2 C4

join

A B C

A1 B1 C1

A1 B1 C2

A3 B1 C1

A3 B1 C2

A2 B2 C3

A2 B2 C4

A4 B2 C3

A4 B2 C4

relations R1(A,B) and R2(B,C). If the relation are joined,the resulting has extra 4 tuples,this result in loss
of information.

EXAMPLES:
BNAME BCITY ASSETS CNAME LNO AMT

XYZ Chennai 1170000 A L-5 3000

PQR Mumbai 120000 B L-10 5000

LMN Calcutta 800000 A L-15 7000

DEF Hydrabad 1600000 C L-20 8000

PQR Mumbai 120000 D L-25 9000

BNAME BCITY ASSETS CNAME CNAME LNO AMT

XYZ Chennai 1170000 A A L-5 3000

PQR Mumbai 120000 B B L-10 5000

LMN Calcutta 800000 A A L-15 7000

DEF Hudrabad 1600000 C C L-20 8000

PQR Mumbai 120000 D D L-25 9000

BNAME BCITY ASSETS CNAME LNO AMT

XYZ Chennai 1170000 A L-5 3000

XYZ Chennai 1170000 A L-15 7000

PQR Mumbai 120000 B L-10 5000

LMN Calcutta 800000 A L-5 3000

LMN Calcutta 800000 A L-15 7000

DEF Hydrabad 1600000 C L-20 8000

PQR Mumbai 120000 D L-25 9000

This is lossy decomposition because after decomposition,then the joined relation has additional tuples.

LESS JOIN DECOMPOSITION


The relation 'R' is decomposed into several relations.When the decomposed relations are joined
if there is no less of information,then it is called loss join decompositon.

Relation: Lending

BNAME BCITY ASSETS CNAME LNO AMT

XYZ Chennai 1170000 A L-1 3000

PQR mumbai 120000 B L-2 5000

ABC Calcutta 1800000 C L-3 7000

DEF Delhi 1600000 D L-4 9000

GHI Hydrabad 120000 E L-5 11000

branch: loan: borrower:


BNAME BCITY ASSETS LNO BNAME AMT CNAME LNO

XYZ chennai 117000 L-1 XYZ 3000 A L-1


0
L-2 PQR 5000 B L-2
PQR mumbai 120000
L-3 ABC 7000 C L-3
ABC calcutta 180000
0 L-4 DEF 9000 D L-4

DEF delhi 160000 L-5 GHF 11000 E L-5


0

GHI Hydraba 120000


d

JOIN

BNAME BCITY ASSETS CNAME LNO AMT

XYZ Chennai 1170000 A L-1 3000

PQR Mumbai 120000 B L-2 5000

ABC Calcutta 1800000 C L-3 7000

DEF Delhi 1600000 D L-4 9000

GHF Hydrabad 120000 E L-5 11000


Relation is decomposed into three relation.They are

Branch (Bname,Bcity,amt)

Loan (Lno,Bname,amt)

Borrower (cname,Lno)

When the decomposed relations are joined,we got the original relation ie., there was no
additional tuples.

NORMALIZATION:
It helps in eliminating data redundancy and inconsistency problems.

Types,

⦁ First normal form( 1 NF)

⦁ Second normal form(2 NF)

⦁ Third normal form(3NF)

⦁ Boyce ,codd normal form(4NF)

⦁ Fourth normal form(4NF)

⦁ Fifth normal form(5NF)

Structurally, 2NF is better than INF,3NF is better than 2NF.

Generally,higher the normal forms,more joins are required to produce a specified output and
slower.The database system to improve performance demonstration is used.

DENORMALIZATION:
It produces a lower normal form ie, a third normal form would be converted to 2NF.

FIRST NORMAL FORM:

A table is said to be in 1NF


- The table is in unnormalized form

-It should not contain repeated groups of data

-All attributes must be simple

-It should not contain multi-valued (or) composite attributes

-Each column should contain one and only one value.

Consider the following example

EMPLOYEE

ENO ENAME DESIGNATION NAME AGE

E-3 X Engineer a 10
b 5
c 7

3-4 Y clerk p 5
q 8

-To bring the table to 1NF we should eliminate repeating groups by splitting the table into two.

-so these composite attributes are changed into simple attributes.

eg:

child

name age -> child name,child age

-Also, the multivalued attributes are removed.

employee

ENO ENAME DESIGNATION

F-3 X Enginner

E-4 Y Clerk
child

ENO CHILDNAME CHILD AGE

E-3 A 10

E-3 B 5

E-3 C 7

E-4 P 5

E-4 Q 8

Now, The table is in 1NF.

FUNCTIONAL DEPENDENCY

An attribute B is said to be functionally dependent on attribute A. If every value of A uniquely


determine the value of B.

A->B

A is called as determinant because it determine the value of B. B is functionally dependent on A.

eg:

Regno -> stud value

The value of Regno determine the name of the student.

DEPENDENCY

A non-prime attribute is said to be fully functionality dependent on composite key. If they depend on
the whole key but not on any subset of it.

{ I+ J / I,J} -> k

{ I} -> k( k is not functionally dependent on I)

{ J} ->k( k is not functionally dependent on J)

eg:

{ Sname,address } -> course


sname -> course

address -> course

SECOND NORMAL FORM (2 NF)

The table is said to be in 2NF,if it is in 1NF

All the non-prime attribute are fully functionally dependent on primary key.

CONSIDER THE FOLLOWING TABLE:

PROD SUPPLIER SUB-NAME SUB-CITY PRICE

P1 S1 X Chennai 50

P2 S2 Y Trichy 45

P3 S1 X Chennai 100

P4 S3 Z Salem 900

P5 S1 X Chennai 150

primary key -> {prod,supplier}

PARTIAL FUNCTIONAL DEPENDENCY:


{ prod,supplier} -> supplier name

prod -> sup-name

supplier-> sup-name

{ prod,supplier} -> sup-city

prod-> sup-city

supplier-> sup-city

FULL FUNCTIONAL DEPENDENCY:


{prod,supplier}-> price

prod->price

supplier->price
PROBLEMS:
Although the table is in 1NF,it may have the following problems.

1.DATA REDUNDANCY:
In the above table,sup-name and sup-city are repeated.

eg:

S1,X,’Chennai’

2. UPDATION ANAMOLY
Suppose ‘S1’ moves from ‘Chennai‘ to ‘madurai’ there may be more than one update.If the
three tuples are not updated,it leads data inconsistency.

Suppose ‘S4’ supplier who is not supplying products at present ,we need all information.

4.DELETION ANAMOLY:
If ‘S3’ stops temporarity supplying products,then deletion of loss of information about sup_name and
sup_city.we can’t able to contact the supplier in future.

To avoid all of these problems,the table is splitted as follows.

T-1

PROD SUPPLIER PRICE

P1 S1 50

P1 S2 45

P2 S1 100

P2 S3 90

P4 S4 150
Primary Key

{product,supplier}

{product,supplier}->Price

T-2

SUPPLIER SUP-NAME SUB-CITY

S1 X Chennai

S2 Y Trichy
S3 Z Salary

Primary Key

Supplier

Supplier->sup_name

Supplier->sup_city

12.PROBLEMS AVOIDED:

1. DATA REDUNDANCY:
The sup_name{S1,X,’Chennai’} is stored only once is the T-2 so repetition is avoided.

2.UPDATION:
The updation also done only once in T-2 as Chennai to Madurai for ‘x’.

3.DELECTION:
We can delete ‘S3’ from T-1 which may not leads the loss of data we can obtain the sup_name and
sup_city from T-2.Thus the table is in 2NF. Suppose we have the dependencies

A->B

B->C

A->C

Ie,Attribute c is functionally dependent on primary key A1, but it also depends on another non-prime
attributes.B1 such dependency is called transitive dependency.

THIRD NORMAL FORM:


A table is said to be in 3NF is it is in 2FN and it contains no transitive dependencies consider the table
given below.

SUPPLIER # SUP-NAME SUP-CITY STATUS

S1 X Chennai 10

S2 Y Trichy 20

S3 Z Salam 30

S4 P Chennai 10

The table have the following dependencies


Supplier->Sup_name

Supplier->Sup_city

Supplier->Status

Sup_city->Status

Since the non-prime attribute status depends on another non-prime attribute sup_city.so,it has
transitive dependency.

To bring the table to 3NF,we should eliminate transitive dependency and we can note that the table is in
2NF,because non-prime attributes are functionally depenent on primary key.

PROBLEMS:
Although the table is in 2NF it some problems.They are,

1.UPDATE ANAMOLY:
If the status value for ‘chennai’ is changed from 10 to 100 we need to update 2 rows.If not,we will have
two status value for Chennai.this leads to data inconsistency.

2.INSERTION ANAMOLY:
Suppose we want to add the status value 50,to all suppliers from Mumbai, we can add such information
unless There are any suppliers from that city .we need a value for supplier.

3.DELETION ANAMOLY:
Suppose ‘s3’ stops supplying products we delete the record belonging to supplier ‘s3’.we may lose
information, of salem city status value 30.so avoid all of these we split the table as below.

T-1 T-2
SUPPLIER SUP-NAME SUP-CITY SUP-CITY STATUS

S1 X Chennai Chennai 10

S2 Y Trichy Trichy 20

S3 Z Salam Salam 30

S4 P Chennai

UPDATE:
If the status value is changed from 10 to 100,….we change in only one place inT-2.
INSERTION:
We can add the city{Mumbai,50}in T-2 without supplier.

DELETION:
We may delete the record of s3, from T-1, the remaining informations are retrieved from T2 as salem 30,
Thus it avoid the loss of information.

BOYCE-CODD NORMAL FORM(BCNF):

A table is said to be in BCNF if it is in 3NF

-each determinant is the primary key.

Ie,if we have A->B,then A is called determinant.If a table contains only one candidate key than 3NF and
BCNF are equivalent.

Consider the table which stores information concerning students,sports in which they participate and
their coaches.

RULES FOR THE TABLE:


-Each student may participate in one (or) more sports.

-For each sports , a student may have different coach.

-Each coach may work with only one sports.

-Each sport may are several coach.

STUDENT SPORT COACH

X Baseball ABC

X Volleyball XYZ

Y Volleyball LMN

P Baseball ABC

Z Baseball ABC

The above table has two alternate keys sport+student and student+coach.

But,stud+sport is used as primary key we have the following dependencies.

Stud+sport->coach

Coach->sport

The prime attribute sport depens ona a non-prime attribute coach.ie,coach is determinant,but it is not a
candidate key.so the table is in 3NF but not in BCNF.
Although the table is in 3NF,it has the following problems.

1.DATA REDUNDANCY:

The information ‘ABC’ coaches ‘Baseball’ is stored three times ,which leads to data redundancy.

2.INSERTION ANAMOLY:
We can’t add c ‘PQR’ put but without the students participating in it.

3.UPDATION ANAMOLY:
Suppose the coach ‘ABC” is changed by another coach ‘DEF’ we must update three rows,if not,there may
be ifferent coaches for baseball.

4.DELETION ANAMOLY:
Suppose ‘y’ stops playing volleyball if we delete the record ‘y’ then the information regarding
coach,sport is lost’LMN’coaches volleyball is lost.

To eliminate all these problems,we split the table as follows.

T-1 T-2
STUDENT COACH COACH SPORT

X ABC ABC Baseball

X XYZ XYZ Volleyball

Y LMN LMN Volleyball

P ABC

Z ABC

DATA REDUNDANCY:
The information (‘ABC’ ‘Baseball’) stored only once in T-2,so redundancy problem is avoided.

UPDATION:
We can change the coach ‘ABC’ to ‘PQR’ in T-2 only once be can the data is represented only one time.

We can add ‘PQR’ coaches in T-2 without stud name.

DELETION:
we can delete ‘y’ froms T-1 and get the coach and sport information from T-2 hence loss of data is
avoided.

MULTI-VALUED DEPENDENCY:
If for each value of attribute A there is one or more associated value of B1 it is known as multivalued
dependency denoted as

A->->B.

Eg:

CUST-NAME CUST-STREET CUST-CITY

X T-nagar jaya nagar Chennai bangalore

Y Airport road egmore Bangalore chennai

Consider a bank customer who are wealthy customers,who have several addresses(say
wninter/summer home).

Primary Key:customer_name.

In the above table we have multi-valued dependency.

Cust_name->->cust_street.

Cust_name->->cust_city.

GROUP FUNCTIONS:
It returns a result based on a group of rows.Some of these are purely mathematical functions.

1) Avg
It return the average of values of the column specified in the argument of the column.

eg:

SQL> select avg(total) from student;

2)MIN:
It returns the least of all values of the column present in the argument.

eg:

SQL> select min(total) from student;

3)MAX:
It gives the maximum of a set of values.

eg:
SQL> select max(total) from student;

4)SUM:
It can be used to obtain the sum of a range of values.

eg:

SQL> select sum(total) from student;

5)COUNT:
It returns the number of rows.

eg:

SQL> select count(*) from student;

SQL> select count(distinct sname) from student;

SQL> select count(sno) from student;

FOURTH NORMAL FORM(4NF):


A table is said to be in 4NF is,
It is in BCNF (or) 3NF.
It does not contain multiple sets of multivalued dependencies.
EG:
Consider a table which stores information concerning sports students in which
they participate and their coaches.

RULES FOR THE TABLE:


each student may participate in many sports.
Each students may be assigned several coaches.
A coach may assist students in many sports.
STUDENT SPORTS COACH
x Baseball ABC
Volleyball XYZ
baseball LMN
y Volleyball ABC
football XYZ
z Football ABC
football LMN
PRIMARY KEYstudent+sport

DEPENDENCIES:
Student->->sport
Student->->coach

PROBLEMS:
1.DATA REDUNDANCY:
we repeat the information x plays baseball,z plays football 2 times in the above
table which leads to data redundancy.

2.UPDATION:
if we want be change z plays football to volleyball we have to update 2 rows ,if we
change at only one row ,then it leads to data inconsistency.

3.INSERTION:
suppose if we want to insert “y coach XYZ”.we need to specify whether we teach
volleyball or football or both or none.

4.DELETION:
If suddenly the following stopped played.
X plays baseball
Z plays football
STUDENT sports
X baseball We need to delete the 2 records and loss of
X volleyball information about coaches.

Y volleyball
Y football T1:
Z football T2:
STUDEN COACH
T
X ABC
X XYZ
X LMN
Y ABC
Y XYZ
Z ABC
Z LMN
FIFTH NORMAL FORM(5NF):
A table is said to be in 5NF is,
relation is in 4NF.
relation should have loss-less join dependency.
table is broken into as many as possible relations.
called as PJNF(Project Join NF).
SUB LECTURER SEMESTER
c A 1
maths A 1
maths B 1
maths C 2
C++ D 2

T1:

SUB LECTURER
c A
maths A
maths B
maths C
C++ D
T2:
SUB SEM
c 1
maths 1
maths 2
T3:
C++ 2

LECTURER SEM
A 1
B 1
C 2
D 2

You might also like