Professional Documents
Culture Documents
STRUCTURE:
- Normalizations
- Normalizations process:
- Repeating group:
- 2NF
- 3NF
The goal of a relational database design is to generate a set of relation schemas that allows us to store
information without any redundant (repeated) data. It also allows us to retrieve information easily and
more efficiently.
For this we use an approach normal form as the set of rules. The main objective of a relation schema is
to:
Minimizing data redundancy
Going through all these steps is a useful tool for requirements analysis and data modelling process of
software development. Thereby it can be useful in reducing the all undesirable problems.
Dependency:
Dependency Diagram
ii) Reduces the possibility or makes it less likely that any important Dependency gets overlooked or
missed.
Partial dependency:
Transitive Dependency:
A functional dependency
A functional dependency is defined as dependency on another field. If there are three fields say, X, Y and
Z in a relation r and
X is dependent on Y
Y is dependent on Z
X is dependent on YZ.
The left and right sides are called determinant and the dependant. Both dependent and determinant are
sets of attribute. Functional Dependency or FD is not a key constraint. It cannot determine or hold back
transaction of a query.
A functional dependency is a many to one relationship between two sets of a given relation r. Functional
dependencies or FD allow us to express constraints that we cannot express with super keys. We shall
use functional dependencies in two ways:
To test relations whether they are legal under a given set of functional dependencies. If a relation m is
legal under a set R of functional dependencies, we say m satisfies R.
To specify constraints on the set of legal relations. We will concern ourselves with only those relations
that satisfy a given set of functional dependencies. If we constrain to relations on schema m that satisfy
a sat R of functional dependencies, we say that R holds on m.
- Normalizations
Normalization is a method by which we try to minimize a table into a more structural form so that
execution and query is simpler. Normalization is a process of assigning attributes to entities.
Database normalization is data design and organization process applied to data structures based on
their functional dependencies and primary keys that help build relational databases. This helps in:
Result will be a database that can produce the same information as the original.
- Normalizations process:
Normalization works to a series of stages called Normal forms have their nomenclature as follows:
Database normalization is a useful tool for requirements analysis and data modeling process of software
development. Thus the normalization is the process to reduce the all undesirable problems by using the
functional dependencies and keys.
- Repeating group:
It derives its name from the fact that a group of multiple entities can exist for any single-key attribute
occurrences.
Normalizing any of the table structure will reduce these data redundancies.
In the first normal form, we try to make each table dependent on a primary key. Each field in a table
must be functionally dependent on the primary key.
Converting to 1NF:
A table in a relational database has to be in 1NF if
Definition of 1NF:
Some tables contain partial dependencies but still are subject to data redundancy.
Example: Consider DEPARTMENT relation schema in which primary key is DNUMBER. And we extend it
by introducing DLOCATIONS attribute as shown. Each department may have a number of locations. The
domain of DLOCATIONS includes atomic values; some tuples can have set of these values. DLOCATIONS
is not functionally dependent on the primary key DNUMBER. The domain of DLOCATIONS includes set of
values and hence non-atomic. Here DNUMBER->DLOCATIONS, so that each set is considered as a single
member of domain attributes. So the DEPARTMENT relation is not in 1NF.
1) Remove DLOCATIONS attribute that violates 1NF and place it in separate relation along with primary
key DNUMBER of DEPARTMENT. Primary key is the combination of {DNUMBER, DLOCATION}.
2) Now expand the key and then there will be separate tuple in the DEPARTMENT relation for each
location in the DEPARTMENT table. This has the disadvantage of redundancy problem.
3) If the number of values for the attribute is known then replace the DLOCATIONS attribute by atomic
attributes. As in figure c DLOCATIONS attribute is divided in to DLOCATION1, DLOCATION2, and
DLOCATION3. This has the disadvantage of introducing null values if most departments have fewer than
three locations.
DEPARTMENT
DEPARTMENT
c) DEPARTMENT
DNAME DNUMBER DMGRSSN DLOCATION
Figure: Normalization in to 1NF. a) A relation schema that is not in 1NF. b) Example state of
DEPARTMENT relation c) 1NF version of DEPARTMENT relation.
1NF disallows multivalued attributes that are composite. They are called nested relations. Here each
tuple can have relation within it. Each tuple represents employee entity and relation PROJS (PNUMBER,
HOURS) within the tuple represents employee’s projects and working hours per week. The schema can
be represented as follows
EMP_PROJ (SSN, ENAME, {PROJS (PNUMBER, HOURS)}). Here primary key of the EMP_PROJ relation is
SSN and PNUMBER is the partial key of the nested relation. To normalize in to 1NF we remove nested
relation attributes in to new relation and propagate primary key in to that relation as shown in figure.
EMP_PROJ
EMP_PROJ
EMP_PROJ1 EMP_PROJ2
Figure: Normalizing nested relations into 1NF a) Schema of the EMP_PROJ relation with a nested
relation attribute PROJS. b) Extension of the EMP_PROJ relation showing nested relation showing in
each tuple. c) Decomposition of the EMP_PROJ in to relations EMP_PROJ1 and EMP_PROJ2 by
propagating the primary key.
- 2NF
2NF is a normal form in database normalization. It requires that all the data elements in a table are full
functionally dependent on the table's primary key. If data clement only dependent on part of primary
key, then they are parsed out to separate tables. If the table has a single field as the primary key, it is
automatically in 2NF. A functional dependency X->Y is a full functional dependency if removal of any
attribute A form X means that the dependency does not satisfy anymore i.e. for any attribute A€X ,X-
{A}) does not functionally determine Y.
A functional dependency X->Y is a partial functional dependency if some attribute A €X can be removed
from X and the dependency holds. I.e. for any attribute A€X ,X-{A}) -> Y.
Converting to 2NF:
Identify all key components:
ProT: ProNo->Proname.
Define 2NF:
Table is in 2NF, if
It is in 1NF.
- 3NF
The 3NF is a normal form used in database normalization to check if the entire non key attributes of a
relation depend only on the candidate keys of the relation. This means that all non-key attributes are
mutually independent or in other words that a non key attribute cannot be transitively dependent on
another non-key attribute.
A functional dependency X->Y is transitive dependency if there is set of attributes Z that is neither a
candidate key nor a subset of any key of R, and both X->Z and Y->Z satisfy.
A relation schema R is in 3NF if each non-prime attribute of R meets both of the following.
Converting to 3NF:
Resolve transitive dependencies.
Pro_T:proNo-> proname
The test for 2NF involves the testing for functional dependencies whose left hand attributes are part of
the primary key. The EMP_PROJ relation in the figure is in 1 NF but it is not in 2NF.The non-prime
attribute ENAME violates 2NF because of FD2, also PNAME, PLOCATION because of FD3. FD2 and FD3
make ENAME, PNAME and PLOCATION partially dependent on the primary key {SSN, PNUMBER} of
EMP_PROJ and thus violating 2NF. EMP_PROJ is decomposed in to EP1, EP2, and EP3 each of which is in
2NF as shown in Figure a.
EMP_PROJ
FD1
FD2
FD3
2NF NORMALIZTAION
The relation schema EMP_DEPT in the following figure is in 2NF but not in 3NF because of transitive
dependency of DMGRSSN (also DNAME) on SSN via DNUMBER. EMP_PROJ is decomposed in to ED1 and
ED2, each of which is in 3NF as in Figure b.
EMP_DEPT
ED1
ED2
BCNF is a normal form used in database normalization. It is slightly stronger version of the 3NF.
(b) For every of its nontrivial functional dependency X.... Y, X is a super key.
OR A relation schema R is in BCNF if and only if a nontrivial functional dependency say, X.... A holds in R
then
X is a super key of R.
Converting to BCNF:
Each and every determinant in the table is a candidate key.
Has some characteristics as primary key, but for some reason, not chosen as primary key.
If a table contains at least one candidate key, the 3NF and BCNF are equivalent.
BCNF can only be violated if the table contains more than one candidate key.
Example:
2NF:
Supply table:
3NF:
Supply Table:
Sid Status
S1 20
S2 10
S3 10
S4 10
City Table:
Status City
20 PARIS
20 LONDON
Example No.2
2NF:
Adv Adv_room
J 412
S 216