You are on page 1of 18

CHAPTER FIVE

DATABASE DESIGN
Database Design is the process of coming up with different kinds of specification for the data to be
stored in the database. The database design part is one of the middle phases we have in information
systems development where the system uses a database approach. Design is the part on which we
would be engaged to describe how the data should be perceived at different levels and finally how it is
going to be stored in a computer system.
4.1 Database Development Life Cycle
As it is one component in most information system development tasks, there are several steps in
designing a database system. Here more emphasis is given to the design phases of the system
development life cycle. The major steps in database design are;
1. Planning: that is identifying information gap in an organization and propose a database solution to
solve the problem.
2. Analysis: that concentrates more on fact finding about the problem or the opportunity. Feasibility
analysis, requirement determination and structuring, and selection of best design method are also
performed at this phase.
3. Design: in database designing more emphasis is given to this phase. The phase is further divided
into three sub-phases.
 Conceptual Design
 Logical Design
 Physical Design
4. Implementation: the testing and deployment of the designed database for use.
5. Operation and Support: administering and maintaining the operation of the database system and
providing support to users.
In developing a good design, one should answer such questions as:
 What are the relevant Entities for the Organization
 What are the important features of each Entity and relationships
 What are the important queries from the user
 What are the other requirements of the Organization and the Users

4.2 Levels of Database Design


Database development system has several phases, from those different phases, the prime
interest of a database development system will be the Design part which is again sub divided
into other three sub-phases. These sub-phases are:-
1
Conceptual design

Logical design

Conceptual Design
Physical design

The three level of database design


Conceptual Database Design
The information gathered in the requirements analysis step is used to develop a high-level description
of the data to be stored in the database, along with the constraints that are known to hold over this
data. This step is often carried out using the ER model, or a similar high-level data model, and is
discussed in the rest of this chapter.
Logical Database Design
Converting the complex data structure in to simple structure. But this process is independent of a
particular DBMS and other physical considerations. We will only consider the structure.
Physical Database Design
In this step we must consider typical expected workloads that our database must support and further
renew the database design to ensure that it meets desired performance criteria. This step may simply
involve:-
 Describes the storage structures and access methods used to achieve efficient access to the
data.
 Tailored to a specific DBMS system-Characteristics are function of DBMS and operating syste
 Includes estimate of storage space

4.2.1 Conceptual Database Design


Conceptual design revolves around discovering and analyzing organizational and user data
requirements
 The important activities are to identify
 Entities
 Attributes
 Relationships
 Constraints
 And based on these components develop the ER model using ER diagrams.

2
Entity-Relationship (ER) Model
The entity-relationship (ER) data model allows us to describe the data involved in a real-world
enterprise in terms of objects and their relationships and is widely used to develop an initial database
design. In this section, we introduce the ER model and discuss how its features allow us to model a
wide range of data faithfully.
The ER model is important primarily for its role in database design. It provides useful concepts that
allow us to move from an informal description of what users want from their database to a more
detailed and precise, description that can be implemented in a DBMS.
In this section, we will follow the traditional approach of concentrating on the database structures and
constraints during database design. We will present the modeling concepts of the Entity-Relationship
(ER) model, which is a popular high-level conceptual data model. This model and its variations are
frequently used for the conceptual design of database applications, and many database design tools
employ its concepts.
Components of Entity-Relational (RE) Model

1. Entities:
2. Attributes
3. Relationships
4. Relational constraints

1. Entities
The basic object that the ER model represents is an entity, which is a "thing" in the real world with an
independent existence. An entity may be an object with a physical existence—a particular person, car,
house, or employee—or it may be an object with a conceptual existence—a company, a job, or a
university course.
 Entity Types and Entity sets
A database usually contains groups of entities that are similar. For example, a company employing
hundreds of employees may want to store similar information concerning each of the employees.
These employee entities share the same attributes, but each entity has its own value(s) for each
attribute. An entity type defines a collection (or set) of entities that have the same attributes. A few
individual entities of each type are also illustrated, along with the values of their attributes. The
collection of all entities of a particular entity type in the database at any point in time is called an
entity set; the entity set is usually referred to using the same name as the entity type. For example,

3
EMPLOYEE refers to both a type of entity as well as the current set of all employee entities in the
database.
2 Attribute
Each entity has attributes—the particular properties that describe it. For example, an employee entity
may be described by the employee‘s name, age, address, salary, and job. A particular entity will have a
value for each of its attributes. The attribute values that describe each entity become a major part of
the data stored in the database.
For example: FName, Mname, Age, and HomePhone can be the attributes of the employee entity
with the values "Bereket‘‘ ,‗‖ Kebede" ,"23", and "046-881-3456," respectively.
Types of Attributes
Several types of attributes occur in the ER model: simple versus composite; single-valued versus
multivalued; and stored versus derived. We first define these attribute types and illustrate their use via
examples. We then introduce the concept of a null value for an attribute.
Composite versus Simple (Atomic) Attributes
Composite attributes can be divided into smaller subparts, which represent more basic attributes with
independent meanings. For example, the Address attribute of the employee entity can be sub-divided
into City, Region, and Zip. Attributes that are not divisible are called simple or atomic attributes.
The value of a composite attribute is the concatenation of the values of its constituent simple
attributes.
Composite attributes are useful to model situations in which a user sometimes refers to the composite
attribute as a unit but at other times refers specifically to its components. If the composite attribute is
referenced only as a whole, there is no need to subdivide it into component attributes. For example, if
there is no need to refer to the individual components of an address (Zip, Street, and so on), then the
whole address is designated as a simple attribute.
Single-valued Versus Multivalued Attributes
Most attributes have a single value for a particular entity; such attributes are called single-valued. For
example, Age is a single-valued attribute of person. In some cases an attribute can have a set of values
for the same entity—for example, Colors attribute for a car, or a College Degrees attribute for a
person. Cars with one color have a single value, whereas two-tone cars have two values for Colors.
Similarly, one person may not have a college degree, another person may have one, and a third person
may have two or more degrees; so different persons can have different numbers of values for the
College Degrees attribute. Such attributes are called multivalued. A multivalued attribute may have
lower and upper bounds on the number of values allowed for each individual entity. For example, the

4
Colors attribute of a car may have between one and three values, if we assume that a car can have at
most three colors.
Stored Versus Derived Attributes
In some cases two (or more) attribute values are related—for example, the Age and BirthDate
attributes of a person. For a particular person entity, the value of Age can be determined from the
current (today‘s) date and the value of that person‘s BirthDate. The Age attribute is hence called a
derived attribute and is said to be derivable from the BirthDate attribute, which is called a stored
attribute. Some attribute values can be derived from related entities; for example, an attribute
NumberOfEmployees of a department entity can be derived by counting the number of employees
related to (working for) that department.
Null Values
In some cases a particular entity may not have an applicable value for an attribute. For example, a
CollegeDegrees attribute applies only to persons with college degrees. For such situations, a special
value called null is created. A person with no college degree would have null for CollegeDegrees.
Null can also be used if we do not know the value of an attribute for a particular entity—for example,
if we do not know the home phone of an employee. The meaning of the former type of null is not
applicable, whereas the meaning of the latter is unknown. The unknown category of null can be further
classified into two cases. The first case arises when it is known that the attribute value exists but is
missing—for example, if the Height attribute of a person is listed as null. The second case arises when
it is not known whether the attribute value exists—for example, if the HomePhone attribute of a person
is null.
3. Relationships
The relationship between entities which exist must be taken into account when processing information.
In any business processing one object may be associated with another object due to some event. Such
kind of association is what we call a RELATIONSHIP between entity objects
 One external event or process may affect several related entities.
 Related entities require setting of LINKS from one part of the database to another.
 A relationship should be named by a word or phrase which explains its function
 Role names are different from the names of entities forming the relationship: one entity may take
on many roles, the same role may be played by different entities
 For each RELATIONSHIP, one can talk about the Number of Entities and the Number of Tuples
participating in the association. These two concepts are called DEGREE and CARDINALITY of a
relationship respectively.

5
Degree of Relationship
 An important point about a relationship is how many entities participate in it. The number of
entities participating in a relationship is called the DEGREE of the relationship.
 Among the Degrees of relationship, the following are the basic:
 UNARY/RECURSIVE RELATIONSHIP: Tuples/records of a Single entity are related
withy each other.
 BINARY RELATIONSHIPS: Tuples/records of two entities are associated in a
relationship
 TERNARY RELATIONSHIP: Tuples/records of three different entities are associated
 And a generalized one: N-NARY RELATIONSHIP: Tuples from arbitrary number of
entity sets are participating in a relationship.
Cardinality of Relationship
Another important concept about relationship is the number of instances/tuples that can be associated
with a single instance from one entity in a single relationship. The number of instances participating or
associated with a single instance from an entity in a relationship is called the CARDINALITY of the
relationship. The major cardinalities of a relationship are:
 ONE-TO-ONE: one tuple is associated with only one other tuple.
 E.g. Building – Location as a single building will be located in a single location and as
a single location will only accommodate a single Building.
 ONE-TO-MANY, one tuple can be associated with many other tuples, but not the reverse.
 E.g. Department-Student: as one department can have multiple students.
 MANY-TO-ONE, many tuples are associated with one tuple but not the reverse.
 E.g. Employee – Department: as many employees belong to a single department.
 MANY-TO-MANY: one tuple is associated with many other tuples and from the other side,
with a different role name one tuple will be associated with many tuples.
 E.g. Student – Course: as a student can take many courses and a single course can be
attended by many students.

4. Relational Constraints (Integrity rules)


 Relational Integrity
 Domain Integrity: No value of the attribute should be beyond the allowable limits
 Entity Integrity: In a base relation, no attribute of a Primary Key can assume a value of
6
NULL.
 Referential Integrity: If a Foreign Key exists in a relation, either the Foreign Key value
must match a Candidate Key value in its home relation or the Foreign Key value must
be NULL.
 Enterprise Integrity: Additional rules specified by the users or database administrators
of a database are incorporated.
 Key constraints
If tuples are need to be unique in the database, and then we need to make each tuple distinct.
To do this we need to have relational keys that uniquely identify each relation.
 Super Key: an attribute or set of attributes that uniquely identifies a tuple within a
relation.
 Candidate Key: a super key such that no proper subset of that collection is a Super Key
within the relation. If a super key is having only one attribute, it is automatically a
Candidate key. If a candidate key consists of more than one attribute it is called
Composite Key. A candidate key has two properties:
1. Uniqueness
2. Irreducibility
 Primary Key: the candidate key that is selected to identify tuples uniquely within the
relation. The entire set of attributes in a relation can be considered as a primary case in a
worst case.
 Foreign Key: an attribute, or set of attributes, within one relation that matches the
candidate key of some relation. A foreign key is a link between different relations to create the
view or the unnamed relation

Developing an ER Diagram
 Designing conceptual model for the database is not a one linear process but an iterative activity
where the design is refined again and again.
 To identify the entities, attributes, relationships, and constraints on the data, there are different
set of methods used during the analysis phase.
These include information gathered by…
 Interviewing end users individually and in a group
 Questionnaire survey
 Direct observation
 Examining different documents

7
 The basic E-R model is graphically depicted and presented for review.
 The process is repeated until the end users and designers agree that the E- R diagram is a fair
 representation of the organization‘s activities and functions.
 Checking for Redundant Relationships in the ER Diagram. Relationships between entities
indicate access from one entity to another - it is therefore possible to access one entity
occurrence from another entity occurrence even if there are other entities and relationships that
separate them - this is often referred to as Navigation' of the ER diagram
 The last phase in ER modeling is validating an ER Model against requirement of the user.
Graphical Representation in ER Diagramming
 Entity is represented by a RECTANGLE containing the name of the entity.
Strong entity
Weak entity
 Connected entities are called relationship participants
 Attributes are represented by OVALS and connected to the entity by a line

Attributes

Multi-valued attribute composed attribute


 A derived attribute is indicated by a DOTTED LINE.

Derived Attribute
 Relationships are represented by DIAMOND shaped symbols

Strong Relationship Weak Relationship


 Weak Relationship is a relationship between Weak and Strong Entities.
 Strong Relationship is a relationship between two strong Entities.
 PRIMARY KEYS are underlined.

Key FK

Primary key (PK) Foreign key (FK)

8
An Entity Relationship Diagram Methodology
Identify the roles, events, locations, tangible things or concepts about which
1. Identify Entities
the end-users want to store data.
Name the information details (fields) which are essential to the system under
2. Identify Attributes
development.
Find the natural associations between pairs of entities using a relationship
3. Find Relationships
matrix.
Determine the number of occurrences of one entity for a single occurrence
4. Fill in Cardinality
of the related entity.
Identify the data attribute(s) that uniquely identify one and only one
5. Define Primary Keys
occurrence of each entity.
Eliminate Many-to-Many relationships and include primary and foreign
6.DrawKey-Based ERD
keys in each entity.
7. Draw fully ERD Adjust the full ERD.
Does the final Entity Relationship Diagram accurately depict the system
8. Check Results
data?
Steps to Develop Entity Relationship Diagram (ER-D)

Converting ER Diagram to Relational Tables


Three basic rules to convert ER into tables or relations:
1. Entity Type Rule
 Each entity type becomes a table. The PK of ET (if not weak) becomes the PK of the table.
The attributes of the entity type become columns in the table. This rule should be used first
before the relationship rules.
2. for a relationship with One-to-One Cardinality:
 All the attributes are merged into a single table. Which means one can post the primary key or
candidate key of one of the relations to the other as a foreign key.
3. for a relationship with One-to-Many Cardinality:
 Post the primary key or candidate key from the ―one‖ side as a foreign key attribute to the
―many‖ side. E.g.: For a relationship called ―Belongs To‖ between Employee (Many) and
Department (One)
4. for a relationship with Many-to-Many Cardinality:
 Create a new table (which is the associative entity) and post primary key or candidate
9
key from each entity as attributes in the new table along with some additional attributes
(if applicable)
After converting the ER diagram in to table forms, the next phase is implementing the process of
normalization, which is a collection of rules each table should satisfy.

Logical Database Design


Logical design is the process of constructing a model of the information used in an enterprise based on a
specific data but independent of a particular DBMS and other physical considerations
 Normalization process
 Collection of Rules to be maintained
 Discover new entities in the process
 Revise attributes based on the rules and the discovered Entities
The first step before applying the rules in relational data model is converting the conceptual design to
a form suitable for relational logical model, which is in a form of tables.
Normalization
A relational database is merely a collection of data, organized in a particular manner. As the father of
the relational database approach, Codd created a series of rules called normal forms that help define
that organization.
One of the best ways to determine what information should be stored in a database is to clarify what
questions will be asked of it and what data would be included in the answers.
Database Normalization is a series of steps followed to obtain a database design that allows for
consistent storage and efficient access of data in a relational database. These steps reduce data
redundancy and the risk of data becoming inconsistent
Normalization is the process of identifying logical association between data item and designing a
database that will represent such abscissions but with out suffering the update anomalies which are:
 Insertion Anomalies
 Deletion Anomalies
 Modification Anomalies
Normalization may reduce system performance since data will be cross referenced from many tables. Thus
denormalization is sometimes used to improve performance, at the cost of reduced consistency
guarantees.
Normalization normally is considered as good if it is lossless decomposition.
All the normalization rules will eventually remove the update anomalies that may exist during data
manipulation after the implementation. The update anomalies are;

10
The type of problems that could occur in insufficiently normalized table is called update anomalies
which includes;
1. Insertion anomalies
An "insertion anomaly" is a failure to place information about a new database entry into all the places
in the database where information about that new entry needs to be stored. In a properly normalized
database, information about a new entry needs to be inserted into only one place in the database; in an
inadequately normalized database, information about a new entry may need to be inserted into more
than one place and, human fallibility being what it is, some of the needed additional insertions may be
missed.
2. Deletion anomalies
A "deletion anomaly" is a failure to remove information about an existing database entry when it is
time to remove that entry. In a properly normalized database, information about an old, to-be-gotten-
rid-of entry needs to be deleted from only one place in the database; in an inadequately normalized
database, information about that old entry may need to be deleted from more than one place, and,
human fallibility being what it is, some of the needed additional deletions may be missed.
3. Modification anomalies
A modification of a database involves changing some value of the attribute of a table. In a properly
normalized database table, what ever information is modified by the user, the change will be effected
and used accordingly.
Functional Dependency (FD)
Before moving to the definition and application of normalization, it is important to have an
understanding of "functional dependency."
The logical associations between data items that point the database designer in the direction of a good
database design are referred to as determinant or dependent relationships.
Two data items A and B are said to be in a determinant or dependent relationship if certain values of
data item B always appears with certain values of data item A. if the data item A is the determinant
data item and B the dependent data item then the direction of the association is from A to B and not
vice versa.
The essence of this idea is that if the existence of something, call it A, implies that B must exist and
have a certain value, then we say that "B is functionally dependent on A." We also often express
this idea by saying that "A determines B," or that "B is a function of A," or that "A functionally
governs B." Often, the notions of functionality and functional dependency are expressed briefly by the
statement, "If A, then B." It is important to note that the value B must be unique for a given value of

11
A, i.e., any given value of A must imply just one and only one value of B, in order for the relationship
to qualify for the name "function." (However, this does not necessarily prevent different values of A
from implying the same value of B.)
X  Y holds if whenever two tuples have the same value for X, they must have the same value for Y

The notation is: AB which is read as; B is functionally dependent on A.


In general, a functional dependency is a relationship among attributes. In relational databases, we can
have a determinant that governs one other attribute or several other attributes. FDs are derived from
the real-world constraints on the attributes.
EXAMPLE
Dinner Type of Wine
Meat Red
Fish White
Cheese Rose

Since the type of Wine served depends on the type of Dinner, we say Wine is functionally dependent
on Dinner.
Dinner  Wine
Dinner Type of Wine Type of Fork
Meat Red Meat fork

Fish White Fish fork

Cheese Rose Cheese fork

Since both Wine type and Fork type are determined by the Dinner type, we say Wine is functionally
dependent on Dinner and Fork is functionally dependent on Dinner.
Dinner  Wine
Dinner  Fork
Three Type of Functional Dependency
1. Partial Dependency
If an attribute which is not a member of the primary key is dependent on some part of the primary key (if
we have composite primary key) then that attribute is partially functionally dependent on the primary key.
Let {A,B} is the Primary Key and C is no key attribute.
Then if {A, B}C and BC or AC
Then C is partially functionally dependent on {A,B}
2. Full Dependency
12
If an attribute which is not a member of the primary key is not dependent on some part of the primary key
but the whole key (if we have composite primary key) then that attribute is fully functionally
dependent on the primary key.
Let {A,B} is the Primary Key and C is no key attribute.
Then if {A,B}C and BC and AC does not hold ( if B can not determine C and B can not
determine C) Then C Fully functionally dependent on {A,B}
3. Transitive Dependency
In mathematics and logic, a transitive relationship is a relationship of the following form: "If A implies
B, and if also B implies C, then A implies C."
Example:
If Mr X is a Human, and if every Human is an Animal, then Mr X must be an Animal.
Generalized way of describing transitive dependency is that:
If A functionally governs B, AND
If B functionally governs C
THEN A functionally governs C
Provided that neither C nor B determines A i.e. (B / A and C / A)
In the normal notation:
{(AB) AND (BC)} ==> AC provided that B / A and C / A
Steps of Normalization:
We have various levels or steps in normalization called Normal Forms. The level of complexity,
strength of the rule and decomposition increases as we move from one lower level Normal Form to the
higher.
A table in a relational database is said to be in a certain normal form if it satisfies certain constraints.
First Normal Form (1NF)
Requires that all column values in a table are atomic (e.g., a number is an atomic value, while a list or
a set is not). We have tow ways of achieving this:
 Putting each repeating group into a separate table and connecting them with a primary key-
foreign key relationship
 Moving these repeating groups to a new row by repeating the common attributes.
Definition: a table (relation) is in 1NF
If
There are no duplicated rows in the table. Unique identifier
Each cell is single-valued (i.e., there are no repeating groups).
Entries in a column (attribute, field) are of the same kind.
13
Example
UNNORMALIZED STUDENT_COURSE relation (table)

S_ Name phone
ST_ID Courses taken
100 John 4872454
Course_code Course_desc Credit Grade
H
IS381 DB concept 3 A
IS416 Unix OS 3 B

200 Smith 6718120


Course_code Course_desc Credit_H Grade
IS380 DB concept 3 B
IS416 Unix OS 3 B
IS420 Data NW 3 C
300 Tome Finance
Course_code Course_desc Credit_H Grade
IS417 SA 3 A

This relation is not in any normal form because a normal-form relation must have only simple values
in each row-column intersection. A simple value is one single value that does not have further
components. The values in the Courses attribute are made up of one or more rows in another relation.
This is not a simple value. To make the relation into a first normal form relation, we must make each
value in each column be a simple value. To do this, we take each row in the non-normalised relation
and look at the relation in
the non-simple column. For each relation row in the non-simple column, we make a row in a new
relation.
STUDENT_COURSE relation
ST_ID S_Name Phone Course- Course-description Credit-hours Grade
code
100 John 487 2454 IS380 Database Concepts 3 A

100 John 487 2454 IS416 Unix Operating System 3 B

200 Smith 671 8120 IS380 Database Concepts 3 B

200 Smith 671 8120 IS416 Unix Operating System 3 B

14
200 Smith 671 8120 IS420 Data Net Work 3 C

300 Tome 871 2356 IS417 System Analysis 3 A

STUDENT_COURSE TABLE is with sample data – in First Normal Form (1NF)


In table STUDENT_COURSE relation, each row in this table is unique for the combination of ST_ID
and Course_Coude. Following is the detail description of the anomalies that relation Student-courses
suffers from.
1. Insertion anomaly: We cannot add a new course such as IS247 with course description
programming techniques to the database unless we add a student who to take the course.
2. Update anomaly: If we change the course description for IS380 from Database Concepts to
New_Database_Concepts we have to make changes in more than one place or else the database
will be inconsistent. In other words in some places the course description will be
New_Database_Concepts and in any place were we forgot to make the changes the description still
will be Database_Concepts.
3. Deletion anomaly: If student Tome is deleted from the database we also loose information that
we had on course IS417 with description System_Analysis.
 The above discussion indicates that having a single table Student-courses for our database
causing problems (anomalies). Therefore we break the table to smaller table to get a higher
normal form relation.
Second Normal Form (2NF)
A relation is in second normal form (2NF) if every non-primary-key attribute is functionally
dependent on the whole primary key. Thus no non-primary-key attribute is functionally dependent on
part, but not all, of the primary key.
Definition: a relation is in 2NF
If
It is in 1NF and
If all non-key attributes are dependent on the entire primary key. i.e. no partial dependency.
Note that primary attributes are those attributes, which are parts of the primary key, and non-primary
attributes do not participate in the primary key. In Student-courses relation both ST_ID and
Course_code are primary attributes because they are components of the primary key. However
attributes S_name, Phone, Course-description, Credit-hours and Grade all are non primary attributes
because non of them is a component of the primary key.
15
To convert Student-courses relation to second normal form we have to make all non-primary
attributes to be fully functionally dependent on the primary key. To do that we need to project (that is
we break it down to two or more relations) Student-courses table into two or more tables. However
projections may cause problems. To avoid such problems it is important to keep attributes, which are
dependent on each other in the same table, when a relation is projected to smaller relations. Following
this principle we should divide Student-courses relation into following three relations:
PROJECT Student-courses ON (ST_ID, S_Nname, Phone) creates a table call it Student. The
relation Student will be Student (ST_ID:pk, S_Nname, Phone) and
PROJECT Student-courses ON (ST_ID, Course_code , Grade) creates a table call it Student-grade.
The relation Student-grade will be:
Student-grade (ST_ID:pk1:fk:Student, Course_code::pk2:fk:Courses, Grade) and
Projects Student-courses ON (Course_code, Course-Description, Credit-hours) create a table call it
Courses. Following are these three relations and their contents:
Student (ST_ID:pk, S_Name, Phone)

ST_ID Sname Phone


100 John 487 2454
200 Smith 671 8120
300 Russell 871 2356

Courses (Course-code::pk, Course-Description,Credit_hours)


Course-code Course-description Credit-hours
IS380 Database Concepts 3
IS416 Unix Operating System 3
IS420 Data Net Work 3
IS417 System Analysis 3

Student-grade (ST_ID:pk1:fk:Student, Course-code::pk2:fk:Courses, Grade)


ST_ID Course-code Grade
100 IS380 A
100 IS416 B

16
200 IS380 B
200 IS416 B
200 IS420 C
300 IS417 A

All these three relations are in second normal form. Examination of these relations shows that we have
eliminated the redundancy in the database. Now relation Student contains information only related to
the entity student, relation Courses contains information related to entity Courses only, and the
relation Student-grade contains information related to the relationship between these two entity.
Further these three sets are free from all anomalies. Let us clarify this in more detail.
Insertion anomaly: Now a new Course with course-code IS247 and Course-description can be
inserted to the table Course. Equally we can add any new students to the database by adding their id,
name and phone to Student table. Therefore our database, which made up of these three tables does
not suffer from insertion anomaly.
Update anomaly: Since redundancy of the data was eliminated no update anomaly can occur. To
change the course-description for IS380 only one change is needed in table Courses.
Deletion anomaly: the deletion of student Tome from the database is achieved by deleting Tome's
records from both Student and Student-grade relations and this does not have any side effect because
the course IS417 untouched in the table Courses.
Third Normal Form (3NF)
A relation is in third normal form if all non-primary attributes (that is attributes that are not parts of the
primary key) have non-transitivity dependency on the primary key.
Definition: a Table (Relation) is in 3NF
If
It is in 2NF and
There are no transitive dependencies between a primary key and non-primary key attributes.
 No non primary key attribute is functional dependent on another non-primary key attribute.
Example
Assumption: Students of same batch (same year) live in one building or dormitory.
STUDENT relation
StudID Stud_F_Name Stud_L_Name Dept Year Dormitary

125/97 Abebe Mekuria Info Sc 1 401

17
654/95 Lemma Alemu Geog 3 403
842/95 Chane Kebede CompSc 3 403
165/97 Alem Kebede InfoSc 1 401
985/95 Almaz Belay Geog 3 403
This relation is in its 2NF but not in 3NF because non key attributes are functionally depend on
another non key attribute.
 Let‘s take StudID, Year and Dormitary and see the dependencies.
StudIDYear AND YearDormitary
And Year can not determine StudID and Dormitary can not determine
StudID
Then transitively StudIDDormitary
 To convert it to a 3NF we need to remove all transitive dependencies of non key attributes on
another non-key attribute. The non-primary key attributes, dependent on each other will be
moved to another table and linked with the main table using Primary Key- Foreign Key
relationship.
STUDENT table DORM table

StudID Stud_F_Name Stud_L_Name Dept Year Year Dormitory


1 401
125/97 Abebe Mekuria Info Sc 1
3 403
654/95 Lemma Alemu Geog 3
842/95 Chane Kebede CompSc 3
165/97 Alem Kebede InfoSc 1
985/95 Almaz Belay Geog 3
.
Generally, even though there are other four additional levels of Normalization, a table is said to be
normalized if it reaches 3NF. A database with all tables in the3NF is said to be Normalized Database.

18

You might also like