Chapter7
The Relational
Datahase Model
‘There are four types of database models: relational, hierarchical, network and object-
oriented. The relational database model is however by far the most widely used data
model today (although the object-oriented model is also gaining popularity in recent
years). The relational model was first introduced in the early 1970s by E.F. Codd. It
was the result of an attempt to produce a precise, formal mathematical representation
of data,
The term relational model (also called relational data model or relational database
‘model) refers wo the arrangement of data as a set of tables while the term relational
database management system (RDBMS) refers to the software that manages those
tables. What exactly is a relational model? This chapter explains the basic concepts
of the relational data model
11 The Relational Model
‘The relational database model represents data in the form of tables (or relations) and
a mathematical language called relational algebra is used to manipulate the tables.
‘The relational data model has the following characteristics:
1. Data structure, Data are stored in the form of tables or relations.
2, Data manipulation, Powerful operations are used to manipulate the data stored
in the relations.
3. Data integrity. Facilities are included to specify business rules that maintain the
integrity of data when they are manipulated,1:2 ¢ Chapter 7
12 Relations
A relation is a named, two-dimensional table of data. It consists of a set of named
columns and an arbitrary number of unnamed. rows. Each column (e.g., student
name) in the relation corresponds to an attribute of that relation. Each row
corresponds to a record that contains data values for an entity (e.g., student).
‘The table below shows an example of a relation named Student. The relation
contains the following attributes describing students: Studid, StudName,
Age, Sex and Address. There ere seven rows in the table corresponding to seven
students. Adding or deleting rows doesn't change the relation.
student
‘StwiName | Age | Sex | Address
444 | Marina 25 | F [44 Lake Avenue, 60000 Kuala Lampur
222_[ Anthom 27_[-M | 22 Forest View, 47000 Petaling Jaya
555__| Zainal 26_|_M_| 55 Beach Road, 33000 Penang
111 | MeiLing | 25 | F | 11 Orchard Road, 22000 Johor
333 | Ozem 30_|M_[33 Mountain View, 72000 Langkawi
771__| Shoba 24_[F [77 Ocean View, 45000 Kuantan
‘666. | Moti Lal 28 | M__| 66 Garden Road, 50000 Kuala Lumpur
‘You can represent the structure of the above relation using the following shorthand
notation:
student (StudId, StudName, Age, sex, Address)
(VeSIBe See Arie
Relation name Field Names
Note that the key field (Studd) is underlined in both the table and in the shorthand
notation. (Key fields are used for linking tables and searching information in a
database.)
18 Properties Of Relations
We have defined relations as two-dimensional tables; however, not all tables are
relations. Relations have certain properties and only those tables that have these
properties are relations. A relation has the following properties:The Relational Database Model ¢ 7-3.
1. Entries in columns are atomic. There is only one entry at the intersection of a
row and a column, That means, there can be no multi-valued attributes or
repeating groups in a relation. The above table meets this definition,
2. Entries in columns are from the same domain. The set of values that can
appear in a column or field is called its domain. For example, the domains for
the above table may be defined as follows:
Field Domain
StudId Three digits in the range 000 to 999.
StudName A string of alphabetic characters (including blanks) up to 20
characters long.
Age An integer in the range 15 to 35.
Sex The characters F,£,M or m.
Address —_A string of characters up to 30 characters long.
‘The domain of an attribute is the set of all possible values for that attribute. It is
important to know the domain of an attribute in order to determine’ whether a
given data item is valid or not valid for that attribute. For example, the domain of
Sex must be either F (£) or M(m). Anything else will be invalid,
3. Each row is unique, No two rows in a relation are identical. The above table
meets this requirement. This property assures that each row in the table is
‘meaningful and that a user can easily locate the desired data. The primary key in
a relation guarantees the uniqueness in a relation, The key can be a single field or
a combination of fields. The primary key values cannot be null (empty) as they
‘would then not identify a row. This is known as entity integrity constraint.
4, The sequence of columns is insignificant. The columns of a relation can be
interchanged without changing the meaning or use of the relation, That means,
they can be arranged and stored in any sequence. The columns however must be
referenced by name and not by position within the table (since position is not
significant),
5. The sequence of rows is insignificant. As with columns, the rows of a relation
can be interchanged. That means, you can insert a new record anywhere in the
table - at the beginning, end or middle.
6. Ifa value in one relation refers to a row in another relation, then that row must
exist, This is known as the referential integrity constraint.Hi ¢ Chapter 7
7.4 Well-Structured Relations
‘What is a well-structured relation? Simply put, a well-structured relation contains a
minimum number of redundancies and allows users to insert, modify and delete the
rows in a table without errors or inconsistencies, The above table is a well-structured
relation. Each record describes one student. Any modification is confined to one row
of the table.
Redundancies in a table may result in errors or inconsistencies when a user attempts
to update the data in the table, These errors are called anomalies. There are three
types of anomalies: insertion, deletion and modification anomalies.
To illustrate these anomalies, let's use the following table (which is not a relation):
‘Stud | Stud | Tel [Major | Course | Course | Teacher | Teacher ] Grade
Jd_|_Name Id Title_| Name _| Room
Ti_| Marina | 1234 [BU | BUIO0 | Bus Org_| Wong. AL B
1 Marina [1234 [Bu | 8U200_|Econs [Aziz AZ, A
iii _| Marina [1234 [BU [18300 — | Database _[ Shoba B3 Cc
222 [Anthony | 2345 | 1S 18200 | Info Sys_[ Allen BS B
222 [Anthony [2345 [1S 18300 | Database | Shoba BB A
Insertion anomaly. Suppose that we want to insert a new course. We cannot insert it
into the table until a student has registered for that course.
Deletion anomaly. Suppose that the student number 222 is deleted from the table.
‘This will result in losing the course IS200.
Modification anomaly. Suppose that the student number 222 changes his telephone
number, we must record this fact in several rows in the table.
The above anomalies indicate that the table given above is not a well-structured
relation. The problem with this table is that it contains data on more than one entity,
eg., Student and Course.
‘The solution to these anomalies is normalization.The Relational Database Model ¢ 7
1.5 Normalization
Normalization is a process of converting complex data structures into simpler, stable
data structures. It is based on the analysis of functional dependence (discussed
below). The process of normalization is often accomplished in stages, each of which
corresponds to @ normal form. The dependencies are removed successively by
applying few simple rules.
‘The steps in normalization are as follows:
1, Remove repeating groups. This produces a single value at the intersection of each
row and column of the table (explained in the example below). The resulting
table is in First normal form (INF).
2. Remove partial functional dependencies. ‘This results in Second normal form
QNF).
3. Remove transitive dependencies. This results in Third normal form (3NF).
4. Remove remaining anomalies resulting from functional dependencies. This
results in Boyce-Codd normal form.
5. Remove multi-valued dependencies. This results in Fourth normal form (4NF).
6, Remove remaining anomalies. This results in Fifth normal form (SNF).
A functional dependency is a particular relationship between two attributes, For any
relation R, attribute B is functionally dependent on attribute A, if for every valid
‘occurrence of A, that value of A uniquely determines the value of B. The functional
dependence of B on A is represented using the arrow symbol (>) as follows: A > B.
An attribute may also be functionally dependent on two or more attributes.
If AB and BC, then AC. That is, if B depends on A and C depends on B, then
C depends on A. This is called transitive dependency,
Although five normal forms are listed in the literature, there is seldom a need to go
beyond than the Third normal form (3NF). So here we will only discuss up to the
SNF.
To illustrate normalization, let's use the table given below.16 ¢ Chapter 7,
Grade
Stud [Stud [Tel | Major | Course | Course | Teacher | Teacher | Grade
Td_|_Name ta_| Tite | Name _| Room
TH [Marina [1234 [BU [BUIO0 | Bus Org | Wong AL B
BU200 | Econs | Aziz a2 A
1$300__| Database_| Shoba B3 c
22 | Anthony” [2345 [1S [18200 | Info Sys f Allen Bt B
1$300__| Database _| Shoba BB A
‘The above table contains ‘repeated entries: the course data is repeated for cach
student. As a result there are multiple values at the intersection between certain rows
and columns. For example, there are three values for Course Id (BU100, BU200
. and 18300) for Marina.
To normalize the above table, use the following steps:
StepI: Remove repeating groups. This is done casily by copying the information
above downwards as shown below. The table (Grade2) is now in INF.
Grade2_(1NF)
‘Stud [Stud] Tel | Major | Course | Course | Teacher | Teacher | Grade
Ta_| Name Id Tide _|Name__| Room
Tt | Marina_| 1234 [BU | BU100 | BusOrg | Wong AL B
Tit _| Marina [1234 | BU | BU200_[Beons [Aziz AD a
I11_| Marina | 1234 | BU | 18300 _ | Database | Shoba BB Cc
222 [Anthony {2345 | 1S 18200 | Info Sys | Allen BA B
222 Anthony | 2345 [18 18300__| Database _| Shoba BS A
‘As explained, the above relation suffers from the insertion, deletion and modification
anomalies. That means, the relation is not well-structured; it requires futher
normalization.
Step 2: Remove partial functional dependencies. To apply step 2, we must frst
analyze the functional dependencies in the above relation and then select a primary
key for the relation.
‘The above relation reveals the following dependencies:
1, Studtd > StudName, Tel, Major
2. Courseld > CourseTitle, TeacherName, TeacherRoom
3. Studtd, Courseta > Grade (dependent on two fields)
4, TeacherName > TeacherRoomThe Relational Database Model ¢ T-
In (1), StudName, Tel and Major are functionally dependent on Studtd.
Similarly, in (2), CourseTitle, TeacherName and TeacherRoom are
functionally dependent on Course1d. And so on.
A candidate key for the relation is one that uniquely identifies the right-hand side
attributes, and is non-redundant. Careful examination shows that there is one such
key containing both StudTa and CourseT4. It is called a composite primary key
= a primary key that contains more than one attribute. The relation is shown as
follows:
GRADE? (StudId, StudName, Tel, Major, Courseld,
CourseTitie, TeacherName, TeacherRoom, Grade)
The composite key (Stuaté, StucName) is underlined.
Next we examine the functional dependencies on this composite key. Three attributes
(Studd, Tel and Major) are functionally dependent on-part of the key (Studd)
as shown in Figure 7.1. Similarly, the attributes (Course title, Teacher and
TeacherRoom) are functionally dependent on another part of the key (Coursetd).
‘These six attributes are partially functionally dependent on the primary key. Only the
attribute Grade is functionally dependent on both StudId and CourseZd, That
‘means, we must know both the StudId and the CourseTd taken in order to
identify Grade. Since partial dependencies exist, the above relation is not in 2NF.
Figure 7.1: Functional dependencies
KEY
1}
“TeacherRoomEB ¢ Chapter 7
To convert to 2NF, we must remove the partial dependencies. Close examination tells.
that we can do this by creating three new relations as follows:
Student (StudzD, studName, Tel, Major)
CourseTeacher(CourseId, CourseTitle, TeacherName,
‘Teacher_room)
Registration(StudID, Courseld, Grade)
‘The three new relations are as follows:
Student
Stud | Stud | Tel_| Major
Id_| Name
Til | Marina | 1234 | BU
2iz_| Anthony | 2345 [1S
CourseTeacher
Course | Course | Teacher | Teacher
Id Title | Name | Room
BU100 | Bus Org_| Wong AL
BU200_|Bcons | Azz AD
1S300_| Database | Shoba BB
1S200 | Info Sys | Allen. Ba
18300 | Database_| Shoba Ba
Registration
‘Stud | Course ] Grade
ta_| aa
Tit | BU100
Ti_| BU200
11118300.
222_| 18200.
222 [1s300
a
>|elol>|
‘The above relations are in 2NF as cach non-key attribute in the relations is fully
dependent on the key for that relation.
Step 3: Remove transitive dependencies. The Student and Registration are
already in INF. However, CourseTeacher is still in the 2NF. That means, it is
still subject to the anomalies discussed above. For exemple, if we want to change the
location for the teacher Shoba from room B3 to A3, we must make this change in
‘multiple rows. Similarly, if we delete the course 18200 and there is only one rowThe Relational Database Model ¢ 7-9,
for S200, then we may lose the information that the teacher Allen is located in
room B4.
‘The anomalies in the CourseTeacher relation exist because data concerning the
entities ‘Teacher are hidden within CourseTeacher. The functional
dependencies in this relation are:
CourseId > CourseTitle, TeacherName, TeacherRoom
TeacherName > TeacherRoom
As Teacher room is functionally dependent on Teacher . name (a non-key
attribute), there is transitive dependency in the relation. Therefore, the relation
CourseTeacher is not yet in 3NF.
To remove this transitive dependency from CourseTeacher, we split it into two
relations Course and Teacher as follows:
course,
‘Course ‘Teacher
Ia Name
BUIOO Wong
BU200 ‘Avie
18300 Shoba
1S200 ‘Allen,
18300 Shoba
Teacher
“Teacher | Teacher
Name | Room
Wong AL
‘Aziz AZ
‘Shoba B3
‘Allen Ba
Shoba B3
The relation Course contains the following attributes: CourseTd (key), Course
title, and Teacher. The relation Teacher contains two attributes: Teacher
name (key) and TeacherRoom. Thus the TeacherName becomes the primary
key in the new relation Teacher and a foreign key in the new Course relation.
A foreign key is an attribute that appears as a non-key attribute in one relation and as
a primary key in another relation.transformed the original table
to a set of four relations in 3NF through a series of simple steps. The complete set is
as follows:
Student (StudId, studName, Tel, Major)
Stud [Stud [Tel | Major
Ta_| Name
Tit_[Marina [7234 | BU
222__| Anthony [3345 iS
Course (Courseld, CourseTitle, TeacherNane)
Course | Course | Teacher
1d Title | _Name
BUIO0 | Bus Org | Wong
BU200 | Beons | Aziz,
18300 | Database | Shoba
1200__| Info Sys —_| Allen
18300] Database | Shobs
Teacher (TeacherName, TeacherRoom)
Teacher | Teacher
Name | Room
‘Wong AL
‘Aziz AD
‘Shoba B3
Allen Ba
‘Shoba BS
Registration (StudId, Courseld, Grade)
‘Stud | Course | Grade
w_| td
i _|Bu100 |B
ii | BU200
1i1_[ 18300.
222 [18200
222 [1300
> fo]o}>
The above relations are free from the insertion, deletion and modification anomalies,
As cach entity is desoribed in a separate relation, we can insert or delete data on each
entity without reference to other entities, Modification to an entity can also be done
easily since any change is confined to only a single row. No information has been
lost through the normalization process.‘The Relational Database Model + J-11
‘As mentioned previously, the relational database management system (RDBMS) is
the most widely used database model today. Some of the main reasons for this are:
1. The RDBMS is well documented in the literature.
2. Itis well understood.
3. It is widely taught in schools, colleges, universities and training centers. That
means, it is easier (and probably less expensive) to recruit database personnel
who are familiar with the RDBMS than with other the hierarchical or network
database systems.
4. Itis also fast as it uses power database engines such as the Jet Database Engine.
5. Itis user-friendly as it uses graphical user interface (GUI)
6. It provides good backup and security features such as password, encryption and
views.
7. Ithas Fourth Generation Language (4GL) features. For example, it comes with an
easy-to-use Structured Query Language (SQL) that allows end-users to retrieve
information quickly without having to write code (programs).
8. Some of the RDBMS also provides documentation features. This not only
simplifies and reduces the software development and maintenance time, but it
also improves the software quality.
9. It supports rich data types such as currency, date, logical, memo and general (for
images, sound and video).
10. It provides a lot more features than the other database models.
11, It has the support of powerful software vendors (such as Microsoft and Oracle).
To retain their customers, they always provide an easy upgrade path. That means,
customers do not have to redo their work they change to a newer version of the
database software, operating system or hardware,
12. It supports network applications including concurrency and locking mechanisms.
That means many users can simultaneously work on an application without
running into coordination or synchronization problems.
13. It supports the popular client/server architecture,
14, It can be linked to Internet applications.HA2.¢ Chapter 7
1.6 Database Design Considerations
A good database should exhibit the following characteristics:
1, Minimal data redundancy. Duplication of data in a database should be minimal.
Although there will be some redundancies in a database (c.g., the key fields),
such redundancies however are controlled.
2. Data consistency. If the same data (or information) is stored in multiple tables,
then they should be identical. For example, if the occupational code of an
employee is modified, then all other occurrences of that information in the
database must also be updated. Otherwise, the data will be inconsistent.
3. Data integration, Data stored in one table should be easily accessible from other
related tables. For example, if the employee record has a job code, then it should
be possible to obtain his or her job ttle,
4. Data sharing, Often there will be many tisers who share the same database. Each
user may be given a view of the database. A view roughly corresponds to a subset,
of the database fields.
5, Provide security, privacy and integrity controls. There must be proper controls
for accessing, updating and protecting data, This often requires setting standards
and procedures.
6. Data accessibility and responsiveness. Users should be able to access the
information stored in a database using simple commands such as an SQL.
7. Data independence. Data should be separated from the application programs that
use the data. In other words, the data stored in a database can change without
necessitating change in the application programs. This calls for some kind of
data dictionary or repository to store information on the data stored in the
database.
8. Simplify application development and maintenance. Developing and
documenting applications should be easy and fast. This implies the use of 4GLs
or CASE (computer aided software engineering) features.“The Relational Database Model + 7-13.
7.7 Relational Database Management Systems
The term relational database management system (RDBMS) refers to software that
supports the relational data model. There are several RDBMS currently available in
the market. These include Access, FoxPro, dBASE, Paradox, Clipper, Informix,
Oracle, DB400 and DB2. Some of them run only on mainframes and minis (e.¢.,
DB2, DB400); some only on PCs (¢.g., Access, FoxPro); while others may run on
both types (e.g., Oracle and Informix).
‘The use of RDBMS greatly simplifies the design of databases. Most RDBMS today
provide many sophisticated features such as. graphical user interfaces (GUI), variety
of data types (eg, text, date/time, number, currency, memo, generaVOLE (for
objects)), automatic code generation and even diagramming support. They also
provide support for designing forms, queries and reports. Many also provide query
Tanguage (¢.g., Structured Query Language (SQL)) support. All these features
greatly simplify the task of designing databases.
Sometimes, the term Fourth Generation Language (4GL) or Computer-Aided
Software Engineering (CASE) is used to refer to these RDBMS.
7.8 The RDBMS Components
‘The RDBMS
conment includes the following components:
1, CASE tools. These are automated tools used to design databases and application
‘programs.
2, Repository, Stores information about the database such as data definitions,
screen and report formats.
3. RDBMS. The relational database software for creating, calculating, maintaining
and accessing information in the database. It also manages the information
repository.
4, Database. A shared collection of logically related data, designed to meet the
information needs of multiple users in an organization.
5. Application programs. Programs written in a database software for creating and
manipulating databases.114 ¢ Chapter 7,
6. User Interface. Languages, menus, icons and other features by which users
interact with the system components such as the application programs, DMBS,
CASE and repository.
7. Data administrators. People who are responsible for the overall planning and
management of the data resource.
8. System developers. People (e.g,, system analysts, programmers) who develop
new applications,
9. End users. People who actually use the system - those who add, modify and
delete information as well as those who receive information from the system.
Figure 7.2 shows the above components and their relationship with one another.
Figure 7.2: Components of a database environment
Date End
‘administrators users
CASE ‘Application
tools programs
LY [_ooms |
Repository
DatabaseThe Relational Database Model ¢ 7-15
1. Define the term relational data model. List the main characteristics of this data
model.
2. Distinguish between a table and a relation.
3. Explain the terms (a) primary key, (b) foreign key and (c) composite key.
4, What is a well-structured relation?
5. Define the term normalization. Describe the steps for normalizing an ill-
structured relation,
6. Given the following table:
represent simple relations.
‘Order | Onder | Cosiomer [Tem [hen Tem Unt] Vendor | Vendor
Now | Date | Name | Cade | Quintiy | Description | Price | Code™ | name
7) 2098 | Yasmin 355 | 20 | Pen 7.00 | VP__| Penco
77 | 202798-| Yasmin | $55 | —15__| Stapler [3.00 VS__[ Staples
797 [207098 [Yasmin [385 | 10” | Ruler | 0.50” | VR | Rulerco
Normalize the above table as far as necessary so that the resulting tables
7. A real estate agency rents houses to customers on behalf of the house owners
‘The table gives information about houses, customers and owners:
Cust | Cust [House | House ‘Stat [End | Rent | Owner | Owner |
No | Name | No | Address Date | Date No | Name
Tit [Wong [HI | SNewSt, KL | Wis | 6nm7 | 1000, Wi | Karen
Hz | 6oldst, KL | 7718 | 61697 | 900] wi | Karen
13 __| 7™Me view,P9 | 7/1097 | 6298 | 1200 | wo | Zainal
22 [Yasmin | Ha] $OneSe,KL | 5/595] 4/597 | 1100] WI | Karen
5 |otwosepr | 5/786 | we57 | 1000 | w2 | Zainal
Normalize the above table as far as necessary so that the resulting tables
represent simple relations.
8, List and describe the characteristics of a good database design.
Describe the major components of a relational database environment,