You are on page 1of 158

LEVEL 4

DATABASES
Lecturer Guide

DB Lecturer Guide V1.0

Modification History

Version

Date

Revision Description

V1.0

June 2011

For Release

NCC Education Limited, 2011


All Rights Reserved
The copyright in this document is vested in NCC Education Limited. The document must not be
reproduced by any means, in whole or in part, or used for manufacturing purposes, except with the
prior written permission of NCC Education Limited and then only on condition that this notice is
included in any such reproduction.
Published by: NCC Education Limited, The Towers, Towers Business Park, Wilmslow Road,
Didsbury, Manchester M20 2EZ, UK.
Tel: +44 (0) 161 438 6200 Fax: +44 (0) 161 438 6240 Email: info@nccedu.com
http://www.nccedu.com

Page 2 of 158
DB Lecturer Guide V1.0

Title Here
CONTENTS
1.

Module Overview and Objectives ................................................................................ 7

2.

Learning Outcomes and Assessment Criteria ........................................................... 7

3.

Syllabus .......................................................................................................................... 8

4.

Related National Occupational Standards ................................................................ 10

5.

Resources .................................................................................................................... 10

6.

Pedagogic Approach .................................................................................................. 11


6.1

Lectures.........................................................................................................................11

6.2

Tutorials.........................................................................................................................11

6.3

Laboratory Sessions .....................................................................................................11

6.4

Private Study .................................................................................................................11

7.

Assessment ................................................................................................................. 11

8.

Further Reading List ................................................................................................... 12

Topic 1:

Introduction to the Module and Database Fundamentals ....................................... 13

1.1

Learning Objectives .......................................................................................................13

1.2

Pedagogic Approach .....................................................................................................13

1.3

Timings ..........................................................................................................................13

1.4

Lecture Notes ................................................................................................................14


1.4.1 Guidance on the Use of the Slides ....................................................................14

1.5

Laboratory Sessions ......................................................................................................17

1.6

Private Study .................................................................................................................20

1.7

Tutorial Notes ................................................................................................................22

Topic 2:

Databases and Database Management Systems (DBMS) ....................................... 23

2.1

Learning Objectives .......................................................................................................23

2.2

Pedagogic Approach .....................................................................................................23

2.3

Timings ..........................................................................................................................23

2.4

Lecture Notes ................................................................................................................24


2.4.1 Guidance on the Use of the Slides ....................................................................24

2.5

Laboratory Sessions ......................................................................................................30


2.5.1 Simple Queries ..................................................................................................30
2.5.2 Using the COUNT function with queries ............................................................32

2.6

Private Study .................................................................................................................33

2.7

Tutorial Notes ................................................................................................................34

Topic 3:

Entity Relationship (ER) Modelling (I) ....................................................................... 35

3.1

Learning Objectives .......................................................................................................35

3.2

Pedagogic Approach .....................................................................................................35

3.3

Timings ..........................................................................................................................35

3.4

Lecture Notes ................................................................................................................36


Page 3 of 158

DB Lecturer Guide V1.0

Title Here
3.4.1

Guidance on the Use of the Slides ....................................................................36

3.5

Laboratory Sessions ......................................................................................................38


3.5.1 Selecting from More Than One Table ...............................................................38

3.6

Private Study .................................................................................................................41

3.7

Tutorial Notes ................................................................................................................43

Topic 4:

Entity Relationship (ER) Modelling (2) ...................................................................... 45

4.1

Learning Objectives .......................................................................................................45

4.2

Pedagogic Approach .....................................................................................................45

4.3

Timings ..........................................................................................................................45

4.4

Lecture Notes ................................................................................................................46


4.4.1 Guidance on the Use of the Slides ....................................................................46

4.5

Laboratory Sessions ......................................................................................................48

4.6

Private Study .................................................................................................................50


4.6.1 Review Material on ER Modelling ......................................................................50
4.6.2 Drawing ER Models ...........................................................................................50

4.7

Tutorial Notes ................................................................................................................53

Topic 5:

The Relational Model (I) .............................................................................................. 55

5.1

Learning Objectives .......................................................................................................55

5.2

Pedagogic Approach .....................................................................................................55

5.3

Timings ..........................................................................................................................55

5.4

Lecture Notes ................................................................................................................56


5.4.1 Guidance on the Use of the Slides ....................................................................56

5.5

Laboratory Sessions ......................................................................................................59


5.5.1 More on Creating and Modifying Tables ............................................................59

5.6

Private Study .................................................................................................................62

5.7

Tutorial Notes ................................................................................................................66

Topic 6:

The Relational Model (2) ............................................................................................. 69

6.1

Learning Objectives .......................................................................................................69

6.2

Pedagogic Approach .....................................................................................................69

6.3

Timings ..........................................................................................................................69

6.4

Lecture Notes ................................................................................................................70


6.4.1 Guidance on the Use of the Slides ....................................................................70

6.5

Laboratory Sessions ......................................................................................................74

6.6

Private Study .................................................................................................................77

6.7

Tutorial Notes ................................................................................................................80

Topic 7:

SQL (I) ........................................................................................................................... 83

7.1

Learning Objectives .......................................................................................................83

7.2

Pedagogic Approach .....................................................................................................83

7.3

Timings ..........................................................................................................................83

Page 4 of 158
DB Lecturer Guide V1.0

Title Here
7.4

Lecture Notes ................................................................................................................84


7.4.1 Guidance on the Use of the Slides ....................................................................84

7.5

Laboratory Sessions ......................................................................................................88

7.6

Private Study .................................................................................................................93

7.7

Tutorial Notes ................................................................................................................95

Topic 8:

SQL (2) .......................................................................................................................... 97

8.1

Learning Objectives .......................................................................................................97

8.2

Pedagogic Approach .....................................................................................................97

8.3

Timings ..........................................................................................................................97

8.4

Lecture Notes ................................................................................................................98


8.4.1 Guidance on the Use of the Slides ....................................................................98

8.5

Laboratory Sessions ....................................................................................................101


8.5.1 Aggregation .....................................................................................................101
8.5.2 Part One: Using the Workers, Departments and Job_Types Tables ............... 101
8.5.3 Using the Personal_details, Qualifications and Qualification_types Tables .... 103

8.6

Private Study ...............................................................................................................106

8.7

Tutorial Notes ..............................................................................................................109

Topic 9:

Database Design ....................................................................................................... 111

9.1

Learning Objectives .....................................................................................................111

9.2

Pedagogic Approach ...................................................................................................111

9.3

Timings ........................................................................................................................111

9.4

Lecture Notes ..............................................................................................................112


9.4.1 Guidance on the Use of the Slides ..................................................................112

9.5

Laboratory Sessions ....................................................................................................116


9.5.1 Grouping ..........................................................................................................116

9.6

Private Study ...............................................................................................................119

9.7

Tutorial Notes ..............................................................................................................122

Topic 10:

Supporting Transactions .......................................................................................... 124

10.1 Learning Objectives .....................................................................................................124


10.2 Pedagogic Approach ...................................................................................................124
10.3 Timings ........................................................................................................................124
10.4 Lecture Notes ..............................................................................................................125
10.4.1 Guidance on the Use of the Slides ..................................................................125
10.5 Laboratory Sessions ....................................................................................................130
10.5.1 Nested Sub-queries .........................................................................................130
10.5.2 Use of the Workers, Departments and Job_types Tables ...............................130
10.5.3 Using the Personal_details, Qualifications and Qualification_types Tables .... 132
10.6 Private Study ...............................................................................................................133
10.7 Tutorial Notes ..............................................................................................................135
Topic 11:

Database Implementation ......................................................................................... 138

11.1 Learning Objectives .....................................................................................................138


Page 5 of 158
DB Lecturer Guide V1.0

Title Here
11.2 Pedagogic Approach ...................................................................................................138
11.3 Timings ........................................................................................................................138
11.4 Lecture Notes ..............................................................................................................139
11.4.1 Guidance on the Use of the Slides ..................................................................139
11.5 Laboratory Sessions ....................................................................................................144
11.5.1 Views ...............................................................................................................144
11.5.2 Indexes ............................................................................................................145
11.5.3 Constraints ......................................................................................................146
11.6 Private Study ...............................................................................................................147
11.7 Tutorial Notes ..............................................................................................................148
Topic 12:

Summary .................................................................................................................... 150

12.1 Learning Objectives .....................................................................................................150


12.2 Pedagogic Approach ...................................................................................................150
12.3 Timings ........................................................................................................................150
12.4 Lecture Notes ..............................................................................................................151
12.4.1 Guidance on the Use of the Slides ..................................................................151
12.5 Private Study ...............................................................................................................157

Page 6 of 158
DB Lecturer Guide V1.0

Overview
1.

Module Overview and Objectives

This unit aims to give the learner a thorough grounding in practical techniques for the design and
development of database systems, and the theoretical frameworks that underpin them.

2.

Learning Outcomes and Assessment Criteria

Learning Outcomes;
The Learner will:

Assessment Criteria;
The Learner can:

1. Understand the concepts associated 1.1 Summarise the common uses of database
with database systems
systems
1.2 Explain the meaning of the term database
1.3 Explain the meaning of the term database
management system (DBMS)
1.4 Describe the components of the DBMS
environment
1.5 Describe the typical functions of a DBMS
1.6 Summarise the advantages and disadvantages of
a DBMS
2. Understand the concepts associated 2.1 Summarise the concept of the relational model
with the relational model
2.2 Explain the terminology associated with the
relational model
2.3 Explain the purpose of relational integrity
3. Understand how to design
develop a database system

and 3.1 Explain the use of ER modelling in database


design
3.2 Describe the basic concepts of an ER model
3.3 Describe ways of identifying problems in an ER
model
3.4 Explain ways of solving problems in an ER model
3.5 Summarise the purpose of SQL
3.6 Describe how to create database tables using
SQL

4. Be able to develop a logical database 4.1 Identify a set of tables from an ER model
design
4.2 Check that the tables are capable of supporting
the required transactions
5. Be able to develop a database 5.1 Create database tables
system using SQL
dictionary
5.2 Insert data into the tables
5.3 Update data in the tables
5.4 Delete data in the tables

Page 7 of 158
DB Lecturer Guide V1.0

based

on

data

Title Here
3.

Syllabus

Syllabus
Topic
No

Title

Proportion

Content

Introduction to the
Module and
Database
Fundamentals

1/15

Databases and
Database
Management
Systems (DBMS)

1/15

Entity Relationship 1/15


(ER) Modelling (1)

Introduction to the module


What are databases?
Examples of databases in use
Data and information

2 hours of
lectures
1 hour of
tutorials
1 hour of
laboratory
Learning Outcome: 1
sessions

Components of a database system


Types of applications
2 hours of Database Management Systems
lectures
Available commercial implementations
1 hour of
History of information management
tutorials
1 hour of Pre-database information systems
Advantages of database approach and DBMS
laboratory
Disadvantages of DBMS
sessions
Relational model and alternatives
Learning Outcome: 1

The goal of ER modelling


Types of notation
Basic concepts (Entities, Attributes and
Relationships)
Identifying entities

Constructing ER models
Strong and weak entities
Identifying problems in ER models
Problem solving in ER models

2 hours of
lectures
1 hour of
tutorials
1 hour of
laboratory
Learning Outcome: 3
sessions
4

Entity Relationship 1/10


(ER) Modelling (2)

2 hours of
lectures
2 hours of
tutorials
2 hours of
laboratory
Learning Outcome: 3
sessions

Page 8 of 158
DB Lecturer Guide V1.0

Title Here
5

The Relational
Model (1)

1/10

Aims of the relational model


Basic concept of the relational model
Terminology

The purpose of relational integrity


Basic purpose and concepts of normalisation

2 hours of
lectures
2 hours of
tutorials
2 hour of
laboratory
Learning Outcome: 2
sessions
6

The Relational
Model (2)

1/10

2 hours of
lectures
2 hour of
tutorials
2 hours of
laboratory
Learning Outcome: 2
sessions
7

SQL (1)

1/12

The purpose and role of SQL


Basic concepts of SQL
Standards and flavours of SQL

Key constructs in SQL


Create statements
Select statements
Fixing mistakes

Understanding requirements
Identify a set of tables from an ER model
The data dictionary
Use of CASE tools
Entities to tables

2 hours of
lectures
1 hour of
tutorials
2 hours of
laboratory
Learning Outcome: 3
sessions
8

SQL (2)

1/12

2 hours of
lectures
1 hour of
tutorials
2 hours of
laboratory
Learning Outcome: 3
sessions
9

Database Design

1/10

2 hours of
lectures
2 hours of
tutorials
2 hours of
laboratory
sessions
Learning Outcome: 4

Page 9 of 158
DB Lecturer Guide V1.0

Title Here
10

Supporting
Transactions

1/10

Identifying business rules


Checking a database will support the required
transactions
Identifying possible performance issues
Indexing and de-normalisation

2 hours of
lectures

2 hours of
tutorials
2 hours of
laboratory
Learning Outcome: 4
sessions
11

Database
Implementation

1/10

The implementation environment


Creating tables based on database dictionary
Enforcing integrity via constraints
Enforcing business rules via constraints
Creating indexes
Insert, Update and Delete

2 hours of
lectures
2 hours of
tutorials
2 hours of
laboratory
sessions
Learning Outcomes: 5
12

Summary

1/30

Summary of module
Identifying links with other modules/subject
areas
2 hours of
lectures
Clarification of module material and related
issues as identified by students
Learning Outcomes: ALL

4. Related National Occupational Standards


The UK National Occupational Standards describe the skills that professionals are expected to
demonstrate in their jobs in order to carry them out effectively. They are developed by employers
and this information can be helpful in explaining the practical skills that students have covered in this
module.
Related National Occupational Standards (NOS)
Sector Subject Area: 6.1 ICT Professional Competence
Related NOS: 4.2.A.1 Contribute to data analysis assignment;
4.2.A.2 Carry out specified data analysis activities;
4.5.A.1 Collate specified information relating to data design activities;
4.5.A.2 Contribute to producing and maintaining data designs;
4.5.A.3 Assist, under supervision, the management of data relating to data designs;
4.5.P.1 Assist with the development for data design activities

5.

Resources

Lecturer Guide:

This guide contains notes for lecturers on the organisation of each topic, and
suggested use of the resources. It also contains all of the suggested
exercises and model answers.

PowerPoint Slides:

These are presented for each topic for use in the lectures. They contain many
examples which can be used to explain the key concepts. Handout versions
Page 10 of 158

DB Lecturer Guide V1.0

Title Here
of the slides are also available; it is recommended that these are distributed to
students for revision purposes as it is important that students learn to take
their own notes during lectures.
Student Guide:

This contains the topic overviews and all of the suggested exercises.

This module also makes use of SQL. You may choose which version of SQL is to be used but
students will need to have access to this during laboratory and private study time. You may wish to
consider
MySQL.
This
is
open
source
software
which
is
available
from
http://www.mysql.com/downloads.

6.

Pedagogic Approach
Suggested Learning Hours

Lectures:

Tutorial:

24

Seminar:

17

Laboratory:
19

Private Study:
90

Total:
150

The teacher-led time for this module is comprised of lectures, laboratory sessions and tutorials. The
breakdown of the hours is also given at the start of each topic.

6.1

Lectures

Lectures are designed to start each topic and PowerPoint slides are presented for use during these
sessions. Students should also be encouraged to be active during this time and to discuss and/or
practice the concepts covered. Lecturers should encourage active participation wherever possible.

6.2

Tutorials

These are designed to deal with the questions arising from the lectures and private study sessions.
For some topics these will be structured sessions with students engaging in tasks related to the
lecture. Other sessions will involve problem solving and trouble-shooting discussions related to the
practical work.

6.3

Laboratory Sessions

During these sessions, students are required to work through practical tutorials and various
exercises. The bulk of the tutorial sessions will be related to gaining a sufficient level of mastery of
the chosen database tool and the SQL language sufficient to implement the assessment task.
Students will be introduced to SQL in the laboratory sessions and this learning will later be
augmented by lecture and tutorial sessions. The details of these are provided in this guide and also
in the Student Guide.

6.4

Private Study

In addition to the taught portion of the module, students will also be expected to undertake private
study. Exercises are provided in the Student Guide for students to complete during this time.
Teachers will need to set deadlines for the completion of this work. These should ideally be before
the tutorial session for each topic, when Private Study Exercises are usually reviewed.

7.

Assessment

This module will be assessed by means of an examination worth 75% of the total mark and an
assignment worth 25% of the total mark. These assessments will be based on the assessment
Page 11 of 158
DB Lecturer Guide V1.0

Title Here
criteria given above and students will be expected to demonstrate that they have met the modules
learning outcomes. Samples assessments are available through the NCC Education Campus
(http:campus.nccedu.com) for your reference.
Assignments for this module will include topics covered up to and including Topic 7. Questions for
the examination will be drawn from the complete syllabus. Please refer to the Academic Handbook
for the programme for further details.

8.

Further Reading List

A selection of sources of further reading around the content of this module must be available in your
Accredited Partner Centres library. The following list provides suggestions of some suitable
sources:
Benyon-Davies, P. (2003). Database Systems, 3rd Revised Edition. Palgrave Macmillan.
ISBN-10: 1403916012
ISBN-13: 978-1403916013
Connolly, T. and Begg, C. (2009). Database Systems: A Practical Approach to Design,
Implementation, and Management, 5h Edition. Pearson Addison Wesley.
ISBN-10: 0321523067
ISBN-13: 978-0321523068
Hoffer, J., Ramesh, V. and Toppi, H. (2010). Modern Database Management, 10th Edition. Pearson
Prentice Hall.
ISBN-10: 1408264315
ISBN-13: 978-1408264317

Page 12 of 158
DB Lecturer Guide V1.0

Topic 1
Topic 1: Introduction
to
Fundamentals
1.1

the

Module

and

Database

Learning Objectives

This topic provides an overview of the module syllabus and a general introduction to databases
On completion of the topic, students will be able to:

1.2

Give a definition of what a database is;


Give examples of databases in use;
Distinguish between data and information.

Pedagogic Approach

Information will be transmitted to the students during the lectures. Students will be encouraged to
participate in activities during the lecture. They will then practise the skills during the tutorial session.
The laboratory sessions will introduce SQL to enable students to begin creating some database
structures that will be used in future topics.

1.3

Timings

Lectures:

2 hours

Laboratory Sessions: 1 hour


Private Study:

7.5 hours

Tutorials:

1 hour

Page 13 of 158
DB Lecturer Guide V1.0

Title Here
1.4

Lecture Notes

The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

Outline of the module


Examples of databases in use
What are databases?
Data and information

1.4.1 Guidance on the Use of the Slides


Slide 2:

Scope and coverage

Slide 3:

Learning outcomes of this lecture

Slide 4:

An overview of the module is presented here. Give a brief introduction to each of


the topics. The structure of the module moves from an initial introduction to
databases into the techniques needed to be able to complete a database project.

Slide 5:

This slide gives an outline of the pedagogic approach. Information will be


transmitted to the students during the lectures. Students will be encouraged to
participate in activities during lectures. They will then practise the skills during the
tutorial session. The laboratory sessions will be used to develop skills in SQL to be
able to create database structures and manipulate data. Students will also be
expected to engage in private study to consolidate and extend their knowledge of
the topics covered.

Slide 6:

Outline the assessment methods for this module to the students.

Slide 7:

This slide gives a brief overview of the importance of databases. Point out that they
are a relatively new technology. You may also like to ask students to give their own
explanations/definitions of what a database is. This can be helpful in gauging
students initial knowledge levels on the topic.

Slides 8-9:

The students should now be asked to write down all the different places where they
think information might be kept about them on a database or where they have
interacted with a database. Run a class feedback session to gather feedback and
collate the results before showing students the examples on Slide 9.

Slides 10-11:

A more detailed example of purchasing insurance is presented here. To purchase


insurance of any sort, there is a lot of information that needs to be gathered about
the person applying and matched to a lot of other information about the types of
insurance available. To use the example of travel insurance: the broker needs to
know what sort of travel is being undertaken, for how long, what sort of cover is
required (just health or belongings as well), the health and age of the person
applying and the sort of activities that will be undertaken (anything dangerous like
skiing or mountain climbing will cost more). All this information will be stored on a
database.

Page 14 of 158
DB Lecturer Guide V1.0

Title Here
To match this to the right policy, they would have to search through one or more
databases before coming up with the insurance quote.
Sometimes this matching of person to policy is done with a sophisticated piece of
software called an Expert System. This uses sets of rules to match the person to
the policy and uses databases of the personal data, the policy data and the actual
rules themselves to do so.
Slides 12-13:

So far, the lecture has looked at lots of different uses of database but how do we
define what a database is? There are various definitions given in textbooks. This
slide presents the definition given by one of the founding fathers of modern
databases, C.J.Date: A database is a computerised record keeping system. This
definition is ok as a starting point but highlight to students that some people include
manual filing systems as being a type of database.

Slide 14:

Databases have the capacity to store, manipulate and retrieve data. We keep data
there (storage), we do things to that data via programs and applications
(manipulation) and we need to be able to get the data out of the database when we
need it (retrieval).

Slides 15-16:

There is a wide range of databases in existence from single user databases on a


P.C. to multi-user databases with hundreds of users. The example given on the
slide is of the supermarket chain Wal-Marts data warehouse, which was (as of
2004) about 500 tera-bytes in size. You may also want to provide local examples of
very large databases.

Slides 17-18:

This slide makes the point that databases are not just big buckets of data. Most of
these systems are not just a mass of data that isnt in any way organised or
controlled. Thus, Dates definition is not precise enough and a more detailed
definition from Hoffer et al is presented on Slide 18.

Slides 19-21:

The terms in this definition will be examined in more depth as the module
progresses. How data is organised and logically related will be extremely important
when students examine entity relationship modelling and the relational model later
in the module. These slides present an introduction to the importance of these
ideas. The example of a salespersons database is given on Slide 21 but this could
be substituted for another database with which students are familiar, for example a
database holding hospital records. We say that data should be related in that
certain data belongs together in the sense that it forms a meaningful set for the
person using it.

Slide 22:

Now ask students to work in pairs or small groups and think about what qualities
there are about themselves that might become data. They should make two lists:
one is a list of their data that might be relevant to a database for a college or
university; the other list should be for a database for a social networking site.
After students have had a few minutes to discuss, run a class feedback session.
The lists should contain different sorts of data. For example a college would be
interested in their qualifications; a social networking site would be interested in what
sort of music and films they liked.

Slides 23-24:

These slides return to the question of what data is. Historically, data meant facts
that could be recorded and stored in a computer system. For example in a sales
database the data would include facts such as a customer name, address and
telephone number. This is quite simple data which consists of text. Numerical data,
Page 15 of 158

DB Lecturer Guide V1.0

Title Here
such as the amount that customer spent last year, might also be stored. Today, this
definition has to be expanded to reflect a new reality since databases store objects
such as whole documents, photographic images, sound and video.
Slides 25-26:

Traditionally there has been a distinction made between data and information.
Ask students to look at the list on Slide 25 and guess what it means. Highlight that if
it is processed in a certain way, it becomes more meaningful. Information is data
that has been processed in such a way that it can increase the knowledge of the
person who uses it.

Slide 27:

This slide highlights the importance of information. We are told that we live in a
knowledge economy where information is the most vital resource. Databases
therefore are one of the key technologies to gain access to this information.

Slide 28:

Recap of learning outcomes. Ask the student to go through how they have been
covered in the lecture.

Page 16 of 158
DB Lecturer Guide V1.0

Title Here
1.5

Laboratory Sessions

The laboratory time allocation for this topic is 1 hour.


Lecturers Notes:
Students have copies of the laboratory exercises in the Student Guide. Answers are not provided in
their guide.
It is envisaged that the first session will include some set up time to establish students with SQL
accounts. You can use any SQL which implements all the standard features such as Oracle, SQL
Server or MySQL. This will need to be available to each student during all of the laboratory
sessions. You will need to be familiar with the features of whichever version you are using in order
to be able to instruct the students.
This session is about creating some data structures in the form of database tables and populating
them with some data. These tables and data will be used in future exercises.
These exercises are to get students started. The parts of the SQL language they will use will be revisited in subsequent sessions. You will need to monitor students carefully as they work through the
exercises and provide guidance as necessary.

Exercise 1
The following SQL scripts will create two tables. Create and run them. If possible learn how to save
them in a file and run them from the SQL prompt. The mechanism for doing this will depend upon
the version of SQL you are using.
Create table departments
(dept_no integer not null,
department_name varchar(30),
location varchar2(3)
primary key dept_no);
Create table workers
(emp_no integer not null,
first_name varchar(30),
last_name varchar(30),
job_title varchar(30),
age integer,
dept_no integer,
primary key emp_no,
foreign key (dept_no) references departments)
Exercise 2
Examine the tables you have created. You do this using the desc <table_name> command.
Suggested Answer:
desc departments

Page 17 of 158
DB Lecturer Guide V1.0

Title Here
desc employees
Exercise 3: Insert Statements
To insert data into a table, you need to use an insert statement. The structure of insert statements
is:
Insert into departments values ('1','Packing','Cairo');
Now use similar statements to insert the Accounts department in Lagos with reference number 2
and the Human Resources department in London with reference number 3.
Note: are used around text based fields and are not required for numeric fields.
Suggested Answer:
Insert into departments values (2,Accounts,Lagos)
Insert into departments values (3,Human Resources,London)
Exercise 4
Now use insert statements to create the following workers:
Emp_no

First_name

Last_name

Job_title

Age Dept_no

Lawrence

Surani

Manager

56

Jason

Argo

Manager

33

Emily

Villa

Manager

32

Ahmed

Mukani

Packer

23

Joe

Todj

Packer

24

Hattie

Smith

Accountant

56

Sally

Boorman

Admin
Assistant

34`

Suggested Answer:
Insert statements structured as above with relevant data
e.g. Insert into workers values (1,Lawrence,Surani,Manager,56,1)

Page 18 of 158
DB Lecturer Guide V1.0

Title Here
Exercise 5: Looking at the Data
To see the data in your table, you need to use a select statement. The structure of select statements
is:
select <column_name> from <table_name>
To see all the columns:
select * from <table_name>
Use the select command to view the contents of your tables.
Suggested Answer:
Relevant select statement such as Select * from Workers.

Page 19 of 158
DB Lecturer Guide V1.0

Title Here
1.6

Private Study

The time allocation for private study in this topic is expected to be approximately 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
The Private Study exercises will be reviewed during the tutorial session so students should be
directed to complete this work in advance of that session.

Exercise 1: Databases in Organisations


Consider at least 3 organisations that you are familiar with; for example: your college, place of work,
somewhere you shop or an organisation you are involved with during your leisure time. For each of
them make notes on the following:

Do they have databases at the moment? If so, what are they used for?

What sort of data would they be interested in collecting?

How might they breakdown the data they are interested in into different categories? These
might be types of people, objects for sale, courses etc.

Suggested Answer:
The answer will depend on the organisations. Students should be encouraged to make educated
guesses about aspects that they are not familiar with and cannot get access to. Students should
also be encouraged to ask questions.
With regard to the breakdown of data, students should not be expected as this stage to have come
up with definitive data structures but to demonstrate that they have thought about different types of
data.
Exercise 2
Many organisations that collect data that will eventually go into a database begin the collection
process with paper forms. Example of this might be a passport or driving license application, a job
application or application to join a library.
Collect some examples of such paper forms and a turn each into a list of the data that could then be
input into a database.
Suggested Answer:
Once again, it is not expected that students should come up with fully-fledged data structures. The
aim here is to encourage them to think about different sources of data, data types and the way data
is organised.

Page 20 of 158
DB Lecturer Guide V1.0

Title Here
Exercise 3
Revise the topics that were discussed in the lecture. Ensure that you understand examples of
databases in use, the definitions of databases that were given, types of data, and the difference
between data and information.
If anything remains unclear after you have revised the topic, make a list of your questions and bring
it to the tutorial session.

Page 21 of 158
DB Lecturer Guide V1.0

Title Here
1.7

Tutorial Notes

The time allowance for tutorials in this topic is 1 hour.


Lecturers Notes:
Students have copies of the tutorial activities in the Student Guide. Answers are not provided in their
guide.
In Exercise 1 below, group members should be encouraged to look at each students findings and
compare them with their own. They should then prepare a summary to report back to the rest of the
class. You should also encourage students to ask any questions they have about the content of this
topic.

Exercise 1:

Group Discussion of Private Study Activities

In small groups, discuss your findings to Private Study Exercises 1 and 2. You should collate your
findings and report back to the class.
Exercise 2:

Data and Information

Answer the following questions:


1. What is the difference between data and information?
2. Identify in the materials you have collected as part of the Private Study: what is data and what is
information?
Suggested Answer:
Data is raw facts kept in a database and unprocessed. When data is processed in some way so as
to become meaningful, it becomes information.
With regard to the materials the students have collected, the point should be made that it is not
always easy to distinguish data and information. The headings on the forms mean that the entries
will have meaning. When they are put into a database they will be stored as raw data and have to
undergo some process of transformation (such as headings supplied and formatted in a report) for
them to become information again.

Page 22 of 158
DB Lecturer Guide V1.0

Topic 2
Topic 2: Databases
(DBMS)
2.1

and

Database

Management

Systems

Learning Objectives

This topic provides an overview of databases and database management systems


On completion of the topic, students will be able to:

2.2

Describe the main features of a database system;


Understand the role of the database management system;
Describe pre-database information systems;
Identify some of the commercial products available;
Understand the importance of the relational model and identify some alternatives to it.

Pedagogic Approach

Information will be transmitted to the students during the lectures. They will investigate some of the
topics further in private study and feed their results back to the tutorial. The laboratory session will
be exercises in the SQL language.

2.3

Timings

Lectures:

2 hours

Laboratory Sessions: 1 hour


Private Study:

7.5 hours

Tutorials:

1 hour

Page 23 of 158
DB Lecturer Guide V1.0

Title Here
2.4

Lecture Notes

The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

Recap of data
Introducing the concept of metadata
Pre-database file processing systems
Structure of a database system
Common applications
DBMS architecture and functions
DBMS advantages and disadvantages
Commercial products
Data models

2.4.1 Guidance on the Use of the Slides


Slides 2-3:

Scope and Coverage and Learning Outcomes - Go through the topics that will be
covered in this lecture. Recap some of the key concepts covered in the first lecture.
The main features of database systems will be described in more detail. Predatabase information management tools (file processing systems) will be examined
in order to point out the different approach of database systems and the benefits
and advantages of database systems.

Slides 4-5:

A recap on the previous definition of data is given here: data is the raw facts stored
in a computer system. In a relational database, data is stored in the form of tables
with columns and rows, and each row represents an instance of data. This is
sometimes referred to using the term tuple. Each column represents a different
attribute of that data. MS Access and Oracle are both examples of relational
database systems and are demonstrated here as examples of how their table
structures are shown. Note that in the Oracle slide, the data in the table has been
retrieved using a SELECT statement of the sort the students will be using in the
laboratory sessions.

Slides 6-9:

Metadata is data that contains the structure of other data. The structure of the
tables in a relational database is kept within the database itself in the form of
metadata. This defines the name of the table, the name of the column, the length of
the column and the data-type. Here, the students could be asked about their
understanding of data-types, e.g. what data-types are available? It should be
pointed out that different implementations of relational databases from different
vendors might have slightly different names for the same data type; although there
are standards. These slides show examples of how metadata is held in MS Access
and Oracle. Metadata is important because it is how the database keeps a record of
the structure of the data that is stored in it; the keeping of metadata means that
within the database itself, there is a record of the structure of the tables and
attributes. This enables a database to realise one of the advantages of the
database approach, namely, program-data independence. This will be discussed in
more detail below. The collection of metadata in a database is known as the data
dictionary.
Page 24 of 158

DB Lecturer Guide V1.0

Title Here
Slide 10:

Activity - The students should be asked to define metadata for the systems
mentioned in the slide. The important point to get across here is that they
understand the distinction between data and metadata; this exercise will test the
learning of the distinction between these concepts.

Slides 11-12:

These slides examine a two file-processing system. The example used is of a car
rental system. One system processes CUSTOMER data, and the other processes
RENTAL data. Each of the files and the applications that use them are totally
separate. Although this is an improvement over older manual systems, there are a
number of problems here:

Data are separated and isolated:


Customer information is stored in a separate file to rental information. If
salespeople need to relate substantial customer information to the cars rented,
then there is a problem. Data will have to be extracted from each file and
combined into a single file. This involves working out how each file is related to
the other and which parts of the file are needed; then a process of extraction
has to take place. This is often more difficult the greater the number of files
involved.

Data duplication:
For example, a customers name, address and phone number might be stored
many times over. Once in the customers file and once again every time they
make a rental (therefore possibly many times in the rental file).
This wastes space and also raises the more serious problem of compromising
data integrity. Data integrity refers to data being logically consistent. For
example, if a customer changes his or her name or address, then all the files
containing that data must be updated, but the danger with duplication is that this
doesnt happen. The address might be changed in one file and not in another,
which would lead to difficulties in knowing which the correct address is.

Application program dependency:


With file processing systems, the application programs depend on the file
format. For example, if you write a program in COBOL, to get some data from a
file then you have to specify in your program the exact way in which that file
holds the data. The problem with this arrangement is that when changes are
made in file formats, then the application programs must also be changed.

Incompatible files:
Due to this application program dependency, files that can be processed by one
programming language will be different to those processed by another. This
makes files difficult to combine, which reinforces the isolation and separation of
data that we discussed earlier.

Difficulty of representing data in the users perspectives:


Users often want to see data in a way that is different from the way it is stored.
For example, they might want to see rental information with a substantial
amount of customer information. This means doing things like combining files,
Page 25 of 158

DB Lecturer Guide V1.0

Title Here
which we have already noted is difficult, just to make the data appear natural to
the users.
Slide 13:

It should be pointed out that the most obvious feature of the diagram of the Basic
Structure of a database system is that now the data is stored in one place. All the
various applications will access it via the Database Management System.

Slide 14:

Represents the database system in more detail.

Slide 15:

The features of the database approach overcome the problems discussed with
regard to file-processing systems. The following points should be discussed:

Integrated data:
In a database system, all the application data is stored in a single facility called
a database. An application program can access customer information and rental
information easily. The program can specify how to combine the data and the
DBMS will do it.

Reduced data duplication:


As data is stored in only one place, there is no need to duplicate it. It is easy to
retrieve and if something changes, we only have to update it in one place.

Program/Data independence
Since the record formats are stored in the database itself (as metadata), then
we dont need to include file information in our application programs.

Easier representation of users perspectives


Database technology makes it much easier to represent data in a way the users
like to see it. This is a product of integration, which means it is much easier to
produce the sorts of applications where all the data that is needed can be
shown. When a developer creates a database, the information is stored in
tables. It is the job of the DBMS to store and retrieve data in these tables. When
a user wants to see the data in other formats, such as on a screen or in a
report, then we have to develop applications to do so.

Database systems are self-describing


In addition to the users source data, a database contains a description of its
own structure. This description is held by metadata and is held in a set of tables
known as the data dictionary.

Database Systems Maintain Program-Data Independence


Since a database is self-describing, application programs do not need
knowledge of the underlying file system formats. This means that changes in
the structure of the data will not have a major impact on the application
programs.

A database is a model of a model


A database is a model. It should be pointed out to the students that while it is
tempting to say that a database is a model of reality or some portion of reality as
it relates to a business, this is not strictly true. A database does not model
reality or some portion of reality. Instead, a database is a model of the users
model. It is an attempt to capture the way users understand the data held for
their business needs. So, for example, the amount of detail held by a system
would depend on the users needs. Understanding the way the user thinks, their
Page 26 of 158

DB Lecturer Guide V1.0

Title Here
requirements and pre-conceptions, is a major topic of study in systems analysis,
but we should always bear in mind that what we are concerned with is the
peoples perceptions and understanding.
Slide 16:

Slides 1718:

Slide 19:

The students should be asked what their understanding of an application is. They
can be encouraged to think about how they have interacted with databases. An
overview of the various types of applications is given:

Forms - Interactive, allowing input and output. Used for data-entry.

Reports - Read-only. Can be paper based or on-screen. The result of some


querying of the database.

Web-application - Many websites interact with databases. The type of


applications found on the web might be forms allowing interaction or reports that
are the result of a query. The students should be asked to think about what
databases they might have interacted with online.

Batch Processes - Programs that perform an activity on the database in one go,
such as multiple updates. If, for example, someone wanted to delete all the
customers
from a system who hadnt brought anything for over a year, a
batch process could
do this in one go without an end-user having to go
through the records one by one.

Slide 14 showed how the Database Management System (DBMS) sat as the
intermediary between the applications and the database itself. The DBMS is the
piece of software that handles all the interactions between applications and the
database. Paul Benyon-Davis provides a useful way of looking at the structure of
the DBMS itself.

Kernel - Central engine, which operates most of the core data management
functions such as those discussed below

Toolkit - The tools and applications that interact with the end-users. There is a
vast range of these available now. These might be provided as part of the
DBMS product or as separate piece of software.

Interface: handles the interaction between the toolkit and the kernel

The standard functions that the DBMS performs should be outlined. Most of these
will be performed by the kernel. It should be noted that some of these topics will be
covered in more detail later in the module.

CRUD - Stands for Create, Retrieve, Update and Delete. The basic interactions
between applications and the data.

Data Dictionary - The repository for the metadata should be supported. Not only
the structure of tables, but also the primary keys, relationships between tables
etc.

Transaction Management - A transaction is one or more operations that access


or make a change to the database. This must be supported by the DBMS.

Concurrency Control - The ability for many users to perform transactions at the
same time.

Recovery - In the event of a hardware or software failure, the database must be


capable of being recovered.

Page 27 of 158
DB Lecturer Guide V1.0

Title Here

Slide 20:

Slides 2123:

Authorisation - Security must be enforceable. This means being able to allocate


different roles and levels of access to users with associated user names,
passwords and privileges that give access to some, but not all areas of the
database and some, but not necessarily all, CRUD operations.

Data Communication - To be able to interact with other software enabling


connections to other parts of an ICT system. This enables connection between
the kernel and the toolkit software.

Data Integrity - Making sure that the database data reflects accurately the
model of the world that data is being kept about. This involves the use of
integrity constraints, such as enforcing that the values of an attribute are valid
values.

Administration Utilities - Allow importing and exporting of data, monitoring of use


and monitoring of performance.

The interface should provide the following languages to support DBMS functions:

Data Definition Language - Used to create structures, such as tables, and to


delete and amend existing structures

Data Manipulation Language - Supports CRUD functions

Data Integrity Language - Specifies constraints

Data Control Language - Designed for database administration and


authorisation

These slides list the advantages and disadvantages of the DBMS. Rather than just
going through this list, the lecturer could use each slide in turn as a set of headings
from which the students should discuss what they think these points mean and how
they relate to the content of the lecture so far.
Some concepts might need to be explained at the outset:

Data redundancy - Refer back to the discussion on file-based systems where


data can be stored in more than one place. Databases overcome this problem.

Integrity This refers to the consistency of data. This can be enforced with the
use of constraints.

Standards - Applying a set of standardised formats and procedures to an


organisation is easier when data is stored all in one place.

Concurrency - Allowing multiple accesses without creating conflicts or


inconsistent data.

A full discussion of these points can be found in Connolly and Begg chapter 1.
Slides 2425:

These slides look at commercial products. The market share of the main vendors is
shown. Some points can be mentioned with regard to the three main vendors. The
students will be investigating this further as part of their private study.

Oracle The biggest database company which, for a long time, held the market
share. It has a mature product and support network. It is favoured by many
professional organisations.

Page 28 of 158
DB Lecturer Guide V1.0

Title Here

Slide 26:

Microsoft SQL Server This fully integrates with other Microsoft tools.

MySQL This integrates well with web-development language PHP. This is


open-source software and free versions of it are available which goes some way
to explaining its increasing use.

It should be noted at this point that the lecture has focused on the relational model.
This is the most widely used data model for databases. It is the model that will be
the focus of this module. It is not, however, the most up-to-date model, in the
sense that there are newer data models available for databases. The order of the
bullet points here is more or less a chronological one.
As part of their private study, the students will be investigating the various different
data models that have existed.

Lecturers Notes:
Please note that the Private Study exercise for this topic requires organisation in order to ensure
suitable topic coverage (see Section 2.6 below). You should ensure that students know which topic
they have been assigned following the lecture session(s).

Page 29 of 158
DB Lecturer Guide V1.0

Title Here
2.5

Laboratory Sessions

The laboratory time allocation for this topic is 1 hour.


Lecturers Notes:
Students have copies of the laboratory exercises in the Student Guide. Answers are not provided in
their guide.
During the laboratory session, you may wish to check that students are working on their reports for
Private Study Exercise 1 (see Section 2.6 below) and that each student is aware of their assigned
topic.

2.5.1 Simple Queries


The SELECT statement is used to retrieve data from tables in the database. Whenever you want to
retrieve some data, use the SELECT keyword followed by the name of the columns you want and
the table (or tables) that those columns are in. The form on the SELECT statement is:
Select <column name>, <column name>
From <table name>
Where <Condition>
An example of a SELECT statement that gets the emp_no and the last_name from the workers table
is:
Select emp_no, last_name
From workers
Where dept_no = 1;
Try this and look at the results.

Exercise 1
Select the emp_no, first_name and last name from the workers table.
Suggested Solution:
Select emp_no, first_name, last_name from workers;
Exercise 2
Select the emp_no, first_name and last_name from the workers table for all the workers in
department no 1.

Page 30 of 158
DB Lecturer Guide V1.0

Title Here
Suggested Solution:
Select emp_no, first_name, last_name from workers
Where dept_no = 1;
Exercise 3
Select the first_name, last_name and job_title for all the managers in the workers table.
Suggested Solution:
Select emp_no, first_name, last_name from workers
Where job_title = Manager;
Exercise 4
Select the first_name and last_name for all the workers whose first names start with the letter J.
Suggested Solution:
Select first_name, last_name from workers
where first_name like 'J%';
Exercise 5
Select all the columns from the workers table for workers over the age of 50.
Suggested Solution:
Select * from workers
Where age > 50;
Exercise 6
Select the emp_no, first_name and last_name for all the managers who are under the age of 40.
Suggested Solution:
Select emp_no, first_name, last_name
from workers
where age < 40
and job_title = 'Manager';

Page 31 of 158
DB Lecturer Guide V1.0

Title Here
Exercise 7
Select the name and location of all the departments.
Suggested Solution:
Select dept_name, location from departments;
Exercise 8
Select all the columns for the department located in Cairo.
Suggested Solution:
Select * from departments where location = Cairo;

2.5.2 Using the COUNT function with queries


The COUNT function allows us to count the rows in a table.
Example:
Select Count(emp_no)
From Workers;
Using the primary key, in this case emp_no, in the count function, will give the number of unique
rows that match whatever criteria is used in the WHERE clause.
Exercise 9
Count the number of workers who are under the age of 30.
Suggested Solution:
Select count(emp_no) from workers where age > 50:
Exercise 10
Count the number of Managers.
Suggested Solution:
Select count(emp_no) from workers where job_title = Manager;

Page 32 of 158
DB Lecturer Guide V1.0

Title Here
2.6

Private Study

The time allocation for private study in this topic is expected to be 7 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide. Students are also expected to use some of their private study time to review the
content of this topic and to conduct any extra reading required to improve their understanding.
You will need to allocate the topics to the students to ensure that there is suitable coverage of each
one. Students should write individual reports and ideally half of the class will cover each topic.

Exercise 1
Write a report on one of the following topics. Your report should be 600-900 words in length and you
should be prepared to discuss your report in the tutorial session. Your lecturer should allocate the
topics to ensure the different content is covered evenly.

Prepare a report on three of the commercially available database management systems.


Include at least two of the main vendors (Oracle, MS SQL-Server, MySQL).
Your report should discuss the products and features available for each of the vendors,
their market share and type of market they are most commonly used in, and the ways in
which they differ from one another.

Prepare a report on three of the alternative data model approaches that have been used
for database systems: network, hierarchical, relational, object-oriented, deductive and
post-relational. Discuss the history, structure and uses of each of your chosen models.

These reports will be used as the basis for the classroom discussion in the tutorial session.
Suggested Answer:
The report should be about 600 to 900 words with a third given to each of the chosen systems or
data models. Students should be encouraged to carry out research online and use the results of this
to produce a document that is in their own words.

Page 33 of 158
DB Lecturer Guide V1.0

Title Here
2.7

Tutorial Notes

The time allowance for tutorials in this topic is 1 hour.


Lecturers Notes:
Students have copies of the tutorial activities in the Student Guide. Answers are not provided in their
guide.
It is envisaged that Exercise 1 should take approximately 20 minutes. Students can then spend
around 15 minutes on Exercise 2. This should be followed by a whole class discussion where you
summarise the key information about commercial implications and vendors, and the data models
used for databases. You should also allow students the opportunity to raise any queries they may
have about the content of Topic 2.

Exercise 1
Work in a small group with other students who have written a report on the same topic during
private study time.
Discuss the information you have found. You should take the opportunity to add any additional
information to your own notes.
Now prepare to present your information to students who have worked on the other report. You
should work together as group to prepare a short (5 minutes), informal presentation which will give
the other students a summary of the main information you have found.
Exercise 2
Join together with another small group who have worked on the other report topic.
Work with your group to present your information to students from the other group. You should also
answer any questions they might have.
Now listen to their presentation and take notes.

Page 34 of 158
DB Lecturer Guide V1.0

Topic 3
Topic 3: Entity Relationship (ER) Modelling (I)
3.1

Learning Objectives

This topic provides an overview of Entity Relationship Modelling


On completion of the topic, students will be able to:

3.2

Understand the goal of Entity Relationship (ER) Modelling;


Recognise different types of notation;
Understand the concepts of an entity type, relationships and attributes;
Begin to develop an approach to identifying entities.

Pedagogic Approach

Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.

3.3

Timings

Lectures:

2 hours

Laboratory Sessions 1 hour


Private Study:

7.5 hours

Tutorials:

1 hour

Page 35 of 158
DB Lecturer Guide V1.0

Title Here
3.4

Lecture Notes

The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

Goal of Entity Relationship Modelling


Different types of notation
Definition of key terms
Identifying entities
Multiplicity

3.4.1 Guidance on the Use of the Slides


Slides 2-3:

Scope and Learning Outcomes. It should be mentioned that this topic will be
covered over 2 lectures and the scope and learning outcomes presented here are
those that relate to this topic's lecture.

Slide 4:

Point out that the ERM technique is used to specify database structures among
professionals within database industry. It is also used to communicate data models
to non-specialist users, such as clients sponsoring database projects.

Slides 5-7:

The students should be made aware that that there are different notations available.
For this unit, the UML notation will be used, but they might come across other
notations in textbooks. The notations are more or less equivalent in terms of what
they can express.

Slides 8-9:

These slide present definitions of an Entity Type and Entity Occurrence. These are
best explored though the use of examples.

Slide 10:

The students should be encouraged to engage in this activity to think about what
entity types might exist in a library. Allow some time for them to do this either
individually or in groups.

Slides 11-12:

Allow students to make notes before going through the likely answers, to illustrate
the difference between entity type and entity occurrence.

Slide 13:

This slide explains UML standard notation for entity type. Note that attributes (more
of which below) will sometimes be listed on the diagram and this will be shown in
more detail in the second ER lecture.

Slides 14-15:

Here are definitions of relationship types and relationship occurrence. As with


entities, this is best explored through example.

Slide 16:

For these entities identified as part of a Library System, specify the relationships
between them by connecting them with a line.

Slides 17-18:

Discuss with students their own solutions, which might be different. Ask what they
understand as the nature of the relationships they have specified.

Slide 19:

Relationship Names.
Page 36 of 158

DB Lecturer Guide V1.0

Title Here
Slides 20-23:

These slides present a definition of multiplicity and examples of types of multiplicity


from the Library System. Point out the meaning of each of these and indicate the
way the UML notation specifies these types of multiplicity. Note that multiplicity is
one of the ways that different notations of ER modelling differ from one another.

Slide 24:

As part of self-study, the students will be asked to specify the multiplicity of the
remaining entities on the Library System diagram. BOOK to LOAN; LOAN to
BORROWER.

Slide 25:

This slide gives a definition of attributes.

Slide 26:

Attribute Domain a domain is the set of allowable values for an attribute or


number of attributes. A domain therefore limits the values that an attribute can
have. For example, the domain of 'Sex' would include the values 'Male' and
'Female'. The domain of fruit would include the values 'Apple', 'Orange' etc.
Simple Attribute - composed of a single component for example 'Sex' is just one
value for any occurrence.
Composite Attribute - composed of more than one component. For example,
'Address' might have address lines, town, post or zip code.
Single-Valued Attribute - holds a single value for an occurrence of an entity type.
Again, 'Sex' is a good example because there will only be one value.
Multi-Valued Attribute - where there might be more than one value for a given
occurrence of an entity type, e.g. for the 'Telephone Number', where a person or
business might have many of these.

Slides 27:

Identify attributes for the entity types of the library system. Allow some time for this
and discuss the findings with students.

Slides 28-30:

Identifying entities - This is best illustrated by looking at an example. Bear in mind


that the approach that is being taken is very much a top-down approach. Students
will need to practice identifying entities and there will be more examples and
discussion on this topic in the next lecture.

Page 37 of 158
DB Lecturer Guide V1.0

Title Here
3.5

Laboratory Sessions

The laboratory time allocation for this topic is 1 hour.


Lecturers Notes:
Students have copies of the laboratory exercises in the Student Guide. Answers are not provided in
their guide.

3.5.1 Selecting from More Than One Table


When we want to get information from more than one table, this involves what is known as a join
between the two tables. This basically means using the WHERE part of the SELECT statement to
do an equality operation between the columns that join the two tables.
For example, the workers and departments tables are joined in our data model by the fact that
dept_no is a foreign key on workers. This shows in which department each worker is. If we want to
select the department_name from departments and the first_name and last_name of the workers
from the workers table, the SELECT statement will look like this:
Select departments.department_name, workers.first_name, workers.last_name
From departments, workers
Where departments.dept_no = workers.dept_no;
We can make this a bit easier to write out by giving each of our tables an alias, usually a letter so we
do not have to keep writing the whole name for the table each time we refer to it:
Select d.department_name, w.first_name, w.last_name
From departments d, workers w
Where d.dept_no = w.dept_no;
Exercise 1
Make sure you can write and run the above query using the table aliases.
Exercise 2
Try the following exercises that all use similar statements:
1. Select the department_name from departments and the first_name, last_name and job_title
from workers.
2. Select the department_name and location from departments and the first_name and
last_name from workers.
3. Select the department_name and location from departments and the first_name and
last_name from workers. Only select workers who are in the Packing department.
4. Select the department_name and location from departments, and the first_name, last_name
and job_title from workers, for just the managers that work in Cairo.
5. Select the job_title, age and location for all the workers who work in London.
Page 38 of 158
DB Lecturer Guide V1.0

Title Here
6. Using the COUNT function and joining the two tables, count how many workers there are in
Lagos.
Suggested Solutions:
1. Select d.department_name, w.first_name, w.last_name, w.job_title
from departments d, workers w
where d.dept_no = w.dept_no;
2. Select d.department_name, d.location, w.first_name, w.last_name, w.job_title
from departments d, workers w
where d.dept_no = w.dept_no;
3. Select d.department_name, w.first_name, w.last_name, w.job_title
from departments d, workers w
where d.dept_no = w.dept_no
and d.department_name = 'Packing';
4. Select d.department_name, d.location, w.first_name, w.last_name, w.job_title
from departments d, workers w
where d.dept_no = w.dept_no
and d.location = 'Cairo';
5. Select w.job_title, w.age, d.location
from departments d, workers w
where d.dept_no = w.dept_no
and d.location = 'London';
6. Select Count(*)
from departments d, workers w
where d.dept_no = w.dept_no
and d.location = 'Lagos'
Exercise 3
The ORDER BY statement is a way of specifying the order in which you want your selected data to
appear. For example, to retrieve the emp_no and first_name of workers, we could order by emp_no
or order by first_name:
To order by emp_no:
Select emp_no, first_name
From workers
Order by emp_no;

Page 39 of 158
DB Lecturer Guide V1.0

Title Here
To order by first_name
Select emp_no, first_name
From workers
Order by first_name;
1. Select the department_name from departments, the first_name, last_name and age from
workers. Order by the age.
2. Select the first and last names of the workers who work in Cairo and order them by their age.
Suggested Solutions:
1. Select d.department_name, w.first_name, w.last_name, w.age
from departments d, workers w
where d.dept_no = w.dept_no
order by w.age;
2. Select w.first_name, w.last_name
from departments d, workers w
where d.dept_no = w.dept_no
and d.location = 'Cairo'
order by w.age

Page 40 of 158
DB Lecturer Guide V1.0

Title Here
3.6

Private Study

The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.

Exercise 1
In the Private Study session for Topic 1, you were asked to collect paper data input forms from an
organisation, such as a library or government department or any other organisation.
Examine these forms and specify what entities and attributes might be needed in a database to
capture the data that they collect.
Bring both the paper forms and your analysis to the tutorial for discussion.
Suggested Answer:
The answer will depend on the material brought by the student. However, students should be
looking for field entries for attributes and thinking how these might be grouped under certain entities.
Entities might be identifiable as section headings within forms, such as personal information,
qualifications etc.
Exercise 2
Examine the library system diagram on Slide 24. Identify the missing multiplicities for Book to Loan
and Loan to Borrower.
Suggested Answer:
A Loan is for 1 Book and for 1 Borrower so if someone took out more than one Book, it would be set
up as a number of different Loans. A Borrower might have none or more Loans. A Book might have
none or more Loans.
Exercise 3
Revise the topics studied in the lecture. Make notes on the following topics and make sure you
understand the concepts:

Entity Type
Entity Occurrence
Relationship Type
Relationship Occurrence
Attributes
Multiplicity

Page 41 of 158
DB Lecturer Guide V1.0

Title Here
Suggested Answer:
The students should make some short notes on each of the topics and demonstrate their
understanding of them in the tutorial. They should be encouraged to raise aspects of the topics that
they find difficult or do not understand.

Page 42 of 158
DB Lecturer Guide V1.0

Title Here
3.7

Tutorial Notes

The time allowance for tutorials in this topic is 1 hour.


Lecturers Notes:
Students have copies of the tutorial activities in the Student Guide. Answers are not provided in their
guide.

Exercise 1:

Group Discussion of Private Study Activity

In small groups, discuss your findings to Private Study Exercises 1 and 2. You should collate your
findings and report back to class.
Exercise 2
Use this time to raise questions regarding the material. In small groups, discuss the concepts listed
below and report your findings back to the class.
Do you understand the following concepts? :

Entity Type
Entity Occurrence
Relationship Type
Relationship Occurrence
Attributes
Multiplicity

Page 43 of 158
DB Lecturer Guide V1.0

Title Here

Page 44 of 158
DB Lecturer Guide V1.0

Topic 4

Topic 4: Entity Relationship (ER) Modelling (2)


4.1

Learning Objectives

This topic provides an overview of Entity Relationship Modelling


On completion of the topic, students will be able to:

4.2

Construct an ER model from a scenario;


Understand the purpose of a primary key;
Understand the role of foreign keys;
Recognise strong and weak entities;
Identify and solve problems in ER models.

Pedagogic Approach

Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.

4.3

Timings

Lectures:

2 hours

Laboratory Sessions: 2 hours


Private Study:

7.5 hours

Tutorials:

2 hours

Page 45 of 158
DB Lecturer Guide V1.0

Title Here
4.4

Lecture Notes

The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

Identifying entities
Primary and Foreign keys
Chasm and Fan traps

4.4.1 Guidance on the Use of the Slides


Slides 2-3:

Scope and Learning Outcomes. It should be noted that this is the second lecture
about ER modelling, which expands on topics covered in the first lecture. There is a
focus in this topic on practical application of the model to example scenarios.

Slides 4-8:

Mention that as a technique, ER modelling is best learnt by going though practical


examples. The focus here is on discovery of entities using a top-down approach
that tries to recognise entities by looking at the nouns used in a scenario along with
an understanding of what it is that a business actually does. Last week, students
were able to guess at entities in a library, because it is an organisation of which
most people have a basic understanding.
This topics scenario is related to a gardening company. Try to recognise nouns that
could possibly be entities. Note that some nouns might not actually be entities.
Town here is not an entity, because we are only concerned with one town. In
another situation, it might become an entity. This contextual nature of data
structures is important and should be emphasised: in some situations something
will be an entity while in other situations it would not.

Slide 9:

Primary Key -Attributes were introduced in the last topics lecture; they are qualities
of an entity. This is an example of attributes of the entity Client. Each entity will
have one or more attributes that uniquely defines an entity instance; in this case, it
is ClientNo. No two client numbers will be the same.

Slide 10:

Entities are linked by foreign keys. A foreign key is the copy of an attribute from
another entity. The attribute copied across links the two entities. It is usually (though
not always) the primary key that is copied across. In the example here, a Client has
a Preference for types or room and a maximum rent. We know whose Preference it
is, because the clientNo is a Foreign Key that has been copied from Client and links
the two attributes.

Slide 11:

Here is another example where there is a many-to-many relationship. Semantically,


(in terms of its meaning) this represents a Module having many Students and
Students taking many Modules. When we have a situation like this, we need to
think about how to represent it in our model. Where do we put the foreign Keys?
Normally, a foreign key should go at the many end. But, in this case, it would go on
both, which would lead to confusion. It is preferable to resolve many-to-many
relationships along the lines shown here.

Page 46 of 158
DB Lecturer Guide V1.0

Title Here
Slide 12:

A new entity is created called Student on Module. This represents an instance of


one Student taking one Module. Therefore, the foreign keys are moved from
Module and Student, to the new entity.

Slides 13-15:

These slides show how instances of data are represented on the entities with the
use of foreign keys. Note that the Primary Key of the Student on Module is the
compound primary key composed of the two attributes Module Code and Student
Code. Ask the students whether this would really provide a unique identifier for
each instance of Student on Module. What would happen in a situation where
students were allowed to resit a Module? In that case, there would be duplicate
values in the primary key on Student on Module, which is not allowed. Therefore,
we would need to introduce another attribute as part of the primary key on Student
on Module, for example a date or semester and year.

Slides 16-17:

These slides define strong and weak entity types.

Slide 18:

Primary keys on weak entity types will normally include the primary key (that exists
there as a foreign key) as part of their own primary key.

Slides 19-20:

Fan Traps -These slides illustrate that where there is a structure similar to that in
Slide 19, there is a potential problem as shown by the example data in Slide 20. We
have a Campus entity that shows a number of Staff and a number of Departments
as belonging to a particular Campus. However, there is no link between Department
and Staff, and therefore, where we have a Campus that has more than one
Department, we will not always know the Department in which a member of Staff
works.

Slide 21-22:

The solution is to adopt a different structure where we have a Campus having one
or more Departments, which in turn have one or more Staff. It should be clear from
the example data that this solves the problem.

Slides 23-24:

Chasm Traps - These occur where there are relationships between entities, but one
of the relationships is non-mandatory, i.e. there does not HAVE to be an instance of
this relationship. This is shown in the example data in Slide 23; here a Branch has
many Staff members who manage Properties, but not every Property must be
managed by someone. Therefore, because Hill House is not managed by a
member of Staff, we do not know from which Branch that Property is managed.

Slides 25-26:

The solution is to change the structure and represent both relationships. This is
illustrated by the data shown.

Slides 27-29:

Students should be given a fairly generous amount of time to attempt the activity,
either individually or in pairs, during the lecture. The solution should then be
discussed and used as an example of the techniques of entity discovery, setting up
relationships and multiplicity that have been discussed other the two lectures. Note
that the self-study and tutorial exercises will involve more example ER diagrams.

Page 47 of 158
DB Lecturer Guide V1.0

Title Here
4.5

Laboratory Sessions

The laboratory time allocation for this topic is 2 hours.


Lecturers Notes:
Students have copies of the laboratory exercises in the Student Guide. Answers are not provided in
their guide.

Exercise 1:

Inserting Data

The INSERT statement allows us to add new data to our tables.


The basic structure of the INSERT statement was introduced in the first laboratory session.
Insert into departments values (1,'Packing','Cairo');
There is some variation on this. For example, to insert into the table but only particular fields, we
specify the columns in the insert statement. So, if we wanted to insert a new row into departments,
but did not yet know where it would be located, we would write:
Insert into departments (dept_no, department_name) values (4, Marketing);
Run this insert statement.
1.

Insert another new department Public Relations. Its dept_no will be 5. Its location is Madrid.

2.

Insert another new department Research and Development. Its dept_no is 6. Its location is
unknown at the moment.

3.

A new employee has joined Research and Development. Their first_name is Jonas, their
last_name is McKenzie. They are 38 years old. Their emp_no is 8. They are 38 years old,
but currently their job_title has not been decided. Insert data into the workers table for them.

IMPORTANT: Make sure you commit your work by writing the command: commit;
Suggested Solutions:
1.

Insert into departments values (5,'Public Relations', 'Madrid');

2.

Insert into departments(dept_no, department_name) values (6,'Research and Development');

3.

Insert
into
workers
(8,'Jonas','McKenzie',38,6);

Exercise 2:

(emp_no,first_name,last_name,age,dept_no)

Updating Data

The UPDATE statement allows us to change data that already exists.


For example, we might have a script that changes someones age.
Update workers
Page 48 of 158
DB Lecturer Guide V1.0

values

Title Here
Set age = age + 1
Where emp_no = 1;
This will add one to the age of worker one.
Try this Update script.
1.

It has now been decided where the new Research and Development department should be
located. Write an update script that sets the location of this department to Berlin.

2.

Write an update script to set the job_title of the employee McKenzie to Manager.

Suggested Solutions:
1. Update departments
set location = 'Berlin'
where dept_no = 6;
2. Update workers
set job_title = 'Manager'
where emp_no = 8;
Exercise 3:

Deleting Data

The DELETE statement allows us to get rid of data from our database.
For example, if we wanted to delete all the managers from the database, we would write the
following (do NOT run this; it is an example):
Delete from workers
Where job_title = Manager;
1.

Use the INSERT statement to insert yourself in the workers table with an ID of 10. Then write
a DELETE statement to get rid of yourself from the database.

Suggested Solution:
The insert statement will depend on the data but will be along the lines of those in Exercise 1 above.
Delete from workers
where emp_no = 10;

Page 49 of 158
DB Lecturer Guide V1.0

Title Here
4.6

Private Study

The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.

4.6.1 Review Material on ER Modelling


Please review all the materials for Topics 3 and 4 before going on to the exercises below. You
should make note of anything that you feel requires further clarification and bring your questions to
the tutorial for this topic.

4.6.2 Drawing ER Models


ER diagrams are one of the most important techniques used in database development. You will
need to master this technique in order to complete your assignment. Almost any work in the
database field requires an understanding and ability to construct ER diagrams.
The following are a series of short scenarios. Draw an ER diagram for each.
Exercise 1
A customer records systems for a mail order beauty products company. A customer is assigned to
one and only one geographical region. A customer may be interested in a number of different
product lines. Any particular product line belongs to one product category that may contain many
product lines.
Suggested Solution:

CUSTOMER

0..*

REGION

CATEGORY

0..N
0..*

PRODUCT LINE

Exercise 2:

0..*

A Boat Rental System

A boat is rented to a customer for a set period of time. Any damage to the boat is recorded for that
particular rental.

Page 50 of 158
DB Lecturer Guide V1.0

Title Here
Suggested Solution:
1

BOAT

0..*N

0..*

RENTAL

CUSTOMER

1
0..*

DAMAGE

Exercise 3:

A Personnel Database

Employees can be members of one or more than one department.


Suggested Solution:
1

EMPLOYEE

Exercise 4:

0..N

MEMBER

0..N

DEPARTMENT

A Database for a Private Collection of Books

Each author may have written one or more books. A book might have one or more authors. Each
book belongs to one category.
Suggested Solution:
BOOK TYPE

0..N

BOOK

0..N

WRITTEN BY

Page 51 of 158
DB Lecturer Guide V1.0

0..N

AUTHOR

Title Here
Exercise 5:

A Plane Ticket System

Each ticket is for one flight and one customer. A customer may book many flights. A flight has many
customers.
Suggested Solution:
1

FLIGHT

Exercise 6:

0..N

TICKET

0..N

PASSENGER

A Film Rental Shop

The shop needs to keep track of rentals. A member can rent films. A film can be rented by many
members. A film can be rented by the same member more than once.
Suggested Solution:
FILM

0..N

RENTAL

Page 52 of 158
DB Lecturer Guide V1.0

0..N

CUSTOMER

Title Here
4.7

Tutorial Notes

The time allowance for tutorials in this topic is 2 hours.


Lecturers Notes:
Students have copies of the tutorial activities in the Student Guide. Answers are not provided in their
guide.
Before eliciting questions in Exercise 1 below, you may want to elicit from students what they
believe are the key things to understand in terms of ER Modelling in order to revise the topic.
After Exercise 2, run a whole-class feedback session to check students answers and deal with any
problems/questions.

Exercise 1:

Review of ER Modelling

Ask your tutor any questions you have with regard to Topic 3 and 4 on Entity Relationship Modelling.
Exercise 2:

Review of Private Study Exercises

Work in a small group and review your answers to Private Study Exercises 1-6. Discuss the
decisions you took in drawing each ER diagram.
Exercise 3
Answer the following questions in your own words:
a. Give an explanation of a fan trap using examples.
b. Give an explanation of a chasm trap using examples.
c. Why is it important to resolve many-to-many relationships into one-to-many relationships?
d. Give examples of entities, from any system or example system, which represent the
following: people, events, concepts, physical objects.
e. Define the following: simple attribute; composite attribute and single-valued attribute.
Suggested Answers:
a. A fan trap is where one too many relationships fan out from a central entity, so the link
between the two entities at the many end becomes unclear. For example, if we had a central
entity of Branch (of an organisation) and two entities fanning out from it: Staff and
Department, we cannot clearly link Staff with Department. The students should provide a
suitable example similar to this.
b. A chasm trap is a data model that has relationships where there is a non-mandatory link
between a parent and a child entity type. An example of this (derived from Connolly and
Begg) is where there is a Branch entity that has a one-to-many relationship with a Staff
entity. The Staff entity has a one-to-Many relationship with a PropertyForRent entity. If the
Page 53 of 158
DB Lecturer Guide V1.0

Title Here
relationship between Staff and PropertyForRent is non-mandatory, then the information as to
which Branch the PropertyForRent rows are related to could be lost. The students should
provide a suitable example similar to this.
c. It is important to do this because since the FK will be at the many end of the relationship, it
could result in FKs at both ends each relating to the other in ways that make it unclear what
the actual relationship is.
d. An example of a person would be a student entity type.
An example of an event might be an appointment or lecture entity type.
An example of a concept might be a course entity type.
An example of a physical object might be a book entity type.
Other examples are, of course, acceptable.
e. Simple Attribute - composed of a single component for example 'Sex' is just one value for
any occurrence.
Composite Attribute - composed of more than one component. For example, 'Address' might
have address lines, town, post or zip code.
Single-Valued Attribute - holds a single value for an occurrence of an entity type. Again, 'Sex'
is a good example, because there will only be one value.

Page 54 of 158
DB Lecturer Guide V1.0

Topic 5
Topic 5: The Relational Model (I)
5.1

Learning Objectives

This topic provides an overview of the relational model.


On completion of the topic, students will be able to:

5.2

Understand the aims of the relational model;


Understand the basic concepts of the relational model;
Define key terms of the relational model.

Pedagogic Approach

Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.

5.3

Timings

Lectures:

2 hours

Laboratory Sessions: 2 hours


Private Study:

7.5 hours

Tutorials:

2 hours

Page 55 of 158
DB Lecturer Guide V1.0

5.4

Lecture Notes

The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

Aims of the relational model


Basic concepts of the relational model
Terminology

5.4.1 Guidance on the Use of the Slides


Slides 2-3:

Note that this is the first of two lectures on the relational model. This first topic will
serve as a background to the relational model and go through some of the
fundamental terms. The second topic will focus more on the process of
normalisation.

Slide 4:

The relational model is so named because it is built around relations. It is based on


mathematical set theory.

Slide 5:

Point out that the relations in the model are NOT the same as the realtionships in
ER modelling; relations should be thought of as tables. This distinction can be a
cause of confusion for students.

Slide 6:

All the full advantages that databases have over previous file processing systems,
as discussed in an earlier lecture, are only fully realised with the coming of the
relational model. The three aspects mentioned here are key to realising those
advantages.

Slide 7:

Data independence means that access to data moves from being the realm of the
programmer to that of, ideally, the end user. The internal storage structure of the
data does not need to concern someone who wants to access the data. All they
need to know about is the structure of the realtions (or tables), and the attributes (or
columns). Previously, in a language like COBOL for example, a programmer was
needed to take account for the file structure; this made accessing and changing
data much more difficult.

Slide 8:

The process of normalisation is about structuring the data so as to minimise


redundancy and duplication. Ideally, an item of data should be stored only in one
place. In practice, there is some duplication due to the use of foreign keys (of which
more later) and approaches taken to ensure good performance.

Slide 9:

Set-orienated data manipulation languages - From the mathematics of set thoery,


the relational model has used languages like relational algebra and relational
calculus to express data manipulation. These are beyond the scope of this module.
However, SQL (Stuctured Query Language) is ultimately based on these languages
and the underlying set theory.

Slide 10:

System R was developed at IBM's San Jose laboratory and involved some of the
key people in the early development of databases, such as Codd and Boyce. They
used to play something called the 'Query Game' to work out how to express queries
as simply as possible. This led to the development of SQL. There were also
Page 56 of 158

DB Lecturer Guide V1.0

commercial implemenations of System R and commercial spin-offs. Other aspects


that were investigated during the System R project include: transaction
management, concurrency control, recovery techniques, query optimisation, data
securtiy, data integerity, human factors and user interfaces.
Slide 11:

INGRES stands for Interactive Graphics Retrieval System. It was used to


investigate the concepts of the relational model. From the initial investigation, the
INGRES project was used to develop the commercial product Ingres.

Slide 12:

The Peterless Relational Test Vehicle was the first relational database to be able to
handle large volumes of data in term of both rows and columns.

Slide 13:

The relational model has its own set of terms. These are often unfamiliar names for
concepts and structues with which we are familiar.

Relation - The logical structure of the database is perceived as a series of


tables with rows and columns. These are known as relations. It is worth noting
that this has no implication for how the database is stored physically.
Relations can be thought of as tables with certain properties. These properties
are discussed later.

Attribute - The columns in a realtion are known as attributes.

Domain - The set of allowable values of an attribute. The domain primary


colours for screen displays could be 'Red', 'Green' and 'Blue'. An attribute that
drew its values from this domain could only have one of these values or
(possibly) have a NULL value.

Null - A value that is currently unknown. It IS NOT zero or blank. As it is


unknown, the statement 'NULL is equal to NULL is not true.

Slide 14:

This slide illustrates attributes, domains and a relation.

Slide 15:

Tuple - A tuple is equivalnet to a row in a table.


Degree - The degree of a relation is the number of attributes it has.
Cardinality The cardinality of a relation is the number of tuples it contains or the
number of rows in a database table.
Relational Database - In the terminology of the relational model, as opposed to that
of the business world where it might refer to a type of product, a relational database
can be thought of as the collection of realtions that are normalised and have unique
names.
Relational Schema - A named relation defined by a set of attributes and domain
name pairs. This recognises that each attribute has its value drawn from a
particular domain.

Slide 16:

This slide illustrates tuples, degree and cardinality.

Slide 17:

It should be pointed out that the sets of terms that have been used also have a
counterpart in a third set of terms (Alternative 2) that are remnants of older file
processing and pre-relational database technologies.

Page 57 of 158
DB Lecturer Guide V1.0

Slides 18-20:

Properites of a relation should be discussed and the example table in Slide 19


shown. Students should be invited to discuss among themselves whether this
meets the criteiria.

Slide 21:

This shows the same data rearranged as a relation. Note that there is lots of
repetion here; for example the name, address and course. Note also that where an
address is not known, there is no data and this column is NULL.

Slide 22:

In order to overcome the problem of repetition, the relation is split into three. This
should result in reducing repetion to a minimum. Only certain attributes are
repeated and these are foreign keys that are linking the data in one relation with the
data in another. Understanding this data should be intuative. So if asked 'What
modules is Guy Smith taking and what are their names?' the students should all be
able to answer this by reading the data from the thre different tables.

Slide 23:

Primary Keys are the attributes that uniquely identify a row in a table (a tuple in a
relation).

Slide 24:

Foreign Keys are an attibute or the set of attributes within one relation that matches
an attribute (usually the primary key but not always) in another relation. A foreign
key in a relation can match an attribute in the same relation. If the attribute that it
matches is not a primary key, then it will be what is known as a candiate key (see
below). Foreign keys are the way in which relationships (of the sort that exist in ER
diagrams) are represented in the realtional model and in implmented databases.

Slide 25:

The process that has been shown here is a flavour of what is known as
normalisation.

Slide 26:

This slide looks at the different types of keys:

Primary Key and Foreign Key have been discussed - elicit from students what
they are in order to check their understanding.

Super Key - An attribute or set of attributes that uniquely identifies a tuple.

Candidate Key - A candidate key should be a super key. However ALL the
attributes of this super key must be necessary to uniquely identify it. It should
not be the case that any of the attributes that go to make up this key should
qualify as a super key; ALL the attributes are necessary.

The primary key will be chosen from the candidate keys.

Page 58 of 158
DB Lecturer Guide V1.0

5.5

Laboratory Sessions

The laboratory time allocation for this topic is 2 hours.


Lecturers Notes:
Students have copies of the laboratory exercises in the Student Guide. Answers are not provided in
their guide.

5.5.1 More on Creating and Modifying Tables


The basic format for creating tables was shown in Topic 1:
Create table departments
(dept_no integer not null,
department_name varchar(30),
location varchar(30)
primary key dept_no);
Be aware of variations depending on which database product you use. The variations are usually to
do with where you specify primary keys and foreign keys.
Create table departments
(dept_no integer not null primary key,
department_name varchar(30),
location varchar(3));
In the exercises for this laboratory session, we will look at creating a table from another table and
modifying an already existing table.
If we wanted to create a table which stored the unique names for the different types of jobs that are
available in our firm, we could create it in the conventional way. However, since all the values we
want already exist in the workers table, it would be easier to create the table with the data from the
existing workers table. This is done like this:
Create table job_type as
(Select distinct job_title from workers);
Exercise 1
Run the above script and use a SELECT statement to look at the results.
Suggested Solution:
Select * from job_type;

Page 59 of 158
DB Lecturer Guide V1.0

Exercise 2
Our new table only has one attribute and no primary key. Therefore, we should modify this with the
ALTER TABLE statement as follows:
Add a column for the salary of that job_title:
ALTER TABLE job_type
Add salary FLOAT;
Note that Float designates the floating point data type. It is also known as REAL.
Run this script to alter the table.
Exercise 3
We now need to add the primary key for this table. The primary key will be the job_title field.
ALTER TABLE job_type
ADD PRIMARY KEY (job_title);
Run this script.
Exercise 4
We must now enforce the fact that job_title in the workers table is now a foreign key to the job_type
table. We do this in a similar way using the ALTER table statement.
ALTER TABLE workers
ADD FOREIGN KEY (job_title) REFERENCES job_type(job_title);
Run this script.
Be aware that different vendors versions of SQL may implement these constraints slightly
differently.
Exercise 5
You will notice that the salary field is blank. Update the job_type table to set the salaries as follows:

Managers earn 30000


Packers earn 20000
Admin Assistants earn 15000
Accountants earn 28000

Suggested Solution:
Update job_type
Set salary = 30000
Where job_title = Manager;
The other updates will be similar.
Page 60 of 158
DB Lecturer Guide V1.0

Exercise 6
You should now be confident enough to be able to create tables of your own design.
1.

Design a table that keeps your personal details. This should include your name, address and
date of birth. Create this table using SQL with an appropriate primary key.

2.

Design a table that keeps a list of your qualifications. This will have a foreign key to the table
with your personal details. Create this table using SQL with the appropriate primary and
foreign keys. You should include information about the name of the qualification, the level of
the qualification (e.g. Level 4), the name of the institution the qualification was taken at and
the final grade.

Suggested Solution:
This will depend on the student, but an example is given below:
Create table Personal_details
(personal_no integer not null,
first_name varchar(30),
second_name varchar(30)
primary key personal_no);
Create table qualifications
(qual_no integer not null,
qual_name varchar(30),
qual_level integer,
institution varchar(30),
grade varchar (30),,
personal_no integer,
primary key qual_no,
foreign key (personal_no) references personal_details)

Page 61 of 158
DB Lecturer Guide V1.0

5.6

Private Study

The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.

Exercise 1
Look at the following table of data from a hair-care product supplier:
Customer ID

Customer Name

Customer Products

Product Prices

C1

Manjeet Islam

Hair dryer

$35

Shampoo

$7

Specialist comb set

$8

C4

Tolu Amusia

Hair net

C2

Sid James

Specialist comb set

$7

Hair dryer

$35

C6
1.
2.
3.
4.
5.

Ambereen Reeza

Clippers

Identify why you think this table is not a relation.


Is the price of the hair net the same as the price of the clippers?
What are the prices of the hair net and the clippers?
Should the row for customer C2 be put before that of C4?
Redraw the table as a single table so that it qualifies as a relation.

Suggested Answers:
1.
2.
3.
4.
5.

There are repeating groups of values in individual cells. There is also no name.
No we cannot tell.
Their values are Null.
No, the order of the tuples/rows is not important in a relation.
An example is given below:

Page 62 of 158
DB Lecturer Guide V1.0

Name: Customers
Customer ID

Customer Name

Customer Products

Product Prices

C1

Manjeet Islam

Hair dryer

$35

C1

Manjeet Islam

Shampoo

$7

C1

Manjeet Islam

Specialist Comb set

$8

C4

Tolu Amusia

Hair net

C2

Sid James

Specialist Comb set

$7

C2

Sid James

Hair dryer

$35

C6

Ambereen Reeza

Clippers

Exercise 2
Looking at the single table you have produced for Question 5 of the Exercise 1 above where you
were asked to redraw the table as a single table. There will still be a number of problems with it.
What issues are there with duplication and the primary key?
Suggested Answer:
There is a lot of duplication now with customer information repeated and price information repeated.
Customer ID cannot be the primary key as it is duplicated across different rows and a primary key
must uniquely identify each row.
Exercise 3
Redraw the single table as three separate tables that have less duplication. You should be guided in
this by the example shown in the lecture for this topic.
Suggested Solution:
Customer ID Customer Name
C1

Manjeet Islam

C4

Tolu Amusia

C2

Sid James

C6

Ambereen Reeza

Page 63 of 158
DB Lecturer Guide V1.0

Customers
Product

Price

Hair dryer

$35

Shampoo

$7

Specialist Comb set

$8

Hair net
Clippers
Products
Customer ID

Product

C1

Hair dryer

C1

Shampoo

C1

Specialist Comb set

C4

Hair net

C2

Specialist Comb set

C2

Hair dryer

C6

Clippers

Note that Products might be given an ID and Customer Products would have the appropriate FK.
Exercise 4
Identify the primary and foreign keys for each of your new relations.

Page 64 of 158
DB Lecturer Guide V1.0

Suggested Answers:
Customer. PK is Customer ID
Products. PK is Product or possibly a Product ID
Customer Products the PK is both columns. Both columns are FKs to the other tables.
Exercise 5
Review the content of this topic and conduct any further reading you need to undertake in order to
ensure that you understand the material. You should make note of anything that you still feel
requires further clarification and bring your questions to the tutorial for this topic.

Page 65 of 158
DB Lecturer Guide V1.0

5.7

Tutorial Notes

The time allowance for tutorials in this topic is 2 hours.


Lecturers Notes:
Students have copies of the tutorial activities in the Student Guide. Answers are not provided in their
guide.
Students can work in small groups to complete the exercises. You should then run a whole class
feedback session to discuss the students answers.

Exercise 1:

Group Discussion of Private Study Activity

In small groups, discuss your findings to Private Study Exercises 1-4. Your tutor will then lead a
class feedback session, during which you can also raise any questions you have about the material
covered in this topic.
Exercise 2:

Questions

Answer these questions in your own words.


a.

What is meant by the concept of data independence?

b.

What was System R and what was its importance in the development of the relational
model?

c.

What is meant by the term NULL and why cant a primary key contain a NULL value?

d.

What are the properties of a relation?

e.

What is the purpose of foreign keys in a relational database?

f.

Look at the following tables that form part of a database from a library system:
Book
BookID
BookName
AuthorID
BookTypeCode
ISBN
BookType
BookTypeCode
BookTypeDescription
Author
AuthorID
AuthorName
NationalityCode
Country
Page 66 of 158

DB Lecturer Guide V1.0

NationalityCode
CountryName
Borrower
BorrowerID
BorrowerName
Loan
BorrowerID
BookID
LoanStartDate
LoanEnd Date
Idenfity the primary and foreign keys in the above schema.
Suggested Answers:
a. Data independence means that the internal storage structure of the data does not need to
concern someone who wants to access the data. All they need to know about is the structure
of the realtions (or tables), and the attributes (or columns). Previously, in a language like
COBOL for example, a programmer was needed to take account for the file structure; this
made accessing and changing data much more difficult.
b. System R was an early relational database developed at IBM's San Jose laboratory and
involved some of the key people in the early development of databases, such as Codd and
Boyce.
It was most important as a testing ground for relational concepts and for leading to the
development of SQL.
There were also commercial implemenations of System R and commercial spin-offs. Other
aspects that were investigated during the System R project include: transaction
management, concurrency control, recovery techniques, query optimisation, data securtiy,
data integerity, human factors and user interfaces
c. A NULL is is an unknown value. It is worth re-emphasising that a NULL is NOT a blank, or a
zero, it is unknown and could potentially have a value. Because of this, it cannot be part of a
primary key, as it could potentially be the same as a value of an already exisitng primary key.
Thus the primary key would not be unique.
d. This was specified on Slide 18 of the Topic 5 lecture slides.

It has a name which is unique within the relational schema.


Each cell of a relation contains exactly one value.
Each attribute has a name.
Each tuple is unique.
The order of the attributes is insignificant.
The order of tuples is insignificant.

e. Foreign keys are the links between relations. A foreign key must have the value of an
attribute that exists in another table (or must be null). The foreign key must relate to a
candiate key in its parent table. This is usually, but not always, the primary key.

Page 67 of 158
DB Lecturer Guide V1.0

f.

Book
BookID (PK)
BookName
AuthorID (FK)
BookTypeCode (FK)
ISBN
BookType
BookTypeCode (PK)
BookTypeDescription
Author
AuthorID (PK)
AuthorName
NationalityCode (FK)
Country
NationalityCode (PK)
CountryName
Borrower
BorrowerID (PK)
BorrowerName
Loan
BorrowerID (PK) (FK)
BookID (PK) (FK)
LoanStartDate (PK)
LoanEnd Date

Page 68 of 158
DB Lecturer Guide V1.0

Topic 6
Topic 6: The Relational Model (2)
6.1

Learning Objectives

This topic provides an overview of further aspects of the relational model.


On completion of the topic, students will be able to:

6.2

Describe the types of relational integrity;


Describe the concept of functional dependency;
Recognise anomalies in relations;
Normalise a paper-type form.

Pedagogic Approach

Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.

6.3

Timings

Lectures:

2 hours

Laboratory Sessions: 2 hours


Private Study:

7.5 hours

Tutorials:

2 hours

Page 69 of 158
DB Lecturer Guide V1.0

6.4

Lecture Notes

The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

Relational integrity
Normalisation
Anomalies
Functional dependency
The process of normalisation

6.4.1 Guidance on the Use of the Slides


Slides 2-3:

The focus of this topic will be on normalisation as a way of producing a set of


relations in a desired state that will minimise repetition.

Slide 4:

Relational integrity refers to the different rules that exist within the model to make
sure that it is made of relations.
The concept of nulls was introduced in the last session, but it would be useful to
recap at this point as it is an important concept with regard to integrity rules. Nulls
represent values of an attribute that are unknown. Note that this does NOT mean
blank or zero. Since null means unknown, it is NOT possible to say that an attribute
with a value of null is equal to another attribute with a value of null.
Entity integrity - This rule is about making sure that each tuple (or row) in a relation
is unique. The rule states that no attribute of a primary key can be null.
Activity: Ask the students why an attribute that is a primary key (or part of a
primary key) cannot not be null. Why would this potentially violate uniqueness?
Answer: A null value, being unknown, might be the same as the value in the
primary key of another tuple.
Referential integrity - If a foreign key exists in a relation, it must much a candidate
key in its home relation or must be null.
General constraints - Any additional rules that are set up at the request of the users
in order to satisfy their requirements. For example, in a database of voters in an
election, a rule could be set up that says all voters must be over a certain age.

Slide 5:

Recap the properties of a relation from Topic 5.

Slides 6-8:

Outline the concept of functional dependence as being a key technique that is


needed to be able to go through a formal process of normalisation.

Slides 9-11:

The students should be asked to identify the candidate key. Remember the formal
definition of a candidate key is that firstly it is a superkey (an attribute or set of
attributes that uniquely identifies a tuple). Secondly, it is a superkey such that no
part of it could be a superkey on its own. Therefore it is the minimum set of
Page 70 of 158

DB Lecturer Guide V1.0

attributes that could uniquely identify a tuple. It is called a candidate key, because
it is a candidate to become a primary key.
In this example, the functional dependencies can be examined to help the student
identify a candidate key. If student ID is known, does that mean that the other two
attributes are known? The answer is no. If activity is known then the fee would be
known, but not the student ID. Knowing the fee tells nothing about the other
attributes, so there are functional dependencies there. What is clear is that the
candidate key is a combination of other attributes.
However, if this does not feel right, semantically speaking, this is because this
relation is not fully normalised, there are anomalies.
Slide 12:

Activity: The students should be asked to think about loss of information if a tuple
is deleted from this relation.
Answer: We would lose the price of Skiing as well as the fact that student 9901
was taking skiing. This deletion anomaly is one of the anomalies that occur when
relations are not fully normalised.
There are also insert and update anomalies.
If we want to record a new activity, but no one has yet taken it, we cannot do so; we
need a student ID because the student ID is part of the primary key and therefore
cannot be null. This is an insert anomaly.
If we wanted to change the cost of swimming to 75, we would have to do it for
every tuple where someone was taking swimming. In a relational database, doing
such repetitive updates should not be necessary. It points to an update anomaly.
The problem with this relation is that it contains details about two separate facts
who is taking an activity and how much the activity costs. The solution is to split the
relation.

Slide 13:

The anomalies in the previous section should be revisited. They have been
overcome. This is because the relations are now normalised.
The functional dependencies here are much clearer. In student activity, the two
attributes are self contained; since a student may take many activities they both
must be part of the primary key. In activity cost, there is a functional dependency
whereby activity determines cost. If we know the activity, we know the cost. The
attribute activity will be the primary key.

Slide 14:

There is a more formal process of normalisation. What follows is one example of it.
There are other approaches, although the rules for each normal form are the same.
The starting point is a paper document of the sort that is used to store data in a
manual system.
It is worth noting a number of features of this document. It is a document containing
data about students and the modules they take. There is also information about the
results. Note that each result code has a corresponding description so that P
means Pass and RE means Refer Exam etc.
Note also that this is information for one student and that for that one student, as for
all the other students in the system, there is information about more than one
Page 71 of 158

DB Lecturer Guide V1.0

module. In this example, there are seven rows of data about modules. These rows
are called, in the language of normalisation, repeating groups. For each student
there are repeating groups of information about modules, results etc.
Slide 15:

The first step is to identify which attributes belong to the repeating group. The
attributes are listed as shown. Those attributes where there is one occurrence are
annotated with a 1. Those attributes where there is a repeating group are
annotated with a 2. The 2 in this case simply means more than one. The
tentative primary key is also underlined. In this case it is student number.

Slide 16:

First Normal Form. In the column marked 1NF (for first normal form) the repeating
group information is separated out. Note the important step also: the student
number which is the primary key of the upper set of attributes, is also copied down
to become a foreign key in the lower block. Foreign keys are identified by a star.
This maintains the link between the student information and the module information.
The repeating group data is module information for a particular student. The
primary key of the lower block of attributes (the repeating group) is also identified as
being the combination of student number and module code: once again this makes
sense semantically, because this is data about modules for a particular student.
It is worth stressing that this step, identifying the primary key of the repeating group
block, is often a cause for confusion. If this is not done properly, then the remaining
steps will go wrong.

Slide 17:

For second normal form, we need to examine the functional dependencies that
exist when there is a primary key that is made up of more than one attribute.
Activity: Which block of data do we have to examine now?
Answer: The repeating group block, because it has a primary key made up of more
than one attribute.
Examine each non-key attribute in the relation and see if it is fully dependant on the
WHOLE primary key.
Examining each of the attributes in turn:

Result: Yes. It is dependent on both student number and module code,


because a result is for a particular student on a particular module.

Result Code: Yes. The same as for result.

Grade Point: Yes. A grade is given for a particular student for a particular
module.

No of Credits: No. This attribute records the number of credits a module is


worth. It has nothing to do with the student and is dependent on just the
module code.

Module Title: no. Module title is dependent just on the module code.

The next step is that where we have identified attributes that are only dependent on
one part of the primary key, we separate them out. We take out the part of the
primary key on which they are dependent. That is also left behind and becomes a
foreign key. In this case, we have separated out the module information and
Page 72 of 158
DB Lecturer Guide V1.0

module code becomes the primary key of the module information. Module code is
left behind as a foreign key.
Slide 18:

Third Normal Form. The process now looks for any functional dependencies in a
relation that are not on the primary key. Go through each attribute in turn to see if is
dependent on the primary key directly. There are two examples where this is not so
here. Course title is dependent on the course code, so while it is true that here is a
dependence on student number (in the sense that if I know a student number, I
know the course title) this dependency is what is known as transitive, i.e. through
another attribute; in this case, the course code.
The course code is separated out with the course title. Course code is also left as a
foreign key in the student block.
The second example is the result code and result. Result is transitively dependent
on the student number/module code primary key and so is separated out.

Slides 1920:

What we are left with at third normal form is different blocks of attributes that
correspond to entities. We can now work from the bottom up and draw our entity
diagram since we know a foreign key means the many end of a relationship. As
part of private study, students will draw the full ER diagram for this set of attributes.

Page 73 of 158
DB Lecturer Guide V1.0

6.5

Laboratory Sessions

The laboratory time allocation for this topic is 2 hours.


Lecturers Notes:
Students have copies of the laboratory exercises in the Student Guide. Answers are not provided in
their guide.
It is suggested that this session be dedicated to ensuring that all students have completed the
exercises from the laboratory sessions for Topics 2-5 and are confident with SQL. By this stage, all
students should be confident with the following:
Creating tables
Inserting data
Updating data
Deleting data
Performing queries
Start the session by ensuring that all students have completed the following laboratory exercises
and provide guidance for those students who have not:
Topic 2 Laboratory Sessions: Exercises 4-10
Topic 3 Laboratory Sessions: Exercises 1-2
Topic 4 Laboratory Sessions: Exercises 1-3
Topic 5 Laboratory Sessions: Exercises 1 and 6
Additional exercises are provided below for students who have already completed the
aforementioned exercises or who are able to complete them before the end of this session.

Exercise 1
Create a table called Student. The table should have the following attributes all of type varchar:

Student_id

First_name

Last_name

Gender

Student_id is the primary key for the Student table. The Student table is attached to the table
Course in a one-to-many relationship where Student is the many part of the relationship. The
primary key of the Course table is Course_id. You will also need to create the Course table.
Suggested Solution:
The point of this exercise is to ensure that students can create tables with appropriate primary and
foreign keys from a given statement about the relevant tables. The students may have different
sizes for each of the attributes. For example, Student_id could be varchar(10) because a student ID
is not likely to be more than 10 characters, etc.
Page 74 of 158
DB Lecturer Guide V1.0

Create table Student


(Student_id varchar(30) not null,
First_name varchar(30),
Last_name varchar(30),
Gender varchar(30),
Course_id varchar(30),
primary key Student_id,
foreign key (Course_id) references Course);
For foreign key/primary key relationship to work properly, a dummy Course table should be
created. This could be as simple as the following:
Create table Course
(Course_id varchar(30) not null,
Course_name varchar(30),
primary key Course_id);
Exercise 2
Insert the following person into the Student table created in Exercise 1. The students first name is
Chris, his last_name is Peters, he is an 18 year old male and his student ID will be NCC001. Chris
will be on course DB001 Databases. You will also need to update the Course table.
Suggested Solution
The insert statements should be carried out in the following order so that the foreign key/primary key
relationship works properly:
1. Insert into Course (Course_id, Course_name) values (DB001, 'Databases');
2. Insert into Student (Student_id, First_name, Last_name, Gender, Course_id) values
(NCC001, 'Chris', 'Peters', Male, DB001);
Exercise 3
Update the Course table to show that the course name for course DB001 has changed from
Databases to Database Systems. Then update the Student table to show that student NCC001 no
longer wishes to take course DB001, they now wish to take course SE001 Software engineering.
You will also need to update the course table.
Suggested Solution:
The following update can be carried out at any time as it does not cause problems with the foreign
key/primary key relationship.
Update Course set Course_name = 'Database Systems' where Course_id = DB001;
These statements should be carried out in the following order so that the foreign key/primary key
relationship works properly:
1. Insert into Course (Course_id, Course_name) values (SE001, 'Software Engineering');
2. Update Student set Course_id = SE001' where Student_id = NCC001;
Page 75 of 158
DB Lecturer Guide V1.0

Exercise 4
Delete the student with the ID NCC001 from the Student table.
Suggested Solution:
Delete from Student where Student_id = NCC001;
Exercise 5
1.

Using the COUNT function and joining Student and Course, count how many students there are
on the Software Engineering course.

2.

Select the first and last names of the students who are on the Database Systems course and
order them by their gender.

Suggested Solution
1.

Select Count(*)
from Student s, Course c
where s.Course_id = c.Course_id
and c.Course_name = 'Software Engineering';

2.

Select s.First_name, s.Last_name


from Student s, Course c
where s.Course_id = c.Course_id
and c.Course_name = 'Database Systems'
order by s.Gender;

Page 76 of 158
DB Lecturer Guide V1.0

6.6

Private Study

The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.

Exercise 1
Draw the ER diagram for the set of relations produced in Slide 20.
Suggested Answer:
Course

Student
0..*

1
0..*
Student Module

0..*

Module

0..*
1
Module Type

Page 77 of 158
DB Lecturer Guide V1.0

Exercise 2:

Normalisation

Product Number: 009


Product Name: Wall Bracket
Product Type Code: HF
Product Type Name: Home Fitting

Supplier
Number

Suppliers
Name

Suppliers
Product Ref
No

Price

Main Supplier
Y/N ?

099

Gibbons

WB09

0100

Jarrolds
Fittings

98383

3.50

0101

H Drammond

B010

3.75

098

Crambornes

Br 7

3.99

078

Jamison

8383

3.99

Above is a form used by a firm to keep track of the different suppliers that supply them the same
part. Suppliers Product Ref No is the reference number given to the part by the supplier. Main
Supplier Y/N indicates whether this is their preferred supplier of the part.
Using the techniques discussed in the lecture, break this document down into a set of third normal
form relations.

Page 78 of 158
DB Lecturer Guide V1.0

Suggested Answer:

UNF

Lev

1NF

2NF

3NF

Product Number

Product Number

Product Number

Product Number

Product Name

Product Name

Product Name

Product Name

Product Type Code

Product Type Code

Product Type Name

Product Type
Code*

Supplier Number

Product Type
Name

Product Type
Code

Supplier Name

P/S Reference

Product Price

Main Indicator

Product Type
Name

Product Type
Code
Product Type
Name

Product Number*
Supplier Number
Supplier Name
P/S Reference
Product Price
Main Indicator

Product Number*
Supplier Number*
P/S Reference

Product
Number*

Product Price

Supplier
Number*

Main Indicator

P/S Reference
Product Price
Main Indicator

Supplier Number
Supplier Name
Supplier Number
Supplier Name
Entities

PRODUCT
PRODUCT TYPE
SUPPLIER PRODUCT
SUPPLIER

Exercise 3
Review the content of this topic and conduct any further reading you need to undertake in order to
ensure that you understand the material. You should make note of anything that you still feel
requires further clarification and bring your questions to the tutorial for this topic.

Page 79 of 158
DB Lecturer Guide V1.0

6.7

Tutorial Notes

The time allowance for tutorials in this topic is 2 hours.


Lecturers Notes:
Students have copies of the tutorial activities in the Student Guide. Answers are not provided in their
guide.
Students can work in small groups to complete the exercises. You should then run a whole class
feedback session to discuss the students answers.

Exercise 1:

Review of Private Study Exercises

In small groups, discuss your findings to Private Study Exercises 1 & 2. Your tutor will then lead a
class feedback session, during which you can also raise any questions you have about the material
covered in this topic.
Exercise 2:

Questions

Answer these questions in your own words.


a.

Give a definition for first, second and third normal form.

b.

What is the purpose of normalisation? Why is it necessary to split data into separate tables

c.

Why do you think Entity Diagrams are usually referred to as a top-down approach and
normalisation as a bottom-up approach?

d.

Describe the concept of functional dependency.

e.

What role does functional dependency play in the process of normalisation?

Suggested Answers:
a.

First Normal Form: A formal definition might be that it is a table where each cell contains only
one value. For our purposes, the students should be able to express that it is a table which
contains no repeating group information.
Second Normal Form: A relation in which any non-key attribute is fully functionally dependent
on the primary key. There are no partial key dependencies.
Third Normal Form: A relation in which no attribute is transitively dependent on the primary
key.

b. The purpose of normalisation is to produce a set of relations that has the minimum amount of
duplication of data. It is also to avoid anomalies. Tables should contain data about one topic;
that is they should semantically fit that portion of the real world that the data is trying to
represent. If all the data was put into one big table, then every time something was changed
or added it would mean replicating already existing data.

Page 80 of 158
DB Lecturer Guide V1.0

c. ER diagrams approach the problem of organising data by identifying the larger clusters of
data known as entities that correspond to real-world categorisation of data by an
organisation. The aim is to get an overview of the main ways in which data is grouped. The
details of this (the attributes, data-types) and so on, tend to be added later.
Normalisation starts with attributes and works its way up to arrive at a set of entities after
the process of normalisation.
d. This concept describes the relationship between attributes in a relation such that if there is a
functional dependency between two attributes, it means that if one attribute is known then
the value of the other will also be known. If functionally A determines B, then for each value
of A there will be exactly one value of B. For example, for each student ID there will be one
student name. But the reverse is not true.
e. Within normalisation, each stage will be looking for functional dependencies and applying the
rules of that stage to see if they fit. For example, 2nd Normal form is about identifying
functional dependencies between non-primary key attributes.

Page 81 of 158
DB Lecturer Guide V1.0

Page 82 of 158
DB Lecturer Guide V1.0

Topic 7
Topic 7: SQL (I)
7.1

Learning Objectives

This topic provides an overview of SQL (Structured Query Language).


On completion of the topic, students will be able to:

7.2

Explain the purpose of SQL;


Outline the basic concepts of SQL;
Understand that there are different flavours of SQL.

Pedagogic Approach

Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.

7.3

Timings

Lectures:

2 hours

Laboratory Sessions: 2 hours


Private Study:

7.5 hours

Tutorials:

1 hour

Page 83 of 158
DB Lecturer Guide V1.0

7.4

Lecture Notes

The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

History and purpose of SQL


Basic concepts of SQL
Data-types in SQL

7.4.1 Guidance on the Use of the Slides


Slide 4:

The objectives of SQL are to create the database and relation structures, perform
basic tasks, such as insertions, updates and deletions of data from base relations
and perform simple and complex queries.

Slides 5-6:

DDL is for defining the database structures and controlling access to data. DML is
for retrieving and updating data.

Slides 7-8:

These slides look at the history of SQL. SQL was developed out of System R,
which was mentioned during the lecture on the relational model. It was developed
as THE standard database language during the late 1970s. SQL was the first
standard published in 1987 by ISO (International Standards Organisation) based on
work by ANSI (American National Standards Institute). Revision to SQL occurred as
follows:

1989 - Minor revisions.

1992 - Major revision introducing new data types like VARCHAR and some
new set operators like UNION JOIN and NATURAL JOIN.

1999 Addition of the object-relations features.

2003 Addition of the concept of Core-SQL, which every implementation of


SQL must meet. This was an attempt to further define a set of SQL
features that all
implementations of SQL must meet. The aim was to
overcome the divergence in the
language that had happened as different
vendors had implemented SQL.

2006 - SQL to be used with the XML mark-up language.

2008 - Minor revisions.

Before the language was fully standardised in 1987, it was being developed by
different vendors. Various vendors added features known as extensions. The result
of this is what is known as the various SQL dialects.
Slide 9:

Data Manipulation is for retrieving and updating data.


SELECT This is the main keyword used in the retrieval of data. It is at the heart of
SQL. Learning to write select statements is a vital part of the language.

Page 84 of 158
DB Lecturer Guide V1.0

INSERT, UPDATE and DELETE - These keywords are for updating data; that is
changing it in various ways on the database.
Ask the students if they understand the difference between an update and an insert.
An update changes data that is already there, whereas an insert is to put new data
into the database.
Slide 10:

It is important to distinguish how to use different sorts of literal constants in the


language. All non-numeric data must be enclosed in single quotes; all numeric data
must not be enclosed in quotes.

Slides 11-12:

Activity: Ask the students to look at this example of a query. What are the
functions of the keywords specified in bold here?
SELECT specifies which columns from the table are to appear in the result
FROM specifies which table or tables are to be used to get the results
WHERE specifies some condition that will restrict the rows that are retrieved
GROUP BY groups rows by some column value
HAVING used to restrict the result that will be grouped
ORDER BY specifies the order in which the result will appear
Mention that the full power of the SELECT construct will be looked at in greater
detail in the next lecture.

Slide 13:

Database Update. SQL can be used not only to query the database despite its
name, it can also be used to add, change and delete data. These constructs are
usually simpler than the SELECT statement.

Slide 14:

This slide shows an example of an insert statement. In the first, the columns are
specified. The second assumes an insert for all the columns. Note that an insert
statement is to add new data to a table.

Slide 15:

This slide shows an example of an update statement. In the first, ALL ROWS are
updated. The second updates specific rows based on a condition. Note that an
update statement is for changing data that is already in the database.

Slide 16:

This slide shows an example of a delete statement. In the first, ALL ROWS are
deleted. It should be pointed out that this needs to be used with care. The second
deletes specific rows based on a condition.

Slides 1718:

Activity: Look at the department table that was created in both the first laboratory
session and in subsequent ones. How would you write a statement to insert a new
row of data in the table? The department will be number 8, based in Glasgow and
will be the Complaints department.

Slide 19:

The commit statement is needed to actually save the changes you have made;
otherwise they will be lost.

Slide 20:

The rollback command can be used to undo an action like an insert, update or
delete.
Page 85 of 158

DB Lecturer Guide V1.0

Slide 21:

Datatypes are used to enforce a domain. Ask the students if they can give a
definition of a domain as discussed in the topic on the relational model. A domain is
a set of allowable values for a column or attribute. Datatypes enforce general
domains, such as whether a column takes a character or number. For more specific
domains, such as Male or Female, then SQL uses something called constraints;
this will be examined in a later topic.
It should be noted that different vendors have extended their versions of SQL to
include new datatypes.

Slides 2223:

String Datatypes. Character can be abbreviated to CHAR. This is a column with a


fixed length. If, for example, a CHAR is defined as being six characters long and
input is only 4 characters, then the rest of the field will be filled with blanks spaces.
This has implications for queries.
For example, imagine we wanted to retrieve the data from a row in a table and use
the name field for a person called Gary. If the name was stored in a field that was
defined as datatype CHAR of length six, then what would actually be stored in the
database would be the letters of the name followed by two blank spaces.
Gary__
If a query then used SELECT * from People Where name = Gary;
not work, because what is stored in the database is Gary (Gary
two blank spaces.

, this would
followed by

Varying character can be abbreviated to Varchar. This overcomes the problem of


having the field padded out with blank characters as with a Char datatype. Only the
actual characters are stored, so the above SQL statement would work if the field
was a varchar.
BIT(N) Fixed length binary field
BIT Varying Variable length binary field
Slide 24:

Numeric Datatypes. Numeric and Decimal are the same; the definition is Decimal
(M,N) with the M being the number of digits before the point and the N the number
of digits after the point. Decimal is abbreviated to DEC.
Integer is a number without a decimal point. It is abbreviated to INT.
Float (also called real) is a number stored with a decimal point that can move as the
number requires it.

Slide 25:

This slide shows datetime types:

Date - System defined date

Time - System defined time

Timestamp - Dates and times, including fractions of a second

Interval - Intervals between dates

Page 86 of 158
DB Lecturer Guide V1.0

It should be noted that the representation of dates can vary depending on the
vendor and there are extensions in the various vendors flavours of SQL. There was
much refinement in SQL systems (along with all other computer systems) in the run
up to the year 2000, because of the need to store the year in a four digit format.
The students could be asked why this was such an issue at the time and what is the
problem with storing the year as a two digit number? The answer is because
ofconflicts between years, e.g. does 01-JAN-20 mean the first of January 1920 or
2020? We only know if we store the year as a four digit number. The date formats
can usually be specified by the given vendors methods.
Slide 26:

It should be emphasised that SQL is near universal as a language for relational


databases. It is relatively easy to use and provides the features that professionals
and other users will need when working with databases.

Slide 27:

C.J. Date pointed out early in 1987 that SQL did not support all the features of the
relational model. Referential integrity is not supported in the sense that while we
can define primary keys and foreign keys, it is possible to create tables without a
primary key (so allowing the insert of duplicate tuples) and to have non-enforced
foreign keys.
There is no one standard and the different flavours of SQL can be confusing. SQL
has had to be extended to support new developments in database technology, such
as object-oriented features.

Page 87 of 158
DB Lecturer Guide V1.0

7.5

Laboratory Sessions

The laboratory time allocation for this topic is 2 hours.


Lecturers Notes:
Students have copies of the laboratory exercises in the Student Guide. Answers are not provided in
their guide.
At this stage of the module, you should also introduce the assessed assignment to students.
Assignments for the relevant assessment cycle are available from the NCC Education Campus
(http://campus.nccedu.com). You will need to ensure that each student has a copy of the
assignment and understands the requirements. Assignments would normally be submitted for
marking during Topic 9 or 10, depending on how much time you feel you need for marking.

Exercise 1
In Topic 5, you should have designed and created your own personal details tables. Gather details
of at least 8 of your fellow students. You should get data for both tables. Insert the data into the
tables.
Suggested Solution:
The purpose of these exercises is for students to get used to writing queries and performing
operations on data structures of their own devising. The answers will depend on the table structure
that the students have created. The model answers here will be based on the generic solution given
in Topic 5.
Create table Personal_details
(personal_no integer not null,
first_name varchar(30),
second_name varchar(30),
primary key personal_no);
Create table qualifications
(qual_no integer not null,
qual_name varchar(30),
qual_level integer,
institution varchar(30),
grade varchar(30),
personal_no integer,
primary key qual_no,
foreign key (personal_no) references personal_details)
Please note that some implementations of SQL specify the primary key in a slightly different way:
Create table Personal_details
(personal_no integer not null primary key,
first_name varchar(30),
second_name varchar(30));
Therefore the insert statements will look like this:

Page 88 of 158
DB Lecturer Guide V1.0

Insert into Personal_details


Values (2,Gary,Smith);
Insert into qualifications
Values(5,BSc Computing,6,University of Cairo,Upper-Second,2);
The important point to note with these inserts is that students know how to link the qualifications that
are creating to the people those qualifications below to. One person might have many qualifications
and that would result in many rows being created in the qualifications table with the same foreign
key. For example another qualification for Gary Smith would look like:
Insert into qualifications
Values(10,MSc Computing,7,University of Luxor,Merit,2);
Note also that, in this example, the Primary Key of qualification is just a consecutive number with no
meaning relationship to the person the qualification belongs to.
Exercise 2
Write a query that shows all the qualifications for a named person. This could be yourself.
Suggested Solution:
Select p.First_name, p.Second_name, q.qual_name
From personal_details p, qualifications q
Where q.personal_no = p.personal_no;
Exercise 3
Write a query that shows which institution each student has attended. Order this by the students
last name.
Suggested Solution:
Select p.First_name, p.Second_name, q.institution
From personal_details p, qualifications q
Where q.personal_no = p.personal_no
Order by p.Last_name;
Exercise 4
Show all those people who have achieved a Level 3 qualification.
Suggested Solution:
Select p.First_name, p.Second_name
From personal_details p, qualifications q
Where q.personal_no = p.personal_no
And q.level = 3;

Page 89 of 158
DB Lecturer Guide V1.0

Exercise 5
Write a query that shows how many qualifications you have.
Suggested Solution:
Select Count(q.personal_no)
From personal_details p, qualifications q
Where q.personal_no = p.personal_no
And p.last_name = <Students Last Name>
The student will put their own last name within the angular brackets.
Exercise 6
If there is not one already there then add a column to the personal_details to record a persons age.
Suggested Solution:
Alter table personal_details
Add age integer;
Exercise 7
Update the personal_details table with each persons age.
Suggested Solution:
This is an example.
Update personal_details
Set age = 18
Where personal_no = 1;
Exercise 8
Write a query to show all first names, last names and the level 2 qualifications for students who are
under the age of 20;
Suggested Solution:
Select p.first_name, p.last_name, q.qual_name
From personal_details p, qualifications q
Where q.personal_no = p.personal_no
And q.qual_level = 2
And p.age < 20;
Exercise 9
Create a new table called Qualification_Type using the as statement that shows all the
qualifications that exist. There should be one row for each qualification without duplications.
Page 90 of 158
DB Lecturer Guide V1.0

Suggested Solution:
Create table qualification_type as
(Select distinct qual_name
From qualifications);
Exercise 10
Add a column to the qualification_type table to show the level the qualifications is at.
Suggested Solution:
Alter table qualification_type
Add qual_level varchar(30);
Exercise 11
Update the qualification_type with the correct level for each qualification.
Suggested Solution:
The simple way to do this without nested sub-queries is to use an update statement of the basic
form.
Update qualification_type
Set qual_level = 3
Where qualification = Certificate in Computing;
Exercise 12
Once the qualification_type table is updated with the level then the level can be deleted from the
qualification table. Use the drop column scripts as shown below:
Alter table qualification
Drop column qual_level;
Exercise 13
Make the qualification attribute the primary key of qualification_type;
Suggested Solution:
Alter table qualification_type
Add primay_key (qualification);
Exercise 14
Now create a foreign key between qualification and qualification_type using the qualification
attribute.

Page 91 of 158
DB Lecturer Guide V1.0

Suggested Solution:
Alter table qualification
Add foreign_key (qualification) references qualification_type (qualification);
Exercise 15
Rewrite the query from Exercise 4. Show all those people who have achieved a Level 3 qualification.
You will now need to include all three tables.
Suggested Solution:
Select p.First_name, p.Second_name
From personal_details p, qualifications q, qualification_type t
Where q.personal_no = p.personal_no
And q.qualification = t.qualification
And qt.qual_level = 3;

Page 92 of 158
DB Lecturer Guide V1.0

7.6

Private Study

The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.

Exercise 1
In a Customer Accounts System, the following tables have been created using SQL DDL
commands.
1.

CREATE TABLE Customer_Purchase


(
ItemNo
char (4) not null,
CustomerNo
char (4) not null,
PurchaseDate
date
PaymentTotal decimal
PRIMARY KEY (ItemNo, CustomerNo),
FOREIGN KEY (CustomerNo) REFERENCES Customer(CustomerNo)
FOREIGN KEY (ItemNo) REFERENCES Purchase(ItemNo) );

2.

CREATE TABLE Customer


(
CustomerNo
char (4) not null,
CustomerName
char(10),
City
char(20),
PRIMARY KEY (CustomerNo) );

3.

CREATE TABLE Item


(
ItemNo
char (4) not null,
Item Name char (30) not null,
Item Type Code char(4) not null,
PRIMARY KEY (ItemNo),
FOREIGN KEY (Item Type Code) REFERENCES Item Type (Item Type Code));

4.

CREATE TABLE Item Type


(
Item Type Code
char (4) not null,
Item Type Name char(30) not null,
PRIMARY KEY (Item Type Code));

A user tried to execute the following commands in the given order to insert values into the created
tables. Find those commands that would result in the return of an error message. Explain why.
1. INSERT INTO Item Type values (2345, Hand Drill, 25);
2. INSERT INTO Item Type values (2344, Electronic Drill);
3. INSERT INTO Item Type values (2346, Drill Bit);

Page 93 of 158
DB Lecturer Guide V1.0

4. INSERT INTO Item values (1010, 2344, 2344);


5. INSERT INTO Item values (1005, Dulux Cordless Electronic Drill, 2344);
6. INSERT INTO Item values (1005, 5mm Ceramic Drill Bit,2354);
7. INSERT INTO Item values (1005, Standard Long Cord Electronic Drill,2344);
8. INSERT INTO Customer values (5566, HASNET, LONDON);
9. INSERT INTO Customer values (5667, SONGARA, BIRMINGHAM);
10. INSERT INTO Customer values (5667, SINGH, CAIRO);
11. INSERT INTO CustomerPurchase values (1005, 5566, 03-FEB-2004, 20.00);
12. INSERT INTO CustomerPurchase values (1007, 5566, 04-FEB-2004, 40.00);
Suggested Answer:
1.
2.
3.
4.

INSERT INTO Item Type values (2345, Hand Drill, 25); ERROR TOO MANY VALUES
INSERT INTO Item Type values (2344, Electronic Drill); OK
INSERT INTO Item Type values (2346, Drill Bit); OK
INSERT INTO Item values (1010, 2344, 2344); ERROR INCORRECT DATA TYPE ITEM
NAME IS A CHAR
5. INSERT INTO Item values (1005, Dulux Cordless Electronic Drill, 2344); OK
6. INSERT INTO Item values (1005, 5mm Ceramic Drill Bit,2354); ERROR FOREIGN KEY
DOES NOT MATCH PRIMARY KEY IN TARGET TABLE
7. INSERT INTO Item values (1005, Standard Long Cord Electronic Drill,2344); ERROR
INCORRECT DATA TYPE PRIMARY KEY IS CHAR NOT A NUMBER
8. INSERT INTO Customer values (5566, HASNET, LONDON); OK
9. INSERT INTO Customer values (5667, SONGARA, BIRMINGHAM); OK
10. INSERT INTO Customer values (5667, SINGH, CAIRO); ERROR DUPLICATE VALUE
FOR CUSTOMER ID
11. INSERT INTO CustomerPurchase values (1005, 5566, 03-FEB-2004, 20); OK
12. INSERT INTO CustomerPurchase values (1007, 5566, 04-FEB-2004, 40); ERROR
FOREIGN KEY DOES NOT MATCH PRIMARY KEY IN TARGET TABLE
Exercise 2
Using online resources, compare the features of any two implementations of SQL, as provided by a
vendor. For example, you could compare Oracle SQL*Plus with MySQL.
Use your own words to write your answer and make sure you include a reference list of the places
where you found the information.
Suggested Answer:
The students should present their answer to the class during the tutorial session.
The answer would depend on the products chosen. For example, if choosing Oracle as one of the
products, there should be discussion of how SQL*Plus has extended the language with nonstandard features, including the procedural extensions. Those choosing MySQL could discuss its fit
with PHP and its availability.

Page 94 of 158
DB Lecturer Guide V1.0

7.7

Tutorial Notes

The time allowance for tutorials in this topic is 1 hour.


Lecturers Notes:
Students have copies of the tutorial activities in the Student Guide. Answers are not provided in their
guide.

Exercise 1:

Review of Private Study Exercises

In small groups, discuss your findings to Private Study Exercise 1, asking your tutor for clarification
when needed.
Exercise 2
Work in a small group. Present your findings from Private Study Exercise 2 to the other students and
answer any questions they may have.
Make notes on the findings of the other students to increase your understanding.
Your tutor will then run a whole class feedback session.

Page 95 of 158
DB Lecturer Guide V1.0

Page 96 of 158
DB Lecturer Guide V1.0

Topic 8

Topic 8: SQL (2)


8.1

Learning Objectives

This topic provides an overview of further aspects of SQL.


On completion of the topic, students will be able to:

8.2

Understand the syntax of the create statement;


Understand the construction of more complex selections;
Recognise the issues around error messaging and query optimisation.

Pedagogic Approach

Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.

8.3

Timings

Lectures:

2 hours

Laboratory Sessions: 2 hours


Private Study:

7.5 hours

Tutorials:

1 hour

Page 97 of 158
DB Lecturer Guide V1.0

8.4

Lecture Notes

The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

Creating tables
More of the select statement
Fixing errors and optimisation

8.4.1 Guidance on the Use of the Slides


Slides 45:

The most important part of data definition that is covered in this module is creating
tables.
The following activity relating to Slide 4 can be discussed with the class. The
answers are on slide 5.
Activity: Examine the Create table statement. Where is the name of the table
defined? Where are the columns defined? What defines the datatype for each of
the columns? What defines whether the columns in mandatory or not? What
defines the maximum length of each of the columns?

Slide 6:

This slide shows the specification of a primary key. Note that there are different
ways of doing this depending on the vendors flavour of SQL.

Slide 7:

This slide shows the specification of a foreign key.

Slide 8:

There are various ways in which a table can be modified after it has been created.
List:

Add an extra column.

Drop a column from a table

Modify the maximum length of the table

Add a new constraint

Drop a constraint

Set a default for a column

Drop a default

The significance of being able to do this is that changes in the database structure
will not necessarily mean that a table will have to be recreated from scratch.
Modifying an existing table leaves the data in that table intact, dropping the table
and creating it from scratch does not, and would entail creating a temporary table
with the data. There are limitations to altering the table, for example creating a new
column that was NOT NULL where there was already rows in the table would not
work.
Slide 9:

Example of using the alter table to add a column.

Page 98 of 158
DB Lecturer Guide V1.0

Slide 10:

Slide 11:

Data Manipulation - The select statement was introduced in its basic form in the
previous lecture. Students should be familiar with it and with other aspects of SQL
from the workshop materials. The following slides provide an overview of the key
features listed below:

Select

Order by

Aggregate functions

Group by

Sub-queries

Joins

Select - Simple retrieval uses the keywords SELECT, FROM and WHERE.
Activity: Ask the students if they know the function of each of these keywords.
Answer: Select specifies the columns, From specifies the table(s) and Where
specifies some condition that will limit the columns retrieved. This was covered in
previous lecture.

Slide 12:

This slide shows retrieving all the columns using the star operator and a simple
retrieval specifying the columns and a where clause.

Slide 13:

In the relational model, there is no ordering of tuples in a relation. In other words,


although it might be counter-intuitive, the rows in a database table are not stored in
a particular order as far as our logical model is concerned. It should be pointed out
that different vendors might impose a physical ordering of rows, but this should not
affect the way in which data is retrieved.
The Order By statement imposes an order on the query result of a select
statement. There is an example on this slide.

Slide 14:

Ascending and Descending - The default order is ascending, but using the DESC
keyword will make it descending in this example.

Slides 1516:

Standard SQL defines five aggregate functions: Count, Sum, Avg, Min and Max.
Activity: What is the purpose of each of these functions?
Answer: Count returns the number of values in a column, Sum returns the sum
total of values of a column, Avg returns the mean average of values in a column,
Min returns the lowest value in a column and Max returns the highest value in a
column.

Slide 17:

This slide shows an example of the syntax for using aggregate functions.

Slide 18:

The aggregate functions can be used to find summaries of data for particular
groups of rows in a database table. Using the Group By function means that the
aggregate function used is applied to each group. The slide shows an example
drawn from the workshop.

Page 99 of 158
DB Lecturer Guide V1.0

Slide 19:

The group by clause can be modified using the having clause. This is equivalent to
a where clause in the main part of the select statement. This slide shows an
example drawn from the workshop.

Slide 20:

SQL has the capability to nest one query inside another. The nested query is known
as a sub-query. This allows SQL to have queries where the results are based on
the outcome of another query. This slide shows an example drawn from the
workshop.
It is worth noting that there is often more than one way to produce the same query
result in SQL. Often a result that uses a sub-query could equally be done by using a
join of some sort.

Slide 21:

Joins. Data in a relational database is stored in different tables that can be


connected by foreign keys. Often what is required is to produce a result of a query
that has data drawn from different tables. In order to do that, we need to join the
tables, usually by equating the foreign key of one table with the primary key of
another table. This slide shows an example drawn from the workshop.
Activity: What would happen if we selected data from the two tables, but did not
specify the join condition?
Answer: The result would be all the rows from both tables would be selected. The
students should be encouraged to try this and look at the results in the laboratory
session.

Slide 22:

Not specifying the join condition correctly is one of the commonest mistakes when
starting to use SQL. The result can often be a very long output from a select
statement. Unfortunately, there is no magic fix for debugging SQL and the way in
which different vendors have implemented the language does not help. Error
messages are often cryptic and might just point to the line in which the error occurs
(or even the next line as it is the next point parsed (examined) by the SQL
compiler). There is also no standard editing feature built into SQL, so it is important
to establish, with the product being used, how editing will take place, for example
using the operating systems text editor.

Slide 23:

Much time is spent by professionals using SQL to work out the best way to write a
particular query. There are lots of ways of doing the same thing in SQL, especially
when it comes to querying. The example of producing the same query results with
either a join or a sub-query is an example of this. Different ways of writing the query
can result in dramatic differences in the amount of time it takes to produce the
result. This issue is known as performance and is especially important in databases
where there are large numbers of rows. Much work in query optimisation requires
knowledge of one of the underlying languages of the relational model - relational
algebra. This is beyond the scope of this module.

Slide 24:

TOAD is the most widely used product in helping to optimise queries.

Slide 25:

The best way of becoming adept with SQL is to use it. The work in the laboratory
sessions and the work that students do as part of their assignment will help them
develop their ability with SQL.

Page 100 of 158


DB Lecturer Guide V1.0

8.5

Laboratory Sessions

The laboratory time allocation for this topic is 2 hours.


Lecturers Notes:
Students have copies of the laboratory exercises in the Student Guide. Answers are not provided in
their guide.
You should also allow time during the laboratory sessions to check that students are working on
their assignments and answer any general questions on the expected scope of the work. You may
also wish to remind them of the submission deadline and documentation requirements.

8.5.1 Aggregation
Many important queries in a database involve using the aggregation functions COUNT, MIN, MAX,
SUM and AVG.
COUNT counts the number of times something occurs in a database table; the number of rows that
meet a particular condition.
MIN finds the minimum or lowest occurrence of an attribute in a database table.
MAX finds the highest occurrence of an attribute in a database table.
AVG finds the mean average of an attribute in a database table.
SUM finds the totals of all the values of an attribute in a database table.
Example: To find the number of rows in the workers table, we use the primary key, Emp_no, as
there will be exactly one occurrence of this value for each row and no duplicates.
Select count(emp_no)
From Workers;

8.5.2 Part One: Using the Workers, Departments and Job_Types Tables
Exercise 1
Try the above select statement.
Exercise 2
Find the average age of the workers in the workers table.
Suggested Solution:
Select Avg(Age) from Workers;

Page 101 of 158


DB Lecturer Guide V1.0

Exercise 3
Find the average age of the managers in the workers table.
Suggested Solution:
Select Avg(Age)
From Workers
Where job_title = Manager;
Exercise 4
Find the minimum, maximum, average and the sum of the age all the packers in the workers table.
Suggested Solution:
Select min(age), max(age), avg(age), sum(age)
From workers
Where job_type = Packer;
Exercise 5
Write a query that tells you the age of the youngest employee in Cairo. You will need to use the
joining of tables that you have studied in previous tutorials.
Suggested Solution:
Select min(age)
From Workers w, Departments d
Where w.dept_no = d.dept_no
and d.location = Cairo;
Exercise 6
Write a query that tells you how many employees there are in Lagos.
Suggested Solution:
Select count(emp_no)
From Workers w, Deparments d
Where w.dept_no = d.dept_no
And d.location = Lagos;

Page 102 of 158


DB Lecturer Guide V1.0

Exercise 7
Write a query that finds the job_type with the highest salary. You will need to use the job_type table
you created in Topic 5.
Suggested Solution:
Select max(salary)
From Job_Type;
Exercise 8
What is the total of all salaries paid?
Suggested Solution:
Select Sum(j.salary)
From job_type j, workers w, departments d
Where w.job_title = j.job_title
And w.dept_no = d.dept_no
Exercise 9
What is the lowest salary paid in Cairo?
Suggested Solution:
Select Min(j.Salary)
From job_type j, workers w, departments d
Where w.job_titlee = j.job_title
And w.dept_no = d.dept_no
And d.location = Cairo;

8.5.3 Using the Personal_details, Qualifications and Qualification_types Tables


The following operations use the personal_details, qualifications and qualification_types tables.
Exercise 10
Select the maximum level of qualification attained overall.

Page 103 of 158


DB Lecturer Guide V1.0

Suggested Solution:
Select max(qual_level) from qualification_type;
Exercise 11
Select the highest level of qualification attained by you.
Suggested Solution:
Select max(qual_level)
from qualification_type qt, personal_details p, qualification q
Where qt.qual_name = q.qual_name
And q.personal_no = p.personal_no
And p.second_name = <students_surname>;
Exercise 12
Select the highest level achieved for those students who are over 20.
Suggested Solution:
Select max(level)
from qualification_type qt, personal_details p, qualifications q
Where qt.qual_name = q.qual_name
And q.pesonal_no = p.personal_no
And p.age > 20;
Exercise 13
How many students have achieved level 2 qualifications?
Suggested Solution:
Select count(p.personal_number)
from qualification_type qt, personal_details p, qualification q
Where qt.qual_namen = q.qual_name
And q.personal_no = p.personal_no
And q.qual_level = 2;
Exercise 14
What is the average grade for level 3 qualifications?
Suggested Solution:
Select Avg(q.grade)
From qualifications q, qualification_type qt
Where qual_name = qt.qual_name
And qt.qual_evel = 3;

Page 104 of 158


DB Lecturer Guide V1.0

Exercise 15
What is the average level of qualification achieved by students under 19?
Suggested Solution:
Select Avg(qt.level)
From qualifications q, qualification_type qt
Where q.qqual_name = qt.qqual_name, personal_details p
And p.personal_no = q.persona_no
And p.age < 19;

Page 105 of 158


DB Lecturer Guide V1.0

8.6

Private Study

The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
The following tables are for a garden products database.
Customers
Customer ID

Customer Name

C1

Arthur Smith

C4

Samson Odogo

C2

Jagpal Singh

C6

Jenkins Watson

Products
Product

Price

Land mower

100

Slug Repellent

Trowel

Weed killer
Knee rest

Page 106 of 158


DB Lecturer Guide V1.0

Customer Products
Customer ID

Product

C1

Lawn Mower

C1

Slug Repellent

C1

Trowel

C4

Weed Killer

C2

Weed Killer

C2

Lawn Mower

C6

Trowel

Use these tables to complete the exercises below.


Exercise 1
Write an SQL statement that returns the names of all the customers.
Suggested Solution:
Select customer_name from customers
Exercise 2
Write an SQL statement that returns the names of all the customers who have bought a lawn
mower.
Suggested Solution:
Select customer_name
From customers, customer_products
Where customer_products.customer_id = customers.customer_id
An product = Lawn Mower;

Page 107 of 158


DB Lecturer Guide V1.0

Exercise 3
Write an SQL statement that finds the average price for all the products.
Suggested Solution:
Select avg(price) from products
Exercise 4
Write an SQL statement that sets the price of weed killer to 5.
Suggested Solution:
Update products
Set price = 5
Where product = Weed Killer;
Exercise 5
Write a query that gives the total spent by each customer.
Suggested Solution:
Select customer_name, sum(price)
From customers, customer_products, products
Where customer.customer_id = customer_products.customer_id
Exercise 6
Review all the material for Topics 7 and 8 (SQL). You should make sure that you understand the
following concepts and be prepared to raise any questions about them in the next tutorial:

The purpose of SQL

Data definition language (DDL)

Date manipulation language (DML)

How to update data on a database

How to retrieve data on a database using the select statement

How to create and modify tables using SQL

The advantages and disadvantages of SQL

Page 108 of 158


DB Lecturer Guide V1.0

8.7

Tutorial Notes

The time allowance for tutorials in this topic is 2 hours.


Lecturers Notes:
Students have copies of the tutorial activities in the Student Guide. Answers are not provided in their
guide.
Students can work in small groups to complete the exercises. You should then run a whole class
feedback session to discuss the students answers.

Exercise 1:

Review of Private Study Exercises

Work in a small group and review your answers to Part One of the private study exercises. Your
tutor will then lead a whole class feedback session.
Exercise 2:

Questions

Answer the following questions which relate to aspects of SQL.


a.

SQL has two major components, DDL and DML. What are these components and what are
their functions?

b.

What are the disadvantages of the CHAR data-type and how does the VARCHAR data-type
overcome these?

c.

What is the purpose of the ROLLBACK statement?

d.

List the advantages and disadvantages of SQL.

e.

What are the advantages of using the ALTER TABLE statement as opposed to creating a
new table from scratch when changes are needed?

f.

What is the purpose of the GROUP BY clause?

Suggested Answers:
a.

DDL is data definition language. Its purpose is to define the database objects, such as tables
and columns. Its primary operator is the CREATE statement.
DML is data manipulation language. It is concerned with performing actions on data. This
could be putting data into the data base using the INSERT statement, or changing data using
UPDATE, or deleting data using DELETE. It is also concerned with performing retrievals of
data using the SELECT statement.

b.

The CHAR data-type causes problems because it is of fixed length. If data is entered into it
that is less than the fixed length, then the rest of the characters are filled with blanks. This
causes problems with select statements where the equals statement does not always work.
The VARCHAR data-type overcomes this by being only as long as the data that is entered
into it.
Page 109 of 158

DB Lecturer Guide V1.0

c.

The ROLLBACK statement takes the database back to the state it was in just after the last
COMMIT statement was issued - to the last point it was saved. It can be used to undo
transactions and operations that have already been carried out that the user does not want
to be saved.

d.

Advantages:

Universal

Easy to Use

Fits (more or less) with the relational model

Disadvantages:

Does not support some features of the relational model

Has no one standard

Has had to be extended

Has much redundancy possible to do the same thing in many ways

e.

Altering the tables means that the data in it can be retained.

f.

This clause is used when an aggregate function is used in the main part of the select
statement. Its purpose is to group the results by some attribute from the table or tables on
which the select statement operates.

Exercise 3:

Review of SQL

Take part in a class discussion around the relevant points of Topics 7 and 8 that are listed in Private
Study Exercise 6. Ask your tutor any questions you have about SQL.

Page 110 of 158


DB Lecturer Guide V1.0

Topic 9
Topic 9: Database Design
9.1

Learning Objectives

This topic provides an overview of database design.


On completion of the topic, students will be able to:

9.2

Understand the process of requirements gathering;


Design a set of database tables from an entity model;
Document the tables, columns and domains in a database using a data dictionary;
Understand the use of Case tools.

Pedagogic Approach

Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.

9.3

Timings

Lectures:

2 hours

Laboratory Sessions: 2 hours


Private Study:

7.5 hours

Tutorials:

2 hours

Page 111 of 158


DB Lecturer Guide V1.0

9.4

Lecture Notes

The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

Understanding requirements
Moving from entities to tables
Documenting attributes with a data dictionary
The use of Case tools

9.4.1 Guidance on the Use of the Slides


Slides 4-5:

Understanding user requirements as a subject overlaps with the whole area of


systems analysis. There have been a number of approaches to systems
development, and the development of databases has formed an important part of
most of the systems that have been constructed over the past few decades.
There has been a lot written on computer systems development in general and
database development in particular.

Slide 6:

There are many different approaches to systems development; there has been
what might be termed a mainstream and a traditional approach, but this has fallen
out of favour in recent years, particularly with regard to the question of
understanding user requirements.
Understanding requirements is vital to being able to produce a finished system that
meets the business needs of an organisation.

Slide 7:

The traditional methodology goes under various names, the Systems Development
Life Cycle (SDLC) or sometimes the Waterfall approach.
This involves a complete set of steps that a team follows. The fundamental idea is
to divide the development process into a series of phases or stages, each of which
finishes before the next one starts.
This process is often viewed as a cascade of steps, which is why it has been called
the waterfall approach.
It is worth noting that there are many different variations on the steps involved. This
is because there are particular methodologies known as Structured Methods,
which have been developed to guide developers through the whole lifecycle of
building a computer system. Some examples of these are Structured Systems
Analysis and Design (SSADM) and Information Engineering.

Slide 8:

Listed below is one variation of the different steps:


Strategy and Planning

Feasibility Study
Systems Analysis (or Analysis)

Page 112 of 158


DB Lecturer Guide V1.0

Slide 9:

Design
Implementation
Maintenance

The concern of this module is being able to design a database given a particular
business scenario. It is worth understanding how a developer might have reached
this point following the traditional approach:

Systems Analysis
In this stage, investigation is undertaken to understand the requirements of
both the business and the users. The emphasis should be on what the system
should deliver, rather than how it should be delivered.

Design
In this stage, the purpose is to translate the requirements that have been
gathered during the previous stage into a systems design which details how
they will be satisfied.

Implementation
The construction uses a particular choice of software (a DBMS product) of the
actual system from the given design. Here the development will build the
required structures on the database, write the application programs, and
integrate on the chosen hardware platform.

Slide 10:

We have seen that the fundamental idea of the Waterfall Approach is to do one
thing after another. However, it is worth acknowledging that very few waterfall
projects adhere to the pure waterfall model. Even the most formal project will
benefit from some amount of feedback and rework based on that feedback. But the
principles behind the model remain: to perform certain processes in a certain order.
Historically, there have been serious problems with this approach.
What if the original requirements specified in original analysis turn out to be wrong
in some way because:

the users have not communicated their needs satisfactorily


users did not really understand their own needs
the analyst misunderstood
omission - a piece of vital information was not discussed when it should have
been

These problems may not become apparent until the users are looking at their new
database system. By now it will be very expensive to make a change. For
example, changing a database field length costs nothing at the analysis stage when
everything is on paper; however, once the system is implemented then that field
has to be changed physically on the database which will be much more costly.
Slide 11:

Alternative approaches have been developed, most of which involve the concept of
what is known as Iteration, which is going over some part of the development and
incorporating user feedback in order to get the requirements right. Usually this
involves some sort of prototyping.

Page 113 of 158


DB Lecturer Guide V1.0

Slide 12:

There are whole methodologies devoted to an alternative to the waterfall lifecycle


and using prototyping, for example Dynamic Systems Development Methodology
(DSDM). Without going into this in too much detail, the point to stress here is the
basic concept of prototyping.
Prototype - a first or original example of something from which others have or will
have developed.
Prototyping - the process whereby a model is built of part of the envisaged system.
Enhancements or amendments are discussed with the user which can then be
incorporated in the finished product.
This is basically what designers, architects and engineers have been doing for
years when they produce a model of something, because it is much easier to see a
problem when we have some idea of what the finished product will look like.
What we do is build a part of the system or demonstrate some aspect of the
system.

Slide 13:

The point to stress here is that although the focus will now be on the specifics of
database design, it should be considered that within the context of a larger
Information Technology project, this design stage might also serve as part of a
requirements gathering and verification process. This will involve the database
developer having some of the skills of the systems analyst - primarily good
communication skills.

Slide 14:

Database Design means moving from a set of requirements to implementing these


with database technology.

Slide 15:

The phases here suggest the sort of linear progression from one to another
characteristic of a traditional systems development life cycle. The outputs of one
phase would be the inputs of another. For example, the output of logical design
might be an Entity Relationship Model. This would become the basis for table
design in the Physical Design phase. More iterative approaches to development
would have the deliverables from previous phases revisted so that, for example,
prototyping at physical design stage might lead to revisions in the ER model that
was produced during logical design.
Discussions of database phases tend to present them in a linear fashion, but it is
worth noting that current trends in project management and development
methodology are more iterative.

Slide 16:

Conceptual Database Design. At this stage, data is investigated and design is


undertaken without regard either to the physical implementation OR the data model.
So the designers at this stage do not even assume they will be using the relational
model. The sorts of areas of investigation are: what data does the enterprise hold?
In what format is it? How is it used? How might the use of data affect how it is held?
What terminology do users have for their data?

Slide 17:

Logical Database Design constructs the model without regard for the particular
DBMS that will be used. However, the data model (e.g. the relational model) is
known. A key activity is normalisation.

Slide 18:

Physical Database Design - The move from entities to tables is one of the key
activities here involving what is known as designing the base relations. But this
Page 114 of 158

DB Lecturer Guide V1.0

phase also involves other activities, such as indexing, denormalisation, view


creation and query tuning.
Slide 19:

With regard to this module, the activities that are undertaken are those of logical
and physical design. The process of logical design (identifying entities,
normalisation) has been discussed earlier. Our concern in this topic is to examine
some of the aspects of physical design.

Slide 20:

There will usually be a one-to-one mapping of entities to tables. An entity in the


ERD will become a table in the database. The naming conventions have been that
an entity is singular and a table is plural, so the entity Student would become the
table Students.

Slide 21:

Decomposing many-to-many relationships. Where many-to-many relationships exist


in the ERD, these should be decomposed.

Slide 22:

Representing domains as separate tables. Some domains should be implemented


as separate tables known as look-up tables. The number of values and how
dynamic the data is are issues here. If there are a large number of values in a
domain and it is likely that these will be added to or updated, then it likely that these
should be implemented as a separate table.

Slide 23:

This slide presents document domains and base relations in a data dictionary,
which should include the name of the relation, a list of the attributes, the primary
key and any foreign keys. For each attribute, there should be listed a number of
different aspects; firstly, the domain. The domain might be a specifically defined
domain or simply the data type, length and any constraint and any default value. It
should also be noted as to whether the field is mandatory or whether is can be null.
Constraints will be covered in a coming topic.
Example of data dictionary for base relation Students and its associated domains:
Domains
Domain StudentType varchar, length 30, must be Overseas,Home
Domain City varchar length 30
Base Relations
Students(StudentID Number NOT NULL,
Address_line1
Varchar (30) NOT NULL
Address_line2
Varchar (30) NOT NULL
City
City
StudentType
StudentType NOT NULL DEFAULT Home
PRIMARY KEY (StudentID)
FOREIGN KEY (City) REFERENCES City (City_name));
In this case, StudentType is a simple domain with two values and would be
implemented by a constraint on the table. City as a domain is enforced by having a
separate table with valid cities in it.

Slide 24:

Documenting large database developments can be a daunting task. Software has


been developed to aid with this. Computer Aided Software Engineering (CASE)
tools enable developers to store entity definitions, domain definitions etc. From
these, the base relations can automatically be generated. The Case tool can also
generate the scripts to create the physical database.

Page 115 of 158


DB Lecturer Guide V1.0

9.5

Laboratory Sessions

The laboratory time allocation for this topic is 2 hours.


Lecturers Notes:
Students have copies of the laboratory exercises in the Student Guide. Answers are not provided in
their guide.

9.5.1 Grouping
In the previous laboratory session we looked at aggregation. You were asked to find the minimum,
maximum, average and the sum of the age all the packers in the workers table.
The suggested solution was like this:
Select min(age), max(age), avg(age), sum(age)
From workers
Where job_type = Packer;
But what if we want to provide a query that shows us the maximum age for each of the different
types of workers? SQL provides a group by clause that allows us to do this.
Select job_title, max(age)
From Workers
Group by job_title;
Exercise 1
Run the above query and study the results.
Exercise 2
Write a query that finds the average age for the employees in Cairo. Group this by job_title.
Suggested Solution:
Select job_title, avg(age)
From Workers w, Departments d
Where w.dept_no = d.dept_no
And d.location = Cairo
Group by job_title;

Page 116 of 158


DB Lecturer Guide V1.0

Exercise 3
Write a query that shows the age of the eldest workers in each department. Group this by the
dept_no. You do not have to show the department name.
Suggested Solution:
Select dept_no, max(age)
From Workers
Group by dept_no;
SQL also provides the ability to place a selection condition on the group by clause. This is the
Having clause. This example shows the above query modified so that only those departments with
a maximum age above 35 are shown:
Select dept_no, max(age)
From Workers
Group by dept_no
Having max(age) > 35;
Exercise 4
Find the departments that have an average age of over 40. You do not need to show the department
name.
Suggested Solution:
Select dept_no, avg(age)
From Workers
Group by dept_no
Having avg(age) > 40;
Exercise 5
Find the maximum age, the minimum age, the average age and the job title for those jobs where the
average age is above 35. Group this by the job title.
Suggested Solution:
Select job_title, max(age), min(age), avg(age)
From Workers
Group by job_title
Having avg(age) > 35;
Exercise 6
As part of the laboratory exercises in Topic 5, you created two tables that kept personal information
about yourself and the qualifications that you have. In Topic 7 you should have added some new
rows about your friends and their qualifications to these tables.

Page 117 of 158


DB Lecturer Guide V1.0

Now use the aggregate functions from Topic 8s Laboratory Session and the Group By clause from
Topic 9s Laboratory Session to create a set of useful queries using these tables.
Suggested Solution:
This will depend on the tables the students created. A suitable field to operate on would be age. If
there are no suitable fields, students should add them using the modify table statements that they
learnt in Topic 4s Laboratory Session.

Page 118 of 158


DB Lecturer Guide V1.0

9.6

Private Study

The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
You should also allow time during the laboratory sessions to check that students are working on
their assignments and answer any general questions on the expected scope of the work. You may
also wish to remind them of the submission deadline and documentation requirements.
This topics private study time involves practising some database design based on elaboration of a
previous private study exercise.
Exercise 1
In Topic 4, you were asked to draw an ERD for a boat rental system. The requirements were the
following:

You should be able to record that a boat is rented to a customer for a set period.
Any damage to the boat is recorded against the particular rental.
A boat should have a name.
All boats are of the same type (yacht).
Damage is classified as being hull, interior or other.

Using the ERD for this system, produce a data dictionary specifying the base relations (tables),
attributes and domains. The data dictionary should be in the format given in the lecture.
Suggested Answer:
BOAT

0..N

0..N

RENTAL

CUSTOMER

1
0..N

DAMAGE

Domain DamageType varchar length 10 must be one of Hull,Interior or Other;


Base Relation
Boat(
BoatID number NOT NULL,
BoatName varchar 30 NOT NULL
Primary Key (BoatID);
Page 119 of 158
DB Lecturer Guide V1.0

Base Relation
Customer(
CustomerID number NOT NULL,
CustomerName varchar 30 NOT NULL,
CustomerAddress varchar 60 NOT NULL,
Primary Key (Customer ID);
Base Relation
Rental(
BoatID number NOT NULL,
CustomerID number NOT NULL,
RentalStartDate date NOT NULL,
RentalEndDate date NOT NULL,
Primary Key (BoatID, CustomerID, RentalStartDate),
Foreign Key (BoatID) REFERENCES Boat (BoatID),
Foreign Key (CustomerID) REFERENCES Customer(CustomerID));
Base Relation
Damage(
BoatID number NOT NULL,
CustomerID number NOT NULL,
RentalStartDate date NOT NULL,
DamageType DamageType);
Exercise 2
Find some examples of CASE tools online. What are their features? For how much of the database
development process do they cater? What might be their disadvantages?
Prepare a brief written discussion for the tutorial.
Suggested Answer:
This will depend on the Case tool chosen. Oracle Designer, for example, covers logical design and
physical design, and there are tools for modelling entities, tables, processes and functions. There
are code generators for the database structures and for applications, such as forms. Students
should be encouraged to give an outline of the tool they have investigated.
Exercise 3
Investigate a systems development methodology such as SSADM. Each stage or step has what is
known as a set of deliverables. These are the outcomes of that stage which will form the basis of
work in the next stage.
What are the deliverables from analysis, design and implementation stages for the methodology that
you have investigated?

Page 120 of 158


DB Lecturer Guide V1.0

Suggested Answer:
This will depend on the development methodology investigated, but typically it would be something
along these lines:
Stage

Deliverable

1. Analysis

Analysis Report that might include various sorts of


Diagram, but requires ER diagram and list of requirements

2. Design

Design specification, including physical data design (tables that will


exist in the database) and module design (the applications that will
use the database).

3. Implementation

Data loaded, testing and training complete

Exercise 4
Review the content of this topic and conduct any further reading you need to undertake in order to
ensure that you understand the material. You should make note of anything that you still feel
requires further clarification and bring your questions to the tutorial for this topic.

Page 121 of 158


DB Lecturer Guide V1.0

9.7

Tutorial Notes

The time allowance for tutorials in this topic is 2 hours.


Lecturers Notes:
Students have copies of the tutorial activities in the Student Guide. Answers are not provided in their
guide.
Students can work in small groups to complete the exercises. You should then run a whole class
feedback session to discuss the students answers.

Exercise 1:

Review of Private Study Exercises

In small groups, discuss your findings to the Private Study Exercises, asking your tutor for
clarification when needed.
Exercise 2
Answer the following questions, which relate to approaches to development.
1. What is the difference between analysis and design?
2. Why is the traditional systems development approach called the Waterfall Model?
3. What stages in a traditional waterfall lifecycle do you think overlap with the conceptual,
logical and physical stages of database design?
4. What is prototyping and what are its advantages?
Suggested Answers:
1. Analysis is the investigation to understand the requirements of the business and the user.
The emphasis is on WHAT the system should deliver rather than HOW it should be
delivered.
Design is translating the requirements from analysis into a systems design which detail how
they will be satisfied.
2. We divide the development process into a series of phases or stages and each finishes
before the next one starts. This is often viewed as a cascade of steps, hence the water fall.
(You could also draw an analogy with water flow and data flow from one step to next in one
direction.)
3. The stages in a traditional waterfall lifecycle are analysis, design and implementation. We
can assume some of the analysis has been done given that the outcomes of analysis would
be the background under which the conceptual design would begin. Logical and physical
design would fall within the design stage of the waterfall lifecycle. Those parts of physical
design which involve creating objects within the database might be classified as part of the
implementation in a waterfall approach.
Page 122 of 158
DB Lecturer Guide V1.0

4. Prototyping is the process whereby a model is built of part of the envisaged system.
Enhancements or amendments are discussed with the user which can then be incorporated
in the finished product.
You can obtain feedback from the user and use this to amend the original requirements,
analysis and design. Prototyping helps the analyst and user to communicate and it can save
time and money to deliver what the user wants.
Key aspects of prototyping:

Build the appropriate prototype


Obtain feedback
Iteration

Exercise 3
Outline the difference between Conceptual, Logical and Physical Design.
Suggested Answer:
Conceptual design is the initial investigation of the data that is needed to support a system. It does
not take into account either a particular data model or the implementation environment.
Logical design involves investigation of the data using the tools of a particular data model, such as
the relational model, but is still independent of the implementation environment.
Physical Design takes into account the chosen DBMS product and the physical structure and
implementation of the database.

Page 123 of 158


DB Lecturer Guide V1.0

Topic 10
Topic 10: Supporting Transactions
10.1 Learning Objectives
This topic provides an overview of supporting transactions.
On completion of the topic, students will be able to:

Identify transactions;
Understand business rules and their implications;
Recognise potential performance issues;
Identify the potential need for de-normalisation.

10.2 Pedagogic Approach


Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.

10.3 Timings
Lectures:

2 hours

Laboratory Sessions: 2 hours


Private Study:

7.5 hours

Tutorials:

2 hours

Page 124 of 158


DB Lecturer Guide V1.0

10.4 Lecture Notes


The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

Business Rules
Identifying and documenting transactions
Views and de-normalisation

10.4.1 Guidance on the Use of the Slides


Slides 4-5:

This topic will use the example of the boat rental system specified in the Topic 9
Private Study exercise.

Slide 6:

The suggested answer shown on the slide specifies the basis for the system in
terms of the relational model. It also embodies essential business rules of the
organisation; for example, the original scenario states that any damage is stored
against the rental, so the entity damage is related to the rental entity. This may
seem counter-intuitive since it is boats that get damaged; however, by storing it
against the rental, the database tells us who was renting the boat when it was
damaged, during what time was it rented as well as which boat was damaged.
In this case, the business rule is built into the structure of the database. Other
business rules might be embodied in a constraint of the sort created on the Workers
table in the Topic 9 laboratory session (that workers had to be 70 or younger). One
could imagine a business rule on this database that specified a maximum time for a
rental period. This would be enforced with a check constraint. The other rule that is
specified here is for the domain of damage type.
Identifying business rules will form part of the requirements gathering process.

Slide 7:

A transaction is an operation carried out on the database. Transactions can


generally be identified as retrievals, inserts, updates and deletes. This is sometimes
usefully remembered by the acronym CRUD (Create, Retrieve, Update and Delete).

Slide 8:

In order to design a database that meets the user requirements, the transactions
that will take place have to be identified. There is both quantitative information
about them (what they are, what they do), and qualitative information (how often
they run, how many rows they will affect.)

Slide 9:

Connolly and Begg provide a useful summary of what to look at when identifying
transactions and beginning to analyse the effect they will have on the design of the
database, the applications and the performance of the database when running
queries.

Transactions that run often and could have an impact on performance

Transactions critical to the running of the business

Transactions that take place in busy times (alongside other similar


transactions) creating a peak time.

Page 125 of 158


DB Lecturer Guide V1.0

Slide 10:

To investigate transactions, the following needs to be carried out:

Trace all transactions to the relations that they use or affect this will mean
thinking about what tables are written to, read from etc.

Determine if some relations are frequently used by many transactions cross


reference transactions with each other so that frequently used relations can be
identified.

Analyse how the data is used by a given transaction this will be discussed
below by looking at CRUD matrixes, but it is worth noting that this could go to
quite a detailed level looking at each attribute on a table.

Note. All these investigations can be done much more efficiently with the user of a
CASE tool that allows cross referencing between transactions and relations
Slide 11:

This slide shows the requirements for transactions for the boat hire system:
a.
b.
c.
d.
e.
f.
g.
h.

Enter the details of all the boats. Update any details for customers. Delete
boats.
Enter the details for customers. Update any details for customers
Enter the details for hiring of boats
Enter the details for any damage to boats
List the details of all the boats
List the details of all the customers and their hire, for which boats
List the details for damage, to which boats, during which hire periods and for
which customers.
Provide a summary of the hires for a particular period

Note: these requirements would be known as a result of the requirements gathering


process as discussed under Database Development. Users themselves would not
necessarily articulate their requirements in this way, but it would be up to the
analyst to break them down so that they refer to tables in the database. The above
list of transactions is a typical example of the sort of transactions that would be
necessary in such a system: there are inserts of core data and the sort of queries
that would be needed for useful business reports.
Slide 12:

Activity: Examine the blank CRUD matrix presented on this slide, copy it and fill in
the relevant operations in the transactions.

Slide 13:

This slide shows the completed CRUD (or IRUD) matrix for the transaction.

Transaction/Relations A
Boat

CUD

Customer

Damage

R
CU

Rental

Slide 14:

R
R

Some transactions may affect some attributes, but not another. For example,
updating a customers details to change their address would only affect the address
attributes and not name or ID.

Page 126 of 158


DB Lecturer Guide V1.0

Slide 15:

When analysing transactions, the type of access (CRUD) is already identified. We


must, however, identify the following:

Slide 16:

Does the transaction involve any predicates (specific conditions in a where


clause)? For example if, as in transaction H on Slide 11, we wanted all the hire
data for a particular period, then the where clause would contain a range
search between two dates that were input.
For retrievals which will eventually become SQL queries, are there any table
joins?
Which attributes are involved? As mentioned in Slide 14, it might not be all of
them.
How often does a transaction run? How many times per day or per week? Are
there any peak periods? For example, the boat hire company might do more
business at weekends or during holiday periods. It is important to note this
because it might affect performance. Will the transactions perform the same
during periods of high activity?

This slide presents an example of a transaction analysis form, which specifies


details of transactions. It shows the section of ER affected and the number of tuples
(in this case 50).
Below is a copy of the transaction analysis form as presented on Slide 16.

Transaction Analysis Form


1st Jan 2010
Transaction (e) List the details of all boats
Transaction volume
Average 1 per day
Peak 100 times per day during production of promotional literature (June)
Select boatID, boatName
From Boats
Boats (50)
Access

Entity

Boats

Slide 17:

Type of Access
R

Average Number
1 * 50

Peak Number
100 * 50

The term Performance is generally used by database professionals to refer to the


way in which a query or other Data Manipulation statement behaves when run
against a database. Usually, this is measured in the amount of time it takes to
perform the operation. Put simply, a query that takes three seconds to return is
better performing than one that takes twenty seconds. However, performance
would have to take into account other factors, such as the number of rows in the
database. Performance can be improved by creating indexes on the attributes that
will be used a lot in retrieval of data and on those that are used most often in joining

Page 127 of 158


DB Lecturer Guide V1.0

tables. Be aware that attributes that are often updated will be slowed down by an
index.
Slide 18:

Note that a lot of the performance related issues presented on Slide 18 pertain to
databases with tables that have a large number of rows. Such tables are
increasingly common as databases become large.
The issue of analysing transactions and their behaviour in databases of different
sizes is known as scalability. Professionals have to investigate whether a
database is scalable. This means, does a database and its transactions behave in a
comparable way when it is implemented at quite a small scale (with very few rows)
to when it is implemented with a much larger number of rows.

Slide 19:

How the transactions will be manifest as part of a deployed system will depend on
the set of applications that are built on top of it. This will depend on the type of
system it is. A database can be accessed by web-based forms, by queries
embedded in a website, by forms applications built in a language like Visual Basic
or Oracle Forms. Queries might be run from an SQL prompt, but are more likely to
be embedded inside some other application.
Much specialised database development also involves the development of
applications.

Slide 20:

Not every user of a database has the same role. Within any organisation and
among the population of people in it that are users of the database, there will be
different levels in terms of hierarchy as well as people doing different jobs. These
differing roles will require access to different information. There is also the issue of
data protection with many countries having legislation which means that data
should only be accessed by those who have a genuine need to use it. Some people
will need to just see data, whereas others will have a need to insert, update or
delete data.

Slide 21:

In the boat hire system, we might define two types of users: the manager who
should be able to have full access to the entire database; the administrative
assistant who is hired for the holiday period to insert the rentals, add any new
customers and record any damage to the boats. They would have different
privileges on the database.
We can define this using a similar CRUD technique, but here we are recording
access rather than transactions.
Table/User

Boat

Customer

Rental

Damage

Manager

CRUD

CRUD

CRUD

CRUD

Admin Assistant

CRU

CRU

CRU

Slide 22:

SQL has various facilities to manage the way different users are granted access to
different parts of the database. These facilities can enforce the roles and access
rights that have been defined for a database.

Slide 23:

The grant facility gives access to a particular object, e.g. a view or a table created
by a different user.

Page 128 of 158


DB Lecturer Guide V1.0

Grant create on Boat to Admin this command will give the role of Admin the right
to create data on the table Boat.
Grant all on Boat to Manager- this command will give the role of Manager the right
to carry out any operation on the table Boat.
Slide 24:

The revoke facility removes a users access to a particular object.


Revoke all on Boat from Admin this command will take away any access rights
from the role of Admin on the table Boat.
Revoke delete on Boat to Manager this command will take away the right to
delete data from the Boat table by the Manager.

Slide 25:

Normalising our data model means we will have the minimum amount of
redundancy. However, this can have an effect on performance. If we are running a
query that joins tables, this will be slower than running a query against a single
table or view.
Denormalisation can be done by including an attribute in a table that should not be
there according to the rules of normalisation, e.g. the name of the boat on the Hire
entity.

Slide 26:

This slide presents the use of view as a way to improve performance. A view is a
virtual table. The students will have encountered the creation of views in the SQL
laboratory sessions. Views can be used to combine tables, so that instead of joining
tables in a query, the query will just access the view and thus be quicker.

Slide 27:

In the boat hire system, we could create a view where there is a query that
combines lots of tables. For example, transaction G - List the details for damage: to
which boats, during which hire periods and for which customers (presented on Slide
11) involves joining all the tables in the database. For part of the private study, you
will need to write the SQL for creating a view for this transaction that will contain all
the relevant data.

Page 129 of 158


DB Lecturer Guide V1.0

10.5 Laboratory Sessions


The laboratory time allocation for this topic is 2 hours.
Lecturers Notes:
Students have copies of the laboratory exercises in the Student Guide. Answers are not provided in
their guide.

10.5.1 Nested Sub-queries


When we need to find the result of some complex enquiry on our database, we can put one query
inside another. This is known as nesting.
Consider the following example:
Select d.department_name, d.location
From departments d, workers w
Where d.dept_no = w.dept_no
And w.age =
(select max(w2.age)
From workers w2);

10.5.2 Use of the Workers, Departments and Job_types Tables


Exercise 1
Run the above query. What information is it telling us?
Suggested Solution:
The information is giving us the name and the location of the departments that have employees of
the maximum age of all our employees.
Exercise 2
Modify the above query to select the department and its location that has the youngest manager.
Suggested Solution:
Select d.department_name, d.location
From departments d, workers w
Where d.dept_no = w.dept_no
And w.age =
(select min(w2.age)
From workers w2
Where w2.job_title = Manager );

Page 130 of 158


DB Lecturer Guide V1.0

Exercise 3
Write a query using a nested sub-query to find those department IDs where the average age of the
workers is less than the average age for all the workers in the company.
Suggested Solution:
Select d.dept_no
From departments d, workers w
Where d.dept_no = w.dept_no
And w.age < (select avg(age) from workers);
Exercise 4
Note that the result above produces multiple repeating values. Use the Group By clause after the
closing brackets to group by department ID.
Suggested Solution:
Select d.dept_no
From departments d, workers w
Where d.dept_no = w.dept_no
And w.age < (select avg(age) from workers)
Group by d.dept_no;
Exercise 5
Make this query more user-friendly by changing the department ID to the department name.
Suggested Solution:
Select d.department_name
From departments d, workers w
Where d.dept_no = w.dept_no
And w.age < (select avg(age) from workers)
Group by d.department_name;
Exercise 6
Using a nested sub-query, get the first names of all those workers who have the maximum age for
the whole company. Remember to use Group By if you are getting repeating values.
Suggested Solution:
Select w.first_name
From workers w, departments d
Where w.age = (select max(age) from workers)
Group by w.first_name;

Page 131 of 158


DB Lecturer Guide V1.0

10.5.3 Using the Personal_details, Qualifications and Qualification_types Tables


The following exercises use the Personal_details, qualifications and qualification_types tables.
Exercise 7
Show the first name and last name of all the people who have qualifications at level 3 and are older
than the average age.
Suggested Solution:
Select p.first_name, p.last_name
From personal_details, qualifications q, qualifications qt
Where p.personal_no = q.personal_no
And q.qualification = qt.qualification
And qt.level = 3
And p.age = (select avg(age) from personal_details);
Exercise 8
Show a count of those people who have got a higher grade than the average for their level 2
qualification.
Suggested Solution:
Select count(p.personal_no)
From personal_details, qualifications q, qualifications qt
Where p.personal_no = q.personal_no
And q.qualification = qt.qualification
And qt.level = 2
And q.grade > (select avg(grade) from qualification);

Page 132 of 158


DB Lecturer Guide V1.0

10.6 Private Study


The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.

Exercise 1
Write the SQL for creating the view for the following transaction:
List the details for damage: to which boats, during which hire periods and for which customers.
Suggested Answer:
Create View as
Select b.BoatName, c.Customer, r.RentalStartDate, r.RentalEndDate, d.DamageType
From Boat b, Customer c, Rental r, Damage d
Where r.BoatID = b.BoatID
And r.CustomerID = c.CustomerID
And d.CustomerID = r.CustomerID
And d.BoatID = r.BoatID
And d.RentalStartDate = r.RentalStartDate;
Exercise 2
Explain what you think the purpose and effect of this view would be for the Boat Hire system.
Suggested Answer:
The purpose is to group all the information about the damage that the users need to see in one
convenient place; in this case it will be stored in a view. The effect should be to speed up retrievals
of damage information. Without the view, each time this information is needed, the whole query
would have to be written out. This query joins all four tables and, as can be seen from the length of
the join statement, means that in a database with lots of rows, there could be a significant effect on
the performance of the query. This would be manifested in the time it took for the query to come
back with a result.
If the view is set up however, then a simple select from the view would give all the data needed.

Page 133 of 158


DB Lecturer Guide V1.0

Exercise 3
Use online resources to look for jobs advertised for database development work. What sorts of skills
are being required for the jobs? What software is involved?
Suggested Answer:
There should be a discussion about the sort of skills and software that is required during the tutorial
session. Different database vendors should be highlighted. The development of application software
that accesses databases (PHP, VB.net, Oracle Forms etc.) should be pointed out.
Exercise 4
One of the definitions of a transaction is that it should possess four basic properties, usually
remembered by the abbreviation ACID:

Atomicity
Consistency
Isolation
Durability

Research and give a definition for each of these properties.


Suggested Answer:
Atomicity - This is the property that defines a transaction as an indivisible unit in the sense that
either the whole transaction must occur or no part of it must occur. Although a transaction could be
made up of a number of operations for the transaction to be atomic, ALL operations must be carried
out or none of them at all. An example of this is buying something using a credit card online. The
whole transaction would involve debiting the card of the customer (operation 1) and then placing the
order for the item (operation 2). Both operations must be carried out or we might get a situation
where the customer would be debited, but no goods ordered.
Consistency - A transaction must not leave the database in an inconsistent state. Therefore all
constraints should be followed including enterprise constraints that enforce business rules. For
example, in our Boat Hire system, if it was a rule that any Rental of a Boat must have both a start
data and end date, then if a transaction created a new Rental record that did not include the end
date, then it would not be consistent.
Isolation - Transactions should not interfere with other transactions. For example, in a banking
system, updating a current account with a deposit should not interfere with debiting that same
account for a payment. Otherwise it could leave the database in an inconsistent state. Ensuring that
this isolation occurs is the responsibility of the concurrency control subsystem in the database.
Durability - when a transaction has taken place, then its effects must be lasting and not vulnerable
to being lost, because of a subsequent system failure. The recovery subsystem is there to ensure
that this is possible.
Exercise 5
Review all the material from this topic and prepare any questions for the tutorial session.

Page 134 of 158


DB Lecturer Guide V1.0

10.7 Tutorial Notes


The time allowance for tutorials in this topic is 2 hours.
Lecturers Notes:
Students have copies of the tutorial activities in the Student Guide. Answers are not provided in their
guide.
Students can work in small groups to complete the exercises. You should then run a whole class
feedback session to discuss the students answers.

Exercise 1:

Review of Private Study Exercises

In groups, review the work you carried out during your private study.
Exercise 2
Give an explanation as to your understanding of a business rule. Using a system you are familiar
with, from an example in the course materials or through personal experience, specify business
rules that apply to that system.
Suggested Answer:
Definition of a business rule: A rule, procedure or way of doing something that applies to a particular
business.
The examples the students give should demonstrate that they understand that business rules are
something that is in addition to the normal integrity rules of relational databases (although a
business rule could be enforced with an integrity constraint.) For example, a rule that all boats must
be blue, red or green is something particular to a business that would have to be enforced, e.g. with
a domain.
Exercise 3
A student record system consists of three tables: Students, Modules, StudentsModule and the ER
shown below:
1

Student

0..*

0..*
StudentModule

Data dictionary
Student
StudentID (PK)
StudentFirstName
StudentLastName
StudentAddress
Page 135 of 158
DB Lecturer Guide V1.0

Module

StudentAge
StudentModule
StudentID(PK)(FK)
ModuleCode(PK)(FK)
Semeseter
Year
Result
Module
ModuleCode(PK)
ModuleName
Complete a CRUD matrix for the following transactions:
a.

Insert a new student.

b.

List a students personal details and results for all of the modules they have taken. Include
the module name.

c.

List details of all the modules.

d.

Insert a new module.

e.

Allocate a student to a module.

f.

Assign a result to a student for a module.

Suggested Answer:
Transaction/Table

Student

StudentModule

Module

Exercise 4
A number of business rules have been defined for this student records system:
1.

All students should have an enrolment date recorded for them and a final completion date.
All students should be deleted from the system three years after their completion date.

2.

Secondly, students should be classified as being Home or Overseas.

3.

Students should be allowed to retake a module that they fail.

Discuss how each of these business rules might be enforced on the system. This might require the
creation of new attributes or other database structures.
Suggested Solution:

Page 136 of 158


DB Lecturer Guide V1.0

1.

Attributes for enrolment date and completion date would have to be added to the Student
table. There would have to be a transaction to delete students three years after their
completion date.

2.

A new attribute of StudentType would have to be added to the Student table. The domain
for this (with values Home and Overseas could be supported in a number of ways. This
could be the creation of a separate domain or a check constraint on the Student table.

3.

The reason why students would NOT be able to retake a module at the moment is that the
Primary Key on StudentModule is the StudentID and the ModuleCode. If the student took
the module again, as things stand, then the Primary Key would be duplicated. Therefore the
change that would need to be made to make it possible for the student to retake the module
would be for some other attribute or attributes to become part of the primary key. This
might be Year and Semester.

Page 137 of 158


DB Lecturer Guide V1.0

Topic 11
Topic 11: Database Implementation
11.1 Learning Objectives
This topic provides an overview of database implementation issues and the implementation
environment.
On completion of the topic, students will be able to:

Understand how constraints can be enforced;


Insert multiple rows of data in SQL;
Explain some of the features of the Oracle RDBMS.

11.2 Pedagogic Approach


Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.

11.3 Timings
Lectures:

2 hours

Laboratory Sessions: 2 hours


Private Study:

7.5 hours

Tutorials:

2 hours

Page 138 of 158


DB Lecturer Guide V1.0

11.4 Lecture Notes


The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

The implementation environment


Creating tables
Enforcing integrity via constraints
Creating indexes
Inserting data at implementation
Oracle as an example of an implementation environment

11.4.1 Guidance on the Use of the Slides


Slide 4:

This slide presents aspects of implementation, such as creating the tables from the
data dictionary by writing the create scripts (this will include any tables used to
enforce domains). Aspects of implementation also include creating other structures
in the database, such as domains themselves, indexes and views.

Slide 5:

Create table statements will create the tables on the database.

Slide 6:

Domains can be enforced in a number of ways. The ISO SQL standard has
specified a statement that creates domains as separate structures in the database.
This is an example of the syntax of such a statement; it creates a domain called
allowable colours, which can have one of three values Red, Blue or Green. It
also specifies a default of Red, which means that if no colour is specified when this
domain is used, then it will be set to Red.
Create domain Allowable_Colours As Char
Default Red
Check (Value in (Red,Blue,Green));

Slide 7:

Consideration is needed as to the different types of constraints that are needed in a


relational database: They are listed here. What happens when a constraint is
violated also needs to be considered.

Slide 8:

This slide presents an example of referential integrity, which is where a foreign key
references another table. This example is from the laboratory session In Topic 1.

Slide 9:

This slide presents the propagation constraint, which is that when two tables are
related and there is an update in one, it can affect the other. What would happen to
all the records for hire and damage in our Boat Hire system if there was an update
or deletion of one of the Boats? The answer will depend on how we have set up our
database. But if it were just the case that we could delete Boats without doing
anything about the records on the Rental table, then there would be a lot of Rental
records that referred to Boats that no longer existed.

Slide 10:

This slide presents the create script for the rental table:
Page 139 of 158

DB Lecturer Guide V1.0

Base Relation
Create Table Rental
(BoatID number NOT NULL,
CustomerID number NOT NULL,
RentalStartDate date NOT NULL,
RentalEndDate date NOT NULL,
Primary Key (BoatID, CustomerID, RentalStartDate)
Foreign Key (CustomerID) REFERENCES Customer(CustomerID),
Foreign Key (BoatID) REFERENCES Boat (BoatID)
On delete no action
On update cascade);
What the script is saying is that if a boat is deleted from the Boat table, then we
leave the Rental record as it is, in order to keep a genuine record of our business.
However, if a boat is updated in some way that affects the primary key, then this will
also be updated in the rental record.
Slide 11:

Slide 12:

Slide 13:

There are a number of ways of dealing with the knock on effect of an action being
performed on one table that will affect another table. This action is known as
propagation and the ways of dealing with it are propagation constraints. The various
settings for propagation constraints are presented on this slide.

No action means the record in the table with the foreign key is left as it is.

Cascade means that any change (including deleting a record) is replicated in


the table with the foreign key in it.

Set Default means that the change in the parent table (in this case Boat)
causes the record in the child table to be set to some sort of default, e.g. a
boat could be deleted and in the Rental record, the foreign key would be set
to some value, such as X, which indicates that the record refers to a boat
that no longer exists on the system.

Set Null is similar to Set Default except that the table with the foreign key has
that foreign key set to null.

A domain is the set of allowable values that an attribute can have. Domain
constraints are rules for specifying how this set of allowable values can be enforced
on the database. For example, if we say that a Boat can only be of a number of
types, then how do we make sure that this is the case? We do so by enforcing the
domain with a constraint. There are a number of ways we can implement this
constraint. Firstly, we add a column to the Boat table called BoatType. The
allowable values for BoatType can be enforced in several ways:

As a check constraint in the table definition.

Using the create domain statement

As a foreign key to another table.

This slide shows how to enforce a domain using a check constraint. A check
constraint is a rule placed on an attribute on a table that specifies that the values for
that attribute must obey that rule. In this case, the rule is that the value of BoatType
must be either Yacht, Cruiser or Rower.

Page 140 of 158


DB Lecturer Guide V1.0

Slide 14:

This slide shows how to enforce a domain, by setting up a separate domain object
in the database. Here, a domain of BoatType is set up and the table Boat refers to it
in the definition of the BoatTypeCode attribute.

Slide 15:

This slide shows how to enforce a domain constraint as a foreign key to another
table. As mentioned in the lecture in Topic 10, if the values of the domain are likely
to be dynamic, then it would be better to implement them as a look-up table.
Boat Type is set up as separate table that is referenced by the Boat table. The
advantage of this is that if a new Boat Type is used by the business, then all that is
needed is to add a row with details of it to the Boat Type table. Note that it is the
BoatTypeCode that is now on Boat. To get the full description of the boat type in
any query, then the tables would have to be joined. This could have an effect on
performance.

Slide 16:

A table constraint is where a constraint forms part of the table definition, but does
not fall into one of the other categories.
The example on this slide limits the amount of times a boat can be rented to less
than 10. Table constraints can be dropped using the Alter Table clause.

Slides 17-18:

Note that as well as the statement for inserting multiple rows of data, different
vendors have products which automate the process. These tools make the process
of loading lots of data into a database easier in that data already exists in an
electronic form. It is increasingly the case that databases are not just built to
replace paper-based systems, but are to replace other computerised systems
instead. Data in the older systems will be stored in various formats. The data
loading tools mean these files can be read and the data is put into a format which
will allow loading into a relational database.

Slide 19:

The implementation environment includes the choice of vendor, the supporting


architecture within the organisation in which the system will be implemented and
any other systems with which the database system will have to interface. The
diagram on the slide is a schematic representation of a database existing within a
wider environment. The database will be implemented using a vendors DBMS. It
will be stored on a server of a particular type. That server will be connected to a
network with a particular network operating system. Applications such as forms and
reports will send requests for data over the network to the database. These
applications exist on application servers or individual PCs. The network might be
connected to other servers that provide services, such as printing, so that if a user
wanted to print a report that accessed the database, then the print server would
come into operation, and it might require the result to be processed in some way to
make the format usable. The network will probably be connected to the Internet and
there might be Internet applications, such as webpages, that can access the
database.

Slide 20:

This slide presents Oracle as an example of a database vendor. Oracle is the


largest database vendor in terms of value and was, until recently, the largest in
terms of market share. Oracle has now been challenged by MySQL; however, it still
remains the choice of vendor for many businesses for their database infrastructure.
New versions introduce new features, and a number of Oracle features are outlined
in the next few slides.

Page 141 of 158


DB Lecturer Guide V1.0

Slide 21:

The Oracle environment has tools that support various languages, such as Java
and XML, and helps them interact with the database. This means that the database
would be able to interpret data from a wide range of sources.

Slide 22:

Oracle also provides support for the following:


Distributed Database features. This allows the data to be split across a
number of different database servers. To the user, it would seem they were
querying a single server.
Advanced Security. This includes encryption and authorisation control.
Data Warehousing. This includes data from various sources can be
integrated into a single warehouse and then examined for the purposes of
gathering management information.
Internet ready features. This includes a set of tools to enable developers to
build web-based applications and control web-database connectivity.

Slide 23:

The objects that are supported by the Oracle environment include physical
implementation of all the objects that exist in the logical structure in the relational
model: tables, indexes, views, domains etc.
Additional objects include objects themselves in the object-oriented sense of
having internal structure. For example, an address column could be defined as an
object type address which has a structure of address lines, city and post or zip
code.
PL/SQL is a procedural extension of SQL with all the capacity that this implies: if
statements, loops etc. It allows for the writing of complex programs what can be
stored on the database itself in the form of:
Stored functions
Stored procedures
Triggers.
A trigger is a piece of logic that will begin to operate (or fire) when some action is
carried out. Usually some type of database transaction will cause a trigger to fire.
For example, there is a trigger known as an On-Update trigger. This trigger will fire
when a transaction updates a table. What happens when the trigger fires will
depend on what logic is put into the trigger. An example would be that the trigger
could call a function or procedure that did something like create a record in another
table for audit purposes.

Slide 24:

A detailed examination of Oracle architecture is beyond the scope of this module,


however a general outline can be given. A given implementation of an Oracle
database on a server consists of
The database (the data including any control files)
The database instance (all the processes and memory areas on the
server that handle access to the database itself)
Logical and physical structure

Page 142 of 158


DB Lecturer Guide V1.0

Slide 25:

Logical structure includes table-spaces. These are logical storage units that
contain the other logical structures, such as tables. Tables and other objects will be
defined as belonging to a particular schema. The logical structure also includes
definitions of users.

Slide 26:

Datafiles are the physical files that contain the logical structures, such as the
tablespace, which in turn contains objects. Redo log files record all the changes to
the data, so that in the event of some kind of system failure, then data can be
recovered. Control files contain a list of all the other files in the database.

Slide 27:

The Oracle instance is all the processes and memory areas that are needed to
allow and control access to a database. It is made up of processes and memory
areas. Some examples of these are:
System Global Area - an area of shared memory used to store data for one
instance, which in turn contains other memory structures: database buffer
cache, redo log buffer,
shared pool.
Program Global Area: shared memory for server processes.
User Processes
Oracle Processes including log writer and recovery processes.

Lecturers Notes:
Please note that the Private Study exercise for this topic requires organisation in order to ensure
suitable topic coverage (see Section 11.6 below). You should ensure that students know which topic
they have been assigned following the lecture session(s).

Page 143 of 158


DB Lecturer Guide V1.0

11.5 Laboratory Sessions


The laboratory time allocation for this topic is 2 hours.
Lecturers Notes:
Students have copies of the laboratory exercises in the Student Guide. Answers are not provided in
their guide.
During the laboratory session, you may wish to check that students are working on their reports for
Private Study Exercise 2 (see Section 2.6 below) and that each student is aware of their assigned
topic.

11.5.1 Views
A view can be thought of as a virtual table. It is the result of an SQL select operation and, to the
user, looks like a table with rows and columns. However, unlike a table, it does not necessarily exist
permanently in the database.
The syntax to create a view is similar to a select statement, but with a Create View added. For
example:
Create View WorkersOverThirty
As Select Emp_no, First_name, Last_name
From Workers
Where Age > 30;
Exercise 1
Run the above script and then run a select statement to see all the data from it.
Suggested Solution:
Run the above script then:
Select * from WorkersOverThirty;
Exercise 2
Create a view that will contain the last name and the job title for all the workers in Cairo.
Suggested Solution:
Create View CairoWorkers
As Select last_name, job_title
From Workers where dept_no = (select dept_no from departments where location = Cairo);
Note: There will be different ways of doing this, for example using a sub-query, but it is not
acceptable to just use the Workers table only and have the dept_no = 1, because the student knows
this is the dept_no of the department in Cairo.

Page 144 of 158


DB Lecturer Guide V1.0

Exercise 3
Create a summary view that includes the emp_no, first_name, last_name, department_name,
location, the job type and the salary (from the job type table).
Suggested Solution:
Create View Summary
As Select w.emp_no, w.first_name, w.last_name, d.department_name, d.location, j.job_title ,j.salary
From Workers w, Departments d, Job_Type j
Where w.dept_no = d.dept_no
And w.job_title = j.job_title;
Exercise 4
Now recreate the summary view from Exercise 4, but make it only for Workers who earn more than
25000.
Note: You will have to give it a different name from the previous summary.
Suggested Solution:
Create View Summary
As Select w.emp_no, w.first_name, w.last_name, d.department_name, d.location, j.job_title ,j.salary
From Workers w, Departments d, Job_Type j
Where w.dept_no = d.dept_no
And w.job_title = j.job_title;
And j.salary > 25000;

11.5.2

Indexes

Exercise 5
An index is a structure in a database that helps queries run more quickly. This will be discussed in
more detail in a coming lecture. Indexes can be unique, meaning that they will prevent a duplicate
value from being added to that column, or they can be non-unique.
The syntax to create a unique index for the Workers table column Emp_no is:
Create Unique Index EmpNoIndex on Workers(Emp_No);
Run this script.
If you need to get rid of this index, the syntax is:
Drop Index EmpNoIndex;

Page 145 of 158


DB Lecturer Guide V1.0

11.5.3 Constraints
Exercise 6
As well as having constraints to enforce primary and foreign keys, constraints can also be added to
enforce a business rule. This will be discussed in more detail in a coming lecture. The example
below enforces a rule that all our workers must be 70 or younger.
Alter table Workers
Add Constraint Valid_age
Check (age < 71);
Run this script and then see what happens if you try to update someones age to over 70.

Page 146 of 158


DB Lecturer Guide V1.0

11.6 Private Study


The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.

Exercise 1
You should prepare a short presentation on the database architecture of the vendor that you used to
implement your assignment. Focus should be on the logical structure and the physical structure. The
degree of detail that you will need to present should be guided by the lecture slides, i.e. it should be
an overview in your own words rather than a detailed technical paper.
Exercise 2
Your lecturer will assign one of the following topics, concerning bulk loading facilities, to you.
Prepare a short report about the features and the facilities of the tool that you are assigned to
investigate.
Bulk insert in SQL server
http://sqlserver2000.databases.aspfaq.com/how-do-i-load-text-or-csv-file-data-into-sql-server.html
Oracle SQL loader
http://oreilly.com/catalog/orsqlloader/chapter/ch01.html
My-SQL uses something called 'Bulk Insert'
http://mysql.bigresource.com/Bulk-insert-from-text-files-dDPRzHYo.html#2t6P0D5I
Exercise 3:

Review of All Topics

Review the materials for all the topics up to this week and prepare questions for the final overview
lecture in Topic 12.

Page 147 of 158


DB Lecturer Guide V1.0

11.7 Tutorial Notes


The time allowance for tutorials in this topic is 2 hours.
Lecturers Notes:
Students have copies of the tutorial activities in the Student Guide. Answers are not provided in their
guide.
This is the last tutorial session for the module. You may wish to include time to discuss the
examination and suitable revision techniques with students. You may also wish students to attempt
the sample examination paper which can be found on the NCC Education Campus
(http://campus.nccedu.com)

Exercise 1:

Vendor Presentation

Give your presentation on the database architecture of the vendor you have chosen to the rest of
the group.
Takes notes on interesting points while other students are speaking. Your tutor will also lead a
discussion to summarise the findings of the class at the end of the presentations.
Exercise 2
Work in a small group with other students who have written a report on the same topic during
private study time.
Discuss the information you have found. You should take the opportunity to add any additional
information to your own notes.
Now prepare to present your information to students who have worked on the other report. You
should work together as group to prepare a short (5 minutes), informal presentation which will give
the other students a summary of the main information you have found.
Exercise 3
Work with your group to present your information to students from the other groups. You should also
answer any questions they might have.
Now listen to their presentations and take notes.

Page 148 of 158


DB Lecturer Guide V1.0

Page 149 of 158


DB Lecturer Guide V1.0

Topic 12
Topic 12: Summary
12.1 Learning Objectives
This topic provides an overview of the module materials as a whole.
On completion of the topic, students will be able to:

Recognise the topics they have studied on the module;


Recognise links to other modules.

12.2 Pedagogic Approach


Information will be transmitted to the students during the lectures. They will then practise the skills
during the tutorial and seminar sessions.

12.3 Timings
Lectures:

2 hours

Private Study:

7.5 hours

Page 150 of 158


DB Lecturer Guide V1.0

12.4 Lecture Notes


The following is an outline of the material to be covered during the lecture time. Please also refer to
the slides.
The structure of this topic is as follows:

Summary of module
Clarification of module material and related issues as identified by students
Identify links to other modules/subject areas

12.4.1 Guidance on the Use of the Slides


Slide 4:

This lecture recaps the list of topics covered during the module. It should be noted
that the overview given in this weeks topic should serve as a pointer to further
revision of material and not as a definitive statement of all that needs to be known
from the module for the purposes of the final summative assessment. The slides go
over and revisit some of the key features of the topics that have been studied.

Slide 5:

This slide provides examples of databases in use. There are various definitions
given in textbooks and this slide presents the definition given by one of the founding
fathers of modern databases, C.J.Date: A database is a computerised record
keeping system. This definition is sufficient as a starting point, but highlight to
students that some people include manual filing systems as being a type of
database. Databases have the capacity to store, manipulate and retrieve data. We
keep data there (storage), we do things to that data via programs and applications
(manipulation) and we need to be able to get the data out of the database when we
need it (retrieval).

Slide 6:

This slide presents disadvantages of pre-database systems.

Data are separated and isolated


Customer information is stored in a separate file to rental information. If
salespeople need to relate substantial customer information to the cars
rented, then there is a problem. Data will have to be extracted from each file
and combined into a single file. This involves working out how each file is
related to the other and which parts of the file are needed; then a process of
extraction has to take place. This is often more difficult the greater the number
of files involved.

Data duplication
A customers name, address and phone number might be stored many times
over, i.e. once in the customers file and once again every time they make a
rental (therefore possibly many times in the rental file).
This wastes space and also raises the more serious problem of compromising
data integrity. Data integrity refers to data being logically consistent. For
example, if a customer changes his or her name or address, then all the files
containing that data must be updated, but the danger with duplication is that

Page 151 of 158


DB Lecturer Guide V1.0

this does not happen. The address might be changed in one file and not in
another, which would lead to difficulties in knowing which the correct address
is.

Application program dependency


With file processing systems, the application programs depend on the file
format. For example, if you write a program in COBOL, to get some data from
a file then you have to specify in your program the exact way in which that file
holds the data. The problem with this arrangement is that when changes are
made in file formats, then the application programs must also be changed.

Incompatible files
Due to the application program dependency, files that can be processed by
one programming language will be different to those processed by another.
This makes files difficult to combine, which reinforces the isolation and
separation of data that we discussed earlier.

Difficulty of representing data in the users perspectives


Users often want to see data in a way that is different from the way it is
stored. For example, they might want to see rental information with a
substantial amount of customer information. This means doing things like
combining files, which we have already noted is difficult, just to make the data
appear natural to the users.

Slide 7:

This slide presents how the database approach overcomes the problems
associated with pre-database systems:

Integrated data
In a database system, all the application data is stored in a single facility
called a database. An application program can access customer information
and rental information easily. The program can specify how to combine the
data and the DBMS will do it.

Reduced data duplication


Since data is stored in only one place, there is no need to duplicate it. It is
easy to retrieve and if something changes, we only have to update it in one
place.

Program/Data independence
Since the record formats are stored in the database itself (as metadata) then
we do not need to include file information in our application programs.

Easier representation of users perspectives


Database technology makes it much easier to represent data in a way the
users like to see it. This is a product of integration, which means it is much
easier to produce the sorts of applications where all the data that is needed
can be shown. When a developer creates a database, the information is
Page 152 of 158

DB Lecturer Guide V1.0

stored in tables. It is the job of the DBMS to store and retrieve data in these
tables. When a user wants to see the data in other formats, such as on a
screen or in a report, then we have to develop applications to do so.

Database systems are self-describing


In addition to the users source data, a database contains a description of its
own structure. This description is held by metadata and is held in a set of
tables known as the data dictionary.

Database systems maintain program-data independence


Since a database is self-describing, application programs do not need
knowledge of the underlying file system formats. This means that changes in
the structure of the data will not have a major impact on the application
programs.

A database is a model of a model


A database is a model. It should be pointed out to the students that while it is
tempting to say that a database is a model of reality or some portion of reality
as it relates to a business, this is not strictly true. A database does not model
reality or some portion of reality. Instead, a database is a model of the users
model. It is an attempt to capture the way users understand the data held for
their business needs. So, for example, the amount of detail held by a system
would depend on the users needs. Understanding the way the user thinks,
their requirements and pre-conceptions, is a major topic of study in systems
analysis, but we should always bear in mind that what we are concerned with
is the peoples perceptions and understanding.

Slides 8-9:

Data are raw facts kept in a computerised system. An example was given of data
in a sales database. Data in a sales database would include facts, such as a
customer name, address and telephone number. This is quite simple data which is
comprised of bits of text. Numerical data, such as the amount that a customer spent
last year, might also be stored. Today, this definition has to be expanded to reflect a
new reality since databases store objects, such as whole documents, photographic
images, sound and video.
Traditionally there has been a distinction made between data and information.
Information is data that has been processed in such a way that it can increase the
knowledge of the person who uses it.

Slide 10:

Metadata is data that contains the structure of other data. The structure of the
tables in a relational database is kept within the database itself in the form of
metadata. This defines the name of the table, the name of the column, the length of
the column and the data-type.

Slide 11:

Activity: Ask the students why metadata is important:


Answer: Metadata is important, because it is the way a database knows about its
own structure and is able to realise one of the advantages of databases: programdata independence. This is because with metadata, the structure of the database,
such as the tables, is stored in the database itself. A programmer only needs to

Page 153 of 158


DB Lecturer Guide V1.0

know how to access this in order to be able to retrieve data from the database. The
collection of metadata in a database is known as the data dictionary.
Slide 12:

The students could be asked at this point about their understanding of data-types,
e.g. what data-types are available? It should be pointed out that different
implementations of relational databases from different vendors might have slightly
different names for the same data-type, even though there are standards.

Slide 13:

Students could be asked to give a definition of Entity, Attribute and Relationship.


They should be encouraged to give their own explanations and examples.

Entity - Something of significance about which we want to keep data. This


could be a person, thing or concept.

Attribute - The quality or property of an entity.

Relationship - The way one entity is linked to another. This represents real
world relationships, such as a customer buying a product. It can also
represent concepts within our data model, such as customer types relating to
customers.

Slides 14-15:

The students were asked in Topic 4 to draw the ER for this scenario. They should
be at the stage where they are comfortable being able to understand such a
scenario and construct the relevant ER and accompanying data dictionary with
primary and foreign keys and other appropriate attributes.

Slide 16:

This slide presents the key concepts of a relational model.

Slides 17-18:

Relations and Tables - Tables can be thought of as the most basic structure
in the relational model. Within the model itself, as opposed to its
implementations, the equivalent of table is known as a relation. Tables are
made up of attributes. Relations are implemented in the database as twodimensional tables made up of columns and rows.

Attribute - The qualities or property of an entity (already mentioned). It


corresponds to a column in a table.

Domain - The set of valid values of an attribute. For example, the valid values
in the domain sex would be Male and Female.

Tuples and Rows - An instance of an entity is known as a tuple. It


corresponds to a row in a table. Each tuple or row represents the set of values
in the attributes of a particular table.

Primary Key - The attribute or attributes that uniquely identify a row in a


table.

Foreign Key This is an attribute which references an attribute (usually the


primary key) in another table. Foreign keys are the way in which relationships
are represented in the implemented database.

Ask the students to give a definition of 1st, 2nd and 3rd Normal Form. The students
should also be able to normalise a document such as the one in the example. In

Page 154 of 158


DB Lecturer Guide V1.0

Topic 6, this process was illustrated in detail by breaking it down into steps.
Students should revisit this and be confident they understand the process at each
of the steps. There is also an example in the tutorial materials for Topic 6.
Slide 19:

The key concepts in SQL are create, insert, update, delete and select statements.
The students should understand how each of these important parts of SQL works.
Students should refer to the lectures and the laboratory work to make sure they
understand these concepts.

Slide 20:

Point out that database development involves skills from other disciplines within
ICT, e.g. systems analysis. There are particular issues with regards to database
development that are different from other types of development, for example the
development of networks or websites. This module focuses on the unique
elements.
Requirements gathering is part of systems analysis. There are different
methodologies available, but recently iterative approaches have been popular.
These entail a large amount of user involvement, for example by using prototypes
that are shown to the user to enable them to better identify what they want.

Slide 21:

Database Design - moving from a set of requirements to implementing these with


database technology.

Slide 22:

A transaction is one or more operations that are carried out on the database.
Transactions can generally be identified as retrievals, inserts, updates and deletes.
This is sometimes usefully remembered by the acronym CRUD (Create, Retrieve,
Update, Delete). Transactions can be cross referenced with tables by constructing
a CRUD matrix that shows which of the operations in a transaction (Create,
Retrieve, Update and/or Delete) affects which tables. .

Slide 23:

What is de-normalisation and what is its purpose? Normalising our data model
means we will have the minimum amount of redundancy. However, this can have
an effect on performance. If we are running a query that joins tables, this will be
slower than running a query against a single table or view.

Slide 24:

This diagram demonstrates database implementation and that there is a wider


implementation environment. Oracle is an example of specific database
architecture.

Slide 25:

This slide lists how constraints can be enforced using SQL as discussed in Topic
11.

Slide 26:

This slide looks at links with other modules. The Level 5 Database Development
module focuses specifically on the development aspects. Again, using SQL the
students will develop a system from an example scenario. They will also gain a
greater understanding of the development process and study each phase of it in
detail.
Systems Analysis modules are an important part of understanding how the
requirements for database systems are arrived at in the first place. Other modules
that deal with web technology will also enable students to understand aspects of
the interaction between web applications and databases.

Page 155 of 158


DB Lecturer Guide V1.0

Slides 27-28:

This is a good opportunity for an open question session. Students should have
prepared questions. These can have been e-mailed to the lecturer or raised in open
discussion.

Page 156 of 158


DB Lecturer Guide V1.0

12.5 Private Study


The time allocation for private study in this topic is expected to be 7.5 hours.
Lecturers Notes:
Students have copies of the private study exercises in the Student Guide. Answers are not provided
in their guide.
Exercise 1
You are now at a point where you should be revising for the examination. The lists below, although
not exhaustive, indicate things that you should understand, be able to describe and be able to
produce, in order to do well in the examination.
Understand

Metadata

Fan traps

Chasm traps

The concepts associated with SQL

Constraints on data

In order to make sure that you can show your understanding of the above,read through the lecture
slides and make short notes on each of the points. Revise from these notes. You can also ask your
tutor for guidance.
Describe

Entity types

The relational model

The database development process

How databases are used

How databases are deployed

In order to make sure that you can describe the above, read through the lecture slides and make
detailed notes on each of the points. Revise from these notes. You can also ask your tutor for
guidance.

Produce

An ER Diagram from a scenario


Page 157 of 158

DB Lecturer Guide V1.0

SQL SELECT statements from information given to you

SQL CREATE statements from information given to you

SQL INSERT statements from information given to you

CRUD Matrices from information given to you

Normalised tables from information given to you

In order to make sure that you can produce the above, go through the appropriate laboratory, tutorial
and private study exercises and make sure that you can answer the questions. If you are having
difficulties answering the questions, you may need to either revisit the lecture slides or ask your tutor
for guidance.

GOOD LUCK IN YOUR EXAMINATION

Page 158 of 158


DB Lecturer Guide V1.0