You are on page 1of 153

221-INFS- DATABASE SYSTEMS-1

Bachelor in Information System (BSIS)


Department of IT & Security
College of CS & IT
Jazan University, K.S.A

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


1
COURSE OUTLINE

CHAPTER TOPICS PAGE

1 Databases and Database Users 3 – 24

2 Database System Concepts and Architecture 25 – 42

3 Data Modeling Using the Entity-Relationship 43 – 78


(ER) Model

4 The Relational Data Model and Relational 79 – 101


Database Constraints

5 Relational Algebra 102 – 123

6 Functional Dependencies and Normalization for 124 - 153


Relational Databases

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


2
Chapter 1
Databases and Database Users

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


3
Overview
Chapter-1: Databases and Databases Users
1. Introduction : Types of databases and functionalities
1.2 An Example (University Database)
1.3 Characteristics of the Database Approach
1.3.1 Self-Describing Nature of a Database System
1.3.2 Insulation between Programs and Data, and Data Abstraction
1.3.3 Support of Multiple Views of the Data
1.3.4 Sharing of Data and Multiuser Transaction Processing
1.4 Actors on the Scene
1.4.1 Database Administrators
1.4.2 Database Designers
1.4.3 End Users
1.4.4 System Analysts and Application Programmers (Software Engineers)
1.5 Workers behind the Scene
1.6 Advantages of Using the DBMS Approach
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
4
Types of Databases and Database
Applications
 Traditional Applications:
 Numeric and Textual Databases
 More Recent Applications:
 Multimedia Databases
 Geographic Information Systems (GIS)
 Data Warehouses
 Real-time and Active Databases
 Many other applications
 We will focus on traditional applications of
database to solve mini-world problems.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


5
Basic Definitions
 Database:
 A collection of related data.

 Data:
 Known facts that can be recorded and have an implicit
meaning.
 Mini-world or the Universe of discourse (UoD) :
 Represents some aspects /parts of the real world about which
data is stored in a database. For example, student registration
system at a university, Patients record system in a hospital.
 Database Management System (DBMS):
 A collection of programs/software package to facilitate the
creation and maintenance of a database.
 Its facilitates the processes of defining, constructing,
manipulating, and sharing databases among various users and
applications.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
6
Typical DBMS Functionality
 Define a particular database in terms of its data types,
structures, and constraints
 Construct or Load the initial database contents on a
secondary storage medium
 Manipulating the database:
 Retrieval: Querying, generating reports
 Modification: Insertions, deletions and updates to its content
 Generating reports from the data
 Processing and Sharing a database allows multiple users
and programs to access the database simultaneously

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


7
Typical DBMS Functionality (continued)x
 Application program: accesses the database by sending queries
or requests for data to the DBMS.
 A query typically causes some data to be retrieved; a transaction
may cause some data to be read and some data to be written into
the database.
 Protection includes system protection against hardware or software
malfunction (or crashes), unauthorized or malicious access.
 DBMS must be able to maintain the database system by allowing
the system to evolve as requirements change over time.
 Database system: To complete our initial definitions, we will call the
database and DBMS software together a database system.
 Figure I.I illustrates some of the concepts we have discussed so far.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


8
Simplified database system environmentx

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


9
Example of a Database
(with a Conceptual Data Model)
 Mini-world for the example:
 Part of a UNIVERSITY environment.
 Some mini-world entities:
 STUDENTs
 COURSEs
 SECTIONs (of COURSEs)
 (academic) DEPARTMENTs
 INSTRUCTORs

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


10
Example of a Database
(with a Conceptual Data Model)
 Some mini-world relationships:
 SECTIONs are of specific COURSEs
 STUDENTs take SECTIONs
 COURSEs have prerequisite COURSEs
 INSTRUCTORs teach SECTIONs
 COURSEs are offered by DEPARTMENTs
 STUDENTs major in DEPARTMENTs

 Note: The above entities and relationships are typically


expressed in a conceptual data model, such as the
ENTITY-RELATIONSHIP data model (see Chapters 3, 4)

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


11
Example of a simple databasex

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


12
Main Characteristics of the Database
Approachx
 Self-describing nature of a database system:
 A DBMS catalog stores the description of a particular
database (e.g. data structures, types and constraints on the
data)
 The information stored in the catalog is called meta-
data, and it describes the structure of the primary
database.
 This allows the DBMS software to work with different
database applications.
 Insulation between programs and data:
 Called program-data independence.
 Allows changing data structures and storage organization
without having to change the DBMS access programs. 13
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
Example of a simplified database catalogx

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


14
Main Characteristics of the Database
Approach (continued)
 Data Abstraction:
 The characteristic that allows program-data
independence and program-operation independence is
called data abstraction. Informally, A data model is
used to hide storage details and present the users with
a conceptual view of the database.
 Support of multiple views of the data:x
 Each user may see a different view (subset or
virtual data) of the database, which describes only
the data of interest to that user.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


15
Main Characteristics of the Database
Approach (continued)
 Sharing of data and multi-user transaction
processing:
 Allowing a multiple users to retrieve/access from
and update to the database
 Concurrency control within the DBMS guarantees
that each transaction is correctly executed or
aborted
 These types of applications are generally called
online transaction processing (OLTP)
applications.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


16
Database Users
 Users may be divided into
 Those who actually use and control the database
content, and those who design, develop and
maintain database applications (called “Actors on
the Scene”), and
 Those who design and develop the DBMS
software and related tools, and the computer
systems operators (called “Workers Behind the
Scene”).

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


17
Database Users
 Actors on the scene
 Database administrators:
 Responsible for authorizing access to the database,
for coordinating and monitoring its use, acquiring
software and hardware resources, controlling its use
and monitoring security and efficiency of operations.
 Database Designers:
 Responsible to define the content, the structure, the
constraints, and functions or transactions against
the database. They must communicate with the
end-users and understand their needs.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


18
Categories of End-users
 Actors on the scene
 End-users: They use the data for queries, reports
and some of them update the database content.
End-users can be categorized into:
 Casual: access database occasionally when
needed
 Naïve or Parametric: Constantly querying and
updating the database, using standard types of
queries and updates-called canned transactions.
 Examples are bank-tellers or university secretaries who do
this activity for an entire shift of operations.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


19
Categories of End-users (continued)x
 Sophisticated:
 These include business analysts, scientists, engineers,
others thoroughly familiar with the system capabilities.
 Stand-alone:
 Mostly maintain personal databases using ready-to-use
packaged applications.
 E.g.:
 (i) User of a tax package that stores a variety of personal
financial data for tax purposes,
 (ii) A scientists that creates a database for its own
experiments,
 (iii) A user that maintains an address book
 You may become sophisticated or stand-alone
users
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
20
Categories of End-users (continued)
 System analysts: determine the requirements of end
users, and develop specifications for standard canned
transactions that meet these requirements.
 Application programmers:
implement these
specifications as programs; then they test, debug,
document, and maintain these canned transactions. Such
analysts and programmers commonly referred to as
software developers or software engineers.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


21
Workers behind the Scenex
 DBMS system designers and implementers design
and implement the DBMS modules and interfaces as a
software package.
 Tool developers design and implement tools- that
facilitate database modeling and design, database system
design, and improved performance.
 Operators and maintenance personnel (system
administration personnel) are responsible for the actual
running and maintenance of the hardware and software.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


22
Advantages of Using the Database
Approach
 Interacting easily with data using high-level dedicated tools
 Controlling redundancy in data storage and in development
and maintenance efforts.
 Sharing of data among multiple users.

 Ensuring consistency of data.

 Restricting unauthorized access to data.


 Providing storage structures (e.g. indexes) for efficient query
processing
 Providing backup and recovery services.
 Providing multiple interfaces to different classes of users.
 Representing complex relationships among data.
 Enforcing integrity constraints on the database. 23
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
Review questions
1. Define the following terms: data, database, DBMS, database system, database
catalog, UoD, data model, program-data independence, DBA, User view,
canned transaction, meta-data, transaction-processing application etc.
2. What are the responsibilities of the DBA and the Database designers?
3. List down the advantages of using DBMS approach.
4. Discuss the differences between system Analysts and Application programmers.
5. What do you understand by database management systems (DBMS)?
6. Briefly explain the advantages of using DBMS approach.
7. What four main types of actions involve databases? Briefly discuss each.
8. What are the different types of database end users? Discuss the main activities
of each.
9. Discuss the main characteristics of the database approach and how it differs
from traditional file systems.
10. Draw a simplified diagram of a database system environment.
11. Draw a schema of a mini world database.
12. Explain the workers behind the scene.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
24
Chapter 2
Database System Concepts and
Architecture

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


25
Overview
Chapter-2 : Database System Concepts and Architecture
2. Introduction
2.1 Data models, Schemas and Instances
2.1.1 Categories of Data Models
2.1.2 Schemas, Instances, and Database State
2.2 Three schema architecture and data independence
2.2.1 The Three-Schema Architecture
2.2.2 Data Independence
2.3 Database languages and interfaces
2.3.1 DBMS Languages
2.3.2 DBMS Interfaces
2.4 DBMS system environment
2.4.1 DBMS Component Modules
2.4.2 Database System Utilities
2.4.3 Tools, Application Environments
In-Class Exercise & Assignment
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
26
Data Model & Data Abstraction
 Data Model:
 A data model is a collection of concepts that can
be used to describe the structure of a database-
provides the necessary means to achieve this
abstraction. By structure of a database we mean
the data types, relationships, and constraints
that apply to the data.
 Data abstraction:
 It generally refers to the hiding of details of data
organization and storage, and the highlighting of
the essential features for an improved
understanding of data.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
27
Categories of Data Modelsx
 High-level or conceptual data models: provide concepts
that are close to the way many users perceive data.
 Low-level or physical data models: provide concepts that
describe the details of how data is stored on the computer’s disk.
 An access path is a structure that makes the search for particular
database records efficient.
 An index is an example of an access path that allows direct access
to data using an index term or a keyword.
Between these two extremes we have,
 Representational or implementation data models provide
concepts that may be easily understood by end users but that are
not too far removed from the way data is organized in computer
storage (hard disks). 28
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
Categories of Data Models (Continued)
 Conceptual data models use concepts such as entities,
attributes, and relationships.
 Entity represents a real-world object or concept, such as an
employee or a project from the mini-world.
 Attribute represents some property of interest that further
describes an entity, such as the employee's name or salary.
 Relationship among two or more entities represents an
association among the entities,
 Example, a works-on relationship between an employee and a
project. (See chapter-3 : Entity-Relationship model).

Employee works-on Project

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


29
Schemas, Instances, and Database State
 Schemas: The description of a database is called the database
schema.
 Schema diagram: A displayed schema is called a schema
diagram. (See Figure 2.1)

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


30
Schemas, Instances, and Database State
 Schema construct: Each object in the schema - such as
STUDENT or COURSE- is a schema construct.
 Database state or Snapshot: The data in the database at a
particular moment in time is called a database state or
snapshot . It is also called the current set of occurrences or
instances.
 The distinction between database schema and database
state is very important.
 When we define a new database, we specify its database
schema only to the DBMS. At this point, the corresponding
database state is the empty state with no data. We get
the initial state of the database when the database is first
populated or loaded with the initial data.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
31
Three-Schema Architecture and Data
Independencex
 Three of the four important characteristics of the database
approach:
(1) Use of a catalog to store the database description (schema) so as
to make it self-describing,
(2) Insulation of programs and data (program-data and program-
operation independence), and
(3) Support of multiple user views.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


32
The Three-Schema Architecture

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


33
The Three-Schema Architecture
1. Internal schema (Internal level) at the internal level to
describe physical storage structures and access paths.
Typically uses a physical data model.

2. Conceptual schema (Conceptual level) describes the


structure of the whole database. It hides the details of physical
storage structures and concentrates on describing entities, data
types, relationships, user operations, and constraints.

3. External schema (view level), describes the part of the


database that a particular user group is interested in and hides
the rest of the database from that user group.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


34
Data Independencex
 Logical data independence is the capacity to change the
conceptual schema without having to change external schemas
or application programs.
 We may change the conceptual schema to expand the database
(by adding a record type or data item), to change constraints, or
to reduce the database (by removing a record type or data item).
 Physical data independence is the capacity to change
the internal schema without having to change the conceptual
schema. Hence, the external schemas need not be changed as
well.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


35
DBMS Languages
 Data Definition Language (DDL): Used by the DBA
and database designers to specify the conceptual schema of a
database. In many DBMSs, the DDL is also used to define
internal and external schemas (views). In some DBMSs,
separate Storage Definition Language (SDL) and View
Definition Language (VDL) are used to define internal and
external schemas.
 Data Manipulation Language (DML): Used to
specify database retrievals and updates.
 DML commands (data sublanguage) can be embedded in a
general-purpose programming language (host language), such
as COBOL, C or an Assembly Language.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


36
DBMS Languages (Continued)x
 High Level or Non-procedural Languages: e.g.,
SQL, are set-oriented and specify what data to retrieve than
how to retrieve. Also called declarative languages.

 Low Level or Procedural Languages: record-at-a-


time; they specify how to retrieve data and include constructs
such as looping.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


37
DBMS Interfaces
 User-friendly interfaces:
 Menu-based, popular for browsing on the web
 Forms-based, designed for naïve users
 Graphical User Interface (Point and Click, Drag and Drop
etc.)
 Natural language: requests in written English
 Speech as Input (?) and Output
 Parametric interfaces (e.g., bank tellers) using function
keys.
 Interfaces for the DBA: Creating accounts, granting
authorizations, Setting system parameters, Changing
schemas or access path

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


38
The Database System Environment
DBMS Component Modules

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


39
Database System Utilitiesx
 To perform certain functions such as:
 Loading data stored in files into a database. Includes
data conversion tools.
 Backing up the database periodically on tape.
 Database storage reorganization reorganizes a set
of database files into different file organizations, and
create new access paths to improve performance.
 Performance monitoring utility monitors database
usage and provides statistics to the DBA. such as
whether or not to reorganize files or whether to add or
drop indexes to improve performance.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
40
Tools and Application Environments
 Data dictionary / repository : Used to store schema
descriptions and other information such as design
decisions, application program descriptions, user
information, usage standards, etc.
 Active data dictionary is accessed by DBMS software
and users/DBA.
 Passive data dictionary is accessed by users/DBA

only.
 Application Development Environments and CASE

(computer-aided software engineering) tools:


 Examples – Power builder (Sybase), Builder (Borland)

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


41
Review questions
1. Define the following terms: data model, database schema, database state,
internal schema, conceptual schema, external schema, data independence,
access path, DDL, DML, SDL, VDL, query language, host language, data
sublanguage, database utility etc.
2. List down the main categories of data models.
3. What is the difference between a database schema and a database state?
4. What is the difference between logical data independence and physical data
independence?
5. Draw the diagram of three-schema architecture.
6. What is the difference between procedural and nonprocedural DMLs?
7. List down the different types of user-friendly interfaces.
8. With what other computer system software does a DBMS interact?
9. Discuss the main categories of data models.
10. Explain the three-schema architecture with the help of its diagram.
11. Discuss the different types of user-friendly interfaces and the types of users who
typically use each.
12. Discuss some types of database utilities and tools and their functions.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
42
Chapter 3
Data Modeling Using the Entity-
Relationship (ER) Model

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


43
Overview
Chapter 3 : Data Modeling Using the Entity-Relationship (ER) Model
3. Introduction
3.1 Using High–Level Conceptual Data Model for Database Designs
3.2 An Example database application
3.3 Entity types, Entity Sets, Attributes, and keys
3.3.1 Entities and Attributes
3.3.2 Entity types, Entity Sets, keys and Value Sets
3.3.3 Initial conceptual design of the COMPANY database
3.4 Relationships types, Relationship Sets, Roles, and Structural Constraints
3.4.1 Relationships types, Sets, Roles
3.4.2 Relationships Degree, Roles Names, and Recursive Relationships
3.4.3 Constraints on Binary Relationship Types
3.4.4 Attributes of Relationship Types
3.5 Weak Entity Types
3.6 Refining the ER Design for the COMPANY Database
In-Class Exercise & Assignment
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
44
ER Model
 Entity-Relationship (ER) model:
 In this chapter, we follow the traditional approach of
focusing on the database structures and
constraints during conceptual database design.
 We present the modeling concepts of the Entity-
Relationship (ER) model, which is a popular high-
level conceptual data model.
 This model is frequently used ER diagrams for the
conceptual design of database applications, and
many database design tools employ its concepts.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


45
Figure 3.1: A simplified diagram to illustrate the main phases
of database design

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


46
Using High-Level Conceptual Data
Models for Database Design
 Database design process steps:
1. Requirements collection and analysis: Interview database
users to understand and document their data requirements &
functional requirements (operations, transactions, retrievals
and updates)
2. Create a conceptual schema using a high-level conceptual
data model (ER Model). This step is called conceptual
design.
3. Logical design or data model mapping: Mapping of database
schema in the implementation data model of the DBMS.
4. Physical design phase: In which the internal storage
structures, file organizations, indexes, access paths, and
physical design parameters for the database files are specified.
47
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
ER-Diagram Notation for ER Schemas
Symbol Meaning

ENTITY TYPE

WEAK ENTITY TYPE

RELATIONSHIP TYPE

IDENTIFYING RELATIONSHIP TYPE

ATTRIBUTE

KEY ATTRIBUTE

MULTIVALUED ATTRIBUTE

COMPOSITE ATTRIBUTE

DERIVED ATTRIBUTE

E1 R E2 TOTAL PARTICIPATION OF E2 IN R

N
E1 R E2 CARDINALITY RATIO 1:N FOR E1:E2 IN R

(min,max) STRUCTURAL CONSTRAINT (min, max) ON


R E
PARTICIPATION OF E IN R
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
48
Entity Types, Entity Sets, Attributes, and
Keys
 Entities- The basic object that the ER model represents is an
entity, which is a thing in the real world with an independent
physical OR conceptual existence.
 E.g. a particular person, car, house, employee or a company, a
job, a university course.
 Attributes - The particular properties that describe an Entity.
E.g.- an EMPLOYEE entity may be described by the
employee's ssn, name, age, address, salary, sex, department
etc..

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


49
Entity Types, Entity Sets, Attributes,
and Keys
 Composite Attributes - Composite attributes can be divided
into smaller subparts. E.g.- The Address attribute of the
EMPLOYEE can be subdivided into Street_address, City, State,
and Zip. Attributes that are not divisible are called Simple
(Atomic) Attributes E.g.- Zip, University ID, Section, Grade etc.

Street_address State Zip


City

Address

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


50
Entity Types, Entity Sets, Attributes,
and Keys
 Single-Valued Attributes- Most attributes have a single value
for a particular entity; such attributes are called single-valued.
E.g.- Age is a single-valued attribute of a person.
 Multivalued Attributes- have a set of values for the same
entity. E.g. (1) a Colors attribute for a car, Cars with one color
have a single value, whereas two-tone cars have two color
values.
 E.g.-(2) One person may not have a college degree, another
person may have one, and a third person may have two or
more degrees; therefore, different people can have different
numbers of values for the College_degrees attribute. Such
attributes are called Multivalued.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


51
Entity Types, Entity Sets, Attributes,
and Keys
 Stored Attribute- In some cases, two (or more) attribute
values are related.
 E.g.(1) - the Age and Birth_date attributes of a person. For a
particular person entity, the value of Age can be determined
from the current (today's) date and the value of that person's
Birth_date, hence Age is called Derived attribute and is
derived from the Birth_date attribute, which is called a stored
attribute.
 E.g.(2) An attribute Number_of_employees (Derived) of a
DEPARTMENT entity can be derived by counting the number
of employees related to (working for) that department.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


52
Entity Types, Entity Sets, Attributes,
and Keys
 NULL Values- In some cases, a particular entity may not have
an applicable value/not known/unavailable for an attribute.
 E.g. (1) The Apartment_number attribute of an address applies
only to addresses that are in apartment buildings and not to
other types of residences, such as single-family homes.
 E.g. (2) A College_degrees attribute applies only to people with
college degrees. For such situations, a special value called
NULL is created.
 NULL can also be used if we do not know the value of an attrib-
ute for a particular entity- E.g.- if we do not know the home
phone number of ‘Ali‘.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


53
Entity Types, Entity Sets, Attributes,
and Keys
 Complex Attributes- Composite and Multivalued attributes
can be nested arbitrarily. Such attributes are called Complex
attributes.
Composite
 Example: attribute
Multivalued
attribute
 {Address_phone( {Phone(Area_code,Phone_number)},Address(Street_address
(Number,Street,Apartment_number),City,State,Zip) )}
Multivalued
attribute

 Composite attribute should be mentioned in between


parentheses ()
 Multivalued attributes between braces { }
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
54
Entity Types, Entity Sets, Keys, and
Value Sets
 Entity Types- An entity type defines a collection (or set) of
entities that have the same attributes.
 Entity Sets- The collection of all entities of a particular entity
type in the data base at any point in time is called an entity
set.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


55
Entity Types, Entity Sets, Keys, and
Value Sets
 Key Attributes- An important constraint on the entities of an
entity type is the key or uniqueness constraint on attributes.
 Has one or more attributes in an entity
 Values are distinct/different for each record
 Used to identify each entity uniquely
 Use underlined inside the oval to define
Such an attribute is called a key attribute.

Figure 3.7
The CAR entity
type with two key
attributes

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


56
Entity Types, Entity Sets, Keys, and
Value Sets
 Value Sets (Domains) of Attributes- A set of value that
may be assigned to that attribute for each individual entity, in
an entity set is called Value set or Domain.
 E.g.(1)- E.g.- In Employee entity, range of ages allowed for
employees is between 20 and 58 then the value set of the
Age attribute consist of integer numbers between 20 and 58.
Age: Domain[20-58]
 E.g.(2) The value set for the Name attribute to be the set of
strings of alphabetic characters and some special characters
and so on.
 Name: Domain[a-z],[A-Z],blank space, dot,#,_,@ e.c.t.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


57
Initial Conceptual Design of the COMPANY
Database
 An entity type DEPARTMENT with attributes Name, Number, Locations,
Manager, and Manager_start_date. Locations is the only multivalued
attribute. Name and Number are (separate) key attributes because each
was specified to be unique.

 An entity type PROJECT with attributes Name, Number, Location, and


Controlling_department. Both Name and Number are (separate) key
attributes.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


58
Initial Conceptual Design of the
COMPANY Database
 An entity type EMPLOYEE with attributes Name, Ssn, Sex, Address,
Salary, Birth_date, Department, and Supervisor. Both Name and
Address may be composite attributes and ssn is key attribute.

 An entity type DEPENDENT with attributes Employee, Dependent_name,


Sex, Birth_date, and Relationship (to the employee).

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


59
Relationship Types, Relationship Sets,
Roles, and Structural Constraints
 A Relationship relates two or more distinct entities with a specific
meaning.
 E.g.- EMPLOYEE Ali works on a PROJECT or EMPLOYEE Hussian
manages the Research DEPARTMENT.

(a) (b)
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
60
Relationship Types, Relationship Sets,
Roles, and Structural Constraints
 Relationships of the same type are grouped or typed into a
relationship type.

 E.g.- The WORKS_ON relationship type in which


EMPLOYEEs and PROJECTs participate, or the MANAGES
relationship type in which EMPLOYEEs and
DEPARTMENTs participate in previous figures (a-b).

 A relationship type R among n entity types El' E2' ... , En


defines a set of associations- or a relationship set - among
entities from these entity types.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


61
Relationship Degree, Role Names, and
Recursive Relationships
 Degree of a Relationship Type- The degree of a relationship type is
the number of participating entity types. Hence, the WORKS_FOR
relationship is of degree two.
 A relationship type of degree two is called Binary, and one of degree
three is called Ternary.

Binary relationships Ternary relationships


Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
62
Relationship as Attributes
 A relationship type can have attributes;

 E.g.- HoursPerWeek of WORKS_ON; its value for each


relationship instance describes the number of hours per
week that an EMPLOYEE works on a PROJECT.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


63
Role Names and Recursive Relationships.
 Each entity type that participates in a relationship type plays a particular
role in the relationship. The role name signifies the role that a
participating entity from the entity type plays in each relationship instance,
and helps to explain what the relationship means.
 For example, in the WORKS_FOR relationship type, EMPLOYEE plays
the role of employee or worker and DEPARTMENT plays the role of
department or employer.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


64
Role Names and Recursive Relationships
 We can also have a Recursive relationship type.
 Both participations are same entity type in different roles.
 E.g. - SUPERVISION relationships between EMPLOYEE (in role of supervisor or
boss) and (another) EMPLOYEE (in role of subordinate or worker).
 In following figure, first role participation labeled with 1 and second role
participation labeled with 2.
 In ER diagram, need to display role names to distinguish participations.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


65
Constraints on Relationship Types
 Also known as Ratio constraints
 Maximum Cardinality
 One-to-One (1:1)
 One-to-Many (1:N) or Many-to-One (N:1)
 Many-to-Many (N:M)

 Minimum Cardinality (also called participation


constraint or existence dependency constraints)
 zero (optional participation, not existence-dependent)
 one or more (mandatory, existence-dependent)

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


66
One-to-One (1:1) RELATIONSHIP
(MANAGES)

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


67
Many-to-one (N:1) RELATIONSHIP
(WORKS_FOR)

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


68
Many-to-Many (M:N) RELATIONSHIP
(WORKS_FOR)

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


69
Weak Entity Types
 Weak entity that does not have a key attribute itself,
hence it is identified by a partial key, which is the
attribute that can uniquely identify weak entities that
are related to the same owner entity
 A weak entity must participate in an identifying
relationship type with an owner or identifying entity
type
Example:
Suppose that a DEPENDENT entity is identified by the dependent’s first
name and birhtdate, and the specific EMPLOYEE that the dependent is
related to. DEPENDENT is a weak entity type with EMPLOYEE as its
identifying entity type via the identifying relationship type
DEPENDENT_OF.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


70
ER DIAGRAM OF COMPANY SCHEMA
Entity Types are: EMPLOYEE, DEPARTMENT, PROJECT, DEPENDENT

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


71
Review questions
1. Discuss the role of a high-level data model in the database design process.
2. What do you understand by Entity-Relationship model?
3. List the various cases where use of a NULL value would be appropriate.
4. Define the following terms: entity, attribute, attribute value, relationship,
instance, composite attribute, multivalued attribute, derived attribute, complex
attribute, key attribute, and value set (domain).
5. Explain the difference between an attribute and a value set.
6. What is meant by a recursive relationship type? Give some examples of
recursive relationship types.
7. Define the terms owner entity type, weak entity type, identifying relationship
type, and partial key.
8. Construct a preliminary design of Department, Employee, Product, Car,
Student, Teacher, Mobile, Class, book etc with all its possible attributes.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


72
Review questions
9. What is an entity type? What is an entity set? Explain the differences among
an entity, an entity type, and an entity set.
10. What is a relationship type? Explain the differences among a relationship
instance, a relationship type, and a relationship set.
11. Discuss the conventions for displaying an ER schema as an ER diagram.
12. Draw a simplified diagram to illustrate the main phases of database design.
13. For the following binary relationships, suggest cardinality ratios based on the
common- sense meaning of the entity types.
Entity 1 Cardinality Ratio Entity 2
STUDENT ______________ EXAM
TEACHER ______________ OFFICE
STUDENT ______________ TEACHER
CLASSROOM ______________ WALL
COURSE ______________ TEXTBOOK
PRODUCT ______________ PRICE
BOOK ______________ COURSE

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


73
Review questions
14. Draw an ER diagram of UNIVERSITY database, that captures all the given
below requirements. Specify key attribute(s) of each entity set. For each
relationship set, specify structural constraints and participation constraints.

 For each STUDENT, the university maintains student Name, student ID, SSN (Social
Security Number), Address, Phone, Birthdate, Gender, Level, Major (CS, IS, CNET).
Both Social Security Number and Student Number have unique values for each
student.

 Each DEPARTMENT is described by a Name, Department code, Department Phone,


and Office address. Both Name and Code have unique values for each department.

 Each COURSE has a Course Name, Description, Course Code, Credit Hours, Level,
and Offering Department. The value of Course Code is unique for each course.

 Each SECTION is associated with an Instructor, Semester, Year, Course, and


Section Number. The Section Number distinguishes sections of the same course
that are taught during the same semester.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


74
Review questions
15. Draw an ER diagram of COMPANY database, that captures all the given below
requirements. Specify key attribute(s) of each entity set. For each relationship
set, specify structural constraints and participation constraints.

 For each DEPARTMENT, the company maintains Name, department


Number, Manager, Location. Both Name and department Number have
unique values for each student.

 Each PROJECT is described by a Name, Number, Location, and


Controlling Department. Both Name and Number have unique values for
each department.

 Each EMPLOYEE has a SSN, Name, Sex, Department, Address, and Salary. The
value of SSN is unique for each employee, whereas Name, a composite attribute has
Fname, Minit and Lname as a subpart of it.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


75
Review questions
16. Draw an ER diagram of ONLINE COURSE REGISRATION database, that
captures all the given below requirements. Specify key attribute(s) of each
entity set. For each relationship set, specify structural constraints and
participation constraints.

 For each USER, the portal maintains user ID, Name, E-mail. Each user has
a unique ID. Name is a composite attributes with Fname, Midname, Lname.

 Each COURSE is described by a course Code, Description, Category, and


Term. Both Code and Term have unique values for each course.

 Each LECTUR has a lecturer Title ID, Duration, and Date. The value of ID
is unique for each lecturer.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


76
Review questions
17. Read the following case study, which describes the data requirements
for a video rental company.
 The video rental company has several branches throughout the USA. The data
held on each branch is the branch address made up of street, city, state, and
zip code, and the telephone number. Each branch is given a branch number,
which is unique throughout the company.
 Each branch is allocated staff, which includes a Manager. The Manager is
responsible for the day-to-day running of a given branch. The data held on a
member of staff is his or her name, position, and salary. Each member of staff is
given a staff number, which is unique throughout the company.
 Each branch has a stock of videos. The data held on a video is the catalog
number, video number, title, category, daily rental, cost, status, and the names
of the main actors and the director. The catalog number uniquely identifies each
video.
 However, in most cases, there are several copies of each video at a branch,
and the individual copies are identified using the video number. A video is given
a category such as Action, Adult, Children, Drama, Horror, or Sci-Fi. The status
77
indicates whether a specific copy of a video is available for rent.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
Review questions
 Before hiring a video from the company, a customer must first register as a
member of a local branch. The data held on a member is the first and last
name, address, and the date that the member registered at a branch. Each
member is given a member number, which is unique throughout all branches of
the company. Once registered, a member is free to rent videos, up to a
maximum of ten at any one time.
 The data held on each video rented is the rental number, the full name and
number of the member, the video number, title, and daily rental, and the dates
the video is rented out and returned. The rental number is unique throughout
the company.
 Identify the main entity types of the video rental company.
 Identify attributes and associate them with entity or relationship types.
Represent each attribute in the ER diagrams created in (a).
 Determine candidate and primary key attributes for each (strong) entity
type.
 Identify the main relationship types between the entity types described in
(a) and represent each relationship as an ER diagram.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
78
Chapter 4
The Relational Data Model and
Relational Database Constraints

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


79
Overview
Chapter 4 :The Relational Data Model and Relational Database Constraints
5. Introduction
5.1 Relational model concepts
5.1.1 Domains, Attributes, Tuples, and Relations
5.2 Relational model Constraints and Relational Database Schema
5.2.1 Domain Constraints
5.2.2 Key Constraints and Constraints on NULL Values
5.2.3 Relational Databases and Relational Database Schemas
5.2.4 Integrity, Referential Integrity, and Foreign Keys
In-Class Exercise & Assignment

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


80
Relational Model Concept

 The Relational Model represents the database as a


collection of relations. Informally, each relation resembles a
table of values or, to some extent, a flat file of records.
 It is called a flat file because each record has a simple linear
or flat structure (See Figure 1.2).

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


81
A database that stores STUDENT and
COURSE information

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


82
Domains, Attributes, Tuples, and
Relations
 A domain D is a set of atomic values. By atomic we mean
that each value in the domain is indivisible.
 Example: A domain has a logical definition:
 “KSA_phone_numbers”- are the set of 10 digit phone numbers valid
in the U.S.
 “Social_security_numbers”- The set of valid nine-digit Social Security
numbers.
 A domain may have a data-type or a format defined for it.
The USA_phone_numbers may have a format: (ddd)-ddd-dddd where
each d is a decimal digit.
 Dates have various formats such as month name, date, year or yyyy-
mm-dd, or dd mm,yyyy etc.
 An attribute designates the role played by the domain.
E.g., the domain Date may be used to define attributes “Invoice-date”
and “Payment-date”.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
83
Domains, Attributes, Tuples, and
Relations
 A column header is called an attribute.
 The degree (or arity) of a relation is the number of
attributes of its relation schema.
 E.g.: A STUDENT relation of degree of seven

STUDENT(Name, Ssn, Home_phone, Address, Office_phone, Age, Gpa)

 Using the data type of each attribute, the definition is


sometimes written as:

 STUDENT(Name:string, Ssn:integer, Home_phone:integer, Address:


string, Office_phone: string, Age: integer, Gpa: real)
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
84
Domains, Attributes, Tuples, and
Relations
 A tuple is an ordered set of values

 Each value is derived from an appropriate domain.

 Each row in the STUDENT table may be referred to as a


tuple in the table and would consist of four values.

 STUDENT(ID: integer, Name: string, Section_No: integer,


Major: string)

<201898722, “Ali”,, 33, “CS”>

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


85
Domains, Attributes, Tuples, and
Relations
 RELATION: A table of values
 A relation may be thought of as a set of rows.
 A relation may alternately be though of as a set of
columns.
 Each row represents a fact that corresponds to a real-
world entity or relationship.
 Each row has a value of an item or set of items that
uniquely identifies that row in the table.
 Sometimes row-ids or sequential numbers are assigned
to identify the rows in the table.
 Each column typically is called by its column name or
column header or attribute name.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


86
Formal Relational Model Terminology
 A Relation may be defined in multiple ways.
 The Schema of a Relation: R (A1, A2, .....An)
Relation schema R is defined over attributes A1, A2, .....An
For Example -
CUSTOMER (Cust-id, Cust-name, Address, Phone#)

Here, CUSTOMER is a relation defined over the four attributes


Cust-id, Cust-name, Address, Phone#, each of which has a
domain or a set of valid values. For example, the domain of
Cust-id is 6 digit numbers.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
87
Example Figure 5.1 and Definition Summary

Informal Terms Formal Terms


Table Relation
Column Attribute/Domain
Row Tuple
Values in a column Domain
Table Definition Schema of a Relation
Populated Table Extension
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
88
Relational Model Constraints
and Relational Database Schemas
 Constraints (or restrictions) on the actual values in a
database state are derived from the rules in the mini-world.
 Constraints can be divided into three main categories:
1. Constraints that are inherent in the data model. We call these inherent
model-based constraints or implicit constraints.

2. Constraints that can be directly expressed in schemas of the data model,


typically by specifying them in the DDL (data definition language. We call
these schema-based constraints or explicit constraints.

3. Constraints that cannot be directly expressed in the schemas of the data


model, and hence must be enforced by the application programs. We call
these application-based or semantic constraints or business rules.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


89
Domain Constraints
 Domain constraints specify that within each tuple, the value
of each attribute A must be an atomic value from the domain
dom(A).
 Each table has certain set of columns and each column
allows a same type of data, based on its data type. The
column does not accept values of any other data type.
 E.g.
KSA_phone_numbers(10 digit) – 0543776699
Date_of_Birth(dd/mm/yyyy) – 21/09/1998
Name(A-Z, a-z) – Ali Hamza
GPA(0 to 5, real numbers) – 3.65
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
90
Key Constraints and Constraints on
NULL Values
 There are three main types of constraints:

1. Key constraints

2. Entity integrity constraints

3. Referential integrity constraints

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


91
Initial Conceptual Design of the COMPANY
Database
 Superkey of R: A set of attributes SK of R such that no two tuples
in any valid relation instance r(R) will have the same value for SK.
That is, for any distinct tuples t1 and t2 in r(R), t1[SK]  t2[SK].
 Key of R: A "minimal" superkey; that is, a superkey K such that
removal of any attribute from K results in a set of attributes that is
not a superkey.
Example: The CAR relation schema:
CAR(State, Reg#, SerialNo, Make, Model, Year)
has two keys Key1 = {State, Reg#}, Key2 = {SerialNo}, which are
also superkeys. {SerialNo, Make} is a superkey but not a key.
 If a relation has several candidate keys, one is chosen arbitrarily
to be the primary key. The primary key attributes are underlined.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


92
Initial Conceptual Design of the
COMPANY Database

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


93
Entity Integrity, Referential Integrity, and
Foreign Keys
 Relational Database Schema: A set S of relation schemas
that belong to the same database. S is the name of the
database.
S = {R1, R2, ..., Rn}
 Entity Integrity: The primary key attributes PK of each
relation schema R in S cannot have null values in any tuple
of r(R). This is because primary key values are used to
identify the individual tuples.
t[PK]  null for any tuple t in r(R)
Note: Other attributes of R may be similarly constrained to
disallow null values, even though they are not members of the
primary key.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


94
Entity Integrity, Referential Integrity, and
Foreign Keys
 Referential Integrity: A constraint involving two relations
(the previous constraints involve a single relation).
 Used to specify a relationship among tuples in two relations:
the referencing relation and the referenced relation.
 Tuples in the referencing relation R1 have attributes FK
(called foreign key attributes) that reference the primary key
attributes PK of the referenced relation R2.
 A tuple t1 in R1 is said to reference a tuple t2 in R2 if
t1[FK] = t2[PK].
 A referential integrity constraint can be displayed in a
relational database schema as a directed arc from R1. FK to
R2 (FK PK). 95
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
Referential Integrity, and Foreign Keys
PK FK
T_ID Name Department Course_Code# Foreign key (can be)
T001 Ali IS 221 • Duplicate
T002 Hussain CS 452
T003 Rashid CNET 443
• Null
T004 Khalid CS 443 • Reference to Primary
T005 John IS NULL key or Unique key

PK C_Code# Name Level


221 Database-1 5
Primary key 222 Database-2 6
• Unique 443 Multimedia 9
• Not Null 334 Software Eng 8

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


96
Schema diagram of COMPANY database

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


97
Referential Integrity Displayed (PKFK)

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


98
COMPANY database state

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


99
Review questions
1. Define the following terms as they apply to the relational model of data: domain,
attribute, tuple, relation schema, relation state, degree of a relation, relational
database schema, and relational database state etc.
2. Why are duplicate tuples not allowed in a relation?
3. What is the difference between a key and a superkey?
4. Define key, Super key, Candidate key, Primary key, and foreign key.
5. Compare the differences between primary key and foreign key.
6. Discuss the entity integrity and referential integrity constraints.
7. Identify the Super key, Candidate key, Primary key, Foreign Key in the below
relation.
T_ID Name Department Course_Code#
T001 Ali IS 221
T002 Hussain CS 452
T003 Rashid CET 443
T004 Khalid CS 443

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


100
Review questions
8. Consider the following relations for a database that keeps track of automobile
sales in a car dealership, Specify the foreign keys for this schema, stating any
assumptions you make.
CAR(SerialNo, Model, Manufacturer, Price)
OPTION(SerialNo, OptionName, Price)
SALE(SalespersoId, SerialNo, Date, Sale_price)
SALESPERSON(SalespersonId, Name, Phone)
9. Consider the following relations for a database that keeps track of student
enrollment in courses and the books adopted for each course. Draw a
relational schema diagram specifying the foreign keys for this schema.

STUDENT (SSN, Name, Major, Bdate)


COURSE (Course#, Cname, Dept)
ENROLL (SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author) 101
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
Chapter 5
Relational Algebra

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


102
Overview

Chapter 5- Relational algebra


6. Introduction
6.1 Unary Relational Operations: SELECT and PROJECT
6.1.1 The SELECT operation
6.1.2 The PROJECT operation
6.2 Relational Algebra Operations from Set Theory
6.2.1 The UNION, INTERSECTION, and MINUS Operations
6.2.2 The CARTESIAN PRODUCT (CROSS PRODUCT) Operation
In-Class Exercise & Assignment

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


103
Relational Algebra
 In this chapter we discuss the formal languages for the
relational model: the Relational Algebra.
Definition:
 The basic set of operations for the relational model is the

relational algebra. These operations enable a user to


specify basic retrieval requests as relational algebra
expressions (Query).
Example RA expression:
 (EMPLOYEE)
Dno=4 AND salary>25000

πLname, Fname, Dno(EMPLOYEE)


(Dno=4 AND Salary>25000) OR (Dno=5 AND Salary>40000) (EMPLOYEE)

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


104
Relational Algebra Operations Taxonomy
Relational Algebra
Operations

Relational Database operations Set operations

Extended Set Operations Traditional Set Operations

Unary
SELECT, operations
PROJECT UNION,
INTERSECTION,
SET DIFFERENCE, and
CARTESIAN PRODUCT
JOIN Binary operations
OR CROSS PRODUCT

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


105
Unary Relational Operations:
SELECT and PROJECT
 The SELECT Operation:
 SELECT and PROJECT are unary operations because
they operate on single relation.
 The SELECT operation is used to choose a subset of the
tuples from a relation that satisfies a selection condition.
 The SELECT operation can also be visualized as a
horizontal partition of the relation into two sets.
 Syntax:
Select < table_name > where <condition>

(The <condition> must involve only columns from the mentioned table)

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


106
Example: SELECT operation
We can write this in Relational algebra expression:
 (table_name)
<condition> [Where σ is “Phi”]
EMPLOYEE
Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
Franklin T Wong 333445555 1955-12-08 638 Voss, Houston, TX M 40000 888665555 5
Jennifer S Wallace 987654321 1941-06-20 291 Berry, Bellaire, TX F 43000 888665555 4
Ramesh K Narayan 666884444 1962-09-15 975 Fire Oak, Humble, TX M 38000 333445555 5

Example Display those Employee who are working in department number 5.


 (EMPLOYEE)
Dno=5
Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
Franklin T Wong 333445555 1955-12-08 638 Voss, Houston, TX M 40000 888665555 5
Ramesh K Narayan 666884444 1962-09-15 975 Fire Oak, Humble, TX M 38000 333445555 5

Subset of EMPLOYEE relation


Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
107
SELECT operation with comparison operators (=, <, ≤, >, ≥, ≠), and
Boolean operators (and, or, not to) form a general selection condition

Eg: Select the tuples for all employees who either work in department 4 and
make over $25,000 per year, or work in department 5 and make over
$40,000, we can specify the following SELECT operation:

(Dno=4 AND Salary>25000) (


OR Dno=5 AND Salary>40000 ) (EMPLOYEE)

Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
Franklin T Wong 333445555 1955-12-08 638 Voss, Houston, TX M 41000 888665555 5
Jennifer S Wallace 987654321 1941-06-20 291 Berry, Bellaire, TX F 43000 888665555 4
Ramesh K Narayan 666884444 1962-09-15 975 Fire Oak, Humble, TX M 38000 333445555 5

Output subset tuples of relation Employee

Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
Franklin T Wong 333445555 1955-12-08 638 Voss, Houston, TX M 41000 888665555 5
Jennifer S Wallace 987654321 1941-06-20 291 Berry, Bellaire, TX F 43000 888665555 4

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


108
SELECT operation correspond to the SQL query:

In SQL, the SELECT condition is typically specified in the


WHERE clause of a query.

For example, the following operation:


 (EMPLOYEE)
Dno=4 AND salary>25000

would correspond to the following SQL query:

SQL> SELECT * FROM EMPLOYEE WHERE Dno=4 AND Salary>25000;

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


109
The PROJECT Operation
 The PROJECT operation selects certain columns from the
table and discards the other columns.
 If we are interested in only certain attributes of a relation,
we use the PROJECT operation to project the relation
over these attributes only.
 Therefore, the result of the PROJECT operation can be
visualized as a vertical partition of the relation into two
relations.
 The PROJECT operation removes any duplicate tuples,
so the result of the PROJECT operation is a set of distinct
tuples.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


110
Example: PROJECT Operation
EMPLOYEE
Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
Franklin T Wong 333445555 1955-12-08 638 Voss, Houston, TX M 38000 888665555 5
Jennifer S Wallace 987654321 1941-06-20 291 Berry, Bellaire, TX F 43000 888665555 4
Ramesh K Narayan 666884444 1962-09-15 975 Fire Oak, Humble, TX M 38000 333445555 5

Syntax : π<attribute list>(R) [Where π is “Pie”]

a. πLname, Fname, Dno(EMPLOYEE) b. πSex, Salary(EMPLOYEE)

Lname Fname Dno Sex Salary


Wong Franklin 5 F 43000
Wallace Jennifer 4 M 38000
Narayan Ramesh 5
Duplicate elimination

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


111
PROJECT operation correspond to the SQL query:

 In SQL, the PROJECT attribute list is specified in the


SELECT clause of a query.

 For example, the following operation:

πSex, Salary(EMPLOYEE)

 would correspond to the following SQL query:


SQL> SELECT DISTINCT Sex, Salary FROM EMPLOYEE;
Sex Salary
Output with Duplicate elimination F 43000
M 38000

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


112
Relational Algebra Operations from Set Theory
The UNION, INTERSECTION, and MINUS Operations

 UNION, INTERSECTION, and SET DIFFERENCE (also


called MINUS or EXCEPT) set theoretic operations are used
to merge the elements of two sets.
 These are binary operations, because these are applied to
two sets (of tuples).
 When these operations are applied to the two relations, the
relations must have the same type of tuples; this condition
has been called union compatibility or type compatibility.
 It means, R(A1, A2, ..., An) and S(B1, B2, ..., Bn) are union
compatible, if these two relations have the same number of
attributes and each corresponding pair of attributes has the
same domain. 113
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
UNION Operation
• UNION: The result of this operation, denoted by R U S, is
a relation that includes all tuples that are either in R or in S
or in both R and S. Duplicate tuples are eliminated.
(a) Two union-compatible relations (b) STUDENT U INSTRUCTOR.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


114
INTERSECTION Operation
• INTERSECTION: The result of this operation,
denoted by R ∩ S, is a relation that includes all
tuples that are in both R and S.
(C) STUDENT ∩ INSTRUCTOR.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


115
SET DIFFERENCE (or MINUS) Operation:

 SET DIFFERENCE (or MINUS): The result of this


operation, denoted by R - S, is a relation that includes
all tuples that are in R but not in S.
(D) STUDENT – INSTRUCTOR. (E) INSTRUCTOR – STUDENT.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


116
The CARTESIAN PRODUCT (CROSS
PRODUCT) Operation
 CARTESIAN PRODUCT operation-also known as
CROSS PRODUCT or CROSS JOIN - which is denoted
by x.
 This is also a binary set operation, but the relations on
which it is applied do not have to be union compatible.
 It is used between two set of ordered pair.
 In general, the CARTESIAN PRODUCT operation applied
by itself is generally meaningless.
 It is mostly useful when followed by a selection that
matches values of attributes coming from the component
relations.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
117
The CARTESIAN PRODUCT (CROSS
PRODUCT) Operation
Let’s: R1={1,2} and R2={a, b, c}
Ordered
Then, R1 X R2 = {(1,a), (1,b), (1,c), (2,a), (2,b), (2,c)} set of
R2 X R1 = {(a,1), (a,2), (b,1), (b,2), (c,1), (c,2)} pairs
Hence, R1 X R2 ≠ R2 X R1

R1 R2 R1 R2
a a
1 1
b b
2 2
c c
R1 X R2 R2 X R1
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
118
CARTESIAN PRODUCT Operation
STUDENT X MAJOR
STUDENT MAJOR
SID SNAME GPA SID MAJOR
SID SNAME GPA SID MAJOR
10 ALI 3.5 10 CS
10 ALI 3.5 10 CS
10 ALI 3.5 11 IS
11 ZOYA 3.7 11 IS
10 ALI 3.5 12 CNET
12 ALEX 4.1 12 CNET
11 ZOYA 3.7 10 CS
11 ZOYA 3.7 11 IS
11 ZOYA 3.7 12 CNET
12 ALEX 4.1 10 CS
12 ALEX 4.1 11 IS
12 ALEX 4.1 12 CNET

Cartesian product would correspond to the following SQL


query:
SQL> Select * from Student Cross Join Major;

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


119
Review questions
1. Define Relational Algebra. Why do we need Relation Algebra for DBMS?
2. Illustrate diagrammatically the taxonomy (classification) of relation algebra
operations.
3. List the operations of relational algebra and the purpose of each.
4. Explain unary relation operations: SELECT and PROJECT with example.
5. Differentiate unary and binary relational algebra operations with example.
6. Explain relational algebra binary operations: UNION, INTERSECTION, SET
DIFFERENCE (or MINUS) with example.
7. Explain relational algebra binary operations: CARTESIAN PRODUCT (x) with
example.
8. What is union compatibility? Why do the UNION, INTERSECTION, and SET
DIFFERENCE operations require that the relations on which they are applied
be union compatible?

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


120
Review questions
EMPLOYEE

9. Use above relation EMPLOYEE and write the relational algebra query.
a. Retrieve the records form EMPLOYEE whose department is 4.
b. Select the tuples where salary > 40000.
c. Retrieve the records of all employee who are Male and work for department
number 5.
10. Use above relation EMPLOYEE and write the relational algebra query.
a. List each employee’s First name, Last name and salary.
b. List First name, Last name and salary of all employees who work in
department number 5.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


121
Review questions
11. Convert the given below relational algebra operation into relational query.
a. Dno=4 (EMPLOYEE)
b. Salary > 30,000 (EMPLOYEE)
c.  (Dno=4 AND Salary > 30,000)(EMPLOYEE)
 
d. Ssn, Fname, Bdate( Salary>39000(EMPLOYEE))
 
e. Fname, Lname, Salary( Dno=5(EMPLOYEE))
12. Consider the following GRADEBOOK relational schema describing the data for a
grade book of a particular instructor.
CATALOG(Cno, Ctitle)
STUDENTS(Sid, Fname, Lname, Minit)
COURSES(Term, Sec_no, Cno, A, B, C, D)
ENROLLS(Sid, Term, Sec_no)
a. Retrieve the names of students enrolled in the Automata class during the fall
2009 term.
b. Retrieve the Sid values of students who have enrolled in CSc226 and CSc227.
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
122
Review questions
13. Apply the set theoretic operations UNION, INTERSECTION, MINUS on the
given tables and draw the result.
a. STUDENT U INSTRUCTURE
b. STUDENT Ո INSTRUCTURE
c. STUDENT — INSTRUCTURE
d. INSTRUCTURE — STUDENT
STUDENT
INSTRUCTURE
Fn Ln
Fname Lname
Susan Yao
John Smith
Ramesh Shah Ricardo Browne
Johnny Kohler Susan Yao
Barbara Jones Francis Johnson
Amy Ford Ramesh Shah
Jimmy Wang
Ernest Gilbert
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
123
Chapter 6
Functional Dependencies and
Normalization for Relational
Databases

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


124
Overview
Chapter 6 - Functional Dependencies and Normalization for Relational Databases
10. Introduction
10.1 Informal design guideline for relation schema
10.2 Functional dependencies
10.2.1 Definition of Functional Dependency
10.3 Normal Forms Based on Primary Keys
10.3.1 Normalization of Relations
10.3.2 Practical Use of Normal Forms
10.3.3 Definitions of Keys and Attributes Participating in Keys
10.3.4 First Normal Form and Example
10.3.5 Second Normal Form and Example
10.3.6 Third Normal Form and Example
In-Class Exercise & Assignment
Revision and Lab exam

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


125
Database design approach
 Database design may be performed using two approaches:
(i) Bottom-up approach (ii) Top-down approach
 A bottom-up design methodology (design by synthesis)

considers the basic relationships among individual attributes


as the starting point and uses those to construct relation
schemas.
 This bottom-up approach is not very popular in practice:

i. Because it suffers from the problem of having to collect a large number of


binary relationships among attributes as the starting point.

ii. For practical situations, it is next to impossible to capture binary


relationships among all such pairs of attributes.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


126
Database design approach
 In contrast, a top-down design methodology (design by
analysis) starts with a number of groupings of attributes into
relations that exist together naturally, for example, on an
invoice, a form, or a report.
 The relations are then analyzed individually and collectively,
leading to further decomposition until all desirable properties
are met.
 Relational database design ultimately produces a set of
relations.
 The implicit goals of the design activity are information
preservation and minimum redundancy.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


127
Informal Design Guidelines for Relation
Schemas
 Informal guidelines that may be used as measures to
determine the quality of relation schema design and may
eventually answer;
 Why Normalization for Relational Databases?

1. Making sure that the semantics of the attributes is clear in the


schema
2. Reducing the redundant information in tuples
3. Reducing the NULL values in tuples
4. Disallowing the possibility of generating spurious tuples

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


128
1. Semantics of the Relation Attributes
GUIDELINE 1: Design a relation schema so that it is easy to
explain its meaning. Do not combine attributes from multiple
entity types and relationship types into a single relation.
Consider the relation:
 Attributes of different entities (EMPLOYEEs,
DEPARTMENTs, PROJECTs) should not be mixed in the
same relation.
 Only foreign keys should be used to refer to other entities.
 Entity and relationship attributes should be kept apart as
much as possible.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


129
2. Redundant Information in Tuples and
Update Anomalies
 Mixing attributes of multiple entities may cause problems
 Information is stored redundantly wasting storage
 Problems with update anomalies
 Insertion anomalies
 Deletion anomalies
 Modification anomalies

GUIDELINE 2: Design a schema that does not suffer from the


insertion, deletion and update anomalies. If there are any
present, then note them so that programs that update the
database will operate correctly.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


130
EXAMPLE OF AN UPDATE ANOMALY (1)

 Consider the relation:

 EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)

 Update Anomaly: Changing the name of project


number P1 from “Billing” to “Customer-
Accounting” may cause this update to be made
for all 100 employees working on project P1.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


131
EXAMPLE OF AN UPDATE ANOMALY (2)

 Insert Anomaly: Cannot insert a project unless an


employee is assigned to.
 Inversely - Cannot insert an employee unless an he/she is
assigned to a project.

 Delete Anomaly: When a project is deleted, it will result


in deleting all the employees who work on that project.
Alternately, if an employee is the sole employee on a
project, deleting that employee would result in deleting the
corresponding project.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


132
3. Null Values in Tuples

GUIDELINE 3: Relations should be designed such


that their tuples will have as few NULL values as
possible.
 Attributes that are NULL frequently could be placed in
separate relations (with the primary key)

 Reasons for NULL:


 The attribute does not applicable or invalid
 The attribute value for this tuple unknown
 The value is known to exist, but unavailable

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


133
4. Spurious Tuples

 Bad designs for a relational database may result in erroneous


results for certain JOIN operations.
 The "lossless join" property is used to guarantee meaningful
results for join operations.
 GUIDELINE 4: The relations should be designed to satisfy
the lossless join condition. No spurious tuples should be
generated by doing a natural-join of any relations.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


134
Functional Dependencies

 Functional dependencies (FDs) are used to specify


formal measures of the "goodness" of relational designs

 FDs and keys are used to define normal forms for relations
 FDs are constraints that are derived from the meaning and
interrelationships of the data attributes

 A set of attributes X functionally determines a set of


attributes Y if the value of X determines a unique value for Y

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


135
Functional Dependencies (2)
 X -> Y holds if whenever two tuples have the same value
for X, they must have the same value for Y
 For any two tuples t1 and t2 in any relation instance r(R):
If t1[X]=t2[X], then t1[Y]=t2[Y]
 X -> Y in R specifies a constraint on all relation instances
r(R)
 Written as X -> Y; can be displayed graphically on a
relation schema as in Figures. (denoted by the arrow)
 FDs are derived from the real-world constraints on the
attributes

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


136
Examples of FD constraints (1)

 Social security number determines employee


name
SSN -> ENAME
 Project number determines project name and
location
PNUMBER -> {PNAME, PLOCATION}
 Employee ssn and project number determines the
hours per week that the employee works on the
project
{SSN, PNUMBER} -> HOURS

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


137
Normalization of Relations
 Normalization: The process of decomposing unsatisfactory
"bad" relations by breaking up their attributes into smaller
relations.
 Normal Form: Condition using keys and FDs of a relation to
certify whether a relation schema is in a particular normal
form.
 Denormalization: The process of storing the join of higher
normal form relations as a base relation-which is in a lower normal
form.
 Transforming normalized relations into non- normalized physical
record specifications.
 Benefit: Can improve performance(speed) by reducing numbers of
tables lookup. 138
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
Why do we need Normalization?

 To reduce or minimize redundancy (Duplication


of data) from the database.
 To save storage space and to make more
efficient and error free database.
STUDENT
ID Name Department Head Phone
1001 S1 IS Mr. A 0888888
1002 S2 CS Mr. B 0999999
1003 S3 IS Mr. A 0888888
1004 S4 CNET Mr. C 0777777
1005 S5 IS Mr. A 0888888

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


139
Breaking up the table to Solve Redundancy
problem
STUDENT
ID Name Department Head Phone
1001 S1 IS Mr. A 0888888
1002 S2 CS Mr. B 0999999
1003 S3 IS Mr. A 0888888
1004 S4 CNET Mr. C 0777777
1005 S5 IS Mr. A 0888888

STUDENT
DEPARTMENT
ID Name Department
Dept_ID Head Phone
1001 S1 IS
IS Mr. A 0888888
1002 S2 CS
CNET Mr. C 0777777
1003 S3 IS
CS Mr. B 0999999
1004 S4 CNET
1005 S5 IS Redundancy Reduced
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
140
Normal Forms
Types and Processes

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


141
1. First Normal Form
 Composite attributes: can be further subdivided. Eg:
Name, Address
 Multivalued attributes: can have set of value for same
entity (non-atomic). Eg. Color, Degree
 Nested relations: part of the definition of relation,
means No two columns can have the same name.
 Every column should contain the values of same
domain
 The order of rows and the order of column is
irrelevant
 Considered to be part of the definition of relation 142
Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe
Normalization into 1NF Fname
Mnit
Lname
Eg. 1 : Composite Attribute
Ename

EMPLOYEE
EMPLOYEE
Ssn Ename Address Salary Dept
1001 Ahmad, A, Jarebi Jazan 5600 IT
1002 Ali, Mohemmed, Hakami Sabya 7900 IS
1003 Yahya, M.,Makki Jeddah 8000 CS
1004 Mosa, A, Hazmi Abha 6700 IBD

EMPLOYEE
Ssn Fname Mnit Lname Address Salary Dept
1001 Ahmad, A., Jarebi Jazan 5600 IT
1002 Ali, Mohemmed, Hakami Sabya 7900 IS
1003 Yahya, M, Makki Jeddah 8000 CS
1004 Mosa, A, Hazmi Abha 6700 IBD

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


143
Normalization into 1NF
Eg. 2 : Multivalued Attribute:

Multivalued
Attribute

Converted
in 1NF

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


144
Normalization into 1NF
Eg. 3 : Nested relations
Nested
relation

Converted
in 1NF

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


145
2. Second Normal Form

Definitions: A relation schema R is in Second Normal Form


(2NF), if every Non-prime attribute is Fully functionally
dependent on the Primary key
 Prime attribute :- Attribute that is member of the primary key K
Prime Attributes Non Prime Attributes

 Full functional dependency :- a FD Y -> Z where removal of


any attribute from Y means the FD does not hold any more
FD1: {SSN, PNUMBER} -> HOURS (Full FD )
FD2: {SSN, PNUMBER} -> ENAME (it is called a partial dependency )

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


146
2. Second Normal Form (Cont...)

2 Normal Form

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


147
Class Exercise:2
Consider the given relation PRODUCT is already in 1NF, and the FD
given as FD1,FD2 and FD3, Decompose the relation into 2NF.

FD1 FD3
P_ID Pname Description Order_ID Unit Total_Price
FD2

2 NF Normalization PROD1
P_ID Pname Description
FD1 : P_ID -> {Pname, Description}
PROD2
FD2: {P_ID,Order_ID } ->Total_Price P_ID Order_ID Total_Price
PROD3
FD3: Order_ID -> Unit Order_ID Unit

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


148
3. Third Normal Form
Definition:
 Transitive functional dependency - a FD X -> Z that
can be derived from two FDs X -> Y and Y -> Z

Examples:
- SSN -> DMGRSSN is a transitive FD since
SSN -> DNUMBER and DNUMBER -> DMGRSSN hold

- SSN -> ENAME is non-transitive since there is no set of


attributes X where SSN -> X and X -> ENAME

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


149
Third Normal Form (2)
 A relation schema R is in third normal form (3NF) if it is in
2NF and no non-prime attribute A in R is transitively
dependent on the primary key
 R can be decomposed into 3NF relations via the process of
3NF normalization
NOTE:
In X -> Y and Y -> Z, with X as the primary key, we consider
this a problem only if Y is not a candidate key. When Y is a
candidate key, there is no problem with the transitive
dependency.
E.g., Consider EMP (SSN, Emp#, Salary ).
Here, SSN -> Emp# -> Salary and Emp# is a candidate key.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


150
Example: Normalize into 2NF and 3NF

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


151
Example: Normalize into 2NF and 3NF

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


152
Review questions
1. Discuss bottom-up and top-down design methodology.
2. Discuss insertion, deletion, and modification anomalies. Why are they
considered bad? Illustrate with examples.
3. Define functional dependency with example.
4. Write down the informal design guideline for relation schema.
5. Why do we have NULL values in the relation? What are the demerits of it?
6. What undesirable dependencies are avoided when a relation is in 2NF?
7. What undesirable dependencies are avoided when a relation is in 3NF?
8. What is multivalued dependency? When does it arise?
9. Describe in detail the 1NF,2NF and also 3NF.
10. Discuss the guideline for 1NF, 2NF and 3NF.
11. Write the summary of normal form based on primary keys and corresponding
normalization.

Copyright © 2015 Ramez Elmasri and Shamkant B. Navathe


153

You might also like