Professional Documents
Culture Documents
Fundamentals of
Database Systems
COMPSCI/SOFTENG 351
COMPSCI 751
1
Fundamentals of Database Systems Semester 1, 2017
2
Database Design
3
Iterative Database Design
Requirements analysis
Conceptual Design
Logical Design
Physical Design
4
Conceptual Database Design
5
View of the World
Example: Welcome Harry. Harry is just about to open his own DVD rental
shop. He got everything: potential clients, DVDs to rent, ... well, it might be wise
to keep track of clients renting DVDs. Doing this by hand is a burdensome job, in
particular as Harry cannot afford a secretary.
So Harry needs a database which stores all the data necessary to provide the desired
information.
What about the target of the database to be designed? It contains Harry’s DVDs,
his clients and, of course, all the rentals.
I ERM views target of the database to consist of entities and relationships:
I Entities are basic objects in the target of the database
I Example: Clients and DVDs are basic objects, that is, entities. Rentals are con-
nections between DVDs and clients, that is, relationships.
I This simple view of the world is the major reason for the success of the data model
6
Modeling Entities
7
Entities, Attributes, and Keys
I As in the RDM, each attribute has a domain (its set of possible values)
I Throughout, let D = {Di }i∈I be a fixed family of sets, and call each Di a domain
8
Entity Types
For short, we write E = (attr(E), id(E)). The attributes in the key id(E) are
called the key attributes of E.
9
Entities
I Client is an abstract concept that stands for all the real-world clients like Johnny
I Remark: The first tuple notation above does not need a fixed order of the at-
tributes, since attributes have unique names. The second tuple notation, however,
requires a fixed order of the attributes. Thus before we can drop the attribute names
we must fix such an order of the attributes.
Example: Some more clients, namely JohnnyTwo, Lizzie and Debbie. Using the
order of attributes fixed before, these entities give rise to the following tuples:
,→ (John Fox, 06/06/1966, 66 Victoria Street, 3506060)
,→ (Lisa Hunter, 12/12/1982, 22 Te Awe Awe Street, 3502222)
,→ (Debra Gunner, 08/08/1988, 21 Park Road, 3501111)
10
Entities and Entity Sets
I Again, every client may be seen as a mapping that assigns each attribute a particular
value from the attribute domain
I This approach leads to our next definition
∪
I For the sake of simplicity, we introduce the universal domain D = i∈I Di
which is just the union of all domains under consideration
I A finite set E t of entities is an entity set if two entities e1, e2 that share the same
values on all key attributes (that is, e1(A) = e2(A) for all A ∈ id(E)) are indeed
equal
I This property is called the unique-key-value property
11
The Unique-Key-Value Property
I Clearly, the clients Johnny, JohnnyTwo, Lizzie and Debbie are entities of type
Client. Together they form an entity set of type Client:
Client
Name Birthday Address Phone
John Fox 08/08/1980 88 Main Street 3508888
John Fox 06/06/1966 66 Victoria Street 3506060
Lisa Hunter 12/12/1982 22 Te Awe Awe Street 3502222
Debra Gunner 08/08/1988 21 Park Road 3501111
12
Another Example
In addition to clients, the target of the database for Harry’s DVD rental shop
contains DVDs. So DVD is a further essential concept.
We are going to model this as an entity type, too. Again we have to decide which attributes are
useful and required, we have to specify their domains, and we have to select a key. Say we decide to
use
I the attributes attr(DVD) = {Title, Director, Year},
I the domain assignment is given by dom(Title) = string, dom(Director) = string,
dom(Year) = number,
I and we choose the key id(DVD) = {Title}
or, if the attribute domains are known, even attr(DVD) = {Title, Director, Year}
This gives us a surprisingly short specification of the entity type DVD in form of a
pair E = (attr(E), id(E)):
DVD = ({Title, Director, Year}, {Title})
13
Visualization of Entity Types
I Practice shows that the graphical representation has significant advantages over
textual specification for the purposes of communication between systems analysts,
database designers and potential users of the database
I Entity types are often visualized by rectangles.
I If desired, the attributes can be attached to the rectangle, and those attributes
forming the key are underlined.
Title–
Director– DVD
Year–
I Example: To visualize the entity type
Client = ({Name, Birthday, Address, Phone}, {Name, Birthday})
we use –Name
–Birthday
Client
–Address
–Phone
I Remark: Please recall that this convention implicitly supposes that we know about
the domains of the attributes
14
Modeling Relationships
I So far we found the abstract concepts Client and DVD. In both cases we decided
to specify them as entity types. What about rentals?
I Clearly, they are also in the target of the database and, thus, Rental is a further
abstract concept that deserves to be modeled.
I However, rentals can hardly be seen as basic objects. Roughly speaking, their
existence depends on some client hiring a DVD.
I Hence, rentals are rather relationships between entities than entities themselves:
each rental connects a client and a DVD.
I The abstract concept Rental has two components Client and DVD.
I In addition to the components, we might want to store further data on rentals in
the database.
Example: We choose additional attributes RentalDay and DueDay, both with
domain date.
15
Relationships, Attributes, and Keys
I These attributes are common attributes of all rentals that may be used to describe
them, that is, we consider the abstract concept Rental to comprise the components
Client and DVD and the attributes RentalDay and DueDay.
Example: Suppose Johnny rents the BlueDVD. Then we are interested in the
rental day (say 03/10/2011) and the due day (say 03/11/2011).
I As before, a key will be used to distinguish different rentals. However, a key of a
relationship type may also contain components.
Example: Johnny might rent more than one DVDs, and he might even rent the
BlueDVD again later on. Hence, neither Client and RentalDay, nor Client and
DVD provide sufficient information to distinguish between rentals. But DVD and
RentalDay should be fine.
I The component DVD together with the attribute RentalDay form a key that allows
us to uniquely identify a particular rental.
I Of course, for relationship types, a key may also contain components.
16
Relationship Types
17
An Example
Again we prefer to couple the domain assignment with the attributes by writing:
attr(Rental) = {RentalDay: date, DueDay: date}
or even to omit the domain assignment whenever we know about the domains:
attr(Rental) = {RentalDay, DueDay}
This gives again rise to a shorter specification of the relationship type Rental in
form of a triple R = (comp(R), attr(R), id(R)):
Rental = ({Client, DVD}, {RentalDay, DueDay}, {DVD, RentalDay})
18
Visualization of Relationship Types
I It is linked by edges to the rectangles representing its components. For key com-
ponents, the corresponding edge is marked by a dot.
I If desired, the attributes are simply attached to the diamond. Key attributes are
underlined similar to key attributes of entity types.
—DueDay
Rental —RentalDay
R
–Name Title–
–Birthday
Client
–Address Director– DVD
–Phone Year–
19
Relationships
I Rental is an abstract concept that stands for all the real-world rentals.
I As before, each real-world rental is written in form of a tuple like
,→ (Client: Johnny, DVD: BlueDVD, RentalDay: 03/10/2011, DueDay: 03/11/2011)
,→ or, even shorter, (Johnny, BlueDVD, 03/10/2011, 03/11/2011)
I Each rental may be seen as a mapping that assigns each component a particular
entity from the entity set and each attribute a particular value from the attribute
domain.
I Let ent(E) denote the set of all entities of an entity type E.
I ent(Client) consists of the clients Johnny, JohnnyTwo, Lizzie, and Debbie.
I ent(DVD) consists of the DVDs BlueDVD, WhiteDVD, and RedDVD.
20
Relationships, and Relationship Sets
with r(E) ∈ ent(E) for all components E ∈ comp(R) and r(A) ∈ dom(A) for
all attributes A ∈ attr(R).
2) A relationship set is a finite set Rt of relationships of type R with unique key
values, that is, for all r1, r2 ∈ Rt with r1(X) = r2(X) for all key components
and key attributes X ∈ id(R) we must have r1 = r2.
I Entities and relationships are jointly called objects
I A finite set Rt of relationships of type R is a relationship set if two relationship r1, r2
in Rt that share the same values on all key components and key attributes (that is,
r1(X) = r2(X) for all X ∈ id(R)) are actually equal
I This is again the unique-key-value property
21
The Unique-Key-Value Property
Rental
Client DVD RentalDay DueDay
Johnny BlueDVD 03/10/2011 03/11/2011
Johnny BlueDVD 05/01/2012 05/02/2012
Johnny WhiteDVD 03/10/2011 03/11/2011
JohnnyTwo RedDVD 07/10/2011 07/11/2011
Debbie WhiteDVD 15/11/2011 15/12/2011
Lizzie BlueDVD 11/11/2011 11/12/2011
22
Components with Roles
I Example: Harry also rents his DVDs to teenagers like Debbie and her friends. Now
Harry is wondering whether their parents are clients, too. This information might
be helpful in running the shop.
I This can be modeled by a relationship type Descendent whose components are
both of type Client.
I To avoid confusion, roles are associated with the different components, such as
Child and Parent. This gives rise to the relationship type Descendent
I with components
comp(Descendent) = {Child:Client, Parent:Client},
I without additional attributes (and thus without additional domain assignment)
attr(Descendent) = ∅,
I and with key id(Descendent) = {Child:Client, Parent:Client}
23
Visualizing Roles
Descendent
Child Parent
? ?
–Name
–Birthday
Client
–Address
–Phone
I Apart from the usage of roles the example above we may observe two further details:
I First, there is no need for relationship types to possess attributes. Attributes are
intended to capture properties which are useful in meeting the information needs
or to ensure that each relationship in a relationship set can be uniquely identified.
But this does not imply that attributes are compulsory for relationship types.
24
Observations
Descendent
Child:Client Parent:Client
Debbie Mary
Debbie Bob
Julie Mary
Julie Bob
I Second, the key may include all components or attributes of a relationship type.
While this is an extreme situation, it may well occur in practice.
I In most cases, however, it occurs that we do not need all components and attributes
in order to identify objects (entities or relationships) of some type uniquely. This
is because we are often interested in several properties that are useful to know but
irrelevant for the identification of the object itself.
25
Entity-Relationship Schemata
I Entities and relationships are objects in the target of the database. Consequently,
they should be stored in the database. By classifying these objects we found abstract
concepts like Client, DVD or Rental.
I These abstract concepts were modeled by entity types or relationship types. Together
they form the database schema for the database to be designed.
I Naturally, whenever we use a relationship type then all the entity types forming its
components should be in the database schema, too.
An Entity-Relationship schema (or ER schema, for short) is a finite set S
of entity types and relationship types such that for each relationship type R in S
and each of its components E or p : E in comp(R), we have that the entity type
E belongs to S as well.
I Every object type should have a unique name in the ER schema. Attributes, on the
other hand, need not to be unique.
I In fact, an attribute like Title or Name is likely to be an interesting property for
various object types.
26
Entity-Relationship Diagrams
I ER schema and ER diagram are just two different ways of presenting essentially the
same information.
27
An Example
The ER schema for Harry’s shop consists of the entity types Client and DVD,
and of the relationship type Rental.
—DueDay
Rental —RentalDay
R
–Name Title–
–Birthday
Client
–Address Director– DVD
–Phone Year–
I So far we always attached the attributes to the entity or relationship types. This
increases the level of detail, but might decrease the readability of the diagram. For
that reason, we sometimes omit the attributes in the diagram.
I Finally, key components of a relationship type are marked by dots on the edges to
the corresponding entity types. Key attributes of object types are underlined, as
long as we decided to attach attributes to the rectangles and diamonds.
28
Database Instances
I Each of the object types in an ER schema stands for a set of objects in the target of
the database. Hence, the database will contain an object set for each of the object
types.
I Clearly, these sets may change over time: Hopefully, Harry’s DVD rental shop will
acquire new clients over time. Similarly, Harry might buy new DVDs or replace
some of the old ones.
I The latter condition ensures consistency of the database instance: an entity may
only occur in a relationship if it belongs to the relevant entity set.
29
An Example
Client
Name Birthday Address Phone
John Fox 08/08/1980 88 Main Street 3508888
John Fox 06/06/1966 66 Victoria Street 3506060
Lisa Hunter 12/12/1982 22 Te Awe Awe Street 3502222
Debra Gunner 08/08/1988 21 Park Road 3501111
DVD
Title Director Year
Blue Velvet David Lynch 1986
White Oleander Peter Kosminsky 2003
The Hunt for Red October John McTiernan 1990
Rental
Client DVD RentalDay DueDay
Johnny BlueDVD 03/10/2011 03/11/2011
Johnny BlueDVD 05/01/2012 05/02/2012
Johnny WhiteDVD 03/10/2011 03/11/2011
JohnnyTwo RedDVD 07/10/2011 07/11/2011
Debbie WhiteDVD 15/11/2011 15/12/2011
Lizzie BlueDVD 11/11/2011 11/12/2011
30
Semantics of Relationships
I In the examples so far we were somewhat cheating. We used shortcuts like Johnny
or BlueDVD. These shortcuts do only exist in our imagination . . .
I . . . but in the database they are not available. To be precise the rental
(Johnny, BlueDVD, 03/10/2011, 03/11/2011)
should read as
((John Fox, 08/08/1980, 88 Main Street, 3508888), (Blue Velvet, David Lynch, 1986), 03/10/2011, 03/11/2011)
I In fact, it is not necessary to include all the attributes of the client into the rental:
I It suffices to include the key attributes
I By the unique-key-value property, we can identify the corresponding object
31
Set Semantics vs. Foreign Key Semantics
I Note: In the definitions above we used set semantics. Foreign key semantics leads to
slightly modified definitions of relationships, relationship sets and database instances
I The major difference to the original definition is that we no longer use the entire
entity e, but only its restriction e||id(E) to the key attributes of the type E
I In this sense, foreign key semantics reflects the fact that we need only the values on
key attributes to uniquely identify an entity in an entity set
32
An Example
The set of rentals in set semantics and in foreign key semantics:
Rental
Client DVD RentalDay DueDay
(John Fox, 08/08/1980, (Blue Velvet, 03/10/2011 03/11/2011
88 Main Street, 3508888) David Lynch,1986)
(John Fox, 08/08/1980, (Blue Velvet, 05/01/2012 05/02/2012
88 Main Street, 3508888) David Lynch,1986)
(John Fox, 08/08/1980, (White Oleander, 03/10/2011 03/11/2011
88 Main Street, 3508888) Peter Kosminsky,2003)
(John Fox, 06/06/1966, (The Hunt for Red October, 07/10/2011 07/11/2011
66 Victoria Street, 3506060) John McTiernan,1990)
(Debra Gunner, 08/08/1988, (White Oleander, 15/11/2011 15/12/2011
21 Park Road, 3501111) Peter Kosminsky,2003)
(Lisa Hunter, 12/12/1982, (Blue Velvet, 11/11/2011 11/12/2011
22 Te Awe Awe Street, 3502222) David Lynch,1986)
Rental
Client DVD RentalDay DueDay
(John Fox,08/08/1980) (Blue Velvet) 03/10/2011 03/11/2011
(John Fox,08/08/1980) (Blue Velvet) 05/01/2012 05/02/2012
(John Fox,08/08/1980) (White Oleander) 03/10/2011 03/11/2011
(John Fox,06/06/1966) (The Hunt for Red October) 07/10/2011 07/11/2011
(Debra Gunner, 08/08/1988) (White Oleander) 15/11/2011 15/12/2011
(Lisa Hunter, 12/12/1982) (Blue Velvet) 11/11/2011 11/12/2011
33
Identifier Semantics
I Still, the idea to use shortcuts like Johnny or BlueDVD seems to be promising
34
An Example
I The unique-identifier property in part 2) of the definition ensures that each entity
in an entity set receives a unique identifier. Clearly, this shortcut may then be used
to identify the corresponding entity in the entity set.
Client
ID Name Birthday Address Phone
Johnny John Fox 08/08/1980 88 Main Street 3508888
JohnnyTwo John Fox 06/06/1966 66 Victoria Street 3506060
Lizzie Lisa Hunter 12/12/1982 22 Te Awe Awe Street 3502222
Debbie Debra Gunner 08/08/1988 21 Park Road 3501111
DVD
ID Title Director Year
BlueDVD Blue Velvet David Lynch 1986
WhiteDVD White Oleander Peter Kosminsky 2003
RedDVD The Hunt for Red October John McTiernan 1990
35
Unique-Key-Value and Unique-Identifier Property
Rental
ID Client DVD RentalDay DueDay
i1 Johnny BlueDVD 03/10/2011 03/11/2011
i2 Johnny BlueDVD 05/01/2012 05/02/2012
i3 Johnny WhiteDVD 03/10/2011 03/11/2011
i4 JohnnyTwo RedDVD 07/10/2011 07/11/2011
i5 Debbie WhiteDVD 15/11/2011 15/12/2011
i6 Lizzie BlueDVD 11/11/2011 11/12/2011
I Of course, the rentals in our database also receive identifiers. Here we used artificial
identifiers i1, i2, . . .. This is usual in practice, and refers to the fact that an identifier
does not reflect an additional attribute, but is simply a technical means to allow a
more compact representation of data.
36
Adjusting to Identifier Semantics
37
More Entity-Relationship Modeling
I Shortly after its introduction, the ERM became the most popular data model used
in conceptual database design
I Over the time, a number of extensions were proposed to overcome some minor
drawbacks and to increase the expressive power of the ERM
I After all, the ERM with its extensions provides effective and convenient means of
describing the target of the database
I Some extensions we shall discuss:
I Further ways to form relationship types
I Higher-order relationship types
I So far we used aggregation to form relationship types. To provide more freedom when
modelling objects in the target of the database, other ways of forming relationship
types were proposed including
I Specialization,
I Generalization, and
I Clusters.
38
Specialization
I Sometimes, an object in the target of the database can be represented by more than
just a single abstract concept.
I Example: Students are also persons. Graduate students are also students and thus
also persons.
I For this it would be good to derive abstract concepts from other abstract concepts,
such as the more specific concept Student from the more general concept Person.
I This idea is known as specialization. The derived object type is a subtype of the
more general supertype. A subtype inherits all features of its supertype, but often
adds some new properties.
I The subtype U may be modeled as a unary relationship type whose single component
is just its supertype C. Clearly, U may have some additional attributes, and we may
use C as the key for U :
U = ({C}, attr(U ), {C})
I Note: Every object of type C gives rise to at most one object of type U .
39
An Example of a Specialization Hierarchy
- Degree
Graduate
Student - Topic
- StudentId
- Position -Department
Student General Staff Lecturer
-Major -Subject
-Name
Person
- Address
40
Adding Some Further Relationship Types
Supervises
- StudentId
- Position -Department
Student General Staff Lecturer
-Major -Subject
-Name - No
Person Paper
- Address - Title
41
Generalization and Clusters
I For this it would be good to have an abstract concept that models all of them, that
is, we like to comprise several object types to a single new type.
I The idea is known as generalization as the new abstract concept is more general
than the individual ones.
I To use roles, we also allow components of the form pi : Ci rather than simply Ci.
42
An Example of a Cluster Type
- Degree
Graduate Employee
Student - Topic
- StudentId
- Position -Department -Department
Student General Staff Lecturer Tutor
-Major -Subject -Subject
-Phone
-Name
Person
- Address
43
Disjoint Unions of Object Sets
I Thus, if the same object o occurs in another object set I(Cj ) with i ̸= j, the pairs
(j, o) and (i, o) are different
I Afterwards, the object set I(U ) associated with a cluster U is the disjoint union
∪
n
I(U ) = {(i, o) : o ∈ I(Ci)}
i=1
44
Adding a Further Relationship Type
- Salary
Hires
- Since
- Degree
Graduate Employee
Student - Topic
- StudentId
- Position -Office -Office
Student General Staff Lecturer Tutor
-Major -Subject -Subject
-Phone
-Name -Name
Person Department
- Address
45
Higher-order Relationship Types
46
Extended Entity-Relationship Schemata
47
An Example
- Month
LendingPeriod - Year - Advance
Billing
- Date
Rental - RentalNo
- IRD_No - Conditions
Manager - Salary Supplies
- VendorName
Vendor - Representative
- Tel
- Amount
Buys - Date
48
Transforming ER Schemata into RDM Schemata
Requirements analysis
Conceptual Design
Logical Design
Physical Design
49
Our Running Example
No Name
Date
No
Supplier DayOffer Article
Shortname
Price
Address QuantityOnStock
No Budget
50
Transformation of Entity Types
I Start with level 0 types and then work your way up gradually
I Entity types:
51
Example: Transformation of the Entity Types
52
Transformation of Relationship Types
53
Example: Transformation of Order-1 Relationships Types
I k attr(Department) = {Department.No}
I k attr(Supplier) = {Supplier.No}
I k attr(Article) = {Article.No}
I key: {Supplier.No,Article.No,Date}
I foreign keys:
,→ [Supplier.No] ⊆ Supplier’[No]
,→ [Article.No] ⊆ Article’[No]
54
Example: Transformation of Order-2 Relationships Types
I k attr(DayOffer) = {DayOffer.Supplier.No,DayOffer.Article.No,DayOffer.Date}
I Purchase’ = {Department.No,DayOffer.Supplier.No,DayOffer.Article.No,DayOffer.Date,Quantity}
with
I key: {Department.No,DayOffer.Supplier.No,DayOffer.Article.No,DayOffer.Date}
I foreign keys:
[Department.No] ⊆ Department’[No]
[DayOffer.Supplier.No,DayOffer.Article.No,DayOffer.Date] ⊆
DayOffer’[Supplier.No,Article.No,Date]
55
How to Handle Cluster Types
56
Transformation of Cluster Types
I Cluster types in ER schema S that are not component of any relationship type can
be removed from S
I For each i = 1, . . . , n:
,→ Ri obtained from R by replacing every occurrence of C by Ci
I If some Ri still contains clusters, then repeat process of replacing these clusters
by their components
57
Example: Cluster-free ER Schema
Professor PostGrad
Supervisor Student
Dept Subject Degree
Supervise
Name
since until
Associate Under
Prof Graduate
58
Example: Cluster-free ER Schema
I Prof Supervision:
({Professor, Student, Project},{since,until},{Professor, Project})
I AProf Supervision:
({AssociateProf, Student, Project},{since,until},{AssociateProf, Project})
59
Main Contributors to the Entity-Relationship Model
I Peter PC Chen
I originator of the basic ER model
I The Entity-Relationship Model -
Toward a Unified View of Data, ACM ToDS, 1976.
I over 8,000 citations
I Bernhard Thalheim
I several extensions of the basic ER model,
including semantics and higher-order object types
I Entity-Relationship modeling -
Foundations of Database Technology, Springer, 2000.
60
Summary
61