You are on page 1of 189

COMP 204: Advanced

Database Systems
(Bsc. Computer Science)
Lecturer: Patrick Kwabena Mensah
WhatsApp: 0247 89 0072

06/27/2022 13:12 1
Course Outline
• Relational database features;
• Internet-based distributed databases;
• Relational algebra and calculus,
• SQL, queries, constraints, triggers, mySql,
• Application development, JDBC, ODBC,
• Storage and indexing, Transaction management Concurrency control,
Crash recovery, Security and authorization;
• Configuration issues – setup and installation of SQL Server;
• tuning and performance measurements.

06/27/2022 13:12 2
Recommended Books
1. Toby J. Teorey et al., (2011) Database Systems: A Practical
Approach to Design, Implementation, and Management, Sixth
Edition, Morgan Kaufmann Publishers.
2. Abraham Silberschatz et al., (2011) Database System Concepts, 7th
Edition, McGraw-Hill Publishers.
3. Anthony DeBarros, (2018) PRACTICAL SQL A Beginner’s Guide to
Storytelling with Data, William Pollock Publishers.
4. Tamer Özsu, (2020) Principles of Distributed Database Systems,
Fourth Edition, Springer.

06/27/2022 13:12 3
Relational Databases
1. Enhanced Entity Relationship Modelling
• EER => ER model supported with additional semantic concepts to
meet complex demanding application requirements.
• E.g. of such database applications:
• Computer-Aided Design (CAD),
• Computer-Aided Manufacturing (CAM),
• Computer-Aided Software Engineering (CASE) tools,
• Office Information Systems (OIS) and Multimedia Systems,
• Digital Publishing, and
• Geographical Information Systems (GIS)

06/27/2022 13:12 4
Additional Concepts that EER adds to ER
• Specialization/generalization,
• Aggregation, and
• Composition.

06/27/2022 13:12 5
Specialization/Generalization
• Includes special types of entities known as
• Super-classes and Subclasses, and
• the process of Attribute Inheritance.

• We show how to represent specialization/generalization in an EER


diagram using UML..

• Entity type => a set of entities of the same type such as Staff, Branch,
and PropertyForRent.

06/27/2022 13:12 6
• Superclass => Entity types that have distinct subclasses:
• E.g.
Superclass Subclass => generalization specialization
Staff Manager
SalesPersonnel
Secretary
(1:1) relationship called
superclass/subclass relationship

06/27/2022 13:12 7
• Overlapping subclasses => a member of staff who is a Manager and a
Sales Personnel.
• Not every member of a superclass is a member of a subclass. E.g???
• Problem of holding all staff details in one relation:
• attributes appropriate to all staff can be filled (namely, staffNo, name,
position, and salary), those that are only applicable to particular job roles are
only partially filled. E.g.

06/27/2022 13:12 8
Advantages of superclasses
and subclasses in an ER model
1. To avoid describing similar concepts more than once, thereby saving
time for the designer and making the ER diagram more readable.
2. To add more semantic information to the design in a form that is
familiar to many people.
E.g. “Manager IS-A member of staff” and “flat IS-A type of property,”
communicates significant semantic content in a concise form.
• Type hierarchy => Entity and its subclasses and their subclasses, etc.
• Type hierarchy is also called specialization hierarchy or
generalization hierarchy or IS-A hierarchy.

06/27/2022 13:12 9
• Shared subclass => A subclass with more than one superclass.
• Multiple inheritance => attributes of the superclasses are inherited by the
shared subclass.

• Specialization is a top-down approach to defining a set of superclasses and


their related subclasses.
• E.g. Manager, SalesPersonnel, and Secretary are subclasses of Staff
superclass bcos they have distinctive attributes.

• Generalization is a bottom-up approach, that results in the identification


• of a generalized superclass from the original entity types

06/27/2022 13:12 10
E.g.

06/27/2022 13:12 11
06/27/2022 13:12 12
06/27/2022 13:12 13
Constraints on Specialization/Generalization

• Two types:
1. F

• A participation constraint may be mandatory or optional.


• superclass/subclass relationship with mandatory participation means every
member in the superclass must also be a member of a subclass.
• To represent mandatory participation, “Mandatory” is placed in curly
brackets below the triangle that points towards the superclass.
• E.g. Fig 13.3 => every member of staff must have a contract of employment.
• Optional participation specifies that a member of a superclass need not
belong to any of its subclasses.

06/27/2022 13:12 14
• only applies when a superclass has more than one subclass
• Disjoint => when an entity occurrence can be a member of only
one of the subclasses.
• “Or” => represents a disjoint superclass/subclass relationship
• E.g. Fig. 13.3, member of staff must have a full-time permanent or
a part-time temporary contract, but not both.
• “And”=> subclasses of a specialization/generalization are not
disjoint (called non-disjoint), then an entity occurrence may be a
member of more than one subclass.

06/27/2022 13:12 15
Aggregation

• Aggregation => when one entity represents a larger entity (the


“whole”), consisting of smaller entities (the “parts”)
• Place an open diamond shape at one end of the relationship line, next
to the entity that represents the “whole.”

06/27/2022 13:12 16
06/27/2022 13:12 17
Composition

• Aggregation is nothing more than distinguish a “whole” from a “part.”


• Composition represents a strong ownership and coincidental lifetime
between the “whole” and the “part”.
• In a composite, the “whole” is responsible for the disposition of the
“parts,”
• i.e. the composition must manage the creation and destruction of its “parts.”

06/27/2022 13:12 18
• Place a filled-in diamond shape at the entity that represents the
“whole” in the relationship.

06/27/2022 13:12 19
Difference between Composition and
Aggregation
• Composition => e.g. Advert entity (the “part”) belongs to exactly one
Newspaper entity (the “whole”).
• Aggregation => A part may be shared by many wholes.
• E.g. a Staff entity may be “a part of” one or more Branches entities.

06/27/2022 13:12 20
Normalization
• Database design objective is to create an accurate representation of
1. the data,
2. relationships between the data,
3. and constraints on the data
• Use any of the ff. database design techniques to achieve objectives
1. ER modeling or
2. Normalization
• Attributes => properties of the data/relationships between the data that is
important to the enterprise.
• Normalization => Database design technique that examines the
relationships (i.e. functional dependencies) between attributes.

06/27/2022 13:12 21
• Normalization uses a series of tests (described as normal forms) to
help identify the optimal grouping for these attributes.

• It is a formal technique for analyzing relations based on their primary


key (or candidate keys) and functional dependencies.
• It has a series of rules used to test relations for normalizing the
database to any degree.

• When a requirement is not met, the relation violating the requirement


must be decomposed into relations that individually meet the
requirements of normalization.
06/27/2022 13:12 22
E.g. Dependencies
1. Data Redundancy and Update Anomalies
• Major aim of relational database design => to group attributes into
relations to minimize data redundancy.
• Benefits:
• Updates done with minimal number of operations reducing the
occurrence of data inconsistencies
• Reduces file storage space, hence minimizing costs
• However, data Redundancy is accepted as:
• Copies of primary keys (or candidate keys) acting as foreign keys
in related relations are used to enable the modeling of relationships
between data.
06/27/2022 13:12 23
Problems of unwanted data redundancy
• Primary Key underlined

06/27/2022 13:12 24
06/27/2022 13:12 25
• StaffBranch relation has redundant data; the details of a branch are
repeated for every member of staff located at that branch.
• Redundant Relations may have problems called update anomalies:
• insertion,
• deletion, or
• modification
• Insertion anomalies => 2 Types
1. To add a new staff into the StaffBranch relation, the branchNo is required.
Since the details of branch number (e.g. B007) are recorded in the database
as a single tuple in the Branch relation, this relations do not suffer from this
potential inconsistency.
2. To add a new branch with no staff into the StaffBranch relation, we must
enter nulls into the attributes for staff, e.g. staffNo.

06/27/2022 13:12 26
• But, staffNo is the primary key for the StaffBranch relation,
entering nulls will violates entity integrity.
• We cannot enter a tuple for a new branch into the StaffBranch
relation with a null for the staffNo.
• The relations here avoids this problem.
• Deletion Anomalies =>
• Deleting a tuple of the last member of staff for a branch from the
StaffBranch relation will delete details about that branch from the
database.
• E.g. Deleting tuple for staff number SA9 (Mary Howe) from the
StaffBranch relation would have deleted details of branch number
B007 from the database except for the presence of the Branch
relation

06/27/2022 13:12 27
• Modification Anomalies =>
• To change one value under branch attribute in StaffBranch relation
(e.g. address for branch number B003) tuples of all staff located at
that branch must be updated.
• Otherwise, the database will become inconsistent.
• This anomaly is avoided by decomposing the original relation into
the Staff and Branch relations.
• Two properties of decomposition of larger relation into smaller
relations:
1. lossless-join property => makes sure any instance of the
original relation can be identified from corresponding
instances in the smaller relations.
2. dependency preservation property => makes sure the
constraint on the original relation can be maintained by
enforcing some constraint on each of the smaller relations.

06/27/2022 13:12 28
2. Functional Dependencies
• It describes the relationship between attributes

• “A functionally determines B.”

06/27/2022 13:12 29
• When a functional dependency is present,
• the attribute or group of attributes on the left-hand side of the arrow is called
the determinant.
• the dependency is specified as a constraint between the attributes
• Example of a functional dependency:
• Consider the attributes staffNo and position of the Staff relation
• For the staff with staffNo—SL21—we can determine the position as a
Manager.
• i.e. staffNo functionally determines position
• position does not functionally determine
staffNo

06/27/2022 13:12 30
• a staff holds one position, but one position can be held by
several staff
• => the relationship between staffNo and position is one-to-
one (1:1).
• the relationship between position and staffNo is one-to-
many (1:*)
• staffNo is the determinant of this functional dependency

06/27/2022 13:12 31
06/27/2022 13:12 32
Full Functional dependency
• When the determinants have the minimal number of attributes necessary to
maintain the functional dependency with the attribute(s) on the righthand
side.

• The functional dependency A ® B is a full functional dependency if


removal of any attribute from A results in the dependency no longer
existing. functional dependency
• A ® B is a partial dependency if there is some attribute that can be removed
from A and yet the dependency still holds

06/27/2022 13:12 33
Transitive dependency

• Identifying Functional Dependencies


1. From user requirement specifications
2. depending on the database application
• Identifying the Primary Key for a Relation Using
Functional Dependencies
• It is necessary to identify the candidate keys, one of which is selected to be
the primary key for the relation.

06/27/2022 13:12 34
E.g. Identifying the primary key for the
StaffBranch relation

1. identify the determinants & functional dependencies for the StaffBranch relation.
i.e. Determinants => staffNo, branchNo, bAddress,
Dependencies => (branchNo, position), and (bAddress, position).
2. identify the attribute (or group of attributes) that uniquely identifies each tuple in
this relation.
• If a relation has more than one candidate key, identify the candidate key that is to act as the
primary key for the relation

06/27/2022 13:12 35
• All attributes that are not part of the primary key (non-
primary-key attributes) are functionally dependent on the key.
• staffNo => The only candidate key & hence the primary key
in the StaffBranch relation.
• all other attributes of the relation are functionally dependent
on staffNo.

• Although branchNo, bAddress, (branchNo, position), and


(bAddress, position) are determinants, they are not candidate
keys for the relation.

06/27/2022 13:12 36
The Process of Normalization
• Normalization => Formal technique for analyzing relations based on
their primary key (or candidate keys) and functional dependencies.
• Three normal forms were initially proposed called
• First Normal Form (1NF),
• Second Normal Form (2NF), and
• Third Normal Form (3NF).
• stronger definition of third normal form called Boyce–Codd Normal
Form (BCNF)
• Except 1NF, all these normal forms are based on functional
dependencies among the attributes
06/27/2022 13:12 37
• There are also Fourth Normal Form (4NF) and Fifth Normal Forms
(5NF)
• Normalization follows some Steps.
• Each step corresponds to a specific normal form.
• As normalization proceeds, update anomalies reduces.
• For relational data model, only First Normal Form (1NF) is critical; all
subsequent normal forms are optional.
• But to avoid update anomalies, proceed to at least Third Normal Form
(3NF).
• some 1NF relations are also in 2NF, and some 2NF relations are also
in 3NF, and so on (See Figure on next page).

06/27/2022 13:12 38
06/27/2022 13:12 39
Process of Normalization
First Normal Form (1NF)

• identify and remove repeating groups within the table.


• repeating group => an attribute, or group of attributes, in a table that
occurs with multiple values for a single occurrence of the key
attribute(s) for that table.

06/27/2022 13:12 40
• two approaches to remove repeating groups from unnormalized
tables:
1. enter appropriate data in the empty columns of the rows containing the
repeating data => Perform “flattening” by filling in the blanks by duplicating
the nonrepeating data,
✓ introduces more redundancy into the original UNF table as part of the “flattening”
process
2. placing the repeating data, along with a copy of the original key attribute(s),
in a separate relation.
✓ creates two or more relations with less redundancy than in the original UNF table.
• A set of relations is in 1NF if it contains no repeating groups.
• 1NF requires atomic (or single) values at the intersection of each row
and column.

06/27/2022 13:12 41
•Normalization

06/27/2022 13:12 42
• 4 Types

06/27/2022 13:12 43
06/27/2022 13:12 44
1NF

06/27/2022 13:12 45
06/27/2022 13:12 46
06/27/2022 13:12 47
2NF

06/27/2022 13:12 48
06/27/2022 13:12 49
06/27/2022 13:12 50
06/27/2022 13:12 51
3NF

06/27/2022 13:12 52
06/27/2022 13:12 53
06/27/2022 13:12 54
06/27/2022 13:12 55
06/27/2022 13:12 56
Boyce Codd normal form (BCNF)

06/27/2022 13:12 57
06/27/2022 13:12 58
06/27/2022 13:12 59
06/27/2022 13:12 60
06/27/2022 13:12 61
06/27/2022 13:12 62
4NF

06/27/2022 13:12 63
06/27/2022 13:12 64
06/27/2022 13:12 65
06/27/2022 13:12 66
06/27/2022 13:12 67
5NF

06/27/2022 13:12 68
06/27/2022 13:12 69
06/27/2022 13:12 70
06/27/2022 13:12 71
06/27/2022 13:12 72
06/27/2022 13:12 73
Relational Decomposition

06/27/2022 13:12 74
Lossless Decomposition

06/27/2022 13:12 75
• The above relation is decomposed into two relations EMPLOYEE and
DEPARTMENT

06/27/2022 13:12 76
• Now, when these two relations are joined on the common column
"EMP_ID", then the resultant relation will look like:

06/27/2022 13:12 77
06/27/2022 13:12 78
Multivalued Dependency

06/27/2022 13:12 79
06/27/2022 13:12 80
Join Dependency

06/27/2022 13:12 81
Inclusion Dependency

06/27/2022 13:12 82
Canonical Cover (irreducible set)
Armstrong’s Axioms

06/27/2022 13:12 83
Secondary Rules

06/27/2022 13:12 84
Applying Canonical Cover using Armstrong’s
Axioms

06/27/2022 13:12 85
Assignment
1. Given a relational Schema R( A, B, C, D) and set of Function
Dependency FD = { B → A, AD → BC, C → ABD }. Find the canonical
cover?
2. Given a relational Schema R( W, X, Y, Z) and set of Function
Dependency FD = { W → X, Y → X, Z → WXY, WY → Z }. Find the
canonical cover?

06/27/2022 13:12 86
Internet-based distributed databases
• A distributed database is a collection of multiple interconnected
databases, which are spread physically across various locations that
communicate via a computer network.

06/27/2022 13:12 87
06/27/2022 13:12 88
Distributed Database Management System

06/27/2022 13:12 89
06/27/2022 13:12 90
06/27/2022 13:12 91
06/27/2022 13:12 92
06/27/2022 13:12 93
06/27/2022 13:12 94
06/27/2022 13:12 95
06/27/2022 13:12 96
06/27/2022 13:12 97
06/27/2022 13:12 98
06/27/2022 13:12 99
06/27/2022 13:12 100
06/27/2022 13:12 101
06/27/2022 13:12 102
06/27/2022 13:12 103
06/27/2022 13:12 104
06/27/2022 13:12 105
06/27/2022 13:12 106
06/27/2022 13:12 107
06/27/2022 13:12 108
06/27/2022 13:12 109
06/27/2022 13:12 110
06/27/2022 13:12 111
06/27/2022 13:12 112
06/27/2022 13:12 113
Assignment
1. Briefly explain the following terms
a) Distribution transparency
b) Location transparency
c) Fragmentation transparency
d) Replication transparency

06/27/2022 13:12 114


Database Control
• Database control refers to the task of enforcing regulations so as to
provide correct data to authentic users and applications of a
database.
• data should be screened away from unauthorized users so as to
maintain security and privacy of the database. Database control is
one of the primary tasks of the database administrator (DBA).

06/27/2022 13:12 115


Security and authorization

06/27/2022 13:12 116


06/27/2022 13:12 117
06/27/2022 13:12 118
06/27/2022 13:12 119
06/27/2022 13:12 120
06/27/2022 13:12 121
Assignment

• Write and give examples of all the SQL queries you know.

06/27/2022 13:12 122


Relational algebra and calculus
• Relational algebra is a procedural query language. It gives a step by
step process to obtain the result of the query. It uses operators to
perform queries.

06/27/2022 13:12 123


06/27/2022 13:12 124
06/27/2022 13:12 125
• E.g. LOAN Relation

06/27/2022 13:12 126


06/27/2022 13:12 127
06/27/2022 13:12 128
06/27/2022 13:12 129
06/27/2022 13:12 130
06/27/2022 13:12 131
06/27/2022 13:12 132
06/27/2022 13:12 133
06/27/2022 13:12 134
06/27/2022 13:12 135
06/27/2022 13:12 136
06/27/2022 13:12 137
06/27/2022 13:12 138
06/27/2022 13:12 139
06/27/2022 13:12 140
06/27/2022 13:12 141
06/27/2022 13:12 142
06/27/2022 13:12 143
06/27/2022 13:12 144
06/27/2022 13:12 145
06/27/2022 13:12 146
06/27/2022 13:12 147
06/27/2022 13:12 148
Product

06/27/2022 13:12 149


06/27/2022 13:12 150
06/27/2022 13:12 151
06/27/2022 13:12 152
Precedence o Operators

06/27/2022 13:12 153


Join Operations:

06/27/2022 13:12 154


06/27/2022 13:12 155
06/27/2022 13:12 156
06/27/2022 13:12 157
06/27/2022 13:12 158
Outer Join
• The outer join operation is an extension of the join operation. It is
used to deal with missing information.

06/27/2022 13:12 159


06/27/2022 13:12 160
Types of Outer join

06/27/2022 13:12 161


06/27/2022 13:12 162
06/27/2022 13:12 163
06/27/2022 13:12 164
06/27/2022 13:12 165
06/27/2022 13:12 166
06/27/2022 13:12 167
Expression Trees

06/27/2022 13:12 168


06/27/2022 13:12 169
06/27/2022 13:12 170
Assignment: Self-Join

06/27/2022 13:12 171


06/27/2022 13:12 172
Integrity Constraints

06/27/2022 13:12 173


06/27/2022 13:12 174
06/27/2022 13:12 175
06/27/2022 13:12 176
06/27/2022 13:12 177
06/27/2022 13:12 178
Relational Calculus

06/27/2022 13:12 179


06/27/2022 13:12 180
06/27/2022 13:12 181
06/27/2022 13:12 182
06/27/2022 13:12 183
SQL JOINs
• SQL JOIN means "to combine two or more tables“
• INNER JOIN
• LEFT JOIN
• RIGHT JOIN
• FULL JOIN

06/27/2022 13:12 184


06/27/2022 13:12 185
Left Join
• returns all the values from left table and the matching values from
the right table.
• If there is no matching join value, it will return NULL.

06/27/2022 13:12 186


Right Join
• ??????

06/27/2022 13:12 187


FULL JOIN
• In SQL, FULL JOIN is the result of a combination of both left and right
outer join.
• Join tables have all the records from both tables. It puts NULL on the
place of matches not found.

06/27/2022 13:12 188


06/27/2022 13:12 189

You might also like