You are on page 1of 251

CO2102 Databases and Domain

Modelling

Dr Kehinde Aruleba
Dr Kehinde Aruleba
• Lecturer at School of Computing and Mathematical
Sciences, University of Leicester
• Research Interest:
• Handwriting recognition
Convenor • Algorithms and artificial intelligence, with a focus on
inequality
• Data colonialism, accessibility, and sharing
• Responsible AI
• Email ka388@leicester.ac.uk
Name: CO2102 Databases and Domain
Modelling

Credits: 15 (SEMESTER 1)
Module
Information
150 hours of work in semester 1

A 2h live lecture on Monday


Each week: One computer labs on
Wednesday: 9:00 – 11:00 &
11:00 – 13:00
Lectures
• Theory topics presented in class
• Time for questions on theory and understanding

5
Computer Lab Sessions
• Pre-session materials
• materials for you to read/watch before the session
• Session materials
• Demos and exercises to complete during the session
• Post-session materials
• Solutions to exercises for you to review
• Common problems students experienced

6
CO2102 Semester 1
• Database Development
• with a Quality outlook
• We will use:
• SQL Workbench

7
Project Assignment – 50%
Assessment
Class Test – 50%

8
CO2102 Checklist
Every lab session
❑ Read pre-session materials and watch videos
❑ Prepare questions for labs/lectures
❑ Bring headphones for the lab for instruction videos
❑ Work on exercises (below are optional)
❑ use the discussion forum or MS Teams
❑ If you are still stuck, book TA help
❑ Update the discussion forum

Twice per semester


❑ Start early to complete project
❑ Prepare questions for project and class test
completion sessions

9
Recommendations from CO2102
• Engage with material
• Attend lectures!
• Watch previous lectures’ recordings
• Read material before labs
• Work on lab sessions
• Independently to the best of your ability
• Seek help from Lecturer and TAs
• Consult lab solutions
• Use Blackboard forum and MS Teams
• Relax and enjoy!

10
Recommendations from CO2102

Engage with material Use Blackboard forum Relax and enjoy!


and MS Teams
Attend lectures!
Watch previous lectures’
recordings
Read material before labs
Work on lab sessions
• Independently to the best
of your ability
• Seek help from Lecturer
and Tas
Consult lab solutions

11
Before Databases: File-Based Approach
• A collection of application programs that perform services for the
end-users such as answering queries and production of reports.
• Each program defines and manages its own data, so we may have
data redundancy, inconsistency and duplication.
• Imagine you have a collection of important documents, like letters,
invoices, and receipts, that you want to organize and store for easy
access. Without a database, you might use a file-based approach to
manage these documents.

12
Library Example
• In a file-based database, data is stored in separate files,
similar to our physical files in the library example.
• Each file contains specific information about a particular
entity (e.g., a customer, a product, an order).
• These files are organized in folders or directories, much
like the drawers and folders in the library's filing
cabinet.
• Each program defines and manages its own data, so we
may have data redundancy, inconsistency and
duplication.

13
Basic Library DB Requirements
• Library needs to manage:
• book catalogue
• book loans
• book reservations
• recall of overdue loans
• catalogue inquiries
• Membership
• etc . . .

14
Traditional file approach
• Library-user file:
information about each library user.
• Catalogue file:
information about books in the library.
• Loans file:
information about each book on loan and the
person who is borrowing the book.
• Reservations file:
information about each book reserved and the
person who is reserving the book

15
Traditional approach
• The processes must
relate data between the
files when necessary.
• This can become very
complicated in the
traditional approach with
numerous issues.

16
Another approach: Create one big single file
Why not put all the information in a single table, such as an Excel spreadsheet?

17
File-based Approach Disadvantages
• Separation of data
• Duplication of data: wasteful, loss of data integrity.
• Data dependence - difficult to change the structure of data, unresponsive to
updates, extensions, redevelopment: high development cost
• Incompatible file formats
• Fixed queries, difficult to do ad hoc queries
• Low reliability Name Address Telephone Status Book title Book ISBN Date
• Low security author borrowed

• Low integrity D. 14 High


Andrews Street
274 4893 Adult Donatello Green
Halgh
123-
4567
12/04/2006

D. 14 High 274 4893 Adult War and Tolstoy 456- 18/05/2006


Andrews Street Peace 1234

18
The Solution: Database
• A database is a collection of information that is organized so that it can be easily
accessed, managed and updated.
• A shared collection of logically related data (and a description of this data) designed to
meet the information needs of an organisation.
• Important points to remember:
• Shared
• Collection
• Self-describing
• Logically related:
• Entities
• Attributes
• relationships

19
Types of Databases
• Databases have evolved since their inception in the 1960s, beginning with
hierarchical and network databases, through the 1980s with object-oriented
• In one view, databases can be classified according to content type:
bibliographic, full text, numeric and images.
• In computing, databases are sometimes classified according to their organizational
approach. There are many different kinds of databases, ranging from the most prevalent
approach:
• The relational database
• Distributed database
• Cloud database
• Graph database
• NoSQL database

20
Relational Database
• A relational database, invented by E.F. Codd at IBM in 1970, is a
tabular database in which data is defined so that it can be
reorganized and accessed in a number of different ways.
• Relational databases are made up of a set of tables with data that
fits into a predefined category. Each table has at least one data
category in a column, and each row has a certain data instance for
the categories which are defined in the columns.
• The Structured Query Language (SQL) is the standard user and
application program interface for a relational database. Relational
databases are easy to extend, and a new data category can be added
after the original database creation without requiring that you
modify all the existing applications.

21
Definition of Data Model
• An integrated collection of concepts for describing data,
relationships between data, and constraints on the data.
• There are three components of each data model:
• a structural part: rules for constructing databases, terminology
• a set of integrity rules to ensure data accuracy
• a manipulative part: defines the type of operations that are allowed on
the database

22
History of Data Models
Three projects that started it all in late 1970s
• System R, by IBM in late 1970s
• Structured Query Language (SQL) developed
• Commercial products: SQL/DS and ORACLE
• INGRES, Uni. of California at Berkeley
• Relational DBMS Ingres
• Peterlee Relational Test Vehicle, IBM UK
• Query processing, optimization, functional extensions
• Commercially available PC versions of RDBMSs
• Access and FoxPro from Microsoft
• Paradox and Visual dBase from Borland

23
Entities
An entity is a
• Thing
• Object
• concept
in the real world

Can be distinguished from other objects.

24
Attributes
• An attribute is a property of an entity
• An object can be described by a set of properties:
• One may uniquely identify an object (key)
• Others may add (useful) information

25
Terminology
• A set of entities of the same type is represented as a table (also
called relation)
• Each row in the table represents an entity
• Each column represents an attribute
• An atomic place in a table is called a cell

26
Different Terminology
Name YOB Sex Country
User Relational Model Programmer Marlon Brando 1924 Male USA
Table Relation File or Array Harrison Ford 1942 Male USA
Row Tuple Record Dustin Hoffman 1937 Male USA
Column Attribute Field Robert Redford 1937 Male USA
Meryl Streep 1949 Female USA

27
Properties of Relations
• Relations have names
• Each cell of the relation contains one atomic value
• Each attribute (column) has a distinct name
• Values of attributes come from the same domain
• The order of attributes is irrelevant
• The number of columns (attributes) is called degree
• Each tuple (row) is distinct
• The order of tuples is not significant, theoretically
• The number of rows (tuples) is called cardinality

28
Primary Keys
• The primary key is the candidate key that is selected to identify rows
uniquely within the relation

• Primary Key – A Primary Key is a column or a combination of columns


that uniquely identify a record.
• Example: In the library database, you might use the ISBN (International Standard Book Number)
as the primary key for the "Books" table. Each book has a unique ISBN, and this ISBN serves as a
primary key to distinguish one book from another.

29
Example

colour
registration make model

Car(registration, make, model, colour)

30
Superkeys
• A superkey is a set of one or more attributes that uniquely identifies an
entity within a relation

• A superkey is a combination of columns that uniquely identifies any row


within a relational database management system (RDBMS) table.
Superkeys always exist, just take all attributes
• In our library database, a super key could be a combination of attributes like (Title, Author, Publication
Date). This combination could potentially identify a unique book, but it's not the most minimal or
irreducible set because you could still uniquely identify a book using just the ISBN.

31
Candidate Keys
• A Candidate Key can be any column or a combination of columns that
can qualify as unique key in database. There can be multiple
Candidate Keys in one table. Each Candidate Key can qualify as
Primary Key.
• A candidate key is a closely related concept where the superkey is
reduced to the minimum number of columns required to uniquely
identify each row.
• In our library database, the ISBN is an example of a candidate key because it's the smallest set of
attributes (just one attribute) required to uniquely identify a book.

32
Relationships
• An association between two or more entities
• Identified/described by two or more key attributes

33
Relationships Represented by Tables
Given the Film Star table and the Film table, an example of a
relationship between these tables is the Role relation (table):

34
Relation Schema Definition
Relation schema is a set of attribute names and their domains.

35
Example of Relation
A set of tuples for relation Player

(Tom Watson , USA, 1949)


(Jack Nicklaus, USA, 1940)
(Severiano Ballesteros, Spain, 1957)
(Raymond Floyd, USA, 1942)

36
Foreign Keys
• Recall:
○ There may be more than one candidate key
○ We may choose a particular one and called it the primary key
• A foreign key is an attribute (or a set of attributes) that matches the candidate key of some
relation
• A foreign key is a column (or columns) that references a column (most often the primary
key) of another table. The purpose of the foreign key is to ensure referential integrity of the
data. In other words, only values that are supposed to appear in the database are
permitted.

37
Relational Integrity
• Relational integrity is a set of rules which ensure that the data is
accurate
• The rules contain:
• Domain constraints (we have already met these constraints)
• Entity integrity rule
• Referential integrity rule
• Before we define the last two rules, we study the concept of nulls

39
Nulls
• Null represents a value for an attribute that is currently unknown or it
is not applicable.
• Null is not the same as
• numeric value zero
• string of spaces
• Not all RDBMS use nulls. If we don’t use nulls, then we need to use
“false’’ data to represent missing or non-applicable data. For example,
use number -1 in the field of type text.

40
Entity Integrity
• Recall: Entities are modelled by tables. Each table has a primary key.
• Entity Integrity:
In a table, no cell under a primary
key column can be null.
• For example, in the Student table no record can have a null in the
stud_id column.

41
Referential Integrity
If a foreign key exist in a table, either the foreign
key value must match a candidate key value of
some row in its home table, or the foreign key value
must be null.
• For example, we cannot have a record in the
Animal table with 700 in the keeper cell,
unless there is a keeper 700 in the Keeper
table.
• But, we can have an animal with null value
under keeper: the new arrival (Larry the
Leopard) has not yet been assigned to a
particular keeper.

42
Queries
• How to search for specific data in the database?
• Example: list all animals eating buns with weight between 20 and 200kg, in
ascending order of their age
• DBMS supports flexible queries of the data
• QBE (Query-by-example): a visual way of specifying a query
• SQL (Structured Query Language): a language for specifying queries to
a relational database
• de facto standard
• Use English words, e.g. SELECT, INSERT, WHERE…

43
Summary

A relational database A table can represent either


consists of nothing but an entity set or a
relations represented as relationship between
tables entities

Each table has a predefined Each table is homogeneous


and time-invariant format — — each column contains
this is specified by the only values from a specified
relation schema domain

45
Entity-Relationship Modelling

Dr Kehinde Aruleba
Outlines
• Entity-Relationship (ER) model concepts: entities,
attributes, relationships
• Relationship types, relationship sets, roles, and
structural constraints
• Weak entity types
• Diagrammatic technique for displaying ER models
Entities
• Thing in real world with independent existence – (Person: STAFF,
STUDENT; place: CLASS, STORE)
• Entity - distinguishable “thing” in the real world
• Strong (or regular) entity - entities have an independent existence/uniquely
identified by its attributes (e.g. staff)
• Weak entity - existence dependent on some other entity/does not have a
unique identifier on its on (e.g. next of kin)

Entity type name


EntityName (singular, no spaces,
capital letter at start of each word)

space for attributes


Attributes

• Entity types have Attributes (or properties) which associate each entity with a value from a
domain of values for that attribute (STUDENT: StudentID, StudentName, Address, Degree)
• Attributes can be
• simple (atomic) e.g. Surname
• composite e.g. address (street, town, postcode)
• multi-valued e.g. phone number – home and work
• complex nested multi-valued and composite
• base or derived e.g. D.O.B. ; age
Simple vs. Composite Attributes

• Simple attribute: composed of a single component with an independent


existence.

• Composite attribute: composed of multiple components each with an


independent existence.

• E.g. Address: contains number, street, town, … components.


Single vs. Multi-valued Attributes

• Attributes are either single- or multi-valued depending on the number of values they hold

• Examples:

• Tel_No: one may have more than one telephone number

• Type_of_food: a dish may consist of several types of food: protein, carbohydrates

11
Derived Attributes

• Derived attribute: an attribute whose value is derivable from the values of


other attributes, not necessarily in the same entity.

• Examples:

• Age is derivable from DoB


Notation for attributes
Primary Key
marked {PK} EntityName
keyAttribute {PK}
Composite attribute. E.g
compositeAttribute name
firstname
partOne lastname
partTwo
Derived
/ derivedAttribute
Attribute
multiValued [min..max]
Multi-Valued
Attribute
(number of
values in [ ]
brackets) e.g.
skills[1..3] 13
Relationships

• Entities can be associated with one another in relationships.

• The entities involved in a particular relationship type are called the participants in that
relationship.

• Relationship degree defines the number of entity classes participating in the relationship:

• Degree 1 is a unary relationship.

• A relationship of degree 2 is a binary relationship.

• Degree 3 is a ternary relationship.


Degree 2 Relationship: Binary

• The has and POwns


relationship is an example
of binary relationship

Thomas Connolly - Database Systems: A Practical Approach to Design, Implementation, and Management, Global Edition
2015 Pearson Education UK
Degree 3 Relationship: Ternary

• This relationship represents


the registration of a client by
a member of staff at a
branch.

Thomas Connolly - Database Systems: A Practical Approach to Design, Implementation, and Management, Global Edition 2015 Pearson Education UK
One-to-One Binary Relationship
• 1:1 (one-to-one)
• A single entity instance in one entity class is related to a single
entity instance in another entity class.
• This means that for every entry in the first table, there is a
corresponding and unique entry in the second table, and vice
versa
• An employee can manage only one department; and
• Each department can be managed by one employee only
One-to-Many Binary Relationship

• 1:N (one-to-many)
• A single entity instance in one entity class is related to many entity instances
in another entity class.
• A customer may place many orders; but
• Each order can be placed by one customer only
Many-to-Many Binary Relationship

• N:M (many-to-many)
• Many entity instances in one entity class is related to many entity
instances in another entity class:
• One student may belong to more than one student organisation; and
• One organization can admit more than one student.

4-19
Maximum Cardinality

• Relationships are named and classified by their cardinality, which is a word


that means count.
• Each of the three types of binary relationships shown above have different
maximum cardinalities.
• Maximum cardinality is the maximum number of entity instances that may
participate in a relationship instance—one, many, or some other fixed
number.
• the maximum number of related records that can exist on one side of a relationship.
Minimum Cardinality

• Minimum cardinality is the minimum number of entity instances that must


participate in a relationship instance.
• minimum number of related records that must exist on one side of a
relationship.
• These values typically assume a value of zero (optional) or one (mandatory).
Cardinality Example
• Maximum cardinality is many for both ITEM and
SUPPLIER.
• Minimum cardinality is zero (optional) for ITEM and
one (mandatory) SUPPLIER.
• A SUPPLIER does not have to supply an ITEM.
• An ITEM must have a SUPPLIER.
Entity-Relationship Diagrams
• The diagrams in previous slides are called entity-relationship
diagrams.
• Entity classes are shown by rectangles.
• Relationships are shown by diamonds.
• The maximum cardinality of the relationship is shown inside the diamond.
• The minimum cardinality is shown by the oval or hash mark next to the entity.
• The name of the entity is shown inside the rectangle.
• The name of the relationship is shown near the diamond.
Example

A university consists of a number of departments. Each department


offers several courses. A number of modules make up each course.
Students enroll in a particular course and take modules towards the
completion of that course. Each module is taught by a lecturer from the
appropriate department, and each lecturer tutors a group of students.
Solution
Solution
Solution
Solution
Solution
Solution
Solution
Solution
Solution
Normalisation
Dr Kehinde Aruleba
Normal Forms

Functional Dependency

Outlines
Anomalies

Examples
Normalization
Normal Forms

First Normal Form (1NF)


Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form (BCNF)
Fourth Normal Form (4NF)
Fifth Normal Form (5NF)
Normalization

Functional
dependency
No transitive
of nonkey
dependency
attributes on
between
the primary
nonkey
attributes
Boyce- key - Atomic
Codd and values only
Higher
All Full
determinants Functional
are candidate dependency
keys - Single of nonkey
multivalued attributes on
dependency the primary
key
Functional Dependencies
FD – Example
FD – Example
• Functional Dependencies
• AuthNo → AuthName, AuthEmail, AuthAddress
• PaperNo → Primary-AuthNo, Title, Abstract, Status
• RevNo → RevName, RevEmail, RevAddress
• RevNo, PaperNo → AuthComm, Prog-Comm, Date,
Rating1, Rating2, Rating3, Rating4, Rating5
Insertion Anomaly
• If new information is available, Student Name Module Lecturer Email
no
we may not be able to enter the
1 Kehinde CO2102 Dr Kehinde Ka388
data until we have data for all 2 Karim CO2102 Dr Kehinde Ka388
fields. 3 John CO2102 Dr Kehinde ka388
• To insert redundant data for
every new field (student record)
Student Name Module Lecturer Email
is a data insertion anomaly. no
• Reason for this – saving two 1 Kehinde CO2102 Dr Kehinde Ka388

information (student info and 2 Karim CO2102 Dr Kehinde Ka388

module info) in a single table. 3 John CO2102 Dr Kehinde Ka388


4 ABC CO2102 Dr Kehinde Ka388
… … … … …
1000 XYZ CO2102 Dr Kehinde Ka388
CO2103 Dr Karim
Deletion Anomaly
Student Name Module Lecturer Email
• If we delete some information no
1 Kehinde CO2102 Dr Kehinde Ka388
from the database, we may
2 Karim CO2102 Dr Kehinde Ka388
accidentally remove additional
3 John CO2102 Dr Kehinde ka388
data which we wish to keep..
• Loss of a related data when
Student Name Module Lecturer Email
some other data is deleted no
1 Kehinde CO2102 Dr Kehinde Ka388
Update/Modification Anomaly
• If data appears more than once in the Student Name Module Lecturer Email
no
database, then if one item of the data
1 Kehinde CO2102 Dr Kehinde Ka388
is altered, we must alter all the Prof. A
instances of that data. 2 Karim CO2102 Dr Kehinde Ka388
• If a single field is missed out during Prof. A

modification (student no = 3) this will 3 John CO2102 Dr Kehinde ka388

lead to inconsistent data.


The Solution – Multiple Tables
• Database Normalization is a technique of organizing the data in the database.
Normalization is a systematic approach of decomposing tables to eliminate data
redundancy and undesirable characteristics like Insertion, Update and Deletion
Anamolies. e.g Student table and Module table.
• If the lecturer name or email is updated in the module name, it is update automatically for students.

Student no Name Module Module Lecture Email


1 Kehinde CO2102 CO2102 Dr Kehinde ka388

• It is a multi-step process that puts data into tabular form by removing duplicated
data from the relation tables.
• Normalization is used for mainly two purposes,
• Eliminating redundant(useless) data.
• Ensuring data dependencies make sense i.e data is logically stored.
Unnormalized Relations
In unnormalized relations data can repeat within a column

Student No Name Module


1 Kehinde CO2102, CO2103
2 Karim CO3201, CO3001
3 John CO1101
First Normal Form
• To move to First Normal Form a relation must contain only atomic
values at each row and column.
• No repeating groups
• A column or set of columns is called a Candidate Key when its values can
uniquely identify the row in the relation.

Student No Name Module Student No Name Module


1 Kehinde CO2102, CO2103 1 Kehinde CO2102
2 Karim CO3201, CO3001 1 Kehinde CO2103
3 John CO1101 2 Karim CO3201
2 Karim CO3001
3 John CO1101
Second Normal Form
• Must be 1NF
• A relation is said to be in Second Normal Form when every non-key attribute is fully
functionally dependent on the primary key.
• That is, every non-key attribute needs the full primary key for unique identification
• Partial dependency occurs when non-key attributes (columns that are not part of the
primary key) are functionally dependent on only a portion of the primary key, rather
than the entire primary key.
• Automatically 2NF if no composite primary key.
Second Normal Form Example
Module Table
Module_id Module_name
(PK) At this point, we have the student table and
1 CO2102 the module table. Let us create another table
2 CO2103 for storing students mark
3 CO1101
4 CO3001

Score Table
Mark_id Student No Module_ Marks Convenor
(Cand. Key) id (Cand.
• There are two PKs (Student No & Module_id).
Key) • In this table, Convenor is only dependent on
1 6 1 64 Kehinde Module, therefore, module_id has nothing to
2 6 2 70 ABC do with student_no, this is partial
3 12 1 71 Kehinde dependencies .
4 7 3 50 XYZ
Second Normal Form Example
• Remove the Convenor field from Score table and add it to the Module
table.
• Score table will have no partial dependency, i.e., it will be in 2NF

New Module Table New Score Table


Module_ Module_na Convenor Mark_id Student No Module_id Marks
id (PK) me 1 6 1 64
1 CO2102 Kehinde 2 6 2 70
2 CO2103 ABC 3 12 1 71
3 CO1101 XYZ 4 7 3 50
4 CO3001 Karim
• Must be 2NF
Third • A relation is said to be in Third Normal Form if
there is no transitive functional dependency
Normal between non-key attributes
• When one non-key attribute can be

Form determined with one or more non-key


attributes there is said to be a transitive
functional dependency.
Third Normal Form
OrderNo [PK] Customer Contact Person Total
1 Acme Dave 20
2 StuCorp Fred 20000
3 Acme Dave 15
4 Acme Dave 123

• OrderNo serves as the primary key.


• Customer and total amount are dependent on the order number –
this data is specific to each other.
• However, the contact person is dependent upon the customer.
• There is really a 1-1 relationship between Customer and Contact
Person
Third Normal Form
OrderNo Customer Total Customer Contact
[PK] [FK] Person
1 Acme 20 Acme Dave
2 StuCorp 20000 StuCorp Fred
3 Acme 15
4 Acme 123

• Create tables to represent relations - a 1-1 relationship between


Customer and Contact Person
Summary

• First normal form: A table is in the first normal form if it


contains no repeating columns.
• Second normal form: A table is in the second normal
form if it is in the first normal form and contains only
columns that are dependent on the whole (primary) key.
• Third normal form: A table is in the third normal form if
it is in the second normal form and all the non-key
columns are dependent only on the primary key.
SQL

Dr Kehinde Aruleba
SQL components

Outlines
SQL queries
Data Definition Language (DDL)

• Commands that define a database, including creating,


altering, and dropping tables and establishing
SQL constraints
• Commands for defining and controlling access to data
Components:
DDL, DML & Data Manipulation Language (DML)

DCL • Commands that maintain and query a database


• To access a DB by retrieving and updating data e.g.
SELECT, INSERT, UPDATE, DELETE

Data Control Language (DCL)

• Commands that control a database, including


administering privileges and committing data e.g.
GRANT, DENY, USE
SQL Database Definition

• Data Definition Language (DDL)


• Major CREATE statements:
• CREATE SCHEMA – defines a portion of the database owned by a particular user
• CREATE TABLE – defines a new table and its columns
• CREATE VIEW – defines a logical table from one or more tables or views
• Other CREATE statements: CHARACTER SET, COLLATION, TRANSLATION, ASSERTION,
DOMAIN

• Available data types in SQL?


Domain Types in SQL

• char(n). Fixed length character string, with user-specified length n.


• varchar(n). Variable length character strings, with user-specified maximum length n.
• int. Integer (a finite subset of the integers that is machine-dependent).
• smallint. Small integer (a machine-dependent subset of the integer domain type).
• numeric(p,d). Fixed point number, with user-specified precision of p digits, with d digits
to the right of decimal point. (eg., numeric(3,1), allows 44.5 to be stores exactly, but not
444.5 or 0.32)
• float(n). Floating point number, with user-specified precision of at least n digits. Use float
when you need a floating-point number with a wide range and are comfortable with some
degree of imprecision.
Steps in Table Creation

1 2 3 4 5 6 7

Identify Identify Identify Identify Determin Identify Create the


data columns columns primary e default constraints table and
types for that can that must key– values on associated
columns
attributes and be unique foreign indexes
(domain
cannot be (candidat key mates specificatio
null e keys) ns)

6
Create Table Construct
• An SQL relation is defined using the create table command:
CREATE TABLE table_name(
Column1name data_type_for_column_1,
Column2name data_type_for_column_2,... )

• Example:
create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
SQL Syntax

• SQL uses reserved keywords & user-defined names


CREATE TABLE Staff ( StaffNoINTEGER, Salary FLOAT,
Lname VARCHAR(20) );
INSERT INTO Staff VALUES (32, 25000.0, 'Smith');
• By convention, keywords are upper-case, though most SQL dialects are case-insensitive;
• User-defined names must be entered exactly as in tables
• Text data is enclosed using single quotes (' ')
• Round brackets () are used to group related items
• Commas , separate items in a list
• Statements are terminated with a semicolon (';')
Table name Attribute names

Tables in SQL
Product

Pid Price Category Manufacturer


• How would you create this
table?
1001 19.99 Gadgets GizmoWorks • First decide what type of
field to use for each field
1002 29.99 Gadgets GizmoWorks

1003 149.99 Photography Canon

1004 203.99 Household Hitachi


Tables Explained

The schema of a table is the table name and A key is an attribute whose values are unique;
its attributes: we underline a key

Product(PId, Price, Category, Product(PId, Price, Category,


Manfacturer) Manfacturer)
Tables Explained - DDL

CREATE TABLE Product (PId INT,


Price FLOAT, Category VARCHAR
(50), Manfacturer VARCHAR (60));

SQL description of this table is as follows:


Data Definition Language

• Forgot the domains of your table?


• Use the command
• DESCRIBE table_name
• e.g. DESCRIBE Product;
Removing Tables

• DROP TABLE statement allows you to remove tables from your schema:

• DROP TABLE Product


• Now we want to add data to our table
INSERT INTO Product VALUES (1001, 19.99,
'Gadgets', 'Gizmo Works');

SQL Syntax • This means insert into our product table a row with the
following fields:
• Pid of 1001
• Price of 19.99
• Category of ‘Gadgets’
• Manfacturer of ‘Gizmo Works’
SQL Syntax

• How would you insert


values into the table?

INSERT INTO Product VALUES (1002, 29.99, 'Gadgets', 'Gizmo Works’);


INSERT INTO Product VALUES (1003, 149.99, 'Photography', 'Canon’);
INSERT INTO Product VALUES (1004, 203.99, 'Household', 'Hitachi');
SQL Syntax
• Suppose we don’t know all the information?
• e.g. we know a product price, category and manufacturer but not product ID?
INSERT INTO Product (Price, Category, Manfacturer)VALUES (49.99,'Phone','Samsung');
INSERT INTO Product (PId, Category, Manfacturer)VALUES (1006,'Phone','Apple’);

• Leaves the unknown fields as NULL. Can put these in later…


SQL Syntax
• Suppose we don’t know all the information?
• e.g. we know a product price, category and manufacturer but not product ID?
INSERT INTO Product (Price, Category, Manfacturer)VALUES (49.99,'Phone','Samsung’);
INSERT INTO Product (PId, Category, Manfacturer)VALUES (1006,'Phone','Apple’);

• Leaves the unknown fields as NULL. Can put these in later…


SELECT Statement

• Used for queries on single or multiple tables


• Clauses of the SELECT statement:
 SELECT
 List the columns (and expressions) to be returned from the query
 FROM
 Indicate the table(s) or view(s) from which data will be obtained
 WHERE
 Indicate the conditions under which a row will be included in the result
 GROUP BY
 Indicate categorization of results
 HAVING
 Indicate the conditions under which a category (group) will be included
 ORDER BY
 Sorts the result according to specified criteria
SQL statement
processing order
(based on van
der Lans, 2006
p.100)

Copyright © 2014 Pearson Education, Inc.


SELECT

SELECT fields_list
FROM table_list
WHERE condition;

• Fields_list: A list of fields from the tables in tables_list; can also specify distinct (no
duplicates) or all (*: An asterisk in the select clause denotes “all attributes”)
• Table_list: A list of table names you wish to select from
• Condition: filters rows subject to some condition
SELECT

• The SELECT statement retrieves data and formats output


• SELECT is the most frequently used SQL statement
• Performs relational algebra's selection, projection and join operations in a
single statement
SELECT * FROM Product;
• In English this means select all the rows from the staff table
Selecting Specific Columns
• Specific columns can be output by giving
their names:
SELECT Manfacturer FROM Product;

• In English this means:


• Select the Manfacturer field for all the
rows FROM the product table
Selecting Specific Columns
• Specific columns can be output by giving their names:
SELECT Pid, Category, Manfacturer FROM Product;

• NB. There must be a comma (‘,’) between column names


• In English this means:
• Select the product ID (PId), Categories (Category) and manufacturer (Manfacturer)
fields of all the rows from the Product table.
Simple SQL Query
• Use WHERE to specify a condition:
Pid Price Category Manufacturer
Product 1001 19.99 Gadgets GizmoWorks
1002 29.99 Gadgets GizmoWorks
1003 149.99 Photography Canon
1004 203.99 Household Hitachi

SELECT *
FROM Product
WHERE category=‘Gadgets’

Pid Price Category Manufacturer


1001 19.99 Gadgets GizmoWorks

“selection” 1002 29.99 Gadgets GizmoWorks


Simple SQL Query
Product Pid Price Category Manufacturer
1001 19.99 Gadgets GizmoWorks
1002 29.99 Gadgets GizmoWorks
1003 149.99 Photography Canon
1004 203.99 Household Hitachi

SELECT Pid, Price, Manufacturer


FROM Product
WHERE Price > 100

Pid Price Manufacturer


“selection” and 1003 149.99 Canon
“projection” 1004 203.99 Hitachi
Selecting Distinct Rows

• Use DISTINCT to get only one occurrence of values:

SELECT DISTINCT Category, Manfacturer FROM Product;

Category In English the query means:


Gadgets
Gadgets Select unique occurrences of the categories (Category)
Photography and Manufacturers (Manfacturer) FROM the Product
Household table (i.e. no 2 Gadgets)
Eliminating Duplicates
Category
SELECT DISTINCT category Gadgets
FROM Product
Photography
Household

Compare to:

Category
Gadgets
SELECT category
Gadgets
FROM Product
Photography
Household
Selecting Specific Rows & Columns

• Test for range (BETWEEN, NOT BETWEEN)


SELECT PId, Category , Manfacturer FROM Product WHERE
Price BETWEEN 0 AND 30

Pid Price Category Manufacturer


1001 19.99 Gadgets GizmoWorks
1002 29.99 Gadgets GizmoWorks
1003 149.99 Photography Canon
1004 203.99 Household Hitachi
Break
SQL Aggregate Functions
We do not want to just retrieve data, We also want to summarise data

These functions operate on the avg: average value


min: minimum value
multiset of values of a column of a max: maximum value
sum: sum of values
relation, and return a value count: number of values

Except count, all aggregations apply to a single attribute


SQL Aggregate Functions

• SELECT count(Manfacturer) as PhoneManufacturer FROM


Product WHERE Category = 'Phone’

• SELECT count(Manfacturer) as countManufacturer,


MAX(Price) AS maxPrice,MIN(Price) as minPriceFROM
Product
Simple Aggregations
Purchase
Product Date Price Quantity
Apple 10/21 1 20
Banana 10/3 0.5 10
Banana 10/10 1 10
Apple 10/25 1.50 20

SELECT Sum(price * quantity)


FROM Purchase
WHERE product = ‘Apple’ 50 (= 20+30)
More Examples
Purchase(product, date, price, quantity)
Product Date Price Quantity
Apple 10/21 1 20
Banana 10/3 0.5 10
Banana 10/10 1 10
Apple 10/25 1.50 20

SELECT Sum(price * quantity) What do


FROM Purchase they mean ?
Group By

• Aggregate functions help us to summarise the whole column(s) of data into one row.
• Sometimes we want to group data before applying aggregate functions
• This gives us 'subtotals' rather than 'overall total’
• GROUP BY is used to achieve that
SELECT fields-list
FROM table-list
WHERE condition;
GROUP BY columnList[HAVING condition]
ORDER BY columnList

• GROUP BY filters groups of rows…


Grouping and Aggregation
Purchase(product, date, price, quantity)

Find total sales after 10/1/2005 per product.

SELECT product, Sum(price*quantity) AS TotalSales


FROM Purchase
WHERE date > ‘10/1/2005’
GROUP BY product

Let’s see what this means…


Grouping and Aggregation

1. Compute the FROM and 2. Group by the attributes in the 3. Compute the SELECT clause:
WHERE clauses. GROUPBY grouped attributes and
aggregates.
1&2. FROM-WHERE-GROUPBY

Product Date Price Quantity


Apple 10/21 1 20
Apple 10/25 1.50 20
Banana 10/3 0.5 10
Banana 10/10 1 10
3. SELECT
Product Date Price Quantity Product TotalSales
Apple 10/21 1 20
Apple 10/25 1.50 20 Apple 50
Banana 10/3 0.5 10
Banana 10/10 1 10
Banana 15

SELECT product, Sum(price*quantity) AS TotalSales


FROM Purchase
WHERE date > ‘10/1/2005’
GROUP BY product
HAVING
• Use with GROUP BY to restrict output; filters groups rather than
rows (WHERE); condition has aggregate fn

SELECT branchNo, Count(staffNo) AS myCount, SUM(salary) AS mySum


FROM MoreStaff
GROUP BY branchNo
HAVING COUNT(staffNo)>1;.
SELECT - Order By
SELECT fields-list
FROM table-list
WHERE condition;
GROUP BY columnList[HAVING condition]
ORDER BY columnList

• ORDER BY sorts results - can be ASCending or DESCending


• Ordering is ascending, unless you specify the DESC keyword.

SELECT PId, price, manufacturer


FROM Product
WHERE category=‘gadgets’ AND price > 50
ORDER BY price, PId
UPDATE Queries
UPDATE table-list
SET field1 = value1
field2 = value2 …
WHERE condition; (WHERE is optional)

UPDATE Product SET Manfacturer = 'LG’ WHERE PId = 1001;

• What do you think this statement will return:


UPDATE Product SET Manfacturer = 'LG’;
Delete Statement

• Removes rows from a table


• Delete certain rows
DELETE FROM Product WHERE Manfacturer = ‘LG’;
• Delete all rows
DELETE FROM Product;
• Completely removing a table is a DDL (data definition language) operation:
• DROP TABLE Staff;
NULLS in SQL

Whenever we don’t have a value, we can put a NULL

Can mean many things:


• Value does not exists
• Value exists but is unknown
• Value not applicable
• Etc.
The schema specifies for each attribute if can be null (nullable attribute) or not

How does SQL cope with tables that have NULLs ?


Null Values
• If x= NULL then 4*(3-x)/7 is still NULL

• If x= NULL then x=“Joe” is UNKNOWN


SQL

Dr Kehinde Aruleba
 After completing this lesson, you should be able to do the
following:
◦ Define subqueries
◦ Describe the types of problems that subqueries can
solve
Objectives ◦ List the types of subqueries
◦ Write single-row and multiple-row subqueries
◦ INNER, LEFT, RIGHT, FULL JOIN
Subqueries

• A Subquery is a query within another SQL query, usually embedded within the WHERE clause.
• Use subqueries when the result that you want requires more than one query, and each subquery
provides a subset of the table involved in the query
• The inner query executes first before its parent query. The results of an inner query are then
passed to the outer query.

SELECT select_list
FROM table
WHERE expr operator
(SELECT select_list
FROM table);
Subquery Syntax
◦ The subquery (inner query) executes once before the main
query (outer query).
◦ A subquery is a SELECT statement that is embedded in a
clause of another SELECT statement. You can build
powerful statements out of simple ones by using
subqueries.
◦ The result of the subquery is used by the main query.
◦ Operator includes a comparison condition such as >, =, or
IN
Using a
Subquery to
Solve a Problem
• Who has a salary greater than
Mark?
Using a Subquery to Solve a Problem

Main query:

Which employees have salaries greater


than Mark’s salary?
• Who has a salary greater than Mark’s? Subquery:

What is Mark’s salary?


Using a Subquery to Solve a Problem

• Suppose you want to write a query to find out who earns a salary greater
than Mark salary.
• To solve this problem, you need two queries: one to find how much
Mark earns, and a second query to find who earns more than that
amount.
• You can solve this problem by combining the two queries, placing one
query inside the other query.
• The inner query (or subquery) returns a value that is used by the outer
query (or main query). Using a subquery is equivalent to performing two
sequential queries and using the result of the first query as the search
value in the second query.
Using a Subquery
SELECT StaffNo, Fname
FROM Staff
WHERE Salary > 20000
(SELECT Salary
FROM Staff
WHERE Fname = ‘Mark');

• In the slide, the inner query determines the salary of staff Mark. The
outer query takes the result of the inner query and uses this result to Result Table
display all the staff who earn more than this amount.

• The main query will retrieve the "StaffNo" and "Fname" of staff
members whose salary is greater than the salary of the staff member
with the first name 'Mark.' This is done by comparing the "Salary"
column in the main query with the result of the subquery that
retrieves 'Mark's salary.
Guidelines for Using Subqueries

Enclose Enclose subqueries in parentheses.

Place subqueries on the right side of the comparison


Place condition.

Use single-row operators with single-row subqueries, and


Use use multiple-row operators with multiple-row subqueries.
Example 2

• Find all members of staff who earn


more than the managers

SELECT * FROM Staff WHERE Salary >

(SELECT Salary FROM Staff WHERE

Position = 'Manager’);

Since we have 3 managers would this


example work?
Types of Subqueries
◦ Single-row subquery- Queries that return only one row from the inner SELECT
statement

Main query
returns
Subquery ST_CLERK

• Multiple-row subquery - Queries that return more than one row from the inner
SELECT statement

Main query
returns ST_CLERK
Subquery
SA_MAN
Operator Meaning
Single-Row Subqueries = Equal to
◦ Return only one row > Greater than
◦ Use single-row comparison operators >= Greater than or equal
to
< Less than
<= Less than or equal to

<> Not equal to


Using Group Functions in a Subquery
• You can display data from a main
query by using a group function
in a subquery to return a single
row.
• This example shows the staff
first, last name and salary of all SELECT Fname, Lname, Salary
employees whose salary is equal FROM Staff
to the minimum salary. WHERE salary = 200
• The MIN group function returns (SELECT MIN(salary)
a single value (200) to the outer FROM Staff);
query.
Multiple-Row Subqueries
◦ Return more than one row Operator Meaning
◦ Use multiple-row comparison operators IN Equal to any member in the list

ANY Compare value to each value


returned by the subquery

ALL Compare value to every value


returned by the subquery
Using the ANY Operator in Multiple-Row Subqueries

• The ANY operator (and its synonym, the SOME operator) compares a value
to each value returned by a subquery.
• <ANY means less than the maximum.
• >ANY means more than the minimum.
• =ANY is equivalent to IN.

• Example: find those members of staff who are not managers but earn more
than any managers:
Using the ANY Operator in Multiple-Row Subqueries

SELECT * FROM Staff


WHERE Position <> 'Manager' AND Salary > ANY
(SELECT Salary
FROM Staff
WHERE Position = 'Manager');

• This example displays staffs who are not Managers and whose salary is greater
than that of any Manager. The maximum salary that a Manager earns is 32500.

• ANY: Predicate is true if it matches ANY value in the list e.g. statement is true if
staff is not a manager and has a higher salary than any manager.
Output
Using the ALL Operator in Multiple-Row Subqueries
• The ALL operator compares a value to every value returned by a subquery.
• >ALL means more than the maximum, and <ALL means less than the minimum.

SELECT * FROM Staff


WHERE Position <> 'Manager’
AND Salary < ALL
(SELECT Salary
FROM Staff
WHERE Position = 'Manager’);

• ALL: Predicate is true if it matches ALL value in the list e.g. statement is true if staff is
not a manager and has a lower salary than ALL the managers.
Using the IN Operator in Multiple-Row Subqueries
• The IN operator in SQL is used in multiple-row subqueries to compare a value to a set
of values produced by a subquery. It is often used when you want to filter rows in a
query where a column's value matches any value in a list generated by a subquery.

SELECT * FROM Staff WHERE position IN ('Manager',


‘CEO’);

IN: Predicate is true if the test element is in the list; (i.e. if position is IN
the set of values we supply - Manager or CEO)
An SQL JOIN clause is used to
combine rows from two or more
tables, based on a common field
between them.

The most common type of join is:


SQL Join SQL INNER JOIN (simple join).

An SQL INNER JOIN returns all


rows from multiple tables where
the join condition is met.
Types of Joins
Inner Joins

• This return the records which have


matching values in both tables.
• For example, if you perform an INNER
JOIN between Employee table and
Project table, all the rows which have
matching values in both tables will be
returned as output.
Inner Joins
EmpID EmpFname EmpLname Age Address
1 John Head 22 Leicester
2 Julie Walter 32 Wigston
3 Yash Ali 24 London
4 Vard Kumar 25 Coventry
5 Teja Ali 26 Oxford

• Considering both tables, we can see there is a matching column (EmpID).


• The main relationship between both tables is that a specific employee with an
EmpID can work on any numbers of project. EmpID = 3 is currently working on
two projects.
Inner Joins

EmpID EmpFname EmpLname ProjectID ProjectName


1 John Head 111 Project1
2 Julie Walter 222 Project2
3 Yash Ali 333 Project3
3 Yash Ali 444 Project4
5 Teja Ali 555 Project5

From the result we can see that all the matching values from both tables have
been retrieved, i.e. all employees working on specific projects have been
retrieved
Left Joins

• LEFT JOIN or LEFT OUTER JOIN returns all the


records from the left table and also those conditions
that satisfy a condition from the right table.
• For the records having no matching values in the
right table, the result set will contain NULL values.
Left Joins

EmpFname EmpLname ProjectID ProjectName


John Head 111 Project1
Julie Walter 222 Project2
Yash Ali 333 Project3
Yash Ali 444 Project4
Teja Ali 555 Project5
Vard Kumar Null Null

• All the records from the left table and those records that do not have
any matching record for the right table will have a null value.
Right Joins
• RIGHT JOIN or RIGHT OUTER JOIN returns all the records from the
right table and also those records which satisfy a condition from the
left table.
• For records having no matching values in the left table, the output
will contain NULL values.
RIGHT Joins

EmpFname EmpLname ProjectID ProjectName • All the records from the right table
John Head 111 Project1 are retrieved and the records that
Julie Walter 222 Project2 satisfies condition from the left table
Yash Ali 333 Project3 are also retrieved but records with no
Yash Ali 444 Project4 matching values from the left table
Teja Ali 555 Project5 have null values in the record.
NULL NULL 666 Project6
NULL NULL 777 Project7
NULL NULL 888 Project8
Full Joins
• FULL JOIN or FULL OUTER JOIN returns all those records which either
have a match in the left table (Table1) or the right table (table2).
• MySQL does not support FULL JOIN, so you have to combine JOIN,
UNION and LEFT JOIN to get an equivalent. It gives the results of A
union B. It returns all records from both tables. Those columns which
exist in only one table will contain NULL in the opposite table.
FULL Joins

EmpFname EmpLname ProjectID • We are retrieving all the values from


John Head 111 the left table and the matching values
Julie Walter 222 in the right table through the left join,
Yash Ali 333 we then UNION it with the matching
Yash Ali 444 values from the right table and also
Teja Ali 555 the values with satisfying condition
Vard Kumar NULL from the left table.
NULL NULL 666
NULL NULL 777
NULL NULL 888
Creating Natural Joins
• The NATURAL JOIN clause is based on all columns in the two
tables that have the same name.
• The SQL NATURAL JOIN is structured in such a way that, columns
with the same name of associated tables will appear once only If
the columns having the same names have different data types, an
error is returned.
Creating Natural Joins
• Natural Join is used when
you want to ensure that
the numbers of columns
returned are less.
• To avoid redundancy
SQL

Dr Kehinde Aruleba
 After completing this lesson, you should be understand
the following:
◦ Primary keys
◦ Composite primary keys
Objectives ◦ Foreign keys
◦ Views
Primary Key

The PRIMARY KEY constraint uniquely identifies each record in a table.

Primary keys must contain UNIQUE values and cannot contain NULL values.

The primary key is the only guaranteed way to identify rows in queries

3
Primary Key Rules
• A primary key must contain unique values.
• If the primary key consists of multiple columns, the combination of values
in these columns must be unique – composite primary key.

• there is only ONE PRIMARY KEY (PK_Student). But, the VALUE of the primary key is made up of TWO
COLUMNS (ID + LastName).
• In case the primary key consists of multiple columns, you must specify them at the end of the
CREATE TABLE statement. You put a comma-separated list of primary key columns inside
parentheses
4
Primary Key Rules

A primary key column


cannot have NULL values.
Note that MySQL implicitly A table can have one and
adds a NOT NULL only one primary key.
constraint to primary key
columns.

5
Primary Key

CREATE table School(


• CREATE table School(
sid INT,
sid INT,
sname VARCHAR(50) NOT NULL
sname VARCHAR(50) NOT NULL PRIMARY KEY (sid)
) )

A. B.
Insert into School values Insert into School values
(1, ‘Computing & Mathematical Science’), (1, ‘Computing & Mathematical Science’),
(1, ‘Law’),
(1, ‘Law’),
(2, ‘Accounting’);
(2, ‘Accounting’);

sid sname sid sname


1 Computing & Mathematical Science 1 Computing & Mathematical Science
1 Law 2 Law
2 Accounting 3 6 Accounting
Composite Primary Key
• Common use-case - when we have a many-to-many relationship between two tables i.e. multiple
rows in table A are associated with multiple rows in table B.
• Examples
• A customer and product table – A customer can purchase many products also a product can be
purchased by many customers.
• Author and Books – One author can write many books and also a book can be written by many
authors

• NB: RDBMS does not allow implementation of many-to-many relationship. Therefore, we need to
create a third table commonly called JUNCTION/LINK table i.e., the table that will contain the many-to-
many relationship.
• The junction table would usually contain only the two foreign key columns (AuthorId & BookId)

Primary Primary
Key Key
AuthorID FirstName LastName Location BookID BookTitle Subject
1 Dave Scott England 101 C# Computing
2 Albert Einstein Germany 102 Relativity Physics
3 Jennifer Pitts USA 103 Law and Empire Politics 7
Composite Primary Key
• From the junction table (Table A), we can see that AuthorId = 1 authored two books (101 & 102) and
the bookid = 101 is written by three authors.
• This structure allows multiple authors to be associated with multiple books, creating a many-to-many
relationship.
• Duplicates are allowed in each column composite primary key but they must be unique across the
column.

AuthorID BookID AuthorID BookID AuthorID BookID


1 101 1 101 1 101
A. 1 102 B. 1 102 C. 1 102
2 101 2 101 2 101
3 101 1 102 1 Null

Primary Primary
Key Key
AuthorID FirstName LastName Location BookID BookTitle Subject
1 Dave Scott England 101 C# Computing
2 Albert Einstein Germany 102 Relativity Physics
8
3 Jennifer Pitts USA 103 Law and Empire Politics
Foreign Key

• A FOREIGN KEY is a field (or group of fields) in one table that refers to the PRIMARY KEY in
another table.

• The table with the foreign key is called the child table, and the table with the primary key
is called the referenced or parent table.

• A foreign key constraint is defined on the child table.

• Foreign and primary keys must be defined on the same data type.

9
Foreign Key

Each customer can have zero or many orders and each order
must have a customer.

The customers table is called the parent table or referenced


table, and the orders table is known as the child table or
referencing table.

Customer table & order table is one-to-many.


A table can have more than one foreign key where each This is create by the foreign key in the orders
table specified by the CustomerNo column.
foreign key references to a primary key of the different parent
tables.

Each row in the orders table has a CustomerNo that exists in


the CustomerNo column of the customers table. Multiple
rows in the orders table can have the same CustomerNo. 10
Foreign Key - syntax

• specify the name of foreign key constraint that you want to create after the CONSTRAINT
keyword. For InnoDB tables, a constraint name is generated automatically.
• specify a list of comma-separated foreign key columns after the FOREIGN KEY keywords.
Foreign key name is optional and is generated automatically if you skip it.
• specify the parent table followed by a list of comma-separated columns to which the
foreign key columns reference.
• specify how foreign key maintains the referential integrity between the child and parent
tables by using the ON DELETE and ON UPDATE clauses.
• The reference_option determines action which MySQL will take when values in the
parent key columns are deleted (ON DELETE) or updated (ON UPDATE).

11
Foreign Key – referential actions

MySQL has five reference options: CASCADE, SET NULL, NO ACTION, RESTRICT, and SET DEFAULT
• CASCADE: Delete or update the row from the parent table and automatically delete or update the
matching rows in the child table.
• SET NULL: Delete or update the row from the parent table and set the foreign key column or columns
in the child table to NULL.
• RESTRICT: if a row from the parent table has a matching row in the child table, MySQL rejects
deleting or updating rows in the parent table.
• NO ACTION: is the same as RESTRICT.
• SET DEFAULT: is recognised by the MySQL parser. However, this action is rejected by both InnoDB and
NDB tables.

12
Foreign Key - RESTRICT & NO ACTION actions

• Attempt to insert a new row into the Orders table with a BookId value does not exist in the
Book table:
Insert into Orders Values (4, 6712, 108); - Error Code: 1452. Cannot add or
update a child row: a foreign key constraint fails (`lecture5`.`orders`, CONSTRAINT
`orders_ibfk_1` FOREIGN KEY (`BookId`) REFERENCES `books` (`BookId`)) 0.000 sec
13
Foreign Key
• Attempt to update the value in BookId column (101) in the Book Table
to 107

Error Code: 1451. Cannot delete or update a parent row: a


foreign key constraint fails (`lecture5`.`orders`, CONSTRAINT
`orders_ibfk_1` FOREIGN KEY (`BookId`) REFERENCES `books`
(`BookId`))

Because of the RESTRICT option, you cannot delete or update


BookId = 101 since it is referenced by the OrderId = 1 in the
order table.

14
Foreign Key - CASCADE action (Update)

• Attempt to update the value in BookId column (101) in the Book Table to 107:

• Two rows with value 101 in the BookId column of the orders table (table A) were
automatically updated to 107 (table B) because of the ON UPDATE CASCADE action.
15
Foreign Key - CASCADE action (delete)

• Attempt to delete the value in BookId column (1002) in the Book Table:
B

• All books with BookId = 102 from the orders table were automatically deleted because of
the ON DELETE CASCADE action..
16
Foreign Key - SET NULL action (update) if a row from the parent table is deleted or updated, the
values of the foreign key column (or columns) in the
child table are set to NULL.

• Update the value in BookId column (101) in the Book Table to 123:
B

• The rows with the BookId 101 in the orders table were automatically set to NULL due to
the ON UPDATE SET NULL action.
17
Foreign Key - SET NULL action (delete)

• Delete the BookId column (102) from the Book Table:


B

• The values in the BookId column of the rows with BookId = 102 in the books table were
automatically set to NULL due to the ON DELETE SET NULL action.
18
Views

• A view is a virtual table, constructed from base tables


• A view is based on a table or another view and acts as a window through which data on
tables can be viewed or changed.
• Views are relations, except that they are not physically stored, and they don’t contain any
data of their own.
• A view can be accessed with the use of SQL SELECT statement like a table.
◦A view can also be made up by selecting data from more than one tables.

19
The Importance of Views & Privileges

• For presenting different information to different users


• DBMSs are primarily used in organisations, and in large organisations, DBMSs are used by
a range of staff:
• directors, managers, analysts, engineers, personnel, secretarial, etc.
• Consequently, access to data in different tables may need to be controlled to:
• provide access to authorised users
• restrict access to unauthorised users (security)…what could happen if a db was
hacked?!
• enforce business rules or government regulations
• Views & privileges can help implement access control

20
Example
• If you want to get the same information (StaffID, Name & Salary), you need to write the
query each time or save the query as a .txt or .sql file.
• A better way to achieve this is to save it in the db server and give it a name. This is
called view.

Once you execute the CREATE VIEW statement, MySQL creates


the view and stores it in the database.

Now, you can reference the view as a table in SQL statements.


For example, select * from Staff_payroll;

21
Example
• This query returns matching data from both tables customers and
payments using the inner join

A better way to do this?


22
Example

Once you execute the CREATE VIEW statement, MySQL creates


the view and stores it in the database.

23
Advantages of views

• Views help provide the granularity of access control


• Views can help reduce complexity (simplify tables & (e.g.) multi-table queries) and improve access control, e.g:
• customisation (same tables can be seen differently by different users)
• convenience (users see only what they need),
• improved security (views appropriate for the user, restricts access to db),
• constancy (view shows part of base tables & if base columns not in view change, no effect),
• currency (changes to base table accessed in view are immediately seen in view)
• Views can help maintain DB integrity (e.g. by doing updates via views)

24
Disadvantages of views

There are some restrictions on their use: some


Implementation: resolution method can cause a
views can’t be updated. can’t do aggregates on
performance penalty (here defining query is run
aggregates in SELECT clause or use an aggregate
on base tables again each time; prob w/
fn in WHERE clause; also grouped views can’t be
frequent access)
joined with a base table/view, etc.

Views are read-only by nature. You cannot


Structure restriction: downside of constancy
perform data modification operations (INSERT,
(SELECT * FROM Foo only selects cols which
UPDATE, DELETE) directly on a view, especially if
were in Foo when View was created; view needs
it involves multiple tables. This can be a
to be recreated for new cols in Foo to appear in
limitation when you need to perform data
View)
manipulation operations on a dataset.

25
SQL (4)
Dr Kehinde Aruleba
Objectives

After completing this lesson, you should be understand


the following:
◦ ALTER TABLE Statement
◦ DCL Commands
◦ Grant
◦ Revoke
◦ Create Roles
Alter

• alter command is used for altering the table structure, such as,

• to add a column to existing table


• to rename any existing column
• to change datatype of any column or to modify its size.
• to drop a column from the table.

3
ALTER Command: Add a new Column

• Using ALTER command we can add a column to any existing table.


Following is the syntax,

In this syntax:

• First, specify the name of the table in which you want to add the new column.
• Second, specify the name of the column, its data type, and constraint if applicable

4
ALTER Command: Add a new Column
(Example)
• Add publisher column to the existing Authors table

This ALTER command will add a


new column PUBLISHER to the
table Authors

5
ALTER Command: Add multiple new Columns

• Using ALTER command we can add multiple new columns to any existing
table.
• In this syntax, you specify a comma-separated list of columns that you
want to add to a table after the ADD clause.

6
ALTER Command: Add multiple new Columns
This command will add three new columns
• Example to the Authors table.

7
ALTER Command: Modify an existing Column
• ALTER command can also be used to modify data type of any existing
column.

• Example: Convert the data type of dob from date to varchar


• Before converting, let us check the current state (datatypes) of our tables and date
in particular

8
ALTER Command: Modify an existing Column
• Example: Convert the data type of dob from date to varchar

9
ALTER Command: Modify an existing Column
• ALTER command can be used to change the size of a column.
• Example: Change the size of department from 60 to 100.

10
ALTER Command: Modify an existing Column
• ALTER command can be used to change a nullable column to NOT NULL.
• Example: Change the column LastName to NOT NULL.

11
ALTER Command: Rename a Column
• ALTER command can be used to rename a column.

• table_name
• The name of the table to modify.
• original_name
• The column to rename.
• new_name
• The new name for the column.
• column_definition
• The datatype and definition of the column (NULL or NOT NULL, etc). You must specify the column
definition when renaming the column, even if it does not change.
• FIRST | AFTER column_name
• Optional. It tells MySQL where in the table to position the column, if you wish to change its position.
12
ALTER Command: Rename a Column
• Example
• Rename the LastName column to Surname

13
ALTER Command: Drop a Column
• The syntax to delete a column in a table in MySQL (using the ALTER TABLE
statement) is:

• table_name
• The name of the table to modify.
• column_name
• The name of the column to delete from the table.

14
ALTER Command: Drop a Column
• This example shows how to remove the dob column from Authors table

15
ALTER Command: Rename table
• To rename a table, you use the ALTER TABLE RENAME TO statement:

• Rename Authors table to AuthorsInfo

16
SQL: DCL Commands
• DCL commands are used to control user access in a database.
• DCL commands are used to enforce database security in a multiple user
database environment.
• Helps to restrict users from accessing data in a database.
• Two types of DCL commands are GRANT and REVOKE.
• Only Database Administrator's or owner's of the database object can
provide/remove privileges on a database object.

17
DCL Commands - GRANT
• SQL GRANT is a command used to provide access or privileges on the
database objects to the users.
• The grant privilege is given by the administrator who has the control on the
database
Privileges Capability

all All privileges

create Ability to CREATE tables

select Ability to SELECT statements on tables or views

insert Ability to INSERT statements on the table or view

update Ability to UPDATE statements on the table or view

delete Ability to DELETE statements on the table or view

references Ability to create a foreign key constraint that refers to another table

alter Ability to perform ALTER TABLE statements

drop Ability to perform DROP objects in the database


18
DCL Commands - GRANT
• SQL GRANT syntax

• privilege_name is the access right or privilege granted to the user. Some of


the access rights are ALL, EXECUTE, and SELECT.
• object_name is the name of a database object like TABLE or VIEW.
• user_name is the name of the user to whom an access right is being granted.
• PUBLIC is used to grant access rights to all users.
• ROLES are a set of privileges grouped together.
• WITH GRANT OPTION allows a user to grant access rights to other users.
19
DCL Commands - GRANT
• For example, if I want to give user1 the privileges of select, insert, update,
and delete on the Employee table. Below is the statement
A B

• You should use the WITH GRANT option carefully because for example if you GRANT SELECT privilege in A
to user1 using the WITH GRANT option as shown in B, then user1 can GRANT SELECT, INSERT, UPDATE,
DELETE privilege on employee table to another user, such as user2 etc.
• Later , if you REVOKE the privileges on employee from user1, user2 will still have the privileges on
employee table.

Similarly to A & B above, in C user1 has been given permission


C to view and modify ALL records in employee table.
20
DCL Commands - REVOKE
• SQL REVOKE command removes user access rights or privileges to the
database objects.
• Used to cancel previously granted or denied permissions.

21
DCL Commands - REVOKE

• This command will REVOKE a SELECT privilege on Employee table from user1.
• When you REVOKE SELECT privilege on a table from a user, the user will not be
able to SELECT data from that table anymore.
• However, if the user has received SELECT privileges on that table from more than
one users, he/she can SELECT from that table until everyone who granted the
permission revokes it.
• You cannot REVOKE privileges if they were not initially granted by you.

22
DCL Commands - REVOKE
• For example, if I want to revoke the select, insert, update, delete privileges
granted to user1. Below is the statement

• If I want to remove all privileges including the option to grant privileges from user 1, we have to revoke
the privileges separately

• If you have DBA access, you could do this instead (this will remove ALL database access)

23
Summary

In this lecture you have learned the following in MySQL:

• ALTER TABLE Statement

• DCL Commands
• Grant
• Revoke

30
SQL (5): Tiggers
Dr Kehinde Aruleba
IF( ) Function

• The MySQL IF() function is used for validating a condition.


• The IF() function returns a value if the condition is TRUE and another value
if the condition is FALSE.
• The MySQL IF() function can return values that can be either numeric or
strings depending upon the context in which the function is used.

2
IF( ) Function

• condition: This is the expression that is evaluated. If the condition is true, the IF()
function returns value_if_true; otherwise, it returns value_if_false.

• value_if_true: The value to be returned if the condition is true.

• value_if_false: The value to be returned if the condition is false

3
IF( ) Function - Examples

•A Yes

No
•B

4
Triggers
• A trigger is a statement that a system executes automatically when there is any
modification to the database.
• Modification - SQL INSERT, UPDATE, or DELETE statement

• For example
• you can define a trigger that is invoked automatically before a new row is inserted into a
table.
• Charge £10 overdraft fee if the balance of an account after a withdrawal transaction is less
than £500

5
Triggers Event
• trigger_event indicates the kind of operation that activates the trigger.
These trigger_event values are permitted:
• INSERT: The trigger activates whenever a new row is inserted into the table.

• UPDATE: The trigger activates whenever a row is modified.

• DELETE: The trigger activates whenever a row is deleted from the table.

6
Types of Triggers in MySQL

• We can define 6 types of triggers for each table:


1. AFTER INSERT activated after data is inserted into the table.
2. AFTER UPDATE: activated after data in the table is modified.
3. AFTER DELETE: activated after data is deleted/removed from the table.
4. BEFORE INSERT: activated before data is inserted into the table.
5. BEFORE UPDATE: activated before data in the table is modified.
6. BEFORE DELETE: activated before data is deleted/removed from the table.

7
Triggers Syntax
• This statement creates a new trigger. A trigger is a named database object that is associated with a table,
and activates when a particular event occurs for the table.

• trigger_name: name of the trigger you want to create. Specify this after the create trigger keyword.
• {BEFORE | AFTER}: trigger action time. This indicates that the trigger is invoked before or after each row is modified.
• {INSERT | UPDATE| DELETE}: the operation that activates the trigger.
• ON table_name FOR EACH ROW: the name of the table to which the trigger belongs.
• Finally, specify the statement to execute when the trigger activates.

• If you want to execute multiple statements within the trigger_body, you use the BEGIN END compound statement.

8
Triggers Syntax – before insert
• MySQL BEFORE INSERT triggers are automatically fired before an insert event occurs on
the table.

• For example, create a trigger using the BEFORE INSERT to insert 10 marks extra for
every student mark field when a new record is added to the table.
• Created a trigger called lecture7

9
Triggers Syntax – before insert
• Example, create a trigger using the BEFORE INSERT to convert all average marks less
than 0 to 0.

• Use the delimiter // because we are considering multiple statements.

10
Triggers Syntax – after insert
• MySQL AFTER INSERT triggers are automatically invoked after an insert event occurs on
the table.

• Example, create a trigger using the AFTER INSERT to send a message to students when
the name of their department is missing in their record.
1. Create Student table (see A) and a message table (see B)

A B
11
Triggers Syntax – after insert
2. Create trigger called CheckDept

3. Insert new records to the existing table A

12
Triggers Syntax – after insert
2. Create trigger called CheckDept

3. Insert new records to the existing table A

13
Triggers Syntax – before update
• BEFORE UPDATE Trigger in MySQL is invoked automatically whenever an update
operation is fired on the table associated with the trigger.

• Example, create a trigger using the BEFORE UPDATE such that while updating the table
A, if the Salary = 20000, set new Salary to 90000 else if the Salary < 20000 then set the
Salary field to 70000

14
Triggers Syntax – before update

15
Triggers Syntax – before update
• If you update the value in the quantity column (Table A) to a new value that is 3 times greater than the
current value, the trigger raises an error and stops the update.

2
16
Triggers Syntax – after update
• MySQL AFTER UPDATE triggers are invoked automatically after an update event occurs
on the table associated with the triggers.

• Example, create an AFTER UPDATE trigger that promotes all students to the next class
• create another table named students_log that keeps the updated information in the selected user (Table B).

Whenever an update is performed on a single row in the


B "students" table, a new row will be inserted in the
A "students_log" table. This table keeps the current user id and a
description regarding the current update.

17
Triggers Syntax – after update
Whenever an update is performed on a single row in the
"students" table, a new row will be inserted in the
B
A "students_log" table. This table keeps the current user id and a
description regarding the current update.

18
The Special Powers of a Trigger
• While in the body of a trigger, there are potentially two sets of
column values available to you, with special syntax for denoting them.
• old.<column name> will give you the value of the column before the DML
statement executed.
• new.<column name> will give you the value of that column after the DML
statement executed.
• Insert triggers have no old values available, and delete triggers have
no new values available for obvious reasons. Only update triggers
have both the old and the new values available.
• Only triggers can access these values this way.
Database Security
Dr Kehinde Aruleba
Introduction
• Most organization has one or more databases

WHICH MUST BE PROTECTED 2


Question: Is the protection secure enough???

3
The Real Threat
• Database trusts the applications and users interacting with it

• The trust in this context refers to the assumption that the applications and users
interacting with the database are legitimate and authorized.

• The database relies on the integrity and security of the applications and users to
ensure that only valid and authorized operations are performed on the data.

• The real threat arises when this trust is misplaced or exploited.

5
Threats to Databases
from
Applications

1. Application Code Exposure

6
Application code exposure
• Attackers / malicious users analyse the
• URL and error messages
• Application code
to determine
• What platform is being used
• What business rules are in place
• And, How is database being accessed i.e.
• connection strings, actual SQL queries etc.
• Helps in launching the database attack
• Directly
• Or through application

7
Application configuration files
• Holds connection details
• Contains database server name
• And username and password to connect to database

• Is it stored securely?

8
Application configuration files
• Huh, Attackers won’t find my configuration file
• Why don’t you ask my friend Google
• *Search for ext:php intext:"$dbms""$dbhost""$dbuser""$dbpasswd""$table_prefix""phpbb_installed“
• (This will search for the configuration files for phpBB, a popular `php` based
bulletin board)

9
Anatomy of Application configuration files
• Options to store configuration files
• Web.config and global.asa files
For microsoft technologies
• UDL files
• Registry
• Include text files
For all
• Hard coding connection string

10
Suggestions
• Use windows authentication or LDAP authentication (if possible) to
connect to database over a secure channel
• Never store the username and password in clear text
• Encrypt the configuration files
• Use DPAPI (for windows 2000 and above)
• Create a baseline
• Monitor the database connections being made
• Make a list of IP Addresses and applications which should be allowed to access the
database
• And deny access to others using a good SQL firewall

11
Protecting the code
• Code obfuscation
• Provides security through obscurity
• Converting the intermediate code into a form that makes reverse engineering very
difficult (not impossible though)
• Add code to break decompilers
• Use tools e.g. DashO and HoseMocha for java.

• Don’t store application code in unsecured environments


• Security of development and test environments is as necessary as of the production
environment
• Patch the development and test environments regularly

12
Threats to Databases
from
Applications

2. SQL Injection

13
SQL Injection
• Famous attacks
• Tesla vulnerability—in 2014, security researchers publicized that they were able to
breach the website of Tesla using SQL injection, gain administrative privileges and
steal user data.
• Heartland Corporate, 2013: Within days, it had been classified as the biggest ever
criminal breach of card data. One estimate claimed 100 million cards and more than
650 financial services companies were compromised. Prosecutors have said that
three of the corporate victims reported $300m in losses.
• What is it?
• SQL injection usually occurs when you ask a user for input, like their
username/userid, and instead of a name/id, the user gives you an SQL statement
that you will unknowingly run on your database.

14
SQL Injection
• Can be used to
• Bypass authentication

• Access sensitive data

• Even shutting down a system

• And more…

15
SQL Injection
• Look at the following example which creates a SELECT statement by adding
a variable (txtUserId) to a select string.
• The variable is fetched from user input (getRequestString):

• The original purpose of the code was to create an SQL statement to select
a user, with a given user id.
• If there is nothing to prevent a user from entering "wrong" input, the user can enter
some "smart" input like this:

16
SQL Injection
• Then, the SQL statement will look like this:

• The SQL above is valid and will return ALL rows from the "Users" table,
since OR 1=1 is always TRUE.
• Does the example above look dangerous? What if the "Users" table
contains names and passwords?
• The SQL statement above is much the same as this:

• A hacker might get access to all the user names and passwords in a
database, by simply inserting 105 OR 1=1 into the input field.
17
SQL Injection
• Countermeasures
• Limiting application vulnerabilities
• Find the vulnerability and fix the code
• All input must be sanitized
• SQL used to access data must not be formed by string concatenation
• Prepared statements and parameterized stored procedures must be used
wherever possible
• Add quotes to all user input including numerical data
• Don’t display the database generated error messages
• Apply latest patches ASAP
• remove or minimize the application’s access to the vulnerable function

18

You might also like