You are on page 1of 71

Database Systems:

Design, Implementation, and


Management

Chapter 8
Advanced SQL (Part 1)
Objectives
• In this lecture, you will learn:
– How to implement, what SQL concepts you
learnt in the last lecture
– How to use the advanced SQL JOIN operator
syntax
– About the relational set operators UNION,
UNION ALL, INTERSECT, and MINUS
– About the different types of subqueries and
correlated queries

Database Systems, 11th Edition 2


SQL language statement types
STATEMENT Type
SELECT Data Retrieval

INSERT
UPDATE Data Manipulation Language (DML)
DELETE

CREATE
ALTER
DROP Data Definition Language (DDL)
RENAME
TRUNCATE

GRANT
Data Control Language (DCL)
REVOKE

COMMIT
Transaction Control
ROLLBACK
SQL SELECT Overview
The most basic part

SELECT
[DISTINCT | ALL] <column-list>
FROM <table-names>
[WHERE <condition>]
[ORDER BY <column-list>]
[GROUP BY <column-list>]
[HAVING <condition>]
• ([]- optional, | - or)
Example Tables

Student Grade
ID First Last ID Code Mark
S103 John Smith S103 DBS 72
S104 Mary Jones S103 IAI 58
S105 Jane Brown S104 PR1 68
S106 Mark Jones S104 IAI 65
S107 John Brown S106 PR2 43
S107 PR1 76
Course
S107 PR2 60
Code Title S107 IAI 35
DBS Database Systems
PR1 Programming 1
PR2 Programming 2
IAI Intro to AI
DISTINCT and ALL

• Sometimes you end up


SELECT ALL Last Last
with duplicate entries
FROM Student Smith
• Using DISTINCT Jones
removes duplicates Brown
Jones
• Using ALL retains them Brown
- this is the default SELECT DISTINCT Last
FROM Student Last
Smith
Jones
Brown
WHERE Clauses

• Usually you don’t want


all the rows • Example conditions:
– A WHERE clause restricts – Mark < 40
the rows that are returned – First = ‘John’
– It takes the form of a – First <> ‘John’
condition - only those rows – First = Last
that satisfy the condition – (First = ‘John’)
are returned AND
(Last = ‘Smith’)
– (Mark < 40) OR
(Mark > 70)
WHERE Examples

SELECT * FROM Grade SELECT DISTINCT ID


WHERE Mark >= 60 FROM Grade
WHERE Mark >= 60
ID Code Mark
S103 DBS 72 ID
S104 PR1 68 S103
S104 IAI 65 S104
S107 PR1 76 S107
S107 PR2 60
WHERE Example

• Given the table • Attempt this on your own


before going to the next
Grade slide
ID Code Mark
• Write an SQL query to find a
list of the ID numbers and
S103 DBS 72 marks in IAI of students who
S103 IAI 58
have passed (scored 40 or
S104 PR1 68
S104 IAI 65
higher) IAI
S106 PR2 43
S107 PR1 76 ID Mark
S107 PR2 60 S103 58
S107 IAI 35 S104 65
One Solution – you can do SQL
queries in different

We only want the ID and Mark, not the Code


Single quotes around the string

SELECT ID, Mark FROM Grade


WHERE Code = ‘IAI’ AND
Mark >= 40

We’re only interested in IAI

We’re looking for entries with pass marks


SELECT from Multiple Tables

• Often you need to • If the tables have


combine information columns with the same
from two or more tables name ambiguity results
• You can get the effect of • You resolve this by
a product by using referencing columns with
SELECT * FROM Table1, the table name
Table2... TableName.Column
SELECT from Multiple Tables

SELECT
Student
First, Last, Mark ID First Last
FROM Student, Grade S103 John Smith
WHERE S104 Mary Jones
S105 Jane Grade
Brown
(Student.ID = S106 Mark ID JonesCode Mark
S107 John Brown
Grade.ID) AND S103 DBS 72
(Mark >= 40) S103 IAI 58
S104 PR1 68
S104 IAI 65
S106 PR2 43
S107 PR1 76
S107 PR2 60
S107 IAI 35
SELECT from Multiple Tables

SELECT ... FROM Student, Grade WHERE...

ID First Last ID Code Mark


S103 John Smith S103 DBS 72
Are matched S103 John Smith S103 IAI 58 All of the
with the first S103 John Smith S104 PR1 68 entries from
entry from S103 John Smith S104 IAI 65 the Grade
the Student S103 John Smith S106 PR2 43 table
table... S103 John Smith S107 PR1 76
S103 John Smith S107 PR2 60
S103 John Smith S107 IAI 35
And then S104 Mary Jones S103 DBS 72
with the S104 Mary Jones S103 IAI 58
second… S104 Mary Jones S104 PR1 68
S104 Mary Jones S104 IAI 65
and so on S104 Mary Jones S106 PR2 43
SELECT from Multiple Tables

SELECT ... FROM Student, Grade


WHERE (Student.ID = Grade.ID) AND ...

ID First Last ID Code Mark


S103 John Smith S103 DBS 72
S103 John Smith S103 IAI 58
S104 Mary Jones S104 PR1 68
S104 Mary Jones S104 IAI 65
S106 Mark Jones S106 PR2 43
S107 John Brown S107 PR1 76
S107 John Brown S107 PR2 60
S107 John Brown S107 IAI 35

Student.ID Grade.ID
SELECT from Multiple Tables

SELECT ... FROM Student, Grade


WHERE (Student.ID = Grade.ID) AND (Mark >= 40)

ID First Last ID Code Mark


S103 John Smith S103 DBS 72
S103 John Smith S103 IAI 58
S104 Mary Jones S104 PR1 68
S104 Mary Jones S104 IAI 65
S106 Mark Jones S106 PR2 43
S107 John Brown S107 PR1 76
S107 John Brown S107 PR2 60
SELECT from Multiple Tables

SELECT First, Last, Mark FROM Student, Grade


WHERE (Student.ID = Grade.ID) AND (Mark >= 40)

First Last Mark


John Smith 72
John Smith 58
Mary Jones 68
Mary Jones 65
Mark Jones 43
John Brown 76
John Brown 60
SELECT from Multiple Tables

• When selecting from SELECT * FROM


multiple tables you Student, Grade, Course
almost always use a WHERE
WHERE clause to find Student.ID = Grade.ID
entries with common AND
values Course.Code =
Grade.Code
SELECT from Multiple Tables

Student Grade Course

ID First Last ID Code Mark Code Title


S103 John Smith S103 DBS 72 DBS Database Systems
S103 John Smith S103 IAI 58 IAI Intro to AI
S104 Mary Jones S104 PR1 68 PR1 Programming 1
S104 Mary Jones S104 IAI 65 IAI Intro to AI
S106 Mark Jones S106 PR2 43 PR2 Programming 2
S107 John Brown S107 PR1 76 PR1 Programming 1
S107 John Brown S107 PR2 60 PR2 Programming 2
S107 John Brown S107 IAI 35 IAI Intro to AI

Student.ID = Grade.ID Course.Code = Grade.Code


SQL Join Operators
• Join operation merges rows from two tables
and returns the rows with one of the following:
– Have common values in common columns
• Natural join
– Meet a given join condition
• Equality or inequality
– Have common values in common columns or
have no matching values
• Outer join
• Inner join: only returns rows meeting criteria

Database Systems, 11th Edition 20


JOINs

• JOINs can be used to


combine tables A CROSS JOIN B
– There are many types of – returns all pairs of rows
JOIN from A and B
• CROSS JOIN A NATURAL JOIN B
• INNER JOIN
– returns pairs of rows with
• NATURAL JOIN common values for
• OUTER JOIN identically named columns
– OUTER JOINs are linked and without duplicating
with NULLs - more later columns
A INNER JOIN B
– returns pairs of rows
satisfying a condition
Cross Join
• Performs relational product of two tables
– Also called Cartesian product
• Syntax:
SELECT column-list FROM table1 CROSS JOIN
table2
• Perform a cross join that yields specified
attributes

Database Systems, 11th Edition 22


CROSS JOIN

Student SELECT * FROM


ID Name Student CROSS JOIN
123 John Enrolment
124 Mary
125 Mark ID Name ID Code
126 Jane 123 John 123 DBS
124 Mary 123 DBS
Enrolment 125 Mark 123 DBS
ID Code 126 Jane 123 DBS
123 John 124 PRG
123 DBS 124 Mary 124 PRG
124 PRG 125 Mark 124 PRG
124 DBS 126 Jane 124 PRG
126 PRG 123 John 124 DBS
124 Mary 124 DBS
Natural Join

• Returns all rows with matching values in the


matching columns
– Eliminates duplicate columns
• Used when tables share one or more common
attributes with common names
• Syntax:
SELECT column-list FROM table1 NATURAL
JOIN table2

Database Systems, 11th Edition 24


NATURAL JOIN

Student SELECT * FROM


ID Name Student NATURAL JOIN
123 John
Enrolment
124 Mary
125 Mark
126 Jane ID Name Code

Enrolment 123 John DBS


124 Mary PRG
ID Code 124 Mary DBS
123 DBS 126 Jane PRG
124 PRG
124 DBS
126 PRG
CROSS and NATURAL JOIN

SELECT * FROM
A CROSS JOIN B SELECT * FROM
A NATURAL JOIN B
• is the same as
SELECT * FROM A, B •is the same as
SELECT A.col1,… A.coln,
[and all other columns
apart from B.col1,…B.coln]
FROM A, B
WHERE A.col1 = B.col1
AND A.col2 = B.col2
...AND A.coln = B.col.n
(this assumes that col1…
coln in A and B have
common names)
INNER JOIN

• Can also use


• INNER JOINs specify a SELECT * FROM
condition which the pairs
A INNER JOIN B
of rows satisfy
USING
(col1, col2,…)
SELECT * FROM
A INNER JOIN B • Chooses rows where the
given columns are equal
ON <condition>
JOIN USING Clause

• Returns only rows with matching values in the


column indicated in the USING clause
• Syntax:
SELECT column-list FROM table1 JOIN table2
USING (common-column)
• JOIN USING operand does not require table
qualifiers
– Oracle returns error if table name is specified

Database Systems, 11th Edition 28


INNER JOIN

Student SELECT * FROM


ID Name Student INNER JOIN
123 John
Enrolment USING (ID)
124 Mary
125 Mark ID Name ID Code
126 Jane 123 John 123 DBS
Enrolment 124 Mary 124 PRG
124 Mary 124 DBS
ID Code 126 Jane 126 PRG
123 DBS
124 PRG
124 DBS
126 PRG
JOIN ON Clause

• Used when tables have no common attributes


• Returns only rows that meet the join condition
– Typically includes equality comparison
expression of two columns
• Syntax:
SELECT column-list FROM table1 JOIN table2
ON join-condition

Database Systems, 11th Edition 30


INNER JOIN
SELECT * FROM
Buyer Buyer INNER JOIN
Name Budget Property ON
Smith 100,000 Price <= Budget
Jones 150,000
Green 80,000

Property Name Budget Address Price

Address Price Smith 100,000 15 High St 85,000


Jones 150,000 15 High St 85,000
15 High St 85,000 Jones 150,000 12 Queen St 125,000
12 Queen St 125,000
87 Oak Row 175,000
INNER JOIN

SELECT * FROM
A INNER JOIN B SELECT * FROM
ON <condition> A INNER JOIN B
USING(col1, col2,...)
• is the same as
•is the same as
SELECT * FROM A, B
WHERE <condition> SELECT * FROM A, B
WHERE A.col1 = B.col1
AND A.col2 = B.col2
AND ...
JOINs vs WHERE Clauses

• JOINs (so far) are not • Yes, because


needed – They often lead to concise
– You can have the same queries
effect by selecting from – NATURAL JOINs are very
multiple tables with an common
appropriate WHERE • No, because
clause
– Support for JOINs varies a
– So should you use JOINs
fair bit among SQL
or not?
dialects
Writing Queries
• When writing queries • Most DBMSs have query
– There are often many optimisers
ways to write the query – These take a user’s query
– You should worry about and figure out how to
being correct, clear, and efficiently execute it
concise in that order – A simple query is easier to
– Don’t worry about being optimise
clever or efficient – We’ll look at some ways to
improve efficiency later
SELECT from Multiple Tables
• Often you need to • If the tables have
combine information columns with the same
from two or more tables name ambiguity results
• You can get the effect of • You resolve this by
a product by using referencing columns with
SELECT * FROM Table1, the table name
Table2... TableName.Column
SELECT from Multiple Tables
SELECT
Student
First, Last, Mark ID First Last
FROM Student, Grade S103 John Smith
WHERE S104 Mary Jones
S105 Jane Grade
Brown
(Student.ID = S106 Mark ID JonesCode Mark
S107 John Brown
Grade.ID) AND S103 DBS 72
(Mark >= 40) S103 IAI 58
S104 PR1 68
S104 IAI 65
S106 PR2 43
S107 PR1 76
S107 PR2 60
S107 IAI 35
SELECT from Multiple Tables
SELECT ... FROM Student, Grade WHERE...

ID First Last ID Code Mark


S103 John Smith S103 DBS 72
Are matched S103 John Smith S103 IAI 58 All of the
with the first S103 John Smith S104 PR1 68 entries from
entry from S103 John Smith S104 IAI 65 the Grade
the Student S103 John Smith S106 PR2 43 table
table... S103 John Smith S107 PR1 76
S103 John Smith S107 PR2 60
S103 John Smith S107 IAI 35
And then S104 Mary Jones S103 DBS 72
with the S104 Mary Jones S103 IAI 58
second… S104 Mary Jones S104 PR1 68
S104 Mary Jones S104 IAI 65
and so on S104 Mary Jones S106 PR2 43
SELECT from Multiple Tables
SELECT ... FROM Student, Grade
WHERE (Student.ID = Grade.ID) AND ...

ID First Last ID Code Mark


S103 John Smith S103 DBS 72
S103 John Smith S103 IAI 58
S104 Mary Jones S104 PR1 68
S104 Mary Jones S104 IAI 65
S106 Mark Jones S106 PR2 43
S107 John Brown S107 PR1 76
S107 John Brown S107 PR2 60
S107 John Brown S107 IAI 35

Student.ID Grade.ID
SELECT from Multiple Tables
SELECT ... FROM Student, Grade
WHERE (Student.ID = Grade.ID) AND (Mark >= 40)

ID First Last ID Code Mark


S103 John Smith S103 DBS 72
S103 John Smith S103 IAI 58
S104 Mary Jones S104 PR1 68
S104 Mary Jones S104 IAI 65
S106 Mark Jones S106 PR2 43
S107 John Brown S107 PR1 76
S107 John Brown S107 PR2 60
SELECT from Multiple Tables
SELECT First, Last, Mark FROM Student, Grade
WHERE (Student.ID = Grade.ID) AND (Mark >= 40)

First Last Mark


John Smith 72
John Smith 58
Mary Jones 68
Mary Jones 65
Mark Jones 43
John Brown 76
John Brown 60
SELECT from Multiple Tables
• When selecting from SELECT * FROM
multiple tables you Student, Grade, Course
almost always use a WHERE
WHERE clause to find Student.ID = Grade.ID
entries with common AND
values Course.Code =
Grade.Code
SELECT from Multiple Tables
Student Grade Course

ID First Last ID Code Mark Code Title


S103 John Smith S103 DBS 72 DBS Database Systems
S103 John Smith S103 IAI 58 IAI Intro to AI
S104 Mary Jones S104 PR1 68 PR1 Programming 1
S104 Mary Jones S104 IAI 65 IAI Intro to AI
S106 Mark Jones S106 PR2 43 PR2 Programming 2
S107 John Brown S107 PR1 76 PR1 Programming 1
S107 John Brown S107 PR2 60 PR2 Programming 2
S107 John Brown S107 IAI 35 IAI Intro to AI

Student.ID = Grade.ID Course.Code = Grade.Code


JOINs
• JOINs can be used to
combine tables A CROSS JOIN B
– There are many types of – returns all pairs of rows
JOIN from A and B
• CROSS JOIN A NATURAL JOIN B
• INNER JOIN
– returns pairs of rows with
• NATURAL JOIN common values for
• OUTER JOIN identically named columns
– OUTER JOINs are linked and without duplicating
with NULLs - more later columns
A INNER JOIN B
– returns pairs of rows
satisfying a condition
CROSS JOIN

Student SELECT * FROM


ID Name Student CROSS JOIN
123 John Enrolment
124 Mary
125 Mark ID Name ID Code
126 Jane 123 John 123 DBS
124 Mary 123 DBS
Enrolment 125 Mark 123 DBS
ID Code 126 Jane 123 DBS
123 John 124 PRG
123 DBS 124 Mary 124 PRG
124 PRG 125 Mark 124 PRG
124 DBS 126 Jane 124 PRG
126 PRG 123 John 124 DBS
124 Mary 124 DBS
NATURAL JOIN

Student SELECT * FROM


ID Name Student NATURAL JOIN
123 John
Enrolment
124 Mary
125 Mark
126 Jane ID Name Code

Enrolment 123 John DBS


124 Mary PRG
ID Code 124 Mary DBS
123 DBS 126 Jane PRG
124 PRG
124 DBS
126 PRG
CROSS and NATURAL JOIN
SELECT * FROM
A CROSS JOIN B SELECT * FROM
A NATURAL JOIN B
• is the same as
SELECT * FROM A, B •is the same as
SELECT A.col1,… A.coln,
[and all other columns
apart from B.col1,…B.coln]
FROM A, B
WHERE A.col1 = B.col1
AND A.col2 = B.col2
...AND A.coln = B.col.n
(this assumes that col1…
coln in A and B have
common names)
INNER JOIN
• Can also use
• INNER JOINs specify a SELECT * FROM
condition which the pairs
A INNER JOIN B
of rows satisfy
USING
(col1, col2,…)
SELECT * FROM
A INNER JOIN B • Chooses rows where the
given columns are equal
ON <condition>
INNER JOIN

Student SELECT * FROM


ID Name Student INNER JOIN
123 John
Enrolment USING (ID)
124 Mary
125 Mark ID Name ID Code
126 Jane 123 John 123 DBS
Enrolment 124 Mary 124 PRG
124 Mary 124 DBS
ID Code 126 Jane 126 PRG
123 DBS
124 PRG
124 DBS
126 PRG
INNER JOIN

SELECT * FROM
Buyer INNER JOIN
Property ON
Price <= Budget
Buyer
Name Budget Address Price
Name Budget
Smith 100,000 15 High St 85,000
Smith 100,000 Jones 150,000 15 High St 85,000
Jones 150,000 Jones 150,000 12 Queen St 125,000
Green 80,000

Property
Address Price
15 High St 85,000
12 Queen St 125,000
87 Oak Row 175,000
INNER JOIN

SELECT * FROM
A INNER JOIN B SELECT * FROM
ON <condition> A INNER JOIN B
USING(col1, col2,...)
• is the same as
•is the same as
SELECT * FROM A, B
WHERE <condition> SELECT * FROM A, B
WHERE A.col1 = B.col1
AND A.col2 = B.col2
AND ...
JOINs vs WHERE Clauses

• JOINs (so far) are not • Yes, because


needed – They often lead to concise
– You can have the same queries
effect by selecting from – NATURAL JOINs are very
multiple tables with an common
appropriate WHERE • No, because
clause
– Support for JOINs varies a
– So should you use JOINs
fair bit among SQL
or not?
dialects
Writing Queries

• When writing queries • Most DBMSs have query


– There are often many optimisers
ways to write the query – These take a user’s query
– You should worry about and figure out how to
being correct, clear, and efficiently execute it
concise in that order – A simple query is easier to
– Don’t worry about being optimise
clever or efficient – We’ll look at some ways to
improve efficiency later
Outer Joins
• Returns rows matching the join condition
• Also returns rows with unmatched attribute
values for tables to be joined
• Three types
– Left
– Right
– Full
• Left and right designate order in which tables
are processed

Database Systems, 11th Edition 53


Outer Joins (cont’d.)
• Left outer join
– Returns rows matching the join condition
– Returns rows in left side table with unmatched
values
– Syntax: SELECT column-list FROM table1 LEFT
[OUTER] JOIN table2 ON join-condition
• Right outer join
– Returns rows matching join condition
– Returns rows in right side table with unmatched
values

Database Systems, 11th Edition 54


Outer Joins (cont’d.)

• Full outer join


– Returns rows matching join condition
– Returns all rows with unmatched values in either
side table
– Syntax:
SELECT column-list
FROM table1 FULL [OUTER] JOIN table2
ON join-condition

Database Systems, 11th Edition 55


Database Systems, 11th Edition 56
Subqueries and Correlated Queries
• Often necessary to process data based on
other processed data
• Subquery is a query inside a query, normally
inside parentheses
• First query is the outer query
– Inside query is the inner query
• Inner query is executed first
• Output of inner query is used as input for outer
query
• Sometimes referred to as a nested query
Database Systems, 11th Edition 57
WHERE Subqueries
• Most common type uses inner SELECT
subquery on right side of WHERE comparison
– Requires a subquery that returns only one single
value
• Value generated by subquery must be of
comparable data type
• Can be used in combination with joins

Database Systems, 11th Edition 58


Subqueries

• A SELECT statement can E.g. get the names of


be nested inside another people who are in Andy’s
query to form a subquery department:
• The results of the SELECT Name
subquery are passed FROM Employee
back to the containing
WHERE Dept =
query
(SELECT Dept
FROM Employee
WHERE Name=‘Andy’)
Subqueries

SELECT Name • First the subquery is


FROM Employee evaluated, returning the
WHERE Dept = value ‘Marketing’
(SELECT Dept • This result is passed to
the main query
FROM Employee
WHERE SELECT Name
Name=‘Andy’) FROM Employee
WHERE Dept =
‘Marketing’
Subqueries

• Often a subquery will • Options


return a set of values – IN - checks to see if a
rather than a single value value is in the set
– EXISTS - checks to see if
• You can’t directly
the set is empty or not
compare a single value
– ALL/ANY - checks to see
to a set if a relationship holds for
every/one member of the
set
(NOT) IN

• Using IN we can see if a


given value is in a set of SELECT <columns>
values FROM <tables>
WHERE <value>
• NOT IN checks to see if IN <set>
a given value is not in
the set
• The set can be given
SELECT <columns>
explicitly or from a FROM <tables>
subquery WHERE <value>
NOT IN <set>
(NOT) IN

SELECT *
FROM Employee
Employee WHERE Department IN
Name Department Manager (‘Marketing’,
‘Sales’)
John Marketing Chris
Mary Marketing Chris
Chris Marketing Jane Name Department Manager
Peter Sales Jane
Jane Management John Marketing Chris
Mary Marketing Chris
Chris Marketing Jane
Peter Sales Jane
(NOT) IN

Employee SELECT *
Name Department Manager FROM Employee
WHERE Name NOT IN
John Marketing Chris (SELECT Manager
Mary Marketing Chris FROM Employee)
Chris Marketing Jane
Peter Sales Jane
Jane Management
(NOT) IN

• First the subquery • This gives


SELECT Manager
SELECT *
FROM Employee
FROM Employee
• is evaluated giving WHERE Name NOT
IN (‘Chris’,
‘Jane’)
Manager
Chris
Chris Name Department Manager
Jane John Marketing Chris
Jane
Mary Marketing Chris
Peter Sales Jane
IN Subqueries
• Used when comparing a single attribute to a list
of values

Database Systems, 11th Edition 66


HAVING Subqueries
• HAVING clause restricts the output of a
GROUP BY query
– Applies conditional criterion to the grouped rows

Database Systems, 11th Edition 67


Multirow Subquery Operators:
ANY and ALL
• Allows comparison of single value with a list of
values using inequality comparison
• “Greater than ALL” equivalent to “greater than
the highest in list”
• “Less than ALL” equivalent to “less than lowest”
• Using equal to ANY operator equivalent to IN
operator

Database Systems, 11th Edition 68


FROM Subqueries

• Specifies the tables from which the data will be


drawn
• Can use SELECT subquery in the FROM
clause
– View name can be used anywhere a table is
expected

Database Systems, 11th Edition 69


Attribute List Subqueries

• SELECT statement uses attribute list to indicate


columns to project resulting set
– Columns can be attributes of base tables
– Result of aggregate function
• Attribute list can also include subquery
expression: inline subquery
– Must return one single value
• Cannot use an alias in the attribute list

Database Systems, 11th Edition 70


Correlated Subqueries

• Subquery that executes once for each row in


the outer query
• Correlated because inner query is related to the
outer query
– Inner query references column of outer
subquery
• Can also be used with the EXISTS special
operator

Database Systems, 11th Edition 71


Summary
• Operations that join tables are classified as
inner joins and outer joins
• Natural join returns all rows with matching
values in the matching columns
– Eliminates duplicate columns
• Subqueries and correlated queries process
data based on other processed data
• Most subqueries are executed in serial fashion

Database Systems, 11th Edition 72

You might also like