You are on page 1of 15

Query Languages for

Week 3 Relational Algebra Relational Databases


Operations on databases:
Queries read data from the database;
Querying and Updating a Database
Updates change the content of the database.
The Relational Algebra In this lecture unit we discuss the relational algebra,
Union, Intersection, Difference a procedural language that defines database
Renaming, Selection and Projection operations in terms of algebraic expressions.
[The Relational Calculus is a declarative language
Join, Cartesian Product for database operations based on Predicate Logic;
we will not discuss it here.]

CSC343 Introduction to Databases University of Toronto Relational Algebra 1 CSC343 Introduction to Databases University of Toronto Relational Algebra 2

Relational Algebra Union, Intersection, Difference


A collection of algebraic operators that
Are defined on relations;
Relations are sets, so we can apply set-theoretic
Produce relations as results, operators
and therefore can be combined to form However, we want the results to be relations
complex algebraic expressions. (that is, homogeneous sets of tuples)
Operators: It is therefore meaningful to only apply union,
Union, intersection, difference; intersection, difference to pairs of relations
Renaming; defined over the same attributes.
Selection and Projection;
Join (natural join, Cartesian product, theta
join).
CSC343 Introduction to Databases University of Toronto Relational Algebra 3 CSC343 Introduction to Databases University of Toronto Relational Algebra 4

1
Union Intersection

Graduates Graduates
Number Surname Age Number Surname Age
7274 Robinson 37 7274 Robinson 37
7432 O'Malley 39 Graduates Managers 7432 O'Malley 39
9824 Darkes 38 Number Surname Age 9824 Darkes 38 Graduates Managers
7274 Robinson 37 Number Surname Age
7432 O'Malley 39 7432 O'Malley 39
Managers 9824 Darkes 38 Managers 9824 Darkes 38
Number Surname Age 9297 O'Malley 56 Number Surname Age
9297 O'Malley 56 9297 O'Malley 56
7432 O'Malley 39 7432 O'Malley 39
9824 Darkes 38 9824 Darkes 38

CSC343 Introduction to Databases University of Toronto Relational Algebra 5 CSC343 Introduction to Databases University of Toronto Relational Algebra 6

Difference A Meaningful but Impossible Union

Graduates
Paternity Maternity
Number Surname Age Father Child Mother Child
7274 Robinson 37 Adam Cain Eve Cain
7432 O'Malley 39 Adam Abel Eve Seth
9824 Darkes 38 Graduates - Managers Abraham Isaac Sarah Isaac
Number Surname Age Abraham Ishmael Hagar Ishmael
Managers 7274 Robinson 37
Number Surname Age
9297 O'Malley 56 Paternity Maternity ???
7432 O'Malley 39
9824 Darkes 38
The problem: Father and Mother are different
names, but both represent a parent.
The solution: rename attributes!
CSC343 Introduction to Databases University of Toronto Relational Algebra 7 CSC343 Introduction to Databases University of Toronto Relational Algebra 8

2
Renaming
Example of Renaming
This is a unary operator which changes attribute
Paternity Father> Parent(Paternity)
names for a relation without changing any Father Child Parent Child
values. Adam Cain Adam Cain
Adam Abel Adam Abel
Renaming removes the limitations associated Abraham Isaac Abraham Isaac
with set operators. Abraham Ishmael Abraham Ishmael

Notation: OldNameNewName(r)
For example, FatherParent(Paternity) The textbook allows positions rather than attribute
names, e.g., 1 Parent
If there are two or more attributes involved in a
renaming operation, then ordering is meaningful: Textbook also allows renaming of the relation
itself,e.g.,Paternity,1 Parenthood,Parent
e.g., Branch,Salary Location,Pay(Employees)
CSC343 Introduction to Databases University of Toronto Relational Algebra 9 CSC343 Introduction to Databases University of Toronto Relational Algebra 10

Renaming and Union,


Renaming and Union with Several Attributes
Paternity Maternity
Father Child Mother Child
Employees
Adam Cain Eve Cain
Eve Seth Surname Branch Salary
Adam Abel
Sarah Isaac Patterson Rome 45 Staff
Abraham Isaac
Hagar Ishmael Trumble London 53 Surname Factory Wages
Abraham Ishmael
Patterson Rome 45
Trumble London 53
Father->Parent(Paternity) Mother->Parent(Maternity)
Parent Child
Adam Cain
Adam Abel Branch,SalaryLocation,Pay(Employees) Factory, WagesLocation,Pay(Staff)
Abraham Isaac Surname Location Pay
Abraham Ishmael Patterson Rome 45
Eve Cain Trumble London 53
Eve Seth Cooke Chicago 33
Sarah Isaac Bush Monza 32
Hagar Ishmael

CSC343 Introduction to Databases University of Toronto Relational Algebra 11 CSC343 Introduction to Databases University of Toronto Relational Algebra 12

3
Selection and Projection Selection
These are unary operators, in a sense
orthogonal:
This is a unary operation which returns a relation
selection for "horizontal" decompositions;
with the same schema as the operand;
projection for "vertical" decompositions.
but, with a subset of the tuples of the operand,
A B C A B C
Selection i.e., only those that satisfy a condition.
Notation: F(r)
Semantics: F(r) = { t | t r s.t. t satisfies F, I.e., F(t)}
A B C A B
Projection

CSC343 Introduction to Databases University of Toronto Relational Algebra 13 CSC343 Introduction to Databases University of Toronto Relational Algebra 14

Selection Example Selection, Another Example

We can use Employees


the or Surname FirstName Age Salary Citizens
symbol (v), Smith Mary 25 2000 Surname FirstName PlaceOfBirth Residence
Black Lucy 40 3000 Smith Mary Rome Milan
Not symbol Verdi Nico 36 4500 Black Lucy Rome Rome
() & and Smith Mark 40 3900 Verdi Nico Florence Florence
symbol () Smith Mark Naples Florence
as we would
for Age<30 v Salary>4000 (Employees) PlaceOfBirth=Residence (Citizens)
specifying Surname FirstName Age Salary Surname FirstName PlaceOfBirth Residence
any logical Smith Mary 25 2000 Black Lucy Rome Rome
Verdi Nico 36 4500 Verdi Nico Florence Florence
condition

CSC343 Introduction to Databases University of Toronto Relational Algebra 15 CSC343 Introduction to Databases University of Toronto Relational Algebra 16

4
Projection Example of Projection

Projection returns a relation which includes a subset Employees


of the attributes of the operand. Surname FirstName Department Head
Smith Mary Sales De Rossi
Notation: Given a relation r(X) and a subset Y of X: Black Lucy Sales De Rossi
Verdi Mary Personnel Fox
Y(r) Smith Mark Personnel Fox

Semantics: Y(r) = { t[Y] | t r }


Surname, FirstName(Employees)
Surname FirstName
Smith Mary
Black Lucy
Verdi Mary
Smith Mark

CSC343 Introduction to Databases University of Toronto Relational Algebra 17 CSC343 Introduction to Databases University of Toronto Relational Algebra 18

Another Example Cardinality of Projection Operations


Note that the result of a projection contains at
most as many tuples as the operand relation.
Employees
Surname FirstName Department Head However, it may contain fewer, if several tuples
Smith Mary Sales De Rossi
Black Lucy Sales De Rossi collapse, i.e., they are identical in all their values.
Verdi Mary Personnel Fox
Smith Mark Personnel Fox Theorem: Y(r) contains as many tuples as r if
and only if Y is a superkey for r.
Department, Head (Employees) This property holds even if Y is "by chance a
Department Head superkey, i.e., it is not defined as a superkey in the
Sales
Personnel
De Rossi
Fox
schema, but it is a superkey for the current
database, see the example.

CSC343 Introduction to Databases University of Toronto Relational Algebra 19 CSC343 Introduction to Databases University of Toronto Relational Algebra 20

5
Tuples that Collapse Tuples that do not Collapse,
"by Chance"
Students Students
RegNum Surname FirstName BirthDate DegreeProg RegNum Surname FirstName BirthDate DegreeProg
284328 Smith Luigi 29/04/59 Computing
296328 Smith John 29/04/59 Computing
296328 Smith John 29/04/59 Computing
587614 Smith Lucy 01/05/61 Engineering
587614 Smith Lucy 01/05/61 Engineering
934856 Black Lucy 01/05/61 Fine Art 934856 Black Lucy 01/05/61 Fine Art
965536 Black Lucy 05/03/58 Fine Art 965536 Black Lucy 05/03/58 Engineering

Surname, DegreeProg (Students) Surname, DegreeProg (Students)


Surname DegreeProg Surname DegreeProg
Smith Computing Smith Computing
Smith Engineering Smith Engineering
Black Fine Art
Black Fine Art
Black Engineering

CSC343 Introduction to Databases University of Toronto Relational Algebra 21 CSC343 Introduction to Databases University of Toronto Relational Algebra 22

Join A Natural Join


The most used operator in the relational algebra.
Allows us to establish connections among data in r1 r2
different relations, taking advantage of the "value- Employee Department Department Head
Smith sales production Mori
based" nature of the relational model. Black production sales Brown
White production
Two main versions of the join:
"natural" join: takes attribute names into account;
r1 r2
"theta" join. Employee Department Head
Both join operations are denoted by the symbol . Smith
Black
sales
production
Brown
Mori
White production Mori

CSC343 Introduction to Databases University of Toronto Relational Algebra 23 CSC343 Introduction to Databases University of Toronto Relational Algebra 24

6
Definition of Natural Join Natural Join: Comments
r1 (X1), r2 (X2) The tuples in the resulting relation are obtained by
r1 r2 (natural join of r1 and r2) is a relation on combining tuples in the operands with equal
X1X2 (the union of the two sets): values on the common attributes
{ t on X1X2 | t [X1] r1 and t [X2] r2 } The common attributes often form a key of one
of the operands (remember: references are
or, equivalently realized by means of foreign keys, and we join in
order to follow references)
Not always! Consider Person(Name,Addr,PostalC) and let
{ t on X1X2 | exist t1 r1 and t2 r2 with t [X1] = t1 us define Neighbour(Name,Addr,Name1,Addr1,PostalC)
and t [X2] = t2 } by joining Person with Name,AddrName1,Addr1(Person);
What is criterion for neighbourhood here?

CSC343 Introduction to Databases University of Toronto Relational Algebra 25 CSC343 Introduction to Databases University of Toronto Relational Algebra 26

Another Example Yet Another Join


Offences Code Date Officer Dept Registartion In this example, join gives very different results
143256 25/10/1992 567 75 5694 FR from union (see earlier example)
987554 26/10/1992 456 75 5694 FR
987557 26/10/1992 456 75 6544 XY
630876 15/10/1992 456 47 6544 XY
539856 12/10/1992 567 47 6544 XY Paternity Maternity
Father Child Mother Child
Registration Dept Owner Adam Cain Eve Cain
Cars Adam Abel Eve Seth
6544 XY 75 Cordon Edouard
7122 HT 75 Cordon Edouard Abraham Isaac Sarah Isaac
5694 FR 75 Latour Hortense Abraham Ishmael Hagar Ishmael
6544 XY 47 Mimault Bernard
Offences Cars Paternity Maternity
Code Date Officer Dept Registration Owner
143256 25/10/1992 567 75 5694 FR Latour Hortense Father Child Mother
987554 26/10/1992 456 75 5694 FR Latour Hortense Adam Cain Eve
987557 26/10/1992 456 75 6544 XY Cordon Edouard Abraham Isaac Sarah
630876 15/10/1992 456 47 6544 XY Cordon Edouard Abraham Ishmael Hagar
539856 12/10/1992 567 47 6544 XY Mimault Bernard
CSC343 Introduction to Databases University of Toronto Relational Algebra 27 CSC343 Introduction to Databases University of Toronto Relational Algebra 28

7
Joins can be Incomplete Joins can be Empty
If a tuple does not have a "counterpart" in the As an extreme, we might have that no tuple has
other relation, then it does not contribute to the a counterpart, and all tuples are dangling
join ("dangling" tuple)
r2
r1
Department Head
r1 r2
production Mori
Employee Department Department Head
Employee Department purchasing Brown
Smith sales marketing Mori
Smith sales purchasing Brown
Black production Black production
White production White production

r1 r2 r1 r2
Employee Department Head Employee Department Head
Black production Mori
White production Mori

CSC343 Introduction to Databases University of Toronto Relational Algebra 29 CSC343 Introduction to Databases University of Toronto Relational Algebra 30

Another Extreme How Many Tuples in a Join?


If each tuple of each operand can be combined Given r1 (X1), r2 (X2) the join has cardinality
with all the tuples of the other, then the join has a 0 | r1 r2 | | r1 | | r2|
cardinality that is the product of the cardinalities
where | r | is the cardinality of relation r.
of the operands
Moreover:
r1
Employee Project if the join is complete, then its cardinality is at
Smith A r1 r2 least the maximum of | r1 | and | r2|.
Black A
White A
Employee
Smith
Project
A
Head
Mori
if X1X2 contains a key for r2,
r2 Black A Brown then | r1 r2 | | r1|
White A Mori
Project
A
Head
Mori
Smith A Brown if X1X2 is the primary key for r2, and there is a
Black A Mori
A Brown
White A Brown
referential constraint between X1X2 in r1 and
such a key, then | r1 r2 | = | r1|.
CSC343 Introduction to Databases University of Toronto Relational Algebra 31 CSC343 Introduction to Databases University of Toronto Relational Algebra 32

8
Outer Join Outer Join Operations
r1 r2
A variant of the join, to keep all pieces of Employee Department Department Head
Smith sales
information from the operands. Black production
production
purchasing
Mori
Brown
White production
An outer join operation "pads with nulls" the
tuples in one operant relation that have no Employee Department Head
Smith Sales
r1 LEFTr2
NULL
counterpart in the other relation. Black production Mori
White production Mori
Three variants:
Employee Department Head
LEFT only tuples of left operand are
r1 RIGHT r2
Black production Mori
padded; White
NULL
production
purchasing
Mori
Brown
RIGHT only tuples of right operand are
Employee Department Head
padded; Smith Sales NULL
r1 FULL r2 Black production Mori
FULL tuples of both operands are padded. White production Mori
NULL purchasing Brown
CSC343 Introduction to Databases University of Toronto Relational Algebra 33 CSC343 Introduction to Databases University of Toronto Relational Algebra 34

N-ary Join Operations


Example of N-ary Join Operation
The natural join is r1
Employee Department
commutative: r1 r2 = r2 r1 Smith sales r2
Black production
associative: (r1 r2) r3 = r1 (r2 r3) Brown marketing Department Division
White production production A
Therefore, we can write n-ary joins without marketing B
purchasing B
ambiguity: r3
r1 r2 rn Division Head
A Mori
B Brown

r1 r2 r3
Employee Department Division Head
Black production A Mori
Brown marketing B Brown
White production A Mori
CSC343 Introduction to Databases University of Toronto Relational Algebra 35 CSC343 Introduction to Databases University of Toronto Relational Algebra 36

9
Join and Intersection Natural Join as Cartesian Product

We have made no assumptions about the sets The natural join is defined also when the
of attributes X1 and X2 on which the operands operands have no attributes in common.
of a join operation are defined; the two sets In this case no condition is imposed on tuples,
could even be equal or disjoint. and therefore the result contains tuples obtained
If X1 = X2 then r1 r2 = r1 r2 since, by by combining the tuples of the operands in all
definition, the result is a relation which includes possible ways.
tuples t such that t[X1] r1 and t[X2] r2, and
X1 = X2.

CSC343 Introduction to Databases University of Toronto Relational Algebra 37 CSC343 Introduction to Databases University of Toronto Relational Algebra 38

Cartesian Product: Example Theta-Join


In most cases, a Cartesian product is meaningful
Employees Projects
Employee Project
only if followed by a selection:
Code Name
Smith A A Venus theta-join: a derived operator
Black A B Mars
Black B r1 F r2 = F(r1 r2)

Employes Projects
Employee Project Code Name if F is a conjunction of equalities, then we
Smith A A Venus have an equi-join
Black A A Venus
Black B A Venus
Smith A B Mars
Black A B Mars
Black B B Mars

CSC343 Introduction to Databases University of Toronto Relational Algebra 39 CSC343 Introduction to Databases University of Toronto Relational Algebra 40

10
Equi-join: example Division
Employees Consider two relations A(x,y), B(y) and suppose
Employee Project we want to specify the query
Smith A
Black A Projects "Find all A's that are associated with all B's"
Black B Code Name This can be expressed as
A Venus
B Mars A/B = x(A) - x((x(A) B) - A)
This means that division does not extend the
Employes Project=Code Projects expressiveness of Relational Algebra, but it is a
Employee
Smith
Project
A
Code
A
Name
Venus
convenient operation to use in many situations.
Black A A Venus
Black B B Mars

CSC343 Introduction to Databases University of Toronto Relational Algebra 41 CSC343 Introduction to Databases University of Toronto Relational Algebra 42

Example of Division Queries


Assume
Take(x,y) - "student x has taken course y", A query is a function from database instances to
CS(y) - "y is a CS course" relations.
We want "All students who have taken all CS Queries are formulated in relational algebra by
courses" means of expressions over relations.
x(Take) CS -- ? Table of all students, CS
courses .
(x(Take) CS) - Take -- ?? Table of all
students and the CS courses they have not
taken
x((x(Take) CS) - Take) -- ??? All
students who have not taken a CS course
CSC343 Introduction to Databases
University of Toronto Relational Algebra 43 CSC343 Introduction to Databases University of Toronto Relational Algebra 44

11
A Sample Database Example 1
Find the numbers, names and ages of employees
Supervision earning more than 40k.
Head Employee Employees(Number,Name,Age,Salary)
210 101
Employees 210 103 Supervision(Head,Emp)
Number Name Age Salary 210 104
101 Mary Smith 34 40 231 105
103 Mary Bianchi 23 35 301 210 Try it!
Number,Name,Age(Salary>40Employees)
104 Luigi Neri 38 61 301 231
105 Nico Bini 44 38 375 252 Number Name Age
210 Marco Celli 49 60 104 Luigi Neri 38
231 Siro Bisi 50 60 210 Marco Celli 49
252 Nico Bini 44 70 231 Siro Bisi 50
301 Steve Smith 34 70 252 Nico Bini 44
375 Mary Smith 50 65 301 Steve Smith 34
375 Mary Smith 50

CSC343 Introduction to Databases University of Toronto Relational Algebra 45 CSC343 Introduction to Databases University of Toronto Relational Algebra 46

Example 2 Example 3
Find the registration numbers of the supervisors Find the names and salaries of the supervisors
of the employees earning more than 40M. of the employees earning more than 40M.
Employees(Number,Name,Age,Salary) Employees(Number,Name,Age,Salary)
Supervision(Head,Emp) Supervision(Head,Emp)

Try it! Try it! (this


NameH,SalaryH is a bit tougher)
(Num,Name,Sal,AgeNumH,NameH,SalH,AgeHEmployees
Head(Supervision Emp=Number(Salary>40Employees))
NumberH=Head (Supervision Number=Emp(Salary>40Employees))))
Head
210 NameH SalaryH
301 Marco Celli 60
375 Steve Smith 70
Mary Smith 65
R(Head,Emp,Number,Name,Age,Salary)) R(NumH,NameH,AgeH,SalH,Head,Emp,Number,Name,Age,Salary)

CSC343 Introduction to Databases University of Toronto Relational Algebra 47 CSC343 Introduction to Databases University of Toronto Relational Algebra 48

12
Example 4 Example 5
Find the employees earning more than their Find registration numbers and names of
respective supervisors, return registration supervisors, all of whose employees earn more
numbers, names and salaries of the than 40M.
employees and their supervisors. Employees(Number,Name,Age,Salary)
Employees(Number,Name,Age,Salary) Supervision(Head,Emp)
Supervision(Head,Emp) (Employees Number=Head (Head (Supervision) -
Try it!
Number,Name
Number,Name,Salary,NumberH,Nam3H, SalaryH
Try it! Definitely
(Salary>SalaryH challenging
(Num,Name,Sal,AgeNumH,NameH,SalH,AgeH Employees Head (Supervision Number=Emp (Salary40Employees))))
NumberH=Head (Supervision Emp=Number Employees)))
Number Name
Number Name Salary NumberH NameH SalaryH
301 Steve Smith
104 Luigi Neri 61 210 Marco Celli 60
252 Nico Bini 70 375 Mary Smith 65 375 Mary Smith

CSC343 Introduction to Databases University of Toronto Relational Algebra 49 CSC343 Introduction to Databases University of Toronto Relational Algebra 50

Another Series of Examples: Example 2

Films(Film#,Title,Director,Year,ProdCost) Films(Film#,Title,Director,Year,ProdCost)
Artists(Actor#,Surname,FirsName,Sex,Birthday, Artists(Actor#,Surname,FirsName,Sex,Birthday,
Nationality) Nationality)
Roles(Film#,Actor#,Character) Roles(Film#,Actor#,Character)

Find The titles of all films in which the director is


Find The titles of films starring Henry Fonda also an actor
Try it! Try it!
Title(Films ((FirstName=Henry) Title((Director=Actor#)(Films Roles))
(Surname=Fonda)(Artists Roles)))
CSC343 Introduction to Databases University of Toronto Relational Algebra 51 CSC343 Introduction to Databases University of Toronto Relational Algebra 52

13
Example 3 Example 4
Films(Film#,Title,Director,Year,ProdCost)
Artists(Actor#,Surname,FirsName,Sex,Birthday, Films(Film#,Title,Director,Year,ProdCost)
Nationality) Artists(Actor#,Surname,FirsName,Sex,Birthday,
Roles(Film#,Actor#,Character) Nationality)
Roles(Film#,Actor#,Character)
Find The actors who have played two characters in
the same film; show the title of each such film, first The titles of the films in which the actors are all of
name and surname of the actor and the two the same sex
characters TitleTry
(Films)
it!
Title,FirstName,Surname,Character1,Character
Try it! Title(Films) sexsex1((Artists Roles)
(Film#,Actor#,CharacterFilm#1,Actor#1,Character1(Roles) Ac#,Char,Sur,Fir,Sex,BD,N Ac#1,Char1,Sur1,Fir1,Sex1,BD1,N1
(Film#1=Film#)(Actor#1=Actor#) (Character1Character) (Artists Roles))
Roles Artists Films)
CSC343 Introduction to Databases University of Toronto Relational Algebra 53 CSC343 Introduction to Databases University of Toronto Relational Algebra 54

Lecture Example (for blackboard)


Relational Algebra and Null Values Emp(Num,Name,Age,Sal)
"Find all ages for which there are at least two
People Name Age Salary employees earning more than 40M"
Aldo 35 15
sal>40(Emp) sal1>40(Num,Name,SalNum1,Name1,Sal1Emp)
Andrea 27 21
Maria NULL 42 This results in a relation R with schema
??(Num,Name,Age,Sal,Num1,Name1,Sal1)
Now we select tuples where NumNum1 and project
Consider Age>30 (People)
on Age
Which tuples belong to the result?
Age( NumNum1R)
The first yes, the second no, but the third??
In a similar fashion, you can find ages for which there
are 3 employees earning >40M, etc.
CSC343 Introduction to Databases University of Toronto Relational Algebra 55 CSC343 Introduction to Databases University of Toronto Relational Algebra 56

14
Blackboard Example II

Finding the maximum value in Relational Algebra.


Emp(Num,Name,Age,Sal)
"Find the employees earning maximum salary"
A = Sal(Emp) - Sal( Sal1>Sal(Emp everything(Emp))
This gives us the maximum salary; the rest is easy

Num(Emp A)

CSC343 Introduction to Databases University of Toronto Relational Algebra 57

15

You might also like