You are on page 1of 53

Relational Algebra

Formal Relational Query Languages


„ Two mathematical Query Languages form the
basis for practical languages (e.g. SQL)
– Relational Algebra: Operational, useful for
representing execution plans
– Relational Calculus: Declarative: Describe
what you want, rather than how to compute it
„ They are NOT programming languages
– Not expected to be Turing complete
Operations (Operators)

„ Operations on a single relation


– selection σ, projection π
„ Usual set operations (relations are sets):
– union ∪, intersection ∩, and difference 
„ Operations combining two or more relations
– Cartesian product ×, join and natural join
„ And a renaming operation ρ
Projection

„ Keeping vertical slices of a relation


according to L
„ L is a list of attributes (i.e. a list of
columns) of the relation R
πL(R)
Projection

„ πEmp_No, Num(Assigned_To)

Emp_No Dep_Date Num


1001 Nov 1 100 Emp_No Num
1001 Oct 31 100 1001 100
1002 Nov 1 100 1002 100
1002 Oct 31 100

πEmp_No, Num(
1003 100
1003
1003
Oct 31
Oct 31
100
337
)= 1003
1004
337
337
1004 Oct 31 337 1005 337
1005 Oct 31 337 1006 337
1006 Nov 1 991 1006 991
1006 Oct 31 337
Projection
„ Is πEmp_No, Num(Assigned_To) Equivalent to
SELECT Emp_No, Num
FROM Assigned_To
???
Emp_No Dep_Date Num
Emp_No Num

πEmp_No, Num( 1001


1001
Nov 1
Oct 31
100
100 )= 1001
1002
100
100
1002 Nov 1 100

„ NO. Relational algebra works with sets


(i.e. No dublicates)
SELECT DISTINCT Emp_No, Num
FROM Assigned_To
Selection

„ Selecting the t-uples of a relation R


verifying a condition c

σc(R)
Selection

„ σ Salary<100000(Employee)

Name Salary Emp_No


Clark 150000 1006 Name Salary Emp_No

Gates 5000000 1005 Jones 50000 1001

σSalary<100000( Jones
Peters
50000
45000
1001
1002
)= Peters
Rowe
45000
35000
1002
1003

Phillips 25000 1004 Phillips 25000 1004

Rowe 35000 1003


Warnock 500000 1007
Selection

„ σ Salary<100000(Employee)

SELECT DISTINT *
FROM Employee
WHERE Salary < 100000
Confusing terms
Projection πattr1,attr2(Relation)

SELECT DISTINCT attr1,attr2


FROM Relation
WHERE cond

Selection σcond(Relation)
Selection (θ-)Condition

„ Term Op Term is a condition


– where Term is an attribute name
– or Term is a constant
– Op is one of <, >, =, ≠, etc.
∧ C2), (C1 ∨ C2), (¬ C1) are
„ (C1
conditions where C1 and C2 are
conditions
Selection

„ σSalary>100000 ∧¬(Name=‘Gates’)(Employee)

Name Salary Emp_No


Clark 150000 1006
Gates 5000000 1005
Jones 50000 1001
Peters 45000 1002
Phillips 25000 1004
Rowe 35000 1003
Warnock 500000 1007
Composability

„ The result of an expression is a relation

πEmp_No, Num(σNum>150(Employee))

Emp_No Dep_Date Num

Emp_No Num
1001 Nov 1 100
1002 200
1001 Oct 31 100

1002 Nov 1 200


Composability
πEmp_No, Num(σNum>150(Employee))
What is the equivalent SQL query?
SELECT DISTINCT Emp_No, Num
FROM Employee
WHERE Num>150

σNum>150(πEmp_No, Num(Employee))
What is the equivalent SQL query?
Can I always exchange the order of σ and π ?
Union, Intersection, Set-difference
„ R1 ∪ R2 = { t | t ∈ R1 or t ∈ R2}
„ R1 ∩ R2 = { t | t ∈ R1 and t ∈ R2}
„ R1 — R2 = { t | t ∈ R1 and t ∉ R2}

„ The
relations R1 and R2 must be union
compatible
– Same number of attributes
– Corresponding attributes have the same
type (but not necessarily the same name)
Set operations - Union

„ Plane1 ∪ Plane2
Maker Model_No
Airbus A310
Maker Model_No
Maker Model_No Airbus A320
Airbus A310
Boeing B727 Airbus A330
Airbus A320

∪ =
Boeing B747 Airbus A340
Airbus A330
Boeing B757 Boeing B727
Airbus A340
MD DC10 Boeing B747
MD DC10
MD DC9 Boeing B757
MD DC9
MD DC10
MD DC9
Set operations - Union
„ Plane1 ∪ Plane2

SELECT DISTINCT *
FROM Plane1
UNION
SELECT DISTINCT *
FROM Plane2
Set operations - Intersection

„ Plane1 ∩ Plane2
Maker Model_No
Maker Model_No
Airbus A310
Boeing B727
Airbus A320 Maker Model_No

∩ =
Boeing B747
Airbus A330 MD DC9
Boeing B757
Airbus A340 MD DC10
MD DC10
MD DC10
MD DC9
MD DC9
Set operations - Intersection

„ Plane1 ∩ Plane2

SELECT DISTINCT *
FROM Plane1
INTERSECT
SELECT DISTINCT *
FROM Plane2
Set operations – Set difference

„ Plane1 — Plane2

Maker Model_No
Maker Model_No
Airbus A310 Maker Model_No
Boeing B727
Airbus A320 Airbus A310

— =
Boeing B747
Airbus A330 Airbus A320
Boeing B757
Airbus A340 Airbus A330
MD DC10
MD DC10 Airbus A340
MD DC9
MD DC9
Set operations – Set difference

„ Plane1 — Plane2

SELECT DISTINCT *
FROM Plane1
EXCEPT
SELECT DISTINCT *
FROM Plane2
Are all operators essential?

R∩S = ((R∪S) − (R−S)) − (S−R)

Compute all tuples


belonging to R or S
Remove the ones that
belong only to R

Remove the ones that


belong only to S
Cartesian Product
„ Combining two relations
R1 × R2
„ {a, b} x {1,3} = {(a,1), (a,3), (b,1), (b,3)}
„ {(a,1), (a,3)} x {(a,1), (a,3)}
= {((a,1),(a,1)), ((a,1),(a,3)),
((a,3),(a,1)),((a,3),(a,3))}
= {(a,1,a,1), (a,1,a,3), (a,3,a,1),(a,3,a,3)}
Emp_No Model_No Maker Model_No

Cartesian Product
1001 B727 Airbus A310
1001 B727 Airbus A320
1001 B727 Airbus A330
1001 B727 Airbus A340
1001 B727 Boeing B727
1001 B727 Boeing B747

„ Can_fly × Plane 1001 B727 Boeing B757


1001 B727 MD DC10
1001 B727 MD DC9
1001 B747 Airbus A310
Emp_No Model_No Maker Model_No
1001 B747 Airbus A320
1001 B727 Airbus A310
1001 B747 Airbus A330
1001 B747 Airbus A320
1001 B747 Airbus A340
1001 DC10 Airbus A330
1001 B747 Boeing B727

×
1002 A320 Airbus A340

=
1001 B747 Boeing B747
1002 A340 Boeing B727
1001 B747 Boeing B757
1002 B757 Boeing B747
1001 B747 MD DC10
1002 DC9 Boeing B757
1001 B747 MD DC9
1003 A310 MD DC10
1001 B727 Airbus A310
1003 DC9 MD DC9 1001 B727 Airbus A320

… … … …

81 t-uples!!!
Cartesian Product

„ Can_fly × Plane

SELECT DISTINCT *
FROM Can_Fly, Plane
θ-Join
„ Combining two relations R1 and R2 on a
condition c
R1 c R2 = σc(R1 × R2)
θ-Join

„ Can_fly Can_fly.Model_No = Plane.Model_No Plane


Emp_No Model_No Maker Model_No Emp_No Can_Fly.Model_No Maker Plane.Model_No
1001 B727 Airbus A310 1003 A310 Airbus A310
1001 B747 Airbus A320 1002 A320 Airbus A320
1001 DC10 Airbus A330 1002 A340 Airbus A340

… =
1002 A320 Airbus A340 1001 B727 Boeing B727
1002 A340 Boeing B727 1001 B747 Boeing B747
1002 B757 Boeing B747 1002 B757 Boeing B757
1002 DC9 Boeing B757 1001 DC10 MD DC10
1003 A310 MD DC10 1002 DC9 MD DC9
1003 DC9 MD DC9 1003 DC9 MD DC9
θ-Join

„ Can_fly Can_fly.Model_No = Plane.Model_No Plane

SELECT DISTINCT *
FROM Can_Fly, Plane
WHERE Can_Fly.Model_No = Plane.Model_No
θ-Join

„ Flight1.Dest = Flight2.Origin ∧
Flight1.Arr_Time < Flight2.Dept_Time
Num Origin Dest Dep_Time Arr_Time Num Origin Dest Dep_Time Arr_Time
334 ORD MIA 12:00 14:14 334 ORD MIA 12:00 14:14
335 MIA ORD 15:00 17:14 335 MIA ORD 15:00 17:14
336
337
ORD
MIA
MIA
ORD
18:00
20:30
20:14
23:53
… 336
337
ORD
MIA
MIA
ORD
18:00
20:30
20:14
23:53
394 DFW MIA 19:00 21:30 394 DFW MIA 19:00 21:30
395 MIA DFW 21:00 23:43 395 MIA DFW 21:00 23:43
θ-Join

„ Flight1.Dest = Flight2.Origin ∧
Flight1.Arr_Time < Flight2.Dept_Time
Flight1 Flight1. Flight1 Flight1.De Flight1.Ar Flight2_ Flight2.O Flight2. Flight2.Dep Flight2.Arr_
.Num Origin .Dest p_Time r_Time 1.Num rigin Dest _Time Time
334 ORD MIA 12:00 14:14 335 MIA ORD 15:00 17:14

335 MIA ORD 15:00 17:14 336 ORD MIA 18:00 20:14

336 ORD MIA 18:00 20:14 337 MIA ORD 20:30 23:53

334 ORD MIA 12:00 14:14 337 MIA ORD 20:30 23:53

336 ORD MIA 18:00 20:14 395 MIA DFW 21:00 23:43

334 ORD MIA 12:00 14:14 395 MIA DFW 21:00 23:43
The Equi-Join

„ Combines two relations on a condition


composed only of equalities of attributes
of the first and second relation and
projects only one of the redundant
attributes (since they are equal)
„ R1 E(A1.1=A2.1 ∧… ∧A1.n = A2.n) R2
The Equi-Join

„ Can_fly E(Can_fly.Model_No = Plane.Model_No) Plane


Emp_No Model_No Maker Model_No Emp_No Can_Fly.Model_No Maker Plane.Model_No
1001 B727 Airbus A310 1003 A310 Airbus A310
1001 B747 Airbus A320 1002 A320 Airbus A320
1001 DC10 Airbus A330 1002 A340 Airbus A340

… =
1002 A320 Airbus A340 1001 B727 Boeing B727
1002 A340 Boeing B727 1001 B747 Boeing B747
1002 B757 Boeing B747 1002 B757 Boeing B757
1002 DC9 Boeing B757 1001 DC10 MD DC10
1003 A310 MD DC10 1002 DC9 MD DC9
1003 DC9 MD DC9 1003 DC9 MD DC9
The Natural Join

„ Combines two relations on the equality


of the attributes with the same names
and projects only one of the redundant
attributes
„ R1 n R2
The Natural Join

„ Can_fly n Plane
Emp_No Model_No Maker Model_No Emp_No Model_No Maker

1001 B727 Airbus A310 1003 A310 Airbus

1001 B747 Airbus A320 1002 A320 Airbus

1001 DC10 Airbus A330 1002 A340 Airbus

=
1002 A320 Airbus A340 1001 B727 Boeing

1002 A340 n Boeing B727 1001 B747 Boeing

1002 B757 Boeing B747 1002 B757 Boeing

1002 DC9 Boeing B757 1001 DC10 MD

1003 A310 MD DC10 1002 DC9 MD

1003 DC9 MD DC9 1003 DC9 MD


Renaming
„ Ifattributes or relations have the same
name (for instance when joining a
relation with itself) it may be convenient
to rename one
ρ(R’(N -> N’1, Nn -> N’n), R)
„ The new relation R’ has the same
instance has R, its schema has
attribute N’i instead of attribute Ni
Renaming

„ ρ(Staff(Name -> Family_Name,


Salary -> Gross_salary),
Employee)
Employee Staff
Family_ Gross_
Name Salary Emp_No Emp_No
Name Salary
Clark 150000 1006 Clark 150000 1006
Gates 5000000 1005 Gates 5000000 1005
Jones 50000 1001 Jones 50000 1001
Peters 45000 1002 Peters 45000 1002
Phillips 25000 1004 Phillips 25000 1004
Rowe 35000 1003 Rowe 35000 1003
Warnock 500000 1007 Warnock 500000 1007
Renaming
„ ρ(Staff(Name-> Family_Name,
Salary -> Gross_salary),
Employee)

SELECT Name AS Family_Name,


Salary AS Gross_salary
FROM Employee Staff
Complex Expression

„ R1 ∪ (R2 ∩ πb(R3 × ρ(R4(a -> b), R5)))


R1 ∩
R2
πb
×
R3 ρ (R4(a -> b))
R5
Example

„ Find for each employee, its name and


the model number of engine it can fly
„ πName, Model_No (σEmployee.Emp_No=Can_fly.Emp_No(Employee ×
Can_fly))

πName, Model_No

σEmployee.Emp_No=Can_fly.Emp_No

Employee Can_fly
Example

Emp_No Model_No
Name Salary Emp_No
1001 B727
Clark 150000 1006
1001 B747
Gates 5000000 1005
1001 DC10
Jones 50000 1001
1002 A320
Peters 45000 1002
1002 A340
Phillips 25000 1004
1002 B757
Rowe 35000 1003
1002 DC9
Warnock 500000 1007
1003 A310
1003 DC9
Simple SQL and Algebra queries

„ Also
called Project Select Join queries
(PSJ)

SELECT DISTINCT student.name,


course.name
FROM student, course
WHERE student.take = course.id AND
student.year = 1997
Simple SQL and Algebra queries

„ πstudent.name, course.name(σstudent.take = course.id ∧ student.year = 1997(student ×


course))
„π student.name, course.name (σ student.take = course.id (σ
year =

(student)×course))
1997

„π (σ (student ⊗
student.name, course.name student.year = 1997 student.take = course.id

course))
„π (σ (student) ⊗
student.name, course.name year = 1997 student.take = course.id

course)
Example 1
„ Find the Employment numbers of the pilots
who can fly Airbus planes
πEmp_No(σMaker=‘Airbus’(Can_Fly⊗Plane))

Emp_No Model_No Maker Model_No

1001 B727 Airbus A310

1001 B747 Airbus A320

1001 DC10 Airbus A330

1002 A320 Airbus A340


Emp_No
1002 A340 Boeing B727

1002 B757 Boeing B747 1002


1002 DC9 Boeing B757
1003
1003 A310 MD DC10

1003 DC9 MD DC9


Example 2
„ Find the Employment numbers of the pilots
who can fly Boeing or MD planes
πEmp_No(σMaker=‘Boeing’ ∨ Maker=‘MD’
(Can_Fly⊗Plane))

Emp_No Model_No Maker Model_No

1001 B727 Airbus A310

1001 B747 Airbus A320

1001 DC10 Airbus A330

1002 A320 Airbus A340


Emp_No
1002 A340 Boeing B727

1002 B757 Boeing B747 1001


1002 DC9 Boeing B757
1002
1003 A310 MD DC10

1003 DC9 MD DC9 1003


Example 3
„ Find the Employment numbers of the pilots
who can fly Boeing and MD planes
Emp_No
πEmp_No(σMaker=‘Boeing’ ∧ Maker=‘MD’ 1001
1002
(Can_Fly⊗Plane))
Emp_No Model_No
1001 B727

WRONG! The correct is:


1001 B747
1001 DC10

πEmp_No(σMaker=‘Boeing’(Can_Fly⊗Plane))
1002 A320
1002 A340


1002 B757
1002 DC9
1003 A310

πEmp_No(σMaker=‘MD’(Can_Fly⊗Plane)) 1003 DC9


Example 4
„ Find the Employment numbers of the pilots
who can fly at least two Boeing planes
Emp_No
1001
ρ(BP1, σMaker=‘Boeing’(Can_Fly⊗Plane))
Emp_No Model_No

ρ(BP2, BP1) 1001 B727


1001 B747
1001 DC10

πBP1.Emp_No(
1002 A320
1002 A340

σBP1.Model_No≠BP2.Model.No(BP1×BP2)) 1002 B757


1002 DC9
1003 A310
1003 DC9
Example 4 – cont.
„ Find the Employment numbers of the pilots
who can fly at least two Boeing planes
Emp_No
1001
SELECT Can_Fly.Emp_No
FROM Can_Fly, Plane
Emp_No Model_No

WHERE Plane.Maker = ‘Boeing’ 1001 B727


1001 B747
GROUP BY Can_Fly.Emp_No 1001 DC10
1002 A320
HAVING COUNT(*) >= 2 1002 A340
1002 B757
1002 DC9

Algebra does not support aggregations 1003 A310


1003 DC9
Example 5 - Division

„ Find the Employment numbers of the pilots


who can fly all MD planes

Emp_No Model_No Maker Model_No


1001 B727 Airbus A310
1001 B747 Airbus A320
1001 DC10 Airbus A330
1002 A320 Airbus A340
1002 A340 Boeing B727
1002 B757 Boeing B747
Emp_No
1002 DC9 Boeing B757 1003
1003 A310 MD DC10
1003 DC9 MD DC9

1003 DC10
Example 5 - Division
„ Let A have two fields x and y
„ Let B have one field y
„ A/B contains all x tuples, such that for every y tuple
in B there is a xy tuple in A
Sno Pno
s1 p1 Pno

s1 p2 p2 B
s1 p3
s1 p4
s2 p1 Sno

s2 p2 s1

s3 p2 s2
s3
A/B
s4 p2
s4 p4 s4

A
Example 5 - Division
„ Let A have two fields x and y
„ Let B have one field y
„ A/B contains all x tuples, such that for every y tuple
in B there is a xy tuple in A
Sno Pno
s1 p1 Pno

s1 p2 p2 B
s1 p3 p4

s1 p4
s2 p1
s2 p2 Sno
s3 p2 s1 A/B
s4 p2 s4
s4 p4

A
Example 5 - Division
„ Let A have two fields x and y
„ Let B have one field y
„ A/B contains all x tuples, such that for every y tuple
in B there is a xy tuple in A
Sno Pno
s1 p1 Pno

s1 p2 p1 B
s1 p3 p2

s1 p4 p4

s2 p1
s2 p2 Sno
s3 p2 s1 A/B
s4 p2
s4 p4

A
Example 5 - Division
„ Compute all possible combinations of the first
column of A and B.
„ Then remove those rows that do not belong to A
„ Keep only the first column of the result. These are
the disqualified values
πx( (πx(A)×B) − A)
„ A/B is the first column of A except of the
disqualified values
A/B = πx(A) − πx((πx(A)×B)− A)
Example 5 - Division
„ Find the Employment numbers of the pilots
who can fly all MD planes

ρ(B, πModel_No(σMaker=‘MD’(Plane)))

ρ(A, Can_Fly)

πEmp_No(A) − πEmp_No((πEmp_No(A)×B)− A)
What is the corresponding SQL?

You might also like