You are on page 1of 40

# The Relational Algebra

## and Relational Calculus

Silberschatz Text: 6
What are Relational Algebra, Relational Calculus?

 Relational Algebra
– A basic set of operations to manipulate the database
– Procedural, specify how to retrieve, incorporated in SQL
– Provides formal foundation for relational model operations
– Used as a basis for implementing and optimizing queries in
RDBMS

 Relational Calculus
– A higher-level declarative notation for specifying queries
– Non-procedural, specify only what is to be retrieved
– Firm basis in mathematical logic
– Tuple Calculus: operates over the rows
– Domain Calculus: operates over the columns
R.Algebra Slide -2
Topics to cover
 Basic Relational Algebra Operations
– Set (∪, ∩, –, X)
– Specific for relational database (σ, π, ⋈ , ρ, ÷)
– Unary (single relation) and Binary (2 relations)
– Aggregate Functions and Grouping ℑ
– Recursive Closure Operations
– OUTER JOIN Operations ]><[
– The OUTER UNION Operation
 Tuple Relational Calculus
– SQL based on Relational calculus
 Domain Relational Calculus
– QBE based on Domain Calculus

R.Algebra Slide -3
Relational Algebra Operations
1. SET OPERATIONS
 UNION ∪, INTERSECTION ∩
 SET DIFFERENCE (MINUS) –
 CARTESIAN (CROSS JOIN) PRODUCT X

2. SPECIFIC OPERATIONS
 SELECT σ, PROJECT π (unary operations)
 JOIN ⋈ (various types, =, θ, *,) (binary operations)
 OTHERS
 Sequence, rename ρ (unary operations)
 Division ÷ (binary operations)

R.Algebra Slide -4
SELECT OPERATION
 UNARY implies it is applied to a single relation and to each tuple individually
 Select a subset of tuples from a relation that satisfy a selection condition
 σ(selection condition)(R)
 σ sigma = SELECT operation
 selection condition = Boolean expression
 <attribute name> <comparison op> <constant value>, or
<attribute name> <comparison op> <attribute name>
<comparison op> = {=,<, ≤,>, ≥, ≠}
<constant value> = constant value from the attribute domain
 Can connect clauses by AND, OR and NOT
 The degree of the resulting SELECT operation is same as the degree of R
 Number of resulting tuples <= number of tuples in R
 Commutative, i.e, apply in any order: σ(cond1) ( σ(cond2)(R)) = σ(cond2) ( σ(cond1)(R))
 Can combine a cascade of SELECT operations into a single SELECT operation
with a conjunctive (AND) condition
 σ<cond1> ( σ<cond2> ( ..(σ<condn>(R))…)) = σ(cond1) AND <cond2> AND … AND <condn>(R)
R.Algebra Slide -5
PROJECT OPERATION
 UNARY implies it is applied to a single relation and to each tuple individually
 Select certain columns (attributes) from the table and discards the other columns
 π <attribute list>(R)
 π i = PROJECT operation
 e.g. π LNAME, FNAME, SALARY (EMPLOYEE)
 Project (choose) only attributes specified in <attribute list> in same order as they
appear
 Duplicate elimination, remove duplicate non-key only attributes of R
 Number of resulting tuples <= number of tuples in R
 If projection list is a superkey of R, the resulting relation has the same number of
tuples as R
 If <list 2> contains attributes in <list1>: π <list1> (π <list2> (R)) = π <list1> (R)
 Commutative does not hold on PROJECT operation

R.Algebra Slide -6
Results of SELECT and PROJECT operations.
Refer to COMPANY database schema
SELECT
(a): σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND SLARY>30000)(EMPLOYEE).

PROJECT
(b): πLNAME, FNAME, SALARY(EMPLOYEE).
(c): πSEX, SALARY(EMPLOYEE).

## Only one entry of (F, 25000)

Fig 1
R.Algebra Slide -7
SEQUENCES OF OPERATIONS

##  Nesting operations into one single relational algebra

expression:
– e.g. 2a: π FNAME, LNAME, SALARY (σ DNO=5 (EMPLOYEE))

##  Sometimes it is simpler to apply one operation at a time and

create intermediate result relations, giving a name to each
intermediate relation:

## – DEP5_EMPS ← σ DNO=5 (EMPLOYEE)

– RESULT ← π FNAME, LNAME, SALARY (DEP5_EMPS )

R.Algebra Slide -8
Results of a sequence of operations.
(a) πFNAME, LNAME, SALARY(σDNO=5(EMPLOYEE)).
(b) Using intermediate relations and renaming of attributes.

Fig 2
R.Algebra Slide -9
RENAME OPERATIONS
● Sometimes the attributes in the intermediate result relations are
renamed
– TEMP ← σ DNO=5 (EMPLOYEE)
– ρS(FIRSTNAME, LASTNAME, SALARY) ← π FNAME, LNAME, SLARAY (TEMP )

● RENAME Operation
– ρ (rho)
– S is new relation name
– B1, B2, …, Bn are new attribute names

## ● Can rename either relation name or attribute names

– ρS(B1, B2, …, Bn)(R) rename both relation and attribute
– ρS(R) renames relation only
– ρ(B1, B2, …, Bn)(R) renames attributes only
R.Algebra Slide -10
SET OPERATIONS

##  UNION Operation (fig 4a, 4b)

– R ∪ S, result in new relation that includes all tuples that
are either in R or in S or in both, duplicate tuples are
eliminated

##  INTERSECTION Operation (fig 4c)

– R ∩ S, result in new relation that includes all tuples that
are in both R and S

##  SET DIFFERENCE (MINUS) Operation (fig 4d, 4e)

– R - S, result in new relation that includes all tuples that
are in R but not in S

## R.Algebra Slide -11

Results of the UNION operation
 Example: to retrieve SSN of all employees who either work in department 5
or directly supervise an employee who works in department 5
– RESULT 1: SSN of all employees who work in department 5
– RESULT 2: SSN of all supervisors in department 5
– RESULT: UNION Operation - SSN either in RESULT 1 or RESULT 2

## RESULT ← RESULT1 ∪ RESULT2 Fig 3

R.Algebra Slide -12
SET OPERATIONS

##  Union Compatibility: only applies to same type of tuples

– same degree n for R(A1, A2, …, An) and S(B1, B2, …, Bn)
– dom (Ai) = dom(Bi)

 Commutative operations:
– R ∪ S = S ∪ R and R ∩ S = S ∩ R

 Associative operations:
– R ∪ ( S ∪ T) = (R ∪ S) ∪ T

## R.Algebra Slide -13

SET operations UNION, INTERSECTION, MINUS

## (a) Two union-compatible relations.

(b) STUDENT ∪ INSTRUCTOR.
(c) STUDENT ∩ INSTRUCTOR.
(d) STUDENT – INSTRUCTOR.
(e) INSTRUCTOR – STUDENT

Fig 4
R.Algebra Slide -14
SET Operation:
The CARTESIAN PRODUCT (CROSS JOIN)

CROSS JOIN = X
R(A1, A2, …, An) X S(B1, B2, …, Bm),

## result = Q, with degree n+m attribute Q(A1, A2, …, An, B1,

B2, …, Bm), in that order

## If R has nr tuples and S has nS tuples, then

R X S will have nr x nS tuples
Fig 5a
R.Algebra Slide -15
SET Operation:
The CARTESIAN PRODUCT (CROSS JOIN)
Example: to retrieve a list of names of each female employees dependents

## FEMALE_EMPS ← σ SEX=‘F’ (EMPLOYEE)

EMPNAMES ← π FNAME, LNAME, SSN (FEMALE_EMPS)
EMP_DEPENDENTS ← EMPNAMES X DEPENDENT
ACTUAL_DEPENDENTS ← σ SSN=ESSN (EMP_DEPENDENTS)
RESULT ← π FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPENDENTS )

## EMPNAMES is cross join with DEPENDENT

Fig 5a
R.Algebra Slide -16
SET Operation: Cross JOIN
EMP_DEPENDENTS ← EMPNAMES X DEPENDENT

## Notice: in Cartesian Product (cross join), ALL combinations of tuples are

included in the result Fig 5b
R.Algebra Slide -17
SET Operation:
SELECT tuple(s) and PROJECT desired attributes from the
CROSS JOIN result EMP_DEPENDENTS
ACTUAL_DEPENDENTS ← σ SSN=ESSN (EMP_DEPENDENTS)
RESULT ← π FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPENDENTS )

## Example: Out of the 3 female employees, only SSN 987654321, Jennifer

Wallace, has a dependent whose name is Abner.

Fig 5c
R.Algebra Slide -18
BINARY JOIN OPERATIONS – INNER JOINS
 Only matching tuples of inner joins ⋈ are kept in result relation

 THETAJOIN θ, ⋈ θ
– R(A1, A2, …, An) JOIN <join condition> S(B1, B2, …, Bm), result
= Q (A1,, An, B1,, Bm)
– Only combination of tuples that satisfies join condition
appears in the result set
– Each <join condition> is of the form Ai θ Bi, where Ai and Bi
have the same domain. θ (theta) is one of the comparison
operators {=,<, ≤,>, ≥, ≠}

 EQUIJOIN
– Similar to theta-join but only comparison operator used in the
<join condition> is =
– Join result carries superfluous attributes
R.Algebra Slide -19
BINARY JOIN OPERATIONS – INNER JOINS
 NATURALJOIN *
– similar to EQUIJOIN, but superfluous attributes are removed
– Both join attributes must have same name, if not, apply
renaming operation first
– Q ← R * (<List1>), (<List2>) S

##  n-way JOIN for EQUI and NATURAL, specified among multiple

tables,
– e.g. 3 way join:
– (( PROJECT JOIN DNUMB=DNUMBER DEPARTMENT)) JOIN

## R.Algebra Slide -20

BINARY THETAJOIN θ :
JOIN result is the combine related tuples from 2 relations into single tuples

## Result of the JOIN operation

DEPT_MGR ← DEPARTMENT JOIN ⋈ MGRSSN=SSN EMPLOYEE .

## Previous example of cross-join in Fig 5b and 5C:

EMP_DEPENDENTS ← EMPNAMES X DEPENDENT
ACTUAL_DEPENDENTS ← σ SSN=ESSN (EMP_DEPENDENTS)

## Can be replaced by:

ACTUAL_DEPENDENTS ← EMPNAMES JOIN ⋈ SSN=ESSN DEPENDENT

Fig 6
R.Algebra Slide -21
Results of two NATURAL JOIN * operations.
(a) RENAME attributes before doing JOIN

## PROJ_DEPT ← PROJECT * DEPT.

(b) DEPT_LOCS ← DEPARTMENT * DEPT_LOCATIONS.
Join attribute

Fig 7
R.Algebra Slide -22
The DIVISION ÷ operation

## DIVISON operation is applied to 2 relations

R(Z) ÷ S(X), where X ⊆ Z, Y=Z-X (i.e. set of attributes of R that are not attributes of S)

## DIVISON operation can be expressed as a sequence of π, X, and - operations

T1 ← π Y(R)
T2 ← π Y((S X T1) – R)
T ← T1 – T2

SQL does not implement DIVISION directly, it has a round about way to
dealing with it.

Fig 8b
R.Algebra Slide -23
The DIVISION ÷ operation
Example Fig 8a: Retrieve the names of employees who work on all the projects
that “John Smith” works on:
Step 1: retrieve list of PNO that John Smith works on
SMITH ← σ FNAME=‘JOHN’ AND LANME=‘SMITH’(EMPLOYEE)
SMITH_PNOS ← π PNO (WORKS_ON JOIN ⋈ ESSN=SSN SMITH)
Step 2: Create a relation to include a tuple <PNO, ESSN>
SSN_PNOS ← π ESSN, PNO (WORKS_ON)
Step 3: Divide SSN_PNOS by SMITH_PNOS to get the desired SSNS
SSNS (SSN) ← SSN_PNOS ÷ SMITH_PNOS
RESULT ← π ENAME, LNAME (SSNS * EMPLOYEE)

T←R÷S

Fig 8
R.Algebra Slide -24
Some common database requests that cannot be performed with
basic algebra operations (in the previous list) would require the

## Aggregate Functions and Grouping ℑ

Recursive Closure Operations
OUTER JOIN Operations
The OUTER UNION Operation

## R.Algebra Slide -25

AGGREGATE FUNCTION / GROUPING operation

## <grouping attributes> ℑ <function list> (R)

where:
grouping attributes: list of attributes of relation R
function list: list of <function> such as SUM,
AVERAGE, MAXIUM, MINIUM, COUNT and
<attributes> pairs

## R.Algebra Slide -26

The AGGREGATE FUNCTION operation.
Example: fig 9a,retrieve each department number, number of employees in the
department and their average salary: specify List of attributes names with rename

## Example: fig 9b,no renaming applied, attributes

of resulting relation will each be the concatenation
of the function name with the attribute name

SALARY(EMPLOYEE)

## Example: fig 9c, no grouping attributes

specified, functions applied to all tuples

## ℑ COUNT SSN, ,AVERAGE SALARY(EMPLOYEE)

Fig 9
R.Algebra Slide -27
Recursive Closure Operations
Query:
Specify the SSNs of all employees
supervised by employee whose name is
James Borg. (i.e. Borg’s division)

## BORG_SSN ← π SSN(σ FNAME=‘James’ AND LNAME=‘Borg’ (EMPLOYEE))

SUPERVISION (SSN1, SSN2) ← π SSN, SUPERSSN (EMPLOYEE)
RESULT1(SSN) ← π SSN1(SUPERVISION JOIN ⋈ SSN2=SSN BORG_SSN )

## RESULT2(SSN) ← π SSN1(SUPERVISION ⋈ SSN2=SSN RESULT1 )

Fig 10
R.Algebra Slide -28
LEFT, RIGHT, FULL OUTER JOINS
 LEFT OUTER JOIN
– keep every tuple in the first, or left, relation R in R ]><| S, if
no matching tuple is found in S, then attributes of S in the join
result are filled or “padded” with null values

##  RIGHT OUTER JOIN

– keep every tuple in the second, or right, relation R in R |><[ S

##  FULL OUTER JOIN

– keep ALL tuples in both the left and right relations R ]><[ S
when no matching tuples are found, padding them with null
values as needed.

## R.Algebra Slide -29

The result of a LEFT OUTER JOIN operation.
Example: Retrieve a list of employee names and also the name of the departments
they manage if they happen to manage a department, indicate by null for those not
managing a department. In example here, a Left Outer JOIN keeps every tuple in the
‘left’ relation, ie. EMPLOYEE. There are Right Outer JOIN and Full Outer JOIN.

## RESULT ← π FNAME, INIT, LNAME, DNAME (TEMP )

Fig 11
R.Algebra Slide -30
OUTER UNION operation
UNION of tuples from 2 relations with partially UNION compatible attributes,
e.g. R(X, Y) and S (X, Z)

UNION compatible attributes are represented once in the result, others are kept in
the result relation T (X, Y, Z)

Example:
 Apply OUTER UNION to 2 relations STUDENT (Name, SSN, Department,
Advisor) and INSTRUCTOR(Name, SSN, Department, Rank).

 Result relation:
 STUDENT_OR_INSTRUCTOR (Name, SSN, Department, Advisor, Rank)
 Tuples with same (Name, SSN, Department) will appear once only
 Tuples only in STUDENT will have a null for the Rank attribute
 Tuples only in INSTRUCTOR will have a null for the Advisor attribute

## R.Algebra Slide -31

What is Relational Calculus?
 Provide higher-level declarative notation for specifying
relational queries.
– e.g. form new relations that is specified in terms of variables over
rows (tuples) and columns (domains)
– Based on predicate calculus
– No order of operations optimize queries to form new relations
(non-procedural language)

##  Tuple Relational Calculus

– SQL based on Relational calculus

##  Domain Relational Calculus

– QBE base on Domain Calculus

## R.Algebra Slide -32

What is Tuple Relational Calculus?
 Only one declarative expression to specify a retrieval request,
emphasis on what is to be retrieved, not how to retrieve

query

##  A query language such as SQL is relational complete, when

any retrieval in basic relational algebra can be expressed in
relational calculus

##  A formula (condition) is made up of predicate calculus atoms

connected via the logical operator AND, OR, and NOT

## R.Algebra Slide -33

Tuple Relational Calculus
Tuple Variables and Range Relations
Expressions, Formulas
Existential and Universal Quantifiers
– Queries examples using Ex-Quantifier
Transforming Universal and Existential
Quantifiers
Using the Universal Quantifier
Safe Expressions
R.Algebra Slide -34
Tuple Variables and Range Relations
 Example of a simple tuple relational calculus query
– {t | COND(t)}
 where t is a tuple variable
 COND(t) is a conditional expression involving t
 result = all tuples of t that satisfy COND(t)

## – Query 1: retrieve all attribute of EMPLOYEE whose salary is > \$50,000

 {t | EMPLOYEE(t) and t.SALARY>50000}

## – Query 2: retrieve 2 attribute values for each selected EMPLOYEE tuple t

 {t.FANME, t.LNAME | EMPLOYEE(t) and t.SALARY>50000}

## – Query 3: retrieve birthdate and address of the John B. Smith

 {t.BDATE, t.ADDRESS | EMPLOYEE(t) AND t.FNAME=‘John’ AND
t.MINIT=‘B’ AND t.LNAME=‘Smith’}

## R.Algebra Slide -35

Expressions and Formulas
 General expression:
– {t1.Aj, t2.Ak, …, tn.Am, | COND(t1, t2, …., tn, tn+1, tn+2, …., tn+m )}
 where t1, t2, …., tn, tn+1, tn+2, …., tn+m are tuple variables

##  A formula is made up of predicate calculus atoms. Each of the

atoms below evaluates to either TRUE or FALSE. Atoms are
connected via the logical operators AND, OR, NOT.
– R(ti )
 where R is relation name and ti is a tuple variable

– t1.A op tj.B
 where op is one of the comparison operators in {=,<, ≤,>, ≥, ≠}
 A is an attribute of the relation on which t1 ranges
 B is an attribute of the relation on which tj ranges

– t1.A op C or C op tj.B
 where C is a constant value
R.Algebra Slide -36
Existential and Universal Quantifiers
 1. Every atom is a formula

##  2. True values of formulas are derived from their component formulas

– F1 AND F2 is TRUE if both F1 and F2 are TRUE, otherwise , it is FALSE
– F1 OR F2 is FALSE if both F1 and F2 are TRUE, otherwise, it is TRUE
– NOT (F1) is TRUE if F1 is FALSE; it is FALSE if F1 is TRUE
– NOT (F2) is TRUE if F2 is FALSE; it is FALSE if F2 is TRUE

##  3. Existential Quantifier (∃ t) (F)

– The formula is TRUE if (F) evaluates to TRUE for some (at least one) tuple
assigned to free occurrences of t in F

##  4. Universal Quantifier (∀ t) (F)

– The formula is TRUE if (F) evaluates to TRUE for every (in the universe)
tuple assigned to free occurrences of t in F

## R.Algebra Slide -37

Existential and Universal Quantifiers
 Tuple variable t is bound if it is quantified, otherwise it is free
 Can transform one type of quantifier into the other with NOT; AND, OR.
 Example:
– Query 1
Retrieve name, address of all employees work for department “Research”
Q1: {t.FNAME, t.LNAME, t.ADDRESS | EMPLOYEE(t) AND (∃ d)
(DEPARTMENT(d) AND d.NAME = ‘Research’ AND d.NUMBER=t.DNO)}

– t is the only free variable (on left of | ) where d is bounded by the existential
quantifier.
– Conditions EMPLOYEE (t) and DEPARTMENT (d) specify range relations
for t and d.
– SELECT condition is d.DNAME =‘Research”
– JOIN condition is d.NUMBER=t.DNO

## R.Algebra Slide -38

What is Domain Relational Calculus?
 Rather than variable range over tuples, such as the case of
Tuple Calculus, Domain Calculus uses variables range
over single values from the domains of attributes

##  A query language such as QBE (Query-By-Example) is

domain relational calculus.

 General expression:
– {x1, x2,,…, xn, | COND(x1, x2,,…, xn,xn+1,xn+2, …., xn+m )}
where x1, x2,,…, xn,xn+1,xn+2, …., xn+m are domain
variables that range over domains (of attributes) and
COND is a condition or formula of the domain
relational calculus

## R.Algebra Slide -39

Domain Relational Calculus Query Example
 Query 1
– Retrieve name, address of all employees work for department “Research”
– Q1: {qsv | (∃ Z) (∃ l) (∃ m) (EMPLOYEE(qrstuvwxyz) AND
DEPARTMENT (lmno) AND l=‘RESEARCH’ AND m=z}

## – first specify the requested attributes by the free occurrences of domain

variables: q=FNAME, s=LNAME, v=ADDRESS where qrstuvwxyz are
the 10 variables for the EMPLOYEE relation, one to range over the
domain of each attribute in order, and lmno are the 4 variables for the
DEPARTMENT relation
– m=z is a join condition (joining on DNUMBER and DNO)
– l=‘research’ is a selection condition relates a domain variable to a
constant, in this case DNAME=“RESEARCH”
– the existential qualifiers (∃ Z) (∃ l) (∃ m) will provide the tuples that will
hold the condition of this formula to be TRUE