You are on page 1of 40

The Relational Algebra

and Relational Calculus

Silberschatz Text: 6
What are Relational Algebra, Relational Calculus?

 Relational Algebra
– A basic set of operations to manipulate the database
– Procedural, specify how to retrieve, incorporated in SQL
– Provides formal foundation for relational model operations
– Used as a basis for implementing and optimizing queries in
RDBMS

 Relational Calculus
– A higher-level declarative notation for specifying queries
– Non-procedural, specify only what is to be retrieved
– Firm basis in mathematical logic
– Tuple Calculus: operates over the rows
– Domain Calculus: operates over the columns
R.Algebra Slide -2
Topics to cover
 Basic Relational Algebra Operations
– Set (∪, ∩, –, X)
– Specific for relational database (σ, π, ⋈ , ρ, ÷)
– Unary (single relation) and Binary (2 relations)
 Additional Relational Algebra Operations
– Aggregate Functions and Grouping ℑ
– Recursive Closure Operations
– OUTER JOIN Operations ]><[
– The OUTER UNION Operation
 Tuple Relational Calculus
– SQL based on Relational calculus
 Domain Relational Calculus
– QBE based on Domain Calculus

R.Algebra Slide -3
Relational Algebra Operations
1. SET OPERATIONS
 UNION ∪, INTERSECTION ∩
 SET DIFFERENCE (MINUS) –
 CARTESIAN (CROSS JOIN) PRODUCT X

2. SPECIFIC OPERATIONS
 SELECT σ, PROJECT π (unary operations)
 JOIN ⋈ (various types, =, θ, *,) (binary operations)
 OTHERS
 Sequence, rename ρ (unary operations)
 Division ÷ (binary operations)

R.Algebra Slide -4
SELECT OPERATION
 UNARY implies it is applied to a single relation and to each tuple individually
 Select a subset of tuples from a relation that satisfy a selection condition
 σ(selection condition)(R)
 σ sigma = SELECT operation
 selection condition = Boolean expression
 <attribute name> <comparison op> <constant value>, or
<attribute name> <comparison op> <attribute name>
<comparison op> = {=,<, ≤,>, ≥, ≠}
<constant value> = constant value from the attribute domain
 Can connect clauses by AND, OR and NOT
 The degree of the resulting SELECT operation is same as the degree of R
 Number of resulting tuples <= number of tuples in R
 Commutative, i.e, apply in any order: σ(cond1) ( σ(cond2)(R)) = σ(cond2) ( σ(cond1)(R))
 Can combine a cascade of SELECT operations into a single SELECT operation
with a conjunctive (AND) condition
 σ<cond1> ( σ<cond2> ( ..(σ<condn>(R))…)) = σ(cond1) AND <cond2> AND … AND <condn>(R)
R.Algebra Slide -5
PROJECT OPERATION
 UNARY implies it is applied to a single relation and to each tuple individually
 Select certain columns (attributes) from the table and discards the other columns
 π <attribute list>(R)
 π i = PROJECT operation
 e.g. π LNAME, FNAME, SALARY (EMPLOYEE)
 Project (choose) only attributes specified in <attribute list> in same order as they
appear
 Duplicate elimination, remove duplicate non-key only attributes of R
 Number of resulting tuples <= number of tuples in R
 If projection list is a superkey of R, the resulting relation has the same number of
tuples as R
 If <list 2> contains attributes in <list1>: π <list1> (π <list2> (R)) = π <list1> (R)
 Commutative does not hold on PROJECT operation

R.Algebra Slide -6
Results of SELECT and PROJECT operations.
Refer to COMPANY database schema
SELECT
(a): σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND SLARY>30000)(EMPLOYEE).

PROJECT
(b): πLNAME, FNAME, SALARY(EMPLOYEE).
(c): πSEX, SALARY(EMPLOYEE).

Only one entry of (F, 25000)


appearing instead of 2

Fig 1
R.Algebra Slide -7
SEQUENCES OF OPERATIONS

 Nesting operations into one single relational algebra


expression:
– e.g. 2a: π FNAME, LNAME, SALARY (σ DNO=5 (EMPLOYEE))

 Sometimes it is simpler to apply one operation at a time and


create intermediate result relations, giving a name to each
intermediate relation:

– DEP5_EMPS ← σ DNO=5 (EMPLOYEE)


– RESULT ← π FNAME, LNAME, SALARY (DEP5_EMPS )

R.Algebra Slide -8
Results of a sequence of operations.
(a) πFNAME, LNAME, SALARY(σDNO=5(EMPLOYEE)).
(b) Using intermediate relations and renaming of attributes.

Fig 2
R.Algebra Slide -9
RENAME OPERATIONS
● Sometimes the attributes in the intermediate result relations are
renamed
– TEMP ← σ DNO=5 (EMPLOYEE)
– ρS(FIRSTNAME, LASTNAME, SALARY) ← π FNAME, LNAME, SLARAY (TEMP )

● RENAME Operation
– ρ (rho)
– S is new relation name
– B1, B2, …, Bn are new attribute names

● Can rename either relation name or attribute names


– ρS(B1, B2, …, Bn)(R) rename both relation and attribute
– ρS(R) renames relation only
– ρ(B1, B2, …, Bn)(R) renames attributes only
R.Algebra Slide -10
SET OPERATIONS

 BINARY operations: each applies to two sets of tuples

 UNION Operation (fig 4a, 4b)


– R ∪ S, result in new relation that includes all tuples that
are either in R or in S or in both, duplicate tuples are
eliminated

 INTERSECTION Operation (fig 4c)


– R ∩ S, result in new relation that includes all tuples that
are in both R and S

 SET DIFFERENCE (MINUS) Operation (fig 4d, 4e)


– R - S, result in new relation that includes all tuples that
are in R but not in S

R.Algebra Slide -11


Results of the UNION operation
 Example: to retrieve SSN of all employees who either work in department 5
or directly supervise an employee who works in department 5
– RESULT 1: SSN of all employees who work in department 5
– RESULT 2: SSN of all supervisors in department 5
– RESULT: UNION Operation - SSN either in RESULT 1 or RESULT 2

DEP5_EMPS ← σ DNO=5 (EMPLOYEE)

RESULT 1 ← π SSN (DEP5_EMPS )

RESULT 2 (SSN) ← π SUPERSSN (DEP5_EMPS )

RESULT ← RESULT1 ∪ RESULT2 Fig 3


R.Algebra Slide -12
SET OPERATIONS

 Union Compatibility: only applies to same type of tuples


– same degree n for R(A1, A2, …, An) and S(B1, B2, …, Bn)
– dom (Ai) = dom(Bi)

 Commutative operations:
– R ∪ S = S ∪ R and R ∩ S = S ∩ R

 Associative operations:
– R ∪ ( S ∪ T) = (R ∪ S) ∪ T

R.Algebra Slide -13


SET operations UNION, INTERSECTION, MINUS

(a) Two union-compatible relations.


(b) STUDENT ∪ INSTRUCTOR.
(c) STUDENT ∩ INSTRUCTOR.
(d) STUDENT – INSTRUCTOR.
(e) INSTRUCTOR – STUDENT

Fig 4
R.Algebra Slide -14
SET Operation:
The CARTESIAN PRODUCT (CROSS JOIN)

CROSS JOIN = X
R(A1, A2, …, An) X S(B1, B2, …, Bm),

result = Q, with degree n+m attribute Q(A1, A2, …, An, B1,


B2, …, Bm), in that order

Q has one tuple for each combination of tuples

If R has nr tuples and S has nS tuples, then


R X S will have nr x nS tuples
Fig 5a
R.Algebra Slide -15
SET Operation:
The CARTESIAN PRODUCT (CROSS JOIN)
Example: to retrieve a list of names of each female employees dependents

FEMALE_EMPS ← σ SEX=‘F’ (EMPLOYEE)


EMPNAMES ← π FNAME, LNAME, SSN (FEMALE_EMPS)
EMP_DEPENDENTS ← EMPNAMES X DEPENDENT
ACTUAL_DEPENDENTS ← σ SSN=ESSN (EMP_DEPENDENTS)
RESULT ← π FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPENDENTS )

EMPNAMES is cross join with DEPENDENT


Fig 5a
R.Algebra Slide -16
SET Operation: Cross JOIN
EMP_DEPENDENTS ← EMPNAMES X DEPENDENT

Notice: in Cartesian Product (cross join), ALL combinations of tuples are


included in the result Fig 5b
R.Algebra Slide -17
SET Operation:
SELECT tuple(s) and PROJECT desired attributes from the
CROSS JOIN result EMP_DEPENDENTS
ACTUAL_DEPENDENTS ← σ SSN=ESSN (EMP_DEPENDENTS)
RESULT ← π FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPENDENTS )

Example: Out of the 3 female employees, only SSN 987654321, Jennifer


Wallace, has a dependent whose name is Abner.

Fig 5c
R.Algebra Slide -18
BINARY JOIN OPERATIONS – INNER JOINS
 Only matching tuples of inner joins ⋈ are kept in result relation

 THETAJOIN θ, ⋈ θ
– R(A1, A2, …, An) JOIN <join condition> S(B1, B2, …, Bm), result
= Q (A1,, An, B1,, Bm)
– Only combination of tuples that satisfies join condition
appears in the result set
– Each <join condition> is of the form Ai θ Bi, where Ai and Bi
have the same domain. θ (theta) is one of the comparison
operators {=,<, ≤,>, ≥, ≠}

 EQUIJOIN
– Similar to theta-join but only comparison operator used in the
<join condition> is =
– Join result carries superfluous attributes
R.Algebra Slide -19
BINARY JOIN OPERATIONS – INNER JOINS
 NATURALJOIN *
– similar to EQUIJOIN, but superfluous attributes are removed
– Both join attributes must have same name, if not, apply
renaming operation first
– Q ← R * (<List1>), (<List2>) S

 n-way JOIN for EQUI and NATURAL, specified among multiple


tables,
– e.g. 3 way join:
– (( PROJECT JOIN DNUMB=DNUMBER DEPARTMENT)) JOIN
MGRSSN=SSN EMPLOYEE

R.Algebra Slide -20


BINARY THETAJOIN θ :
JOIN result is the combine related tuples from 2 relations into single tuples

Result of the JOIN operation


DEPT_MGR ← DEPARTMENT JOIN ⋈ MGRSSN=SSN EMPLOYEE .

Previous example of cross-join in Fig 5b and 5C:


EMP_DEPENDENTS ← EMPNAMES X DEPENDENT
ACTUAL_DEPENDENTS ← σ SSN=ESSN (EMP_DEPENDENTS)

Can be replaced by:


ACTUAL_DEPENDENTS ← EMPNAMES JOIN ⋈ SSN=ESSN DEPENDENT

Fig 6
R.Algebra Slide -21
Results of two NATURAL JOIN * operations.
(a) RENAME attributes before doing JOIN
DEPT ← ρ(DNAME,DNUM,MGRSSN,MGRSTARTDATE)(DEPARTMENT)

PROJ_DEPT ← PROJECT * DEPT.


(b) DEPT_LOCS ← DEPARTMENT * DEPT_LOCATIONS.
Join attribute

Fig 7
R.Algebra Slide -22
The DIVISION ÷ operation

DIVISON operation is applied to 2 relations


R(Z) ÷ S(X), where X ⊆ Z, Y=Z-X (i.e. set of attributes of R that are not attributes of S)

DIVISON operation can be expressed as a sequence of π, X, and - operations


T1 ← π Y(R)
T2 ← π Y((S X T1) – R)
T ← T1 – T2

SQL does not implement DIVISION directly, it has a round about way to
dealing with it.

Fig 8b
R.Algebra Slide -23
The DIVISION ÷ operation
Example Fig 8a: Retrieve the names of employees who work on all the projects
that “John Smith” works on:
Step 1: retrieve list of PNO that John Smith works on
SMITH ← σ FNAME=‘JOHN’ AND LANME=‘SMITH’(EMPLOYEE)
SMITH_PNOS ← π PNO (WORKS_ON JOIN ⋈ ESSN=SSN SMITH)
Step 2: Create a relation to include a tuple <PNO, ESSN>
SSN_PNOS ← π ESSN, PNO (WORKS_ON)
Step 3: Divide SSN_PNOS by SMITH_PNOS to get the desired SSNS
SSNS (SSN) ← SSN_PNOS ÷ SMITH_PNOS
RESULT ← π ENAME, LNAME (SSNS * EMPLOYEE)

T←R÷S

Fig 8
R.Algebra Slide -24
Additional Relational Operations
Some common database requests that cannot be performed with
basic algebra operations (in the previous list) would require the
following additional Relational algebra operations:

Aggregate Functions and Grouping ℑ


Recursive Closure Operations
OUTER JOIN Operations
The OUTER UNION Operation

R.Algebra Slide -25


AGGREGATE FUNCTION / GROUPING operation

<grouping attributes> ℑ <function list> (R)

where:
grouping attributes: list of attributes of relation R
function list: list of <function> such as SUM,
AVERAGE, MAXIUM, MINIUM, COUNT and
<attributes> pairs

R.Algebra Slide -26


The AGGREGATE FUNCTION operation.
Example: fig 9a,retrieve each department number, number of employees in the
department and their average salary: specify List of attributes names with rename

ρ R (DNO, NO_OF EMPLOYEES, AVERAGE_SAL)(DNO ℑ COUNT SSN, AVERAGE SALARY(EMPLOYEE))

Example: fig 9b,no renaming applied, attributes


of resulting relation will each be the concatenation
of the function name with the attribute name

DNO ℑ COUNT SSN, AVERAGE


SALARY(EMPLOYEE)

Example: fig 9c, no grouping attributes


specified, functions applied to all tuples

ℑ COUNT SSN, ,AVERAGE SALARY(EMPLOYEE)


Fig 9
R.Algebra Slide -27
Recursive Closure Operations
Query:
Specify the SSNs of all employees
supervised by employee whose name is
James Borg. (i.e. Borg’s division)

BORG_SSN ← π SSN(σ FNAME=‘James’ AND LNAME=‘Borg’ (EMPLOYEE))


SUPERVISION (SSN1, SSN2) ← π SSN, SUPERSSN (EMPLOYEE)
RESULT1(SSN) ← π SSN1(SUPERVISION JOIN ⋈ SSN2=SSN BORG_SSN )

RESULT2(SSN) ← π SSN1(SUPERVISION ⋈ SSN2=SSN RESULT1 )

Fig 10
R.Algebra Slide -28
LEFT, RIGHT, FULL OUTER JOINS
 LEFT OUTER JOIN
– keep every tuple in the first, or left, relation R in R ]><| S, if
no matching tuple is found in S, then attributes of S in the join
result are filled or “padded” with null values

 RIGHT OUTER JOIN


– keep every tuple in the second, or right, relation R in R |><[ S

 FULL OUTER JOIN


– keep ALL tuples in both the left and right relations R ]><[ S
when no matching tuples are found, padding them with null
values as needed.

R.Algebra Slide -29


The result of a LEFT OUTER JOIN operation.
Example: Retrieve a list of employee names and also the name of the departments
they manage if they happen to manage a department, indicate by null for those not
managing a department. In example here, a Left Outer JOIN keeps every tuple in the
‘left’ relation, ie. EMPLOYEE. There are Right Outer JOIN and Full Outer JOIN.

TEMP ← (EMPLOYEE LEFTJOIN ]><| SSN, = MGRSSN DEPARTMENT)

RESULT ← π FNAME, INIT, LNAME, DNAME (TEMP )

Fig 11
R.Algebra Slide -30
OUTER UNION operation
UNION of tuples from 2 relations with partially UNION compatible attributes,
e.g. R(X, Y) and S (X, Z)

UNION compatible attributes are represented once in the result, others are kept in
the result relation T (X, Y, Z)

Example:
 Apply OUTER UNION to 2 relations STUDENT (Name, SSN, Department,
Advisor) and INSTRUCTOR(Name, SSN, Department, Rank).

 Result relation:
 STUDENT_OR_INSTRUCTOR (Name, SSN, Department, Advisor, Rank)
 Tuples with same (Name, SSN, Department) will appear once only
 Tuples only in STUDENT will have a null for the Rank attribute
 Tuples only in INSTRUCTOR will have a null for the Advisor attribute

R.Algebra Slide -31


What is Relational Calculus?
 Provide higher-level declarative notation for specifying
relational queries.
– e.g. form new relations that is specified in terms of variables over
rows (tuples) and columns (domains)
– Based on predicate calculus
– No order of operations optimize queries to form new relations
(non-procedural language)

 Tuple Relational Calculus


– SQL based on Relational calculus

 Domain Relational Calculus


– QBE base on Domain Calculus

R.Algebra Slide -32


What is Tuple Relational Calculus?
 Only one declarative expression to specify a retrieval request,
emphasis on what is to be retrieved, not how to retrieve

 Nonprocedural language, no description of how to evaluate a


query

 A query language such as SQL is relational complete, when


any retrieval in basic relational algebra can be expressed in
relational calculus

 A formula (condition) is made up of predicate calculus atoms


connected via the logical operator AND, OR, and NOT

R.Algebra Slide -33


Tuple Relational Calculus
Tuple Variables and Range Relations
Expressions, Formulas
Existential and Universal Quantifiers
– Queries examples using Ex-Quantifier
Transforming Universal and Existential
Quantifiers
Using the Universal Quantifier
Safe Expressions
R.Algebra Slide -34
Tuple Variables and Range Relations
 Example of a simple tuple relational calculus query
– {t | COND(t)}
 where t is a tuple variable
 COND(t) is a conditional expression involving t
 result = all tuples of t that satisfy COND(t)

– Query 1: retrieve all attribute of EMPLOYEE whose salary is > $50,000


 {t | EMPLOYEE(t) and t.SALARY>50000}

– Query 2: retrieve 2 attribute values for each selected EMPLOYEE tuple t


 {t.FANME, t.LNAME | EMPLOYEE(t) and t.SALARY>50000}

– Query 3: retrieve birthdate and address of the John B. Smith


 {t.BDATE, t.ADDRESS | EMPLOYEE(t) AND t.FNAME=‘John’ AND
t.MINIT=‘B’ AND t.LNAME=‘Smith’}

R.Algebra Slide -35


Expressions and Formulas
 General expression:
– {t1.Aj, t2.Ak, …, tn.Am, | COND(t1, t2, …., tn, tn+1, tn+2, …., tn+m )}
 where t1, t2, …., tn, tn+1, tn+2, …., tn+m are tuple variables

 A formula is made up of predicate calculus atoms. Each of the


atoms below evaluates to either TRUE or FALSE. Atoms are
connected via the logical operators AND, OR, NOT.
– R(ti )
 where R is relation name and ti is a tuple variable

– t1.A op tj.B
 where op is one of the comparison operators in {=,<, ≤,>, ≥, ≠}
 A is an attribute of the relation on which t1 ranges
 B is an attribute of the relation on which tj ranges

– t1.A op C or C op tj.B
 where C is a constant value
R.Algebra Slide -36
Existential and Universal Quantifiers
 1. Every atom is a formula

 2. True values of formulas are derived from their component formulas


– F1 AND F2 is TRUE if both F1 and F2 are TRUE, otherwise , it is FALSE
– F1 OR F2 is FALSE if both F1 and F2 are TRUE, otherwise, it is TRUE
– NOT (F1) is TRUE if F1 is FALSE; it is FALSE if F1 is TRUE
– NOT (F2) is TRUE if F2 is FALSE; it is FALSE if F2 is TRUE

 3. Existential Quantifier (∃ t) (F)


– The formula is TRUE if (F) evaluates to TRUE for some (at least one) tuple
assigned to free occurrences of t in F

 4. Universal Quantifier (∀ t) (F)


– The formula is TRUE if (F) evaluates to TRUE for every (in the universe)
tuple assigned to free occurrences of t in F

R.Algebra Slide -37


Existential and Universal Quantifiers
 Tuple variable t is bound if it is quantified, otherwise it is free
 Can transform one type of quantifier into the other with NOT; AND, OR.
 Example:
– Query 1
Retrieve name, address of all employees work for department “Research”
Q1: {t.FNAME, t.LNAME, t.ADDRESS | EMPLOYEE(t) AND (∃ d)
(DEPARTMENT(d) AND d.NAME = ‘Research’ AND d.NUMBER=t.DNO)}

– t is the only free variable (on left of | ) where d is bounded by the existential
quantifier.
– Conditions EMPLOYEE (t) and DEPARTMENT (d) specify range relations
for t and d.
– SELECT condition is d.DNAME =‘Research”
– JOIN condition is d.NUMBER=t.DNO

R.Algebra Slide -38


What is Domain Relational Calculus?
 Rather than variable range over tuples, such as the case of
Tuple Calculus, Domain Calculus uses variables range
over single values from the domains of attributes

 A query language such as QBE (Query-By-Example) is


domain relational calculus.

 General expression:
– {x1, x2,,…, xn, | COND(x1, x2,,…, xn,xn+1,xn+2, …., xn+m )}
where x1, x2,,…, xn,xn+1,xn+2, …., xn+m are domain
variables that range over domains (of attributes) and
COND is a condition or formula of the domain
relational calculus

R.Algebra Slide -39


Domain Relational Calculus Query Example
 Query 1
– Retrieve name, address of all employees work for department “Research”
– Q1: {qsv | (∃ Z) (∃ l) (∃ m) (EMPLOYEE(qrstuvwxyz) AND
DEPARTMENT (lmno) AND l=‘RESEARCH’ AND m=z}

– first specify the requested attributes by the free occurrences of domain


variables: q=FNAME, s=LNAME, v=ADDRESS where qrstuvwxyz are
the 10 variables for the EMPLOYEE relation, one to range over the
domain of each attribute in order, and lmno are the 4 variables for the
DEPARTMENT relation
– m=z is a join condition (joining on DNUMBER and DNO)
– l=‘research’ is a selection condition relates a domain variable to a
constant, in this case DNAME=“RESEARCH”
– the existential qualifiers (∃ Z) (∃ l) (∃ m) will provide the tuples that will
hold the condition of this formula to be TRUE

R.Algebra Slide -40