You are on page 1of 10

CHAPTER SIX

Relational Algebra
Basic Relational Algebra Operations
In addition to defining the database structure and constraints, a data model must include a set of
operations to manipulate the data. A basic set of relational model operations constitute the relational
algebra. These operations enable the user to specify basic retrieval requests. The result of retrieval is a
new relation, which may have been formed from one or more relations. The algebra operations thus
produce new relations, which can be further manipulated using operations of the same algebra. A
sequence of relational algebra operations forms a relational algebra expression, whose result will also
be a relation.
The relational algebra operations are usually divided into two groups. One group includes set operations
from mathematical set theory; these are applicable because each relation is defined to be a set of
tuples. Set operations include UNION, INTERSECTION, SET DIFFERENCE, and CARTESIAN
PRODUCT. The other group consists of operations developed specifically for relational databases;
these include SELECT, PROJECT, and JOIN, among others. The SELECT and PROJECT operations
are discussed first, because they are the simplest. Then we discuss set operations. Finally, we discuss
JOIN and other complex operations.
Some common database requests cannot be performed with the basic relational algebra operations, so
additional operations are needed to express these requests.
Relational Algebra Symbols
The operations have their own symbols. The symbols are hard to write in HTML that works with all
browsers, so I'm writing PROJECT etc here. The real symbols:
Operation My HTML Symbol Operation My HTML Symbol

Projection PROJECT Cartesian product X

Selection SELECT Join JOIN

Left outer join LEFT OUTER JOIN


Renaming RENAME

Right outer join RIGHT OUTER JOIN


Union UNION

Intersection INTERSECTION Full outer join FULL OUTER JOIN

1
Assignment <- Semijoin SEMIJOIN

Operators - Write
 INSERT - provides a list of attribute values for a new tuple in a relation. This operator is the
same as SQL.
 DELETE - provides a condition on the attributes of a relation to determine which tuple(s) to
remove from the relation. This operator is the same as SQL.
 MODIFY - changes the values of one or more attributes in one or more tuples of a relation, as
identified by a condition operating on the attributes of the relation. This is equivalent to SQL
UPDATE.
Operators - Retrieval
There are two groups of operations:
 Mathematical set theory based relations:
UNION, INTERSECTION, DIFFERENCE, and CARTESIAN PRODUCT.
 Special database operations:
SELECT , PROJECT, and JOIN.

The SELECT Operation The SELECT operation is used to select a subset of the tuples from a
relation that satisfy a selection condition. One can consider the SELECT operation to be a filter that
keeps only those tuples that satisfy a qualifying condition. For example, to select the EMPLOYEE
tuples whose department is 4, or those whose salary is greater than $30,000, we can individually specify
each of these two conditions with a SELECT operation as follows:

σDNO=4(EMPLOYEE)
σSALARY>30000(EMPLOYEE)
In general, the SELECT operation is denoted by

σ<selection condition>(R)

Where the symbol σ (sigma) is used to denote the SELECT operator, and the selection condition is a
Boolean expression specified on the attributes of relation R. Notice that R is generally a relational

2
algebra expression whose result is a relation; the simplest expression is just the name of a database
relation. The relation resulting from the SELECT operation has the same attributes as R. The Boolean
expression specified in <selection condition> is made up of a number of clauses of the form
<attribute name> <comparison op> <constant value>, or
<attribute name> <comparison op> <attribute name>
where <attribute name> is the name of an attribute of R, <comparison op> is normally one of the
operators {=, <, 1, >, , }, and <constant value> is a constant value from the attribute domain. Clauses
can be arbitrarily connected by the Boolean operators AND, OR, and NOT to form a general selection
condition. For example, to select the tuples for all employees who either work in department 4 and make
over $25,000 per year, or work in department 5 and make over $30,000, we can specify the following
SELECT operation: σ (DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE)

EXAMPLE:

3
 The query to select computer science department from subject table.

DEPT='CS'(SUBJECT) produces an output as follows:

NUM NAME DEPT

31 Systems CS

33 Database CS

39 Java CS

 Whereas, the query DEPT='CS' and NAME='Java'(SUBJECT) produces output as follows.

NUM NAME DEPT

39 Java CS

The PROJECT Operation


If we think of a relation as a table, the SELECT operation selects some of the rows from the table while
discarding other rows. The PROJECT operation, on the other hand, selects certain columns from the
table and discards the other columns. If we are interested in only certain attributes of a relation, we use
the PROJECT operation to project the relation over these attributes only. For example, to list each
employee’s first and last name and salary from employee table, we can use the PROJECT operation as
follows:

πLNAME, FNAME, SALARY(EMPLOYEE)

The general form of the PROJECT operation is

π<attribute list>(R)

Where π (pi) is the symbol used to represent the PROJECT operation and <attribute list> is a list of
attributes from the attributes of relation R. Again, notice that R is, in general, a relational algebra
expression whose result is a relation, which in the simplest case is just the name of a database relation.

4
The result of the PROJECT operation has only the attributes specified in <attribute list> and in the same
order as they appear in the list. Hence, its degree is equal to the number of attributes in <attribute list>.

If the attribute list includes only non key attributes of R, duplicate tuples are likely to occur; the
PROJECT operation removes any duplicate tuples, so the result of the PROJECT operation is a set of
tuples and hence a valid relation. This is known as duplicate elimination.

EXAMPLE:

Find the Names of all students from SUBJECT table. And find the department from SUBJECT table:

 NAME(SUBJECT) produces output . DEPT(SUBJECT) the result will be


DEPT
NAME CS
Systems Maths
Database
Java
Algebra

. The expression  NAME,DEPT(SUBJECT) list both attributes NAME and DEPT.

NAME DEPT
Systems CS
Database CS
Java CS
Algebra Maths

Combining SELECT and PROJECT

We can combine the Select and Project operations together to produce more complex expressions. For
example, for the query "Which students live in Carlton? List their names.", we can use the following

expression: NAME(SUBURB='Carlton'(STUDENT)) which produces


NAME
Nando
Naida

5
We can see that in the above expression, we first use the Select operation to choose the tuples for
students who live in "Carlton" and then use the Project operation to choose the NAME attribute from
those tuples from step 1.

Look at another query,"What is the student number of Nando of Carlton?".The query could be written as

ID(NAME='Nando'(SUBURB='Carlton'(STUDENT)))
or

ID(NAME='Nando' and SUBURB='Carlton'(STUDENT)), and both returns

ID
6933

Set Operations
Consider two relations R and S.
 UNION of R and S
the union of two relations is a relation that includes all the tuples that are either in R or in S or in
both R and S. Duplicate tuples are eliminated.
 INTERSECTION of R and S
the intersection of R and S is a relation that includes all tuples that are both in R and S.
 DIFFERENCE of R and S
the difference of R and S is the relation that contains all the tuples that are in R but that are not
in S.
SET Operations - requirements
For set operations to function correctly the relations R and S must be union compatible. Two relations
are union compatible if
 they have the same number of attributes
 The domain of each attribute in column order is the same in both R and S.

6
UNION Example

Figure : UNION
INTERSECTION Example

Figure: Intersection
DIFFERENCE Example

Figure: DIFFERENCE
CARTESIAN PRODUCT
The Cartesian product is also an operator which works on two sets. It is sometimes called the CROSS
PRODUCT or CROSS JOIN.
It combines the tuples of one relation with all the tuples of the other relation.

7
CARTESIAN PRODUCT example

Figure : CARTESIAN PRODUCT


JOIN Operator
JOIN operator is used to combine related tuples from two relations:

 In its simplest form the JOIN operator is just the cross product of the two relations.
 As the join becomes more complex, tuples are removed within the cross product to make the
result of the join more meaningful.
 JOIN allows you to evaluate a join condition between the attributes of the relations on which
the join is undertaken.

The notation used is


R JOINjoin condition S
EXAMPLE:

Figure: JOIN

8
Natural Join
Invariably the JOIN involves an equality test, and thus is often described as an equi-join. Such joins
result in two attributes in the resulting relation having exactly the same value. A `natural join' will
remove the duplicate attribute(s).

 In most systems a natural join will require that the attributes have the same name to identify the
attribute(s) to be used in the join. This may require a renaming mechanism.
 If you do use natural joins make sure that the relations do not have two attributes with the same
name by accident.
OUTER JOINs
Notice that much of the data is lost when applying a join to two relations. In some cases this lost data
might hold useful information. An outer join retains the information that would have been lost from the
tables, replacing missing data with nulls.
There are three forms of the outer join, depending on which data is to be kept.

 LEFT OUTER JOIN - keep data from the left-hand table


 RIGHT OUTER JOIN - keep data from the right-hand table
 FULL OUTER JOIN - keep data from both tables

OUTER JOIN example 1

Figure : OUTER JOIN (left/right)

9
OUTER JOIN example 2

Figure : OUTER JOIN (full)

10

You might also like