Professional Documents
Culture Documents
Relational Algebra
Basic Relational Algebra Operations
In addition to defining the database structure and constraints, a data model must include a set of
operations to manipulate the data. A basic set of relational model operations constitute the relational
algebra. These operations enable the user to specify basic retrieval requests. The result of retrieval is a
new relation, which may have been formed from one or more relations. The algebra operations thus
produce new relations, which can be further manipulated using operations of the same algebra. A
sequence of relational algebra operations forms a relational algebra expression, whose result will also
be a relation.
The relational algebra operations are usually divided into two groups. One group includes set operations
from mathematical set theory; these are applicable because each relation is defined to be a set of
tuples. Set operations include UNION, INTERSECTION, SET DIFFERENCE, and CARTESIAN
PRODUCT. The other group consists of operations developed specifically for relational databases;
these include SELECT, PROJECT, and JOIN, among others. The SELECT and PROJECT operations
are discussed first, because they are the simplest. Then we discuss set operations. Finally, we discuss
JOIN and other complex operations.
Some common database requests cannot be performed with the basic relational algebra operations, so
additional operations are needed to express these requests.
Relational Algebra Symbols
The operations have their own symbols. The symbols are hard to write in HTML that works with all
browsers, so I'm writing PROJECT etc here. The real symbols:
Operation My HTML Symbol Operation My HTML Symbol
1
Assignment <- Semijoin SEMIJOIN
Operators - Write
INSERT - provides a list of attribute values for a new tuple in a relation. This operator is the
same as SQL.
DELETE - provides a condition on the attributes of a relation to determine which tuple(s) to
remove from the relation. This operator is the same as SQL.
MODIFY - changes the values of one or more attributes in one or more tuples of a relation, as
identified by a condition operating on the attributes of the relation. This is equivalent to SQL
UPDATE.
Operators - Retrieval
There are two groups of operations:
Mathematical set theory based relations:
UNION, INTERSECTION, DIFFERENCE, and CARTESIAN PRODUCT.
Special database operations:
SELECT , PROJECT, and JOIN.
The SELECT Operation The SELECT operation is used to select a subset of the tuples from a
relation that satisfy a selection condition. One can consider the SELECT operation to be a filter that
keeps only those tuples that satisfy a qualifying condition. For example, to select the EMPLOYEE
tuples whose department is 4, or those whose salary is greater than $30,000, we can individually specify
each of these two conditions with a SELECT operation as follows:
σDNO=4(EMPLOYEE)
σSALARY>30000(EMPLOYEE)
In general, the SELECT operation is denoted by
σ<selection condition>(R)
Where the symbol σ (sigma) is used to denote the SELECT operator, and the selection condition is a
Boolean expression specified on the attributes of relation R. Notice that R is generally a relational
2
algebra expression whose result is a relation; the simplest expression is just the name of a database
relation. The relation resulting from the SELECT operation has the same attributes as R. The Boolean
expression specified in <selection condition> is made up of a number of clauses of the form
<attribute name> <comparison op> <constant value>, or
<attribute name> <comparison op> <attribute name>
where <attribute name> is the name of an attribute of R, <comparison op> is normally one of the
operators {=, <, 1, >, , }, and <constant value> is a constant value from the attribute domain. Clauses
can be arbitrarily connected by the Boolean operators AND, OR, and NOT to form a general selection
condition. For example, to select the tuples for all employees who either work in department 4 and make
over $25,000 per year, or work in department 5 and make over $30,000, we can specify the following
SELECT operation: σ (DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE)
EXAMPLE:
3
The query to select computer science department from subject table.
31 Systems CS
33 Database CS
39 Java CS
39 Java CS
π<attribute list>(R)
Where π (pi) is the symbol used to represent the PROJECT operation and <attribute list> is a list of
attributes from the attributes of relation R. Again, notice that R is, in general, a relational algebra
expression whose result is a relation, which in the simplest case is just the name of a database relation.
4
The result of the PROJECT operation has only the attributes specified in <attribute list> and in the same
order as they appear in the list. Hence, its degree is equal to the number of attributes in <attribute list>.
If the attribute list includes only non key attributes of R, duplicate tuples are likely to occur; the
PROJECT operation removes any duplicate tuples, so the result of the PROJECT operation is a set of
tuples and hence a valid relation. This is known as duplicate elimination.
EXAMPLE:
Find the Names of all students from SUBJECT table. And find the department from SUBJECT table:
NAME DEPT
Systems CS
Database CS
Java CS
Algebra Maths
We can combine the Select and Project operations together to produce more complex expressions. For
example, for the query "Which students live in Carlton? List their names.", we can use the following
5
We can see that in the above expression, we first use the Select operation to choose the tuples for
students who live in "Carlton" and then use the Project operation to choose the NAME attribute from
those tuples from step 1.
Look at another query,"What is the student number of Nando of Carlton?".The query could be written as
ID(NAME='Nando'(SUBURB='Carlton'(STUDENT)))
or
ID
6933
Set Operations
Consider two relations R and S.
UNION of R and S
the union of two relations is a relation that includes all the tuples that are either in R or in S or in
both R and S. Duplicate tuples are eliminated.
INTERSECTION of R and S
the intersection of R and S is a relation that includes all tuples that are both in R and S.
DIFFERENCE of R and S
the difference of R and S is the relation that contains all the tuples that are in R but that are not
in S.
SET Operations - requirements
For set operations to function correctly the relations R and S must be union compatible. Two relations
are union compatible if
they have the same number of attributes
The domain of each attribute in column order is the same in both R and S.
6
UNION Example
Figure : UNION
INTERSECTION Example
Figure: Intersection
DIFFERENCE Example
Figure: DIFFERENCE
CARTESIAN PRODUCT
The Cartesian product is also an operator which works on two sets. It is sometimes called the CROSS
PRODUCT or CROSS JOIN.
It combines the tuples of one relation with all the tuples of the other relation.
7
CARTESIAN PRODUCT example
In its simplest form the JOIN operator is just the cross product of the two relations.
As the join becomes more complex, tuples are removed within the cross product to make the
result of the join more meaningful.
JOIN allows you to evaluate a join condition between the attributes of the relations on which
the join is undertaken.
Figure: JOIN
8
Natural Join
Invariably the JOIN involves an equality test, and thus is often described as an equi-join. Such joins
result in two attributes in the resulting relation having exactly the same value. A `natural join' will
remove the duplicate attribute(s).
In most systems a natural join will require that the attributes have the same name to identify the
attribute(s) to be used in the join. This may require a renaming mechanism.
If you do use natural joins make sure that the relations do not have two attributes with the same
name by accident.
OUTER JOINs
Notice that much of the data is lost when applying a join to two relations. In some cases this lost data
might hold useful information. An outer join retains the information that would have been lost from the
tables, replacing missing data with nulls.
There are three forms of the outer join, depending on which data is to be kept.
9
OUTER JOIN example 2
10