You are on page 1of 25

CHAPTER SIX

6. Query Languages
6.1. Relational Query Languages
 Query languages are Allow manipulation and retrieval of data from a database.
 Query Languages are programming languages.
 Query languages are not intended to be used for complex calculations.
 Query languages are support easy, efficient access to large data sets.
 Relational model supports simple, powerful query languages.
Formal Relational Query Languages
 There are varieties of Query languages used by relational DBMS for manipulating
relations.
 Some of them are procedural: User tells the system exactly what and how to manipulate
the data. PL/SQL (Procedural Language/Structure Query Language) is a procedural
language. A C++ or Java program is procedural, which means that you have to state, step
by step, exactly how the result should be calculated.
 Others are non-procedural: User states what data is needed rather than how it is to be
retrieved. SQL is referred to as nonprocedural database language. Here nonprocedural
means that, when we want to retrieve data from the database it is enough to tell SQL what
data to be retrieved, rather than how to retrieve it. The DBMS will take care of locating the
information in the database. SQL is declarative, which means that you tell the DBMS
what you want, but not how it is to be calculated.
 Relational algebra is more procedural than SQL.
 Actually, relational algebra is mathematical expressions.
Two mathematical Query Languages form the basis for Relational languages:
 Relational Algebra:
 Relational Calculus:
 We may describe the relational algebra as procedural language: it can be used to tell the
DBMS how to build a new relation from one or more relations in the database.
 We may describe relational calculus as a non-procedural language: it can be used to
formulate the definition of a relation in terms of one or more database relations.

1
 Formally the relational algebra and relational calculus are equivalent to each other. For
every expression in the algebra, there is an equivalent expression in the calculus.
 Both are non-user-friendly languages. They have been used as the basis for other, higher-
level data manipulation languages for relational databases.
A query is applied to relation instances, and the result of a query is also a relation instance.
 Schemas of input relations for a query are fixed.
 The schema for the result of a given query is also fixed! Determined by definition of
query language constructs.
6.2. Relational Algebra
The basic set of operations for the relational model is known as the relational algebra. These
operations enable a user to specify basic retrieval requests. The result of the retrieval is a new
relation, which may have been formed from one or more relations. The algebra operations thus
produce new relations, which can be further manipulated using operations of the same algebra. A
sequence of relational algebra operations forms a relational algebra expression, whose result will
also be a relation that represents the result of a database query (or retrieval request).
Relational algebra is an abstract language, which means that the queries formulated in relational
algebra are not intended to be executed on a computer. Relational algebra consists of group of
relational operators that can be used to manipulate relations to obtain a desired result. Knowledge
about relational algebra allows us to understand query execution and optimization in relational
database management system.
 Relational algebra is a theoretical language with operations that work on one or more relations
to define another relation without changing the original relation.
 The output from one operation can become the input to another operation (nesting is possible)
 There are different basic operations that could be applied on relations on a database based on
the requirement.
What is an Algebra? An algebra is a set of operators and operands. Relational algebra is a notation
for specifying queries about the contents of relations. Operands are relations, thought of as sets of
tuples. Operators are symbols denoting procedures that construct new values from given values.
6.2.1. Role of Relational Algebra in DBMS
Knowledge about relational algebra allows us to understand query execution and optimization in
relational database management system. The role of relational algebra in DBMS is shown in the

2
Figure below. From the figure it is evident that when a SQL query has to be converted into an
executable code, first it has to be parsed to a valid relational algebraic expression, then there should
be a proper query execution plan to speed up the data retrieval. The query execution plan is given
by query optimizer.

Figure: Role of relational algebra in DBMS


6.3. Relational Algebra Operations
Operations in relational algebra can be broadly classified into set operation and database
operations.
6.3.1. Unary and Binary Operations
Unary operation involves one operand, whereas binary operation involves two operands. The
selection and projection are unary operations. Union, difference, Cartesian product, and Join
operations are binary operations:
 Unary operation operate on one relation.
 Binary operation operates on more than one relation.

3
Notation:
The operations have their own symbols. The real symbols are:
Operation Symbol Operation Symbol
Projection Cartesian product
Selection Join

Renaming Left outer join

Union Right outer join


Intersection
Full outer join
Assignment Semijoin

Three main database operations are SELECTION, PROJECTION, and JOIN.


 Selection ( σ ) Selects a subset of rows from a relation.

 Projection ( π ) Deletes unwanted columns from a relation.


 Renaming: assigning intermediate relation for a single operation
 Cross-Product ( x ) Allows us to combine two relations.
 Set-Difference ( - ) Tuples in relation1, but not in relation2.
 Union (∪ ) Tuples in relation1 or in relation2.
 Intersection (∩) Tuples in relation1 and in relation2.
 Join ⋈Tuples joined from two relations based on a condition.
 Using these we can build up sophisticated database queries.
Sample table used to illustrate different kinds of relational operations. The relation contains
information about employees, skills they have and the University where they attend each skill.
EmpID Fname Lname SkillID Skill Skill Type University University Skill
Address level
12 Abebe Tomas 2 SQL Database AAU Sidist Killo 5
16 Lemma Ayele 5 C++ Programming ASTU Adama 6
28 Robel Dawit 2 SQL Database AAU Sidist Killo 10
25 Abera Taye 6 C# Programming WKU Wolkite 8
65 Almaz Belay 2 SQL Database WKU Wolkite 9
24 Dereje Tamiru 8 Oracle Database ASTU Adama 5
51 Selam Belay 4 Prolog Programming JU Jimma 8
94 Alem Kebede 3 Cisco Networking AAU Sidist Killo 7
18 Girma Dereje 1 IP Programming JU Jimma 4
13 Henok Yared 7 Java Programming AAU Sidist Killo 6
1. Selection

4
 Selects subset of tuples/rows in a relation that satisfy selection condition.
 Selection operation is a unary operator (it is applied to a single relation).
 The Selection operation is applied to each tuple individually.
 The degree of the resulting relation is the same as the original relation but the cardinality (no.
of tuples) is less than or equal to the original relation.
 The Selection operator is commutative.
 Set of conditions can be combined using Boolean operations (∧(AND), ∨(OR), and ~(NOT))
 No duplicates in result!
 Schema of result identical to schema of (only) input relation.
 Result relation can be the input for another relational algebra operation! (Operator
composition.)
 It is a filter that keeps only those tuples that satisfy a qualifying condition (those satisfying
the condition are selected while others are discarded.)
Relational Algebra Selection Notation:

<Selection Condition>
<Relation Name>

SQL: SELECT * FROM R WHERE <selection condition>


Example: Find all Employees with skill type of Database?

< SkillType =”Database”> (Employee)

This query will extract every tuple from a relation called Employee with all the attributes where
the SkillType attribute with a value of “Database”. The resulting relation will be the following.
EmpID Fname Lname SkillID Skill Skill Type University University Skill
Address level
12 Abebe Tomas 2 SQL Database AAU Sidist Killo 5
28 Robel Dawit 2 SQL Database AAU Sidist Killo 10
65 Almaz Belay 2 SQL Database WKU Wolkite 9
24 Dereje Tamiru 8 Oracle Database ASTU Adama 5
If the query is all employees with a SkillType Database and University ASTU the relational
algebra operation and the resulting relation will be as follows.
< SkillType =” Database” AND University=” Adama”> (Employee)

EmpID Fname Lname SkillID Skill Skill Type University University Skill
Address level
24 Dereje Tamiru 8 Oracle Database ASTU Adama 5
2. Projection
 Selects certain attributes while discarding the other from the base relation.

5
 The PROJECT creates a vertical partitioning – one with the needed columns (attributes)
containing results of the operation and other containing the discarded Columns.
 Deletes attributes that are not in projection list.
 Schema of result contains exactly the fields in the projection list, with the same names that
they had in the (only) input relation.
 Projection operator has to eliminate duplicates! Note: real systems typically don’t do
duplicate elimination unless the user explicitly asks for it.
 If the Primary Key is in the projection list, then duplication will not occur
 Duplication removal is necessary to ensure that the resulting table is also a relation.
Relational Algebra Projection Operation Notation: π <Selected Attributes> <Relation Name>
SQL: SELECT DISTINCT <attribute list> FROM table-name
Example: To display Name, Skill, and Skill Level of an employee, the query and the resulting

relation will be: π <FName, LName, Skill, Skill_Level> (Employee)

Fname Lname Skill Skill level


Abebe Tomas SQL 5
Lemma Ayele C++ 6
Robel Dawit SQL 10
Abera Taye C# 8
Almaz Belay SQL 9
Dereje Tamiru Oracle 5
Selam Belay Prolog 8
Alem Kebede Cisco 7
Girma Dereje IP 4
Henok Yared Java 6
If we want to have the Name, Skill, and Skill Level of an employee with Skill SQL and SkillLevel
greater than 5 the query will be:

π<FName, LName, Skill, Skill_Level> ( <Skill=”SQL” ∧ SkillLevel>5>(Employee))

Fname Lname Skill Skill level


Robel Dawit SQL 10
Almaz Belay SQL 9

Example 2: The table EMPLOYEE


EmpNo Ename Salary
1 Abebe 10000
2 Sara 8000
3 Tomas 10000
Comparison of SQL and Relational algebra database operation projection and selection.

6
SQL Result Relational algebra
select distinct salary Π< Salary> (EMPLOYEE)
salary from 10000
EMPLOYEE 8000

select EmpNo, EmpNo Salary π <EmpNo, Salary>(EMPLOYEE)


Salary 1 10000
from EMPLOYEE 2 8000
3 10000

Note that there are no duplicate rows in the result.

The table EMPLOYEE Selection

SQL Result Relational algebra


select * from EmpNo Ename Salary salary > 8000 (EMPLOYEE)
EMPLOYEE 1 Abebe 10000
where salary >8000 3 Tomas 10000
select * salary > 8000 and EmpNO >= 3
from EMPLOYEE EmpNo Ename Salary (EMPLOYEE)
where salary >8000 3 Tom 10000
and EmpNo>= 3

Note that the select operation in relational algebra has nothing to do with the SQL keyword select.
Selection in relational algebra returns those tuples in a relation that fulfil a condition, while the
SQL keyword select means "here comes an SQL statement".

Relational algebra expressions

SQL Result Relational algebra


select distinct name, salary 𝝅 <name, salary (SELECT salary >
from EMPLOYEE where 8000(EMPLOYEE))
salary > 8000 name salary
Abebe 10000 or, step by step, using an intermediate result
Tomas 10000
Temp <- salary < 8000(EMPLOYEE)
Result <- PROJECT name, salary(Temp)
6.3.2. Rename Operation
We may want to apply several relational algebra operations one after the other. The rename
operator returns an existing relation under a new name. ρA(B) is the relation B with its name
changed to A. The results of operation in the relational algebra do not have names. It is often useful

7
to name such results for use in further expressions later on. The rename operator can be used to
name the result of relational algebra operation. The query could be written in two different forms:
1. Write the operations as a single relational algebra expression by nesting the operations.
2. Apply one operation at a time and create intermediate result relations. In the latter case, we
must give names to the relations that hold the intermediate results Rename Operation.
If we want to have the Name, Skill, and Skill Level of an employee with salary greater than 1500 and
working for department 5, we can write the expression for this query using the two alternatives:
1. A single algebraic expression:
The above used query is using a single algebra operation, which is:

π<FName, LName, Skill, Skill_Level> ( <Skill=”SQL” ∧ SkillLevel>5>


(Employee))
2. Using an intermediate relation by the Rename Operation:
Step1: Result1 <DeptNo=5 ∧ Salary>1500>
(Employee)

Step2: Result π<FName, LName, Skill, Skill_Level>(Result1)

Then Result will be equivalent with the relation we get using the first alternative.
6.3.3. UNION Operation
The result of this operation, denoted by R U S, is a relation that includes all tuples that are either
in R or in S or in both R and S. Duplicate tuples are eliminated. The two operands must be “type
compatible”. The union of two relations R and S is the set of tuples that are in R or in S or in both.
6.3.3.1. Union Compatibility
In order to perform the Union, Intersection, and the Difference operations on two relations, the
two relations should be union compatible. Two relations are union compatible if they have same
number of attributes and belong to the same domain. Mathematically UNION COMPATIBILITY
it is given as: Let R(A1, A2, …, An) and S(B1,B2, …, Bn) be the two relations. The relation R
has the attributes A1, A2, …, An and the relation S has the attributes B1, B2, …, Bn. The two
relations R and S are union compatible if domain(Ai)=domain(Bi) for i = 1, 2, …, n. The resulting
relation for:
 R1 ∪ R2, R1 ∩ R2, or R1-R2 has the same attribute names as the first operand relation R1 (by
convention). RA= R ∪ S /Where R is table1, S is table2 and RA is a UNION result. The SQL
equivalent is: SQL: SELECT columns, … FROM R UNION SELECT columns, …FROM S;
1. INTERSECTION Operation

8
The result of this operation, denoted by R ∩ S, is a relation that includes all tuples that are in both
R and S. The two operands must be "type compatible". The intersection of two relations R and
S is the set of tuples that are in both R and S.
 RA: R ∩ S
 SQL: SELECT * FROM R INTERSECT SELECT * FROM S;
2. Set Difference (or MINUS) Operation
The result of this operation, denoted by R - S, is a relation that includes all tuples that are in R but
not in S. The two operands must be "type compatible”. The difference of two relations R and S
is the set of tuples that are in R but not in S.
 RA: R - S
 SQL: SELECT * FROM R EXCEPT SELECT * FROM S;
 R - (R -S) = R ∩ S
 SQL: SELECT * FROM R EXCEPT SELECT * FROM R EXCEPT SELECT * FROM S; =
SELECT * FROM R INTERSECT SELECT * FROM S;
6.3.3.2. Some Properties of the Set Operators
Notice that both union and intersection are commutative operations; that is:
R ∪ S = S ∪ R, and R ∩ S = S ∩ R
Both union and intersection can be treated as n-nary operations applicable to any number of
relations as both are associative operations; that is
R ∪ (S ∪ T) = (R ∪ S) ∪ T, and (R ∩ S) ∩ T = R ∩ (S ∩ T)
The minus operation is not commutative; that is, in general
R-S≠S–R
3. CARTESIAN (cross product) Operation
The Cartesian product of two tables combines each row in one table with each row in the other
table. This operation is used to combine tuples from two relations in a combinatorial fashion. That
means, every tuple in Relation1(R) one will be related with every other tuple in Relation2 (S).
In general, the result of R (A1, A2, . . ., An) x S (B1, B2, . . ., Bm) is a relation Q with degree n + m

attributes Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order. Where R has n attributes and S has m

attributes.
The resulting relation Q has one tuple for each combination of tuples—one from R and one from
S. Hence, if R has n tuples, and S has m tuples, then | R x S | will have n* m tuples. The two
operands do NOT have to be "type compatible.” The Cartesian product (or cross-product or

9
product) of two relations R and S are the set of pairs that can be formed by pairing each tuple of R
with each tuple of S.
 The result is a relation whose schema is the schema for R followed by the schema for S.
 We rename attributes to avoid ambiguity or we prefix attribute with the name of the relation it
belongs to.
RA: R x S SQL: SELECT * FROM R, S;
Example 1: Employee
ID FName LName
123 Abebe Lemma
567 Belay Taye
822 Kefle Kebede
Department
DeptID DeptName MangID
2 Finance 567
3 Personnel 123
Then Employee X department will be of the form:
ID FName LName DeptID DeptName MangID
123 Abebe Lemma 2 Finance 567
123 Abebe Lemma 3 Personnel 123
567 Belay Taye 2 Finance 567
567 Belay Taye 3 Personnel 123
822 Kefle Kebede 2 Finance 567
822 Kefle Kebede 3 Personnel 123
π<ID, FName, LName, DeptName > (𝝈 <ID=MangID>(Employee X Department))

ID FName LName DeptName


123 Abebe Lemma Personnel
567 Belay Taye Finance
6.3.4. JOIN Operation
Join operation combines two relations to form a new relation. The tables should be joined based
on a common column. The common column should be compatible in terms of domain. The
sequence of Cartesian product followed by select is used quite commonly to identify and select
related tuples from two relations, a special operation, called JOIN. Thus, in JOIN operation, the
Cartesian Operation and the Selection Operations are used together. JOIN Operation is denoted by
a ⋈ symbol. This operation is very important for any relational database with more than a single
relation, because it allows us to process relationships among relations. The general form of a join
operation on two relations. R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is:

10
R ⋈<join condition>S Is equivalent to <selection condition> (R X S)
where <join condition> and <selection condition> are the same

Figure: Types of Join Operation


1. EQUIJOIN Operation
EQUIJOIN is a simple SQL join and Uses the equal sign(=) as the comparison operator for the
condition. Such a join, where the only comparison operator used is called an EQUIJOIN. In the
result of an EQUIJOIN we always have one or more pairs of attributes (whose names need not be
identical) that have identical values in every tuple since we used the equality logical operator.
Example of EquiJoin: Given the two relations STAFF and DEPT, produce a list of staff and the
departments they work in.
STAFF DEPARTMENT
Staff No Job DeptNo DepNo Name
1 Salesman 50 50 Accounting
2 Draftsman 60 60 Civil

Answer for the earlier query is equijoin of STAFF and DEPT:


STAFF EQUI JOIN DEPARTMENT
Staff No DepNo Name DepNo Name
1 50 Accounting 50 Accounting
2 60 Civil 60 Civil

2. NON EQUI JOIN

NON EQUI JOIN are joins that uses comparison operator other than the equal sign. The
operators use like >, <, >=, <= with the condition:

3. NATURAL JOIN Operation

11
The natural join performs an equijoin of the two relations R and S over all common attributes. One
occurrence of each common attribute is eliminated from the result. In other words, a natural join
will remove duplicate attribute. In most systems a natural join will require that the attributes have
the same name to identity the attributes to be used in the join. This may require a renaming
mechanism. Even if the attributes do not have same name, we can perform the natural join provided
that the attributes should be of same domain. The SQL NATURAL JOIN is a type of EQUI JOIN
and is structured in such a way that, columns with same name of associate tables will appear once
only. The associated tables have one or more pairs of identically named columns. The columns
must be the same data type. Don’t use ON clause in a natural join.
Input: Two relations (tables) R and S Notation: R ⋈ S
Purpose: Relate rows from second table and enforce equality on all column attributes and
eliminate one copy of common attribute.
 Short hand for 𝜋 L(R × S): L is the union of all attributes from R and S with duplicate removed.
Example of Natural Join Operation: Consider two relations EMPLOYEE and DEPARTMENT.
Let the common attribute to the two relations be DEPTNUMBER. It is worth to note that Natural
join operation is associative. (i.e.,) If R, S, and T are three relations then:
R⋈ (S ⋈ T) = (R⋈ S) ⋈ T SQL: SELECT * FROM table1 NATURAL JOIN table2;
EMPLOYEE
EmpID Title DeptNO DEPARTMENT
C100 Lecturer E1 ⋈ DepName DepNo
C101 Assistant Professor E2 Electrical E1
C102 Professor C1 Computer C1
EMPLOYEE ⋈ DEPARTMENT
EmpID Title DeptNO DepName
C100 Lecturer E1 Electrical
C102 Professor C1 Computer
4. SELF JOIN
A self-join is a join in which a table is joined with itself (Unary relationships), especially when the
table has a FOREIGN KEY which references its own PRIMARY KEY. To join a table itself means
that each row of the table is combined with itself and with every other row of the table. The self-
join can be viewed as a join of two copies of the same table. Example: SELF JOIN: SELECT *
FROM table1 X, table1 Y WHERE X. A=Y.A;
5. OUTER JOIN Operation

12
OUTER JOIN is another version of the JOIN operation where non-matching tuples from the first
Relation are also included in the resulting Relation where attributes of the second Relation for non-
matching tuples from Relation one will have a value of NULL. Returns all rows from one table
and matching rows from the secondary table and comparison columns should be equal in both the
tables. In outer join, matched pairs are retained unmatched values in other tables are left
null. Notation: R , , <Join Condition > S

Figure: Representation of left and right outer join.


Types of Outer Join
The pictorial representation of the full, left and the right outer join of two relations R and S are
shown in the above Figure:
1. Left Outer Join: Return all records from the left table, and the matched records from the right
table. Left outer joins is a join in which tuples from R that do not have matching values in the
common column of S are also included in the result relation.
2. Right Outer Join: Return all records from the right table, and the matched records from the
left table. Right outer join is a join in which tuples from S that do not have matching values in
the common column of R are also included in the result relation.
3. Full Outer Join: Return all records when there is a match in either left or right table. Full outer
join is a join in which tuples from R that do not have matching values in the common columns
of S still appear and tuples in S that do not have matching values in the common columns of R
still appear in the resulting relation.
Example of Full Outer, Left Outer and Right Outer Join: Consider two relations PEOPLE and
MENU determine the full outer, left outer, and right outer join.

MENU
PEOPLE Food Day
Name Age Food Tibis Monday
Abebe 25 Kitifo Kitifo Tuesday
Tomas 28 Pasta Pasta Wednesday
Robel 24 Pizza Berger Thursday
Yoseph 25 Tibis Fruit Friday

13
Table 6.1. Left outer join of PEOPLE and MENU relation

PEOPLE PEOPLE. Food=MENU. Food MENU


Name Age People.Food Menu.Food Day
Abebe 25 Kitifo Kitifo Tuesday
Tomas 28 Pasta Pasta Wednesday
Robel 24 Pizza Null Null
Yoseph 25 Tibis Tibis Monday
The left outer join of PEOPLE and MENU on Food is represented as PEOPLE PEOPLE.
Food=MENU. Food MENU. The result of the left outer join is shown in Table 6.1. From this table,
it is to be noted that all the tuples from the left table (in our case it is PEOPLE relation) appears in
the result. If there is any unmatched value, then a NULL value is returned.
SQL: SELECT column_name(s) FROM table1 LEFT JOIN table2 ON table1.column_name =
table2.column_name;
Table 6.2. Right outer join of PEOPLE and MENU relation

PEOPLE PEOPLE. Food=MENU. Food MENU


Name Age People.Food Menu.Food Day
Yoseph 25 Tibis Tibis Monday
Abebe 25 Kitifo Kitifo Tuesday
Tomas 28 Pasta Pasta Wednesday
Null Null Null Berger Thursday
Null Null Null Fruit Friday
The right outer join of PEOPLE and MENU on Food is represented in the relational algebra as

PEOPLE PEOPLE.Food=Menu.Food MENU. The result of the right outer join is shown in
Table 6.2. From this table, it is clear that all tuples from the right-hand side relation (in our case
the right hand relation is MENU) appears in the result.
SQL: SELECT column_name(s) FROM table1 RIGHT JOIN table2 ON table1.column_name =
table2.column_name;
Table 6.3. Full outer join of PEOPLE and MENU relation

PEOPLE PEOPLE. Food=MENU. Food MENU


Name Age People.Food Menu.Food Day
Abebe 25 Kitifo Kitifo Tuesday
Tomas 28 Pasta Pasta Wednesday
Robel 24 Pizza Null Null
Yoseph 25 Tibis Tibis Monday
Null Null Null Berger Thursday
Null Null Null Fruit Friday
The full outer join of PEOPLE and MENU on Food is represented in the relational algebra as

14
PEOPLE PEOPLE.Food=MENU.Food MENU. The result of the full outer join is shown
in Table 6.3. From this table, it is clear that tuples from both the PEOPLE and the MENU relation
appears in the result.
SQL: SELECT column_name(s) FROM table1 FULL OUTER JOIN table2 ON
table1.column_name = table2.column_name;
6. SEMIJOIN Operation (⋉)
SEMI JOIN is another version of the JOIN operation where the resulting Relation will contain
those attributes of Relation one that are related with tuples in the second Relation. Expression for
Semi-Join: R ⋉F S =∏ A(R ⋉F S) where F is the predicate.
Example of Semi-Join: In order to understand semi-join, consider two relations EMPLOYEE and
PAY.
EMPLOYEE
EmpID Name Designation PAY
E1 Abebe Programmer Designation Salary
E2 Tomas System Analyst Programmer Birr, 30,000
E3 Robel Database Administrator Consultant Birr, 40,000
E4 Yoseph Consultant

The semi-join of EMPLOYEE with the PAY is denoted by:


EMPLOYEE⋉EMPLOYE.DESIGNATION=PAY.DESIGNATION PAY. The result of this semi-join is given
later:
EMPLOYEE⋉EMPLOYE.DESIGNATION=PAY.DESIGNATION PAY
EmpID Name Designation Designation
E1 Abebe Programmer Programmer
E4 Yoseph Consultant Consultant
From the result of the semi-join it is clear that a semi-join is half of a join: the rows of one table
that match with at least one row of another table. Only the rows of the first table appear in the
result.
6.3.5. Advantages of Relational Algebra
The relational algebra has solid mathematical background. The mathematical background of
relational algebra is the basis of many interesting developments and theorems. If we have two
expressions for the same operation and if the expressions are proved to be equivalent, then a query
optimizer can automatically substitute the more efficient form. Moreover, the relational algebra is
a high level language which talks in terms of properties of sets of tuples and not in terms of for-
loops.
6.3.6. Limitations of Relational Algebra

15
The relational algebra cannot do arithmetic. For example, if we want to know the price of 10l of
petrol, by assuming a 10% increase in the price of the petrol, which cannot be done using relational
algebra. The relational algebra cannot sort or print results in various formats. For example, we
want to arrange the product name in the increasing order of their price. It cannot be done using
relational algebra.
Relational algebra cannot perform aggregates. For example, we want to know how many staff are
working in a particular department. This query cannot be performed using relational algebra.
The relational algebra cannot modify the database. For example, we want to increase the salary of
all employees by 10%. This cannot be done using relational algebra.
6.4. Relational Calculus
A relational calculus expression creates a new relation, which is specified in terms of variables
that range over rows of the stored database relations (in tuple calculus) or over columns of the
stored relations (in domain calculus). Relational Calculus comes in two flavors (1) Tuple
Relational Calculus (TRC) and (2) Domain Relational Calculus (DRC). The basic difference
between relational algebra and relational calculus is that the former gives the procedure of how to
evaluate the query whereas the latter gives only the query without giving the procedure of how to
evaluate the query:
 The variable in tuple relational calculus formulae range over tuples.
 The variable in domain relational calculus formulae range over individual values in the
domains of the attributes of the relations.
 Relational calculus is nonoperational, and users define queries in terms of what they want,
not in terms of how to compute it.
The purpose of relational calculus is to provide a formal basis for defining declarative query
languages appropriate for relational databases.
 A relational calculus query specifies what information is retrieved.
 A relational algebra query specifies how information is retrieved. (Declarative.)
In a calculus expression, there is no order of operations to specify how to retrieve the query result.
A calculus expression specifies only what information the result should contain rather than how
to retrieve it.
When we substitute values for the arguments in the predicate, the function yields an expression,
called a proposition, which can be either true or false. If a predicate contains a variable, as in ‘x is
a member of staff’, there must be a range for x. When we substitute some values of this range for

16
x, the proposition may be true; for other values, it may be false. If COND is a predicate, then the
set off all tuples evaluated to be true for the predicate COND will be expressed as follows:
{t | COND(t)} Where t is a tuple variable and COND (t) is a conditional expression involving t.
The result of such a query is the set of all tuples t that satisfy COND (t). If we have set of predicates
to evaluate for a single query, the predicates can be connected using ∧(AND), ∨(OR), and ~(NOT).
Tuple-oriented Relational Calculus:
 The tuple relational calculus is based on specifying a number of tuple variables. Each tuple
variable usually ranges over a particular database relation, meaning that the variable may
take as its value any individual tuple from that relation. Tuple relational calculus is
interested in finding tuples for which a predicate is true for a relation. Based on use of tuple
variables. Tuple variable is a variable that ‘ranges over’ a named relation: that is, a variable
who’s only permitted values are tuples of the relation.
 If E is a tuple that ranges over a relation employee, then it is represented as EMPLOYEE(E)
i.e. Range of E is EMPLOYEE. Then to extract all tuples that satisfy a certain condition,
we will represent is as all tuples E such that COND(E) is evaluated to be true.
{E ⁄ COND(E)}
The predicates can be connected using the Boolean operators: ∧ (AND), ∨ (OR), ∼ (NOT)
COND(t) is a formula, and is called a Well-Formed-Formula (WFF) if:
 Where the COND is composed of n-nary predicates (formula composed of n single
predicates) and the predicates are connected by any of the Boolean operators.
 And each predicate is of the form A θ B and θ is one of the logical operators {<, ≤ , >, ≥, ≠,
= }which could be evaluated to either true or false. And A and B are either constant or
variables.
 Formulae should be unambiguous and should make sense.
Example: (TRC): Extract all employees whose skill level is greater than or equal to 8.
{E | Employee(E) ∧ E.SkillLevel >= 8}
EmpID Fname Lname SkillID Skill Skill Type University University Skill
Address level
28 Robel Dawit 2 SQL Database AAU Sidist Killo 10
25 Abera Taye 6 C# Programming WKU Wolkite 8
65 Almaz Belay 2 SQL Database WKU Wolkite 9
51 Selam Belay 4 Prolog Programming JU Jimma 8

17
 To find only the EmpId, FName, LName, Skill and the University where the skill is
attended where of employees with skill level greater than or equal to 8, the tuple based
relational calculus expression will be:
{E.EmpId, E.FName, E.LName, E.Skill, E.University | Employee(E) ∧ E.SkillLevel >= 8}
EmpID Fname Lname Skill University
28 Robel Dawit SQL AAU
25 Abera Taye C# WKU
65 Almaz Belay SQL WKU
51 Selam Belay Prolog JU

E.FName means the value of the First Name (FName) attribute for the tuple E.
6.4.1. Quantifiers in Relation Calculus
 To tell how many instances the predicate applies to, we can use the two quantifiers in
the predicate logic.
 One relational calculus expressed using Existential Quantifier can also be expressed
using Universal Quantifier.
1. Existential quantifier ∃ (‘there exists’)
Existential quantifier used in formulae that must be true for at least one instance, such as: An
employee with skill level greater than or equal to 8 will be:
{E | Employee(E) ∧ (∃E)(E.SkillLevel >= 8)}
This means, there exist at least one tuple of the relation employee where the value for the
SkillLevel is greater than or equal to 8.
2. Universal quantifier ∀ (‘for all’)
Universal quantifier is used in statements about every instance, such as: An employee with skill
level greater than or equal to 8 will be:
{E | Employee(E) ∧ (∀E)(E.SkillLevel >= 8)}
This means, for all tuples of relation employee where value for the SkillLevel attribute is greater
than or equal to 8.
Example: Let’s say that we have the following Schema (set of Relations)
Employee(EID, FName, LName, Dept)
Project(PID, PName, Dept)
Dept(DID, DName, DMangID)
WorksOn(EID, PID)
To find employees who work on projects controlled by department 5 the query will be:

18
{E | Employee(E) ∧ (∀x)(Project(x) ∧ (∃w)(WorksOn(w) ∧ x.Dept=5 ∧ E.EID=W.EID))}
6.5. Structured Query Languages(SQL)
What is SQL? SQL stands for Structured Query Language. SQL is a standard language for
accessing and manipulating databases. SQL is an ANSI (American National Standards Institute)
standard. The Structured Query Language is a relational database language. By itself, SQL does
not make a DBMS. SQL is a medium which is used to communicate to the DBMS. SQL commands
consist of English-like statements which are used to query, insert, update, and delete data. English-
like statements mean that SQL commands resemble English language sentences in their
construction and use and therefore are easy to learn and understand.

SQL is referred to as nonprocedural database language. Here nonprocedural means that, when we
want to retrieve data from the database it is enough to tell SQL what data to be retrieved, rather
than how to retrieve it. The DBMS will take care of locating the information in the database.

Commercial database management systems allow SQL to be used in two distinct ways. First, SQL
commands can be typed at the command line directly. The DBMS interprets and processes the
SQL commands immediately, and the results are displayed. This method of SQL processing is
called interactive SQL. The second method is called programmatic SQL. Here, SQL statements
are embedded in a host language such as COBOL, FORTRAN, C, etc. SQL needs a host language
because SQL is not a really complete computer programming language as such because it has no
statements or constructs that allow branch or loop. The host language provides the necessary
looping and branching structures and the interface with the user, while SQL provides the
statements to communicate with the DBMS. Some of the features of SQL are:

 SQL is a language used to interact with the database.


 SQL is a data access language.
 SQL is based on relational tuple calculus.
 SQL is a standard relational database management language.
 The first commercial DBMS that supported SQL was Oracle in 1979.
 SQL is a “nonprocedural” or “declarative” language.
6.5.1. Languages in SQL

Some of the most common SQL Languages are:

19
1. Data Definition Language commands (DDL): DDL commands are used to define a
database, including creating, altering, and dropping tables and establishing constraints. DDL:
Used to define schemas, relations, and other database structures and also used to update these
structures as the database evolves. Examples of Structure Created by DDL: The different
structures that are created by DDL are Tables, Views, Sequences, Triggers, Indexes, etc.
A. Tables: The main features of table are:
 It is a relation that is used to store records of related data.
 It is a logical structure maintained by the database manager.
 It is made up of columns and rows.
 At the intersection of every column and row there is a specific data item called a value.
 A base table is created with the CREATE TABLE statement and is used to hold persistent
user data.
B. Views: The basic concepts of VIEW are:
 It is a stored SQL query used as a “Virtual table.”
 It provides an alternative way of looking at the data in one or more tables.
 It is a named specification of a result table. The specification is a SELECT statement that
is executed whenever the view is referenced in an SQL statement. Consider a view to have
columns and rows just like a base table. For retrieval, all views can be used just like base
tables.
 When the column of a view is directly derived from the column of a base table, that column
inherits any constraints that apply to the column of the base table. For example, if a view
includes a foreign key of its base table, INSERT and UPDATE operations using that view
are subject to the same referential constraints as the base table. Also, if the base table of a
view is a parent table, DELETE and UPDATE operations using that view are subject to the
same rule as DELETE and UPDATE operations on the base table.
C. Sequences: A sequence is an integer that varies by a given constant value. Typically used
for unique ID assignment.
D. Triggers: Trigger automatically executes certain commands when given conditions are met.
E. Indexes: Indexes are basically used for performance tuning. Indexes play a crucial role in
fast data retrieval.

Create Table Command: The CREATE TABLE command is used to implement the schemas of
individual relations. Steps in Table Creation:

a. Identify datatypes for attributes

20
b. Identify columns that can and cannot be null
c. Identify columns that must be unique
d. Identify primary key–foreign key mates
e. Determine default values
f. Identify constraints on columns (domain specifications)
g. Create the table

Syntax: CREATE TABLE table name


(column-name1 data-type-1 [constraint],
column-name2 data-type-2 [constraint],
column-nameN data-type-N [constraint] );

2. Data Manipulation Language commands (DML): DML commands are used to maintain
and query a database, including updating, inserting, modifying, and querying data.
3. Data Control Language commands (DCL): DCL commands are used to control a database
including administering privileges and saving of data. DCL commands are used to determine
whether a user is allowed to carry out a particular operation or not. The ANSI standard groups
these commands as being part of the DDL.
6.5.2. Datatypes in SQL
In relational model the data are stored in the form of tables. A table is composed of rows and
columns. When we create a table we must specify a datatype for each of its columns. These
datatypes define the domain of values that each column can take. MS SQL Server, Oracle
MySQL… provides a number of built-in datatypes as well as several categories for user-defined
types that can be used as datatypes. Some of the built-in datatypes are string datatype to store
characters, number datatype to store numerical value, and date and time datatype to store when the
event happened (history, date of birth, etc.). Read the data type of different DBMS from your
laboratory manual.

6.5.3. SQL Selection Operation

Selection operation can be considered as row wise filtering. We can select specific row(s) using
condition. Syntax: ELECT * FROM table name WHERE condition;

6.5.4. SQL Projection Operation

21
The projection operation performs column wise filtering. Specific columns are selected in
projection operation. If all the columns of the table are selected, then it cannot be considered as
PROJECTION. Syntax: SELECT column1, column2, ColumnN FROM table-name;

6.5.5. SELECTION and PROJECTION Operation

We can perform both selection and projection operation in a relation. If we combine selection and
projection operation, we are restricting the number of rows and the columns of the relation.
Syntax: SELECT column1, column2, …, columnN FROM table-name WHERE condition;

6.5.6. Aggregate Functions

SQL provides seven built-in functions to facilitate query processing. The seven built-in functions
are COUNT, MAX, MIN, SUM, AVG, STDDEV, and VARIANCE.
No. Functions Use Syntax
1. COUNT to count the number of A. SELECT COUNT (*) FROM table name;
rows of the relation. B. SELECT COUNT (Column-name) FROM
table name;
2. MAX to find the maximum SELECT MAX (Column-name) FROM table
value of the attribute. name;
3. MIN to find the minimum SELECT MIN (attribute-name) FROM table name;
value of the attribute.
4. SUM to find the sum of values SELECT SUM (attribute name) FROM table
of the attribute provided name;
the datatype of the
attribute is number.
5. AVG AVG to find the average SELECT AVG (attribute name) FROM table
of n values, ignoring null name;
values.
6. STDDEV standard deviation of n SELECT STDDEV (attribute-name) FROM table
values, ignoring null name;
values
7. VARIANCE Variance of n values VARIANCE (attribute name) FROM table name;
ignoring null.
 GROUP BY Function: The GROUP BY clause is used to group rows to compute group-
statistics. Syntax: SELECT attribute name, aggregate function FROM table name GROUP
BY attribute name;
 HAVING Command: The HAVING command is used to select the group. In other words,
HAVING restricts the groups according to a specified condition.
 Syntax: SELECT attribute name, aggregate function FROM table name GROUP BY
attribute name HAVING condition;

22
 SORTING of Results: The SQL command ORDER BY is used to sort the result in
ascending or descending order. Syntax: SELECT * FROM table name ORDER BY attribute
name ASC/DESC;
6.5.7. Data Manipulation Language
1. Adding a New Row to the Table: The INSERT command is to add new row to the table.
Syntax: INSERT INTO table name VALUES (‘column1’, ‘column2’. . . ‘columnN’);
2. Updating the Data in the Table: The data in the table can be updated by using UPDATE
command. Syntax: UPDATE table name SET attribute value=new value WHERE
condition;
3. Deleting Row from the Table: The DELETE command in SQL is used to delete row(s)
from the table. Syntax: DELETE FROM table name WHERE condition;
4. Filtering rows from single or multiple tables: To select data from a single or multiple
tables. We can use select command by combining with different aggregate functions, where
clause, like and projection functions. Syntax: SELECT * FORM table-name:
6.5.8. Table Modification Commands

We can use ALTER command to alter the structure of the table, that is we can add a new column
to the table. It is also possible to delete the column from the table using DROP COLUMN
command.

1. Adding a Column to the Table: We can add a column to the table by using ADD command.
Syntax: ALTER TABLE table-name ADD column name datatype; Data can be inserted to
the newly added column by using UPDATE command.
Syntax: UPDATE table name SET attribute value=new value WHERE condition;
2. Modifying the Column of the Table: We can modify the width of the datatype of the column
by using ALTER and MODIFY command. Syntax: ALTER table name MODIFY column-
name datatype;
3. Deleting the Column of the Table: The DROP COLUMN command can be used along with
the ALTER table command to delete the column of the table. Syntax: ALTER table name
DROP COLUMN column-name;
6.5.9. Table Truncation

23
The TRUNCATE TABLE command removes all the rows from the table. The truncate table also
releases the storage space used by the table. Syntax: TRUNCATE TABLE table-name; Note
Another way to delete all the rows of the table is to use DELETE command. The syntax is:
DELETE FROM table-name;

1. Dropping a Table: The definition of the table as well as the contents of the table is deleted by
issuing DROP TABLE command. Syntax: DROP TABLE table-name;
6.5.10. Constraints

Constraints are basically used to impose rules on the table, whenever a row is inserted, updated,
or deleted from the table. Constraints prevent the deletion of a table if there are dependencies. The
different types of constraints that can be imposed on the table are NOT NULL, UNIQUE,
PRIMARY KEY, FOREIGN KEY, INDEX and CHECK.

1. NOT NULL Constraint: If one is very much particular that the column is not supposed to
take NULL value, then we can impose NOT NULL constraint on that column. The syntax of
NOT NULL constraint is:

CREATE TABLE table-name (


column name1, data-type of the column1 (size) NOT NULL,
column name2, data-type of the column2, …, N);
2. UNIQUE Constraint: The UNIQUE constraint imposes that every value in a column or set of
columns be unique. It means that no two rows of a table can have duplicate values in a specified
column or set of columns. The syntax of UNIQUE constraint is:
CREATE TABLE table-name (
column1 data-type of the column1 (size) UNIQUE,
column2, data-type of the column2, …, N);
Note: NOT NULL constraint accepts duplicate values, whereas UNIQUE constraint will not
accept duplicate values. Also, when a UNIQUE constraint is imposed on an attribute means that
attribute can accept NULL values. Whereas NOT NULL constraint will not accept NULL values.

3. Primary Key Constraint: When an attribute or set of attributes is declared as the primary key,
then the attribute will not accept NULL value moreover it will not accept duplicate values. It
is to be noted that “only one primary key can be defined for each table.” The syntax of
PRIMARY KEY constraint is:

CREATE TABLE table-name (


Column1, data-type of the column1 (size) PRIMARY KEY,
Column2, data-type of the column2, …, N);

24
Difference Between UNIQUE and PRIMARY KEY Constraint an attribute declared as primary
key will not accept NULL values whereas an attribute declared as UNIQUE will accept NULL
values. Only one PRIMARY KEY can be defined for each table but more than one UNIQUE
constraint can be defined for each table.

4. CHECK Constraint: CHECK constraint is added to the declaration of the attribute. The
CHECK constraint may use the name of the attribute or any other relation or attribute name
may in a subquery. Attribute value check is checked only when the value of the attribute is
inserted or updated. Syntax of CHECK Constraint:

CREATE TABLE table-name (


Column1, data-type of the column1 (size) PRIMARY KEY,
Column2, data-type of the column2, …, N,
CHEK (Column2 = or >= or <= or < or> Value));
5. Referential Integrity Constraint: According to referential integrity constraint, when a
foreign key in one relation references primary key in another relation, the foreign key value
must match with the primary key value.

CREATE TABLE table-name1 (


Column1, data-type of the column1 (size) PRIMARY KEY,
Column2, data-type of the column2, …, N,
CHEK (Column2 = or >= or <= or < or> Value));

CREATE TABLE table-name2 (


Column1, data-type of the column1 (size),
Column2, data-type of the column2, …, N,
CHEK (Column2 = or >= or <= or < or> Value),
Column1 datatype FOREIGN KEY REFERENCES table-name1(Column1));
6.1.1. Join Operation and Set Operations
Join operation is used to retrieve data from more than one table. There are different joins such as
Inner join, left outer join, right outer join, full outer join, self-join, Equijoin and natural join. The
UNION, INTERSECTION, and the MINUS (Difference) operations are considered as SET
operations. Out of these three set operations, UNION, INTERSECTION operations are
commutative, whereas MINUS (Difference) operation is not commutative. All the three operations
are binary operations.

25

You might also like