You are on page 1of 35

UNIT-II DBMS

Unit II

Relational Model: Structure of Relational Databases – Relational Algebra – Extended -


Relational Algebra Operations – Modification of Database – Views – Tuple Relational -
Calculus – Domain Relational Calculus. SQL: Background – Basic Structure – Set -
Operations – Aggregate Functions – Null Values – Nested Sub-queries – Views – Complex
Queries –Modification of the database –Joined Relations – Data-Definition Language.
THE RELATIONAL MODEL

1. The first database systems were based on the network and hierarchical models.
These are covered briefly in appendices in the text. The relational model was first
proposed by E.F. Codd in 1970 and the first such systems (notably INGRES and
System/R) was developed in 1970s. The relational model is now the dominant
model for commercial data processing applications.
2. Note: Attribute Name Abbreviations

The text uses fairly long attribute names which are abbreviated in the notes as
follows.

o customer-name becomes cname


o customer-city becomes ccity
o branch-city becomes bcity
o branch-name becomes bname
o account-number becomes account#
o loan-number becomes loan#
o banker-name becomes banker

Structure of Relational Database

1. A relational database consists of a collection of tables, each having a unique


name.

A row in a table represents a relationship among a set of values.

Thus a table represents a collection of relationships.

2. There is a direct correspondence between the concept of a table and the


mathematical concept of a relation. A substantial theory has been developed for
relational databases.

Basic Structure
1. Figure 3.1 shows the deposit and customer tables for our banking example.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 1


UNIT-II DBMS

Figure 3.1: The deposit and customer relations.

o It has four attributes.


o For each attribute there is a permitted set of values, called the domain of
that attribute.
o E.g. the domain of bname is the set of all branch names.

Let denote the domain of bname, and , and the remaining attributes'
domains respectively.

Then, any row of deposit consists of a four-tuple where

In general, deposit contains a subset of the set of all possible rows.

That is, deposit is a subset of

In general, a table of n columns must be a subset of

2. Mathematicians define a relation to be a subset of a Cartesian product of a list of


domains. You can see the correspondence with our tables.

We will use the terms relation and tuple in place of table and row from now on.

3. Some more formalities:


o let the tuple variable refer to a tuple of the relation .
o We say to denote that the tuple is in relation .
o Then [bname] = [1] = the value of on the bname attribute.
o So [bname] = [1] = ``Downtown'',
o and [cname] = [3] = ``Johnson''.
4. We'll also require that the domains of all attributes be indivisible units.
o A domain is atomic if its elements are indivisible units.
o For example, the set of integers is an atomic domain.
o The set of all sets of integers is not.
o Why? Integers do not have subparts, but sets do - the integers comprising
them.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 2


UNIT-II DBMS

o We could consider integers non-atomic if we thought of them as ordered


lists of digits.

Database Scheme
1. We distinguish between a database scheme (logical design) and a database
instance (data in the database at a point in time).
2. A relation scheme is a list of attributes and their corresponding domains.
3. The text uses the following conventions:
o italics for all names
o lowercase names for relations and attributes
o names beginning with an uppercase for relation schemes

These notes will do the same.

For example, the relation scheme for the deposit relation:

o Deposit-scheme = (bname, account#, cname, balance)

We may state that deposit is a relation on scheme Deposit-scheme by writing


deposit(Deposit-scheme).

If we wish to specify domains, we can write:

o (bname: string, account#: integer, cname: string, balance: integer).

Note that customers are identified by name. In the real world, this would not be
allowed, as two or more customers might share the same name.

Figure 3.2 shows the E-R diagram for a banking enterprise.

Figure 3.2: E-R diagram for the banking enterprise

4. The relation schemes for the banking example used throughout the text are:
o Branch-scheme = (bname, assets, bcity)

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 3


UNIT-II DBMS

o Customer-scheme = (cname, street, ccity)


o Deposit-scheme = (bname, account#, cname, balance)
o Borrow-scheme = (bname, loan#, cname, amount)

Note: some attributes appear in several relation schemes (e.g. bname, cname).
This is legal, and provides a way of relating tuples of distinct relations.

5. Why not put all attributes in one relation?

Suppose we use one large relation instead of customer and deposit:

o Account-scheme = (bname, account#, cname, balance, street, ccity)


o If a customer has several accounts, we must duplicate her or his address
for each account.
o If a customer has an account but no current address, we cannot build a
tuple, as we have no values for the address.
o We would have to use null values for these fields.
o Null values cause difficulties in the database.
o By using two separate relations, we can do this without using null values

Keys
1. The notions of superkey, candidate key and primary key all apply to the
relational model.
2. For example, in Branch-scheme,
o {bname} is a superkey.
o {bname, bcity} is a superkey.
o {bname, bcity} is not a candidate key, as the superkey {bname} is
contained in it.
o {bname} is a candidate key.
o {bcity} is not a superkey, as branches may be in the same city.
o We will use {bname} as our primary key.
3. The primary key for Customer-scheme is {cname}.
4. More formally, if we say that a subset of is a superkey for , we are
restricting consideration to relations in which no two distinct tuples have the
same values on all attributes in . In other words,
o If and are in , and
o ,
o then .

Query Languages
1. A query language is a language in which a user requests information from a
database. These are typically higher-level than programming languages.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 4


UNIT-II DBMS

They may be one of:

o Procedural, where the user instructs the system to perform a sequence of


operations on the database. This will compute the desired information.
o Nonprocedural, where the user specifies the information desired without
giving a procedure for obtaining the information.
2. A complete query language also contains facilities to insert and delete tuples as
well as to modify parts of existing tuples.

The Relational Algebra

1. The relational algebra is a procedural query language.


o Six fundamental operations:
 select (unary)
 project (unary)
 rename (unary)
 cartesian product (binary)
 union (binary)
 set-difference (binary)
o Several other operations, defined in terms of the fundamental operations:
 set-intersection
 natural join
 division
 assignment
o Operations produce a new relation as a result.

Fundamental Operations
1. The Select Operation

Select selects tuples that satisfy a given predicate. Select is denoted by a


lowercase Greek sigma ( ), with the predicate appearing as a subscript. The
argument relation is given in parentheses following the .

For example, to select tuples (rows) of the borrow relation where the branch is
``SFU'', we would write

Let Figure 3.3 be the borrow and branch relations in the banking example.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 5


UNIT-II DBMS

Figure 3.3: The borrow and branch relations.

The new relation created as the result of this operation consists of one tuple:
.

We allow comparisons using =, , <, , > and in the selection predicate.

We also allow the logical connectives (or) and (and). For example:

Figure 3.4: The client relation.

Suppose there is one more relation, client, shown in Figure 3.4, with the scheme

we might write

to find clients who have the same name as their banker.

2. The Project Operation

Project copies its argument relation for the specified attributes only. Since a
relation is a set, duplicate rows are eliminated.

Projection is denoted by the Greek capital letter pi ( ). The attributes to be copied


appear as subscripts.

For example, to obtain a relation showing customers and branches, but ignoring
amount and loan#, we write

We can perform these operations on the relations resulting from other operations.

To get the names of customers having the same name as their bankers,

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 6


UNIT-II DBMS

Think of select as taking rows of a relation, and project as taking columns of a


relation.

3. The Cartesian Product Operation

The cartesian product of two relations is denoted by a cross ( ), written

The result of is a new relation with a tuple for each possible pairing of
tuples from and .

In order to avoid ambiguity, the attribute names have attached to them the name
of the relation from which they came. If no ambiguity will result, we drop the
relation name.

The result is a very large relation. If has tuples, and has


tuples, then will have tuples.

The resulting scheme is the concatenation of the schemes of and , with


relation names added as mentioned.

To find the clients of banker Johnson and the city in which they live, we need
information in both client and customer relations. We can get this by writing

However, the customer.cname column contains customers of bankers other than


Johnson. (Why?)

We want rows where client.cname = customer.cname. So we can write

to get just these tuples.

Finally, to get just the customer's name and city, we need a projection:

4. The Rename Operation

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 7


UNIT-II DBMS

The rename operation solves the problems that occurs with naming when
performing the cartesian product of a relation with itself.

Suppose we want to find the names of all the customers who live on the same
street and in the same city as Smith.

We can get the street and city of Smith by writing

To find other customers with the same information, we need to reference the
customer relation again:

where is a selection predicate requiring street and ccity values to be equal.

Problem: how do we distinguish between the two street values appearing in the
Cartesian product, as both come from a customer relation?
Solution: use the rename operator, denoted by the Greek letter rho ( ).

We write

to get the relation under the name of .

If we use this to rename one of the two customer relations we are using, the
ambiguities will disappear.

5. The Union Operation

The union operation is denoted as in set theory. It returns the union (set union)
of two compatible relations.

For a union operation to be legal, we require that

o and must have the same number of attributes.


o The domains of the corresponding attributes must be the same.

To find all customers of the SFU branch, we must find everyone who has a loan
or an account or both at the branch.

We need both borrow and deposit relations for this:

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 8


UNIT-II DBMS

As in all set operations, duplicates are eliminated, giving the relation of Figure
3.5(a).

Figure 3.5: The union and set-difference operations.

6. The Set Difference Operation

Set difference is denoted by the minus sign ( ). It finds tuples that are in one
relation, but not in another.

Thus results in a relation containing tuples that are in but not in .

To find customers of the SFU branch who have an account there but no loan, we
write

The result is shown in Figure 3.5(b).

We can do more with this operation. Suppose we want to find the largest account
balance in the bank.
Strategy:

o Find a relation containing the balances not the largest.


o Compute the set difference of and the deposit relation.

To find , we write

This resulting relation contains all balances except the largest one. (See Figure
3.6(a)).

Now we can finish our query by taking the set difference:

Figure 3.6(b) shows the result.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 9


UNIT-II DBMS

Figure 3.6: Find the largest account balance in the bank.

Formal Definition of Relational Algebra


1. A basic expression consists of either
o A relation in the database.
o A constant relation.
2. General expressions are formed out of smaller subexpressions using
o select (p a predicate)
o project (s a list of attributes)
o rename (x a relation name)
o union
o set difference
o cartesian product

Additional Operations
1. Additional operations are defined in terms of the fundamental operations. They do
not add power to the algebra, but are useful to simplify common queries.
2. The Set Intersection Operation

Set intersection is denoted by , and returns a relation that contains tuples that are
in both of its argument relations.

It does not add any power as

To find all customers having both a loan and an account at the SFU branch, we
write

3. The Natural Join Operation

Often we want to simplify queries on a cartesian product.

For example, to find all customers having a loan at the bank and the cities in
which they live, we need borrow and customer relations:

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 10


UNIT-II DBMS

Our selection predicate obtains only those tuples pertaining to only one cname.

This type of operation is very common, so we have the natural join, denoted by a
sign. Natural join combines a cartesian product and a selection into one
operation. It performs a selection forcing equality on those attributes that appear
in both relation schemes. Duplicates are removed as in all relation operations.

To illustrate, we can rewrite the previous query as

The resulting relation is shown in Figure 3.7.

Figure 3.7: Joining borrow and customer relations.

We can now make a more formal definition of natural join.

o Consider and to be sets of attributes.


o We denote attributes appearing in both relations by .
o We denote attributes in either or both relations by .
o Consider two relations and .
o The natural join of and , denoted by is a relation on scheme
.
o It is a projection onto of a selection on where the predicate
requires for each attribute in .

Formally,

where .

To find the assets and names of all branches which have depositors living in
Stamford, we need customer, deposit and branch relations:

Note that is associative.

To find all customers who have both an account and a loan at the SFU branch:

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 11


UNIT-II DBMS

This is equivalent to the set intersection version we wrote earlier. We see now that
there can be several ways to write a query in the relational algebra.

If two relations and have no attributes in common, then , and


.

4. The Division Operation

Division, denoted , is suited to queries that include the phrase ``for all''.

Suppose we want to find all the customers who have an account at all branches
located in Brooklyn.

Strategy: think of it as three steps.

We can obtain the names of all branches located in Brooklyn by

Figure 3.19 in the textbook shows the result.

We can also find all cname, bname pairs for which the customer has an account
by

Figure 3.20 in the textbook shows the result.

Now we need to find all customers who appear in with every branch name in
.

The divide operation provides exactly those customers:

which is simply .

Formally,

o Let and be relations.


o Let .
o The relation is a relation on scheme .
o A tuple is in if for every tuple in there is a tuple in satisfying
both of the following:

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 12


UNIT-II DBMS

o These conditions say that the portion of a tuple is in if and


only if there are tuples with the portion and the portion in for
every value of the portion in relation .

We will look at this explanation in class more closely.

The division operation can be defined in terms of the fundamental operations.

Read the text for a more detailed explanation.

5. The Assignment Operation

Sometimes it is useful to be able to write a relational algebra expression in parts


using a temporary relation variable (as we did with and in the division
example).

The assignment operation, denoted , works like assignment in a programming


language.

We could rewrite our division definition as

No extra relation is added to the database, but the relation variable created can be
used in subsequent expressions. Assignment to a permanent relation would
constitute a modification to the database.

The Tuple Relational Calculus

1. The tuple relational calculus is a nonprocedural language. (The relational algebra


was procedural.)

We must provide a formal description of the information desired.

2. A query in the tuple relational calculus is expressed as

i.e. the set of tuples for which predicate is true.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 13


UNIT-II DBMS

3. We also use the notation


o to indicate the value of tuple on attribute .
o to show that tuple is in relation .

Example Queries
1. For example, to find the branch-name, loan number, customer name and amount
for loans over $1200:

This gives us all attributes, but suppose we only want the customer names. (We
would use project in the algebra.)

We need to write an expression for a relation on scheme (cname).

In English, we may read this equation as ``the set of all tuples such that there
exists a tuple in the relation borrow for which the values of and for the cname
attribute are equal, and the value of for the amount attribute is greater than
1200.''

The notation means ``there exists a tuple in relation such that


predicate is true''.

How did we get the above expression? We needed tuples on scheme cname such
that there were tuples in borrow pertaining to that customer name with amount
attribute .

The tuples get the scheme cname implicitly as that is the only attribute is
mentioned with.

Let's look at a more complex example.

Find all customers having a loan from the SFU branch, and the the cities in which
they live:

In English, we might read this as ``the set of all (cname,ccity) tuples for which
cname is a borrower at the SFU branch, and ccity is the city of cname''.

Tuple variable ensures that the customer is a borrower at the SFU branch.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 14


UNIT-II DBMS

Tuple variable is restricted to pertain to the same customer as , and also ensures
that ccity is the city of the customer.

The logical connectives (AND) and (OR) are allowed, as well as (negation).

We also use the existential quantifier and the universal quantifier .

Some more examples:

1. Find all customers having a loan, an account, or both at the SFU branch:

Note the use of the connective.


As usual, set operations remove all duplicates.

2. Find all customers who have both a loan and an account at the SFU branch.

Solution: simply change the connective in 1 to a .

3. Find customers who have an account, but not a loan at the SFU branch.

4. Find all customers who have an account at all branches located in Brooklyn.
(We used division in relational algebra.)

For this example we will use implication, denoted by a pointing finger in the text,
but by here.

The formula means implies , or, if is true, then must be true.

In English: the set of all cname tuples such that for all tuples in the branch
relation, if the value of on attribute bcity is Brooklyn, then the customer has an
account at the branch whose name appears in the bname attribute of .

Division is difficult to understand. Think it through carefully.

Formal Definitions

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 15


UNIT-II DBMS

1. A tuple relational calculus expression is of the form

where is a formula.

Several tuple variables may appear in a formula.

2. A tuple variable is said to be a free variable unless it is quantified by a or a .


Then it is said to be a bound variable.
3. A formula is built of atoms. An atom is one of the following forms:
o , where is a tuple variable, and r is a relation ( is not allowed).
o , where and are tuple variables, and and are attributes,
and is a comparison operator ( ).
o , where is a constant in the domain of attribute .
4. Formulae are built up from atoms using the following rules:
o An atom is a formula.
o If is a formula, then so are and .
o If and are formulae, then so are , and .
o If is a formula containing a free tuple variable , then

are formulae also.

5. Note some equivalences:


o
o
o

Safety of Expressions
1. A tuple relational calculus expression may generate an infinite expression, e.g.

2. There are an infinite number of tuples that are not in borrow! Most of these tuples
contain values that do not appear in the database.
3. Safe Tuple Expressions

We need to restrict the relational calculus a bit.

o The domain of a formula , denoted dom( ), is the set of all values


referenced in .
o These include values mentioned in as well as values that appear in a
tuple of a relation mentioned in .

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 16


UNIT-II DBMS

o So, the domain of is the set of all values explicitly appearing in or that
appear in relations mentioned in .
o is the set of all values appearing
in borrow.
o is the set of all values appearing in borrow.

We may say an expression is safe if all values that appear in the result
are values from dom( ).

4. A safe expression yields a finite number of tuples as its result. Otherwise, it is


called unsafe.

Expressive Power of Languages


1. The tuple relational calculus restricted to safe expressions is equivalent in
expressive power to the relational algebra.

The Domain Relational Calculus

1. Domain variables take on values from an attribute's domain, rather than values for
an entire tuple.

Formal Definitions
1. An expression is of the form

where the represent domain variables, and is a formula.

2. An atom in the domain relational calculus is of the following forms


o where is a relation on attributes, and ,
are domain variables or constants.
o , where and are domain variables, and is a comparison
operator.
o , where c is a constant.
3. Formulae are built up from atoms using the following rules:
o An atom is a formula.
o If is a formula, then so are and .
o If and are formulae, then so are , and .
o If is a formula where x is a domain variable, then so are
and .

Example Queries
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 17
UNIT-II DBMS

1. Find branch name, loan number, customer name and amount for loans of over
$1200.

2. Find all customers who have a loan for an amount > than $1200.

3. Find all customers having a loan from the SFU branch, and the city in which they
live.

4. Find all customers having a loan, an account or both at the SFU branch.

5. Find all customers who have an account at all branches located in Brooklyn.

If you find this example difficult to understand, try rewriting this expression using
implication, as in the tuple relational calculus example. Here's my attempt:

I've used two letter variable names to get away from the problem of having to
remember what stands for.

Safety of Expressions
1. As in the tuple relational calculus, it is possible to generate infinite expressions.
The solution is similar for domain relational calculus-restrict the form to safe
expressions involving values in the domain of the formula.

Read the text for a complete explanation.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 18


UNIT-II DBMS

Expressive Power of Languages


1. All three of the following are equivalent:
o The relational algebra.
o The tuple relational calculus restricted to safe expressions.
o The domain relational calculus restricted to safe expressions.

Modifying the Database

1. Up until now, we have looked at extracting information from the database. We


also need to add, remove and change information. Modifications are expressed
using the assignment operator.

Deletion
1. Deletion is expressed in much the same way as a query. Instead of displaying, the
selected tuples are removed from the database. We can only delete whole tuples.

In relational algebra, a deletion is of the form

where is a relation and is a relational algebra query.

Tuples in for which is true are deleted.

2. Some examples:

1. Delete all of Smith's account records.

2. Delete all loans with loan numbers between 1300 and 1500.

3. Delete all accounts at Branches located in Needham.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 19


UNIT-II DBMS

Insertions
1. To insert data into a relation, we either specify a tuple, or write a query whose
result is the set of tuples to be inserted. Attribute values for inserted tuples must
be members of the attribute's domain.
2. An insertion is expressed by

where is a relation and is a relational algebra expression.

3. Some examples:

1. To insert a tuple for Smith who has $1200 in account 9372 at the SFU branch.

2. To provide all loan customers in the SFU branch with a $200 savings account.

Updating
1. Updating allows us to change some values in a tuple without necessarily changing
all.

We use the update operator, , with the form

where is a relation with attribute , which is assigned the value of expression


.

The expression is any arithmetic expression involving constants and attributes


in relation .

Some examples:

1. To increase all balances by 5 percent.

This statement is applied to every tuple in deposit.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 20


UNIT-II DBMS

2. To make two different rates of interest payment, depending on balance amount:

Views

1. We have assumed up to now that the relations we are given are the actual
relations stored in the database.
2. For security and convenience reasons, we may wish to create a personalized
collection of relations for a user.
3. We use the term view to refer to any relation, not part of the conceptual model,
that is made visible to the user as a ``virtual relation''.
4. As relations may be modified by deletions, insertions and updates, it is generally
not possible to store views. (Why?) Views must then be recomputed for each
query referring to them.

View Definition
1. A view is defined using the create view command:

where <query expression> is any legal query expression.

The view created is given the name .

2. To create a view all-customer of all branches and their customers:

3. Having defined a view, we can now use it to refer to the virtual relation it creates.
View names can appear anywhere a relation name can.
4. We can now find all customers of the SFU branch by writing

Updates Through Views and Null Values


1. Updates, insertions and deletions using views can cause problems. The
modifications on a view must be transformed to modifications of the actual
relations in the conceptual model of the database.
2. An example will illustrate: consider a clerk who needs to see all information in
the borrow relation except amount.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 21


UNIT-II DBMS

Let the view loan-info be given to the clerk:

3. Since SQL allows a view name to appear anywhere a relation name may appear,
the clerk can write:

This insertion is represented by an insertion into the actual relation borrow, from
which the view is constructed.

However, we have no value for amount. A suitable response would be

o Reject the insertion and inform the user.


o Insert (``SFU'',3,``Ruth'',null) into the relation.

The symbol null represents a null or place-holder value. It says the value is
unknown or does not exist.

4. Another problem with modification through views: consider the view

This view lists the cities in which the borrowers of each branch live.

Now consider the insertion

Using nulls is the only possible way to do this (see Figure 3.22 in the textbook).

If we do this insertion with nulls, now consider the expression the view actually
corresponds to:

As comparisons involving nulls are always false, this query misses the inserted
tuple.

To understand why, think about the tuples that got inserted into borrow and
customer. Then think about how the view is recomputed for the above query.

Sql

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 22


UNIT-II DBMS

 Basic structure
 Set operations
 Aggregate functions
 Null values
 Nested sub queries
 Derived relations
 Views
 Modification of the database
 Joined relations
 Data definition language

Schema used in examples

Basic structure
 Sql is based on set and relational operations with certain modifications and
enhancements
 A typical sql query has the form:
select a1, a2, ..., an
from r1, r2, ..., rm
where p
o Ais represent attributes
o Ris represent relations
o P is a predicate.
 This query is equivalent to the relational algebra expression.
 a1, a2, ..., an(p (r1 x r2 x ... X rm))
 The result of an sql query is a relation.
The select clause
 The select clause list the attributes desired in the result of a query
o Corresponds to the projection operation of the relational algebra
 E.g. Find the names of all branches in the loan relation
select branch-name
from loan

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 23


UNIT-II DBMS

 In the “pure” relational algebra syntax, the query would be:


 branch-name(loan)
 Note: sql does not permit the ‘-’ character in names,
o Use, e.g., branch_name instead of branch-name in a real implementation.
o We use ‘-’ since it looks nicer!
 Note: sql names are case insensitive, i.e. You can use capital or small letters.
o You may wish to use upper case where-ever we use bold font.
 Sql allows duplicates in relations as well as in query results.
 To force the elimination of duplicates, insert the keyword distinct after select.
1. Find the names of all branches in the loan relations, and remove duplicates
select distinct branch-name from loan
2. The keyword all specifies that duplicates not be removed.
select all branch-name from loan
3. An asterisk in the select clause denotes “all attributes”
select * from loan
4. The select clause can contain arithmetic expressions involving the operation,
+, –, , and /, and operating on constants or attributes of tuples.
5. The query:
Select loan-number, branch-name, amount  100 from loan
Would return a relation which is the same as the loan relations, except that the
attribute amount is multiplied by 100.

The where clause


 The where clause specifies conditions that the result must satisfy
o Corresponds to the selection predicate of the relational algebra.
1. To find all loan number for loans made at the perryridge branch with loan
amounts greater than $1200.
Select loan-number from loan
where branch-name = ‘perryridge’ and amount > 1200
 Comparison results can be combined using the logical connectives and, or, and
not.
 Comparisons can be applied to results of arithmetic expressions.
 Sql includes a between comparison operator

1. Find the loan number of those loans with loan amounts between $90,000 and
$100,000 (that is, $90,000 and $100,000)
select loan-number from loan
where amount between 90000 and 100000

The from clause


 The from clause lists the relations involved in the query

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 24


UNIT-II DBMS

o Corresponds to the cartesian product operation of the relational algebra.


1. find the cartesian product borrower x loan
select from borrower, loan

2. Find the name, loan number and loan amount of all customers
having a loan at the perryridge branch.
select customer-name, borrower.loan-number, amount
from borrower, loan
where borrower.loan-number = loan.loan-number an branch-name =
‘perryridge’
The rename operation
 The sql allows renaming relations and attributes using the as clause:
old-name as new-name
1. Find the name, loan number and loan amount of all customers; rename the
column name loan-number as loan-id.
select customer-name, borrower.loan-number as loan-id,
amount
from borrower, loan
where borrower.loan-number = loan.loan-number
Tuple variables
 Tuple variables are defined in the from clause via the use of the as clause.
1. Find the customer names and their loan numbers for all customers having a loan
at some branch
select customer-name, t.loan-number, s.amount
from borrower as t, loan as s
where t.loan-number = s.loan-number
2. Find the names of all branches that have greater assets than some branch
located in brooklyn.
select distinct t.branch-name
from branch as t, branch as s
where t.assets > s.assets and s.branch-city = ‘brooklyn’
String operations
 Sql includes a string-matching operator for comparisons on character strings.
Patterns are described using two special characters:
 Percent (%). The % character matches any substring.
 Underscore (_). The _ character matches any character.
1. Find the names of all customers whose street includes the substring “main”.
 Select customer-name
from customer
where customer-street like ‘%main%’
 Match the name “main%”
 Like ‘main\%’ escape ‘\’

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 25


UNIT-II DBMS

 Sql supports a variety of string operations such as


 Concatenation (using “||”)
 converting from upper to lower case (and vice versa)
 finding string length, extracting substrings, etc.

Ordering the display of tuples


1. List in alphabetic order the names of all customers having a loan in perryridge
branch
 Select distinct customer-name
from borrower, loan
where borrower loan-number - loan.loan-number and
branch-name = ‘perryridge’
order by customer-name
2. We may specify desc for descending order or asc for ascending order, for each
attribute; ascending order is the default.
o E.g. Order by customer-name desc

Set operations
 The set operations union, intersect, and except operate on relations and
correspond to the relational algebra operations 
 Each of the above operations automatically eliminates duplicates; to retain all
duplicates use the corresponding multiset versions union all, intersect all and
except all.
Suppose a tuple occurs m times in r and n times in s, then, it occurs:
o M + n times in r union all s
o Min(m,n) times in r intersect all s
o Max(0, m – n) times in r except all s
1. Find all customers who have a loan, an account, or both:
(select customer-name from depositor)
union
(select customer-name from borrower)
2. Find all customers who have both a loan and an account.
(select customer-name from depositor)
intersect
(select customer-name from borrower)
3. find all customers who have an account but no loan
(select customer-name from depositor)
except
(select customer-name from borrower)
Aggregate functions
 These functions operate on the multiset of values of a column of a relation, and
return a value

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 26


UNIT-II DBMS

avg: average value


min: minimum value
max: maximum value
sum: sum of values
count: number of values
 Find the average account balance at the perryridge branch.
select avg (balance) from account
where branch-name = ‘perryridge’
 Find the number of tuples in the customer relation.
select count (*) from customer
 Find the number of depositors in the bank.
select count (distinct customer-name) from depositor
Aggregate functions – group by
 Find the number of depositors for each branch.
select branch-name, count (distinct customer-name)
from depositor, account
where depositor.account-number = account.account-number
group by branch-name
Note: attributes in select clause outside of aggregate functions must
appear in group by list
Aggregate functions – having clause
 Find the names of all branches where the average account balance is more than
$1,200.
select branch-name, avg (balance)
from account
group by branch-name
having avg (balance) > 1200
Note: predicates in the having clause are applied after the
formation of groups whereas predicates in the where
clause are applied before forming groups
Null values
 It is possible for tuples to have a null value, denoted by null, for some of their
attributes
 Null signifies an unknown value or that a value does not exist.
 The predicate is null can be used to check for null values.
E.g. Find all loan number which appear in the loan relation with null values for
amount.
 Select loan-number
from loan
where amount is null
 The result of any arithmetic expression involving null is null
E.g. 5 + null returns null
 However, aggregate functions simply ignore nulls
More on this shortly
Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 27
UNIT-II DBMS

Null values and three valued logic


 Any comparison with null returns unknown
o E.g. 5 < null or null <> null or null = null
 Three-valued logic using the truth value unknown:
o Or: (unknown or true) = true, (unknown or false) = unknown
(unknown or unknown) = unknown
o And: (true and unknown) = unknown, (false and unknown) = false,
(unknown and unknown) = unknown
o Not: (not unknown) = unknown
o “p is unknown” evaluates to true if predicate p evaluates to unknown
 Result of where clause predicate is treated as false if it evaluates to unknown

Null values and aggregates


 Total all loan amounts
 Select sum (amount)
from loan
o Above statement ignores null amounts
o Result is null if there is no non-null amount, that is the
 All aggregate operations except count(*) ignore tuples with null values on the
aggregated attributes.

Outer join
 An extension of the join operation that avoids loss of information.
 Computes the join and then adds tuples form one relation that does not match
tuples in the other relation to the result of the join.
 Uses null values:
o Null signifies that the value is unknown or does not exist
o All comparisons involving null are (roughly speaking) false by definition.

 Relation loan
loan-number branch-name amount

L-170 Downtown 3000


L-230 Redwood 4000
L-260 Perryridge 1700

 Relation borrower

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 28


UNIT-II DBMS

customer-name loan-number

Jones L-170
Smith L-230
Hayes L-155

Outer Join – Example


 Inner Join
loan Borrower

loan-number branch-name amount customer-


name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith

 Left Outer Join loan Borrower

loan-number branch-name amount customer-


name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 29


UNIT-II DBMS

Outer Join – Example


 Right Outer Join
loan borrower

loan-number branch-name amount customer-


name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-155 null null Hayes

 Full Outer Join


loan borrower
loan-number branch-name amount customer-
name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null
L-155 null null Hayes

Nested subqueries
 Sql provides a mechanism for the nesting of subqueries.
 A subquery is a select-from-where expression that is nested within another
query.
 A common use of subqueries is to perform tests for set membership, set
comparisons, and set cardinality.

Example query
 Find all customers who have both an account and a loan at the bank.
select distinct customer-name
from borrower
where customer-name in (select customer-name
from depositor)
 Find all customers who have a loan at the bank but do not have
an account at the bank
select distinct customer-name
from borrower
where customer-name not in (select customer-name
from depositor)
 Find all customers who have both an account and a loan at the perryridge
branch

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 30


UNIT-II DBMS

select distinct customer-name


from borrower, loan
where borrower.loan-number = loan.loan-number and
branch-name = “perryridge” and
(branch-name, customer-name) in
(select branch-name, customer-name
from depositor, account
where depositor.account-number =
account.account-number)
Set comparison
 Find all branches that have greater assets than some branch located in
brooklyn.
select distinct t.branch-name
from branch as t, branch as s
where t.assets > s.assets and
s.branch-city = ‘brooklyn’

 Same query using > some clause


select branch-name
from branch
where assets > some
(select assets
from branch
where branch-city = ‘brooklyn’)
 Find the names of all branches that have greater assets than all branches
located in brooklyn.
select branch-name
from branch
where assets > all
(select assets
from branch
where branch-city = ‘brooklyn’)
Views
 Provide a mechanism to hide certain data from the view of certain users. To
create a view we use the command:
create view v as <query expression>
Where:
o <query expression> is any legal expression
o The view name is represented by v
Example queries
 A view consisting of branches and their customers
create view all-customer as
(select branch-name, customer-name
from depositor, account
where depositor.account-number = account.account-number) union

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 31


UNIT-II DBMS

(select branch-name, customer-name


from borrower, loan
where borrower.loan-number = loan.loan-number)
 Find all customers of the perryridge branch
select customer-name
from all-customer
where branch-name = ‘perryridge’
Update of a view
 Create a view of all loan data in loan relation, hiding the amount attribute
 Create view branch-loan as
select branch-name, loan-number
from loan
 Add a new tuple to branch-loan
 Insert into branch-loan
values (‘perryridge’, ‘l-307’)
 This insertion must be represented by the insertion of the tuple
(‘l-307’, ‘perryridge’, null)
Into the loan relation
 Updates on more complex views are difficult or impossible to translate, and
hence are disallowed.
 Most sql implementations allow updates only on simple views (without
aggregates) defined on a single relation

Data definition language (ddl)


Allows the specification of not only a set of relations but also information about each
relation, including:
 The schema for each relation.
 The domain of values associated with each attribute.
 Integrity constraints
 The set of indices to be maintained for each relations.
 Security and authorization information for each relation.
 The physical storage structure of each relation on disk.
Domain types in sql
 Char(n). Fixed length character string, with user-specified length n.
 Varchar(n). Variable length character strings, with user-specified maximum
length n.
 Int. Integer (a finite subset of the integers that is machine-dependent).
 Smallint. Small integer (a machine-dependent subset of the integer domain type).
 Numeric(p,d). Fixed point number, with user-specified precision of p digits,
with n digits to the right of decimal point.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 32


UNIT-II DBMS

 Real, double precision. Floating point and double-precision floating point


numbers, with machine-dependent precision.
 Float(n). Floating point number, with user-specified precision of at least n digits.
 Null values are allowed in all the domain types. Declaring an attribute to be not
null prohibits null values for that attribute.
 Create domain construct in sql-92 creates user-defined domain types
 Create domain person-name char(20) not null
date/time types in sql
 Date. Dates, containing a (4 digit) year, month and date
o E.g. Date ‘2001-7-27’
 Time. Time of day, in hours, minutes and seconds.
o E.g. Time ’09:00:30’ time ’09:00:30.75’
 Timestamp: date plus time of day
o E.g. Timestamp ‘2001-7-27 09:00:30.75’
 Interval: period of time
o E.g. Interval ‘1’ day
o Subtracting a date/time/timestamp value from another gives an interval
value
o Interval values can be added to date/time/timestamp values
 Can extract values of individual fields from date/time/timestamp
o E.g. Extract (year from r.starttime)
 Can cast string types to date/time/timestamp
o E.g. Cast <string-valued-expression> as date

Embedded sql:
 The sql standard defines embeddings of sql in a variety of programming
languages such as pascal, pl/i, fortran, c, and cobol.
 A language to which sql queries are embedded is referred to as a host language,
and the sql structures permitted in the host language comprise embedded sql.
 The basic form of these languages follows that of the system r embedding of sql
into pl/i.
 Exec sql statement is used to identify embedded sql request to the preprocessor
 Exec sql <embedded sql statement > end-exec
o Note: this varies by language. E.g. The java embedding uses
# sql { …. } ;
Example query:
From within a host language, find the names and cities of customers with more than the
variable amount dollars in some account.
 Specify the query in sql and declare a cursor for it
Exec sql

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 33


UNIT-II DBMS

declare c cursor for


select customer-name, customer-city
from depositor, customer, account
where depositor.customer-name = customer.customer-name
and depositor account-number = account.account-number
and account.balance > :amount
End-exec

 The open statement causes the query to be evaluated


 Exec sql open c end-exec
 The fetch statement causes the values of one tuple in the query result to be placed
on host language variables.
 Exec sql fetch c into :cn, :cc end-exec
repeated calls to fetch get successive tuples in the query result
 A variable called sqlstate in the sql communication area (sqlca) gets set to
‘02000’ to indicate no more data is available
 The close statement causes the database system to delete the temporary relation
that holds the result of the query.
 Exec sql close c end-exec
 Note: above details vary with language. E.g. The java embedding defines java
iterators to step through result tuples.

Updates through cursors:


 Can update tuples fetched by cursor by declaring that the cursor is for update
 Declare c cursor for
select *
from account
where branch-name = ‘perryridge’
for update
 To update tuple at the current location of cursor
 Update account
set balance = balance + 100
where current of c

Dynamic sql:
 Allows programs to construct and submit sql queries at run time.
 Example of the use of dynamic sql from within a c program.

Char * sqlprog = “update account


 set balance = balance * 1.05
where account-number = ?”
Exec sql prepare dynprog from :sqlprog;

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 34


UNIT-II DBMS

char account [10] = “a-101”;


exec sql execute dynprog using :account;
 The dynamic sql program contains a ?, which is a place holder for a value that is
provided when the sql program is executed.

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 35

You might also like