You are on page 1of 144

Chapter One

Introduction

Data: is a collection of raw facts.


Information: is a processed data in the form that is meaningful to
the user.

Information System is a system that:

Receives data and instruction

Processes the data as per the instruction

Produces output

Stores data/information for future use


04/15/2021 Database System 1
Introduction to Database
Traditional way of data/information handling where cards and paper
are used for the purpose. Typing the data on paper and put in a file
cabinet. The data storage and retrieval will be performed using human
labour

04/15/2021 Database System 2


Con’t..

Limitations Traditional Database :


Prone to error
Data loss
Redundancy
Inconsistency
Difficult to update, retrieve, integrate
difficult to compile the information
Limited to small size information
Cross referencing is difficult
04/15/2021 Database System 3
Database

Is a collection of related data in an organized way.

Is logically related data where these logically related data


comprises: entities, attributes, relationships, and business rules
of an organization's information.

Also contains a description of the data which called as


“Metadata” or “Data Dictionary” or “Systems Catalogue” or
“Data about Data”.

E.g. book database

04/15/2021 Database System 4


Database

Databases and database technology are having a major impact on the


growing use of computers.

Play a critical role in almost all areas where computers are used,
including Business, Engineering, Medicine, Law, Education, And
Library Science, Etc.

Designed once and used simultaneously by many users.

04/15/2021 Database System 5


Database

The advantages of a database approach over the traditional and paper-


based methods of record keeping will :-
Compactness: no need for possibly voluminous paper files.
Speed: can retrieve and change data faster than a human can.

Accuracy: timely, accurate and up-to-date information is available


at any time
Data can be shared: two or more users can access and use same
data instead of storing data in redundant manner for each user.
Redundancy can be reduced: In non database or non centralized
systems each application or department keeps its own private files.

04/15/2021 Database System 6


Database

Integrity can be maintained: The problem of integrity is the problem


of ensuring the data in the database is accurate

Security restrictions can be applied: Different rules can be


established for each type of access (retrieve, insert, delete, etc.) to each
of information to the database.

04/15/2021 Database System 7


Con’t..

Limitations and risk of Database

Complexity in designing and managing data

The cost and risk during conversion from the old to the new system

High cost to be incurred to develop and maintain the system

Complex backup and recovery services from the users perspective

Reduced performance due to centralization and data independency

High impact on the system when failure occurs to the central


system.

04/15/2021 Database System 8


Database Systems
A computerized record keeping system. Users of the database can perform a
variety of operations. Such as:

Retrieving data from existing file

Adding new data to existing file

Adding new data to empty file

Receives data and instruction

Modifying data to existing file

Deleting data from existing file

Searching for target information


04/15/2021 Database System 9
component of database system

A database system involves four major components:


Hardware :consists of secondary storage media.
Data :The actual data stored in the database system may be stored as a
single database or distributed in many distinct files and treated as one.
Software :DBMS is responsible for the overall management of
communications between the user and the database
User and designer of database:there are group of roles played by
different stakeholders of the designing and operation of a database system.

04/15/2021 Database System 10


Con’t..

Database Administrator (DBA)


Responsible to oversee, control and manage the database resources
(the database itself, the DBMS and other related software)
Authorizing access to the database
Coordinating and monitoring the use of the database
Responsible for determining and acquiring hardware and software
resources
Accountable for problems like poor security, poor performance of
the system
Involves in all steps of database development

04/15/2021 Database System 11


Database Management System(DBMS)
Is a collection of programs that enables users to create and
maintain a database.

Example: Ms Access, SQL Server, My SQL, Oracle etc.

Is general-purpose software system that facilitates the processes of

Defining database

Constructing database

Manipulating databases for various applications

Defining a database: involves specifying the data types,


structures, and constraints for the data to be stored in the database.
04/15/2021 Database System 12
Con’t…

Constructing the database: is the process of storing the data

itself on some storage medium that is controlled by the

DBMS.

Manipulating a database: to retrieve specific data, updating

the database to reflect changes in the mini world, and

generating reports from the data.

Sharing: allows multiple users to access data.

04/15/2021 Database System 13


Components of DBMS Environment

A DBMS is software package used to design, manage, and

maintain databases.

It facilities to define the database, manipulate the content of the

database and control the database.

These facilities will help the designer, the user as well as the

database administrator to discharge their responsibility in

designing, using and managing the database.

It provides the following facilities:

04/15/2021 Database System 14


Con’t…

Data Definition Language (DDL):


A Language used to define each data element

required by the organization.

Used to setup a database to create, drop, rename

and alter table with the facility of handling

constraints
04/15/2021 Database System 15
Con’t…

Data Manipulation Language (DML):


Is a core command used by end-users and programmers to insert,

delete, and update the data in the database.

Provides basic data manipulation operations on data held in the

database.

Language for manipulating the data organized by the appropriate

data model

04/15/2021 Database System 16


Con’t…

Data Query Language (DQL):


Language for accessing or retrieving the data organized by the

appropriate data model

Since the required data or Query by the user will be extracted using

this type of language, it is also called "Query Language“ model

Procedural DQL: user specifies what data is required and how to

get the data.

Non-Procedural DQL: user specifies what data is required but not

how
04/15/2021 it is to be retrieved
Database System 17
Con’t…

Data Control Language (DCL):


The database administrator should have the facility to control the
overall operation of the system.

The commands include grant or revoke privileges to access the


database or particular object within the database and to store or
remove database transactions

04/15/2021 Database System 18


Types Of Database Model
A model is a representation of real world objects and events and their
associations. Data model can be divided into four:
Hierarchical database model: Consists of an ordered set of
trees in a parent child mode.
Connection between child and its parent is called a Link.
1-1 or 1-M link is allowed.
It is inflexible
Used widely in applications like bank, tecom etc.
Easy to understood
Top down ways of viewing business entity
Allows a node to have only one parent.

04/15/2021 Database System 19


Con’t..
Network model: is the extension of the hierarchical structure because
it allows many-to-many relationships to be managed in a tree-like

structure that allows multiple parents.

Records data in a database as a collection of records, but unlike in the

case of hierarchical model, here a node can have any number of

parents.

Linked records are called set.

Object-oriented Database Model: information is represented in the

form of objects as used in object-oriented programming.

04/15/2021 Database System 20


Con’t..

Relational model: It is stored along with its


entities. A table has two properties rows and columns.

Rows represent records and columns represent

attributes.

 Most current DBMS technologies use relational

model

04/15/2021 Database System 21


Steps of Database Design
Information Systems Design in general involves three steps

Requirements analysis – specifies what the system is required to do


based on user input.

Design – specifies how the system will address the requirements.

Implementation – translates design specifications into a working


system.

04/15/2021 Database System 22


Steps In Requirements Analysis for Database
Identify scope of the design effort.
Establish metadata collection standards - who to interview, what to
collect - how to structure interview.
Identify user views – extracted by reviewing user tasks, types of
decisions. Forms, reports, graphs, maps can be useful information
for defining views. User view- subset of data used by a user in a
specific context
Build a data dictionary - define and describe each item in detail:
name, description, type, length, range and relationships
Identify data volumes and usage patterns how much data is used
and how frequently is data change.
Identify operational (functional) requirements..

04/15/2021 Database System 23


Con’t..

Design of a database involves three types of designing


steps:
Conceptual Design: Synthesis of information from
requirements analysis according to semantic rules.
Outcome is a conceptual model. The conceptual model
describes entities, attributes and relations among entities
independent of implementation details.
Logical Design: Transforms the conceptual data model
into an internal model - schema that can be processed by
a particular DBMS. For example E/R model to relational
model mapping.
04/15/2021 Database System 24
Con’t..

Physical Design: Involves design of internal storage


structures, record formats, access methods, record
blocking and soon.

04/15/2021 Database System 25


Chapter two
Conceptual level data design: Refers to the Entity Relationship(E/R) Data Model .

Entity - Relationship (E/R) : Views the real world as a set of basic objects (entities)
and relationships among these objects.

It represents the overall logical structure of the database

The three basic notions of the E/R model are:

Entity: represents existing real-world objects or concepts, such as places, objects,


events, persons, orders, customers, and so on.

Relationship: represents associations between objects, such as the fact that a


customer may place an order.

Attribute: describes the entity, such as the invoice date or the customer first name.

04/15/2021 Database System 2-1


Entity set
An entity set is a set consisting of the same type of entities that share the
same properties.

Eg “EMPLOYEES”(EMPLOYEE1, EMPLOYEE2,
EMPLOYEE3,etc ) entity set represents all the set of employees

Entities are classified as independent (Strong) or dependent (Weak).


A strong (independent) entity: is one that does not rely on other
entities for identification.
A weak (dependent) entity: is one that relies on other entities for
identification.
An individual occurrence of an entity set is also known as an
instance (object ).
04/15/2021 Database System 2-2
Types of attributes
Are descriptive properties that are associated with an entity. A set of
attributes describe an entity.

A particular instance of an attribute is called a value. For example,


“Employee Id” and “Name” are the attributes of the
“EMPLOYEES” entity set; and “XYZ” is one value of the attribute
“Name”.

Attributes can be classified as:

Identifiers or

Descriptors.

04/15/2021 Database System 2-3


Cont..
Identifiers: more commonly called keys, uniquely identify an instance
of an entity.
Example:“Employee Id” uniquely identifies an employee entity from
the entity set.
Descriptor: describes a non-unique characteristic of an entity instance.
Example: “Name” is a descriptor for the “EMPLOYEES” entity set
Other way of categorizing Attributes is as Simple and Composite
attributes.
Simple Attributes: are attributes also known as Atomic Attributes that
can not be divided into subparts.
Example: “Age” and “Gender” of the “EMPLOYEES” entity set.

04/15/2021 Database System 2-4


Cont..
Composite Attributes: are attributes that are composed of
smaller subparts that can be subdivided into the subparts
(Attributes).

04/15/2021 Database System 2-5


Cont..
Another classification of attributes is based on the values that they can
hold as:
Single-valued and
Multi-valued attributes.
Single-valued Attributes: are attributes having only one possible value
at any time.
 Example: “Name” and “Gender” of the “EMPLOYEES” entity set.
Multi-valued Attributes: are attributes that are having possibly more
than one value.
 Example: “PERSON_OF_DEGREE” .

04/15/2021 Database System 2-6


Cont..
Attributes can also be categorized Stored and Derived attributes.

Derived Attributes: are attributes that can be calculated from the


related stored attributes, entities or general states.

Stored Attributes: are attributes that can not be calculated in any


way from the stored attributes.
 Example: “Birth Date” of the “EMPLOYEES” entity set is a stored

attribute, where as “Age” is a derived attribute that can be calculated


from the “Birth Date” and “Current Date”.

04/15/2021 Database System 2-7


Relationship
A Relationship represents an association between two or more entities.
An example of a relationship would be:
 “EMPLOYEES” Assigned “TEAMS”

 “CUSTOMERS” Owns “PROJECTS”

 “TEAMS” works on “PROJECTS”


Relationships are classified in terms of degree, connectivity,
cardinality, and existence.
Degree: of a relationship is the number of entities associated with
the relationship. The multi-way relationship is the general form for
degree n. Special cases are the binary, and ternary, where the degree
is 2, and 3, respectively
04/15/2021 Database System 2-8
Cont..
Connectivity: The connectivity of a relationship describes the
mapping of associated entity instances in the relationship. The values
of connectivity are “one” or “many”.

Cardinality: The cardinality of a relationship is the actual number


of related occurrences for each of the two entities. The basic types of
connectivity for relations are: one-to-one, one-to-many,many-to-one
and many-to-Many.

04/15/2021 Database System 2-9


Cont..
Existence: The existence of a entity in a relationship is defined as
either mandatory or optional.

04/15/2021 Database System 2-10


Summery

04/15/2021 Database System 2-11


Types of keys
Keys – are attributes or set of attributes that can be used to uniquely
identify an entity within the entity set.

Why we need a Key?

Keys help you to identify any row of data in a table.

In a real-world application, a table could contain thousands of


records.

Allows you to establish a relationship between and identify the


relation between tables.
Help you to enforce identity and integrity in the relationship.

04/15/2021 Database System 2-12


Cont..

A super key (super set) is a group of single or multiple keys which


identifies rows in a table. A Super key may have additional attributes
that are not needed for unique identification.

In the above-given example, EmpSSN and “EmpNum +Empname” are

superkeys.

04/15/2021 Database System 2-13


Cont..

Primary Key: A column or group of columns in a table which


helps us to uniquely identifies every row in that table is called a
primary key. The same value can't appear more than once in the table.

Rules for defining Primary key:

Two rows can't have the same primary key value

It must for every row to have a primary key value.

The primary key field cannot be null.

The value in a primary key column can never be modified or


updated if any foreign key refers to that primary key.

04/15/2021 Database System 2-14


Cont..

Alternate key: All the keys which are not primary key are called an
alternate key. It is a candidate key which is currently not the primary
key. However, A table may have single or multiple choices for the
primary key.
 Example: In this table. StudID, Roll No, Email are qualified to become
a primary key. But since StudID is the primary key, Roll No, Email
becomes the alternative key.

04/15/2021 Database System 2-15


Cont..
Candidate Key: A super key with no repeated attribute is called
candidate key.

The Primary key should be selected from the candidate keys. Every
table must have at least a single candidate key.

Properties of Candidate key:

It must contain unique values

Candidate key may have multiple attributes

Must not contain null values

Uniquely identify each record in a table

04/15/2021 Database System 2-16


Cont..
Example: In the given table below Stud ID, Roll No, and email are
candidate keys which help us to uniquely identify the student record in
the table.

04/15/2021 Database System 2-17


Cont..
Foreign Key A foreign key is a column which is added to create a
relationship with another table.

Foreign keys help us to maintain data integrity and also allows


navigation between two different instances of an entity.

This concept is also known as Referential Integrity.

04/15/2021 Database System 2-18


Cont..
Composite key: A key which has multiple attributes to uniquely
identify rows in a table is called a composite key.

the composite key may or maybe not a part of the foreign key.

04/15/2021 Database System 2-19


Constraints

Constraints: The whole purpose of constraints is to maintain


the data integrity during an update/delete/insert into a table.

Types of constraints
NOT NULL
UNIQUE
DEFAULT
CHECK
Key Constraints – PRIMARY KEY, FOREIGN KEY
Mapping constraints
Domain constraints
04/15/2021 Database System 2-20
Cont..
NOT NULL: constraint makes sure that a column does not hold NULL
value. When we don’t provide value for a particular column while
inserting a record into a table, it takes NULL value by default.

UNIQUE: Constraint enforces a column or set of columns to have


unique values.

DEFAULT: constraint provides a default value to a column when there


is no value provided while inserting a record into a table.

04/15/2021 Database System 2-21


Cont..
CHECK: is used for specifying range of values for a particular
column of a table.

Key constraints:

Eg: CREATE TABLE STUDENT( ROLL_NO   INT  NOT NULL


CHECK(ROLL_NO >1000) , STU_NAME VARCHAR (35)  NOT
NULL UNIQUE, STU_AGE INT  NOT NULL EXAM_FEE
INT DEFAULT 10000, STU_ADDRESS VARCHAR (35) ,
PRIMARY KEY (ROLL_NO) );

04/15/2021 Database System 2-22


Mapping Constraints

Mapping Cardinality:

One to One: An entity of entity-set A can be associated with at


most one entity of entity-set B and an entity in entity-set B can be
associated with at most one entity of entity-set A.

One to Many: An entity of entity-set A can be associated with any


number of entities of entity-set B and an entity in entity-set B can
be associated with at most one entity of entity-set A.

Many to One: An entity of entity-set A can be associated with at


most one entity of entity-set B and an entity in entity-set B can be
associated with any number of entities of entity-set A.
04/15/2021 Database System 2-23
Cont..
Many to Many: An entity of entity-set A can be associated with any
number of entities of entity-set B and an entity in entity-set B can be
associated with any number of entities of entity-set A.

Domain constraints: Each table has certain set of columns and each
column allows a same type of data, based on its data type. The column
does not accept values of any other data type.
Domain Constraint = data type + Constraints (NOT NULL /
UNIQUE / PRIMARY KEY / FOREIGN KEY / CHECK /
DEFAULT)

04/15/2021 Database System 2-24


Database Design Guidelines
What are the steps in designing a database.

Determine the purpose of your database.

Find and organize the required information.

Divide the information into tables.

Turn information items into columns.

Specify primary keys.

Set up the table relationships.

Refine your design.

Apply the normalization rules

04/15/2021 Database System 2-25


Chapter three
Relational database design:
Relational database design (RDD)  use tables to store information.
The standard fields and records are represented as columns (fields/
attributes) and rows (records/ tuple) in a table.

The design of a relational database is composed of four stages,


where the data are modeled into a set of related tables. The stages
are:
Define relations/attributes
Define primary keys
Define relationships
Normalization

04/15/2021 Database System 3-1


Functional dependency in DBMS
The attributes of a table is said to be dependent on each other when an
attribute of a table uniquely identifies another attribute of the same
table.

For example: Suppose we have a student table with attributes:


Stu_Id, Stu_Name, Stu_Age.

Here Stu_Id attribute uniquely identifies the Stu_Name attribute of


student table because if we know the student id we can tell the
student name associated with it.

This is known as functional dependency and can be written as


Stu_Id->Stu_Name or in words we can say Stu_Name is
functionally dependent on Stu_Id.
04/15/2021 Database System 3-2
Cont..
 Normalization: is the process of minimizing redundancy from a
relation or set of relations.
 It ensures optimum structure .

 It ensures data atomic.

 It eliminates data inconsistencies and anomalies.

 It improves data integrity.

Anomalies in DBMS: There are three types of anomalies that occur


when the database is not normalized. These are – Insertion, update and
deletion anomaly.

04/15/2021 Database System 3-3


Cont..
Example: Suppose a library management system database store details
of a table and it has six attributes:

The above table is not normalized. We will see the problems that we
face when a table is not normalized.

04/15/2021 Database System 3-4


Cont..
Update anomaly: In the above table we have two rows that have the
same value gae for a table, If we want to update the age of mulu then
we have to update only one rows.

Therefore the data will become inconsistent.

04/15/2021 Database System 3-5


Cont..
Insert anomaly: if we want to add five new records in to a database ,
then we would not be able to insert the data into the table if Roll_No
field doesn’t allow nulls.

Delete anomaly: Suppose, if at a point of time the manager closes the


Book_No B10 then deleting the rows that are having Book_No as
B10 would also delete the information of student Alex since he is
assigned only to the table .
To overcome these anomalies we need to normalize the data

04/15/2021 Database System 3-6


Cont..
  First Normal Form: If a relation is in 1NF:
It should not contain multi-valued attribute.
every attribute should be  singled valued attribute
It should be atomic .

04/15/2021 Database System 3-7


Cont..
Second normal form (2NF): A table is said to be in 2NF if the
following conditions hold:

Table is in 1NF (First normal form)

No non-prime attribute is dependent on the proper subset of any


candidate key of table.

No partial dependency exist.

An attribute that is not part of any candidate key is known as non-


prime attribute.

04/15/2021 Database System 3-8


Cont..

Conversion to 2NF

04/15/2021 Database System 3-9


Cont..
Third Normal form (3NF): A table is in 3NF for each functional
dependency X-> Y at least one of the following conditions hold:
Table must be in 2NF
Transitive functional dependency of non-prime attribute on any
super key should be removed.
In other words 3NF can be explained like this:
X is a super key of table
Y is a prime attribute of table
An attribute that is a part of one of the candidate keys is known as
prime attribute.

04/15/2021 Database System 3-10


Cont..

04/15/2021 Database System 3-11


Cont..
Boyce-Codd Normal Form (BCNF): is a stronger generalization of
third normal form.

A table is in Boyce-Codd Normal form if and only if at least one of


the following conditions are met for each functional dependency

A → B:

It should no non-trivial dependency(A trivial functional dependency


means that all columns of B are contained in the columns of A)

A is a super key

A relation is in BCNF if every determinant is a candidate key

04/15/2021 Database System 3-12


Cont..

04/15/2021 Database System 3-13


Individual Assignment One (5%)
1. Explain functionality of database system.

2. What is participation in DBS? explain briefly.

3. What is multiplicity in DBS? explain briefly

4. What are the basic difference between multiplicity and participation ?

5. What is ODL ? Explain with neat diagram and compare with E/R.

6. What is anomalies? Explain briefly with examples

Submission date 28/02/2012

04/15/2021 Database System 1


Chapter four

04/15/2021 Database System


Relational algebra
Relational Algebra is a procedural query language that consists of a
set of operations that take one or two relations as input and produce a
new relation as a result. The algebra operations enable a user to
retrieve specific request on a relational model

The sequence of the relational algebra that produces new relation


forms a relational algebra expression

The relational algebra is very important for several reasons:

It provides a formal foundation for relational model operation.

It is used as basis for implementing optimizing queries in RDBMS.

It incorporated in to the SQL for RDBMS.


04/15/2021 Database System 4-1
Cont..
Fundamental Operations of Relational Algebra : can be grouped into
two based on the number of relation operands of the operator. These
are:

Unary Operators .

Selection (σ)

Projection (Π)

Rename(ρ)

04/15/2021 Database System 4-2


Cont..

Binary Operators .

Product (Cartesian Product) (Χ)

Union (U )

Intersection (˄)

Difference ( – )

The binary operators listed above are also known as set operators.

04/15/2021 Database System 4-3


Cont..

Unary Operations:

Select Operation : selects a subset of tuple from a relation instance


that satisfies a selection condition.

Is picking certain rows (tuple) from a relation(R).

It is denoted by σ C(R), Where σ (sigma) is used to denote the


SELECT operator, C(=, <, >, ≠, ≤, ≥) is a Boolean expression of
the select condition, and R is the relation or relational algebra
expression.

04/15/2021 Database System 4-4


Cont..

Example - From the “EMPLOYEES” relation to extract Senior


Mangers or those salary is greater than 30,000 then the selection
operation can be written as:

Employees(EmpId, Name, BDate, Age, Gender, Position, Salary)

δ position=“senior Manager” ˅ salary>=30,000

OR

04/15/2021 Database System 4-5


Cont..

Project Operation: forms a new relation by picking certain columns in


the relation.

Also called duplicate elimination. It is denoted by: Π A ( R), Where Π


(pi) represents the PROJECT operator and A is a set of attributes in
the relation R.

Example - To extract Employees Name and Position only from the


“EMPLOYEES” relation

04/15/2021 Database System 4-6


Cont..

Rename Operation: The renaming operator can be used to explicitly


rename resulting relations of an expression with in the query do not
think rename permanently the relation .

It is denoted by: Where ρ(rho) represents the


RENAME operator and S is a name for the new relation and A1, A2,
… An are new names for the attributes in the relation R.

Can be useful in connection with more complex operations such as


union and join

Eg: find out the maximum account balance in the bank.

Πaccount.bal(δaccount.bal <a.bal>(account Χ ρa (account)))


04/15/2021 Database System 4-7
Cont..

If no renaming is applied, the name of the attribute in the resulting


relation of select operation are the same as those in the original and in
the same order relation

For projection operation with no renaming the resulting relation has


the same attribute names as those in the projection list and in the same
order which they appear in the list.

After the renaming the name of the relation and the attributes can be
used as ordinary relation and attributes in a sequence of relational
algebra expressions:

04/15/2021 Database System 4-8


Binary Operations
Cartesian Product Operation: also known as Cross Product or Cross
Join or Product is binary set operation that generates a new relation
from two relation in a combinatorial fashion.

It is denoted by RΧS, Where Χ represents the PRODUCT operator


and R and S are relations to be joined.

The product operation are maps each tuple in relation with every
tuple in S.

04/15/2021 Database System 4-9


Cont..

Example - Consider the following relations R and S, then R ΧS is


given as shown to the right.

04/15/2021 Database System 4-10


Cont..

Consider the Employees, EmpTeams and Teams relation and develop


a relational algebra expression that retrieves the name and position of
Employees that work on Project 1 as Programmers and rename the
relation as Programmers1.

Employees(EmpId, Name, BDate, Age, Gender, Position, Salary)

EmpTeams(EmpId, TeamId)

Teams(TeamId, PrjId, Name, Descr)

04/15/2021 Database System 4-11


Cont..

04/15/2021 Database System 4-12


Cont..
Union Operation: The union operation on R and S denoted by R U S
results a relation that includes all tuples either in R or in S or in both.
Duplicates are eliminated from the result.

Intersection Operation: The intersection operation on R and S


denoted by R ˄ S results a relation that includes all tuples in both R
and S

Set Difference Operation: The result of the set difference operation on


R and S denoted by R−S is the set of elements in R but not in S.

04/15/2021 Database System 4-13


Cont..

For the set operations (Union, Intersection, product, Set difference)


the two relational operands R and S must have same type of tuples,
this condition is known as Union Compatibility .

Two relations R(A1, A2, … An) and S(B1, B2, … Bn) are said to be
union compatible if

They have same degree n, and

Domain(Ai) = Domain(Bi) for all i = 1, 2, … n

04/15/2021 Database System 4-14


Cont..

Example - Consider the following relations R and S, then R US and


R – S are given as shown to the right.

Find name and position of Employees that work on both Projects 1


and 2 as Programmers. It is Similar to the previous example.

04/15/2021 Database System 4-15


Additional Operations
Natural Join Operation: A frequent type of join connects two relations
by:

Equating attributes of the same name, and

Projecting out one copy of each pair of equated attributes.

Such a join is known as natural join and it is denoted by:


where represents the NATURAL JOIN operator and R and S are

relations to be joined

04/15/2021 Database System 4-16


Cont..

Consider the following relations R and S, then R S is given as


shown to the right

04/15/2021 Database System 4-17


Cont..

The pervious example that retrieves the name and position of


Employees that work on Project 1 as Programmers from the
modified relations below can be simplified as:

Employees(EmpId, FullName, BDate, Age, Gender, Position,


Salary)

EmpTeams(EmpId, TeamId)

Teams(TeamId, PrjId, TName, Descr)

Write the relational expression based on the above information.

04/15/2021 Database System 4-18


Cont..

Theta Join Operation: While the natural join enforces a join condition
by equating similar attributes in the relations to be joined; a theta join
joins relations to an arbitrary condition C.

The notation for theta join is:

The result of the theta operation is constructed by:

Taking the product of R and S, and

Selecting only those tuples satisfying the condition C.

04/15/2021 Database System 4-19


Cont..

Consider the following relations R and S, then is given as


shown to the right.

04/15/2021 Database System 4-20


Extended Operation
The basic relational algebra operations have been extended in several
ways to enhance the expressive power of the original relational
algebra. Some of the extended operations are: -

Outer Join

Extended Projection

Duplicate Elimination

Aggregation and Grouping

04/15/2021 Database System 4-21


Cont..
Outer Join Operation: The natural join operation results a tuple
when there is a match to the common attributes of the tuples in the
relations R and S. Such joins are known as inner join operation.

However, there are cases when we want to have all the tuples from the
participating relations and form the join when there is much. In such
cases outer join operations can be used to keep all the tuples in R, or
all those in S, or all those in both relations irrespective of they having
matching tuples in their common attributes.

04/15/2021 Database System 4-22


Cont..
The three types of outer join operators are

Left Outer Join: Keeps every tuples in the left relation R and when
there is no matching for tuples in R from tuples in S, the attributes
of S are filled (padded) with NULL values.

It is denoted by:

04/15/2021 Database System 4-23


Cont..
Right Outer Join: Similar to the left outer join operation it keeps all
tuples in the right relation S and when there is no matching for
tuples in S from tuples in R, the attributes of R are padded with
NULL values,

It is denoted by:

04/15/2021 Database System 4-24


Cont..
Full Outer Join: Keeps all tuples in both the left and right relations
when no matching tuples are found, padding them with NULL
values as needed

It is denoted by:

04/15/2021 Database System 4-25


Cont..
Aggregation and Grouping Operation: Aggregation functions
(operators) such as SUM, COUNT, MIN, MAX, and AVG are
collection operators that return a single value as a result.

Aggregation operators are not relational algebra operators but they


are used by the grouping operator (γ) that groups tuples according to
their values in one or more attributes.

It is denoted by: Where L is either the list of grouping


attributes in order or list of aggregation functions applied to the
attributes of the relation R

04/15/2021 Database System 4-26


Cont..

Example - Write a relational algebra that determines the number of


teams all the employees are working in.

04/15/2021 Database System 4-27


Introduction to Relational Calculus
A relational calculus is a declarative and nonprocedural expression
that specifies a retrieval request, and hence there is no description of
how to evaluate the query in a relational calculus. Rather, a relational
calculus expression specifies what to be retrieved.

The relational calculus is not the same as that of differential and


integral calculus in mathematics but takes its name from a branch of
symbolic logic termed as predicate calculus.

A relational calculus is classified into two as

Tuple Relational Calculus, and

Domain Relational Calculus


04/15/2021 Database System 4-28
Cont..
The Tuple Relational Calculus : A query in a tuple relational calculus
(tuple calculus) is expressed as: { t/p(t) } Where t. is a tuple variable
and p(t) a predicate (condition) that is to be true for the tuple t.

P(t) may have various conditions logically combined with OR (∨),


AND (∧), NOT(¬). It also uses quantifiers:

∃ t ∈ r (Q(t)) = ”there exists” a tuple in t in relation r such that


predicate Q(t) is true.

∀ t ∈ r (Q(t)) = Q(t) is true “for all” tuples in relation r.

04/15/2021 Database System 4-29


Cont..

EXAMPLE: Find the loan number, branch, amount of loans of greater


than or equal to 10,000 amount
04/15/2021 Database System 4-30
Cont..
SOLUTION
{t| t ∈ loan ∧ t[amount]>=10,000}

Find the loan number for each loan of an amount greater or equal to
10,000.

{t| ∃ s ∈ loan(t[loan number] = s[loan number] ∧ s[amount]>=10000)

Find the names of all customers having a loan at the “ABC” branch.

{t | ∃ s ∈ borrower( t[customer-name] = s[customer-name]) ∧ ∃ u ∈


depositor( t[customer-name] = u[customer-name])}
04/15/2021 Database System 4-31
Cont..
The Domain Relational Calculus: A query in a domain relational
calculus (domain calculus) uses domain variables that take on values
from an attributes domain rather than values for an entire tuple.

It is expressed as Where tx1,x2…..xn


represent domain variables and P is the predicate as in the case of
tuple calculus.

Formulas in the predicate are build in the same ways as the tuple
calculus predicates.

04/15/2021 Database System 4-32


Cont..
Example - Retrieve name and description of all the teams working on
project named “banking db”.

Projects(PrjId, PName, SDate, DDate, CDate)

Teams(TeamId, PrjId, TName, Descr calculus predicates

04/15/2021 Database System 4-33


Chapter five

04/15/2021 Database System


Structured query language(SQL)
What is SQL? SQL stands for Structured Query Language

SQL became a standard of the American National Standards


Institute (ANSI) in 1986, and of the International Organization for
Standardization (ISO) in 1987

SQL is a Standard , Although SQL is an ANSI/ISO standard, there


are different versions of the SQL language. However, to be
compliant with the ANSI standard, they all support at least the
major commands (such as SELECT, UPDATE, DELETE, INSERT,
WHERE) in a similar manner.

04/15/2021 Database System 5-1


Structured query language(SQL)
What Can SQL do?

Can retrieve, insert, update, delete record from a database

Can create new databases

Can create new tables in a database

Can create stored procedures in a database

Can create views in a database

Can set permissions on tables, procedures, and views

04/15/2021 Database System 5-2


SQL Data Definition Language (DDL)
The most important DDL statements in SQL are:

CREATE TABLE - creates a new database table.

ALTER TABLE - alters (changes) a database table.

DROP TABLE - deletes a database table.

CREATE INDEX - creates an index (search key).

DROP INDEX - deletes an index.

04/15/2021 Database System 5-3


SQL Data Manipulation Language (DML)
The Data Manipulation Language (DML) is part of the SQL syntax
for executing queries to insert, retrieve, update, and delete records.
The statements are;

INSERT INTO - inserts new data into a database table.

SELECT - extracts data from a database table.

UPDATE - updates data in a database table.

DELETE - deletes data from a database table.

04/15/2021 Database System 5-4


Schema Definition in SQL
The term "schema" refers to the organization of data as a blueprint of
how the database is constructed (divided into database tables in the
case of relational databases).  Schemas are like folders within
a database but database is a container.

SQL uses the following terms for the corresponding terms in


relational model

Table – Relation

Column – Attribute

Row – Tuple

04/15/2021 Database System 5-5


Schema Creation and Modification
L in the SQL statement is used to
The CREATE SCHEMA command
group database objects such as tables, views and permissions.

The syntax for the command is: CREATE SCHEMA <schema_name>


AUTHORIZATION <owner>

<schema_name> is the name of the schema

<owner> identifies the user who is the owner of the schema.


Example: - CREATE SCHEMA schema AUTHORIZATION

user

04/15/2021 Database System 5-6


Cont..
CREATE SCHEMA command groups database objects the CREATE
DATABASE command in the SQL statement is used to create a new
database and the corresponding files for storing the database.

The syntax for the command is:


 CREATE DATABASE <database_name>

Example: - CREATE DATABASE DMU

04/15/2021 Database System 5-7


Table Creation and Modification
The CREATE TABLE command in the SQL statement is used to
specify a new relation in a database by giving it a name and listing its
attributes.

The syntax: CREATE TABLE table_name ( column_name1 data_type


column_constraint, column_name2 data_type column_constraint……
column_nameN data_type column_constraint)

04/15/2021 Database System 5-8


Cont..
Example: For the PROJECTS and TEAMS relations the
corresponding tables can be defined as: -

CREATE TABLE Projects ( PrjId INT NOT NULL PRIMARY


KEY, Name VARCHAR(30) NOT NULL, SDate DATE NOT
NULL, DDate DATE NULL, CDate DATE NULL)

CREATE TABLE Teams ( PrjId INT NOT NULL FOREIGN KEY


REFERENCES Projects(PrjId), Name VARCHAR(30) NOT
NULL, Description VARCHAR(100) NULL, PRIMARY KEY
(PrjId, Name) ), …

04/15/2021 Database System 5-9


Cont..
The primary key constraint in a relation is enforced by using the key
word PRIMARY KEY following the key attribute or incase of
multiple attributes it can be specified on a separate line.

The referential integrity constraint in a relational database is


implemented by the use of a foreign key. If the referential integrity
enforced using a FOREIGN KEY is violated the default SQL
statement forces the rejection of the violating tuple. However, by the
use of the optional referential trigged actions the designer can attach
clauses to the foreign key constraint such as:

04/15/2021 Database System 5-10


Cont..
Cascading referential integrity constraints are foreign key constraints
that tell SQL Server to perform certain actions when a primary key
field in a primary key-foreign key relationship is updated or deleted

Example: create table parent ( pid int not null primary key (id),Pname
char(20) not null )

create table child ( cid int not null primary key, pid int foreign key
references parent(pid) on delete cascade on update cascade

04/15/2021 Database System 5-11


Cont..
SQL ALTER TABLE Statement:
The ALTER TABLE statement is used to add, delete, or modify
columns in an existing table.
The ALTER TABLE statement is also used to add and drop various
constraints on an existing table.
Syntax: ALTER TABLE table_name ADD column_name datatype;
syntax: ALTER TABLE table_name drop column column_name;
To change the data type of a column in a table, syntax:
LTER TABLE table_name ALTER COLUMN column_name
datatype;
ALTER TABLE Persons ADD PRIMARY KEY (ID);
 ALTER TABLE OrdersADD FOREIGN KEY (PersonID) REFEREN
CES Persons(PersonID);

04/15/2021 Database System 5-12


Index creation and modification
Indexes are the heart of fast data access. Data access can be fast
without indexes, but only if the table is small.

If the table contains thousands or millions of rows, data access has to


be done through indexes. Indices in a book, helps to find information
about a specific subject without having to read the entire book.

The same applies to a database index; it helps to find information


about a specific row or rows without having to search through the
entire table..

04/15/2021 Database System 5-13


Cont..
An index for a table is managed by an external table which consists of
the search key (index attribute) and a pointer to the location of the
data as columns.

Creating indexes is a straight forward process when done with the


CREATE INDEX statements.

The basic CREATE INDEX statement is:


CREATE INDEX index_name
ON table_name (column1, column2, ...);
 CREATE INDEX idx_lastname ON Persons (LastName);

04/15/2021 Database System 5-14


Simple Query Constructs and Syntax
The SELECT statement is used to select data from a database.

The data returned is stored in a result table, called the result-set.


 SELECT Syntax: SELECT column1, column2, ...
FROM table_name;

Here, column1, column2, ... are the field names of the table you want
to select data from.

If you want to select all the fields available in the table, use the
following syntax: Select * from table_Name

04/15/2021 Database System 5-15


Cont..
SELECT <column_list> FROM <table_list> WHERE <condition>

<column_list> is the list of column names whose values are retrieved


by the query.

<table_list> is the list of table names required in the process.

<condition> is Boolean expression (conditional expression) that


determines the rows to be selected in the query.

The HAVING clause was added to SQL because the WHERE


keyword could not be used with aggregate functions.

Eg: SELECT column_name(s)FROM table_nameWHERE condition
GROUP BY column_name(s)HAVING condition
04/15/2021 Database System 5-16
Cont..
The WHERE clause it is a search condition used to filter records.

The WHERE clause is used to extract only those records that fulfill a
specified condition.

Syntax: SELECT column1, column2, ... FROM table_name


WHERE condition;

The WHERE clause is not only used in SELECT statement, it is


also used in UPDATE, DELETE statement, etc.
 Syntax: UPDATE table_name

SET column1 = value1, column2 = value2, ...
WHERE condition;
04/15/2021 Database System 5-17
Cont..
In SQL aliases are used to give a table, or a column in a table, a
temporary name.

Aliases are often used to make column names more readable and to
avoid ambiguity .

An alias only exists for the duration of the query.


SELECT column_name AS alias_name
FROM table_name;
SELECT column_name(s)
FROM table_name AS alias_name;
 Example: SELECT CustomerID AS ID,
CustomerName AS Customer
FROM Customers;
04/15/2021 Database System 5-18
Cont..
The SELECT DISTINCT: statement is used to return only distinct
(different) values.

Inside a table, a column often contains many duplicate values; and


sometimes you only want to list the different (distinct) values.

To remove duplicates and have a set of rows as a result one can the
DISTINCT key word on the SELECT clause as follows

SELECT DISTINCT <column_list> FROM <table_list>


WHERE <condition>

04/15/2021 Database System 5-19


Cont..
Example: A query to retrieve employees name, the projects they are
participating and due date of the project.

SELECT DISTINCT e.Name, p.Name, p.DDate FROM Employees


AS e, EmpTeams AS et, Teams AS t, Projects AS p WHERE
e.EmpId=et.EmpId AND et.TeamId=t.TeamId AND p.PrjId=t.PrjId

If the SELECT is not DISTINCT the resulting table (view) will


include identical set of rows for an employee participating in different
teams for same project.

04/15/2021 Database System 5-20


Cont..
The LIKE operator is used in a WHERE clause to search for a
specified pattern in a column.

There are two wildcards often used in conjunction with the LIKE
operator:
% The percent sign represents zero, one, or multiple characters
_ The underscore represents a single character

04/15/2021 Database System 5-21


Cont..
Example: A query to retrieve employees with a name starting by the
letter ‘A’.
SELECT * FROM Employees WHERE Name=’A%’
SQL statement selects all customers with a CustomerName that have
"r" in the second position:
SELECT * FROM Customers
WHERE CustomerName LIKE '_r%';
SQL statement selects all customers with a CustomerName that starts
with "a" and are at least 3 characters in length:
SELECT * FROM Customers
WHERE CustomerName LIKE 'a__%';

04/15/2021 Database System 5-22


Nested Sub queries and Complex Queries
The SELECT-FROM-WHERE statement discussed so far is the
simplest SQL statement for querying a database. SQL SELECT
statements can be combined together to form Sub queries.

Sub queries in a SQL statement are complete form of SELECT-


FROM-WHERE statements that are contained in one query.

They can be used in different ways:

Sub queries in the WHERE clause to form nested queries, Sub


queries in set operations such as UNION, EXCEPT, …, and

Sub queries in the FROM clause as constant tables.

04/15/2021 Database System 5-23


Cont..
SQL SELECT statements can be contained in the WHERE clause of
another SQL statement to form Nested queries. Sub queries in a
nested SQL statement can produce scalar value (constant) or table.

Considering the following relations,

Employees(EmpId, Name, BDate, SubCity, Kebele, Phone, Salary)

Teams(TeamId, PrjId, Name, Descr) - EmpTeams(EmpId, TeamId)

Projects(PrjId, Name, SDate, DDate, CDate, CustId)

Customers(CustId, Name, Address)

04/15/2021 Database System 5-24


Cont..
Example: write a query to retrieve all the projects that are owned by
the customer ‘XYZ’. (Assume name of a customer is unique)

SELECT Name, SDate, DDate, CDate FROM Projects WHERE


CustId = (SELECT CustId FROM Customers WHERE
Name=’XYZ’)

04/15/2021 Database System 5-25


Cont..
TEST Operators in sub query:

The ANY and ALL operators are used with a WHERE or HAVING
clause.

The ANY operator returns true if any of the sub query values meet
the condition.
SELECT column_name(s) FROM table_name
WHERE column_name
operator ANY(SELECT column_name FROM table_name WHER
E condition);
Eg: SELECT ProductName FROM Products
WHERE ProductID= ANY (SELECT ProductID FROM 
OrderDetails WHERE Quantity = 10);
04/15/2021 Database System 5-26
Cont..
The ALL operator returns true if all of the sub query values meet
the condition.

Syntax: SELECT column_name(s)FROM table_name
WHERE column_name operator ALL
(SELECT column_name FROM table_name WHERE condition);

The operator must be a standard comparison operator (=, <>, !=,


>, >=, <, or <=)

Eg: SELECT ProductName FROM Products


WHERE ProductID= ALL (SELECT ProductID FROM

 OrderDetails WHERE Quantity > 99);
04/15/2021 Database System 5-27
Cont..
The EXISTS operator: is used to test for the existence of any record in
a sub query.

The EXISTS operator returns true if the sub query returns one or more
records.
EXISTS Syntax: SELECT column_name(s) FROM table_name
WHERE EXISTS
(SELECT column_name FROM table_name WHERE condition);
The BETWEEN operator: selects values within a given range. The
values can be numbers, text, or dates. It is inclusive: begin and end
values are included.
Syntax: SELECT column_name(s) FROM table_name
WHERE column_name BETWEEN value1 AND value2;

04/15/2021 Database System 5-28


Cont..
Example: Write a query to retrieve customers that at least own project

SELECT Name, Address FROM Customers AS c WHERE EXISTS


(SELECT * FROM Projects AS p WHERE c.CustId=p.CustId)

SELECT SupplierName FROM Suppliers
WHERE EXISTS (SELECT ProductName FROM Products 

WHERE Products.SupplierID = Suppliers.supplierID AND Price
< 20);

04/15/2021 Database System 5-29


Cont..
A JOIN clause is used to combine rows from two or more tables,
based on a related column between them.

types of join:-

The INNER JOIN keyword selects records that have matching


values in both tables.

SELECT column_name(s) FROM table1
INNER JOIN table2
ON table1.column_name = table2.column_name;

04/15/2021 Database System 5-30


Cont..
The LEFT JOIN keyword returns all records from the left table
(table1), and the matched records from the right table (table2). The
result is NULL from the right side, if there is no match.

Syntax: SELECT column_name(s) FROM table1


left outer  JOIN table2
ON table1.column_name = table2.column_name;

04/15/2021 Database System 5-31


Cont..
The RIGHT JOIN keyword returns all records from the right table
(table2), and the matched records from the left table (table1). The
result is NULL from the left side, when there is no match.

Syntax: SELECT column_name(s) FROM table1


right outer  JOIN table2
ON table1.column_name = table2.column_name;

04/15/2021 Database System 5-32


Cont..
The FULL OUTER JOIN keyword returns all records when there is a
match in left (table1) or right (table2) table records.

Syntax: SELECT column_name(s) FROM table1


full outer  JOIN table2
ON table1.column_name = table2.column_name;

04/15/2021 Database System 5-33


Cont..
Cross Product <left_table> CROSS JOIN <right_table> . The Cross
product forms the Cartesian product set from the participating tables.

Natural Join <left_table> NATURAL JOIN <right_table> The natural


join forms a join of rows with identical values in the common
attributes of the participating tables.

Theta Join <left_table> JOIN <right_table> ON<condition> The theta


join forms the theta join on the joining condition specified by the ON
clause.

04/15/2021 Database System 5-34


Cont..
Example: The query that retrieves employees name and phone that are
participating on projects that are owned by the customer ‘XYZ’ can be
modified as follows using joined tables:

SELECT Name, Phone FROM Employees WHERE EmpId IN (SELECT


EmpId FROM EmpTeams NATURAL JOIN Teams WHERE PrjId IN
(SELECT PrjId FROM Projects AS p JOIN Customers AS c ON
p.CustId=c.CustId WHERE c.Name=’XYZ’))

04/15/2021 Database System 5-35


Cont..
Example: A query to retrieve all the projects and the teams they
consist if any:

SELECT p.Name, p.SDATE, p.DDate, t.Name, t.Descr FROM


Projects AS p LEFT OUTER JOIN Teams AS t ON p.PrjId=t.PrjId

The query returns all the projects, and if the projects are having teams
the teams will be joined with the teams as well. Projects having more
than one team will be joined with each teams in the resulting table.

04/15/2021 Database System 5-36


Cont..
The ORDER BY keyword is used to sort the result-set in ascending or
descending order. It sorts the records in ascending order by default. To
sort the records in descending order, use the DESC keyword.

The default order sequence is ASC that can be omitted.

Syntax: SELECT column1, column2, ...FROM table_name


ORDER BY column1, column2, ... ASC|DESC;

Eg1: SELECT * FROM Customers ORDER BY Country,


Countryname DESC;

Eg2: SELECT * FROM Customers
ORDER BY Country ASC, CustomerName DESC;
04/15/2021 Database System 5-37
Cont..
The GROUP BY statement groups rows that have the same values
into summary rows, like "find the number of customers in each
country". Is often used with aggregate functions (COUNT, MAX,
MIN, SUM, AVG) to group the result-set by one or more columns.

Syntax: SELECT column_name(s) FROM table_name


WHERE condition
GROUP BY column_name(s) ORDER BY column_name(s);

Eg: SELECT  COUNT(CustomerID), Country FROM Customers


GROUP BY Country ORDER BY COUNT(CustomerID) DESC;

04/15/2021 Database System 5-38


Aggregate Function in SQL
The COUNT() function: returns the number of rows that matches a
specified criteria.

Syntax: SELECT COUNT(column_name) FROM table_name


WHERE condition;

The AVG() function: returns the average value of a numeric column.

Syntax: SELECT AVG(column_name) FROM table_name


WHERE condition;

The SUM() function: returns the total sum of a numeric column.


Syntax: SELECT  SUM(column_name)FROM table_name
WHERE condition;
04/15/2021 Database System 5-39
Cont..
The MAX() function returns the largest value of the selected column.
Syntax: SELECT  MAX(column_name)FROM table_name
WHERE condition;

The MIN() function: returns the smallest value of the selected


column.
Syntax: SELECT  MIN(column_name)FROM table_name
WHERE condition;
Eg1: SELECT MIN(Price) AS SmallestPrice FROM Products;
SELECT MAX(Price) AS  LargestPrice FROM Products;

04/15/2021 Database System 5-40


views
In SQL, a view is a virtual table based on the result-set of an SQL
statement.

A view contains rows and columns, just like a real table. The fields in
a view are fields from one or more real tables in the database.

You can add SQL functions, WHERE, and JOIN statements to a view
and present the data as if the data were coming from one single table.

Syntax: CREATE VIEW view_name AS SELECT column1, column2, ...


FROM table_name WHERE condition;

04/15/2021 Database System 5-41


views
1. Discuss types of schema i.e. conceptual schema, logical schema, and
physical schema

2. Discuss relational calculus with example.

Submitted date 03/04/2012

04/15/2021 Database System 6-1


views

Chapter six
Data storage and querying

04/15/2021 Database System 6-1


Storage and File Structure
Introduction: A database system is designed to hold large size of
data that need to be physically (permanently) on the storage medium.
The storage medium in a computer can be categorized as:

Primary Storage: Storage media that have direct access to the CPU:
the main memory and the cache. Cache is the lowest level in the
memory hierarchy that is built inside the microprocessor chip.
Typically the response time is in nanoseconds.

The main memory is the next level in the hierarchy that provides the
main working environment for the CPU to keep the programs and
data.
04/15/2021 Database System 6-2
Storage and File Structure
Secondary storage: Storage media for permanent storage such as
magnetic disk and optical disk. Larger in capacity but significantly
slower than the primary storages. Typical response time is in
milliseconds. The secondary storage is used as a virtual memory, disk
storage, and file system.

Tertiary Storage: Storage media that are used for archive and backup
storage data, such as magnetic tape. Typical response time is few
seconds or even in minutes. Virtual memory is storage on the disk that
can is often addressed by 32 bit address space, hence 232~4GB of
data can be managed.
04/15/2021 Database System 6-3

You might also like