You are on page 1of 141

Unit 1: Introduction

# Data:-
 Data is a collection of row facts related with the
elements, objects or an entity.
 It includes facts, figures, letters, words charts,
symbols, audio, video or any combinations.
 This process is known as data processing.
 The data collected from the user will be processed
to generate some information.

# Information:-
 Information is the processed data that will give
some useful facts or meaning.
 Information is useful in decision making process.
 Information can be a data for next level of
processing.
 In this way a processed data is re-used to generate
further information which is known as information
processing.

# Characteristics of information
As mentioned in definition of information, it must
give some meaning, so information must have some
characteristics, they are;
Subjectivity:
 The information must be highly subjectively as
there may be verities of data collected.
 The data of two person or object or elements may
be matched but the information must be specific
to the concerned only.

Relevance:
 The information must be relevant that is pertinent
and meaningful to the decision maker.
 Right information for the right person will help in
decision making process.
Timeliness:
 The information must be delivered on time as well.
 Right information on right time will help in
decision making.

Accuracy:
 The information provided to make a decision must
be accurate.
 Wrong information may lead to wrong decision
which will be harmful for the organization.

Correct information format:


 The information generated must be in proper
format as desire.
 Wrong format of right information may misguide
the decision making process.

Completeness:
 The information provided to make a decision
making process must be complete.
 Uncompleted information may guide you towards
the wrong decision.

Accessibility:
 The information generated must be accessible for
the decision making process.
 The information will be valueless if it is not
accessible by the decision maker in right format at
the right time.

# Process of converting data into information:


In data processing various steps are followed to
convert the data into information. These steps
include;
Collection:
 Data are collected through the surveys, interviews,
sensors, documents, newspapers or using other
various media.

Classification:
 These collected data will be classified into
different stages as required or convince.

Adding, Merging & Sorting:


 The data collected may be uncompleted,
 These data will be combined, merged and
arranged in order as required,

Summarizing:
 These data will be processed and becomes
information, which can be summarized and
represented to get desire result.

Storing and Retrieving:


 The data or information will be stored properly
and extracted for the further use.

# File Processing System


 Collection of the data, storing them and processing
as required is termed as file processing.
 In traditional approach, less data are collected,
stored and processed which is termed as flat file
processing system.
 COBOL, PASCAL, BASIC etc types of application are
used to design the traditional processing system.
 All data related to the department or blocks are
stored, accessed and processed using separate
application.

Database Personal Account


Server 1 Application Database Saving Account

# Limitation
Database of the
Loan Processing Flat file processing system;
Server 3
Application
Application

Server 2

 Evaluations, analysis and extracting of the data


from two or data base are not possible.
 Data and information stored in a database may be
repeated or stored same data in too many places.
 Data and information will be stored and coded in
special format, these data format cannot be
changed. If changed it may give different meaning.
 Linking data elements between two or more tables
is very difficult and will not give desire result so
far.
 Data stored in a database are not flexible to link
between two or more tables, which will result
incomplete data elements as per required.
 Data elements stored are not very secured as well
as consistency, which may raise problem in
constringe handing.
 Could not distribute the data elements to the
multiple users as required.
 Most important disadvantage is the security, due
to the security provision it is not widely used in
financial transactions.

# Database Management system;


 Database management system is a systematic
collection and processing of the data elements
collected.
 In a database data elements are organized and
systematically stored in a database server.
 These data will be accessed by the various users
through different applications.
 It provides various types of processing functions
such as inserting, retrieving, updating, deleting the
records etc.
 It is used in almost every area of the computer
field such as business, medical, law, education,
library and many other relative areas.
 Specially, database is widely used in financial
transactions such as banking, financial institution,
department store, shops etc.

# some implicit properties of database are;


 A database represents some aspects of the real
world, sometimes called mini world. Changes to
the mini world are reflected in the database.
 A database is logically coherent collection of data
with some inherent meanings random assortment
of data cannot correctly be referred to as
database.
 A database is designed, built and populated with
data for a specific purpose. It has a indented group
of users and some preconceived applications in
which these users are interested.

# History of Database system


 1950s and early 1960s:
* Data processing using magnetic tapes for
storage;
* Tapes provide only sequential access;
* Punched cards for input;
 Late 1960s and 1970s:
* Hard disks allow direct access to data;
* Network and hierarchical data models in
widespread use;
* Ted Coded defines the relational data model;
* Would win the ACM Turing Award for this work;
* IBM Research begins System R prototype;
* UC Berkeley begins Ingres prototype;
* High-performance (for the era) transaction
processing;
 1980s:
* Research relational prototypes evolve into
commercial systems
* SQL becomes industrial standard;
* Parallel and distributed database systems;
* Object-oriented database systems;
 1990s:
* Large decision support and data-mining
applications;
* Large multi-terabyte data warehouses;
* Emergence of Web commerce;
 2000s:
* XML and X-Query standards;
* Automated database administration;

# Database approach;
In a flat filing system, various types of problems
and limitations are there. These all types of problems
can be solved very easily using the database system.
It will store all the data contains in the centrally
located storage section and accessed through
different applications. Some basic task such as
Preparation, Insertion, updating, retrieval, deletion,
backups and restores etc works are performed very
efficiently. The data processing approach is
presented as below;
# Advantages of Modern Database Management
System;
 Data redundancy can be removed as several
repeated data are removed and established a link
to access these data.
 Data inconsistency will be avoided as the similar
data elements which may give different meaning
will be collected separately.
 Sharing of the data between the users is very
easily possible.
 I will maintain the standard format of the report as
desire by the user.
 Different level of security level is maintained which
will make a database task highly secure even in
network.
 The data values collected in database is accurate
and consistent. (Integrity constringes are satisfied.)
 It provides multiple users environment and
interfaces for the users.
 Easy relationship among the data entity or objects
within the database or other databases.
 Backup and recovery provision makes it highly
secure as we can recover the data loss from
backups.

# Disadvantages of Modern Database Management


System
 High investment required initially.
 Overhead expenses for security, backup &recovery
process.
 Extra cost of hardware upgrading for extensive
applications and workspace for executing and
storage.
 Regular cost for maintenance of the software and
hardware.
 An additional cost for converting from traditional
method to integrated method.
 Cost for backup and security.

The most important disadvantages is the cost factor


as
Exercises
1)What is Data and Information? List out the
characteristics of information;
2)What are the processes of converting the data into
information?
3)What are the limitations of flat file processing
system?
4)What is Database Management System?
5)What are the characteristics of DBMS?
6)What are the advantages of modern DBMS?
7)What are the limitations of DBMS?
Unit 2: Database system, Concept and Architecture

# Database processing and its methods:


In data processing, DBMS acts as an intermediary
between user, applications program and database.
DBMS stores the data in a database server, these
data are accessed through the various applications
as required for the processing. After the processing
the result will be generated through the report to
the user as required. These all process will be
followed through the relationship, data integrity and
other related task by the DBMS.
Data stored in the database are accessed in three
different ways. Processing of the data means
retrieval of the data and information. Depends of the
various methods the data retrieval tasks are
performed using the following;

# Database Architecture:
Centralizing DBMS Architecture:
 In centralized DBMS architecture, data contains
are stored in a centrally located server and
accessed by various dumb terminals known as
client.
 These clients will access the database through the
application program.
 Each client will get access permission from the
DBA and have to work accordingly.
 Database structure of a single institution with
multiple nodes

Magnetic
Storage
Section
Server Application
Data server

Client Server DBMS Architecture:

 In client server architecture, a specialized server


will hold the data and will be access through
remote terminals or sub stations via a network.
 Using different application program the data will
be accessed and fulfilled he requirements of the
client.
 Client is an interface to access the data and server
is the machine which will fulfil the request of the
client.
 Different types and nature of server might be used
for this purpose such as fileserver, database
server, web server, print server etc.
 The data elements stored in these terminals could
be accessed through the networking or even
through the Internet.
 SCT is an example of client server network used by
various banks and financial institution in our
country.

Database
Server

Database
Storage

Client Server DBMS Structure


 Two Tier and Three Tier client server architecture;
User User

Application Application

Database Application Server


System
Database System
Data abstraction/ View of data;
A database system is a collection of data elements
in a set of files or programs that allow the user to
store, access and modify. Major purpose of the
database is to give the access over the data using an
abstract view of over the data. Since the entire
database users are not computer literate and they
are not familiar with the complexity of the database
structure, the users are classified into various stages,
which is termed as level of the user. It is defined as
the abstract view of the data.
There are three levels of data abstraction
1) Physical Level or Internal Level:
* Physical view is a representation of the entire
database;
* It is expressed by internal schemas which contains
the definitions of stored records.
* It defines he methods of representing the data
fields and the access over the data.
* Only one physical view is defined per database.

2) Logical or Conceptual Level:


* Logical level expressed all the database entities
and its relationships.
* It is a representation of entire information
contents of the database in the physical storage.
* It includes the conceptual view of the data
elements stored.
* Only one conceptual level is defined per database.

3) View Level or External Level:


* This level is closer with the user as it describes all
the logical record and the relationships.
* It contains the methods of deriving the objects in
the database. The object includes entities,
attributes and relationships.
* External views can be defined as per required.

Database Schema:
A Schema describes the view at each level. A
Schema is an outline or a plan that describes the
records and relationships existing in the view. The
schema also describes the way in which entities at
one level of abstract can be mapped to the next
level. The overall design of the database is called
database schema.
A database schema includes:
* Characteristics of data items such as entities and
attributes.
* Logical structure and relationship among those
data items.
* Format of storage representation.
* Integrity parameters such as physically
authorization and backup politics.
Since each view is defined as schemas, there are
three levels of schemas. At lower level one Physical
Schema, at middle level one Conceptual Schema and
at higher level several Sub-Schemas.
Each user group refers to its own sub-schema. The
DBMS transforms a request specified on an external
schema into conceptual schema and then into that
request forwarded into the internal schema for
processing over the database. The process of
transforming the request and results between the
levels is called mapping.

Data independence:
* Data Independence is major objectives of the
database system which will be implemented through
three levels of schema.
* The change made in one level does not affect the
application or definition at the other level is termed
as data independence.
Logical data independence is the ability to modify
the logical schema without causing application
program to be rewritten or any change to the
external schema. When data is added or removed,
only the view definition and the mapping need to be
changed in the DBMS that supports logical data
independence.
Physical data independence is an ability to modify
the internal schema without causing any changes the
external schema and without causing application
program to be re written.
# Data models:
* The physical or logical structure of the database is
termed as database model.
* It is a conceptual tool to describe the data, data
relationships, data semantics, and data
constraints.
* Two major types of data models are used where as
it is believed that evolution of data models are still
in progress.
# Object based data model:
* Object based concept is based on the data, data
relationship.
* It is gaining wide acceptance for their flexibility
structuring capabilities.
* Various data integrity constraints can be specified
explicitly by using the objects based model.
Entity Relationship and Object Oriented Model are
two database model based on object.

Entity relationship model:


* The model was published by Peter Chen in 1976.
* It implies the link between object called Entity and
Relationship among these entities.
* Each entity has set of attributes that describes the
object.
* A relationship is an association among the entities.

Object Oriented model:


* It is also based on collections of objects like ER
model.
* The objects contain values and bodies of codes,
which is called methods.

Relational model
* It was derived by E.F Coded in 1970 and was
considered as important concept in DBMS.
* It is based on the mathematical notations of a
relation, consisting of rows and column of data.
* The column names are termed as attributed
(Fields) and rows are treated as tuples (Record).
* For each relation there is a set of attributes that
uniquely determines the tuples, this is called key.
* It relieves the user from details of storage
structure and access methods.
* It is based on the relation of the database
elements and popular as RDBMS.
# Data Dictionary and Database Language;
Data Dictionary is a set of specific table in a special
file where schema descriptions are stored. It contains
the metadata about the structure of the database.
Data Dictionary is used very heavily, that’s why great
emphasis should be placed on developing a good
design and efficient implementation of the data
dictionary. Using a compiler and linker, data dictionary
will access during the processing of the queries.
In order to provide the various facilities to
different types of user, a DBMS provides one or more
specialized programming languages called Database
Languages. This language can be called as Structured
Query Language (SQL) which has been taken as a
standard database language. Database language comes
in different forms, which are categorized based on
different level of abstraction facilities provided by
DBMS.

* DDL [Data Definition Language];


DDL is used to describe the details of data. The
conceptual schema is specified by a set of definitions
expressed by this special language. The database will
have a compiler whose function is to process DDL
statements, for this Data Dictionary is consulted
before actual data are read or modified in the
database system.
Another language called Storage Definition
Language (SDL) is used to specify the internal
schema. In some DBMS, there is only one language
which has both DDL and SDL capabilities.
DML [Data Manipulating Language];
The language that enables users to access or
manipulate the data is called Data Manipulation
Language (DML). By DML we means:
 The retrieval of information stored in a database.
 The insertion of new data into the database.
 The deletion of the data and information in a
database.
 The modification of data and information stored in
a database.

Procedural DML and Non Procedural DML are two


further types of DML used in a database.
Procedural DML which requires user to specify
what data are needed and how to get these data.
Non Procedure DML which require a user to
specify what data are needed without specifying
how to get it.

# Database Users and Administrator:


Users are the person who uses the system. User
may be divided into those who actually use and control
the contents and those who enable the data base to be
developed and the DBMS software to be designed and
implemented. Users are differentiated by the way they
expected to interact with the system. There are four
different types of data abase user.
 Naive user, are interacts with the system by
invoking one of the application program that have
been written previously. They have an authority to
use till the predefined application program.
 Complier to a set of tables that is stored
permanently in data Dictionary.

 Storage structure and access method definition:


The DBA creates appropriate storage structure
and access methods by writing a set of definitions,
which is translated by the data storage and DDL
compiler.

 Schema and physical organization modification:


The DBA accomplish the relatively rare
modifications either to the database schema or to
the description of the physical storage
organization by writing a set of definitions that is
used by either the DDL compiler or the data
storage. DDL compiler to generate modifications
to the appropriate internal system table. (Data
Dictionary).

 Granting of authorization for data access: The


granting of different types of authorization allows
the DBA to regulate which parts of the database
various users can access. The authorization
information is kept in special system structure that
is consulted by the database system whenever
access to the data is attempted in the system.

 Integrity constraint specification: The data values


stored in the database must satisfy certain
consistency constraints. The DBA must specify
such constraints explicitly. The integrity
constraints are kept in a special structure that is
consulted by the database system whenever an
update takes place in the system.

 Routine maintenance: The DBA is overall


responsible for all sort of database system, so the
routine maintenance on the database system is
required. The routine maintenance includes:

 Periodically backup of the database to prevent


data loss.
 Ensuring that enough free space is available for
operation.
 Monitoring the job scheduling on the database
system.
 Over-viewing the activities of the different user.
 Security provision against various problems that
may arises.

# Database System Architecture (Environment):


 Database system is portioned into modules that
deal with each of the responsibilities of the overall
system. The functional components of a database
system can be broadly divided into storage
manage and the query processor components.

System Structure:
# Storage Manager:
Storage manager is a program module that provides
the interface between the low level data stored in the
database and the application programs and queries
submitted to the system. The storage manager is
responsible for the interaction with the file manager.
The raw data stored on the disk using the file system,
which is usually provided by a conventional OS. The
storage manager translates the various DML
statements into low level file system commands.
Storage manager is responsible for storing, retrieving
and updating data in the database. The storage
manager component includes;
* Authorization and Integrity Manager, which tests
for the satisfaction of integrity constraints and
checks the authority of user to access data.
* Transaction Manager, which ensure that the
database remains in a consistent (correct state)
despite system failure and that concurrent
transaction executions proceed without
conflicting.
* File Manager, which manages the allocation of
space on disk storage and the data structure used
to represent information stored on disk,
* Buffer Manager, which is responsible for fetching
data storage into main memory and deciding what
data to cache in main memory. The buffer
manager is a critical part of the database system
since it enables the database to handle the data
sizes that are much larger than the size of main
memory.

The storage manager implements several data


structures as a part of the physical system
implementation:
* Data Files, which stores data files itself.
* Data Dictionary, which stores metadata about the
database, in particular the schema of the
database.
* Indices, which provide fast access over the data
items that hold particular values.

Query Processor:
The query processing components are:-
DDL Interpreter, which interprets DDL statements
and records the definitions in the data dictionary.
* DML Compiler, which translates DML statements
in a query language into an evaluation plan
consisting of low level instructions that query
evaluating engine understand.
* Query Evaluation Engine, which executes low level
instructions generated by DML complier.
# Entity Relationship (E-R) Model:
The ER data model is based on the real world that
consists of a set of basic objects called entities and of
relationships among these objects. It was developed to
facilitate database design by allowing the specification
of an enterprise schema, which represents the overall
logical structure of the database. The ER model is one
of several semantics data model: the semantics aspect
of the model lies in its representation of the meaning
of the data. The ER model is very useful in mapping the
meaning and interaction of real world enterprise onto
a conceptual schema.
The ER model is well suited to data modelling for use
with database. It is fairly abstract and is easy to discuss
and explain. ER models are readily translated to
relations. ER modelling is based on two concepts:
 Entity (Things or objects)
 Relationship (association among the several
entities)

 Entity and Entity sets:

An entity is a thing or object in the real world that


is distinguishable from all other objects. An entity has a
set of properties and the values for some set of
properties may uniquely identify an entity. An entity is
an abstraction from the complexities of same domain.
When we speak of an entity we normally speak of
some aspect of the real of the world which can be
distinguished from other.
A collection of similar entities and its collections
called Entity Set. An entity set shares a properties or
attributes. The individual entity that constitutes an
entity set are said to be the Extension of the entity set.
An entity is a representation of sets of Attributes.
Attributes are descriptive properties possessed by each
member of an entity set. The destination of an
attributes for an entity set expressed that the database
stores similar information concerning each entity in the
entity set. However each entity may have its own value
for each attributes. In an attributes a set of permitted
values are there called domain or value set.
The attributes, as used in ER model, can be
characterized into the following types.
 Simple and Composite attributes:
The attributes that are divisible into sub parts
(such as Name can divided into First, Middle and
Last Name) is termed as composite attributes. A
composite attributes are concatenation of simple
attributes. Most of the attributes are simple or
atomic.

 Single valued & Multi valued attributes:


The attributes that are having only one value is
termed as single value attributes. The attributes
that have set of values for the specific entry is
termed as multi value attributes. Age is a single
value attributes where as phone no is multi value
attributes as one person can have more than one
phone no.
ER diagram with “Single & Composite” attributes
and “Single &Multi value” attributes
 Null attributes:
The attributes that does not have a value is
termed as Null value attribute. If the person is not
married then the attributes spouse’s name or
children’s name will not have any value.
 Derived attributes:
The attributes that are derived from other
attributes or instanced from other attributes is
termed as derived attributes. Parrots, Pigeon and
Sparrow shares the properties of Bird, or these all
are derived from the Bird.

* Relationship & Relationship Sets:


A relationship is an association among the two or
more entities. In E-R model one entity is linked with
another entity by a relationship. A relationship set is
a set of relationship of same type that is derived
from or an instance of the relationship that is used.
Each relationship instance is a line joining an entity
instance. The degree of relationship type is the
number of participating entity types. The
relationship of degree two are called binary
relationship. The degree of three is called ternary
relationship.

# Mapping Cardinalities:
Mapping cardinalities, or cardinality ratio, express
the number of entities to which another entity can
be associated via a relationship sets. Mapping
cardinalities are most useful in describing binary
relationship sets, although occasionally they
contribute to the description of relationship sets that
involve more than two entity sets.
For a binary relationship set R between entity sets
A and B, the mapping cardinality must be one of the
following;
 One to one:
An entity A is associated with at most one entity
in B and an entity in B is associated with at most
one entity in entity A.

 One to many:
An entity A is associated with any number of
entities in B. An entity in B however can be
associated with at most one entity in entity A.
 Many to one;
An entity A is associated with at most one entity
in B. An entity in B however can be associated with
any number of entities in entity A.

 Many to many;
An entity A is associated with any number of
entities in B and an entity in B is associated with
any number of entities in A.
One to One One to Many
Many to Many

* Keys fields:
We must have a way to specify the entities within
the entity sets distinguished. Two entities in an entity
set are not allowed to have exactly same values for all
the attributes. A key allows us to identify a set of
attributes that suffice to distinguish relationships from
each other.
 Super Key is a set of one or more attributes that
taken collectively allow us to identify uniquely an
entity in the entity set. Soc Sec No, Symbol No,
Cuts No etc are some key fields used as super key.

 Candidate Key is an interested key in an entity set


that are capable to be a super key. These are
possible keys for the primary or super key.
 Primary Key is a candidate key that is chosen by
the application programmer to associate with the
entities sets. It is the principal means of uniquely
identifying entities within entity sets.

 Composite Key is a combination of more than one


attributes used to identify the entity in an entity
sets. By combining more fields a primary is
defined, that is termed as composite key.

# Weak Entity and Strong Entity sets:


An entity may not have a sufficient attributes to
form a primary key, such an entity sets are termed as
weak entity set. An entity set that has a primary key
is termed as a strong entity set.
A strong entity types can exist on their own
without participating in any relationship. But it is not
so with weal entity sets. Each instance has to
participate in a relationship with a strong entity set
in order to exist. This strong entity sets is called
owner of the weak entity set. Weak entity set always
has a total participation constraint.
ER diagram of Customers and Loan

ER diagram of Customer and Account

# the Entity Relationship Diagram Notations:

To construct an ER diagram different types of


notations are used. These are;
 Rectangles which represents an Entity Sets.

 Ellipses, which represents an attributes.

 Diamonds which represents a Relationship Sets.

 Lines which represents a link to an Entity or

Attributes.

 Double Ellipse which represents Multi-values

Attributes.

 Dashed Lines which indicates Derived Attributes.

 Double Lines which indicates total participation


Of an entity in a relationship.

 Double Rectangle which represents Weak Entity

Sets.

Cardinality Constraints:
 One-to-One;

 One-to-Many;

 Many-to-One ;

 Many to Many;
Exercises

1)What is Database? Why data processing is to be


computerized?
2)Explain the database architecture used in data
processing.
3)Explain about the two tiers and three tiers
architecture.
4)What are the levels of views in a database? Explain
them
5)What is database schema? Explain about the
physical, logical and sub schema.
6)What does data independence means? Explain
with its types.
7)What are the data models? Explain record based
models with figure.
8)Explain about object based models with figure.
9)What are the languages used in database? Explain
them
10) What is data dictionary? Explain its uses and
importance.
11) Explain about the users of database.
12) Who is DBA? What are the major
responsibilities of DBA?
13) Explain the system structure of a database
with diagram.
14) What is storage manager? What are the
functions of the storage manager?
15) What is ER model? Explain
16) Write short notes on:
(Entity, Attributes, Relationship, Key Fields)
17) Explain different types of attributes.
a) Simple and composite attributes.
b) Single Value and multi valued attributes
c) Derived Attribute
18) Explain about the mapping cardinalities;
19) Explain the strong and weak entity with
examples.
20) Prepare an ER diagram of the following;
a) Student’s database; b) Customers
Database;
c) Banking system; d) Publication’s
database;
e) Employees database; f) Suppliers
database
Unit 3: Filing and File Structure:
The physical or internal level of organization of a
database system is concerned with the efficient
storage of information in the secondary storage
devices. At this level we are no longer concerned with
the application programmer’s views of the database.
The physical to conceptual level mapping must provide
the necessary shield to the user. The basic problem in
physical database representation is to select a suitable
file system to store the desired information. The file
consists of record and a record may consist of several
fields.
The typical operations that may be performed on the
information stored in the file are as follows.
 Retrieve: To find a record or sets of records having a
particular value in a particular field or where he field
values satisfy certain conditions.
 Insert: Insert a record or set of records at some
specific locations.
 Update: Modifies the field values of a record or sets
of records.
 Delete: Deletes a particular record or sets of records.

# Overview on Physical Storage Media;


Various types of physical storages devices are used
in a database. For the data storage purpose very huge
storage devices are required. These devices are varying
in speed of access, cost, reliability etc. some common
physical storage devices used are:

Cache memory:
 Cache is a fastest and most expensive types of
memory used in a computer.
 It will maintain the gaps between processor and
main memory.
 While processing Cache act as a high speed buffer
between CPU and main memory.
 It holds the very active and instantly used data and
instructions temporarily.
 While processing,
CPU will search the data
contains first in
Cache and then in
main memory.
Main Memory;
 The storage area, where data and instructions are
stored while working.
 CPU will access the data and instructions stored.
 Semiconductor memories and volatile in nature (Will
hold data contain till electricity is present.)
 Generally small in size and expensive too.
 RAM and ROM are used as main memory in
computer.

Flash Memory;
 A semi conductor type memory which is also called
EEPROM.
 The data stored in flash memories could survive
power failure.
 Accessing data in flash memory is little bit
complicated.
(To read data in takes 10 nano sec and 4-10 micro
sec to write.)
 To overwrite the data, it erases the old data then
writes new data.
 It became very popular due to its small size and
accessibility.

Magnetic storage memory;


 It stores data in a form of magnetic fields on
magnetic coated disk platter.
 It is termed as direct access storage as the data
contains could be read and written in any order.
 It is non volatile in nature. (Data survives even in
power failure.)
 The storage capacity is very huge. (Depends upon no
of elements used to store the data contains.)
 Mostly used as a mass storage devices for the server
side storage.
Optical memory;
 Optical disk used to read/write using optical rays or
laser rays.
 CD-ROM, DVD and Blue Rays Disk etc are common
optical storage device.
 It is WORM (Write Once, Read Many) in nature.
Magnetic tapes;
 A sequential accessing devices where data are stored
using a tape coated with magnetic oxide.
 Cheaper and much slower accessing device used for
backing-up provision.
 Used as protection from disk failure.

The hierarchies of the storage section are mentioned


below:
# Data accessing from storage sections:
According to the data stored or accessed from the
storage sections it could be divided into further types.
Sequential Access:
 Data will be stored or accessed one after other
sequentially.
(If we want to read the data placed on 5 th position,
first we have to access through 1 to 4, then 5 th
positioned data could be accessed.)
 Slow in accessing speed and not very much
convenient as well.
 Mostly used for backup provision.
 Magnetic tapes, punched card, tape drives etc.
Index Sequential Access:
 Data will be stored or accessed sequentially based on
index table.
 During the file creation a file handling program
routines to establish an index on the disk.
 Using this index, files or records will be located
quickly.
 Accessing speed will be faster than sequential access
method.
 Laser disks such as CD, VCD, DVD, Blue-rays Disk etc
are common examples.

Random Access:
 Data will be stored or accessed directly from the
stored location.
 It is also termed as direct access method.
 It is one of the fastest methods for data accessing.
 Magnetic Disk, Core memories/ Flash memories are
common examples.
 Different types of File Allocation Tables are used to
allocate the storage location even in this method (As
similar in Index sequential Method).
# File organizations:
A file is organized logically as a sequence of
records. These records are mapped onto disk blocks.
Files are provided as a basic construct in operating
systems (OS), so we shall assume the existence of an
underlying file system. We need to consider ways of
representing logical data models in terms of files.
Although, blocks are of a fixed size determined by the
physical properties of the disk and OS, record sizes are
varied. In RDBMS, tuples of distinct relations may be of
different sizes.
One approach to mapping database to files is to use
several files and to store records of only one fixed
length in a given file. An alternative is to structure files
to accommodate variable length records (Fixed length
is easier to implement). Many of the techniques used
for the former can be applied to the variable length
case. Thus we begin by considering a file of fixed length
records.

* Fixed Length Records:


Type deposit = record
Branch name: char (22)
Account no: char (10)
Balance: real
End
If we assume, each character occupies 1 byte and
real occupies 8 bytes then each record occupied 40
bytes long. A simple approached is to use the first 40
byes for the first record and next for the second one
and so on. However there are two problems with this
simple approach.
1)It is difficult to delete a record from this structure.
The space occupied by the record to be deleted
must be filled with some other data records of the
file or we must have a way of marking deleted
records so that they can be ignored.
2)Unless he block size happens to be a multiple of 40
some records will cross block boundaries. That is
part of the record will be stored in one block and
part in another. It would thus required two blocks
accessed to read or write such records.
When a record is deleted, we could move the record
that comes after it into the space formerly occupied by
the deleted record, and so on until every record
following the deleted record has been moved ahead.
Such an approach required moving large no of record.
It might be easier simple to move the final record of
the file into the space occupied by the deleted record.
It is undesirable to move records to occupy the space
freed by a deleted record, since doing so requires
additional block accesses. Since insertions tend to be
more frequent than deletions, it is acceptable to leave
open the space occupied by the deleted record and to
wait for a subsequent insertion before reusing the
space. A simple marker on a deleted record is not
sufficient, since it is hard to find this available space
when an insertion is being done. Thus we need to
introduce an additional structure.
At the beginning of the .le, we allocate a certain
number of bytes as a file header. The header will
contain a variety of information about the file. For
now, all we need to store there is the address of the
first record whose contents are deleted. We use this
first record to store the address of the second available
record, and so on. Intuitively, we can think of these
stored addresses as pointers, since they point to the
location of a record. The deleted records thus form a
linked list, which is often referred to as a free list.
On insertion of a new record, we use the record
pointed to by the header. We change the header
pointer to point to the next available record. If no
space is available, we add the new record to the end of
the file.
Insertion and deletion for .les of .fixed-length
records are simple to implement, because the space
made available by a deleted record is exactly the space
needed to insert a record. If we allow records of
variable length in a .i.e. this match no longer holds. An
inserted record may not .t in the space left free by a
deleted record, or it may fill only part of that space.

* Variable Length Records:


Variable-length records arise in database systems in
several ways:
* Storage of multiple record types in a file
* Record types that allow variable lengths for one or
more fields
* Record types that allow repeating fields
Different techniques for implementing variable-
length records exist. For purposes of illustration, we
shall use one example to demonstrate the various
implementation techniques. We shall consider a
different representation of the account information
stored in the file, in which we use one variable-length
record for each branch name and for all the account
information for that branch. The format of the record
is
Type account-list = record
Branch - name: char (22);
Account - info: array [1..8] of
Record:
Account - number: char (10);
Balance: real;
End
We define account-info as an array with an arbitrary
number of elements. That is, the type definition does
not limit the number of elements in the array, although
any actual record will have a specific number of
elements in its array. There is no limit on how large a
record can be (up to the size of the disk storage).

# Organization of Records in a File:


So far, we have studied how records are represented
in a file structure. An instance of a relation is a set of
records. Given a set of records, the next question is
how to organize them in a file. Several of the possible
ways of organizing records in files are:

 Heap File Organization:


Any record can be placed anywhere in the file
where there is space for the record. There is no
ordering of records. Typically, there is a single file for
each relation.
 Sequential File Organization:
Records are stored in sequential order, according to
the value of a “search key” of each record.

 Hashing File Organization:


A hash function is computed on some attribute of
each record. The result of the hash function specifies
in which block of the file the record should be
placed.

Generally, a separate file is used to store the records


of each relation. However, in a clustering file
organization, records of several different relations are
stored in the same file; further, related records of the
different relations are stored on the same block, so
that one I/O operation fetches related records from all
the relations.

There are basically two methods of organizing


records in a file.

# Sequential File Organization:


A sequential file is designed for efficient processing
of records in sorted order based on some search-key. A
search key is any attribute or set of attributes; it need
not be the primary key, or even a super-key. To permit
fast retrieval of records in search-key order, we chain
together records by pointers. The pointer in each
record points to the next record in search-key order.
Furthermore, to minimize the number of block
accesses in sequential file processing, we store records
physically in search-key order, or as close to search-key
order as possible insertion or deletion.
We can manage deletion by using pointer chains, as
we saw previously. For insertion, we apply the
following rules:

1)Locate the record in the .le that comes before the


record to be inserted in search-key order.
2)If there is a free record (that is, space left after a
deletion) within the same block as this record, insert
the new record there. Otherwise, insert the new
record in an overflow block. In either case, adjust the
pointers so as to chain together the records in
search-key order.

# Clustering File Organization:


Many relational-database systems store each
relation in a separate file, so that they can take full
advantage of the file system that the operating system
provides. Usually, tuples of a relation can be
represented as fixed-length records. Thus, relations
can be mapped to a simple file structure. This simple
implementation of a relational database system is well
suited to low-cost database implementations as in, for
example, embedded systems or portable devices.
In such systems, the size of the data base is small, so
little is gained from a sophisticated file structure.
Furthermore, in such environments, it is essential that
the overall size of the object code for the database
system be small. A simple file structure reduces the
amount of code needed to implement the system. This
simple approach to relational-database
implementation becomes less satisfactory as the size of
the database increases. We have seen that there are
performance advantages to be gained from careful
assignment of records to blocks, and from careful
organization of the blocks themselves. Clearly, a more
complicated file structure may be beneficial, even if we
retain the strategy of storing each relation in a
separate file.
Exercise:
1)What are the various operations performed with the
information stored in the file?
2)Define different types of physical storage media used
to store database.
OR
Draw a hierarchical structure of the storage media
and explain each of them.
3)What are the data accessing methods followed to
access the data elements from the storage sections?
4)What is a File Organization? Explain them
5)How the files are organized in a file? Explain
6)What is sequential file organization? Explain
7)Explain about the cluster file organization.

Unit 4: Relational Model:

# Introduction:
A Relational database is a set of database elements
collected in a table. Each of which is assigned a unique
name. The table structure is very similar to the
structure of ER database structure. A row in a table
represents a relationship among the set of values. As a
table is a collection of such relationships, there is a
close correspondence between the concept of table
and the mathematical concept of relation, from which
the relational data model takes its name. In what
follows, we introduce the concept of relation.
Relationship represents facts describing a set of real
world entities. In a relationship, we represent one
entity per row and one attributed per column. Table
name and the column name are used to help in
interpreting the meaning of the values in each role of
relation.

Database example:
Roll Std_Na D_o_B Faculty Semes
No me ter
BBA- Rabindr 12/21/1 Managem VI Sem
0015 a 989 ent
6-
009
BSc- Laxmi 09/11/1 Science IV Sem
0017 990
6-
010
BA- DilBhad 01/09/1 Humaniti III Sem
0045 ur 991 es
4-
010
: : : : :
: : : : :

# Different features of Relationship:


 Attributes:
Attributes are the heading assigned to collect the
data elements on the table. Each table will have
different attributes (field) that are used to collect the
related values of on the table. For example: In above
table Roll No, Std Name, DoB, Faculty, Semester are
the attributes. Two attributes cannot have a same
name in a relation.

 Domain;
Domain defines a set of values that are allowed in an
attributes to collect. Whatever set of data elements
such as numbers, alphabets, data in format, logical
etc that we collect in an attributes is a domain.
Domain is a data type that we collect in an
attributes. Different attributes may have same
domain in an entity.

 Null Value;
The elements that we collect as an attributes cannot
be blank or empty which is termed as null value. We
cannot keep the attributes blank or null.

 Tuple:
Every set of elements in a relation where relative
data will be collected is known as Tuple. It is a
collection of data elements of an object or person.
Tuple is also known as Record. No of records is also
called cardinality of the relation. Each attributes of
tuples may have same value but two duplicate tuples
are not allowed.
# Database Schema:
Database schema is a logical design of the
database that we set for extracting the data
elements from the database.
Database instance is a snapshot of the data in the
database at a given instant in time.
Relation schema is a list of attributes in a specific
order (Logical design). It doesn’t contain any tuples.
The concept of relational schema corresponds to
the programming language notation of type
definition. The concept of a relation instance
corresponds to the programming language to the
value of a variable. The value of the variable may
change with time similarly the contents of a relation
instance may change with time as the relation is
updated.

STD_Schema = (Roll_No, Std_Name, D_o_B,


Faculty, Semester)
We denote that the facts of students in a relation on
STD_schema.
Similarly,
EXAM_schema=(Semester, Roll_No, Subject)
TEACHER_schema=(Teacher_Name, Course,
Lecture_hrs)

 Constraints:
A Constraint is a rule that restricts the values that
may be present in the database. There are several
constraints that are used to check the validations of
the data in a relational database. They are:
o Entity Integrity Constraints:
o Referential Integrity Constraints:

Entity Integrity Constraints:


The constraint which restricts the values in
any individual tuples in a relation is known as
Entity Integrity Constraints. There are two types of
Entity Integrity Constraints.
* Domain Constraints:
Domain constrain specifies that the values of
each attributes must be atomic value from the
domain of the respective attributes. It will make
the distinct data collections in an attributes as well
as ensure the test query comparison makes sense.
* Key constraints:
The key constraint defines how and for what
purpose the attributes are used in the database. It
is a part of the primary key field in the database. Is
the field compulsory, duplicate values are allowed
or not, uniquely identifies or not, can be null or
not and so on.
Referential Integrity:
Referential integrity is defined through foreign
key field in a database. Foreign key is a set of
attributes that are linked with the values with the
primary key in a database. It will provide the
requested data elements in the relation that
matches same value with the primary key.

 Key Fields:
We must have a way to specify the entities within
the entity sets distinguished. Two entities in an
entity set are not allowed to have exactly same
values for all the attributes. A key allows us to
identify a set of attributes that suffice to distinguish
relationships from each other.
o Super Key is a set of one or more attributes that
taken collectively allow us to identify uniquely an
entity in the entity set. Soc Sec No, Symbol No,
Cust No etc are some key fields used as super key.

o Candidate Key is an interested keys in an entity set


that are capable to be a super key. These are
possible keys for the primary or super key.

o Primary Key is a candidate key that is chosen by


he application programmer to associate with the
entities sets. It is the principal means of uniquely
identifying entities within entity sets.

o Composite Key is a combination of more than one


attributes used to identify the entity in an entity
sets. By combining more fields a primary is
defined, that is termed as composite key.
o Foreign Key is a set of attributes that are linked
with the values with the primary key in a database.
It will provide the requested data elements in the
relation that matches value with the primary key.

 Query Language:
A query language is a language in which a user
requests information from the database. These
languages are usually on a level higher than that of a
standard programming language. Query language
can be categories as either procedural or non
procedural. In a procedural language, the user
instructs the system to perform a sequence of
operations on the database to complete the desired
result. In a non procedural language, the user
describes the desire information without giving a
specific procedure to obtain that information.
Most commercial relational database system
offers a query language that includes elements of
both the procedural and non procedural approaches.
SQL (Structural Query Language), QBE (Query by
Example), Quel, Datalog etc are some common query
language.
The relational algebra is procedural language
where as the tuple relational calculus and domain
relational calculus are non procedural. These query
language illustrates the fundamental techniques for
extracting the data from the database.

# Relational Algebra:
Relational algebra is a procedural query language. It
consists of a set of operations that one or two relations
as input and produce a new relation as their result. The
fundamental operations in the relational algebra are
Select, Project, Union, Set difference, Cartesian
product and Rename. In additional to the fundamental
operations, there are several other operations namely
Set Intersection, Natural Join, Divide and Assignment.
These operations will be defined in terms of the
fundamental operations.
Relational algebra operation manipulates relations.
This is these operations use one or two existing
relations to create a new relation. This new relation
may then be used as input to a new operation. This
powerful concept- the creation of new relations from
old ones- makes possible an infinite variety of data
manipulation.
The select, project &rename operations are called
unary operations, because they operate on one
relational. The other 3 operations operate on pair of
relations and are therefore binary operations.

The relation EMPLOYEE:


E ENA ADD SAL JOB_S A Sp DEP
N ME RESS ARY TATUS G vN TN
O E o O
1 Sag Bhak 350 Resear 2 10 10
0 ar tapu 0 ch 8 2
1 r fellow
1 Ras Kath 420 Office 2 10 20
0 hmi man 0 Asst 4 2
7 du
1 Na Bhak 500 Secret 2 11 30
0 mra tapu 0 ary 3 2
2 ta r
1 Sac Kath 900 Admini 2 10 20
1 man
2 hin du 0 strator 5 7
1 Sup Pokh 370 Office 3 11 10
0 raj ara 0 Asst 0 2
9
1 K. Lalit 850 Profes 4 11 20
1 Sing pur 0 sor 5 3
5 h
1 Sara Pokh 120 Direct 3 11 40
1 d ara 00 or 5 5
1
1 Bid Chit 450 Resear 2 11 40
1 ur wan 0 ch 8 5
3 fellow
1 Nar Birat 900 Profes 4 10 10
1 en naga 0 sor 9 7
6 r
1 Pooj Pokh 450 Office 2 10 Null
0 a ara 0 Asst 4 7
3

The relation PROJECT;


PN PNA S_DAT LOCA PMGR DEP
O ME E TION TN
O
M MA 03- Kath 102 10
11 RS DEC- mand
0 2007 u
J2 JUPI 03- Pokha 107 30
20 TOR JAN- ra
2008
V2 VEN 15- Hetau 112 40
08 US OCT da
2007
E0 EAR 01- Butw 252 50
05 TH FEB al
2008

The relation WORK_IN;


EN PN P_JOB HOUR
O O S
10 M1 Coordi 3
1 10 nator
10 V20 Engine 4
7 8 er
10 M1 Typist 5
2 10
11 E00 Accou 9
2 5 ntant
10 M1 Scienti 3
7 10 st
11 V20 Coordi 8
2 8 nator
11 M1 Engine 2
1 10 er
10 E00 Scienti 4
7 5 st
10 V20 Scienti 9
2 8 st
10 J22 Scienti 2
7 0 st
The relation DEPARTMENT;
DEPT DNAM LOCAT D MgrSD
_NO E ION M ate
gr
10 Sttore Kathm 10 03-
andu 1 DEC-
2006
20 Resea Bhakt 11 07-
rch apur 2 FEB-
2007
30 Sales Lalitpu 10 05-
r 2 JAN-
2008
40 Marke Pokha 11 26-
ting ra 1 DEC-
2007
50 Admin Lalitpu 22 01-
istrato r 2 APR-
r 2005

* Select Operation :( σ)
The select operations selects tuples that satisfies a
given predicates (conditions). A symbol Sigma (σ) is
used to denote select operation. A tuple from the
source relation is selected (or not) based on whether it
satisfies a specified predicate. A predicate is a truth
valued expression involving tuple component values
and their relationships. The predicates appears as a
subscript to σ. The argument relation is given in
parentheses following the σ. All tuples satisfying the
predicate are then collected into the resultant relation.
General Syntax of selection is;

Syntax: - Select <predicates> (relation name)


Eg: - σ Address = “Pokhara” (EMPLOYEE)

All the tuples that matched the address Pokhara will


be selected.

E ENA ADD SAL JOB_S A Sp DEP


N ME RESS ARY TATUS G vN TN
O E o O
1 Sup Pokh 370 Office 3 11 10
0 raj ara 0 Asst 0 2
9
1 Sara Pokh 120 Direct 3 11 40
1 d ara 00 or 5 5
1
1 Pooj Pokh 450 Office 2 10 Null
0 a ara 0 Asst 4 7
3

* Project Operation (π):


Project is the operation of selecting certain
attributes from a relation to form a new relation that
satisfies the given predicates. A symbol Pi (π) is used to
denote project operation. The desire columns are
simply specified by name during the projection. With
the help of this operation, any number of columns can
be omitted from the table or column of table can be
rearranged that satisfies the given predicates. General
Syntax of projection is;

Syntax: - Project <list of attributes> (relation name)


Eg: - π EName, Address, Salary, Job Status
(EMPLOYEE)

All the tules of the attributes EName, Address,


Salary and Job Status will be selected to form a new
relation.
ENA ADD SAL JOB_ST
ME RESS ARY ATUS
Sag Bhak 3500 Resear
ar tapur ch
fellow
Ras Kath 4200 Office
hmi man Asst
du
Na Bhak 5000 Secreta
mra tapur ry
ta
Sac Kath 9000 Admini
hin man strator
du
Sup Pokh 3700 Office
raj ara Asst
K. Lalitp 8500 Profess
Sing ur or
h
Sara Pokh 1200 Directo
d ara 0 r
Bidu Chit 4500 Resear
r wan ch
fellow
Nar Birat 9000 Profess
en nagar or
Pooj Pokh 4500 Office
a ara Asst

* Set Union Operation (U);


Union is the operation of selecting certain attributes
from a multiple relations. A symbol (U) is used to
denote union operation along with select or project.
The attributes from two relations are said to be union
compatible if they same degree n. union compatibility
is required so that the result of the union operation is
as relation.

* Intersection Operation (∩):


Intersection is the operation of selecting certain
attributes from a multiple relations that are common
to both the relations. A symbol (∩) is used to denote
intersection operation. The relational intersection
operator also required two relations to be union
compatible.

* Set Difference Operation (-):


Difference is the operation of selecting certain
attributes from a multiple relations that are not
common in between second relation. A symbol (-) is
used to denote difference operation. It will extract only
those elements that are not common in compared
relation. The relational difference operator also
required two relations to be union compatible.

Let’s suppose the union of the relations R and N. this


operation includes all the tuples of the relation R as
well as N.

RelationRelationUnionIntersectionDifference
R N RUN R∩N R-N
X Y Z X Y Z X Y Z X Y Z X Y Z
A i u a j v a i u a j v a i u
A j v b k w a j v b k w
B k w b q s a j v d e f
D E f d e f b k w
b k w
b q s
d e f
d e f

* Cartesian Product (X):


Cartesian product is the operation of selecting the
attributes from a multiple relations that may generate
all possible combination of the multiple relations. A
symbol (X) is used to denote Cartesian product
operation. It will extract all the possible records similar
as the multiplication of the relations. The relational
Cartesian product operator also required two relations
to be union compatible.
The Cartesian product of relations Employee (E) and
Project (P) is denoted E X P, is the set of all product
combinations of tuples of two operation relations. Each
resultant tuple consists of all the attributes of E and P.
The Cartesian product EMPLOYEE and PROJECT;
E ENA ADDRE … DEPT PN … LOCAT
N ME SS NO O ION
… …
O
10 Saga Bhakta 10 M1 Kathm
1 r pur 10 andu
10 Saga Bhakta 10 J22 Pokhar
2 r pur 0 a
10 Saga Bhakta 10 V2 Hetau
3 r pur 08 da
10 Saga Bhakta 10 E0 Butwal
4 r pur 05
10 Rash Kathm 20 M1 Kathm
7 mi andu 10 andu
10 Rash Kathm 20 J22 Pokhar
8 mi andu 0 a
10 Rash Kathm 20 V2 Hetau
9 mi andu 08 da
11 Rash Kathm 20 E0 Butwal
0 mi andu 05
10 Nam Bhakta 30 M1 Kathm
2 rata pur 10 andu
10 Nam Bhakta 30 J22 Pokhar
3 rata pur 0 a
10 Nam Bhakta 30 V2 Hetau
4 rata pur 08 da
10 Nam Bhakta 30 E0 Butwal
5 rata pur 05
11 Sachi Kathm 20 M1 Kathm
2 n andu 10 andu
11 Sachi Kathm 20 J22 Pokhar
3 n andu 0 a
11 Sachi Kathm 20 V2 Hetau
4 n andu 08 da
11 Sachi Kathm 20 E0 Butwal
5 n andu 05
10 Supr Pokhar 10 M1 Kathm
9 aj a 10 andu
11 Supr Pokhar 10 J22 Pokhar
0 aj a 0 a
11 Supr Pokhar 10 V2 Hetau
1 aj a 08 da
11 Supr Pokhar 10 E0 Butwal
2 aj a 05
11 K. Lalitpu 20 M1 Kathm
5 Sing r 10 andu
h
11 K. Lalitpu 20 J22 Pokhar
6 Sing r 0 a
h
11 K. Lalitpu 20 V2 Hetau
7 Sing r 08 da
h
11 K. Lalitpu 20 E0 Butwal
8 Sing r 05
h
11 Sara Pokhar 40 M1 Kathm
1 d a 10 andu
11 Sara Pokhar 40 J22 Pokhar
2 d a 0 a
11 Sara Pokhar 40 V2 Hetau
3 d a 08 da
11 Sara Pokhar 40 E0 Butwal
4 d a 05
11 Bidur Chitwa 40 M1 Kathm
3 n 10 andu
11 Bidur Chitwa 40 J22 Pokhar
4 n 0 a
11 Bidur Chitwa 40 V2 Hetau
5 n 08 da
11 Bidur Chitwa 40 E0 Butwal
6 n 05
11 Nare Biratna 10 M1 Kathm
6 n gar 10 andu
11 Nare Biratna 10 J22 Pokhar
7 n gar 0 a
… …… …… … …… … … ……
… …
… …

* Join Operation ( )
Join is a binary operation that allows the user to
combine two relations in a specific way. Join of two
relations is restriction of their Cartesian product of R
and N such that a specific condition meets. Join is
normally defined on an attribute from same domain.
From R relation a attribute (R.a) and from N relation b
attribute (N.b) are joined with specific condition (R.a
N.b). The specified condition is called join predicate
of attributes.

There are different types of join operations;


- Natural Join:
- Outer Join:
- Theta Join:

The natural join allows us to combine certain


selections and a Cartesian product into one operation.
It is denoted by the symbol ( ). It will perform the
task based on equality. The natural join forms the
Cartesian product of the relations that meet the
specific conditions and remove the duplicate attributes
as well.
The Theta (Ɵ) join, joins two relations on the basic of
some comparison operator other that equality. Each
condition is of the form of a Ɵ b where a is an attribute
of R and b is an attribute of N which have same
domain. Here Ɵ is =, >, <, >=, <= and <>.
The expansion of natural join ( ) that includes all
rows from both relations while selecting. It will not
remove any of the duplicate values from both relations
that meet the condition. This is very useful when we do
not want to miss any tuples from the relations while
extracting. Three forms of outer join are commonly
used, they are Left outer join, right outer join and full
order outer join.
In left outer join ( ), the resulting contains all the
tuples of the left operand and in right outer join ( ),
the resulting contains all the tuples of the right
operand. Similarly full outer join ( ), the resulting
relation contains all the tuples from both operands.

* Rename operation (ρ):


While working with multiple relations, two
relations may have common attributes. It is difficult to
use similar relations with common attributes, so we
can change the name of such attributes such that two
relations have disjoint set of attributes. A rename
operator is denoted by Greek letter Rho (ρ). The
expression is as follows;
ρX (E)
In above example, the results of expression E under
new name X;
ρ ( R (a1, a2, …. An), T)
The expression interprets the attributes of relation T
is named as R in new schema.

* Assignment Operation (←):


An assigning operator is used to assign the part of
relational algebra to a temporary variable. It is denoted
by a left arrow (←) which works like assignment in a
programming language.
Temp1 ← (relational algebra)
The evaluation of an assignment does not result in
any relation being displayed to the user. The
expression at the right of the arrow will be assigned to
the relational variable located at the left of the
expression. The result variable may be in subsequent
expression.

* Division Operation (÷):


A division operation consists of dividend and a
divisor, in which the dividend contains the divisor
several times in the expression. In relational algebra,
the quotient is the answer of the division and we do
not bother about the reminders. Les see the expression
R ÷ N;
R N R ÷ N
R.a R. R.c N.a N. N.c _.a _.b _.c
b b
1 a X 1 a x 1 a x
1 b Y 2 a x 1 b y
2 a X 3 a x 2 a x
2 b Y 1 b x 2 b y
3 c X 2 b x
3 d Y 3 b x

# Kinds of relation:
A relational algebra operation has been extended
in several ways. A simple extension is to allow
arithmetic operations as part of projection. An
important extension is to allow aggregate operations
such as computing the sum of the elements of a set, or
their average. Another important extension is the
outer join operation, which allow relation algebra
expressions to deal with null values, which model
missing information.
* Generalized Projection;
The generalized projection operation extends the
projection operation by allowing arithmetic functions
to be used in the projection list. The generalized
projection operation has the form:
Π F1, F2,……,Fn (E)
Where E is any arithmetic algebra expression, and
each of F1, F2,……,Fn is an arithmetic expression
involving constants and attributes in the schema of E.
As a special case, the arithmetic expression may be
simply an attributes or a constant.
Π ename, salary + 500 (Employee)
The attribute resulting from the expression salary
+ 500 does not have a name. We can apply the rename
operation to the result of generalized projection in
order to give it a name. As a notational convenience,
renaming of attributes can be combined with
generalized projection.

* Aggregate Function:
The aggregate functions take a collection of values
and return a single value as a result. For example, the
aggregate functions sum takes a collection of values
and returns the sum of the values. Some of the
commonly used aggregate functions are SUM, AVG,
COUNT, MAX, MIN, DISTINCT etc.
The general form of the aggregate is as follows;
SUM (Salary (PAYROLL))
AVG (Salary (PAYROLL))
COUNT (NAME (EMPLOYEE))
MIN (Salary (PAYROLL))
MAX (Salary (PAYROLL))
The general form of the aggregate operation (Ʋ) is
as follows;
G1, G2,……,GnƲ F1 (A1), F2 (A2),……,Fm(Am)
(E)
Where E is any relation-algebra expression, G1,
G2, … …Gn constitute a list of attributes on which to
group, each Fi is an aggregate function, each Ai is an
attribute name.
The tuple in the result of expression E are partitioned
into group in such a way that:
1)All tuples in a group have the same values for G1,
G2,...…,Gn.
2)Tuples in different groups have different values for
G1, G2,……,Gn.
In generalized projection, the result of an
aggregatei0on operation does not have name. We
can apply rename operation to the result in order to
give it a name.

* Outer Join;
The outer join operation is an extension of the
joint operation to deal with missing information. In
both the “theta and natural join”, the tuples that have
similar values will not displayed, which may result the
data lost. To retain all the information of both
relations, it is describe to have a join which keeps the
tuples having no corresponding values in both relations
associated with null values. This is the external join or
outer join. The outer join is extended into following:

- Left Outer Join:


- Right Outer Join:
- Full Outer Join:

In left outer join ( ), the resulting contains all the


tuples of the left operand and in right outer join ( ),
the resulting contains all the tuples of the right
operand. Similarly full outer join ( ), the resulting
relation contains all the tuples from both operands.

* Null Value:
Operations and comparisons of null values has to
be avoided. Null values mean values unknown or
nonexistent. Any arithmetic operation involving null
values must return null value as result. Any comparison
with null value results in special value called unknown.

How different relational algebra operations deal with


null values.
Select: In selection, any predicate evaluated with
null will return false or unknown.
Join: Join is a cross product of the two relations. In
join, if both tuples have null values in a common
attributes it matches otherwise the tuples do not
match.
Projection: The projection operation treats null
values just like other values when eliminating
duplicates. If two tuples in the projection results
null, they are treated as duplicates. The decision is
bit arbitrary since without knowing the values we
cannot say they are duplicates.
Union, Intersection, Difference, Generalized
Projection, Aggregate: The treatment is same as in
projection regarding the duplicates.
Outer join: Just like join operation, except on
tuples that do not occur in the join result. Such
tuples may be added to the result padded with
null.

# Modification on database:
We express database modifications by using the
assignment operations. We make assignments to
actual data relation by using same notation as in
assignment.
* Deletion:
We can delete only whole tuples, we cannot
delete values on only particular attributes, in relational
algebra a deletion is expressed by:
R←R–E
In above example R is a relation and E is a
relational algebra query. While resulting, an algebraic
query E is been removed from R relation.
* Insertion:
To insert data into a relation, we either specify a
tuple to be inserted or write a query whose result (a
set of tuples) will be inserted. Both the attributes must
belong to same domain. In relational algebra an
insertion operation is expressed by:
R←RUE
In above example R is a relation and E is a
relational algebra query. A resulting tuples from
relational algebraic query E is been inserted into
relation R. Both the attributes of the relation and
algebraic query must belong to same domain.

* Updating:
In a relation, when we have to change a value of a
particular attributes, we have to update the tuple or
attributes. We can use generalized projection operator
to do this task. Various predicates (conditions) can be
applied while updating the values of the tuples.
Temp1← Πename, eadd, salary +salary *0.10, Job
Status
(σ dept=Account (employee))
In above example salary will be increased by 10%
of all the employees whose dept is Account.
Temp2 ← Πename, eadd, salary +salary *0.10, Job
Status
(σ dept=Account (employee)) U
Πename, eadd, salary +salary *0.05, Job
Status
(σ Age >= 45 (employee))
In above example the salary will be increased by
10% if dept is account and salary increased by 5%
those whose age is greater than 45.

# Views:
In all of our examples we operated on logical
model. For security reason or for personalized
collection it may not be described to show the entire
relation to a user. Certain data may be hidden from
certain users.
Any relation that is not part of the logical model, but
is made visible to a user as a virtual relation, is called
view. It is possible to support a large no of views on top
of any given set of actual relations.
* View Definition;
We define a view using the create view statement;

CREATE VIEW emp1as


Πeid, ename, eadd, salary, Job
Status(employee)
A view “emp1” is defined containing employee;

CREATE VIEW emp2 as


Πeid, ename, eadd, salary, Job
Status(employee project payroll)
A view “emp2” is defined containing employee,
project and payroll)

Exercise
1)What is relation? Explain with the components
related with the relation?
2)What is database schema? Differentiate between
relation schema and relation instance.
3)What is constraint? Differentiate between Entity
Integrity Constraints and Referential Integrity
Constraints.
4)What is key field? Explain various types of keys used
to in relationship.
5)What is relational algebra? List out the operations
used in a relational algebra.
6)Give an expression in the relational algebra for the
following queries.
a. Select all names from the relation employee.
b. Select Name, Address, Phone, Dept from the
relation Employee.
c. Select Name, Address, Phone, Dept who works in
account department.
d. Select Name, Address, Phone, Dept who works in
account department and earns more than 25000.
Unit 5: Integrity Constraints

# Introduction:
As a security is major features of the database, we
must make sure that the change done by the user is
authorized and may not have any types of error even
accidently. To ensure the security and integrity
constraint is very much useful.
Integrity constraints provides a means of ensuring
that changes made to the database by authorized users
do not result in a loss of data consistency. Thus
integrity constraints guards against accidental damage
of the database.
In general, an integrity constraint can be an arbitrary
predicates pertaining to the database. However
arbitrary predicates may be costly to test. Thus, we
usually limit ourselves to integrity constraints that can
be tested with minimal overhead.

# Types of Integrity Constraints:


In database several types of constraints could be
defined to check the error. Most common out of them
are;

* Domain Constraints:
Domain constrain specifies that the values of each
attributes must be atomic value from the domain of
the respective attributes. It will make the distinct
data collections in an attributes as well as ensure the
test query comparison makes sense. Data is said to
be domain integrity when the value of a column is
derived from the domain.
Domain constraints are the most elementary form
of integrity constraint. They are tested easily by the
system wherever a new data items is entered into
the database. It is possible to have several attributes
with similar domain. For example, the attributes Emp
Name, Std Name, Cust Name might have same
domain.

CREATE DOMAIN Salary numeric (9, 2)


Constraints Salary-null-test check (value not
null)

In above example a domain Salary is declared to


be number with total 9 digits out of which 2 digit
after decimal. Similarly the check clause will restrict
the domain as not null or the attributed is
compulsory and cannot be null.

CREATE DOMAIN Department char(15)


Constraints Department-value-test
Check (value in (“Account”, “Management”,
and “Administration”, “Technical,
Contractual”)

Here, in this example, a domain Department is


declared to be maximum 15 characters. The value in
clause will set the values of the attributes.

* Entity Integrity Constraints:


The constraints that will restrict the attributes to
be identify uniquely such that two tuples cannot have
same values or the values cannot be null. It is useful
even for sorting and searching the values of the
attributes quickly. A Primary key is used for entity
integrity, which makes an attributes not null as well
unique to identify the tuples. Thus the rule for entity
integrity constraints asserts that no attributes
participating in the primary key of the relation is
permitted to accept null values and it must be unique.

* Referential Integrity:
In a relational model, data elements data are stored
in different relations and they are linked together while
extracting accordingly. For eg In a PERSONAL relation
all the detail of the student will be stored, and in a
PARENTS relation detail of the parents are stored. If we
have to find the detail of the parents, we will link these
two relations through a unique and common attribute
(PAR_ID) among them. The attribute PAR_ID is set as a
primary key in a PARENTS table where as it is termed
as Foreign key in PERSONAL relation. Now we will
search the detail of the parents with relation to.
Here the foreign key is supposed to have relevant
values of the primary key or a null value. Other values
beside these two are not allowed. If the tuple does not
have any relevant values to the parent primary key
field, it is termed as orphan record.

* Referential Integrity Constraints in SQL;


Foreign keys can be specified as part of the SQL
CREATE TABLE statement by using the FOREIGN KEY
clause. A foreign key or a Referential Integrity
Constraint can be specified in a relation as follows;
Create table account
(Acc no char (10),
Branch name char (15),
Balance integer,
Primary key (acc no),
Foreign key (branch name) references branch,
Check (balance >= 0))
A relation is created named ACCOUNT with Acc
No, Branch Name and Balance attributes. Acc no is set
as a primary key and foreign key is defined as branch
name with reference to BRANCH relation.

Check Constraints;
Check constraint is used to verify the satisfaction
of the value entered. The value entered in an
attributes must have a value within the specified
range or that satisfied the specific condition. The
syntax of check constraints is
[Constraint <name>] CHECK (<condition>)
For example:
CREATE TABLE employee
(………,
E_ID Integer CONSTRAINT chk empid
CHECK (E_IDIS NOT NULL and E_ID<10000),
Dept Char (3) CONSTRAINT Chk_dept
CHECK (value in (“ACC”, “ADM”, “MAN”,
“DIR”, “TEC”, “HLP” )),
… … … );
Here in above example, an attributes will have a
specific value range less than 1000 and the field
should consist null value. Other attribute has a value
of ACC, (Accountant), ADM (Administration), MAN
(Manager), DIR (Director), TEC (Technical) and HEP
(Helper). Beside these values other values are not
allowed.

Assertions:
An assertion is a predicate expressing a condition
that we wish the database always to satisfy. Domain
constrains and referential integrity constraints are
special forms of assertions. However, there are many
constraints that we cannot express by using only
these special forms. Some examples of such
constraints are;
* Every loan has at least one customer who
maintains an account with a minimum balance of
1000.
* The sum of all loan amounts for each branch must
be less than the sum of all account balance at the
branch.
An assertion in SQL takes the form
CREATE ASSERTION <assertion name> CHECK
<predicate>

Create assertion sum-constraint check


(Not exists (select * from branch
Where (select sum (amount) from loan
Where loan branch-name = branch. Branch
name)
>= (select sum (balance) from account
where account branch-name = branch branch-
name)))

Create assertion balance-constraint check


(Not exists (select * from loan
Where not exists (select * from borrower,
depositor, account
Where loan. Loan number = borrower. Loan
number
And borrower. Customer name = depositor.
Customer name
And depositor Account number = account.
Account number
and account balance>= 1000)))

When an assertion is created that the system


executes automatically as a side effect of a
modification to the database is allowed only if it
does not cause that assertion to be violated.

Triggers:
A trigger is a statement that the system executes
automatically as a side effect of a modification to the
database. To design a trigger mechanism, we must
meet two requirements.
* Specify when a trigger is to be executed. This is
broken up into an event that cause the trigger to
be checked and a condition that must be satisfied
for the trigger execution.
* Specify the action to be taken when the trigger
executes.
Once we enter into a trigger into a database, the
database system takes on the responsibility of
executing it whenever the specified event occurs and
the corresponding condition is satisfied.
Triggers are useful mechanism for altering humans
or for starting certain task automatically when
certain conditions are met. An example of use of
trigger, suppose a warehouse wishes to maintain a
minimum inventory of each items; when an
inventory level of an item falls below the minimum
level, an order should be placed automatically. This
is how the business rule can be implemented by
trigger. On an update of the inventory level of an
item, the trigger should compare the level with
minimum inventory level for the item and if the level
is at or below the minimum, a new order is added to
an orders relation.
# Security and integrity violation:
Database integrity is more related with the security
of the database. The data stored in a database are
supposed to be protected from unauthorized access
and destructions or alteration. The data alteration may
be caused by intentional or accidental.
Intentional data loss may caused by the following:
 Unauthorized reading of data (theft of
information).
 Unauthorized modification of dat.
 Unauthorized destruction of data.
Accidental data loss may be caused from:
 Crashes during transaction process.
 Usual data loss during the distribution of data over
several computers.
 Usual data loss during the updating the database.
 Logical errors that may occur during the
assumption that transaction preserve the data
base consistency constraints.
It is easier to protect against accidental loss of data
that to protect against intentional data loss in the
database. Intentional data loss protection may be
costly as well as high alertness over the un-authorized
access is also required. Database security usually refers
to protect against intentional data loss, whereas
integrity refers to the avoidance of accidental loss of
data on the database.
The dividing line between security and integrity is
not always clear. The security generally refers to both
of these. To protect the database, we must take
security measure at several levels;

 Database system:
Some database-system users may be
authorized to access only a limited portion of the
database. Other users may be allowed to issue
queries, but may be forbidden to modify the data.
It is the responsibility of the database system to
ensure that these authorization restrictions are
not violated.

 Operating System:
No matter how secure the database system is,
weakness in operating-system security may serve
as a means of unauthorized access to the
database. Due to the less security restriction by
the OS many problems may occur on the database
system.
 Network:
Since almost all database systems allow
remote access through terminals or networks,
software-level security within the network
software is as important as physical security, both
on the Internet and in private networks. Granting
authority over the network may protect various
problems related to the security.

 Physical:
Sites with computer systems must be
physically secured against armed or surreptitious
entry by intruders.

 Human:
Users must be authorized carefully to reduce
the chance of any user giving access to an intruder
in exchange for a bribe or other favors.
Security at all these levels must be maintained if
database security is to be ensured. A weakness at a low
level of security (physical or human) allows
circumvention of strict high-level (database) security
measures.
Security within the operating system is implemented
at severallevels, ranging from passwords for access to
the system to the isolation of concurrent processes
running within the system. The file system also
provides some degree of protection. The
bibliographical notes reference coverage of these
topics in operating-system texts. Finally, network-level
security has gained widespread recognition as the
Internet has evolved from an academic research
platform to the basis of international electronic
commerce. The bibliographic notes list textbook
coverage of the basic principles of network security.
We shall present our discussion of security in terms of
the relational-data model, although the concepts of
the security are equally applicable to all data models.

# Access control and Authorization:


The authorization over the database may be granted
based on various requirements. We may assign a user
several forms of authorization on parts of the
database. These various authorizations are;
 Read authorization allows reading, but not
modification, of data.
 Insert authorization allows insertion of new data,
but not modification of existing data.
 Update authorization allows modification, but not
deletion, of data.
 Delete authorization allows deletion of data.
We may assign the user all, none, or a combination
of these types of authorization. In addition to these
forms of authorization for access to data, we may grant
a user authorization to modify the database schema:
 Index authorization allows the creation and deletion
of indices.
 Resource authorization allows the creation of new
relations.
 Alteration authorization allows the addition or
deletion of attributes in a relation.
 Drop authorization allows the deletion of relations.
The drop and delete authorization differ in that
delete authorization allows deletion of tuples only. If a
user deletes all tuples of a relation, the relation still
exists, but it is empty. If a relation is dropped, it no
longer exists.
We regulate the ability to create new relations
through resource authorization. A user with resource
authorization who creates a new relation is given all
privileges on that relation automatically.
Index authorization may appear unnecessary, since
the creation or deletion of an index does not alter data
in relations. Rather, indices are a structure for
performance enhancements. However, indices also
consume space, and all database modifications are
required to update indices. If index authorization were
granted to all users, those who performed updates
would be tempted to delete indices, whereas those
who issued queries would be tempted to create
numerous indices. To allow the database administrator
to regulate the use of system resources, it is necessary
to treat index creation as a privilege.
The ultimate form of authority is that given to the
database administrator. The database administrator
may authorize new users, restructure the database,
and so on. This form of authorization is analogous to
that of a super user or operator for an operating
system.

# Security and Views:


View is a means of providing a user with a
personalized model of the database. A view can hide
data that a user does not need to see. The ability of
views to hide data serves both to simplify usage of the
system and to enhance security. Views simplify system
usage because they restrict the user’s attention to the
data of interest. Although a user may be denied direct
access to a relation, that user may be allowed to access
part of that relation through a view.
A combination of relational-level security and view-
level security limits a user’s access to precisely the data
that the user needs. Creation of a view does not
require resource authorization. A user who creates a
view does not necessarily receive all privileges on that
view. One receives only those privileges that provide
no additional authorization beyond those that she
already had. For example, a user cannot be given
update authorization on a view without having update
authorization on the relations used to define the view.
If a user creates a view on which no authorization can
be granted, the system will deny the view creation
request.
# Encryption and Decryption:
The various provisions that a database system may
make for authorization may still not provide sufficient
protection for highly sensitive data. In such cases, data
may be stored in encrypted form. It is not possible for
encrypted data to be read unless the reader knows
how to decipher (decrypt) them. Encryption also forms
the basis of good schemes for authenticating users to a
database.

# Encryption Techniques:
There are a vast number of techniques for the
encryption of data. Simple encryption techniques may
not provide adequate security, since it may be easy for
an unauthorized user to break the code. As an example
of a weak encryption technique, consider the
substitution of each character with the next character
in the alphabet. Thus,
Perry ridge
Becomes
Qfsszsjehf
If an unauthorized user sees only “Qfsszsjehf,” she
probably has insufficient information to break the
code. However, if the intruder sees a large number of
encrypted branch names, she could use statistical data
regarding the relative frequency of characters to guess
what substitution is being made (for example, E is the
most common letter in English text, followed by T, A,
O, N, I and so on).
A good encryption technique has the following
properties:
 It is relatively simple for authorized users to encrypt
and decrypt data.
 It depends not on the secrecy of the algorithm, but
rather on a parameter of the algorithm called the
encryption key.
 Its encryption key is extremely difficult for an
intruder to determine.

# Authentication:
Authentication refers to the task of verifying the
identity of a person/software connecting to a
database. The simplest form of authentication consists
of a secret password which must be presented when a
connection is opened to a database.
Password-based authentication is used widely by
operating systems as well as databases. However, the
use of passwords has some drawbacks, especially over
a network. If an eavesdropper is able to “sniff” the data
being sent over the network, she may be able to find
the password as it is being sent across the network.
Once the eavesdropper has a user name and password,
she can connect to the database, pretending to be the
legitimate user.

Unit 6: Introduction to SQL:


Most of the database system requires a query
language to interact with the data stored on the
databank. SQL is a most common query language
which is very powerful and flexible to manipulate the
data stored in the database. SQL uses a combination of
relational algebra and relational calculus constructs.
SQL is refers as a “query language,” it can do much
more than just query a database. It is also used to
define the structure of the data, modify data in the
database, and specify security constraints. Individual
implementations of SQL may differ in details, or may
support only a subset of the full language.

# Development of the SQL as a query Language:


* IBM developed the original version of SQL at its San
Jose Research Laboratory.
* IBM implemented the language, originally called
Sequel, as part of the System R project in the early
1970s.
* Its name has changed to SQL (Structured Query
Language) later and has clearly established itself as
the standard relational-database language.
* In 1986, the American National Standards Institute
(ANSI) and the International Organization for
Standardization (ISO) published an SQL standard,
called SQL-86.
* IBM published its own corporate SQL standard, the
Systems Application Architecture Database Interface
(SAA-SQL) in 1987.
* ANSI published an extended standard for SQL, SQL-
89, in 1989. The next version of the standard was
SQL-92 standard, and the most recent version is SQL
1999.
* The SQL: 1999 standard is a superset of the SQL-92
standard.
* Many database systems support some of the new
constructs in SQL: 1999, although currently no
database system supports all the new constructs.

# The SQL language has several parts:


Data-definition language (DDL) The SQL DDL
provides commands for define in relation schemas,
deleting relations, and modifying relation schemas.
Data Manipulation Language (DML). The SQL DML
includes a query language based on both the
relational algebra and the tuple relational calculus.
It includes also commands to insert tuples into,
delete tuples from, and modify tuples in the
database.
View definition. The SQL DDL includes commands
for defining views.
Transaction Control. SQL includes commands for
specifying the beginning and ending of
transactions.
Embedded SQL and dynamic SQL. Embedded and
dynamic SQL de.ne how SQL statements can be
embedded within general-purpose programming
languages, such as C, C++, Java, PL/I, Cobol, Pascal,
and Fortran.
Integrity. The SQL DDL includes commands for
specifying integrity constraints that the data
stored in the database must satisfy. Updates that
violate integrity constraints are disallowed.
Authorization. The SQL DDL includes commands
for specifying access rights to relations and views.

# Basic Structure of the SQL:


To create a table:
Syntax:- CREATE TABLE <table name>
(Col name<data type> (<size>)
<Constraints>,
… … … …

[Table Constraint]
);
(Create a Table employee with the following
attributes)
Employee’s Id No,
Employee’s Name,
Employee’s Address,
Employee’s Department,
Involved Project No
Basic Salary
Join date,
);
Eg:- CREATE TABLE employee
( emp_no number (4) NOT NULL,
e_name varchar2 (30),
e_add varchar2 (30),
dept varchar2 (20),
proj_no number (4) NOT NULL,
b_sal number (10,2),
joindate date,
Primary Key (emp_no),
Check (e_add in (“Kathmandu”, “Lalitpur”,
“Bhaktapur”, “Kirtipur” ) ) );

(Create a Table employee with the following


attributes)
Employee’s Id No Primary Key,
Employee’s Name,
Employee’s Address, (KTM, LTP, BKT,
KIR)
Employee’s Department, (Man, Ast Man, Adm,
Tech, Acc)
Involved Project No NOT NULL,
Basic Salary NOT NULL,
Join date,
);

Eg:- CREATE TABLE employee


( emp_no number (4) NOT NULL,
e_name varchar2 (30),
e_add varchar2 (30),
dept varchar2 (20),
proj_no number (4) NOT NULL,
b_sal number (10,2),
joindate date,
Primary Key (emp_no),
Check (e_add in (“Kathmandu”, “Lalitpur”,
“Bhaktapur”, “Kirtipur” ) ) );

To insert a data values;


Syntax:- INSERT INTO <table_name>
( column_1, column_2, … … …,
column_n)
VALUES (data_1, data_1, … … … data_n)
);
Eg:- INSERT INTO employee
(emp_no, e_name, e_add, dept,
proj_no, b_sal)
VALUES (101, ‘Ramesh’,
“Kathmandu’, ‘Acc’,
5001, 12034);

To select data records;


Syntax:- SELECT <col_name1, col_name2, … … …,
col_nameX>
FROM <table_name> WHERE (<conditions>);
Eg:- SELECT emp_no, e_name, e_add FROM
employee;
SELECT emp_no, e_name, e_add FROM
employee
WHERE (emp_no = 105);

# Removing Duplicate (DISTINCT);


SELECT DISTINCT(dept)FROM employee;

# Combining conditions using Boolean Operators;


* AND (Both predicate must be correct to get correct
result);
SELECT emp_no, e_name, e_add FROM employee
WHERE (b_sal>= 5000 AND b_sal< 10000);

* OR (any one predicate should be correct to get


correct result);
SELECT emp_no, e_name, e_add FROM employee
WHERE (e_add=‘Kathmandu” OR
e_add=“Lalitpur”);
# Range specification (IN and BETWEEN);
* IN (values as specified);
SELECT emp_no, e_name, e_add FROM employee
WHERE e_add IN(“Kathmandu”, “Lalitpur”,
“Bhaktapur”);

* BETWEEN (values as specified within range);


SELECT emp_no, e_name, e_add FROM employee
WHERE b_sal BETWEEN 5000 AND 10000;

# Arranging Tuples;
* ORDER BY (Arranging in Ascending or Descending
Order);
SELECT emp_no, e_name, e_add FROM employee
ORDER BY b_sal;
SELECT emp_no, e_name, e_add FROM employee
ORDER BY b_sal (DESC);

* GROUP BY (Listing using Group);


SELECT emp_no, e_name, e_add FROM employee
GROUP BY (dept);

* HAVING (specifies the condition);


SELECT emp_no, e_name, e_add FROM employee
HAVING (b_sal>8000);

# Aggregate Function (Mathematical aggregate


functions);
* To find SUM of the values of the Field;
SELECT SUM (b_sal) FROM employee
WHERE (dept= “Account”);

* To find average of the values;


SELECT AVG (b_sal) FROM employee
WHERE (e_add= “Kathmandu”);

* To find Minimum values of the Field;


SELECT MIN (b_sal) FROM employee;

* To find Maximum values of the Field;


SELECT MAX (b_sal) FROM employee;

* To count no of tuples extracted;


SELECT COUNT (emp_id) FROM employee;

# String Function (Functions that supports


Alphanumeric);
* To select LIKE (similar) values;
SELECT emp_no, e_name, e_add FROM employee
WHERE (e_name LIKE ‘Sa%’);
SELECT emp_no, e_name, e_add FROM employee
WHERE (e_name LIKE ‘%am%’);
SELECT emp_no, e_name, e_add FROM employee
WHERE (e_name LIKE ‘%sh’);

# NULL and NOT NULL values;


SELECT emp_no, e_name, e_add FROM employee
WHERE (proj=NULL);
SELECT emp_no, e_name, e_add FROM employee
WHERE (proj=(NOT)NULL);
SELECT emp_no, e_name, e_add FROM employee
WHERE (proj IS NULL);
SELECT emp_no, e_name, e_add FROM employee
WHERE (proj IS (NOT)NULL);

# selecting data from two different table:


Syntax: -SELECT <table1.column1, table2.column1>
FROM <table1>, <table2>;
Eg:- SELECT staff.emp_no, proj.proj_no
FROM staff, proj;
SELECT s1.emp_no, s1.e_name, p1.add,
p1.proj_no
FROM staff s1, proj p1;
SELECT s1.emp_no, s1.e_name, p1.add,
p1.payment
FROM staff s1, proj p1
WHERE (s1.proj_no=p1.proj_no);

To update data records;


Syntax:- UPDATE <table_name>
SET (col_name = values)
WHERE <predicates>;
Eg:- UPDATE employee
SET (b_sal = b_sal + bsal * 40 / 100);
WHERE dept=”Karnali”;
UPDATE marks
SET (eng = 32)
WHERE (eng>= 27 and eng< 32);

To delete data records;


Syntax: -DELETE FROM <table_name>
WHERE <predicates>;
Eg :- DELETE FROM employee
WHERE dept=”Karnali”;

To delete Table (Drop Table);


* To delete the table;
Syntax: -DROP TABLE <table name>;
Eg:- DROP TABLE employee;

* To delete all records of the table (Make a table


empty);
Syntax: -TRUNCATE TABLE <table name>;
Eg:- TRUNCATE TABLE employee;
To modify the table (ALTER);
* To add field on the table;
Syntax: -ALTER TABLE <table name>
ADD (<col_name><datatype>
(<size>);
Eg:- ALTER TABLE employee
ADD mobile number (10);

* To Modify or Change the field on the table;


Syntax: -ALTER TABLE <table name>
MODIFY (<col_name><datatype>
(<size>);
Eg:- ALTER TABLE employee
MODIFY mobile varchar2 (10);

* To delete the one field on the table;


Syntax: -ALTER TABLE <table name>
DROP COLUMN <col_name>;
Eg:- ALTER TABLE employee
DROP COLUMN telno;

* To Delete multiple fields on the table;


Syntax:- ALTER TABLE <table name>
DROP (<column1, column2, ……);
Eg:- ALTER TABLE employee
DROP (e_add, telno, mobile);

# Set operation;
In SQL various types of SET operators are also
used. Some of them are;
* UNION operator;
Syntax: -<Query1> UNION <Query2>
Eg:- SELECT * FROM employee;
UNION
SELECT * FROM employee1;
* UNION ALL operator;
Syntax: -<Query1> UNION ALL <Query2>
Eg:- SELECT * FROM employee;
UNION ALL
SELECT * FROM employee1;

* INTERSECT operator;
Syntax: -<Query1> INTERSECT <Query2>
Eg:- SELECT * FROM employee;
INTERSECT
SELECT * FROM employee1;

* MINUS operator;
Syntax: -<Query1> MINUS <Query2>
Eg:- SELECT * FROM employee;
MINUS
SELECT * FROM employee1;
# Sub Queries;
Syntax: -SELECT <col_name1, col_name2, … … …,
col_nameX>
FROM <table_name>
WHERE (<SUB QUERY>);
Eg:- SELECT emp_no, e_name, e_add FROM
employee;
WHERE
(SELECT * FROM employee
WHERE DEPT= “Kathmandu”);
SELECT emp_no, e_name, e_add FROM
employee;
WHERE
(SELECT * FROM employee
WHERE DEPT=
(SELECT * FROM employee
WHERE (emp_id<110)));
Unit 7: Query Processing
# Introduction to query processing:
# Equivalence of expressions:
# Query optimization:
# Query decomposition:
Unit 8: Object Oriented Model
# Introduction:
# Design of object oriented model:
Unit 9: Application of Database System in
Organization
# Submission of the database system report of any
one commercial organization:
(Such as Financial, Accounting, Payroll, Inventory,
Ticketing, Banking, Online banking etc)

You might also like