You are on page 1of 36

Introduction to Database Management Systems

Database
 The database is a shared collection of logically related data in a systematic manner,
which is stored to meet the requirements of different users of an organization that
can easily be accessed, managed and updated.
 It is actually a place where related piece of information is stored and various
operations can be performed on it.
 Database can be maintained manually or through electronics devices such as: Digital
diaries, Mobile phones, computers, etc.

Database Management System


 A Database Management System (DBMS) is system software that allows users to
efficiently define, create, maintain and share databases.
 Defining a database involves specifying the data types, structures and constraints of
the data to be stored in the database.
 Creating a database involves storing the data on some storage medium that is
controlled by DBMS.
 Maintaining a database involves updating the database whenever required to
evolve and reflect changes in the miniworld and also generating reports for each
change.
 Sharing a database involves allowing multiple users to access the database.
 DBMS also serves as an interface between the database and end users or
application programs.
 It provides control access to the data and ensures that data is consistent and correct
by defining rules on them.
 An application program accesses the database by sending queries or requests for
data to the DBMS. A query causes some data to be retrieved from database.

DBMS allows users the following tasks:


o Data Definition: It is used for creation, modification, and removal of definition that
defines the organization of data in the database.
o Data Updation: It is used for the insertion, modification, and deletion of the actual
data in the database.
o Data Retrieval: It is used to retrieve the data from the database which can be used
by applications for various purposes.
o User Administration: It is used for registering and monitoring users, maintain data
integrity, enforcing data security, dealing with concurrency control, monitoring
performance and recovering information corrupted by unexpected failure.

Characteristics of Database Management System


o Provides security and removes redundancy
o Self-describing nature of a database system
o Insulation between programs and data abstraction
o Support of multiple views of the data
o Sharing of data and multiuser transaction processing
o DBMS allows entities and relations among them to form tables.
o It follows the ACID concept ( Atomicity, Consistency, Isolation, and Durability).
o DBMS supports multi-user environment that allows users to access and manipulate
data in parallel.

Advantages of DBMS
o Controls database redundancy: It can control data redundancy because it stores
all the data in one single database file and that recorded data is placed in the
database.
o Data sharing: In DBMS, the authorized users of an organization can share the data
among multiple users.
o Easily Maintenance: It can be easily maintainable due to the centralized nature of
the database system.
o Reduce time: It reduces development time and maintenance need.
o Backup: It provides backup and recovery subsystems which create automatic
backup of data from hardware and software failures and restores the data if
required.
o multiple user interface: It provides different types of user interfaces like graphical
user interfaces, application program interfaces

Disadvantages of DBMS
o Cost of Hardware and Software: It requires a high speed of data processor and
large memory size to run DBMS software.
o Size: It occupies a large space of disks and large memory to run them efficiently.
o Complexity: Database system creates additional complexity and requirements.
o Higher impact of failure: Failure is highly impacted the database because in most
of the organization, all the data stored in a single database and if the database is
damaged due to electric failure or database corruption then the data may be lost
forever.

Users in a DBMS environment


Following, are the various category of users of a DBMS system
 Application Programmers: The Application programmers write programs in
various programming languages to interact with databases.
 Database Administrators: Database Admin is responsible for managing the entire
DBMS system. She/he is called Database admin or DBA.
 End-Users: The end users are the people who interact with the database
management system. They conduct various operations on database like retrieving,
updating, deleting, etc.

Application of DBMS
 Banking: For customer information, account activities, payments, deposits, loans,
etc.
 Airlines: For reservations and schedule information.
 Universities: For student information, course registrations, colleges and grades.
 Telecommunication: It helps to keep call records, monthly bills, maintaining
balances, etc.
 Finance: For storing information about stock, sales, and purchases of financial
instruments like stocks and bonds.
 Sales: Use for storing customer, product & sales information.
 Manufacturing: It is used for the management of supply chain and for tracking
production of items. Inventories status in warehouses.
 HR Management: For information about employees, salaries, payroll, deduction,
generation of paychecks, etc.

Types of DBMS

Hierarchical Model
 This database model organizes data into a tree-like-structure, with a single root, to
which all the other data is linked. The hierarchy starts from the Root data, and
expands like a tree, adding child nodes to the parent nodes.
 In this model, a child node will only have a single parent node.
 This model efficiently describes many real-world relationships like index of a book,
recipes etc.
 In hierarchical model, data is organized into tree-like structure with one one-to-
many relationship between two different types of data, for example, one department
can have many courses, many professors and of-course many students.
Network Model
 This is an extension of the Hierarchical model. In this model data is organised more
like a graph, and are allowed to have more than one parent node.
 In this database model data is more related as more relationships are established in
this database model. Also, as the data is more related, hence accessing the data is
also easier and fast. This database model was used to map many-to-many data
relationships.
 In this model, entities are organized in a graph which can be accessed through
several paths.

Relational model
 Relational DBMS is the most widely used DBMS model because it is one of the
easiest.
 This model is based on normalizing data in the rows and columns of the tables.
Relational model stored in fixed structures and manipulated using SQL.
 In this model, data is organised in two-dimensional tables and the relationship is
maintained by storing a common field.
 The basic structure of data in the relational model is tables. All the information
related to a particular type is stored in rows of that table.
 Hence, tables are also known as relations in relational model.
Object-Oriented Model
 In Object-oriented Model data stored in the form of objects.
 The structure which is called classes which display data within it.
 It defines a database as a collection of objects which stores both data member’s
values and operations.
 The uniqueness of object oriented database is that it adds the database functionality
to the object programming language.

Components of DBMS
 User: - Users are the one who really uses the database. Users can be administrator,
developer or the end users.
 Data or Database: - As we discussed already, data is one of the important factor of
database. A very huge amount of data will be stored in the database and it forms the
main source for all other components to interact with each other.  There are two
types of data. One is user data. It contains the data which is responsible for the
database, i.e.; based on the requirement, the data will be stored in the various tables
of the database in the form of rows and columns. Another data is Metadata. It is
known as ‘data about data’, i.e.; it stores the information like how many tables,  their
names, how many columns and their names, primary keys, foreign keys etc.
basically these metadata will have information about each tables and their
constraints in the database.
 DBMS: - This is the software helps the user to interact with the database. It allows
the users to insert, delete, update or retrieve the data.  All these operations are
handled by query languages like MySQL, Oracle etc.
 Database Application: - It the application program which helps the users to
interact with the database by means of query languages. Database application will
not have any idea about the underlying DBMS.

Database Administrator (DBA)


 One of the main reasons behind using DBMS is to have central control on both
data and the applications access those data. The person who has such central
control over the data is called a database administrator (DBA).

Functions of DBA:
 Defining Conceptual Schema: The DBA creates the original database schema
by executing a set of data definition statements in the DDL.

 Schema and physical-organization modification: .The DBA carries out


changes to the schema and physical organization to reflect the changing needs of
the organization, or to alter the physical organization to improve performance.

 Software installation and Maintenance: A DBA often collaborates on the


initial installation and configuration of a new Oracle, SQL Server etc database.
The system administrator sets up hardware and deploys the operating system
for the database server, then the DBA installs the database software and
configures it for use. As updates and patches are required, the DBA handles this
on-going maintenance.
And if a new server is needed, the DBA handles the transfer of data from the
existing system to the new platform.

 Security and Integrity Checks: Ensuring data integrity, this means that data are
complete, accurate and current for the tasks at hand. Controlling data security,
including preventing unauthorized access to the data and protecting against
other security threats.
 Backup and Recovery Strategies: DBAs create backup and recovery plans and
procedures based on industry best practices, then make sure that the necessary
steps are followed. Backups cost time and money, so the DBA may have to
persuade management to take necessary precautions to preserve data.System
admins or other personnel may actually create the backups, but it is the DBA’s
responsibility to make sure that everything is done on schedule.
In the case of a server failure or other form of data loss, the DBA will use existing
backups to restore lost information to the system. Different types of failures may
require different recovery strategies, and the DBA must be prepared for any
eventuality.

 Granting User access and Authentication: Setting up employee access is an


important aspect of database security. DBAs control who has access and what
type of access they are allowed. For instance, a user may have permission to see
only certain pieces of information, or they may be denied the ability to make
changes to the system.

 Monitoring Performance: Monitoring databases for performance issues is part


of the on-going system maintenance a DBA performs. If some part of the system
is slowing down processing, the DBA may need to make configuration changes to
the software or add additional hardware capacity. Many types of monitoring
tools are available, and part of the DBA’s job is to understand what they need to
track to improve the system. 3rd party organisations can be ideal for
outsourcing this aspect, but make sure they offer modern DBA support.

Limitations of File Processing Systems

File processing system is good when there is only limited number of files and data in are
very less. As the data and files in the system grow, handling them becomes difficult.

1. Data Mapping and Access: - Although all the related informations are grouped
and stored in different files, there is no mapping between any two files. i.e.; any
two dependent files are not linked. Even though Student files and Student_Report
files are related, they are two different files and they are not linked by any means.
Hence if we need to display student details along with his report, we cannot
directly pick from those two files. We have to write a lengthy program to search
Student file first, get all details, then go Student_Report file and search for his
report.
When there is very huge amount of data, it is always a time consuming task to
search for particular information from the file system. It is always an inefficient
method to search for the data.
2. Data Redundancy: - There are no methods to validate the insertion of duplicate
data in file system. Any user can enter any data. File system does not validate for
the kind of data being entered nor does it validate for previous existence of the
same data in the same file. Duplicate data in the system is not appreciated as it is a
waste of space, and always lead to confusion and mishandling of data. When there
are duplicate data in the file, and if we need to update or delete the record, we
might end up in updating/deleting one of the record, leaving the other record in
the file. Again the file system does not validate this process. Hence the purpose of
storing the data is lost.
Though the file name says Student file, there is a chance of entering staff
information or his report information in the file. File system allows any
information to be entered into any file. It does not isolate the data being entered
from the group it belongs to.

3. Data Dependence: - In the files, data are stored in specific format, say tab,
comma or semicolon. If the format of any of the file is changed, then the program
for processing this file needs to be changed. But there would be many programs
dependent on this file. We need to know in advance all the programs which are
using this file and change in the entire place. Missing to change in any one place
will fail whole application.  Similarly, changes in storage structure, or accessing
the data, affect all the places where this file is being used. We have to change it
entire programs. That is smallest change in the file affect all the programs and
need changes in all them.
4. Data inconsistency: - Imagine Student and Student_Report files have student’s
address in it, and there was a change request for one particular student’s address.
The program searched only Student file for the address and it updated it
correctly. There is another program which prints the student’s report and mails it
to the address mentioned in the Student_Report file. What happens to the report
of a student whose address is being changed? There is a mismatch in the actual
address and his report is sent to his old address. This mismatch in different copies
of same data is called data inconsistency. This has occurred here, because there is
no proper listing of files which has same copies of data.
5. Data Isolation: - Imagine we have to generate a single report of student, who is
studying in particular class, his study report, his library book details, and hostel
information. All these informations are stored in different files. How do we get all
these details in one report? We have to write a program. But before writing the
program, the programmer should find out which all files have the information
needed, what is the format of each file, how to search data in each file etc. Once all
these analysis is done, he writes a program. If there is 2-3 files involved,
programming would be bit simple. Imagine if there is lot many files involved in it?
It would be require lot of effort from the programmer. Since all the datas are
isolated from each other in different files, programming becomes difficult.
6. Security: - Each file can be password protected. But what if have to give access to
only few records in the file? For example, user has to be given access to view only
their bank account information in the file. This is very difficult in the file system.
7. Integrity: - If we need to check for certain insertion criteria while entering the
data into file it is not possible directly. We can do it writing programs. Say, if we
have to restrict the students above age 18, then it is by means of program alone.
There is no direct checking facility in the file system. Hence these kinds of
integrity checks are not easy in file system.
8. Atomicity: - If there is any failure to insert, update or delete in the file system,
there is no mechanism to switch back to the previous state. Imagine marks for one
particular subject needs to be entered into the Report file and then total needs to
be calculated. But after entering the new marks, file is closed without saving. That
means, whole of the required transaction is not performed. Only the totaling of
marks has been done, but addition of marks not being done. The total mark
calculated is wrong in this case. Atomicity refers to completion of whole
transaction or not completing it at all. Partial completion of any transaction leads
to incorrect data in the system. File system does not guarantee the atomicity. It
may be possible with complex programs, but introduce for each of transaction
costs money.
9. Concurrent Access: - Accessing the same data from the same file is called
concurrent access. In the file system, concurrent access leads to incorrect data.
For example, a student wants to borrow a book from the library. He searches for
the book in the library file and sees that only one copy is available. At the same
time another student also, wants to borrow same book and checks that one copy
available. First student opt for borrow and gets the book. But it is still not updated
to zero copy in the file and the second student also opt for borrow! But there are
no books available. This is the problem of concurrent access in the file system.

DBMS vs. File System


There are following differences between DBMS and File system:

DBMS File System

DBMS is a collection of data. In DBMS, the user File system is a collection of data. In this
is not required to write the procedures. system, the user has to write the
procedures for managing the database.

DBMS gives an abstract view of data that hides File system provides the detail of the data
the details. representation and storage of data.

DBMS provides a crash recovery mechanism, File system doesn't have a crash
i.e., DBMS protects the user from the system mechanism, i.e., if the system crashes
failure. while entering some data, then the
content of the file will lost.

DBMS provides a good protection mechanism. It is very difficult to protect a file under
the file system.

DBMS contains a wide variety of sophisticated File system can't efficiently store and
techniques to store and retrieve the data. retrieve the data.

DBMS takes care of Concurrent access of data In the File system, concurrent access has
using some form of locking. many problems like redirecting the file
while other deleting some information or
updating some information.

Data Abstraction

Database systems comprise of complex data-structures. In order to make the system


efficient in terms of retrieval of data, and reduce complexity in terms of usability of users,
developers use abstraction i.e. hides irrelevant details from the users. This approach
simplifies database design.
There are mainly 3 levels of data abstraction:
 Physical: This is the lowest level of data abstraction. It tells us how the data is
actually stored in memory. The access methods like sequential or random access
and file organisation methods like B+ trees, hashing used for the same. Usability,
size of memory, and the number of times the records are factors which we need to
know while designing the database. Suppose we need to store the details of an
employee. Blocks of storage and the amount of memory used for these purposes is
kept hidden from the user.

 Logical: This level comprises of the information that is actually stored in the
database in the form of tables. It also stores the relationship among the data entities
in relatively simple structures. At this level, the information available to the user at
the view level is unknown.We can store the various attributes of an employee and
relationships, e.g. with the manager can also be stored.

 View: This is the highest level of abstraction. Only a part of the actual database is
viewed by the users. This level exists to ease the accessibility of the database by an
individual user. Users view data in the form of rows and columns. Tables and
relations are used to store data. Multiple views of the same database may exist.
Users can just view the data and interact with the database, storage and
implementation details are hidden from them.
Data Independence

o The ability to modify a scheme definition in one level without affecting a scheme
definition in a higher level is called data independence.

o
o Metadata itself follows a layered architecture, so that when we change data at one
layer, it does not affect the data at another level. This data is independent but
mapped to each other.
Logical Data Independence
 Logical data is data about database, that is, it stores information about how data is
managed inside. For example, a table (relation) stored in the database and all its
constraints, applied on that relation.
 Logical data independence is a kind of mechanism, which liberalizes itself from
actual data stored on the disk. If we do some changes on table format, it should not
change the data residing on the disk.
 Logical data independence refers characteristic of being able to change the
conceptual schema without having to change the external schema.
 Logical data independence is used to separate the external level from the
conceptual view.
 If we do any changes in the conceptual view of the data, then the user view of the
data would not be affected.
 Logical data independence occurs at the user interface level.

Physical Data Independence


o All the schemas are logical, and the actual data is stored in bit format on the disk.
Physical data independence is the power to change the physical data without
impacting the schema or logical data.
o Physical data independence can be defined as the capacity to change the internal
schema without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal
levels.
o Physical data independence occurs at the logical interface level.

DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server
architecture is used to deal with a large number of PCs, web servers, database
servers and other components that are connected with networks.
o The client/server architecture consists of many PCs and a workstation which are
connected via the network.
o DBMS architecture depends upon how users are connected to the database to get
their request done.

Types of DBMS Architecture

Database architecture can be seen as a single tier or multi-tier. But logically, database
architecture is of two types like: 2-tier architecture and 3-tier architecture.

1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user
can directly sit on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't
provide a handy tool for end users.
o The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick response.
2-Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier architecture,
applications on the client end can directly communicate with the database at the
server side. For this interaction, API's like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing
and transaction management.
o To communicate with the DBMS, client-side application establishes a connection
with the server side.

Fig: 2-tier Architecture

3-tier Architecture
o A 3-tier architecture separates its tiers from each other based on the complexity of
the users and how they use the data present in the database.
o It is the most widely used architecture to design a DBMS.
Fig: 3-tier Architecture

 Database (Data) Tier − At this tier, the database resides along with its query
processing languages. We also have the relations that define the data and their
constraints at this level.

 Application (Middle) Tier − At this tier reside the application server and the
programs that access the database. For a user, this application tier presents an
abstracted view of the database. End-users are unaware of any existence of the
database beyond the application. At the other end, the database tier is not aware of
any other user beyond the application tier. Hence, the application layer sits in the
middle and acts as a mediator between the end-user and the database.
 User (Presentation) Tier − End-users operate on this tier and they know nothing
about any existence of the database beyond this layer. At this layer, multiple views
of the database can be provided by the application. All views are generated by
applications that reside in the application tier.
There are following three levels or layers of DBMS architecture:
• External Level
•Conceptual Level
• Internal Level
In the above diagram,

 It shows the architecture of DBMS.


 Mapping is the process of transforming request response between various database
levels of architecture.
 Mapping is not good for small database, because it takes more time.
 In External / Conceptual mapping, DBMS transforms a request on an external
schema against the conceptual schema.
 In Conceptual / Internal mapping, it is necessary to transform the request from the
conceptual to internal levels.
1. Physical Level
 Physical level describes the physical storage structure of data in database.
 It is also known as Internal Level.
 This level is very close to physical storage of data.
 At lowest level, it is stored in the form of bits with the physical addresses on the
secondary storage device.
 At highest level, it can be viewed in the form of files.
 The internal schema defines the various stored data types. It uses a physical data
model.
2. Conceptual Level
 Conceptual level describes the structure of the whole database for a group of users.
 It is also called as the data model.
 Conceptual schema is a representation of the entire content of the database.
 These schema contains all the information to build relevant external records.
 It hides the internal details of physical storage.
3. External Level
 External level is related to the data which is viewed by individual end users.
 This level includes a no. of user views or external schemas.
 This level is closest to the user.
 External view describes the segment of the database that is required for a particular
user group and hides the rest of the database from that user group.

Functions of DBMS:
There are the following important functions of a DBMS:
(i) Data Storage Management: It provides a mechanism for management of permanent
storage of the data. The internal schema defines how the data should be stored by the
storage management mechanism and the storage manager interfaces with the operating
system to access the physical storage. 
(ii) Data Manipulation Management: A DBMS furnishes users with the ability to retrieve,
update and delete existing data in the database.
(iii) Data Definition Services: The DBMS accepts the data definitions such as external
schema, the conceptual schema, the internal schema, and all the associated mappings in
source form.
(iv) Data Dictionary/System Catalog Management: The DBMS provides a data
dictionary or system catalog function in which descriptions of data items are stored and
which is accessible to users. 
(v) Database Communication Interfaces: The end-user's requests for database access
are transmitted to DBMS in the form of communication messages.
(vi) Authorization / Security Management: The DBMS protects the database against
unauthorized access, either international or accidental. It furnishes mechanism to ensure
that only authorized users an access the database. 
{vii) Backup and Recovery Management: The DBMS provides mechanisms for backing
up data periodically and recovering from different types of failures. This prevents the loss
of data,
(viii) Concurrency Control Service: Since DBMSs support sharing of data among multiple
users, they must provide a mechanism for managing concurrent access to the database.
DBMSs ensure that the database kept in consistent state and that integrity of the data is
preserved.
(ix) Transaction Management: A transaction is a series of database operations, carried
out by a single user or application program, which accesses or changes the contents of the
database. Therefore, a DBMS must provide a mechanism to ensure either that all the
updates corresponding to a given transaction is made or that none of them is made. 

Relational algebra

o Relational algebra is a widely used procedural query language. It collects instances


of relations as input and gives occurrences of relations as output. It uses various
operation to perform this action.

o Relational algebra operations are performed recursively on a relation. The output of


these operations is a new relation, which might be formed from one or more input
relations.

o Relational Algebra divided in various groups

Unary Relational Operations

 SELECT (symbol: σ)
 PROJECT (symbol: π)
 RENAME (symbol: ƿ)

Relational Algebra Operations from Set Theory

 UNION (υ)
 INTERSECTION (∩),
 DIFFERENCE (-)
 CARTESIAN PRODUCT ( x )

Binary Relational Operations

 JOIN
 DIVISION

o Projection (π)
Projection is used to project required column data from a relation.

 Example :
o Selection (σ)

 Selection is used to select required tuples of the relations. for the above relation
σ (c>3)R will select the tuples which have c more than 3.
 Note: selection operator only selects the required tuples but does not display them.
For displaying, data projection operator is used.
 For the above selected tuples, to display we need to use projection also.

o Union (U)
 UNION is denoted by ∪ symbol. It includes all tuples that are in tables A or in B. It
also eliminates duplicate tuples. So, set A UNION set B would be expressed as:
 The result <- A ∪ B
 For a union operation to be valid, the following conditions must hold -

 R and S must be the same number of attributes.


 Attribute domains need to be compatible.
 Duplicate tuples should be automatically removed.

Example

Consider the following tables.

Table A Table B

column column column column


1 2 1 2

1 1 1 1

1 2 1 3

A ∪ B gives
Table A ∪ B

column 1 column 2

1 1
1 2

1 3

o Intersection (∩)
 An intersection is defined by the symbol ∩
 A∩B
 Defines a relation consisting of a set of all tuple that are in both A and B. However, A
and B must be union-compatible.

Example:

A∩B

Table A ∩ B

column 1 column 2

1 1

o Set Difference (-)

 Set Difference in relational algebra is same set difference operation as in set theory
with the constraint that both relations should have same set of attributes.
 - Symbol denotes it. The result of A - B, is a relation which includes all tuples that
are in A but not in B.

 The attribute name of A has to match with the attribute name in B.


 The two-operand relations A and B should be either compatible or Union
compatible.
 It should be defined relation consisting of the tuples that are in relation A, but
not in B.

Example

A-B
Table A - B

column 1 column 2

1 2

o Rename (ρ)

 Rename is a unary operation used for renaming attributes of a relation.


ρ (a/b)R will rename the attribute ‘b’ of relation by ‘a’.

Example: We can use the rename operator to rename STUDENT relation to STUDENT1.

ρ(STUDENT1, STUDENT)  

o Cross Product (X)


 Cross product between two relations let say A and B, so cross product between A X
B will results all the attributes of A followed by each attribute of B. Each record of A
will pairs with every record of B.
 below is the example

Join Operations:
A Join operation combines related tuples from different relations, if and only if a given join
condition is satisfied. It is denoted by ⋈.
Example:
EMPLOYEE

EMP_CODE EMP_NAME

101 Stephan

102 Jack

103 Harry

SALARY

EMP_CODE SALARY

101 50000

102 30000

103 25000

1. Operation: (EMPLOYEE ⋈ SALARY)   

Result:

EMP_CODE EMP_NAME SALARY

101 Stephan 50000

102 Jack 30000

103 Harry 25000

Types of Join operations:


1. Natural Join:
o A natural join is the set of tuples of all combinations in R and S that are equal on
their common attribute names.
o It is denoted by ⋈.

Example: Let's use the above EMPLOYEE table and SALARY table:

Input:

1. ∏EMP_NAME, SALARY (EMPLOYEE ⋈ SALARY)  

Output:

EMP_NAME SALARY

Stephan 50000

Jack 30000

Harry 25000
2. Outer Join:
The outer join operation is an extension of the join operation. It is used to deal with missing
information.

Example:

EMPLOYEE

EMP_NAME STREET CITY

Ram Civil line Mumbai

Shyam Park street Kolkata

Ravi M.G. Street Delhi

Hari Nehru nagar Hyderabad

FACT_WORKERS

EMP_NAME BRANCH SALARY

Ram Infosys 10000

Shyam Wipro 20000

Kuber HCL 30000

Hari TCS 50000

Input:
1. (EMPLOYEE ⋈ FACT_WORKERS)  

Output:

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000

Shyam Park street Kolkata Wipro 20000

Hari Nehru nagar Hyderabad TCS 50000

An outer join is basically of three types:


a. Left outer join
b. Right outer join
c. Full outer join

a. Left outer join:


o Left outer join contains the set of tuples of all combinations in R and S that are equal
on their common attribute names.
o In the left outer join, tuples in R have no matching tuples in S.
o It is denoted by ⟕.

Example: Using the above EMPLOYEE table and FACT_WORKERS table

Input:

1. EMPLOYEE ⟕ FACT_WORKERS   

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000

Shyam Park street Kolkata Wipro 20000

Hari Nehru street Hyderabad TCS 50000

Ravi M.G. Street Delhi NULL NULL

b. Right outer join:


o Right outer join contains the set of tuples of all combinations in R and S that are
equal on their common attribute names.
o In right outer join, tuples in S have no matching tuples in R.
o It is denoted by ⟖.
Example: Using the above EMPLOYEE table and FACT_WORKERS Relation

Input:

1. EMPLOYEE ⟖ FACT_WORKERS  

Output:

EMP_NAME BRANCH SALARY STREET CITY

Ram Infosys 10000 Civil line Mumbai

Shyam Wipro 20000 Park street Kolkata

Hari TCS 50000 Nehru street Hyderabad

Kuber HCL 30000 NULL NULL

c. Full outer join:


o Full outer join is like a left or right join except that it contains all rows from both
tables.
o In full outer join, tuples in R that have no matching tuples in S and tuples in S that
have no matching tuples in R in their common attribute name.
o It is denoted by ⟗.

Example: Using the above EMPLOYEE table and FACT_WORKERS table

Input:

1. EMPLOYEE ⟗ FACT_WORKERS  

Output:

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000


Shyam Park street Kolkata Wipro 20000

Hari Nehru street Hyderabad TCS 50000

Ravi M.G. Street Delhi NULL NULL

Kuber NULL NULL HCL 30000

3. Equi join:
It is also known as an inner join. It is the most common join. It is based on matched data as
per the equality condition. The equi join uses the comparison operator(=).

Example:

CUSTOMER RELATION

CLASS_ID NAME

1 John

2 Harry

3 Jackson

PRODUCT

PRODUCT_ID CITY

1 Delhi

2 Mumbai

3 Noida

Input:
1. CUSTOMER ⋈ PRODUCT    

Output:

CLASS_ID NAME PRODUCT_ID CITY

1 John 1 Delhi

2 Harry 2 Mumbai

3 Harry 3 Noida
Relational Calculus
o Relational calculus is a non-procedural query language. In the non-procedural query
language, the user is concerned with the details of how to obtain the end results.
o The relational calculus tells what to do but never explains how to do.

Types of Relational calculus:

1. Tuple Relational Calculus (TRC)


o The tuple relational calculus is specified to select the tuples in a relation. In TRC,
filtering variable uses the tuples of a relation.
o The result of the relation can have one or more tuples.

In Tuple Calculus, a query is expressed as


{t| P(t)}
where t = resulting tuples,
P(t) = known as Predicate and these are the conditions that are used to fetch t
Thus, it generates set of all tuples t, such that Predicate P(t) is true for t.
 P(t) may have various conditions logically combined with OR (∨), AND (∧), NOT(¬).
It also uses quantifiers:
 ∃ t ∈ r (Q(t)) = ”there exists” a tuple in t in relation r such that predicate Q(t) is true.
 ∀ t ∈ r (Q(t)) = Q(t) is true “for all” tuples in relation r.
Example:
Table-1: Customer
CUSTOMER NAME STREET CITY

Saurabh A7 Patiala

Mehak B6 Jalandhar

Sumiti D9 Ludhiana

Ria A5 Patiala
Table-2: Branch
BRANCH NAME BRANCH CITY

ABC Patiala

DEF Ludhiana

GHI Jalandhar
Table-3: Account

ACCOUNT NUMBER BRANCH NAME BALANCE

1111 ABC 50000

1112 DEF 10000

1113 GHI 9000

1114 ABC 7000


Table-4: Loan
LOAN NUMBER BRANCH NAME AMOUNT

L33 ABC 10000

L35 DEF 15000

L49 GHI 9000

L98 DEF 65000


Table-5: Borrower
CUSTOMER NAME LOAN NUMBER

Saurabh L33

Mehak L49

Ria L98
Table-6: Depositor
CUSTOMER NAME ACCOUNT NUMBER

Saurabh 1111

Mehak 1113

Sumiti 1114
Queries-1: Find the loan number, branch, amount of loans of greater than or equal to
10000 amount.
{t| t ∈ loan ∧ t[amount]>=10000}
Resulting relation:

LOAN NUMBER BRANCH NAME AMOUNT

L33 ABC 10000

L35 DEF 15000

L98 DEF 65000


In the above query, t[amount] is known as tupple variable.
Queries-2: Find the loan number for each loan of an amount greater or equal to 10000.
{t| ∃ s ∈ loan(t[loan number] = s[loan number]
∧ s[amount]>=10000)}
Resulting relation:

LOAN NUMBER

L33

L35

L98

Queries-3: Find the names of all customers who have a loan and an account at the bank.
{t | ∃ s ∈ borrower( t[customer-name] = s[customer-name])
∧ ∃ u ∈ depositor( t[customer-name] = u[customer-name])}
Resulting relation:

CUSTOMER NAME

Saurabh

Mehak

Queries-4: Find the names of all customers having a loan at the “ABC” branch.
{t | ∃ s ∈ borrower(t[customer-name] = s[customer-name]
∧ ∃ u ∈ loan(u[branch-name] = “ABC” ∧ u[loan-number] = s[loan-number]))}
Resulting relation:

CUSTOMER NAME

Saurabh

2. Domain Relational Calculus (DRC)


o The second form of relation is known as Domain relational calculus. In domain
relational calculus, filtering variable uses the domain of attributes.
o Domain relational calculus uses the same operators as tuple calculus. It uses logical
connectives ∧ (and), ∨ (or) and ┓ (not).
o It uses Existential (∃) and Universal Quantifiers (∀) to bind the variable.

Notation:

1. { a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}  

Where

a1, a2 are attributes


P stands for formula built by inner attributes

Example:

Table-1: Customer
CUSTOMER NAME STREET CITY

Debomit Kadamtala Alipurduar

Sayantan Udaypur Balurghat

Soumya Nutanchati Bankura

Ritu Juhu Mumbai

Table-2: Loan
LOAN NUMBER BRANCH NAME AMOUNT

L01 Main 200

L03 Main 150

L10 Sub 90

L08 Main 60

Table-3: Borrower
CUSTOMER NAME LOAN NUMBER

Ritu L01

Debomit L08

Soumya L03
Query-1: Find the loan number, branch, amount of loans of greater than or equal to 100
amount.
{≺l, b, a≻ | ≺l, b, a≻ ∈ loan ∧ (a ≥ 100)}
Resulting relation:
LOAN NUMBER BRANCH NAME AMOUNT

L01 Main 200

L03 Main 150

L10 Sub 90
Query-2: Find the loan number for each loan of an amount greater or equal to 150.
{≺l≻ | ∃ b, a (≺l, b, a≻ ∈ loan ∧ (a ≥ 150)}
Resulting relation:

LOAN NUMBER

L01

L03
Query-3: Find the names of all customers having a loan at the “Main” branch and find the
loan amount .
{≺c, a≻ | ∃ l (≺c, l≻ ∈ borrower ∧ ∃ b (≺l, b, a≻ ∈ loan ∧ (b = “Main”)))}
Resulting relation:

CUSTOMER NAME AMOUNT

Ritu 200

Debomit 60

Soumya 150

Solution Set Unit I DBMS

1. Let R = (A, B, C) and let r1 and r2 both be relations on schema R Give an expression in
Tuple relational calculus and domain relational calculus that is equivalent to
a. Π A(r1)
b. σ B = 17 (r1)
c. r1 ∪ r2
d. r1 − r2
e. r1 ∩ r2
Solution:
In Tuple relational calculus
a.{t | ∃ q ∈ r1 (q[A] = t[A])}
b. {t | t ∈ r1 ∧ t [B] = 17}
c. {t | t ∈ r1 ∨ t ∈ r2}
d. {t | t ∈ r1 ∧ t ∈ r2}
e. {t | t ∈ r1 ∧ t ∈ r2}

In Domain relational calculus


a. {< a> | ∃ b, c < a, b, c > ∈ r1)}
b. {< a, b, c > | < a, b, c > ∈ r1 ∧ b = 17}
c. {< a, b, c > | < a, b, c > ∈ r1 ∨ < a, b, c > ∈ r2}
d.{< a, b, c > | < a, b, c > ∈ r1 ∧ < a, b, c > ∈ r2}
e.{< a, b, c > | < a, b, c > ∈ r1 ∧ < a, b, c > ∈ r2}

2. Consider the following relational database and give the relational algebra for each of
the following.
Manages (Person_name, manager_name)
Company (Company_name, city)
Works (Person_name, company_name, salary)
Employee (Person_name, street, city)
Underline columns are the primary keys.

1) Find the names of all employees who work for SBI.


Π person-name (σ company-name = “SBI” (works))

2) Find the names and cities of residence of all employees who work for SBI.
Π person-name, city (employee (σ company-name = “SBI” (works))

3) Find the names, street address, and cities of residence of all employees who work for
SBI and earn more than $10,000 per annum.
Π person-name, street, city (σ company-name = “SBI” ∧ salary > 10000) works employee

4) Find the names of all employees in this database who live in the same city as the
Company for which they work.
Π person-name (employee works company)

5) Find the names of all employees who do not work for SBI.
The following solutions assume that all people work for exactly one company. If one
allows people to appear in the database (e.g. in employee) but not appear in works, the
problem is more complicated.
Π person-name (σ company-name = “SBI” (works))
If people may not work for any company:
Π person-name (employee) – Π person-name (σ (company-name = “SBI”) (works))

6) Find the names of all employees who earn more than every employee of SBI.
 Π person-name (works) − (Π works.person-name (works (works.salary ≤works2.salary ∧ works2.company-name = “SBI”) ρ
works2(works)))

3. Consider a data base with the following schema:


Students(ssn, name, address,major)
Course(code,title)
Registered(ssn,code)
Solve the following queries with relational algebra.

1. List the codes of courses in which at least one student is registered :


 πcode ( Registered)

2. List the titles of registered courses

 Π title ( Course Registered )

3. List the codes of courses for which no student is registered


 πcode ( Course ) - πcode ( Registered )

 4. The titles of courses for which no student is registered.

 Π title ( (πcode ( Course ) - πcode ( Registered )) Course)

7  5. Names of students and the titles of courses they registered to.
 πname,title ( Student Registered Course)

6. List of courses in which all students are registered.

 πcode, ssn ( Registered ) / πssn ( Student )

4. Give classification of DBMS and explain using the following

i)Classification on the basis of number of users. :


 Single user – As the name itself indicates it can support only one user at a time. It is
mostly used with the personal computer on which the data resides accessible to a
single person. The user may design, maintain and write the database programs.
 Multiple users – It supports multiple users concurrently. Data can be both
integrated and shared,a database should be integrated when the same information
is not need be recorded in two places. For example a student in the college should
have the database containing his information. It must be accessible to all the
departments related to him. For example the library department and the fee section
department should have information about student’s database. So in such case, we
can integrate and even though database resides in only one place both the
departments will have the access to it.

ii) Classification on the basis of site location.


 Centralized database system – The DBMS and database are stored at the single
site that is used by several other systems too. We can simply say that data here is
maintained on the centralized server.

 Parallel network database system – This system has the advantage of improving
processing input and output speeds. Majorly used in the applications that have
query to larger database. It holds the multiple central processing units and data
storage disks in parallel. 
 Distributed database system – In this data and the DBMS software are distributed
over several sites but connected to the single computer.

 Further they are classified as


     1.Homogeneous DBMS – They use same software but from the multiple sites. Data
exchange between the sites can be handled easily. For example, library information
systems by the same vendor ,such as Geac Computer corporation, use the same DBMS
software that allows the exchanges between various Geac library sites.
     2.heterogeneous DBMS – They use different DBMS software for different sites but there
is a additional software that helps the exchange of the data between the sites.
 Client-server database system –  This system has two logical components  namely
client and server. Clients are generally the personal computers or workstations
whereas servers are the large workstations, mini range computers or a main frame
computer system. The applications and tools of the DBMS run on the client
platforms and the DBMS software on the server. Both server and client computers
are connected over the network. We can relate it to client and server in real life to
understand in a much better way. Here the applications and tools act as a client send
the requests for its services. The DBMS processes these requests and returns the
result to the client. Server handles jobs that are common to many clients say
database access and updates.
 Multi-tier client-server database system – The rise of personal computers in
business has increased the reliability of the network hardware leading to evolution
of two-tier and three-tier systems which use different software for the client and
software.

iii) Classification on the basis of Type and extent of use


 Online transaction processing(OLTP) DBMS – They manage the operational data.
Database server must be able to process lots of simple transactions per unit of time.
Transactions are initiated in real time, in simultaneous by lots of user and
applications hence it must have high volume of short, simple queries.
 Online analytical processing(OLAP) DBMS – They use the operational data for
tactical and strategical decision making. They have limited users deal with huge
amount of data,complex queries.
 Big data and analytics DBMS – To cope with big data new database technologies
have been introduced. One such is NoSQL (not only SQL) which abandons the well
known relational database scheme.
 XML DBMS – two types

1. Native XML DBMS – Use the logical,intrinsic structure of XML document.


     2.Enabled XML DBMS – Existing DBMS with facilities to store XML data and structured
data in integrated way.                          
 Multimedia DBMS – Stores data such as text, images, audio, video and 3D games
which are usually stored in binary large object.
 GIS DBMS – Stores and queries the spatial data.
 Sensor DBMS – Allows to manage sensor data, bio-metric and telematics data.
 Mobile DBMS – Runs on the smartphones, tablets. It Handles the local queries.
Supports self management( no DBA).
 Open source DBMS – Code is publicly available and can be extended by anyone,
popular for small business applications.

You might also like