You are on page 1of 22

Module: 1 (DATABASE MANAGEMENT SYSTEM)

Introduction: An overview of database management system, database system Vs file system,


Database system concept and architecture, data model schema and instances, data independence
and database language and interfaces, data definitions language, DML, Overall Database
Structure.
Data Modeling using the Entity Relationship Model: ER model concepts, notation for ER
diagram, mapping constraints, keys, Concepts of Super Key, candidate key, primary key,
Generalization, aggregation, reduction of an ER diagrams to tables, extended ER model,
relationship of higher degree.

Introduction:

What is Data?
In simple words data can be facts related to any object in consideration.For example your name,
age, height, weight, etc are some data related to you.A picture , image , file , pdf etc can also be
considered data.

What is a Database?

Database is a systematic collection of data. Databases support storage and manipulation of data.
Databases make data management easy. Let's discuss few examples.
An online telephone directory would definitely use database to store data pertaining to people,
phone numbers, other contact details, etc.
Your electricity service provider is obviously using a database to manage billing , client related
issues, to handle fault data, etc.
Let's also consider the facebook. It needs to store, manipulate and present data related to
members, their friends, member activities, messages, advertisements and lot more.

What is a Database Management System (DBMS)?

Database Management System (DBMS) is a collection of programs which enables its users to
access database, manipulate data, reporting / representation of data .
It also helps to control access to the database.
Database Management Systems are not a new concept and as such had been first implemented in
1960s.
Charles Bachmen's Integrated Data Store (IDS) is said to be the first DBMS in history.
With time database technologies evolved a lot while usage and expected functionalities of
databases have been increased immensely.
Types of DBMS

Let's see how the DBMS family got evolved with the time. Following diagram shows the
evolution of DBMS categories.

✓ There are 4 major types of DBMS. Let's look into them in detail.
✓ Hierarchical - this type of DBMS employs the "parent-child" relationship of storing data.
This type of DBMS is rarely used nowadays. Its structure is like a tree with nodes
representing records and branches representing fields. The windows registry used in
Windows XP is an example of a hierarchical database. Configuration settings are stored
as tree structures with nodes.
✓ Network DBMS - this type of DBMS supports many-to many relations. This usually
results in complex database structures. RDM Server is an example of a database
management system that implements the network model.
✓ Relational DBMS - this type of DBMS defines database relationships in form of tables,
also known as relations. Unlike network DBMS, RDBMS does not support many to
many relationships.Relational DBMS usually have pre-defined data types that they can
support. This is the most popular DBMS type in the market. Examples of relational
database management systems include MySQL, Oracle, and Microsoft SQL Server
database.
✓ Object Oriented Relation DBMS - this type supports storage of new data types. The data
to be stored is in form of objects. The objects to be stored in the database have attributes
(i.e. gender, ager) and methods that define what to do with the data. PostgreSQL is an
example of an object oriented relational DBMS.

Database System versus File System

DBMS File Processing System


Minimal data redundancy problem in DBMS Data Redundancy problem exits
Data Inconsistency does not exist Data Inconsistency exist here
Accessing database is easier Accessing is comparatively difficult
The problem of data isolation is not found in Data is scattered in various files and files may be of
database different format, so data isolation problem exists
Transactions like insert, delete, view,
In file system, transactions are not possible
updating, etc are possible in database
Concurrent access and recovery is possible in
Concurrent access and recovery is not possible
database
Security of data Security of data is not good
A database manager (administrator) stores A file manager is used to store all relationships in
the relationship in form of structural tables directories in file systems.

Database Architecture

DBMS architecture helps in design, development, implementation, and maintenance of a


database. A database stores critical information for a business. Selecting the correct Database
Architecture helps in quick and secure access to this data.

1 tier Architecture

Figure 1:1-tier Architecture Diagram

The simplest of Database Architecture are 1 tier where the Client, Server, and Database all reside
on the same machine. Anytime you install a DB in your system and access it to practise SQL
queries it is 1 tier architecture. But such architecture is rarely used in production.

2-tier DBMS Architecture

2-tier DBMS architecture includes an Application layer between the user and the DBMS, which
is responsible to communicate the user's request to the database management system and then
send the response from the DBMS to the user.
An application interface known as ODBC(Open Database Connectivity) provides an API that
allow client side program to call the DBMS. Most DBMS vendors provide ODBC drivers for
their DBMS.
3-tier DBMS Architecture

3-tier DBMS architecture is the most commonly used architecture for web applications.

It is an extension of the 2-tier architecture. In the 2-tier architecture, we have an application layer
which can be accessed programmatically to perform various operations on the DBMS. The
application generally understands the Database Access Language and processes end users
requests to the DBMS.
In 3-tier architecture, an additional Presentation or GUI Layer is added, which provides a
graphical user interface for the End user to interact with the DBMS.
For the end user, the GUI layer is the Database System, and the end user has no idea about the
application layer and the DBMS system.
If you have used MySQL, then you must have seen PHPMyAdmin, it is the best example of a 3-
tier DBMS architecture.
Types of Data Models

Data Models are the collection of concept to describe the structure of database. The types of data
models are :

Hierarchical Database Model


This is the oldest form of database. This data model organizes the data in the tree structure i.e
each child node can have only one parent node and at the top of the structure, there is a single
parenthesis node.

In this model a database record is a tree that consists of one or more groupings of fields called
segments, which makeup the individual nodes of the tree. This model use one-to-many
relationship
Advantage : Data access is quite predictable in structure and retrieval and updates can be
highly optimized by a DBMS.

Disadvantage : The link is permanently established and cannot be modified which makes this
model rigid.

Network Database Model

The Network database model was developed as an alternative to the hierarchical database. This
model expands on the hierarchical model by providing multiple paths among segments i.e more
than one parent-child relationship. Hence this model allows one-to-one, one-to-many and many-
to-many relationships
Supporting multiple paths in the data structure eliminates some of the drawbacks of the
hierarchical model, the network model is not very practical.

Disadvantage : It can be quite complicated to maintain all the links.


Relational Database Model

The key differences between previous database models and relational database model is in terms
of flexibility.

A relational database represents all data in the database as simple two-dimensional tables called
relations. Each row of a relational table, called tuple, represents a data entity with columns of the
table representing attributes(fields). The allowable values for these attributes are called
the domain
Each row in a relational table must have a unique primary key and also has some secondary keys
which correspond with primary keys in other tables
Advantage : Provides flexibility that allows changes to the database structure to be easily
accommodated. It facilitates multiple views of the same database for different users.

For example: COLLEGE table has Batch_Year as primary key and has secondary keys
Student_ID and Course_ID, these keys serve as primary keys for STUDENT and COURSE
tables.
Student Table

Student_ID Student_Name

101 Shubham

102 Rajat

Course Table
Course_ID Course_Name

14 Java

16 Android
College Table

Batch_Year Student_ID Course_ID Teacher_Name Teacher_Number

2012-16 101 14 Jack 9876543

2013-17 102 16 Tom 9823451

Database Schema

A database schema is the skeleton structure that represents the logical view of the entire
database. It defines how the data is organized and how the relations among them are associated.
It formulates all the constraints that are to be applied on the data.
A database schema defines its entities and the relationship among them. It contains a descriptive
detail of the database, which can be depicted by means of schema diagrams. It’s the database
designers who design the schema to help programmers understand the database and make it
useful.

A database schema can be divided broadly into two categories −


• Physical Database Schema − This schema pertains to the actual storage of data and its
form of storage like files, indices, etc. It defines how the data will be stored in a
secondary storage.
• Logical Database Schema − This schema defines all the logical constraints that need to
be applied on the data stored. It defines tables, views, and integrity constraints.

Database Instance

• It is important that we distinguish these two terms individually. Database schema is the
skeleton of database. It is designed when the database doesn't exist at all. Once the
database is operational, it is very difficult to make any changes to it. A database schema
does not contain any data or information.
• A database instance is a state of operational database with data at any given time. It
contains a snapshot of the database. Database instances tend to change with time. A
DBMS ensures that its every instance (state) is in a valid state, by diligently following all
the validations, constraints, and conditions that the database designers have imposed.

Database Languages
A database system provides a Data Definition Language to specify the database schema and
a Data Manipulation Language to express database queries and updates.

Data-Definition Language

We specify a database schema by a set of definitions expressed by a special language called a


data-definition language (DDL).
All DDl commands are auto-committed. That means it saves all changes permanently in the
database.
DDL deals with database schemas and descriptions, of how the data should reside in the
database.
Example of DDL are :
CREATE : to create database
ALTER : alters the structure of the existing database
DROP : delete objects from the database
TRUNCATE : remove all records from a table, including all spaces allocated for the records are
removed

Data-Manipulation Language
A data manipulation language (DML) is a language that enables users to access or manipulate
data as organized by the appropriate data model.
DML commands are not auto-committed. It means changes are not permanent to database, they
can be rolled back.
Data manipulation is retrieval of information, insertion of new information, deletion of
information or modification of information stored in the database.
Example of DML are :
SELECT : retrieve data from the database
INSERT : insert data into a table
UPDATE : updates existing data in a table
DELETE : delete all records from a database table
There are two types of DML :
Procedural DML : require a user to specify what data are needed and how to get those data.
Non-Procedural DML : require a user to specify what data are needed without specifying how
to get those data.

Transaction Control Language

TCL commands are to keep a check on other commands and their affect on the database.
These commands can annul changes made by other commands by rolling back to original state. It
can also make changes permanent.
Example of TCL are :
COMMIT : to permanently save
ROLLBACK : to undo change
SAVEPOINT : to save temporarily

DBMS | Interfaces

A database management system (DBMS) interface is a user interface which allows for the ability to input
queries to a database without using the query language itself.User-friendly interfaces provide by DBMS
may include the following:

Menu-Based Interfaces for Web Clients or Browsing –


These interfaces present the user with lists of options (called menus) that lead the user
through the formation of a request. Basic advantage of using menus is that they removes
the tension of remembering specific commands and syntax of any query language, rather
than query is basically composed step by step by collecting or picking options from a
menu that is basically shown by the system. Pull-down menus are a very popular
technique in Web based interfaces. They are also often used in browsing interface which
allow a user to look through the contents of a database in an exploratory and unstructured
manner.
Forms-Based Interfaces –
A forms-based interface displays a form to each user. Users can fill out all of the form entries to
insert a new data, or they can fill out only certain entries, in which case the DBMS will redeem
same type of data for other remaining entries. This type of forms are usually designed or created
and programmed for the users that have no expertise in operating system. Many DBMSs
have forms specification languageswhich are special languages that help specify such forms.
Graphical User Interface –
A GUI typically displays a schema to the user in diagrammatic form.The user then can specify a
query by manipulating the diagram. In many cases, GUI’s utilize both menus and forms. Most
GUIs use a pointing device such as mouse, to pick certain part of the displayed schema diagram.
Natural language Interfaces –
These interfaces accept request written in English or some other language and attempt to
understand them. A Natural language interface has its own schema, which is similar to the
database conceptual schema as well as a dictionary of important words.
The natural language interface refers to the words in its schema as well as to the set of standard
words in a dictionary to interpret the request.If the interpretation is successful, the interface
generates a high-level query corresponding to the natural language and submits it to the DBMS
for processing, otherwise a dialogue is started with the user to clarify any provided condition or
request. The main disadvantage with this is that the capabilities of this type of interfaces are not
that much advance.
Overall Structure of DBMS

DBMS acts as an interface between user and the database. DBMS are very large and typically
divided into modules :-

DDL Compiler –
Converts DDL statements to a set of tables containing metadata stored in a data
dictionary.Metadata information can be the name of files,data items,storage details of each
file,mapping information and constraints,etc.
DML Compiler and Query Optimizer - DML compiler translates the Data Manipulation
Languages into query Engine instructions. It might also do optimization for query.
Query processor/optimizer translates statements in a query language into low-level instructions
the database manager understands. (It is used to find an equivalent but more efficient form).
Data Manager –
• The data manager is the central software component of the DBMS. It is sometimes
referred to as the database control system.
One of the functions of the data manager is to convert operations in the user's queries
coming directly via the query processor or\ indirectly via an application program from the
user's logical view to a physical file system.
• The data manager is responsible for interfacing with the file system as show. In addition,
the tasks of enforcing constraints to maintain the consistency and integrity of the data, as
well as its security, are also performed by the data manager.
It is also the responsibility of the Data. Manager to provide the synchronization in the
simultaneous operations performed by concurrent users and to maintain the backup and
recovery operations.
Data Dictionary –
Data Dictionary is a repository of description of data in the database.A data dictionary contains
a list of all files in the database, the number of records in each file and the names and types of
each field. Most database management systems keep the data dictionary hidden from users to
prevent them from accidentally destroying its content.
Functions of the Data Dictionary-
1. Defines the data element.
2. Helps in the scheduling.
3. Helps in the control.
4. Permits the various users who know which data is available and how can it be obtained.
5. Helps in the identification of the organizational data irregularity.
6. Acts as a very essential data management tool.
7. Provides with a good standardization mechanism.
8. Acts as the corporate glossary of the ever growing information resource.
9. Provides the report facility, the control facility along with the excerpt facility.

Data Files - It stores the database.

Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML.

End Users - End Users are the people who interact with the database through applications or
utilities. The various categories of end users are:

1. Casual End Users - These Users occasionally access the database but may need different
information each time. They use sophisticated database Query language to specify their requests.
For example: High level Managers who access the data weekly or biweekly.
2. Native End Users - These users frequently query and update the database using standard
types of Queries. The operations that can be performed by this class of users are very limited and
effect precise portion of the database.For example: - Reservation clerks for airlines/hotels check
availability for given request and make reservations. Also, persons using Automated Teller
Machines (ATM's) fall under this category as he has access to limited portion of the database.
3.Standalone end Users/On-line End Users - Those end Users who interact with the database
directly via on-line terminal or indirectly through Menu or graphics based Interfaces.Example:-
Library Management System.

Data Modeling using the Entity Relationship Model:

ER Model Concepts
ER Diagram Notations:

The ER model defines the conceptual view of a database. It works around real-world entities and
the associations among them. At view level, the ER model is considered a good option for
designing databases.

Entity
Entities are represented by means of rectangles. Rectangles are named with the entity set they
represent.

Attributes
Attributes are the properties of entities. Attributes are represented by means of ellipses. Every
ellipse represents one attribute and is directly connected to its entity (rectangle).
If the attributes are composite, they are further divided in a tree like structure. Every node is then
connected to its attribute. That is, composite attributes are represented by ellipses that are
connected with an ellipse.

Multivalued attributes are depicted by double ellipse.

Derived attributes are depicted by dashed ellipse.


• These attribute types can come together in a way like −
• simple single-valued attributes
• simple multi-valued attributes
• composite single-valued attributes
• composite multi-valued attributes

Entity-Set and Keys


Key is an attribute or collection of attributes that uniquely identifies an entity among entity set.

For example, the roll_number of a student makes him/her identifiable among students.

• Super Key − A set of attributes (one or more) that collectively identifies an entity in an
entity set.
• Candidate Key − A minimal super key is called a candidate key. An entity set may have
more than one candidate key.
• Primary Key − A primary key is one of the candidate keys chosen by the database
designer to uniquely identify the entity set.

Relationship
The association among entities is called a relationship. For example, an employee works_at a
department, a student enrolls in a course. Here, Works_at and Enrolls are called relationships.

Relationship Set

A set of relationships of similar type is called a relationship set. Like entities, a relationship too
can have attributes. These attributes are called descriptive attributes.
Participation Constraints

• Total Participation − Each entity is involved in the relationship. Total participation is


represented by double lines.
• Partial participation − Not all entities are involved in the relationship. Partial
participation is represented by single lines.

Degree of Relationship
The number of participating entities in a relationship defines the degree of the relationship.
• Binary = degree 2
• Ternary = degree 3
• n-ary = degree

Mapping Cardinalities/Mapping Coordinates


Cardinality defines the number of entities in one entity set, which can be associated with the
number of entities of other set via relationship set.
• One-to-one − One entity from entity set A can be associated with at most one entity of
entity set B and vice versa.

• One-to-many − One entity from entity set A can be associated with more than one
entities of entity set B however an entity from entity set B, can be associated with at most
one entity.

• Many-to-one − More than one entities from entity set A can be associated with at most
one entity of entity set B, however an entity from entity set B can be associated with
more than one entity from entity set A.
• Many-to-many − One entity from A can be associated with more than one entity from B
and vice versa.

What is an Enhanced ER Diagram?


Enhanced entity-relationship models, also known as extended entity-relationship models, are
advanced database diagrams very similar to regular ER diagrams. Enhanced ERDs are high-level
models that represent the requirements and complexities of complex databases.
EER is a high-level data model that incorporates the extensions to the original ER model.

It is a diagrammatic technique for displaying the following concepts


• Sub Class and Super Class
• Specialization and Generalization
• Union or Category
• Aggregation
• These concepts are used when the comes in EER schema and the resulting schema
diagrams called as EER Diagrams.
Features of EER Model
• EER creates a design more accurate to database schemas.
• It reflects the data properties and constraints more precisely.
• It includes all modeling concepts of the ER model.
• Diagrammatic technique helps for displaying the EER schema.
• It includes the concept of specialization and generalization.
• It is used to represent a collection of objects that is union of objects of different of
different entity types.
A. Sub Class and Super Class
• Sub class and Super class relationship leads the concept of Inheritance.
• The relationship between sub class and super class is denoted with symbol.
1. Super Class

• Super class is an entity type that has a relationship with one or more subtypes.
• An entity cannot exist in database merely by being member of any super class.
For example: Shape super class is having sub groups as Square, Circle, Triangle.
2. Sub Class
• Sub class is a group of entities with unique attributes.
• Sub class inherits properties and attributes from its super class.
For example: Square, Circle, Triangle are the sub class of Shape super class.

B. Specialization and Generalization

1. Generalization
Generalization is the process of generalizing the entities which contain the properties of all the
generalized entities.
It is a bottom approach, in which two lower level entities combine to form a higher level entity.
Generalization is the reverse process of Specialization.
It defines a general entity type from a set of specialized entity type.
It minimizes the difference between the entities by identifying the common features.
For example:

In the above example, Tiger, Lion, Elephant can all be generalized as Animals.
2. Specialization

Specialization is a process that defines a group entities which is divided into sub groups based on
their characteristic.
It is a top down approach, in which one higher entity can be broken down into two lower level
entity.
It maximizes the difference between the members of an entity by identifying the unique
characteristic or attributes of each member.
It defines one or more sub class for the super class and also forms the superclass/subclass
relationship.
For example

In the above example, Employee can be specialized as Developer or Tester, based on what role
they play in an Organization.

C. Category or Union

Category represents a single super class or sub class relationship with more than one super class.
It can be a total or partial participation.
For example Car booking, Car owner can be a person, a bank (holds a possession on a Car) or a
company. Category (sub class) → Owner is a subset of the union of the three super classes →
Company, Bank, and Person. A Category member must exist in at least one of its super classes.
D. Aggregation

Aggregation is a process that represent a relationship between a whole object and its component
parts.
It abstracts a relationship between objects and viewing the relationship as an object.
It is a process when two entity is treated as a single entity.

In the above example, the relation between College and Course is acting as an Entity in Relation
with Student.

Reduce ER Diagram into Tables


Entity Relationship (ER) Diagram is diagrammatic representation of data in databases, it shows
how data is related.
Note: This article for those who already know what is ER diagram and how to draw ER diagram.

For example, a student can be enrolled only in one course, but a course can be enrolled by many
students

For Student(SID, Name), SID is the primary key. For Course ( CID, C_name ), CID is the
primary key
Student Course Enroll
(SID Name) ( CID C_name ) (SID CID)
-------------- ----------------- ---------------
Now the question is, what should be the primary key for Enroll SID or CID or combined. We
can’t have CID as primary key as you can see in enroll for the same CID we have multiples SID.
(SID , CID) can distinguish table uniquely, but it is not minimum. So SID is the primary key for
the relation enroll.
Degrees of Relationship (Cardinality)

The degree of relationship (also known as cardinality) is the number of occurrences in one entity
which are associated (or linked) to the number of occurrences in another.
There are three degrees of relationship, known as:
• one-to-one (1:1)
• one-to-many (1:N)
• many-to-one (N:1)
• many-to-many (M:N)
• The latter one is correct, it is M:N and not M:M.
Binary Relationship and Cardinality

A relationship where two entities are participating is called a binary relationship. Cardinality is
the number of instance of an entity from a relation that can be associated with the relation.
One-to-one − When only one instance of an entity is associated with the relationship, it is
marked as '1:1'. The following image reflects that only one instance of each entity should be
associated with the relationship. It depicts one-to-one relationship.

One-to-many − When more than one instance of an entity is associated with a relationship, it is
marked as '1:N'. The following image reflects that only one instance of entity on the left and
more than one instance of an entity on the right can be associated with the relationship. It depicts
one-to-many relationship.
Many-to-one − When more than one instance of entity is associated with the relationship, it is
marked as 'N:1'. The following image reflects that more than one instance of an entity on the left
and only one instance of an entity on the right can be associated with the relationship. It depicts
many-to-one relationship.

Many-to-many − The following image reflects that more than one instance of an entity on the
left and more than one instance of an entity on the right can be associated with the relationship. It
depicts many-to-many relationship.

You might also like