You are on page 1of 31

Module I: Introduction to

DBMS
Contents
Definition of DBMS
Data Independence
DBMS Architecture, Levels
Database Administrator
File System Approach Vs DBMS Approach
Advantages of Using a DBMS
Data Models
Schemas, and Instances.
What is database?
•To understand database we must know first about data ," data means known facts that can be
recorded and have implicit meanings. "
•And a "database is a collection or group of inter-related data. "
•database represents some aspect of the real word.
•For example ,consider data of a bank accounts. here,all the data - id,name,address,contact no -
interrelated.so,they represent some particular customer.
•so , database is some kind of collection of data.
What is database management
system?

• A database management system is a collection of inter-related data and a set of


programs to manipulate those data.
• database management system stores data in such a way that it becomes easier to
retrieve, manipulate, and produce information.
• Data manipulation consist various operation like store data, modify data, remove
data, retrieve that data.
• So, DBMS = Database + Set of programs
• It also provide security or safety against system crashes.
What is database environment?

• A database system provides basic functionalities like storage,manipulation and use of


data.
• It have four major components of the database system which forms the database
system environment.
– Data
– Hardware
– Software
– Users
1. Data
It is most important component of database system.
Data means known facts which can be recorded and have implicit meaning.
2. Hardware
All physical module of a computer are referred as hardware.
Such as memory, mouse, hard disk, printer, etc.
3. Software
Software provides interconnection between users and database stored in physical devices.
The operating system manages all hardware of the computer.
DBMS software and operating system form the sofware component here.
4. Users
Any person who interacts with a database in any form is consider as database user
categories of the database user:
Database administrator
Database designers
Application programmers
End users
Various Definitions related to
database
What is Data?
•Data means known facts, which can be recorded and implicit meaning. data is  also a collection of
facts and figures.

What is Information?
•Information means processed or organized data.which can be drived from data and facts.

What is Data-Item (field)?


•It is a character or group of characters that has a specific meaning.
•For Example, cid, cname from customer table

What is record?
•It is a collection of logically related fields.
•And we also say that record consists of values for ech field.

What is file?
•It is a collection related records.which arranged in a specific sequence.
What is metadata?
Set of data that describes and gives information about another data.
in other words, data about data is called metadata.
Example is given table below for metadata.

name type size


no NUMBER 3
student name VARCHAR2 20

What is System Catalog?


•The system catalog is a collection of tables and views that contain important
information about a database. A system catalog is available for each database.
What is data warehouse?
•A data warehouse is a decision support database that is maintained separately from the
organization's operational database.
What is data dictionary?
•data dictionary is a file that contains metadata that is usually a part of the system catalog.
•It have following for components:
• Entities, Attributes, Relationships, Key
DBMS Architecture
DBMS Architecture contains Schemas, Sub-schemas and Instances
Schemas:
The overall logical design of the database.
Sub-Schemas:
A sub schema is a subset of the schema and inherits the same property that a
schema has. The plan (or scheme) for a view is often called sub-schemas.
Instances:
The collection of information stored in the database at a particular moment.
Three-Schema Architecture
• Proposed to support DBMS characteristics
of:
• Program-data independence.
• Support of multiple views of the data.
1. Internal level:
•This is the lowest level of data abstraction.
•It describes How the data are actually stored on storage devices.
•It is also known as physical level.
•It provides internal view of physical storage of data.
•It deals with complex low level data structures,file structures and access methods in detail.
•It also deals with Data Compression and Encryption techniques,if used.
2. Conceptual level:
•This is the next higher level than internal level of data abstraction.
•It describes What data are stored in the database and What relationships exist among those data.
•It is also known as Logical level.
•It hides low level complexities of physical storage.
•Database administrator and designers work at this level to determine What data to keep in
database.
•Application developers also work on this level.
3. External Level:
•This is the highest level of data abstraction.
•It describes only part of the entire database that a end user concern.
•It is also known as an view level.
•End users need to access only part of the database rather than entire database.
•Different user need different views of database.And so,there can be many view level abstractions
of the same database.
Advantages of Three-tier Architecture:
•The main objective of it is to provide data abstraction.
•Same data can be accessed by different users with different customized views.
•The user is not concerned about the physical data storage details.
•Physical storage structure can be changed without requiring changes in internal
structure of the database as well as users view.
•Conceptual structure of the database can be changed without affecting end users.

Data Independence
Data independence is ability to modify a schema definition in one level without
affecting a schema definition in the next higher level.
There are two levels of data independence:
1.Physical Data Independence
2.Logical Data Independence
1. Physical Data Independence:
•Physical Data Independence is the ability to modify the physical schema without requiring any change
in application programs.
•Modifications at the internal levels are occasionally necessary to improve performance. possible
modifications at internal levels are change in file structures, compression techniques, hashing
algorithms, storage devices, etc.
•Physical data independence separates conceptual levels from the internal levels.
•This allows to provide a logical description of the database without the need to specify physical
structures.
•Comparatively, it is easy to achieve physical data independence.
2. Logical Data Independence:
•Logical data independence is ability to modify the conceptual schema without requiring any change in
application programs.
•Modification at the logical levels are necessary whenever the logical structures of the database is
altered.
•Logical data independence separates external level from the conceptual view.
•Comparatively it is difficult to achieve logical data independence.
•Application programs are heavily dependent on logical structures of the data they access.so any
change in logical structure also requires programs to change.
Mappings
Process of transforming request and results between three level it's called mapping.

There are the two types of mappings:


Conceptual/Internal Mapping
External/Conceptual Mapping

1. Conceptual/Internal Mapping:
The conceptual/internal mapping defines the correspondence between the conceptual view
and the store database.
It specifies how conceptual record and fields are represented at the internal level.
It relates conceptual schema with internal schema.
If structure of the store database is changed.
If changed is made to the storage structure definition-then the conceptual/internal mapping
must be changed accordingly, so that the conceptual schema can remain invariant.
There could be one mapping between conceptual and internal levels.
2. External/Conceptual Mapping:
The external/conceptual mapping defines the correspondence between a particular
external view and conceptual view.
It relates each external schema with conceptual schema.
The differences that can exist between these two levels are analogous to those that can
exist between the conceptual view and the stored database.
Example: fields can have different data types; fields and record name can be changed;
several conceptual fields can be combined into a single external field.
Any number of external views can exist at the same time; any number of users can share
a given external view: different external views can overlap.
There could be several mapping between external and conceptual levels.
Database Administrator (DBA) :
"Person in the organization who controls the design and the use of the database refers
database administrator."
DBA provides necessary technical support for implementing a database.
DBA works on such as design, development , testing, and operational phases.
DBA is a technical person having knowledge of database technology.
DBA does not need to be a business person. in short, DBA is a technically focused
person but should understand about the business to administrator the database
effectively.
Functions and responsibilities of DBAs
1. Schema Definition:
The DBA definition the logical Schema of the database.A Schema refers to the overall logical structure of the database.
According to this schema, database will be developed to store required data for an organization.
2. Storage Structure and Access Method Definition:
The DBA decides how the data is to be represented in the stored database.
3. Assisting Application Programmers:
The DBA provides assistance to application programmers to develop application programs.
4. Physical Organization Modification:
The DBA modifies the physical organization of the database to reflext the changing needs of the organization or to
improve performance.
5. Approving Data Access:
The DBA determines which user needs access to which part of the database.
According to this,various types of authorizations are granted to different users.
6. Monitoring Performance:
The DBA monitors performance of the system.The DBA ensures that better performance is maintained by making
changes in physical or logical schema if required.
7. Backup and Recovery:
Database should not be lost or damaged.
The DBA ensures this periodically backing up the database on magnetic tapes or remote servers. In case of failure, such
as virus attack database is recovered from this backup.
Advantage of File-oriented
1. Backup:
system:
It is possible to take faster and automatic back-up of database stored in files of computer-based
systems.
computer systems provide functionalities to serve this purpose.it is also possible to develop specific
application program for this purpose.
2. Compactness:
It is possible to store data compactly.
3. Data Retrieval:
Computer-based systems provide enhanced data retrieval techniques to retrieve data stored in files in
easy and efficient way.
4. Editing:
It is easy to edit any information stored in computers in form of files.
Specific application programs or editing software can be used for this purpose.
5. Remote Access:
In computer-based systems,it is possible to access data remotely.
so,to access data it is not necessary for a user to remain present at location where these data are kept.
6. Sharing:
Data stored in files of computer-based systems ca be shared among multiple users at a same time.
Disadvantage of File-oriented
system:
1. Data Redundancy:
It is possible that the same information may be duplicated in different files.this leads to
data redundancy results in memory wastage.
2. Data Inconsistency:
Because of data redundancy,it is possible that data may not be in consistent state.
3. Difficulty in Accessing Data:
Accessing data is not convenient and efficient in file processing system.
4. Limited Data Sharing:
Data are scattered in various files.also different files may have different formats and
these files may be stored in different folders may be of different departments.
So, due to this data isolation, it is difficult to share data among different applications.
5. Integrity Problems:
Data integrity means that the data contained in the database in both correct and
consistent.for this purpose the data stored in database must satisfy correct and
constraints.
6. Atomicity Problems:
Any operation on database must be atomic.
this means, it must happen in its entirely or not at all.
7. Concurrent Access Anomalies:
Multiple users are allowed to access data simultaneously.this is for the sake of better
performance and faster response.
8. Security Problems:
Database should be accessible to users in limited way.
Each user should be allowed to access data concerning his requirements only.
Advantage of DBMS
1. Improved data sharing:
The DBMS helps create an environment in which end users have better access to more and
better-managed data.
Such access makes it possible for end users to respond quickly to changes in their environment.
2. Improved data security:
The more users access the data, the greater the risks of data security breaches.Corporations
invest considerable amounts of time, effort, and money to ensure that corporate data are used
properly.
A DBMS provides a framework for better enforcement of data privacy and security policies.
3. Better data integration:
Wider access to well-managed data promotes an integrated view of the organization’s
operations and a clearer view of the big picture.
It becomes much easier to see how actions in one segment of the company affect other
segments.
4. Minimized data inconsistency:
Data inconsistency exists when different versions of the same data appear in different places.
The probability of data inconsistency is greatly reduced in a properly designed database.
5. Improved data access:
The DBMS makes it possible to produce quick answers to ad hoc queries.
From a database perspective, a query is a specific request issued to the DBMS for data
manipulation—for example, to read or update the data. Simply put, a query is a question, and an
ad hoc query is a spur-of-the-moment question.
The DBMS sends back an answer (called the query result set) to the application.
For example, end users
6. Improved decision making:
Better-managed data and improved data access make it possible to generate better-quality
information, on which better decisions are based.
The quality of the information generated depends on the quality of the underlying data.
Data quality is a comprehensive approach to promoting the accuracy, validity, and timeliness of the
data. While the DBMS does not guarantee data quality, it provides a framework to facilitate data
quality initiatives.
Increased end-user productivity
The availability of data, combined with the tools that transform data into usable information,
empowers end users to make quick, informed decisions that can make the difference between
success and failure in the global economy.
Disadvantage of DBMS
1. Increased costs:
Database systems require sophisticated hardware and software and highly skilled personnel.
The cost of maintaining the hardware, software, and personnel required to operate and manage a
database system can be substantial. Training, licensing, and regulation compliance costs are often
overlooked when database systems are implemented.
2. Management complexity:
Database systems interface with many different technologies and have a significant impact on a
company’s resources and culture.
The changes introduced by the adoption of a database system must be properly managed to ensure that
they help advance the company’s objectives. Given the fact that database systems hold crucial company
data that are accessed from multiple sources, security issues must be assessed constantly.
3. Maintaining currency:
To maximize the efficiency of the database system, you must keep your system current.
Therefore, you must perform frequent updates and apply the latest patches and security measures to all
components.
Because database technology advances rapidly, personnel training costs tend to be significant. Vendor
dependence.
Given the heavy investment in technology and personnel training, companies might be reluctant to change
database vendors.
4. Frequent upgrade/replacement cycles:
DBMS vendors frequently upgrade their products by adding new functionality. Such new
features often come bundled in new upgrade versions of the software.
Some of these versions require hardware upgrades. Not only do the upgrades
themselves cost money, but it also costs money to train database users and
administrators to properly use and manage the new features.
Main Characteristics of the
Database Approach
• Self-describing nature of a database system:
A DBMS catalog stores the description of the
database. The description is called meta-
data). This allows the DBMS software to work
with different databases.
• Insulation between programs and data: Called
program-data independence. Allows
changing data storage structures and
operations without having to change the
DBMS access programs.
• Data Abstraction: A data model is used to
hide storage details and present the users
with a conceptual view of the database.
• Support of multiple views of the data: Each
user may see a different view of the
database, which describes only the data of
interest to that user.
• Sharing of data and multiuser transaction
processing : allowing a set of concurrent users
to retrieve and to update the database.
Concurrency control within the DBMS
guarantees that each transaction is correctly
executed or completely aborted. OLTP (Online
Transaction Processing) is a major part of
database applications.
Data Models
• Data Model: A set of concepts to describe the
structure of a database, and certain constraints
that the database should obey.
• Data Model Operations: Operations for
specifying database retrievals and updates by
referring to the concepts of the data model.
Operations on the data model may include basic
operations and user-defined operations.
Categories of data models
• Conceptual (high-level, semantic) data
models: Provide concepts that are close to the
way many users perceive data. (Also called
entity-based or object-based data models.)
• Physical (low-level, internal) data models:
Provide concepts that describe details of how
data is stored in the computer.
• Implementation (representational) data
models: Provide concepts that fall between the
above two, balancing user views with some
computer storage details.
•There are three most widely used record-based models are:   
• Relational model:
• The relational model uses a collection of tables to represent both data and the relationships
among those data.
• Each table has multiple columns, and each column has unique name.
•Network model:
• Data in the network model are represented by collections of record and relationships among data
are represented by links, which can be viewed as pointers.
• The records in the database are organized as collection of arbitrary graphs
• Advantages Network model :
• Conceptual Simplicity
• Ease of data access
• Data Integrity and capability to handle more relationship types
• Data independence
• Database standards
•Hierarchical model:
• In hierarchical model the data and relationships among the data are represented by records and
links.
• It is same as network model but differs in terms of organization of records as collections of trees
rather than graphs.
• Advantages Hierarchical model :
• Simplicity
• Data Security and Data Integrity
• Efficiency
• Disadvantages Hierarchical model :
• Implementation Complexity
• Lack of structural independence

You might also like