You are on page 1of 16

UNIT 1

What is a Database?

A database is a collection of related data which represents some aspect of the real world. A
database system is designed to be built and populated with data for a certain task.

What is DBMS?

Database Management System (also known as DBMS) is a software for storing and retrieving
users' data by considering appropriate security measures. It allows users to create their own
databases as per their requirement.

It consists of a group of programs which manipulate the database and provide an interface
between the database. It includes the user of the database and other application programs.

The DBMS accepts the request for data from an application and instructs the operating system to
provide the specific data. In large systems, a DBMS helps users and other third-party software to
store and retrieve data.

In this tutorial, you will learn

Example of a DBMS

Let us see a simple example of a university database. This database is maintaining information
concerning students, courses, and grades in a university environment. The database is organized
as five files:

 The STUDENT file stores data of each student


 The COURSE file stores contain data on each course.
 The SECTION stores the information about sections in a particular course.
 The GRADE file stores the grades which students receive in the various sections
 The TUTOR file contains information about each professor.

To define a database system:

 We need to specify the structure of the records of each file by defining the different types
of data elements to be stored in each record.
 We can also use a coding scheme to represent the values of a data item.
 Basically, your Database will have 5 tables with a foreign key defined amongst the
various tables.
 Users in a DBMS environment
 Following, are the various category of users of a DBMS system

Component Name Task

Application Programmers The Application programmers write programs


in various programming languages to interact
with databases.

Database Administrators Database Admin is responsible for managing


the entire DBMS system. He/She is called
Database admin or DBA.

End-Users The end users are the people who interact with
the database management system. They conduct
various operations on database like retrieving,
updating, deleting, etc.

Popular DBMS Software

Here, is the list of some popular DBMS system:

 MySQL
 Microsoft Access
 Oracle
 PostgreSQL
 dBASE
 FoxPro
 SQLite
 IBM DB2
 LibreOffice Base
 MariaDB
 Microsoft SQL Server etc.
Application of DBMS

Sector Use of DBMS

Banking For customer information, account


activities, payments, deposits, loans, etc.

Airlines For reservations and schedule information.

Universities For student information, course


registrations, colleges and grades.

Telecommunication It helps to keep call records, monthly bills,


maintaining balances, etc.

Finance For storing information about stock, sales,


and purchases of financial instruments like
stocks and bonds.

Sales Use for storing customer, product & sales


information.

Manufacturing It is used for the management of supply


chain and for tracking production of items.
Inventories status in warehouses.

HR Management For information about employees, salaries,


payroll, deduction, generation of
paychecks, etc.
File Oriented approach:

The traditional file oriented approach to information processing has for each application a
separate master file and its own set of personal file. In file oriented approach the program
dependent on the files and files become dependent on the files and files become dependents upon
the programs

Disadvantages of file oriented approach:

1) Data redundancy and inconsistency: The same information may be written in several files.
This redundancy leads to higher storage and access cost. It may lead data inconsistency that is
the various copies of the same data may longer agree for example a changed customer address
may be reflected in single file but not else where in the system.

2) Difficulty in accessing data : The conventional file processing system do not allow data to
retrieved in a convenient and efficient manner according to user choice.

3) Data isolation : Because data are scattered in various file and files may be in different formats
with new application programs to retrieve the appropriate data is difficult.

4) Integrity Problems: Developers enforce data validation in the system by adding appropriate
code in the various application program. How ever when new constraints are added, it is difficult
to change the programs to enforce them.

5) Atomicity: It is difficult to ensure atomicity in a file processing system when transaction


failure occurs due to power failure, networking problems etc. (atomicity: either all operations of
the transaction are reflected properly in the database or non are)

6) Concurrent access: In the file processing system it is not possible to access a same file for
transaction at same time

7) Security problems: There is no security provided in file processing system to secure the data
from unauthorized use.

DBMS vs. Flat File

DBMS Flat File Management System

Multi-user access It does not support multi-user access


Design to fulfill the need for small and large It is only limited to smaller DBMS system.
businesses

Remove redundancy and Integrity Redundancy and Integrity issues

Expensive. But in the long term Total Cost of It's cheaper


Ownership is cheap

Easy to implement complicated transactions No support for complicated transactions

Characteristics of Database Management System

 Provides security and removes redundancy


 Self-describing nature of a database system
 Insulation between programs and data abstraction
 Support of multiple views of the data
 Sharing of data and multiuser transaction processing
 DBMS allows entities and relations among them to form tables.
 DBMS supports multi-user environment that allows users to access and manipulate data
in parallel.

ADVANTAGES OF THE DBMS:


1. Improved data sharing:
The DBMS helps create an environment in which end users have better access to more and
better-managed data.Such access makes it possible for end users to respond quickly to changes in
their environment.
2. Improved data security:
The more users access the data, the greater the risks of data security breaches. Corporations
invest considerable amounts of time, effort, and money to ensure that corporate data are used
properly. A DBMS provides a framework for better enforcement of data privacy and security
policies.
3. Data integration:
Wider access to well-managed data promotes an integrated view of the organization’s operations
and a clearer view of the big picture. It becomes much easier to see how actions in one segment
of the company affect other segments.
4. Minimized data inconsistency:
Data inconsistency exists when different versions of the same data appear in different places. For
example, data inconsistency exists when a company’s sales department stores a sales
representative’s name as “Bill Brown” and the company’s personnel department stores that same
person’s name as “William G. Brown,” or when the company’s regional sales office shows the
price of a product as $45.95 and its national sales office shows the same product’s price as
$43.95. The probability of data inconsistency is greatly reduced in a properly designed
database.
5. Improved data access:
The DBMS makes it possible to produce quick answers to ad hoc queries. From a database
perspective, a query is a specific request issued to the DBMS for data manipulation—for
example, to read or update the data. Simply put, a query is a question, and an ad hoc query is a
spur-of-the-moment question. The DBMS sends back an answer (called the query result set) to
the application. For example, end users, when dealing with large amounts of sales data, might
want quick answers to questions (ad hoc queries) such as:
- What was the dollar volume of sales by product during the past six months?
- What is the sales bonus figure for each of our salespeople during the past three months?
- How many of our customers have credit balances of $3,000 or more?
6. Improved decision making:
Better-managed data and improved data access make it possible to generate better-quality
information, on which better decisions are based. The quality of the information generated
depends on the quality of the underlying data. Data quality is a comprehensive approach to
promoting the accuracy, validity, and timeliness of the data. While the DBMS does not guarantee
data quality, it provides a framework to facilitate data quality initiatives.
7. Increased end-user productivity:
The availability of data, combined with the tools that transform data into usable information,
empowers end users to make quick, informed decisions that can make the difference between
success and failure in the global economy.
8. Controlling Data Redundancy:
In non-database systems (traditional computer file processing), each application program has its
own files. In this case, the duplicated copies of the same data are created at many places. In
DBMS, all the data of an organization is integrated into a single database. The data is recorded at
only one place in the database and it is not duplicated. For example, the dean's faculty file and
the faculty payroll file contain several items that are
identical. When they are converted into database, the data is integrated into a single database so
that multiple copies of the same data are reduced to-single copy.
In DBMS, the data redundancy can be controlled or reduced but is not removed completely.
Sometimes, it is necessary to create duplicate copies of the same data items in order to relate
tables with each other .By controlling the data redundancy, you can save storage space.
Similarly, it is useful for retrieving data from database using queries.
8. Backup and Recovery Procedures:
In a computer file-based system, the user creates the backup of data regularly to protect the
valuable data from damaging due to failures to the computer system or application program. It is
a time consuming method, if volume of data is large. Most of the DBMSs provide the 'backup
and recovery' sub-systems that automatically create the backup of data and restore data if
required. For example, if the computer system fails in the middle(or end) of an update operation
of the program, the recovery sub-system is responsible for making sure that the
database is restored to the state it was in before the program started executing.
DISADVANTAGES OF DATABASE
1. Increased costs:
Database systems require sophisticated hardware and software and highly skilled personnel. The
cost of maintaining the hardware, software, and personnel required to operate and manage a
database system can be substantial. Training, licensing, and regulation compliance costs are
often overlooked when database systems are implemented.
2. Management complexity:
Database systems interface with many different technologies and have a significant impact on a
company’s resources and culture. The changes introduced by the adoption of a database system
must be properly managed to ensure that they help advance the company’s objectives. Given the
fact that database systems hold crucial company data that are accessed from multiple sources,
security issues must be assessed constantly.
3. Maintaining currency:
To maximize the efficiency of the database system, you must keep your system current.
Therefore, you must perform frequent updates and apply the latest patches and security measures
to all components. Because database technology advances rapidly, personnel training costs tend
to be significant. Vendor dependence. Given the heavy investment in technology and personnel
training, companies might be reluctant to change database vendors. As a consequence, vendors
are less likely to offer pricing point advantages to existing
customers, and those customers might be limited in their choice of database system components.
4. Frequent upgrade/replacement cycles:
DBMS vendors frequently upgrade their products by adding new functionality. Such new
features often come bundled in new upgrade versions of the software. Some of these versions
require hardware upgrades. Not only do the upgrades themselves cost money, but it also costs
money to train database users and administrators to properly use and manage the new features.
5. Appointing Technical Staff:
The trained technical persons such as database administrator and application programmers etc
are required to handle the DBMS. You have to pay handsome salaries to these persons.
Therefore, the system cost increases.

DATABASE ADMINISTRATOR
the people responsible for managing databases are called database administrators. Each database
administrator,dubbed DBA for the sake of brevity may be engaged in performing various
database manipulation tasks such as
archiving, testing, running, security control etc. all related to the environmental side of the
databases.

1. Selection of hardware and software

 Keep up with current technological trends


 Predict future changes
 Emphasis on established off the shelf products

2. Managing data security and privacy


 Protection of data against accidental or intentional loss, destruction, or misuse
 Firewalls
 Establishment of user privileges
 Complicated by use of distributed systems such as internet access and client/ server
technology.

3. Managing Data Integrity

 Integrity controls protects data from unauthorized use


 Data consistency
 Maintaining data relationship
 Domains- sets allowable values
 Assertions- enforce database conditions

4. Data backup

 We must assume that a database will eventually fail


 Establishment procedures
o how often should the data be back-up?
o what data should be backed up more frequently?
o who is responsible for the back ups?
 Back up facilities
o automatic dump- facility that produces backup copy of the entire database
o periodic backup- done on periodic basis such as nightly or weekly
o cold backup- database is shut down during backup
o hot backup- a selected portion of the database is shut down and backed up at a
given time
o backups stored in a secure, off-site location

5. Database recovery

 Application of proven strategies for reinstallation of database after crash


 Recovery facilities include backup, journalizing, checkpoint, and recovery manager

If there are back up facilities, are there also journalizing, checkpoint, and recovery facilities?

Yes

 Journalizing facilities include:


o audit trail of transactions and database updates                   
o transaction log which records essential data for each transaction processed against
the database
o database change log shows images of updated data.  The log stores a copy of the
image before and after  modification.
 Checkpoint facilities:
o when the DBMS refuses to accept a new transaction, the system is in a quiet state
o database and transactions are synchronized
o allows the recovery manager to resume processing from a short period instead of
repeating the entire day
 Recovery and Restart Procedures
o switch- mirrored databases
o restore/rerun- reprocess transactions against the backup
o transaction integrity- commit or abort all transaction changes
o backward recovery (rollback)- apply before images
o forward recovery (roll forward)- apply after images (preferable to
restore/rerun)

6. Tuning database performance

 Set installation parameters/ upgrade DBMS


 Monitor memory and CPU usage
 Input/ output contention
o user striping
o distribution of heavily accessed files
 Application tuning by modifying SQL code in applications

7. Improving query processing performance

USERS IN DBMS
Users are of 4 types:
1. Application programmers or Ordinary users
2. End users
3. Database Administrator (DBA)
4. System Analyst
1. Application programmers or Ordinary users: These users write application programs to
interact with the
database. Application programs can be written in some programming language such a COBOL,
PL/I, C++,
JAVA or some higher level fourth generation language. Such programs access the database by
issuing the
appropriate request, typically a SQL statement to DBMS.
End Users: End users are the users, who use the applications developed. End users need not
know about the
working, database design, the access mechanism etc. They just use the system to get their task
done. End users
are of two types:
a) Direct users b) Indirect users
a) Direct users: Direct users are the users who se the computer, database system directly, by
following
instructions provided in the user interface. They interact using the application programs already
developed, forgetting the desired result. E.g. People at railway reservation counters, who directly
interact with database.
b) Indirect users: Indirect users are those users, who desire benefit form the work of DBMS
indirectly. Theyuse the outputs generated by the programs, for decision making or any other
purpose. They are just concerned
with the output and are not bothered about the programming part.
3. Database Administrator (DBA): Database Administrator (DBA) is the person which makes
the strategicand policy decisions regarding the data of the enterprise, and who provide the
necessary technical support for
implementing these decisions. Therefore, DBA is responsible for overall control of the system at
a technicallevel. In database environment, the primary resource is the database itself and the
secondary resource is the DBMS and related software administering these resources is the
responsibility of the Database Administrator(DBA).
4. System Analyst: System Analyst determines the requirement of end users, especially naive
and parametric
end users and develops specifications for transactions that meet these requirements. System
Analyst plays a
major role in database design, its properties; the structure prepares the system requirement
statement, which
involves the feasibility aspect, economic aspect, technical aspect etc. of the system.

LEVEL ARCHITECTURE / THREE-SCHEMA ARCHITECTURE:


In this architecture, the overall database description can be defined at three levels namely
internal, conceptual,
and external levels. This is shown below:
Figure: Logical Architectur

Figure:
Example

Internal level: It is the lowest level of data abstraction that deals with the physical representation
of thedatabase on the computer and thus, is also known as physical level. It describes how the
data is physicallystored and organized on the storage medium. At this level, various aspects are
considered to achieve optimalruntime performance and storage space utilization. These aspects
include storage space allocationtechniques for data and indexes, access paths such as indexes,
data compression and encryption techniques,and record placement.
Conceptual level: This level of abstraction deals with the logical structure of the entire
database and it is
also known as logical level. It describes what data is stored in the database, the relationships
among the dataand complete view of the user’s requirements without any concern for the
physical implementation. It hidesthe complexity of physical storage structures. The conceptual
view is the overall view of the database and itincludes all the information that is going to be
represented in the database.
External level: It is the highest level of abstraction that deals with the user’s view of the
database and it isalso known as view level. Most of the users and application programs do not
require the entire data stored inthe database. The external level describes a part of the database
for a particular group of users. It permitsusers to access data in a way that is customized
according to their needs, so that the same data can be seenby different users in different ways at
the same time. It provides a powerful and flexible security mechanism by hiding the parts of the
database from certain users, as the user is not aware of existence of any attributes
that are missing from the view.
Types of
DBMS

Typ
es of DBMS

Relational database – This is the most popular data model used in industries. It is based on the
SQL. They are table oriented which means data is stored in different access control tables, each
has the key field whose task is to identify each row. The tables or the files with the data are
called as relations that help in designating the row or record, and columns are referred to
attributes or fields. Few examples are MYSQL(Oracle, open source), Oracle database (Oracle),
Microsoft SQL server(Microsoft) and DB2(IBM).
Object oriented database – The information here is in the form of the object as used in object
oriented programming. It adds the database functionality to object programming languages. It
requires less code, use more natural data and also code bases are easy to maintain. Examples are
ObjectDB (ObjectDB software).
Images Source: Free Coloring Pages

Object relational database – Relational DBMS are evolving continuously and they have been
incorporating many concepts developed in object database leading to a new class called extended
relational database or object relational database.
Hierarchical database – In this, the information about the groups of parent or child
relationships is present in the records which is similar to the structure of a tree. Here the data
follows a series of records, set of values attached to it. They are used in industry on mainframe
platforms. Examples are IMS(IBM), Windows registry(Microsoft).
Network database – Mainly used on a large digital computers. If there are more connections,
then this database is efficient. They are similar to hierarchical database, they look like a cobweb
or interconnected network of records. Examples are CA-IDMS(COMPUTER associates),
IMAGE(HP).

Data dictionary

Introduction

At this point, everyone would have an idea of what is data dictionary. It is a dictionary about the
data that we store in the database. It contains all the information about the data objects.  It is like
storing all up-to-date information about the objects like tables, columns, index, constraints,
functions etc. It makes us easily identify access and understand the factors about the object. One
can imagine data dictionary as storing information about house like house name, address, how
many live in the house, who is the eldest/youngest person, responsibilities of each member in the
house etc. or a personal details of an employee in the company.
In the case of a table, data dictionary provides information about

 Its name
 Security information like who is the owner of the table, when was it created, and when it
was last accessed.
 Physical information like where is the data stored for this table
 Structural information like its attribute names and its datatypes, constraints and indexes.

Below is the sample data dictionary view of the tables. (Only few of the columns are shown
below to get an idea of data dictionary). Since it is from USER view, it will list only those tables
which are created by current user. Hence no owner column is displayed below.

From above example it is clear that any data dictionary would contain

 The definitions of all database objects like tables, views, constraints, indexes, clusters,
synonyms, sequences, procedures, functions, packages, triggers etc
 It stores the information about how much space is allocated for each object and how
much space has been used by them
 Any default values that a column can have are stored
 Database user names  - schemas
 Access rights for schemas on each of the objects
 Last updated and last accessed information about the object
 Any other database information

All these informations are stored in the form of tables in the data dictionary.

Types of Data Dictionary

There are two types of data dictionary – Active and Passive.


Active Data Dictionary

Any changes to the database object structure via DDLs will have to be reflected in the data
dictionary. But updating the data dictionary tables for the changes are responsibility of database
in which the data dictionary exists. If the data dictionary is created in the same database, then the
DBMS software will automatically update the data dictionary. Hence there will not be any
mismatch between the actual structure and the data dictionary details. Such data dictionary is
called active data dictionary.

Passive Data Dictionary

In some of the databases, data dictionary is created separately from the current database as
entirely new database to store only data dictionary informations. Sometimes it is stored as xml,
excels or in any other file format. In such case, an effort is required to keep data dictionary in
sync with the database objects. This kind of data dictionary is called passive data dictionary. In
this case, there is a chance of mismatch with the database objects and the data dictionary. This
kind of DD has to be handled with utmost care.

You might also like