Professional Documents
Culture Documents
CIS Itm 4 DBMS
CIS Itm 4 DBMS
Unit -4
Mahabala Shetty M
Database
• Database is an organised, integrated collection
of well defined Data & Information, centrally
controlled in all aspects, created & stored in a
typical structure
• The data structure and its storage should be
such that it facilitates Shareability, Availabilty,
Evolvability and Integrity of the data
DBMS
• Data stored in the database is independent of
the computer programs using it
Why Database ?
In conventional method (File system), each
subsystem will have it’s Master File and
Transaction Files.
• This affects the quality, repetitiveness of data
(redundancy)
• Sharing of data is not possible
• Redundancy causes lack of integrity and
results in inconsistency
File system approach
HRMS Payroll Academic details
Employee ID Employee ID Employee ID
Employee Name Employee Name Employee Name
Qualification Qualification
User /
Application DBMS Database
program
Uses of DBMS
• Database integration & Reporting
• Application Development
• Database Access & Maintenance
• Database creation & Modification
Components of a DBMS
• At least one person who “owns” and is
responsible for the database
• A set of rules and relationships that defines
and governs the interactions among elements
of the database
• People who put data into the database
• People who get data out of the database
• The database itself
Languages for communicating with
Database
1 . Data Description Language (DDL)
Provides with standardised description of data
which constitute the data dictionary
2 . Data Manipulation Language (DML)
Facilitates for data Insertion, Deletion,
Modification & Retrieval
Characteristics of DBMS
• Capacity to store large volume of data
• Capacity to interrogate data files, retrieving &
modifying data
• Minimise duplication of data
• The control of the system limits the access to
the database files and builds the
confidentiality of all data in these files
Objective of DBMS
• Provide for mass storage
• Make access to the data easy for the user
• Eliminate redundancy
• Allow for multiple users to be active
concurrently
Components of a DBMS package
Command Language
Processor
Host
Query Report Language
Application
Language Generator Utilities Program
Generator
Interface
DBMS Kernel
Program
• Software
This is the set of programs used to control and manage the overall database. This
includes the DBMS software itself, the Operating System, the network software
being used to share the data among users, and the application programs used to
access data in the DBMS.
• Hardware
Consists of a set of physical electronic devices such as computers, I/O devices,
storage devices, etc., this provides the interface between computers and the real
world systems.
•
• Data
DBMS exists to collect, store, process and access data, the most important
component. The database contains both the actual or operational data and the
metadata.
•
• Procedures
These are the instructions and rules that assist on how to use the DBMS, and in
designing and running the database, using documented procedures, to guide the
users that operate and manage
• Database Access Language
This is used to access the data to and from the database, to enter new data, update existing data, or
retrieve required data from databases. The user writes a set of appropriate commands in a
database access language, submits these to the DBMS, which then processes the data and
generates and displays a set of results into a user readable form.
• Query Processor
This transforms the user queries into a series of low level instructions. This reads the online user’s
query and translates it into an efficient series of operations in a form capable of being sent to the
run time data manager for execution.
• Data Dictionary
This is a reserved space within a database used to store information about the
database itself. A data dictionary is a set of read-only table and views, containing
the different information about the data used in the enterprise to ensure that
database representation of the data follow one standard as defined in the
dictionary.
• Report Writer
Also referred to as the report generator, it is a program that extracts information
from one or more files and presents the information in a specified format. Most
report writers allow the user to select records that meet certain conditions and to
display selected fields in rows and columns, or also format the data into different
charts.
Uses of DBMS
• Database interrogation and reporting (for end
users)
• Application Development
• Database access
• Database creation & modification
Data description types
Conceptual Schema
Internal Schema
Stored Data
Programs Accessing data by means of sub Schema
Program
Program
Subschema
Subschema
Schema
Data
CODASYL
• Conference on Data System Languages
• It specifies schema & subschema
Types of Databases
• Operational Databases
• Distributed databases
– These are databases of local work groups and
Departments at different locations and work sites
Data Definition Language
is used to describe
Application Maintenance Low degree of data independence High data independence reduces
means higher maintenance cost maintenance cost due to
due to data file re-organization database reorganization
Database Ownership Application may have their own All the data belongs to DBMS
private data
Data Flow Diagram (DFD)
• DFD can help to identify the flow of data in a
system without specifying the media or
hardware involved
• It can illustrate the logical relationships among
data, flows, external entities (source &
destination )
DFD
The 4 basic symbols used are
1 . External entity (eg : Customer)
2 . Process (eg: preparing reports )
3 . Data flow (eg: payment)
4. Data store (eg: customer file)
DFD
The 4 basic symbols used are
1 . External entity (eg : Vendor)
2 . Process (eg: sending quotation )
3 . Data flow (eg: payment)
4. Data store (eg: vendor )
DFD
Quotation (data)
Capture
Vendor Quotation
details
Transaction File
Transaction data
Sorted Transactions
Sort Sorted Transaction
Transaction File
Data Warehouse
• DW is designed to give people information
rather than data.
• An example of information would be that the
profitability of our company has increased by
5% per month over the last six months.
• In order to provide good quality information
for a company (in other words, in order to
build a data warehouse) we have to overcome
two main problems:
DWH
• First: is the data is scattered across multiple transactional systems,
so we have to extract it from those systems, transform it into a
standardized format and finally load it into a central repository
called a data warehouse.
• The second is that we have to reorganize it so that it is presented to
the users in an understandable way. As a side issue we need to
ensure that the information system is very responsive — in many
cases we try to get response times down below two seconds.
• The term "data warehouse," in its broadest sense, describes both
the centralized repository and the system for delivering information
— which is another way of saying that they are woefully sub-
optimal for carrying out detailed and complex analysis.
DWH
• ETL (Extract – Transform – Load )
Big Data
• Big data is a term that describes the large
volume of data – both structured and
unstructured – that is cumulated in a business
on a day-to-day basis.
• Big data can be analyzed for insights that lead
to better decisions and strategic business
moves.
BD characteristics
• Volume. Organizations collect data from a variety of
sources, including business transactions, social media and
information from sensor or machine-to-machine data. In
the past, storing it would’ve been a problem – but new
technologies (such as Hadoop) have eased the burden.
• Velocity. Data streams in at an unprecedented speed and
must be dealt with in a timely manner. RFID tags, sensors
and smart metering are driving the need to deal with
torrents of data in near-real time.
• Variety. Data comes in all types of formats – from
structured, numeric data in traditional databases to
unstructured text documents, email, video, audio, stock
ticker data and financial transactions.
Two additional dimensions when it
comes to big data:
• Variability. In addition to the increasing velocities and
varieties of data, data flows can be highly inconsistent
with periodic peaks. Is something trending in social
media? Daily, seasonal and event-triggered peak data
loads can be challenging to manage. Even more so with
unstructured data.
• Complexity. Today's data comes from multiple sources,
which makes it difficult to link, match, cleanse and
transform data across systems. However, it’s necessary
to connect and correlate relationships, hierarchies and
multiple data linkages or your data can quickly spiral
out of control.
Why Is Big Data Important?
The importance of big data doesn’t revolve around how much data you
have, but what you do with it. One can take data from any source
and analyze it to find answers that enable
1) cost reductions
2) time reductions
3) New product development and optimized offerings
4) smart decision making.