You are on page 1of 71

Database Management System

Unit -4

Mahabala Shetty M
Database
• Database is an organised, integrated collection
of well defined Data & Information, centrally
controlled in all aspects, created & stored in a
typical structure
• The data structure and its storage should be
such that it facilitates Shareability, Availabilty,
Evolvability and Integrity of the data
DBMS
• Data stored in the database is independent of
the computer programs using it
Why Database ?
In conventional method (File system), each
subsystem will have it’s Master File and
Transaction Files.
• This affects the quality, repetitiveness of data
(redundancy)
• Sharing of data is not possible
• Redundancy causes lack of integrity and
results in inconsistency
File system approach
HRMS Payroll Academic details
Employee ID Employee ID Employee ID
Employee Name Employee Name Employee Name

Date of Birth Date of Birth

Qualification Qualification

Basic Salary Basic Salary


Tax payable Tax payable
• JE & Ldgr
• Spreadsheets
• SQL server 2005
Objective of Database Approach
1 . Availability :
Data should be available all the time for the use
of the Application
2 . Shareability :
Data item prepared by one Application should be
available to all Applications / queries.
No data items are “owned” by an application
3 . Evolvability :
The database can evolve as Application usage
and Query needs
… Objective of Database Approach
4 . Data Independence :
The users of the database establish their
view of the data and its structure without
regard to the actual physical storage of data
5 . Data Integrity :
The Database establishes a uniform high
level of accuracy & consistency
… Objectives of Database Approach;
Data integration:
• The data in file system is stored in separate
files. It is very difficult to access data stored in
separate and independent files. An important
objective of databases is to solve this problem.
• The data in database may be located at
different computers physically but it is connected
through data communication links.
In this way, data appears centralized logically.
Data Processing Systems
1 . Recording
2 . Classifying
3 . Sorting
4 . Calculation
5 . Storing
6 . Retrieving
7 . Reproducing data
8 . Communication of data
DBMS
• Database Management involves the control of
how databases are created, accessed and
maintained to provide information
• DBMS is a collection of programs that enable
users to create and maintain a database
• DBMS is hence a general purpose software
system that facilitates the process of defining,
construction and manipulation databases for
various applications
DBMS
• It consists of a collection of inter related data
and a set of programs to access that data
DBMS

User /
Application DBMS Database
program
Uses of DBMS
• Database integration & Reporting
• Application Development
• Database Access & Maintenance
• Database creation & Modification
Components of a DBMS
• At least one person who “owns” and is
responsible for the database
• A set of rules and relationships that defines
and governs the interactions among elements
of the database
• People who put data into the database
• People who get data out of the database
• The database itself
Languages for communicating with
Database
1 . Data Description Language (DDL)
Provides with standardised description of data
which constitute the data dictionary
2 . Data Manipulation Language (DML)
Facilitates for data Insertion, Deletion,
Modification & Retrieval
Characteristics of DBMS
• Capacity to store large volume of data
• Capacity to interrogate data files, retrieving &
modifying data
• Minimise duplication of data
• The control of the system limits the access to
the database files and builds the
confidentiality of all data in these files
Objective of DBMS
• Provide for mass storage
• Make access to the data easy for the user
• Eliminate redundancy
• Allow for multiple users to be active
concurrently
Components of a DBMS package
Command Language
Processor

Host
Query Report Language
Application
Language Generator Utilities Program
Generator
Interface

DBMS Kernel
Program
• Software
This is the set of programs used to control and manage the overall database. This
includes the DBMS software itself, the Operating System, the network software
being used to share the data among users, and the application programs used to
access data in the DBMS.

• Hardware
Consists of a set of physical electronic devices such as computers, I/O devices,
storage devices, etc., this provides the interface between computers and the real
world systems.

• Data
DBMS exists to collect, store, process and access data, the most important
component. The database contains both the actual or operational data and the
metadata.

• Procedures
These are the instructions and rules that assist on how to use the DBMS, and in
designing and running the database, using documented procedures, to guide the
users that operate and manage
• Database Access Language
This is used to access the data to and from the database, to enter new data, update existing data, or
retrieve required data from databases. The user writes a set of appropriate commands in a
database access language, submits these to the DBMS, which then processes the data and
generates and displays a set of results into a user readable form.

• Query Processor
This transforms the user queries into a series of low level instructions. This reads the online user’s
query and translates it into an efficient series of operations in a form capable of being sent to the
run time data manager for execution.

• Run Time Database Manager


Sometimes referred to as the database control system, this is the central software component of
the DBMS that interfaces with user-submitted application programs and queries, and handles
database access at run time. Its function is to convert operations in user’s queries. It provides
control to maintain the consistency, integrity and security of the data.

• Data Manager
Also called the cache manger, this is responsible for handling of data in the database, providing a
recovery to the system that allows it to recover the data after a failure.
• Database Engine
The core service for storing, processing, and securing data, this provides controlled
access and rapid transaction processing to address the requirements of the most
demanding data consuming applications. It is often used to create relational
databases for online transaction processing or online analytical processing data.

• Data Dictionary
This is a reserved space within a database used to store information about the
database itself. A data dictionary is a set of read-only table and views, containing
the different information about the data used in the enterprise to ensure that
database representation of the data follow one standard as defined in the
dictionary.

• Report Writer
Also referred to as the report generator, it is a program that extracts information
from one or more files and presents the information in a specified format. Most
report writers allow the user to select records that meet certain conditions and to
display selected fields in rows and columns, or also format the data into different
charts.
Uses of DBMS
• Database interrogation and reporting (for end
users)
• Application Development
• Database access
• Database creation & modification
Data description types

1 . Application programmer’s view point


- a description of the subschema
2 . From the global logical view point
- a description of the schema
3 . From the physical view point
- description of physical records & their
linkages
Schema & Mapping
• Schema is an overall conceptual or logical
view of the relationships between the data in
a database
• The different schemas reflect different views
of the database
Schema
1 . External Schema or user schema is the user’s
view of part of the database
2 . Conceptual schema is the overall logical view
of the database
3 . Internal schema or data storage definition is
the way, the data is physically organised in the
database
Schema & Mapping
External Schema

User View User View User Vi

Conceptual Schema

Internal Schema

Stored Data
Programs Accessing data by means of sub Schema

Program
Program
Subschema

Subschema

Schema

Data
CODASYL
• Conference on Data System Languages
• It specifies schema & subschema
Types of Databases
• Operational Databases
• Distributed databases
– These are databases of local work groups and
Departments at different locations and work sites
Data Definition Language
is used to describe

• Schema & Sub schema


• Records, fields within Records, relationship
between records and validation specifications
• Storage & Retrieval method specification
• Authorisation specifications
Data Manipulation Language
• Manipulation --- > access
• It is a set of statements that are embedded in
Host Languages & Query Languages to
interact with DBMS
• Provides the commands to Retrieve, Modify,
store, delete & update the data in the
database
• SELECT ... FROM ... WHERE ...
– SELECT ... INTO ...
• INSERT INTO ... VALUES ...
• UPDATE ... SET ... WHERE ...
• DELETE FROM ... WHERE ...
• For example, the command to insert a row into
table employees:
• INSERT INTO employees (first_name, last_name,
fname) VALUES ('John', 'Capita', 'xcapit00');
Database Administrator (DBA)
• One of the main reasons for having a DBMS is
to have a central control of both the data and
the programs accessing it.
• The person for this is DBA
Functions of DBA
• Schema definition
• Storage structure & access method definition
• Schema & physical organization for data
access
• Integrity constraints specification
• Backup & Restore database
• Monitor database performance & efficiency
Database Model
• A set of rules and standards that define how
the database organizes data is called database
model.
• It also defines how users views the
organization of data.
Database Models
1 . Hierarchical Model
• The hierarchical model arranges records in hierarchy
like an organizational chart.
• Each record type in this model is called a node or
segment.
• A node represents a particular entity. The top-most
node is called root.
• Each node is a subordinate of the node that is at the
next higher level. A higher level node is called parent
and lower level node is called child.
• A parent node can have one or many child nodes. A
child node can have only one parent node.
• This kind of structure is often called inverted tree.
2. Network Model

• The network model is similar to hierarchical


model. The difference is that child node can
have more than one parent nodes.
• The child nodes are represented by arrows in
network model. It requires more complex
diagram to represent a database.
• It also provides more flexibility than
hierarchical model.
3 . Relational Model
• Relation model is the most commonly used
database model.
• It is more flexible than hierarchical and network
database models.
• The relational model consists of simple relations.
A relation is a term used for table.
• A relation represents a particular entity. It is used
to store information about the entity.
• The relationships are based on the data of the
entities. The relationship between entities is
represented by the following diagram:
Structured query language

• Structured query language (e.g., SQL) uses a


collection of imperative verbs whose effect is
to modify the schema of the database by
adding, changing, or deleting definitions of
tables or other elements.
• These statements can be freely mixed with
other SQL statements, making the DDL not a
separate language.
SQL
• CREATE TABLE statement
– A commonly used CREATE command is the CREATE
TABLE command. The typical usage is:
– CREATE TABLE [table name] ( [column definitions] )
[table parameters]
Database Creation & Modification
• DBMS removes the control of the database from
the programmers and end users and places the
responsibility in the hands of the specialist called
DBA
• Integrity & Security of organization uses DDL to
develop and specify the data contents,
relationships and structure of each database
• DBA also uses the DDL to modify these database
specifications when necessary. Such information
is catalogued and stored in the Data Dictionary of
the DBMS
• Data Dictionary is a software module and it’s
database contains meta data
• It contains the names and descriptions of all
types of data records and their inter-
relationships
File Management & Database Management
Topic File Management Database Management

Data Redundancy High Low

Data Independence Re-organization of record format Similar re-organization has little


or access method means the or no impact on the application
application has to be changed

Application Maintenance Low degree of data independence High data independence reduces
means higher maintenance cost maintenance cost due to
due to data file re-organization database reorganization

Integrity Controls Provided by the Programmer (if at Provided by the DBMS


all)
Unanticipated Queries A custom program has to be A query language can be used to
written quickly answer adhoc queries
Data relationships Handled by the application (if at An integral part of the DBMS
all )

Database Ownership Application may have their own All the data belongs to DBMS
private data
Data Flow Diagram (DFD)
• DFD can help to identify the flow of data in a
system without specifying the media or
hardware involved
• It can illustrate the logical relationships among
data, flows, external entities (source &
destination )
DFD
The 4 basic symbols used are
1 . External entity (eg : Customer)
2 . Process (eg: preparing reports )
3 . Data flow (eg: payment)
4. Data store (eg: customer file)
DFD
The 4 basic symbols used are
1 . External entity (eg : Vendor)
2 . Process (eg: sending quotation )
3 . Data flow (eg: payment)
4. Data store (eg: vendor )
DFD
Quotation (data)
Capture
Vendor Quotation
details

Transaction File
Transaction data

Sorted Transactions
Sort Sorted Transaction
Transaction File
Data Warehouse
• DW is designed to give people information
rather than data.
• An example of information would be that the
profitability of our company has increased by
5% per month over the last six months.
• In order to provide good quality information
for a company (in other words, in order to
build a data warehouse) we have to overcome
two main problems:
DWH
• First: is the data is scattered across multiple transactional systems,
so we have to extract it from those systems, transform it into a
standardized format and finally load it into a central repository
called a data warehouse.
• The second is that we have to reorganize it so that it is presented to
the users in an understandable way. As a side issue we need to
ensure that the information system is very responsive — in many
cases we try to get response times down below two seconds.
• The term "data warehouse," in its broadest sense, describes both
the centralized repository and the system for delivering information
— which is another way of saying that they are woefully sub-
optimal for carrying out detailed and complex analysis.
DWH
• ETL (Extract – Transform – Load )
Big Data
• Big data is a term that describes the large
volume of data – both structured and
unstructured – that is cumulated in a business
on a day-to-day basis.
• Big data can be analyzed for insights that lead
to better decisions and strategic business
moves.
BD characteristics
• Volume. Organizations collect data from a variety of
sources, including business transactions, social media and
information from sensor or machine-to-machine data. In
the past, storing it would’ve been a problem – but new
technologies (such as Hadoop) have eased the burden.
• Velocity. Data streams in at an unprecedented speed and
must be dealt with in a timely manner. RFID tags, sensors
and smart metering are driving the need to deal with
torrents of data in near-real time.
• Variety. Data comes in all types of formats – from
structured, numeric data in traditional databases to
unstructured text documents, email, video, audio, stock
ticker data and financial transactions.
Two additional dimensions when it
comes to big data:
• Variability. In addition to the increasing velocities and
varieties of data, data flows can be highly inconsistent
with periodic peaks. Is something trending in social
media? Daily, seasonal and event-triggered peak data
loads can be challenging to manage. Even more so with
unstructured data.
• Complexity. Today's data comes from multiple sources,
which makes it difficult to link, match, cleanse and
transform data across systems. However, it’s necessary
to connect and correlate relationships, hierarchies and
multiple data linkages or your data can quickly spiral
out of control.
Why Is Big Data Important?
The importance of big data doesn’t revolve around how much data you
have, but what you do with it. One can take data from any source
and analyze it to find answers that enable
1) cost reductions
2) time reductions
3) New product development and optimized offerings
4) smart decision making.

When big data is combined with high-powered analytics, one can


accomplish business-related tasks such as:
• Determining root causes of failures, issues and defects in near-real
time.
• Generating coupons at the point of sale based on the customer’s
buying habits.
• Recalculating entire risk portfolios in minutes.
• Detecting fraudulent behavior before it affects your organization.
Who uses big data?
Banking

• With large amounts of information streaming in from


countless sources, banks are faced with finding new
and innovative ways to manage big data.
• While it’s important to understand customers and
boost their satisfaction, it’s equally important to
minimize risk and fraud while maintaining regulatory
compliance.
• Big data brings big insights, but it also requires financial
institutions to stay one step ahead of the game with
advanced analytics.
Health Care

• Patient records - Treatment plans - Prescription


information.
• When it comes to health care, everything needs
to be done quickly, accurately – and, in some
cases, with enough transparency to satisfy
stringent industry regulations.
• When big data is managed effectively, health care
providers can uncover hidden insights that
improve patient care.
How It Works
Streaming data

This category includes data that reaches your


IT systems from a web of connected devices.
You can analyze this data as it arrives and
make decisions on what data to keep, what
not to keep and what requires further
analysis.
Social media data
• The data on social interactions is an
increasingly attractive set of information,
particularly for marketing, sales and support
functions.
• It's often in unstructured or semi-structured
forms, so it poses a unique challenge when it
comes to consumption and analysis.
The final step in making big data work for your business is to
research the technologies that help you make the most of
big data and big data analytics. Consider:

• Cheap, abundant storage.


• Faster processors.
• Affordable open source, distributed big data platforms,
such as Hadoop.
• Parallel processing, clustering, virtualization, large grid
environments, high connectivity and high throughputs.
• Cloud computing and other flexible resource allocation
arrangements.
Data Analytics

• Big Data analytics is a process in which large sets of


data (Big Data) are collected, organized and analysed to
discover useful patterns/findings, uncover hidden
patterns, market trends and customers preferences.
• These patterns provide useful information that can
help a company to produce future decisions. Data
analytics are techniques of data analysis (discussed
below).
• These techniques include algorithms and data mining
methods to give results with fewer calculations.
Data Mining

• Data mining is a process by which companies extract


useful information from raw data (data may be in any
form i.e. structured, unstructured or semi structured).
• By using one or more software, from huge sets of data,
patterns are discovered that help to learn about
customers and develop effective marketing strategies.
• Data mining techniques help to convert data in one
form so that the data can be retrieved easily from
server.
• Another name for data mining is KDD (Knowledge
Discovery in Data).
Data Science

• Data Science is exploratory and useful in getting


to know the data.
• Automated methods are used to analyse massive
amount of data.
• You can say, data science is father of above all
(data analytics, data analysis and data mining).
• Data science includes very tough and complex
Mathematics in it.
• It consists of tools, methods, processes,
algorithms and systems to have insights of data.
Machine Learning

• There is no co relation between data science and


machine learning but machine learning can be used for
making machines to learn (improve performance on
specific tasks) from data.
• It is a field of Computer Science that gives computers
ability to learn using different statistical techniques
• From huge data sets, machine learning can apply
knowledge to excel at speech recognition, facial
recognition or many other tasks.
• Machine learning provides systems with the
opportunity of recognizing patterns and making
predictions.
Thank You
File Management Vs. Database Management
Data Flow Diagram (DFD);

• A data flow diagram (DFD) is a graphical representation of the


"flow" of data through an information system, modelling its process
aspects. A DFD is often used as a preliminary step to create an
overview of the system without going into great detail, which can
later be elaborated. DFDs can also be used for the visualization of
data processing (structured design).
• A DFD shows what kind of information will be input to and output
from the system, how the data will advance through the system,
and where the data will be stored. It does not show information
about process timing or whether processes will operate in
sequence or in parallel, unlike a traditional structured flowchart
which focuses on control flow, or a UML activity workflow diagram,
which presents both control and data flows as a unified model.

You might also like