You are on page 1of 46

Database Management

Systems
1 BASICS OF DATA ARRANGEMENT AND
ACCESS
 The Data Hierarchy: A bit (binary digit)
represents the smallest unit of data a computer
can process (a 0 or a 1); a byte, represents a
single character, which can be a letter, a
number, or a symbol.
 Field: A logical grouping of characters into a
word, a small group of words, or a complete
number.
 Record: A logical grouping of related fields.
 File: A logical grouping of related records.
 Database: A logical grouping of related files.
BASICS OF DATA cont…
 Entity: A person, place, thing, or event about
which information is maintained in a record.
 Attribute: Each characteristic or quality
describing a particular.
 Primary key: The identifier field that uniquely
identifies a record.
 Secondary key: An identifier field that has
some identifying information, but typically
does not identify the file with complete
accuracy.
2 THE TRADITIONAL FILE
ENVIRONMENT
A data file is a collection of logically
related records. In the traditional file
management environment, each
application has a specific data file related
to it, containing all the data records
needed by the application
Problems With the Data File Approach
 Data redundancy
 Data inconsistency
 Data isolation
 Data security
 Data integrity
 Application/data independence
3 DATABASES: THE MODERN
APPROACH
Database. A logical group of
related files that stores data
and the associations among
them.
Creating the Database
To create a database, designers must develop a
conceptual design and a physical design
 Conceptual design: An abstract model of a
database from the user or business
perspective.
 Physical design: Layout that shows how a
database is actually arranged on storage
devices.
 Entity-relationship modeling: The process of
designing a database by organizing data entities
to be used and identifying the relationships among
them.
 Entity-relationship (ER) diagram: Document that
shows data entities and attributes and
relationships among them.
 Entity classes: A grouping of entities of a given
type.
 Instance: A particular entity within an entity class.
 Identifier: An attribute that identifies an entity
instance.
 Relationships: The conceptual linking of
entities in a database.
 The number of entities in a relationship is the
degree of the relationship. Relationships
between two items are common and are
called binary relationships.
 There are three types of binary relationships:
 In a 1:1 (one-to-one) relationship, a single-entity
instance of one type is related to a single-entity
instance of another type.
 In a 1: M (one-to-many) relationship, a single-
entity instance of one type is related to many-
entity instance of another type.
 In a M:M (many-to-many) relationship, a single-
entity instance of one type is related to many-
entity of another type and vice versa.
Entity- relationship diagram model
 Normalization A method for analyzing
and reducing a relational database to its
most streamlined form for minimum
redundancy, maximum data integrity,
and best processing performance
Non-normalized relation
Normalized relation
4 DATABASE MANAGEMENT SYSTEMS

 Database management system


(DBMS): The software program (or
group of programs) that provides
access to a database.
Logical versus Physical View
 Physical view: The plan for the actual,
physical arrangement and location of data in
the direct access storage devices (DASDs) of
a database management system.
 Logical view: The user’s view of the data
and the software programs that process that
data in a database management system.
DBMS Components
 Data model: Definition of the way data in a DBMS are
conceptually structured.
 Data definition language (DDL): Set of statements that
describe a database structure (all record types and
data set types).
 Schema: The logical description of the entire
database and the listing of all the data items and the
relationships among them.
 Subschema: The specific set of data from the
database that is required by each application.
 Data manipulation language (DML):
Instructions used with higher-level
programming languages to query the
contents of the database, store or update
information, and develop database
applications.
 Structured query language (SQL): Popular
relational database language that enables
users to perform complicated searches with
relatively simple instructions.
 query by example (QBE): Database language
that enables the user to fill out a grid (form) to
construct a sample or description of the data
wanted.
 data dictionary Collection: definitions of data
elements, data characteristics that use the data
elements, and the individuals, business
functions, applications, and reports that use this
data element.
5 LOGICAL DATA MODELS
 The three most common data models are
hierarchical, network, and relational. Other
types of data models include
multidimensional, object-relational,
hypermedia, embedded, and virtual
 Hierarchical and network DBMSs: usually tie
related data together through linked lists.
Relational and multidimensional DBMSs
relate data through information contained in
the data.
Hierarchical Database Model
 Hierarchical database model rigidly structures data
into an inverted “tree” in which each record contains
two elements, a single root or master field, often
called a key, and a variable number of subordinate
fields.
 The strongest advantage of the hierarchical
database approach is the speed and efficiency with
which it can be searched for data.
 The hierarchical model does have problems: Access
to data in this model is predefined by the database
administrator before the programs that access the
data are written. Programmers must follow the
hierarchy established by the data structure.
Network Database Model

Data model that creates relationships among


data in which subordinate records can be
linked to more than one data element.
Relational Database Model
 Data model based on the simple concept of tables in
order to capitalize on characteristics of rows and
columns of data.
 Relations: The tables of rows and columns used in a
relational database.
 Tuple: A row of data in the relational database
model.
 Attribute: A column of data in the relational database
model.
Three basic operations of a relational database:

 “Select” operation: creates a subset


consisting of all file records that meet stated
criteria.
 “Join” operation: combines relational tables.
 “Project” operation: creates a subset
consisting of columns in a table, permitting
the user to create new tables that contain
only the information required.
Advantages and Disadvantages of Logical Data Models

Model Advantages Disadvantages

Hierarchical Searching is fast and efficient. Access to data is predefined by exclusively


database hierarchical relationships, predetermined by
administrator. Limited search/query
flexibility. Not all data are naturally
hierarchical.
Network Many more relationships can be This is the most complicated database model to
defined. There is greater speed design, Implement, and maintain.Greater query
and efficiency than with relational flexibility than withhierarchical model, but less
database models. than with relational model.

Relational database Conceptual simplicity; there are Processing efficiency and speed are lower. Data
no predefined relationships redundancy is common, requiring additional
among data. High flexibility in ad- maintenance.
hoc querying. New data and
records can be added easily.
Emerging Data Models
 Multi dimensional DB - Data warehouses
 Object- Oriented DB- includes objects also in databases- (objects-attributes,
classes, methods, messages)
 Hypermedia DB
 Object-relational database model: Data model that adds new object storage
capabilities to relational databases.
 Includes traditional data, complex objects (time series and geospatial

data), audio, video etc. Has both data and processes


 Hypermedia database model: Data model that stores chunks of information in
nodes that can contain data in a variety of media( including executable
programs); users can branch to related data in any kind of relationship,
structured by DBMS.
Specialized Database Models
 Geographical information database: Data
model that contains locational data for
overlaying on maps or images.
 Knowledge database: Data model that can
store decision rules that can be used for
expert decision making.
 Small-footprint database: The subset of a
larger database provided for field workers.
 Embedded database: A database built into
devices or into applications; designed to be
self-sufficient and to require little or no
administration.
 Virtual database: A database that consists
only of software; manages data that can
physically reside anywhere on the network
and in a variety of formats.
Data Life Cycle

6
Data Sources
 Internal Data Sources: data about people,
products, services, and processes.
 Personal Data: IS users or other corporate
employees may document their own expertise
by creating personal data.
 External Data Sources: Data from commercial
databases to sensors and satellites.

7
2 Data Warehousing
 Transaction Processing: The data are
organized in hierarchical structure and
centrally processed
 Analytical Processing: Analysis of
accumulated data
 Data Warehouse: A repository of subject-
oriented historical data that are organized to
be accessible in a form readily acceptable for
analytical processing.

9
Characteristics of a Data Warehouse
 Organization. Data are organized by subject and contain
information relevant for decision support only .
 Consistency. Data in different operational databases may be
encoded differently . In the data warehouse, though, they will be
coded in a consistent manner.
 Time variant. The data are kept for many years so that they can
be used for trends, forecasting, and comparisons over time.
 Non-volatile. Data are not updated once entered into the
warehouse.
 Multidimensional. Typically the data warehouse uses a
multidimensional structure .
 Web-based. Today’s data warehouse are designed to provide
an efficient computing environment for web-based applications.

10
Building a Data Warehouse

11
Relational and Multidimensional Database

 Relational databases store data in two –


dimensional tables. Multidimensional
databases typically store data in arrays,
which consist of at least three business
dimension.

12
Data Marts
 Data Mart: A small data warehouse designed for a
strategic business unit ( SBU) or a department
 The advantage of data marts include::
low cost (Prices under $100,000 versus $1million or
more for data warehouses);
significantly shorter lead time for implementation (often
less than 90 days),
local rather than central control (conferring power on
the using group),
More rapid response and more easily understood and
navigated than an enterprise wide data warehouse .

13
3 Information & Knowledge Discovery with
Business Intelligence
 Business Intelligence: A broad category of
applications and techniques for gathering,
storing, analyzing , and providing access to
data to help enterprise users make better
business and strategic decisions.

14
How Business Intelligence works?

15
The Tools and techniques of business intelligence

 The major application include the activities of


query and reporting, online analytical
processing, decision support , data mining,
forecasting, and statistical analysis.
 BI tools are divided into two major categories:
 (1) information and knowledge discovery
 (2) decision support and intelligent analysis.

16
Categories of business intelligence

17
Knowledge Discovery (KD)
 The process of extracting knowledge from
volumes of data; includes data mining .

18
Stage in the evolution of knowledge discovery
Evolutionary stage Business question enabling technologies characteristic

Data collection(1980s) What was my total revenue Computers ,tapes , disks Retrospective , static data
in the last 5 years? delivery

Data access (1980s) What were unit sales in new Relational databases Retrospective , dynamic
England last March ? (RDBMS), structured query data delivery at record level
language (SQL)
Data warehousing and What were the sales in OLAP, multidimensional Retrospective , proactive
decision support (early region A by product , by databases, data data delivery at multiple
1990s) salesperson? warehouses level

Intelligent data mining What’s likely to happen to Advanced algorithms, Prospective , proactive
(late 1990s) the tBoston unit’s sales next multiprocessor computers, information delivery
month ? Why? massive databases
Advanced intelligent What is the best plan to Neural computing advanced Proactive , integrative ;
systems; complete follow? how did we perform al models, complex multiple business partners
integration(2000-2004) compared to metrics? optimization, web services

19
4 Data Mining Concepts
 Data mining: The process of searching for
valuable business information in a large
database, data warehouse, or data mart.
 Data mining capabilities include:
1) Automated prediction of trends and
behaviours, and
2) Automated discovery of previously
unknown patterns.

20
Data Mining Application
Retailing and sales
Banking
Manufacturing and production
Insurance
Police work
Health care
Marketing

21
Web Mining
The application of data mining techniques to discover
actionable and meaningful patterns, profiles , and
trends form web resources.
Web mining is used in the following areas:
information filtering, surveillance, mining of web-
access logs for analyzing usage, assisted browsing,
and services that fight crime on the internet .
Web mining can perform the following function :
Resource discovery
Information extraction
Generalization
23
5 Data Visualization Technologies
Data Visualization: Visual presentation of
data by technologies such as graphics,
multidimensional tables and graphs, videos
and animation, and other multimedia formats.

24
7 Knowledge Management
 Knowledge: Information that is contextual,
relevant, and actionable .
 Intellectual capital (intellectual assets): other
terms for knowledge.

29

You might also like