You are on page 1of 91

Entity-Relationship Model

Database System Concepts 2.1 ©Silberschatz, Korth and Sudarshan


HOW TO START MODELING

ENTERPRISE MODEL

BUSINESS PROCESS MODEL

REQUIREMENT OF INFORMATION

RELATIONSHIP AMONG INFORMATION

DECIDE ON THE TYPE OF MODEL

Database System Concepts 2.2 ©Silberschatz, Korth and Sudarshan


HOW TO CONSTRUCT A HOUSE

MOBILISE THE FINANCES

SELECT A PLACE/SITE & BUY IT

REQUIREMENT OF MATERIALS
AS PER PLAN

HIRING OF LABOUR FORCE

COMPLETE CONSTRUCTION & OCCUPY

Database System Concepts 2.3 ©Silberschatz, Korth and Sudarshan


PUBLISHING A BOOK

■ DECIDE THE SUBJECT ON WHICH YOU ARE GOING TO


WRITE A BOOK AND PUBLISH.
■ COLLECT ALL RELEVANT DATA FROM VARIOUS SOURCES
■ ORGANISE THE SUBJECT CHAPTERWISE & SEQUENCE
THE CHAPTERS.
■ DEVELOP THE SUBJECT IN EACH CHAPTER
■ ADD ILLUSTRATIONS, EXAMPLES, QUOTES
■ SUMMARY, REVIEW QUESTIONS, SELF STUDY WORK,
CASE STUDY AT THE END OF EACH CHAPTER
■ ONCE ALL THE CHAPTERS ARE READY, INDEX/CONTENT
PREFACE, APPENDICES, BIBLIOGRAPHY, REFERENCES
■ NAME INDEX, SUBJECT INDEX
■ PUBLISHING AGENCY, PRICING, MARKETING

Database System Concepts 2.4 ©Silberschatz, Korth and Sudarshan


WHAT IS DATA MODELING

■ It is a conceptual framework which


defines the logical relationship among
the data elements needed to support a
basic business process or other
activities.
■ It is the underlying structure of a
database.
■ It is a map or diagram that represents
entities and their relationships
■ A collection of tools for describing
★data
★data relationships
★data semantics
★data constraints
Database System Concepts 2.5 ©Silberschatz, Korth and Sudarshan
What is database all about

Database model is defined to consist a combination of


the following components:


■ A collection of data object types, which form the basic
building blocks for any database that confirms to the
model.
■ A collection of general integrity rules, which constrain
the set of occurrences of these object types that can
largely appear in any such database.
■ A collection of operators, which can be applied to such
object occurrences for retrieval and other purposes.

Database System Concepts 2.6 ©Silberschatz, Korth and Sudarshan


TYPES OF MODELS AVAILABLE
■ OBJECT BASED LOGICAL MODELS.

■ RECORD BASED LOGICAL MODELS.


■ PHYSICAL MODELS.

Database System Concepts 2.7 ©Silberschatz, Korth and Sudarshan


TYPES OF MODELS AVAILABLE

OBJECT BASED LOGICAL MODELS.


★ Entity – relationship (ER) model
★ Object oriented model
★ Semantic data model
★ Functional data model
RECORD BASED LOGICAL MODELS.

★ Relational model
★ Network model
★ Hierarchical model
PHYSICAL MODELS.

★ Unifying model
★ Frame-memory model
Database System Concepts 2.8 ©Silberschatz, Korth and Sudarshan
What is this Object based Logical Model

■ Object based logical models are used in describing


data at the logical and view levels.

■ They are characterised by the fact that they


provide fairly flexible structuring capabilities.

■ They allow data constraints to be specified


explicitly.

Database System Concepts 2.9 ©Silberschatz, Korth and Sudarshan


What is this ER model

■ Used at the design stage of the database.



■ It is like the flow chart in computer programming.

■ Certain symbols are used in flow charts, similarly
here in ER modeling too, symbols are used.

■ Entity is a physical thing that exists, live or
otherwise that is distinguishable from other
objects. It can be explained by some of its
characteristics called attributes.

Database System Concepts 2.10 ©Silberschatz, Korth and Sudarshan


Entity Sets

■ A database can be modeled as:


★ a collection of entities,
★ relationship among entities.
■ An entity is an object that exists and is distinguishable from
other objects.
★ Example: specific person, company, event, plant
■ Entities have attributes
★ Example: people have names and addresses
■ An entity set is a set of entities of the same type that share the
same properties.
★ Example: set of all persons, companies, trees, holidays
Database System Concepts 2.11 ©Silberschatz, Korth and Sudarshan
Entity Relationship Model

★ Entities (objects)
✔ E.g. customers, accounts, bank branch

★ Relationships between entities


✔ E.g. Account A-101 is held by customer Sridhar
✔ Relationship-set depositor associates customers
with accounts
 Widely used for database design
★ Database design in E-R model usually converted to
design in the relational model (coming up next) which
is used for storage and processing
Database System Concepts 2.12 ©Silberschatz, Korth and Sudarshan
Entity Sets customer and loan
customer-id customer- customer- customer- loan- amount
name street city number

ANJU TIRUVALLA

PATTA CHE’CHERY

MARY ANAIYUR

KUND KESTOPUR
U
NISHU RANCHI

MADA NEW DELHI


N
MUKH KOLKATA
U

Database System Concepts 2.13 ©Silberschatz, Korth and Sudarshan


Attributes

■ An entity is represented by a set of attributes, that is descriptive


properties possessed by all members of an entity set.

Example:
■ customer = (customer-id, customer-name,
■ customer-street, customer-city)
■ loan = (loan-number, amount)
■ Domain – the set of permitted values for each attribute
■ Attribute types:
★ Simple and composite attributes.
★ Single-valued and multi-valued attributes
✔ E.g. multivalued attribute: phone-numbers
★ Derived attributes
✔ Can be computed from other attributes
✔ E.g. age, given date of birth

Database System Concepts 2.14 ©Silberschatz, Korth and Sudarshan


Composite Attributes

Database System Concepts 2.15 ©Silberschatz, Korth and Sudarshan


Relationship Sets

■ A relationship is an association among several entities


 Example:
Sridhar depositor A-102
customer entity relationship set account entity

■ A relationship set is a mathematical relation among n  2


entities, each taken from entity sets
 {(e1, e2, … en) | e1  E1, e2 E2,
…, en En}

where (e1, e2, …, en) is a relationship


★ Example:
 (Sridhar, A-102)  depositor
Database System Concepts 2.16 ©Silberschatz, Korth and Sudarshan
Relationship Set borrower
customer-id customer- customer- customer- loan- amount
name street city number

ANJU TIRUVALLA

PATTA CHE’CHERY

MARY ANAIYUR

KUND
KESTOPUR
U
NISHU RANCHI

MADA
NEW DELHI
N
MUKH KOLKATA
U

Database System Concepts 2.17 ©Silberschatz, Korth and Sudarshan


Relationship Sets (Cont.)
■ An attribute can also be property of a relationship set.
■ For instance, the depositor relationship set between entity sets
customer and account may have the attribute access-date

Database System Concepts 2.18 ©Silberschatz, Korth and Sudarshan


Degree of a Relationship Set

■ Refers to number of entity sets that participate in a relationship set.


■ Relationship sets that involve two entity sets are binary(or degree two). Generally,
most relationship sets in a database system are binary.
■ Relationship sets may involve more than two entity sets. Binary (Two), Ternary
(Three), Quaternary (Four), Quinary (Five), Senary (Six) and so on…..

■ ★E.g. Suppose employees of a bank may have jobs (responsibilities)


at multiple branches, with different jobs at different branches. Then
■ there is a ternary relationship set between entity sets employee, job
■ and branch
■ Relationships between more than two entity sets are rare. Most relationships are
binary. (More on this later.)

Database System Concepts 2.19 ©Silberschatz, Korth and Sudarshan


Mapping Cardinalities

■ Express the number of entities to which another entity can be


associated via a relationship set.
■ Most useful in describing binary relationship sets.
■ For a binary relationship set the mapping cardinality must be
one of the following types:
★ One to one
★ One to many
★ Many to one
★ Many to many

Database System Concepts 2.20 ©Silberschatz, Korth and Sudarshan


Mapping Cardinalities

One to one One to many


Note: Some elements in A and B may not be mapped to any
elements in the other set

Database System Concepts 2.21 ©Silberschatz, Korth and Sudarshan


Mapping Cardinalities

Many to one Many to many


Note: Some elements in A and B may not be mapped to any
elements in the other set

Database System Concepts 2.22 ©Silberschatz, Korth and Sudarshan


Mapping Cardinalities affect ER Design
■Can make access-date an attribute of account, instead of a
relationship attribute, if each account can have only one customer
■I.e., the relationship from account to customer is many to
one, or equivalently, customer to account is one to many

Database System Concepts 2.23 ©Silberschatz, Korth and Sudarshan


E-R Diagrams

■Rectangles represent entity sets.


■Diamonds represent relationship sets.
■Lines link attributes to entity sets and entity sets to relationship sets.
■Ellipses represent attributes
■Double ellipses represent multivalued attributes.
■Dashed ellipses denote derived attributes.
■Underline indicates primary key attributes (will study later)

Database System Concepts 2.24 ©Silberschatz, Korth and Sudarshan


E-R Diagram With Composite, Multivalued, and
Derived Attributes

Database System Concepts 2.25 ©Silberschatz, Korth and Sudarshan


Relationship Sets with Attributes

Database System Concepts 2.26 ©Silberschatz, Korth and Sudarshan


Roles
■ Entity sets of a relationship need not be distinct
■ The labels “manager” and “worker” are called roles; they specify how
employee entities interact via the works-for relationship set.
■ Roles are indicated in E-R diagrams by labeling the lines that connect
diamonds to rectangles.
■ Role labels are optional, and are used to clarify semantics of the
relationship

Database System Concepts 2.27 ©Silberschatz, Korth and Sudarshan


Cardinality Constraints
■ We express cardinality constraints by drawing either a directed
line ( ), signifying “one,” or an undirected line (—), signifying
“many,” between the relationship set and the entity set.
■ E.g.: One-to-one relationship:
★ A customer is associated with at most one loan via the
relationship borrower
★ A loan is associated with at most one customer via borrower

Database System Concepts 2.28 ©Silberschatz, Korth and Sudarshan


One-To-Many Relationship

■ In the one-to-many relationship a loan is associated with at most


one customer via borrower, a customer is associated with
several (including 0) loans via borrower

Database System Concepts 2.29 ©Silberschatz, Korth and Sudarshan


Many-To-One Relationships

■ In a many-to-one relationship a loan is associated with several


(including 0) customers via borrower, a customer is
associated with at most one loan via borrower

Database System Concepts 2.30 ©Silberschatz, Korth and Sudarshan


Many-To-Many Relationship

■ A customer is associated with several (possibly 0) loans


via borrower
■ A loan is associated with several (possibly 0) customers
via borrower

Database System Concepts 2.31 ©Silberschatz, Korth and Sudarshan


Participation of an Entity Set in a
Relationship Set
■Total participation(indicated by double line): every entity in the entity set
participates in at least one relationship in the relationship set
■E.g. participation of loan in borrower is total
■every loan must have a customer associated to it via
borrower
■Partial participation: some entities may not participate in any relationship
in the relationship set
■E.g. participation of customer in borrower is partial

Database System Concepts 2.32 ©Silberschatz, Korth and Sudarshan


Alternative Notation for Cardinality
Limits
■Cardinality limits can also express participation constraints

Database System Concepts 2.33 ©Silberschatz, Korth and Sudarshan


Keys

■ A super key of an entity set is a set of one or more attributes


whose values uniquely determine each entity.
■ A candidate key of an entity set is a minimal super key
★ Customer-id is candidate key of customer
★ account-number is candidate key of account
■ Although several candidate keys may exist, one of the
candidate keys is selected to be the primary key.

Database System Concepts 2.34 ©Silberschatz, Korth and Sudarshan


Keys for Relationship Sets

■ The combination of primary keys of the participating entity sets


forms a super key of a relationship set.
★ (customer-id, account-number) is the super key of depositor
★ NOTE: this means a pair of entity sets can have at most one
relationship in a particular relationship set.
✔ E.g. if we wish to track all access-dates to each account
by each customer, we cannot assume a relationship
for each access. We can use a multivalued attribute
though
■ Must consider the mapping cardinality of the relationship set
when deciding the what are the candidate keys
■ Need to consider semantics of relationship set in selecting the
primary key in case of more than one candidate key

Database System Concepts 2.35 ©Silberschatz, Korth and Sudarshan


E-R Diagram with a Ternary Relationship

Database System Concepts 2.36 ©Silberschatz, Korth and Sudarshan


Cardinality Constraints on Ternary
Relationship
■ We allow at most one arrow out of a ternary (or greater degree)
relationship to indicate a cardinality constraint
■ E.g. an arrow from works-on to job indicates each employee works
on at most one job at any branch.
■ If there is more than one arrow, there are two ways of defining the
meaning.
★ E.g a ternary relationship R between A, B and C with arrows to B
and C could mean
★ 1. each A entity is associated with a unique entity from B and C or
★ 2. each pair of entities from (A, B) is associated with a unique C
entity, and each pair (A, C) is associated with a unique B
★ Each alternative has been used in different formalisms
★ To avoid confusion we outlaw more than one arrow

Database System Concepts 2.37 ©Silberschatz, Korth and Sudarshan


Binary Vs. Non-Binary Relationships
■ Some relationships that appear to be non-binary may be better
represented using binary relationships
★ E.g. A ternary relationship parents, relating a child to his/her father
and mother, is best replaced by two binary relationships, father
and mother
✔ Using two binary relationships allows partial information
(e.g. only mother being know)
★ But there are some relationships that are naturally non-binary
✔ E.g. works-on

Database System Concepts 2.38 ©Silberschatz, Korth and Sudarshan


Converting Non-Binary Relationships to
Binary Form
■ In general, any non-binary relationship can be represented using binary
relationships by creating an artificial entity set.
★ Replace R between entity sets A, B and C by an entity set E, and three
relationship sets:
 1. RA, relating E and A 2.RB, relating E and B
 3. RC, relating E and C
★ Create a special identifying attribute for E
★ Add any attributes of R to E
★ For each relationship (ai , bi , ci) in R, create
 1. a new entity ei in the entity set E 2. add (ei , ai ) to RA
 3. add (ei , bi ) to RB 4. add (ei , ci ) to RC

Database System Concepts 2.39 ©Silberschatz, Korth and Sudarshan


Converting Non-Binary Relationships
(Cont.)
■ Also need to translate constraints
★ Translating all constraints may not be possible
★ There may be instances in the translated schema that
cannot correspond to any instance of R
✔ Exercise: add constraints to the relationships RA, RB and
RC to ensure that a newly created entity corresponds to
exactly one entity in each of entity sets A, B and C
★ We can avoid creating an identifying attribute by making E a weak
entity set (described shortly) identified by the three relationship
sets

Database System Concepts 2.40 ©Silberschatz, Korth and Sudarshan


Design Issues

■ Use of entity sets vs. attributes


Choice mainly depends on the structure of the enterprise being
modeled, and on the semantics associated with the attribute in
question.
■ Use of entity sets vs. relationship sets
Possible guideline is to designate a relationship set to describe
an action that occurs between entities
■ Binary versus n-ary relationship sets
Although it is possible to replace any nonbinary (n-ary, for n > 2)
relationship set by a number of distinct binary relationship sets,
a n-ary relationship set shows more clearly that several entities
participate in a single relationship.
■ Placement of relationship attributes

Database System Concepts 2.41 ©Silberschatz, Korth and Sudarshan


How about doing an ER design
interactively on the board?
Suggest an application to be modeled.

Database System Concepts 2.42 ©Silberschatz, Korth and Sudarshan


E-R Diagram for a Banking Enterprise

Database System Concepts 2.43 ©Silberschatz, Korth and Sudarshan


Summary of Symbols Used in E-R
Notation

Database System Concepts 2.44 ©Silberschatz, Korth and Sudarshan


Summary of Symbols (Cont.)

Database System Concepts 2.45 ©Silberschatz, Korth and Sudarshan


Alternative E-R Notations

Database System Concepts 2.46 ©Silberschatz, Korth and Sudarshan


DATABASE MODELING

Database System Concepts 2.47 ©Silberschatz, Korth and Sudarshan



the better we understand the data,
 the more effective
 the discovery and retrieval will be.

Database System Concepts 2.48 ©Silberschatz, Korth and Sudarshan


TYPES OF MODELS AVAILABLE

OBJECT BASED LOGICAL MODELS.


★ Entity – relationship (ER) model


★ Object oriented model
★ Semantic data model
★ Functional data model

RECORD BASED LOGICAL MODELS.

★ Relational model
★ Network model
★ Hierarchical model

PHYSICAL MODELS.

★ Unifying model
★ Frame-memory model

Database System Concepts 2.49 ©Silberschatz, Korth and Sudarshan


What is this Record Based Logical Model

■ To describe data at the logical and view levels.


■ Can be used to specify the overall logical structure.
■ Can also be used to provide higher level description of the
implementation.
■ Fixed format records from structured database.

Database System Concepts 2.50 ©Silberschatz, Korth and Sudarshan


Database Models
Hierarchical Structure Network
Structure
Dept Dept A Dept B

Employee Employee Employee


Project A Project B 1 2 3

Employe
Employe
Employee ee Project A Project B
1 22

Relational Structure
Dept Dname Dloc Dmgr Empno Ename Etitle Dept
A 1 A
B 2 B
C 3 C

Database System Concepts 2.51 ©Silberschatz, Korth and Sudarshan


Hierarchical Database Model

• Organize data in tree structure


• Hierarchy of parent-child relationship
• One-to-many relationship
• Restricts the child segment to having only one parent segment
• Redundant data
• Used for structured, routine types of transactions
• Not flexible in support of databases
• Cannot easily handle ad hoc requests

Database System Concepts 2.52 ©Silberschatz, Korth and Sudarshan


Relational Database Model

• Organizes data in the form of tables consisting of rows and columns


• Many-to-many relationship
• Based on Relational Algebra such as the application of Boolean operators, Projection,
Cartesian Product
• Redundancy in the data can be avoided by normalization rules
• Used for unstructured types of transactions
• More flexible in support of databases
• Can easily handle ad hoc requests



Database System Concepts 2.53 ©Silberschatz, Korth and Sudarshan


What is DBMS

D DEFINE THE (TYPE OF) DATABASE

B BUILD THE STRUCTURE (FOR STORAGE)

M MANIPULATE DATA (FOR RETRIEVAL)

SYSTEM CRASH

S SAFETY OF THE DATA UNAUTHORISED INTRUSIONS

Database System Concepts 2.54 ©Silberschatz, Korth and Sudarshan


DDL AND DML

DATA DEFINITION PART:


DATA DEFINITION LANGUAGES (DDL)
COBOL, SQL,
Creation of tables, Entering
data,Normalisation

DBMS

DATA MANIPULATION PART


DATA MANIPULATION LANGUAGES (DML)
Structured query languages (SQL) QBE
Query, Report generation and writing,
Checking errors, other control features

Database System Concepts 2.55 ©Silberschatz, Korth and Sudarshan


Database Management Systems

■ DBMS is a set of computer programs that facilitates


★ Creation of database by programmers
★ Maintenance and security of database by DBAs
★ and Application of database by end users.

■ DDL– Data Definition Language is for defining tables (files)


■ DML - Data Manipulation Language is for processing data and


records (e.g., update, sort, …)

■ DCL - A Data Control Language is a computer language for


controlling access to data in a database. Examples of DCL
commands are : GRANT and REVOKE

■ Data dictionary- computer-based catalog or directory containing


metadata.

Database System Concepts 2.56 ©Silberschatz, Korth and Sudarshan


DATA DICTIONARY

■ It is a database management catalog, prepared by database


designers to help individuals to enter data.
■ It contains metadata, i.e. Data on data, like, why the data item is
needed, how often it should be updated, on which form and
reports the data appears.
■ It relies on a DBMS software component to manage a database
of data definitions i.e. metadata about the structure, data
elements, and other characteristics of a database.
■ It contains the names and descriptions of all types of data
records and their inter relationships.
■ It also has information on end user’s access requirements, use
of application programs, database maintenance and security.
■ It can be queried by database administrator. He can make some
changes in data dictionary.

Database System Concepts 2.57 ©Silberschatz, Korth and Sudarshan


Database System Concepts 2.58 ©Silberschatz, Korth and Sudarshan
Concept of primary key

Database System Concepts 2.59 ©Silberschatz, Korth and Sudarshan


Types of keys

■ Primary key
1. Simple Primary key
2. Composite Primary Key
■ Secondary Key
■ Foreign Key
■ Reference Key
 ------------------------------------------------------------------------------
 Primary key: Unique identifier attribute for an entity.
 Simple Primary key : Based on single attribute
 Composite Primary Key : Based on two or more attributes
 Secondary Key : All attributes which is not primary
 Foreign Key : Primary key in one table, which is not primary in another table.

Reference Key: Column in child table, which is referred by primary key by parent
table.

Database System Concepts 2.60 ©Silberschatz, Korth and Sudarshan


TYPES OF DATABASES

■ OPERATIONAL DATABASE
■ ANALYTICAL DATABASE
■ DATA WAREHOUSING
■ DISTRIBUTED DATABASE
■ HYPERMEDIA DATABASE

Database System Concepts 2.61 ©Silberschatz, Korth and Sudarshan


OPERATIONAL DATABASE

■ DATABASE ON ABOUT THE OPERATION IN AN ORGANISATION.


■ CAN BE SUB-DIVIDED INTO SUBJECT AREAS LIKE:

★ TRANSACTION DATABASE
★ MARKETING DATABASE
★ PRODUCTION DATABASE

■ STORE INFORMATION RELATED TO DAY TO DAY OPERATIONS,
RELATED TO
★ CUSTOMERS
★ INVENTORY
★ EMPLOYEES

 TRANSACTION INFORMATION SYSTEM (TPS) USE THIS DATABASE.


Database System Concepts 2.62 ©Silberschatz, Korth and Sudarshan


ANALYTICAL DATABASE

■ INFORMATION EXTRACTED FROM OPERATIONAL AND


EXTERNAL.
■ INFORMATION REQUIRED FOR MIDDLE LEVEL MANAGERS
TO TAKE DECISIONS.
■ CAN ALSO BE CALLED AS MANAGEMENT DATABASES.
■ USE MULTI-DIMENSIONAL DATA STRUCTURE TO
ORGANISE DATA.
■ ONLINE ANALYTICAL PROCESSING, DECISION SUPPORT
SYSTEMS, EXECUTIVE SUPPORT SYSTEMS USE THIS
DATABASE.

Database System Concepts 2.63 ©Silberschatz, Korth and Sudarshan


DISTRIBUTED DATABASE

■ It is database where the data is not stored in one physical


location, but is distributed over many locations (computers/
servers), which are geographically dispersed and connected
by networking.
■ Example.
■ Advantages.
■ Disadvanteges

Database System Concepts 2.64 ©Silberschatz, Korth and Sudarshan


HYPERMEDIA DATABASES

■ Hypermedia databases are a repository of home pages and


other hyperlinked pages of multimedia.
■ Can store text, graphics, photo or video files.
■ World wide web uses hypermedia databases to access HTML
files, GIF files and video files.

Database System Concepts 2.65 ©Silberschatz, Korth and Sudarshan


Data Warehousing

■ Large organizations have complex internal organizations, and


have data stored at different locations, on different
operational (transaction processing) systems, under different
schemas
■ Data sources often store only current data, not historical data
■ Corporate decision making requires a unified view of all
organizational data, including historical data
■ A data warehouse is a repository (archive) of information
gathered from multiple sources, stored under a unified
schema, at a single site

★Greatly simplifies querying, permits


study of historical trends
★Shifts decision support query load away
from transaction processing systems
Database System Concepts 2.66 ©Silberschatz, Korth and Sudarshan
Definition of data warehousing

According to W.H.Inmon

A data warehouse is a subject-oriented,


integrated, time-variant and non-volatile
collection of data in support of management’s
decision making process.

Database System Concepts 2.67 ©Silberschatz, Korth and Sudarshan


Database System Concepts 2.68 ©Silberschatz, Korth and Sudarshan
Data Warehousing

Database System Concepts 2.69 ©Silberschatz, Korth and Sudarshan


DATA WAREHOUSING

■ A data warehousearchives information gathered from multiple


sources, and stores it under a unified schema, at a single
site.

★Important for large businesses which


generate data from multiple divisions,
possibly at multiple sites
★Data may also be purchased externally
★This has been depicted pictorially in the
next slide.
★Concept is much alike to physical
Database System Concepts
warehouses (Godowns) 2.70 ©Silberschatz, Korth and Sudarshan
DATA MART

■ One way of putting it across what is data mart is:


 “It is data warehouse that has limited or specific scope”.
■ It contains selected information from the data warehouse such
that each separate data mart is customised for the decision
support applications of a particular end-user group.
■ Data stored in a centralised location of data warehouse is
edited, standardised, integrated and updated so that it can be
used by managers for decision making.
■ Some of the organisations, instead of having one centralised
warehouse, might opt for multiple data marts, each one one
concentrating on specific aspect.

Database System Concepts 2.71 ©Silberschatz, Korth and Sudarshan


DATA MART

Data
warehouse

Data mart Data mart Data mart


(Production) (Finance) (Marketing)

Database System Concepts 2.72 ©Silberschatz, Korth and Sudarshan


Data Mining
 Broadly speaking, data mining is the process of semi-automatically
analyzing large databases to find useful patterns
 Like knowledge discovery in artificial intelligence data mining
discovers statistical rules and patterns
 Differs from machine learning in that it deals with large volumes of
data stored primarily on disk.
 Some types of knowledge discovered from a database can be
represented by a set of rules.
 e.g.,: “Young women with annual incomes
greater than $50,000 are most likely to
buy sports cars”
 Other types of knowledge represented by equations, or by
prediction functions
 Some manual intervention is usually required
Database System Concepts 2.73 ©Silberschatz, Korth and Sudarshan
What is Data Mining?

■ Data Mining is:


(1) The efficient discovery of previously
unknown, valid, potentially useful,
understandable patterns in large
datasets


(2) The analysis of (often large)
observational data sets to find
unsuspected relationships and to
summarize the data in novel ways that
Database System Conceptsare both understandable and useful to
2.74 ©Silberschatz, Korth and Sudarshan
What is Data Mining?

■ Very little functionality in database systems to support mining


applications
■ Beyond SQL Querying:

★SQL (OLAP) Query:


 - How many computers did we sell in the 1st Qtr of
1999 in Chennai vs New Delhi?
★Data Mining Queries:
- Which sales region (Chennai, Mumbai, Kolkata &
New Delhi) had anomalous sales in the 1st Qtr
of 2005?
Database System Concepts
 2.75 ©Silberschatz, Korth and Sudarshan
Relation between data mining and OLAP


Data mining can be viewed as an advanced stage
of On-line Analytical Processing (OLAP).

However, data mining goes far beyond the narrow
scope of summarization style analytical processing of
data warehouse systems by incorporating more
advanced techniques for data understanding.

Database System Concepts 2.76 ©Silberschatz, Korth and Sudarshan


Applications of Data Mining
■ Prediction based on past history

★Predict if a credit card applicant poses a


good credit risk, based on some
attributes (income, job type, age, ..)
and past history
★Predict if a customer is likely to switch
brand loyalty
★Predict if a customer is likely to respond
to “junk mail”
★Predict if a pattern of phone calling card
usage is likely to be fraudulent
■ Descriptive Patterns

★Associations
Database System Concepts 2.77 ©Silberschatz, Korth and Sudarshan
Database System Concepts 2.78 ©Silberschatz, Korth and Sudarshan
OLTP AND OLAP

■ ONLINE TRANSACTION PROCESSING (OLTP) And


 ONLINE ANALYTICAL PROCESSING (OLAP)
 ARE TWO IMPORTANT ASPECTS OF DATA MINING.
■ OLTP:
■ It refers to the immediate and automated response to the
requests of the user.
■ Example: Send a query mail to HDFC or ICICI bank, you will get
immediate response (sort of acknowledgement).
■ OLTP is designed to handle multiple concurrent transactions
from the customers.
■ It has a fixed number of inputs (type of query) and standard
format output.
■ OLTP is a big part of interactive e-Commerce applications.

Database System Concepts 2.79 ©Silberschatz, Korth and Sudarshan


EXAMPLE OF OLTP

■ From: ktmk54@gmail.com [ktmk54@gmail.com]


Sent: Saturday, Jun 4 2005 2:26PM
To: support@hdfcbank.com [support@hdfcbank.com]
Subject: Banker's Cheque Inquiry

Name : MR. T MOHANAKRISHNAN


Customer ID: 7488678
Account No : 0821050216585

Sir/Madam,
I have deposited a US cheque for $250, about 15 days back,
through Lloyds Road ATM center. I would like to get confirmation
of receipt of this cheque, as well as approximate time to realise
the cheque, please.
Thanking you,
T Mohanakrishnan.

Database System Concepts 2.80 ©Silberschatz, Korth and Sudarshan


EXAMPLE FOR OLTP

 Dear Customer,
Thank you for writing to us. This auto acknowledgement confirms the
receipt of your e-mail. This interaction is being tracked through the
subject line. We request you not to change the subject line.
If you are an existing account holder and have not mentioned your
Customer Identification Number / Account Number in your earlier mail,
please re-send your mail with the details.
Please ignore this message if you have quoted your Customer
Identification Number / Account Number.
We will get back to you shortly.

Warm regards,
HDFC Bank Ltd

Database System Concepts 2.81 ©Silberschatz, Korth and Sudarshan


OLAP

■ A graphical software tools that provide complex analysis of


data stored in database.
■ Why it is called “Online” analytical processing??

■ Purpose of OLAP server links data and special functions.

■ Analysis by OLAP goes beyond normal database queries.

■ It can provide time series and trend analysis views of the


data.
■ “what if” and “why” analysis.

Database System Concepts 2.82 ©Silberschatz, Korth and Sudarshan


WHAT IF ANALYSIS

 A CAPABILITY OF SOME INFORMATION SYSTEMS (EX:


DECISION SUPPORT SYSTEM) THAT ALLOWS USER TO
MAKE HYPOTHETICAL CHANGES TO THE DATA
ASSOCIATED WITH THE PROBLEM AND OBSERVE HOW
THESE CHANGES INFLUENCE THE RESULTS OR
OUTCOME.

Database System Concepts 2.83 ©Silberschatz, Korth and Sudarshan


WHAT IF ANALYSIS

■ Take the case of pay roll of a company.


■ Basic pay, DA, other allowances.
■ If DA is increased by some %, what is the net result or load on
ex-chequer.
■ If Basic pay is increased, what are the effects? (HRA, DA etc.,
are % of Basic)

Database System Concepts 2.84 ©Silberschatz, Korth and Sudarshan


OLAP

■ OLAP applications are found in the area of financial modeling


(budgeting, planning), sales forecasting, customer and product
profitability etc

■ Provides leverage to library managers by providing the ability to


model real life projections and a more efficient use of resources

■ OLAP enables the organization as a whole to respond more quickly to


market demands and improve revenue and profitability

Database System Concepts 2.85 ©Silberschatz, Korth and Sudarshan


Additional Database Models
Bank Account Object
Attributes
•Customer
•Balance
Operations
Denver •Deposit
West •Withdraw
Feb
East Actual Budget Checking Account Savings Account
Sales TV Object Object
Attributes Attributes
VCR •Credit Line •Credit Line
•Mthly Statement •Mthly Statement
Margin TV Operations Operations
•Calculate •Calculate
VCR Interest Interest
•Print Mthly •Print Mthly
Statement Statement
Multidimensional Object-Oriented
Database Structure Database Structure
Database System Concepts 2.86 ©Silberschatz, Korth and Sudarshan
Multidimensional Model

■ Multidimensional Model Enables Realistic Description of Data


■ The multidimensional model is also a natural fit for describing
and storing complex data. Developers can create data
structures that accurately represent real-world data, thus
making it faster to develop applications and easier to maintain
them.

Database System Concepts 2.87 ©Silberschatz, Korth and Sudarshan


Object-Oriented Database Model

• Offers more complex data types to overcome the restriction of


normalization rules for relational databases.

• Object-Oriented databases is based on the following principles:


• Encapsulation
• Inheritance
• Polymorphism
• Aggregation :

Database System Concepts 2.88 ©Silberschatz, Korth and Sudarshan


OBJECT ORIENTED DATABASE

Database System Concepts 2.89 ©Silberschatz, Korth and Sudarshan


Object-Oriented Database Model

• Encapsulation: An application (another object) can only communicate with an object


via messages. The operations provided by an object define the set of messages
which can be understood by it; no other operations can be applied to an object.

• Inheritance: New object classes can be derived from another class (the super-class)
by inheritance. The new classes inherit the attributes and methods of the super-
class and offer additional attributes and operations. The relation between a
derived class and its super-class is called ``is A" relation because an instance of
the derived class also is an instance of the super-class.

• Polymorphism: This feature is closely connected to inheritance. Derived classes may
re-define methods of their super-class(es). This is very useful for achieving
class-specific behaviour using messages already available for the super-class.

■ Aggregation: Composite objects may be constructed as consisting of a set of


elementary objects. The container object can communicate with its contained
objects via their methods. The relation between the container object and its
components is called ``partOf" relation because a component is a part of the
container object.

Database System Concepts 2.90 ©Silberschatz, Korth and Sudarshan


CHARACTERISTICS OF A WELL DEFINED
DATABASE SYSTEM
■ DATA INTEGRITY :

Ability to navigate and ability to manipulate to produce results.

■ DATA INDEPENDENCE:
 Logical independence: not affected by a change in application program
 Physical Independence: not affected by a change in data structure

■ PREVENTION OF DATA REDUNDANCY & INCONSISTENCY


 Redundancy is proportional to storage space

■ SHARING OF DATA
 Flexibility in accessing & viewing data by different users
 Availability of data to existing as well as new applications.

■ DATA MAINTENANCE
★ Checking the integrity of the system and data
★ Provision for modification, addition, deletion of records

■ DATA SECURITY and REGULATION OF STANDARDS
 System failure/ crash, backup
 Unauthorised users
 SOP, provision of pass words, DBA’s tasks

Database System Concepts 2.91 ©Silberschatz, Korth and Sudarshan

You might also like