Professional Documents
Culture Documents
Warehousing
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
In the Beginning, life was simple…
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
But…
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Our information needs…
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Kept growing. (The Spider web)
SOURCE: William H.
Inmon
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Purpose
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
A producer wants to know….
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
What are the users saying...
• Data should be
integrated across the
enterprise
• Summary data has a
real value to the
organization
• Historical data holds the
key to understanding
data over time
• What-if capabilities are
required
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
What is Data Warehousing?
A process of transforming
Information data into information and
making it available to users in
a timely enough manner to
make a difference
Data
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Data Warehousing --
It is a process
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Data Warehouse
• A data warehouse is a
o subject-oriented
o integrated
o time-varying
o non-volatile
collection of data that is used primarily in
organizational decision making.
-- Bill Inmon, Building the Data Warehouse 1996
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Briefing Contents
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Data Warehouse?
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Scenario 1
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Scenario 1 : ABC Pvt Ltd.
Mumbai
Delhi
Sales per item type per Sales
branch Manager
for first quarter.
Chenna
i
Banglor
e
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Solution 1:ABC Pvt Ltd.
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Solution 1:ABC Pvt Ltd.
Mumbai
Rep
ort
Delhi
Query & Sales
Data Analysis tools Manager
Warehouse
Chennai
Banglor
e
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Scenario 2
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Scenario 2 : One Stop Shopping
Data Entry
Operator
Repor
t
Data Entry
Operator
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Solution 2
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Solution 2
Data Entry
Operator
Repor
t
Data Entry
Operator
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Scenario 3
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Solution 3
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Solution 3
Expansion
sales
time
Improvement
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Why Do We Need Data Warehouses?
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Need for Data Warehousing
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Need for Data Warehousing (contd..)
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Need for Data Warehousing (contd..)
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
What Is a Data Warehouse Used for?
• Knowledge discovery
o Making consolidated reports
o Finding relationships and correlations
o Data mining
o Examples
Banks identifying credit risks
Insurance companies searching for fraud
Medical research
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
How Do Data Warehouses Differ From
Operational Systems?
• Goals
• Structure
• Size
• Performance optimization
• Technologies used
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Comparison Chart of Database Types
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Design Differences
Operational Data
System Warehouse
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Supporting a Complete Solution
Operational
System-
Data Entry
Data
Warehouse-
Data Retrieval
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Data Warehouses, Data Marts, and
Operational Data Stores
• Data Warehouse – The queryable source
of data in the enterprise. It is comprised of
the union of all of its constituent data
marts.
• Data Mart – A logical subset of the
complete data warehouse. Often viewed
as a restriction of the data warehouse to a
single business process or to a group of
related business processes targeted
toward a particular business group.
• Operational Data Store (ODS) – A point
of integration for operational systems that
developed independent of each other.
Since an ODS supports day to day SOURCE: Ralph
Kimball
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Data Mining works with Warehouse Data
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
We want to know ...
Industry Application
Finance Credit Card Analysis
Insurance Claims, Fraud
Telecommunication Analysis
Call record analysis
Transport Logistics management
Consumer goods promotion analysis
Data Service providers Value added data
Utilities Power usage analysis
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Data Mining in Use
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Data Warehousing Tools
• Data Warehouse
o SQL Server 2000 DTS
o Oracle 8i Warehouse Builder
• OLAP tools
o SQL Server Analysis Services
o Oracle Express Server
• Reporting tools
o MS Excel Pivot Chart
o VB Applications
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
RDBMS used for OLTP
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
OLTP vs Data Warehouse
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
OLTP vs Data Warehouse
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
To summarize ...
used to “run” a
business
• The Data
Warehouse helps
to “optimize” the
business
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Briefing Contents
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Building a Data Warehouse
Data Warehouse
Lifecycle
• Analysis
• Design
• Import data
• Install front-end tools
• Test and deploy
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Stage 1: Analysis
Analysis
• Design
• Identify: • Import data
• Install front-end tools
o Target Questions • Test and deploy
o Data needs
o Timeliness of data
o Granularity
• Create an enterprise-level data
dictionary
• Dimensional analysis
o Identify facts and dimensions
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Stage 2: Design
• Analysis
Design
• Star schema • Import data
• Install front-end tools
• Data Transformation • Test and deploy
• Aggregates
• Pre-calculated Values Dimensional
• HW/SW Architecture Modeling
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Dimensional Modeling
• Analysis
• Identify data sources • Design
Import data
• Extract the needed data • Install front-end tools
from existing systems to a • Test and deploy
data staging area
• Transform and Clean the
data
o Resolve data type conflicts
o Resolve naming and key
conflicts
o Remove, correct, or flag bad
data
o Conform Dimensions
• Load the data into the
warehouse
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Importing Data Into the Warehouse
Operational
Systems
(source systems)
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Stage 4: Install Front-end Tools
• Analysis
• Design
• Reporting tools • Import data
Install front-end tools
• Data mining tools • Test and deploy
• GIS
• Etc.
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Stage 5: Test and Deploy
• Analysis
• Design
• Usability tests • Import data
• Install front-end tools
• Software installation Test and deploy
• User training
• Performance tweaking based on usage
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Special Concerns
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Briefing Contents
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Goals of the STORET Central Warehouse
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Old Web Application
Flow
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Central Warehouse Application Flow
Search Criteria
Selection
Report Size
Feedback/
Report
Customization
Report Generation
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Web Application Demo
STORET Central
Warehouse:
http://epa.gov/storet/dw_hom
e.html
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
STORET Central Warehouse – Potential Future
Enhancements
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Data Warehouse Components
SOURCE: Ralph
Kimball
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Data Warehouse Components – Detailed
SOURCE: Ralph
Kimball
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this
Briefing Contents
Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this