You are on page 1of 20

Data warehousing

Inmons’s definition
A data warehouse is
-subject-oriented,
-integrated,
-time-variant,
-nonvolatile
collection of data in support of management’s
decision making process.
Need for Data Warehousing

Industry has huge amount of operational data


Knowledge worker wants to turn this data into
useful information.
This information is used by them to support
strategic decision making .
Need for Data Warehousing (contd..)

It is a platform for consolidated historical data for


analysis.
It stores data of good quality so that knowledge
worker can make correct decisions.
Need for Data Warehousing (contd..)

From business perspective


-it is latest marketing weapon
-helps to keep customers by learning more about
their needs .
-valuable tool in today’s competitive fast evolving
world.
Data Warehouse Components

• Staging Area
• A preparatory repository where transaction data can be
transformed for use in the data warehouse
• Data Mart
• Traditional dimensionally modeled set of dimension and
fact tables
• Per Kimball, a data warehouse is the union of a set of
data marts
• Operational Data Store (ODS)
• Modeled to support near real-time reporting needs.
Data Warehouse Functionality

Relational
Databases
Optimized Loader
ERP
Extraction
Systems Cleansing
Data Warehouse
Engine Analyze
Purchased Query
Data

Legacy
Data Metadata Repository
STEPS INVLOVED IN DATA
WAREHOUSING:
STEP 1: ASSEMBLE THE TEAM

STEP 2: GATHER BUSINESS REQUIREMENTS

STEP 3: DEFINE TECHNICAL REQUIREMENTS

STEP 4: IDENTIFY THE DATA REQUIREMENTS

STEP 5: CREATE DATA MAPS

STEP 6: DEVELOP THE DATA DICTIONARY

STEP 7: DETERMINE WHETHER TO USE OUTSIDE SUPPORT

STEP 8: DECIDE ON SOFTWARE AND HARDWARE.

STEP 9:PERFORM INDIVIDUAL WAREHOUSE BUILD

STEP 10:DEVELOP A PRODUCTION CALENDER.


Data Warehousing Tools
Data Warehouse
SQL Server 2000 DTS
Oracle 8i Warehouse Builder
OLAP tools
SQL Server Analysis Services
Oracle Express Server
Reporting tools
MS Excel Pivot Chart
VB Applications
Evolution architecture of data warehouse

GO TO
Top-Down Architecture DIAGRAM

GO TO
DIAGRAM
Bottom-Up Architecture
GO TO
DIAGRAM

Data Stage/Data Mart Architecture GO TO


DIAGRAM
Very Large Data Bases

WAREHOUSES ARE VERY LARGE DATABASES

Terabytes -- 10^12 bytes: Wal-Mart -- 24 Terabytes

Petabytes -- 10^15 bytes: Geographic Information


Systems
Exabytes -- 10^18 bytes: National Medical Records

Zettabytes -- 10^21 bytes: Weather images

Zottabytes -- 10^24 bytes: Intelligence Agency Videos


Complexities of Creating a Data Warehouse

Incomplete errors
Missing Fields
Records or Fields That, by Design, are not Being
Recorded

Incorrect errors
Wrong Calculations, Aggregations
Duplicate Records
Wrong Information Entered into Source System
Data Warehouse Pitfalls

You are going to spend much time extracting, cleaning, and


loading data
You are going to find problems with systems feeding the data
warehouse
You will find the need to store/validate data not being
captured/validated by any existing system
Large scale data warehousing can become an exercise in data
homogenizing
Data Warehouse Pitfalls…

The time it takes to load the warehouse will expand to the


amount of the time in the available window...
You are building a HIGH maintenance system
You will fail if you concentrate on resource optimization
to the neglect of project, data, and customer management
issues and an understanding of what adds value to the
customer
Best Practices

Complete requirements and design

Prototyping is key to business understanding

Utilizing proper aggregations and detailed data

Training is an on-going process

Build data integrity checks into your system.


Thank You
Top-Down Architecture

BACK TO
ARCHITECTURE
Bottom-Up Architecture

BACK TO
ARCHITECTURE
Enterprise Data Mart Architecture

BACK TO
ARCHITECTURE
Data Stage/Data Mart Architecture

BACK TO
ARCHITECTURE

You might also like