You are on page 1of 17

By: RAVI RANJAN

DATA
WAREHOUS
E
By: Ravi Ranjan
DEFINITION
Data Warehouse
A collection of corporate
information, derived directly
from operational systems and
some external data sources. Its
specific purpose is to support
business decisions, not business
operations.
THE PURPOSE OF DATA
WAREHOUSING

 Realize the value of data


 Data / information is an asset
 Methods to realize the value, (Reporting, Analysis, etc.)

 Make better decisions


 Turn data into information
 Create competitive advantage
 Methods to support the decision making process, (EIS,
DSS, etc.)
Data Warehouse Components

• Staging Area
• A preparatory repository where transaction data can be
transformed for use in the data warehouse
• Data Mart
• Traditional dimensionally modeled set of dimension and
fact tables
• Per Kimball, a data warehouse is the union of a set of
data marts
• Operational Data Store (ODS)
• Modeled to support near real-time reporting needs.
DATA WAREHOUSE FUNCTIONALITY

Relational
Databases
Optimized
OptimizedLoader
Loader
ERP
Extraction
Extraction
Systems Cleansing
Cleansing
Data
Data Warehouse
Warehouse
Engine
Engine Analyze
Analyze
Purchased Query
Query
Data

Legacy
Data Metadata
Metadata Repository
Repository
EVOLUTION ARCHITECTURE OF DATA WAREHOUSE

GO TO
Top-Down
Top-DownArchitecture
Architecture DIAGRAM

GO TO
Bottom-Up
Bottom-UpArchitecture
Architecture DIAGRAM

GO TO
Enterprise
Enterprise Data
Data Mart
MartArchitecture
Architecture DIAGRAM

GO TO
Data
Data Stage/Data
Stage/Data Mart
MartArchitecture
Architecture DIAGRAM
VERY LARGE DATA BASES

WAREHOUSES ARE VERY LARGE DATABASES

 Terabytes -- 10^12 bytes: Wal-Mart -- 24 Terabytes

 Petabytes -- 10^15 bytes: Geographic Information


Systems
 Exabytes -- 10^18 bytes: National Medical Records

 Zettabytes -- 10^21 bytes: Weather images

 Zottabytes -- 10^24 bytes: Intelligence Agency Videos


COMPLEXITIES OF CREATING A DATA WAREHOUSE

 Incompleteerrors
Missing Fields
Records or Fields That, by Design, are not Being
Recorded

 Incorrect
errors
Wrong Calculations, Aggregations
Duplicate Records
Wrong Information Entered into Source System
SUCCESS & FUTURE OF DATA WAREHOUSE
 The Data Warehouse has successfully supported the increased
needs of the State over the past eight years.
 The need for growth continues however, as the desire for more
integrated data increases.
 The Data Warehouse has software and tools in place to provide
the functionality needed to support new enterprise Data
Warehouse projects.
 The future capabilities of the Data Warehouse can be expanded
to include other programs and agencies.
DATA WAREHOUSE
PITFALLS

 You are going to spend much time extracting, cleaning, and


loading data
 Youare going to find problems with systems feeding the data
warehouse
 Youwill find the need to store/validate data not being
captured/validated by any existing system
 Large
scale data warehousing can become an exercise in data
homogenizing
DATA WAREHOUSE PITFALLS…

 The time it takes to load the warehouse will expand to the


amount of the time in the available window... and then
some
 You are building a HIGH maintenance system

 You will fail if you concentrate on resource optimization


to the neglect of project, data, and customer management
issues and an understanding of what adds value to the
customer
BEST PRACTICES

 Complete requirements and design


 Prototyping is key to business understanding
 Utilizing proper aggregations and detailed data
 Training is an on-going process
 Build data integrity checks into your system.
Thank You
Top-Down Architecture

BACK TO
ARCHITECTURE
Bottom-Up Architecture

BACK TO
ARCHITECTURE
Enterprise Data Mart Architecture

BACK TO
ARCHITECTURE
Data Stage/Data Mart Architecture

BACK TO
ARCHITECTURE

You might also like