Professional Documents
Culture Documents
DW Architecture
date (julian)
Appl A - date (julian)
Appl B - date (yymmdd)
Appl C - date (absolute)
Operational Data
Warehouse
insert
delet
e load
read only
access
replace
change
Why?
Data
Data Information
Information Knowledge
Knowledge
A Need For New Technology
Government and industrial entities have been collecting data in electronic format since the 1960s.
Today, organizations collect millions of pieces of information about every aspect of their operation
on a daily basis.
Data is obtained from multiple disparate sources.
Often information is replicated, leading to confusion.
Related data is often retained in seemingly heterogeneous and incompatible platforms.
Common data attributes are represented in nonstandard formats and naming constructs across
systems.
Most systems are built for data collection (transaction based).
Designed to support On-Line Transaction Processing (OLTP).
Designed to support day-to-day business operations.
Very specific applications built to support interaction with the data.
Perform best when handling small specific volumes of data.
Does not accept information from dissimilar sources readily.
A Need for New Technology
contd..
Capable of answering questions of a specific nature and time frame.
How many items do I have in stock today?
How many tickets were sold on a specific date?
What is the current price of an item?
Databases Applications
A Need for New Technology
These are the backbone systems of any enterprise, such as order entry inventory etc.
Transaction integrity
OLAP (On Line Analytical Processing) applications - designed for online ad-hoc data
Access to analytical content such as time series and trend analysis views and summary
level information.
• Processing Data-based
• Rule 1
• Rule 2
• Rule 3
Transformation
Engine
Integrator
Error
View
Check
Correct Loader Warehouse
Data Warehousing Architecture
Overview
Data Warehouse Architecture
Data
Legacy Data
Transformation
Data Warehouse
Model concepts:
Fact table(s)
A table containing multiple measurable descriptors relating to a
Model concepts:
Dimension Table(s)
Retains information (product description, geography description,
customer description) that is descriptive and remains moderately
constant over time
Data Warehousing
TIME GEOGRAPHY
Dimensions Dimensions
SALES
STORE CUSTOMER
Sales Facts
Data Warehousing
Year
North
Sample Snowflake Model
Qtr South
Dimensions Dimensions
SALES
East Region
STORE CUSTOMER
West Region Sales Facts
Data Warehousing
Sample Fact Constellation Model
District
Dimensions Dimensions
Sales
Store
Sales
STORE CUSTOMER
Data Mart
Data From
Transaction Sources Data
Warehouse
Update From the
Warehouse
Data From
Transaction Sources
Data Mart
warehouse
Hardware oriented to support the massive storage requirements and
analytical queries
Keys to Success
Do you understand why you are building the warehouse?
Have you identified both technical and business professionals that you will need
to build the warehouse?
Do you have a strong management sponsor?
Are you managing the expectations of the users?
Careers in Data Warehousing
System Administration DBA
DW Architect Application Developer
Data Architect Data Cleansing/ Transformation Analyst
DW Manager Business Analyst
DW Administrator Management
Decision Support Analysts