You are on page 1of 15

David M Walker

Consultant
Data Management & Warehousing

A
Technical Architecture
For The
Data Warehouse
Data Warehouse Implementation Strategy

Project Management Business Analysis

Database Schema
Design

Technical
Architecture
Business Analysis

•! End user driven


•! Cross Functional Workshops
•! Iterative design principle (80/20 rules)
•! Determine the Key Performance Indicators
(KPI)
•! Determine constraints on KPI
Database Schema Design

•! Identify sources of information


•! Qualify external sources of information
•! Translate KPI into facts
•! Translate constraints into dimensions
•! Choose required aggregations
•! Build Meta Data and Security Model
Project Management

•! Iterative Process
•! Rapid Application Development (RAD)
techniques
•! Arbitration when 80/20 rule used
•! Conflict of short and long term goals
The Data Warehouse Systems Logical Architecture

Presentation
Third Party Tools Third Party Tools

Layer
The Data Warehouse Middleware Middleware

Security
EIS EIS
Meta
Data
Decision Decision
Support Systems Support Systems

Transaction Repository
Data Acquisition
Operational
Systems

OLTP Legacy External


System System Data
Sources
Data Acquisition

Data Extraction Data Load


•!Extraction •!Loading
•!Transformation •!Exception Processing
•!Collation •!Quality Assurance
•!Migration •!Publication
Transaction Repository

Dimension Dimension

Dimension Dimension
Fact Fact

Fact Fact

Dimension Dimension
Fact Fact

Fact
Dimension Dimension

Dimension Dimension
Data Aggregation

Year
Executive
Information
Systems
Quarter
Month

Decision
Support
System
Week

Transaction
Repository
Day
The Cost Of Aggregation

A very simple schema:

100 Stores 1095 Days 100000 Products


10 Regions 157 Weeks 1000 Categories
1 Company 36 Month 10 Groups
12 Quarters 1 Type
3 Years

Rows: No aggregation, No sparsity: 10950000000


Aggregation, No sparsity: 14609523963 Growth 33%
No aggregation,30% sparsity: 7665000000
Aggregation, Variable sparsity: 10574481741 Growth 38%

If each row is 64 bytes long, a 10Billion row schema without indexes


and other overheads would be 630Gb!
Data Mart
Time Dimension Associated Another Dimension
Day Facts
Week

Month

Quarter
Year

Another Dimension Another Dimension


Meta Data Dictionary And Security

Meta Data
•!Master schema Security
•!Star schema Control of
•!Star schema description user access
•!Table to the data
•!Table description
•!Table row count
•!Column
•!Column description
•!Column derivation
•!Column format
Middleware and Presentation

•! Use a common middleware


•! Group users based on their requirements
•! Try a number of tools for each group
•! Final solution will have more than one front
end, but not an infinite number
•! Add value with alert systems
Conclusion

Strategy Technical Architeture


•! Project Managment •! Source Systems
•! Business Analysis •! Data Acquisition
•! Schema Design •! Transaction Repository
•! Technical Architecture •! Data Aggregation
•! Data Mart
•! Meta Data & Security
•! Middleware & Presentation

Help your users find it !


Contacts

•! Data Management & Warehousing


–! WWW http://www.datamgmt.com
–! Mail davidw@datamgmt.com
–! Telephone +44 1734 771291
–! Fax +44 1734 773058
•! The Data Warehouse Institute
–! WWW http://www.tekptnr.com/tpi/tdwi
–! Mail tdwi@aol.com
•! The Data Warehouse Information Center
–! WWW http://pwp.starnetinc.com/larryg/index.html