David M Walker Consultant Data Management & Warehousing

A Technical Architecture For The Data Warehouse

Data Warehouse Implementation Strategy

Business Analysis Project Management

Database Schema Design

Technical Architecture

Business Analysis

•! •! •! •!

End user driven Cross Functional Workshops Iterative design principle (80/20 rules) Determine the Key Performance Indicators (KPI) •! Determine constraints on KPI

Database Schema Design

•! •! •! •! •! •!

Identify sources of information Qualify external sources of information Translate KPI into facts Translate constraints into dimensions Choose required aggregations Build Meta Data and Security Model

Project Management

•! Iterative Process •! Rapid Application Development (RAD) techniques •! Arbitration when 80/20 rule used •! Conflict of short and long term goals

The Data Warehouse Systems Logical Architecture
Presentation Layer Third Party Tools Third Party Tools

The Data Warehouse

Middleware

Middleware

EIS Decision Support Systems

EIS Decision Support Systems

Meta Data

Transaction Repository Data OLTP System Legacy System Acquisition External Data Sources

Operational Systems

Security

Data Acquisition

Data Extraction •!Extraction •!Transformation •!Collation •!Migration

Data Load •!Loading •!Exception Processing •!Quality Assurance •!Publication

Transaction Repository
Dimension Dimension

Dimension Fact

Dimension Fact

Fact Dimension

Fact Fact Fact Dimension

Dimension

Fact

Dimension

Dimension

Dimension

Data Aggregation

Year

Month

Quarter

Executive Information Systems

Decision Support System

Week

Day

Transaction Repository

The Cost Of Aggregation
A very simple schema: 100 Stores 10 Regions 1 Company 1095 Days 157 Weeks 36 Month 12 Quarters 3 Years 100000 Products 1000 Categories 10 Groups 1 Type 10950000000 14609523963 Growth 33% 7665000000 10574481741 Growth 38%

Rows: No aggregation, No sparsity: Aggregation, No sparsity: No aggregation,30% sparsity: Aggregation, Variable sparsity:

If each row is 64 bytes long, a 10Billion row schema without indexes and other overheads would be 630Gb!

Data Mart
Time Dimension Day Week Month Quarter Year Associated Facts Another Dimension

Another Dimension

Another Dimension

Meta Data Dictionary And Security

Meta Data
•!Master schema •!Star schema •!Star schema description •!Table •!Table description •!Table row count •!Column •!Column description •!Column derivation •!Column format

Security
Control of user access to the data

Middleware and Presentation

•! •! •! •!

Use a common middleware Group users based on their requirements Try a number of tools for each group Final solution will have more than one front end, but not an infinite number •! Add value with alert systems

Conclusion

Strategy
•! •! •! •! Project Managment Business Analysis Schema Design Technical Architecture

Technical Architeture
•! •! •! •! •! •! •! Source Systems Data Acquisition Transaction Repository Data Aggregation Data Mart Meta Data & Security Middleware & Presentation

Help your users find it !

Contacts

•! Data Management & Warehousing
–! –! –! –! WWW Mail Telephone Fax http://www.datamgmt.com davidw@datamgmt.com +44 1734 771291 +44 1734 773058 http://www.tekptnr.com/tpi/tdwi tdwi@aol.com http://pwp.starnetinc.com/larryg/index.html

•! The Data Warehouse Institute
–! WWW –! Mail –! WWW

•! The Data Warehouse Information Center