You are on page 1of 2

Data Modeling for the Data Warehouse Tom Haughey April 27-29, 2005

Registration $1595 Early Registration Rate $1495 if registered by March 27, 2005 DAMA, MPO or IDMA Rate $1395 if registered by March 27, 2005
This workshop will focus on the specialized techniques used to model data warehouse data. It provides concrete guidelines and rules for modeling this data. This workshop will stress that this is not just an intuitive process based solely on the spontaneous judgment of a skilled analyst. It can be based on sound rules and guidelines just as production data models are. This workshop emphasizes three fundamental concepts: Finding the right level of atomic data in the warehouse, That the data warehouse is not a database but an integrated environment consisting of different levels of data, Optimizing the data warehouse design through dimensional modeling.
Workshop Agenda

Introduction Scope and levels of modeling Kinds of data The framework for data modeling Challenges in data management Five major characteristics of data warehouse Data Models Types and technologies of data warehousing Data Warehouse Methodology Explanation of methodology steps Iterative nature of development Introduction To Data Modeling Definition and components Levels of data models Rules for each level The high level data model Identifying and defining subject areas The detailed level data model Normalization Functional dependency Mathematical normalization Natural normalization Application to the data warehouse Building the Data Warehouse Model Comparison of operational and informational data Case 1: direct access to operational data Case 2: using informational data bases A data controlled environment Progression of data in a data controlled environment Two types of data changes in the data warehouse: Transformations optimizations Levels of Data In the Enterprise Four types of data and systems The warehouse and decision support data model Sources of warehouse and decision data Definition and rules The corporate model The business area model

Derived Data Types and different rules for handling during analysis and design Modeling Time And History Short Term And Long Term View Four ways of handling time and date Time-series data Capturing business changes Importance of representing the business time dimension Information Gathering Facilitated sessions Interviews Information gathering techniques Analyzing Current Systems Data Define key data elements Data stewards for each data element Key data element business rules Define domains and valid values Define valid ranges for error Document key data elements on repository Validate data mappings Identify key data elements in source systems Map relationships for repository Data Transformations Remove pure production data Add time and history to the data identifier Add data derivations Find the right atomicity of data Determine the functional dependencies in summary data Create data arrays and fact tables Accommodate varying levels of summarization Add summary data Merge like data from different tables Create arrays of data Separate data based on its stability Embed relationships in the data Add external data Techniques for derived data

Different levels of summarization Critical Warehouse Components Definition of fact tables and dimensions Creating multidimensional arrays Developing fact tables and arrays Corporate reference tables The star database schema The snowflake database schema Meta-data repository and components Optimizing the Data Warehouse Design Data design compromises Safe compromises to data Merge like tables Create arrays of data (violate first normal form) Split data based on stability and usage ( Add indices, Encode-decode data Aggressive compromises to data Store derived data Summarize data Add redundant data Imbed relationship data What you will learn

Add redundant relationships Add partial dependencies (violate second normal form) Add transitive dependencies (violate third normal form) Critical factors in data design Number of occurrences of each table The ratio of one table to another The queries that use the data The data accesses made by each query The load factor for each query. The steps of optimization Data Warehouse Technology Categories of warehouse tools Review of major products Important Considerations And Issues Denormalization and performance Archiving and purging Data distribution and replication Change control Copy management Alternative Models For Copied Data

Among the most important factors you will learn at this seminar will be how to design a dimensional data warehouse. By contrast, OLTP (on-line transaction processing) data models are normalized so as to reduce update problems. In addition, the focus is on current data, which is two-dimensional. Data warehouse design usually introduces a third dimension, which is that of time. For example, to support trend analysis, it is necessary to show the values of data changing over time (such as, monthly, quarterly, annually). Usually, this requires a degree of denormalization when creating the data model. The seminar will teach how to effectively accomplish this aspect of data modeling, while ensuring a quality data warehouse design. Concepts learned are reinforced by individual and group exercises Tom Haughey is considered one of the four founding fathers of Information Engineering in is America. He is currently President of InfoModel, Inc., a training and consulting company specializing in practical and rapid development methods. His courses on data management, data warehousing, Information Engineering and software development have been delivered to Fortune 1000 companies around the world. He has worked on the development of seven different CASE tools, over 40,000 copies of which have been sold to date. He was formerly Chief Technology Officer for the Pepsi Bottling Group and Enterprise Director of Data Warehousing for Pepsico. He was also formerly Vice President of Technology for Computer Systems Advisers, who market the CASE tools called POSE and SILVERRUN. He wrote his own CASE tool in 1984.He formerly worked for IBM for 17 years as a Senior project manager. He is the author of many articles on Data Management, Information Engineering and Data Warehousing.

You might also like