Professional Documents
Culture Documents
Business
Data Staging Maintenance
Project Dimensional Physical
Requirement Design & Deployment and
Planning Modeling Design
Development Growth
Definition
End-User End-User
Application Application
Specification Development
Project Management
Project Planning & Management
• Who Wants the Warehouse?
– A single visionary user
• desirable because the focus remains manageable
• requires political leverage to make it work
• the need must have broad and definable impacts to show worth
– Multiple demands
• Many organizations want a data mart or warehouse
• Focus is spread, therefore politics and planning play a vital
role
Project Planning & Management
• Who Wants the Warehouse? (cont)
– No identified need
• Organization wanting to get in the “warehouse”
game
• More effort on the warehouse team to identify the
need
• It is highly likely there will be one.
Project Planning & Management
• Determine Warehouse Readiness
– Do you have a strong business sponsor?
• Vision
• Politically savvy
• Connected
• Influential
• History of success
• Respected
• Realistic
• Understands the need and the process and can communicate it
Project Planning & Management
• Determine Warehouse Readiness (cont)
– Without this person you will fail
– Try to recruit multiple sponsors.
– Is there a real and identifiable business need?
– Does a strong partnership exist between IT and
the business groups?
– What is the current analytical environment?
• How are things done now?
• What culture shock will be created?
Project Planning & Management
• Determine Warehouse Readiness (cont)
– What is the feasibility?
• Is the data “dirty” beyond recovery?
• Is the target sources to dispersed and dynamic to
achieve early and significant results?
Project Planning & Management
• Take the Readiness “Litmus Test”
– The test looks at:
• Sponsor
• Business Needs
• IT/Business Partnership
• Current Analytical Environment
• Feasibility
– A strong sponsor is the most important to get a high
rating from the test
– Business needs and IT/Business Partnerships are
secondary in importance
Project Planning & Management
• Addressing Readiness Issues
– High-level business requirements analysis
• Identify the strategic initiatives
• Identify the business metrics
• Identify the high impact and ROI areas
– Business Requirements Prioritization
• Look for high impact, ROI, and feasibility
– Proof of Concept
Project Planning & Management
• Develop the Initial Scope
– Keep the scope narrow and short to retain clarity
– The bigger the scope the more difficult it
becomes to retain focus
– Always define the scope based on business
requirements. Try to avoid deadlines or budget
cycles from driving the scope.
Project Planning & Management
• Develop the Initial Scope (cont)
– Scope definition involves both IT and business
representatives
– Make the scope have significance but ensure it is
achievable and timely
– Start with a single or few data sources and a single
business process
– Limit your initial user base (typically 25 - 35 people).
– Determine what management expects so success can be
identified
Project Planning & Management
• Develop the Initial Scope (cont)
– Document the scope definition and success
indicators
– Acknowledge that the scope will likely change
– Develop a plan to manage the change
Project Planning & Management
• Build the Business Justification
– Determine the costs
• Identify hardware and software costs (start-up and
ongoing)
• Identify maintenance costs
• Internal staff needs
• External resources (consultants, etc.)
• Operational support
• Support of growth pains
Project Planning & Management
• Build the Business Justification (cont)
– Determine the benefits (financial and other)
• Increased revenue
• Increased profit
• Increased customer satisfaction
• Expansion of a market or capability
• Increased employee productivity
• Reduction of capital investments (storage requirements, etc.)
• Protection against fraud and attack
Project Planning & Management
• Build the Business Justification (cont)
– It is important to monitor and track the business
to identify and market impacts the warehouse
has made
– Look for the tangibles and intangibles
Project Planning & Management
• Plan the Project
– Establish project identity
• Create a name
• Create documentation describing your project
• Make T-shirts, mugs, etc
• Market, market, market!!!
Project Planning & Management
• Plan the Project (cont)
– Staff up
Jeffrey T. Edgell
The Dimensional Model
• More intuitive structure for presentation and
reporting
• Likely predates the E/R approach
– General Mills & Dartmouth University
developed a fact and dimension structure
– Nielsen Marketing Research used this on
grocery and drug store auditing and scanner
data in the 70s and 80s.
The Dimensional Model
• Dimensions are descriptive
• Facts are likely numeric and are
measurement based
• Additive facts are vital to allow aggregation
of many records during a retrieval
• Page 145 (A typical dimensional model)
The Argument for the
Dimensional Model
• Tools can utilize a standardized framework
• Query tools can leverage against this for
performance optimization
• High performance entry browsing is
possible
• All queries can be initially constrained thus
significantly increasing performance
The Argument for the
Dimensional Model
• Easily adapts to unpredictable queries
• Extends to allow the addition of new tables
or data elements
– will not require rebuilding the database from
scratch
– data does not need to be reloaded
– existing reports and query tools do not need to
be redesigned or implemented
The Argument for the
Dimensional Model
• The model can be altered as follows without
interruption:
– The addition of new facts (consistent with the
defined grain)
– The addition of new dimensions
– The widening of a dimension table
– Changing the detail of a dimension to a lower
level
The Argument for the
Dimensional Model
• The dimensional model exhibits a predefined set
of approaches used to deal with common issues.
– Slowly changing dimensions
– Heterogeneous products (track different lines of
business i.e. checking & savings)
– Pay-in-advance data bases (look at individual
components as well as the total)
– Event handling (no facts)
The Argument for the
Dimensional Model
• Aggregation in a warehouse allows for
query performance normally delegated to
hardware to solve (greatly increasing $)
• A standard set of schemas for different
business types and applications exist
The Bus
• Supports the incremental approach
• The data mart approach has often lead to
development of warehouse absent of a
corporate framework
• Stovepipe decision structures result
• Produces a uniform global structure
eliminating the pocket or stovepipe data marts
The Bus
• Look at the entire enterprise as you design
and build the data marts
• A high level architecture must be defined
that explains the entire structure
• A detailed architecture must be developed
to support each data mart as they are
confronted
Conformed Dimensions
• Dimensions used to represent concepts
across the enterprise must be standardized
and agreed upon
– customer
– product
– time
– potentially not region (sales & management)
Conformed Dimensions
• Conformed dimensions must be carefully
managed, maintained, and published to
ensure consistency
• The conformed dimension represents the
central source description of which everyone
agrees
• If the conformed dimension approach is not
observed, the bus will not properly function
Conformed Dimensions
• With conformed dimensions
– One dimension table relates to multiple facts
– Browsers are consistent with the dimension
providing a unified view
– Rollups and meanings remain consistent across
facts
Conformed Dimensions
• Design
– Lowest level of granularity possible (based on
the lowest level defined)
– Use the sequential numeric key (surrogate key)
Conformed Facts
• Occurs during the definition of conformed
dimensions
• Relates common measurements accurately
– Cost
– Profit
– Unit price
• If facts are different use different names (marketing
profit & sales profit)
• As much political as technical
When the Bus is not Required
• The business you are dealing with is
intentionally segmented
– Components operated autonomously with no
unified corporate view required
– Products or business areas are disjoint
– For example a company sells music and repairs
train engines (no business or product synergy
except at the very top)
The Components of the
Dimensional Model
• Facts
• Dimensions
• Attributes
• The Bus (optional but highly suggested)
Operations
• Drill down and rollup
– Example on page 168
Snowflakes
• What is it?
The removal of low cardinality fields from a dimension
placed in a new table and linked back with keys
• Complicates design detail
• Decreases performance
• Saves some space but normally not a significant
amount
• Bit map indexes can not be effectively utilized
When a Snowflake is OK
• When used as a subdimesnion
– The data in the subd is related to the dimension
are at different levels of granularity
– The data load times for the data are different
– Examples:
• County and state
• District and region
• Ship and battle group
Good Descriptive Dimensions
• Large dimension tables
• Highly descriptive
• Without good descriptive dimensions, the
warehouse is not useful
• Use:
– full words, no missing values (null), QA,
metadata
Common Dimension Techniques
• Time
– example figure 5.7 page 176
• Address
– example page 178
• Commercial address
– example page 179
Slowly Changing Dimensions
• What to do:
– Type 0: Ignore the change
– Type 1: Overwrite the changed attribute
– Type 2: Add a new dimension record with new
value of the surrogate key
– Type 3: Add an “old value” field
Slowly Changing Dimensions
• Ignore the change
– Not typically a good solution to the problem, but
is done.
• Overwrite the changed attribute
– Valid when correcting a value from the source
• Add a new dimension record with a
generalized key
– Retains history of a changed product
Slowly Changing Dimensions
• Add an “old value” field
– Valid when on the previous change is needed
for decision making
Slowly Changing Dimensions
• Type 2 example:
Change in product (bottle changes from
platic to glass)
Key 001 002
Type Plastic Glass
SKU 1234 1234
Slowly Changing Dimensions
• Type 3 example:
Regional divisions of a company changes
(only one historical change is supported)