You are on page 1of 32

Dimensional Design

Details

1
Star Schema Dimension Tables
 Dimension tables Dimension

 Store dimension Dimension


values
 Textual content

 Dimension tables

usually referred to
simply as Dimension
'dimensions'
 Spend extra effort to

add dimensional
attributes
2
Dimension Keys
 Synthetic keys Dimension
key
 Each table assigned a Dimension
unique primary key, key
specifically generated
for the data warehouse

 Primary keys from


Dimension
source systems may
be present in the key

dimension, but are not


used as primary keys
in the star schema

3
Dimension Columns
Dimension
 Dimension attributes
Key
 Specify the way in Dimension
attribute
which measures are Key
attribute
viewed: rolled up, attribute
attribute
broken out or attribute
summarized attribute
 Often follow the word
“by” as in “Show me Dimension

Sales by Region and Key

Quarter” attribute
attribute
 Frequently referred to
as 'Dimensions' attribute

4
Star Schema Fact Table
 Process measures
 Start by assigning one
fact table per business Fact Table
subject area
 Fact tables store the
process measures (aka fact1
Facts) fact2

 Compared to fact3

dimension tables, fact


tables usually have a
very large number of
rows

5
Fact Table Primary Key
 Every fact table
 Multi-part primary key
added Fact Table

 Made up of foreign key


key
keys referencing key
dimensions fact1
fact2
fact3

6
Fact Table Sparsity
 Sparsity
 Term used to describe the very common situation
where a fact table does not contain a row for
every combination of every dimension table row
for a given time period

 Because fact tables contain a very small


percentage of all possible combinations, they are
said to be "sparsely populated" or "sparse"

7
Fact Table Grain
 Grain
 The level of detail
represented by a row in Fact Table
the fact table
 Must be identified early
 Cause of greatest
confusion during design
process
 Example
 Each row in the fact table
represents the daily item
sales total

8
Designing a Star Schema
 Five initial design steps
 Based on Kimball's six steps
 Start designing in order
 Re-visit and adjust over project life

9
Step One

1. Identify fact table

Start by naming the fact table with the name


of the business subject area

10
Step Two

2. Identify fact table grain

Describe what a row in the fact table


represents - in business terms

11
Step Three

3. Identify dimensions

12
Step Four

4. Select facts

13
Step Five

5. Identify dimensional
attributes

14
Fact Table Details

15
Example Fact Table

Sales Facts
model_key
dealer_key
time_key

revenue
quantity

16
Facts
 Fully additive
 Can be summed across any and all dimensions
 Stored in fact table
 Examples: revenue, quantity

17
Facts
 Semi-additive
 Can be summed across most dimensions but not
all
 Anything that measures a “level”
 Must be careful with ad-hoc reporting
 Often aggregated across the “forbidden
dimension” by averaging

18
Facts
 Non-Additive
 Cannot be summed across any dimension
 All ratios are non-additive
 Break down to fully additive components, store
them in fact table

19
Factless Fact Table
 A fact table with no measures in it
 Nothing to measure...
 …Except the convergence of dimensional
attributes
 Sometimes store a “1” for convenience
 Examples: Attendance, Customer
Assignments, Coverage

20
Dimension Table
Details

21
Example Dimension Tables
Time

Model time_key

model_key year
quarter
brand month
category date
line
model
Dealer
dealer_key

region
state
city
dealer
22
Dimension Tables
 Characteristics
 Hold the dimensional attributes
 Usually have a large number of attributes (“wide”)
 Add flags and indicators that make it easy to
perform specific types of reports
 Have small number of rows in comparison to fact
tables (most of the time)

23
Don’t Normalize Dimensions
 Saves very little space
 Impacts performance
 Can confuse matters when multiple
hierarchies exist
 A star schema with normalized dimensions is
called a "snowflake schema"
 Usually advocated by software vendors
whose product require snowflake for
performance
24
Slowly Changing Dimensions
 Dimension source data may change over time
 Relative to fact tables, dimension records
change slowly
 Allows dimensions to have multiple 'profiles'
over time to maintain history
 Each profile is a separate record in a
dimension table

25
Slowly Changing Dimension
Example
 Example: A woman gets married
 Possible changes to customer dimension
• Last Name
• Marriage Status
• Address
• Household Income
 Existing facts need to remain associated with her
single profile
 New facts need to be associated with her married
profile

26
Slowly Changing Dimension
Types
 Three types of slowly changing dimensions
 Type 1
• Updates existing record with modifications
• Does not maintain history
 Type 2
• Adds new record
• Does maintain history
• Maintains old record
 Type 3:
• Keep old and new values in the existing row
• Requires a design change
27
Designing Loads to Handle SCD
 Design and implementation guidelines
 Gather SCD requirements when designing data
mapping and loading
 SCD needs to be defined and implemented at the
dimensional attribute level
 Each column in a dimension table needs to be
identified as a Type 1 or a Type 2 SCD
 If one Type 1 column changes, then all Type 1
columns will be updated
 If one Type 2 column changes, then a new record
will be inserted into the dimension table
28
Designing Loads to Handle SCD
 Design and implementation guidelines
 For large dimension tables, change data capture
techniques may be used to minimize the data
volume
 For smaller dimension tables, compare all OLTP
records with dimension table records
 Balance data volume with change data capture
logic complexities

29
Conformed Dimensions
 Conformed dimensions mean the exact same
thing with every possible fact table to which
they are joined.
 Eg: The date dimension table connected to
the sales facts is identical to the date
dimension connected to the inventory facts.

30
Degenerate Dimensions
 Dimensions with no other place to go
 Stored in the fact table
 Are not facts
 Common examples include invoice numbers
or order numbers

31
Junk Dimensions
 A junk dimension is a collection of random
transactional codes flags and/or text
attributes that are unrelated to any particular
dimension. The junk dimension is simply a
structure that provides a convenient place to
store the junk attributes.
 Eg: Gender dimension

32

You might also like