You are on page 1of 149

Transparencies for:

Prosser, Ossimitz

Data Warehouse Management


Using SAP BW
UTB fr Wissenschaft, Stuttgart 2001

Background

Functional software

MM

Sales

PPC

Acc.

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Background
Process orientation:

Business Process

Value for
the
customer
MM

Sales

PPC

Acc.

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Background

Vertical integration

EIS
Decision
Support

Reporting

Reporting,
Analysis, Controlling

Functional applications
Cross-functional base applications
Operational systems

Office automation

Horizontal integration
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Background

Organisation #2

Organisation #1

Decision
Support

Decision
Support

Reporting,
Analysis, Controlling

Reporting,
Analysis, Controlling

Functional applications
Cross-functional base applications
Office automation

Functional applications
Cross-functional base applications
Office automation

Cross-company integration (e.g., supply chain management)

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Why this subject?

Warehouse knowledge has become a key factor also for


non-IT specialists, e.g., controlling, strategic planning,
production planning, marketing,
Conventional reporting increasingly replaced by
more powerful warehouse tools
Warehouse boom in IT and consulting industry
Warehouses are the foundation of decision support
systems

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Why this subject?

Introduction, concepts
Conceptual design
Database design
Case study: SAP BW

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
External Sources

Operational IS

Data Mining
KBS

OLAP

DW

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Operational system
Usage
Transaction-intensive (read
and write)
Users
Relatively large number
Coverage (In most cases) current data
only
Integration Data is integrated according
to functional activity or
business processes
Data
Depending on maintenance
quality
of data in the individual
operational system
Database Data is updated continuously
and is therefore volatile

Data warehouse
Query-intensive (read only)
Relatively small number
Current & historical data;
time-dependent
Data is integrated according to
subjects
Quality is defined in terms of
consistency
Data is refreshed or updated
at intervals and is therefore
non-volatile

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Data Mining
KBS

Operational IS
Read/write

OLAP
Read-only
DW

Periodic update
ODS
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
operational

current data
1

warehouse
history
n

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Key Integration
Operational IS
Read/write
Key_1

Key_2

DW

one object in
DW

Example:
accounts receivable
customer
transport destination
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Field Integration
Subject-oriented
Operational IS
Transaction or
process-oriented

DW

Filter:

field format
numeric construction
allowable field content
attributes stored

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Semantic integration

Customer
1:1?

Accounts receivable

1:1?

Transport destination

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Semantic integration

DW

Work center

Batch production

Work center

Work center

Project management

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Development

Sponsorship

Operational system
Often OLTP systems are
built by first capturing a full
set of requirements and then
applying the well established
waterfall development
method
Operational systems tend to
be sponsored by a clear
process owner who has
responsibilities which clearly
map onto the organisational
hierarchy

Data warehouse
Data warehouses are built
according to the principles of
evolutionary and iterative
development

The data warehouse requires


a more complex sponsorship
model because of the
organisational scope of the
project

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Model

Data
structure
Scope

Operational system
Data is usually normalised
and is modeled according to
the needs of the transaction
being supported
Typically, data is organised
relationally or hierarchically
Operational transaction
systems are defined by the
scope of the application
which is fixed

Data warehouse
Data is usually de-normalised
and is modeled according to
dimensions of a subject
Data is usually multidimensional according to the
subject
The data warehouse project is
concerned with providing a
decision-support infrastructure
that is scalable, extensible
and flexible.

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization
SID
1
2
3
4
4
5
5
5

Activity
Squash
Sailing
Fencing
Squash
Sailing
Squash
Weights
Sailing

Only singular entries

Fee $
35
210
50
35
210
35
50
210

First Normal Form, 1NF

The following examples are taken and adapted from David Kroenke, Database Processing, 2000
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization
SID
1
2
3
4
4
5
5
5

Activity
Squash
Sailing
Fencing
Squash
Sailing
Squash
Weights
Sailing

Key: ID, Activity


Activity

Fee $
35
210
50
35
210
35
50
210

Fee

Deletion anomaly: delete entry

5 weights 50

-?

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization
SID
1
2
3
4
4
5
5
5

Activity
Squash
Sailing
Fencing
Squash
Sailing
Squash
Weights
Sailing

Fee $
35
210
50
35
210
35
50
210

Update anomaly: change entry 4 sailing 210


4 sailing 200
Insertion anomaly: insert
6 sailing 190

to
-?
-?

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization
All non-key attributes depend on the entire key
=> Second Normal Form, 2NF

SID
1
2

Activity
Squash
Sailing

3
4
4
5
5
5

Fencing
Squash
Sailing
Squash
Weights
Sailing

Activity
Squash
Sailing
Fencing
Weights

Fee $
35
210
50
50

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization

Second Normal Form, 2NF:

SID
1
2
3
4
4
5
5
5

Activity
Squash
Sailing
Fencing
Squash
Sailing
Squash
Weights
Sailing

Activity
Squash
Sailing
Fencing

Fee $
35
210
50

Weights

50

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization

SID
1
2
3
4
5
6
7
8

Building
A
B
C
A
B
C
C
A

Fee $
1000
1200
1300
1000
1200
1300
1300
1000

Anomalies??

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization

SID
1
2
3
4
5
6
7
8

Building
A
B
C
A
B
C
C
A

Fee $
1000
1200
1300
1000
1200
1300
1300
1000

SID ! Building ! Fee

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization
If relation in 2NF and no transitive dependencies
=> Third Normal Form, 3NF
SID
1
2
3
4
5
6
7
8

Building
A
B
C
A
B
C
C
A

Building
A
B
C

Fee $
1000
1200
1300

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization

Third Normal Form, 3NF:

SID
1
2
3
4
5
6
7
8

Building
A
B
C
A
B
C
C
A

Building
A
B
C

Fee $
1000
1200
1300

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization
SID
1
2
2
3
3
4
4
5
5

Major
WI
AI
Production Man.
Marketing
Production Man.
Production Man.
AI
Marketing
WI

Prof.
Werthner
Janko
Taudes
Scheuch
Jammernegg
Taudes
Janko
Schlegelmilch
Neumann

Assume: (i) student can have several majors,


(ii) major several supervisors, (iii) each professor only
supervises one subject.
Anomalies??

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization
SID
1
2
2
3
3
4
4
5
5

Major
WI
AI
Production Man.
Marketing
Production Man.
Production Man.
AI
Marketing
WI

Prof.
Werthner
Janko
Taudes
Scheuch
Jammernegg
Taudes
Janko
Schlegelmilch
Neumann

Prof. ! Major
Key candidate #1: SID/Major
Key candidate #2: SID/Prof.

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization
Relation in 3NF and all key candidates are keys
=> Boyce-Codd Normal Form, BCNF
Prof.
Werthner
Janko
Taudes
Scheuch
Jammernegg
Taudes
Schlegelmilch
Neumann

Major
WI
AI
Production Man.
Marketing
Production Man.
Production Man.
Marketing
WI

SID
1
2
2
3
3
4
4
5
5

Prof.
Werthner
Janko
Taudes
Scheuch
Jammernegg
Taudes
Janko
Schlegelmilch
Neumann

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization

Boyce-Codd Normal Form, BCNF:

Prof.
Werthner
Janko
Taudes
Scheuch
Jammernegg
Taudes
Schlegelmilch
Neumann

Major
WI
AI
Production Man.
Marketing
Production Man.
Production Man.
Marketing
WI

SID
1
2
2
3
3
4
4
5
5

Prof.
Werthner
Janko
Taudes
Scheuch
Jammernegg
Taudes
Janko
Schlegelmilch
Neumann

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization

SID
1
1
2
3
3
4
4
5
5
5

Major
WI
AI
Production Man.
Transport/Logistics
Production Man.
AI
AI
Marketing
Marketing
Marketing

Activity
Squash
Squash
Sailing
Fencing
Fencing
Squash
Sailing
Squash
Weights
Sailing

Anomalies??

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization

SID
1
1
2
3
3
4
4
5
5
5

Major
WI
AI
Production Man.
Transport/Logistics
Production Man.
AI
AI
Marketing
Marketing
Marketing

Activity
Squash
Squash
Sailing
Fencing
Fencing
Squash
Sailing
Squash
Weights
Sailing

SID ! Major
SID ! Activity
Multi-value

dependency

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization
Relation in BCNF and no multi-value dependencies
=> Fourth Normal Form, 4NF
SID
1
1
2
3
3
4
5

Major
WI
AI
Production Man.
Transport/Logistics
Production Man.
AI
Marketing

SID
1
2
3
4
4
5
5
5

Activity
Squash
Sailing
Fencing
Squash
Sailing
Squash
Weights
Sailing

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization

Fourth Normal Form, 4NF:

SID
1
1
2
3
3
4
5

Major
WI
AI
Production Man.
Transport/Logistics
Production Man.
AI
Marketing

SID
1
2
3
4
4
5
5
5

Activity
Squash
Sailing
Fencing
Squash
Sailing
Squash
Weights
Sailing

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization

Exercise on Normalization #1:


Customer
Artner & Co
Umbrellas&More
Umbrella Store
...

Sales_Area
Austria/East
Austria/West
Austria/West
...

Delivery_Plant
Graz
Innsbruck
Innsbruck
...

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Insert on Normalization

Exercise on Normalization #2:


Cust_Nr
101
102
103
101
102
...

Cust_Name
Artner & Co
Umbrellas&More
Umbrella Store
Artner & Co
Umbrellas&More
...

Product
Beau Paysage
Beau Paysage
Adelaide
Beau Paysage
Adelaide
...

Unit_Price
450
450
400
450
400
...

Order_qty
100
500
1000
500
1000
...

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Relational

multidimensional

Source

predef._
source_
in

Material

n
PR_line
_item

Info record

PO

1
processed_
by

order
assigmt.

PR
(batch)

m
Elementary fact in data warehouse
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Material

Purchase transactions

Purchase organization

Source
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Step #1 in designing a data warehouse:


Identify your elementary transaction,
the dimensions of analysis and the
aggregation steps in each dimension

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Exercise: extract possible facts


from the following data model of a cost center
accounting system
(to be continued)

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Perfdepd
Cost
n Centre

OH
surch

IS A

price

Activity

attributed

Cost
Element
1 m

n
credit
to

Alloc
consumes

stat.
key

n
n

Ass.
cost
elem.

is
is

1
prim.
cost

m
m

m
accordg.
to n

uses

Calc.
n
Schema

1
debits

m cycle

Base

has

has

OH
surchrg.

n
has

Credit
m

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Introducing the Cenox chainstore case:
The dimensions under consideration:
customers (such as Woolworths, K-Mart
and McDonalds),
items (such as oranges, bananas, milk),
days when the transaction was concluded.

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Item
Customer
Item n
Customer n

.....

......
Item 4

.
Customer
4

Item 3

Customer 3

Item 2

Customer 2

Item 1

Customer 1
Period
D
a
y
1

D
a
y
2

D
a
y
3

D
a
y
4

..
..
..
..

D
a
y
n

Item 1

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Aggregation
Business
objects
Customer:
Item:
Period:

Hierarchy levels
Customer (individ.) !
Item (individual) !
Day
!

Customer group !
Item type
!
Month
!

Customer total
Item total
Year

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Simple graphical representation
*
Customer group
Customer

Item
Y

$_value
no_trans

Item type

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Naming convention - the CENOX case
Business Level 0
Objects
Customer: C_cu customer
(individual)
Item:
I_it item
(individual)
Period:
P_d day

Level 1

Level 2

C_gr customer
group
I_ty item type

C_* customer
total
I_* item total

P_m month

P_y year

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Information cube:

I
t
e
m

Total
Item
type

Total

Item

Customer group
Customer
Day

Customer

Month Year
Period

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
individual hypercube, cube:
items

customer

tomatoes
milk
bananas
oranges

Mc Donalds
K-Mart
Woolworths

period

01/ 02/ 03/ 04/


01/ 01/ 01/ ....
01/
00 00 00 ..
00

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

I
t
e
m

Cubes derived from


an information cube:

Total
Item
type

Total
Customer group
Customer
Customer

Item
Day

Month Year

Period

I _ it

C _ cu
P _ d

I _ ty

C _ cu
P _ d

I _ it

...
C _ gr
P _ d

I _*

C _*
P _*

I _ it

C _ cu
P _ m

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

I
t
e
m

Total
Item
type

Total
Customer group
Customer
Customer

Item
Day

Sample cubes:

Month Year

Period

I _ *

C _ cu
P _ d

I _ *

C _ gr
P _ d

I _ *

C_ *
P _ d

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Derivatives of a cube
I _ it

C _ gr
P _ m

I _ ty

C _ gr
P _*

I _ it

C _ gr
P _ m

I _*

C _ gr
P _ d

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Derivatives of a cube
a
x


v = b , w = y
c
z

w is a derivative of v, iff
a x and b y and c z
with a x meaning a is of lower or the same hierarchy
level in the respective dimension

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

characteristics, identify an aggregation level


e.g., customer, item, time period
key figure, aggregated
e.g., sales value in $, number of transactions
non-aggregation attributes, additional information
e.g., customer address

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Each characteristic and key figure has a value set


(in some cases a continuous space)
e.g., customers={Baumann AG, Meier&Co, }
number_of_transactions = N+
length = R+

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Entries in cube are defined by Cartesian Product


of value sets of characteristics
cube c=(I_*,C_cu,P_y):
VS(c)=VS(I_*) X VS(C_cu) X VS(P_y)

Baumann AG
Meier & Co
Artner KEG

1990
1991
1992

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Important qualification on dimensions:
=> should show pairwise independence
Example:

*
Customer type
Customer

$_value
no_trans

Rate

Rate group
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
Aggregation operator:
One or more indicated for each combination
of characteristics in a cube
summation, average, extremum
Aggregation:
(key figure, aggregation operator)
e.g., number_of_transactions, summation

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Operator

Nominal

Ordinal

Interval

Rational

Sum

No

No

No

"

Average

No

(")

"

"

Minimum

No

"

"

"

Maximum

No

"

"

"

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Nominal: numerical coding without meaningful values


Ordinal: coding represents >< relationships,
no meaningful sum
Interval: metric, but have a beginning and/or end,
hence, no meaningful sum
Rational: metric, any operation

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
*
Additivity

Plant
Storage_location

Stock_
level

Material

Material_group

=> AVG "

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction
max

min

x
AVG

Some aggregation operator


All aggregation operators

Some dimensions All dimensions


Semi-additive
Semi-additive
Semi -additive
Additive

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Step #2 in designing a data warehouse:


Identify your aggregation hierarchy
and the key figures to aggregate

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Exercise: extract possible dimensions and hierarchy


levels from the following data model of a cost center
accounting system

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Perf.- n
depd.
Cost
n Centre
n

consumes

Activity

attributed

Cost
Element
1 m

OH
surch. n

IS A

Alloc.

stat.
key m

price

credit
to

n
n

Ass.
cost
elem.

is
is

1
prim.
cost

m
m

accordg.
to n

uses
1

debits

m cycle

has

n
n

Calc. n
Schema

has

Base

OH
surchrg. m

n
has

Credit
m

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Key Terms
I want to keep one dimension at a fixed value
Item

Customer 3 fixed

Customer

Item n

Customer n

.....
......

Item 4

.....

Customer
4
.

Item 4

Item 3

Item 3

Customer 3

Item 2

Customer 2

Item 1

Item n

Item 2

Customer 1

Item 1

Period

D
a
y
1

D
a
y
2

D
a
y
3

=> Slicing

D
a
y
4

..
..
..
..

D
a
y
n

D
a
y
1

D
a
y
2

D
a
y
3

D
a
y
4

.. D
.. a
.. y
.. n

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Key Terms
Item

Customer

Item n

Dimensions
exchanged

Customer n

.....

......

Item 4

.
Customer
4

Item 3
Customer 3
Item 2

Customer 2

Item 1

Customer 1
Period
D
a
y
1

D D
a a
y y
2 3

D
a
y
4

..
..
..
..

D
a
y
n

=> Pivoting
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Key Terms
Item

Customer

Item n

Customer = *

Customer n

.....

Item n
......

.....

Item 4

Item 4

.
Customer
4

Item 3

Item 3

Customer 3

Item 2

Item 2

Customer 2

Item 1

Item 1

Customer 1
Period
D
a
y
1

D
a
y
2

D
a
y
3

D
a
y
4

.. D
.. a
.. y
.. n

D
a
y
1

D
a
y
2

D
a
y
3

D
a
y
4

..
..
..
..

D
a
y
n

=> Dicing
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Key Terms
Slicing & dicing used for user-specific data
presentation and access rights:

marketing
manager

controller

product manager

top manager or
ad hoc view

customer
i
t
e
m

revenue

period

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Key Terms

Items total

Food

Oranges

Dairy

Bananas

Milk

Vegetables

Tomatoes

Drill through

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Key Terms

Items total
Drill down
Food

Oranges

Dairy

Bananas

Milk

Vegetables

Tomatoes

Roll up

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Key Terms
Cube: filter for customers with revenue p.a.
in excess of $100,000.Item
Customer
Item n
.....

Customer n
......
.
Customer 12

Item 4
Item 3

Customer 6

Item 2

Customer 5

Item 1

Customer 3
Period
D
a
y
1

D
a
y
2

D
a
y
3

D
a
y
4

..
..
..
..

D
a
y
n

.
=> Filtering (screening, selection)
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Step #3 in designing a data warehouse:


Make a user-oriented warehouse model,
relate warehouse model to data model in
operational database

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Introduction

Step #4 in designing a data warehouse:


define your tables and input paths

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Conceptual Modeling

Conceptual Model: user view, database-independent


in operational systems: ERM or SOM
Logical Model: database-dependent, represents data
structures to be stored, e.g., relational model
Physical Model: implementation of the logical model
on storage medium, indices, blocks, etc.,
product-dependent

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Conceptual Modeling
Operational IS

Aggregates

Aggregation
Path Array
Dimensional
Fact Modeling

DW

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

Dimensional Fact Modeling (DFM)


developed by Golfarelli, Maio and Rizzi (1998)
facts - event of interest
dimensions
hierarchies
measures the facts from different points of view
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

The Sales Example revisted:


Business Level 0
Objects
Customer: C_cu customer
(individual)
Item:
I_it item
(individual)
Period:
P_d day

Level 1

Level 2

C_gr customer
group
I_ty item type

C_* customer
total
I_* item total

P_m month

P_y year

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling


ITEM
NUMBER

UNIT PRICE

SIZE

ITEM

QUANTITY

SALES

DATE

TICKET
NUMBER

PURCHASE
TICKET

n
FOR

OF

1
ITEM-TYPE

m
CUSTOMER

CUSTOMER
NUMBER
ADDRESS

n
ITEM-TYPE
NUMBER

OF

1
CUSTOMER
GROUP

CUSTOMER
GROUP NUMBER

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

Building attribute tree


Pruning & grafting
Dimensions & hierarchies
Defining fact attributes

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling


After identifying the fact of interest, the attribute tree is
built as followed:
1. Each attribute within the ER schema becomes a node.
2. The identifying attribute of the chosen attribute (sales
in our case) becomes the root which is marked in
yellow (in the graphical illustration).
unit price
item type item
number
number

Item

n um
tick + ber
et n
um
be r

size

date

ticket
number

address

customer
number

customer
group
number

quantity

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling


3. Attach all attributes of one entity to the primary
key of this entity.
4. The nodes representing the identifying attribute
are connected to each other according to their
corresponding relationships in the ER schema.
size

item
number

date

ticket
number

n um
tick + ber
et n
um
be r

item type
number

unit price

address

customer
number

customer
group
number

Item

quantity

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

Building attribute tree


Pruning & grafting
Dimensions & hierarchies
Defining fact attributes

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling


Pruning: dropping a complete subtree

X
X X X
X

Grafting: deleting a node, but descendant


preserved

X
In our example: no pruning, ticket number grafted

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

um

cke
tn

item
number

customer
number

customer
group
number

quantity date

Item

n um

be r

+ ti

item type
number

be r

unit price address

size

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

Building attribute tree


Pruning & grafting
Dimensions & hierarchies
Defining fact attributes

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

Dimensions chosen:
type

item

size
item

customer
date

SALE
month

day

quantity sold
customer group
revenue

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling


Hierarchy levels chosen:
attributes must be arranged into a tree
so that there is a many-to-one relationship between
each node and its descendants
From ER model:
item
item type
customer
customer group
all elements of the date: day

month

year

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

Building attribute tree


Pruning & grafting
Dimensions & hierarchies
Defining fact attributes

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

hierarchy

type
size
non-dimension
attribute

item
fact

address
SALE
month

day

quantity sold
revenue

customer

group

fact attribute

If no fact attribute given

simple counter

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

type
size
item
fact
address
SALE
month

day
AVG

quantity sold
revenue
number of
customers
inventory level

customer

group

Some semi-additive facts


(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling


Glossary:
UNIT PRICE

ITEM

QUANTITY

SALES

PURCHASE
TICKET

n
SALE
quantity sold
revenue

quantity sold = SUM (SALE.qty)


revenue = SUM (SALE.qty * SALE.unit price)

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling


Extension: parallel hierarchies
category
group
type
vendor
size
item
address
SALE

month

day

quantity sold
revenue

customer group

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling


Extension: parallel hierarchies
Dimension
Customer.
Item:
Vendor.
Period:

Level=0
C_cu ! Indiv. Cust.
I_it

! Indiv. item

P_d ! Day

Lever=1
C_gr ! Cust. group
I_ty ! Item type
V_v ! Vendor
P_m ! Month

Level=2
Level=3
C_* ! Cust. total
I_ca ! Item category I_* ! Item total
V_gr ! Vendor group V_* ! Vendor total
P_y ! Year

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

Important note:
Do not confuse the necessity of dimension
independence with parallel hierarchies
Dimension Abbr. Level=0
customer: C_cu customer ==>
item:

I_it

item

==>

vendor:

I_it

item

==>

period:

P_d

day

==>

Still mutually
independent

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling

Review of method:
+
+
+
+

graphical representation
indicates relationship options
ability to represent parallel hierarchies
works without fact attributes

- may become difficult to manage in large cases


- cannot depict derivatives in a systematic way
- formulae for formation of aggregates cannot be
included into the model

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling - Example

NUMBER OF
VISITS

NAME

USER-NUMBER

INTERNET
USER

ACCESS

DATE

m
PERIOD

n
ON

OF

1
USER-TYPE

m
WEB
PAGE

WEG PAGE
NUMBER

n
USER-TYPE
NUMBER

OF
1
PAGEGROUP

PAGE GROUP
NUMBER

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Dimensional Fact Modeling - Example

# Derive the Dimensional Fact Model from the ERM


by building an attribute tree, defining dimensions,
fact attributes and hierarchies, and compile a glossary
of the fact attributes.
# Include an additional fact attribute counting the
number of users!

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

The Lattice
The Sales Case revisited:
Business Level 0
Objects
Customer: C_cu customer
(individual)
Item:
I_it item
(individual)
Period:
P_d day

Level 1

Level 2

C_gr customer
group
I_ty item type

C_* customer
total
I_* item total

P_m month

P_y year

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

The Lattice
I _*

C _*
P _*
I
_
ty
I
_*
I
_
ty

C
_
cu
C
_
cu
C
_
gr

P _ d P _ d P _ m
...

I _ ty

C _ cu
P _ d

I _ it

C _ gr
P _ d

I _ it

C _ cu
P _ m

I _ it

C _ cu
P _ d

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

The Lattice
I_it, C_cu,P_d

I_*,C_gr,P_d

I_ty,C_cu,P_d

I_it,C_gr,P_d

I_it,C_cu,P_m

I_*,C_cu,P_d

I_ty,C_gr,P_d

I_ty,C_cu,P_m

I_it,C_*,P_d

I_it,C_gr,P_m

I_it,C_cu,P_y

I_*,C_cu,P_m

I_ty,C_*,P_d

I_ty,C_gr,P_m

I_ty,C_cu,P_*

I_it,C_*,P_m

I_it,C_gr,P_y

I_*,C_*,P_d

I_*,C_gr,P_m

I_*,C_cu,P_y

I_ty,C_*,P_m

I_ty,C_gr,P_y

I_it,C_*,P_y

I_*,C_*,P_m

I_*,C_gr,P_y

I_ty,C_*,P_y

I_*,C_*,P_y

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

The Lattice

An aggregation path through the lattice:


(I_it,C_gr,P_d),(I_ty,C_gr,P_d),(I_*,C_gr,P_d),
(I_*,C_gr,P_m),(I_*,C_gr,P_y),(I_*,C_*,P_y)

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

The Lattice

I_it, C_cu,P_d

I_*,C_gr,P_d

I_ty,C_cu,P_d

I_it,C_gr,P_d

I_it,C_cu,P_m

I_*,C_cu,P_d

I_ty,C_gr,P_d

I_ty,C_cu,P_m

I_it,C_*,P_d

I_it,C_gr,P_m

I_it,C_cu,P_y

I_*,C_cu,P_m

I_ty,C_*,P_d

I_ty,C_gr,P_m

I_ty,C_cu,P_*

I_it,C_*,P_m

I_it,C_gr,P_y

I_*,C_*,P_d

I_*,C_gr,P_m

I_*,C_cu,P_y

I_ty,C_*,P_m

I_ty,C_gr,P_y

I_it,C_*,P_y

I_*,C_*,P_m

I_*,C_gr,P_y

I_ty,C_*,P_y

I_*,C_*,P_y

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

The Lattice

Review of the method:


+ Each combination of hierarchical dimension steps
can be addressed separately
+ Derivatives can be shown
+ The lattice shows all combinations of hierarchical steps
in a redundancy-free way
+ The lattice is complete
- Complex in user dialogue
- Unhandy in big cases

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

The Lattice
General notation based on the lattice:

j(i)
di
dimension i

Hierarchy level j in dimension i


I_it I_ty I_*
C_cu C_gr C_*
P_d
P_m
P_y

(I _ *, C _ cu,P _ m) = (d12, d02, d13 )

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Aggregation Path Array

Prosser, Ossimitz (2000)


Intended to depict all possible cubes in a
warehouse at one glance
Plan materialized views

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Aggregation Path Array


How to build it:
STEP 1: Concatenate dimensions, beginning at
level 1, the dimension with the most hierarchy
levels being the last:
d1, d2 ,..., d j(1) ,..., d1n , dn2 ,..., dnj(n1) , dn1 ,..., dnj(n)
1 1
1
1
1
1

In our example:
I _ ty,I _ *, C _ gr, C _ *,P _ m,P _ y

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Aggregation Path Array


STEP 2: Take last dimension as a block and permutate
it through dimension n-1
1

(1)

( 1)

d1 , d1 ,..., d1j ,..., dn 1 , dn 1 ,..., dnj 1n , dn ,..., dj n

( n)

d11 , d12 ,..., d1j (1) ,..., dn11 , dn2 1 ,..., dn1 ,..., dnj ( n ) , djn (n1

1)

....
d11 , d12 ,..., d1j (1) ,..., dn1 1 , dn1 ,..., dn j( n) , dn21 ,..., d nj(1n
1

(1)

( )

1)

( 1)

d1 , d1 ,..., d1j ,..., dn ,..., dnj n , dn 1 , dn 1 ,..., d nj 1n


I _ ty , I _ *, C _ gr , C _ *, P _ m, P _ y
I _ ty , I _ *, C _ gr , P _ m, P _ y, C _*
I _ ty , I _ *, P _ m, P _ y, C _ gr , C _*

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Aggregation Path Array


STEP 3: Take last two dimensions as a block and
permutate it through dimension n-2 at each permutation
with dimension n-1 repeat STEP 1 for dimension n
I _ ty, I _ *, C _ gr , C _ *, P _ m, P _ y
I _ ty, I _ *, C _ gr , P _ m, P _ y, C _*
I _ ty, I _ *, P _ m, P _ y, C _ gr , C _*
I _ ty, C _ gr , C _ *, P _ m, P _ y, I _*
I _ ty, C _ gr , P _ m, P _ y, C _ *, I _*
I _ ty, P _ m, P _ yC _ gr , C _ *, I _*

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Aggregation Path Array

STEP 4: Repeat step 3 for all remaining dimensions


until dimension 1 is reached

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Aggregation Path Array


APA of the Sales Case:
1
1
1
1
1
1
2
2
3

1
1
1
2
2
3
2
3
3

2
2
3
2
3
3
3
3
2

2
3
3
3
3
2
3
2
2

3
3
2
3
2
2
1
1
1

3
2
2
1
1
1
1
1
1

I_ty
I_ty
I_ty
I_ty
I_ty
I_ty
C_gr
C_gr
P_m

I_*
I_*
I_*
C_gr
C_gr
P_m
C_*
P_m
P_y

C_gr C_* P_m


C_gr P_m P_y
P_m P_y C_gr
C_* P_m P_y
P_m P_y C_*
P_y C_gr C_*
P_m P_y I_ty
P_y C_* I_ty
C_gr C_* I_ty

P_y
C_*
C_*
I_*
I_*
I_*
I_*
I_*
I_*

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Aggregation Path Array


Each entry in APA shows a cube:
I _ ty

C _ cu
P _ d

I _ ty

C _*
P _ y

I _*

C _*
P _*

I_ty
I_ty
I_ty
I_ty
I_ty
I_ty
C_gr
C_gr
P_m

I_*
I_*
I_*
C_gr
C_gr
P_m
C_*
P_m
P_y

I _*

C _*
P _*

C_gr C_* P_m


C_gr P_m P_y
P_m P_y C_gr
C_* P_m P_y
P_m P_y C_*
P_y C_gr C_*
P_m P_y I_ty
P_y C_* I_ty
C_gr C_* I_ty

P_y
C_*
C_*
I_*
I_*
I_*
I_*
I_*
I_*

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Aggregation Path Array


Redundancy-free set in APA:
Beginning from the left go right until you find a
cube not mentioned in the preceding line,
select all cubes until end of last dimension.

I_ty
I_ty
I_ty
I_ty
I_ty
I_ty
C_gr
C_gr
P_m

I_*
I_*
I_*
C_gr
C_gr
P_m
C_*
P_m
P_y

C_gr
C_gr
P_m
C_*
P_m
P_y
P_m
P_y
C_gr

C_*
P_m
P_y
P_m
P_y
C_gr
P_y
C_*
C_*

P_m P_y
P_y C_*
C_gr C_*
P_y I_*
C_* I_*
C_* I_*
I_ty I_*
I_ty I_*
I_ty I_*

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Aggregation Path Array

Hint:
A tool to automatically draw and analyze
the APA can be downloaded from:
http://erp.wu-wien.ac.at
Java Applet, self extractor:
http://erp.wu-wien.ac.at/install.exe

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Conceptual Modeling

Conceptual Model: user view, database independent


in operational systems: ERM or SOM
Logical Model: database-dependent, represents data
structures to be stored, e.g., relational model
Physical Model: implementation of the logical model
on storage medium, indices, blocks, etc.,
product-dependent

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
Ingredients to a relational schema representing
dimensional data:
A fact table (FT) containing a primary key, with one pointer
(foreign key) for each dimension and additional columns
for additive, numeric facts.
A dimension table (DT) for each dimension with a
generated key and a level indicator.

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
DT_ Item

Fact Table

I_Key

I_Key

DT_ Customer
C_Key

I_it

C_Key

C_cu

I_description

P_Key

C_description

I_ty

Qty Sold

C_gr

I_ty_description

Revenue

C_gr_description

level

level

DT_Period
P_Key
P_d
P_m
level
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation

Dimension Table Item


I_Key
K0001
K0002
K0003
K5000
K5700
K9999

I_it

I_description

1234567 Oranges Naval - S.A.


1234789 Banana Rosa - Victoria
2345678 Milk - Dairy Farmers
*
*
*
*
*
*

I_ty
1
1
2
1
2
*

I_ty_describ
level
pion
fruits
0
fruits
0
dairy
0
fruits
1
dairy
1
*
2

C_cu
C_gr
C_*

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation

Dimension Table Customer


C_Key

C_cu

Cu_description

C_gr

K00001
K00002
K00003
K19999
K29999
K99999

9876543
9874321
8912345
*
*
*

Woolworth
K-Mat
Mc Donalds
*
*
*

1
1
2
1
2
*

C_gr_describpion
foodstore
foodstore
fastfood
foodstore
fastfood
*

level address
0
0
0
1
1
2

207, View Rd, NSW, 9910


12, Creek Ave, NSW, 6677
403, Flood St, NSW, 1109

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation

Dimension Table Period


P_Key

P_d

P_description

P_m

K000001
K000002
K000003
...
K000170
K000171
...
K900001
K900002
...
K999001

1
2
3
...
170
180
...
*
*
...
*

1-Jan-2000
2-Jan-2000
3-Jan-2000
...
18-Jun-2000
19-Jun-2000
...
*
*
...
*

1
1
1
...
6
6
...
1
2
...
*

P_m_desP_Y P_y_description
cribpion
Jan., 2000
1 Year, 2000
Jan., 2000
1 Year, 2000
Jan., 2000
1 Year, 2000
...
... ...
June, 2000
1 Year, 2000
June, 2000
1 Year, 2000
...
... ...
Jan., 2000
1 Year, 2000
Feb., 2000
1 Year, 2000
...
... ...
*
1 Year, 2000

level
0
0
0
...
0
0
...
1
1
...
2

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
Dimension Table Item
I_Key
K0001
K0002
K0003
K5000
K5700
K9999

I_it

I_description

1234567 Oranges Naval - S.A.


1234789 Banana Rosa - Victoria
2345678 Milk - Dairy Farmers
*
*
*
*
*
*

I_ty
1
1
2
1
2
*

I_ty_describ
level
pion
fruits
0
fruits
0
dairy
0
fruits
1
dairy
1
*
2

Find all individual items of type 2:


Find all entries
where type = 2 and level = 0

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
Dimension Table Customer
C_Key

C_cu

Cu_description

C_gr

K00001
K00002
K00003
K19999
K29999
K99999

9876543
9874321
8912345
*
*
*

Woolworth
K-Mat
Mc Donalds
*
*
*

1
1
2
1
2
*

C_gr_describpion
foodstore
foodstore
fastfood
foodstore
fastfood
*

level address
0
0
0
1
1
2

207, View Rd, NSW, 9910


12, Creek Ave, NSW, 6677
403, Flood St, NSW, 1109

Find all individual customers in group 1

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
DT_ Item
I_Key
I_it
I_description
I_ty
I_ty_descritpion
level
DT_Period
P_Key
P_d
P_m
level

Fact Table
I_Key
C_Key
P_Key
Qty Sold
Revenue

DT_ Customer
C_Key
C_cu
C_description
C_gr
C_gr_description
level

Build a table showing revenue by individual


customers and items irrespective of the day

Find all customers, items, revenue


where FactTable.C_Key = DimensionTable.C_Key
level(DT_Customer) = 0
FactTable.I_Key = DimensionTable.I_Key
level(DT_Item) = 0
Group output by Customer and Item
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
DT_ Item
I_Key
I_it
I_description
I_ty
I_ty_descritpion
level

Fact Table
I_Key
C_Key
P_Key
Qty Sold
Revenue

DT_ Customer
C_Key
C_cu
C_description
C_gr
C_gr_description
level

DT_Period
P_Key
P_d
P_m
level

Build a table showing revenue by individual


customers and items irrespective of the day

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
How to determine the size of a warehouse:
Example:
Item: 1,000 items, 50 item types, *
Customer: 10,000 customers, 20 groups, *
Time: 1750 days, 60 months, 5 years

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
Dimension Table Item
I_Key
K0001
K0002
K0003
K5000
K5700
K9999

I_it

I_description

1234567 Oranges Naval - S.A.


1234789 Banana Rosa - Victoria
2345678 Milk - Dairy Farmers
*
*
*
*
*
*

I_ty
1
1
2
1
2
*

I_ty_describ
level
pion
fruits
0
fruits
0
dairy
0
fruits
1
dairy
1
*
2

Each item, item type, and * is one entry in


the dimension table

=> 1051 entries in DT_Ite

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
Fact Table
P_Key

C_Key

I_Key

Qty Sold

Revenue

K000001
K000001
K000001
K000001
K000001
K000001
K000002
K000002
K000002
.....
K900001
K900001
K900001
K900001
...
K999001
K999001
K999001
....
K000001
K000001
K000001
K000001
K000001
K000001
K000001
K000001
K000001
...

K00001
K00001
K00002
K00002
K00003
K00003
K00001
K00001
K00002
...
K00001
K00001
K00001
K00002
...
K00001
K00001
K00001
...
K00001
K00002
K00002
K00003
K00003
K00001
K00002
K00003
K99999
...

K0001
K0002
K0001
K0003
K0001
K0003
K0002
K0003
K0001
...
K0001
K0002
K0003
K0001
...
K0001
K0002
K0003
...
K5000
K5000
K5700
K5000
K5700
K9999
K9999
K9999
K9999
...

1,000
1,111
700
20,000
900
4,000
1,100
10,000
980
...
19,000
17,150
16,900
17,890
...
110,100
89,600
96,100
...
2,111
700
20,000
900
4,000
2,111
20,700
4,900
27,711
...

1,200
1,667
840
18,000
1,080
3,600
1,650
9,000
1,470
...
22,800
25,725
15,210
21,468
...
110,101
134,400
86,490
...
2,867
840
18,000
1,080
3,600
2,867
18,840
4,680
26,387
...

1,051 I_Keys
10,021 C_Keys
1,815 P_Keys
Fact Table is Cartesian
Product of DimTabs
Possible(!) FT entries:
1051*10021*1815 =
= 19.12 b

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
Fact Table
P_Key

C_Key

I_Key

Qty Sold

Revenue

K000001
K000001
K000001
K000001
K000001
K000001
K000002
K000002
K000002
.....
K900001
K900001
K900001
K900001
...
K999001
K999001
K999001
....
K000001
K000001
K000001
K000001
K000001
K000001
K000001
K000001
K000001
...

K00001
K00001
K00002
K00002
K00003
K00003
K00001
K00001
K00002
...
K00001
K00001
K00001
K00002
...
K00001
K00001
K00001
...
K00001
K00002
K00002
K00003
K00003
K00001
K00002
K00003
K99999
...

K0001
K0002
K0001
K0003
K0001
K0003
K0002
K0003
K0001
...
K0001
K0002
K0003
K0001
...
K0001
K0002
K0003
...
K5000
K5000
K5700
K5000
K5700
K9999
K9999
K9999
K9999
...

1,000
1,111
700
20,000
900
4,000
1,100
10,000
980
...
19,000
17,150
16,900
17,890
...
110,100
89,600
96,100
...
2,111
700
20,000
900
4,000
2,111
20,700
4,900
27,711
...

1,200
1,667
840
18,000
1,080
3,600
1,650
9,000
1,470
...
22,800
25,725
15,210
21,468
...
110,101
134,400
86,490
...
2,867
840
18,000
1,080
3,600
2,867
18,840
4,680
26,387
...

Possible FT entries:19.12 b
Filling ratio 1% : 191 m entries
Size of an FT entry:
7, 6, and 5 byte for keys,
qty 12 byte, revenue
10 byte = 40 byte
Net size of FT = 40 * 191 m
= 7.64 GB
Rule of thumb: add 100%
for indices and admin. data

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
General rule for fact table:
Size = b.r. VS (d j( i) )
i j (i)
i
b ... size of FT entry
r ... filling ratio
i ... dimensions
j(i) ... number of hierarchy levels in i
VS ... valueset

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation

Exercise:
# Built the classic Star Schema for Internet example!
# Construct the dimension tables for the Internet example!
# Construct the fact table for the Internet example!

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation

Parallel hierarchies:
Dimension
Customer.
Item:
Vendor.
Period:

Level=0
C_cu ! Indiv. Cust.
I_it

! Indiv. item

P_d ! Day

Lever=1
C_gr ! Cust. group
I_ty ! Item type
V_v ! Vendor
P_m ! Month

Level=2
Level=3
C_* ! Cust. total
I_* ! Item total
I_* ! Item total
V_gr ! Vendor group V_* ! Vendor total
P_y ! Year

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
Dimension Table Item
I_Key

I_it

K0001 1234567

K0001 1234567

K0002 1234789

K0002 1234789

I_it_des.
Oranges
Naval S.A.
Oranges
Naval S.A.
Banana
Rosa Victoria
Banana
Rosa Victoria

I_ty_ I_ty_- I_ca_ I_ca_ V_K


Key des. Key des. ey

V_v

V_v_des.

V_gr
_- V_gr_-des
Key

level

fruits

food

471118

Woolworth

foodstore

item

fruits

food

411983

Mc Donalds

fast food

item

fruits

food

471118

Woolworth

foodstore

item

fruits

food

481399 K-Mart

foodstore

item

K0003 2345678

Milk - Dairy
Famers

dairy

food

471118

Woolworth

foodstore

item

K0003 2345678

Milk - Dairy
Famers

dairy

food

481399 K-Mart

foodstore

item

K0003 2345678

Milk - Dairy
Famers

dairy

food

411983

Mc Donalds

fast food

item

...

...

...

...

...

...

...

...

...

...

...

...

...

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
...

...

...

...

...

...

...

...

...

...

...

...

...

K5000

fruits

food

it_type

K5700

dairy

food

it_type

....

...

...

...

...

...

...

...

...

...

...

...

...

K6000

food

it_cat.

...

...

...

...

...

...

...

...

...

...

...

...

K7000

...
Wool471118
worth

foodstore

vendor

K7001

481399 K-Mart

foodstore

vendor

K7002

411983

...

..

...

...

...

...

...

...

K7120

K7121

...

...

...

...

...

K9999

fast food

vendor

...

Mc Donalds
...

...

...

foodstore

fast food

...

...

...

...

...

...

...

...
ven.group
ven.group
...

total

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation

Careful when counting instances


=> double count

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
Fact Constellation Schema:
a fact table for each combination of dimension
hierarchies
basic fact table (BFT) for atomic data,
aggregate fact table (AFT) for aggregates
+ no level attribute
+ better performance due to table split
- number of tables
- queries over different aggregates or BFT large
number of tables in WHERE clause

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
DT Period
P_d_Key
P_d_description
P_m_Key
P_m_description

FT Item
I_iP_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT Vendor
V_v_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT Item/C_gr
I_iP_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

FT Vendor/C_gr
V_v_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

DT Customer
C_cu_Key
C_cu
C_cu_description
C_gr_Key
C_gr_description

FT IP_type
I_ty_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT Vd_group
V_gr_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT IP_type/C_gr
I_ty_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

FT Ven.group/C_gr
V_gr_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

DT Item
I_it_Key
I_it
I_it_description
I_ty_Key
I_ty_description
I_ca_Key
I_ca_description
V_v_Key
V_v
V_v_description
V_gr_Key
V_gr_description

FT IP_category
I_ca_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT Item/P_m
I_iP_Key
P_m_Key
C_cu_Key
Quantity Sold
Revenue

FT IP_cat/C_gr
I_ca_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

FT Item/P_m/C_gr
I_iP_Key
P_m_Key
C_gr_Key
Quantity Sold

FT IP_type/P_m
I_ty_Key
P_m_Key
C_cu_Key
Quantity Sold
Revenue

FT Vendor/P_m
V_v_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT IP_type/P_m/C_gr
I_ty_Key
P_m_Key
C_gr_Key
Quantity Sold
Revenue

FT Vendor/P_m/C_gr
V_v_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

FT IP_cat/P_m
I_ca_Key
P_m_Key
C_cu_Key
Quantity Sold
Revenue

FT ven.group/P_m
V_gr_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT IP_cat/P_m/C_gr
I_ca_Key
P_m_Key
C_gr_Key
Quantity Sold
Revenue

FT ven.grop/P_m/C_gr
V_gr_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

Revenue

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
Normalized dimension tables:
Dimension Table Item
I_Key
K0001
K0002
K0003
K5000
K5700
K9999

I_it

I_description

1234567 Oranges Naval - S.A.


1234789 Banana Rosa - Victoria
2345678 Milk - Dairy Farmers
*
*
*
*
*
*

I_ty
1
1
2
1
2
*

I_ty_describ
level
pion
fruits
0
fruits
0
dairy
0
fruits
1
dairy
1
*
2

Which NF is the table in?


(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
DT Periode_Day
P_d_Key
P_d
1 P_description
P_m_Key
DT Periode _Month
n P_d_Key
P_m_Key
P_m_description

DT Customer
C_cu_Key
C_cu
1 C_cu_description
C_gr_Key
DT Customer_Group
n C_cu_Key
C_gr_Key
C_gr_description

DT Item
I_Key
I_it
I_it_descritpion
I_ty_Key
V_gr_Key
DT Item_Type
I_Key
I_ty_Key
I_ty_description
I_ca_Key
DT Item_Category
I_Key
I_ca_Key
I_ca_description
DT Vendor
I_Key
V_Key
V_v
V_v_description
V_gr_Key
DT Vendor Group
I_Key
V_gr_Key
V_gr_description

1
1

n
1

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
DT Customer
C_cu_Key
C_cu
C_cu_description
C_gr_Key
C_gr_description

DT Customer

C_cu_Key
C_cu
C_cu_description
C_gr_Key

customer

DT Customer_Group
n

C_cu_Key
C_gr_Key
C_gr_description

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
DT Periode_Day

P_d_Key
P_d
P_description
P_m_Key
DT Periode _Month
P_d_Key
P_m_Key
P_m_description

DT Customer

customer

C_cu_Key
C_cu
C_cu_description
C_gr_Key
DT Customer_Group

C_cu_Key
C_gr_Key
C_gr_description

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation

DT Periode_Day
P_d_Key
P_d
1 P_description
P_m_Key
DT Periode _Month
n P_d_Key
P_m_Key
P_m_description
DT Customer
C_cu_Key
C_cu
1 C_cu_description
C_gr_Key
DT Customer_Group
n C_cu_Key
C_gr_Key
C_gr_description

DT Item
I_Key
I_it
I_it_descritpion
I_ty_Key
V_gr_Key
DT Item_Type
I_Key
I_ty_Key
I_ty_description
I_ca_Key
DT Item_Category
I_Key
I_ca_Key
I_ca_description
DT Vendor
I_Key
V_Key
V_v
V_v_description
V_gr_Key
DT Vendor Group
I_Key
V_gr_Key
V_gr_description

1
1

n
1

item

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
Normalized dimension tables:

+ no inconsistencies can occur


+ no level attribute
+ no danger of double count
- slower as more tables have to be joined in
higher-order aggregate operations

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation

Snow Flake Schema:


Build a Star Schema
Extend it to a Fact Constellation Schema
Normalize dimension tables

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
DT Customer
C_cu_Key
C_cu
C_cu_description
C_gr_Key
C_gr_description
DT Customer

C_cu_Key
C_cu
C_cu_description
C_gr_Key

customer

DT Customer
n

C_cu_Key
C_gr_Key
C_gr_description

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
DT Period
P_d_Key
1 P_d_description
P_m_Key
P_m_description
DT Period
n P_d_Key
P_m_Key
P_m_description

DT Customer
C_cu_Key
C_cu
1 C_cu_description
C_gr_Key

customer

DT Customer
n C_cu_Key
C_gr_Key
C_gr_description

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
DT Period
P_d_Key
1 P_d_description
P_m_Key
P_m_description
DT Period
n P_d_Key
P_m_Key
P_m_description
DT Customer
C_cu_Key
C_cu
1 C_cu_description
C_gr_Key
DT Customer
n C_cu_Key
C_gr_Key
C_gr_description

FT Item
I_it_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT Vendor
V_v_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT Item/C_gr
I_it_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

FT Vendor/C_gr
V_v_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

FT It_Type
I_ty_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT Vd_Group
V_gr_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT It_Type/C_gr
I_ty_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

FT Vd_Gr/C_gr
V_gr_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

FT It_Category
I_ca_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FTItem/P_month
I_it_Key
P_m_Key
C_cu_Key
Quantity Sold
Revenue

FT It_Cat/C_gr
I_ca_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

FT Item/P_m/C_gr
I_it_Key
P_m_Key
C_gr_Key
Quantity Sold
Revenue

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation

DT Item
I_it_Key
I_it
I_it_description
1 I_ty_Key
1
V_v_Key
DT Item
n I_it_Key
I_ty_Key
1 I_ty_description
I_ca_Key

FT It_Type/P_m
I_ty_Key
P_m_Key
C_cu_Key
Quantity Sold
Revenue
FT It_Cat/P_m
I_ca_Key
P_m_Key
C_cu_Key
Quantity Sold
Revenue

FT Vendor/P_m
V_v_Key
P_d_Key
C_cu_Key
Quantity Sold
Revenue

FT It_Ty/P_m/C_gr
I_ty_Key
P_m_Key
C_gr_Key
Quantity Sold
Revenue

FT Vd_Group/P_m FT It_Cat/P_m/C_gr
I_ca_Key
V_gr_Key
P_m_Key
P_d_Key
C_gr_Key
C_cu_Key
Quantity Sold
Quantity Sold
Revenue
Revenue

FT Ven/P_m/C_gr
V_v_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue
FT Vd_Gr/P_m/C_gr
V_gr_Key
P_d_Key
C_gr_Key
Quantity Sold
Revenue

DT Item
n I_it_Key
I_ca_Key
I_ca_description

.
.
. see next slide
(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Relational Representation
.
. see last slide
.
DT_Item
DT Item
n I_it_Key
V_v_Key
V_v
1 V_v_description
V_gr_Key
DT Item
n I_it_Key
V_gr_Key
V_gr_description

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

Physical Modeling

Conceptual Model: user view, database independent


in operational systems: ERM or SOM
Logical Model: database-dependent, represents data
structures to be stored, e.g., relational model
Physical Model: implementation of the logical model
on storage medium, indices, blocks, etc.,
product-dependent

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000

The Case Study with


SAP Business Warehouse

(C) Prosser, A., Ossimitz, M.L., Department of Production Management, Univ. of Economics and Business Admin. Vienna, Austria, 2000