Professional Documents
Culture Documents
all
0-D(apex)
cuboid
time item location supplier
1-D cuboids
all all
Specification of hierarchies
• Schema hierarchy
day < {month <
quarter; week} < year
• Set_grouping hierarchy
{1..10} < inexpensive
Month
February 25, 2011 Data Mining: Concepts and Techniques
A Sample Data Cube
Total annual
Date sales
1Qtr 2Qtr 3Qtr 4Qtr sum of TV in U.S.A.
Produc TV
PC U.S.A
t VCR
sum
Canad
a Country
Mexico
sum
all
0-D(apex)
country cuboid
product date
1-D cuboids
3-D(base)
product, date, cuboid
country
• Visualization
• OLAP capabilities
• Interactive manipulation
February 25, 2011 Data Mining: Concepts and Techniques
Typical OLAP Operations
ORDER
TRUCK
PRODUCT LINE
Time Product
ANNUALY QTRLY DAILY PRODUCT ITEM PRODUCT
CITY GROUP
SALES
COUNTRY PERSON
DISTRICT
REGION
DIVISION
Location Each circle is
called a footprint Promotion Organizatio
February 25, 2011 Data Mining: Concepts and Techniques n
Data Warehousing and OLAP
Technology for Data Mining
Data Marts
(city, item,
February 25, 2011 Data Mining: Concepts and Techniques year)
Cube Computation: ROLAP-Based
Method
• Efficient cube computation methods
o ROLAP-based cubing algorithms (Agarwal et al’96)
o Array-based cubing algorithm (Zhao et al’97)
o Bottom-up computation method (Bayer & Ramarkrishnan’99)
• ROLAP-based cubing algorithms
o Sorting, hashing, and grouping operations are applied to the
dimension attributes in order to reorder and cluster related
tuples
o Grouping is performed on some subaggregates as a “partial
grouping step”
o Aggregates may be computed from previously computed
aggregates, rather than from the base fact table
C c3 61
c2 45
62 63 64
46 47 48
c1 29 30 31 32 What is the
c0
b3 B13 14 15 16 60 best traversing
44
28 56 order to do
b2 9
B 40
24 52
multi-way
b1 5 36 aggregation?
20
b0 1 2 3 4
a0 a1 a2 a3
February 25, 2011 A Data Mining: Concepts and Techniques
Multi-way Array Aggregation for
Cube Computation
C c3 61
c2 45
62 63 64
46 47 48
c1 29 30 31 32
c0
B13 14 15 16 60
b3 44
B b2 28 56
9
40
24 52
b1 5
36
20
b0 1 2 3 4
a0 a1 a2 a3
A
C c3 61
c2 45
62 63 64
46 47 48
c1 29 30 31 32
c0
B13 14 15 16 60
b3 44
B b2 28 56
9
40
24 52
b1 5
36
20
b0 1 2 3 4
a0 a1 a2 a3
A
Layer2
MDDB MDDB
Meta Data