Professional Documents
Culture Documents
Chapter 29
Chapter 29
Navathe
Slide 29- 1
Chapter 29
Overview of
Data Warehousing and OLAP
Chapter 29 Outline
Slide 29- 3
Traditional databases are not optimized for data access only they
have to balance the requirement of data access with the need to
ensure integrity of data.
Most of the times the data warehouse users need only read access
but, need the access to be fast over a large volume of data.
Most of the data required for data warehouse analysis comes from
multiple databases and these analysis are recurrent and predictable
to be able to design specific software to meet the requirements.
There is a great need for tools that provide decision makers with
information to make decisions quickly and reliably based on historical
data.
The above functionality is achieved by Data Warehousing and Online
analytical processing (OLAP)
Slide 29- 4
Slide 29- 5
Slide 29- 6
Data
Reformatting
Metadata
DSSI
EIS
Data
Mining
Updates/New Data
Slide 29- 7
Slide 29- 8
Slide 29- 9
Slide 29- 10
Traditional Databases generally deal with twodimensional data (similar to a spread sheet).
Non volatile
The degree of predictability of the analysis that will
be performed on them is high.
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe
Slide 29- 11
REGION
REG1
P
R
O
D
U
C
T
REG2
REG3
P123
P124
P125
P126
:
:
P
r
o
d
u
c
t
P1 2 3
er
a r t Q tr 4
u
Q
a l Qtr 3
c
F i s tr 2
Q
1
r
t
Q
Reg 1 Reg 2
Reg 3
P1 2 4
P1 2 5
P1 2 6
:
:
Region
Slide 29- 12
Slide 29- 13
Multi-dimensional Schemas
Dimension table
Fact table
Slide 29- 14
Multi-dimensional Schemas
Snowflake Schema:
Slide 29- 15
Multi-dimensional Schemas
Star schema:
Slide 29- 16
Multi-dimensional Schemas
Snowflake Schema:
Slide 29- 17
Multi-dimensional Schemas
Fact Constellation
Slide 29- 18
Multi-dimensional Schemas
Indexing
Slide 29- 19
Slide 29- 20
Slide 29- 21
Slide 29- 22
Slide 29- 23
Slide 29- 24
Usage projections
The fit of the data model
Characteristics of available resources
Design of the metadata component
Modular component design
Design for manageability and change
Considerations of distributed and parallel
architecture
Slide 29- 25
Slide 29- 26
Slide 29- 27
Slide 29- 28
data acquisition
data quality management
selection and construction of access paths and structures
self-maintainability
functionality and performance optimization
Slide 29- 29
Recap
Slide 29- 30