Professional Documents
Culture Documents
Data Warehouses
Data
Sources
Query Driven Approach
• Query Driven Approach
query is issued to a client
side
a metadata dictionary
translates the query into
inner queries
queries are mapped and
sent to the local query Mediator
processor
inner queries are appropriate
Convert
Data
Sources
Query Driven Approach
• Query Driven Approach
Disadvantages
needs complex integration
and filtering processes
is very inefficient and very
expensive for frequent
queries
very expensive for queries
that requires aggregations Mediator
Data
Sources
Query Driven Approach
• Query Driven Approach
Advantages
access to current data
no data redundancy
Mediator
Convert
Data
Sources
DW Incentives
• Update Driven Approach
the information is
integrated
from multiple heterogeneous
sources
in advance and stored
information is made
available
for direct querying and
Integrator analysis
Data
Sources
DW Incentives
• Other possibilities ?
Advantages / Disadvantages
?
Integration
rules
Change
detection
Integrator Metada
ta
Data
Sources
Update Driven Approach
• Update Driven Approach
Advantages
provide high performance
data are copied, processed,
integrated, annotated,
summarized and
restructured
in semantic data store in
advance
query processing does not
Integrator
Data
Sources
Update Driven Approach
• Update Driven Approach
Disadvantages
High data redundancy
Problems with update
Integrator
Data
Sources
BI Process
• Data from the operational systems are
Extracted
Cleansed
Transformed
Aggregated
Loaded into the DW
Integrator
http://blog.cybyte.com/etl-business-intelligence/
Data Warehouse
• Having a DW
It is the first time the company has an integrated view of its
information
• 1985
Procter and Gamble utilises first commercial system focused
on business analytics
Excel 1.0
• 1988
B. Devlin and P. Murphy publish “An architecture for a
• 1989
SQL
A bit of History
• 1990
Inmon publishes “Building the Data Warehouse”
Cognos PowerPlay
• 1993:
Introduction of the term OLAP
• 1996
Kimball publishes “The Data Warehouse Toolkit”
• 1992:
Essbase (Extended Spreadsheet Database) Published by Hyperion Solution this
became a major OLAP server product in the market in 1997.
• 2002
Inmon updates book and defines architecture for collection of disparate sources
into detailed, time variant data store.
Kimball updates book and defines multiple databases called data marts that are
organized by business processes, but use enterprise standard data bus.
Inmon / Kimball
• Data warehouse:
A data warehouse is a subject-oriented, integrated, time-
variant, and non-volatile collection of data in support of
management’s decision-making process
W.H. Inmon, 1992
A data warehouse is a database with these particular features
The database engine will perform a ‘star join’ where This makes the dimensional model hard to change
a Cartesian product will be created using all of the as the business requirements change.
dimension values and the fact table will be queried Cannot handle all the enterprise reporting needs
finally for the selective rows.
because the model is oriented towards business
This is known to be a very effective database processes rather than the enterprise as a whole.
operation.
Integration of legacy data into the data warehouse
• OLAP operations/queries
Aggregation, e.g., SUM
Change level, e.g. (Year, City) -> (Year, Month, City)
Roll Up: Less detail
Idea:
?
• Inmon W.,
Building the Data Warehouse,
John Wiley & Sons, New York 2002
• https://www.youtube.com/watch?v=rvUR
MymCpJM