Professional Documents
Culture Documents
2 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Outline
Exploratory KD Techniques
Analytical Queries over Multidimensional Datasets
Data Cube and OLAP
Data structures and Operations for Analytical Queries
Data Warehouses
Implementing the Informational System
3 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Outline
Exploratory KD Techniques
Analytical Queries over Multidimensional Datasets
Data Cube and OLAP
Data structures and Operations for Analytical Queries
Data Warehouses
Implementing the Informational System
4 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
The 3 most essential dimensions
defining a KD problem:
Data representation
Method (approach, learning algorithm)
Task (problem domain)
5 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Remember: Feature Vectors
6 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Remember: Classification Example
Tid Home Marital Annual Defaulted Tid Home Marital Annual Defaulted
Owner Status Income Borrower Owner Status Income Borrower
6 No Married 60K No
7 Yes Divorced 220K No
What is the
8 No Single 85K Yes
quality of the
9 No Married 75K No New
10 No Single 90K Yes
learned model? Data
Training
Learn
Model
Set Classifier
7 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Multidimensional Datasets
Dimensions: Many independent attributes
spanning n-dimensional space
(coordination system)
Geo
Measures: Few dependent measurement
attributes
Time
Example
Time Geo Sex ... Population
2013 Germany F ... 41 673 725
2013 Germany M ... 40 346 853
2013 Spain F ... 23 702 400
2013 Spain M ... 23 001 908
... ... ... ... ...
9 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Solution: Exploratory KD Techniques
Extract-Visualize-Analyze
Loop of Analysis
[Gray, 1995]
11 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Informational Systems – Problem
Overview
Filter
Zoom ?
?
Tools for
Exploratory KD
over ?
Multidimensional
Datasets
?
Pivot tables Multidimensional
Visualisations Datasets
Spreadsheets
RDBMS / Operational Systems
Visualisation Techniques
13 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Spreadsheets
Problem
Eurostat wiki
14 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
RDBMS / Operational Systems
Problem
Complementary requirements [Köppen, 2012] [Golfarelli, 2009]
There is no “One Size fits All“ [Stonebraker, 07]
15 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Visualisation
Techniques
Problem:
Multidimensionality
OECD Explorer
Gapminder Visualisation
16 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Informational Systems – Problem
Overview
Filter
Zoom ?
?
Tools for
Exploratory KD
over ?
Multidimensional
Datasets
?
Pivot tables Multidimensional
Visualisations Datasets
18 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Example Exploratory KD User Interface
19 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Example Exploratory KD User Interface
20 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Outline
Exploratory KD Techniques
Analytical Queries over Multidimensional Datasets
Data Cube and OLAP
Data structures and Operations for Analytical Queries
Data Warehouses
Implementing the Informational System
21 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Multidimensional Datasets in Relational
Databases – Example
Geo
(coordination system)
Measures: Few dependent measurement
attributes (often aggregated over time or Time
space)
Time Geo Sex ... Population
2013 Germany F ... 41 673 725
2013 Germany M ... 40 346 853
2013 Spain F ... 23 702 400
2013 Spain M ... 23 001 908
... ... ... ... ...
Group by
Group partitions table into groups.
Group-by illustration [Gray, 1995] Example: Age groups
Aggregation function
Aggregation function summarises
attributes from a group returning a
value for each group.
Example: SUM, COUNT, AVG
Histogram population SUM over age groups 0-18, 18-36, 36-54...
23 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
How to Successively Reduce Dimensionality?
Subtotals
Successively aggregating data over attributes (roll-up)
Input: Aggregation function
Output: Group by
roll-up
Time Geo Sex ... Population by Pop by Pop by
Time/Geo/Sex Time/Geo Time
2013 Germany F ... 41 673 725
2013 Germany M ... 40 346 853
... ... ... ... 82 020 578
2013 Spain F ... 23 702 400
2013 Spain M ... 23 001 908
... ... ... ... 46 704 308
... ... ... ... 128 724 886
Geo\Sex F M total
Germany 41 673 725 40 346 853 82 020 578
Spain 23 702 400 23 001 908 46 704 308
total 65 376 125 63 348 761 128 724 886 ...
Population 2013 Pivot Table 2012
25 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
How Difficult is it to Compute All Dimension
Reductions? – CUBE Operator [Gray, 95]
Input
Multidimensional dataset in SUM, AVG,
relational database COUNT...
Aggregation function
CUBE Operator
Output
All possible cross tabulations for
multidimensional dataset
Stored in relational database
...
26 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
CUBE Operator –
Example Time Geo Sex Population
2013 Germany F 41 673 725
2013 Germany M 40 346 853
group-by 2013 Spain F 23 702 400
2013 Spain M 23 001 908
... ... ... ...
2013 Germany ALL 82 020 578
... ... ... ...
... ALL ... ...
... ... ... ...
ALL ... ... ...
... ... ... ...
... ALL ALL ...
... ... ... ...
ALL ALL ALL ...
27 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
CUBE Operator –
Example Time Geo Sex Population
2013 Germany F 41 673 725
2013 Germany M 40 346 853
group-by 2013 Spain F 23 702 400
2013 Spain M 23 001 908
... ... ... ...
2013 Germany ALL 82 020 578
group-by
... ... ... ...
... ALL ... ...
... ... ... ...
ALL ... ... ...
... ... ... ...
... ALL ALL ...
... ... ... ...
ALL ALL ALL ...
28 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
CUBE Operator –
Example Time Geo Sex Population
2013 Germany F 41 673 725
2013 Germany M 40 346 853
2013 Spain F 23 702 400
2013 Spain M 23 001 908
Number of ... ... ... ...
group-bys: 2013 Germany ALL 82 020 578
|N | ... ... ... ...
2 ... ALL ... ...
... ... ... ...
ALL ... ... ...
... ... ... ...
... ALL ALL ...
...
... ... ... ...
ALL ALL ALL ...
29 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
How to Analyse Multidimensional Datasets?
Representing and Querying
Conceptual Level
Independent from
representation
Logical Level
Dependent on
representation
Physical Level
Dependent on
setting and actual
data
30 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
How to Analyse Multidimensional Datasets?
Representing and Querying
Conceptual Level
Independent from
representation
Logical Level
Dependent on
representation
Physical Level
Dependent on
setting and actual
data
31 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Cube
Fact
Dimension Hierarchy
Level
Level Member
Level
Dimension
32 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Cube
DE1
Example DE12 20 000 ...
DE
Geo
Population DE212
DE2
Data Cube
...
NUTS0
NUTS1
NUTS2
rollupmember: V -> V
Meaning: „member
Hierarchy
Level
more specific than“ Level Member
Level
Dimension
ALL
Example 2013
Hierarchy
Level
Level Member
Level
Dimension
Example
Level = {NUTS2, NUTS1, NUTS0, Day, Month, ALL ALL
Year}
NUTS0 Year
NUTS0 = („NUTS0“, {DE, ES, ...}, 1)
NUTS1 Month
rolluplevel = {Day -> Month, Day -> Week,
Month -> Year, Year -> ALL} NUTS2 Day
35 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Cube Elements: Hierarchy
Hierarchy = Set of all hierarchies
H ∊ 2Hierarchy
h = (name, V, rolluplevel,
rollupmember) ∊ Hierarchy
Fact
Dimension
Hierarchy
Level
Level Member
The levels form an ordered list Level
Dimension
Example
Year 2013
Hierarchy = {geoH, timeH, sexH}
timeH = („timeH“, {Day, Month, Year, Month
January 2013
ALL},{Day -> Week, Week -> Year,
Year -> All}, {2013-01-01 -> 2013- Day = ⏊ 2013-01-01
01, 2013-01 -> 2013, 2013 -> ALL}) 2013-01-02
36 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Cube Elements: Dimension
Hierarchy
Level
Level Member
Example Level
Dimension
37 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Cube Elements: Measure
Measure = Set of all measures
M ∊ 2Measure
m = (name, aggfunc) ∊ Measure Fact
aggfunc ∊ {SUM, AVG, COUNT...} Dimension
Hierarchy
Level
Level Member
Example Level
Dimension
38 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Cube Elements: Fact
Fact = Set of all facts
F ∊ 2Fact Fact
Dimension
Hierarchy
Level
Fact = String x 2Dimension x Member x Level Member
Level
2Measure x Number Dimension
Example
Fact = {fact1, fact2, fact3, ...}
CS ∊ 2DataCubeSchema Dimension
cs = (name, D, M) ∊ DataCubeSchema
Hierarchy
Level
Level Member
Level
DataCubeSchema = String x 2Dimension Dimension
x 2Measure
Example
DataCubeSchema = {populationCS, gdpCS}
40 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Cube Elements: Data Cube
DataCube = Set of all data cubes
cube = (name, Cs, F) ∊ DataCube
DataCube = String x
Fact
DataCubeSchema x 2Fact Dimension
Hierarchy
Level
Level Member
Level
Every two facts must not Dimension
have same dimension values
Example
DataCube = {populationC, gdpC}
populationC = („populationC“, populationCS, {fact1,
fact2, fact3...})
41 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
How to Analyse Multidimensional Datasets?
Representing and Querying (2)
Conceptual Level
Independent from
representation
Logical Level
Dependent on
representation
Physical Level
Dependent on
setting and actual
data
Three-level-architecture [2]
42 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Representing Data Cubes
43 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Star Schema (“Sternschema”)
Advantages
Intuitive transformation from data cube to star schema.
Easy to implement in widely-used relational databases.
Fast queries since partly denormalised and few required
joins.
Extensions and changes to the schema easy to realise.
Elements of star schema
Fact tables (large)
Dimension tables (small)
44 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
How to Create a Star Schema of a Data Cube?
For every data cube, one fact table
For every fact, one row in fact table
For every measure, a (numeric) attribute in fact table
For every dimension, a dimension table with primary key and a foreign
key in fact table (all foreign keys = primary key)
For every level, an attribute in dimension table
For every member with highest granularity, a row in a dimension table
geoD
Example 1
timeD
geoID 1
NUTS2 dateID
NUTS1 Day
populationC Week
NUTS0 * Month
Dimension table for Geo geoID Year
timeID *
sexD * sexID Dimension table for Time
population
sexID 1
sex Fact table for Population Data Cube
Dimension table for Sex
45 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Challenges with Representing Data Cubes
Conformed dimensions
Complex hierarchies (e.g., non-strict)
Aggregation functions (e.g., meaningless: „Sum of
population over time“)
Other metadata (e.g., human-readable labels)
Pre-aggregated values (e.g., often-used summarisations)
46 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Conformed Dimensions
Example
Integrating several DE Population
Data Cube
data cubes ES ...
Geo
...
...
NUTS0
Day 1 ... ... ...
Month January ...
Equivalent Year 2013
Conformed Partially shared hierarchy
dimensions Dimension
NUTS1
NUTS0
NUTS2
[Kimball, 2002]
DE11 GDP
DE1
Data Cube
DE12 ...
Geo
DE DE212
47 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Informational Systems – Problem
Overview
Filter
Zoom ?
?
Tools for
Exploratory KD
over ?
Multidimensional
Datasets
?
Pivot tables Multidimensional
Visualisations Datasets
Overview
Filter
Zoom
?
?
Tools for
Exploratory KD
over
Multidimensional
Datasets
Data Cube
Pivot tables Multidimensional
Visualisations Datasets
Exploratory KD Techniques
Analytical Queries over Multidimensional Datasets
Data Cube and OLAP
Data structures and Operations for Analytical Queries
Data Warehouses
Implementing the Informational System
50 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
How to Analyse Multidimensional Datasets?
Representing and Querying (3)
Conceptual Level
Independent from
representation
Logical Level
Dependent on
representation
Physical Level
Dependent on
setting and actual
data
51 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
OLAP Operations
52 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Example OLAP User Interface
53 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Example OLAP User Interface
54 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Example OLAP User Interface
55 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Example Star Schema for Data Cube
Example
Time (Year) Geo (NUTS2) gdpMeasSum gdpMeasAvg ...
56 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Projection() – Filter Measures
Example
Projection(gdpC, {gdpMeasSum})
57 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Dice() – Filter Dimensions
Example
Dice(gdpC,geoD,{DE21,DE11})
58 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Slice() – Remove Dimensions
Dice: DataCube x 2Dimension -> DataCube with
Slice(c, SD) = c‘
c‘.DataCubeSchema.D = c.DataCubeSchema.D\SD
Example
Slice(gdpC, {geoD})
59 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Roll-Up() – Aggregate to Higher Level
Example
Roll-Up(gdpC, geoD, NUTS1)
60 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Drill-Across() – Integrate Two Data Cubes
Drill-Across: DataCube x Dimension -> DataCube with
Drill-Across(c1, c2) = c‘
„Drill-across over conformed dimensions and (total/partial) shared hierarchies“
c‘.DataCubeSchema.D = c1.DataCubeSchema.D ∪ c2.DataCubeSchema.D
c1.DataCubeSchema.M = c1.DataCubeSchema.M ∪ c2.DataCubeSchema.M
Example
Drill-Across(gdpC, populationC)
SELECT year nuts0 sex SUM(gdpM)
SUM(populationM)
FROM gdpC, populationC, timeD,
geoD, sexD
WHERE gdpC.timeID = timeD.timeID
AND gdpC.geoID = geoD.geoID AND
Problems: Partial- populationC.timeID = timeD.timeID
shared hierarchies. AND populationC.geoID = geoD.geoID
Non-conformed AND populationC.sexID = sexD.sexID
dimensions GROUP BY year nuts0 sex
61 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Drill-Across() – Integrate Two Data Cubes
Drill-Across: DataCube x Dimension -> DataCube with
Drill-Across(c1, c2) = c‘
„Drill-across over conformed dimensions and (total/partial) shared hierarchies“
c‘.DataCubeSchema.D = c1.DataCubeSchema.D ∪ c2.DataCubeSchema.D
c1.DataCubeSchema.M = c1.DataCubeSchema.M ∪ c2.DataCubeSchema.M
Example
Drill-Across(
Problems: Partial- Roll-Up(gdpC, geoD, nuts0),
shared hierarchies. Slice(populationC, {sexD})
Non-conformed )
dimensions
62 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
How to Analyse Multidimensional Datasets?
Representing and Querying (4)
Conceptual Level
Independent from
representation
Logical Level
Dependent on
representation
Physical Level
Dependent on
setting and actual
data
Overview
Filter
Zoom
?
?
Tools for
Exploratory KD
over
Multidimensional
Datasets
Data Cube
Pivot tables Multidimensional
Visualisations Datasets
Overview
Filter
Zoom
OLAP
?
Operations
Tools for
Exploratory KD
over
Multidimensional
Datasets
Data Cube
Pivot tables Multidimensional
Visualisations Datasets
Overview
Filter
Zoom
OLAP
?
Operations
Tools for
Exploratory KD
over
Multidimensional
Datasets
Data Cube
Pivot tables Multidimensional
Visualisations Datasets
Exploratory KD Techniques
Analytical Queries over Multidimensional Datasets
Data Cube and OLAP
Data structures and Operations for Analytical Queries
Data Warehouses
Implementing the Informational System
67 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehouse –
Purpose and Requirements
A Data Warehouse is a
subject-oriented,
integrated,
non-volatile and
time-variant
collection of data in support of management‘s decision-
making process. [Inmon, 1996]
FASMI requirements to Data Warehouses [Pendse, 95]
Fast: Returns results within typically 5sec for interactive analysis.
Analysis: Allows useful ad-hoc analytical queries. Most often: OLAP.
Shared: Several users with different rights.
Multidimensional: Adequate representation of multidimensional
datasets. Most often: Data Cube.
Information: Integration of all relevant metadata and data.
68 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehouse
Data Flow [Köppen, 2012]
Transformation Analysis
Control flow
69 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehousing vs. Federated Data
http://semanticweb.com/defending_the_warehouse_b17223
70 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehousing vs. Federated Data
http://semanticweb.com/defending_the_warehouse_b17223
71 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehouse
Data Flow [Köppen, 2012]
Data sources
Task: Supply data for data warehouse
Not part ofTransformation
the data warehouse Analysis
72 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehouse
Data Flow [Köppen, 2012]
Control flow
Realisation
Usage of standard interfaces (e.g. ODBC)
73 12.11.2014 Exception
Knowledge Discovery WS 2014/15handling for
- Data Warehousing continuation in case of error
& OLAP Institut AIFB
Data Warehouse
Data Flow [Köppen, 2012]
Transformation component (Mediators)
Task: Preparation and adaptation of
Extraction Work Load
data forBase
loading Load
Data
sources space Transform all data to a consistent
database
schema (Data Integration)
Align data types, dates, units of
Transformation measure, encodings... Analysis
Identify equivalent entities
ETL Dataconversions
Apply Cube etc.
Elimination of impurity (Data
Cleaning)
Data flow Incorrect or missing values,
redundancy, out-dated values
Use domain knowledge (e.g.
Control flow
Business Rules) for finding
impurities, redundancies, ...
74 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Data Mining e.g. for deviations
Institut AIFB
Data Warehouse
Data Flow [Köppen, 2012]
Load component
Transformation Analysis
Task: Transfer of cleaned and preprocessed (e.g. aggregated) data into
base database/data cube
ETL Data Cube
Characteristics
Usage of special load tools (e.g. bulk loader of database)
Changes to data warehouse data may not overwrite data warehouse
data; instead data has to be stored additionally (history of data)
Load process:
Online: Base database resp. data cube available during loading
Offline: Database not available (Time frame: during nights, week-ends)
75 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehouse
Data Flow [Köppen, 2012]
Work space
Task: Central acquisition component in so-called staging area
Transformation Analysis
76 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehouse
Data Flow [Köppen, 2012]
Base database
Task: Supply data cube with cleaned
Data Extraction Work Load Base Load
data space database
sources
Non-temporary storage for pre-
processed data
Independent of concrete analyses,
Transformation Analysis
i.e. no aggregations
ETL redundant
Since containing Data Cube
information to data cube, often
omitted in real applications
77 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehouse
Data Flow [Köppen, 2012]
78 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehouse
Data Flow [Köppen, 2012]
Analysis tools
Task: GUI for Exploratory KD
Transformation Analysis
79 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehouse – Reference Architecture
Transformation Analysis
Metadata
Manager
Data
Monitor
Data flow Monitor Warehouse
Monitor Manager
Repository
Transformation Analysis
Metadata
Manager
Data
Monitor
Data flow Monitor Warehouse
Monitor Manager
Repository
Data Warehouse
82 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Data Warehouse – Reference Architecture –
Monitors
Transformation Analysis
Metadata
Manager
Data
Monitor
Data flow Monitor Warehouse
Monitor Manager
Repository
Transformation Analysis
Metadata
Manager
Data
Monitor
Data flow Monitor Warehouse
Monitor Manager
Repository
Overview
Filter
Zoom
OLAP
?
Operations
Tools for
Exploratory KD
over
Multidimensional
Datasets
Data Cube
Pivot tables Multidimensional
Visualisations Datasets
Overview
Filter
Zoom Data
OLAP Warehouses &
Operations ETL
Tools for
Exploratory KD
over
Multidimensional
Datasets
Data Cube
Pivot tables Multidimensional
Visualisations Datasets
89 12.11.2014
http://km.aifb.kit.edu/projects/ldcx/ (Kämpgen et al.Institut
Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP
2014) AIFB
Case Study: Can we confirm or oppose
Okun‘s law? (2)
90 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Case Study: Can we confirm or oppose
Okun‘s law? (3)
Pearson-Correlation: 0.851548866822179
91 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Case Study: Financial Information Observation
System (FIOS) – Components
OLAP engine
(OLAP4LD)
94 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
References & Further Reading (2)
[Inmon, 1996] Inmon, W.h.: Building the Data Warehouse. John Wiley & Sons,
Inc. 2. Auflage, 1996.
[Kimball, 2002] Kimball, R., & Ross, M. (2002). The data warehouse toolkit: the
complete guide to dimensional modelling. Nachdr.]. New York [ua]: Wiley, 1–
447. doi:10.1145/945721.945741
[Köppen, 2012] Köppen, V., Saake, G., Sattler K.: Data Warehouse
Technologien. 2012.
[Pendse, 95] Pendse, N.: The FASMI Definition for OLAP. Business
Intelligence, August 1995.
[Stolte, 2002] Stolte, C., Tang, D., & Hanrahan, P. (2002). Query, analysis, and
visualization of hierarchically structured data using Polaris. Proceedings of the
eighth ACM SIGKD international conference on Knowledge discovery and data
mining - KD ’02, 112. doi:10.1145/775063.775064
[Shneiderman, 1996] Shneiderman, B. (1996). The Eyes Have It : A Task by
Data Type Taxonomy for Information Visualizations. Information Visualization,
336–343.
Benedikt Kämpgen, Andreas Harth. OLAP4LD - A Framework for Building
Analysis Applications over Governmental Statistics. ESWC 2014 Posters &
Demo session, Springer, Mai, 2014.
95 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB
Knowledge Discovery Lecture WS14/15
22.10.2014 Einführung
Basics, Overview
29.10.2014 Design of KD-experiments
05.11.2014 Linear Classifiers
12.11.2014 Data Warehousing & OLAP
19.11.2014 Non-Linear Classifiers (ANNs) Supervised Techniques,
26.11.2014 Kernels, SVM Vector+Label Representation
03.12.2014 entfällt
10.12.2014 Decision Trees
17.12.2014 IBL & Clustering Unsupervised Techniques
07.01.2015 Relational Learning I
Semi-supervised Techniques,
14.01.2015 Relational Learning II
Relational Representation
21.01.2015 Relational Learning III
28.01.2015 Textmining
04.01.2015 Gastvortrag Meta-Topics
11.02.2015 Crisp, Visualisierung
96 12.11.2014 Knowledge Discovery WS 2014/15 - Data Warehousing & OLAP Institut AIFB