Professional Documents
Culture Documents
Data warehouse
1
What is Data Mining?
2
What is a Data Warehouse?
• Defined in many different ways, but not rigorously.
• A decision support methods that is maintained
separately from the organization’s operational
database
• Support information processing by providing a
solid platform of consolidated, historical data for
analysis.
• “A data warehouse is a subject-oriented, integrated,
time-variant, and nonvolatile collection of data in
support of management’s decision-making process.”—
W. H. Inmon
• Data warehousing:
• The process of constructing and using data
warehouses 3
Data Warehouse—Subject-Oriented
9
Conceptual Modeling of Data
Warehouses
10
When compared star with snowflake model, star
model is the best one, because the snowflake is the
normalized form to reduce redundancies.
-easy to maintain.
-save storage space.
-reduce the effectiveness of browsing.
-More joins will be needed to execute the query.
11
Example of Star Schema
time
time_key item
day item_key
day_of_the_week Sales Fact Table item_name
month brand
quarter time_key type
year supplier_type
item_key
branch_key
branch location
location_key
branch_key location_key
branch_name units_sold street
branch_type city
dollars_sold province_or_street
country
avg_sales
Measures
12
Example of Snowflake Schema
time
time_key item
day item_key supplier
day_of_the_week Sales Fact Table item_name supplier_key
month brand supplier_type
quarter time_key type
year item_key supplier_key
branch_key
branch location
location_key
location_key
branch_key
units_sold street
branch_name
city_key city
branch_type
dollars_sold
city_key
avg_sales city
province_or_street
Measures country
13
A Concept Hierarchy: Dimension (location)
all all
14
Typical OLAP Operations
1. Roll up (drill-up): summarize data
• by climbing up hierarchy or by dimension
reduction
• Example: street<city<state<country.
• Aggregates the data by ascending the location
hierarchy from the level city to the level country.
• Dimension reduction –one or more dimensions
are removed
• From the given cube.
15
2. Drill down (roll down): reverse of roll-up
• Navigate from less detailed data to more detailed
data, or introducing new dimensions.
• Day<month<quarter<year
• Aggregates from the level month to day.
• By descending order.
• Additional Dimension-adding new dimension to
the given cube.
Cont..
3. Slice and dice:
• Slice operation performs selection on one
dimension of a given cube.
• Example: time=q1.
• Dice operation performs selection on two or
more operations.
• Example: location=q1 or q2, time= t1 or t2.
4. Pivot (rotate):
• Visualization operation that rotates the data axes
in view in order to provide an alternative
presentation of data. 17
Data warehouse architecture
Steps:
1.top-down view
-select relevant information.
-information matches current and future
business needs.
2.Data source view
-information being collected, stored
managed by operating systems.
- information may be documented at various
levels of detail and accuracy.
18
3.Data warehouse view
-fact table and dimension table.
-represents the information that is stored
inside the data warehouse.
19
Process of data warehouse design
1.top-down approach
-planning and designing
-technology mature and well known
-business problem solved clearly and
well understood.
2.Bottom-up approach
-experiments and prototypes.
-business modeling and technology
development.
20
Data Warehouse Design Process
21
Data Warehouse Usage