Professional Documents
Culture Documents
Modeling Techniques
Entity-Relationship Modeling
Traditional modeling technique
Technique of choice for OLTP
Suited for corporate data warehouse
Dimensional Modeling
Analyzing business measures in the specific business context
Helps visualize very abstract business questions
End users can easily understand and navigate the data structure
In simple terms,
Dimensional modeling is one of the methods of data modeling that help us store the data in
such a way that it is relatively easy to retrieve the data from the database.
ER Modeling gives us the advantage of storing data is such a way that there is less redundancy
Goals and Benefits of Dimensional Modelling
Dimensions:
Dimensions are the object or context. That is - dimensions are the 'things' about which
something is being spoken.
Facts/Measures:
Measures are the quantifiable subjects and these are often numeric in nature.
Measures are not stored in the dimension tables. A separate table is created for storing
measures. This table is called Fact Table.
Schema Types:
Why?
In Dimensional modeling, we can create different schema to suit our requirements. We need
various schema to accomplish several things like accommodating hierarchies of a dimension or
maintaining change histories of information etc.
This has obvious disadvantage in terms of information retrieval since we need to read more
tables (and traverse more SQL joins) in order to get the same information.
Example, if you wish to find out all the food, food type sold from store 1, the SQL queries from
star and snowflake schemata will be like below:
SQL Query for Star Schema:
SELECT DISTINCT f.name, f.type
FROM food f, sales_fact t
WHERE f.key = t.food_key
AND t.store_key = 1
SQL Query For SnowFlake Schema
SELECT DISTINCT f.name, tp.type_name
FROM food f, type tp, sales_fact t
WHERE f.key = t.food_key
AND f.type_key = tp.key
AND t.store_key = 1
As you can see in this example, compared to star schema, snowflake schema requires one more
join (to connect one more table) to retrieve the same information. This is why snowflake
schema is not good performance wise.
3 Types
Certain kinds of dimension attribute changes need to be handled differently in Data Warehouse