You are on page 1of 23

DIMENSIONAL MODELING IN DATA

WAREHOUSING

By Chiradip Bhattacharya
CSE, A,13000216109
1
What is Dimensional Modeling?

 Dimensional modeling (DM) names a set of techniques and


concepts used in data warehouse design.

 Developed by Ralph Kimball of Kimball Group.

 Dimensional modeling is one of the methods of data modeling,


that help us store the data in such a way that it is relatively easy to
retrieve the data from the database.

 Dimensional modeling always uses the concepts of facts


(measures), and dimensions (context).
2
Aim of Dimensional Modeling

 Its main aim is to express business measurements in a


standardized framework which will help the end-user to
understand the design of business measurements easily.

 Queried and maintained by SQL or special purpose


management tools for the ease of extraction of the required
information.

3
Entity-Relationship (ER) vs Dimensional
Modeling (DM)
ER DM
Data is de normalized and used in data
Data is normalized and used for OLTP and
warehouse and data mart and is
is optimized for OLTP processing
optimized for OLAP
Tables are units of storage Data Cubes are units of storage
Several tables and chains of relationships Few tables and fact tables are connected
among them to dimensional tables
Volatile (several updates) and time variant Non volatile and time invariant
User friendly, interactive, drag and drop
Normal Reports
multidimensional OLAP Reports
Optimized for updation Optimized for retrieval
Minimize data redundancy Maximize understandability

4
Why ER is not suitable for Data Warehouses?
 End client cannot comprehend, recollect or explore an ER
Model.

 There is no GUI that takes a general ER outline and makes it


usable by end clients.

 ER displaying is not streamlined for complex, specially


appointed questions. They are streamlined for redundant
tight inquiries.

 Utilization of ER displaying system takes a lot of time for


recovery of information.
5
Dimensional Modeling Architecture

6
Components of Dimensional Modeling

a) Fact
 Facts are the measurements/metrics or facts from your business
process.

b) Dimension
 Dimension provides the context surrounding a business process
event. In simple terms, they give who, what, where of a fact. A
dimension is a window to view information in the facts.

c) Attribute
 The Attributes are the various characteristics of the dimension.
Attributes are used to search, filter, or classify facts. Dimension
Tables contain Attributes.

7
Components of Dimensional Modeling
(continued)
e) Fact Table
 The fact table contains the names of the facts, or measure, as well
as keys to each of the related dimension table. It can also be
defined as the place where numerical measures about business
data are stored.

 Contains two or more foreign keys and tend to have huge numbers
of records.

 Useful facts tend to be numeric and additive.

 The foreign keys column allows joins with dimension tables, and the
measures columns contain the data that is being analyzed.
8
Components of Dimensional Modeling
(continued)
f) Dimension Table
 A dimension table allows keeping records of the dimensions.
Dimension table consist of the textual description of dimension of
the table.

 A data warehouse organizes descriptive attributes as columns in


dimension tables.

 A dimension table has a primary key column that uniquely identifies


each dimension record (row).

9
Components of Dimensional Modeling
(continued)
Facts and Dimensions

Criteria Fact Attributes Dimension Attributes


Purpose Measurements for reporting Constraints or qualifiers
or analysis for the measurements
Data Type Additive or semi-additive Textual, descriptive
quantitative data
Size Larger number of records Smaller number of
records
Reporting use Main report contents Row or report headers
Examples Measurements for sales About time, people,
departments, objects,
geographic units

10
Dimensional Modeling Life Cycle
Various phases which are involved in dimension modeling are as
follows –

a) Requirements gathering
 It is the process of selecting the business processes for which the
dimension modeling has to done according to which requirement
are gathered and documented.

b) Identifying the grain


 A grain refers to the atomic level of data that can be analyzed.
Granularity is defined as the detailed level of information stored
in a table.

c) Identify the dimensions


 In this step we determine the dimension for the data model.
11
Dimensional Modeling Life Cycle
(continued)
d) Identify the facts table
 Here we identify the fact table and relevant facts / measures in
the table.

e) Verify the model


 Before continuing, we must verify that the dimensional model can
meet the business requirements.

f) Designing the multi-dimensional schema


 In this step, a specified schema whether star, snowflake, or fact is
drawn.

12
Multi-Dimensional Data Model Schema
a) Star Schema

 The star schema architecture is the simplest data warehouse


schema.

 It is called a star schema because the diagram resembles a star, with


points radiating from a center.

 The center of the star consists of fact table and the points of the
star are the dimension tables

 Usually the fact tables in a star schema are in third normal


form(3NF) whereas dimensional tables are de-normalized.

13
Multi-Dimensional Data Model Schema
(continued)
Star Schema Example
Star schema for college database

14
Multi-Dimensional Data Model Schema
(continued)
b) Snowflake Schema

 Snowflake schema can be derived by expanding the dimension of a


dimension table from a star schema.

 Some dimension tables in the Snowflake schema are normalized.

 The normalization splits up the data into additional tables.

 The snowflake effect affects only the dimension tables and does not
affect the fact tables.

15
Multi-Dimensional Data Model Schema
(continued)
Snowflake Schema Example
Snowflake schema for college database

16
Multi-Dimensional Data Model Schema
(continued)
c) Fact Constellation Schema

 A fact constellation has multiple fact tables. It is also known as


galaxy schema.

 It is also possible to share dimension tables between fact tables.

 It is an aggregation of multiple star schemas with shared dimension


tables, thereby, the name constellation.

17
Multi-Dimensional Data Model Schema
(continued)
Fact Constellation Schema Example
Fact Constellation schema for college database

18
Advantages of Dimensional Modeling
 Predictable, standard framework.

 Responds well to changes in user reporting needs.

 Relatively easy to add data without reloading tables.

 Standard design approaches have been developed.

 There exist a number of products supporting the


dimensional model.

 Dimensional models are deformalized and optimized


for fast data querying.
19
Utility of Dimensional Modeling

 We get the benefits of dimensional models on Hadoop and


similar big data frameworks.

 HDFS and HBase can exploit the benefits of dimensional modeling,


thereby, making the allover process of retrieval of information
faster.

 For a large fact and dimension table we can de-normalize the


dimension table directly into the fact table. For two very large
transaction tables we can nest the records of the child table inside
the parent table and flatten out the data at run time.

20
Conclusions
 Dimensional modeling is one of the most popular and effective
techniques used in Data Warehousing.

 Through these models optimal memory management can be done


as shown and there will be no need for repetition of information
therefore no inconsistency in the database can occur.

 Star schema is the simplest to design and implement at the


physical level while snowflake schema and fact constellation
schema is comparatively difficult to design, implement, time
consuming and involve more man power.

21
References
• Surajit Chaudhuri Umeshwar Dayal Appears in ACM Sigmod Record, March 1997
“An Overview of Data Warehousing and OLAP Technology”.
• Chuck Ballard, Dirk Herreman, Don Schau, Rhonda Bell, Eunsaeng Kim, Ann
Valencic. “Data Modeling Techniques for Data Warehousing”.
• Dimension Modelling Techniques in Business Intelligence by Divya Sharma,
International Journal of Emerging Trends & Technology in Computer Science
(IJETTCS), Volume 3, Issue 6, November-December 2014
• Dimensional Modeling using Star Schema for Data Creation by Md. Mudasir
Kirmani, December 2017.
• Ballard, C. et. al. "Dimensional Modelling in business environment", IBM red books,
March 2006.
• https://en.wikipedia.org/wiki/Dimensional_modeling
• https://www.guru99.com/dimensional-model-data-warehouse.html
• https://www.redbooks.ibm.com/redbooks/pdfs/sg247138.pdf
• https://www.geeksforgeeks.org/dimensional-data-modeling
• https://www.slideshare.net/sunitasahu101/dimensional-modeling-53600268
22
THANK YOU

23

You might also like