Professional Documents
Culture Documents
GRANULARITY
• is the extent to which a system is broken down into small parts, either
• For example, a yard broken into inches has finer granularity than a yard
• The granularity of data refers to the size in which data fields are sub-divided. For example, a postal
address can be recorded, with coarse granularity, as a single field:
• address = 200 2nd Ave. South #358, St. Petersburg, FL 33701-4313 USA or
• country = USA
OR EVEN FINER GRANULARITY:
• state = FL
• postal-code = 33701
• postal-code-add-on = 4313
• country = USA
• Data granularity in a data warehouse refers to the level of detail. The lower the
level of detail, the finer is the data granularity. Of course, if you want to keep data
in the lowest level of detail, you have to store a lot of data in the data warehouse
different levels. Depending on the query, you can then go to the particular level of
• In data warehousing, a Fact table consists of the measurements, metrics or facts of a business
process.
• The primary key of a fact table is usually a composite key that is made up of all of its foreign
keys.
• Fact tables contain the content of the data warehouse and store different types of measures like
additive, non additive, and semi additive measures.
FACT TABLES CONTINUED…
• A fact table is the primary table in a dimensional model where the numerical
performance measurements of the business are stored, We use the term fact to
• We can imagine standing in the marketplace watching products being sold and writing
down the quantity sold and dollar sales amount each day for each product in each
store
• A measurement is taken at the intersection of all the dimensions (day, product, and store).
This list of dimensions defines the grain of the fact table and tells us what the scope of the
measurement is.
• The most useful facts are numeric and additive, such as dollar sales amount
ILLUSTRATION
DIMENSION TABLES
• Dimension tables are integral companions to a fact table. The dimension tables contain
• Each dimension is defined by its single primary key, designated by the PK notation
which serves as the basis for referential integrity with any given fact table to which it is
joined.
• Dimension attributes serve as the primary source of query constraints, groupings,
and report labels. In a query or report request, attributes are identified as the by
words.
• For example, when a user states that he or she wants to see dollar sales by week
by brand, week and brand must be available as dimension attributes.
• Dimension table attributes play a vital role in the data warehouse. Since they are
the source of virtually all interesting constraints and report labels, they are key to
making the data warehouse usable and understandable
SAMPLE DIMENSION TABLE.
BRINGING TOGETHER FACTS AND
DIMENSIONS: FACT AND DIMENSION TABLES IN A
DIMENSIONAL MODEL
DIMENSIONAL MODELING (DM)
• (DM) refers to a logical design technique often used for data warehouses. (It seeks to
• It differs from the entity –relationship (E-R) designs in that while the E-R aims at
• Every Dimensional model is composed of one table with a multi part key called the fact
• Each D-Table has a single part primary key that corresponds to exactly one component
• These represent Business objects or subjects; they could be equated to the entities in
E-R models
• Dimension have attributes in the same way entities have properties. these form the
• Fact table represents sales facts that is the amount sold, units sold and cost.
• In data ware housing, a conformed dimension is a dimension which has the same meaning
Conformed dimensions allow facts and measures to be categorized and described in the
same way across multiple fact tables/data mats/ ensuring consistent reporting across the
enterprise.
JUNK DIMENSION:
• A junk dimension is a collection of random transactional codes or text attributes that are unrelated to
any particular dimension.
• The junk dimension is simply a structure that provides a convenient place to store the junk attributes.
Eg: Assume that we have a gender dimension and marital status dimension. In the fact table we need to
maintain two keys referring to these dimensions.
• Instead of that create a junk dimension which has all the combinations of gender and marital status
(cross join gender and marital status table and create a junk table). Now we can maintain only one key
in the fact table.
DEGENERATED DIMENSION:
dimension key in the fact table that does not have its own dimension table, because all
Dimensions which are often used for multiple purposes within the same database are
called role-playing dimensions. For example, a date dimension can be used for “date of
• The schema is a logical description of the entire database. The schema includes the
name and description of records of all record types including all associated data-items
and aggregates.
• Likewise the database, the data warehouse also require the schema. The database uses
the relational model on the other hand the data warehouse uses the Star, snowflake and
(comes from a Greek word (skhēma )which means shape or more generally plan
• In star schema each dimension is represented with only one dimension table.
• In the following diagram we have shown the sales data of a company with respect to the
branch_key
branch location
location_key
branch_key location_key
branch_name units_sold street
branch_type city
dollars_sold province_or_street
country
avg_sales
Measures
• There is a fact table at the centre.
• The fact table also contain the attributes namely, dollars sold and units sold.
Note: Each dimension has only one dimension table and each table holds a set of attributes.
• For example the location dimension table contains the attribute set
{location_key,street,city,province_or_state,country}.
SNOWFLAKE SCHEMA
• for example the item dimension table in star schema is normalized and split into two
branch_key
location
branch location_key
location_key
branch_key
units_sold street
branch_name
city_key city
branch_type dollars_sold
city_key
avg_sales city
province_or_street
Measures normalization country
• Therefore now the item dimension table contains the attributes item_key, item_name,
• The supplier dimension table contains the attributes supplier_key, and supplier_type.
• Snowflake schemas will use less space to store dimension tables. This is because as a
• In fact Constellation there are multiple fact tables. This schema is also known as galaxy
schema.
• In the following diagram we have two fact tables namely, sales and shipping.
EXAMPLE OF FACT CONSTELLATION
time
time_key item Shipping Fact Table
day item_key time_key
day_of_the_week Sales Fact Table item_name item_key
month brand
quarter
time_key
type
year item_key supplier_type shipper_key
branch_key from_location
• The shipping fact table has the five dimensions namely, item_key, time_key, shipper-key,
from-location.
• The shipping fact table also contains two measures namely, dollars sold and units sold.
• It is also possible for dimension table to share between fact tables. For example time,
item and location dimension tables are shared between sales and shipping fact table.
WHAT IS A FACT TABLE?
• A fact table is the one which consists of the measurements, metrics or facts of business
process.
• These measurable facts are used to know the business value and to forecast the future
• Semi-Additive: Semi-additive facts are facts that can be summed up for some of the
• Non-Additive: Non-additive facts are facts that cannot be summed up for any of the
• Semi-additive facts are facts that can be summed up for some of the dimensions in
Eg: Daily balances fact can be summed up through the customers dimension but
Non-additive facts are facts that cannot be summed up for any of the dimensions
Profit margins are non-additive. If a department has two employees, and one
employee has sold an item with a 55% profit margin and the other has sold an
item with a 45% profit margin, the profit margin for the department is not 100%.
FACT LESS FACT TABLES
• A fact less fact table is fact table that does not contain fact. They contain only
dimensional keys and it captures events that happen only at information level but not
included in the calculations level. just an information about an event that happen over a
period.
• A fact less fact table captures the many-to-many relationships between dimensions,
but contains no numeric or textual facts. They are often used to record events or
coverage information. Common examples of fact less fact tables include:
• Identifying product promotion events (to determine promoted products that didn’t
sell)
• Events or activities occur that you wish to track, but you find no measurements. In
situations like this, build a standard transaction-grained fact table that contains no
facts.
• For eg.
• The above fact is used to capture the leave taken by an employee.
• It is used to support negative analysis report. For example a Store that did not sell a
product for a given period. To produce such report, you need to have a fact table to
capture all the possible combinations. You can then figure out what is missing
• For eg, fact_promo gives the information about the products which have promotions but
still did not sell
• This fact answers the below questions:
• The list of products that have promotion but did not sell.
• This kind of fact less fact table is used to track conditions, coverage or eligibility.