You are on page 1of 15

DATA WAREHOUSING

Data Warehousing Concepts


 What is a Data Warehouse?
 A Data warehouse is a subject-oriented, integrated, time-varying,
non-volatile collection of data in support of the management's
decision –making process (OR)
Data warehouse is a relational database for Query and Analysis
Rather than for transaction processing.

 Subject-oriented(customer,products,sales,etc)
 Non-volatile
 Time-varying
 Integrated
(William.Inmon –1993)
Dimension Modeling
Dimension: Dimension is a structure
which consists of levels, and
hierarchies is defined on each level.
Example:
SEX

MALE FEMALE
Dimension Modeling
Example:
Profession Level 0

Teacher Level 1
Engineer Secretary

Chemical Civil Executive Junior Elementary High School

Level 2
FACTS
 Fact: Fact consists of whole data with
primary key, foreign key relation ship
with dimensions and also consists of
measures.
 There are Three types of facts
1.ADDITIVE FACTS
2.SEMI ADDITIVE FACTS
3.NON ADDITIVE FACTS
Fact less Fact & Conform dim
 Fact less fact is a fact it does not
contain
Measures.
 A Dimension which can share more than
one Fact is called Conform Dimension
 Collection of Star Schemas and
Snowflake
Schemas is called Galaxy.
Star schema & Snow flake Schema
 Star schema:
A Centralized fact table
surrounded with dimension tables
having Primary, Foreign key relation ship
between them is called star schema.
 Snow flake Schema:
A normalized star schema Is
called Snow flake Schema
Star Schema
Sex-Dim Date-Dim

Sex key Date key


Sex Current year
Current month
Fact Table Current week

Profession Key
-----------------------------
Sex key Conform Dim
----------------------------
Address key
-----------------------------
Date key
-----------------------------
-
Address Dim
Profession-Dim ---- Address key
---- Country
Profession-key ----- State
Profession-class Measures (Numeric) City
Title
Level
discipline
Snow Flake Schema
Item Dim Fact Table
Time Dim
Item key Time key
Item name Item key Time key
Type Location key Year
Supplier key ------ Quarter
------ Month
-------------------- day
--------
Rupees sold
Units sold

Location Dim City Dim


Supplier Dim
Location key City key
Supplier key City name
Supplier name Street
City key State
Supplier address Country
Supplier type Pin code
TYPES OF MAPPINGS
History
 Simple pass through (None)
 Slowly growing target (Full)
 Slowly changing dimension
(depends)
Types of SCD’S
 Slowly changing dimension—1
 Slowly changing dimension—2

1. Time stamping
2. versioning
3. Flaging
 Slowly changing dimension—3
Slowly changing Dimension-1
 SCD-1:When you does not want
History
use this kind of mapping (Only insert
else
Update takes place) it inserts the
new row or Update the existing
dimensions.
Slowly changing Dimension-2
 SCD -2 (Time stamp): When you want
maintain full history use this kind of mapping.
Inserts new and changed dimensions. Creates an
effective date range to track changes.
 SCD -2 (Versioning): Inserts new and
changed dimensions. Creates a version number
and increments the primary key to track changes.
 SCD -2 (Flaging): Inserts new and changed
dimensions. Flags the current version and
increments the primary key to track changes.
Slowly changing Dimension-3
SCD -3 : when you want partial history use
this kind of mapping. It Inserts new
dimensions. Updates changed values in
existing dimensions. Optionally uses the
load date to track changes.
Data Warehouse Execution
Architecture
 Architecture

Source Systems ETL DATA WAREHOUSE

Reporting( DSS) Servers


DB2/400

ODBC
Oracle NCR
ODBC Tera Data
Informatica (ETL)
ODBC Warehouse
UNIX-HP
SQL Server Native Native

FTP

Flat files