Professional Documents
Culture Documents
• BI Application Architecture
• History of Data Warehouse
• What is Data Warehouse
• What is a Data Mart
• OLTP Vs OLAP
• Top Down Approach(Dr.inmon)
• Bottom up Approach(Dr.Kimball)
BI Application Architecture
OLAP
Read only
History
Size-Huge
• OLTP(Bank) Read/write
Data warehouse
Live data
Small size Staging DB Data mart
uk End users
Data cleansing
E
-------------------
Remove duplicates
Loans
Merging
us
T Split
convert Insurance
L
india
Credit card
History:
Father of Data warehouse
Dr. Willian H inmon
Dr. Ralph Kimball
--Introduced Dimension Modeling to design DWH in 1990.
What is Data Warehouse?
OLTP DWS
DataMart
Credit Card
Data mart
Insurence
Bottom-up Approach
DataMart
OLTP DWS
DataMart
DataMart
Normalization: Convert the larger tables into smaller tables
OLTP(HEAP) OLAP(INDEXES)
HEAP INDEXES
EMP ENAME SALARY DEPT LOCATION EMPID ENAME SALARY DEPT LOCATION
ID
19 SMITH 10000 10 CHICAGO 1 FORD 15000 10 CHICAGO
15 MILLER 25000 20 NEWYORK
15 MILLER 25000 20 NEWYORK
19 SMITH 10000 10 CHICAGO
1 FORD 15000 10 CHICAGO 20 ADAM 40000 30 WASHINGTON
20 ADAM 40000 30 WASHINGTON
OLTP E OLAP
T
To Design OLTP L To Design OLAP(DWH)
Amazon ETL
OLTP to OLAP Table Conversion Amazon DWH
OLTP Contains
-Master Tables
• Product Tables
• Category Tables
• Customer
-Transactional Tables
• Payment Details Master Transactional
• Order Details Product Master Sales
• Delivery details Dimension Fact
Customer Master
Product Dimension Sales
Category Master
Customer Dimension
In OLAP
-Master Tables converted as Dimension
Tables
-Transactional Tables converted as Fact
Tables
ERD Modeling
OLTP Product:
Customer:
------------------
Master(PK,string)-----------incremental load ---------------- M purchage N cid(PK)
---------------------------------------------------------------- Pid Cname
pk Caid(FK) Address
pName M : 1
Product Master(productid,name,price) 50 gb+50rows 1 : 1 Phone
price age
pk
customerMaster(custid,name,addr,city,phone) 40gb+1000
rows Sales
------------------------------
Pid
Transactional(fk,numeric)--------------full load cid
-------------------------------------------------------------------------- Quantity
fk fk Amount
Sales(productid,customerid,qty,amount,date)500gb delete date
all dataM
• Developed by Dr.Kimball
• It is a DWH design concept
• It is a process to convert OLTP data into OLAP data in the form of Dimension tables and
Fact tables
• Dimension Modeling is heart of data warehouse
Month city
Q3-- Q4-- Week
Q1-- Q2--- Day
50 50 100
200
W1 W2 W3 W4
1.Has redundant data and hence tough to 1.No Redundancy and hence more easy to
maintain/change maintain/change
2.Less complex queries and hence easy to understand 2.More complex queries and hence less easy is to
understand
3.Less foreign keys so faster execution time 3.More foreign keys so takes more execution time
5.Good to use for large database 5.Good to use for small database
7.When Dimension tables has less number of rows we can 7.When dimension tables is relatively big in size is better
go for this as it reduce space.
Types of Dimensions
• Role Playing Dimension
• Conformed Dimension
• Junk Dimension
• Degenerated Dimension
• Slowly Changing Dimension
• inferred Dimension
• Role Play Dimension:one dimension having more than one relation ship with in the same fact table(Fact)---
• Conformed Dimension:one single dimension shared with muiltiple fact table in dimension modeling
Role play Dimension:
Fact_Sales One dimension table having more
than one relationship with fact table
Dim_Product Pid_key(FK) Time then the dimension is called Role Play
Cid-key(FK) date_key(PK) Dimension.
Location-key(FK) Date
Pid-key(PK) pprice Day
Pname Qty Month
Brand Salesamount Week
Orderdate_key(FK) Qtr
Deliverydate_key(F Orderdate_key(FK)
K) Deliverydate_key(FK
Dim_Location
Location_Key(P Dim_Customer
K) Cid_key(PK)
Region Cname
Country Address
State Phone
City Age
• Conformed Dimension
• A conformed dimension relates to multiple fact tables with in the same DWH.
• Dim Date is a common conformed dimension because it’s attributes(day,week,month,quarter,year etc) have the same
meaning when joined to any fact table.
Date Dimension
Date_key(PK)
Balance of company’s acct 1 for day 1:5000 Day1 means Date/time dimension
Balance of company’s acct 2 for day1:2000
-----------------------------------------------------------------
Total: 7000
-----------------------------------------------------------------
1200/1000=20% 1300/1000=30%
1400/1000=40% 1600/1000=60%
2600/2000=30% 2900/2000=45%
Fact less Fact:
Fact table doesn’t contain any measures or facts.
Fact Table contain only key attributes.