You are on page 1of 19

Slowly Changing

Dimensions
Kiran Pendse (sr. Data Engineer)
Date: 20 November 2023
AGENDA
1. Introduction
6. Implementation
2. What is SCD?
7. Demo
3. Types of SCD
8. Learnings
4. Advantages
9. Q&A
5. Disadvantages
Introduction :
What is SCD?

A Slowly Changing Dimension (SCD) is a


dimension that stores and manages both
current and historical data over time in a
data warehouse
Advantages :
• Historical Data Tracking

• Improved Decision-Making

• Better Data Quality

• Auditing and Compliance

• User-Friendly Reporting

Overall,
Slowly Changing Dimensions play a crucial role in enabling organizations to work with historical data
effectively, leading to better decision-making, improved data quality, and compliance with regulatory
requirements.
Disadvantages :
• Increased Storage Requirements

• Complex Data Transformation

• Performance Overhead

• Data Maintenance Overhead

• Data Quality Challenges

Overall,
to mitigate these disadvantages, organizations must carefully plan and implement SCD strategies,
consider the trade-offs, and make technology and infrastructure choices that align with their specific data
warehousing needs and capabilities.
SCD 0 : [No change]
Source System :

ID Name Salary FaxNumber

123 Vivek 10000 456-123-8899

Data Warehouse Table :

ID Name Salary FaxNumber

123 Vivek 10000 456-123-8899


SCD 1 : [Only Maintain The Latest
Information]
Source System:
ID Name Salary Designation

123 Vivek 10000 Developer

Data Warehouse Table :

ID Name Salary Designation

123 Vivek 10000 Developer


SCD 1 :
Source System Dimensions With Corrections:

ID Name Salary Designation

123 Vivek 40000 Lead

Data Warehouse Table :

ID Name Salary Designation

123 Vivek 40000 Lead


SCD 2 : [ Latest row with current flag = Y &
additional row every time iff change in
source dimension. ]
Source System :

ID Name Salary Office Date Eff.

123 Vivek 10000 Gurgaon 10 Aug 2013

Data Warehouse Table :


ID Name Salary Office From_Date To_Date Current_Flag

10 Aug 2013
123 Vivek 10000 Gurgaon 31 Dec 9999 Y
SCD 2 :
Source System With New Updates:

ID Name Salary Office Date Eff.

123 Vivek 40000 California 10 Jan 2015

Data Warehouse Table With New Updates:


ID Name Salary Office From_Date To_Date Current_Flag

10 Aug 2013
123 Vivek 10000 Gurgaon 9 Jan 2015 N

123 Vivek 40000 California 10 Jan 2015 31 Dec 9999 Y


SCD 2 :
Source System With New Again Updates:

ID Name Salary Office Date Eff.

123 Vivek 40000 Houston 25 Jul 2015

Data Warehouse Table With New Updates:

ID Name Salary Office From_Date To_Date Current_Flag

123 Vivek 10000 Gurgaon 10 Aug 2013 9 Jan 2015 N

123 Vivek 40000 California 10 Jan 2015 24 Jul 2015 N

123 Vivek 40000 Houston 25 Jul 2015 12 Dec 9999 Y


SCD 3 : [ Iff we have prior knowledge of dimensions
that it won’t change that frequently ]

Source System :

ID Name Salary Office Date Eff.

123 Vivek 10000 Gurgaon 10 Aug 2013

Data Warehouse Table :


ID Name Salary Prev_Office Curr_Office From_Date

123 Vivek 10000 Gurgaon Gurgaon 10 Aug 2013


SCD 3 :

Source System :

ID Name Salary Office Date Eff.

123 Vivek 40000 California 10 Jan 2015

Data Warehouse Table :

ID Name Salary Prev_Office Curr_Office From_Date

123 Vivek 40000 Gurgaon California 10 Jan 2015


SCD 3 :

Source System :

ID Name Salary Office Date Eff.

123 Vivek 40000 Houston 25 Jul 2015

Data Warehouse Table :

ID Name Salary Prev_Office Curr_Office From_Date

123 Vivek 10000 California Houston 25 Jul 2015


SCD 4 : 1 table for latest snapshot & another to maintain history
Source System :
ID Name Salary Office Date Eff.

123 Vivek 40000 Houston 25 Jul 2015

Data Warehouse Table:


Current Snapshot :
ID Name Salary Office

123 Vivek 40000 Houston

History Table :
ID Name Salary Office From_Date To_Date

123 Vivek 10000 Gurgaon 10 Aug 2013 9 Jan 2015

123 Vivek 40000 California 10 Jan 2015 24 Jul 2015

123 Vivek 40000 Houston 25 Jul 2015 12 Dec 9999


SCD 6 : [ SCD 1 + 2 + 3 ]
Source System :

ID Name Salary Office Date Eff.

123 Vivek 40000 Houston 25 Jul 2015

Data Warehouse Table:


ID Name Salary Curr_Office Hist_Office From_Date To_Date Curr_Flag

123 Vivek 10000 Houston Gurgaon 10 Aug 2013 9 Jan 2015 N

123 Vivek 40000 Houston California 10 Jan 2015 24 Jul 2015 N

123 Vivek 40000 Houston Houston 25 Jul 2015 12 Dec 9999 Y


Q&A
THANK YOU

You might also like