You are on page 1of 16

Creating a SCD Type 2 mapping using the Informatica PowerCenter Mapping Wizard

The Mapping Wizard available in the Informatica PowerCenter Designer client provides pre-designed mapping
templates to create mappings based on specific requirements like SCD Types 1, 2 & 3.

The illustration below explains the creation of an SCD Type 2 mapping using the Mapping Wizard. The source table
is EMPLOYEES that contains employee information like Employee ID, Name, Role, Department ID, Location,
Employment Status and the Date of joining.

The EMPLOYEES table is shown below.

EMPLOYEES
EMP_ID EMP_NAME EMP_ROLE DEPT_ID LOCATION EMPL_STATUS JOIN_DT
1321 Shaun Mathews Clerk 209 Atlanta Active 13-Apr-08
1487 Shane Smith Supervisor 110 Atlanta Active 4-Aug-08
1678 Katie Wells Manager 198 Atlanta Active 20-Aug-08

The field EMP_ID is the primary key for the EMPLOYEES table. The fields on which history needs to be maintained
are EMP_ROLE, DEPT_ID, LOCATION and EMPL_STATUS.

Import the source definition EMPLOYEES using the Source Analyzer workspace. Go to Sources > Import from
Database.

Nelrick Rodrigues
This opens the Import Tables window. Assuming that a system DSN is already created for this connection, specify
all the necessary details and click Connect.

Select the EMPLOYEES table to import and click OK to continue.

The EMPLOYEES source definition is created and appears in the workspace. Click Save to save the source definition
in the repository.

Nelrick Rodrigues
The source table EMPLOYEES contains only current data and doesn't have any historical data. This mapping would
be run daily to capture the historical data in the EMPLOYEES_SCD2 target table. The Effective Date logic would be
used for SCD Type 2 mapping.

Click on the Mapping Designer tab.

Go to Mappings > Wizards > Slowly Changing Dimensions.

Nelrick Rodrigues
Provide a suitable mapping name as shown and select the Type 2 Dimension radio button. Click Next to continue.

Select the correct Source definition from the Select Source Table drop-down list and type the New Target Table
name as EMPLOYEES_SCD2 as shown. Click Next to continue.

This opens the Target Field Selection window as shown.

Nelrick Rodrigues
Add the EMP_ID field as Logical Key Fields as shown as it is the primary key in the EMPLOYEES source table.

Add the remaining fields on which history needs to be maintained as Fields to compare for changes as shown.

Click Next. Select Mark the dimension records with their effective date range as the versioning method to
maintain history.

Nelrick Rodrigues
This adds two more fields PM_BEGIN_DATE and PM_END_DATE which helps identify the effective start date and
the end date respectively for each employee's record and if any of the fields on which history needs to be tracked
undergo a change in the source table, a new record for that employee will be created with new effective start date
and end date will be null. The PM_END_DATE value will be null for all current version of the records in the
EMPLOYEES_SCD2 table. The date logic is indeed useful in scenarios wherein the source system doesn't have an
effective date or last updated date field and it is binding on the ETL system to provide such a function. However,
the definiteness of the effective start and end date being loaded into the history table will also depend on the
frequency at which the SCD Type 2 mapping is run.

Click Finish.

The SCD Type 2 mapping is generated.

Nelrick Rodrigues
Save the mapping in the repository by pressing Ctrl+S. Check the Output Window below which displays messages
stating that the mapping is valid with no parsing errors.

The new target definition EMPLOYEES_SCD2 is created in Informatica Designer, but not in the database.

Drag the target definition EMPLOYEES_SCD2 from the Repository Navigator into the Target Designer workspace.

Nelrick Rodrigues
Go to Targets > Generate/Execute SQL.

This opens the Database Object Generation window. Mention the path and filename for the DDL file to be
created. Select the Create table radio button under the Generation options. Select the Create index radio button.

Click Generate and execute.

This opens the Connect to an ODBC Data Source window. Mention the necessary database details where the
target table should be created and click Connect.

Nelrick Rodrigues
Check the Output Window to verify if the script has been successfully generated and executed in the database.

Click Close to close the Database Object Generation window.

The iconic view of the mapping is shown below.

Nelrick Rodrigues
A brief description of the transformations used in the mapping is given below.

1. LKP_GetData: This is a lookup on the target table EMPLOYEES_SCD2 and will compare the incoming data
from the EMPLOYEES source table based on the key field EMP_ID with that of the target table,
EMPLOYEES_SCD2. All the currently active records in the EMPLOYEES_SCD2 table will have a null
PM_END_DATE. Hence, only these records should be compared for changes with the incoming data and
therefore, an unconnected input port INPUT_NULL_DATE is also matched with the PM_END_DATE field
of the lookup table as part of the Condition. The condition used in the lookup transformation
LKP_GetData is shown below.

2. EXP_DetectChanges: This expression transformation will generate two flags - ChangedFlag and NewFlag.
The ChangedFlag will check if the employee information in the EMPLOYEES_SCD2 target table has
undergone a change in the source EMPLOYEES table. The NewFlag will check for the occurrence of new
employee records in the source EMPLOYEES table.

3. SEQ_GenerateKeys: This sequence generator generates unique keys for the PM_PRIMARYKEY field in the
EMPLOYEES_SCD2 table for both new records and records that have undergone change in the fields on
which history is maintained in the EMPLOYEES source table, which will be inserted as a new record in the
target table.

4. EXP_KeyProcessing_InsertNew & EXP_KeyProcessing_InsertChanged: The expression transformations


EXP_KeyProcessing_InsertNew & EXP_KeyProcessing_InsertChanged generate the effective start date
PM_BEGIN_DATE for the new records and the changed records that are inserted into the
EMPLOYEES_SCD2 target table respectively.

5. FIL_InsertNewRecord, FIL_InsertChangedRecord & FIL_UpdateChangedRecord: The filter transformation


FIL_InsertNewRecord passes the new rows if the NewFlag is TRUE while the filter transformations -
FIL_InsertChangedRecord & FIL_UpdateChangedRecord passes the changed rows if the ChangedFlag is
TRUE.

6. UPD_ForceInserts, UPD_ChangedInserts & UPD_ChangedUpdate: The update strategy transformations


UPD_ForceInserts & UPD_ChangedInserts are used to manage inserts for new rows and changed rows

Nelrick Rodrigues
respectively while the UPD_ChangedUpdate is used to update the old version rows based on the
PM_PRIMARYKEY field.

7. EXP_CalcToDate: This expression transformation generates the effective end date PM_END_DATE for the
old version of an employee’s record in the EMPLOYEES_SCD2 target table.

The only optimization needed in the mapping is replacing the three filter transformations with a router
transformation.

Create a valid session and workflow for this mapping.

Start the Workflow Manager client tool and click on the Task Developer tab. Go to Tasks > Create to create a new
task.

This opens the Create Task window. Select the Session task from the drop-down and enter a name for this task as
shown below.

Click Create to continue. Select the mapping created in the previous steps to associate with this session.

Nelrick Rodrigues
Click OK to continue. Click Done in the Create Task window.

A new task is created in the Task Developer workspace as shown above. Double click on the session to edit it. Click
on the Mapping tab and select the Connections option on the left and apply the correct relational connections as
shown below.

Click OK to continue. Right click on the session task and click Validate to validate the session as shown below.

Nelrick Rodrigues
A notification is generated in the Output Window as shown below stating that the session is valid.

Press Ctrl+S to save the session task.

Nelrick Rodrigues
Click on the Workflow Designer tab. Go to Workflows > Create to create a new workflow.

This opens the Create Workflow window. Provide the workflow name as shown below.

Click OK. Drag the session created in the previous steps from the Repository Navigator into the Workflow
Designer workspace. Go to Tasks > Link Task.

Nelrick Rodrigues
Link the Start task to the session task as shown below.

Click Ctrl+S to validate and save the workflow.


th
Assuming the workflow ran for the first time on the 24 March, 2010, the data loaded in the target table is shown
below.

EMPLOYEES_SCD2
PM_PRIMARYKEY EMP_ID EMP_NAME EMP_ROLE DEPT_ID LOCATION EMPL_STATUS JOIN_DT PM_BEGIN_DATE PM_END_DATE
1 1321 Shaun Mathews Clerk 209 Atlanta Active 13-Apr-08 24-Mar-10
2 1487 Shane Smith Supervisor 110 Atlanta Active 4-Aug-08 24-Mar-10
3 1678 Katie Wells Manager 198 Atlanta Active 20-Aug-08 24-Mar-10

Observing the data in the target table, it is evident that since all these records are the current version, the
PM_END_DATE for all these records are null.

Nelrick Rodrigues
Assuming that the role of Shane Smith changes to Manager and the department ID of Katie Wells changes to 151 in
th
the source system and a new employee, Jim Mason joins the organization on the 25 March, 2010, the
EMPLOYEES table is shown below.

EMPLOYEES
EMP_ID EMP_NAME EMP_ROLE DEPT_ID LOCATION EMPL_STATUS JOIN_DT
1321 Shaun Mathews Clerk 209 Atlanta Active 13-Apr-08
1487 Shane Smith Manager 110 Atlanta Active 4-Aug-08
1678 Katie Wells Manager 151 Atlanta Active 20-Aug-08
2050 Jim Mason Clerk 171 Chicago Active 25-Mar-10

th
When the workflow runs on the 25 March, 2010, the data loaded in the target table is shown below.

EMPLOYEES_SCD2
PM_PRIMARYKEY EMP_ID EMP_NAME EMP_ROLE DEPT_ID LOCATION EMPL_STATUS JOIN_DT PM_BEGIN_DATE PM_END_DATE
1 1321 Shaun Mathews Clerk 209 Atlanta Active 13-Apr-08 24-Mar-10
2 1487 Shane Smith Supervisor 110 Atlanta Active 4-Aug-08 24-Mar-10 25-Mar-10
3 1678 Katie Wells Manager 198 Atlanta Active 20-Aug-08 24-Mar-10 25-Mar-10
4 2050 Jim Mason Clerk 171 Chicago Active 25-Mar-10 25-Mar-10
5 1487 Shane Smith Manager 110 Atlanta Active 4-Aug-08 25-Mar-10
6 1678 Katie Wells Manager 151 Atlanta Active 20-Aug-08 25-Mar-10

The old version of records for Shane Smith and Katie Wells have been updated with a PM_END_DATE of 25-Mar-
10 and two new versions of records having PM_PRIMARYKEY values 5 and 6 and a null value for PM_END_DATE
get inserted into the target table. The new record for Jim Mason also gets inserted into the target table with a null
value for PM_END_DATE, indicating it is the current version of the record in the target table.

Nelrick Rodrigues

You might also like