You are on page 1of 7

Data Masking with Expressions

2009 Informatica Corporation

Table of Contents
Overview ........................................................................................................................................................................... 2
Source File........................................................................................................................................................................ 3
Source Qualifier Transformation ....................................................................................................................................... 3
Data Masking Transformation........................................................................................................................................... 3
Configuring Mapping Parameters................................................................................................................................. 4
Configuring Expression Masking .................................................................................................................................. 5
Lookup Transformation ..................................................................................................................................................... 6
Target File......................................................................................................................................................................... 6

Overview
The following example shows how to configure expression masking in the Data Masking transformation.
A company insurance policy file contains sensitive data that you want to use in a test scenario, but you need to
maintain security. You can create a mapping to mask each column of the company data and write the test data to
another file.
The mapping includes a Data Masking transformation to mask the company fields. It also includes a Lookup
transformation that retrieves a substitute company name from a dictionary file.
The following figure shows the mapping:

The mapping has the following transformations:


Transformation

Type

Description

SQ_Company_Policy

Source Qualifier

Passes company data to the Data Masking transformation.

DM_Mask_Company_
Policy_Names

Data Masking

Performs the following types of masking:

LKUP_Company_Names

Lookup

Random. Generates a new start and end date within three months of the
original dates. Generates a random number to use in a Lookup
transformation.

Key. Generates a company ID and policy number that is repeatable each


time the same company ID or policy number occurs in the source data.

Expression. Calculates an end date to maintain the same policy length as


the original policy length.

Performs a flat file lookup on Company_Names.dic to retrieve a substitute


company name.

Source File
The following table describes each source column and the masking requirements:
Column

Datatype

Mask
Transformation

Mask Objectives

Company_ID

Integer

Data Masking

Replace with a random number that is repeatable each time the


same company ID occurs in the source data.

Company_Name

String

Lookup

Replace with a random company name from the


Company_Names dictionary file.

Policy_Number

Integer

Data Masking

Replace with a random number that is repeatable each time the


same policy number occurs in the source data.

Start_Date

Date/time

Data Masking

Replace with a date that is within three months of the original


date.

End_Date

Date/time

Data Masking

Replace with a date so that the time difference between the


original Start_Date and End_Date is maintained after masking.

Source Qualifier Transformation


The Source Qualifier passes company data to the Data Masking transformation. It passes the Company_ID column to
the following ports in the Data Masking transformation:
y

Company_ID. Company number.

Randid1. Random number generator for lookups on the Company_Names.dic file.

Data Masking Transformation


Configure the masking properties for input ports on the Masking Properties tab.

The following figure shows how each port in the Data Masking transformation is masked:

The Data Masking transformation modifies the following source columns:


y Company_ID. Key masking. The mapping has a mapping parameter, $$CompanyNumber, that contains a seed
value.
y Company_Name. No masking.
y Policy_Number. Key masking. The mapping has a mapping parameter, $$CustomerPolicyNum, that contains a
seed value.
y Start_Date. Random masking within a variance of the source date. The Data Masking transformation returns a
Start_Date that is within three months of the source date. Configure blurring to mask the date as a variance of the
source date. The unit is a part of the date to apply the variance to. You can select a year, month, day, or year. For
this example, the variance is three months. The Data Masking transformation returns a date that is within three
months of the original date.
y End_Date. Expression masking. The difference between the original start date and end date is maintained after
masking, so the policy length is the same between the original dates and the masked dates.
y Randid1. Random masking. Randid1 represents a random serial number in the Company_Names.dic file. Randid1
must be an integer from 1 to 500 because Company_Names.dic file has 500 records in this example.

Configuring Mapping Parameters


When you configure key masking, you can apply a mapping parameter or variable for the seed value. When you use a
mapping parameter you can apply the same parameter when you mask primary-key and foreign-key values for parent
and child tables.
If you plan to use a mapping parameter for a seed value, create the mapping parameter before you configure the Data
Masking transformation. Verify that the parameter datatype is valid for a seed number. You must create the Data
Masking transformation in the Mapping Designer if you plan to use a mapping parameter.

To configure a mapping parameter for a seed value:


1.

On the Masking Properties tab, select a port and choose key masking.

2.

Select the Mapping Parameter option.


If you have not configured a mapping parameter or variable, an error message appears.

3.

Select a mapping parameter or variable from the list.


If the mapping parameter value is invalid, the Integration Service uses a default seed value from the defaultValue
file. The defaultValue file is an XML file in the following location:
<PowerCenter Installation Directory>\infa_shared\SrcFiles\defaultValue.xml

Configuring Expression Masking


You can configure an expression to modify data in a port. When you configure the expression, you can use functions,
ports, and variables. You can reference masked data in an expression.
To configure expression masking in the Data Masking transformation:
1.

On the Masking Properties tab, select a port and choose Expression Masking.
The Designer displays the port name as the default expression.

2.

Click Configure Expression.


The Expression Editor appears.

3.

Add the following expression in the Expression Editor for End_Date:


ADD_TO_DATE(out_Start_Date, 'DD', DATE_DIFF(End_Date,out_Start_Date, 'DD'))

To calculate a new end date that maintains the same policy length, the expression calculates the number of days
between the original start and end dates. It adds the result to the masked start date to determine the new end
date.

You can select the functions and port names to use in the expression. The Expression Editor validates the
expression when you click OK. You can also click Validate to verify the expression syntax. The expression
appears on the Masking Properties tab.

Lookup Transformation
The mapping contains a Lookup transformation to retrieve random company names from a lookup file. The lookup
source is a sample company names file that has 500 records. The mapping substitutes company names from the
production data with company names from the dictionary file. The Lookup transformation receives a random number
between 1 and 500 from the Data Masking transformation.
Each company record contains a serial number and company name. The Company_Names file contains records
similar to the following:
SNO,COMPANY_NAME
1,7-ELEVEN
2,ABBOTT LABORATORIES
3,ADC TELECOMMUNICATIONS
4,ADELPHIA COMMUNICATIONS
5,ADOBE SYSTEMS
6,ADVANCED MICRO DEVICES

The following figure shows the ports in the LKP_Company_Names Lookup transformation:

The lookup condition compares the value of the random number with the serial number in the Company_Names.dic
file. The Lookup transformation returns the record that contains the serial number equal to the random number. The
Condition tab contains the following lookup condition:
SNO = out_Randid1

Target File
The Company_Policy target definition receives the new company data from the following mapping components:
Transformation

Column Name

Data Masking

Company_ID, Policy_Number, Beg_Date, End_Date

Lkp_Company_Names

Company_Name

The Target_Company_Policy file contains realistic data that you can use in a test environment. None of the original
data can be derived from the substitute data.

Author
Ellen Chandler
Principal Technical Writer

You might also like