You are on page 1of 18

DATA QUALITY 9.

1 Training
IDQ 9.1 Labs
Lab 1 - Content Management Service......................................................................................2
.....................................................................................................................................................5
Lab 2 - New Reference Table Capabilities...............................................................................6
Lab - Content Sets...................................................................................................................!
Lab " - Tags...............................................................................................................................1#
Lab 5 - Matc$ %n$ancements..................................................................................................12
Lab 6 & New %'ception Transform.........................................................................................1
Lab ! - (ata )*alit+ for MS %'cel...........................................................................................15
Lab , - -rofiling Labs ..............................................................................................................16
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
Lab 1 - Content Management Service
Create and Configure CMS
Objective: Configure AddressDoctor options and check AV reference file status from Developer
Steps
Open Administrator Console
Select Action/Ne/Content !anagement Service
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
"ollo #i$ard to create and start the C!S
Open C!S% processes tab
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
&dit AV Options
AV 'icence: S0PCF4MN94L7ZXEZ635NCSM90NZKR0NJUTWA
o Set No (re)'oad to A'' for all t*pes
o Set AV +eference data path to:
C:,-nformatica,./0/1,services,D2Content,-N"A3Content,av,default
+ec*cle C!S Service
+ec*cle D-S Service
Open Developer
o Select #indo / (references
o Select Content Status
o Check Status is displa*ed correctl* 4e5pected vie belo6
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training

Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
Lab 2 - New Reference Table Capabilities
Create Managed Reference Table from database
Objective: Create ne reference table using a database source
Steps
Open Anal*st 7ool
Select 8Create Ne +eference 7able9
Select 8Connect to a +elational 7able9
Select 8D237ables9 Connection
Select fname table
Select Column0 as valid value
Save as fname in :our (roject
Create Unmanaged Reference Table from database
Objective: Create ne unmanaged reference table
Steps
Open Anal*st 7ool
Select 8Create Ne +eference 7able9
Select 8Connect to a +elational 7able9
!ake sure 8;nmanaged 7able9 is ticked
Select 8D237ables9 Connection
Select us3states table
Select Column0 as valid value
Save as us3states in :our (roject
End of Exercise
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
Lab - Content Sets
Create and configure new Content Set
Objective: Create ne content set and content set e5pressions
Steps
Open Developer
Select "ile / Ne / Content Set
Create ne Content Set called 8ContentSet3.09 in :our project
Open *our content set
Add a ne
Add ne Character Set:
o Name: Char
o 'abel : C
o +ange: a)$ and A)<
Add a ne +egular &5pression:
o Name: num
o Number of Outputs: 0
o +eg&5: =>1).?@A
Add a ne 7oken Sets 4+eg&56:
o Name: date
o 'abel: date
o +eg&5: =,dB0%CD,/,dB0%CD,/,dBEDA
o Description: matches dates of the form FF/FF/:::: here FF can be 0 or C digits long
and :::: is ala*s E digits long/
Save Content Set
Use a Cotet Set
Parse, Cleanse and Standardize Data
Objective: (repare data source for upload to #arehouse and matching scenarios
Steps
Create Ne !apping: m3process3customer3data
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
Add customer "lat file source from c:,D23DA7A director*
Add (arser G 7oken (arser
o Add -nput (ort contact3name from source
o Create ne strateg*
Name: parse3names
Operation 0:
Operation: (arse ;sing +eference 7able
Name: parse3fname
+eference 7able: fnames 4&nablement3.0 project6
Output: fname% string% CH
Operation C:
Operation: (arse ;sing +eference 7able
Name: parse3sname
+eference 7able: usa3surnames3infa
4-nformatica3D23Content/Dictionaries/North America/;SA6
Output: sname% string% CH
o Add -nput port address0 from source
o Create ne strateg*
Name: parse3housenum
Operation 0:
Operation: (arse ;sing 7oken Set
Select +egular &5pression
o Choose +eg&5 8num9 from ContentSet3.0
Create ne output house3num
o +un data vieer and e5amine *our results
Add 'abeler
o Add -nput (ort addressI
o Create Ne Strateg*
Name: label3state
!ode: 7oken
Operation: 'abel ith +eference 7able
+eference 7able: us3states 4:our project6
'abel: state?
Add 'abeler 4 or use e5iting one6
o Add -nput (ort cust3start3date from source
o Create ne strateg*
Name: lbl3date
!ode: 7oken
Operation 0:
Operation: 'abel 7okens ith 7oken Set
Name: lbl3date
Select 7oken Set 8date9 from ContentSet3.0
o Add -nput (ort currenc* from source
o Create ne strateg*
Name: lbl3currenc*
!ode: 7oken
Operation 0:
Operation: 'abel 7okens ith reference table
-nformatica3D23Content/dictionaries/general/currenc*3codes3infa
Name: lbl3currenc*

o +un the data vieer and e5amine *our results
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
Should look something like this:

Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
Lab ! - Tags
Create and associate new tags - Developer
Objective: Create ne tags and associate to objects in Developer
C!eate Ta" Steps
Open Developer
Open #indo / (references
Select 7ags
Vie 8Out of the Jo59 7ags (These will appear when you install 9.1 accelerators, which this
image does not have)
Create ne tags:
o Customer
o (roduct
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
o Content
Asso#$ate Ta" Steps
Open Developer
Appl* 7ags to Data Sources
o Open Source
o Navigate to 7ags Vie
o Select &dit
o Appl* 7ag
Appl* 7ags to content set e5pressions
o Open Content Set
o Navigate to 7ags Vie
o Select &dit
o Appl* Content 7ag to all elements
Create and associate new tags - nal!st
Objective: Create ne tags and associate to objects in Anal*st
C!eate Ta" Steps
Open Anal*st
Select Actions / Sho 7ags
Create ne tag:
o Order
o Address
Asso#$ate Ta" Steps
Open (rofile3order
4:ou probabl* have not alread* profiled the order table/ Create a data object using the flat file
8order9 in the c:,D23DA7A director* and profile it% columns onl*/6
Appl* 7ags to data columns
o Sho 7ags vie
o Select Address related columns
Appl* Address 7ag
o Select Order related columns
Appl* Order 7ag
o Select Customer name related columns
Appl* Customer 7ag
Ko to project vie
o Select +7! fnames
Appl* Customer and Content 7ags
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
Lab " - Matc# En#ancements
Pre-re" for image
Cop* I /*sp files from C:,-nformatica,./0/1,services,D2Content,-N"A3Content,identit*,
7o a nel* created folder called 8default9 at
C:,-nformatica,./0/1,services,D2Content,-N"A3Content,identit*,default
#e! $en and Matc% nal!sis
Objective: -dentif* potential duplicates
Steps
Create Ne !apping: m3match3customer
Add c:,D23Data,aml3demo3data source
Add Le* Kenerator
o ;se String strateg* on iso3ctr*3code
right click on the ke* generator and Select Anal*se Detail from the menu
o +evie the folloing information:
&stimated processing time
Kroups above the recommended threshold
o &dit the desired throughput value and observe ho estimated processing time changes
o &dit min and ma5 group si$e values
o Select groups above the threshold from the dropdon list and drilldon to the record
level
+e)configure Le* Kenerator to $ip3or3postcode port
+un KroupLe* Anal*sis again and observe the results
Add !atch transform and configure as follos
o "ield !atching 4Single Source6
o &dit Distance on contact3name
o 7hreshold 1/M
Select +untime Anal*sis b* !atch7*pe from the right click menu
o +evie results
Select Output Anal*sis of Clusters from the right click menu
o +evie results (There is a bug in this Beta release so you may get funny results)
+epeat the above steps for a !atch transform configured for -dentit* !atching
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
Lab $ % New Exception Transform
Objective: -dentif* and manuall* correct e5ception records
7hink of this as further don in a mapping here *ou have alread* run data cleansing and address
validation/ :ou have also run a data Nualit* check on the phone field/ No that *ouOve done that% *ou
need to decide hat records pass and hich need manual intervention/
Steps
Create Ne !apping: m3e5ception3records
Add c:,D23data,cleansed3customer source
Add Decision
o Assign a score of M1 to records ith AddressStatus P-ncomplete AddressO or P-nvalid
Address 'ineO and (hone Status P-ncomplete (honeO
o Assign a score of .1 to all remaining records
$% A&&!essStat's( )*#o+p,ete A&&!ess)
o! A&&!essStat's()*-a,$& A&&!ess L$e)
o! P.oeStat's()*#o+p,ete P.oe)
t.e
s#o!e/(60
e,se
s#o!e/(90
e&$%
Add &5ception transform and configure as follos
o Jad +ecords &5ception
o 7able PJAD +&CO+DSO in Staging DJ
o Connect data ports
o Add AddressStatus and (honeStatus ports to 'abels input
o Connect the score port to -nputs QQ Control QQ Score/
o +ecords ith a score beteen E1 and .1 to be revieed manuall*
o Send good records to standard output and bad records to Jad +ecords table
o !ap AddressStatus and (honeStatus ports to respective issues in the priorit* tab
+un the mapping (Data iewer)
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
o !un the data viewer on the "#ception transform
-n Anal*st% add the nel* created table to the project vie
+evie available filters

Select &dit mode


Correct a number of records and select Save All Corrected +ecords
Open the audit trail and revie the changes made
Rover over the ne value to see the old value
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
Lab & - 'ata ()alit* for MS Excel
&nstall D' for ()cel
Objective: (erform base install of D2 for &5cel
Steps
&5tract lpI/$ip 4from lab files provided6 to Desktop
Close &5cel
+un setup/e5e
o Note this ill connect to internet to donload base &5cel D''s reNuired to run Add)-n
Once complete% open &5cel
Check -nformatica +ibbon is available
Use D' for ()cel
Objective: Add ne service to D2 for &5cel and use ith sample data
Steps
&5port #SD' from service created in D2/#S lab
Open &5cel
Select -nformatica +ibbon and Add Service
(oint to #SD' file e5tracted from first step
Open customer/5ls5 4see Desktop on ./0 777 V!% folder D23data3enablement.06
;se imported service to parse Name column into "irst and 'ast names
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
Lab + - ,rofiling Labs
Profiling
0/ Add I Data Objects 4'ocation: c:,D23DA7A6 G Tool Matters (make sure you click the
because there are embedded commas in the data)
a/ Customer Orders
b/ (roduct
C/ (rofile Customer G Columns onl*
I/ Delete profile
E/ (rofile All the tables at once
a/ C a*s
i/ Select all the data objects and profile
ii/ (rofile one G add the other to
0/ Add a (refi5 of D#3 to the objects
H/ (rofile Customer G Column% (rimar* Le* and Dependenc*
a/ 7ake all the defaults 4but keep hitting ne5t not finish to see them6
+emember to select the columns for (L and "unctional Dependenc*
b/ Vie (L +esults
c/ Select Cust3number G verif*
i/ #hat happened to the displa*
d/ Vie Violations
e/ Select cust3Number 0H.HCMSC and Drill don
i/ #hat is the difference beteen the recordsT
f/ Vie "unctional Dependencies
i/ #h* are there blank determinantsT
g/ Select a column and verif*
i/ #hat happened in the Displa*T
M/ (rofile Orders G All three 4Column% (rimar* Le* and Dependenc*6 and take defaults
a/ Vie column profile
i/ #hat is the Le* to this tableT
b/ Vie (rimar* Le* inference
i/ #hat is the ke*T
c/ Vie the "unctional Dependencies
i/ Can *ou identif* an* potential Sub)tables
S/ Delete the orders profile
U/ +e)profile orders G 7his time override and change the default options
$int% Be careful in changes to these options otherwise you will be here till &aturday
a/ (rimar* Le* G !inimum (ercent V 011
b/ Dependenc* ) !inimum (ercent V 011
i/ Vie results
ii/ Sort on the determinant column
iii/ Can *ou identif* a ke*T
iv/ Can *ou identif* a potential sub)tableT
./ Ko back and modif* the profile description 4(D6 to change the minimum to SH
i/ #hat does this sho *ouT
01/ Delete the profile
00/ (rofile orders G dependenc* onl*
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
a/ &5clude all columns from determinants e5cept sales3id
b/ Verif* all fields that sho 011W
0C/ Ko back and modif* the profile definition to profile all three tasks
a/ Did it orkT
0I/ (rofile orders G all three tasks
a/ Verif* the primar* ke*
b/ 'ook at dependencies
i/ #h* are form% ingredient3list% on3hand and segment not determined b* the
primar* ke*
ii/ Ro can - fi5 segment and form to sho the* are determined b* the primar* ke*T
*ilters
0/ Delete Customer profile
C/ (rofile customer G adding filters
a/ AddressI V N:
b/ AddressI V CA
c/ AddressI V N: and addressC V N&# :O+L
C/ +un each profile% vie results% modif* (rofile definition and run the ne5t one/
Drilldowns
0/ Open the Customer (rofile in the Anal*st tool
C/ +emove an* active filters and re)run the profile
I/ Ko to addressI and drill don on N:
E/ &dit the drill don filter and add addressI V N&# :O+L 4*es case matters6
H/ Add another field <ip3or3postcode X 011C1
M/ Add to the filter% iso3ctr*3code is not null or YV ;SA or YV ;/S/A/
Modeling
0/ Create a (rofile model called 'hatever ith orders% customer and product
a/ Select the customer object% right click and data object profile
b/ (rofile all I steps G setting options *ou think are appropriate
c/ Vie results
d/ -n (rimar* Le* inference% right click and add cust3number to model
C/ Ko back to the default vie and create and run a data object profile for orders
a/ Vie results
b/ Add item@order to the model
I/ Ko back to the default vie and create and run a data object profile for product
a/ !ake sure *ou verif* all *our ke*s
b/ Add product3id to the model
E/ Select customer and orders and profile foreign ke*s
a/ Verif* and add relationship to the model
b/ +e)select the relationship and vie the Venn diagram
c/ Double click on the non)overlapping orders/
i/ See orders ithout a valid customer -D
ii/ Double click on customer and see all the deadbeat customers/
Informatica Data Quality 9.1
DATA QUALITY 9.1 Training
H/ Ko back to the default vie and do the same for orders and product changing the inference
options 47rim values and case sensitivit*6
a/ Verif* the relationship 4if it makes sense6
b/ +e)select the relationship and vie the Venn diagram/
i/ "ind the products that donOt have open orders/
ii/ "ind the most ordered products/
M/ "-NA'': 4ell almost6% select customer and do a join profile
a/ Add orders
b/ Add a join on cust3number and KC'OC
$enerating a Mapping from a Profile
0/ Open the Customer (rofile
C/ Add an OO7J rule to validate cust3start3date
a/ :ou ma* ant to cop* it and change the date format if *ou ant it to actuall* ork
correctl*/
C/ Add an OO7J rule to validate 'ast3Order3Date
a/ See above
I/ Create a rule to validate iso3ctr*3code 4;S is onl* valid value6
a/ --"4iso3ctr*3code VZ;SZ%ZValidZ%Z-nvalidZ6
E/ Add an OO7J rule to remove punctuation from contact3name
H/ Add more as appropriate or stop to get out earl*/
M/ +un the profile 4an*thing interestingT6
S/ +ight click on profile and generate a mapping
U/ Vie the mapping and see the results and behavior/
Informatica Data Quality 9.1

You might also like