You are on page 1of 80

Informatica Master Data

Management (MDM)

1
Topic 4: Load Process

2
Objectives

Following are the objectives of this topic:


• Configure Trust
• Configure Validation Rules
• Configure Relationships
• Configure Lookups
• Describe the Load Process

3
Trust

• Dynamic Cell-level Survivorship


• Base Object property
• A mechanism for measuring the confidence factor associated with each cell based on
its source system, change history, and other business rules
• Defined at a column level for each contributing source system
• Ensures that the most reliable data at the cell level is consolidated based on data
characteristics

4
Trust

When two base object records


merge: 2 Base Object records to merge:

ROWID_O Name Phone


• MRM calculates the trust for each BJECT
trusted column in the two base
100 Doug McDougal Grp 1-555-901-4670
object records being merged
200 The Doug McDougal Group 201-10810
• Cell with the highest values
survive in the final merged record Calculate Trust:

ROWID_O Name Phone


BJECT

100 62
Doug McDougal Grp 56

1-555-901-4670

200 The Doug

Winners Survive:

71McDougal Group 37
201-10810

ROWID_O Name Phone


BJECT

100 The Doug McDougal Group 1-555-901-4670

5
Trust

When an update comes in from a Base object prior to update:


source:
ROWID_O Name Phone
BJECT
• MRM calculates the trust on the
incoming data and compares it to 100 The Doug McDougal Group 1-555-901-4670

the trust of the data in the base 71 56


object
• Updates are only applied to the Data from staging table:
base object for cells that have ROWID_O Name Phone
higher trust on the incoming data BJECT
100 DMcD Group 201-10810

75
 50
Base Object cells only updated where new data has

higher trust weighting:
ROWID_O Name Phone
BJECT
100 DMcD Group 1-555-901-4670

6
Trust

• Trust is an option property for a base object column


• If trust is switched off for a column, then the most recently updated value from any
source is the survived value in the base object
• Only switch on trust for a column if:
• Two or more source systems contribute to the column
• The sources are not deemed to be equally reliable providers of values to the column

7
8
Trust

Trust Demo

9
Validation Rules

• Defines a condition under which a data value is not valid


• Base Object property
• If the validation condition is met, then trust weighting is downgraded
• Trust after validation downgrade is

TRUST – (TRUST * downgrade_pct/100)


• Reserve Minimum Trust can be set to avoid having trust scores below the minimum
trust value

x := TRUST – (TRUST * downgrade_pct/100)

if x < MINIMUM_TRUST then x := MINIMUM_TRUST

endif;
• Validation check can be done on any column in a base object and Downgrade can be
applied to any other columns in the base object

1
Validation Rules

Some examples of validation rules:


• Downgrade trust on Last Name if

length(last_name)<3 and last_name <> ‘NG’


• Downgrade trust on Middle Name if

middle_name is null
• Downgrade trust on Address Line 1, City, State, Zip, and Valid Address Ind if

valid_address_ind = ‘False’

1
Validation Rules

Validation Rules Demo

1
Relationships

• Relationships are association between base objects via a matching column


• Property of the Base Object
• Types of relationships:
• One to Many Relationship
• Many to Many Relationship

1
Relationship

One-to-many Relationship
• One table (the child) contains a foreign key column, which matches a unique key
column of another table (the parent)
• One-to-many relationships are always defined from the child table in the relationship
(i.e. the referencing table rather than the referenced table).

1
Relationship

Many-to-many Relationship
• A base object acts as an intersection table between another two base objects
• The intersection table has a one-to-many relationship with the other two base objects

1
1
Lookups

Automatic Lookups
• MRM automatically handles lookups loading/updating the primary key of a Base Object

Staging Table for Customer data from Customer Base Object


CRM System (C_STG_CRM_CUST):
ROWID_OBJECT FULL_NAME
PKEY_SRC_OBJECT FULL_NAME

10810 JOHN J HANCOCK


3507 JOHN JAMES HANCOCK

Customer Cross-Reference

ROWID_OBJECT ROWID_SYSTEM PKEY_SRC_OBJECT FULL_NAME

10810 CRM_SYS 3507 JOHN J HANCOCK

10810 SALES_SYS A53UT1 JOHN HANCOCK

1
Lookups

User Defined Lookups


• For user-defined relationships, the corresponding lookups has to be manually
configured
• Lookups can be based on XREF table or an Unique Key in Base Object

Staging Table for Address data from CRM Customer Base Object
System (C_STG_CRM_ADDR):
ROWID_OBJECT FULL_NAME
PKEY_SRC_OBJECT CRM_ID

10810 JOHN J HANCOCK


ADDR100 3507

Customer Cross-Reference

ROWID_OBJECT ROWID_SYSTEM PKEY_SRC_OBJECT SUB_CATG_CODE

10810 CRM_SYS 3507 JOHN J HANCOCK

10810 SALES_SYS A53UT1 JOHN HANCOCK

1
Lookups

Shadow Foreign Key


• The foreign key value stored on the cross-reference (X-ref) is the same as the value
stored on the base object
• This facilitates certain MRM internal processes on parent merge
• However, it makes it difficult to tie child X-ref’s back to their original parent X-ref
• Shadow foreign key is an additional column added to the X-ref for every foreign key
defined on the base object
• Contains the source system’s original foreign key value
• Name of shadow foreign key column is S_FKColumnName, for example
• Foreign key column name = Customer_ROWID
• Shadow foreign key column name = S_Customer_ROWID

1
Lookups

Shadow Foreign Key on XREF

Staging Table for Address data Customer Base Object


from CRM System
(C_STG_CRM_ADDR): ROWID_OBJECT FULL_NAME

PKEY_SRC_
CUST_ID 10810 JOHN J HANCOCK
OBJECT

ADDR100 3507 Customer Cross-Reference


ROWID_ ROWID_ PKEY_SRC_
FULL_NAME
OBJECT SYSTEM OBJECT

10810 CRM_SYS 3507 JOHN J HANCOCK

Address Base Object


ROWID_
CUST_ID ADDRESS
OBJECT

24680 10810 123 Main St

Address Cross-Reference
ROWID_ ROWID_ PKEY_SRC_
CUST_ID S_CUST_ID ADDRESS
OBJECT SYSTEM OBJECT

24680 CRM_SYS ADDR100 10810 3507 123 Main St

2
Load Process

2
Load Process

Load process is a two-step process:


• Apply Updates
• Apply Inserts

Register Process STRIP_ON_LOAD_IND = 0


LOAD Updates
job
STRIP_ON_LOAD_IND = 1

Tokenize
STRIP_ON_LOAD_IND= 0
Process Inserts End
LOAD
job
STRIP_ON_LOAD_IND = 1

Tokenize

2
Load Process

Updates
• Load job applies updates for existing records whose

LAST_UPDATE_DATE (Staging table) > SRC_LUD (XREF table)


• The update process always updates the XREF table record
• The update process may update the Base Object depending on trust:
• For columns not flagged for trust, update happens if incoming data has new LUD
• For columns flagged for trust, load job compares trust weightings of staging table data
to trust weightings of existing data in base object to determine what can be updated
• If history flag is switched on for the Base Object, then the update process writes to the
history tables of Base Object and XREF

2
Load Process

Inserts
• Load job applies inserts for records that do not exist in the XREF table
• ROWID_OBJECT values are generated for the new records
• New records are inserted into base object and XREF with CONSOLIDATION_IND = 4
• If history flag is switched on for the Base Object, then the insert process writes to the
history tables of Base Object and XREF

2
Load Process

Rejects
• Referential Integrity is maintained among base objects in the consolidated data model
• Rejects will occur in the load process if any records violate the RI constraint
• Parent records do not exist
• Child records are loaded before the parent records
• Lookup has been defined incorrectly

• Rejected records are inserted in the reject table of Staging table

staging_table_name_REJ

2
Topic 5: Match Process

2
Objectives

Following are the objectives of this topic:


• Match & Merge Overview
• Match Rules Configuration
• Exact Match/Search Strategy
• Fuzzy Match/Search Strategy
• Match Server Architecture

2
Match & Search Strategy

Match Process

2
Match & Merge Overview

Challenges with identifying duplicate records


• Misspellings, typing, and transcription errors
• Nicknames
• Synonyms
• Abbreviations
• Foreign and Anglicized words
• Prefix and suffix abbreviations
• Concatenation or splitting of words
• Noise words and punctuation
• Casing and character set variations

2
Match & Merge Overview

• To merge or link records, MRM needs to know which records are likely duplicates of
each other
• Match rules tell MRM how to identify likely duplicates
• Match rules also tell MRM if two matching records are similar enough to automatically
merge/link, or if they should be reviewed by a data steward

3
Match & Merge Overview

Data Consolidation Options


• Merging (merge-style base objects)
physically combines the matched
records in the base object. Makes the
most-current best version of the truth
(BVT) available
• Linking (link-style base objects) quickly
determines the BVT without physically
combining the records. Provides much
faster overall throughput

3
Match/Search Strategy

Exact
• Does not allow for any variations in the data in the match columns
• Very simple match process, therefore fast

Fuzzy
• Allows for variations in spelling, formats, word order, nicknames, synonyms, etc.
• More complex match process, therefore slower

3
Match/Search Strategy

High level process flow for the match process

Fuzzy
Register Fuzzy or Generate Search for Match
MATCH Exact? Keys Candidates
job

Exact

Compare records to match against rest of Compare records to match against


records in base object match candidates

Populate match table with matched ids

End
MATCH
job

3
Match Path

Match Path

• A Match Path represents the base object which will provide data for matching purpose
• Traverse the hierarchy between records across multiple base objects or within a single
base object
• Foreign Key Relationships between tables are used to traverse the relationships
• Parent-to-child or child-to-parent relationships can be specified

Match Path - Check for Missing Children

• By default, MDM does an inner join between the base objects defined in the Match
Path
• The join therefore excludes rows that don’t have corresponding rows in the joined
tables
• To include those records, switch on “Check for Missing Children” – MDM will then do
an outer join instead of an inner join

3
Match Path

Match Path – Inter Table

3
Match Path

Match Path – Intra Table

3
Match Column

• A match column contains an identifying characteristic of the base object to be


consolidated
• Each base object can have multiple match columns
• Examples:
Full Name

Generation

Address

Phone

Email

• Provider column(s) is the base object columns that provide the data for the match
column:
• Can be a single column or a concatenation of columns
• Must be a VARCHAR / CHAR column to concatenate
• Date column is also supported for matching

3
Match Column

Each match column is based on one or


Customer: Would get false
more columns positive
ROWID_ Name matches if
From base object OBJECT matching just
on Name
Or from X-ref (in some cases)
200 John Smith
Or from child base object (in some
250 John Smith
cases) Include
300 John J Smith Address
attributes in the
Match to
reduce false
Address: positives

CUSTOME Address
R_ROWID

200 123 Main Street, Boston MA

250 109 Broad Street, Boulder CO

300 123 Main Street, Boston MA

3
Exact Match/Search Strategy

Steps for defining Exact Match Rules


• Select Match/Search Strategy = Exact
• Define Match Path
• Define Match Columns
• Create at least one Match Rule Set
• Create Match Rules for Match Rule Set(s)

3
Exact Match/Search Strategy

Match Columns
• A match column contains an identifying characteristic of the base object record to be
consolidated
• Exact Match Columns:
• Does not make allowance for any variations in data content
• Records will match if they have identical values in the match columns used in match
rules

4
Exact Match/Search Strategy

Match Rule & Match Rule Set


• Match Rules are grouped into Match Rule Sets
• Can define multiple rule sets
• Only one match rule set can be active at any point in time

• Match rule defines the combination of columns that constitute a match

Match Rule - Auto property


• Match rules are flagged either for auto merge/auto link or for manual merge/link
• Matches resulting from auto merge/auto link rules will result in the records being
automatically merged/linked by the system when the auto merge/auto link batch job
runs
• Matches resulting from manual merge/link rules will be queued for review by a data
steward

4
4
Exact Match/Search Strategy

Match Rule – Null Matching


• By default, NULL is not regarded Data Example:
as being the same as NULL ROWID_ Customer_Name Generation
OBJECT
• NULL Matches NULL: Use this
500 Douglas McDougal Jr
flag to specify the match columns
in a match rule that should be 550 Doug McDougal
regarded as matches even if the
560 D McDougall Jr
2 values being compared are
both NULL 570 Doug McDougall

• NULL Matches non-NULL: Use In the above example the effects of Null
Matching on the Generation column are
this flag to specify the match shown
columns in a match rule that
should be regarded as matches
when one of the values being
compared is NULL and the other
is not

4
Exact Match/Search Strategy

Match Rule – Non-Equal


Matching
Data Example:
• Specifies that 2 records are a
match if they do not have the ROWID_ Custome Customer_Name CRM_
same values in the non-equal OBJECT r_Type FLAG
match column 500 ORG The Doug McDougal
Group
• Reverses whatever


would/would NOT have 550 IND Doug McDougal Y
matched without Non-equal 560 IND D McDougall Y
match
570 ORG Doug McDougal
• If using non-equal match, then
MUST switch on Validate
Matches property in Base • If non-equal match is used on the CRM_FLAG column to
prevent 2 records from the CRM system from matching
Object Advanced Properties each other, then –
• NULL=Y is a match
• NULL=NULL is a match
• Y=Y is not a match

4
Exact Match/Search Strategy

Match Rule – Segment Matching


• Allows a match rule to be limited Data Example:
to a specific subset of data ROWID_ Custome Customer_Name CRM_
OBJECT r_Type FLAG
• Different match rules can use
different segment values 500 ORG The Doug McDougal
Group
550 IND Doug McDougal Y
560 IND D McDougall Y
570 ORG Doug McDougal

• Use a Segment Match value of ‘ORG’ on Customer Type


match column to create a match rule that only applies to
Organizations.
• Use a Segment Match value of ‘IND’ on Customer Type
match column to create a match rule that only applies to
Individuals.

4
Exact Match/Search Strategy

Match Rule

4
Fuzzy Match/Search Strategy

Steps for defining Fuzzy Match Rules


• Select Match/Search Strategy = Fuzzy
• Choose a Population
• Define Match Path
• Define Match Key
• Define Match Columns
• Create at least one Match Rule Set & choose Search Level
• Create Match Rules for Match Rule Set(s)

4
Fuzzy Match/Search Strategy

Population
• Population is intended to addresses the name distribution problem
• Common family names in each population skew the data and query performance
e.g. Smith, Williams in English-speaking populations

• Each population also has a large number of uncommon names that tend to have the
most error and variability
• Match needs to account for both of these situations in the way that the keys are built,
to give optimal search performance for both
• Defines how to identify matches within a particular population and language
• Defines how to build keys and perform searches on name and address
• Supports a specific set of match purposes

4
Fuzzy Match/Search Strategy

Population

4
Fuzzy Match/Search Strategy

Match Key
• Match key is used to search for match candidates
• It is a fixed-length, compressed, and encoded value
• Built from a combination of the words and numbers in a name or address
• For one name or address, multiple SSA match keys are generated
• Match Key Properties:
• Key Type
• Key Width
• Path Component
• Match Column Contents

5
Fuzzy Match/Search Strategy

Match Key – Key Type


• The match key type describes important characteristics about a column to MDM Hub
• Should be based on the main identifying data in your base object
• For standard population, the options are:

Key Type Description


Use if the data contains organization names or both
Organization Name
organization names and individual names
Person Name Use if the data contains individual names only

Address Part1 Use if the data contains addresses

5
Fuzzy Match/Search Strategy

Match Key – Key Width

 Determines the degree of variance that will be supported in the key values

 Represents tradeoff between match precision and the space used by match key
records
Key Width Description

• Generates the most keys


• Allows for the most variance in key values i.e. supports greatest
Extended
search completeness
• Uses the most disk space
• Generates the fewest keys
Limited • Does not allow for word order variances
• Uses the least disk space

• Aims for balance between Limited and Extended i.e. balance between
Standard
disk usage/performance and search completeness

• Generates single key


Preferred
• Might result in fewer match candidates

5
Fuzzy Match/Search Strategy

Match Key – Path Component


• Contains the column that forms the basis for defining the Match Key
• Can be any table defined in the Match Path

Match Key – Match Column Contents


• The column(s) from Path Component that provide data to the Match Key

5
Fuzzy Match/Search Strategy

Full_Name Match Key


Match Key – Example
ELIZABETH S O'BRIAN
PCOJLK$-
• Key Type = Organization_Name;
PCWG$$OG

• Key Width = Standard; VL/IEFLM


VL/IJ/$-
• Path Component = Customer ELIZABETH O BRIEN
MIDIA*P-

• Match Column Contents = Full_Name MIP$$$DI


PC>AO$$-
PCP$$$>>
ELIZABETH O'BRIEN PCOG$$$$
VL/IEFLM
BETH O'BRIEN MMU$?/$-
PCOG$$$$
VL/IEFLM
LIZ O'BRIEN PCOG$$$$
SXOG$$$-
VL/IEFLM

5
Fuzzy Match/Search Strategy

Match Key

5
Fuzzy Match/Search Strategy

Match Column
• A match column contains an identifying characteristic of the base object record to be
consolidated
• Can be a fuzzy column or an exact column
• Fuzzy Match Column
• The column name you choose defines the type of data that the match expects that
column to contain
• Examples: Person Name, Address Part 1, Address Part 2, etc.

• Exact Match Column


• Acts as a filter in the match
• Can have additional properties when used in match rules like Null match, Non-equal
match and segment match

5
Fuzzy Match/Search Strategy

Match Column

5
Fuzzy Match/Search Strategy

Match Rule Set


• They are logical grouping of Match Rules that collectively act on a base object for
identifying duplicates
• Multiple rule sets can be defined for a base object
• Only one rule set can be active at any point in time
• Each rule set has a Search Level and can comprise of one or more Match Rules

5
Fuzzy Match/Search Strategy

Match Rule Set – Search Level


• Determines how many match candidates are returned in the search phase of match
process

Search Level Description

• Least complex and generates fewest candidates


Narrow
• Gives the best performance

Typical • The appropriate level of search level for typical data sets

• Used when the data set is small or if it is critical to


Exhaustive
identify the highest number of matching records
• Supports the highest level of complexity
Extreme • Gives the worst performance as it generates the most
candidates

5
Fuzzy Match/Search Strategy

Match Rule Set – Search Level Examples

Key Type = Organization_Name; Key Width = Standard;

Record to be Matched = “ELIZABETH S O’BRIAN”


Narrow 3 Typical 14 Exhaustive 26 Extreme 27
Start key End Key Start key End Key Start key End Key Start key End Key
PCOG$$$$ PCOG$$$/ PCWG$$$$ PCWG$$ZZ PVS$$$$$ PVS$BZZZ PVS$$$$$ PVS$BZZZ
PC$$$$$$ PC$$$$$/ PCOG$$$$ PCOJZZZZ MM/OB/$$ MM/OB/$/ MM/OAH$$ MM/OB/ZZ
PCWG$$$$ PCWG$$ZZ OVOG$$$$ OVOJZZZZ M-WG$$$$ M-WG$$ZZ M-TO$$$$ M-WJZZZZ
VL/IEF$$ VL/IEF$/ PCWG$$$$ PCWG$$ZZ PCTO$$$$ PCWJZZZZ
PC$$$$$$ PC$$$$$/ MMV>B/$$ MMV>B/$/ MMV>AH$$ MMV>B/ZZ
PVS$$$$$ PVS$$$$/ RSWG$$$$ RSWG$$ZZ RSTO$$$$ RSWJZZZZ
MM/OB/$$ MM/OB/$/ P?WG$$$$ P?WG$$ZZ P?TO$$$$ P?WJZZZZ
MMV>B/$$ MMV>B/$/ KXWG$$$$ KXWG$$ZZ KXTO$$$$ KXWJZZZZ

P?WG$$$$ P?WG$$ZZ PBWG$$$$ PBWG$$ZZ PBTO$$$$ PBWJZZZZ

PVLKB/$$ PVLKB/$/ PVLGB/$$ PVLGB/$/ PVLGAH$$ PVLGB/ZZ

S$S$B/$$ S$S$B/$/ PVKSB/$$ PVKSB/$/ PVKSAH$$ PVKSB/ZZ

TNKBJ/$$ TNKBJ/$/ PAWG$$$$ PAWG$$ZZ PATO$$$$ PAWJZZZZ

TIWG$$$$ TIWG$$ZZ RAWG$$$$ RAWG$$ZZ RATO$$$$ RAWJZZZZ

YMU$B/$$ YMU$B/$/ PVLKB/$$ PVLKB/$/ PVLKAH$$ PVLKB/ZZ


… … … …

6
Fuzzy Match/Search Strategy

Match Rule Set

6
Fuzzy Match/Search Strategy

Match Rule
• Determines what constitutes a match during match process
• Fuzzy Match Rule Properties:
• Match Purpose
• Match Level
• Accept Limit Adjustment

Match Rule – Match Purpose


• Determines the fields that will be used in the match
• Different fields are required fields for different purposes
• There are also optional fields for each purpose that can help improve the match

• Determines the importance accorded to each field

6
Fuzzy Match/Search Strategy

Match Rule – Match Level


• Determines how precise the match is i.e. how similar a candidate record is to the
queued record to be considered a match
• Supported match levels are:
• Conservative: Tight Matching
• Typical: Appropriate for most matches
• Loose: Allows more variance in the values being matched

Match Rule – Accept Limit Adjustment


• Determines the acceptability of a match for the specified match level
• The Accept Limit Adjustment allows a coarse adjustment to what is considered to be a
match for this match rule:
• A positive adjustment results in tighter matching
• A negative adjustment results in looser matching

6
Fuzzy Match/Search Strategy

Match Rule

6
Fuzzy Match/Search Strategy

Match Rule – Syntax Used in Rule Description

Symbol Description
Column_1 (Fuzzy) Indicates that Column_1 is a fuzzy match column

Column_1 (Fuzzy) (+2) Indicates that the fuzzy column, Column_1, has had its weighting in the rule manually
increased

Column_2 {‘a’} Set of segment match values for Column_2

Column_3 (Ø) Indicates that null match is switched on for Column_3.


Can be combined with non-equal match: Column_3 (≠ Ø)

Column_4 (≠) Indicates that non-equal match (anti-match) is switched on for Column_4. Can be combined
with null match: Column_4 (≠ Ø)

6
Match Server Architecture

Match server is multi-threaded


Can configure how many threads MDM Hub will create for matching
If not configured, 4 threads will be created regardless of the number of CPUs on the
machine

Multiple match servers can be configured


Allows match jobs to be run in parallel. A single match job is not load balanced across
multiple match servers
MDM Hub will assign match jobs to available match servers on a round robin basis

6
Topic 6: Merge Process

6
Objectives

Following are the objectives of this topic:


• Configure Merge Settings
• Describe Immutable Source Systems
• Describe Distinct Systems
• Describe the Un-Merge Process

6
Merge Process

Merge Process

6
Merge Process

Merge
• Consolidation process of two matched records in the Base Object
• Merge can be Auto-Merge or Manual-Merge depending on the degree of matching

Immutable Source Systems


• An immutable source means that the source system is seen as a distinct source
• All records coming from this source always have a consolidation indicator of 1
• If two immutable records must be merged, then a data steward needs to perform a
manual verification in order to allow that change. The data steward will have to choose
the key that remains

Distinct Systems

 Records from source marked as Distinct will not merge amongst themselves

7
Merge Process

Un-Merge Process
• By default, unmerging parent records does not unmerge associated child records
• Unmerge Child When Parent Unmerges option allows you to specify what happens if
records in the parent base object are unmerged
• Pre-Requisites for enabling this option are:
• The parent-child relationship must already be configured in the child base object
• The foreign key column in the child base object must be a match-enabled column

7
Topic 7: Batch Process

7
Objectives

Following are the objectives of this topic:


• Overview of Batch Viewer
• Executing Stored Procedures
• Job Status & Job Statistics
• Scheduling Considerations
• Overview of Batch Group
• Viewing Logs and Rejected Records

7
Batch Process

Batch Viewer

• Provides a way to execute a batch job from the Hub Console

• Shows job completion status (Success / Failure / Warning) with associated message

• Shows job statistics

• Useful for starting the run of a single job, or running jobs that don’t often need to run
(e.g. Synchronize Trust job after changing Trust settings)

• Does not provide any automation or scheduling

7
Batch Process

Batch Viewer

7
Executing Stored Procedures

Stored Procedures
• All public MRM batch processes can be executed through stored procedures
• Can easily be integrated with any job scheduling software – Tivoli, CA Unicenter etc.
• The full list of public batch processes per user-defined object can be found in
C_REPOS_TABLE_OBJECT_V

SELECT * FROM C_REPOS_TABLE_OBJECT_V WHERE PUBLIC_IND = 1


• Various Run Status upon completion of a Stored Procedure:
• 0 = Completed Successfully
• 1 = Completed with Errors/Warnings
• 3 = Failed

7
Job Status & Job Statistics

Job Status and Statistics


• Job status & statistics can be viewed in the Batch Tool or query the C_REPOS_JOB*
tables directly

7
Scheduling Considerations

Stage Jobs
• If cleanse server machine has enough CPU and memory to handle multiple cleanse
servers, then parallelize stage jobs

Load Jobs
• Easiest way to schedule Load jobs is in serial
• If large number of Loads run for a short batch window, then need to Load separate
targets in parallel and check all dependencies before each Load starts

Match/Merge Jobs
• Determine whether to run match-merge once per object per batch window, or after
every source load
• Consider whether to tokenize after load. Can switch off the STRIP_ON_LOAD indicator
so that the strip process does not run as part of the load

7
Batch Group

Batch Group
• A batch group is a collection of individual batch jobs (e.g. Stage, Load, Match, etc.) that
can be executed with a single command
• Each batch job in a group can be executed sequentially or in parallel to other jobs
• Group Levels – Jobs in a particular Group Level are executed in parallel

Viewing Logs and Rejected Records


• History logs can be viewed across all Batch Groups, based on their execution status by
clicking on the appropriate node under the “Logs By Status” node
• A batch group that contains stage jobs may encounter rejected records. These can be
viewed by selecting the log record for the stage job that contains the rejected record,
then clicking the “View Rejects” button

7
Batch Group

Batch Group

Jobs in the same


level are executed in
parallel

Individual levels are


executed in sequence

You might also like