You are on page 1of 73

Informatica Overview

Contents
• Introduction
• Clients
• Server(s)
• Source, Target, Repository
• Connectivity
What is Informatica?
• Allows you to load data into a centralized
location, such as a datamart or data warehouse.

• ETL Tool
– Extract data from multiple sources
– Transform the data according to business logic
and needs
– Load the transformed data into file and
relational targets
Example

EMPLOYEE EMP_DETAILS
EMP_ID EMP_ID
EMP_NAME EMP_NAME
EMP_CITY EMP_CITY
EMP_STATE EMP_STATE
EMP_COUNT EMP_COUNTRY
RY Transform
EMP_DATE_O Date of YRS_OF_SRV
Joining to
F_JOINING
Yrs of
Service
Data Warehousing
Developer End User

Metadata

Extract
Transform
Load
Operational Sources Data Warehouse
 Informatica Architecture
 Design Process
 Client Tool Review
– Repository Manager
– Designer
– Server Manager
Informatica’s Architecture
Data Models
Designer
Repository Manager Server Manager

Sources 1-n Targets 1-n


PowerPlugs

Repository

Server
Informatica Design Process
2.

1.

3.
Source Def
5.
Mapping

Sessions

4. Target Def

1. Create Repository
2. Import Source Definitions
3. Create Target Schema
4. Create Mappings
5. Load Data
Informatica Client
 Repository Manager – Can view much of the metadata in
the Repository through the Repository Manager.

 Designer – Create Source-to-Target mappings that contain


transformation instructions for the Informatica Server.

 Server Manager – Create, schedule, and monitor sessions.


You create a session based on a transformation and
schedule it to run on the Informatica Server.
Informatica Client
Repository Manager
Metadata Repository

 Information about the data mart system


 Catalogs the repository
 Directs the server
 Contains record of user access
 Can be shared
 Can be searched and reported
 Bridged through Metadata Exchange
Repository Manager

Navigator Window Analysis Window

Dependency Window

Output Window
Folder Attributes

• FOLDER OWNER - user who serves as focal point for


folder permissions
• PERMISSIONS - rights to read, write, and/or execute
objects in a folder
• SHARED - property that allows you to make shortcuts to
objects in a folder
• SHORTCUT - a dynamic link to an object stored in a
shared folder
• VERSIONS - folder iterations that indicate development
stages
Informatica Client
Designer
Designer Workspace
Open Folder List

Navigator
Workspace

Workbook Tabs

Output Window

Status Bar
Designer Options

General Tables Format


workspace options columns viewed workspace colors
reload objects on open column size import keys
group source definitions object size automatic Source
object colors Qualifier creation
Informatica Client
Server Manager
Server Manager

Navigator

Configure Window

Monitor Window

Output Window
Designer
• Source Analyzer
• Warehouse Designer
• Transformation Developer
• Mapplet Designer
• Mapping Designer
Source Analyzer
• Identify the sources used to build the warehouse.
• Create repository definitions for these sources
Analyzing Sources
 Relational – Oracle, Sybase, Informix, IBM DB2,
Microsoft SQL Server, and Teradata

 File – Fixed and delimited flat file, COBOL file, and XML
 Other – Microsoft Excel, Microsoft Access
 Extended – PeopleSoft, SAP R/3, Sieble, and IBM
MQSeries (need to purchase additional products for these
sources)

 Mainframe – Need to purchase additional products.


Warehouse Designer
• Create relational tables in Target database
• Edit target definitions
• Preview relational target data
Targets
• Relational – Oracle, Sybase, Sybase IQ, Informix, IBM DB2,
Microsoft SQL Server, and Teradata
• File – Fixed and delimited flat files and XML
• Extended – SAP BW, IBM MQ Series (need to purchase
additional products for these targets)
• Other - Microsoft Access
Mixing Sources and Targets
• You can combine data from different platforms and
source types.

Oracle

Sybase

Flat
File
Transformation Developer
• Generates ,modifies, passes data through ports
• 12 objects for transforming data
Transformations Types
• Source Qualifier represents all data queried from the
source
• Normalizer normalizes records from VSAM or
relational sources
• Expression performs simple calculations
• Filter serves as a conditional filter
• Aggregator performs aggregate calculations
• Rank limits records to top or bottom range
Transformations Types contd…
• Update Strategy allows for logic to insert, update,
delete, or reject data
• Lookup looks up values and passes to other
objects
• Stored Procedure calls a stored procedure and
captures return values
• External Procedure calls a procedure defined in a
shared library
• Sequence Generator generates unique ID values
• Joiner allows for heterogeneous joins
SourceQualifier Transformation
• Represents records that Informatica server reads
when it runs a session
• Automatically attached when a Source is added to a
mapping
Use a Source Qualifier to:
 Filter Records when the Informatica Server reads
source data
 Specify sorted ports
– Order by clause
 Select only distinct values from a source
 Create a custom query for the Informatica Server
to read source data
Expression Transformation

 Calculate values in a single row


– Adjust employee salaries, concatenate first and
last names, convert string to number…
 Perform Any Non-Aggregate Calculations
 Test conditional statements before you output to
target
Example
EMPLOYEE EMP_SALARY
EMP_ID EMP_ID
EMP_NAME EMP_NAME
ROLE_CODE ROLE_CODE
Gross
BASIC_SALA Salary= GROSS_SALARY
RY Basic
Salary *
3.5
Aggregator Transformation

 Allows you to perform aggregate calculations,


such as averages and sums
 While the Expression is on a row-by-row basis,
the aggregator can perform calculations on
groups
Example
REVENUE PU_REVNUE
PU_CODE PU_CODE
PROJECT_CO MAX_REVENUE
Aggregator
DE Transformat
ion
MIN_REVENUE
REVENUE
AVG_REVENUE
Max Revenue =
Max (Revenue)
Min Revenue =
Min(Revenue) Avg
Revenue = Avg
(Revenue)
Filter Transformation

 Provides the means for filtering rows in a


mapping
– Employees who are currently working in the
project “NML” of “WENA” as “SE”

 Only rows that meet the condition pass


through the mapping.
Filter Transformation

 All ports are input/output


 Returns TRUE or FALSE for each row passed
through the mapping based on the condition
 Discarded rows do not appear in the session log
or reject files
 The input ports must only come from one
transformation
Filter vs Source Qualifier (SQ)

 SQ provides better performance

 SQ only lets you filter rows from relational


sources, Filter Transformation filters rows from
any source

 SQ only uses standard SQL, Filter can use any


statement or function that returns True/False
Example
EMPLOYEE NM_EMP_DETAILS
EMP_ID EMP_ID
EMP_NAME EMP_NAME
PROJECT_CO Filter
Transform
PROJECT_CODE
DE
ation PU_CODE
PU_CODE
ROLE_CODE
ROLE_CODE Where Project =
“NML” and PU =
“WENA” and Role
= “SE”
Router Transformation
 Groups data into many groups
 Routes rows of data that do not meet any
condition to a default group
 Can enter any expression that returns a single
value
 Condition returns True or False for each row
 If the condition = NULL, the row is assumed as
FALSE
Router Transformation
• One Group can be connected to One
transformation or target
• One Output Port can be connected to multiple
transformations or targets
• Multiple Output ports in One Group can be
connected to multiple transformations or
targets
• CANNOT Connect more than One Group to One
Transformation or Target
Lookup Transformation
 Looks up data in a relational table
– Can be the Source, Target, or any database that the
Informatica Client and Server can connect to
– Lookup table can be a single table or can join multiple
tables
 Lookups can:
– Get a related value (your source include Employee_ID
and you want Employee_Name),
– Perform a calculation
– Update a slowly changing dimension table (check if
records exist on a target)
Lookup Transformation
 For each input row, the Informatica Server queries
the lookup table based on the lookup ports and
the condition in the transformation
– The Informatica Server can return values from that
lookup (static cache)
OR
– The Informatica Server inserts a row into the cache to
flag rows as new or existing (dynamic cache)
Connected and Unconnected
Lookup Transformations
CONNECTED LOOKUP UNCONNECTED LOOKUP
Part of the mapping data flow Separate from the mapping data flow
Returns multiple values (by Returns one value (by checking the
linking output ports to another Return (R) port option for the output
transformation) port that provides the return value)
Executed for every record Only executed when the lookup
passing through the function is called
transformation
More visible, shows where the Less visible, as the lookup is called
lookup values are used from an expression within another
transformation
Default values are used Default values are ignored
Example
EMPLOYEE_PROJECT NM_EMP_DETAILS
EMP_ID EMP_ID

EMP_NAME EMP_NAME

PROJECT_CO PROJECT_CODE
DE PROJECT_DESC
LOOK UP
PROJECT Transform
ation
PROJECT_CO Get
PROJECT.PROJECT_DESC
DE Where
PROJECT_DE PROJECT.PROJECT_CODE
SC =
NM_EMP_DETAILS.PROJE
CT_CODE
Update Strategy
 Two Ways Of doing

 Within a Session

 Within a Mapping
Update Strategy
 Within a Session

 Instruct the Informatica Server how to treat


the rows when the session is configured

 Within a Mapping

 Use the update strategy transformation to


flag records for insert, delete, update, or
reject.
Constraint for each Database
Operation

Operation Constant Numeric Value


Insert DD_INSERT 0
Update DD_UPDATE 1
Delete DD_DELETE 2
Reject DD_REJECT 3
Joiner Transformation
 Active Transformation

Join two flat files

Join two tables from different databases

Join a flat file with a relational table


Transformation Overview
Three views:
 Iconized View -- shows transformation in relation to mapping
 Normal View -- shows data flow through transformation
 Edit View -- shows transformation properties and allows for
editing
Transformation Overview
Normal view shows data flow through the
transformation
Data passes
through
I/O ports
unchanged

 DATE_ENTERED passes into transformation


through an input port.
 It is used in MTH port to extract month.
 Month is passed through MTH output port to
another transformation.
Transformation Overview
Edit view provides flexibility in defining transformation rules

Define port level handling Define transformation level


-Switch between properties
transformation
s
-Enter comments
-Make reusable
Transformations and Expressions
Calculation or conditional statement

Used in Expression, Aggregator, Rank, Filter, Update


Strategy

Performs calculation based on ports, functions,


operators, variables, literals, constants, and return
values from other transformations
Mapplets
• Reusable Object
• Include multiple transformations
• Include Source definitions
• Multiple groups of output ports
Mapping
 Move and transform data from sources to targets
 Includes
 source definitions
 target definitions
 transformations.

Source Transformations Target

Mapping
Mapping Designer

Transformation Toolbar
Mapping List

Iconized

Status Bar
Validation
 Three different levels of validation:

 Connection validation
 Expression validation
 Mapping validation
Connection Validation
 Connecting ports with mismatched datatypes
 Connecting output ports to a source
 Connecting a source to anything but a Source
Qualifier or Normalizer
 Connecting an output to a output, or an input
to a input
 Connecting more than one active
transformation to another transformation
 Copying columns to a target definition
Expression Validation
 Parse the current expression, with remote port
searching (references to a port in another
transformation are resolved)
 Parse expression attributes such as filter
condition, lookup condition, SQL Query, etc.
 Parse default values
Mapping Validation
 Mapping validation will take place with menu
commands:
 Mapping | Validate
 Repository | Save
 Mapping validation will:
 Perform connection validation
 Perform expression validation
 Check the mapping flow validation
 Data from Source Qualifier mapped to a target
 Targets are connected to transformations
Informatica Server
• Reads information from the Repository

• Extracts data from the Sources and stores the data in


memory while it applies the transformation rules you
created

• Loads the transformed data into the mapping targets


Transformation Process
Repository

Session Metadata
Source Def source information
Mapping target information
Target Def mapping
scheduling
Session error handling
pre- / post-session scripts
Server Manager tuning parameters
output log information
transformation overrides

Server

Targets
Sources
Definitions
• Session - A set of instructions that tells the
Informatica Server how and when to move data from
sources to targets

• Batch - A group of Sessions which are to be run


together
Server Manager

Navigator

Configure Window

Monitor Window

Output Window
Process
 Configure server
 Create session
 Run session
 Monitor session
 View logs
 Tune session
Server Configuration
Server Variables

 Establish default directories for files and


caches.
 Variables are server specific.
 Allows for easy deployment.
 Can be overridden at session level.
 Changing the variable updates sessions.
 Directories must exist prior to session
launch.
Server Output
 
 Control .dat

.perf .idx

Target

E-mail
Data


.bad

.log Event
Error
Log

Log
Source Settings - Session Wizard
Source
 Select source type
 File, Relational, Heterogeneous Treat rows as:
 Source Type: Relational, File
 Tells server how to treat source rows
 Insert, update, delete, data driven
 Works in conjunction with Target Options

Source Options...
 Source Type: Relational
 Database name
 Source Type: File
Source Database
 fixed / delimited properties
 Source Type: Relational
 file list
 Database connection
 FTP properties
Target Settings - Session Wizard

Target
 Select target type
 File, Relational
Flat File Options:
 File properties
 FTP properties
 Loader properties

Target Options:
 Target type: Relational
 Tell server how to load target
Target Database
 Insert, Update Delete
 Target Type: Relational
 Truncate target
 Database connection
 Bulk, Normal, Test
Launching Sessions
 Manual start - Manually launch a session
from within the Server Manager
 Session Schedule - Schedule a session using
business cycle start, stop, and repeat intervals
 Batching - Use batches to run session
concurrently or sequentially
 Event based - Configure a session to launch
based upon the appearance of an indicator file
 Command Line - Launch a session from the
command line prompt
After the Session Launches...

 Poll/Refresh Session Status


 View Session Details
 View Performance Monitor
 View Logs
 Tune the Session
 Correct Session Problems
Monitor Session

 Select Server Requests | Session Details

 Number of rows loaded/failed

 Read/Write throughput

 Most current Server message

 Audit trail in repository or session log


Monitor Session
Performance Monitor
 Select Server Requests | Session Performance
Details or
open file <sessionname>.perf
 Need to configure session properties to save the
performance detail counters
 Help determine where session performance can
be improved
Log Files - Session Wizard

Log Files...
 Log file path and name
 Reject file path and name
 Session log archive options
Connectivity Overview

You might also like