You are on page 1of 13

Basics of Informatica

Abstract:
This paper takes a look at the introduction of a warehouse and
the different ETL tools to test Data warehouse. We have many ETL
tools in the market like Informatica, Data stage, SSIS that could be
beneficial while testing the ETL process in a DW. A data warehouse is a
critical business application and defects in it results is business loss
that cannot be accounted for. Here, we walk you through some of the
basics about one of the ETL tool Informatica.
Introduction:
This is an era of global competition and ignorance is one of the
greatest threats to modern business. As such organizations across the
globe are relying on IT services for strategic decision-making. A data
warehouse implementation is one such tool that comes to the rescue.
Let us take a look at the basics of Informatica
What is Informatica?
Informatica has a simple visual interface. You do most of the work by
simply dragging and dropping with your mouse in the Designer. This
graphical approach makes it also very easy to understand what is
going on (it is "self-documenting" in a sense).
Informatica can communicate with all major data bases; can
move/transform data between them. It can move huge volumes of
data in a very effective way. It can throttle the transactions (do big

updates in small chunks to avoid long locking and filling the


transactional log). It can effectively do joins between tables in
different databases on different servers. The tasks are performed by
Informatica Server (UNIX or MS Windows). You get a client
application called "Server Manager" to work with the server.
You design your processes in a client application called "Designer".
This is where you tell what the source databases and tables will be,
what will be the targets, and how you move/transform the data.
Informatica uses its own database called "Metadata Repository
Database", or simply a Repository. Repository stores the data (rules)
needed for data extraction, transformation, loading, and management.
You get a client application "Repository Manager" to work with the
repository. Data warehouse is mainly focused on the Data, so we have
to have clear picture about what is the data types and functions
available in the tool. Here is the some of the functions and objects
which used to test efectively

Informatica Functions Data Cleansing

Used to process data during data Cleansing

METAPHONE and SOUNDEX create indexes based on English


pronunciation (2 different standards)
Ex:
INSTR
IS_DATE
IS_NUMBER
IS_SPACES
ISNULL
LTRIM
METAPHONE
REPLACECHR
REPLACESTR
RTRIM
SOUNDEX
SUBSTR
TO_CHAR
TO_DATE
TO_DECIMAL
TO_FLOAT
TO_INTEGER

Informatica Functions Date

Used to round, truncate, or compare dates; extract one part


compare dates; extract one part of a date; or perform arithmetic
on a date

To pass a string to a date function, first use the TO_DATE


function to convert it to an date/time data type
Ex:
ADD_TO_DATE
DATE_COMPARE
DATE_DIFF
GET_DATE_PART
LAST_DAY
ROUND (Date)
SET_DATE_PART
TO_CHAR (Date)
TRUNC (Date)

Informatica Functions Numerical and Scientific

Numerical Functions used to perform mathematical operations


on numeric data
Ex:
ABS
CEIL
CUME
EXP
FLOOR
LN
LOG
MOD
MOVINGAVG
MOVINGSUM
POWER
ROUND
SIGN
SQRT
TRUNC

Scientific Functions Used to calculate geometric values of


numeric data
Ex:
COS
COSH
SIN
SINH
TAN
TANH

Informatica Functions Special and Test

Special Functions - Used to handle specific conditions within a


session; search for certain values; test conditional statements
IIF (Condition, True, False)
Ex:
ABORT
DECODE
ERROR
IIF
LOOKUP

Test Functions - Used to test if a lookup result is null Used to


validate data
Ex:
IS_DATE
IS_NUMBER
IS_SPACES
ISNULL

Variable Ports

Use to simplify complex expressions


o e.g. create and store a depreciation formula to be
referenced more than once

Use in another variable port or an output port expression

Local to the transformation (a variable port cannot also be an


input or output port)

Available in the Expression, Aggregator and Rank


transformations

Informatica Data types

NATIVE DATATYPES
o Specific to the source and target database types
o Display in source and target tables within Mapping
Designer

TRANSFORMATION DATATYPES
o Power Center internal data types based on UCS-2
o Display in transformations within Mapping Designer

Transformation data types allow mix and match of source and


target database types

When connecting ports, native and transformation data types


must be compatible (or must be explicitly converted)

Data type Conversions within Power Center

Data can be converted from one data type to another by:


o Passing data between ports with different data types
o Passing data from an expression to a port
o Using transformation functions
o Using transformation arithmetic operators

Only conversions supported are:


o Numeric data types

Other numeric data types

o Numeric data types String


o Date/Time

Date or String

o For further information, see the Power Center Client Help >
Index > port-to-port data conversion

Mapping Designer

Transformations Objects Used in This Class

Source Qualifier: reads data from flat file & relational sources

Expression: performs row-level calculations

Filter: drops rows conditionally

Sorter: sorts data

Aggregator: performs aggregate calculations

Joiner: joins heterogeneous sources

Lookup: looks up values and passes them to other objects

Update Strategy: tags rows for insert, update, delete, reject

Router: splits rows conditionally

Sequence Generator: generates unique ID values

Other Transformation Objects

Normalizer: normalizes records from relational or VSAM


sources

Rank: filters the top or bottom range of records

Union: merges data from multiple pipelines into one pipeline

Transaction Control: allows user-defined commits

Stored Procedure: calls a database stored procedure

External Procedure : calls compiled code for each row

Custom: calls compiled code for multiple rows

Midstream XML Parser: reads XML from database table or


message queue

Midstream XML Generator: writes XML to database table or


message queue
o More Source Qualifiers: read from XML, message queues
and applications

Acronyms:
1. DW Data Warehouse
2. QA Quality Assurance
3. ETL Extraction, Transformation and Loading
References:
1. Data Warehousing Soumendra Mohanty
2. Informatica: Instructions and Basics