You are on page 1of 23

Que: source ...........>transformer.........

>target
In source i have only one record.
I want 10 records(same record will repeat) in target.
can any one tell me the logic in transformer

Ans: using looping you ca do that to call looping you have to use @ITERATION
declare stgvar=col1
in loop you hav to mention If @ITERATION=1 then stgvar
else if @ITERATION=2 then stgvar and so on till @ITERATION=10
and in output map stgvar to out col
########################################
Que: my seq file 10 columns but in my target i want only 3
coluoumns what we can do?
Ans= 1) only pass 3 col in transformer stage
2->use oracle we can do
3->use copy we can do
4) Use Modify Stage
5)Use filter option of sequential file stage and specify the cut(UNIX)
command.
Ex: cut -d ',' -f1,5,8
(Assuming columns 1, 5 , 8 are to be loaded)
##############################################
#################################
Que: scenario based
Input id

code

------------------

111 01,02,03
222 04,05
Expected output id code
111 01
111 02
111 03
222 04
222 05
FYI, the code column here can have any number of delimited values. Its not
fixed.
ANS: We have to do Column to Row convertion i.e vertical to horizontal
pivotting
use pivot enterprise for this purpose
##############################################
#################################
Que: source has 2 fields like
COMPANY

LOCATION

IBM

HYD

TCS

BAN

IBM

CHE

HCL

HYD

TCS

CHE

IBM

BAN

HCL

BAN

HCL

CHE

LIKE THIS.......

AND I WILL GET THE OUTPUT LIKE THIS....

Company

loc

TCS

HYD

count
3

BAN
CHE
IBM

HYD

BAN
CHE
HCL

HYD

BAN
CHE
Ans: sourec ---> aggregate (use group by and take count)-->target

source ----> aggregator---> target


in aggregator
set property as group key as company name and add sub -property to nonmissing value count means count(*) so u may get o/p like that.
In your answer how can we get "count - column" in output.
you can use following design
Source---> Aggregator(Company,Count)
|
|
Source---->LKP STG---->Sort(Company)---> Target
##############################################
######################
There are Three stage

1. lookup stage
2. join stage
3. Merge stage
In lookup stage, only matched records passed to next stage.For example:table A as a source table

table B as a reference

---------------------------------------------------------------------------EID ENAME

EID ESALARY

102 JOHN

103 7000

103 MIKE
---------------------------------------------------------------------------Result will be
EID ENAME ESALARY
103 MIKE 7000
2. In join stage,Not only matched but unmatched records also passed to next
stage depand on which type of join we are using. Suppose if we use left
outer join
table A as a source table

table B as a reference

---------------------------------------------------------------------------EID ENAME

EID ESALARY

102 JOHN

103 7000

103 MIKE
---------------------------------------------------------------------------Result will be
EID ENAME ESALARY
102 JOHN

103 MIKE 7000

3. Merge stage:- Merge Stage is a Processing Stage which is used to perform


the horizontal combining. This is one of the stage to perform this operation
like Join stage and Lookup Stage. Only the difference between the stages are
size variance an Input requirements between them. we can have n-1 no. of
rejects link.
---------------------------------------------------------------------------------------------4 types of lookup stages aviailable.
1. Normal lookup
2. sparse lookup
3. range lookup
4. caseless lookup

1.Normal lookup: here secondary data is stored in memory then each


primary record will cross verify with the seconday records.
--> here it supports N input, 1 outputs and 1 reject
2.Sparse lookup: its is best technique to use at the time of memory
insufficient.
means the secondary data is comparitivly more than the primary data. ratio
1:100.
----> here it supports 2 inputs in case of databases (if need more inputs we
can use lookup fileset)
##############################################
################################
Que:Can any one give brief explanaton on conductor node,section
leaders,players
Ans: Re: conductor node,secession leaders,players
When the job is initiated the primary process (called the conductor) reads
the job design, which is a generated Orchestrate shell (osh) script. The
conductor also reads the parallel execution configuration file specified by
the current setting of the APT_CONFIG_FILE environment variable.
Once the execution nodes are known (from the configuration file) the
conductor causes a coordinating process called a section leader to be

started on each; by forking a child process if the node is on the same


machine as the conductor or by remote shell execution if the node is on a
different machine from the conductor (things are a little more dynamic in a
grid configuration, but essentially this is what happens).
Each section leader process is passed the score and executes it on its own
node, and is visible as a process running osh. Section leaders stdout and
stderr are redirected to the conductor, which is solely responsible for logging
entries from the job.
The score contains a number of Orchestrate operators. Each of these runs in
a separate process, called a player (the metaphor clearly is one of an
orchestra). Player processes stdout and stderr are redirected to their parent
section leader. Player processes also run the osh executable.

Communication between the conductor, section leaders and player processes


in a parallel job is effected via TCP.
##############################################
#############################
QUe: Explain about Reverse epivot stage in 8.0.1 Version with example.
souce like

Output

-----------------------------------------------------------------------------------------id,name

id,name

1,a

1 a,c,e

2,b

2 b,w

1,c

3 f

3,f
1,e
2,w
Ans:
1. Convert column to row. do horizontal pivotting
Seq ---> Horizontal Pivot---->Dataset
2. Sort data based on id and , select key change column = True ( To Generate

Group ids)
in transformer check
create 2 stage var
stg1=name
stg2= if keycolumnchage=1 then stg1 else stg2:',':stg1
map stg2 in name field
Seq-----> Sort-->Tfm----> RM ---->

Dataset

3.You can use create logic in transformer[use link sort ] and the use RD after
that
seq->transformer-->remote duplicate-->dataset
In Transformer stage
STEP1:DO LINK SORT->IT WILL SORT THE DATA IN ASSE ORDER
STEP2:USE STAGE VARIABLE S,S1 AND GIVE THIS CONDITION......
S=

IF ID=SID1 THEN NAME:',':S ELSE NAME

SID1= ID
HERE OUT IS------> ID

NAME

A,C

A,C,E

B,W

In RMDUP stage :
DULICATE RETAIN= LAST
drag column to output dataset
##############################################

###############
Que: convert single column to multiple column
source
----------------------ID

Mobile

1 samsung
1 nokia
1 ercisson
2 iphone
2 motrolla
3 lava
3 blackberry
3 reliance

Expected Output
-------------------------------------ID Mobile1 Mobile2 Mobile3
1

samsung nokia ercission

2
3

iphone motrolla
lava

blackberry reliance

And:
1. Convert Row to column Do vertical pivotting
Seq file----> vertical pivot----->TFM2-----> dataset
In vertical pivot mention pivot colum, index , array size

2. Seq----> Sort--->Tfm1--->RM------>tfm2----> out


Sort and mark key colum change=true
Logic in Transformer1 :
create stage var stg1
Stg1= if keycolchange=1 then Mobile1 else Stg1:',':Mobile1
map this Stg1 to column Mobile1 then the coulmn mobile1 is coming with
delimeter commas
In Remove Duplicate stage key is ID and Set option Duplicate retain to ------>
Last

in transformer define 3 columns like Mobile1 , Mobile2, Mobile3


in Mobile1 derivation give Field(InputColumn,",",1) and
in Mobile2 derivation give Field(InputColumn,",",2) and
in Mobile3 derivation give Field(InputColumn,",",3)

##############################################
######################
Que: How Unix shell scripts used in datastage
Ans:
A) Go to --> job Properties--> select Before-job Subroutine--> as
Execsh--> Under input text box give UNIX command. / FIle path
In Above case UNIX command executed Before the job Execution.
Go to --> job Properties--> select After-job Subroutine--> as
Execsh--> Under input text box give UNIX command.
In Above case UNIX command executed After the job Execution.

B) Under Sequential File


FILTER= UNIX Command

C) In Job Sequences Command Activity.


##############################################
###################
Que: Input File
Dummy
------1
Output should be like
Dummy
------1
2
3
4
5
ANs:
1. Use surrogate key generator
You can use surrogate key to achieve the target.Add the Sequence Initial
Value property to the Options group, and specify the value to initialize the
sequence.
2. in transformer stage create one stage variable is (rownum*1)
3. Rowgenerator--> xfm-->seqfile
in row gwnerator : number of records=num(it is job parameter)
in xfm : add a colunm in derivation put @rownum
then run job at run time u can pass any value for ur example pass 5 u can
output
4. Using Looping

in looping : give condition @ITERATION<=5


in Transformer Dummy = @ITERATION
##############################################
###############################
Que: I have file with empid,empname and I want to load these two fields
along with sal in my target
1)salary must be same for all the records
2)I want pass the salary at run time
Ans:
1. Create job parameter salary so that you can pass same salary for all
records runtime
and map this column to output dataset with new column name
##############################################
#######################
Que: I have two source tables/files numbered 1 and 2.
In the the target, there are three output tables/files, numbered 3,4 and 5.
to the output 4 -> the records which are common to both 1 and 2 should go.
(INNER JOIN)
to the output 3 -> the records which are only in 1 but not in 2 should go
(LEFT OUTER JOIN)
to the output 5 -> the records which are only in 2 but not in 1 should go.
(RIGHT OUTER JOIN)
how to go with this scenario
ANS:
u can use join stage with all three option.
1.) Inner join
2.) Left outer join
3.) Right outer join

#############################################
Que: my input is like below
empno sal
------------------------------ravi 1000
raju 2000
srinu 3000
rao 500
i want o/p like
empno sal
---------------------ravi 1000
ravi 1000
raju 2000
raju 2000
raju 2000
raju 2000
srinu 3000
srinu 3000
srinu 3000
srinu 3000
srinu 3000
srinu 3000
rao 500
can any one explain how can i achieve this
ANS: u can apply pad string in transformer stage

"Pad string" function.


2. You can divide sal by 1000 and multiply by 2 and pass this stage output to
iteration
i.e @ITERATION <=Stgvar1
##############################################
###########
Que: My input is llike
col1

col2

col3

NULL

NULL

col4

Output should be like


col1

col2

col3

col4

NULL

NULL

Means if any record NULL means it should come as last..Remaining


should populate in first by pushing NULL values to end
Ans: use sort stage to sort row data (need to sort row data not column )
specify Null Position = LAST
sqe---> row generator--->pivot colum to row ---> sort ----> row to column--->
out
ID Col1
col3

(sort retain null to last ) ID col1 col2

-----------------------------------------------------------------------------------------------1 a

1 a b c NULL

1b

2 k l m NULL

ac
1 NULL

##############################################
###########################
Que: Input:
Dummy
------a
a
a
O/p: Dummy
------3
Ans: 1. use aggregator stage (count function)
2. With d help of transformer (using stgvar)
##############################################
##
Que: Datastage Real time scenario
source table -----name--- A A B B B C C D In source table data like thi
s but I want traget table like this name count A 1 A 2 B 1 B 2 B 3 C 1 C 2 D 1
pls any one one solve this........
Ans: 1. Using transformer
1st one is using loop variables in transformation.
seq---->aggregator ----> tfm-->out
use aggregator stage or using Database you can take count
take count from aggregator
pass output to tfm.
In looping declation would be:
@ITERATION<=count

and pass count to output column

2 nd is using stage variables and without loop variables.


StgV1= Name
Count=1
StgV2= if StgV1=Temp then Count+1 else count
Temp = StgV1
in the transformer stage output link u can create one more column that
sequence no in that column derivation call StgV2
##############################################
#######################
Que: have source like this

a,b,c,1,2,3 ( All this in one column


required output: {(a,b,c,1),(a,b,c,2),(a,b,c,3)}
Ans:
1. Source->TR->RD->Pivot->Target,by using these order of stages we can get
required output.
Transformer:
we have to concatenate the values by using loop(we will get like a,b,c,1,2,3)
after that we have to split it into separate fields using field function.
o/p is:c1 c2 c3 c4 c5 c6 c7
1a
1ab
1abc
1abc1
1abc12
1abc123
(if we dont want this dummy column we can drop it here itself)
RemoveDuplicate:

put condition retain last


o/p is: c1 c2 c3 c4 c5 c6 c7
1 a b c 1 2 3
pivot:
in derivation of c5 column give c5,c6,c7
here we are converting columns into rows
o/p is: c2 c3 c4 c5 c6 c7
abc1
abc2
abc3
2. We can do it using Transformer.Take 3 stage variables(s1,s2,s3),for s1 map
the input column and for s2 we have to write the condition like if
alpha(inputcolumn)= true then trim(s3:,:s1,,,) else s3:,:inputcolumn. for s3
also we have to write the condition like if alpha(inputcol) true then map s2
into s3 else map s3 to s3.
Input column---->S1
if alpha(inputcolumn)=true then trim(s3:,:s1) else
trim(s3:,:inputcolumn)-------->s2
if alpha(inputcoumn)=true then s2 else s3------->s3
In constraint part we have to write the below condition
if alnum(s2)=true
In derivation part we have to map s2 to output column

3. source-->Transformer(use field function by taking stage variable like


field(inputcolumn,,,1,3) and concatenate this field value aging with field ---->
Pivot stage--->target.

4. source--->pivot-->Transformer --> target


we will get 6 columns from pivot stage(rows to column) then

In transformer stage we have to concatenate the input columns


like col1:col2:col3:(here we can use stage variables and increment that value
by one)
5. JOB flow: Transformer (3 ouyput links)----> Funnel---->Dataset

In Transformer: Use 3 stage variables,


in sv1: left(Cust_id,1,7)
in sv2: left(Cust_id,1,6)||Left(right(Cust_id,3),1)
in sv3: left(Cust_id,1,6)||Right(Cust_id,1)
Take 3 outputs from Transformer, map sv1 to output-1 column CUST_ID,
map sv2 to output-2 column CUST_ID,
map sv3 to output-3 column CUST_ID
Here one input record splits into 3 output records.
Now cature all three output records using Funnel stage
##############################################
######################
Que: I am trying to find out the method for obtaining the last non-null value
for each column within a key (or group of keys). Could not find anything by
searching the forum.
Here's a data scenario:
ID

Seq

C3

C4

C5

-------- ------ ------ ------ --1000 2

--- -----------------------------

<NULL>

999

1000 12

<NULL>

1000 13

1000 18

<NULL>

1001 5

<NULL>

1001 8

1001 10

<NULL> <NULL>

111
<NULL>

<NULL>
<NULL>
Y

777
666
<NULL>
555

Basically i have sorted it acocrding to id and seq. now i want is the id, last seq

number and the last non-null values for the rest of the columns.
my desired result is
ID Seq C3 C4

C5

---- -------- ------ ------ -----1000 18 O Y 777


1001 10 A Y 555

ans: seq--->tfm----> RM--->out


Stage variables will help here. Sort the data on ID and Seq.

svLastC3 = IF (In.ID=svLastID) THEN NullToValue(In.C3,svLastC3) ELSE


NullToValue(In.C3,"")
svLastC4 = IF (In.ID=svLastID) THEN NullToValue(In.C4,svLastC4) ELSE
NullToValue(In.C4,"")
SVLastC5
etc.
svLastID = In.ID
Pass all rows out of the transform stage. Add a remove-duplicates
stage and keep the last row.
##############################################
###################
Que: We have text file as source,it contains only one column.The first row is
related to job status of second row which is jobname, and third row is related
to job status of fourth row which is Jobname and so on.....
The output should contain 2 columns i.e job status and that particular
jobname.(it is guaranteed that jobstatus and job name are in respective
sequential order only)
required Answer jobstatus | jobname
Input : COLUMN

--------------------------RUN OK
test_seq
WARNING
test_seq1
FAILED
testseq2
Output :
Jobstatus Jobname
----------------------RUN OK test_seq
WARNING test_seq1
FAILED test_seq3
We are using datastage server jobs.(7.5v) on Windows server.
Ans:

Solutions we had tried :

We had splitted the single source link into two links using Transformer
constraints as Mod(@inrownum,2) = 0 into one link(it contains data as
test_seq
test_seq1
test_seq2
and the Mod(@inrownum,2) <> 0 into second link.
RUN OK
WARNING
FAILED
We are not sure how to combine these two links to get my desired output. i.e
Output :
JObstatus Jobname

----------------------RUN OK test_seq
WARNING test_seq1
FAILED test_seq3
Can anyone help me on this.

I think Vertical Pivot should help.


Let's say that you have 6 rows with data and then it will be repeating.
So, you need to concate first 6 rows in 1st record and then next 6 rows in 2nd
record and so on.
Afterwards you can parse them out and put it in different columns.

You can do it the way you specified but would need to add a "key" which
could be the DIV by two.
Alternatively, you keep keep every odd row in a Stage Variable and output on
every even row

Below is the design to do this logic.

Seqfile-------->Transformer------->output

In the transformer you declare stage variable as below


StgVar---If Mod(@INROWNUM,2)=1 Then InputColumn Else StgVar: '|' :
InputColumn
In the constraint give the condition as -- Mod(@INROWNUM,2)=0
In the derivation for output column pass the Stagevarible i.e StgVar
Then you will get the require ouput.
##############################################
################################

Question :souce file having the columns like


Name Company
krish IBM
pooja TCS
nandini WIPRO
krish IBM
pooja TCS
if first row will be repeat i want the result like this
Output:
Name Company Count
krish

IBM

pooja

TCS

nandini WIPRO 1
krish

IBM

pooja

TCS

Ans:

1. seq--->tfm--->out

stgvar1= if company ='IBM' then count+1 else stgvar1


stgvar1=0
stgvar2= if company ='TCS' then count2+1 else stgvar2
stgvar2=0
stgvar3= if company ='WIPRO' then count3+1 else stgvar3
stgvar3=0
in column derivation if company=ibm then stgvar1 elseif company=tcs then
stgvar2 else if company=wipro then stgvar3 else 0
2. seq--->sort-->tfm-->out
first sort the both name and compamy and then, using stage variable in
transformer:

curr= name:company
val=if curr <> prev then 1 else val+1
prev=curr
o.p = val
name,company,val
##############################################
######################
QUe: scenario is like i'm having a record 1234"13233434%343434^23232!
1212$23232 in the above record all the special characters must be
removed.how can we do it in datastage 8.0.1.can any one please ans this?
thanx in advance
AnS:

##############################################
###########################
Que: I hav source like this .
deptno,sal
1,2000
2,3000
3,4000
1,2300
4,5000
5,1100
i want target like this
target1
1,2000
3,4000
4,5000

target2
2,3000
1,2300
5,1100 with out using transformer
Ans:
1.

tgt1____
|

source------transformer
|_____tgt2
in transformer

TGT1 --constraint --> mod(deptno,2) = 1


source.deptno - deptno
source.sal

- sal

TGT2 --constraint --> mod(deptno,2) = 0


derivation -->
source.deptno - deptno
source.sal

- sal

##############################################
##############################

You might also like