You are on page 1of 4


The Aggregator Stage:

Aggregator stage is a processing stage in datastage is used to grouping and summary operations.By
Default Aggregator stage will execute in parallel mode in parallel jobs.
Note:In a Parallel environment ,the way that we partition data before grouping and
summary will affect the results.If you parition data using round-robin method and then
records with same key values will distruute across different partiions and that will give in
correct results.
Aggregation Method:
Aggregator stage has two different aggregation Methods.

1)Hash:Use hash mode for a relatively small number of groups; generally, fewer than about 1000
groups per megabyte of memory.
2)Sort: Sortmode requires the input data set to have been partition sorted with all of the grouping
keys specified as hashing and sorting keys.Unlike the Hash Aggregator, the Sort Aggregator requires
presorted data, but only maintains the calculations for the current group in memory.
Aggregation Data Type:
By default aggregator stage calculation output column is double data type and if you want decimal
output then add following property as shown in below figure.

If you are using single key column for the grouping keys then there is no need to sort or hash
partition the incoming data.

And single records not repeated with respected to dno need to come to one target.tom 30. In Aggregator stage select group =dno Aggregator type = count rows Count output column =dno_cpunt( user defined ) In output Drag and Drop the columns required.ram 10. Take Job design as Read and load the data in sequential file.tiny 40. Give Target file names and Compile and Run the JOb.emy 20.Than click ok In Filter Stage ----.remo And we need to get the same multiple times records into the one target.AGGREGATOR STAGE AND FILTER STAGE WITH EXAMPLE If we have a data as below table_a dno.sam 20.siva 10.At first where clause dno_count>1 -----Output link =0 -----At second where clause dno_count<=1 -----output link=0 Drag and drop the outputs to the two targets. You will get the required data to the 10. AGGREGATOR STAGE TO FIND NUMBER OF PEOPLE GROUP WISE .

vin.We can use Aggregator stage to find number of people each in each department.emy.20 300.driver.clerck. Go to Aggregator Stage and Select Group as Dept_No and Aggregator type = Count Rows Count Output Column = Count ( This is User Determined) Click Ok ( Give File name at the target as your wish ) Compile and Run the Job AGGREGATOR STAGE WITH REAL TIME SCENARIO EXAMPLE Aggregator stage works on groups.10 600. e_name.-------Agg.10 6.20 .2300.lin. For example.20 3.clerck.30 10. It is used for the calculations and counting.20 Take Job Design as below Seq.lin.10 100.1600.e_name.e_sal.pom.File Read and load the data in source file.sam.tom.1200. if we have the data as below e_id. e_job.10 200.10 4.20 400.tim.30 7.10 8.salesman.Stage--------Seq.dept_no 1.20 9. It supports 1 Input and 1 Outout Example for Aggregator stage Input Table to Read e_id.jim.zim.20 5.10 500.sam.eli.jem.manager.den.2000.2500.pinky.2200.

File--------Aggregator-----------Seq. Select output file name in second sequential file. Group. According to this sample data.Fie And in Aggregator Stage ---In Properties---.e because to calculate maximum salary based on dept.File Read the data in Seq. That is we can take like this Seq. Now compile And run. Take Sequential File to read the data and take Aggregator for calculations. number. And Take sequential file to load into the target. .Select Group =DeptNo And Select e_sal in Column for calculations i. we have two departments.Here our requirement is to find the maximum salary from each dept.