AGGREGATOR TRANSFORMATION

Basics and Advanced Concepts

Transformations

Transformations help to transform the source data according to the requirements of target system and it ensures the quality of the data being loaded into target. Transformations are of two types: 1) Active 2) Passive

such as averages and sums.Overview Aggregator transformation allows you to perform aggregate functions. Expression transformation permits you to perform calculations on a row-by-row basis only . The aggregator transformation is unlike the Expression transformation. We can use aggregator transformation to perform calculations on groups.

Incremental aggregation. providing more flexibility After we create a session that includes an aggregator transformation . .we can enable the session option.Aggregator Transformation We can use conditional clauses to filter rows. it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally. While performing incremental aggregation.

The Designer creates the Aggregator transformation. if desired. . click Create. Then click Done. Drag the desired ports to the Aggregator transformation. The Designer creates input/output ports for each port you include. choose Transformation-Create. then configure the transformation with an aggregate expression and group by ports. Select the Aggregator transformation.Creating an Aggregator Transformation To use an Aggregator transformation in a mapping. you add the Aggregator transformation to the mapping. To create an Aggregator transformation: In the Mapping Designer. Enter a name for the Aggregator.

. then click OK. You can optionally enter a default value to replace null groups. Enter the aggregate expression. Make sure the expression validates before closing the Expression Editor.Creating an Aggregator Transformation Double-click the title bar of the transformation to open the Edit Transformations dialog box. Make the port an output port by clearing Input (I). Click in the right corner of the Expression field to open the Expression Editor. Select the Ports tab. Click the group by option for each column you want the Aggregator to use in creating groups. Click Add and enter a name and data type for the aggregate expression port. click Validate.

.Continuation« Select the Properties tab.

.

Choose Repository-Save to save changes to the mapping .Continuation« Click OK.

output. When grouping data. Sorted input. Can include non-aggregate expressions and conditional clauses.Components of the Aggregator Transformation The Aggregator is an active transformation. in ascending or descending order. The PowerCenter Server stores data in the aggregate cache until it completes aggregate calculations. Use to improve session performance. you must pass data to the Aggregator transformation sorted by group by port. To use sorted input. The port can be any input. 6 . Indicates how to create groups. It stores group values in an index cache and row data in the data cache. changing the number of rows in the pipeline. input/output. The Aggregator transformation has the following components and options: Aggregate expression. the Aggregator transformation outputs the last row of each group unless otherwise specified. or variable port. Group by port. Entered in an output port. Aggregate cache.

Aggregate Caches When you run a session that uses an Aggregator transformation. If the PowerCenter Server requires more space. it stores overflow values in cache files. You can configure the index and data caches in the Aggregator transformation or in the session properties. No need to configure cache memory for Aggregator transformations that use sorted ports . the PowerCenter Server creates index and data caches in memory to process the transformation.

you must use them in an expression within an Aggregator transformation. You can nest one aggregate function within another aggregate function.Aggregate Functions You can use the following aggregate functions within an Aggregator transformation. . The transformation language includes the following aggregate functions:  AVG  FIRST  COUNT  LAST  MAX  PERCENTILE  STDEV  SUM  VARIANCE When you use any of these functions.

To define a group for the aggregate expression. select the appropriate input. For example. rather than performing the aggregation across all input data. You can select multiple group by ports. input/output.Group By Ports The Aggregator transformation allows you to define groups for aggregations. rather than finding the total company sales. output. . You can select multiple group by ports. and variable ports in the Aggregator transformation. and variable ports in the Aggregator transformation. creating a new group for each unique combination of groups. creating a new group for each unique combination of groups. input/output. To define a group for the aggregate expression. select the appropriate input. you can find the total sales grouped by region. output.

Group By Ports The following Aggregator transformation groups first by STORE_ID and then by ITEM: .

Continuation« If you send the following data through this Aggregator transformation: .

The PowerCenter Server performs the aggregate calculation on the following unique groups: .Continuation«..

as follows: .Continuation«. The PowerCenter Server then passes the last row received. along with the results of the aggregation..

the PowerCenter Server performs aggregate calculations as it reads. you must pass sorted data to the Aggregator transformation. it stores group information in memory. However.Using Sorted Input You can improve Aggregator transformation performance by using the sorted input option. the PowerCenter Server assumes all data is sorted by group. since data is not sorted. As the PowerCenter Server reads rows for a group. it performs aggregate calculations. the PowerCenter Server stores data for each group until it reads the entire source to ensure all aggregate calculations are accurate . When necessary. When you do not use sorted input. When you use sorted input. To use the Sorted Input option.

the session fails. in the order they appear in the Aggregator transformation. If data is not in strict ascending or descending order based on the session sort order.Sorted Input Conditions Do not use sorted input if either of the following conditions are true: The aggregate expression uses nested aggregate functions. Data must be sorted as follows: By the Aggregator group by ports. the PowerCenter Server fails the session . Pre-Sorting Data To use sorted input. If you use sorted input and do not sort data correctly. The session uses incremental aggregation. you pass sorted data through the Aggregator. Using the same sort order configured for the session.

Mapping with a Sorter transformation configured to sort the source data in descending order by ITEM_NAME .

You can optimize Aggregator transformations by performing the following tasks: Group by simple columns. Aggregator transformations need additional memory to hold intermediate group results. Use sorted input. Use incremental aggregation.Optimizing Aggregator Transformations Aggregator transformations often slow performance because they must group data before processing it. .

As the PowerCenter Server reads rows for a group. You should also avoid complex expressions in the Aggregator expressions. the PowerCenter Server assumes all data is sorted by group. When necessary. use numbers instead of string and dates in the columns used for the GROUP BY. Use Sorted Input You can increase session performance by sorting data and using the Aggregator Sorted Input option. The Sorted Input option reduces the amount of data cached during the session and improves performance. .Performance Tuning Tips Group By Simple Columns You can optimize Aggregator transformations when you group by simple columns. The Sorted Input decreases the use of aggregate caches. When possible. Use this option with the Source Qualifier Number of Sorted Ports option to pass sorted data to the Aggregator transformation. it stores group information in memory. When you use the Sorted Input option. it performs aggregate calculations.

you can use Incremental Aggregation to optimize the performance of Aggregator transformations. rather than processing the entire source and recalculate the same calculations every time you run the session.Performance tuning tips Use Incremental Aggregation If you can capture changes from the source that changes less than half the target. When using incremental aggregation. The PowerCenter Server updates your target incrementally. you apply captured changes in the source to aggregate calculations in a session. Filter before aggregating If you use a filter transformation in the mapping. .place the transformation before the aggregator transformation to reduce unnecessary aggregation.

Thank You« .