Professional Documents
Culture Documents
Aggregator Transformation: Basics and Advanced Concepts
Aggregator Transformation: Basics and Advanced Concepts
Transformations
Transformations help to transform the source data according to the requirements of target system and it ensures the quality of the data being loaded into target. Transformations are of two types: 1) Active 2) Passive
Overview
Aggregator transformation allows you to perform aggregate functions, such as averages and sums. The aggregator transformation is unlike the Expression transformation, We can use aggregator transformation to perform calculations on groups.
Aggregator Transformation
We can use conditional clauses to filter rows, providing more flexibility After we create a session that includes an aggregator transformation ,we can enable the session option, Incremental aggregation. While performing incremental aggregation, it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally.
To use an Aggregator transformation in a mapping, you add the Aggregator transformation to the mapping, then configure the transformation with an aggregate expression and group by ports, if desired. To create an Aggregator transformation: In the Mapping Designer, choose Transformation-Create. Select the Aggregator transformation. Enter a name for the Aggregator, click Create. Then click Done. The Designer creates the Aggregator transformation. Drag the desired ports to the Aggregator transformation. The Designer creates input/output ports for each port you include.
Continuation
Select the Properties tab.
Continuation
Click OK. Choose Repository-Save to save changes to the mapping
Aggregate Caches
When you run a session that uses an Aggregator transformation, the PowerCenter Server creates index and data caches in memory to process the transformation. If the PowerCenter Server requires more space, it stores overflow values in cache files. You can configure the index and data caches in the Aggregator transformation or in the session properties. No need to configure cache memory for Aggregator transformations that use sorted ports
Aggregate Functions You can use the following aggregate functions within an Aggregator transformation. You can nest one aggregate function within another aggregate function. The transformation language includes the following aggregate functions: AVG FIRST COUNT LAST MAX PERCENTILE STDEV SUM VARIANCE When you use any of these functions, you must use them in an expression within an Aggregator transformation.
Group By Ports
The Aggregator transformation allows you to define groups for aggregations, rather than performing the aggregation across all input data. For example, rather than finding the total company sales, you can find the total sales grouped by region. To define a group for the aggregate expression, select the appropriate input, input/output, output, and variable ports in the Aggregator transformation. You can select multiple group by ports, creating a new group for each unique combination of groups. To define a group for the aggregate expression, select the appropriate input, input/output, output, and variable ports in the Aggregator transformation. You can select multiple group by ports, creating a new group for each unique combination of groups.
Group By Ports
The following Aggregator transformation groups first by STORE_ID and then by ITEM:
Continuation
If you send the following data through this Aggregator transformation:
Continuation..
The PowerCenter Server performs the aggregate calculation on the following unique groups:
Continuation..
The PowerCenter Server then passes the last row received, along with the results of the aggregation, as follows:
You can improve Aggregator transformation performance by using the sorted input option. When you use sorted input, the PowerCenter Server assumes all data is sorted by group. As the PowerCenter Server reads rows for a group, it performs aggregate calculations. When necessary, it stores group information in memory. To use the Sorted Input option, you must pass sorted data to the Aggregator transformation. When you do not use sorted input, the PowerCenter Server performs aggregate calculations as it reads. However, since data is not sorted, the PowerCenter Server stores data for each group until it reads the entire source to ensure all aggregate calculations are accurate
Pre-Sorting Data
To use sorted input, you pass sorted data through the Aggregator. Data must be sorted as follows: By the Aggregator group by ports, in the order they appear in the Aggregator transformation. Using the same sort order configured for the session. If data is not in strict ascending or descending order based on the session sort order, the PowerCenter Server fails the session
Mapping with a Sorter transformation configured to sort the source data in descending order by ITEM_NAME
Aggregator transformations often slow performance because they must group data before processing it. Aggregator transformations need additional memory to hold intermediate group results. You can optimize Aggregator transformations by performing the following tasks: Group by simple columns. Use sorted input. Use incremental aggregation.
Thank You