This action might not be possible to undo. Are you sure you want to continue?
. Every mappingcontains one or more source pipelines. A source pipeline consists of a source qualifier and all thetransformations and targets that receive data from that source qualifier.
If you use PowerCenter, you can specify partitioning information for each source pipeline in a mapping. If you use PowerMart, you must accept the default partitioning information. The partitioning information for a pipeline controls the following factors: The number of reader, transformation, and writer threads that the master thread creates for the pipeline. For more information, see Understanding Processing Threads. How the Informatica Server reads data from the source, including the number of connections to the source. • How the Informatica Server distributes rows of data to each transformation as it processes the pipeline. • How the Informatica Server writes data to the target, including the number of connections to each target in the pipeline. You can specify partitioning information for a pipeline by setting the following attributes: • Location of partition points. Partition points mark the thread boundaries in a pipeline and divide the pipeline into stages. The Informatica Server sets partition points at several transformations in a pipeline by default. If you use PowerCenter, you can define other partition points. When you add partition points, you increase the number of transformation threads,which can improve session performance. The Informatica Server can redistribute rows of data at partition points, which can also improve session performance. For more information on partition points, see Partition Points. • Number of partitions. A partition is a pipeline stage that executes in a single thread. If you use PowerCenter, you can set the number of partitions at any partition point. If you use PowerMart, the Informatica Server defines one partition for the pipeline. When you add partitions, you increase the number processing threads, which can improve session performance. For more information, see Number of Partitions. • Partition types. The Informatica Server specifies a default partition type at each partition
you reduce the number of stages by one. see Partition Types. For more information on adding and deleting partition points. Partition points mark thread boundaries as well as divide the pipeline into stages. Table 10-1 lists the partition points that the Workflow Manager creates by defau The mapping in Figure 10-1 contains four stages. the Informatica Server sets partition points at various transformations in the pipeline. For more information. Number of Partitions . The partition type you set at this partition point controls the way in which the Informatica Server passes rows of data to each partition. Besides marking stage boundaries. partition points also mark the points in the pipeline where the Informatica Server can redistribute data across partitions. The partition point at the source qualifier marks the boundary between the first (reader) and second (transformation) stages. When you add a partition point. Partition Points By default. you can change the partition type. the new pipeline stage includes that transformation.point. see Understanding Processing Threads. The partition point at the target instance marks the boundary between the third (transformation) and fourth (writer) stage. see Adding and Deleting Partition Points. For moreinformation. If you use PowerCenter. For more information. For example. if you place a partition point at a Filter transformation and define multiple partitions. the Informatica Server can redistribute rows of data among the partitions before the Filter transformation processes the data. The partition type controlshow the Informatica Server redistributes data among partitions at partition points. When you set partition point at a transformation. A stage is a section of a pipeline between any two partition points. you increase the number of pipeline stages by one. The partition point at the Aggregator transformation marks the boundary between the second and third (transformation) stages. Similarly. when you delete a partition point. see Partition Types.
A partition is a pipeline stage that executes in a single reader. the Informatica Server defines a single partition in the source pipeline. By default. or writer thread. The partition type determines how the Informatica Server redistributes data across partition points. you must specify a partition type at each partition point in the pipeline. you can generally define up to 16partitions at any partition point. Partition Types When you configure the partitioning information for a pipeline. If the server machine contains ample CPU bandwidth. If you select hash auto-keys. see Restrictions on the Number of Partitions. For more information. Figure 10-2 shows the threads that the master thread creates for this mapping: By default. However. For more information on adding and deleting partitions. • Hash partitioning. However. The Workflow Manager allows you to specify the following partition types: • Round-robin partitioning. you specify a number of ports to form the partition key. The Informatica Server distributes data evenly among all partitions. see Adding and Deleting Partitions. youcannot change the number of partitions. processing rows of data in a sessionconcurrently can increase session performance. When you do this. you can increase the number of partitions. If you use PowerMart. transformation. This increases the number of processing threads. which can improve session performance. Use hash partitioning where you want to ensure that the Informatica Server processes groups of rows with the same partition key in the same . the Workflow Manager defines three partitions in the pipeline. For example. To do this. increasing the number of partitions or partition points also increases the load on the servermachine. see Round-Robin Partitioning. the Informatica Server sets the number of partitions to one. The Informatica Server applies a hash function to a partition key to group data among partitions. you need to use the mapping in Figure 10-1 to extract data from three flat files of various sizes. If you use PowerCenter. Use round-robin partitioning where you want each partition to process approximately the same number of rows. there are situations in which you can define only onepartition in the pipeline. the Informatica Server uses all grouped or sorted ports as the partition key. you can overload the system. If you use PowerCenter. For more information. Note: Increasing the number of partitions or partition points increases the number of threads. if you create a large number of partitions orpartition points in a session that processes large amounts of data. If you select hash user keys. you define three partitions at the source qualifier to read the data simultaneously. Therefore.
see Hash Partitioning. but do not want tochange the distribution of data across partitions. You can delete the default partition point at the Aggregator transformation. • Filter transformation. • Target. you can increase session performance by specifying different partition types at the following partition points in the pipeline: • Source qualifier. and then filter out discontinued items. • Pass-through partitioning. Since the source files vary in size. Use key range partitioning where the sources or targets in the pipeline are partitioned by key range. use hash auto-keys partitioning at the Sorter transformation. Since the target tables are partitioned by key range. The Informatica Server passes data to each partition depending on the ranges you specify for each port. Set a partition point at the Filter transformation. For more information. When you use this mapping in a session. The Informatica Server passes all rows at one partition point to the next partition point without redistributing them. specify key range partitioning at the target to optimize writing data to the tar . each partition processes a different amount of data. You can specify different partition types at different points in the pipeline. For more information. • Sorter transformation. and writes the results to a relational database in which the target tables are partitioned by key range.partition. The mapping must read item information from three flat files of various sizes. • Key range partitioning. You specify one or more ports to form a compound partition key. Choose pass-through partitioning whereyou want to create an additional pipeline stage to improve performance. calculates the average prices and wholesale costs. you must specify three partitions at the source qualifier. It sorts the active items by description. For more information. seePassthrough Partitioning. To read data from the three flat files concurrently. and choose round-robin partitioning to balance the load going into the Filter transformation. pass-through. This causes the Informatica Server to group all items with the same description into the same partition before the Sorter and Aggregator transformations process the rows. Accept the default partition type. The mapping in Figure 10-3 reads data about items and calculates average wholesale costs and prices. To eliminate overlapping groups in the Sorter and Aggregator transformations. see Key Range Partitioning.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.