You are on page 1of 5

Decode stage

http://pic.dhe.ibm.com/infocenter/iisinfsv/v8r7/advanced/print.jsp?topic=...

Decode stage
Contents
1. Decode stage: fast path 2. Decode stage: Stage page 2.1. Decode stage: Properties tab 2.1.1. Decode stage: Options category 2.2. Decode stage: Advanced tab 3. Decode stage: Input page 3.1. Decode stage: Partitioning tab 4. Decode stage: Output page
IBM InfoSphere DataStage, Version 8.7.0 Feedback

Decode stage
The Decode stage is a processing stage. It decodes a data set using a UNIX decoding command, such as gzip, that you supply. It converts a data stream of raw binary data into a data set. Its companion stage, Encode, converts a data set from a sequence of records to a stream of raw binary data (see Encode Stage). As the input is always a single stream, you do not have to define meta data for the input link.

The stage editor has three pages: Stage Page. This is always present and is used to specify general information about the stage. Input Page. This is where you specify the details about the single input set from which you are selecting records. Output Page. This is where you specify details about the processed data being output from the stage. Decode stage: fast path This section specifies the minimum steps to take to get a Decode stage functioning. Decode stage: Stage page Decode stage: Input page

1 of 5

9/18/2013 5:42 PM

Decode stage

http://pic.dhe.ibm.com/infocenter/iisinfsv/v8r7/advanced/print.jsp?topic=...

Decode stage: Output page Parent topic: Processing Data

Release date: 2011-10-01 PDF version of this information: IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide

1. IBM InfoSphere DataStage, Version 8.7.0

Feedback

Decode stage: fast path


This section specifies the minimum steps to take to get a Decode stage functioning.

About this task


InfoSphere DataStage has many defaults which means that it can be very easy to include Decode stages in a job. InfoSphere DataStage provides a versatile user interface, and there are many shortcuts to achieving a particular end, this section describes the basic method, you will learn where the shortcuts are when you get familiar with the product.

Procedure
In the Stage Page Properties Tab, specify the UNIX command that will be used to decode the data, together with any required arguments. The command should expect its input from STDIN and send its output to STDOUT. Parent topic: Decode stage

Release date: 2011-10-01 PDF version of this information: IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide

2. IBM InfoSphere DataStage, Version 8.7.0

Feedback

Decode stage: Stage page


The General tab allows you to specify an optional description of the stage. The Properties tab lets you specify what the stage does. The Advanced tab allows you to specify how the stage executes. Decode stage: Properties tab Decode stage: Advanced tab Parent topic: Decode stage

Release date: 2011-10-01

2 of 5

9/18/2013 5:42 PM

Decode stage

http://pic.dhe.ibm.com/infocenter/iisinfsv/v8r7/advanced/print.jsp?topic=...

PDF version of this information: IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide

2.1. IBM InfoSphere DataStage, Version 8.7.0

Feedback

Decode stage: Properties tab


The Properties tab allows you to specify properties which determine what the stage actually does. This stage only has one property and you must supply a value for this. The property appears in the warning color (red by default) until you supply a value. Table 1. Properties Category/Property Options/Command Line

Values Command Line

Default N/A

Mandatory? Y

Repeats? N

Dependent of N/A

Decode stage: Options category Parent topic: Decode stage: Stage page

Release date: 2011-10-01 PDF version of this information: IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide

2.1.1. IBM InfoSphere DataStage, Version 8.7.0

Feedback

Decode stage: Options category


Command line
Specifies the command line used for decoding the data set. The command line must configure the UNIX command to accept input from standard input and write its results to standard output. The command must be located in the search path of your application and be accessible by every processing node on which the Decode stage executes. Parent topic: Decode stage: Properties tab

Release date: 2011-10-01 PDF version of this information: IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide

2.2. IBM InfoSphere DataStage, Version 8.7.0

Feedback

Decode stage: Advanced tab


This tab allows you to specify the following: Execution Mode. The stage can execute in parallel mode or sequential mode. In

3 of 5

9/18/2013 5:42 PM

Decode stage

http://pic.dhe.ibm.com/infocenter/iisinfsv/v8r7/advanced/print.jsp?topic=...

parallel mode the input data is processed by the available nodes as specified in the Configuration file, and by any node constraints specified on the Advanced tab. In Sequential mode the entire data set is processed by the conductor node. Combinability mode. This is Auto by default, which allows InfoSphere DataStage to combine the operators that underlie parallel stages so that they run in the same process if it is sensible for this type of stage. Preserve partitioning. This is Propagate by default. It adopts Set or Clear from the previous stage. You can explicitly select Set or Clear. Select Set to request that next stage in the job should attempt to maintain the partitioning. Node pool and resource constraints. Select this option to constrain parallel execution to the node pool or pools or resource pool or pools specified in the grid. The grid allows you to make choices from drop down lists populated from the Configuration file. Node map constraint. Select this option to constrain parallel execution to the nodes in a defined node map. You can define a node map by typing node numbers into the text box or by clicking the browse button to open the Available Nodes dialog box and selecting nodes from there. You are effectively defining a new node pool for this stage (in addition to any node pools defined in the Configuration file). Parent topic: Decode stage: Stage page

Release date: 2011-10-01 PDF version of this information: IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide

3. IBM InfoSphere DataStage, Version 8.7.0

Feedback

Decode stage: Input page


The Input page allows you to specify details about the incoming data sets. The Decode stage expects a single incoming data set. The General tab allows you to specify an optional description of the input link. The Partitioning tab allows you to specify how incoming data is partitioned before being decoded. The Columns tab specifies the column definitions of incoming data. The Advanced tab allows you to change the default buffering settings for the input link. Details about Decode stage partitioning are given in the following section. See "Stage Editors," for a general description of the other tabs. Decode stage: Partitioning tab Parent topic: Decode stage

Release date: 2011-10-01 PDF version of this information: IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide

3.1. IBM InfoSphere DataStage, Version 8.7.0

Feedback

4 of 5

9/18/2013 5:42 PM

Decode stage

http://pic.dhe.ibm.com/infocenter/iisinfsv/v8r7/advanced/print.jsp?topic=...

Decode stage: Partitioning tab


The Partitioning tab allows you to specify details about how the incoming data is partitioned or collected before it is decoded. It also allows you to specify that the data should be sorted before being operated on. The Decode stage partitions in Same mode and this cannot be overridden. If the Decode stage is set to execute in sequential mode, but the preceding stage is executing in parallel, then you can set a collection method from the Collector type drop-down list. This will override the default collection method. The following Collection methods are available: (Auto). This is the default collection method for Decode stages. Normally, when you are using Auto mode, InfoSphere DataStage will eagerly read any row from any input partition as it becomes available. Ordered. Reads all records from the first partition, then all records from the second partition, and so on. Round Robin . Reads a record from the first input partition, then from the second partition, and so on. After reaching the last partition, the operator starts over. Sort Merge. Reads records in an order based on one or more columns of the record. This requires you to select a collecting key column from the Available list. Parent topic: Decode stage: Input page

Release date: 2011-10-01 PDF version of this information: IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide

4. IBM InfoSphere DataStage, Version 8.7.0

Feedback

Decode stage: Output page


The Output page allows you to specify details about data output from the Decode stage. The Decode stage can have only one output link. The General tab allows you to specify an optional description of the output link. The Columns tab specifies the column definitions for the decoded data. See "Stage Editors," for a general description of the tabs. Parent topic: Decode stage

Release date: 2011-10-01 PDF version of this information: IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide

5 of 5

9/18/2013 5:42 PM