Welcome to Scribd. Sign in or start your free trial to enjoy unlimited e-books, audiobooks & documents.Find out more
Standard view
Full view
of .
Look up keyword or section
Like this

Table Of Contents

Partition parallelism
Combining pipeline and partition parallelism
Repartitioning data
Parallel processing environments
The configuration file
Partitioning, repartitioning, and collecting data
The mechanics of partitioning and collecting
Sorting data
Data sets
Runtime column propagation
Table definitions
Schema files and partial schemas
Data types
Strings and ustrings
Complex data types
Date and time formats
Date formats
Time formats
Timestamp formats
Incorporating server job functionality
Chapter 3. Parallel Jobs and NLS
How NLS Mode Works
Internal Character Sets
Maps and Locales in InfoSphere DataStage Parallel Jobs
Using Maps in Parallel Jobs
Character Data in Parallel Jobs
Specifying a Project Default Map
Specifying a Job Default Map
Specifying a Stage Map
Specifying a Column Map
Using Locales in Parallel Jobs
Specifying a Project Default Locale
Specifying a Job Default Locale
Specifying a Stage Locale
Defining Date/Time and Number Formats
Specifying Formats at Project Level
Specifying Formats at Job Level
Specifying Formats at Stage Level
Specifying Formats at Column Level
Chapter 4. Stage editors
Showing stage validation errors
The stage page
General tab
Properties tab
Advanced tab
Link ordering tab
NLS Map tab
NLS Locale tab
Inputs page
Partitioning tab
DB2 partition properties
Range partition properties
Format tab
Columns Tab
Field level
Example of writing a sequential file
Example of reading a sequential file
Sequential File stage: fast path
Sequential File stage: Stage page
Sequential File stage: Input page
Sequential File stage: Output page
Row number column
File set stage
File Set stage: fast path
File Set stage: Stage page
File Set stage: Input page
Single file per partition
File Set stage: Output page
Using RCP With file set stages
Lookup file set stage
Lookup File Set stage: fast path
Lookup File Set stage: Stage page
Lookup File Set stage: Input page
v Keep
Lookup File Set stage: Output page
External source stage
External Source stage: fast path
External Source stage: Stage page
External Source stage: Output page
Using RCP With External Source Stages
External Target stage
External Target stage: fast path
External Target stage: Stage page
External Target stage: Input page
Using RCP with External Target stages
Complex Flat File stage
Editing a Complex Flat File stage as a source
Column definitions
Filler creation and expansion:
Simple normalized array columns
Nested normalized array columns
Parallel normalized array columns
Editing a Complex Flat File stage as a target
Reject links
Chapter 6. Processing Data
Transformer stage
Transformer stage: fast path
Transformer editor components
Transformer stage basic concepts
Editing Transformer stages
Defining output column derivations
Column auto-match facility:
Specifying link order
To define a loop condition:
Key break detection
Detection of end of data or end of wave
The InfoSphere DataStage expression editor
Transformer stage properties
BASIC Transformer stage
BASIC Transformer stage: fast path
BASIC Transformer editor components
BASIC Transformer stage basic concepts
Editing BASIC transformer stages
Defining local stage variables
BASIC Transformer stage properties
Aggregator stage
Aggregator stage: fast path
Aggregator stage: Stage page
Aggregator stage: Properties tab
v Maximum Value
v Mean Value
v Minimum Value
v Missing Value
Aggregator stage: Input page
Aggregator stage: Output page
Join stage
Join versus lookup
Example joins
Join stage: fast path
Join stage: Stage page
Join stage: Input page
Join stage: Output page
Merge Stage
Example merge
Merge stage: fast path
Merge stage: Stage page
Warn On Reject Masters
Merge stage: Input page
Merge stage: Output page
Lookup Stage
Lookup Versus Join
Example Look Up
Lookup stage: fast path
Lookup Editor Components
Editing Lookup Stages
Lookup Stage Properties
Lookup Stage Conditions
Range lookups
The InfoSphere DataStage Expression Editor
Sort stage
Sort stage: fast path
Sort stage: Stage page
Sort stage: Input page
Sort stage: Output page
Funnel Stage
Funnel stage: fast path
Funnel stage: Stage page
N Key
Funnel stage: Input page
Funnel stage: Output page
Remove Duplicates Stage
Remove Duplicates stage: fast path
Remove Duplicates stage: Stage page
Remove Duplicates stage: Properties tab
Remove Duplicates stage: Input page
Remove Duplicates stage: Output page
Compress stage
Compress stage: fast path
Compress stage: Stage page
Compress stage: Properties tab
Compress stage: Input page
Compress stage: Output page
Expand Stage
Expand stage: fast path
Expand stage: Stage page
Expand stage: Input page
Expand stage: Output page
Copy stage
Copy stage: fast path
Copy stage: Stage page
Copy stage: Input page
Copy stage: Output page
Modify stage
Modify stage: fast path
Modify stage: Stage page
Modify stage: Input page
Modify stage: Output page
Filter Stage
Specifying the filter
Filter stage: fast path
Filter stage: Stage page
Output link
Filter stage: Input page
Filter stage: Output page
External Filter stage
External Filter stage: fast path
External Filter stage: Stage page
External Filter stage: Properties tab
External Filter stage: Input page
External Filter stage: Output page
Change Capture stage
Example Data
Change Capture stage: fast path
Change Capture stage: Stage page
v Sort Order
Log statistics
Change Capture stage: Input page
Change Capture stage: Output page
Change Apply stage
Change Apply stage: fast path
Change Apply stage: Stage page
Change Apply stage: Input page
Change Apply stage: Output page
Difference stage
Example data
Difference stage: fast path
Difference stage: Stage page
Difference stage: Properties tab
Difference stage: Input page
Difference stage: Output page
Compare stage
Compare stage: fast path
Compare stage: Stage page
Compare stage: Properties tab
Compare stage: Input page
FTP Enterprise Stage
Restart capability
FTP Enterprise stage: fast path
FTP Enterprise stage: Stage Page
FTP Enterprise stage: Input Page
Open command
User Name
Restartable Mode
v Job Id
Schema file
Transfer Type
Editing a Slowly Changing Dimension stage
Dimension update action
Pivot Enterprise stage
Specifying a horizontal pivot operation
To specify the horizontal pivot operation:
Specifying a vertical pivot operation
Pivot Enterprise stage: Properties tab
Specifying execution options
To specify execution options:
Specifying where the stage runs
To specify where the stage runs:
Specifying partitioning or collecting methods
Specifying a sort operation
Checksum stage
Adding a checksum column to your data
To add a checksum column to your data:
Properties for Checksum Stage
Mapping output columns
Chapter 7. Cleansing your Data
Chapter 8. Restructuring Data
Column Import stage
Column Import stage: fast path
Column Import stage: Stage page
Column Import stage: Input page
Column Import stage: Output page
Using RCP With Column Import Stages
Column Export stage
Column Export stage: fast path
Column Export stage: Stage page
Column Export stage: Input page
Using RCP with Column Export stages
Make Subrecord stage
Make Subrecord stage: fast path
Make Subrecord stage: Stage page
Make Subrecord stage: Properties tab
Make Subrecord stage: Input page
Make Subrecord stage: Output page
Split Subrecord Stage
Split Subrecord stage: fast path
Split Subrecord stage: Stage page
Split Subrecord stage: Properties tab
Split Subrecord stage: Input page
Split Subrecord stage: Output page
Combine Records stage
Example 1
Combine Records stage: fast path
Combine Records stage: Stage page
Combine Records stage: Properties tab
Combine Records stage: Input page
Combine Records stage: Output page
Promote Subrecord stage
Promote Subrecord stage: fast path
Promote Subrecord stage: Stage page
Promote Subrecord stage: Properties tab
Promote Subrecord stage: Input page
Promote Subrecord stage: Output page
Make Vector stage
Make Vector stage: fast path
Make Vector stage: Stage page
Make Vector stage: Properties tab
Make Vector stage: Input page
Make Vector stage: Output page
Split Vector Stage
Split Vector stage: fast path
Split Vector stage: Stage page
Split Vector stage: Properties tab
Split Vector stage: Input page
Split Vector stage: Output page
Chapter 9. Debugging your designs
Head stage
Head stage: fast path
Head stage: Stage page
Partition number
Head stage: Input page
Head stage: Output page
Tail stage
Tail stage: fast path
Tail stage: Stage page
Tail stage: Input page
Tail stage: Output page
Sample stage
Sample stage: fast path
Sample stage: Stage page
Sample stage: Input page
Sample stage: Output page
Peek stage
Peek stage: fast path
Peek stage: Stage page
Delimiter string
Peek stage: Input page
Peek stage: Output page
Row Generator stage
Row Generator stage: fast path
Row Generator stage: Stage page
Row Generator stage: Output page
Column Generator stage
Column Generator stage: fast path
Column Generator stage: Stage page
Column Generator stage: Input page
Column Generator stage: Output page
Write Range Map stage
Write Range Map stage: fast path
Write Range Map stage: Stage page
Write Range Map stage: Input page
Job optimization
Optimization scenarios
Balanced Optimization process overview
Relationship between root job and optimized job
Balanced Optimization options and properties
Push processing to database sources
Push processing to database targets
Push data reductions to database targets
Optimizing InfoSphere DataStage jobs
Selecting the job to optimize
Setting optimization options
Optimizing a job and saving the new optimized job
Viewing the optimization log
Searching for optimized jobs
Searching for the root job
To search for the root job:
Breaking the relationship between optimized job and root job
Exporting optimized jobs
Exporting root jobs
Job design considerations
Supported functions
Supported operators
Supported macros and system variables
Database stage types
Processing stage types
User-defined SQL
Combining connectors in Join, Lookup, and Funnel stages
Sorting data and database sources
Sorting data and database targets
Conversion errors
Type conversion errors
Complex write modes and circular table references
Staging table management
Chapter 11. Managing data sets
Structure of data sets
Starting the data set manager
Data set viewer
Viewing the schema
Viewing the data
Copying data sets
Deleting data sets
Configurations editor
Configuration considerations
Logical processing nodes
Optimizing parallelism
Configuration options for an SMP
Example Configuration File for an SMP
Configuration options for an MPP system
Configuration options for an SMP cluster
An example of an SMP cluster configuration
Diagram of a cluster environment
Configuration files
The default path name and the APT_CONFIG_FILE
Node names
Node pools and the default node pool
Disk and scratch disk pools and their defaults
Buffer scratch disk pools
The resource DB2 option
The resource INFORMIX option
The resource ORACLE option
The SAS resources
Adding SAS information to your configuration file
Sort configuration
Allocation of resources
Selective configuration with startup scripts
Hints and tips
Chapter 13. Grid deployment
Resource manager software requirements
Configuring the grid environment
Example LoadL_admin file
Example LoadL_config file
Example LoadL_config.local file for conductor node
Example LoadL_config.local file for compute node
Example master_config.apt configuration file
Example resource_manager file
Getting started with grid project design
Dynamic resources
Designing grid projects
Configuration file templates
Format of templates
Other grid computing considerations
Configuring SSH for passwordless commands
File type considerations
Job operation considerations
Deploying jobs in other environments
Requirements for master templates in different environments
Chapter 14. Remote deployment
Enabling a project for job deployment
Deployment package
Command shell script - pxrun.sh
Environment variable setting source script - evdepfile
Main parallel (OSH) program script - OshScript.osh
Script parameter file - jpdepfile
XML report file - <jobname>.xml
Self-contained transformer compilation
Deploying a job
Server side plug-ins
Appendix A. Schemas
Schema format
Date columns
Decimal columns
Floating-point columns
Integer columns
Raw columns
String columns
Time columns
Timestamp columns
Tagged columns
Partial schemas
Date and time functions
Logical functions
Mathematical functions
Null handling functions
Number functions
Raw functions
String functions
Vector function
Type conversion functions
Utility functions
Appendix C. Fillers
Creating fillers
Filler creation rules
Filler creation examples
Select multiple redefine columns
Select multiple cascading redefine columns
Expanding fillers
0 of .
Results for:
No results containing your search query
P. 1


Ratings: (0)|Views: 825|Likes:
Published by Sai Vishnu Vardhan

More info:

Published by: Sai Vishnu Vardhan on Jun 24, 2011
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





You're Reading a Free Preview
Pages 15 to 80 are not shown in this preview.
You're Reading a Free Preview
Pages 95 to 374 are not shown in this preview.
You're Reading a Free Preview
Pages 389 to 395 are not shown in this preview.
You're Reading a Free Preview
Pages 410 to 689 are not shown in this preview.
You're Reading a Free Preview
Pages 704 to 713 are not shown in this preview.

Activity (5)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
addubey2 liked this
addubey2 liked this
addubey2 liked this

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->