Professional Documents
Culture Documents
( Day -2 )
A Practical
Introduction to
Ab Initio Software:
Part 2
AB INITIO
Simple
Components
Component Organizer
Click on Component
Organizer Button
The Graph Model
The Graph Model: Naming the
Pieces
Components
Dataset Datasets
Flows
The Graph Model: Some Details
Ports
Record format
Expression
metadata
metadata
Components
Components may run on any computer
running the Co>Operating System.
Different components do different jobs.
The particular work a component
accomplishes depends upon its parameter
settings.
Some parameters are data transformations,
that is business rules to be applied to an
input(s) to produce a required output.
Datasets
A dataset is a source or destination of
data. It can be a simple file, a database
table, a SAS dataset, ...
Datasets may reside on any machine
running the Co>Operating System.
Datasets may reside on other machines if
connected by FTP or database middleware.
Data is always described by record format
metadata (termed “DML”).
Dataset : Simple components
• These are main
dataset components
use for the data
Source & result
storage in the files.
Input file
Input File represents data records read as
input to a graph from one or multiple
serial files or from a multifile.
Description Tab :
URL : Path of the file where data is stored.
( file: / mfile: )
Partition : Ad hoc multifile (changing depth)
Access Tab :
File Handling
File Protection
Ports : DML
Dataset: Records and Fields
0345John Smith
A dataset is 0212Sam Spade
made up of
records; a Records 0322Elvis Jones
record 0492Sue West
consists of 0121Mary Forth
fields. 0221Bill Black
Analogous
database
terms are Fields
rows and
columns
Sources of Record Format
Metadata
Record formats can be generated
from:
• Database catalogs
• COBOL copybooks
• Other third-party products
• SAS datasets
One can always resort to manual
entry!
Viewing Component Properties
Double click on a
component to bring
up its Properties Page
Viewing Port Properties
DML
Record Format Metadata in
Graphical Form
0345John Smith
0212Sam Spade
0322Elvis Jones
0492Sue West
0121Mary Forth
0221Bill Black
DML Types
Fixed length
Delimiter
Mixed
Editing Types in GDE
Description Tab :
URL : Path of the file where data is stored.
( file: / mfile: )
Partition : Ad hoc multifile (changing depth)
Access Tab :
File Handling
File Protection
Port : DML
Intermediate file
Intermediate File represents one or
multiple serial files or a multifile of
intermediate results that a graph writes
during execution, and saves for your
review after execution.
Description Tab :
URL : Path of the file where data is stored.
( file: / mfile: )
Partition : Ad hoc multifile (changing depth)
Access Tab :
File Handling
File Protection
Port : DML
Viewing Data
Type in an expression...
Expression text
Exercise : Writing DML
Open New Graph create input file
The data file data1.dat contains following data:
Rao,Sunita,20031223,24000,\n
Shinde,Sachin,19931029,32000,\n
Sharma,Sunil,19941102,19000,\n
Use the Record Format Editor to create a
description of this data:
last_name
first_name
joining_date
salary
Then use View Data to verify the description is
correct.
Simple components
In these
components don’t
have any
parameter
Trash
Trash ends a flow by accepting all
the data records in it and discarding
them.
Replicate
Replicate arbitrarily combines all the data
records it receives into a single flow and
writes a copy of that flow to each of its
output flows.
Component: Gather Logs
Reads logging records from
multiple flows connected to the
input port and writes them to the
specified ‘log file’ outside of the
application’s transactional context.
Database Components
In these
components deals
with the third
party databases
for data Reading,
Manipulating and
Saving data in the
tables.
In these
components the
record format
metadata does
not change from
input to output
The Filter by Expression
For each record on the input port the
‘select_expr’ parameter is evaluated.
• If ‘select_expr’ evaluates true (non-zero), the
input record is written to the ‘out’ port exactly
as the input was read.
• If the ‘select_expr’ evaluates false (zero), the
record is written to the ‘deselect’ port.
The ‘out’ port must be connected
downstream, those records meeting the
‘select_expr’ criteria
The ‘deselect’ output may be optionally
used
Filter Data (Selection)
1. Push “Run” button.
Combine
n_id,full_name,n_date
The Transform Function Editor
Text DML: Transform Function
Syntax
Transform Functions look like:
output-variables :: name ( input-
variables ) =
begin
assignments;
end;
Assignments look like:
output-variable.field :: expression;
The Transform Function in Text
Format
a b c
x y z
A Record arrives at the input port
9 45 QF
out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;
The Record is read into the component
9 45 QF
out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;
The Transformation Function is evaluated
9 45 QF
out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;
Since every rule within the Transform
function
out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;
44 9 RG
The result record is written to the output port
of the component
out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;
44 9 RG
Exercise : Reformat Data
New graph (use i/p file data1.dat)
• id|city|name|salary
Remove records for employee having
id >500
Add a New field in output name
city (default value ‘Mumbai’)
Change the delimiter from “|” to “;”
Increase salary by 25% for employee
having id <100
Run the graph and examine the results.
Rollup
Rollup generates data records that
summarize groups of data records.
By default, Rollup reads grouped
(sorted) records from the input port,
aggregates them as indicated by key
and transform parameters, and writes
the resulting aggregate record on the
out port.
Data Aggregation
0345Smith Bristol 56
0121Forth Bristol 7 Bristol 63
0322Jones Compton 12 Compton 12
0212Spade London 8
0492West London 23 London 31
0221Black New York 42 New York 42
Built-in Functions for Rollup
The following aggregation functions
are predefined and are only available
in the rollup component:
avg
max
count
min
first
product
last
sum
Rollup Wizard
0345Bristol 561997/09/24
0212London 81900/01/01
0322Compton 121997/04/02
0492London 231997/11/23
0121Bristol 71996/12/11
0221New York 421900/01/01
Joining Sorted Data on the ‘id’ field
0121Bristol 71996/12/11
0212London 81900/01/01
...
Building the Output Record
in0: in1:
record record
decimal(4) id; decimal(4) id;
string(6) name; date(”YYMMDD”) dt;
string(8) city; decimal(9.2) cost;
decimal(3) amount; end
end
out:
record
decimal(4) id;
string(8) city;
decimal(3) amount;
date(“YYYY/MM/DD”)dt;
end
What if the in1 record is missing?
in0: in1:
record record
decimal(4) id; decimal(4) id;
string(6) name; date(”YYMMDD”) dt; ???
string(8) city; decimal(9.2) cost;
decimal(3) amount; end
end
out:
record
decimal(4) id;
string(8) city;
decimal(3) amount;
date(“YYYY/MM/DD”)dt;
end
Prioritized Assignment
Destination Priority Source
a b c a q r
a x q
Records arrive at the inputs of the Join
G 234 42 G NY 4
Align inputs by a
G 234 42 G NY 4
Align inputs by a
G 234 42 G NY 4
Align inputs by a
Align inputs by a
G 234 42 G NY 4
Align inputs by a
G 234 42 G NY 4
Align inputs by a
G 24 NY
New records arrive at the inputs of the Join
H 79 23 K IL 8
Align inputs by a
Align inputs by a
H 79 23 K IL 8
Align inputs by a
K IL 8
Align inputs by a
H 79 23
K IL 8
Align inputs by a
H 79 23
K IL 8
Align inputs by a
H 89 XX
Exercise: Join Data
Study different joins
• Inner
• Full Outer
• Explicit
Record required parameter
If a port does not have a record with a key
value that matches the current key value,
and you set the record-required parameter
for that port to:
• false - Join calls the transform function with NULL
for the corresponding argument.
• true - Join does not call the transform function at
all for the current key value.
The GDE Debugger
The GDE has a built in debugger
capability
To enable the Debugger,
Debugger:Enable Debugger
The Debugger Toolbar
Enable Debugger Remove All Watchers