Fourteenth

Data Parallelism
• Scales with data.
• Requires data partitioning.
• Different partitioning methods for

different operations.
Confidential & Proprietary

Data Partitioning
Expanded View:
Global View:

Data Partitioning:
The Global View
Degree of Parallelism
Fan-out Flow

Component:
Partition by Round-robin
• Reads records from its input port and writes them to
the flow partitions connected to its output port. Records
are written to partitions in “roundrobin” fashion, with
block-size records going to a partition before moving on
to the next.

Roundrobin Partitioning
Partition 0 Partition 1 Partition 2

A A
B B
C C
D D
E E
F F
C C
D D
B B
G G
B B
A A
A A
D D
F F
E E
A A
D D

Roundrobin Partitioning

A A B C
B D E F
C C D B
D G B A
E A D F
F E A D
C
D
B
G
B
A
A
D
F
E
A
D

A Data Parallel Application:
The Expanded View

Exercise 14: Data Parallel
Reformatting (Expanded)
• Open figure-04.
• Save As... to figure-04-expanded.
• Create a copy of the Reformat and the Simple-Out dataset (use Edit...Copy and
Edit…Paste).
• Change the path for the copy of Simple-Out.
• Add a Partition by Round-robin component before the Reformat components;

hook them up with flows.
• Run the application and examine the results.

A Data Parallel Application:
The Global View
Degree of Parallelism
(Abstract)
Fan-out Flow Multifile

Exercise 15: Data Parallel
Reformatting (Global)
• Open figure-04.
• Save As... to figure-04-global.
• Add a Partition by Round-robin component.
• Change the Simple-Out dataset to a multifile.
• Run the application and examine the results (use the “Partition”
option in View Data).

Data Aggregation in Parallel
0345Smith Bristol 56 Bristol 63

0322Jones Compton 12 Compton 12
0121Forth Bristol 7
0212Spade London 8 London 31

0492West London 23 New York 42
0221Black New York 42

Data Aggregation of Grouped
Input in Parallel
0345Smith Bristol 56
0121Forth Bristol 7 Bristol 63
0322Jones Compton 12 Compton 12
0212Spade London 8
0492West London 23 London 31
0221Black New York 42 New York 42

Key-Dependent Data
Parallelism
• Aggregation processes records in groups defined by key

values.
• Parallel aggregation requires partitioning based on key

value.
• Parallel aggregation takes three steps:

• Partition by key.
• Sort by key. Same key in each step
• Aggregate by key.

Component: Partition by Key
• Reads records from its input port and writes them to

the flow partitions connected to its output port. A hash
code computed using the key determines which
partition a record will be written on, meaning that
records with the same key value will go to the same
partition.

Partitioning by Key

A A
B B
C C
D D
E E
F F
C C
D D
B B
G G
B B
A A
A A
D D
F F
E E
A A
D D

Partitioning by Key

A A B D
B C E F
C C B D
D A G D
E A B F
F A E D
C
D
B
G
B
A
A
D
F
E
A
D

Partition by Key + Sort =
Parallel Grouping
A A B D
B C E F
C C B D
D A G D
E A B F
F A E D
C
D
B
G
B
A
A A B D
D A B D
F A B D
E A E D
A C E F
D C G F

Common Mistakes
• Incorrect Results if:

Keys for partition, sort, or aggregate
differ.
Data is partitioned, but is never sorted.
• Computationally Expensive if:

Data is sorted before it is partitioned.

Exercise 16:
Data Parallel Aggregation
• Start with figure-05.
• Save As... to figure-05-parallel.
• Add a Partition by Key component.
• Change the output file to a multifile.
• Run the application and examine the results.

Departitioning
Departitioning combines many flows of data to

produce one flow. It is the opposite of partitioning.
Each departition component combines flows in a

different manner.

Departitioning
Expanded View:
Score 1
Departition
Score
2 Output File
Score
3
Global View:

Departitioning
Fan-in Flow
• For the various departitioning components:

• Key-based?
• Result ordering?
• Effect on parallelism?
• Uses?

Departitioning: Performance
Input buffer Output buffer
Free space
Used space

Concatenation
Globally ordered, partitioned data:
49Jane 02241 2 47Bill 02114 14 42John 02116 30
44Bob 02116 8 46Rick 02116 23 48Mary 02116 38
43Mark 02114 9 45Sue 02241 92
Sorted data:
49Jane 02241 2
44Bob 02116 8
43Mark 02114 9
47Bill 02114 14
46Rick 02116 23
42John 02116 30
48Mary 02116 38
45Sue 02241 92

Concatenation: Performance
Running components Reading single flow

in its entirety
Blocked components

Concatenation
• Not key-based.
• Result ordering is by partition.
• Serializes pipelined computation.
• Useful for:
• creating serial flow from partitioned data
• appending headers and trailers
• writing DML
• Used infrequently

Merge
Round-robin partitioned and sorted by amount:

42John 02116 30 49Jane 02241 2 44Bob 02116 8
48Mary 02116 38 43Mark 02114 9 47Bill 02114 14
45Sue 02241 92 46Rick 02116 23
Sorted data, following merge on amount:

49Jane 02241 2
44Bob 02116 8
43Mark 02114 9
47Bill 02114 14
46Rick 02116 23
42John 02116 30
48Mary 02116 38
45Sue 02241 92
Merge: Performance
If keys evenly distributed: Reading flows

roughly evenly
Components running roughly in lock-step

Merge: Performance
If keys globally sorted or near globally sorted: Reading single flow

in its entirety
Blocked components

Merge
• Key-based.
• Result ordering is sorted if each input is sorted.
• Possibly synchronizes pipelined computation; may
even serialize.
• Useful for creating ordered data flows.
• Used more than concatenate, but still infrequently

Interleave
Round-robin partitioned and scored:
42John 02116 30A 43Mark 02114 9C 44Bob 02116 8C
45Sue 02241 92A 46Rick 02116 23B 47Bill 02114 14B
48Mary 02116 38A 49Jane 02241 2C
Scored dataset in original order, following interleave:

42John 02116 30A
43Mark 02114 9C
44Bob 02116 8C
45Sue 02241 92A
46Rick 02116 23B
47Bill 02114 14B
48Mary 02116 38A
49Jane 02241 2C

Interleave: Performance
Reading flows in
round-robin sequence
Components running in lock-step

Interleave
• Not key-based.
• Result ordering is inverse of round-robin.
• Synchronizes pipelined computation.
• Useful for restoring original order following a
record-independent parallel computation
partitioned by round-robin.
• Used in rare circumstances

Gather
Round-robin partitioned and scored:
42John 02116 30A 43Mark 02114 9C 44Bob 02116 8C
45Sue 02241 92A 46Rick 02116 23B 47Bill 02114 14B
48Mary 02116 38A 49Jane 02241 2C
Scored dataset in random order, following gather:

43Mark 02114 9C
46Rick 02116 23B
42John 02116 30A
45Sue 02241 92A
48Mary 02116 38A
44Bob 02116 8C
47Bill 02114 14B
49Jane 02241 2C

Gather: Performance
Reading flows as
data is available

Gather
• Not key-based.
• Result ordering is unpredictable.
• Neither serializes nor synchronizes pipelined
computation.
• Useful for efficient collection of data from multiple
partitions and for repartitioning.
• Used most frequently

Summary of Departitioning
Methods
Method Key-based? Ordering? Uses

Merge Yes Sorted Creating ordered serial flow
Concatenate No Global Creating serial flow from
partitioned data
Interleave No Inverse of “Undoing” round-robin
round-robin partitioning
Gather No Unpredictable Unordered departitioning,
repartitioning

Deadlock
Blocking on read
Blocking on write

Avoiding Deadlock
• Use Concatenate, Interleave and Merge with

care
• Use flow buffering.

• Insert phase break before departition.
• Don’t serialize data unnecessarily;
repartition instead of departition.

Repartitioning
Use to redistribute records across partitions.
Records are almost always redistributed in a

key-based manner, but don’t have to be.
Records can be redistributed to fewer partitions,

the same number of partitions, or more partitions.

The “Wrong” Way
This serializes the computation.

Repartitioning -- The Right Way
Expanded View:
Global View:

Repartitioning
All-to-All Flow
Note: The departition component is almost

always a Gather.

Key Repartition + Sort =
Regroup
A 1 B 6 D 6
A 2 B 5 D 4
A 3 B 3 D 5
A 4 E 6 D 6
C 5 E 2 F 7
C 5 G 7 F 2
Partition by Key:
Gather:
C 5 B 3 A 4
B 5 E 2 D 6
D 5 F 2 B 6
G 7 A 1 E 6
F 7 A 2 D 4
C 5 A 3 D 6

Key Repartition + Sort = Regroup
C 5 B 3 A 4
B 5 E 2 D 6
D 5 F 2 B 6
G 7 A 1 E 6
F 7 A 2 D 4
C 5 A 3 D 6
Sort:
C 5 A 1 A 4
C 5 A 2 D 4
B 5 E 2 B 6
D 5 F 2 E 6
G 7 A 3 D 6
F 7 B 3 D 6

Regroup

Sort Does “Gathering”

Which Components will
Gather?
Many built-in components will gather. To find out if

a specific component will gather:
• Select the component in the component organizer
• Either:
– Look at the adjacent help
– Look for “fan” next to Input Ports: in
OR
– Press the help button
– Look for “fan-in” in the Ports section beside in

Deadlock
Blocking on read
Blocking on write

Avoiding Deadlock
• Use Concatenate, Interleave and Merge with

care
• Use flow buffering.

• Insert phase break before departition.
• Don’t serialize data unnecessarily;
repartition instead of departition.

Repartitioning
Use to redistribute records across partitions.
Records are almost always redistributed in a

key-based manner, but don’t have to be.
Records can be redistributed to fewer partitions,

the same number of partitions, or more partitions.

The “Wrong” Way
This serializes the computation.

Repartitioning -- The Right Way
Expanded View:
Global View:

Repartitioning
All-to-All Flow
Note: The departition component is almost

always a Gather.

Regroup
A 1 B 6 D 6
A 2 B 5 D 4
A 3 B 3 D 5
A 4 E 6 D 6
C 5 E 2 F 7
C 5 G 7 F 2
Partition by Key:
Gather:
C 5 B 3 A 4
B 5 E 2 D 6
D 5 F 2 B 6
G 7 A 1 E 6
F 7 A 2 D 4
C 5 A 3 D 6

Key Repartition + Sort = Regroup
C 5 B 3 A 4
B 5 E 2 D 6
D 5 F 2 B 6
G 7 A 1 E 6
F 7 A 2 D 4
C 5 A 3 D 6
Sort:
C 5 A 1 A 4
C 5 A 2 D 4
B 5 E 2 B 6
D 5 F 2 E 6
G 7 A 3 D 6
F 7 B 3 D 6

Regroup

Sort Does “Gathering”

Which Components will
Gather?
Many built-in components will gather. To find out if

a specific component will gather:
• Select the component in the component organizer
• Either:
– Look at the adjacent help
– Look for “fan” next to Input Ports: in
OR
– Press the help button
– Look for “fan-in” in the Ports section beside in

Layout
• Layout determines the location of a
resource.
• A layout is either serial or parallel.

• A serial layout specifies one node and
one directory.
• A parallel layout specifies multiple nodes

and multiple directories. It is
permissible for the same node to be
repeated.

Layout
• The location of a Dataset is one or more
places on one or more disks.
• The location of a computing component is

one or more directories on one or more
nodes. By default, the node and directory is
unknown.
• Computing components propagate their

layouts from neighbors, unless specifically
given a layout by the user.

Layout
(notice that all layouts are serial in this graph)
files on
Node X
file on Node X
Q: On which node do the processing components run?

A: On Node X.

Layout Determines What
Runs Where
Q: On which Node do the processing components run?
Node W Node X Node Y Node Z

Runs Where

Runs Where
Serial
Parallel
3-way multifile on
file on Node W Node X,Y,Z

Runs Where

Runs Where
Serial Q: Serial or Parallel? Serial
file on Node W
file on Node W
Q: Where do the Reformat(s) run?

Controlling Layout
Propagate
(default)
Bind layout to that
of another component
Use layout of URL
Construct layout
manually
Run on these
hosts

Multidirectory URL as a
Layout
mfile://host1/u/jo/mfs
//host1/vol4/pA/ //host2/vol3/pB/ //host3/vol7/pC/
Layout specifies the locations of the partitions.
Each partition of a layout has:

A host part (node to run on)
A data part (directory for working storage)

Reining in the Parallel Beast
• Applications built with Ab Initio Software can

combine all forms of parallelism.
• Layouts control the number of partitions of a

parallel computation; that is, the degree of data
parallelism.
• Phases control the number of components running

at any one time; that is, the degree of component
and pipeline parallelism.

Phases
Phase 0 Phase 1

Phases
• Breaking an application into phases limits

the contention for:
• Main memory.
• Processor(s).
• Breaking an application into phases costs:

• Disk space.

Checkpoints
• Since data is staged to disk between

phases, one can arrange to use that
data to “start from the middle” should
something go wrong.
• Any phase break can be a checkpoint.

The Phase Toolbar
A Toggle between:
Phase (P), and Checkpoint After Phase (C)
Select Phase Number
View Phase Set Phase

Anatomy of a Running Job
What happens when you push the “Run” button?
• Your graph is translated into a script that can be executed
in the Shell Development Environment.
• This script and any metadata files stored on the GDE client
machine are shipped (via FTP) to the server.
• The script is invoked (via REXEC or TELNET) on the server.
• The script creates and runs a job that may run across
many nodes.
• Monitoring information is sent back to the GDE client.

• Host Process Creation

• Pushing “Run” button generates script.
• Script is transmitted to Host node.
• Script is invoked, creating Host process.
Host
GDE
Client Host Processing nodes

• Agent Process Creation

• Host process spawns Agent processes.
Host
GDE Agent Agent

• Component Process Creation

• Agent processes create Component
processes on each processing node.
Host
GDE Agent Agent

• Component Execution
• Component processes do their jobs.
• Component processes communicate directly with
datasets and each other to move data around.
Host
GDE Agent Agent

• Successful Component Termination

• As each Component process finishes with its
data, it exits with success status.
Host
GDE Agent Agent

• Agent Termination
• When all of an Agent’s Component processes exit,
the Agent informs the Host process that those
components are finished.
• The Agent process then exits.
Host
GDE

• Host Termination
• When all Agents have exited, the Host process
informs the GDE that the job is complete.
• The Host process then exits.
Host
GDE

• Abnormal Component Termination

• When an error occurs in a Component
process, it exits with error status.
• The Agent then informs the Host.
Host
GDE Agent Agent

• Abnormal Component Termination

• The Host tells each Agent to kill its
Component processes.
Host
GDE Agent Agent

• Agent Termination
• When every Component process of an Agent have
been killed, the Agent informs the Host process that
those components are finished.
• The Agent process then exits.
Host
GDE

• Host Termination
• When all Agents have exited, the Host
process informs the GDE that the job failed.
• The Host process then exits.
Host
GDE

To View or Edit the Script
“Edit Script” button
Lines beginning with

“mp” are Shell
Development Environment
directives

Fourteenth

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fourteenth

Uploaded by

Copyright:

Available Formats

Data Parallelism

• Scales with data.

• Requires data partitioning.

• Different partitioning methods for

Confidential & Proprietary

Confidential & Proprietary

Confidential & Proprietary

Confidential & Proprietary

Partition 0 Partition 1 Partition 2

Confidential & Proprietary

Partition 0 Partition 1 Partition 2

Confidential & Proprietary

Confidential & Proprietary

• Save As... to figure-04-expanded.

• Change the path for the copy of Simple-Out.

• Add a Partition by Round-robin component before the Reformat components;

• Run the application and examine the results.

Confidential & Proprietary

Fan-out Flow Multifile

Confidential & Proprietary

• Save As... to figure-04-global.

• Add a Partition by Round-robin component.

• Change the Simple-Out dataset to a multifile.

Confidential & Proprietary

0345Smith Bristol 56 Bristol 63

0212Spade London 8 London 31

Confidential & Proprietary

Confidential & Proprietary

• Aggregation processes records in groups defined by key

• Parallel aggregation requires partitioning based on key

• Parallel aggregation takes three steps:

Confidential & Proprietary

• Reads records from its input port and writes them to

Confidential & Proprietary

Partition 0 Partition 1 Partition 2

Confidential & Proprietary

Partition 0 Partition 1 Partition 2

Confidential & Proprietary

Confidential & Proprietary

• Incorrect Results if:

• Computationally Expensive if:

Confidential & Proprietary

• Save As... to figure-05-parallel.

• Add a Partition by Key component.

• Change the output file to a multifile.

• Run the application and examine the results.

Confidential & Proprietary

Departitioning combines many flows of data to

Each departition component combines flows in a

Confidential & Proprietary

Confidential & Proprietary

• For the various departitioning components:

Confidential & Proprietary

Input buffer Output buffer

Confidential & Proprietary

Confidential & Proprietary

Running components Reading single flow

Confidential & Proprietary

Confidential & Proprietary

Round-robin partitioned and sorted by amount:

Sorted data, following merge on amount:

If keys evenly distributed: Reading flows

Components running roughly in lock-step

Confidential & Proprietary