You are on page 1of 4

m_dump metadata [data] [action] It produces human readable report that shows

how input data is interpreted by Ab Initio

metadata. It can print the metadata or a
description of the metadata. It can also
evaluate specified expressions within each
metadata is one of: filename: Read metadata from file.
-string string: Read metadata from string.
data is one of: filename: Read data from file. Specify with
URL for a remote file or multifile.
-string string: Read data from string.
- [hyphen]: Read data from standard input.
action is zero or one of: -print-metadata: Print metadata.
-describe: Describe structure of metadata:
the names, offsets, sizes, and types for
every field.
-print-data: Default: Print data to standard
-no-print-data: Suppress printing of data.
-print expression: Evaluate expression for
each record displayed and print result.
-start recnum: Start data printing at record
-end recnum: End data printing at record
-record recnum: Print data only for record
Note: the first record is record number 1, and
start and end are inclusive.
-partition: Print an individual partition of a
multifile. This option must appear last on
the command line. Partitions are numbered
in the range 0-n, inclusive.
-report: Produce monitor reports as
specified in the variable XX_REPORT.

m_attach: Ab Initio provides this shell command to facilitate remote startup on large parallel

m_env: Displays the current settings of the Ab Initio environment variables. Invoke m_env with
the option h for added help (m_env h).

Environment Variables: Set these environment variables if we want a value different from the

XX_TIMEOUT=seconds The time-out interval for certain operations,

such as starting a remote process. Default is
30 seconds.
XX_MAX_RECORD_BUFFER=bytes Maximum buffer size that certain parts of the
system will use to hold a record. Default is 5
million bytes.
XX_NICE=priority Run jobs on remote nodes at the specified
XX-SORT-MAX-CORE=megabytes The default value for the max-core argument
to the local-sort component. Default is 10

Special Ab Initio Facilities:

HOST_ALIAS_FILE=path File containing hostname aliases.

XX_CATALOG=path Location of user-created metadata catalogs.
XX_REPORT=keyword Monitor the current job and produce reports.


IWAIT=true Enable debugging via interactive wait.

XX_DEBUG=value Set debugging mode.
DISPLAY=display_id The X Windows display. Used to pop up
TRACE_ALL_SOCS=path Trace al process SOC events to files named
program-name.soc in directory specified by
LAUNCHER_TRACE Enable trace output from the low-level layer
that controls remote process control and job

An Ab Initio application is a set of mp commands, beginning with mp job and ending (usually)
with mp run. In between are commands that identify the program components and indicate the
flow of data from one to the next. Thus, the mp script usually defines and runs the job.

When a script is invoked, the mp job command executes. At this point, the system creates two
files in the current working directory:
jobname.job: As the rest of the script is read, a text representation of the application
being defined is placed here. The file is a text file.
.abinitio-current-job: This file contains jobname, it enables the system to know the
name of the current job.

If two or more mp jobs are running in the same directory at the same time, one job will overwrite
the others .abinitio-current-job file. To avoid this problem, use the environment variable
AB_JOB. When AB_JOB is set, all mp commands use its value as the name of the current mp
job, ignoring the name stored in .abinitio-current-job.
An Ab Initio application may be designed to execute in sequential phases with or without check
pointing, which means saving state to disk between phases.

Phased execution is enabled from within the application, if the script developer ahs inserted the
command mp phase or mp checkpoint between one component and another.

Phasing makes a difference in how the application uses the system resources, often trading off
performance for safety. Phasing inhibits pipeline parallelism but guarantees that resource-
intensive stages will not compete with each other.

When a job does not complete normally, it leaves a file in the working directory on the host
system with the name jobname.rec. This file contains a set of pointers to the log files on the host
and on every node. The log files are placed in the subdirectories that are created when the
application starts and deleted when the application successfully completes.
If the application encounters a software failure, all nodes and their respective files will be rolled
back to their initial state, as if the application were not run at all. If the program contains
checkpoint commands, the state restored is that of the most recent checkpoint.
Specifically, the Ab Initio system will:
Kill all processes running on all nodes, including control processes and processes that
constitute the partitions of a parallel program.
Cleanly shut down all data flows.
Rollback the effects of all file changes.
Report the state of the system.

It is not possible for the Co>Operating System to restore the system to an earlier state. For
example, a failure could occur because a node or its native operating system crashed. In this
case, it is not possible to cleanly shut down flow or file operations, nor to rollback file operations
performed in the current phase. In fact, it is likely that stray files (intermediate temporaries) will
be left lying around. To complete the cleanup and get the job running again, you must perform a
manual rollback. For this, we use the command m_rollback.

m_rollback [-d] [-I] [-h] recovery file

-d: Delete the job along with its recovery file and any log files it created.
-i: Display the state of the job and prompt the user whether the job should be deleted.
If the i option is not used, jobs that have reached their first checkpoint will be rolled back to the
checkpoint. Jobs that do not include checkpoints or that did not reach their first checkpoint will be


Monitoring is controlled in either (or both) of two ways:

From the shell, set the configuration variable XX_REPORT before running the job.
Within the script, supply arguments to the report option to the mp run command.

The keywords are:


export XX_REPORT=flows times interval=10 (ksh)

mp run report flows times interval=10 (in script)

File Skew

Skew is only of concern if its large (say, over 25%) and if large amounts of data or CPU time
are involved.

Situations that might lead to skew are an overloaded node, unbalanced data, or
different node speeds.
An overloaded node: If a node is overloaded, then data flows will tend to show up
as initially skewed, but the skew will go to zero at the end of the run.
Unbalanced Data: If different partitions of a data flow have different amounts of data,
then both data and CPU time will be skewed at the end of the run.
Different node speeds: If some nodes are faster than others, then skew is likely to
result. In this case, CPU times will be skewed at the end of the run, but not data


The XX_DEBUG environment variable controls the tracing and debugging of processes.

The IWAIT mechanism is a simple job-tracking system that lets us detect and handle processes
that fail. We must set IWAIT in order to use any tracing or debugging features.


AB_SUPPRESS_HISTORY_CHECK: Permits changing parameters when restarting a

checkpointed mp job.


of remote connections.

AB_NODES is used for defining node aliases


The m_attach utility accelerates job startup on IBM SP configurations of 9 or more nodes.

You might also like