You are on page 1of 4

m_dump metadata [data] [action]

metadata is one of:


data is one of:

action is zero or one of:

It produces human readable report that shows


how input data is interpreted by Ab Initio
metadata. It can print the metadata or a
description of the metadata. It can also
evaluate specified expressions within each
record.
filename: Read metadata from file.
-string string: Read metadata from string.
filename: Read data from file. Specify with
URL for a remote file or multifile.
-string string: Read data from string.
- [hyphen]: Read data from standard input.
-print-metadata: Print metadata.
-describe: Describe structure of metadata:
the names, offsets, sizes, and types for
every field.
-print-data: Default: Print data to standard
output.
-no-print-data: Suppress printing of data.
-print expression: Evaluate expression for
each record displayed and print result.
-start recnum: Start data printing at record
recnum.
-end recnum: End data printing at record
recnum.
-record recnum: Print data only for record
recnum.
Note: the first record is record number 1, and
start and end are inclusive.
-partition: Print an individual partition of a
multifile. This option must appear last on
the command line. Partitions are numbered
in the range 0-n, inclusive.
-report: Produce monitor reports as
specified in the variable XX_REPORT.

m_attach: Ab Initio provides this shell command to facilitate remote startup on large parallel
systems.
m_env: Displays the current settings of the Ab Initio environment variables. Invoke m_env with
the option h for added help (m_env h).
Environment Variables: Set these environment variables if we want a value different from the
default.
XX_TIMEOUT=seconds
XX_MAX_RECORD_BUFFER=bytes
XX_NICE=priority

The time-out interval for certain operations,


such as starting a remote process. Default is
30 seconds.
Maximum buffer size that certain parts of the
system will use to hold a record. Default is 5
million bytes.
Run jobs on remote nodes at the specified

XX-SORT-MAX-CORE=megabytes

priority.
The default value for the max-core argument
to the local-sort component. Default is 10
megabytes.

Special Ab Initio Facilities:


HOST_ALIAS_FILE=path
XX_CATALOG=path
XX_REPORT=keyword

File containing hostname aliases.


Location of user-created metadata catalogs.
Monitor the current job and produce reports.

Debugging:
IWAIT=true
XX_DEBUG=value
DISPLAY=display_id
TRACE_ALL_SOCS=path
LAUNCHER_TRACE

Enable debugging via interactive wait.


Set debugging mode.
The X Windows display. Used to pop up
debuggers.
Trace al process SOC events to files named
program-name.soc in directory specified by
path.
Enable trace output from the low-level layer
that controls remote process control and job
recovery.

An Ab Initio application is a set of mp commands, beginning with mp job and ending (usually)
with mp run. In between are commands that identify the program components and indicate the
flow of data from one to the next. Thus, the mp script usually defines and runs the job.
When a script is invoked, the mp job command executes. At this point, the system creates two
files in the current working directory:
jobname.job: As the rest of the script is read, a text representation of the application
being defined is placed here. The file is a text file.
.abinitio-current-job: This file contains jobname, it enables the system to know the
name of the current job.
If two or more mp jobs are running in the same directory at the same time, one job will overwrite
the others .abinitio-current-job file. To avoid this problem, use the environment variable
AB_JOB. When AB_JOB is set, all mp commands use its value as the name of the current mp
job, ignoring the name stored in .abinitio-current-job.
An Ab Initio application may be designed to execute in sequential phases with or without check
pointing, which means saving state to disk between phases.
Phased execution is enabled from within the application, if the script developer ahs inserted the
command mp phase or mp checkpoint between one component and another.
Phasing makes a difference in how the application uses the system resources, often trading off
performance for safety. Phasing inhibits pipeline parallelism but guarantees that resourceintensive stages will not compete with each other.
When a job does not complete normally, it leaves a file in the working directory on the host
system with the name jobname.rec. This file contains a set of pointers to the log files on the host
and on every node. The log files are placed in the subdirectories that are created when the
application starts and deleted when the application successfully completes.

If the application encounters a software failure, all nodes and their respective files will be rolled
back to their initial state, as if the application were not run at all. If the program contains
checkpoint commands, the state restored is that of the most recent checkpoint.
Specifically, the Ab Initio system will:
Kill all processes running on all nodes, including control processes and processes that
constitute the partitions of a parallel program.
Cleanly shut down all data flows.
Rollback the effects of all file changes.
Report the state of the system.
Exit.
It is not possible for the Co>Operating System to restore the system to an earlier state. For
example, a failure could occur because a node or its native operating system crashed. In this
case, it is not possible to cleanly shut down flow or file operations, nor to rollback file operations
performed in the current phase. In fact, it is likely that stray files (intermediate temporaries) will
be left lying around. To complete the cleanup and get the job running again, you must perform a
manual rollback. For this, we use the command m_rollback.
m_rollback [-d] [-I] [-h] recovery file
-d: Delete the job along with its recovery file and any log files it created.
-i: Display the state of the job and prompt the user whether the job should be deleted.
If the i option is not used, jobs that have reached their first checkpoint will be rolled back to the
checkpoint. Jobs that do not include checkpoints or that did not reach their first checkpoint will be
deleted.
Monitoring
Monitoring is controlled in either (or both) of two ways:
From the shell, set the configuration variable XX_REPORT before running the job.
Within the script, supply arguments to the report option to the mp run command.
The keywords are:
Verbose-errors
Expanded-graph
Flows
Times
Skew
Skew=n
Scroll=mode
File=filename
Interval=n
Table-flows
export XX_REPORT=flows times interval=10 (ksh)
mp run report flows times interval=10 (in script)
File Skew
Skew is only of concern if its large (say, over 25%) and if large amounts of data or CPU time
are involved.

Situations that might lead to skew are an overloaded node, unbalanced data, or
different node speeds.

An overloaded node: If a node is overloaded, then data flows will tend to show up
as initially skewed, but the skew will go to zero at the end of the run.
Unbalanced Data: If different partitions of a data flow have different amounts of data,
then both data and CPU time will be skewed at the end of the run.
Different node speeds: If some nodes are faster than others, then skew is likely to
result. In this case, CPU times will be skewed at the end of the run, but not data
volumes.

Debugging
The XX_DEBUG environment variable controls the tracing and debugging of processes.
The IWAIT mechanism is a simple job-tracking system that lets us detect and handle processes
that fail. We must set IWAIT in order to use any tracing or debugging features.
Administration
AB_SUPPRESS_HISTORY_CHECK: Permits changing parameters when restarting a
checkpointed mp job.
AB_CONNECTION, AB_CONNECTION_SCRIPT, AB_PASSWORD, AB_USER: control aspects
of remote connections.
AB_NODES is used for defining node aliases
Performance
The m_attach utility accelerates job startup on IBM SP configurations of 9 or more nodes.

You might also like