You are on page 1of 4

m_dump metadata [data] [action]

metadata is one of: data is one of:

action is zero or one of:

It produces human readable report that shows how input data is interpreted by Ab Initio metadata. It can print the metadata or a description of the metadata. It can also evaluate specified expressions within each record. filename: Read metadata from file. -string string: Read metadata from string. filename: Read data from file. Specify with URL for a remote file or multifile. -string string: Read data from string. - [hyphen]: Read data from standard input. -print-metadata: Print metadata. -describe: Describe structure of metadata: the names, offsets, sizes, and types for every field. -print-data: Default: Print data to standard output. -no-print-data: Suppress printing of data. -print expression: Evaluate expression for each record displayed and print result. -start recnum: Start data printing at record recnum. -end recnum: End data printing at record recnum. -record recnum: Print data only for record recnum. Note: the first record is record number 1, and start and end are inclusive. -partition: Print an individual partition of a multifile. This option must appear last on the command line. Partitions are numbered in the range 0-n, inclusive. -report: Produce monitor reports as specified in the variable XX_REPORT.

m_attach: Ab Initio provides this shell command to facilitate remote startup on large parallel systems. m_env: Displays the current settings of the Ab Initio environment variables. Invoke m_env with the option h for added help (m_env h). Environment Variables: Set these environment variables if we want a value different from the default. XX_TIMEOUT=seconds XX_MAX_RECORD_BUFFER=bytes XX_NICE=priority XX-SORT-MAX-CORE=megabytes The time-out interval for certain operations, such as starting a remote process. Default is 30 seconds. Maximum buffer size that certain parts of the system will use to hold a record. Default is 5 million bytes. Run jobs on remote nodes at the specified priority. The default value for the max-core argument to

the local-sort component. Default is 10 megabytes. Special Ab Initio Facilities: HOST_ALIAS_FILE=path XX_CATALOG=path XX_REPORT=keyword Debugging: IWAIT=true XX_DEBUG=value DISPLAY=display_id TRACE_ALL_SOCS=path LAUNCHER_TRACE Enable debugging via interactive wait. Set debugging mode. The X Windows display. Used to pop up debuggers. Trace al process SOC events to files named program-name.soc in directory specified by path. Enable trace output from the low-level layer that controls remote process control and job recovery. File containing hostname aliases. Location of user-created metadata catalogs. Monitor the current job and produce reports.

An Ab Initio application is a set of mp commands, beginning with mp job and ending (usually) with mp run. In between are commands that identify the program components and indicate the flow of data from one to the next. Thus, the mp script usually defines and runs the job. When a script is invoked, the mp job command executes. At this point, the system creates two files in the current working directory: jobname.job: As the rest of the script is read, a text representation of the application being defined is placed here. The file is a text file. .abinitio-current-job: This file contains jobname, it enables the system to know the name of the current job. If two or more mp jobs are running in the same directory at the same time, one job will overwrite the others .abinitio-current-job file. To avoid this problem, use the environment variable AB_JOB. When AB_JOB is set, all mp commands use its value as the name of the current mp job, ignoring the name stored in .abinitio-current-job. An Ab Initio application may be designed to execute in sequential phases with or without check pointing, which means saving state to disk between phases. Phased execution is enabled from within the application, if the script developer ahs inserted the command mp phase or mp checkpoint between one component and another. Phasing makes a difference in how the application uses the system resources, often trading off performance for safety. Phasing inhibits pipeline parallelism but guarantees that resource-intensive stages will not compete with each other. When a job does not complete normally, it leaves a file in the working directory on the host system with the name jobname.rec. This file contains a set of pointers to the log files on the host and on every node. The log files are placed in the subdirectories that are created when the application starts and deleted when the application successfully completes. If the application encounters a software failure, all nodes and their respective files will be rolled back to their initial state, as if the application were not run at all. If the program contains checkpoint commands, the state restored is that of the most recent checkpoint.

Specifically, the Ab Initio system will: Kill all processes running on all nodes, including control processes and processes that constitute the partitions of a parallel program. Cleanly shut down all data flows. Rollback the effects of all file changes. Report the state of the system. Exit. It is not possible for the Co>Operating System to restore the system to an earlier state. For example, a failure could occur because a node or its native operating system crashed. In this case, it is not possible to cleanly shut down flow or file operations, nor to rollback file operations performed in the current phase. In fact, it is likely that stray files (intermediate temporaries) will be left lying around. To complete the cleanup and get the job running again, you must perform a manual rollback. For this, we use the command m_rollback. m_rollback [-d] [-I] [-h] recovery file -d: Delete the job along with its recovery file and any log files it created. -i: Display the state of the job and prompt the user whether the job should be deleted. If the i option is not used, jobs that have reached their first checkpoint will be rolled back to the checkpoint. Jobs that do not include checkpoints or that did not reach their first checkpoint will be deleted. Monitoring Monitoring is controlled in either (or both) of two ways: From the shell, set the configuration variable XX_REPORT before running the job. Within the script, supply arguments to the report option to the mp run command. The keywords are: Verbose-errors Expanded-graph Flows Times Skew Skew=n Scroll=mode File=filename Interval=n Table-flows export XX_REPORT=flows times interval=10 (ksh) mp run report flows times interval=10 (in script) File Skew Skew is only of concern if its large (say, over 25%) and if large amounts of data or CPU time are involved. Situations that might lead to skew are an overloaded node, unbalanced data, or different node speeds.

An overloaded node: If a node is overloaded, then data flows will tend to show up as initially skewed, but the skew will go to zero at the end of the run. Unbalanced Data: If different partitions of a data flow have different amounts of data, then both data and CPU time will be skewed at the end of the run. Different node speeds: If some nodes are faster than others, then skew is likely to result. In this case, CPU times will be skewed at the end of the run, but not data volumes.

Debugging The XX_DEBUG environment variable controls the tracing and debugging of processes. The IWAIT mechanism is a simple job-tracking system that lets us detect and handle processes that fail. We must set IWAIT in order to use any tracing or debugging features. Administration AB_SUPPRESS_HISTORY_CHECK: Permits changing parameters when restarting a checkpointed mp job. AB_CONNECTION, AB_CONNECTION_SCRIPT, AB_PASSWORD, AB_USER: control aspects of remote connections. AB_NODES is used for defining node aliases Performance The m_attach utility accelerates job startup on IBM SP configurations of 9 or more nodes.