Teradata Utilities MultiLoad

Teradata Utilities: MultiLoad
Reprinted for KV Satish Kumar, IBM

kvskumar@in.ibm.com
Reprinted with permission as a subscription benefit of Books24x7,

http://www.books24x7.com/
i
Table of Contents
Chapter 4: Multiload........................................................................................................................1
Why it is Called "Multi" Load..................................................................................................1
Two MultiLoad Modes: IMPORT and DELETE................................................................1
Block and Tackle Approach.............................................................................................2
MultiLoad Imposes Limits.................................................................................................3
Error Tables, Work Tables and Log Tables...........................................................................3
Supported Input Formats.......................................................................................................4
MultiLoad Has Five IMPORT Phases....................................................................................5
Phase 1: Preliminary Phase.............................................................................................5
Phase 2: DML Transaction Phase...................................................................................6
Phase 3: Acquisition Phase.............................................................................................6
Phase 4: Application Phase.............................................................................................7
Phase 5: Clean Up Phase................................................................................................7
MultiLoad Commands............................................................................................................8
Two Types of Commands................................................................................................8
Parameters for .BEGIN IMPORT MLOAD.............................................................................9
Parameters for .BEGIN DELETE MLOAD...........................................................................12
A Simple Multiload IMPORT Script......................................................................................12
Building our Multiload Script................................................................................................13
Executing Multiload..............................................................................................................14
Another Simple MultiLoad IMPORT Script...........................................................................15
MultiLoad IMPORT Script....................................................................................................18
Error Treatment Options for the .DML LABEL Command....................................................19
An IMPORT Script with Error Treatment Options................................................................21
A IMPORT Script that Uses Two Input Data Files...............................................................22
Redefining the INPUT..........................................................................................................24
A Script that Uses Redefining the Input...............................................................................24
DELETE MLOAD Script Using a Hard Coded Value...........................................................26
A DELETE MLOAD Script Using a Variable........................................................................27
An UPSERT Sample Script..................................................................................................28
What Happens when MultiLoad Finishes.............................................................................29
MultiLoad Statistics........................................................................................................29
Troubleshooting Multiload Errors.........................................................................................30
RESTARTing Multiload........................................................................................................31
RELEASE MLOAD: When You DON'T Want to Restart MultiLoad.....................................31
MultiLoad and INMODs........................................................................................................32
How Multiload Compares with FastLoad.............................................................................32
Chapter 4: Multiload
"In the end we'll remember not the sound of our enemies, but the silence of our
friends."
- Martin Luther King Jr.
Why it is Called "Multi" Load

If we were going to be stranded on an island with a Teradata Data Warehouse and we could only
take along one Teradata load utility, clearly, MultiLoad would be our choice. MultiLoad has the
capability to load multiple tables at one time from either a LAN or Channel environment. This is in
stark contrast to its fleet-footed cousin, FastLoad, which can only load one table at a time. And it
gets better, yet!
This feature rich utility can perform multiple types of DML tasks, including INSERT, UPDATE,
DELETE and UPSERT on up to five (5) empty or populated target tables at a time. These DML
functions may be run either solo or in combinations, against one or more tables. For these reasons,
MultiLoad is the utility of choice when it comes to loading populated tables in the batch environment.
As the volume of data being loaded or updated in a single block, the performance of MultiLoad
improves. MultiLoad shines when it can impact more than one row in every data block. In other
words, MultiLoad looks at massive amounts of data and says, "Bring it on!"
Leo Tolstoy once said, "All happy families resemble each other." Like happy families, the Teradata
load utilities resemble each other, although they may have some differences. You are going to be
pleased to find that you do not have to learn all new commands and concepts for each load utility.
MultiLoad has many similarities to FastLoad. It has even more commands in common with TPump.
The similarities will be evident as you work with them. Where there are some quirky differences, we
will point them out for you.
Two MultiLoad Modes: IMPORT and DELETE

MultiLoad provides two types of operations via modes: IMPORT and DELETE. In MultiLoad
IMPORT mode, you have the freedom to "mix and match" up to twenty (20) INSERTs, UPDATEs or
DELETEs on up to five target tables. The execution of the DML statements is not mandatory for all
rows in a table. Instead, their execution hinges upon the conditions contained in the APPLY clause
of the script. Once again, MultiLoad demonstrates its user-friendly flexibility. For UPDATEs or
DELETEs to be successful in IMPORT mode, they must reference the Primary Index in the WHERE
clause.
The MultiLoad DELETE mode is used to perform a global (all AMP) delete on just one table. The
reason to use .BEGIN DELETE MLOAD is that it bypasses the Transient Journal (TJ) and can be
RESTARTed if an error causes it to terminate prior to finishing. When performing in DELETE mode,
the DELETE SQL statement cannot reference the Primary Index in the WHERE clause. This due to
the fact that a primary index access is to a specific AMP; this is a global operation.
The other factor that makes a DELETE mode operation so good is that it examines an entire block
of rows at a time. Once all the eligible rows have been removed, the block is written one time and a
checkpoint is written. So, if a restart is necessary, it simply starts deleting rows from the next block
without a checkpoint. This is a smart way to continue. Remember, when using the TJ all deleted
rows are put back into the table from the TJ as a rollback. A rollback can take longer to finish then
the delete. MultiLoad does not do a rollback; it does a restart.
Reprinted for ibmkvskumar@in.ibm.com, IBM Coffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited
Teradata Utilities: BTEQ, FastLoad, MultiLoad, TPump, and FastExport, Second Edition 2
In the above diagram, monthly data is being stored in a quarterly table. To keep the contents limited
to four months, monthly data is rotated in and out. At the end of every month, the oldest month of
data is removed and the new month is added. The cycle is "add a month, delete a month, add a
month, delete a month." In our illustration, that means that January data must be deleted to make
room for May's data.
Here is a question for you: What if there was another way to accomplish this same goal without
consuming all of these extra resources? To illustrate, let's consider the following scenario: Suppose
you have TableA that contains 12 billion rows. You want to delete a range of rows based on a date
and then load in fresh data to replace these rows. Normally, the process is to perform a MultiLoad
DELETE to DELETE FROM TableA WHERE <date-column> < '2002-02-01'. The final step would be
to INSERT the new rows for May using MultiLoad IMPORT.
Block and Tackle Approach

MultiLoad never loses sight of the fact that it is designed for functionality, speed, and the ability to
restart. It tackles the proverbial I/O bottleneck problem like FastLoad by assembling data rows into
64K blocks and writing them to disk on the AMPs. This is much faster than writing data one row at a
time like BTEQ. Fallback table rows are written after the base table has been loaded. This allows
users to access the base table immediately upon completion of the MultiLoad while fallback rows
are being loaded in the background. The benefit is reduced time to access the data.
Amazingly, MultiLoad has full RESTART capability in all of its five phases of operation. Once again,
this demonstrates its tremendous flexibility as a load utility. Is it pure magic? No, but it almost
seems so. MultiLoad makes effective use of two error tables to save different types of errors and a
LOGTABLE that stores built-in checkpoint information for restarting. This is why MultiLoad does not
use the Transient Journal, thus averting time-consuming rollbacks when a job halts prematurely.
Here is a key difference to note between MultiLoad and FastLoad. Sometimes an AMP (Access
Module Processor) fails and the system administrators say that the AMP is "down" or "offline."
When using FastLoad, you must restart the AMP to restart the job. MultiLoad, however, can
continue running when an AMP fails, if the table is fallback protected. As the same time, you can
use the AMPCHECK option to make it work like FastLoad if you want.
MultiLoad Imposes Limits

Rule #1: Unique Secondary Indexes are not supported on a Target Table. Like FastLoad,
MultiLoad does not support Unique Secondary Indexes (USIs). But unlike FastLoad, it does support
the use of Non-Unique Secondary Indexes (NUSIs) because the index subtable row is on the same
AMP as the data row. MultiLoad uses every AMP independently and in parallel. If two AMPs must
communicate, they are not independent. Therefore, a NUSI (same AMP) is fine, but a USI (different
AMP) is not.
Rule #2: Referential Integrity is not supported. MultiLoad will not load data into tables that are
defined with Referential Integrity (RI). Like a USI, this requires the AMPs to communicate with each
other. So, RI constraints must be dropped from the target table prior to using MultiLoad.
Rule #3: Triggers are not supported at load time. Triggers cause actions on related tables based
upon what happens in a target table. Again, this is a multi-AMP operation and to a different table. To
keep MultiLoad running smoothly, disable all Triggers prior to using it.
Rule #4: No concatenation of input files is allowed. MultiLoad does not want you to do this
because it could impact are restart if the files were concatenated in a different sequence or data
was deleted between runs.
Rule #5: The host will not process aggregates, arithmetic functions or exponentiation. If you
need data conversions or math, you might be better off using an INMOD to prepare the data prior to
loading it.
Error Tables, Work Tables and Log Tables

Besides target table(s), MultiLoad requires the use of four special tables in order to function. They
consist of two error tables (per target table), one worktable (per target table), and one log table. In
essence, the Error Tables will be used to store any conversion, constraint or uniqueness violations
during a load. Work Tables are used to receive and sort data and SQL on each AMP prior to
storing them permanently to disk. A Log Table (also called, "Logtable") is used to store successful
checkpoints during load processing in case a RESTART is needed.
HINT: Sometimes a company wants all of these load support tables to be housed in a particular
database. When these tables are to be stored in any database other than the user's own default
database, then you must give them a qualified name (<databasename>.<tablename>) in the script
or use the DATABASE command to change the current database.
Where will you find these tables in the load script? The Logtable is generally identified immediately
prior to the .LOGON command. Worktables and error tables can be named in the BEGIN MLOAD
statement. Do not underestimate the value of these tables. They are vital to the operation of
MultiLoad. Without them a MultiLoad job can not run. Now that you have had the "executive
summary", let's look at each type of table individually.
Two Error Tables: Here is another place where FastLoad and MultiLoad are similar. Both require
the use of two error tables per target table. MultiLoad will automatically create these tables. Rows
are inserted into these tables only when errors occur during the load process. The first error table is
the acquisition Error Table (ET). It contains all translation and constraint errors that may occur
while the data is being acquired from the source(s).
The second is the Uniqueness Violation (UV) table that stores rows with duplicate values for
Unique Primary Indexes (UPI). Since a UPI must be unique, MultiLoad can only load one
occurrence into a table. Any duplicate value will be stored in the UV error table. For example, you
might see a UPI error that shows a second employee number "99." In this case, if the name for
employee "99" is Kara Morgan, you will be glad that the row did not load since Kara Morgan is
already in the Employee table. However, if the name showed up as David Jackson, then you know
that further investigation is needed, because employee numbers must be unique.
Each error table does the following:
• Identifies errors
• Provides some detail about the errors
• Stores the actual offending row for debugging
You have the option to name these tables in the MultiLoad script (shown later). Alternatively, if you
do not name them, they default to ET_<target_table_name> and UV_<target_table_name>. In
either case, MultiLoad will not accept error table names that are the same as target table names. It
does not matter what you name them. It is recommended that you standardize on the naming
convention to make it easier for everyone on your team. For more details on how these error tables
can help you, see the subsection in this chapter titled, "Troubleshooting MultiLoad Errors."
Log Table: MultiLoad requires a LOGTABLE. This table keeps a record of the results from each
phase of the load so that MultiLoad knows the proper point from which to RESTART. There is one
LOGTABLE for each run. Since MultiLoad will not resubmit a command that has been run
previously, it will use the LOGTABLE to determine the last successfully completed step.
Work Table(s): MultiLoad will automatically create one worktable for each target table. This means
that in IMPORT mode you could have one or more worktables. In the DELETE mode, you will only
have one worktable since that mode only works on one target table. The purpose of worktables is to
hold two things:
1. The Data Manipulation Language (DML) tasks
2. The input data that is ready to APPLY to the AMPs
The worktables are created in a database using PERM space. They can become very large. If the
script uses multiple SQL statements for a single data record, the data is sent to the AMP once for
each SQL statement. This replication guarantees fast performance and that no SQL statement will
ever be done more than once. So, this is very important. However, there is no such thing as a free
lunch, the cost is space. Later, you will see that using a FILLER field can help reduce this disk
space by not sending unneeded data to an AMP. In other words, the efficiency of the MultiLoad run
is in your hands.
Supported Input Formats

Data input files come in a variety of formats but MultiLoad is flexible enough to handle many of
them. MultiLoad supports the following five format options: BINARY, FASTLOAD, TEXT,
UNFORMAT and VARTEXT.
BINARY Each record is a 2-byte integer, n, that is followed by n bytes of data. A byte is the
smallest means of storage of for Teradata.
FASTLOAD This format is the same as Binary, plus a marker (X '0A' or X '0D') that specifies the
end of the record.
TEXT Each record has a random number of bytes and is followed by an end of the record
marker.
UNFORMAT The format for these input records is defined in the LAYOUT statement of the
MultiLoad script using the components FIELD, FILLER and TABLE.
VARTEXT This is variable length text RECORD format separated by delimiters such as a
comma. For this format you may only use VARCHAR, LONG VARCHAR (IBM) or
VARBYTE data formats in your MultiLoad LAYOUT. Note that two delimiter
characters in a row will result in a null value between them.
Figure 5-1
MultiLoad Has Five IMPORT Phases

MultiLoad IMPORT has five phases, but don't be fazed by this! Here is the short list:
• Phase 1: Preliminary Phase
• Phase 2: DML Transaction Phase
• Phase 3: Acquisition Phase
• Phase 4: Application Phase
• Phase 5: Cleanup Phase
Let's take a look at each phase and see what it contributes to the overall load process of this
magnificent utility. Should you memorize every detail about each phase? Probably not. But it is
important to know the essence of each phase because sometimes a load fails. When it does, you
need to know in which phase it broke down since the method for fixing the error to RESTART may
vary depending on the phase. And if you can picture what MultiLoad actually does in each phase,
you will likely write better scripts that run more efficiently.
Phase 1: Preliminary Phase

The ancient oriental proverb says, "Measure one thousand times; Cut once." MultiLoad uses Phase
1 to conduct several preliminary set-up activities whose goal is to provide a smooth and successful
climate for running your load. The first task is to be sure that the SQL syntax and MultiLoad
commands are valid. After all, why try to run a script when the system will just find out during the
load process that the statements are not useable? MultiLoad knows that it is much better to identify
any syntax errors, right up front. All the preliminary steps are automated. No user intervention is
required in this phase.
Second, all MultiLoad sessions with Teradata need to be established. The default is the number of
available AMPs. Teradata will quickly establish this number as a factor of 16 for the basis regarding
the number of sessions to create. The general rule of thumb for the number of sessions to use for
smaller systems is the following: use the number of AMPs plus two more. For larger systems with
hundreds of AMP processors, the SESSIONS option is available to lower the default. Remember,
these sessions are running on your poor little computer as well as on Teradata.
Each session loads the data to Teradata across the network or channel. Every AMP plays an
essential role in the MultiLoad process. They receive the data blocks, hash each row and send the
rows to the correct AMP. When the rows come to an AMP, it stores them in worktable blocks on
disk. But, lest we get ahead of ourselves, suffice it to say that there is ample reason for multiple
sessions to be established.
What about the extra two sessions? Well, the first one is a control session to handle the SQL and
logging. The second is a back up or alternate for logging. You may have to use some trial and error
to find what works best on your system configuration. If you specify too few sessions it may impair
performance and increase the time it takes to complete load jobs. On the other hand, too many
sessions will reduce the resources available for other important database activities.
Third, the required support tables are created. They are the following:
Type of Table Table Details
ERRORTABLES MultiLoad requires two error tables per target table. The first error
table contains constraint violations, while the second error table
stores Unique Primary Index violations.
WORKTABLES Work Tables hold two things: the DML tasks requested and the
input data that is ready to APPLY to the AMPs.
LOGTABLE The LOGTABLE keeps a record of the results from each phase of
the load so that MultiLoad knows the proper point from which to
RESTART.
Figure 5-2
The final task of the Preliminary Phase is to apply utility locks to the target tables. Initially, access
locks are placed on all target tables, allowing other users to read or write to the table for the time
being. However, this lock does prevent the opportunity for a user to request an exclusive lock.
Although, these locks will still allow the MultiLoad user to drop the table, no one else may DROP or
ALTER a target table while it is locked for loading. This leads us to Phase 2.
Phase 2: DML Transaction Phase

In Phase 2, all of the SQL Data Manipulation Language (DML) statements are sent ahead to
Teradata. MultiLoad allows the use of multiple DML functions. Teradata's Parsing Engine (PE)
parses the DML and generates a step-by-step plan to execute the request. This execution plan is
then communicated to each AMP and stored in the appropriate worktable for each target table. In
other words, each AMP is going to work off the same page.
Later, during the Acquisition phase the actual input data will also be stored in the worktable so that it
may be applied in Phase 4, the Application Phase. Next, a match tag is assigned to each DML
request that will match it with the appropriate rows of input data. The match tags will not actually be
used until the data has already been acquired and is about to be applied to the worktable. This is
somewhat like a student who receives a letter from the university in the summer that lists his
courses, professor's names, and classroom locations for the upcoming semester. The letter is a
"match tag" for the student to his school schedule, although it will not be used for several months.
This matching tag for SQL and data is the reason that the data is replicated for each SQL statement
using the same data record.
Phase 3: Acquisition Phase

With the proper set-up complete and the PE's plan stored on each AMP, MultiLoad is now ready to
receive the INPUT data. This is where it gets interesting! MultiLoad now acquires the data in large,
unsorted 64K blocks from the host and sends it to the AMPs.
At this point, Teradata does not care about which AMP receives the data block. The blocks are
simply sent, one after the other, to the next AMP in line. For their part, each AMP begins to deal
with the blocks that they have been dealt. It is like a game of cards - you take the cards that you
have received and then play the game. You want to keep some and give some away.
Similarly, the AMPs will keep some data rows from the blocks and give some away. The AMP
hashes each row on the primary index and sends it over the BYNET to the proper AMP where it will
ultimately be used. But the row does not get inserted into its target table, just yet. The receiving
AMP must first do some preparation before that happens. Don't you have to get ready before
company arrives at your house? The AMP puts all of the hashed rows it has received from other
AMPs into the worktables where it assembles them into the SQL. Why? Because once the rows are
reblocked, they can be sorted into the proper order for storage in the target table. Now the utility
places a load lock on each target table in preparation for the Application Phase. Of course, there is
no Acquisition Phase when you perform a MultiLoad DELETE task, since no data is being acquired.
Phase 4: Application Phase

The purpose of this phase is to write, or APPLY, the specified changes to both the target tables and
NUSI subtables. Once the data is on the AMPs, it is married up to the SQL for execution. To
accomplish this substitution of data into SQL, when sending the data, the host has already attached
some sequence information and five (5) match tags to each data row. Those match tags are used to
join the data with the proper SQL statement based on the SQL statement within a DMP label. In
addition to associating each row with the correct DML statement, match tags also guarantee that no
row will be updated more than once, even when a RESTART occurs.
The following five columns are the matching tags:

MATCHING TAGS
ImportSeq Sequence number that identifies the IMPORT command where the error
occurred
DMLSeq Sequence number for the DML statement involved with the error
SMTSeq Sequence number of the DML statement being carried out when the
error was discovered
ApplySeq Sequence number that tells which APPLY clause was running when the
error occurred
SourceSeq The number of the data row in the client file that was being built when
the error took place
Figure 5-3
Remember, MultiLoad allows for the existence of NUSI processing during a load. Every
hash-sequence sorted block from Phase 3 and each block of the base table is read only once to
reduce I/O operations to gain speed. Then, all matching rows in the base block are inserted,
updated or deleted before the entire block is written back to disk, one time. This is why the match
tags are so important. Changes are made based upon corresponding data and DML (SQL) based
on the match tags. They guarantee that the correct operation is performed for the rows and blocks
with no duplicate operations, a block at a time. And each time a table block is written to disk
successfully, a record is inserted into the LOGTABLE. This permits MultiLoad to avoid starting again
from the very beginning if a RESTART is needed.
What happens when several tables are being updated simultaneously? In this case, all of the
updates are scripted as a multi-statement request. That means that Teradata views them as a
single transaction. If there is a failure at any point of the load process, MultiLoad will merely need to
be RESTARTed from the point where it failed. No rollback is required. Any errors will be written to
the proper error table.
Phase 5: Clean Up Phase

Those of you reading these paragraphs that have young children or teenagers will certainly
appreciate this final phase! MultiLoad actually cleans up after itself. The utility looks at the final Error
Code (&SYSRC). MultiLoad believes the adage, "All is well that ends well." If the last error code is
zero (0), all of the job steps have ended successfully (i.e., all has certainly ended well). This being
the case, all empty error tables, worktables and the log table are dropped. All locks, both Teradata
and MultiLoad, are released. The statistics for the job are generated for output (SYSPRINT) and the
system count variables are set. After this, each MultiLoad session is logged off. So what happens if
the final error code is not zero? Stay tuned. Restarting MultiLoad is a topic that will be covered later
in this chapter.
MultiLoad Commands
Two Types of Commands
You may see two types of commands in MultiLoad scripts: tasks and support functions. MultiLoad
tasks are commands that are used by the MultiLoad utility for specific individual steps as it
processes a load. Support functions are those commands that involve the Teradata utility Support
Environment (covered in Chapter 9), are used to set parameters, or are helpful for monitoring a
load.
The chart below lists the key commands, their type, and what they do.
MLOAD
Command Type What does the MLOAD Command do?
Support This command communicates directly with Teradata
.BEGIN to specify if the MultiLoad mode is going to be
[IMPORT] IMPORT or DELETE. Note that the word IMPORT
MLOAD
is optional in the syntax because it is the
DEFAULT, but DELETE is required. We
.BEGIN recommend using the word IMPORT to make the
DELETE coding consistent and easier for others to read. Any
MLOAD parameters for the load, such as error limits or
checkpoints will be included under the .BEGIN
command, too. It is important to know which
commands or parameters are optional ince, if you
do not include them, MultiLoad may supply defaults
that may impact your load.
Task The DML LABEL defines treatment options and
.DML LABEL labels for the application (APPLY) of data for the
INSERT, UPDATE, UPSERT and DELETE
operations. A LABEL is simply a name for a
requested SQL activity. The LABEL is defined first,
and then referenced later in the APPLY clause.
Task This instructs MultiLoad to finish the APPLY
.END MLOAD operations with the changes to the designated
databases and tables.
Task This defines a column of the data source record that
.FIELD will be sent to the Teradata database via SQL.
When writing the script, you must include a FIELD
for each data field you need in SQL. This command
is used with the LAYOUT command.
Task Do not assume that MultiLoad has somehow
.FILLER uncovered much of what you used in your term
papers at the university! FILLER defines a field that
is accounted for as part of the data source's row
format, but is not sent to the Teradata DBS. It is
used with the LAYOUT command.
Task LAYOUT defines the format of the INPUT DATA
.LAYOUT record so Teradata knows what to expect. If one
record is not large enough, you can concatenate
multiple data records by using the LAYOUT

parameter CONTINUEIF to tell which value to
perform for the concatenation. Another option is
INDICATORS, which is used to represent nulls by
using the bitmap (1 bit per field) at the front of the
data record.
Support This specifies the username or LOGON string that
.LOGON will establish sessions for MultiLoad with Teradata.
Support This support command names the name of the
.LOGTABLE Restart Log that will be used for storing
CHECKPOINT data pertaining to a load. The
LOGTABLE is then used to tell MultiLoad where to
RESTART, should that be necessary. It is
recommended that this command be placed before
the .LOGON command.
Support This command terminates any sessions established
.LOGOFF by the LOGON command.
Task This command defines the INPUT DATA FILE, file
.IMPORT type, file usage, the LAYOUT to use and where to
APPLY the data to SQL.
Support Optionally, you can SET utility variables. An
.SET example would be {.SET DBName TO 'CDW_Test'}.
Support This interrupts the operation of MultiLoad in order to

.SYSTEM issue commands to the local operating system.
Task This is a command that may be used with the
.TABLE .LAYOUT command. It identifies a table whose
columns (both their order and data types) are to be
used as the field names and data descriptions of the
data source records.
Figure 5-4
Parameters for .BEGIN IMPORT MLOAD

Here is a list of components or parameters that may be used in the .BEGIN IMPORT command.
Note: The parameters do not require the usual dot prior to the command since they are actually
sub-commands.
REQUIRED
PARAMETER OR NOT WHAT IT DOES
Optional NONE specifies
AMPCHECK that MLOAD starts
{NONE|APPLY|ALL} even with one
down AMP per
cluster if all tables
are Fallback.
APPLY
(DEFAULT)
specifies MLOAD
will not start or
finish Phase 4 with
a down AMP.
ALL specifies not

to proceed if any
AMPs are down,
just like FastLoad.
Optional Short for Access
AXSMOD Module, this
command specifies
input protocol like
OLE-DB or reading
a tape from REEL
Librarian. This
parameter is for
network-attached
systems only.
When used, it
must precede the
DEFINE command
in the script.
Optional You have two
CHECKPOINT options:
CHECKPOINT
refers to the
number of
minutes, or
frequency, at
which you wish a
CHECKPOINT to
occur if the
number is 60 or
less. If the number
is greater than 60,
it designates the
number of rows at
which you want the
CHECKPOINT to
occur. This
command is NOT
valid in DELETE
mode.
Optional You may specify
ERRLIMIT errcount [errpercent] the maximum
number of errors,
or the percentage,
that you will
tolerate during the
processing of a
load job.
Optional Names the two
ERRORTABLES error tables, two
ET_ERR UV_ERR per target table.
Note there is no
comma separator.
Optional If you opt to use
NOTIFY NOTIFY for a any
{LOW|MEDIUM|HIGH|OFF event during a
load, you may
designate the
priority of that
notification:
LOW for level

events,
MEDIUM for
important events,
HIGH for events at
operational
decision points,
and OFF to
eliminate any
notification at all
for a given phase.
Optional This refers to the
SESSIONS <MAX> <MIN> number of
SESSIONS that
should be
established with
Teradata. For
MultiLoad, the
optimal number of
sessions is the
number of AMPs in
the system, plus
two more.
You can also use

MAX or MIN, which
automatically use
the maximum or
minimum number
of sessions to
complete the job. If
you pecify nothing,
it will default to
MAX.
Optional Tells MultiLoad
SLEEP how frequently, in
minutes, to try
logging on to the
system.
Required Names up to 5
TABLES Tablename1, Tablename2…, Tablename5 target tables.
Optional Tells MultiLoad
TENACITY how many hours to
try logging on
when its initial

effort to do so is
rebuffed.
Optional Names the
WORKTABLES Tablename1, Tablename2…, Tablename5 worktable(s), one
per target table.
Figure 5-5
Parameters for .BEGIN DELETE MLOAD

Here is a list of components or parameters that may be used in the BEGIN DELETE command.
Note: The parameters do not require the usual dot prior to the command since parameters are
actually sub-commands.
A Simple Multiload IMPORT Script
"We must use time as a tool, not as a crutch."

– John F. Kennedy
Ask Not – What your Multiload can do for you. Ask what you can do for your Multiload. Multiload is a
great tool when you're short on time. Multiload can update, insert, delete or upsert on Teradata
tables that are already populated. It can even do all four in one script. Our flatfile will contain
Employee_numbers and Salaries * 2. We are giving a big raise. We're going to create a flat file to
use with Multiload, as shown below:
Let's create a flat file for our Multiload
Let's Execute it:
Remember, we'll still use the BTEQ utility to create our flat file.
Building our Multiload Script
"I can accept failure, but I can't accept not trying."

- Michael Jordan
Getting these scripts down is a very hard process, so don't be discouraged if you have a couple of
mistakes. The next two slides will show you a blank copy of the basic Multiload script, as well as a
marked slide illustrating the important parts of the script:
"If you don't know where you're going, any road will take you there."
- Lewis Carrol
Creating our Multiload script

Executing Multiload
"Ambition is a dream with a V8 Engine."

- Elvis Presley
You will feel like the King after executing your first Multiload script. Multiload is the Elvis Presley of
data warehousing because nobody knows how make more records then Multiload. If you have the
ambition to learn, this book will give you what it takes to steer through these utilities. We initialize
the Multiload utility like we do with BTEQ, except that the keyword with Multiload Is mload.
Remember that this Multiload is going to double the salaries of our employees.
Let's execute our Multiload script
Here is a before and after image of our Employee_table02:
Another Simple MultiLoad IMPORT Script
"Those who dance are considered insane by those who cannot hear the music."
- George Carlin
MultiLoad can be somewhat intimidating to the new user because there are many commands and
phases. In reality, the load scripts are understandable when you think through what the IMPORT
mode does:
• Setting up a Logtable
• Logging onto Teradata
• Identifying the Target, Work and Error tables
• Defining the INPUT flat file
• Defining the DML activities to occur
• Naming the IMPORT file
• Telling MultiLoad to use a particular LAYOUT
• Telling the system to start loading
• Finishing loading and logging off of Teradata
This first script example is designed to show MultiLoad IMPORT in its simplest form. It depicts the
loading of a three-column Employee table. The actual script is in the left column and our comments
are on the right. Below the script is a step-by-step description of how this script works.
Step One: Setting up a Logtable and Logging onto Teradata — MultiLoad requires you specify a
log table right at the outset with the .LOGTABLE command. We have called it CDW_Log. Once you
name the Logtable, it will be automatically created for you. The Logtable may be placed in the same
database as the target table, or it may be placed in another database. Immediately after this you log
onto Teradata using the .LOGON command. The order of these two commands is interchangeable,
but it is recommended to define the Logtable first and then to Log on, second. If you reverse the
order, Teradata will give a warning message. Notice that the commands in MultiLoad require a dot
in front of the command key word.
Step Two: Identifying the Target, Work and Error tables — In this step of the script you must tell
Teradata which tables to use. To do this, you use the .BEGIN IMPORT MLOAD command. Then
you will preface the names of these tables with the sub-commands TABLES, WORKTABLES AND
ERROR TABLES. All you must do is name the tables and specify what database they are in. Work
tables and error tables are created automatically for you. Keep in mind that you get to name and
locate these tables. If you do not do this, Teradata might supply some defaults of its own!
At the same time, these names are optional. If the WORKTABLES and ERRORTABLES had not
specifically been named, the script would still execute and build these tables. They would have
been built in the default database for the user. The name of the worktable would be
WT_EMPLOYEE_DEPT1 and the two error tables would be called ET_EMPLOYEE_DEPT1 and
UV_EMPLOYEE_DEPT1, respectively.
Sometimes, large Teradata systems have a work database with a lot of extra PERM space. One
customer calls this database CORP_WORK. This is where all of the logtables and worktables are
normally created. You can use a DATABASE command to point all table creations to it or qualify the
names of these tables individually.
Step Three: Defining the INPUT flat file record structure — MultiLoad is going to need to know
the structure the INPUT flat file. Use the .LAYOUT command to name the layout. Then list the fields
and their data types used in your SQL as a .FIELD. Did you notice that an asterisk is placed
between the column name and its data type? This means to automatically calculate the next byte in
the record. It is used to designate the starting location for this data based on the previous fields
length. If you are listing fields in order and need to skip a few bytes in the record, you can either use
the .FILLER (like above) to position to the cursor to the next field, or the "*" on the Dept_No field
could have been replaced with the number 132 (CHAR(11)+CHAR(20)+CHAR(100)+1). Then, the
.FILLER is not needed. Also, if the input record fields are exactly the same as the table, the .TABLE
can be used to automatically define all the .FIELDS for you. The LAYOUT name will be referenced
later in the .IMPORT command. If the input file is created with INDICATORS, it is specified in the
LAYOUT.
Step Four: Defining the DML activities to occur — The .DML LABEL names and defines the
SQL that is to execute. It is like setting up executable code in a programming language, but using
SQL. In our example, MultiLoad is being told to INSERT a row into the SQL01.Employee_Dept
table. The VALUES come from the data in each FIELD because it is preceded by a colon (:). Are
you allowed to use multiple labels in a script? Sure! But remember this: Every label must be
referenced in an APPLY clause of the .IMPORT clause.
Step Five: Naming the INPUT file and its format type — This step is vital! Using the .IMPORT
command, we have identified the INFILE data as being contained in a file called
"CDW_Join_Export.txt". Then we list the FORMAT type as TEXT. Next, we referenced the
LAYOUT named FILEIN to describe the fields in the record. Finally, we told MultiLoad to APPLY the
DML LABEL called INSERTS — that is, to INSERT the data rows into the target table. This is still a
sub-component of the .IMPORT MLOAD command. If the script is to run on a mainframe, the
INFILE name is actually the name of a JCL Data Definition (DD) statement that contains the real
name of the file.
Notice that the .IMPORT goes on for 4 lines of information. This is possible because it continues
until it finds the semi-colon to define the end of the command. This is how it determines one
operation from another. Therefore, it is very important or it would have attempted to process the
END LOADING as part of the IMPORT — it wouldn't work.
Step Six: Finishing loading and logging off of Teradata — This is the closing ceremonies for the
load. MultiLoad to wrap things up, closes the curtains, and logs off of the Teradata system.
Important note: Since the script above in Figure 5-6 does not DROP any tables, it is
completely capable of being restarted if an error occurs. Compare this to the next
script in Figure 5-7. Do you think it is restartable? If you said no, pat yourself on the
back.
REQUIRED
PARAMETER OR NOT WHAT IT DOES
Required Names the Target table.
TABLES Tablename1
Optional Names the worktable one per target table.
WORKTABLES Tablename1
Optional Names the two error tables, two per target
ERRORTABLES table and there is no comma separator
ET_ERR UV_ERR between them.
Optional Tells MultiLoad how many hours to try
TENACITY establishing sessions when its initial effort
to do so is rebuffed.
Figure 5-6
Sets Up a Logtable and

/* Simple Mload script */ Logs on to Teradata
.LOGTABLE SQL01.CDW_Log;
.LOGON TDATA/SQL01,SQL0;
Begins the Load Process
.BEGIN IMPORT MLOAD TABLES by naming the Target
SQL01.Employee_Dept1
Table, Work table and
WORKTABLES SQL01.CDW_WT
ERRORTABLES SQL01.CDW_ET error tables; Notice NO
SQL01.CDW_UV; comma between the error
tables
Names the LAYOUT of the
.LAYOUT FILEIN; INPUT record and defines
.FIELD Employee_No * CHAR(11); its structure; Notice the
.FIELD Last_Name * CHAR(20);
.FILLER Junk_stuff * CHAR(100); dots before the FIELD and
.FIELD Dept_No * CHAR(6); FILLER and the
semi-colons after each
definition.
Names the DML Label
.DML LABEL INSERTS;
Tells MultiLoad to INSERT
INSERT INTO SQL01.Employee_Dept1 a row into the target table
(Employee_No and defines the row
,Last_Name
,Dept_No ) format.
VALUES Lists, in order, the

(:Employee_No VALUES (each one
,:Last_Name
,:Dept_No ); preceded by a colon) to
be INSERTed.
Names the Import File and
.IMPORT INFILE CDW_Join_Export.txt its Format type; Cites the
FORMAT TEXT LAYOUT file to use tells
LAYOUT FILEIN
APPLY INSERTS; Mload to APPLY the
INSERTs.
Ends MultiLoad and Logs
.END MLOAD; off all MultiLoad sessions
.LOGOFF;
Figure 5-7
MultiLoad IMPORT Script

Let's take a look at MultiLoad IMPORT script that comes from real life. This sample script will look
much more like what you might encounter at your workplace. It is more detailed. The notes to the
right are brief and too the point. They will help you can grasp the essence of what is happening in
the script.
Load Runs from a

/* !/bin/ksh* */ Shell Script
Any words between /*
/* +++++++++++++++++++++++++++++++++++++*/ … */ are comments
/* MultiLoad SCRIPT */ only and are not
/*This script is designed to change the */
/*EMPLOYEE_DEPT1 table using the data found */ processed by
/* in IMPORT INFILE CDW_Join_Export.txt */ Teradata.
/* Version 1.1 */
/* Created by Coffing Data Warehousing */ Names and describes
/* +++++++++++++++++++++++++++++++++++++*/ the purpose of the
script; names the
author
Secures the logon by
.LOGTABLE SQL01.CDW_Log; storing userid and
.RUN FILE LOGON.TXT; password in a separate
/*Drop Error Tables — caution, this script cannot be file, then reads it.
restarted because these tables would be needed */
DROP TABLE SQL01.CDW_ET; Drops Existing error
DROP TABLE SQL01.CDW_UV; tables and cancels the
ability for the script to
restart – DON'T
ATTEMPT THIS AT
HOME! Also, SQL does
not use a dot (.)
Begins the Load
/* Begin Import and Define Work and Error Tables */ Process by telling us
.BEGIN IMPORT MLOAD TABLES first the names of the
SQL01.Employee_Dept1
WORKTABLES target table, Work table
SQL01.CDW_WT and error tables; note
ERRORTABLES NO comma between
SQL01.CDW_ET the names of the error
SQL01.CDW_UV; tables
Names the LAYOUT of
/* Define Layout of Input File */ the INPUT file.
.LAYOUT FILEIN; Defines the structure

.FIELD Employee_No * CHAR(11); of the INPUT file.
.FIELD First_Name * CHAR(14);
.FIELD Last_Name * CHAR(20); Notice the dots before
.FIELD Dept_No * CHAR(6); the FIELD command
.FIELD Dept_Name * CHAR(20); and the semi-colons
after each FIELD
definition.
Names the DML Label

/* Begin INSERT Process on Table */
.DML LABEL INSERTS; Tells MultiLoad to
INSERT INTO SQL01.Employee_Dept1
( Employee_No INSERT a row into the
,First_Name target table and
,Last_Name defines the row format.
,Dept_No
,Dept_Name ) Note that we place
VALUES
( :Employee_No comma separators in
,:First_Name front of the following
,:Last_Name column or value for
,:Dept_No easier debugging.
,:Dept_Name );
Lists, in order, the
VALUES to be
INSERTed.
Names the Import File
/* Specify IMPORT File and Apply Parameters */ and States its Format
.IMPORT INFILE CDW_Join_Export.txt type; Names the
FORMAT TEXT
LAYOUT FILEIN Layout file to use And
APPLY INSERTS; tells MultiLoad to
APPLY the INSERTs.
Ends MultiLoad and
.END MLOAD; Logs off of Teradata
.LOGOFF;
Figure 5-8
Error Treatment Options for the .DML LABEL Command

MultiLoad allows you to tailor how it deals with different types of errors that it encounters during the
load process, to fit your needs. Here is a summary of the options available to you:
ERROR TREATMENT OPTIONS FOR .DML LABEL
.DML LABEL {labelname}

{MARK | IGNORE} DUPLICATE [INSERT |UPDATE] ROWS
{MARK | IGNORE} MISSING [INSERT |UPDATE] ROWS
DO INSERT FOR [MISSING UPDATE] ROWS ;
Figure 5-9
In IMPORT mode, you may specify as many as five distinct error-treatment options for one
.DML statement. For example, if there is more than one instance of a row, do you want MultiLoad to
IGNORE the duplicate row, or to MARK it (list it) in an error table?
If you do not specify IGNORE, then MultiLoad will MARK, or record all of the errors. Imagine you
have a standard INSERT load that you know will end up recording about 20,000 duplicate row
errors. Using the following syntax "IGNORE DUPLICATE INSERT ROWS;" will keep them out of the
error table. By ignoring those errors, you gain three benefits:
1. You do not need to see all the errors.
2. The error table is not filled up needlessly.
3. MultiLoad runs much faster since it is not conducting a duplicate row check.
When doing an UPSERT, there are two rules to remember:
• The default is IGNORE MISSING UPDATE ROWS. Mark is the default for all operations.
When doing an UPSERT, you anticipate that some rows are missing, otherwise, why do an
UPSERT. So, this keeps these rows out of your error table.
• The DO INSERT FOR MISSING UPDATE ROWS is mandatory. This tells MultiLoad to
insert a row from the data source if that row does not exist in the target table because the
update didn't find it.
The table that follows shows you, in more detail, how flexible your options are:
ERROR TREATMENT OPTIONS IN DETAIL
.DML LABEL WHAT IT DOES
OPTION
MARK DUPLICATE This option logs an entry for all duplicate INSERT rows in the
INSERT ROWS UV_ERR table. Use this when you want to know about the
duplicates.
IGNORE DUPLICATE This tells MultiLoad to IGNORE duplicate INSERT rows
INSERT ROWS because you do not want to see them.
MARK DUPLICATE This logs the existence of every duplicate UPDATE row.
UPDATE ROWS
IGNORE DUPLICATE This eliminates the listing of duplicate update row errors.
UPDATE ROWS
MARK MISSING This option ensures a listing of data rows that had to be
UPDATE ROWS INSERTed since there was no row to UPDATE.
IGNORE MISSING This tells MultiLoad NOT to list UPDATE rows as an error.
UPDATE ROWS This is a good option when doing an UPSERT since
UPSERT will INSERT a new row.
MARK MISSING This option makes a note in the ET_Error Table that a row to
DELETE ROWS be deleted is missing.
IGNORE MISSING This option says, "Do not tell me that a row to be deleted is
DELETE ROWS missing.
DO INSERT for This is required to accomplish an UPSERT. It tells MultiLoad
MISSING UPDATE that if the row to be updated does not exist in the target table,
ROWS then INSERT the entire row from the data source.
Figure 5-10
An IMPORT Script with Error Treatment Options

The command .DML LABEL names any DML options (INSERT, UPDATE OR DELETE) that
immediately follow it in the script. Each label must be given a name. In IMPORT mode, the label will
be referenced for use in the APPLY Phase when certain conditions are met. The following script
provides an example of just one such possibility:
Load Runs from a

/* +++++++++++++++++++++++++++++++++++++*/ … */ are COMMENTS
/* MultiLoad SCRIPT */ ONLY and are not
/*This script is designed to change the */
/*EMPLOYEE_DEPT table using the data from */ processed by
/* the IMPORT INFILE CDW_Join_Export.txt */ Teradata.
/* Version 1.1 */
/* Created by Coffing Data Warehousing*/ Names and describes
/* +++++++++++++++++++++++++++++++++++++ */ the purpose of the
script; names the
author
Sets up a Logtable and
/* Setup the MulitLoad Logtables, Logon Statements*/ then logs on to
.LOGTABLE SQL01.CDW_Log; Teradata.
.LOGON TDATA/SQL01,SQL01;
DATABASE SQL01; Specifies the database
in which to find the
target table.
/*Drop Error Tables */ Drops Existing error

DROP TABLE WORKDB.CDW_ET; tables in the work
DROP TABLE WORKDB.CDW_UV;
database.
Begins the Load
/* Begin Import and Define Work and Error Tables */ Process by telling us
.BEGIN IMPORT MLOAD TABLES first the names of the
Employee_Dept
WORKTABLES Target Table, Work
WORKDB.CDW_WT table and error tables
ERRORTABLES are in a work database.
WORKDB.CDW_ET Note there is no
WORKDB.CDW_UV; comma between the
names of the error
tables (pair).
Names the LAYOUT of
/* Define Layout of Input File */ the INPUT file.
.LAYOUT FILEIN;
.FIELD Employee_No * CHAR(11); Defines the structure
.FIELD First_Name * CHAR(14); of the INPUT file.
.FIELD Last_Name * CHAR(20); Notice the dots before
.FIELD Dept_No * CHAR(6); the FIELD command
.FIELD Dept_Name * CHAR(20); and the semi-colons
after each FIELD
definition.
Names the DML Label

/* Begin INSERT Process on Table */
.DML LABEL INSERTS Tells MultiLoad NOT
IGNORE DUPLICATE INSERT ROWS;
INSERT INTO SQL01.Employee_Dept TO LIST duplicate
( Employee_No INSERT rows in the
,First_Name error table; notice the
,Last_Name option is placed
,Dept_No AFTER the LABEL
,Dept_Name)
VALUES identification and
( :Employee_No immediately BEFORE
,:First_Name, the DML function.
,:Last_Name,
,:Dept_No, Lists, in order, the
,:Dept_Name);
VALUES to be
INSERTed.
Names the Import File
/* Specify IMPORT File and Apply Parameters */ and States its Format
.IMPORT INFILE CDW_Join_Export.txt type; names the
FORMAT TEXT
LAYOUT FILEIN Layout file to use and
APPLY INSERTS; tells MultiLoad to
APPLY the INSERTs.
Ends MultiLoad and
.END MLOAD; logs off of Teradata
.LOGOFF;
Figure 5-11
A IMPORT Script that Uses Two Input Data Files
Load Runs from a

/*MultiLoad IMPORT SCRIPT with two INPUT files */ … */ are comments
*/ only and are not
/*This script INSERTs new rows into the */
/* Employee_table and UPDATEs the Dept_Name */ processed by
/*in the Department_table. */ Teradata.
/* Version 1.1 */
/* Created by Coffing Data Warehousing */
/* +++++++++++++++++++++++++++++++++++++*/
Sets up a Logtable
.LOGTABLE SQL01.EMPDEPT_LOG; and logs on with
.RUN FILE c:\mydir\logon.txt; .RUN.
The logon.txt file
contains:
.logon
TDATA/SQL01,SQL01;
Drops the worktables
DROP TABLE SQL01.EMP_WT; and error tables, in
DROP TABLE SQL01.DEPT_WT; case they existed
DROP TABLE SQL01.EMP_ET;
DROP TABLE SQL01.EMP_UV; from a prior load;
DROP TABLE SQL01.DEPT_ET; NOTE: Do NOT
DROP TABLE SQL01.DEPT_UV; include IF you want to
RESTART using
CHECKPOINT.
Identifies the 2 target
/* the following defines 2 tables for loading */ tables with a comma
.BEGIN IMPORT MLOAD
between them.
TABLES
SQL01.Employee_Table, Names the worktable
SQL01.Department_Table and error tables for
WORKTABLES each target table;
SQL01.EMP_WT,
SQL01.DEPT_WT
ERRORTABLES Note there are NO
SQL01.EMP_ET commas between the
SQL01.EMP_UV, pair of names, but
SQL01.DEPT_ET there is a comma
SQL01.DEPT_UV; between this pair and
the next pair.
Names and Defines
/* these next 2 LAYOUTs define 2 different records */ the LAYOUT of the 1st
.LAYOUT FILEIN1; INPUT file
.FIELD Emp_No * INTEGER;
.FIELD LName * CHAR(20);
.FIELD FName * VARCHAR(20);
.FIELD Sal * DECIMAL (10,2);
.FIELD Dept_Num * INTEGER;
Names and Defines
.LAYOUT FILEIN2; the LAYOUT of the
.FIELD DeptNo * CHAR(6); 2nd INPUT file
.FIELD DeptName * CHAR(20);
Names the 1st DML
.DML LABEL EMP_INS Label; Tells MultiLoad
IGNORE DUPLICATE INSERT ROWS; to IGNORE duplicate
INSERT INTO SQL01.Employee_Table
VALUES (:Emp_No INSERT rows because
,:FName you do not want to
,:LName see them.
,:Sal
,:Dept_Num); INSERT a row into the
table, but does NOT
name the columns. So
all VALUES are

passed IN THE
ORDER they are
defined in the
Employee table.
Names the 2nd DML

.DML LABEL DEPT_UPD; Label;
UPDATE Department_Table
SET Dept_Name = :DeptName
WHERE Dept_No = :DeptNo; Tells MultiLoad to
UPDATE when it finds
Deptno (record) equal
to the Dept_No in the
Department_table and
change the
Dept_name column
with the DeptName
from the INPUT file.
Names the TWO
.IMPORT INFILE Emp_Data Import Files
LAYOUT FILEIN1
APPLY EMP_INS;
Names the TWO
.IMPORT INFILE Dept_Data Layouts that define
LAYOUT FILEIN2 the structure of the
APPLY DEPT_UPD; INPUT DATA files …
and tells MultiLoad to
APPLY the INSERTs
to target table 1 and
the UPDATEs to
target table 2.
Ends MultiLoad and
.END MLOAD; logs off of Teradata.
.LOGOFF;
Figure 5-12
Redefining the INPUT

Sometimes, instead of using two different INPUT DATA files, which require two separate LAYOUTs,
you can combine them into one INPUT DATA file. And you can use that one file, with just one
LAYOUT to load more than one table! You see, a flat file may contain more than one type of data
record. As long as each record has a unique code to identify it, MultiLoad can check this code and
know which layout to use for using different names in the same layout. To do this you will need to
REDEFINE the INPUT. You do this by redefining a field's position in the .FIELD or .FILLER section
of the LAYOUT. Unlike the asterisk (*), which means that a field simply follows the previous one,
redefining will cite a number that tells MultiLoad to take a certain portion of the INPUT file and jump
to the redefined position to back toward the beginning of the record.
A Script that Uses Redefining the Input

The following script uses the ability to define two record types in the same input data file. It uses a
.FILLER to define the code since it is never used in the SQL, only to determine which SQL to run.
Load Runs from a

/* +++++++++++++++++++++++++++++++++++++*/ … */ are comments
/* MultiLoad IMPORT SCRIPT with multiple target */ only and are not
/*tables and DML labels */
/*This script INSERTs new rows into the */ processed by
/* Employee_table and UPDATEs the Dept_Name */ Teradata.
/*in the Department_table */
/* Version 1.1 */
/* +++++++++++++++++++++++++++++++++++++*/
Sets Up a Logtable
.LOGTABLE SQL01.EmpDept_Log; and Logs on to
.LOGON TDATA/SQL01,SQL01; Teradata; Optionally,
specifies the
database to work in.
Identifies the 2 target
/* 2 target tables, 2 work tables, 2 error tables per tables;
target table, defined in pairs */
BEGIN IMPORT MLOAD
TABLES Names the worktable
SQL01.Employee_Table, and error tables for
SQL01.Department_Table each target tables;
WORKTABLES
SQL01.EMP_WT, Note there is no
SQL01.DEPT_WT
ERRORTABLES comma between the
SQL01.EMP_ET names of the error
SQL01.EMP_UV, tables but there is a
SQL01.DEPT_ET comma between the
SQL01 .DEPT_UV; pair of error tables.
Names and defines
.LAYOUT FILEIN; the LAYOUT of the
.FILLER Trans * CHAR (1); INPUT record. The
.FIELD Emp_No * INTEGER;
.FIELD Dept_Num * INTEGER; FILLER is for a field
.FIELD LName * CHAR(20); that tells what type of
.FIELD FName * VARCHAR(20); record has been read.
.FIELD Sal * DECIMAL (10,2); Here that field
.FIELD DeptNo 2 INTEGER; contains an "E" or a
.FIELD DeptName * CHsssssssAR(20);
"D". The "E" tells
MLOAD use the
Employee data and
the "D" is for
department data.
The definition for

Dept_Num tells
MLOAD to jump
backward to byte 2.
Where as the * for
Emp_Num defaulted
to byte 2. So, Emp_No
and Dept_Num both
start at byte 2, but in
different types of
records. When Trans
(byte position 1)
contains a "D", the
APPLY uses the dept
names and for an "E"
the APPLY uses the
employee data.
.DML LABEL EMPIN Names the 1st DML

IGNORE DUPLICATE INSERT ROWS; Label; Tells MultiLoad
INSERT INTO SQL01.Employee_Table to IGNORE duplicate
VALUES ( :Emp_No INSERT rows because
,:FName you do not want to
,:LName see them.
,:Sal
,:Dept_Num );
Tells MultiLoad to
INSERT a row into the
1st target table but
optionally does NOT
define the target table
row format. All the
VALUES are passed
to the columns of the
Employee table IN
THE ORDER of that
table's row format.
Names the 2nd DML
.DML LABEL DEPTIN; Label;
UPDATE Department_Table
SET Dept_Name = :DeptName
WHERE Dept_No = :DeptNo; Tells MultiLoad to
UPDATE the 2nd
target table but
optionally does NOT
define that table's row
format. When the
VALUE of the DeptNo
equals that of the
Dept_No column of
the Department, then
update the
Dept_Name column
with the DeptName
from the INPUT file.
Ends MultiLoad and
.IMPORT INFILE UPLOAD.dat logs off of Teradata.
LAYOUT FILEIN
APPLY EMPIN WHERE Trans = 'E'
APPLY DEPTIN WHERE Trans = 'D' ;
.END MLOAD;
.LOGOFF;
Figure 5-13
DELETE MLOAD Script Using a Hard Coded Value

The next script demonstrates how to use the MultiLoad DELETE task. In this example, students no
longer enrolled in the university are being removed from the Student_Profile table, based upon the
registration date. The profile of any student who enrolled prior to this date will be removed.
.LOGTABLE RemoveLog; Identifies the Logtable and logs onto Teradata with a valid
.LOGON TDATA/SQL01,SQL01; logon string.
Begins MultiLoad in DELETE mode and Names the target
.BEGIN DELETE MLOAD table.
TABLES Order_Table;
SQL DELETE statement does a massive delete of order
DELETE FROM Order_Table data for orders placed prior to the hard coded date in the
WHERE Order_Date < '99/12/31'; WHERE clause. Notice that this is not the Primary Index.
You CANNOT DELETE in DELETE MLOAD mode based
upon the Primary Index.
Ends loading and logs off of Teradata.
.END MLOAD;
LOGOFF;
Figure 5-14
How many differences from a MultiLoad IMPORT script readily jump off of the page at you? Here
are a few that we saw:
• At the beginning, you must specify the word "DELETE" in the .BEGIN MLOAD command.
You need not specify it in the .END MLOAD command.
• You will readily notice that this mode has no .DML LABEL command. Since it is focused on
just one absolute function, no APPLY clause is required so you see no .DML LABEL.
• Notice that the DELETE with a WHERE clause is an SQL function, not a MultiLoad
command, so it has no dot prefix.
• Since default names are available for worktables (WT_<target_tablename>) and error tables
(ET_<target_tablename> and UV_<target_tablename>), they need not be specifically
named, but be sure to define the Logtable.
Do not confuse the DELETE MLOAD task with the SQL delete task that may be part of a MultiLoad
IMPORT. The IMPORT delete is used to remove small volumes of data rows based upon the
Primary Index. On the other hand, the MultiLoad DELETE does global deletes on tables, bypassing
the Transient Journal. Because there is no Transient Journal, there are no rollbacks when the job
fails for any reason. Instead, it may be RESTARTed from a CHECKPOINT. Also, the MultiLoad
DELETE task is never based upon the Primary Index.
Because we are not importing any data rows, there is neither a need for worktables nor an
Acquisition Phase. One DELETE statement is sent to all the AMPs with a match tag parcel. That
statement will be applied to every table row. If the condition is met, then the row is deleted. Using
the match tags, each target block is read once and the appropriate rows are deleted.
A DELETE MLOAD Script Using a Variable

This illustration demonstrates how passing the values of a data row rather than a hard coded value
may be used to help meet the conditions stated in the WHERE clause. When you are passing
values, you must add some additional commands that were not used in the DELETE example with
hard coded values.
.LOGTABLE RemoveLog; Identifies the Logtable and logs onto Teradata with a valid
.LOGON TDATA/SQL01,SQL01; logon string.
Begins the DELETE task and names only one table, but still
.BEGIN DELETE MLOAD uses TABLES option.
TABLES Order_Table;
Names the LAYOUT and defines the column whose value will
.LAYOUT OldMonth be passed as a single row to MultiLoad. In this case, all of the
.FIELD OrdDate * DATE; order dates in the Order_Table will be tested against this
OrdDate value.
The condition in the WHERE clause is that the data rows with
DELETE FROM Order_Table orders placed prior to the date value (:OrdDate) passed from
WHERE Order_Date < :OrdDate; the LAYOUT OldMonth will be DELETEd from the
Order_Table.
Note that this time there is no dot in front of LAYOUT in this
.IMPORT INFILE clause since it is only being referenced.
LAYOUT OldMonth ;
Ends loading and logs off of Teradata.
.END MLOAD;
.LOGOFF;
Figure 5-15
An UPSERT Sample Script

The following sample script is provided to demonstrate how to do an UPSERT — that is, to update
a table and if a row from the data source table does not exist in the target table, then insert a new
row. In this instance we are loading the Student_Profile table with new data for the next semester.
The clause "DO INSERT FOR MISSING UPDATE ROWS" indicates an UPSERT. The DML
statements that follow this option must be in the order of a single UPDATE statement followed by a
single INSERT statement.
Load Runs from a

/* !/bin/ksh* */ shell script; Any
/* +++++++++++++++++++++++++++++++++++++++++++++++++ */ words between /*
/* MultiLoad UPSERT SCRIPT */
/*This script Updates the Student_Profile Table */ … */ are comments
/* with new data and Inserts a new row into the table */ only and are not
/* if the row to be updated does not exist. */ processed by
/* Version 1.1 */ Teradata;
/* ++++++++++++++++++++++++++++++++++++++++++++++++++*/
Names and
describes the
purpose of the
script; names the
author.
Sets Up a Logtable
/* Setup Logtable, Logon Statements*/ and then logs on
to Teradata.
.LOGTABLE SQL01.CDW_Log;
.LOGON CDW/SQL01,SQL01;
Begins the Load
/* Begin Import and Define Work and Error Tables */ Process by telling
.BEGIN IMPORT MLOAD TABLES
us first the names
SQL01.Student_Profile of the target table,
WORKTABLES SQL01.SWA_WT work table and
ERRORTABLES SQL01.SWA_ET error tables.
SQL01.SWA_UV;
Names the
/* Define Layout of Input File */ LAYOUT of the
INPUT file;
.LAYOUT FILEIN; An ALL

.FIELD Student_ID * INTEGER; CHARACTER
.FIELD Last_Name * CHAR (20);
.FIELD First_Name * VARCHAR (12); based flat file.
.FIELD Class_Code * CHAR (2);
.FIELD Grade_Pt * DECIMAL(5,2); Defines the
structure of the
INPUT file; Notice
the dots before the
FIELD command
and the
semi-colons after
each FIELD
definition;
Names the DML
/* Begin INSERT and UPDATE Process on Table */ Label
.DML LABEL UPSERTER
Tells MultiLoad to
DO INSERT FOR MISSING UPDATE ROWS; INSERT a row if
/* Without the above DO, one of these is guaranteed to there is not one to
fail on this same table. If the UPDATE fails because
rows is missing, it corrects by doing the INSERT */ be UPDATED, i.e.,
UPSERT.
UPDATE SQL01.Student_Profile
SET Last_Name = :Last_Name Defines the
,First_Name = :First_Name UPDATE.
,Class_Code = :Class_Code
,Grade_Pt = :Grade_Pt
WHERE Student_ID = :Student_ID; Qualifies the
UPDATE.
INSERT INTO SQL01.Student_Profile
VALUES (:Student_ID Defines the
,:Last_Name
,:First_Name INSERT.
,:Class_Code
,:Grade_Pt); We recommend
placing comma
separators in front
of the following
column or value
for easier
debugging.
Names the Import
.IMPORT INFILE CDW_IMPORT.DAT File and it names
LAYOUT FILEIN the Layout file to
APPLY UPSERTER;
use and tells
MultiLoad to
APPLY the
UPSERTs.
Ends MultiLoad
.END MLOAD; and logs off of
.LOGOFF; Teradata
Figure 5-16
What Happens when MultiLoad Finishes

MultiLoad Statistics
****08:06:41 UTY1803 Import Processing Statistics

Import 1 Total Thus Far
Candidate Records considered . . . 70000 70000
Apply conditions satisfied . . . . 70000 70000
****08:06:38 UTY0818 Statistics for table Employee_Table
INSERTS: 25000
UPDATES: 25000
DELETES: 0
****08:06:41 UTY0818 Statistics for table Department_Table
INSERTS: 0
UPDATES: 20000
DELETES: 0
Figure 5-17
Troubleshooting Multiload Errors

The output statistics in the above example indicate that the load was entirely successful. But that is
not always the case. Now we need to troubleshoot in order identify the errors and correct them, if
desired. Earlier on, we noted that MultiLoad generates two error tables, the Acquisition Error and
the Application error table. You may select from these tables to discover the problem and research
the issues.
For the most part, the Acquisition error table logs errors that occur during that processing phase.
The Application error table lists Unique Primary Index violations, field overflow errors on non-PI
columns, and constraint errors that occur in the APPLY phase. MultiLoad error tables not only list
the errors they encounter, they also have the capability to STORE those errors. Do you remember
the MARK and IGNORE parameters? This is where they come into play. MARK will ensure that the
error rows, along with some details about the errors are stored in the error table. IGNORE does
neither; it is as if the error never occurred.
THREE COLUMNS SPECIFIC TO THE ACQUISITION ERROR TABLE

ErrorCode System code that identifies the error.
ErrorField Name of the column in the target table where the error happened; is left blank if
the offending column cannot be identified.
HostData The data row that contains the error.
Figure 5-19
THREE COLUMNS SPECIFIC TO THE APPLICATION ERROR TABLE

Uniqueness Contains a certain value that disallows duplicate row errors in this table; can be
ignored, if desired.
DBCErrorCode System code that identifies the error.
DBCErrorField Name of the column in the target table where the error happened; is left blank if the
offending column cannot be identified. NOTE: A copy of the target table column
immediately follows this column.
Figure 5-20
RESTARTing Multiload
Who hasn't experienced a failure at some time when attempting a load? Don't take it personally!
Failures can and do occur on the host or Teradata (DBC) for many reasons. MultiLoad has the
impressive ability to RESTART from failures in either environment. In fact, it requires almost no
effort to continue or resubmit the load job. Here are the factors that determine how it works:
First, MultiLoad will check the Restart Logtable and automatically resume the load process from the
last successful CHECKPOINT before the failure occurred. Remember, the Logtable is essential for
restarts. MultiLoad uses neither the Transient Journal nor rollbacks during a failure. That is why you
must designate a Logtable at the beginning of your script. MultiLoad either restarts by itself or waits
for the user to resubmit the job. Then MultiLoad takes over right where it left off.
Second, suppose Teradata experiences a reset while MultiLoad is running. In this case, the host
program will restart MultiLoad after Teradata is back up and running. You do not have to do a thing!
Third, if a host mainframe or network client fails during a MultiLoad, or the job is aborted, you may
simply resubmit the script without changing a thing. MultiLoad will find out where it stopped and start
again from that very spot.
Fourth, if MultiLoad halts during the Application Phase it must be resubmitted and allowed to run
until complete.
Fifth, during the Acquisition Phase the CHECKPOINT (n) you stipulated in the .BEGIN MLOAD
clause will be enacted. The results are stored in the Logtable. During the Application Phase,
CHECKPOINTs are logged each time a data block is successfully written to its target table.
HINT: The default number for CHECKPOINT is 15 minutes, but if you specify the CHECKPOINT as
60 or less, minutes are assumed. If you specify the checkpoint at 61 or above, the number of
records is assumed.
RELEASE MLOAD: When You DON'T Want to Restart MultiLoad

What if a failure occurs but you do not want to RESTART MultiLoad? Since MultiLoad has already
updated the table headers, it assumes that it still "owns" them. Therefore, it limits access to the
table(s). So what is a user to do? Well there is good news and bad news. The good news is that if
the job you may use the RELEASE MLOAD command to release the locks and rollback the job. The
bad news is that if you have been loading multiple millions of rows, the rollback may take a lot of
time. For this reason, most customers would rather just go ahead and RESTART.
Before V2R3: In the earlier days of Teradata it was NOT possible to use RELEASE MLOAD if one
of the following three conditions was true:
• In IMPORT mode, once MultiLoad had reached the end of the Acquisition Phase you could
not use RELEASE MLOAD. This is sometimes referred to as the "point of no return."
• In DELETE mode, the point of no return was when Teradata received the DELETE
statement.
• If the job halted in the Apply Phase, you will have to RESTART the job.
With and since V2R3: The advent of V2R3 brought new possibilities with regard to using the
RELEASE MLOAD command. It can NOW be used in the APPLY Phase, if:
• You are running a Teradata V2R3 or later version
• You use the correct syntax:

RELEASE MLOAD <target-table> IN APPLY
• The load script has NOT been modified in any way
• The target tables either:

♦ Must be empty, or
♦ Must have no Fallback, no NUSIs, no Permanent Journals
You should be very cautious using the RELEASE command. It could potentially leave your table half
updated. Therefore, it is handy for a test environment, but please don't become too reliant on it for
production runs. They should be allowed to finish to guarantee data integrity.
MultiLoad and INMODs

INMODs, or Input Modules, may be called by MultiLoad in either mainframe or LAN environments,
providing the appropriate programming languages are used. INMODs are user written routines
whose purpose is to read data from one or more sources and then convey it to a load utility, here
MultiLoad, for loading into Teradata. They allow MultiLoad to focus solely on loading data by doing
data validation or data conversion before the data is ever touched by MultiLoad. INMODs replace
the normal MVS DDNAME or LAN file name with the following statement:
.IMPORT INMOD=<INMOD-name>
You will find a more detailed discussion on how to write INMODs for MultiLoad in the chapter of this
book titled, "INMOD Processing".
How Multiload Compares with FastLoad
Function FastLoad MultiLoad

Error Tables must be defined Yes Optional. 2 Error Tables have to
exist for each target table and
will automatically be assigned.
Work Tables must be defined No Optional. 1 Work Table has to
exist for each target table and
will automatically be assigned.
Logtable must be defined No Yes
Allows Referential Integrity No No
Allows Unique Secondary Indexes No No
Allows Non-Unique Secondary Indexes No Yes
Allows Triggers No No
Loads a maximum of n number of tables One Five
DML Statements Supported INSERT INSERT, UPDATE, DELETE,
and "UPSERT"
DDL Statements Supported CREATE and DROP DROP TABLE
TABLE
Transfers data in 64K blocks Yes Yes
Number of Phases Two Five
Is RESTARTable Yes Yes, in all 5 phases (auto
CHECKPOINT)
Stores UPI Violation Rows Yes Yes
Allows use of Aggregated, Arithmetic No Yes
calculations or Conditional Exponentiation
Allows Data Conversion Yes, 1 per column Yes
NULLIF function Yes Yes
Figure 5-21

Teradata Utilities MultiLoad

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Teradata Utilities MultiLoad

Uploaded by

Copyright:

Available Formats

Teradata Utilities: MultiLoad

Reprinted for KV Satish Kumar, IBM

Reprinted with permission as a subscription benefit of Books24x7,

Why it is Called "Multi" Load

Two MultiLoad Modes: IMPORT and DELETE

Block and Tackle Approach

MultiLoad Imposes Limits

Error Tables, Work Tables and Log Tables

Each error table does the following:

• Provides some detail about the errors

• Stores the actual offending row for debugging

1. The Data Manipulation Language (DML) tasks

2. The input data that is ready to APPLY to the AMPs

Supported Input Formats

MultiLoad Has Five IMPORT Phases

• Phase 1: Preliminary Phase

• Phase 2: DML Transaction Phase

• Phase 3: Acquisition Phase

• Phase 4: Application Phase

• Phase 5: Cleanup Phase

Phase 1: Preliminary Phase

Phase 2: DML Transaction Phase

Phase 3: Acquisition Phase

Phase 4: Application Phase

The following five columns are the matching tags:

Phase 5: Clean Up Phase

multiple data records by using the LAYOUT

Support This interrupts the operation of MultiLoad in order to

Parameters for .BEGIN IMPORT MLOAD

ALL specifies not

LOW for level

You can also use

when its initial

Parameters for .BEGIN DELETE MLOAD

A Simple Multiload IMPORT Script

"We must use time as a tool, not as a crutch."

Let's create a flat file for our Multiload

Let's Execute it:

Building our Multiload Script

"I can accept failure, but I can't accept not trying."

Creating our Multiload script

"Ambition is a dream with a V8 Engine."

Let's execute our Multiload script

Here is a before and after image of our Employee_table02:

Another Simple MultiLoad IMPORT Script

• Logging onto Teradata

• Identifying the Target, Work and Error tables

• Defining the INPUT flat file

• Defining the DML activities to occur

• Naming the IMPORT file

• Telling MultiLoad to use a particular LAYOUT

• Telling the system to start loading

• Finishing loading and logging off of Teradata

Sets Up a Logtable and

VALUES Lists, in order, the

MultiLoad IMPORT Script

Load Runs from a

.LAYOUT FILEIN; Defines the structure

Names the DML Label

Error Treatment Options for the .DML LABEL Command

ERROR TREATMENT OPTIONS FOR .DML LABEL

.DML LABEL {labelname}

1. You do not need to see all the errors.

/Drop Error Tables / Drops Existing error