You are on page 1of 23

Teradata Tools

FastLoad

Vidya T
FastLoad
(Fast Data Load)

After completing this module, you will be able to:

 Describe the two phases of FastLoad.

 Prepare a FastLoad script.

 Partition a data load over successive runs.

 Restart an interrupted FastLoad.


FastLoad

• Fast batch mode utility for loading new tables onto the Teradata
database
• Can reload previously emptied tables
• Full Restart capability
• Error Limits and Error Tables, accessible using SQL
• Restartable INMOD routine capability
• Ability to load data in several stages

FastLoad
Teradata
Host
Database
FastLoad Characteristics
Purpose
• Load large amounts of data into an empty table at high speed.

Concepts
• Loads into an empty table with no secondary indexes.
• Has two phases - creates an error table for each phase.
• Status of run is displayed.
• Checkpoints can be taken for restarts.

Restrictions
• Only load 1 empty table with 1 FastLoad job.
• The Teradata Database will accommodate up to 15 FL/ML/FE
applications at one time.
• Tables defined with Referential integrity, secondary indexes, Join
Indexes, or Triggers cannot be loaded with FastLoad.
• Duplicate rows cannot be loaded into multiset table with FastLoad.
• If an AMP goes down, FastLoad cannot be restarted until it is back
online.
FastLoad Phase 1
Host
2 2
1 - FastLoad PE
PE PE
PE
B1R1 B1R1 B2R4
B1R2 B1R2 B2R5
B1R3 B1R3 B2R6
B2R4
B2R5 3 3
B2R6 BYNET
BYNET

BnRx
BnRy AMP11
AMP AMP22
AMP
4 5 4 5
BnRz B1R1 R4 B2R4 R3
B1R2 R2 B2R5 R1
B1R3 R5 B2R6 R6

6 6
R4 R3
R2 R1
R5 R6

Phase 1
• FastLoad uses one SQL session to define AMP steps.
• The PE sends a block to each AMP which stores blocks of unsorted
data records.
• AMPs hash each record and redistribute them to the AMP responsible
for the hash value.
• At the end of Phase 1, each AMP has the rows it should have, but the
rows are not in row hash sequence.
FastLoad Phase 2
2
Host 1 - FastLoad PE
PE
END LOADING;

3 3
BYNET
BYNET

AMP11
AMP AMP22
AMP
5 5
R4 R2 R3 R1
R2 R4 R1 R3
R5 R5 R6 R6

4 6 4 6

R4 R2 R3 R1
R2 R4 R1 R3
R5 R5 R6 R6

Phase 2
• When the FastLoad job receives END LOADING; statement, FastLoad
starts Phase 2.
• Each AMP sorts the target table, puts the rows into blocks, and writes
the blocks to disk.
• Fallback rows are then generated if required.
• Table data is available when Phase 2 completes.
A Sample FastLoad Script

LOGON tdpid/username,password;
DROP TABLE Acct; SETUP
DROP TABLE AcctErr1;
DROP TABLE AcctErr2; Create the table, if
it dosen’t already
CREATE TABLE Acct, FALLBACK ( exist.
AcctNum INTEGER
,Number INTEGER
,Street CHAR(25)
,City CHAR(25)
,State CHAR(2)
,Zip_Code INTEGER)
UNIQUE PRIMARY INDEX (AcctNum);
LOGOFF; FASTLOAD

LOGON tdpid/username,password; Start the utility.


BEGIN LOADING Acct Error files must be
ERRORFILES AcctErr1, AcctErr2 defined.
CHECKPOINT 100000;
Checkpoint is
optional.
DEFINE in_AcctNum (INTEGER)
,in_Zip (INTEGER)
,in_Nbr (INTEGER) DEFINE the input;
,in_Street (CHAR(25)) must agree with host
,in_State (CHAR(2)) data format.
,in_City (CHAR(25))
FILE=data_infile1;
INSERT must agree with
INSERT INTO Acct VALUES ( table definition.
:in_AcctNum Phase 1 begins.
,:in_Nbr Unsorted blocks are
,:in_Street written to disk.
,:in_City
,:in_State
,:in_Zip);
Phase 2 begins with END
LOADING. Sorting and
END LOADING;
writing blocks to disk.
LOGOFF;
FastLoad from a UNIX Server

Input script file name


fastload < /home/job1.fld > /home/job1.out & output file (report)
name

SESSIONS 8;
Name of empty
LOGON educ2/bank,bkpasswd; table

BEGIN LOADING Customer Starts Phase 1


ERRORFILES CustrErr1, CustrErr2;
DEFINE in_CustNum (INTEGER) Defines input
record
,in_SocSec (INTEGER)
,Filler (CHAR(40))
,in_Lname (CHAR(30))
,in_Fname (CHAR(20))
FILE=custdata.dat;
INSERT INTO CUSTOMER VALUES (
SQL Insert
:in_CustNum statement

,:in_Lname
,:in_Fname
,:in_SocSec);
END LOADING;
LOGOFF;
Start Phase 2; if
omitted, utility will
pause
Converting the Data

LOGON educ2/user14,ziplock;
DROP TABLE Accounts;
DROP TABLE Accts_ErrTab_1;
DROP TABLE Accts_ErrTab_2;
CREATE TABLE Accounts, FALLBACK (
Account_Number INTEGER
,Account_Status CHAR(15) FastLoad permits
,Trans_Date DATE conversion from one
,Balance_Forward DECIMAL(5,2)
data type to another,
,Balance_Current DECIMAL(7,2) )
once for each column.
UNIQUE PRIMARY INDEX (Account_Number);
BEGIN LOADING Accounts
ERRORFILES Accts_ErrTab_1, Accts_ErrTab_2;
DEFINE in_Acctno (CHAR(9))
, in_Trnsdate (CHAR(10))
, in_Balcurr (CHAR(7))
, in_Balfwd (INTEGER)
, in_Status (CHAR(10))
FILE = INFILE;
INSERT INTO Accounts Including column names
(Account_Number provides script
documentation which may
,Account_Status
aid in future when
,Trans_Date debugging or modifying the
,Balance_Forward job script.
,Balance_Current)
VALUES (
:in_Acctno
,:in_Status
,:in_Trnsdate (Format ‘YYYY-MM-DD’)
,:in_Balfwd
,:in_Balcurr);
END LOADING;
LOGOFF;
Data Conversion Chart

FROM: TO: ORIGINAL DATA: STORED AS:

CHAR(13) VARCHAR(5) ABCDEFHIJKLM ABCDE


CHAR(5) INTEGER ABCDE invalid
CHAR(5) INTEGER 12345 0000012345
CHAR(13) INTEGER 12345bbbbbbbb 0000012345
CHAR(13) INTEGER 1234567890123 overflow
CHAR(13) DATE 92/01/15bbbbb 920115
CHAR(13) DATE 920115bbbbbbb invalid
CHAR(13) DATE 01/15/92bbbbb invalid
CHAR(6) DEC(5,2) 123.50 123.50
CHAR(6) DEC(5,2) 12350 overflow
VARCHAR(5) CHAR(13) ABCDE ABCDEbbbbbbbb
BYTEINT INTEGER 123 0000000123
SMALLINT INTEGER 12345 0000012345
INTEGER SMALLINT 0000012345 12345
INTEGER SMALLINT 1234567890 invalid
INTEGER BYTEINT 0000000123 123
INTEGER BYTEINT 0000012345 invalid
INTEGER DATE 0000920115 920115
INTEGER CHAR(8) 0000012345 bbbbbb12
DECIMAL(3,2) INTEGER 1v23 0000000001
DECIMAL(3,2) CHAR(5) 1v23 b1.23
DECIMAL(3,2) CHAR(3) 1v23 b1.
DATE INTEGER 0000920115 0000920115
DATE SMALLINT 0000920115 invalid
DATE CHAR(8) 0000920115 92/01/15
DATE CHAR(6) 0000920115 92/01/
NULLIF

DEFINE in_Acctno (CHAR(9))


,in_Status (CHA(10))
,in_Trnsdate (CHAR(10), NULLIF = 0)
,in_Balfwd (INTEGER)
,in_Balcurr (CHAR(7))
FILE = infile5;

INSERT INTO Accounts VALUES (


:in_Acctno
,:in_Status
,:in_Trnsdate (FORMAT ‘YYYY-MM-DD’)
,:in_Balfwd
,:in_Balcurr);

NULLIF allows you to specify that if an input field contains a


specified value, it should be treated as NULL.
One common example occurs when dates are entered as
zeroes; they may cause a fault since they are not in the
expected format.
BEGIN LOADING Statement

Name of empty table


BEGIN LOADING Target_table_name (not a view)
ERRORFILES ERROR_Table_1,
ERROR_Table_2 Name the two error
tables (required)
[ CHECKPOINT integer ]
[ INDICATORS ] ; Optional checkpoint
interval

Allows NULLs to be
preserved

Required privileges for the userid logged in to the FastLoad job:

• For Target_table_name: SELECT and INSERT (CREATE and


DROP or DELETE, if using those functions)
• For Error_tables: CREATE TABLE
• Required privileges for the user PUBLIC on the restart log
table (SYSADMIN.FASTLOG):
– SELECT
– INSERT
– UPDATE
– DELETE
• There will be a row in the FASTLOG table for each FastLoad
job that has not completed in the system.
FastLoad Error Tables

ErrorTable1
Contains one row for each row which failed to be loaded due to
constraint violations or translation errors. The table has three
columns:

Column_Name Datatype Content


ErrorCode Integer The Error Code in
DBC.ErrorMsgs.
ErrorFieldName VarChar(30) The column that caused
the error.
DataParcel VarByte(64000) The data record sent by
the host.

ErrorTable2
For non-duplicate rows, captures those rows that cause a UPI
duplicate violation.
Notes
• Duplicate rows are counted and reported but not captured.
• Error tables are automatically dropped if empty upon completion
of the run.
• Performance Note: Rows are written into error tables one row at
a time. Errors slow down FastLoad.
Error Recovery

Output report from FastLoad


A Total Records Read = 35000
B Total ErrorTable 1 = 1250 (Not loaded due to error)
C Total ErrorTable 2 = 30 (Duplicate UPIs only)
D Total Inserts Applied = 33700
E Total Duplicate Rows = 20

(A = B + C + D + E)

Investigating the failed rows


SELECT DISTINCT ErrorCode, ErrorFieldName
FROM Error_Table_1;

Investigating the duplicate index violations


SELECT * FROM Error_Table_2;
CHECKPOINT Option

BEGIN LOADING . . .
CHECKPOINT integer;

• Used to verify that rows have been transmitted and processed.


• Specifies the number of rows transmitted before pausing to
take a checkpoint and verify receipt by AMPs.
• If the CHECKPOINT parameter is not specified, FastLoad
takes checkpoints as follows:
– Beginning of Phase 1
– Every 100,000 input records
– End of Phase 1
• FastLoad can be restarted from previous checkpoint.
END LOADING Statement

END LOADING ;

• Indicates that all data rows have been transmitted.


• Begins Phase 2 processing.
• Omission implies:
– The load is incomplete and will be restarted later.
– This causes the table that is being loaded to become
“FastLoad paused.”
– If you attempt to access a table (via SQL) that is in a
“FastLoad paused” state, you will get the following error.
Error #2652 Operation Not Allowed
tablename is being loaded
RECORD Statement

RECORD [integer] [THRU integer];

• If you do not use a RECORD command, FastLoad reads from the


first record in the data source to the last record [OR FROM THE
LAST CHECKPOINT! ].

• RECORD allows control over which input records are to be


brought in for loading.

• RECORD is a separate statement used before the DEFINE


statement.

RECORD 1 THRU 1000;

1st through the 1,000th record

RECORD 1;

1st through the last record


INSERT Statement

DEFINE AcctNum (INTEGER)


,Nbr (INTEGER)
,Strt (CHAR(25))
,Cty (CHAR(25))
,St (CHAR(2))
,Zip (INTEGER)
File=data_in ;

INSERT INTO Accounts VALUES (


:AcctNum
,:Nbr
,:Strt
,:Cty
,:St
,:Zip);

INSERT privilege is required on the table.


Restarting FastLoad

Condition
Condition 1:
1: Abort
Abort inin Phase
Phase 11 -- data
data acquisition
acquisition incomplete.
incomplete.
Solution:
Solution: Resubmit
Resubmit the the script.
script. FastLoad
FastLoad will
will begin
begin from
from record
record 11 or
or the
the first
first
record
record past
past the
the last
last checkpoint.
checkpoint.

Condition
Condition 2:
2: Abort
Abort occurs
occurs inin Phase
Phase 22 -- data
data acquisition
acquisition complete.
complete.
Solution:
Solution: Submit
Submit only
only BEGIN
BEGIN and
and END
END LOADING
LOADING statements;
statements; restarts
restarts
Phase
Phase 22 only.
only.

Condition
Condition 3: 3: Normal
Normal end
end ofof Phase
Phase 11 (paused)
(paused) -- more
more data
data toto acquire,
acquire, thus
thus
there
there isis no
no 'END
'END LOADING'
LOADING' statement
statement inin script.
script.
Solution:
Solution: Resubmit
Resubmit the
the script.
script. FastLoad
FastLoad will
will be
be positioned
positioned toto record
record 11 or
or the
the
first
first record
record past
past the
the last
last checkpoint.
checkpoint.

Condition
Condition 4:
4: Normal
Normal end
end ofof Phase
Phase 11 (paused)
(paused) -- no no more
more data
data toto acquire,
acquire, no
no
'END
'END LOADING'
LOADING' statement
statement was
was inin the
the script.
script.
Solution:
Solution: Submit
Submit BEGIN
BEGIN and
and END
END LOADING
LOADING statements;
statements; restarts
restarts Phase
Phase 22
only.
only.
FastLoad Fails to Complete
LOGON Username, Password;
BEGIN LOADING Accounts
ERRORFILES AcctErr1, AcctErr2;

DEFINE AcctNum (INTEGER)


,Nbr (INTEGER)
,Strt (CHAR(25))
,Cty (CHAR(25))
,St (CHAR(2))
,Zip (INTEGER)
File=Data_in ;
FDL4803.DEFINE statement processed
INSERT INTO Accounts VALUES (
:AcctNum
,:Nbr
,:Strt
,:Cty
,:St
,:Zip);
***Number of recs / msg =1442
**** 11:17:29 Starting to send to RDBMS with record 1
**** 11:17:58 Starting row 100000
**** 11:18:23 Starting row 200000
**** 11:18:31 Starting row 300000
:
**** 11:24:09 Starting row 1700000
**** 11:24:35 Starting row 1800000
**** 11:24:36 RDBMS error 2644: No more room in data base target.
**** 11:24:36 Increase database size and restart FastLoad

FAST LOAD PAUSED

LOGOFF;
Restarting FastLoad (Output)
LOGON Username, Password;
BEGIN LOADING Accounts
ERRORFILES AccErr1, AccErr2;
DEFINE AcctNum (INTEGER)
:
FDL4803.DEFINE statement processed
INSERT INTO Accounts VALUES (
:
;
FastLoad Restarted
***Number of recs / msg =1442
**** 11:26:45 The last checkpoint was taken at row: 1804231
**** 11:26:45 FastLoad will now restart at row: 1804232
**** 09:41:15 Starting row 1900000
**** 09:41:40 Starting row 2000000
:
**** 09:43:38 Starting row 2800000
**** 09:43:41 Sending row 2820489
**** 09:43:41 Finished sending rows to the RDBMS

END LOADING;
0009 end loading;

**** 09:49:57 END LOADING COMPLETE

Total Records Read = 2820489


Total Error Table 1 = 0 ---- Table has been dropped
Total Error Table 2 = 0 ---- Table has been dropped
Total Inserts Applied = 2820489
Total Duplicate Rows = 0

Start: Wed Jan 17 09:44:00 2001


End : Wed Jan 17 09:49:57 2001
FastLoad with Multiple Data Files

load_US.fld fastload < load_US.fld


LOGON username,password;
BEGIN LOADING Customer ERRORFILES CustErr1, CustErr2;

DEFINE in_CustNum (INTEGER)


,in_Lname (CHAR(15))
,in_Fname (CHAR(10))
,in_Mailcode (INTEGER)
FILE=US.dat;

INSERT INTO Customer VALUES (


:in_CustNum
,:in_Lname
,:in_Fname
,:in_Mailcode); No END LOADING;
statement. Table is in
“FastLoad Paused” state.
LOGOFF;

load_Int.fld fastload < load_Int.fld


LOGON username,password;
BEGIN LOADING Customer ERRORFILES CustErr1, CustErr2;

DEFINE in_CustNum (INTEGER)


,in_Lname (CHAR(15))
,in_Fname (CHAR(10))
,in_Mailcode (INTEGER)
FILE=International.dat;

INSERT INTO Customer VALUES (


:in_CustNum
,:in_Lname
,:in_Fname
,:in_Mailcode); END LOADING; indicates
no more data and to start
END LOADING; Phase 2.
LOGOFF;
Summary — FastLoad

FastLoad
Teradata
Host
Database

Good for loading new tables from the host.

Reload existing tables.

Use INMOD for specialized processing.

Remove referential integrity or secondary indexes prior to using


FastLoad.

You might also like