Teradata Notes

Teradata Notes
Fastload
1. Key word is MATCH TAG which is not there in Fastload. 2. FastLoad discards duplicate rows, because it doesn't have/store any information about the input record sequence like MultiLoad's Match Tag (ApplySeq+DMLSeq+ImportSeq+SMTSeq+SourceSeq), thus it simply doesn't know, if a row was duplicate within the data or was sent twice because of a restarted FastLoad (in Application Phase). 3. FastLoad uses multiple sessions to load data. However, it loads data into only one table on a Teradata RDBMS per job. If you want to load data into more than one table in an RDBMS, you must submit multiple FastLoad jobsone for each table. 4. During a load operation, FastLoad inserts the data from each record of your data source into one row of the table on a Teradata RDBMS. The table on the Teradata RDBMS receiving the data must be empty and have no defined secondary indexes. 5. Note: FastLoad does not load duplicate rows from your data source to the Teradata RDBMS. (A duplicate row is one in which every field contains the exact same data as the fields of an existing row.) This is true even for MULTISET tables. If you want to load duplicate rows in a MULTISET table, use the MultiLoad utility. 6. You can restart serial FastLoad operations by loading the next tape in a series instead of beginning with the first tape in a set. 7. In either case, FastLoad: Uses multiple Teradata sessions, at one session per AMP, to transfer data Transfers multiple rows of data within a single message 8. Also, in either case, until you complete the FastLoad job and have loaded the data into the FastLoad table: There is no journaling or fallback data You cannot define the secondary indexes
9. The FastLoad utility can redefine the data type specification of numeric, character and date input data so it matches the type specification of its destination column in the FastLoad table on the Teradata RDBMS. If, for example, an input field with numeric type data is targeted for a column with a character data type specification, FastLoad can change the input data specification to character before inserting it into the table. The types of data conversions you can specify are: Numeric-to-numeric (for example integer-to-decimal) Character-to-numeric Character-to-date Date-to-character 10. You use the datatype specification of the DEFINE command to convert input data to a different type before inserting it into the FastLoad table on the Teradata RDBMS. 11. Checkpoints are entries posted to a restart log table at regular intervals during the FastLoad data transfer operation. If processing stops while a FastLoad job is running, you can restart the job at the most recent checkpoint. If, for example, you are loading 1,000,000 records into a table and have specified checkpoints every 50,000 records, FastLoad pauses and posts an entry to the restart log table whenever multiples of 50,000 records have been successfully sent to the Teradata RDBMS. If the job stops after record 60,000 has been loaded, you can restart the job at the record immediately following the last checkpointrecord 50,001. You enable the checkpoint function by specifying a checkpoint value in the BEGIN LOADING command. 12. The FastLoad utility logs off all sessions with the Teradata RDBMS and returns a status message indicating: The total processor time that was used The job start and stop date/time The highest return code that was encountered: 0 if the job completed normally 4 if a warning condition occurred 8 if a user error occurred 12 if a fatal error occurred Whether the utility terminated or paused
13.When a FastLoad job is in the paused state, the FastLoad target table and the two error tables on the Teradata RDBMS are locked. You can access the two error tables by using a locking modifier, such as:
locking error_table_name for access select errorcode, errorfieldname from error_table_name;
14. The FastLoad utility does not support foreign key references in target tables. 15. The FastLoad utility does not support target tables defined with secondary indexes. Attempting a FastLoad task against a target table defined with secondary indexes produces an error condition. Or, alternatively, if only non-unique secondary indexes are involved, consider using the MultiLoad utility. 16. The FastLoad utility does not maintain Join Indexes. You cannot use FastLoad to load data to tables with an associated Join Index on a Teradata RDBMS for UNIX, V2R2.1 database. In this case, you must first drop the Join Index, then recreate it after running the FastLoad job.
17. The FastLoad utility does not load duplicate rows, as in MULTISET
tables. If you use FastLoad to load a target table defined as MULTISET, the utility will discard any duplicate rows. 18. The maximum file size that is supported by FastLoad on networkattached client systems is 2 gigabytes. 19. A multi-file FastLoad job is one that loads the FastLoad table with input data from more than one source. You do this by: 1 Using a LOGOFF command, with no END LOADING command, to intentionally pause the FastLoad job after you have initiated the job and loaded the data from the first source. 2 Successively restarting and pausing the FastLoad job to load the data from each subsequent input source. 3 Using an END LOADING command to terminate the FastLoad job after you have loaded the data from the last input source.
Note: When you run a multi-file FastLoad job, the FastLoad table and the two error tables remain locked and are not available to users until you use the END LOADING command to conclude the FastLoad job. 20. While processing your FastLoad job script, FastLoad tracks and records information about five types of error conditions that cause the Teradata RDBMS to reject an input data record: Constraint violations Conversion errors Unavailable AMP conditions Unique primary index violations Duplicate rows 21. The FastLoad utility stores the input data records related to constraint violations, conversion errors, unavailable AMP conditions and unique primary index violations in the two error tables that you specify in your BEGIN LOADING command:
The FastLoad utility discards all records that produce a duplicate row error, but includes the total number of duplicate rows encountered, along with the total records in each error table, in the end-of-job status report.
Analyze Explain Plan

1) what does "pseudo table" mean ? is this a temp table create for global temp table ?
2) what does DBC.TVM,DBC.DBase mean ? 3) i do not see any insert into the global temporary table. can someone explain how data is inserted into global temporary table Psuedo tables are dummy tables that sit on all AMPs. Pseudo tables store the Table ID Hash codes of database objects which are involved in ALL-Amp operations. Whenever a ALL-AMP operation is performed on a database object, a Row-Hash lock is placed on pseudo table to prevent the deadlock situation(if we have another user submitted another request on same object). Looking at the explain, looks like you are performing a DDL change through a macro.DBC.TVM and DBC. Dbase are dictionary tables that store the meta data about the database objects Tables/views/macros. Whenever a database object is created/deleted/modified these dictionary tables are updated.
What is a Psuedo Lock? Whenever all-AMPs are utilized in a query a Psuedo Lock must be placed on the table. This sounds like fancy terminology, but all it means is that a "Gatekeeper" is responsible for locking the table for one user at a time. Let me explain this EXPLAIN Terminology. We know that each AMP holds a portion of a table. We also know that when a Full Table Scan is performed that each AMP will read their portion of the table. If Teradata isnt careful a DEADLOCK can happen. A deadlock is when two different users require multiple locks and one user gets one lock and the other user gets the other lock. Both users require both locks and wait for the other lock to become available. They will unfortunately wait forever unless Teradata breaks one of the locks. This is a deadlock. A Psuedo Lock is how Teradata prevents a deadlock. When a user does an All-AMP operation Teradata will assign a single AMP to command the other AMPs to lock the table. Teradata actually hashes the table_name and uses the hash map to choose an AMP. This single "Gatekeeper" AMP will always be responsible for locking that particular table on all AMPs. This allows for users running an all-AMP query on the table to have to report to the "Gatekeeper" AMP. The "Gatekeeper" AMP never plays favorites and performs the locking on a First Come First Serve basis. The first user to run the query will get the lock. The others will have to wait.
Multiload
1. A single Teradata MultiLoad job performs a number of different import and delete tasks on database tables and views: Each Teradata MultiLoad import task can do multiple data insert, update, and delete functions on up to five different tables or views.
Each Teradata MultiLoad delete task can remove large numbers of rows from a single table. 2. Phases of Teradata MultiLoad operations
3. Use a RELEASE MLOAD statement specifying all of the target tables identified in the TABLES clause of the BEGIN MLOAD command for the aborted Teradata MultiLoad job. Note: An unsuccessful RELEASE MLOAD statement indicates that the Teradata
MultiLoad job was in the application phase when it terminated. In this case, the Teradata MultiLoad job cannot be abandoned. Restart the job and let it run to completion.
4. Teradata MultiLoad creates four tables that are required for restarting a paused Teradata MultiLoad job: Restart Log Table Work Table Acquisition Error Table Application Error Table 5. Down AMPs
The impact of down AMPs on Teradata MultiLoad tasks depends on: The number of AMPs that are down, either logically or physically, in a cluster. The operational phase of the Teradata MultiLoad task when the down AMP condition Occurs. Whether the target tables are fallback or no fallback. If all of the target tables are fallback and not more than one AMP is down, then Teradata MultiLoad tasks continue to run as long as there is not more than one AMP down, either Logically or physically, in a cluster. The down AMP does not participate in the application phase if: The AMP goes down before the Teradata MultiLoad task enters the application phase, and the AMPCHECK parameter is set to NONE Certain I/O errors occur during the application phase If all of the target tables are fallback and two or more AMPs are down, then Teradata MultiLoad tasks do not run, or terminate if two or more AMPs are down, either logically or physically, in a cluster. Note: In the application phase, if AMPs are down to the extent that data on the disk is Corrupted, then the affected tables must be manually restored. If one or more of the target tables is nonfallback and one or more AMPs are down, then Teradata MultiLoad tasks terminate and they cannot be restarted until all of the AMPs are back up.

Teradata Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Teradata Notes

Uploaded by

Copyright:

Available Formats

Teradata Notes

Analyze Explain Plan

You might also like