DATA PROTECTION IN TERADATA AND TERADATA UTILITIES OVERVIEW

Different data protection methods in Teradata
• • • • • • Transient Journal RAID Protection Fallback Clusters Clique Permanent Journals

Locking for data integrity

• Transient Journal activities are automatic . the data is returned to its original state after transaction failure • The transient journal maintains a copy on each AMP of before images of all rows affected by the transaction • If the transaction fails.Transient Journal • The transient journal permits the successful rollback of a failed transaction • In case a transaction fails. the before images for the transaction are discarded. the before images are reapplied and rollback operation is completed and then before images are deleted from the journal • If the transaction succeeds.

Transient Journal .

. In case of a disk failure. • Teradata recommends RAID 1. • Raid-5 doesn’t store redundant data. the mirror disk becomes the primary disk for the data and performance is unchanged • Raid-5 is parity checking technique .RAID Protection • Provides protection against the disk failure • Teradata supports the following disk protection schemes Raid-1 Raid-5 • Raid-1 is disk mirroring technique wherein a mirror copy of each physical disk is maintained. It stores parity information using which the data in the failed disk can be re-constructed.

If specified. • If an AMP fails. • Fallback guarantees that the two copies of a row will always be on different AMPs . • Stores copy of each row of the table in a separate Fallback AMP in the same cluster. the system accesses the Fallback rows to meet requests • Fallback is an optional feature in Teradata which we need to specify during table creation. it is automatic.Fallback Feature • Provides data protection from AMP failure.

FALLBACK .

ie. . the other AMP’s in the cluster should do their own work plus the work of the failed AMP. b ut the Fallback row will always go to another AMP in the same cluster • Cluster size may range from 2 to 16 AMP’s • The loss of an AMP in one cluster has no effect upon other clusters. • But if two AMP’s fail in the same cluster. • Clustering has no effect on primary row distribution of the table. the data access is lost. It is possible to lose one AMP in each cluster and still have full access to all Fallback-protected table data. • When one AMP fails in a cluster. the workload of the other amps will increase.Fallback Cluster • A cluster is a group of AMPs that act as a single Fallback unit.

Fallback Cluster .

. a down-AMP recovery journal is started automatically to log any changes to rows which reside on the down AMP • Any I/U/D operations to the rows on the down amp are applied to the fallback copy within the cluster.Recovery Journal for Down AMPs • After the loss of any AMP. • The recovery journal is then discarded and the AMP is brought back online again. • Once the down AMP is active again. the recovery journal is read and the changes are applied to the recovered AMP.

• Provides fault tolerance from Node failures • In case of a node failure. the vprocs from the failed node can migrate to other available nodes in the clique thereby keeping the system operational • When the failed node is returned back to operation.Cliques • A Collection of Teradata nodes which share a common set of disks.the vprocs will be brought back to the actual node .

Cliques .

which we can define during table creation. • It provides database recovery upto a specified point of time. the user must specify if single images (default) or dual images (for fault-tolerance) are to be captured • Multiple tables or multiple databases may share a permanent journal • It reduces the need for full table back-up’s which is very costly • The journal can be periodically dumped to external media. • Additionally. . • We can specify to capture before images(for roll back) or after images(for roll forward) or both for all the rows which are getting changed(I/U/D operation).Permanent Journal • Permanent Journals are optional.

NO BEFORE JOURNAL AFTER JOURNAL ( field1 INTEGER. field2 INTEGER ) PRIMARY INDEX field1. .Table definition with Fallback & Journal enabled CREATE TABLE table_name FALLBACK.

Write Lock. Exclusive Locks.Locking in Teradata • Locks are used for concurrency control. Users can also specify locks explicitly • There are 3 levels of locking in teradata.ie. Locking prevents multiple users from changing the same data at the same time which can affect the data integrity • Locks are automatically acquired during the processing of a request and released at the termination of the request. Table level and Row Level • There are 4 types of locks ie. Database level. Read lock and Access locks • The type and level of locks are automatically chosen based on the type of SQL command .ie.

) : Users who are not concerned about data consistency can specify access locks . This is placed in response to a user-defined LOCKING FOR ACCESS .This is the most restrictive of all the locks. • Write lock is established when there is an insert.ie.delete or update request • Read locks are used to ensure consistency during read operations.DDL changes. Read locks are established for select requests • Several users may hold concurrent read locks on the same data • Access Locks(stale read locks. • Write locks enable users to modify data while locking out all other users except readers not concerned about data consistency.Locking • Exclusive locks are placed whenever there is any Database or Table level structural changes.

The first user to request an object is the first to lock the object.Locking • Teradata locks objects on a first come first serve basis. Teradata will place other users who are accessing the same object in a locking queue • Teradata allows a user to move up the line if their lock is compatible with the lock in front of them .

Compatibility between different Locks .

.

Compatibility between different Locks • In the example in the previous slide USER 1 is first in line and READS the object USER 2 must wait on USER 1 USER 3 must wait on USER 2 USER 4 moves past USER 3. and simultaneously reads the object with USER 1. Example 2: . then USER 2.

It is not compatible with the EXCL and can’t move up. .Compatibility between different Locks USER 1 READS the Object immediately USER 2 is compatible and WRITES on the object also USER 3 must wait on both USER 1 and USER 2 USER 4 must wait until USER 3 is done.

• TERADATA UTILITIES OVERVIEW .

DELETEs or Upserts • A BTEQ script is a combination of BTEQ commands and SQL commands. • Supports Conditional Logic and error handling.BTEQ • BTEQ stands for Basic Teradata Query • Batch-mode utility for submitting SQL requests to the Teradata database • Runs on every supported platform—laptop to mainframe • Can be used to export data to a client system from the Teradata database • Reads input data and imports it to the Teradata database as bulk INSERTs. . UPDATEs.

which can later be used for analysis • Quickly loads the data into an empty table in 64K blocks. . • Full Restart capability. • Error Limits may be set • Error Tables collect records that fail to load. • One fastload job per target table.FASTLOAD • FastLoad is a utility that can be used to quickly load large amounts of data to an empty table on Teradata • FastLoad uses multiple sessions to load data to the teradata table.

Restrictions on Fastload • Target table must initially be empty. • Target tables must NOT have: – Secondary Indexes defined – Enabled Triggers – Referential Integrity constraints .

UPDATEs. • Definable error limits • Error capture and reporting via error tables . else INSERT) • Supports mainframe and network-attached systems • Full Restart capability using a Logtable. DELETEs and UPSERTs (UPDATE if exists. Data processing is done in 64K blocks • Uses conditional logic for applying changes. • Ability to do INSERTs. updating or deleting data to and from populated or empty tables • Support for up to five target tables per script • Performs block level operations against populated tables and is good for high percentage updates.MULTILOAD • MultiLoad is used for loading.

MULTILOAD • There are two distinct types of tasks that MultiLoad can perform: IMPORT task : Intermix a number of different SQL/DML statements and apply them to up to five different tables DELETE task: Execute a single DELETE statement on a single table .

FASTEXPORT • Exports large volumes of Data from Teradata to a file • Can use multiple sessions with Teradata • It can export data from multiple tables • Fully automated restart capability • Data export is done in 64k blocks .

• Supports automatic restarts. to more than 60 tables at a time • Allows target tables to: • Have secondary indexes. • Be populated or empty.TPump Utility • Allows near real-time updates from transactional systems into the warehouse • Allows constant loading of data into a table. UPDATE. DELETE. • Performs INSERT. • No session limit—use as many sessions as necessary . referential integrity constraints and enabled triggers. or a combination.

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.