Teradata Primary and Secondary Index Guide

Primary index in teradata:
Primary index is used to specify where the data resides in Teradata. It is used to specify
which AMP gets the data row. Each table in Teradata is required to have aprimary
index defined.
Teradata utilities:
http://www.bi-dw.info/teradata-loading-tools.htm
A Secondary Index (SI) is an alternate data access path. It allows you to access the data without
having to do a full-table scan.
You can drop and recreate secondary indexes dynamically, as they are needed.Secondary Indexes
are stored in separate subtables that requires additional disk space and maintenance which is
handled automatically by the system.
The entire purpose for the Secondary Index Subtable will be to point back to the real row in the
base table via the Row-ID.
Rule 1: Secondary Indexes are optional.

Rule 2: Secondary Index values can be unique or non-unique.
Rule 3: Secondary Index values can be NULL.
Rule 4: Secondary Index values can be modified.
Rule 5: Secondary Indexes can be changed.
Rule 6: A Secondary Index has a limit of 64 columns.
teradata utilities:
http://www.bi-dw.info/teradata-loading-tools.htm
In API mode the data processing(load/update/insert/delete) is slow, however other process can
access the database tables during the update.
Compared to above, Utility mode processing(load/update/insert/delete) goes faster, as it handles data

records in large chunks, however during that time no other process can access the database table,
i.e., the process running in Utility mode locks the table/owns exclusive ownership of that database
instance.
In cross functional & largely distributed organizations, API mode is recommended considering the
performance aspect, over Utility mode.
However for one time loads/initialization of huge volume data in tables, Utility mode can be used
API and utility modes both have advantages and disadvantages:

API mode — Provides flexibility: generally, the vendor opens up a range of functions for
the programmer to use; this permits a wide variety of tasks to be performed against the
database. However, the tradeoff is performance; this is often a slower process than using a
utility. As an Ab Initio user, you might use API mode when you want to use a function that
is not available through a utility. In some instances, a component will only run in API mode
for just this reason - the function inherent in the component is not available through that
vendor’s published utilities. In general, however, it is useful to remember that API mode
executes SQL statements.
Utility mode — Makes direct use of the vendor’s utilities to access the database. These
programs are generally tuned by the vendor for optimum performance. The tradeoff here is
functionality. For example, you might not be able to set up a commit table. In such an
instance, you must trust the ability of the utility to do its job correctly. Because the granular
control given by API mode is not present in utility mode, utility mode is best when your
purpose most closely resembles the purpose for which the utility was created. For example,
any support of transactionality and record locking is subject to the abilities of the utility in
question. Also, unlike API mode, utility mode does not normally run SQL statements.
Secondary index:
http://www.teradatawiki.net/2013/08/Teradata-Secondary-Indexes.html
When are secondary indexes required?
There may be cases where queries may not use PI. Then SI comes into the picture
to enhance performance and chance of avoiding FTS.Value ordered NUSI is
recommended for range queries.
They can be created and dropped anytime.
Understanding the business requirement and design, drives us to create SIs.

how primay index work? - http://www.teradatatech.com/?p=470
partitioned primary index:
http://www.teradatawiki.net/2013/09/partitioned-primary-index.html
When will you create PPI and when will you create secondary indexes?
Partitioned Primary Indexes are Created so as to divide the table onto partitions based on Range or
Values as Required. This is effective for Larger Tables partitioned on the Date and integer columns.
There is no extra Overhead on the System (no Spl Tables Created ect )
Secondary Indexes are created on the table for an alternate way to access data. This is the second
fastest method to retrieve data from a table next to the primary index. Sub tables are created.
PPI and secondary indexes do not perform full table scans but they access only a defined st of data in
the AMP's.
When you chose primary index and when will you choose secondary index?
Primary index will be chosen at the time of table creation. This will help us in data distribution, data
retrieval and join operations.
Secondary indexes can be created and dropped at any time. They are used as an alternate path to
access data other than the primary index
What is the difference between sample and top?
The Sampling function (SAMPLE) permits a SELECT to randomly return rows from a Teradata
database table.
It allows the request to specify either an absolute number of rows or a percentage of rows to return.
Additionally, it provides an ability to return rows from multiple samples.
SELECT * FROM student_course_table SAMPLE 5;
The TOP clause is used to specify the number of records to return. The TOP clause can be very
useful on large tables with thousands of records. Returning a large number of records can impact on
performance.
SELECT TOP 2 * FROM EMP.

Teradata Primary and Secondary Index Guide

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Teradata Primary and Secondary Index Guide

Uploaded by

Copyright:

Available Formats

Primary index in teradata:

Rule 1: Secondary Indexes are optional.

Compared to above, Utility mode processing(load/update/insert/delete) goes faster, as it handles data

API and utility modes both have advantages and disadvantages:

When are secondary indexes required?

They can be created and dropped anytime.

Understanding the business requirement and design, drives us to create SIs.

partitioned primary index:

What is the difference between sample and top?

SELECT * FROM student_course_table SAMPLE 5;

SELECT TOP 2 * FROM EMP.

You might also like