Professional Documents
Culture Documents
After completing this module, you will be able to: Analyze Optimizer Access scenarios. Explain partial value searches and data conversions. Identify the effects of conflicting data types. Determine the cost of I/Os. Identify column level attributes and constraints. Identify table level attributes and constraints. Add, modify and drop constraints from tables. Explain how the Identity column allocates new numbers.
Unique Secondary Index Very efficient Two AMPs, one row No spool file
The Optimizer chooses the fastest access method. COLLECT STATISTICS to help the Optimizer make good decisions.
Col_1
USI
FTS
Notes: 1. The Optimizer prefers Primary Indexes over Secondary Indexes. It chooses the NUPI if only one I/O (block) is accessed. The Optimizer prefers Unique indexes over non-unique indexes. Only one row is involved with USI even though it is a two-AMP operation. 2. Depending on relative selectivity, the Optimizer may use either NUSI, may use both with NUSI Bit Mapping, or may do a FTS.
The Teradata Database does a FTS on a partial index value unless the index is ordered by value (Value-ordered NUSI or Hash Index).
Data Conversions
Columns (or values) must be of the same data type to be compared. If column (or values) types differ, internal conversion is performed.
Numeric values are converted to the same underlying representation. Character to numeric comparison requires the character value to be
converted to a numeric value.
Data conversion is expensive and generally unnecessary. Implement data types at the Domain level. Comparison across data types may indicate that Domain definitions are not
clearly understood.
Comparison Rules: To compare columns, they must be of the same Data types. Character data types will always be converted to numeric (when comparing character to numeric). Bottom Line: Always store numeric data in numeric data types to avoid unnecessary and costly data conversions.
INTEGER = DATE = DECIMAL (x,0) CHAR = VARCHAR = LONG VARCHAR BYTE = VARBYTE GRAPHIC = VARGRAPHIC Administer data type assignments at the domain level.
Give matching Primary Indexes across tables the same data type.
Cache hits Swapping Rows per block Cylinder splits/migrates Mini-Cylpacks Number of spool files Spool file sizes
I/Os may be done serially or in parallel. Data and index block I/O may or may not require Cylinder Index I/O. Changes to data rows and USI rows require Transient Journal I/O. I/O counts indicate the relative cost of a transaction. A given I/O operation may not cause any actual physical I/O.
Provides for automatic rollback in the event of TXN failure. Is automatic and transparent. TJ space comes from available free cylinders in the system. When a transaction completes, TJ space is returned to free cylinder lists. Provides Transaction Integrity.
Therefore, when modifying a table, there are I/Os for data table and the Transient Journal. Some situations where Transient Journal is not used include:
INSERT / SELECT into an empty table DELETE FROM tablename ALL Utilities such as FastLoad and MultiLoad
= I/O Operations
READ DATA BLOCK WRITE TRANSIENTJOURNAL INSERT or DELETE the DATA ROW WRITE NEW DATA BLOCK WRITE CYLINDER INDEX READ INDEX BLOCK WRITE TRANSIENTJOURNAL INSERT or DELETE the NEW INDEX ROW WRITE NEW INDEX BLOCK WRITE CYLINDER INDEX READ INDEX BLOCK ADD or DELETE the ROWID on the ROWID LIST or ADD or DELETE the SUBTABLE ROW WRITE NEW INDEX BLOCK WRITE CYLINDER INDEX
* *
UPDATE Operations
UPDATE tablename SET colname = exp . . .
* = I/O Operations
READ CURRENT DATA BLOCK WRITE TRANSIENTJOURNAL CHANGE DATA COLUMN WRITE DATA BLOCK WRITE CYLINDER INDEX
DATA ROW
* * * *
* * *
* *
DATA ROW ** READ CURRENT DATA BLOCK, WRITE TRANSIENTJOURNAL DELETE the DATA ROW ** WRITE NEW DATA BLOCK, WRITE CYLINDER INDEX ** READ NEW DATA BLOCK, WRITE TRANSIENTJOURNAL INSERT the DATA ROW ** WRITE NEW DATA BLOCK, WRITE CYLINDER INDEX For each USI * * * * READ INDEX BLOCK WRITE TRANSIENTJOURNAL UPDATE the INDEX ROW with the new ROW ID WRITE NEW INDEX BLOCK WRITE CYLINDER INDEX
*
* *
READ INDEX BLOCK UPDATE the ROW ID on the ROW ID LIST with the new ROW ID WRITE NEW INDEX BLOCK WRITE CYLINDER INDEX
NONE
SINGLE SINGLE NONE DUAL SINGLE
SINGLE
NONE SINGLE DUAL NONE DUAL
2
2 4 4 4 6
DUAL
The total number of Permanent Journal I/O operations per row is: INSERT : Total PJ I/O = Count + (#USIs * Count) DUAL
SINGLE
DUAL
6
8
DATABLOCKSIZE =
BYTES KILOBYTES (or KBYTES) MINIMUM DATABLOCKSIZE MAXIMUM DATABLOCKSIZE IMMEDIATE
FREESPACE Percent of freespace to keep on cylinder during load operations (0 - 75%). CHECKSUM = DEFAULT | NONE | LOW | MEDIUM | HIGH | ALL Disk I/O Integrity Check V2R5.1 feature
CREATE TABLE Table_2 (col1 INTEGER NOT NULL col2 INTEGER NOT NULL col3 INTEGER col4 INTEGER );
All constraints are named. All constraints are at column level. PRIMARY KEY columns must have NOT NULL attribute. UNIQUE columns must also have NOT NULL attribute.
Named
Unnamed
Some constraints are named. Some constraints are unnamed. All constraints are at table level.
PRIMARY KEY
FOREIGN KEY (dept_mgr_number) REFERENCES Employee (employee_number) CHECK (dept_number > 999)
,CONSTRAINT );
dn_1000_plus
Some constraints are named, some are not. Some constraints are at column level. Some are at table level.
Notes: Primary key constraint becomes a named index. Unique constraint becomes a unique index. All constraints are specified at table level.
To drop constraints:
ALTER TABLE tablename DROP CONSTRAINT constrname ;
In V2R5, the ALTER TABLE command can also be used to add new columns (up to 2048) to an existing table.
Guarantee row uniqueness in a table Guarantee even row distribution for a table Optimize and simplify initial port from other databases that use generated keys
Identity Columns are valid for:
Single inserts Multi-session concurrent insert requests (e.g., TPump) INSERT SELECT
Identity Columns Save Overhead/Maintenance Costs:
Reduce need for uniqueness constraints Reduce manual coding tasks Generate unique PK values Comply with the ANSI Standard
GENERATED ALWAYS + NO CYCLE implies uniqueness CYCLE restarts numbering after the maximum/minimum number is
generated
DBSControl setting indicates the number pool size to reserve for generating
numbers
SELECT * FROM Table_B ORDER BY 1 DESC; Cust_Number 10000000 9999999 9999998 9999997 : 9900000 9899999 9899998 : LName Tatem Kroger Yang Miller : Powell Gordan Smoothe : Zip_Code 89714 98101 77481 45458 : 57501 89714 80002 :
Typically define the Primary Index. Define as the Primary Index only if it is the primary path. If it is also used as an access path, consider it as a Secondary Index.
Generated By Default Identity Columns
Facilitate copying data from one table into another. Use a numeric type large enough to hold all the values that will ever be required. Never use as a substitute for a good logical database design. May not optimally utilize Teradata join and access capabilities.
Restrictions
A table can only have 1 Identity column. FastLoad and MultiLoad do not support Identity columns with Teradata V2R5.0. ALTER TABLE statement can not add an Identity Column to an existing table. Cannot be part of a composite primary or a composite secondary index. Cannot be used with Global Temporary or volatile tables. Cannot be used in a join index, hash index, PPI or value-ordered index. Atomic UPSERTs are not supported on a table with an Identity Column as its PI. GENERATED ALWAYS Identity Column value updates are not supported.
Note: With Teradata V2R5.1, Identity columns are supported with the FastLoad, MultiLoad, and Teradata Warehouse Builder (TWB) utilities.
Review Questions
1. Which one of the following situations requires the use of the Transient Journal? a. INSERT / SELECT into an empty table b. UPDATE all the rows in a table c. DELETE all the rows in a table d. loading a table with FastLoad 2. What is a negative impact of updating a UPI value? ______________________________________________________ ______________________________________________________ 3. What are the 4 types of constraints? _____________ 4. 5. 6. 7. 8. 9. True or False? True or False? True or False? True or False? True or False? True or False? _____________ _____________ _____________
A primary key constraint is always implemented as a primary index. A primary key constraint is always implemented as a unique index. Multi-column constraints must be coded as table level constraints. Only named check constraints may be modified. Named primary key constraints may always be dropped if they are no longer needed. Using the START WITH 1 and INCREMENT BY 1 options with an Identity column will provide sequential numbering with no gaps for the column.
A primary key constraint is always implemented as a primary index. A primary key constraint is always implemented as a unique index.
6.
7. 8. 9.
True or False?
True or False? True or False? True or False?