This action might not be possible to undo. Are you sure you want to continue?
This self-study exam preparation guide for the Teradata 12 Certified Master certification exam contains everything you need to test yourself and pass the Exam. All Exam topics are covered and insider secrets, complete explanations of all Teradata 12 Certified Master subjects, test tricks and tips, numerous highly realistic sample questions, and exercises designed to strengthen understanding of Teradata 12 Certified Master concepts and prepare you for exam success on the first attempt are provided. Put your knowledge and experience to the test. Achieve Teradata 12 Certified Master certification and accelerate your career. Can you imagine valuing a book so much that you send the author a “Thank You” letter? Tens of thousands of people understand why this is a worldwide best-seller. Is it the authors years of experience? The endless hours of ongoing research? The interviews with those who failed the exam, to identify gaps in their knowledge? Or is it the razor-sharp focus on making sure you don’t waste a single minute of your time studying any more than you absolutely have to? Actually, it’s all of the above. This book includes new exercises and sample questions never before in print. Offering numerous sample questions, critical time-saving tips plus information available nowhere else, this book will help you pass the Teradata 12 Certified Master exam on your FIRST try. Up to speed with the theory? Buy this. Read it. And Pass the Teradata 12 Certified Master Exam.
Certi cation Exam Preparation Course in a Book for Passing the
Certi ed Master
The How To Pass on Your First Try Certi cation Study Guide
Teradata 12 Certified Master Exam Preparation
This Exam Preparation book is intended for those preparing for the Teradata 12 Certified Master certification. This book is not a replacement for completing the course. This is a study aid to assist those who have completed an accredited course and preparing for the exam. Do not underestimate the value of your own notes and study aids. The more you have, the more prepared you will be. While it is not possible to pre-empt every question and content that MAY be asked in the Teradata exams, this book covers the main concepts covered within the Database Management discipline. Due to licensing rights, we are unable to provide actual Teradata Exams. However, the study notes and sample exam questions in this book will allow you to more easily prepare for Teradata exams.
Ivanka Menken Executive Director The Art of Service
Write a review to receive any free eBook from our Catalog - $99 Value!
If you recently bought this book we would love to hear from you! Benefit from receiving a free eBook from our catalog at http://www.emereo.org/ if you write a review on Amazon (or the online store where you purchased this book) about your last purchase!
How does it work?
To post a review on Amazon, just log in to your account and click on the Create your own review button (under Customer Reviews) of the relevant product page. You can find examples of product reviews in Amazon. If you purchased from another online store, simply follow their procedures.
What happens when I submit my review?
Once you have submitted your review, send us an email at firstname.lastname@example.org with the link to your review, and the eBook you would like as our thank you from
http://www.emereo.org/. Pick any book you like from the catalog, up to $99 RRP. You will
receive an email with your eBook as download link. It is that simple!
neither the author nor the publisher shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the instructions contained in this book or by the products described in it. and the publisher was aware of a trademark claim. recording. electronic. 3 . No such use.Notice of Rights All rights reserved. or otherwise. Notice of Liability The information in this book is distributed on an “As Is” basis without warranty. the designations appear as requested by the owner of the trademark. While every precaution has been taken in the preparation of the book. mechanical. No part of this book may be reproduced or transmitted in any form by any means. or the use of any trade name. is intended to convey endorsement or other affiliation with this book. All other product names and services identified throughout this book are used in editorial fashion only and for the benefit of such companies with no intention of infringement of the trademark. without the prior written permission of the publisher. photocopying. Where those designations appear in this book. Trademarks Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks.
4.Contents Foreword 1 3 4 5 5.5 5.9 5.3.1 5.1.2 5.1 5.1 5.11 5.10 5.3 Teradata 12 Certified Master Exam Specifics Teradata Products Teradata Solution Teradata Database Data Warehouse Concepts Relational Database Concepts Tables Views SQL Stored Procedures External Stored Procedures Macros Triggers User-Defined Functions User-Defined Methods User-Defined Types Databases and Users Data Dictionary Views Teradata RDBMS Components Platforms Virtual Processors Processing Requests 4 12 16 17 17 18 19 20 20 21 21 22 23 23 24 25 25 26 29 30 30 30 32 .3.7 220.127.116.11 5.8 5.4 5.3.3 18.104.22.168 22.214.171.124 5.1 5.2 126.96.36.199.6 5.3.
5.2.1 5.5.5 5.9.3 5.4.1 5.1 5.6 5.1 5.5 188.8.131.52.9.7 5.4.6 5.2 5.5 5.8 5.8.2 5.8 184.108.40.206 5.4 Disk Arrays 5.9 5.4 Cliques Hot Standby Nodes Parallel Database Extensions Workstations Types Teradata Database Window Database Requirements Fault Tolerance from Software Fault Tolerance from Hardware Redundant Array of Inexpensive Disks Client Communication Network Attachments Channel Attached Systems Data Availability Concurrency Control Transactions Locks Host Utility Locks Recovery Two-Phase Commit Protocol Teradata Tools and Utilities Data Archiving Utilities Load and Extract Utilities Access Modules Querying 5 32 33 33 33 34 34 34 35 37 38 39 39 40 41 41 41 42 44 44 45 46 46 46 47 48 .8.4 5.9 5.9.7 220.127.116.11.18.104.22.168.2 5. 5.3 5.
5 6.6 6.7 5.2 22.214.171.124 5.1.8 126.96.36.199.2 6.1.11 Session and Configuration Management Resource and Workload Management Security and Privacy Concepts of Security Users Database Privileges Authentication Logon Authorization Data Protection Security Monitoring Security Policy Database Design Development Planning Design Considerations for Teradata Database Data Marts Data Warehousing Parallel Processing Usage Considerations ANSI/X3/SPARC Three Schema Architecture Design Phases Requirements Analysis Entity-Relationship Models Normalization Process Join Modeling 6 48 49 51 51 51 52 53 54 55 56 56 57 59 59 59 60 61 62 63 63 64 64 66 67 71 .188.8.131.52.9 184.108.40.206 220.127.116.11 5.9.4 5.1 6.10.1 5.8 6.7 6.3 6.5.6 5.1 6.9.9 6 6.10.4 6.3 18.104.22.168.5 5.10 6.
22.214.171.124 7.4 6.2 126.96.36.199 6.4 6.6 188.8.131.52 184.108.40.206 220.127.116.11 6.1.2 6.1 7.1 6.2.7 Activity Transaction Modeling Process Indexing Primary Indexes Secondary Indexes Join Indexes Hash Indexes Integrity Set Theory Semantic Integrity Physical Integrity Database Principles Missing Values Capacity Planning Planning Considerations Database Sizes Estimating Space Requirements Structured Query Language Overview SQL Statements SELECT Statements SQL Data Types Recursive Query SQL Functions Cursors Session Modes 7 72 73 75 76 79 81 83 83 84 87 88 89 90 90 94 96 99 99 99 102 103 104 104 105 105 .3 7 7.12 18.104.22.168 7.5 7.5 6.3 6.3 6.2 6.1 22.214.171.124.1.3 6.3.3 7.3.
1 126.96.36.199 188.8.131.52 7.19 184.108.40.206 7.9 7.2.5 7.7 7.8 7.2.14 7.17 7.8 220.127.116.11.2.2 7.2 7.10 7.18 18.104.22.168 7.2.3 22.214.171.124 7.2.1 SQL Applications EXPLAIN Request Modifier Third-Party Development Database Objects Databases and Users Tables Columns Data Types User-Defined Types Keys Indexes Primary Index Secondary Index Join Index Hash Index Referential Integrity Views Triggers Macros Stored Procedures User-Defined Functions Profiles Roles SQL Syntax Statement Structure 8 105 106 106 107 108 108 112 113 117 117 117 118 119 121 122 122 124 125 126 126 128 129 130 131 131 .10 126.96.36.199.12 7.13 7.6 7.2.15 7.3.2.
5.4 188.8.131.52.5 184.108.40.206 220.127.116.11.5.1.4 7.6 8.7 8.2.1 7.3 18.104.22.168 8.6 22.214.171.124 8 8.3 7.2 7.2 8.3 Keywords Literals Operators Functions Delimiters and Separators Default Database Functional Families Data Definition Language General DDL Statements Data Control Language Data Manipulation Language Query and Workload Analysis Database Administration Physical Database Design Primary Indexes Secondary Indexes Join Indexes Hashing Identity Columns Normalization Referential Integrity Database Administration System Users Administrator User Administration Tools 9 131 132 133 133 134 135 135 135 137 138 139 141 143 143 143 144 145 145 146 146 146 147 148 148 149 .1 8.1.4 7.5.2 8.4 7.3.5 8.2 8.5 7.7 126.96.36.199 8.
5.4.1 188.8.131.52 8.2 8.2 8.4.3 184.108.40.206 220.127.116.11 8.7.4 8.7.2 8.6 8.4.5 8.3.4 8.7.1 8.4.4 8.1 8.3 18.104.22.168.6.7.1 8.3.2 8.5 System Administration Session Management Utilities User and Security Management Databases and Users User Types Creating Users Roles Profiles Privileges Object Maintenance Data Dictionary Data Dictionary Views Capacity Management Ownership Space Limits Spool Space Temporary Space Data Compression Session Management Session Modes Monitoring Tools Accounts System Accounting Using Query Bands 10 149 149 150 152 152 153 153 155 157 157 166 166 166 168 168 169 171 172 173 176 176 177 177 179 179 .3 8.7 8.6.6 8.3 8. 8.4 8.
1 8. Restore.8 8.8.4 9 9.1 11 12 Business Continuity Archive.3 8.2 8.8. and Recover Required Privileges Session Control HUT Locks Practice Exam Refresher “Warm up Questions” Answer Guide Answers to Questions References Index 180 180 180 181 181 182 182 196 196 205 206 11 .1 10 10.8.8.8.
manage. the following exams must be passed in sequential order: • • • • • • • TEO-121 – Teradata 12 Basics TEO-122 – Teradata 12 SQL TEO-123 – Teradata 12 Physical Design and Implementation TEO-124 – Teradata 12 Database Administration TEO-125 – Teradata 12 Solutions Development TEO-126 – Teradata 12 Enterprise Architecture TEO-127 – Teradata Comprehensive Mastery Each exam covers the following: • TEO-121 – Teradata 12 Basics o o o o Product Overview Processing Types and Characteristics Data Warehouse Architectures Relational Database Concepts 12 .3 Teradata 12 Certified Master The Certified Master certification is part of the Teradata Certified Professional Program covering the knowledge and skills required to install. The entire list of available certifications in recommended order is as follows: • Teradata 12 Certified Professional • Teradata 12 Certified Technical Specialist • • • • Teradata 12 Certified Database Administrator Teradata 12 Certified Solutions Developer Teradata 12 Certified Enterprise Architect Teradata 12 Certified Master The exams for each certification roughly correspond to the steps in the program. Each step in the program builds on the previous step. For the Certified Master certification. or certification. and operation Teradata systems.
o o o o o o o • Teradata RDBMS Components and Architecture Database managed Storage Data Access Mechanics Data Availability Features Teradata Tools and Utilities Workload Management Security and Privacy TEO-122 – Teradata 12 SQL o o o o o o o o o o o o o o o o o Teradata Extensions Data Definition Language Data Manipulation Language Data Control Language Views and Macros Logical and Conditional Expressions Data Conversions and Computations CASE Expressions Subqueries and Correlated Subqueries Joins Attribute and String Functions Set Operations Analytical Functions Time/Date/Timestamp/Intervals Stored Procedures Concepts Aggregations SQL Optimization Concepts 13 .
o Advanced SQL Concepts • TEO-123 – Teradata 12 Physical Design and Implementation o o o o o o o o o o • Physical Database Design Table Attributes Column Attributes Statistics Primary Indexes Secondary Indexes Transaction Isolation Physical Database Operations Teradata Query Analysis Database Space Management TEO-124 – Teradata 12 Database Administration o o o o o o o o o o System Software Setup and Parameters User and Security Management Session Management Load and Extract System Administration Tools System Workload Analysis and Management Performance Optimization Capacity Management and Planning Business Continuity Object Maintenance • TEO-125 – Teradata 12 Solutions Development o Development Process 14 .
o o o o Development Considerations Development Planning Development Strategies Optimization • TEO-126 – Teradata 12 Enterprise Architecture o o o o o o • System Planning and Space management Optimization Data Integration Data Protection Data Governance Information Delivery Strategies TEO-127 – Teradata 12 Comprehensive Mastery o o o o o o o Workload Management Performance Management and Query Optimization Database Design Configuration Management and Capacity Planning Data Availability and Security Application Integration and Optimization Data Management and Integration 15 .
prometric. Tests are conducted at a testing center.com/teradata. Scheduling and location of test sites can be obtained at www. and timed. Exams are delivered in a secure environment. proctored. Multiple Choice 16 . Two valid forms of ID are required when arriving at the center.4 Exam Specifics Teradata Exams are proctored by Prometric. Specifics about the exam include: • • • Exam Number : Time Limit: Question Type: TEO-121 – TEO-127 2 hours to 3 hours minutes (based on exam).
1 Teradata Solution The core product is the Teradata Database.database system administrator • Teradata Administration Workstation • Teradata Parallel Transporter .continuous load • Teradata Replication Services .data quality • Teradata Analytic Data Set Generator The following functions are provided by the complete Teradata solution: • Active Load .capture and delivery of changed data • Teradata FastLoad . The methods include streaming from a queue. The tools and their functions are: • Teradata Manager .table loading • Teradata FastExport .5 Teradata Products 5. and moving changed data.analytical intelligence can be accessed quickly and consistently.data extraction • Teradata Archive/Recovery Utility • Teradata Dynamic Workload Manager • Teradata Analyst Pack • Teradata Utility Pack • Teradata Warehouse Miner Stats • Teradata Warehouse Miner • Teradata Profiler . • Active Access .data is loaded actively and continuously while supporting other workloads. batch updates.unified load utility • Teradata TPump .data loading • Teradata MultiLoad . A comprehensive set of management tools and utilities are included to aid in numerous functions. 17 .
the data warehouse solution is integrated into the enterprise business and technical architectures to support business users. 18 . Data transactions will rollback to a consistent state if a fault occurs. • Active Enterprise Integration . The solution enabled data type translation.aids in identifying and fulfilling application-specific availability. and workload management to allow clients to access a single copy of the data. the database can be expanded without sacrificing performance. The capabilities of the Teradata Database allow users to view and manage data as collections of related tables. This single store is possible through heterogeneous client access. The network attachment method is used to attach the system to workstations and other computers and devices through a LAN. • Active Workload Management . recoverability. The architecture supports single-node (Symmetric Multiprocessing) and multinode (Massively Parallel Processing) systems. but adds Teradata-specific extensions. The channel attachment method will attach systems directly to a mainframe computer using an I/O channel. communication utilizes Structured Query Language (SQL).business events are detected automatically and business rules are applied against current and historical data.mixed workloads are managed dynamically and system resource utilization is optimized to meet business goals.1 Teradata Database The Teradata Database is an inexpensive solution which will use most standard hardware components. Parallel processing makes the Teradata product faster than other relational systems. and customers. partners. Teradata SQL is compatible with ANSI SQL. concurrency. Either Teradata or ANSI modes are available for transactions to be run. The database can use one of two attachment methods to connect to other operational computer systems. and performance requirements. 5. A fast interconnected structure (BYNET) allows distributed functions to communicate.1. • Active Availability . Fault tolerance capabilities ensure hardware failures are detected and recovered. Additionally. • Active Events . connections.increasing the speed for processing user and customer requests. Teradata Database is designed to be a single data store for multiple client architectures for the purpose of reducing data duplication and inaccuracies present when multiple stores are used. Due to the Teradata Database being a relational database. The database has the capacity to store large amounts of detailed data and performs large amounts of instructions every second.
validates messages from clients of generate sessions and control encryption. • Teradata Gateway . This data is: • Subject oriented • Integrated • Timestamped • Nonvolatile • Data can be captured from several sources. The data is typically gathered from other operational databases. such as: • Customer orders • Inventory database • Shipping processes • Direct mail • Electronic mail • Phone calls • Utilities load the data either in a timely. • Parallel Data Extensions (PDE) . Access Module Processor (AMP) and file system.2 Data Warehouse Concepts A data warehouse is a centralized database storing data that is later used to aid strategic.The database software implements the relational database environment using the following functional modules: • Database Windows . continuous fashion or through batch jobs. and event-driven decision making. tactical.enables the database to operate in a parallel environment by implementing a software interface layer.used to control operations. 19 . 5. • Teradata Database Management Software .includes a Parsing Engine (PE).
and prices may be found in their own columns. consisting of rows and columns. • Queue . Relationships and constraints of data are defined by references between tables.3.stores trace output for sessions to be used for debugging SQL stored procedures and external routings. particularly insert and update errors. a tuple is defined by a row.used when only one session needs a table.used to store information about errors.permanent tables without a defined primary index used as staging tables to load data from FastLoad or TPump Array INSERT.permanent tables with a timestamp with contents organized in first-in firstout (FIFO) ordering. There are several types of tables: • Permanent . on a permanent table. An entity is found in a row and the attributes of that entity is found in columns. only the creator needs access to 20 .5. Entities are a specific person. These tables have a persistent table definition which is stored in the Data Dictionary. and intercolumn dependencies. and the columns within the table define the degree of the relation.different sessions and users can share table content. and an attribute is defined by a column.3 Relational Database Concepts Relational database concepts are grounded in the mathematical exercise of set theory.session based tables which are private to the session and dropped automatically at the end of a session. These tables have a persistent table definition which is stored in the Data Dictionary. The table format is used to organize data and present that data to users. or attributes. In this context. predictable outcome is possible when manipulating the table because the operations allows are well-defined. relationships. but never in the same column.1 Tables Tables are two-dimensional objects comprised of rows and columns. A consistent. A row is a specific instance consisting of all columns in the table. The implementation of a relational database is a general implementation of set theory relations. Table constraints are simply conditions that must be met before a value is written to a column and can include value ranges. 5. equalities or inequalities. • Global Temporary Trace . • Volatile . • Error Logging . Columns generally represent entities. Tables within databases are two-dimensional. dates. or thing which has information stored in the tables. place. Names. • Global Temporary . • NoPI . a relation is defined by a table. The rows within the table define the cardinality of the relation. A column contains the same type of information.
A view does not contain data and they do not exist without a reference from a DL statement. View definitions are stored in the Data Dictionary. Views can provide a level of independence to logical data. DCL. They contain: • Multiple input and output parameters. Hierarchies of views can be created from other views. higher-level views have dependencies on lower-level views. One or more base tables or views can be used to create a view. With SQL stored procedures. usually presenting only a subset of columns and rows. MERGE. • Business rules on the server can be encapsulated and enforced. When done. If a lower-level view is deleted.3 SQL Stored Procedures SQL stored procedures are a combination of procedural control statements. Applications using SQL stored procedures have the following characteristics: • Stored procedures reside and are executed on the server. reducing network traffic. They are executed on the Teradata Database server space. the dependencies become invalidated. tested. UPDATE. • Derived .a temporary table is created through a subquery of one or more other tables. This table type is specified in an SQL SELECT statement. and DELETE statements. • SQL DDL. though some restrictions apply when using INSERT. They are used as if they are physical tables when retrieving data defining columns form other views or tables. • Local variables and cursors.a table and better performance than a global temporary table is required or the table definition after ending the session is not required. Views can be used as if they are tables in SELECT statements. 5. 5.3. and DML statements. improving application 21 . large and complex database applications can be built.2 Views Database views are virtual tables. and enables high performance capabilities. Views provide users a perspective of the data in the database and can restrict access to tables and provide updates. and control declarations which provide a procedural interface to the Teradata Database.3. Access to the data can be defined. SQL statements.
used with all supported SQL statements except CALL. • User access can be granted to procedures instead of the data tables directly. cursor declarations. Write. • Security is better due to the ability of the data access clause restricting access to the database. Create a database object for the external stored procedure using the CREATE PROCEDURE or REPLACE PROCEDURE statement.4 External Stored Procedures External stored procedures can be written in C. 2. They are installed on the database and executed like stored procedures. Place the class or classes in an archive file (JAR or ZIP) if using Java and call the SQLLINSTALL_JAR external stored procedure to register the archive file. 22 3. 5.DDL. • SQL transaction statements . and local variable declarations respectively. DCL. • Runtime conditions generated by the application can be handled by an exception handling mechanism. DECLARE CURSOR.maintenance. and DECLARE statements to provide condition handlers. • Comments . A single CALL statement can be used to execute all SQL and SQL control statements embedded in an SQL stored procedure. C++. • LOCKING modifiers . DML. • Transaction control is better.initiated by DECLARE HANDLER. Condition handlers can be of a CONTINUE or EXIT type. . and SQLWARNING. One or more of the following elements can be found in SQL stored procedures: • SQL control statements . including dynamic SQL statements.3. or Java code for the procedure. SQLEXCEPTION. and debug the C. • Control declarations .notes which are simple of bracketed. and SELECT statements. test. NOT FOUND. FOR. C++. There are five steps to follow in order to write and utilize an external stored procedure: 1.nested or non-nested compound statements. or defined for SQLSTATE. or Java programming language.
The basic SQL statements used with macros include: • CREATE MACRO . all updates are backed out. one or more rows of data may be returned. and the database is returned to its original state. • DROP MACRO .deletes a macro. A failed macro is aborted. INSERT. The execution of a macro will either process all the statements it contains or none of the statements. the tables affected. 5. which is a stored SQL statement associated with a table called a subject table. 5.3. Though multiple statements may exist in a macro. or the results of the macro. Macros can be created by an individual for personal use or for others by granting execution authorization. Two types of triggers are supported by the Teradata Database: • Statement • Row 23 .4.6 Triggers A triggering event is an event which is initiated because of the occurrence of another event. 5.runs a macro. it is treated as a single request. Execution of a macro does not require knowledge of database access. • EXECUTE . Each time a macro is performed.3. The conditions of the triggering event are represented by a database object. or UPDATE. Grant privileges to authorized users using the GRANT statement.5 Macros One or more SQL statements can be executed by a single request using the macro database object. Invoke the procedure using the CALL statement. Triggers are executed when a specified column or columns in the subject table are modified by a DELETE.used to incorporate a frequently used SQL statements or series of statements into a macro.
Two types of UDFs are supported: • SQL UDFs • External UDFs Regular SQL expressions can be encapsulated into functions and used like a standard SQL function by creating an SQL UDF. MERGE. INSERT. and DELETE statements performed to a subject table can be propagated to another table. SQL expressions are objectified when they are frequently repeated in queries. • AFTER . 3. MERGE. Triggers allow: • The UPDATE. called user-defined functions (UDFs). System performance can be maximized by running triggered and triggering statements in parallel. An SQL UDF can be created and used with the following steps: 1.7 User-Defined Functions SQL can be extended by writing functions. • To set a threshold. When multiple triggers are specified. 24 .3. Call the function. they are executed in the order determined by the timestamp of each trigger. 5.Triggers are initiated when one of the following statements is used: • BEFORE . Triggers can initiate other triggers. or DELETE during specific timeframes. 2. Define the UDF using the CREATE FUNCTION or REPLACE FUNCTION statement. INSERT. They can be sorted based on the preceding ANSI rule. unless the ORDER extension is used. • The performance of an audit. • To disallow major operations for UPDATE. Use the GRANT statement to grant privileges to authorized users. such as business hours.executed after the completion of a triggering event. • To call SQL stored procedures and external stored procedures.executed before the completion of a triggering event.
• Aggregate .3. or Java code for the UDF. Two types of EDMs are supported: • Instance .initializes an instance of a structured UDT. Use the GRANT statement to grant privileges to authorized users. Call the function.External UDFs are functions written in C. or Java programming language. 6. It is always associated with a UDT. 2.operates on a specific instance of a distinct or structured UDT and can provide transform.3. They are installed on the database and used like standard SQL functions. 1. An External UDF can be created and used with the following steps: 1Write. Create a database object for the UDF using a CREATE FUNCTION or REPLACE FUNCTION statement. 4. test. • Table .returns a single value result for an input parameter. 5.invoked in a FROM clause of a SELECT statement and returns a table to the statement.produces summary results. 5. C++. Three types of external UDFs are supported: • Scalar . Both types of UDTs can define methods. and debug the C. ordering. Dynamic UDT is a form of structured UDT. A distinct UDT is a single predefined data type. Place the class or classes in an archive file (JAR or ZIP) if using Java and call the SQLLINSTALL_JAR external stored procedure to register the archive file. and cast functionality for the UDT. A structured UDT is a collection of one or more attributes defined as a predefined data type or other UDT. A distinct or structured UDT is defined using a CREATE TYPE statement. C++. 5.8 User-Defined Methods A User-Defined Method (UDM) is a special form of UDF. 3. but a dynamic UDT is defined using a NEW VARIANT_TYPE 25 .9 User-Defined Types User-Defined Types (UDT) can be structured or distinct. • Constructor .
usually assigning all database disk space not required by system tables to the User System Administrator. The Data Dictionary is a set of tables and views associated with the system user. Dynamic UDTs can only be specified as a data type of input parameters to external UDFs. 5. This user is then used to allocate space. The tables used only by the system and contain metadata about objects. The database administrator can create a User System Administrator from the User DBC to protect the system tables within Teradata Database. the User DBC is created.expression. and system usage. The metadata is comprised of current definitions. control information and general information about: • • • • • • • • • • • Authorization Accounts Character sets Columns Constraints Databases Disk Space End users Events External stored procedures Indexes 26 . Space is assigned from the User DBC to all other objects. the only difference being the ability for the user to log on to the system. This single user owns all other databases and users in the system and all the space in the entire system initially. privileges. system events. specifically all free space and the database and users created after installation. a database and a user are nearly identical. When the Teradata Database is installed on a server. This expression constructs an instance of the dynamic UDT and its attributes at runtime.3. Views provide access to the information contained within the tables.10 Databases and Users In Teradata Database. DBC. The database administrator manages this user.
• • • • • • • • • • • • • • • • • • • JAR and ZIP archive files Logs Macros Privileges Profiles Resource usage Roles Rules Sessions Session attributes Statistics Stored procedures Tables Translations Triggers User-defined functions User-defined methods User-defined types Views 27 .
trigger names and conditions. and revision number. • JAR . owner name. user/creator access privileges. default format. execution protection mode. source file language. function call name. • Trigger . data types. external name. modifier. Information on columns in the table including column name. parameter passing convention.stores the IDs of tables. data type. data accessing characteristic. null-call characteristic. database administrators will create and update tables referenced by the system views. users who created and last updated the trigger.information on database name. and creation date and time for the table. and user and creator access privileges.information contains C source code and object code. table name. • Stored procedures .Generally. creator name. creator name. table backup and protection. and subject table databases. collation type. databases. and fallback tables. • Java external stored procedure . identification. execution protection code. modify data and time. character and platform type. defined indexes and constraints. source file language. data types. database name and user names. version. parameter type. function class. default character set. JAR name. specific name.information on the attributes of the creation time. number of fallback tables. • Databases . and revision numbers. overflow text.information on table location. • User-defined function . character and platform type.defined function . parameters including parameter name. function class. including: • Tables . data accessing characteristics. database name. platform type.information on the view or macro text. character and platform type.information on source code and object code. execution protection mode. source file language. source file language. This information is typically focused on objects in the system. data type. account name. triggers. creation time stamp. specific name. indexes.information on call name. timestamp for the last updated. and phrases. • View or Macro . external name. data type.information including name. data accessing characteristics. and user and creator access privileges. creation text and time stamp. external name. 28 . external stored procedure name. deterministic characteristic. • Java user . role and profile names. • External stored procedures . parameter passing convention. length.information on the Java object code. data accessing characteristic. attributes of the creation time. space allocation. These views are used by users to obtain information within the table. parameter data types. parameter passing convention. parameter passing convention. external file reference.
DBC. Some Data Dictionary views are restricted to specific types of users. and privileges related to a user.UDTInfo (each UDT entry). date form. ordering form. type kind. function class. source file language. ToSQL routine ID and FromSQL routine ID. To have access to these views.information contains C source code and object code. instantiability.3. and DBC.information on performance status and statistics. modification name and timestamp. character type. • Security administrator . defining new users. 5. null-call characteristic. including implicit assignment and cast routing. • Operations and Recovery Control . ordering category. database. data accessing characteristics.information includes user name. • Supervisory .information on objects. allocating access privileges.information contains DBC.UDTCast (each cast entry). creator name. and accounting. deterministic characteristic. cast count. and archiving. Information on objects found in the Data Dictionary can be presented through views.deterministic characteristic. specific name. creating indexes. monitoring space usage. default transform group. null-call characteristic.archive and recovery activities 29 .11 Data Dictionary Views Views are used to examine information in tables. including type name. parameter data types. parameter passing convention. types of access granted. collation. DBC. The different types of users and their information needs are: • End . role and profile name. errors. and character and parameter types. • Database administrator .UDTTransform (each transform). • User-defined types .information on creating and organizing database. function call name. external name.information on access logging rules and access checking results. and character and parameter types. rather than querying the actual tables.UDFInfo (each auto-generated default constructor) including the same information as a regular UDF. the database administrator will need to grant privileges. execution protection mode. execution protection mode. space allocation. creation timestamp. • User . ordering routine ID. default account. • User-defined methods . owner name. password change date. while other views are accessible by all users. including default transform group name. password string.
The components include: • Processor node .4.1 Platforms The Teradata Database software is supported by hardware components based on Symmetric Multiprocessing (SMP) technology. the SMP technology can be formed to create a Massively Parallel Processing (MPP) system. The processors are virtual (vprocs). 30 . It utilizes a high-speed logic to allow bi-directional communications and merge functions. but are backward compatible to Kanji/Latin. and makes a SQL query.links nodes on an MMP system to enable point-to-point. Transmissions can be optimized using load balancing software. uses a password. When combined with a communications network.4 Teradata RDBMS Components 5. and broadcast messaging between processors.several tightly coupled CPUs connected to one or more disk arrays (SMP configuration) used as the hardware platform. At least two BYNETs in the multinode system are needed to provide fault tolerant capabilities and enhance communication between processors. Other statements cannot be used at all or are limited to specific Data Dictionary tables. • BYNET . Single-code SMP systems will use Boardless BYNET. The BYNET resembles a switched network which loosely couples the SMP nodes in a multinode system. The views used by the Data Dictionary are based on Unicode. An MPP configuration is made of two or more loosely coupled SMP nodes. 5.The Data Dictionary is accessed every time a user logs into the Teradata Database. multicast. 5.performs database functions: each processor owns a portion of the overall database storage.4. or virtual BYNET. They run software processes on a node under the Parallel Database Extensions (PDE). software to emulate BYNET hardware.2 Virtual Processors There are several different types of virtual processors: • Access Module Processor (AMP) . The only SQL DML command allowed with the Data Dictionary is the SELECT statement.
During the query process.receives output from parser and sends to appropriate AMPs. The vproc performs specific database management tasks like accounting.communicates with the Teradata Meta Data Services utility with any dictionary changes and provides a socket interface for the replication agent. This processor type cannot be manipulated externally.manages the database storage. • Relay Services Gateway (SG) . • Session Control . merge data rows and aggregate data. query parsing. up to 128 vprocs can be supported at each node.used to generate and package steps. locking database objects. Up to 16. Parsing Engine (PE) vprocs perform communication duties between client systems and AMPs. query dispatch. • Generator .• Gateway (GTW) . • VSS .performs any actions related to session control. • Optimizer . Though each node will typically have 6-12 vprocs. an AMP will sort.provides a socket interface to Teradata Database. • Parsing Engines (PE) . Multiple AMPs can be grouped into logical clusters. A vproc will also manage the disk space in the file system. journaling. AMP vprocs manage all interactions between the disk subsystem and the Teradata Database.decomposes SQL into processing steps.384 vprocs are supported within a single system. 31 . Communication with vprocs occurs using a unique-address messaging driven by BYNET driver software. Each vproc is an independent copy of the processor software. decomposes SQL statements and returns answered to clients. and security validation. Each AMP vproc will manage a portion of the disk storage. The communication with AMPs is enabled through BYNET. The following elements comprise the PE software: • Parser .identifies the most efficient path to access data. • Dispatcher . query optimizations. and converting output data. sharing only the physical resources of the node with other vprocs. At the core of the PE is the database software that manages sessions. These clusters enable the fault-tolerant capabilities of the database. then monitors for completion of the steps or errors.manages session activities and recovers sessions.
8. AMPs will acquire the required rows. which provide directives to the AMPs. The Dispatcher controls the execution sequence of the concrete steps. than steps can be processed in parallel. The gncApply will bind parameterized data into the plastic steps and transform them into concrete steps. The SQL requests are handled in the following way: 1. which will distribute those steps to the AMP database management software.3 Processing Requests The SQL parser processes incoming SQL requests to query the database.5. Completion processes come from all expected AMPs and will place the next step on the BYNET. or all AMPs. a response is required before starting. 3. and then passes the parse tree to the Generator. The optimized parse tree is transformed into plastic steps by the Generator and the steps are cached. The concrete steps are passed to the Dispatcher. the request is aborted and an error message sent back to the requester. The process continues until all AMP steps related to a request are completed. he syntax of the incoming request is checked by the Syntaxer. the Dispatcher will communicate if the step is for one AMP. or a set of AMPs called a dynamic BYNET group. and then sent to the gncApply. if not. all AMPs. AMP steps are sent to one AMP.4. if no errors exist the request is converted into a parse tree and sent on to the Resolver. As each step is sent to the BYNET. Messages are transmitted to and from AMPs and PEs by the BYNET. The resolver will add information from the Data Dictionary or cache to convert database objects to internal identifiers. It will pass the steps to the BYNET.4 Disk Arrays Redundant Array of Independent Disks (RAID) solutions are used to protect data at the disk level. 7. if errors exist an error message is sent back to the requester or. If the privileges are valid. several AMPs. 6. The Dispatcher will wait for a completion response. The Optimizer will determine the best way to implement the SQL request. The request is scanned by the Optimizer to determine where to place locks. 2. Disk drives are grouped into RAID LUNS to ensure availability of data during a failure. Privileges in the Data Dictionary are checked by the security monitor. If the next step is dependent on the output. 5. 32 . the request is sent to the Optimizer or. 5. If no dependency exists. When processing requests. 4.4.
They are members of a clique and will not normally perform any database operations. A PDE allows the Teradata Database to: • • • • Run in a parallel environment. Multiple operating systems can be run in parallel and PDE provides a number of services for parallel operating systems.4. 5. they cannot migrate. Apply a priority scheduler. the PEs can migrate. Each LUN is uniquely identified. The physical connections are made using Fibre Channel (FC) buses.Drive groups are a set of drives configured into one of more LUNs. 5.7 Parallel Database Extensions Between the operating system and Teradata Database is a software interface layer called the Parallel Database Extensions (PDE). Due to this.4. I/O and messaging system interfaces. They are utilized only when a node within the core system fails. Manage memory. including: • • • The ability to manage parallel execution of database operations on multiple nodes Dynamic distribution of database tasks Task execution coordinated between and within nodes Between the PDE and the Teradata Database is a layer of software called the Teradata 33 .4. If they are connected through the LAN. Vdisks are a group of cylinders assigned to an AMP. Execute vprocs.6 Hot Standby Nodes Nodes that are used solely to improve the availability and maintain performance are called hot standby nodes. The vprocs migrate to other nodes when the original nodes in the clique fail.5 Cliques Nodes of an MPP system are physically linked by multiported access to common disk array units. This is a feature called a clique and supports the migration of vprocs under PDE. The PEs are dependent on the physically attached hardware. 5.
and performance stations can be displayed. system operators.4. The file system allows the Teradata Database to store and retrieve data efficiently without bothering with specific low-level operating system interfaces.8 Workstations Types Two types of workstations exist to look into the Teradata Database: • • System Console on the SMP platform. The DBW is used to start and control Teradata Database utilities. Various utilities can be controlled through the system console.Database File System.9 Teradata Database Window The Teradata Database operations are controlled by database administrators.4. Within the console. With the AWS. and support personnel using the Teradata Database Window (DBW). current system configuration. Administration Workstation (AWS) on the MPP platform. multiple nodes can be viewed through a single system and system performance can be monitored. The Administration Workstation will perform all the functions of the system console. 5. 5. the system status.5 Database Requirements The critical requirements to be fulfilled by the Teradata Database are characterized as 34 . 5. The system console provides a mechanism to allow system and database administrators to provide input to the Teradata Database. Database commands can be issued and utilities can be run from the DBW. It can be run from: • • • System console Administration Workstation Remote workstation or computer The DBW is a graphical user interface. The CNS is a part of the PPDE software which the database runs on top. This interface is specific to the Teradata Console Subsystem (CNS).
5. Subpools can prevent disk 35 . fallback tables can maintain a row’s availability. Smaller clusters can reduce the potential failure of two AMPs in the same cluster. Fallback tables are a copy of a primary table. The Parsing Engine (PE) and Access Module Processor (AMP) are vprocs which will migrate when a node fails. This technique is at the expense of doubling storage space and the I/O used for tables. AMP clusters are comprised of 2-16 AMPS. Fallback can be defined for individual tables. logical groupings of AMPs and disks within the same cluster. Should the system lose an AMP. Clustering can be performed across subpools. The fulfillment of these requirements is achieved by combining: • • • Multiple microprocessors in a SMP arrangement DAID disk storage Operational anomaly protection Fault tolerance is provided through both the hardware and software. availability. usability.reliability. and installability (RASUI). while leaving other tables alone. such as those critical for the business. especially in large systems. serviceability. while others are optional. which can cause a system halt. This migration enables a system to function fully during a failure.5. They provide fallback capabilities for each other by storing a copy of each row on a separate AMP in the same cluster. Each fallback row in a fallback table is stored on an AMP different from the AMP hashed by the primary row. Multiple clusters can exist. though some performance degradation will be experienced because of the nonfunctional hardware. Some fault tolerance features are mandatory.1 Fault Tolerance from Software Software fault tolerance is provided through the following Teradata Database facilities: • • • • • • Vproc migration Fallback tables AMP clusters Journaling Backup/Archive/Recovery Table Rebuild Utility Vproc migration can migrate from node to node within the same hardware clique when a failure occurs.
or full recovery. full-table archives is reduced because of permanent journals.uses tape libraries and drives directly connected to Teradata Database nodes which initiate and run the backup. DELETE.used to roll back failed transactions aborted by the user or system because it stores BEFORE images of the transactions by capturing begin/end transaction indicators. DROP. An image is created on the same AMP as the row described. and the image is discarded when the transaction or rollback is completed.consists of Teradata Database nodes connected to BAR servers which initiate and run the backup. or both to enable rollback. before row images for UPDATE and DELETE statements. Journaling is a recording of activity and there are several such mechanisms in the Teradata Database. Backup Archive and Restore (BAR) solutions can be found in two different architectures: • BAR Framework . The different types of journals include: • Down AMP recovery . control records for CREATE. and restore activities. The journal is discarded after use. • Transient . • Permanent .occurs during an AMP failure on fallback tables only and used to recover the AMP after repair. and ALTER statements. and restore activities. The system itself will perform some journaling. • Direct-attached Architecture .failures from occurring by dividing AMPs in the same cluster into a subpool. row IDs for INSERT statements. archive. The Teradata Archive/Recovery (ARC) utility archives files to client tape or files and restores them to the Teradata Database. If two AMPs in the same cluster fail. the system will not crash if the AMPs are in different subpools. This type of journal can contain before images or after images. archive. while other activities can be configured for journaling. The need for frequent. rollforward.specified by the user for tables or database. The files which are backed up and restored by this utility include: • • • • • • Authorization objects Databases Data Dictionary tables External stored procedures Hash Indexes Join Indexes 36 .
The affected tables utilize fallback protection. head crash. an entire table. A table can be rebuilt on an AMP-by-AMP basis for the primary or fallback portions of a table.5. or the entire disk on a single AMP. The utility can be used to remove any inconsistencies in stored procedure tables in a database. filed engineer. database.2 Fault Tolerance from Hardware Hardware fault tolerance is provided through the following Teradata Database facilities: • • • • • • • • Multiple BYNETs RAID disk units Multiple-channel connections Isolation from client hardware defects Battery backup Power supplies and fans Hot swap node capabilities Cliques 37 .• • • • • • • • Methods Stored procedures Tables Table Partitions Triggers UDFs UDTs Views The Table Rebuild Utility is used to recreate a table. or all tables in a database. 5. or system support representative. or other malfunction. The utility is usually run by a system engineer. all tables in an individual AMP. The recreation process is performed when the table structure or data is damaged due to a software problem. power failure.
Level 5 has a high read rate and is reliable. RAID Level 1 – uses mirroring to replicate data from one drive to the next. but write performance is low and is unsuitable for frequent transactions using small data transfers. Level 1 is excellent when the primary requirements are high availability and high reliability. It is best used when the primary 38 . deciding which level to use is one of the most important decisions in SAN designing using RAID. The most basic level is RAID 0. The reason for this reduction in speed is the need for the host system to calculate the parity values and perform additional I/O operations to ensure the storage of these values. Tape devices are primarily used to back up large volumes of data. RAID Level 5 – uses parity to store parity values across different drives. The system creates a combined large storage device from smaller individual devices. Software implementation of RAID is possible.6 Redundant Array of Inexpensive Disks Each storage device involves different technologies. This level does not offer any redundancy and is not recommended for storing data. Both technologies have the potential to fail at any point. RAID is a simplified system for managing and maintaining the storage environment. Damaged drives can be hot-swapped without disrupting the network functions. The level of redundancy provides by a virtual disk ensures that the data is protected from disk failures. and striping schemes. but the write speeds are typically slower than hardware implementations. The different levels of RAID include: • • • • • RAID Level 0 – simple level of disk striping which has data stored on all drives. RAID Level 6 – parity is stored on striped drives along with the data. The technology is for large database operations. Data is generally stored across different drives.5. To minimize host processing. though they are relatively stable. fast RAID arrays have additional hardware caches. RAID Level 3 – uses parity to store the parity value on a separate drive. but is costly since double the storage capacity is required. Magnetic disks are the preferred device for primary storage. and different levels of RAID provide different levels of redundancy and performance. Level 6 has high reliability and high read speed. Level 0 is best used when high throughput is desired with the lowest cost possible. multiple buses. Redundant Array of Inexpensive Disks (RAID) provides a fault-tolerant array of drives to overcome any possibility of failure. Back to the different RAID levels. RAID Level 3 provides the best high data transfer and costs less than other levels. It is most suitable for multiple applications. but performance goes down when the drive fails though it can withstand single drive failures. and RAID 5 and RAID 3 options are the most popular choices for large databases. but offers no redundancy.
The costs are high and the write speed is slower than RAID 5.7 Client Communication The Teradata Database can communicate with client applications through the LAN network or a channel provided by a mainframe server.1 Network Attachments The methods used to communicate over the network include: • • • • • . Open Database Connectivity (ODBC). Programmers can use the OLE DB Provider to design application programs to access databases and data stores that do not use SQL. the Teradata Database can be accessed using the Java language.7. Results are either processed directly or cached in an ADO. executing SQL statements. CLIv2 is used by . and processing results. With the Teradata JDBC Driver.NET Data Provider for Teradata.requirements are high availability and data security. The program will request database information from an intermediate program which in turn will access the database. 5. Responses from the database are sent back to the application program through the intermediary. 5. The JDBC API provides a standard set of interfaces for opening connections to databases. These methods are used to allow client applications to make requests to the database server and send back responses from the database. The ODBC Driver provides an interface to the database using a standard ODBC API. This 39 .NET Data Provider for Teradata to connect. Java Database Connectivity (JDBC). OLE BD Provider for Teradata. The Java Database Connectivity is a specification for an API to allow platform-independent java applications to access the database using SQL and external stored procedures. execute commands. and retrieve results from the database. OLE DB Provider for the Teradata Database uses service providers to enhance a provider’s functionality.NET DataSet to generate XML and bridge relational and XML data sources. Teradata CLIv2 for NAS.
2 Channel Attached Systems Teradata CLIv2 for channel-attached systems (CAS) is used to perform channel attachment. The TDP enables communications between Teradata CLIv2 for CAS and the Teradata Database server. and restart. As a result. Providing physical input to and output from the server is another function of MTDP. 5. recovery. The APE can build parcels packaged by MTDP and sent to the Teradata Database using MOSI. • Cooperative processing to allow simultaneous operations on client and server by an application. IBM System z Operating System. It allows sessions to be initiated and terminated. verification. OBDC and CLI work independent from each other.driver provides Core-level SQL and Extension-level function call capability. only one version of MTDP is required to run all network-attached platforms when using MOSI. Information Management System (IMS). The collection provides an interface between applications and the Teradata Director Program (TDP) running on an IBM mainframe client. requests are sent to the server and responses are sent back from the server to client applications. • Insulation of the application from communication mechanics with a server. Through the TDP. Individual TDPs are associated with a logical server. but multiple TDPs can operate and 40 . All versions of IBM operating systems can be operated. Micro Operating System Interface (MOSI) is an interface between the MTDP and the Teradata Database to provide a library of service routines. Teradata CLIv2 for NAS is a proprietary API and library. This library allows clients that access the Teradata Database operating system independence. as well as enabling logging. It provides Teradata an interface between applications on a network-attached client and the Teradata Database server. It is a collection of callable service routines. The driver uses the Windows Sockets TCP/IP communications software interface. • Communication with Two-Phase Commit (2PC) coordinators for CICS and IMS transactions. Teradata CLIv2 for CAS enables: • Management of multiple simultaneous sessions to server(s). Each parcel returned has a pointer for the application. The Micro Teradata Director Program (MTDP) is the interface between Teradata CLIv2 and MOSI.7. including: • • • Customer Information Control System (CICS).
That is.a lock is placed on an object before the object is used. Also provided is the physical input and output to and from the server. It is a logical unit of work and a unit of recovery. Partial transactions cannot occur. but as a different job. The functions of the TDP include initiating and terminating sessions. and updated.8. logging. recovery.8 Data Availability 5.simultaneously accessed by Teradata CLIv2 on the same mainframe. An identifier called the TDPid is attached to each TDP and referred to by applications. no more locks are placed on an object. The program executes on the same mainframe as Teradata CLIv2 for CAS. and restarting. To prevent concurrently running processes from causing any problems to simultaneously updating. The Teradata Database server implements the relational database. all the requests in a transaction either must happen or not happen. Requests are nested inside a transaction and are atomic. which comes through the TDP. Shrinking phase . Teradata Database supports ANSI transaction semantics and Teradata transaction semantics 41 . This database will process requests from Teradata CLIv2 for CAS. Locks are only released after a transaction is completely committed or completely rolled back.2. 5. The two phases of the protocol are: • • Growing phase . Transactions Transactions are used to maintain the integrity of the database. verification. concurrency control is established through two mechanisms: • • Transactions Locks 5.8. The serailizability of transactions is ensured by the Two-Phase Locking (2PL) protocol. deleted.after a lock is released.1 Concurrency Control Multiple users accessing the same database raises the possibility of the same data being added. Transactions can be serializable: a condition where a set of transactions can produce the same result as an arbitrary execution of the same transactions for arbitrary input. as well as session balancing and queue maintenance.
the END TRANSACTION. or ABORT request is performed by the application. or executes an explicit ROLLBACK or ABORT statement. 5. Most locks on resources are obtained automatically.as specified by a system parameter. An explicit transaction is generated by the user and consists of a single set of BEGIN TRANSACTION/END TRANSACTION statements.8. • Row hash . The severity of a lock can be upgraded with the LOCKING request modifier. the resource is either fully or partially inaccessible from other users. In ANSI mode. The transaction closes when a COMMIT. Four different locking severities exist: • Access . DDL statements in a transaction must be the last request before the transaction closing statement. 42 . • Table . the database software generates an error. any subsequent requests are queued until the lock is released.3 Locks Locks are used to control access to a resource.locks the rows in all tables in the database. When a lock is placed on a resource.minor inconsistencies in the data are allowed and modifications on the underlying data are allowed while the SELECT operation is in progress. Teradata mode transactions can either be explicit or implicit. the entire transaction will be rolled back if the current requests result in a deadlock. or the two-phase commit protocol is allowed in a session. If an error occurs. but the severity can never be downgraded. The type of lock used is based on the data integrity requirement of the request.locks the primary copy of a row and all rows sharing the same hash code in the same table. All other requests are implicit transactions. performs a DDL statement that aborts. typically being opened by the execution of the first SQL in a session or the execution of the first request after the close of a transaction. The request can be aborted if a lock is not obtained immediately. ANSI mode transactions are implicitly opened. the entire transaction is rolled back by the system. ROLLBACK. The BEGIC TRANSACTION.locks all rows in a table and any associated index and fallback subtables. Locks can be placed on the following objects: • Database . In ANSI mode. If a resource is locked by a user.
exclusive lock on table. CREATE DATABASE .write lock on table and row hash.exclusive lock on table. If BTEQ is used.requester has exclusive privileges to the locked resource and no other process can write.exclusive lock on table. the report of the transaction abort is sent to BTEQ. Most locks are automatic based on the SQL statement used. Below is a list of statements and the lock severity on the appropriate lock level: • • • • • • • • • • SELECT . DELETE .• Read . INSERT . UPDATE .write lock on row hash.exclusive lock on database.exclusive lock on database. ALTER TABLE . or access the resource.write lock on table and row hash. In this situation. Deadlocks are situations where a transaction already has a lock on a resource and needs a lock on another resource.read lock on table and row hash. DROP TABLE .requester has exclusive privileges to the locked resources except for readers. which already has a lock on it by a requester who needs a lock on the original resource. read. neither transaction can move forward until the other transaction is aborted. The request that caused the error is resubmitted but not the entire transaction. Teradata Database will generally abort the younger transaction defined by the shortest length of time a resource is held.exclusive lock on database. MODIFY DATABASE .allows multiple locks of this type to exist by several users and permits no modification to the resource. • Write . CREATE TABLE . • Exclusive . 43 . DROP DATABASE .
• Placed on objects in the AMPs participating in a utility operation and none other. When a database discovers an error or failure. they are automatically reinstated.4 Host Utility Locks The Teradata Archive/Recovery utility located on the client uses a different locking operation than Teradata Database. • If not released when Teradata Database restarts. the system may restart. • Write .any object being archived. • Write . This is typically the result of: 44 .permanent journal table being restored. • Exclusive .5. • Group Read .8.journal table being deleted. • Remains active until the RELEASE LOCK option of an ARC command is given or RELEASE LOCK statement after a utility operation completes its execution.tables in ROLLFORWARD or ROLLBACKWARD during recovery. • Does not conflict with any lock at a different level for the same object for the same user. • Write . 5.used when the table is defined for an after image permanent journal causing rows of a table to be archived. • Placed during a CLUSTER dump at the cluster level.8.any object being restored. Since transactions are simply a series of updates to the database. HUT locks are: • Associated with the user currently logged-on. they can be used to take the database to an earlier state or forward to a current state.5 Recovery Recovery in database management handles the process where databases in an inconsistent state are brought back to a consistent state. The different types are: • Read . These locks are commonly called Host Utility (HUT) locks.
each participant in the transaction commit operation will vote to either commit or abort the changes. After all updates have been made. 45 . user error. it will continue to process. When a large number of rows need to be processed.8. A change is not fully made by any participant until it knows that all participants can commit to the change. 5. The system can perform an automatic transaction recovery. or inconsistent data table. the AMP is considered recovered. When errors are found. the AMP will recover offline. Errors can be isolated to data or index subtables or to a range of rows in a data or index subtable. A vote by a participant is simply a declaration that they can either commit or roll back its portion of the transaction work. the process is performed when the AMP is online. If an AMP fails to come online during system recovery. When the AMP finally does come online. the down AMP recovery procedures are initiated to bring the AMP up-to-date. If any transactions are still in progress that requires the down subtable or region. Disk parity error. Participants can also be a coordinator of participants at a lower level. Software failure. the transaction is aborted. userinitiated abort command. If only a few rows need to be recovered. the Teradata Database will process requests using fallback data. the affected subtable or region is marked and a snapshot dump is taken. A database recovery is performed after initiating a restart of the system. A participant is any database manager performing some work related to the transaction. Two types of transaction recovery can occur: • Single transaction recovery • Database recover A single transaction recovery will occur because of a transaction deadlock.6 Two-Phase Commit Protocol Update consistency across distributed databases is assured using the Two-Phase Commit protocol. If the transaction does not require access to the affected subtable or region. In this type of environment. System recovery uses the down AMP Recovery Journal. This form of recovery uses the transient journal to perform its operation.• • • AMP or disk failure.
Tiered Archive Restore Architecture (TARA) consists of software extensions to connect BAR software to the Teradata Database. and views.1 Data Archiving Utilities Third party software products are supported by Teradata Backup. Teradata ARC supports the archiving and restoration of individual functions. macros. Teradata ARC will archive and restore tape storage devices and backup-to-disk (B2D) storage devices.5. parallel data extraction. and NetBackup Extension for Teradata.9. The prominent features of Teradata PT are: • • • • Process-specific operators Access modules Parallel execution structure Data stream use 46 . backup. and TPump. recover. The framework is comprised of the TARA Server. and TSM Teradata Extension. stored procedures. TARA GUI. The Teradata Archive/Recovery utility (ARC) works with BAR application software writing and reading sequential files to archive. loading. and Restore (BAR) for various archiving.used to select databases and tables graphically and define the backup types to perform. and updating by using and expanding on the traditional Teradata extract and load utilities.manages I/O interfaces between Teradata ARCMAIN and IBM TSM.used to schedule automatic. • Tivoli Storage Manager (TSM) and Tivoli Manager Teradata Extension . TARA GUI. BAR software includes: • Bakbone NetVault and NetVault Plug-in . and restore functions. MultiLoad.2 Load and Extract Utilities Teradata Parallel Transporter (Teradata PT) provides scalable. • Symantec NetBackup and NetBackup Extension for Teradata .9 Teradata Tools and Utilities 5. join indexes. 5. unattended backups for client systems and the framework is comprised of the TARA Server. restore. and copy table data. FastExport. hash indexes. such as FastLoad. Archive. high-speed.9. triggers. In connection with BAR application software.
FastExport can be used to export tables to client files. Bulk inserts. and scalability found with MultiLoad. Conventional row hash locking is supported. 47 .will load data into the database from a UNIX OS named pipe (data buffer).will load data from a message queue using IBM’s WebSphere MQ. Only one table is populated per job. The following access modules will read. • Teradata Access Module for JMS . Block transfers can be performed with multisession parallelism. data: • Named Pipes Access Module . FastLoad will load data in to unpopulated tables. The utility supports client environments and server environments. FastExport is a functional complement of FastLoad and MultiLoad. It supports the same restart.• • SQL-like scripting language Teradata PT Wizard Teradata PT utilizes an Application Programming Interface (API) to provide several interfaces to load data into or extract data from the Teradata Database. Teradata PT API is a functional library designed to allow greater control over the load and extraction operations. the utility uses standard SQL. Utilities and the access modules are linked though the Teradata Data Connector API.3 Access Modules Providing block-level I/O interfaces is the function of dynamically-linked software components known as access modules. updates. and deletes can be performed using Teradata MultiLoad. The utility is supported in client environments and server environments. The utility will also perform block transfers with multisession parallelism. and export data to an Output Modification (OUTMOD) routine. portability. as well as multiple SQL statements packed into a single request. Data from multiple input source files can be loaded. • Teradata WebSphere MQ Access module . They import data from data sources and return that data to a Teradata utility.will load data from a JMS-enabled message system. Instead of using block transfers. Multiple instances of the tool can be run simultaneously with no limitations. The utility is designed to extracts large amounts of data from the Teradata Database to the client in parallel. TPump is used to maintain data in tables. and block transfers with multisession parallelism. 5. These activities can be performed against several unpopulated or populated database tables at a time.9. not write. requiring multiple FastLoad jobs to be submitted.
Access modules will work with the following tools and utilities on multiple operating systems: • • • • • • BTEQ FastExport FastLoad MultiLoad Teradata PT TPump 5. SQL queries are submitted to the database: BTEQ will format the results and return them to the screen. Create and use stored procedures. • Custom Access Modules . Enter operating system commands. or to a designated printer. add.9. and delete data.will transfer data between an OLE DB provider and Teradata Database. modify. the following actions can be taken: • • • • View. During a BTEQ session.4 Querying Basic Teradata Query (BTEQ) is a command-based program for allowing users to communicate with one or more Teradata Database system and format reports for print and screen output.9. 48 .5 Session and Configuration Management The following tools are used for investigating sessions and configurations: • Query Session . 5.provides information about active sessions by monitoring the state of selected sessions.will provide access to specific systems using the DataCenter operator. Enter BTEQ commands.• Teradata OLE DB Access Module . as a file.
• Query Configuration . The WAL has a simpler structure than a table. The log file is written to the disk at key moments. including node. except: • • • Transient Journal tables User journal tables Restartable spool tables 49 .provides reports on the current configuration of the database. the WAL can be used to reconstruct the database. This tool will protect all permanent tables and all system tables. even when coming from different transactions. Modifications to permanent data can be batched.6 Resource and Workload Management Several tools are used to perform system resource and workload management in Teradata Database: • • • • Write Ahead Logging Ferret utility Priority Scheduler Terada Active System Management The Write Ahead Logging (WAL) protocol will record permanent data writes to a log file to provide an update report for the database. but is conceptually similar to tables. AMP. and PE identification and status. Log data is a sequence of WAL records and are not accessible through SQL. In the event of a system failure. • Gateway Global .9. The log contains Redo Records and Transient Journal (TJ) records.provides monitoring and control capabilities for sessions of networkconnected users. 5.
Controls resource sharing between different applications. and cylinder. • Resource Usage Monitor .provides an SQL interface to PMPC through UDFs and external stored procedures. throttling.The Ferret utility will display and set storage space utilization attributes for the database. WAL log. 50 . The following capabilities are available: • • • • Better service for higher priority work. subtables. Prevents aggressive queries from consuming resources. The utility is active on all Teradata Database systems. It includes: • • • SCANDISK SCOPE SHOWBLOCK and SHOWSPACE Priority Scheduler (schmon) controls access to resources based on different active jobs in the database. • Query Bands . Automates changes. and defining classes of workloads. It can do so within vprocs. tables.defines rules for filtering. • Teradata Viewpoint .analysis DBQL data to group workloads and define rules for managing system performance. • Open APIs . The utility reconfigures data in the file system while maintaining data integrity during the change.used for data collection.defined by the user or middle-tier application to allow sessions or transactions to be tagged with an ID. and creates events for monitoring system resources. disk. Administrators can define priorities to different types of work. Teradata Active System Management (Teradata ASM) is a collection of products interacting with each other and a common data source for the purpose of allowing automation in: • • • • Workload management Performance tuning Capacity planning Performance monitoring Some of the Teradata ASM products include: • Teradata Workload Analyzer (Teradata WA) .
5. • Logon .determines the permissions to the database associated with a user.explicitly or automatically granted permissions to a user or database. • Authorization . The implementation may or may not require the creation of a matching database user.ensures the message received is the same as the message sent with no loss or change to the data. In any case.2 Users Users must be defined in the database or a supported directory.the submission of user credentials when requesting access to the database. The CREATE USER statement defines permanent database users. The application will authenticate the user rather than the database.1 Concepts of Security Security in the Teradata Database is grounded in the following concepts: • Database user . • Security mechanism . • Access logs .provides a history of users accessing the database and the objects accessed. Users who access the database through a middle-tier application using trusted sessions are called proxy users. A SET QUERY_BAND statement is submitted when establishing privileges for a 51 .provides specific authentication. • Network traffic protection . confidentiality.10. and integrity services.10 Security and Privacy 5. Rolebased database privileges are assigned to proxy users using a GRANT CONNECT THROUGH statement. • Authentication . Directory-based users accessing the database must be defined in a supported directory.secures the message traffic between network attached clients and Teradata Database. • Privileges . one of more configurations must be updated to allow directory users to access the database.the verification of a user’s identity.5. • Message Integrity . Usernames are typically used to represent individuals and must be unique in the database.an individual or group of individuals represented as a single user identity.10.
the statement and the rules for applying it must be coded into the application in order to be used. however.10.granted to user or database directly or to a role which is associated to a user.granted to the creator of a database. user. the roles are not granted to the directory users directly. 5. since directory users do not exist in the database. All objects that a role has privileges to can be accessed with those privileges by a member of that role. The CREATE ROLE statement is used to define a role. Profiles can be defined and assigned to a group of users sharing similar values. or the default database user PUBLIC. Users can switch between multiple roles they are members of using the SET ROLE statement. • Explicit . To assign additional user roles. However.3 Database Privileges • If a user has privileges on an object. or database object or to newly created users and databases. the roles are assigned to groups which the directory users are members. External roles are used to assign privileges to directory users. External roles and database roles are created in the exact same way. To access all roles. such as directory users to database users. Groups of users with similar needs can be granted privileges on database objects by associating them to a role. • Inherited . the user can use the SET ROLE ALL statement. they can access the object. The GRANT statement can be used to grant roles to users. such as: • • • • Default database assignment Spool space capacity Temporary space capacity Account strings permitted 52 . If no privileges exist. The default role for the user must be specified when using the CREATE USER statement. Proxy users are assigned roles using the GRANT CONNNECT THROUGH statement. Instead. they cannot access the object.particular connection to the database. The use of roles will decrease the dictionary space required for granting privileges to individuals. the MODIFY USER statement can be used. • Automatic .passed on based on a user’s relationship to another user or role. The different types of database privileges include: • Implicit .granted to the owner of the space where the database objects are created.
• Password security attributes 5. Logon type. • Sign-on As .users are authenticated in the domain and do not require subsequent logons. the Teradata Database requires the user and privileges for the user to be defined. The different types of logon for external authentication are: • Single Sign-on . External authentication utilizes an agent running on the same network as the Teradata Database and its clients. Teradata Database authentication makes use of the TD2 mechanisms by default. It is dependent on two elements: • • Security mechanism specified in the logon (authenticating agent).user is authenticated by the directory. The process consists of the following elements: • • • Authentication method Logon format and controls Password format and controls Users must choose a security mechanism to identify the authentication method used. The categories of authentication are: • • Teradata Database authentication External authentication For authentication. The mechanisms do not need to be specified at logon unless another mechanism is set as the default.4 Authentication User authentication is a process where a user’s identity is verified and compared to a list of approved users.10. • Directory Sign-on .the logon to Teradata Database is recognized by the domain. Security mechanisms are part of the logon string or the system will use the default mechanism. 53 .
The following logon forms are found in Teradata Database: • • • • Command line GUI Logon from channel-attached client Logon through a middle-tier application
When using a command line logon, the following information is provided by the user: • Jogmech - the name of the security mechanism is specified to define the method used to authenticate and authorize the user. • Jogdata - used for external authentication to specify a username and password. • Jogon - used for Teradata Database authentication and optionally for external authentication. A GUI logon is performed within dialog boxes, which includes fields and buttons to prompt the user to enter the same logon information provided in command line logon. When channel-attached clients log into sessions, network security features are not supported. These features include security mechanisms, encryption, or directory management of users. Logons require the submission of username, password, tdpid, and optional account string information. When accessing the database through a middle-tier application, a database username must be used. A connection pool is set up by the application to allow end users to access the database without any formal logon. Permissions are automatically granted for all users defined in the database to allow them to logon from all connected client systems. The configuration for this automatic process can be modified by administrators, specifically: • • • • • Modify current or default logon privileges. Restrict logon from a specific channel or network interface. Set the maximum number of unsuccessful logon string submissions. Enable external applications to perform authentication. Restrict access based on IP address. 54
Passwords used in logons must conform to password format rules. These rules govern the type and number of characters allowed in the password. Password controls allow administrators to: • Restrict password content, including minimum and maximum password characters, characters allowed, and the use of specific words. • Set the number of days a password is valid. • Assign a temporary password. • Set lockout time after the maximum number of logon attempts is exceeded. • Define time period that a previous password cannot be reused.
Once a user has been fully authenticated, their defined privileges define their authorized database privileges. Permanent users are authorized with the following privileges: • • • Directly granted using the GRANT statement. Indirectly granted using automatic, implicit, and inherited privileges. Granted as a member of a role.
Directory-based users will be authorized access privileges to the database according to these rules: • Each directory user is authorized the privileges of the objects if the directory maps users to those objects. • The directory user is authorized to all privileges associated to a matching database user if the directory does not map users to those objects, but the directory username matches a database username. • The directory user has no privileges to the database if the user is neither mapped to any database object nor their username does not match a username.
The types of directories certified for use with Teradata Database are: • • • • • Active Directory Active Directory Application Mode (ADAM) Novell eDirectory Sun Java System Directory Server LDAPv3-compliance directories
Middle-tier applications will log on the database, be authenticated as a permanent database user, and establishing a connection pool. Authentication of individual application end-users is then performed by the application. All end-users accessing the database through a middletier application are authorized privileges to the database, and audited in access logs based on the permanent database user identity of the application.
Data protection in Teradata Database is enhanced by the following features:
• The logon string is encrypted, by default, to maintain the username and password’s confidentiality. • Optional encryption of the message maintains the data’s confidentiality. • Data corruption is prevented through automatic integrity checking. • BAR encryption will ensure data backup confidentiality between BAR servers and the storage device. • Systems using LDAP authentication with simple binding utilizes SSL/TLS protection. • Systems using LDAP authentication with Digest-MD5 binding utilizes SASL protection.
Two types of user security monitoring are performed by Teradata Database: • Viewed through DBC.LogOnOffV, all user logon and logoff activity is collected in the Event Log using parameters such as database username, session number, and logon events. 56
identifies logon rules resulting from GRANT and REVOKE LOGON statements. requesting database username. • The security features of the database meet the current security needs of the business.identifies the access logging rules contained in each BEGIN and END LOGGING statement.AccLogRulesV . requesting database object.LogOnOffV .10. user attempts to access the database can be optionally recorded using parameters such as access type. These statements can also establish the access parameters to log.• Viewed through DBC.AccessLogV. • A developed policy which is a mixture of system-enforced and personnel-enforced features. Directory users are logged by their directory username and not their associated database name. • DBC.9 Security Policy A security policy implemented for the database should consider: • Balancing the need for secure data against the need for quick and efficient access to the data. and frequency of access. Logging can be enabled and disabled using the BEGIN and END LOGGING statements.AccessLogV . 57 . Some security-related system views used by the Data Dictionary: • DBC. Users accessing a database through a middle-tier application using a trusted session and set up as proxy users are logged using their proxy username. the user’s username for the application is shown in logs. If trusted sessions are not set between the application and end user.identifies all logon and logoff activity. 5. • DBC. • DBC.identifies a privileges check resulting from a user request.LogonRulesV . request type.
Suggested and required actions related to security. Contacts for security related questions. The benefits of the security policy for users and the business.A security policy document should detail: • • • • • The need for security. A description of security features in the database. 58 .
from batch jobs to queries to detailed analysis. and transformation of the data. the source data. storing data. the major areas of interest fall into entering data. The effort provides long range decision support as well as adhoc tactical and decision support. 59 . and the replication and propagation of the data.1 Design Considerations for Teradata Database The primary requirement for any data warehousing effort is a properly designed and configured database.1. When considering a database solution. The various methods related to data transformation are: • • • • • • • • • • Accessing Capturing Extracting Filtering Scrubbing Reconciling Conditioning Condensing Householding Loading The physical aspects of the database consist of the Relational Database Management System (RDBMS). how the data is organized in the RDBMS (Data Marts). and supporting processes. Key considerations surrounding entering data into the database include the IT users.6 Database Design 6.1 Development Planning 6. Different user types require access to different types of data using different methods of retrieval. retrieving data.
utilizing data from the enterprise data store and permitting users to have full access to the enterprise data store.2 Data Marts Data marts are relatively small subset of a data warehouse database which is application or function specific and designed for a narrowly defined user population.are part of the enterprise data warehouse. including: • • • • Clustering Statistical Artificial Intelligence Neural Nets Supporting the database solution are several components and processes in place for the entire lifecycle of the database. they are isolated entities separated from the enterprise data warehouse and utilizes independent data sources. • Logical data marts .also known as data basements.a form of dependent data mart which is virtually constructed from the physical database. and the data mining methods used.1. the access tools to the database. 60 . • Dependent data marts . Three types of data marts are recognized: • Independent data marts . These include: • • • • • • • • Logical Models Middleware Metadata Data Dictionary Network Management Database Management System Management Business and Technology Services 6.The considerations surrounding retrieving data covers the business users who require the data in the database.
6. The introduction of data warehousing focused on the provision of a historical database containing data derived from an active operational database. and should not be. deleted. The data is volatile. increasing response times. sometimes. Centralized shared information architecture. The Teradata Database solution is a dynamic solution with active data supporting a business’ ongoing. and added. Manageable. day-to-day operations. The data was static and data was not inserted. The result is a failed database solution and. preselected set of queries. updated. The following are popular summaries for data warehouse failures: • • • Denormalization. they are not. Redundant network connectivity. the data mart is originally unsuccessful because the design restricts the user’s access to data which they really need. The development of the Teradata solution fulfills the following goals: • • • • • • • Large capacity. and handled time variances. Focus on a small. parallel processing database. Use of a standard. usually because the authority is too restrictive or the data is present in the data mart. Aggregates stored over detail data.Data warehouse designs typically “start small” using data marts before transforming into a large data warehouse solution. deleted.1. Most often. Fault tolerance. Data in this context was oriented to a specific subject. and adding to user frustration. the focus of a data warehouse solution. The data was typically added to the data warehouse from the operational database after some defined age. it can be changed. Faster response times. This approach is nearly always unsuccessful. or modified. If the new data mart is successful. nonvolatile. updated.3 Data Warehousing Though data marts have a place in data warehousing. integrated. the perception that the reasons for failure were always present. non-proprietary access language. The data warehouse was only used to perform historical queries of the data. 61 . scalar growth. the user demands placed on the data mart will cause the database to grow faster than expected.
The BYNET internodal communication is used when the same information is required by all AMPs. Teradata Database will use a non-collision. As a result. and the number of rows used to a particular operation. The system supports parallel processing by balancing the database’s workload.4 Parallel Processing Parallel processing speaks to the Teradata Databases’ ability to allow its file system. The hash distribution of rows across AMPs enables request parallelism. including the following dimensions: • • • Query parallelism Within-step parallelism Multistep parallelism 62 . and query optimizer to work in parallel. The optimizer is aware of the parallel system in the following ways: • Determines how long each operation takes to perform the query to determine the optimal ordering of the join. are supported by internodal communication. • Determines the need to redistribute rows for a join operation. These rows are hashed across AMPs using the row hash value of the primary key. and unconditionally across all AMPs and they are performed independently of the data on other AMPs in the system. Each data row is owned by exactly one AMP. The operations of Teradata Database. This row hash is used to retrieve the row from the AMP. point-to-point monocast communication architecture which allows a single sender to connect with a single receiver. table cardinalities. Request parallelism in Teradata Database is multidimensional. • Statistical and demographical information related to the AMPs. Teradata Database is a shared nothing database architecture: the PE and AMP vprocs in the architecture do not share either memory or disk storage across central processing units. To minimize the cost of this internodal communication. As a result. including the BYNET. information is typically broadcasted to all AMPs in the system at the same time. which has the ability to create. message subsystem. lock manager. The Teradata Database optimizer is designed specifically to optimize queries in a parallel architecture. read. update. Query optimization is another consideration in a parallel environment. • Determines an AMP-to-CPU ratio to define the most efficient query plan. The balance is achieved by distributing table rows across AMPs and giving the responsibility of the data to those AMPs.1. All relational operations are performed in parallel. or lock its data. Each system node within the system will have multiple BYNET paths to every other node in the same system.6. simultaneously. each AMP has exclusive control over its own virtual disk space.
Multi-statement requests. relational operations are processed in parallel. Multiple steps of the same request can be executed simultaneously as long as the results of those steps are not dependent on each other. allows a number of distinct SQL statements to be bundled together and treated as a single unit.transparently maps External level views to the physical storage of the actual data in the database.1.the physical storage of data on the storage media. • Internal .composed of all views of the underlying physical database. • Conceptual . the components of a query is divided into concrete steps and dispatched to appropriate AMPs for execution. before relational databases were commercially available. Three levels of the architecture are specifically defined by ANSI/SPARC: • External .5 Usage Considerations Some design considerations for Teradata Database are: • • • • • • Online Transaction Processing (OLTP) Decision Support (DSS) Summary Data Detail Data Simple and Complex Queries Adhoc Queries 6. 6. The statements are executed in parallel.• Multistatement request parallelism When a query plan is generated by the Optimizer. Within each concrete step. Multistep parallelism refers to the system’s ability to invoke more than one process for each step in the request. 63 .1.6 ANSI/X3/SPARC Three Schema Architecture The American National Standards Institute/Standards Planning and Requirements Committee (ANSI/X3/SPARC) architecture was developed to define database management systems. The Teradata SQL extension.
After the logical database design is achieved. triggers. the basic framework behind the enterprise data model has the following components: perspectives and dimensions. In this phase.7 Design Phases The purpose of the logical database design is to formally define the objects and their relationships.8 Requirements Analysis The purpose of requirements analysis is the development of an enterprise data model: a blueprint for ensuring IT standards exist and are integrated into the business enterprise. ATM will define physical attributes to the logical data model. 64 . macros. including the usage constraints and policies.defines the scope of the project.defines the envisioned product. including cost constraints and regulations. The different perspectives include: • Planner . In essence.1. attributes. and other objects. Creating a preliminary set of data demographics. The first step in designing the logical database is normalization. The physical design phase is the point where commitment is made to the physical attributes of the database.6. 6. Summarizing table and join accesses. the entities. indexes. The phase will identify and create actual databases. Active Transaction Modeling (ATM) is introduced. The results of these activities act as an input into the design of the physical database. The process includes the following efforts: • • • • • • Identifying business rules that impact data storage. views. base tables. The structure of database management is built on normal forms and derivations as applicable to a series of inference rules and formal logical operations derived from set theory. as well as the attributes applicable to those objects. Identifying and modeling database applications. Based on an idea created by John Zachman. Identifying and modeling application transactions. • Owner . which results into a scope definition. Identifying and defining attribute domains and constraints for physical columns. and relationships derived from the logical model are translated into the physical database design process.1. The principle concept behind the enterprise data model is a logical construct defining and controlling interfaces and the integration of all components of the enterprise information structure.
resulting into a technology model. Time . each perspective should be handled in order. 65 . The greatest disadvantage is the possible insertion of hidden agendas or conflicting perspectives on the importance of specific requirements. Each dimension is addressed by each perspective. Each perspective is dependent on the product of the perspective preceding it. and little explanation on what currently exists and why. Locations . • Subcontractor . and produces reasonable and reliable estimates of costs. interviewing and review of legacy databases. often defined by generic models. Legacy database can provide substantial information on the strengths and weaknesses of the data and how that data is used or misused. which results into a system model. The different dimensions include: • • • • • • Entities . The disadvantage is that most legacy systems are developed with little or no direction or guidance.defines when activities occur.defines goals promoting the model.defines the work to be performed. Therefore. The resulting items of each perspective/dimension question become the requirements for the end product. Motivation .which results into a business model.defines the users of the model.defines the geographical aspects.ensures the scope and requirements are transformed into a product specification. supports user access to the system. Interviewing can provide the current demands on database requirements.constructs the individual components of the product. Each dimension is independent from each other and has equal importance to the overall model. Activities . People .defines the specific inputs into the process. • Builder . Analysis of requirements ensures the development of the system is done properly the first time. Requirements are generally identified through two activities which are.constructs the product. • Designer .
objects are tangible in the real world and captured as an entity. operators.represents entities of large and small cardinality and degree.a characteristic of an entity: at least one attribute always exists called the primary key. For Teradata. Data models are an abstract.database objects representing a real world “thing”. From a design perspective.1. In this model. and supporting elements to the structure and behavior of the data. • Supertype and subtype entities . Key definitions within this context are: • Entity .an association between two or more entities. The entity-relationship (E-R) model extends the semantics of the relational model to obtain greater meaning out of the data. two schemas are commonly referred to: • Major and minor entities . Relational models for the database model identify connections between data within the database. including: • • • • • • • • • Communications Financial Services Healthcare Insurance Manufacturing Media Retail Transportation and Logistics Travel Industry Most of these models are based on the entity-relationship model for database management.6. several types of schemas exist for categorizing entities. each with unique characteristics captured as attributes.represents a generic entity (supertype) comprised of several specific entities (subtype).9 Entity-Relationship Models Teradata provides several industry-specific logical data model (LDM) frameworks. 66 . Entities interact with each other through relationships. logical definition of objects. • Attribute . Major entities are updated frequently while minor entities are updated rarely. • Relationship .
the occurrence of one entity is related to 0. drawn from the domain. • 1:M Relationship . or more occurrences of another entity.a table with a composite primary key.describes the primary key with a unique name.any candidate key not selected as a relation’s primary key. • Existence Dependency . 67 . 1. and constrained.a table with a single column primary key. • Connectivity . • Non-Prime table . many).the mapping of entity occurrences in a relationship (values are 0.the number of entity instances related through the relationship.1. • M: M Relationship .the number of entities associated in the relationship. 1.any attribute derived through a calculation from other data in the model. • Attribute . • Body . • Candidate key .an attribute set that uniquely identifies a tuple. and vice versa (mathematically A leads to at least one B and B leads to at least one A). • Degree . • Cardinality . • Associative Table .• Derivative Attribute . 6. 1.a relationship between entities where the existence of one entity depends on the existence of the other entity. • Prime Table . • 1:1 Relationships . but in reverse many occurrences of the second entity is related to at most one occurrence of a single entity (mathematically A leads to at least one B and B leads to exactly one A).a composite value set of a relation assigned to tuple variables.the occurrence of one entity is related to 0.a non-prime table where all primary key columns are also foreign keys. or more occurrences of another entity.the occurrence of one entity is related to at most one occurrence of another entity and vice versa (mathematically A leads to B and B leads to A).10 Normalization Process Some definitions used within the context of normalization are: • Alternate key .
• Tuple .a tuple drawn from a complete set of tuples for a relation. • Field .a collection of logically related attributes occurring multiple times in a tuple.a key defined on multiple attributes.attribute sets based on an identical attribute set which is a candidate key for a different relation.set of relations in a logical relational model.a key defined on a single attribute. or a data type.represents a real world identifier for a tuple in a relational database.• Composite key . • Repeating group .attached to each attribute and comprised of a name and a domain. • Domain . • Key . • Foreign key . uniquely identifying each tuple in the relation. 68 .an artificial simple key identifying individual entities in an arbitrary way. Every relationship has one and only one primary key. • Instance .any set of attributes uniquely identifying a tuple.an attribute set. • Heading .a representation of data in tabular form. • Surrogate key .a unique instance of a relations consisting of at least one primary key and any associated attributes. • Relation .an attribute set uniquely identifying each tuple in a relation. • Natural key . • Relational schema . • Intelligent key . • Primary key .the intersection of a tuple and an attribute. • Superkey .an “overloaded” simple key encoded with more than one fact. • Simple key .all possible values specified for a given attribute.
• Second Norm (2NF) . Relational databases are typically broken into six layers as follows: • First Norm (1NF) .any subset of tuples satisfying specified conditions.represents all multiples of all attributes found in associated relations. logical operators are used to construct and decompose relationships in the database: • Difference .Since relational databases are based on set theory. that is. while attributes represent a single-value property of the entity. eventually reducing a relational database to a single aphorism . each field contains one value and one value only. • Third Norm (3NF) . • Divide .any subset of attributes of a relation is a projection on the relation.eliminates nonkey attributes not describing the primary key.the boolean product of the concatenation of tuples within relations. Different layers of normal forms provide greater detail related to the normalization process. • Restrict/Select .represents all attributes found in either relation or both relations. • Project . The primary key uniquely identifies any tuple. The relationship between a table and its primary key and attributes are not one-to-one from primary key to attributes: all nonkey attributes are functionally dependent on the primary key. • Intersection . Any relation in a relational database is considered to be in first norm by definition. That is. A relation consists of a primary key and zero or more attributes. relations in a relational database meet the constraints of the first norm. • Union .represents the set of all attributes contained in both relations and only in both relations. the relationship between a table and its primary key and attributes must be one-to-one.one fact in one place. A stricter form is called the Boyce-Codd Normal Form (BCNF) and defines a relation if 69 . The relationship between two nonkey attributes is not one-to-one in either direction. Relations (tables) are decomposed vertically and horizontally.the division of one relation of degree by another relation of degree to produce a quotient relation of degree. • Join .eliminates circular dependencies. • Product . Normalization is performed through identified dependencies between attributes. Normal forms define a system of constraints placed on a relation.represents the set of all attributes contained in one relation but not another. Formally.the first layer where all fields of a relation variable is atomic.
always assign the same primary to both sides of the relationship. To identify a primary key: 1. Eliminate any redundant or unnecessary attributes in the superkeys to create a candidate key set. • In supertype-subtype relationship. Every join dependency is a consequence of the candidate keys of the relation. 70 . Recommendations for selecting primary keys are: • Select numeric attributes whenever possible. Primary key attributes cannot be null. Primary keys do not identify order or access a path. To enforce Referential Integrity Rule. The relation is in BCNF and has no anomalies associated with multivalued dependencies. When a simple primary key cannot be chosen. Any relation that is in 3NF is in 5NF if every candidate key is simple in the relation. • Select those attribute that remain unique. Identify any superkeys .and only if every determinant is a candidate key. • Never use intelligent keys.the relation satisfies no nontrivial join dependencies. a UNIQUE constraint can be assigned to any alternate key. cannot contain duplicate values. • Fourth Norm (4NF) . and cannot be defined on columns in the BLOB or CLOB data types.a set of attributes uniquely identifying the tuples of a relation. • Name primary key attributes using consistent convention. • Fifth Norm (5NF) . Select a primary key from the candidate keys: every relation has one and only one primary key. 3.eliminates multivalued dependencies from relations. A primary key uniquely identifies each tuple associated to a relation variable. A surrogate key is an artificial simple key used when no natural key exists. should never be modified.Also known as projection-join normal form (PJ/NF). • Select attributes that rarely change. surrogate keys can be used. 2. • Select system-assigned keys whenever possible. Any candidate key that is unselected is considered an alternate key. • Sixth Norm (6NF) .
The result of a join is another relation. There are three types of foreign key values: • • • Mirror image Wholly null Partially null Foreign keys cannot be defined on columns in the BLOB or CLOB data types. The rule is enforced using the PRIMARY KEY and FOREIGN KEY clause of the CREATE TABLE statement. it is possible to have a lossless join relation. Attribute mapping provided within join processing can provide an analysis of semantic disintegrity.1. There are several types of joins. A binary join defines the joining of two relations. A relational join describes an operation when data from two or more tables are combined. Due to the presence of foreign keys. nor can they be defined for a global temporary trace table. If the smaller relations (tables) are joined to recreate the original relation.11 Join Modeling A join describes when two tables are associated with each other. semantic disintegrity can occur. 6. The Referential Integrity Rule states that if a relation has a foreign key match a primary key of another relation.). A join requires any two relations to share a common attribute. If the relations are joined and will not result in the original relation. 71 . the expression n-ary join is used (tertiary. every value of the foreign key must be equal to a primary key value or be wholly null. Since joins between relations are also relations themselves. A lossless join relation is defined as any table which exists solely to ensure any adhoc query against a set of relations will only use standard operators of the relational algebra. Consider that normalization will decompose relations into smaller relations. the join is considered lossy. etc. A sign of this occurs when an incorrect answer is provided for a user’s query. • A 1:1 mapping between attributes is equivalent to a functional dependency (X and Y). when three or more tables are combined. When a database has been improperly normalized or the result of normalization is misunderstood. primary index joins can be performed.Foreign keys are attribute sets in one relation found in another relation acting as a primary or alternate key. a property defined by the law of conservation of relations. The following properties apply: • Mappings are either 1:1 or 1:M. the join is considered lossless.
3. closed set of values and commonly referred to as a data type.1. but the same value Y can be mapped to different values of X. 6.a physical restriction defined for a column or table. 7. 4.• 1:1 is loosely defined: X can be mapped to only one Y. • Table . The ATM process defines the following terms accordingly: • Domain . Dependency preservation is a product of a relation’s decomposition.12 Activity Transaction Modeling Process The Activity Transaction Modeling (ATM) Process is the initial step in transforming the logical data model into physical data. Identify all applications. Model identified transactions. 5. 72 . The activities of the ATM process include: 1. atomic attribute of a relational entity. where all function dependences of the original relational schema can be implied by the functional dependencies in decomposed relations.a unique. Identify column change ratings. • A sequence of attribute mappings must meet the lossless join property. semantic disintegrity can be avoided.an instance of an object in a relational table. then the mapping must be 1:M. • Constraint . Model identified applications. 6. • Column . • Row . Transfer access information.a defined. Define all domains. 8. 2.an abstract representation of an entity constructed of rows (tuples) and columns (attributes). • If a mapping is not 1:1. Compile or estimate demographic information of data. If the above properties hold. Define all constraints.
a column set that uniquely identifies a tuple within a relation. They provide direct access to data typically retrieved by common.048. or possible locations.a column set identifying a relationship between two or more tables in a database. This distribution is enabled using a hashing algorithm. In unique indexes. The rows are made up of two parts: a data field in the referenced table and a pointer to the location. but will change whenever the value of the primary index or partitioning column for the row changes and can be reused after any association with a current row has ended. hash. Additional storage space is required when fallback features are defined for a table. of the row in the base table.consists of unique and system-generated columns which are frequently used to create surrogate keys. planned queries. • Foreign key . The number of rows retrieved by the index determines the index’ selectivity. Primary indexed tables will distribute rows across multiple AMPs. which uses a substantial amount of system storage space. The Teradata Database will update the index subtables every time an indexed column value is updated or deleted in the base table. • Normalization . A low selectivity identifies those indexes which retrieve many rows. or a new row is inserted.2 Indexing Indexes are used by the Optimizer to allow table access to be more efficient. The value is a 32-bit value containing either a 16-bit hash bucket number with a 16-bit remainder or a 20-bit hash bucket number with a 12-bit remainder. Full-table scans do not require indexing and represent the activity of unplanned. Subtables store secondary. which will compute a row hash value based on the value from the primary index. 73 . RowIDs will uniquely identify each row. A Teradata system can have up to 65. based on the settings for the CurHashBucketSize and NewHashBucketSize flags. accessing one or more data block per data block.576 hash buckets (20-bit hash buckets). 6.536 (16-bit hash buckets) or 1.• Primary key . and join indexes. • Identity Column . The NoPI will store the row hash value with the RowID generated by the AMP software after assigning the row to the AMP.segregates the attributes of a database into individual tables to allow the attributes to be uniquely modified. Rows are hashed differently between primary indexes and no primary indexes (NoPI). a 32-bit row hash value of the primary index is stored with the column data for the row. Indexes of relational database are tables consisting of columns and rows and referencing base tables in the database. Hash buckets are distributed across AMPs as evenly as possible. adhoc queries. Highly selective indexes retrieve very few rows.
join and hash. Hash Index. secondary. Nonunique primary indexes are typically used to join by defining entities with the same primary index to ensure rows are hashed to the same AMP. Unique indexes. Nonunique secondary index hash-ordered on all columns with no ALL option. Multitable sparse join index. Single-table simple join index. Multitable aggregate join index. Nonunique nonpartition primary index. utilizing the primary key column set constraint. non-unique primary indexes (NUPI) and non-unique secondary indexes (NUSI). Nonunique secondary index hash-ordered on single column with no ALL option. will enforce a unique value for a particular column set. Unique multilevel partitioned primary index. Nonunique secondary index value-ordered on single column with no ALL option. Unique secondary index.There are a number of different index types falling under four general categories: primary. Nonunique indexes. Nonunique secondary index value-ordered on single column with ALL option. 74 . Nonunique single-level partitioned primary index. Multitable simple join index. Single-table sparse join index. do not require a unique value. Unique single-level partitioned primary index. Nonunique multilevel partitioned primary index. The different types include: • • • • • • • • • • • • • • • • • • • Unique nonpartition primary index. Nonunique secondary index hash-ordered on single column with ALL option. Single-table aggregate join index. unique primary indexes (UPI) and unique secondary indexes (USI).
Unique primary indexes are assigned to major entries and subentries in non-temporal tables. If a primary index is not explicitly defined. primary indexes can be unique or non-unique. All temporal tables are defined to have only non-unique primary indexes. non-compressed join indexes. a table will hash its rows to the appropriate AMPs and stored in row hash order. Period. Primary indexes can be partitioned or non-partitioned. A Primary INDEX may be explicitly defined or a PRIMARY INDEX may not be defined or NO PRIMARY INDEX may be explicitly defined or not. Only one primary index can be defined for each table. When created with an NPPI. Both single-level and multilevel PPIs can be defined for global temporary and volatile tables. row compressed join tables.2. Non-unique primary indexes are assigned to minor entities and defined on the same column as the major entities which the minor entity is associated. volatile tables. The primary index definition will specify no more than 64 columns and those columns cannot have data types for BLOB. Primary indexes are used to define the distribution of the rows to the AMPs. Partitioned primary indexes can have up to 15 levels and be applied to base tables. A partitioned primary index (PPI) is an extension to the NPPI. and aid in efficient aggregation. Multi-value compression cannot be specified for primary index columns or partitioning columns of a PPI partitioning expression.1 Primary Indexes Tables in the Teradata Database can have zero or one primary index. The unique primary index column set for non-temporal tables should always be set with a NOT NULL attribute. while single-table join indexes define joins by hashing frequently joined subsets of base table columns to the same AMP. NoPI tables. or Geospatial. 6. global temporary tables. A unique primary index cannot be defined for a temporal table. the existence of a default primary index is dependent on a PRIMARY KEY constraint or any UNIQUE constraints and on the setting of the PrimaryIndexDefault Control flag. Hash indexes are similar to single-table join indexes with the syntax similar to secondary indexes. Subentities will typically use the same unique primary index as the major entity it is associated with to ensure that related rows in the different tables are hashed to the same AMP. As mentioned above. A non-partitioned primary index (NPPI) is the standard primary index for the Teradata Database. CLOB. UDT. provide efficient joins. Partitioned primary indexes cannot be applied to queue tables. standard temporal and non75 . ensure access to rows without a full-table scan. nor is a default primary index assigned for a temporal table. Primary indexes are unique or non-unique and partitioned or non-partitioned. and hash indexes.Multitable join indexes are defined for frequently performed join queries. Primary indexes are created using the CREATE TABLE statement.
and then a unique value. Non-primary index (NoPI) tables are non-temporal tables which have no primary index and a table type of MULTISET. Period. NoPI tables cannot be used as a temporal table. The internal number is calculated from the external partition number. These tables are used to improve performance of data loading operations. NoPI tables cannot specify a permanent journal or an identity column. Without a primary index. particularly for repetitive or standard queries. queue table. Geospatial. nor can SQL UPDATE or MERGE requests be used to update a NoPI table. Non-secondary indexes (NUSI) are generally assigned to non-unique column sets which have attributes which are frequently sorted. a NoPI table will not hash rows to an AMP using a primary index value. Secondary indexes can be unique or non-unique. or a different algorithm. the secondary index will specify other access paths to the desired tables. These numbers are based on the user defined value of a partitioning expression. error table. A Row ID in a NoPI table row is randomly selected using an arbitrary hash bucket owned by an AMP. Unique secondary indexes (USI) are assigned to any column constrained by unique values.2 Secondary Indexes Secondary indexes are never required for tables. but are used to improve performance. 6. Up to 64 columns can be found within a secondary index definition and the columns defined in the definition cannot contain BLOB. PPI rows are grouped into partition groups on an AMP using their partition number. and non-compressed join indexes. Instead. They will typically appear in conditions identified by the WHERE clause. The partition number field is located in the Row ID and represents the combined partitioning expressions of a PARTITION BY clause. There are two partition numbers. The design of PPIs will optimize range queries. Also not allowed are system-derived PARTITION or PARTITION#Ln columns. then the row hash value for each partition. These indexes can be created when a table is created or added later using the CREATE INDEX statement. CLOB. such as: • • Selection conditions Join conditions 76 . Instead of using the primary index path. the row are hashed to AMPs and assigned to compute internal partition numbers. NPPIs will be ordered using only the row’s hash value. When a PPI is created with a table or join index. the Query ID for a row is used in the hash. A Teradata Database can have up to 32 secondary hash and join indexes. while the external number is the partitioning expression for a row in a PPI table. or SET table.2. or UDT data types. an internal and an external number.temporal base tables.
524) .• • • • • ORDER BY clauses GROUP BY clauses Foreign keys UNION DISTINCT For row access using a single value. with a NONSEQUENCED VALIDTIME UNIQUE constraint. • Secondary index value . The fields in the row layout include: • Row Length (2 bytes) . The Teradata Database will implement a USI when a non-primary index uniqueness constraint is created for a nontemporal table using the PRIMARY KEY or UNIQUE constraints. Before any load is successful.defines the RowID of the base table row identified by the USI. • Base table row ID (8 bytes for NPPI tables and 10 bytes for PPI tables) . in addition to the row hash and uniqueness value. all associated USIs to the table must be dropped. including overhead. MultiLoad. the EXCLUSIVE lock is activated and will not allow other processing until completed. Depending on the type of base table and format. The RowID for the NPPI will not contain a partitioned number. thus blocking any DROP INDEX transaction from completing. These constraints are implemented on temporal tables only when a valid time table has a column specified. If a DROP INDEX transaction is running. • Row ID (8 bytes) . such as Teradata Parallel Data Pump. 77 . BTEQ. USIs are used to access base tables or to enforce data integrity. If requests are currently running. or Teradata Parallel Transporter operations (LOAD and UPDATE) to load data if the indexes are associated with the target base tables.defines uniquely the USI row by combining the row hash (output of the hashing algorithm) and uniqueness value (system-generated integer). USIs are preferred: NUSIs are more preferred for range query access. the index will place a READ lock on the index subtable. Usually. or other load utilities must be used.defines the column values for USI. Unique secondary indexes do not allow FastLoad. The RowID of the PPI table contains the partitioned number for the row. • Overhead (2 bytes) – unused.(up to 65. or Teradata Parallel Transporters (INSERT and STREAM). a subtable row will be structured slightly differently.defines the number of bytes in the row.
Non-unique secondary indexes are best used for range access equality and nonequality conditions. or LIKE. The USI value request is accessed by hashing to its subtable and reading the pointer to the base table in order to access the stored row directly. the Optimizer will use bit mapping. The Parser checks the syntax and lexicon of the query. The AMP locates the USI subtable using the USI Table ID. Other selections can provide conditions for BETWEEN. the following process is used: 1. and the USI data value. Composite set selection with an OR or AND can be used.Teradata Database will distribute a USI row to a different AMP than the base table row the index identifies. The Parser looks up the Table ID for the USI subtable containing the desired USI value. Multiple NUSIs are frequently defined for the same table. while ANDed expressions will allow all predicate conditions to evaluate to TRUE. If any condition evaluates to FALSE. requiring the subtables to be scanned in order to identify the relevant pointers to base table rows. no row is retrieved for the set of conditions. the Optimizer will typically perform a full table scan. 5. ORed expressions allows any predicate condition in the WHERE clause to evaluate to TRUE for the specified condition. LESS THAN. As a result. 2. USI access is typically a two-AMP operation. combinations of NUSIs may be highly selective. 8. 78 . While individual NUSIs may not be highly selective. The subtable access is not hashed for NUSIs. 9. 6. 3. The purpose of bit mapping is to drastically reduce the number of base rows to be accessed and is only used when weakly selective indexed conditions are ANDed. How many of those NUSIs are used in the query plan depends on the individual and composite selectivity. A three-part BYNET message is specified by USI access. The USI row hash is used by the Dispatcher to send a message across the BYNET to the appropriate AMP containing the USI subtable row. 7. An AMP steps message is created by the Generator. If an OR expression is used. The RowID is used to locate the base table row. containing the USI Table UD. The AMP locates the index row in the subtable using the USI RowID. NUSI access will use all AMPs in its operations. The AMP reads the base table RowID from the USI row and distributes a message containing the base table ID and RowID for the requested row. 4. USI row hash value. GREATER THAN. To use a USI to locate a row. The USI value is hashed by the hashing algorithm. If low selectivity is possessed by both indexes.
3 Join Indexes Join indexes are designed to permit the resolution of queries by accessing the index without accessing or joining the underlying base tables. the same alias must be referenced in the CREATE INDEX statement creating the secondary index. no CASESPECIFIC condition exists in the conditions of the query. and if not. If a query is not covered by a join index. they are only specified in a COUNT function or UPPERCASE operator in the query.When a query is covered. or if any bad character columns exist. When the index structure contains all columns referenced by joins. • An ALL option is defined by the NUSI. each ROWID specification must be qualified. The join index definition is the only definition which uses the ROWID keyword. • All columns referenced in a query are included in the NUSI. the Optimizer will often use the index to join underlying base tables to provide greater optimization. or the keyword ROWID. A query is covered by a NUSI if the following is true: • The query references only columns in the NUSI. If an alias name is referenced in the select list of the join index definition. however a user-named column called partition in the index definition. If multiple tables in the definition are referenced. An alias name for the ROWID can be referenced. Partial covering means that only some of the columns requested are available. Join tables allow one or more columns of a single table to be aggregated as a summary table. A hash or join index definition cannot be specified with a system-derived PARTITION column. it means that all columns requested in a query are available in some index subtable without having to access a base table. • A character column set is not referenced which is not defined as either CASESPECIFIC or UPPERCASE in the base table.2. or if not. 79 . and its rows partitioned using a different primary index than the base table. Join indexes will join multiple tables in a prejoin table. does not contain any bad character column sets. It can be used optionally to specify the ROWID for the base table. A single base table will be replicated as a whole. multitable join tables are useful. or in part. Aliases are required to resolve any ambiguities in column names or ROWIDs in the select list of the join index definition. 6.
All or some of the columns can be hashed on a foreign key which hashes rows to the same AMP as another large table.When a join index is created. • A transfer to a join index definition cannot happen if the column is a component of the primary index for the join index.satisfying any query performing a frequent join operation by defining a permanent prejoin table without violating the normalization of the database schema. 80 . • A transfer to a join index column will not happen for any column specified with an argument for the functions COUNT. or SUM. EXTRACT.a database object created using the CREATE JOIN INDEX statement specifying one or more columns derived from an aggregate expression. The different types of join indexes include: • Simple join indexes . • Aggregate join indexes . • A transfer to a join index definition cannot happen if the column is a component of an ORDER BTY clause in the definition. a single-table join index is a database object created by a CREATE JOIN INDEX statement specifying only one table in its FROM clause. • Single-table join indexes . satisfying conditions such as: o All columns specified in the grouping clause of a query must be included in the grouping clause. The following rules must be followed to transfer compression values to join index columns: • Multi-value compression transfers will occur to a join index definition even if alias names are specified for columns in the join index definition. These join indexes cover aggregate queries considering a subset of groups contained in the join table. • A transfer to a join index definition cannot happen for any columns which are components of a partitioned primary index or partitioning expression for the join index. A defined multi-value compression is not inherited by hash indexes.a simple join index on a single table. • A transfer to a column cannot happen in a compressed join index which has indexes defined on its column_1 and column_2. • Transfers to the multitable join index definition will continue as long as the maximum header length of the join index is not exceeded. the system will transfer any column multi-value compression defined in the base table automatically to the join index definition. particularly specifying a SUM or COUNT aggregate operation.
• Space allocation is received from permanent space and stored in distinct tables.2.4 Hash Indexes Hash indexes are file structures. o • Sparse join indexes . Their properties are common to single-table join indexes and secondary indexes. but a hash index cannot. • A partially covered query containing a TOP n or TOP m PERCENT clause is not supported. • The system maintains the relevant columns of their base table automatically through an update using a DELETE. and SHOW HASH INDEX. whether simple or aggregate. DROP STATISTICS.or value-ordered. Any join index can be sparse. multitable or single-table. • They can be row compressed.a sparse join index will use a constant expression in the WHERE clause of its definition to narrowly filter the row population. • A complex expression can be transformed into a simple index column. • Neither can be directly updated or queried. • They can be hash. • A single-table join table can have a partitioned primary index. or UPDATE statement. HELP INDEX. • They can be FALLBACK protected. Below are the similarities between single-table join indexes and hash indexes: • Both function to improve query performance. • Both can be an object of the SQL statements .o All columns in the query WHERE clause must be part of the join index definition.COLLECT STATISTICS. The partitioning columns must be members of the column set specified in the GROUP BY clause of the index definition if a PPI is defined fro an aggregate join index. • Row compression specifying a UDT in their select list cannot be implemented. 81 . INSERT. 6.
and Archive/Recovery utilities. Join index columns cannot have BLOB. • Primary indexes cannot be partitioned for hash indexes and must be in noncompressed row forms for join indexes. • Base table row pointers in join indexes are explicitly defined using the ROWID keyword. FastLoad. or Geospatial data types. • Join indexes will transparently add column multi-value compression. • NoPI tables are supported by join indexes. CLOB. • Hash indexes cannot specify a UDT in its select list. a logical row can correspond to either one and only one row in a referenced base table or multiple rows in referenced base tables. 82 .• A system-derived PARTITION column cannot be defined. while join indexes must explicitly define index row compression. while pointers are implicitly added for hash indexes. Neither index type will allow column multi-value compression to be explicitly defined. not hash indexes. while hash indexes do not. Hash index columns cannot have UDT. • Logical rows can correspond to one and only one row in a reference base table for a hash index while in a join index. Teradata tables can have up to 32 secondary. or Geospatial data types. The differences between hash indexes and join indexes are: • A hash index will index only one table while join indexes will index multiple tables. CLOB. • Restrictions exist for using the MultiLoad. Period. • Hash indexes will transparently add index row compression. while join indexes can only when the UDT is not row-compressed. BLOB. hash. and join indexes. • The column list for a join index can specify aggregate functions. while the column list for a hash index cannot specify aggregate or order analytical functions.
By default. is a statement that can be proven without question to be either true or false.” • Inference rules .” and “there exists. commonly representing attributes of a relation (columns) and the relation heading (relation variable).also called a proposition.a truth-valued function. the normalization of a database can be drilled down to a point where these principles and formal logic can seem trivial. • Predicate .signifies the logical equivalent of “is identical to” or “is equal to. 6. Semantic integrity constraints are used to enforce the logical aspects of the data and their relationships. all rows in a relational table or relational variable are assumed to be true.” “for any. To ensure bad data is not loaded into the database and ensure no corruption occurs between tables due to improper deletion or update of data in the existing database. Some common terms from set theory and formal logic that find themselves in database design are: • Assertion .” • Identity predicate .3.6. The purpose of a constraint is twofold. 83 .signifies the logical equivalent of “for some. Declarative semantic constraints are part of the definition of the database and consist of: • • • Column-level constraints Table-level constraints Database constraints Physical data integrity constraints will check used data as it travels from system memory to disk.1 Set Theory The principles of set theory are at the core of relational database management and the foundation for practical design and administration of relational databases. They should never be declared on columns defining BLOB or CLOB data types. • Existential quantifier .determines the steps of reasoning used to prove propositions.3 Integrity Databases have two types are integrity constraints: semantic and physical. as integrity constraints would deny their entry is proven false. Though. Constraints are the physical implementation of business rules.
• Column . They can be implemented using constraints or triggers.a data type. INSERT.• Predicate calculus . a constraint is any predicate which must evaluate to TRUE in order for a DELETE. as compensate responses are not permitted. The SQL CREATE TABLE and ALTER TABLE statements are used to specify integrity rules. 6. and identity predicates to prove validity of a statement. The ANSI SQL standard support either a response of reject or compensate. • Truth-valued function . • Predicate logic .3. a checking time. though all Teradata Database constraints will immediately check and deferred checking time is not permitted. acting as a simple constraint by defining the characteristics of values entered into the database. The ANSI SQL standard supports immediate or deferred checking times. An integrity constraint is a component of the logical database model that formalizes a component of the business model (business rule) by defining the conditions and ranges permitted in database parameters. All Teradata Database constraints will respond with a reject.defines the sets of relationships available to values within a row and can be applied to a single row or multiple rows. • Table .specifies a simple predicate which is applied to only one column. The checking time is the point within the process where the constraint is checked. In its strictest sense. 84 . or UPDATE operation to be permitted in a database. universal and existential quantifiers. and violation response. The violation response component defines what action will be taken of the integrity constraint is violated. • Universal quantifier . An integrity rule is a set of rules used to ensure data integrity.a set of inference rules where propositions in predicate logic are proven.signifies the logical equivalent of “for all”. Each rule is comprised of a name.uses truth-functional operators of propositional calculus. In a database. each table must be able to evaluate its own predicate or relational variable (relvar) for its truth value and enforce a set of business rules.a function that will evaluate without question to either TRUE or FALSE. Four types of integrity constraints are supported: • Domain . an integrity constraint set.2 Semantic Integrity Integrity constraints will restrict database updates to a set of specified values or range of values. Business rules are a component of the business model that defines specific conditional modes of activity.
defined on a single column as USIs for nontemporal tables or as either single-table join indexes or USIs for temporal tables. The most general form of SQL constraint specifications is the CHECK constraint. or Geospatial data types.defined on a single column and cannot be implemented as indexes. Column-level constraints cannot reference other columns in the table. • PRIMARY KEYS . It can be applied to either a individual column or to an entire table depending on the CREATE or ALTER TABLE text. • Unnamed constraints with identical text and case are considered duplicates and will result in an error is submitted with a CREATE TABLE or ALTER TABLE statement. • Cannot be specified for any level of a volatile table. Some column-level integrity constraints for both nontemporal and temporal tables are: • UNIQUE . • Can have a temporal dimension.# $ If a constraint is not named. CLOB. 85 . The following rules are applied to the all CHECK constraints: • Can be defined at the column or table-level.defined on a single column as USIs for nontemporal tables or as either single-table join indexes or USIs for temporal tables. Each column defined for a table must specify a name and a data type.• Database . Not supported on temporal tables.defines the functional determinant between a key is its dependent attributes. • FOREIGN KEY.. system-generated names will not be assigned. while table-level constraints must reference other columns in the table.REFERENCES . • CHECK . • The specified predicate can be any simple boolean search condition. UDT. Additional attributes or constraint definitions can be used to further define a column.. The names of a constraint can contain the following characters: • • • Uppercase and lowercase alphanumeric Integers Special characters . • The current session collation is used to test the constraints on character columns.defined on a single column and cannot be implemented as an index. In general. semantic constraints cannot be defined with BLOB. Period.
Batch Referential Integrity will test every inserted. 86 . or updated. Neither process is valid for use with temporal tables. a REFERENCE clause must be specified also. Period. UDT.. Column-level CHECK constraints cannot reference any other column n the table.. the AMP software will reject the operation and respond with an error message.• Cannot be defined with BLOB. However. the system will combine the constraints into a single-column constraint. If a FOREIGN KEY is specified. or Geospatial data types. If a violation occurs. They cannot be defined on self-referencing tables. If multiple unnamed constraints are defined for the same column. but assumes that it is enforced by the user in some other way than the normal declarative referential integrity constraint mechanism. CLOB. deleted.. Constraint predicates for table-level CHECK cannot references columns from other tables. Temporal relationship constraints (TRCs) also do not perform a test for referential integrity. Referential integrity is not tested. Referential constraint types supported by the FOREIGN KEY. FOREIGN KEY keywords are optional. Each named constraint is handled separately.REFERENCES.. deleted. A maximum of 100 table-level constraints can be defined for a table. • Cannot be specified for global temporary trace tables Column-level CHECK constraints can have multiple constraints specified to a single column. Referential constraints can be used for either temporal or nontemporal tables. If a violation to referential integrity occurs. but assumes some other method is used to enforce it. TRC operations are defined at the table level and valid for referential integrity relationships between single columns belonging to valid time and bitemporal table types to non-temporal or transaction table types. the parsing engine will roll back the entire batch and respond with an abort message. Table-level CHECK constraints will reference at least two columns from the table. or updated batch operation for referential integrity.REFERENCES include: • • • • Standard Referential Integrity Batch Referential Integrity Referential Constraints Temporal Relationship Constraints Standard referential integrity will test each row for referential integrity which is inserted. Referential primary key relationships with foreign keys are allowed through the FOREIGN KEY.
• Middle node . including: • • • Full end-to-end checksums. the system will remove the affected AMP from service. byte. Disk I/Os in the Teradata Database will read from and write to different data and file structures.represents the master index and defines the current state of cylinder indexes through cylinder index descriptor entries. the Teradata Database can implement checksums to enforce physical data integrity at the I/O disk level. If this happens. Disk I/O integrity checking (checksums) will detect and log any errors in the disk I/O. 87 .3 Physical Integrity Checking mechanisms for physical integrity will detect data corruption due to lost writes or errors in bit. However.3. No checksum integrity checking. or byte strings. the user will simply access the second. including: • • • • Data blocks (DB) Cylinder indexes (CI) Master index (MI) File information block (FIB) A Teradata file system is a generalized B* tree structure with three levels: • Bottom node .6. or fallback. Statistically sampled partial end-to-end checksums. when the data corruption is detected. • Top node . copy of the data. If hardware devices do not support these mechanisms. Additionally fallback protection allows the same data to be written to two different AMPs within a cluster in case the primary copy of the data is corrupted or goes down for some reason. Checksums can be performed at different level. Most data corruption is typically protected through end-to-end error detection and correction algorithms.represents the cylinder index and defines the current state of the cylinder through data block descriptor entries. but will not fix the errors.represents the data block and contains data rows.
The rule allows attributes of the foreign key to be wholly or partially null. 3) the decomposition must preserve dependencies.user-defined setting and samples one third (33 percent) of the words of the data per disk sector to compute the checksum value.3. • HIGH . • Closed World Assumption .disables disk I/O integrity checking. • NONE . which violates the Entity Integrity Rule. and 5) the decomposition 88 . 2) the decomposition must not be non-loss. • ALL – 100 percent of the words in each disk sector is sampled to generate a checksum.user-defined setting and samples one word (2 percent) of the data per disk sector to compute the checksum value.five principles declaring: 1) a relation variable not in 5NF should be decomposed into a set of 5NF projections.The following keywords have predefined integrity levels: • DEFAULT . They are: • Entity Integrity Rule .a database can only contain relation variables: information content can be represented at any given instant as explicit values in attribute positions in tuples. 4) every projection on the non-5NF relvar must reconstruct the original relvar. • Principle of Interchangeability . • Principles of Normalization .The attributes of the primary key cannot be null and applied to base relations. • Referential Integrity Rule . This rule will violate the Principle of Interchangeability. • MEDIUM .Base relations and virtual relations (tables and views) have no arbitrary or unnecessary distinctions. then the corresponding logical proposition must evaluate to FALSE.the system will use whatever system-wide level of integrity checking defined for the table type in the DBS Control utility.declares that if a tuple appears in a relation variable at a given instant but not in that relvar. • Information Principle . • LOW .user-defined setting and samples two third (66 percent) of the words of the data per disk sector to compute the checksum value.4 Database Principles Several fundamental principles of relational database management must be adhered to when designing and maintaining databases. 6.no unmatched foreign key values can exist.
89 .every entity has its own identity: if there is not a way to distinguish between two entities.5 Missing Values The use of SQL nulls covers a number of situations resulting in missing values in the data.should end when all relation variables are in 5NF. those entities are identical and cannot be two different entities. the semantics. Though SQL treats all these situations the same. 6.3. The most common reasons are: • • • • • • • Value is unknown. Value does not exist. • Assignment Principle . such that the relation variable constraints for their projections permit the same tuple to appear in both projections. Value is not supplied. They are not valid predicate conditions in SQL. Nulls are used to identify where missing information is located. • Principle of Orthogonal Design . and behavior of different null types are different. Value is not defined. To search fields which may or may not contain nulls.any database constraint cannot be evaluated to FALSE if the result of an update operation. context.after a value is assigned to a variable. the operators IS NULL or IS NOT NULL must be used. Value is an empty set. but are not values themselves. • Golden Rule . Value is not valid.two distinct relation variables in the same database must not be a non-loss decomposition. • Principle of the Identity of Indiscernables . a comparison between the two must evaluate to TRUE. except in CASE expressions. Value is not applicable. The value of a null cannot be resolved because they have no value.
Cool and icy data is accessed less frequently.1 Planning Considerations Capacity planning does more than simply ensuring that access to data is available and stored appropriately. cool. Hot and warm data is accessed often and constitutes the operational data store. and maintained regularly. Recent data is typically accessed more frequently than older data. The format of the base table row will provide the most basic estimate in capacity planning.4 Capacity Planning 6. operational. The packed64 format will store data on tables which are generally 3 percent to 9 percent smaller than the other format and will reduce the number of I/O operations required to access and write 90 . This set of data is required to be up-to-date. Teradata Database will utilize two row formats: packed64 or aligned row format. a true assumption can be declared that different data in the database has different levels of relevance to the business over time. This rate of descent can be faster or slower than other data. An operand of a function. The access rates are called hot. and icy respectively.4. 6. However. An explicit SELECT item. An item specifying a null in a column during an INSERT or UPDATE operation. warm. A NULL has no data value. integrated. the relevance of the data can lower. Cool sectors may be easily moved into greater relevance should new columns or indexes be added which affects the data in the sector or recast to make more relevant to the current business context. As data ages. A default column definition specification. with some hot spots within the cool data sector. except in the cases where it acts as an explicit SELECT item or operand of a function where its data type is treated as an INTEGER.The literal NULL represents a placeholder for the logical value in an SQL request. Designing processes begin by recognizing that large amounts of historical data can be stored at a given time and that business processes require access to that data. It can be used in the following ways: • • • • • • A CAST source operand. Proper design can provide optimal performance to the data. A CASE result.
Offset array for locating variable length columns. Table headers are used to internally maintain information about each table. Last table archive timestamp. The size for a table header is limited to 1 MB. Number of backup tables associated with this table. Table structure version Table structure version for host utilities. Characters can be represented by one byte. the columns for each category are stored in decreasing alignment constraint order. Table header row format version. Primary index flag. Internal ID of the database and database space charged. Each table has an associated subtable called a table header which is stored on each AMP in the system. Field 3 is not used. or a combination of single-bytes and multibyte representations. Table columns fall into three general categories: • • • Fixed length Compressible Variable length The order above determines how rows are stored. Table creation timestamp. The components of the table header include the row header and fields 1-9. 91 . In aligned row format. Last table update timestamp. Field 1 is fixed length while all other fields are variable length. The representation used will determine the number of characters supported.256 bytes (approximately 64KB). The different fields contain: • Field 1 o o o o o o o o o o o Row 1 header. two bytes. the overhead devoted from the row header is 12 or 16 bytes respectively. The maximum row length is always 64. the columns are stored in field ID order. In packet64 format. Depending on whether the table has partitioned or non-partitioned primary indexes.rows.
Protection type (fallback. Message class of secondary step. Message kind of primary step. Hash flag. Disk I/O integrity checksum. Data block size for the table (in bytes). Audit. Merge block ratio validity.o o o o o o o o o o o o o o o o o o o o o o o o o Internal ID of the permanent journal table. Both. Data block size validity. Dropped flag. Session number. Merge block ratio for the table. Table kind (permanent. and none). before. Number of child tables referencing this table. Byte count of the number of defined USIs. Journal type (After. Host character set at table creation. Number of parent tables referenced by this table. Message class of primary step. or join index). log. DDL change flag. Host ID. Percent free space for table. Message kind of secondary step. 92 . Percent free space validity. User journal flag. and None). volatile. temporary.
Compression flag. Transaction number. 93 . Archive/Recovery. Filed 5 type (Table descriptor with row hash and unique RowID. Internal ID for primary key index. Offset in row to the presence bit array. Flag to indicate a single row or multiple rows. Offset in row to the first byte past the presence bit array.o o o o o o o o o o o Request number. Index into the field descriptor array. Restart flag. • Field 5 .contains primary index descriptor and all secondary index descriptors. • Field 2 . and replication copy information: is always present for permanent journal tables.contains the table column descriptor.contains MultiLoad. Number or presence bits in each row. Number of varying length columns in the table. Number of columns in table. Row format. • Field 4 . Row 1 length. Index descriptors list for table. table rebuild. Dummy space. o o o o o o o o o o Internal ID of the first column in the table. Internal ID of replicated error table Replicated table initiation flag. Table ID of base temporary table. FastLoad.
Database sizing can consider the contents of the system disks and data disks. Offset to system code to build rows. ANSI tables without unique indexes.4. • Field 8 .048 GB of memory supporting 128 TB or larger databases. 6. table name. Duplicate rows flag (Dictionary and non-ANSI tables. Disk 0 is the boot disk and contains file systems under the control of the operating system root directory. Disk 1 provides additional space for dumps and memory swapping. or index descriptor with a PPI RowID). the data disk space allocation.contains restartable sort and ALTER TABLE information. A BYNET interconnect can support up to 512 nodes.up to 128 reference index descriptors. name list of unresolved child tables from referential integrity constraint specifications. table descriptor with a PPI RowID. • Field 9 .index descriptor with row hash and unique RowID. The contents of the boot disk include: • AMP identifiers file. and the length and offset of database name and table name.contains database name.2 Database Sizes The Teradata system managing relational database can be as large as 2. ANSI tables with unique indexes). Field descriptors array. Compressed values and UDT contexts. UDT name. System code for building rows. o o o o o • Field 6 .048 CPUs and 2. o o Row format. 94 .contains LOB descriptors. and the determination of usable disk space. • Configuration maps. • Open PDE and operating system. • Field 7 . Disk 0 and Disk 1 are system disks cabled to a controller not associated with the disk arrays.
• UDF libraries. • Bad disk sectors file. and operating system and Open PDE dumps. Reserved space is used to contain the vprocID of each AMP and optional memory swapping. memory swapping. Included in the data disk is reserved space and permanent space. • Space for diagnostic reports. Permanent space is owned by the system user DBC and used for the following: • • • • • • • • • CRASHDUMPS user SYSTEMFE user SYSADMIN user Data dictionary WAL log space Depot area space Spool files Temporary space Hierarchical ownership of PERM and TEMP spaces 95 . Data disks are virtual and controlled by the AMPs. • Values. • Copy of Teradata GDOs.• Executable Teradata software.
and ad-hoc queries. 3.4. Nonuser table space allotments include: • Overhead space • Depot area • Tables area. The size of the WAL log is based on the total number of data rows being updated or deleted. or TEMP space owned by the DBC user. The Depot area consists of large Depot slots and small Depot slots. CRASHDUMPS user. Global temporary tables require a minimum of 512 bytes from the PERM space of the containing database or user for the GTT table header. Double the resulting value. TEMP space is allocated from the available space of the owning user. 2. including data dictionary. PERM space. the following steps are used: 1. A large Depot slot is used 96 . Spool space is allocated from the available free space of the owning user. A WAL log is created and managed to recover data tables as a result of aborted transactions. The number of bytes to be allocated to a database or user is based on the TEMPORARY=n clause in the CREATE DATABASE and CREATE USER statements. Multiply the maximum row length by the total number of rows in application programs.3 Estimating Space Requirements Allocation of space to the data disk is reserved for system use. and user spool space. To determine the maximum size of the WAL log. batch jobs.6. Cylinder index space requirements are calculated using the formula: (2 times the size of one cylinder index/size of one cylinder) times 100. User table data space is determined by first defining the nonuser table data space allotments and subtracting the total from the available table space. Determine the maximum row length. • Every new user or database created can have the maximum amount of spool space specified. The following guidelines should be used: • Reserve 25 percent to 35 percent of total space for spool space and spool growth buffer. PERM space is allocated with the CREATE USER statement. user temporary space. • An extra 5percent of PERM space in the user DBC should be allowed.
The number of cylinders allocated are fixed at startup and allocated per pdisks.row overhead. b . 97 . n .a minimum of 20 percent of the user permanent space should be reserved. PH . If no default temporary space is defined.formatted size of each character column in the ORDER BY clause. • User Spool Space . the allocated space is set to the maximum temporary space for the immediate owner. RO .sorted numeric fields. • CRASHDUMPS User Space .number of columns.allocates default space for global temporary tables.row parcel indicators. NF . A minimum of 512 bytes is required f0r the GTT table header.created by the DIP utility with a default allocation of 1 GB.number of rows being selected.parcel headers. CF . SCF . • User TEMP space . The usable data space for Field mode is calculated using the following equation: a(RO + RP + n(PH + NF + CF)) + b(SNF + SCF) Where: o o o o o o o o o o a .formatted size of each numeric column. SNF . Small Depot slots are used when Depot protection is required by individual blocks which are written to the Depot area. RP . which are grouped to a subpool and AMPs are individually assigned to those subpools. The table area combines the following requirements for space: • Data dictionary .approximately 80 MB should be reserved for growth of system tables and WAL log.formatted size of each character column.to write multiple blocks by aging routines to the Depot area with a single I/O.number of numeric columns in the ORDER BY clause. The system will calculate the average number of pdisks per AMP and multiply by the specified value to determine the total number of depot cylinders for each AMP.
Finally. the PERM space is determined by subtracting the number of cylinders comprising the default free space defined by the DBS Control record.To calculate PERM space requirements. the total table storage based on the sum of space estimates for all table sizes. and the space requirement for the application. estimate the size of each database. 98 . The estimation from before should be deducted from the remaining space. The space required to accommodate the user DBC and maximum WAL log should be subtracted from the user DBC PERM space.
table.recompiles an external stored procedure. view.removes a database object. • ALTER PROCEDURE . Database objects can be a database.1. • ALTER . The language uses statements to define database objects. and update data. • DROP . referential constraint. • MODIFY . • COLLECT . view. trigger. and indexes.specifies for a session a time zone. UDF. collation or character set. group of columns. • SET . macro. stored procedure. Data Definition Language statements define the structure and instance of a database in the form of database objects. user access to those objects. SQL statements exist within three primary functional families: • • • Data Definition Language (DDL) Data Control Language (DCL) Data Manipulation Language (DML) Additional statements are provided by SQL which do not fall into any of these families clearly.changes a database or user definition. user. and view.1 Overview 7.used to collect optimizer or QCD statistics on a column. The most commonly used basic DDL statements include: • CREATE . 99 .used to replace a macro. trigger.used to define a new database object. columns. • RENAME . or index. UDT. or macro.changes the name of a table. trigger. • REPLACE . such as HELP and SHOW. trigger.changes tables.7 Structured Query Language 7. or UDM. stored procedure. stored procedure. index.1 SQL Statements Structured Query Language (SQL) is the most commonly used language for relational database management systems.
controls the logon privileges to a client or host group.inserts new rows into a table.used to manage transactions. Some of the most commonly used basic DML statements are: • • • • • • • • • • • • CHECKPOINT . ECHO . All DCL statement results are recorded in the Data Dictionary. COMMIT . DELETE .used to manage transactions. update values in stored rows.defines a recovery point in the journal. END TRANSACTION . • GRANT LOGON/REVOKE LOGON . 100 . as well as insert new rows in a table. • COMMENT . SELECT .• DATABASE . and delete a row. ROLLBACK .used to manage transactions. Data Control Language statements are used to grant and revoke access to database objects.used to manage transactions.returns specified row data in a result table. Ownership of those objects can be changed from one user to the next. • GIVE .inserts or retrieves a text comment for a database object. UPDATE .modifies data on one of more rows. BEGIN TRANSACTION . ABORT .removes row(s) from a table.combines the UPDATE and INSERT statements into a single SQL statement. INSERT .specifies a default database.used to manage transactions. MERGE .echoes a string or command to a client.controls a user’s privileges on an object. Data Manipulation Language statements are used to manipulate and process database values.gives a database object to another database object. The most commonly used basic DCL statements include: • GRANT/REVOKE .
101 . From embedded code in a macro. • Double quotation marks . as follows: • Period .separates database names from table names and table names from a particular column name.The following is typically found in a SQL request: • • • • • Statement keyword One or more column names Database name Table name Optional clauses related to keywords An executable statement can be invoked in one of the following ways: • • • • • • • From a terminal through interaction. From an SQL stored procedure through dynamic creation.delimits boundaries of character string constraints.prefixes reference parameters or client system variables.identifies user names. Different parts of the SQL statement are identified or separated using punctuation.groups expressions or defines limits of a phrase. • Apostrophe . From an application program with embedded code. • Comma .distinguishes column names in the select list or column names or parameters in an optional clause. • Colon . • Left and right parenthesis . From an embedded application through dynamic creation. From embedded code in a stored procedure or external stored procedure.separates statements in multi-statement requests and terminates requests submitted through some utilities. • Semicolon . Through a trigger.
The SELECT statement can be used to specify both inner joins and outer joins.2 SELECT Statements The most frequently used SQL statements is the SELECT statement. and clauses can be used with a SELECT statement: • • • • • • • • • • DISTINCT FROM WHERE GROUP BY HAVING QUALIFY ORDER BY WITH Query expressions Set operators (UNION.1. and the table(s) referenced within the database. This is done by combining the results of each query into a single result set. It is used to specify the table columns where the data is obtained. The following options. An outer join is an extension of the inner join. INTERSECT. the corresponding database.7. It includes rows that qualify for a simple inner join and a specified set of rows which do not match the join condition. and subqueries. Embedded SQL and stored procedures will use the SELECT INFO statement. derived tables. in what format and what order. 102 . lists. SELECT statements are used to reference data in tow or more tables. The referenced data is combined by a relational join. and MINUS/EXCEPT) Set operators can be used to manipulate the answers to two or more queries. An inner join will select data from two or more tables to meet specific conditions. Each source is specifically named and the common relationship (join condition) can be on an ON or WHERE clause. Set operators can be used within view definitions. The SELECT statement will specify how the system will return a set of result data.
Data types are communicated through phrases.3 SQL Data Types The SQL data types supported by Teradata Database are: • Teradata Database data types . Teradata SQL can be used to define attributes of a data value.1.large objects (LOBs). Data type attributes include: • • • • • • • • • • • • • • • NOT NULL UPPERCASE CASESPECIFIC NOT CASESPECIFIC FORMAT string_literal TITLE string-literal NAMED name DEFAULT value DEFAULT USER DEFAULT DATE DEFAULT TIME DEFAULT TIME DEFAULT NULL WITH DEFAULT CHARACTER SET 103 . A data type phrase will determine how data is stored and how data is presented. period. character. interval. to control the internal representation of the stored data (import format) and the presentation of data in a column or expression result (export format).distinct.byte. and geospatial. • ANSI-compliant data types . • User-defined types . DateTime. numeric. structured.7. graphic.
104 .minimum column value.7. This allows a way to search a table using iterative self-join and set operations.arithmetic average of values in a column. Use the WITH RECURSIVE clause in the statement to implement a recursive query and the RECURSIVE clause in the CREATE VIEW statement.ordered ranking of rows based on the value of the column.5 SQL Functions Standard functions can be: • • • Scalar Aggregation Ordered analytical A scalar function will create a result based on input parameters.1. Each set of data is processed independently and only one result is produced for each set.arithmetic average of all values for each row in the group.number of qualified rows. SUM . The following are aggregate functions: • • • • • AVG . The following are ordered analytical functions: • • AVG .4 Recursive Query A recursive query is a named query expression which references itself in its definition. Ordered analytical functions perform over a range of data for a particular set of rows in some order. This feature can reduce complexity in querying and allows the efficient execution of a class of queries. 7. COUNT . the function’s result is used by the expression referencing the function. This data is typically grouped together using a GROUP BY or ORDER BY clause. Upon completion. The function is invoked as required whenever a stated expression is evaluated for an SQL statement. RANK . The function is called for each item in a set and produces a result for each detail item.1.arithmetic sum of values in a column. These types or functions allow sophisticated data mining to be performed on the information in the database. Aggregate functions will produce a result from a set of relational data. MIN .maximum column value. MAX .
105 . Cursors are used by Teradata Preprocessor to mark or tag the first row accessed by an SQL query.1. Cursor definition and manipulation. These host variables are used by application to perform computations.1. 7. Result set cursors are used to return the result of a SELECT statement executed in the stored procedure to the caller. the requests are considered embedded into the application. SQL stored procedures used cursors to fetch one result row at a time and execute SQL and SQL control statements as required. Rows can be individually fetched and written to host variables using the FETCH…INTO… statement. 7.6 Cursors A cursor is a pointer used by the application program to move through a result table. An embedded SQL contains extensions to executable SQL to permit declarations.7.8 SQL Applications The following APIs are used by client applications to communicate with Teradata Database: • • • . It is declared for a SELECT request and a named cursor is opened. the Teradata session mode has several defaults that differ from ANSI semantics.NET Data Provider Java Database Connectivity (JDBC) Open Database Connectivity (ODBC) When SQL requests are inserted into an application program.1. These declarations include: • • Code to encapsulate the SQL from the application language.7 Session Modes Two session modes are supported by Teradata Database: • • ANSI Teradata While the ANSI session mode semantics conform fully with the ANSI SQL: 2008 standard. executing the SQL request. then increment the cursor as needed.
The precompiler tool is Preporcessor2 (PP2) and it will: • Read application code for defined SQL code fragments.1. UPDATE. SQL is not defined for any of these languages. 7. It allows SQL statements formulated for DB2 to be translated into Teradata SQL. Detailed Optimizer information. However. and PL/I.Any compiled client application language will support embedded SQL. an explanation is provided describing how a request will be processed. the request is parsed and optimized. is supported in the EXPLIAN request modifier. The PP2 will support the application development languages of C. and the relative cost needed to complete the request is presented.10 Third-Party Development Third-party software products are supported by Teradata Database. the access and join plans generated by the Optimizer are returned as a text file. The precompiler will produce an output as a native source code with CLI calls substituted for the SQL source. To display the execution plan. largely consisting of two general categories: transparency series and native interface products. SQL Applications can also come in the form of macros and SQL stored procedures. This converted source code can be processed using the native language compiler of the application.9 EXPLAIN Request Modifier The EXPLAIN request modifier is provided by Teradata SQL to view the execution plan of a query. such as cost estimates for INSERT. The Transparency Services/Application Program Interface (TS/API) provides a gateway between the Teradata Database and IBM DB2. IPSERT. this requires the embedded SQL code to be precompiled for the purpose of translating the SQL into native code. COBOL. Using EXPLAIN allows complex queries to be evaluated and alternative processing strategies developed. and DELETE. • Interpret the code’s intent after isolating SQL code and translating it into Call-Level Interface (CLI) calls. and the performance impact of the request. DB2 applications are permitted to access data stored in Teradata Database.1. The modifier will precede any SQL request. 7. 106 . the estimated number of rows involved. With this translation. • Comment out of the SQL source. MERGE. With this modifier. The execution plan for the request is displayed but the request is not submitted for execution.
• Access to data by CLIv2 request not usually accessible by resource usage. Some general categories are: • • • • • Databases Users Tables Columns Data Types 107 .The Workload Management API contains interfaces to PM/APIs and open APIs. the following actions can be taken: • • • • • Monitoring system and session-level activities. Managing task priorities. such as: • • • AbortSessions function TDWMRuleControl function GetQueryBandValue procedure 7.2 Database Objects Multiple objects are used to create a database. Managing Teradata Active System Management (ASM) rules. • Data retrieval of raw data in an in-memory buffer for real-time analysis or importing into custom reports. Open APIs provide an SQL interface to System PMPC using user-defined functions and external stored procedures. A specialized PM/ API subset of CLIv2 is used by the MONITOR logon partition to access the PMPC subsystem. Tracking system usage. Updating components stored in the customer TDWM database. the following is possible: • Frequent in-process performance analysis allowed on CLIv2 data in near real time. PM/APIs will provide access to PMPC routines found in Teradata Database. With PM/API. With these interfaces.
Users are created using the CREATE USER STATEMENT.2.2. and macros. Each row following the first row consists of unique data values. indexes. The name of the database. The number of tuples is referred to as the cardinality of the table and the number of columns refers to the degree or parity of the 108 . where columns identify attributes and rows represent tuples. or column names. stored procedures. 7. amount of storage to allot and other attributes are specified during the creation process. or other users or databases. The difference between databases and users is the existence of a password for the user and their need to log on to the system. Users are given space by the database to create and maintain objects. userdefined functions. Databases are created using the CREATE DATABASE statement. views. triggers. Users are similar to databases. The first row in every relational table consists of column headings. A database has no password and does not require the need to log on to the system.2 Tables A table defines relationships.1 Databases and Users Databases are a collection of related tables.• • • • • • • • • • • • Keys Indexes Referential Integrity Views Triggers Macros Stored Procedures External Stored Procedures User-Defined Functions Profiles Roles User-Defined Types 7. The user name and password are specified during the creation process.
Error logging tables are associated with a permanent base table to log information when: • Errors occur during SQL INSERT. Therefore if a table prohibits duplicate rows. the error logging table will contain an error for each error generated. and a marker row is used to determine the 109 . by the ANSI standard. it is called a SET table. If no rows reside in the queue table. the default action by the system is to create the table using the first column as a non-unique primary index: • • • • PRIMARY INDEX clause NO PRIMARY INDEX clause PRIMARY KEY constraint UNIQUE constraint Ideally. and column attributes are specified. it is called a MULTISET table. Queue tables are like ordinary base tables. Base tables are defined using the CREATE TABLE statement. If the table allows duplicate rows. Though duplicate rows are prohibited in relational tables based on set theory.table. When using the statement. SQL definition is based on bags. or multisets. This column is used to identify when a row is inserted into the table. The SELECT statement can be used to view the contents of the queue table. the transaction enters a delay state until a row is inserted into the queue table or the transaction aborts. and one or more secondary indexes. a person can also specify the data block size. When creating a table. If the error logging facilities can handle the error generated. but work like an asynchronous first-in-first-out (FIFO) queue. When a table is defined so is the primary index. then the operation is aborted and rolled back.SELECT operations. one or more column names. each row in a table has unique values not shared with any other row in the same table. and other physical attributes of the table. • Errors occur during SQL MERGE operations. the request will be completed using the referential integrity (RI) or unique secondary index (USI) violations are detected. If a table is defined and none of the following statements are specified. If the request is completed. When creating queue tables. A bag or multiset is an unordered group of elements which may be repeated. Permanent journal tables can be created through options in the CREATE/MODIFY USER and CREATE/MODIFY DATABASE statements. a TIMESTAMP column must be defined with a default value of CURRENT_TIMESTAMP. Indexes are used to store and access rows within a table.. The SELECT AND CONSUME statement can be used to return data from the row with the oldest timestamp and deleting that row after returning data.. the table name. percent free space.
number of errors generated in single request. If the marker row exists. When valid-time rows are changed. For instance. The original row is “closed”. If the PA overlaps the PV.records and maintains the time period the information is valid. Within the error logging table. The NoPI table acts as a staging table for loading data.records and maintains the time period the database is aware of the information in the row. a second row representing the modified information within the PA. and has the begin transaction time represented as the modification time. The relationship between the PA and PV will determine the number of rows created as a result of the modification. only two rows are created. an error row is inserted representing each error that occurred. For instance. The two columns are independent time dimensions and serve different purposes. The system will store rows on any desired AMP. Normally staging tables with primary indexes will perform some data redistribution. and a third row with the original information representing the time period from the end of the PA to the end of the PV. thus improving performance when applications load data into a staging table. the transaction time for a row is changed and the database now has two rows to manage. and stores the converted data into another staging table. The NoPI tables have no data redistribution. the time period which the modification is valid can be specified. the request completed is successful. appending them at the end of the table. The original row with the end transaction time being the modification time and acts as a history row to represent the information contained before the modification and the new row. the operation is aborted and rolled back. No Primary Index (NoPI) tables are used to improve performance when bulk loading data using FastLoad or SQL sessions. • Transaction time . which is called the period of applicability (PA). which represents the changed information. therefore no marker row indicates the request failed. transforms or standardizes the data. Temporal tables store and maintain information related to time to allow the database to performed operations and queries using time-based reasoning. three rows are created . NoPI tables can also act a log files or sandbox table for storing data until an indexing method can be defined. called the period of validity (PV). if the PA is within the boundaries of the PV. Tables with both a valid-time and transaction-time column are considered bitemporal tables. Temporal tables include one or two special columns to store the time-based information: • Valid-time .one row with the original information representing the beginning of the PV to the beginning of the PA. The database will create new rows automatically when transaction time rows in temporal tables are changed in order to maintain time dimensions. If the error limit specified by the LOGGING ERRORS option is reached or the error logging facilities cannot handle the generated error. 110 .
• Global temporary trace . There are three types of temporary tables: • Global temporary . The ALTER TABLE statement can be used to modify transient journaling and ON COMMIT options for the base global temporary table. The events that cause an empty global temporary table to materialize are a CREATE INDEX statement and COLLECT STATISTICS statement on the table.retained for the session duration. No privileges are required to create volatile tables.Temporary tables are used to temporarily store data. Users must have the appropriate privileges on the base table or database (or user) containing the base table in order to materialize a new global temporary table. the table is populated immediately upon being materialized. the table has no rows and no physical instantiation. No access logging is performed on a materialized global temporary table. The definition and contents of the volatile table are not persistent. and no access logging is performed on these tables. but do not have a persistent definition in the data dictionary and must be created by the session. Global temporary tables allow table templates to be defined in the database schema. The database name for the table is the login user’s name. They are created in the login user space. if an INDEX statement is used. The tables are only created when they are needed. existing only as a definition. • Volatile . The base definition of the table is created using the CREATE TABLE statement with the GLOBAL TEMPRORY keyword used to describe the table type. Volatile tables are created using the CREATE TABLE statement and the VOLATILE keyword. 111 . the new table is materialized using the stored definition. they have persistent definitions but will not retain data across sessions. Up to 1000 volatile tables can be created for each user session at any given time. The table is started each time a session requiring it starts and continues until the end of the session. The spaced used by the global temporary table is charged to the login user temporary table.used to debug external routines. A CREATE TABLE statement can be used to manage transient journaling options on the global temporary table definitions. When an application accesses a table by the same name as the base table and the base table has not already materialized.consists of a persistent table definition stored in the data dictionary and used to store intermediate results from multiple queries into working tables used by applications and retained only for the length of the session. When created.
The value is a value in the declared type of the column and can include nulls. • Column constraint attributes clauses (PRIMARY KEY. the name must be unique from all global and permanent temporary table names in the database with the same name of the login user. One of more optional attribute definitions can be added to define the column further. when a volatile table is created. and CHARACTER SET). • Column storage attributes clause (COMPRESS). A name and data type must be defined for each column in the table. WITH DEFUALT). • Default value control clauses (DEFAULT. a volatile table can be created with the same name in each session. However. UNIQUE.2. Exactly one value is found in each column of each row in the table.Volatile tables do not permit the following CREATE TABLE options: • • • • • • • • If a user logs on to multiple sessions. including: • Data type attribute declaration (NOT NULL. Volatile tables are private to each session. Permanent journaling Referential integrity constraints Check constraints Compressed columns DEFAULT clause TITLE clause Named indexes 7. 112 . REFERENCES. TITLE. They have a name and a declared type. FORMAT.3 Columns Columns are a structure component of a table. and CHECK). The elements of the table column are defined in the column definition clause of the CREATE TABLE statement.
binary integer value from -9.036. 113 . • INTEGER .647. 7.used when the table is defined with a multilevel PPI to provide the corresponding partition level and partition number. such as: • Identity .specified with the GENERATED ALWAYS AS IDENTITY or GENERATED BY DEFAULT AS IDENTITY option in the table definition. Numeric values are specified using the following SQL data types: • BIGINT .036.483. binary integer value from -2. • Object Identifier (OID) .775.PARTITION#5 . which is specified for each column when creating a table.372.483.223.used when table is defined with a partitioned primary index to provide the partition number.uniquely identifies the row.a signed. • PARTITION .775. • ROWID .2.147. either as an exact number (integer or decimal) or an approximate number (floating point).used when table has LOB columns to store pointers to subtables containing LOB data.147.507. The values found in each column belong to one of the following data types: • • • • • • • • Numeric Character DateTime Interval Period Byte UDT Geospatial Numeric data types allow numeric values.4 Data Types Every data value is part of an SQL data type.508 to 9.Some columns are dynamically generated by the database.a signed.372.648 to 2.854.223. • PARTITION#1 .854.
times.a signed. second and optional time zone.Year-Month and Day-Time. • DECIMAL or NUMERIC (n|m) . • REAL. or FLOAT . minute. minute.time value including components for hour. The SQL data types used to specify DateTime values are: • DATE .variable length character string of length for internal character storage. and optional time zone.the longest permissible variable length character string.timestamp value including components for hour.a decimal number of n digits. and day. DateTime data types represent dates.768 to 32767.data value including components for year. second. second. • VARCHAR(n) . HTML. Characters of a given character set are represented by character data types. binary integer value from -32.a signed. month. • BYTEINT .fixed length character string for internal character storage. The different SQL data types for each category are: • YEAR-MONTH o o o • DAY-TIME o o INTERVAL DAY INTERVAL DAY TO HOUR 114 INTERVAL YEAR INTERVAL YEAR TO MONTH INTERVAL MONTH .a large character string. • LONG VARCHAR . Interval data types represent a time period and consist of two mutually exclusive interval type categories . and timestamps. with m of those digits to the right of the decimal point. fractional second. • TIMESTAMP [WITH TIME ZONE] . • TIME [WITH TIME ZONE] . DOUBLE PRECISION. for storing a character large object (CLOB) such as simple text. fractional.a value in sign/magnitude form. or XML documents. • CLOB . The SQL data types for character data are: • CHAR |(n)| .• SMALLINT . binary integer value from -128 to 127.
fractional second.a variable-length binary string. representing a binary large object (BLOB) such as graphics. and documents.an anchored duration of DATE elements including components for year. second. Raw data can be stored as logical bit streams when using byte data types. The SQL data types for the period category are: • PERIOD(DATE) . BYTE and VARBYTE are extensions of Teradata to the ANSI SQL:2008 standard. and day. • PERIOD(TIMESTAMP|(n)| [WITH TIME ZONE]) .an anchored duration of TIMESTAMP elements including components for hour.a fixed. second. • PERIOD(TIME|(n)| [WITH TIME ZONE]) .o o o o o o o o INTERVAL DAY TO MINUTE INTERVAL DAY TO SECOND INTERVAL HOUR INTERVAL HOUR TO MINUTE INTERVAL HOUR TO SECOND INTERVAL MINUTE INTERVAL MINUTE TO SECOND INTERVAL SECOND Different from interval data types which represent a span of time. minute.a large binary string of raw bytes. This type of data is transmitted from the memory of the client system. and optional time zone. month. The SQL data types are: • BYTE . a period data types represents a set of contiguous time granules from the beginning boundary up to but not including the ending boundary. • VARTYPE . files.an anchored duration of TIME elements including components for hour. video clips. minute. length binary string. fractional second. • BLOB . and optional time zone. 115 . while BLOB is ANSI SQL:2008 compliant.
0-dimensional geometry collection where elements are restricted to ST_Point values.1-dimensional geometry stored as a sequence of points with linear interpolation between points. 116 .2-dimensional geometry consisting a one exterior boundary and zero or more interior boundaries. The following data types are used by Teradata Database: • ST-Geometry . analyze.Teradata proprietary internal UDT representing any of the following: o ST_Point . and display geographic information.Data types can be custom defined to support the structural and behavioral model required by the database. Distinct UDTs are based on a single predefined data type. ST_Polyygon . For application to manage.collection of zero or more ST_Geometry values.0-dimensional geometry representing a single location in 2-dimensional space.1-dimensional geometry collection where elements are restricted to ST_LineString values. ST_MultiPoint . GeoSequence -extension of ST_LineString to contain tracking information. ST_MultiLineString . o o o o o o o • MBR . Custom data types are referred to as UDT data types and can be either distinct or structured.2-dimensional geometry collection where elements are restricted to ST_Polygon values.Teradata proprietary internal UDT for obtaining the minimum bounding rectangle (MBR) for tessellation purposes. geospatial data types are used. each defining a hole. ST_GeomCollection . ST_MultiPolygon . while structured UDTs are a collection of one or more fields called attributes representing predefined data types or other UDTs. ST_LineString .
7. a concept where relationships between tables must remain consistent. connect change.2.7. or a pointer to all possible locations of rows with the same data field value. Which is a file comprised of rows having a data field in the reference table and a pointer to the location of the row in the base table. A column or group of columns which is the primary key in two or more tables in the same database is called a foreign key. The stronger the selectivity. When a user makes a request or query. Row hash values are 117 . Indexes can be created to pull information from full tables on common types of queries.7 Indexes When a user accesses a full table for a small amount of data. a user can create other data types and use them just like a predefined data type. the values defining a primary key must be unique. a query may have to look through a very large number of rows. the values used to link related tables together. a 32-bit row hash value is stored with the row. 7. the performance and processor requirements to operate are much lower. Table rows in Teradata Database are self-indexing based on their primary index.6 Keys A column or group of columns which uniquely identify each row in the table is called a primary key. Each time a row is inserted into a table. Indexes do not support unplanned. A distinct UDT is based on a single predefined data type. if the index is unique. Since the index is relatively smaller than the full table. An index has weak selectivity if many rows are retrieved and has strong selectivity if few rows are retrieved. if the index is non-unique. the index is accessed instead of the full table. ad hoc queries. Several weakly selective non-unique secondary indexes can be combined through bit mapping to create a single strongly selective index. two types are supported: Distinct and Structured.2. and cannot be null. each defined as a predefined data type. This operation can consume a large amount of processor usage and reduce the performance of the entire database.5 User-Defined Types In addition to a set of predefined data types provided through SQL. the more useful the index is. Teradata Database indexes are based on row hash values instead of raw table column values. Relational databases utilize a classic index. These are called User-Defined Types (UDTs). Primary and foreign keys are used to ensure referential integrity. A structured UDT is a set of one or more attributes.2. Another form of structured UDT is the dynamic UDT which is associated with external UDFs to provide up to eight input parameters.
8 Primary Index Data is distributed and retrieved for particular tables across AMPs using a primary index. 118 . Older systems will generally use 16-bit hash bucket sizes. the RowID includes the combined partition numbers for each level of the PPI. A hashing algorithm is used to distribute rows across the AMPS. The properties of the primary index consist of: • The CREATE TABLE data definition statement defines the primary index. Hash buckets are distributed evenly across AMPs on a system. table or multitable. Any primary or secondary index can be defined unique. therefore a unique 32-bit numeric value is generated and appended to the row hash value to create a unique RowID. • Join index: simple or aggregate. • Hash index The CREATE TABLE statement is used to define a primary index and one or more secondary indexes. A partitioned primary index (PPI) will partition the data on each AMP based on some set of columns and ordered using a hash of the primary index columns. the CREATE TABLE statement will automatically be assigned. • If a primary index is not explicitly defined. • The ALTER TABLE data definition statement modifies the primary index.2.048. allow 65. hash-ordered or valueordered. The hash bucket size may be 16 bits or 20 bits. each associated with a different piece of data.536 hash buckets. Secondary indexes can also be defined using the CREATE INDEX statement.not unique. The algorithm computes a 32-bit row hash value based on the primary index. single. If the table has a partitioned primary index defined to it. complete or sparse. except partitioned primary indexes if one or more partitioning columns do not exist in the primary table. The different types of indexes used in Teradata Database are: • Primary index: unique or non-unique and partitioned or non-partitioned. Generally. 7.576 hash buckets. new systems will have a 10 bit hash bucket size allowing a total number of 1. Hash indexes and join indexes are created using the CREATE HASH INDEX or the CREATE JOIN INDEX statements respectively. Hash buckets are an ordered letter section consisting of data and used to sort and lookup the data. • Secondary index: unique or non-unique. Where the hash buckets are located is maintained in a hash map.
and retrievals on a single value is always a one-AMP operation.2. ST_Geometry. it is defined as non-unique. • If primary index is not explicitly defined as unique or is specified for a single column SET table. MBR.9 Secondary Index Secondary indexes are created explicitly with the CREATE TABLE and CREATE INDEX statements. • A maximum of one primary index is defined for each table. Most tables will require a primary index. CASE_N and RANGE_N functions. joins between tables with identical primary indexes. 7. Optimal access and uniform distribution of data are the two most important factors in choosing a primary table. They are never required by tables. When considering optimal data access. The columns used as the primary index has several restrictions: • Column cannot contain a CLOB. • Single-level PPIs may only used the following general forms of partitioning expressions: INTEGER or casting to INTEGER. • The hashing algorithm controls data distribution and retrieval. BLOB. and efficient parallel processing is possible through even distribution of table rows across AMPs. When considering uniform data distribution. the primary index values should be as distinct as possible. • Multilevel PPI may only specify CASE_N and RANGE_N functions in their partitioning expressions. UDT. but will often improve performance in the system. the primary index should be on the most frequently used access path.• A primary index can have up to 64 columns. • A primary index can be unique or non-unique. When a secondary index is created. • A primary index can be partitioned or non-partitioned. • Improves performance when performing single-AMP retrievals. • Member columns of the primary index column set or partitioning columns cannot be compressed. or Period data type. Primary index operations must provide a full primary index value. If rows have the same primary index value. Unique secondary indexes are implicitly created when using the CREATE TABLE statement to specify a primary table and on column set specified using PRIMARY KEY or UNIQUE constraints. they are distributed to the same AMP. a separate internal subtable is built to contain 119 . or partition elimination.
• For hash-ordered NUSI. • The secondary index value is copied and appended to the ROWID of the base table row. the secondary index subtables are duplicated. For value-ordered NUSI. • A Row Hash is created on the secondary index value. • A spool file is built containing each secondary index value followed by the RowID. are placed onto the BYNET. the RowIDs are sorted in ascending order. If a table is defined with FALLBACK. the NUSI value is used to determine storage. • The data is received by an appropriate AMP and created a row in the index subtable. • Data distribution is not affected. and secondary index value. or updated. • All three values. • For hash-ordered NUSI. For value-ordered NUSI. 120 . Nonunique secondary indexes (NUSIs) are specified as hash-ordered or value-ordered. • A maximum of 32 secondary indexes defined for each table. Secondary indexes can be unique or non-unique. Unique secondary indexes (USI) are built using the following process: • Each AMP accesses its subset of the base table rows. Secondary index properties include: • Speed of data retrieved enhanced. deleted. the rows are sorted by NUSI value order.the index rows. • If the row already exists in the index subtable. • Only a single column specified for the hash ordering in its covering index. Row Hash. an error is reported. • Only a single numeric or DATE column of four or less bytes can be used for hashordered NUSI. which is updated whenever a table row is inserted. • Composed up to 64 columns. RowID. Non-unique secondary indexes (NUSI) are built using the following process: • Each AMP accesses its subset of the base table rows. a row hash value is created for each secondary index value and a row created in the index subtable.
To create a join index. When locating rows with a specific value in the index. • Non-unique secondary indexes can be hash-ordered or value-ordered. Unique secondary indexes are guaranteed to have a unique index value. Single-table join tables are used to resolve joins on large tables without redistributing the joined rows across the AMPs. Join indexes can be used to define a prejoin table on frequently joined columns. • Selection of a small number of rows is highly efficient. • Additional I/Os required on inserts and deletes. it involves a two-AMP operation. These types of secondary values include covering columns. When the index is used to access data. • Should not be defined on columns with frequently changing values. Highly useful where the index structure contains all the columns referenced by one or more loins. the CREATE JOIN INDEX statement is used. create a full or partial replication of a base table with a primary index. • Can be unique or non-unique. the multitable join index will allow part of the query to be covered. These types of join indexes will hash a frequently joined subset 121 . They are used for join queries which are performed with a high frequency. or define a summary table.• Dynamic creation or release as data usage changes. • Secondary indexes cannot be partitioned. • Composite secondary indexes should not be used when multiple single column indexes and bit mapping can be used. by can be defined on a table with a partitioned primary index. • Composite secondary indexes are not used by the Optimizer unless explicit values exist for each column. • Columns which do not enhance selectivity should not be included. Access using the index involves an all-AMP operation. A join index can be defined on one or more tables.2. an NUSI is the best option. Multitable join indexes will store and maintain joined rows of two or more tables and will aggregate selected columns. • Additional disk space required to store subtables. 7.10 Join Index Join indexes are file structures for permitting queries to be resolved by accessing an index rather than its base table.
As a result BYNET traffic is eliminated. 122 . This “sparse” index is created using a constant expression to filter rows. simple or aggregate. to create a full or partial replication of a base table with a primary index.2. Referential integrity can: • Increase productivity in development by bypassing the need to code SQL statements to enforce referential constraints.of base table columns to the same AMP. 7. A referencing table is sometimes called a child table. the referenced columns must be either a unique primary index (UPI) or unique secondary index (USI). Aggregate join indexes can be defined on two or more tables. • Improve performance by identifying the most efficient method for enforcing referential constraints. Any join index can be sparse. As a result aggregate calculations for every query is not required. In both standard and batch referential integrity. whether it is singletable or multitable.2. neither can be defined as NOT NULL. This primary index is on a foreign key column.11 Hash Index Single-table join tables and hash indexes have the same purpose. and the referencing columns are called child columns. well known subset of them are referenced in a frequently run query. Child tables and columns must have parents which are the tables and columns being referenced. or on a single table. The number of rows used in a join index can be limited to only those rows when a small. highly efficient method of resolving queries to frequently specified aggregate operations on the same column or set of columns. Aggregate join indexes is a cost-effective. • Requires fewer written programs by ensuring referential constraints are not violated.12 Referential Integrity Enforcement of referential integrity supports two forms of declarative SQL: a standard method and a batch method. which will include a summary table containing a subset of columns from the base table and additional columns for the aggregate summaries of the base table columns. 7. A hash index can be defined on one table only.
A circular reference exists when one table references another table which has another reference to the original table.the system verifies the change will not violate foreign key constraint rules. • The foreign and primary key must contain the same number of columns.the system verifies no rows exist with the same foreign key values as the referenced columns. The referenced and referencing columns must be different columns. The database must maintain the integrity of foreign keys. if a row exists. if a row exists. • Before updating a referenced column .Referring tables must have FOREIGN KEY columns which are identical in definition with keys in the referenced table. If one column value for the foreign key is null. • A referenced table is dropped . all foreign key values are invalidated. 123 .the system verifies a row exists in the referenced table with values matching those of the altered values of the foreign key columns. Here are some actions the database will take to maintain that integrity when data is manipulated: • Inserting a row into the referencing table with a NOT NULL foreign key column . if not.the system verifies the same values exist for the foreign key column and the row existing in the referenced table. FOREIGN KEY references can exist on the same table containing the FOREIGN KEY. if not.the system will validate all values in the foreign key columns against columns in the referenced table. The following rules are applied to columns assigned as FOREIGN KEYS: • The COMPRESS option is not permitted on referenced or referencing column(s) for standard referential integrity. • A foreign key referenced is added . • No comparisons are made between column level constraints. Columns have the same data type and case sensitivity. an error is returned when an ALTER TABLE or DROP INDEX statement is made. at least one set of FOREIGN KEYS must be defined on nullable columns. • Deleted row in the referenced table . • Foreign key columns values are altered to NOT NULL . an error is returned. an error is returned. When a circular reference exists. an error is returned.the system will ensure the referencing table will dropped its foreign key for the referenced table. if a rule is broken. • Before altering the structure of columns .the system verifies no rows exist with the same foreign key values as the deleted row. an error is returned.
13 Views A view can be used to see defined portions of one or more tables in a database. When a single table is restored. inserts. a SELECT statement on one or more columns from underlying tables or views. When restoring or copying a table. the CREATEVIEW statement is used. • Tables cannot be updates from a view when the view is a join view.2. its columns.Archiving and restoration of individual tables are performed by the Archive/Recovery (ARC) utility. Only the column definition for the view is stored. they are virtual. but instead of being physical tables storing data. • View column names must be explicitly specified for any derived columns in the view. it is possible to create a reference definition which is inconsistent. a GROUP clause. If a table is the target for a FastLoad or MultiLoad operation. The restrictions are as follows: • An index cannot be created on a view. or deletes to the table are allowed. 7. Some of the operations used to manipulate tables are not valid for views and some are even restricted. foreign key references are not supported. • An ORDER BY clause cannot exist within a view definition. the table is marked inconsistent and no updates. as well as copying tables from one database to another. Until then. Data does not exist within a view like it does a table. the dictionary definition for the table is also restored. the database will materialize the view. This dictionary definition contains the references between a parent and child tables. or defines the same column more than once. which will provide a view name. The ARC utility will validate the references in both tables after the restoration is complete. and any conditional expressions or aggregate operators. However. 124 . contains derived columns. as if they were accessing the physical table. a user can work with a view. Views will be presented as tables to the user. At this point. a DISTINCT clause. and a view does not exist until it is referenced by a statement. To define a view.
all the actions will abort.executed once when the triggering statement is executed.executed once for each row changed by a triggering event and satisfying any qualifying condition. • The triggered action is executed. • Statement triggers . • If any action within the triggering process aborts. the ALTER TRIGGER statement must be used by the application. The ALTER TRIGGER statement will enable. most load utilities will not be able to access the table. or MERGE statement is used to modify a specified column(s) in the subject table. To remove a trigger permanently from the system. the DROP TRIGGER statement is used. The CREATE TRIGGER statement is used to define a trigger. Teradata Database triggers conform to ANSI SQL: 2008 standards. • The trigger action time for qualified triggers is examined to determine when they fire before or after the triggering event. or change the creation timestamp of the trigger. If a table has an active trigger. typically in the form of SQL statements. They execute when an INSERT. To disable the trigger and enable the load. • With multiple qualifying triggers. disable. The following process is initiated when a trigger is executed: • A triggering event occurs on the subject table. the order of fire is based on the ANSI-specified order of creation timestamp. • Appropriate triggers on the subject table are activated based on the triggering event.2. • Control is passed from one trigger to the next. 125 . DELETE. A trigger will have two types of granularity: • Row triggers .7. UPDATE.14 Triggers Triggers are active database objects consisting of a stored SQL statement or a block of SQL statements.
15 Macros Macros are one or more statements that can be executed as a group by performing a single EXECUTE statement. If the macro is a single statement that returns no data. The EXEC statement is used to perform a static macro. Parameters cannot be used for data object names.7. The PREPARE statement is used to define a dynamic macro execution. The resolution of a data definition statement is not fully reached until the macro is executed. The exception to this rule is CREATE AUTHORIZATION and REPLACE AUTHORIZATION. one or more rows of data can be returned.2. All unqualified database object references are resolved using the default database of the user submitting the EXECUTE statement. When multiple statements are used and return data. If a different result is desired. all object references must be fully qualified in the data definition statement. When macros are executed. The DECLARE CURSOR statement is used to associate a macro cursor with a static SQL macro execution. a static execution is specified using the EXEC statement and a dynamic execution is specified using both a PREPARE and EXECUTE statements. Parameters in a macro can be substituted with data values and time a macro is executed. They consist of control and condition handling statements written in SQL. Macros can be executed statically or dynamically.2. A macro can contain an EXECUTE statement to execute another macro. The EXECUTE statement is used to perform a dynamic macro. Applications have a server126 . A macro can contain a data definition statement if it is the only SQL statement in the macro. The statement string within this PERPARE contains an EXEC macro-name statement. The following statements can be used to perform different actions on a macro: • • • • • DROP MACRO REPLACE MACRO RENAME MACRO HELP MACRO SHOW MACRO 7.16 Stored Procedures The ANSI SQL: 2008 standards calls stored procedures Persistent Stored Modules. References to a parameter name can be prefixed with a COLON character. Macros can be created by a user for use only by the creator or be granted authorization for use by other users. a cursor is used. A USING modifier in a macro will allow parameters to be filled with data from an external source.
Stored procedures are created using either the COMPILE command from the BTEQ utility or the SQL CREATE PROCEDURE or REPLACE PROCEDURE statements through CLIv2 applications. C++. DDL. which as the main tasks of the procedures. If authorized. comprise the stored procedure body.based procedural interface to the database to manage these statements. External stored procedures written in C. Declaration and cursor statements are not allowed. Stored procedures are a set of statements. The output from stored procedures consist of values in INOUT or OUT arguments or result sets of SELECT statements. or DCL statement. Result sets are returned when the CREATE PROCEDURE or REPLACE PROCEDURE statements specify DYNAMIC RESULT SET clause. and Teradata SQL Assistant. a user can use the SQL CALL statement from any supporting client utility or interface. ODBC. or Java can also execute stored procedures. either single or compound. such as: • • • • • • • Local variable declarations Cursor declarations Condition declarations Condition handler declaration statements Control statements SQL DML. Execution of a stored procedure requires the appropriate privileges. 127 . All the parameters in a stored procedure must have their arguments specified. DCL statements Multistatement request Statements can be nested in compound statements. DML. JDBC. A compound statement stored procedure body will contain a BEGIN-END statement with a set of declarations and statements. A single statement stored procedure body will contain only one control statement or one SQL DDL. Modification of a stored procedure definition is done using the REPLACE PROCEDURE statement. They are stored as objects within the user database space and executed on the server.
or Java programming language.2. they are considered external store procedures and are installed on the database and executed like stored procedures. C++.Stored procedures can be recompiled using the ALTER PROCEDURE statement. The CREATE PROCEDURE does not require the use of COMPILE command in BTEQ. namely the CREATE EXTERNAL PROCEDURE privilege on the database. CLIv2 is used to execute C or C++ external stored procedures. while Java external stored procedures are executes through JDBC. instead of executing SHOW PROCEDURE and REPLACE PROCEDURE statements. 7. • Allows changes in compile-time attributes such as SPL option and Warnings option. • Stored procedures created in earlier versions of the database can be recompiled with current features. Nested stored procedures can be called using the FNC_CAllSP library. The following statements perform different operations on stored procedures: • • • • DROP PROCEDURE RENAME PROCEDURE HELP PROCEDURE SHOW PROCEDURE If a stored procedure is written in C.17 User-Defined Functions Two types of user-defined functions are supported and allow a user to write their own functions: • • SQL UDFs External UDFs 128 . a user must have the appropriate privilege. This allows: • Allows cross-platform archive and restore operation for stored procedures. To install an external stored procedure.
As a result. External UDFs allow functions to be written in C. 7. Aggregate . C++. The purpose of profiles is to simplify system administration and control password security. Default database. Number of days before password expiration.Regular SQL expressions can be encapsulated in functions and used as standard SQL functions. which is the benefit of creating SQL UDFs. or Java programming language. External UDFs are categorized into three types: • • • Scalar . They are installed on the database and used like standard SQL functions.18 Profiles With profiles. Table . Allocation of space for spool files and temporary tables. Number of days before password reuse is allowed.uses input parameters to return a single value result. the following system parameters are provided defined values: • • • • • • Default database Spool space Temporary space Default account and alternate accounts Password security attributes Optimizer cost profile Profiles are defined and assigned to a group of users who need to share the same settings. Number of incorrect logon attempts. complex SQL expressions can be moved from queries to SQL UDFs. a CREATE PROFILE statement is used and allows the following to be set: • • • • • • • Account identifiers. 129 .uses grouped sets of relational data to return a summary result.returns a table to the SELECT statement. Optimizer cost profile. To create a profile.2. using a FROM clause.
130 . • Use the GRANT statement to grant a role to users or other roles. which provide them the authority to access the objects the role has privileges. including: o o o o o o Use of digits and special characters At least one numeric character At least one alphabet character At least one special character Removal of user name in password Mixture of character case 7. • Assign default roles to users with the DEFAULT ROLE option in the CREATE USER or MODIFY USER statement. The process for managing user privileges using roles is as follows: • Use the CREATE ROLE statement to define a role. Allowable characters in password string.19 Roles The privileges on objects within the database are defined by roles.• • • Minimum and maximum number of characters allowed. • Change the current role for a session. The primary purpose of roles is to simplify the administration of privileges across the database and reduce the disk space required to assign privileges to individual users.2. Users are assigned roles. if required. • Add privileges to the newly created role using the GRANT statement. Restriction of specific words from significant portion of password string.
If a keyword is reserved. one or more column names. Phrases . As new releases of the database product are introduced.data attribute phrases.3. Non-reserved keywords can be used in object names. Functions . This keyword is always a verb. name references. or operations. The first keyword in an SQL statement is the statement keyword. and one of more optional clauses.any literals. formally valid object names may be using newly reserved keywords.subordinate statement qualifiers.2 Keywords Keywords are reserved and non-reserved words which have special meaning in SQL statements. Other keywords found in a statement act as modifiers or introducers of clauses.” The action can be any of the following: • • • • • Expressions . database name. but may cause confusion. those words cannot be used to name database objects. All keywords must be ASCII compliant. 131 . Clauses .3 SQL Syntax 7.7. 7.1 Statement Structure The basic structure of an SQL statement is “statement_keyword action. Keywords . table name.the name of a function and its arguments. A typical SQL statement will have a statement keyword.3.any values introducing clauses or phrases or representing special objects.
Period literals are used to specify a constant value to the Period data type. Spans of time can be declared using interval literals and consists of Year-Month and Day-Time categories. data. 132 . or macro definition text. DATE. and an optional sequence of digits. time. If one sequence on either side of the decimal point is missing. or macro definition text. The three types of character literals are the character literal. • Query . minus sign. optional decimal point.3 Expressions Expressions are literals. the literal character E. view. view. a value is specified. the other sequence is required. view. There are three types of numeric literals: • Integer Literal -literal strings of integer numbers are declared using an optional sign and a sequence of up to 10 digits. optional sequence of digits. a plus sign. Data and time literals are used to declare date. byte string. and TIMESTAMP are the types of DateTime literals found. or macro definition text. and decimal point. Hexadecimal character literal. Object names in a data dictionary can be created and referenced using hexadecimal name literals and Unicode delimited identifiers. or interval of exactly one declared type. or CHECK constraint definition text. another optional sign. 7.literal strings of floating point numbers are declared consisting of an optional sign. Character literals declare character values in an expression. time. it produces a single number. • Floating Point Literal . A numeric literal is a character string of 1 to 40 characters consisting of 0-9.3. timestamp. or timestamp values in an SQL expression. character string. optional sequence of digits.produces rows and tables of data and operate on table views. Expressions are of two types: • Scalar . an optional decimal point.4 Literals Literals are constants coded directly into the text of an SQL statement. and a sequence of digits representing the exponent. or CONSTRAINT definition text. or CHECK constraint definition text.also known as a value expression. • Decimal Literal .literal strings of decimal numbers are declared consisting of an optional sign. an optional sequence of up to 38 digits. or operations using names and literals. Within an expression. TIME.7. and Unicode character string literal. name references.3.
Parenthesis is used to control the order of precedence. LE. GT. multiplication. OR) • Value • Set 7. LIKE. an unknown value. NE. nulls represent the absence of a value. NOT IN set. subtraction) • String (concatenation operator) • Logical (EQ. AND. NULL is ANSI SQL: 2008-compliant with extensions. GE. NOT. The innermost parenthesis is always performed first and evaluated outwardly. LT. The keyword NULL acts similar to. Aggregate functions produce summary results resulting from grouped sets of relational data. unary minus. modulo operator.3.3. exponentiation. or an unknowable value. representing an empty column. Scalar functions use input parameters to return a single value results. Instead of representing a value. SQL operators consist of (with order of precedence): • Numeric (unary plus.6 Functions There are two types of functions used in SQL: scalar and aggregate.The NULL keyword is considered to be like a literal. IN set. They are used as literals in the following ways: • • • • • • • CAST source operand CASE result Insert items Update items Default column definition specification Explicit SELECT item (Teradata extension) Function operand (Teradata extension) 7. BETWEEN/AND. but not identical as. Precedence refers to the importance of the operator compared to other existing operators and represents three levels.5 Operators Logical and arithmetic operations are expressed as SQL operators. literals. 133 . Operators are evaluated from left to right if they have the same precedence. division. addition.
or DateTime fields. A newline character is implementation-specific but created by hitting the Enter or Return key.7. Lexical separators are a character strings located between words. and RETURN characters.used to group expressions and define limitations on phrases. separate. They will not change the meaning of a statement and are represented as comments.table.used to separate database objects (database. The SEMICOLON is a Teradata extension of the ANSI SQL: 2008 standard. • HYPHEN . parameters. • SEMICOLON .used to prefix reference parameters or client system variables. method names from UDT expression.defines the boundaries of character string constants and separates DateTime fields. • SOLIDUS . • Uppercase B and Lowercase b . • FULLSTOP . Other forms of separators are lexical and statement separators. and delimiters. • COLON . Comments can be inserted in an SQL request anywhere a pad character can occur. DateTime fields. Also used to terminate requests through utilities and embedded SQL statements. • QUOTATION MARK -defines the boundaries of nonstandard names. pad characters. Simple comments are delimited by two consecutive HYPHEN characters before the comment text and a newline character at the end.separates and distinguishes column names.7 Delimiters and Separators Delimiters are special characters used in different capacities: • PARENTHESIS . represented by being the last nonblank character in an input line in BTEQ (not counting ends of comments). Bracketed comments are a text string of unlimited length delimited by a beginning SOLIDUS and ASTERISK (/*) and an end ASTERISK and SOLIDUS (*/). • APOSTROPHE . stored procedure bodies.used to separate statements in multi-statement requests. 134 . They can be simple or bracketed. The statement separator is the SEMICOLON and allows multiple statements to be distinguished from each other.separates DateTime fields. • COMMA . literals.separates DateTime fields.column).3. The SEMICOLON is also used to terminate requests. or act as a decimal point.separates DateTime fields. and SQL procedure statements.
It can be defined using the DEFAULT DATABASE clause with the CREATE USER statement or. DDL statements perform the following functions: • Create. drop. and macros. rename.5 Functional Families SQL statements can be executed from within client application programs and are referred to as embedded SQL. the SQL statement will produce an ambiguous name error. and alter user-defined types.1 Data Definition Language Data Definition Language (DDL) is an SQL language subset. any of the following data definition statements can be used: • • • MODIFY USER. EXEC SQL. drop. if a profile exists with a defined default database. 7. A default database can be established for the current session of the database.4 Default Database A database used by the Teradata Database to look for unqualified object names is called the default database. MODIFY PROFILE. It will search this database and other databases to resolve names. views. A permanent default database can be established and invoked each time a user logs on. and alter tables. MODIFY USER. 7. • Create. All SQL statements support the definition of database objects fall under the category of DDL.7. with a PROFILE clause. with a DEFAULT DATABASE clause. rename.5. If an unqualified object name exists in multiple databases. the PROFILE clause with the CREATE USER statement is used. 135 . Some privilege on an object in the database must exist before a user can establish a default database. with a DEFAULT DATABASE clause. user-defined functions. drop. using the DATABASE statement. and replace stored procedures. Teradata Database will distinguish the SQL language statements embedded in the application program form the host programming language using a special prefix. • Create. To change the permanent default database definition or add a default database when none exist.
entries in the Data Dictionary will be automatically created and updated. and set roles. If the execution of a DDL statement is successful. • Create. drop. • Create and drop indexes. • Set a different collation sequence. and modify profiles. • Collect statistics on a column set or index. and database. drop. in explicit transactions. • Establish default database. 136 . and replace user-defined methods. • Create. • Set the query band for a transaction or session. When DDL statements are entered.• Create. or last statement. • Solitary statements in a macro. No DDL statement can be part of a multistatement request. drop. • Solitary statement. • Enable and disable online archiving. account priority. they can consist as: • Single statement requests. DateFrom. and modify users and databases. time zone. • Create. • Comment on database objects. alter. • Create. • Logging start and ending. and replace rule sets. drop. • Create. and replace triggers. drop. rename. • Create. drop. and alter replication groups. drop.
7.5.2 General DDL Statements • • • • • • • • • • • • • • • • • • • • • • • • ALTER FUNCTION ALTER METHOD ALTER PROCEDURE ALTER REPLICATION GROUP ALTER TABLE ALTER TRIGGER ALTER TYPE BEGIN LOGGING/END LOGGING COMMENT CREATE AUTHORIZATION/DROP AUTHORIZATION CREATE CAST/DROP CAST/REPLACE CAST CREATE DATABASE/DELETE DATABASE/DROP DATABASE/MODIFY DATABASE CREATE ERROR TABLE/DROP ERROR TABLE CREATE FUNCTION/DROP FUNCTION/REPLACE FUNCTION CREATE GLOP SET/DROP GLOP SET CREATE HASH INDEX/DROP HASH INDEX CREATE INDEX/DROP INDEX CREATE JOIN INDEX/DROP JOIN INDEX CREATE MACRO/DROP MACRO/RENAME MACRO/REPLACE MACRO CREATE METHOD/REPLACE METHOD CREATE PROCEDURE/DROP PROCEDURE/RENAME PROCEDURE/REPLACE PROCEDURE CREATE PROFILE/DROP PROFILE/MODIFY PROFILE CREATE REPLICATION GROUP/DROP REPLICATION GROUP CREATE REPLICATION RULESET/DROP REPLICATION RULESET/REPLACE REPLICATION 137 .
or last statement.RULESET • • • • • • • • • • • • • CREATE ROLE/DROP ROLE/SET ROLE CREATE TABLE/DROP TABLE/RENAME TABLE CREATE TRANSFORM/DROP TRANSFORM/REPLACE TRANSFORM CREATE TRIGGER/DROP TRIGGER/RENAME TRIGGER/REPLACE TRIGGER CREATE TYPE/DROP TYPE CREATE USER/DELETE USER/DROP USER/MODIFY USER CREATE VIEW/DROP VIEW/RENAME VIEW/REPLACE VIEW DATABASE DROP ORDERING/REPLACE ORDERING LOGGING ONLINE ARCHIVE ON/OFF SET QUERY_BAND SET SESSION SET TIME ZONE 7. entries in the data dictionary are created and updated automatically. If a DCL statement executes successfully. The DCL statement cannot be part of a multistatement request. 138 .5. A solitary statement.3 Data Control Language Data Control Language (DCL) is a subset of the SQL language. A solitary statement in a macro. A data control statement can be either: • • • A single statement request. DCL statements define the security authorization required for accessing database objects. in an explicit transaction. specifically granting and revoking privileges and transferring ownership of a database to another user.
DML statements define the manipulation or processing of database objects. and even specified columns for all rows or specified rows.4 Data Manipulation Language Data Manipulation Language (DML) is a subset of the SQL language. The SELECT statement can be used to select columns. which returns information found in tables of the relational database. A single statement can be used to retrieve data from all rows or specified rows.DCL statements consist of: • • • • • • • GIVE GRANT GRANT CONNECT THROUGH GRANT LOGON REVOKE REVOKE CONNECT THROUGH REVOKE LOGON 7. which returns stored data in the table. and the table(s) within the database to access. The most common form of DML statement is the SELECT statement. The SELECT statement can also be used to select rows. The components of the SELECT statement are the referenced table columns. the database other than default.5. Greater detail can be obtained by using one of the following clauses: • • • • • • • • FROM WHERE ORDER BY DISTINCT WITH GROUP BY HAVING TOP 139 .
the default value for the column is stored. • If a value for a column is omitted when NOT NULL or no default is specified. The DELETE statement allows an entire row or rows to be removed from a table. Subqueries are nested SELECT statements.. This merge is based on the matching condition between the source row and target rows. The column name of the data being modified is specified in the statement.SELECT is performed on a table defined as SET and in Teradata mode. any existing data in the column is removed. the operation is rejected and an error message is returned. 140 . • A value NULL is supplied and is allowed. • An update result will violate uniqueness constraints or create a duplicate row. • If a value for a column is omitted requiring a defined default value. An UPDATE statement is used to modify data in one or more rows of a table. an error message is returned.. These statements will return data without access any tables. The CREATE TABLE statement may define defaults and constraints which affect an insert operation.SELECT statement can be used to perform a bulk insert of rows retrieved from another table. The INSERT statement can be used to add a new role to a table. an error is returned. the operation is rejected and an error message is returned. An update operation can be affected by the attributes in the CREATE TABLE statement: • A value violating some defined constraint will be rejected and an error message returned. The only exception is when an INSERT. The MERGE statement can be used to merge a source row set into a target table. An INSERT..A special form of the SELECT statement is the zero-table SELECT statement. The WHERE clause in the statement will qualify the rows to be deleted. in the following ways: • If an attempt to add a duplicate row for a unique index or to a table defined as SET.. A WHERE clause can be used to identify the rows to be changed. • A value which does not satisfy the constraints or violates a defined constraint is supplied.
4. Perform recursion based on the existing result set. 3. The characterization of recursion is comprised of three steps: initialization.7. repeated iteration of logic. Incrementing the column value by 1 in each recursive statements. The following SQL statements are provided by Teradata to collect and analyze query and data: • • • • • • • • • • • BEGIN QUERY LOGGING COLLECT DEMOGRAPHICS COLLECT STATISTICS DROP STATISTICS DUMP EXPLAIN END QUERY LOGGING INITIATE INDEX ANALYSIS INITIATE PARTITION ANALYSIS INSERT EXPLAIN RESTART INDEX ANALYSIS SHOW QUERY LOGGIN 141 . A recursive query is specified using a WITH RECURSIVE clause preceding a query or a RECURSIVE clause in a CREATE VIEW statement. Return the final result set as the final query. Specifying a limit for the value of the depth control column.5. Specifying a depth control column in the column list. To ensure infinite recursion is not performed. The process of performing a recursive query is in three phases: • • • Create an initial result set. 2.5 Query and Workload Analysis Recursive queries can be used to query hierarchies of data. and termination. Initializing the column value to 0 in seed statements. This is done by: 1. depth control is provided.
The results of these statements can be used by the Optimizer to produce better query plans or to populate a user-defined Query Capture Database (QCD) tables. These statements include: • • • • • • • • • • DIAGNOSTIC HELP PROFILE DIAGNOSTIC SET PROFILE DIAGNOSTIC COSTPRINT DIAGNOSTIC DUMP COSTS DIAGNOSTIC HELP COSTS DIAGNOSTIC SET COSTS DIAGNOSTIC DUMP SAMPLES DIAGNOSTIC HELP SAMPLES DIAGNOSTIC SET SAMPLES DIAGNOSTIC “Validate Index” 142 . Diagnostic statements supporting the Teradata Index Wizard can be used to emulate a production environment on a test system.
1 Physical Database Design 8. or columns. Retrieval of data can easily be processed using indexes. causing “skewed data” when distributing data.8 Database Administration 8. The values in the column must be unique form each other and cannot be null. A table created with NUPI can have duplicate values in its columns.1. Primary Key (PK) defines a column. The degree of uniformity in the distribution is highly susceptible to the degree of uniqueness of the index. used to uniquely identify a row in a table. guaranteeing uniform distribution of table rows. columns cannot have duplicate values. a Non-Unique Primary Index (NUPI). as relationships with other tables may be lost if a PK is changed or 143 .1 Primary Indexes Data can be distributed or accessed. Indexes are used to: • • • • Locate data rows Distribute data rows Improve performance Ensure index value uniqueness The different types of indexes include: • • • • • • Primary Partitioned Primary Secondary Join Hash Special (referential integrity) Tables can be created with a Unique Primary Index (UPI). With UPIS. The PK values should never be changed. An index is a physical mechanism for storing and accessing rows in a table. A primary index is the most efficient method of accessing data and the best primary index has the greatest level of uniqueness. or No Primary Index (NoPI).
144 . A Multilevel Partitioned Primary Index (MLPPI) allows partitions to be sub-partitioned. For instance: two tables are created to store information about subscription customers. but SIs add to table overhead. Subtables for all SIs are built by the system. Useful for NoPI tables. There can be up to 15 levels of partitioning. join indexes using PI values. and non-compressed. Some characteristics of Secondary Indexes: • • • • Row distribution across AMPs is not affected. Their limits may further restrict the number of levels allowed. rows are hashed to AMPs based on the PI columns and assigned to an appropriate partitions.re-used. Rows are stored in row hash order when assigned to a partition. Can be unique or non-unique. specifically they model a relationship between data values on different tables. global temporary tables. thus improving performance. These subtables have index rows where the SI value is associated with one or more rows. In this instance.535.2 Secondary Indexes Secondary Indexes (SI) are used to obtain information using alternative paths. one table lists billing information on customers with the primary key being the customer’s account number and the other table listing the customer’s address with the address column being the primary key. 8. Full table scans are not required.1. The core relationship between these two tables is the customer’s name. Each level of the MLPPI must define at least two partitions. The Optimizer can be used on indexes to improve query performance. When a PPI is used to create a table or join index. SIs can be dropped and recreated when required. The product of the number of partitions cannot exceed 65. The rows are updated in the subtable when changes are made to column values or new rows are added. since a column with these values can be found on both tables. volatile tables. Foreign Keys (FKs) identify table relationships. though both UPIs and NUPIs can be partitioned. A Partitioned Primary Index (PPI) will still provide a path to rows in the base table. the customer’s name is the foreign key. The default PI of Teradata Database is a non-partitioned PI.
145 .3 Join Indexes An indexing structure containing columns from one or more base tables is called a Join Index (JI). When a SI value is specified in the SQL. hashing is used. An aggregate JI can be defined on two or more tables or on a single-table using: • • • SUM function COUNT function GROUP BY clause Spares Join Indexes refer to situations using the WHERE clause in the CREATE JOIN INDEX to index a portion of the table. Nearly all indexes are based on whole or in part the row hash values instead of table column values. the hash value is used to access the required access. When queries use the JI but also the base tables.4 Hashing To distribute data for tables with a PI to disk storage. To distinguish between rows with the same row hash within the same table. A multitable JI can be used to predefine a join when queries frequently request the join.1. Teradata Database can support multitable.8. This type of JI will limit the rows indexed.1. A row identifier is applied to uniquely identify each row in a table and is a combination of the row hash and sequence number. 8. a sequence number is assigned. The hash value is computed using the hash of the values of the SI columns. A single-table JI contains rows from a single-table and used as an alternative approach to directly accessing data. The exception to this rule is aggregate join indexes. A query covered by the JI is characterized as all referenced columns are stored in the index and the query only needs to examine the JI. the query is considered partially-covered by the index. Join indexes support a primary index. along with the actual value of the index columns and a list of primary index row identifiers. partially-covering JIs. All types of join indexes can be joined to base tables to retrieve columns referenced by the query but not stored in the JI. This hash value is recorded by the SI subtable. A row hash is obtained by hashing values of the PI columns.
No hierarchies of data values are allowed in first normal form. as a database is considered relational. Where a relation eliminates repeating groups in first normal form.1. The primary concept behind this process is normal forms. A relation is in normal form if it meets the constraints of a particular normal form. 146 . stable database schema. and surrogate keys. The process of this reduction is called normalization. second. and third normal forms. If all fields within a relation contain one and only one value (atomic). 8.6 Normalization A complex database schema can be reduced into a simple. the relation is considered to be in 1NF. The attribute is used to generate a unique. Normal forms are layered based on the Boyce-Codd model. the concept of referential integrity states a row cannot exist in a table with a non-value when a value of equal value for either the referencing table or referenced table. Columns within a referencing table can be specified as foreign keys for columns in a referenced table which are defined as either primary key columns or unique columns.1.8. USIs. Identity columns can be used to generate UPIs. typically taking on first. Relational databases are defined in the first normal form (1NF). Non-key attributes are any attribute not part of the Primary Key. If two nonPrimary Key columns or group of columns are not in a one-on-one relation in either direction. the relation is considered to be in third normal form. 8. Essentially. second normal form eliminates circular dependencies. specifically when the relationship is based on the definition of a primary key and foreign key. table-level number of every row in a table.5 Identity Columns The ANSI standard defines a column attribute option called the Identity Column. which define a system of constraints.1. A relation is considered to be in second normal form (2NF) if it is and in 1NF and every non-key attribute is fully dependent on the entire Primary Key. A relational database is always defined as normalized to the first form. primary keys.7 Referential Integrity Relationships between tables are the focus of referential integrity. The elimination of non-key attributes that do not describe the Primary Key is the basis of third normal form.
• Child Table . • Parent Key . Associated with the user DBC is a default database also called DBC.2 Database Administration When installing Teradata Database. and objects: • • • • • • • • • • • Crashdumps user SysAdmin user SystemFE user (for field engineers) TDPUSER user Sys-Calendar database TD_SYSFNLIB database SQLJ database (for external routines and UDTs) SYSLIB database (for external routines and UDTs) SYSUDTLIB database (for external routines and UDTs) SYSSPATIAL database (for geospatial data types) DBCExtension database (for GLOP sets) 147 . this is referred to by a child table.where referential constraints are defined. • Foreign Key . it only has one user called DBC.a child table column set referring to a primary key column set in a parent table. The user DBC also owns all space for the entire system. The usable disk space in the DBC database will reflect the entire system hardware capacity initially. This user will own all other databases and users in the system.also called the referenced table. 8. As new databases and users are created.a parent table column set referred to be a foreign key column set in a child table. except for space used for the following system users. • Primary Key . permanent space is extracted from the user DBC. databases.Referential integrity utilizes the following concepts: • Parent Table .candidate key in the parent table.
databases and administrative tools.Calendar view. 8. journals. 148 . views. • TD_SYSFNLIB . macros. specifically a DIPCRASH script being run during DIP installation. stored procedures. • Sys_Calendar .• • • System Transient Journal (JT) System catalog tables of Data Dictionary User accessible views To ensure sufficient space is dedicated to the DBC database to allow it to be a spool. or upgrade. • Allocate space to users and databases. These executable files contain SQL scripts known as DIP scripts.1 System Users The DIP utility has several executable files for creating system users. query logs.created for internal use by field engineers for diagnostic and emulation operations. The following are system users created by these scripts: • SysAdmin .2 Administrator User The responsibilities of the administrator user are: • Establish user management policy with security administrator. and other database object. 8.2. • Crashdumps . UDTs.CalDates table and Sys_Calendar.2. • Grant privileges to roles and users.used to log internal errors.used to contain domain-specific functions. • Create and manage databases.contains the Sys_Calendar. permanent space can set aside and remain unused or an empty database can be created to hold the space. tables. migration. • SystemFE .created for internal use in the database to manage and perform system administration functions and contains several views and macros for administrative purposes.
Created and manage accounts. The name of the administrative user cannot be the same as any name already reserved by the system.• • • • • • • • Manage users and databases. and transient journals. When assigning space to this name. the following methods can be utilized: • • • • Direct submission of SQL statement to BTEQ.3 Administration Tools To perform administrative tasks. 8.3 System Administration 8.2. and request number. logs. Manage data archive and restore. Manage space usage. a session is established. Manage data load and export. To protect sensitive data and system objects owned by the user DBS. Troubleshoot user problems. BETQWin. Subsequent requests and responses over the session can be identified by the host id.3. session number. and account number is accepted by the database and the database returns a session number to the process. Manage database maintenance tasks. The identification used by the database is provided automatically and unknown to 149 . or scripts Client-based utilities DBS utilities Teradata Viewpoint 8. password. SQL Assistant. Monitor system performance. enough space should be allocated to handle growth of system tables.1 Session Management When a user logs on to Teradata Database. The beginning of the session starts after the username. it is recommended an administrative user is created.
• checktable (CheckTable) .used to abort all outstanding transactions on a failed host. and hosts and their interrelationships. The logon string used to identify the user can include: • • • • Tdpid User name Password Optional account number 8.3.used to collect and display a summary of AWT snapshot for users. including: • aborthost (Abort Host) . • config (Configuration Utility) . • ampload (AMP Load) . and whether the user is either an application program.used to display and change PDE Control Parameters GDO fields. or an interactive user. • cufconfig (Cufconfig Utility) .executes one of more DIP scripts. The actual procedure for establishing the session is slightly different based on the client system the operating system. • awtmon (AWT Monitor) .used to display and modify configuration settings for the UDF and external stored procedures subsystem.used to check for inconsistencies between internal data structures. PEs. The session is established when the user logs on to the database. • cnsrun (CNS Run) .2 Utilities Administrative and maintenance functions can be performed by system administrators using a large number of available utilities for Teradata Database. • ctl (Control GDO Editor) . • DIP (Database Initialization Program) .used to define AMPs. 150 .the user.used to identify the load of all AMP vprocs on the system. an interactive terminal session with an application.used to run database utilities from scripts.
used to identify slow down and potential hangs of the database and displays statistics • showlocks (Show Locks) . • gtwcontrol (Gateway Control) . • DUI or DULTAPE (Dump Unload/Load) . • modmpplist (Modify MMP List) .used to report the current Teradata Database configuration • qrysessn (Query Session) . and lock levels associated with SQL statements currently being executed.used to estimate the elapsed time for reconfigurations and time estimates for redistribution.used to display locks placed by Archive and Recovery and Table Rebuild operations • sysinit (System Initializer) .used to save system dumps to tape and restore from tape. • filer (Filer Utility) .used to find and correct problems in the file system. lock object identifiers.used to display information to allow monitoring of recovery progress • dbschk. • reconfig_estimator (Reconfiguration Estimator) . or syscheck (Resource Check Tools) .used to establish an operational database with the component definitions created by the Configuration utility. and execute an action.used to create. • qryconfig (Query Configuration) . • schmon (Priority Scheduler) .used to display all real-time database locks and sessions in a snapshot capture.used to initialize the database 151 . • ferret (Ferret Utility) .used to log transaction identifiers. nodecheck.• dbscontrol (DBS Control) .used to modify the node list file. • lokdisp (Lock Display) .used to monitor session state on selected logical host IDs • reconfig (Reconfiguration Utility) .used to display and modify DBS Control Record fields. modify. and monitor Teradata Database process prioritization parameters. display parameters of. and NUSI building • rcvmanager (Recovery Manager) . • dumplocklog (Locking Logger) .used to define the scope.used to change default values for the Gateway Control GDO. deletion. session identifiers.
8. • updatedbc (Update DBC) .used to verify pdisks are accessible and correctly mapped.used to display information about PDE processes. As databases and users are 152 . Each database and user may contain one permanent journal optionally.used to perform 2PC related functions. Users are similar to databases. privileges must be explicitly granted first. • vprocmanager (Vproc Manager) .used to manage vprocs. Users will have passwords and startup strings.used to perform GetStat/ResetStat operations and display statistics • tdntune (TDN Tuner) .used to rebuild tables that cannot be automatically recovered by database • tdlocaledef (Tdlocaledef Utility) . When done. • tsklist (Task List) . • updatespace (Update Space) .• rebuild (Table Rebuild) .1 Databases and Users Databases and users are uniquely named permanent space.4 User and Security Management 8. privilege definitions. temporary.used to convert a SDF into a GDP • tdnstat (TDN Statistics) .4.used to reset the PDA and database components • tpccons (Tow-Phase Commit Console) . Databases are logical repositories for database objects. the CREATE DATABASE or CREATE USER statement can be used.used to recalculate the permanent.used to display and change tunable parameters for Teradata Network Services • tpareset (Tpareset) . and space limits. To create a new database or user. • verify_pdisks (Verify_pdisks) . They store objects which use space and the utilize objects which do not take space but have definitions in the Data Dictionary which use space. or spool space used by database(s). but can perform actions.used to recalculate the PermSpace and SpoolSpace values in the DBASE table.
REVOKE. 8.a permanent user granted the privilege to assume a proxy user identity. • Directory user .4. 153 . • Proxy user .grant access with NULL password provided by external authentication or externally authorized without permanent space allotted. • Trusted user . The DROP DATABASE and DROP USER statements can be used to remove a specified database or user.4. In order for the process to be complete.uses a session of the trusted user to access the Teradata Database: can be a permanent user or an application proxy user. requiring a mapping to a permanent user or a generic user supplied in Teradata Database called EXTUSER. Any journal tables associated with the database or user requires all data table references to it to be removed and the journal itself removed using the DROP DEFUALT JOURNAL TABLE option in a MODIFY DATABASE or MODIFY USER statements. • Define appropriate authorization checks and validation procedures.created. • Identify the implications available on permanent. • LOGON.3 Creating Users To create a new user: • Identify the job function of the user to determine the appropriate roles and profiles to assign to the user.located on a directory server and not the Teradata Database.product of the CREATE USER statement. At the top of the hierarchy is the DBC and the parent of any subsequent databases and users.2 User Types Several types of users can be created: • Permanent database user . has PERM space on disk. GRANT. all objects contained in the database or user must be deleted. temporary. and directly logs into the Teradata Database. and other account and access-related activity is audited. session. a hierarchy is created. and spool space. • Externally authenticated database user . • Resolve any ownership issues. 8.
the default is the account identifier of the immediate owner of the user. the value is used is the same SPOOL value as the immediate value. • SPOOL . the default is the account defined in the profile. • TEMPORARY . permanent or temporary.used during a session. the statement requires a password and permanent space to be defined.If the profile is assigned to be used by more than one account exists. If the profile is assigned but does not have a TEMPORARY value. the value used is the same SPOOL value as the immediate value. and must be specified if known and provided. the value is the defined limit from the profile. The CREATE USER statement is used to create users to the system. Null passwords 154 .If the profile is assigned to user that has a SPOOL value. the statement is rejected as incomplete. • DEFAULT DATABASE . If the profile is assigned but does not have a SPOOL value. the default is the first account in the string. In addition to defining a unique username. If the profile is not assigned to a user.Users must have an active role. the default is the same TEMPORARY value as the immediate owner of the space. no role is used in the privileges rights validation process. • Default role . If the profile is assigned and a single account exists. • PROFILE . the default is the defined limit in the profile. if required. The default values of the CREATE USER statement if not defined by the DDL are: • FROM database . the default is the same TEMPORARY value of the immediate owner of the space.If the profile is assigned to user and has TEMPORARY value. If the user attempts to log into the Teradata Database without a password.typically the username. The default is null. • DEFAULT ROLE . If the profile is not assigned. If the profile is assigned and no accounts are defined. If the profile is not assigned.use can enter a startup string during logon.• Mapping directory users with database users.no profile exists for the user. • STARTUP . Temporary passwords can be specified when a user is first created. but a SET SESSION DATABASE statement can be submitted to change the default for the current session. provides space to store or search new or target objects. the default is the account identifier of the immediate owner of the user. The following defaults can be optionally defined: • Default database .default database of the creating user. • ACCOUNT .Unless a SET ROLE statement is submitted.
performance can be improved. When using roles to manage privileges. New passwords can be prompted upon first login if installing a new system or through the enforcement of password requirements. Revoke any role granted. Privileges are assigned to a specific role and roles are assigned to users. A role is created using the CREATE ROLE statement. Any role created by the user to be granted to other users and roles. Roles can be granted to users or other roles. 155 . To change or nullify the current role. Grant a role with the WITH ADMIN OPTION to another user.AccessRights. Roles can reduce the number of rows added to and deleted from the DBC. a SET ROLE statement can be submitted.4. The assigned default role will be activated when the user logs onto the system.4 Roles Roles can be used to manage user privileges. Once a role is created. the following considerations should be taken: • • • • • Different roles created for different job functions or responsibilities.can only be used when external authentication is employed. The DROP ROLE and WITH ADMIN OPTION privileges are provided during this creation. 8. Users can then access all objects associated with the assigned role. creator privileges are automatically assigned to the role. creator privileges allow: • • • • Any role created by the user to be dropped by the user. Specific privileges on database objects for each role must be granted. therefore. When given the WITH ADMIN OPTION. Privileges to objects for each role must be specified. Default roles should be assigned to users.
The following rules apply to roles: • One of more roles can be granted to one or more users or roles. • The MODIFY USER. The following privileges cannot be granted to roles: • • • • • • • • CREATE RILE DROP ROLE CREATE PROFILE DROP PROFILE CREATE USER DROP USER CTCONTROL REPLCONTROL The GRANT OPTION cannot be granted to roles. preventing any members of roles from granting the privileges it contains to other users or roles. • The CREATE user.AS DEFAULT ROLE ALL requires no roles to be granted by the user.. A role can have many users. • Any privilege granted to an existing role will affect any user or role that is specified as a recipient of the GRANT statement. • The SET ROLE ALL statement allows all roles to be available within a session. • The current session may be set by the user to ALL. • Privileges granted to a role are inherited by every user or role member of the role.AS DEFAULT ROLE ALL requires no roles to be granted by the creator. 156 . a user can have many roles.... • Single-level nesting is allowed: a role has a member role cannot also be a member of another role.
Implicit privileges are inherited 157 . granted. Privileges are either explicit or implicit. Account ID .Level M.default value is NULL.the defined user setting.4. If profile definitions are not specified when a DREATE RPOFILE statement is submitted. including: • • • • • Password settings Account strings Default database assignments Spool limits Temporary space limits 8. privileges are applied to objects to ensure the user has the right to access or manipulate an object within the Teradata Database. Spool space amount .6 Privileges The implementation of security in the database is largely done through the granting of privileges. Sometimes referred to as access rights. the following defaults can be used: • • • • • • • Password attribute .8.4.the default system profile. Profiles Profiles are used to simplify the management or user settings. These privileges allow a profile to be created.derived from the default account ID of the user. dropped. must be explicitly granted the CREATE PROFILE privileges. External profiles can be assigned to directory users by the directory administrator.the same SPOOL value for the immediate owner of the space. All users. Profiles are assigned to a group of users to ensure all members of the group operate under the same parameters.the defined user setting. Profiles allow several attributes to be managed. permissions. Profile definitions are assigned to users and will override specifications at the system or user level. Values in a profile will always take precedence over the values defined for the user. except DBC. Default database . Performance group . or authorization. Cost profile . and implemented.5. Temporary space amount .
• DBC. • The DBC. • Check if required privilege is a PUBLIC privilege.AccessRights table containing information on explicitly or automatically granted privileges. When working with an object. the explicit privileges will take precedence. Rows in the DBC. • The DBC. • DBC. Privileges are maintained in the Data Dictionary. though if it conflicts.all explicit privileges granted to each role. Explicit privileges can be revoked.each role and every user or role the privilege has been granted. Explicit privileges are granted to the user or database directly. Explicit privileges are stored in one row for each user for each privilege on the database object. • DBC. • DBC. 158 . A user can have both implicit and explicit privileges.any explicit privileges granted by requesting users to other users.AllRightsV . AccessRights table are inserted or removed each time a CREATE/DROP or GRANT/REVOKE statement is submitted. the system checks privileges in the following order: • The DBC.RoleMembersVX .AccessRights table is checked for the required privilege for each nested role identified in the DBC.any explicit privileges granted by requesting user.UserRightsV . They are either granted to a user or database directly on objects by another user or granted automatically to a user or database on objects created. through an assigned role. specifically the DBC. The following views can be used to report explicit privileges: • DBC. • DBC. or to an external role. • DBC.UserGrantedRightsV .UserRoleRightsVX .all explicit privileges granted to each role for the requesting user.AccessRights table is checked for the required privilege at the individual level.AllRoleRightsV .RoleMembersV . Privileges can be explicitly granted to PUBLIC.AccessRights table is checked for the required privilege at the role level. Privileges can be granted to a role which is a collection of privileges on database objects.all roles directly granted to the requesting user.through ownership of the object and cannot be revoked.RoleGrants table for the users’ current role.all explicit privileges granted.
AccessRights. the system checks for privileges in the following way: • If the proxy user is a permanent database user. and PUBLIC. • If the proxy user is an application proxy user.If middle-tier applications are being used. Implicit privileges are implied for an owner of the object or from the ownership hierarchy defined in the DBC table. if the stored procedure is created with the SQL SECURITY INVOKER clause. the proxy user’s active roles. the proxy user privileges will be used. If a CREATE USER/DATABASE statement is submitted. the following privileges are automatically assigned to the creator: • • • • • • • • • • ANY CHECKPOINT CREATEAUTHORIZATION CREATE DATABASE CREATE MACRO CREATE TABLE CREATE TRIGGER CREATE USER CREATE VIEW DELETE 159 . Implicit privileges cannot be revoked and are not logged in DBC. The system will use the privileges of the proxy user and not the trusted user. • The privileges of the proxy user will not be used for macros because the appropriate privileges must come from the immediately owning database. These implicit privileges are valid for as long as the object is owned by the associated database or user. When using a proxy connection. proxy users must be associated with trusted users to allow the database to be accessed from an external source. • The privileges of the immediate owner are used for stored procedures or the privileges of the “invoke” of the procedure depending on the SQL SECURITY clause: that is. the privileges used are from the permanent user. the privileges used are from active roles of a proxy user and PUBLIC.
• • • • • • • • • • • • • • • • DROP AUTHORIZATION DROP DATABASE DROP FUNCTION DROP MACRO DROP PROCEDURE DROP TABLE DROP TRIGGER DROP USER DROP VIEW DUMP EXECUTE INSERT SELECT STATISTICS RESTORE UPDATE When a user or database is created. they will receive the following privileges automatically: • • • • • • • • ANY CHECKPOINT CREATEAUTHORIZATION CREATE MACRO CREATE TABLE CREATE TRIGGER CREATE VIEW DELETE 160 .
• • • • • • • • • • • • • • DROP AUTHORIZATION DROP FUNCTION DROP MACRO DROP PROCEDURE DROP TABLE DROP TRIGGER DROP VIEW DUMP EXECUTE INSERT SELECT STATISTICS RESTORE UPDATE The following privileges must be explicitly granted to creators of a user or database: • • • • • • • • • ALTER EXTERNAL PROCEDURE ALTER FUNCTION ALTER PROCEDURE CREATE EXTERNAL PROCEDURE CREATE FUNCTION CREATE PROCEDURE EXECUTE FUNCTION EXECUTE PROCEDURE SHOW 161 .
The following privileges must be explicitly granted to created users or databases: • • • • • • • • • • • • • ALTER EXTERNAL PROCEDURE ALTER FUNCTION ALTER PROCEDURE CREATE DATABASE CREATE EXTERNAL PROCEDURE CREATE FUNCTION CREATE PROCEDURE CREATE USER DROP DATABASE DROP USER EXECUTE FUNCTION EXECUTE PROCEDURE SHOW The DBC or user with the following privileges can grant the same privileges to another: • Table level privileges o o • INDEX REFERENCES GLOP Data privileges o o o CREATE GLOP DROP GLOP GLOP MEMBER • Monitor privileges o o ABORTSESSION MONRESOURCE 162 .
AllRightsV DBC.UserRoleRightsV The column will record a two character code representing the privilege granted to the particular object referenced.AllRoleRightsV DBC.UserGrantedRightsV DBC.ALTER EXTERNAL PROCEDURE AF .ALTER PROCEDURE 163 .o o o • MONSESSION SETRESRATE SETSESSRATE System level privileges o o o o o o CREATE PROFILE CREATE ROLE DROP PROFILE DROP ROLE REPLCONTROL CTCONTROL • UDT privileges o o o UDTMETHOD UDTUSAGE UDTTYPE The following views have an AccessRight column: • • • • • DBC.UserRightsV DBC.ALTER FUNCTION AP . The codes are as follows: • • • AE .
• • • • • • • • • • • • • • • • • • • • • • • • • AS .CREATE TRIGGER CM .DELETE DA .DROP TRIGGER DM .CREATE AUTHORIZATION CD .CREATE FUNCTION CG .CREATE EXTERNAL PROCEDURE CF .CREATE USER CV .DROP USER DV .CREATE MACRO CO .CREATE ROLE CT .DROP TABLE DU .ABORT SESSION CA .DROP AUTHORIZATION DD .DROP ROLE DT .DROP PROFILE DP .CREATE TABLE CU .DROP MACRO DO .CREATE VIEW D .CREATE DATABASE CE .DROP FUNCTION DG .CHECKPOINT CR .DUMP DR .DROP DATABASE DF .CREATE PROFILE CP .DROP VIEW 164 .
INDEX MR .EXECUTIVE PROCEDURE R .SET SESSION RATE TH .REPLCONTROL RS .• • • • • • • • • • • • • • • • • • • • • • • • • E .GLOP MEMBER I .EXECUTE EF .RESTORE SA .RETRIEVE/SELECT RF .MONITOR SESSION NT .EXECUTE FUNCTION GC .UPDATE 165 .SET RESOURCE RATE SS .NONTEMPORAL OP .INSERT IX .MONITOR RESOURCE MS .SECURITY CONSTRAINT DEFINITION SH .SECURITY CONSTRAINT ASSIGNMENT SD .REFERENCE RO .DROP PROCEDURE PE .CTCONTROL U .SHOW SR .CREATE OWNER PROCEDURE PC .CREATE PROCEDURE PD .DROP GLOP GM .CREATE GLOP GD .
and privileges are granted or revoked. Some Data Dictionary tables will contain rows not distributed using hash maps and are stored AMP locally.AllTempTablesV .5. • Automatically updated as objects are created. Data Dictionary tables are updated whenever a data definition or data control (DCL) statement. and macros. The Data Dictionary tables are created when the Teradata software is installed through the DIPVIEWS utility. 8. views.Owners table. While some tables are referenced by SQL requests. including ownership hierarchy. or dropped. modified.provides information on all global temporary tables materialized in the system found in the DBC. and types.ChildrenV .5. explicit privileges.1 Data Dictionary The Data Dictionary contains information about the entire database and is comprised of tables. • DBC.provides information on hierarchical relationships in the DBC.2 Data Dictionary Views The following are views of the Data Dictionary: • DBC.UDT Method UT .UDT Usage 8.• • • UM . The Data Dictionary consists of tables. the other tables are used only for system or data recovery. Most Data Dictionary tables are fallback protected: a copy of every table row is maintained on different AMPs in the configuration and will provided automatic recovery.UDT Type UU . and macros stored in the user DBC: • Stores information about created objects.TempTables table. views. 166 .5 Object Maintenance 8. altered.
These views can be accessed using an EXPLAIN modifier preceding a DDL or DCL statement.provides information about tables. • DBC. The processing of the DDL or DCL statement is described.provides information on indexes on the tables found in the DBC. The type of locking used. been granted privileges on.ShowTblChecksV .TriggersTbl table. The only difference between X views and non-X views is the existence of a WHERE clause to ensure a user can view only those objects the user owns.• DBC. specialized actions attached to a single user as found in the DBC. • DBC. • DBC.provides information about database table constraint information found in the DBC. users.TablesV . • DBC. Some views are user-restricted: they are only applied to the user submitting the query acting upon the view. is associated with. • DBC.Dbase table.IndicesV . stored procedures. These views will only report a subset of the available information. triggers.provides information about databases. The Data Dictionary will store object names in Unicode to allow the same set of characters to be available regardless of the character set of the client.TableConstraints table.provides information about event-driven.USersV . These views are identified by an appended X to the system view name and are sometimes called X views. Unicode versions of view are identified by an appended V to the system view name. The operating conditions resulting from an EXPLAIN modifier is: • • • The statement is not executed. or assigned a role with privileges. macros.TriggersV .TVM table. 167 .DatabasesV . views. functions. Indexes table.Dbase table.provides information about event-driven. and replication status found in the DBC. and immediate parents found in the DBC. and on what objects. Most views will reference more than one table and access to Data Dictionary information can be limited. is described. specialized actions attached to a single table as found in the DBC.
The creator of an object is the user submitting the CREATE statement. • The GIVE statement will transfer the permanent space owned by the transferring database or user.8. transfer permanent space to the specified recipient. The default database is the database of the creating user. logical repository for database objects. The rules for transferring ownership are as follows: • Database and users can only be transferred using the GIVE statement. The immediate owner is sometimes referred to as the parent. • Object can only be given to the children of the object. If a CREATE statement is executed to create an object. • Spool . • The explicit DROP DATABASE and DROP USER privilege on the transferred object and the received object must be set to use the GIVE statement. • Temporary (TEMP) .stores intermediary query results or formatted answer set and volatile tables. the second user is the immediate owner: the first user is an owner.allocated to a user or database as a uniquely defined. To perform this transfer.6 Capacity Management Disk space is managed in bytes. than the object is immediately owned by that user or database. the user is the creator of the object but the immediate owner of the object is the other user and database. If a user owns a second user creates an object in the second user’s database. The owner of the object is the database or user just above the object in the hierarchy. The GIVE statement will. The parent of a new user or database is the database space where the user or database resides. If the object is directly below a user or database in the hierarchy. in reality. all child objects are transferred. Ownership can be transferred from one immediate owner to another for databases and users. 168 . the user executing the statement is the creator or the object. the GIVE statement is performed.1 Ownership The privileges for owner and creator are different and determine the default settings for undefined parameters. 8.6. but not necessarily the owner or immediate owner of the object. but not an immediate owner. If a user creates an object in the space of another user or database.used to store data for global temporary tables. then the user is both creator and the immediate owner. If the object is created in the user’s database by the user. There are three types of space: • Permanent (PERM) . In addition to the specified object.
2 Space Limits Permanent space limits can be set at the database or user level. If no unused space is available. PERM space limits are deducted from the available space of the immediate owner or the database or user. and permanent journals. stored procedures.DiskSpaceV view can be used to determine the amount of PERM space available for each AMP. Space on cylinders is obtained from a pool of free cylinders. The PERM space values tracked in the DBC.the maximum number of bytes available for storage of all data tables. permanent space is dynamically acquired by data blocks and cylinders. Each AMP will record the result and may not be exceeded on that AMP. These limits set the maximum limit with the PERM parameter of a CREATE/MODIFY USER/DATABASE statement. The space is then allocated to other users created from the user DBC. If space required for a transaction is not available. triggers.6.DiskSpaceV are: • CURRENTPERM . When the system inserts rows. 8. They are not set at the table level. As additional new databases and users are created under those users. and permanent journals. new objects cannot be created until more space is acquired. Data blocks are stored in segments and grouped in cylinders. the statement will succeed. • PEAKPERM . All available permanent space is allocated to the user DBC. stored procedures.the total number of bytes currently allocated to existing tables.” If a statement requires permanent space and one or more free cylinders. For all AMPs on the system. A data block is a diskresident structure. • Implicit privileges are impacted by changed ownership transferring to the new owner. index tables. the transaction will be aborted. index tables. A number of cylinders can be reserved for transactions requiring permanent space. resulting in a reduction of actual PERM space available. triggers. It contains one or more rows from the same table and acts as the physical I/O unit for the file system. The number of reserved cylinders is determined by the File system field in DBS Control called “Cylinders Saved for PERM. The space is allocated from the immediate parent of the object being created. If the statement 169 .• Explicit privileges granted to others on a transferred user are not automatically revoked: explicit privileges must be explicitly REVOKED as required.the largest number of bytes used to store data in a user or database since the last reset. the SUM aggregate can be used. Unused space is dynamically allocated for temporary or spool space. The specified amount of permanent space for each user or database is divided by the number of AMPs in the configuration. subtables. • MAXPERM . The DBC. subtables.
PermDBSize.the percentage of storage space that PACKDISK should leave unoccupied on cylinders. can be controlled by setting the system-level MergeBlockRatio field in DBS Control or specifying the table-level MergeBlockRatio attribute in the CREATE/ALTER TABLE statement. PermDBAllocUnit. It is possible to have a greater number of errors for requests requiring SPOOL space than those requiring PERM space. are set in the DBS Control utility to determine the maximum size of permanent datablocks holding multiple rows. • FREESPACE attribute in CREATE/ALTER TABLE statement .requires spool space and free cylinders exist greater than the amount specified in the Cylinders Saved for PERM. In the long-term. this row will be placed in its own block. 170 . • “Cylinders Saved for PERM” field of DBS Control . the number of DBs in a table is reduced. the global parameters. This space is controlled at the global and table level using: • FreeSpacePercent field of DBS Control . and JournalDBSize. up to 127.determines the percentage of cylinder space left on each cylinder when bulk loading the table. The general size of blocks desired across certain tables. If the statement fails in any case. If the size of multi-row data blocks is exceeded by a row.5 KB. along with the size of resulting data blocks. To set data block size limits.the number of cylinders saved for permanent data only.resets the current free space percent for a table at a global or table level. The amount of data being modified. • DEFAULT FREESPACE of an ALTER TABLE statement . This field/ attribute allows adjustments to be made to ensure merged data blocks are a reasonable size. a disk full error is returned. Teradata Database has the ability to merge small DBs into larger one during full-table modify operations.determines the percentage of space left unused on each cylinder. The frequency of merges. The percentage of cylinder space left during load operations is called the Free Space Percent (FSP). along with the number of I/Os. • FREESPACEPERCENT option in PACKDISK . the statement succeeds. Proper settings should consider: • • • The frequency of accessing or modifying certain tables. Data blocks are segments of one or more rows from a single subtable. The DATABLOCKSIZE = n [BYTES/KBYTES/KILOBYTES] specification used in the CREATE/ ALTER TABLE statement will allow the maximum multi-row data block size.
a manual drop of a table during a session. and profiles are: • If SPOOL is specified in a CREATE/MODIFY USER/DATABASE statement and a profile does not apply the limit may not exceed the limit of the immediate owner of the user or database. the limit is from the user submitting the statement and declared in the profile (if one exists). used to hold intermediate rows during query execution. • Output Spool . real-time loads. The types of spool space available are: • Volatile Spool . or a profile for a user. • If SPOOL is not specified in a CREATE/MODIFY USER/DATABASE statement and a profile does apply. the termination of the session or a Teradata Database reset.6. batch loads. Spool space limits can be set for a database.retained until a transaction is complete. the statement. or deleted from a base table. the limit may not exceed the limit of the user submitting the statement and is declared in the profile. By limiting spool space. user. users. the limit is inherited from the profile specification. • Intermediate Spool . By default.8. inserted into.retained until intermediate spool results are no longer required by the query or the Teradata Database resets. and a medium workload is recommended to have a required spool for the entire system of 30 percent of the MAXPERM. The spool space is drawn dynamically from unused system perm space. The maximum and default limits for databases. and well-designed queries require less spool size. A system with no fallback. • If SPOOL is specified in a CREATE/MODIFY USER/DATABASE statement and a profile does apply.3 Spool Space Spool space is used by the system for response rows of every query run by a user during a session and for intermediate spool tables. high concurrency and large intermediate spools require more spool space. • If SPOOL is not specified in a CREATE/MODIFY USER/DATABASE statement and a profile applies but the parameter is NULL or NONE.retained until the response rows are returned in the answer set for a query or the rows updated within. but not a table. the specification from the 171 . the limit is inherited from the specification of the immediate owner. • If SPOOL is not specified in a CREATE/MODIFY USER/DATABASE statement and a profile does not apply. Highly compressed data. or 40 percent of the CURRENTPERM size. all unused permanent space in the Teradata Database system is available for use as spool space: a spool reserve database can be used to reserve a minimum amount of spool that cannot be used for storage. Fallback. the impact of performing bad queries can be reduced. or the limit set for the immediate owner (if no spool is defined). no compression.
statement, or the specification set for the immediate owner (if no spool is defined). The Teradata Viewpoint can be used to view spool trends for each database or user. Available permanent space can be dynamically allocated as spool when required. Reserving permanent space for spool requirements will ensure transaction processing is not impacting. At minimum, the amount reserved should not exceed 40 percent of space relative to CURRENTPERM. To create a spool reserve database, the CREATE DATABASE statement is submitted and the amount of space to reserve is specified in the PERM parameter. No objects should be created in this database, or any data stores. This ensures the database is available for use as spool space. The following settings can be assigned to spool space limits when using the CREATE USER, CREATE DATABASE, or CREATE PROFILE statements: • MAXSPOOL - limits the number of bytes allocated to create spool files for a user. • CURRENTSPOOL - defines the number of bytes for resolving queries. • PEAKSPOOL - defines the maximum number of bytes used by a transaction for a user since the last reset.
8.6.4 Temporary Space
Temporary space hold rows of materialized global temporary tables. It is allocated at the database or user level, but not table level. The TEMPORARY parameter within the CREATE/ MODIFY PROFILE or CREATE/MODIFY USER/DATABASE statements is used to define a temporary space. The maximum and default limits for temporary space allocation following the rules: • If specified in the TEMPORARY in a CREATE/MODIFY USER/DATABASE statement and a profile does not apply, the limit is inherited from and does not exceed the limit of the immediate owner. • If specified in the TEMPORARY in a CREATE/MODIFY USER/DATABASE statement and a profile does apply, the limit is inherited from the submitting user, cannot exceed the user’s limit, and is declared by the profile limit (if one exists), the limit specified in the statement, or the limit of the immediate owner (if no TEMPORARY is defined o for user). • If not specified in the TEMPORARY in a CREATE/MODIFY USER/DATABASE statement and a profile does not apply, the limit is inherited from the immediate owner of the user. 172
• If not specified in the TEMPORARY in a CREATE/MODIFY USER/DATABASE statement and a profile does apply, the limit is inherited from the profile specification. • If not specified in the TEMPORARY in a CREATE/MODIFY USER/DATABASE statement and a profile does apply with a SPOOL parameter of NULL or NONE, the limit is inherited from the submitting user and declared by the profile limit (if one exists), the limit specified in the statement, or the limit of the immediate owner (if no TEMPORARY is defined o for user). The different types of temporary space include: • • • CURRENTTEMP - the amount of space currently used by Global Temporary Tables. PEAKTEMP - the maximum temporary space used since the last session. MAXTEMP - limits the space available for global temporary table rows.
8.6.5 Data Compression
Disk space usage can be reduced using data compression. The most common forms of compression are: • • • • Multi-Value Compression Algorithmic compression Hash/Join Compression Block Level Compression
Within the CREATE/ALTER TABLE statement, the COMPRESS phrase is used to specify a list of frequently occurring values for compression. The phrase is associated with the columns containing the values. Data that matches the value specified in the list, the database will store the value only once in the table header and a smaller substitute value for each affected row. Multi-Value Compression (MVC) as described above has the greatest cost/benefit ratios to any compression method, requiring minimal resources to decompress the data and lack of impact to query/load performance. Despite these benefits, the number of MVC values listed in the statement is limited: • • A maximum of 255 values per column. Allowable storage per columns for uncompressed values up to: o -7800 bytes per column of BYTE, KANJI1, and KANJISJIS data. 173
-7800 characters per column for GRAPHIC, LATIN, or UNICODE data.
• 1 MB capacity of the table header for storing a list if values being compressed, minus whatever space is needed to store other table-related information. Standard algorithmic compression (ALC) algorithms are available for Teradata Database, along with a framework for creating custom algorithms for compression and decompression. When data is moved into a specified table, the COMPRESS algorithm is invoked. The DECOMPRESS algorithm is invoked when the data is accessed. The system will not apply ALC to any value covered by MVC. ALC is most effective when column values are unique. ALC and MVC can be used concurrently on the same column. The type of algorithm can yield different performance benefits. ALC use is limited to certain types of table are data. Compressions algorithms are found in the form of UDFs, which are used to compress character data by table columns. The different types of Teradata UDFs are: • CAMSET and CAMSET_L - compresses Unicode or Latin (respectively) character set data using a proprietary compression algorithm from Teradata. • DECAMSET and DECAMSET_L - decompresses character data compressed by CAMSET or CAMSET_L algorithm. • LZCOMP and LZCOMP_L - compresses Unicode or Latin (respectively) character data using the ZLIB compression library, based on the Limpel-Ziv algorithm. • LZDECOMP and LZDECOMP_L - decompresses character data compressed by the LZCOMP or LZCOMP_L algorithm. • TRANSUNICODETOUTF8 - compresses Unicode into UTF8 format. • TRANSUTF8TOUNICODE - decompresses the Unicode data compressed by TRANSUNICODETOUTF8. Hash/Join Index row compression will divide rows into repeating portions and non-repeating portions. Multiple sets of non-repeating column values are appended to a single set of repeating column values. The result is the repeating value set is stored once. The nonrepeating column values are stored as logical segmental extensions of the base repeating set. A pointer exists from the non-repeating column set to each repeating column set. Row compression is the default for hash indexes, but for join indexes, row compression is specified in the SELECT clause of a CREATE JOIN INDEX statement. The SELECT clause allows specific join index columns to be listed separately to identify which will be subject to row compression or not. Block Level Compression (BLC) stores data blocks in compressed format. Data blocks are 174
The Lempel-Ziv algorithm is used for all BLC through the ZLIB library. • CompressionAlgorithm . • CompressMloadWorkDBs . The following are DBC Control fields: • BlockLevelCompression . The maximum setting is 255 sectors and its specification will minimize the datablock percentage required for compress logic.identifies the minimum percentage which the size of a DB must be reduced by compression. Once a table is enabled for BLC. • CompressPermDBs . • MinPercentCompReduction .specifies conditions for compressing global temporary table DBs.determines the extent of compression operations will favor processing speed or degree or data compression.specifies the minimum percentage of storage space that 175 . • CompressCLOBTableDBs . unless BLC is: • • • • Disabled globally by a DBS Control field.specifies if compression is applied to all permanent journal DBs. the system will default to applying BLC to all subsequent updates to the table.physical units of I/O defining how the Teradata Database file system handles data.specifies conditions for compressing permanent table DBs. • CompressSpoolDBs . Disabled for a particular table by the FERRET utility UNCOMPRESS command. Not meeting DBS Control compression parameters.specifies conditions for compressing data blocks from multiload sort worktables and index maintenance worktables.specifies conditions for compressing spool DBs.identifies the minimum size DBs to be compresses.specifies the algorithm used to compress DB data.enables or disables BLC globally. The maximum multi-row data block size can be defined using the DATABLOCKSIZE option in the CREATE/ALTER TABLE statement. • UncompressReservedSpace . • CompressGlobalTempDBs .specifies conditions for compressing CLOB subtables. Disabled for particular type by a DBS Control field. • CompressPJDBs . BLC should be applied to large tables only to balance space benefits with CPU cost. Secondary index subtables are not compressed by BLC. • MinDBSectsToCompress . • CompressionLevel .
end-users can be authenticated and requests submitted on behalf of the end user.most remain available after compression. the system will roll back only the request causing an error and not the entire transaction. • Queries to Teradata Database are submitted by the middle-tier application on behalf of the PROXYUSER.7. Both these statements are executed based on the CTCONTROL privilege. With these sessions.1 Session Modes A session is a logical connection between an application and Teradata Database. • The PROXYUSER’s (application end user) connection privilege through the database connection is verified. In Teradata Session mode. A GRANT CONNECT THROUGH statement is used to assert a proxy user identity through a trusted user and create a trusted session. Sessions are identified by a unique number assigned by TDP. 8. • Privileges for the query are verified by the Teradata Database based on the PROXYUSER’s active roles. In ANSI mode. 176 . • The application end user is authenticated with the middle-tier application and requests a service requiring a query to Teradata Database. Trusted sessions are handled by: • A session pool created by the middle-tier application will authenticate itself to a Teradata Database. A session begins when a Teradata Database accepts the username and password of a user. • The active session identity and role is set by the middle-tier application for the application end user by submitting a SET QUERY_BAND statement with the PROXYUSER and PROXYROLE name-value pairs.7 Session Management 8. Two session modes are available: Teradata and ANSI. it rolls back the entire transaction and all locks released. • The PROXYUSER’s identity is recorded by the Teradata Database in the Access Log and Database Query Log. Trusted sessions allow user identities and roles to be asserted without establishing a logon session for the user. A REVOKE CONNECT THROUGH statement removes this assertion.
Account strings can be defined at the user level and the profile level. • DBQLinformation . monitoring resource usage. 8. Accounts manage workloads. unless the logon or startup string explicitly specifies a different account. If set to transaction. the temporary and volatile tables and results sets for the PROXYUSER is discarded at the end of the proxy connection. 177 .7.AmpUsage view and ASE .AmpUsage for every AMP in every session: the rows are not written until after the query completes. providing a basis for billing. Typically.can provide information about queries running in the system for the purpose of identifying problem queries. 8. A SET SESSION ACCOUTN statement can be explicitly submitted to change the session to another account. the PROXYUSER is switched to another PROXYUSER within a transaction and the temporary and volatile tables and results set are not discarded.3 Accounts An account and a session are always associated together. If set to session. and assigning work priorities. • PM/APIs .2 Monitoring Tools Information about a session can be obtained from these tools: • Teradata Viewpoint . the session is associated with the default account of the user at logon. and applications. • DBC.the submission of the BEGIN QUERY LOGGING statement will enable logging to allow monitoring users.a single row is generated by ASE into DBC.7. accounts.allows session information to be immediately accessed with the proper privileges in place: the MONITOR SESSION request or the MonitorSession function can be used to collect current status and activity information about the sessions. Multiple strings can be separated with commas and enclosed in parenthesis.Whether the PROXYUSER is set for the session or transaction will determine how the system handles operations.
the default is the first account in the user definition. the default is the first account in the definition string. The Account ID can be found in the DBC. otherwise.Dbase. • If the user has no profile. • If no account is defined for the user without a profile assignment. the default is the account of the immediate owner of the user. the default is the first account defined in the user or database definition.the performance group identifier WDID .the hour When a CREATE/MODIFY DATABASE statement is processed. the default is the first account defined in the user or database definition. When a CREATE/MODIFY USER/PROFILE statement is processed.Accounts and DBC. • If the user has a profile with no accounts. a row is inserted or updated in the DBC. • If no account is defined for the user or members of a profile with a NULL account. • If no account is defined for the user of a profile.Profiles. the default is the first account defined in the profile. the default is the account of the immediate owner of the database. the account of the immediate owner of the user.the workload definition identifier &S . • If multiple accounts are defined. the default is none for the profile.Accounts and DBC.Each account is provided an Account ID.Acctg table and the log entries in the DBQL tables. The names of account IDs take the following format: $PG$WDID&S&D&H Where: • • • • • $PG$ .the session number &D .Dbase or DBC. This ID aids in improving query and workload analysis. a row is inserted or updated in DBC. The determination of the default account follows the rules below: • If no account is defined for the user or a database. • If the user has a profile with multiple accounts. 178 .the date &H .
used to identify the users currently logged on and other information. system UDFs and external stored procedures.used to view the usage of each AMP for each user and account.SessionTbl. current partition.Dbase. transaction type. collation. such as session source. DBC. or Teradata Viewpoint 179 . • Workload Management . query band. • DBC. 8. To retrieve the query band.allows the collection and analysis of resource utilization information and used to understand the workload requirements of the system. • DBC. • DBC.SessionInfoV .SessionInfo view.8.Software_Event_LogV .7. and DBC. the following session-related system views are used: • DBC. role.LogOnOffV .AccountInfoV . • DBC. query the QueryBand field in the DBC.5 Using Query Bands Query bands are sets of name-value pairs assigned to a session or transaction.AMPUsage . users.used to view system error messages for Teradata Database Field Engineers. password status.LogonRulesV .used to identify the session and duration of user sessions.used to view the current logon rules. and audit trail ID. The user account string can be used to summarize resource usage based on the accountID. use the HELP SESSION.7.through the Priority Schedule.Profiles. as well as the activities of any console utilities. and profiles.allows cost allocations of system resources across all users. They identify the originating source of a query.used to access the DBC. The identifiers are defined and stored along with other session data in DBC. Administration of accounts is performed using the following two views: • DBC.Accounts dictionary tables to provide information on all valid accounts for all databases. query logging (DBQL). as well as the assignment of Priority Scheduling codes and ASE codes.4 System Accounting Three administrative functions are possible through system accounting: • Charge-back Billing . • Capacity Planning . To monitor database access. can manage user account priorities.
ARCHIVE/DUMP Restore operations . restore. individual database object. If the privileges are missing. • Restoring a database.8. • Copying a database. or selected partition to a different Teradata Database. • Restoring copied database. and Recover The ARC utility is invoked by Backup Application Software solutions to archive. table.8. The following privileges must be explicitly granted: • • • Archive operations . or selected partitions from an archived file to the same or different database. CREATE TABLE/VIEW/MACRO/TRIGGER/PROCEDURE/ FUNCTION 180 . The ARC utility aids in: • Archiving a database. the ARC will overlook the object and continue with the operation. Restore. or selected partition from a archived file. or selected partition to a archived file. 8.8. recover. • Recovering a database to an arbitrary checkpoint using a before or after-change images in a permanent journal table.2 Required Privileges Any object being processed for archive. • Deleting a changed image row from a permanent journal table. or copy operations must have specific privileges assigned.1 Archive.RESTORE Copy operations .8 Business Continuity 8. individual object. individual database object.RESTORE. restore. and copy data.
HUT locks are not used for Online Archive. Complete database clocks are built from each vproc in archive attempts.4 HUT Locks HUT locks are created by the ARC utility and are applied to any ARC operation performed on an object.8.3 Session Control A user must logon and start a session to use the ARC utility. The LOGON statement must specify the name of the Teradata machine connecting to the ARC and the username and password used. To release a HUT lock. 181 . They only affect the AMPs participating in the ARC operations and are associated with the user performing the operation.8. 8. Transaction locks are automatically released when a transaction completes: HUT locks must be explicitly released and will be reinstated automatically should the database reset.8. The optimal number of sessions for archive and restore operations is one session for each AMP. the RELEASE LOCK option in an ARC command can be used or the RELEASE LOCK command in a job script can be executed. The LOGOFF statement ends all session logged by the task and terminates the ARC utility. Multiple sessions are possible or the default can be used. The user ID used to log on must have privileges for the ARC statements used. Each session is assigned to a vproc and stays with the vproc until all required data is archived. The SESSIONS runtime parameter is used to set the number of sessions. Data blocks from different vprocs are not combined in the same archive block.
Question 1 What functional module is used by Teradata Database to control operations of the database environment? A. C. D. B. Teradata Database Management Software Teradata Gateway Database Windows Parallel Data Extension Question 2 What type of journal is used to rollback failed transactions? A.1 Refresher “Warm up Questions” The following multiple-choice questions are a refresher. Transient Permanent Down AMP Recovery Any of the above 182 . C. B. D.9 Practice Exam 9.
Question 3 What Active System Management product allows sessions or transactions to be tagged with an ID? A. B. Teradata Viewpoint Query Bands Resource Usage Monitor Teradata Workload Analyzer Question 4 When performing requirements analysis. C. 3NF 1NF 5NF 2NF 183 . C. D. B. which perspective will result in a system model? A. Subcontractor Owner Builder Designer Question 5 Which level of normal form is also called the Boyce-Codd Normal Form? A. D. D. B. C.
D. and join indexes can a Teradata Database support? A. C. hash. 12 16 32 64 Question 8 Which of the following data types can be defined in semantic constraints? A. D.Question 6 What is the typically size of a row hash value? A. B. BLOB Integer UDT Geospatial 184 . B. 12 16 20 32 Question 7 How many secondary. C. D. B. C.
During a MERGE operation As an explicit SELECT item As a CASE result As a default column definition specification Question 10 Which of the following statements is part of the Data Control Language? A. C.Question 9 Which of the following methods will not use a literal NULL? A. SELECT CREATE GIVE ABORT Question 11 Which of the following can lead to a SQL null? A. C. C. B. D. Value does not exist Value not valid Value is an empty set All of the above 185 . D. B. B. D.
Existential quantifier Identity predicate Universal quantifier Predicate logic Question 13 Which of the following integrity constraints is not considered a semantic constraint? A. B. Updated using SQL UPDATE requests. Temporal tables with a MULTISET table type. B. D.Question 12 The logical equivalent for “for any” is signified by what term from formal logic and set theory? A. 186 . Column Database Physical Table Question 14 What is a defining characteristic of NoPI tables? A. D. B. A permanent journal with no primary index. C. D. C. Primary index value is not used to hash rows to an AMP. C.
B. B. B. Interconnected database Intermodal database BYNET database Shared nothing database 187 . D. C. C.Question 15 Which of the following is a cause of semantic disintegrity? A. Improper normalization Dependencies Relationship mapping None of the above Question 16 What logical operator represents the set of all attributes which are contained in both relations? A. D. C. Join Intersection Union Product Question 17 Which of the following phrases best describes the architecture used by the Teradata Database? A. D.
Question 18 Which of the following statements will provide a write lock on a row hash? A. D. Fallback tables Indexes and constraints Account name Space allocation Question 20 Which of the following are characteristics of tables? A. B. D. C. C. D. Cardinality and relation Row and column Tuple and attribute All of the above 188 . B. CREATE DATABASE SELECT INSERT ALTER TABLE Question 19 Which of the following information items is not gathered on databases for management purposes? A. B. C.
People Entities Locations Policies 189 . Trusted User Proxy User Application User Any of the Above Question 23 Which of the following is not a dimension of the Zachman enterprise data model? A. C. D. D. C. Locks Journals Roles Groups Question 22 What are users called who access the database using a trusted session between the database and a middle-tier application? A. B. B. D.Question 21 What is used to control access to resources in a database while that resource is in use? A. C. B.
C. C. 2NF 5NF 1NF 3NF Question 25 Which of the following is not a valid index type? A.Question 24 Which level of normal form focuses on the elimination of circular dependencies? A. B. D. C. Single table aggregate join index Non-unique secondary index hash-ordered on all columns with a not ALL option Unique single-level non-partitioned primary index Multitable sparse join index Question 26 What is the size of the row length field of a subtable row? A. D. B. 2 bytes 8 bytes 10 bytes 12 bytes 190 . D. B.
C. C. B. C. B. D. D. Warm Cool Critical Hot 191 . D. UNIQUE FOREIGN KEY…REFERENCES PRIMARY KEYS CHECK Question 28 Which principal for relational database management declares that all foreign key values must have a match? A.Question 27 Which of the following integrity constraints for the column-level are not supported on temporal tables? A. B. Principle of Interchangeability Entity Integrity Rule Principles of Normalization Referential Integrity Rule Question 29 What of the following is not a term used to describe the relevance of data? A.
B. B. B. D. C.Question 30 Which of the following components is not part of the space requirements for the table area? A. D. SEMICOLON SOLIDUS APOSTROPHE COLON Question 32 What is the size limit for the table header? A. 2 KB 5 KB 1 MB 2 MB 192 . C. C. User Spool Space Cylinder size Data dictionary User TEMP space Question 31 Which delimiter is used to separate statements in a multistatement request? A. D.
C. D. D. HIGH MEDIUM DEFAULT ALL Question 34 Which of the following rules applies to transferring compression values to join index columns? A. 193 . B.Question 33 The DBC Control utility can define the integrity levels used for physical constraints. Transfers to a definition will occur only when columns are a component of the primary index for the join index. Transfers to a definition will occur only when columns are components of a partitioning expression. B. C. Transfers to a definition will occur as long the maximum header length of the index is not exceeded. Transfers to a definition cannot occur if alias names are specified. Which predefined integrity level requires the most processing? A.
C. B. D. 16 32 64 128 Question 36 Which of the following criteria is valid for selecting primary keys? A. B. Use numeric attributes whenever possible Always use intelligent keys Select attributes that are likely to change Attributes do not need to be unique Question 37 What type of key used in the normalization process is encoded with more than one fact? A.Question 35 How many columns can be supported by a primary index definition? A. B. Natural key Composite key Candidate key Intelligent key 194 . D. D. C. C.
C. D. D. Exclusive Write Access Read Question 40 What type of table is used to stage data during FastLoad operations? A. Permanent NoPI Global Temporary Volatile 195 . Data basement Dependent data mart Logical data mart Independent data mart Question 39 What is the most restrictive severity provided by a lock? A. B. D. B. B. What type of data mart is virtually constructed from a physical database? A. C. C.Question 38 A data mart is a small subset of a data warehouse database.
Question 4 Answer: D Reasoning: The designer perspective transforms scope and requirements into a product specification. The result of the effort is a system model. throttling. Open APIs provide an SQL interface to PMPC. and performance monitoring. Question 3 Answer: B Reasoning: Active System management is a collection of products used to automate workload management. capacity planning.10 Answer Guide 10. Query bands allow IDs to be tagged to sessions and transactions as defined by the user or middletier application. 196 . Resource usage Monitor will collect relevant data. and defining classes of workloads. Question 2 Answer: A Reasoning: Transient journals are used to roll back failed transactions which are aborted by the user or system. The Teradata Workload Analyzer is used to analyze DBQL data. performance tuning. Teradata Viewpoint defines rules for filtering.1 Answers to Questions Question 1 Answer: C Reasoning: Database Windows is the functional module used to control operations within a Teradata Database.
Question 7 Answer: C Reasoning: A Teradata Database can support up to 32 secondary. hash. Period. and join indexes. but not a MERGE operation. UDT. Question 6 Answer: D Reasoning: A row hash value is 32 bits in size.Question 5 Answer: A Reasoning: The BCNF is a stricter form of 3NF which eliminates nonkey attributes not describing primary key. Question 8 Answer: B Reasoning: Semantic constraints cannot be defined with BLOB. CLOB. Question 9 Answer: A Reasoning: A literal NULL can be used in an INSERT or UPDATE operation. or Geospatial data types. 197 . The value contains either a 16-bit hash bucket number with the 16-bit remainder or a 20-bit hash bucket number with a 12-bit remainder.
and database-level constraints Question 14 Answer: D Reasoning: NoPI tables are nontemporal tables with no primary index or a table type of MULTISET. does not exist.” “for any.Question 10 Answer: C Reasoning: The most common statements of the Data Control Language are GRANT/REVOKE. nor can they be updated using a SQL UPDATE request. not supplied. table-. or an empty set. 198 . GRANT LOGONJ/REVOKE LOGON. Semantic constraints are further divided into column-.” and “there exists. not defined. and GIVE. Question 11 Answer: D Reasoning: SQL nulls can be used for a number of reasons to identify missing information. not valid. not applicable. The common reasons for a null are the value is unknown.” Question 13 Answer: C Reasoning: Integrity constraints for databases fall into two categories: semantic and physical. Question 12 Answer: A Reasoning: Existential quantifiers are a term to logically identify “for some. but not the value missing. These tables cannot be specified as permanent journal.
role and profile names. owner name. creator name. creation time stamp. DELETE. allowing AMPs to have exclusive control over its own virtual space. or INSERT statement. Question 18 Answer: C Reasoning: A write lock can be placed on a row hash using an UPDATE. Within this architecture. modifier. 199 . collation type. Question 17 Answer: D Reasoning: The Teradata database is a shared nothing database architecture. and revision numbers. space allocation. Some of the general information about database stored includes database name. memory or disk storage is not shared between the PE and AMP vprocs across CPUs. Question 19 Answer: B Reasoning: Indexes and constraints is information gathered on tables.Question 15 Answer: A Reasoning: Semantic disintegrity is a result of improper normalization or misunderstanding of the normalization. Question 16 Answer: B Reasoning: An intersection of two relations covers the attributes that are found in both relations and only within the intersection. account name. not databases. number of fallback tables.
Question 23 Answer: D Reasoning: The dimensions used in the Zachman enterprise data model are entities. and motivation. Question 24 Answer: B Reasoning: Second normal forms describe the decomposition of the database to eliminate any circular dependencies between data. Question 21 Answer: A Reasoning: Locks are used to control access to a resource. people. locations. activities. Different types of objects and their severities will ensure different levels of access. Policies are not a component of the data model. time. Tuples are defined by a row which defines the cardinality of the relation. 200 . they are commonly called proxy users.Question 20 Answer: D Reasoning: Tables have two dimensions: tuples and attributes. Attributes are defined by a column which defines the degree of the relation. Question 22 Answer: B Reasoning: If a user accesses the database through a middle-tier application using a trusted session.
cool. and icy. The Case table row ID is 8 bytes for NPPI tables and 10 bytes for PPI tables. 201 . Question 29 Answer: C Reasoning: Data relevance is typically categorized as hot. Question 28 Answer: D Reasoning: The Referential Integrity Rule declares that no unmatched foreign key values can exist. warm. They are not supported on temporal tables. Question 27 Answer: B Reasoning: FOREIGN KEY…REFERENCES are defined on a single column and not implemented as an index. Question 26 Answer: A Reasoning: The row length field in a subtable row is 2 bytes. The RowID is 8 bytes.Question 25 Answer: C Reasoning: Only partitioned indexes can be classified with either a single-level or a multilevel.
transfers to a multitable join index definition will continue. and user spool space. 202 . Question 33 Answer: D Reasoning: When the ALL keyword is used. All other statements are opposite expressions of actual rules. Question 34 Answer: B Reasoning: As long as the maximum header length is not exceeded. CRASHDUMPS user space. Question 31 Answer: A Reasoning: Semicolons are used to separate statements and to terminate requests. maintain.Question 30 Answer: B Reasoning: The space requirements for the table area are generated by combining the requirements for the data dictionary. a checksum is generated using 100% of the words in the disk sector requiring the most processing requirements to perform. user TEMP space. and use. Question 32 Answer: C Reasoning: A table header is limited to 1 MB in size.
system-assigned keys. Question 38 Answer: C Reasoning: Logical data marts are logical constructs of the physical database. read. write. 203 . Question 37 Answer: D Reasoning: An intelligent key is a simple key which is overloaded with multiple facts. Attributes should remain unique and rarely change. An exclusive lock provides the requester with sole access to the locked resource. Question 36 Answer: A Reasoning: The recommendations for selecting primary keys include selecting numeric attributes. and exclusive in order from the least restrictive to most.Question 35 Answer: C Reasoning: Up to 64 columns can be supported within a primary index definition. Question 39 Answer: A Reasoning: Locking severities consists of access. Intelligent keys should never if used.
204 . They are used as a staging area to facilitate operations using FastLoad or TPump Array INSERT.Question 40 Answer: B Reasoning: NoPI tables are permanent tables which have no primary index.
com. Teradata Corporation: November 2010.10. Teradata Corporation: August 2010.10.artofservice.au www.10.11 References Teradata Database SQL Fundamentals Release 13.teradata. Teradata Corporation: August 2010.com 205 . Teradata Database Introduction to Teradata Release 13.com Websites: www. Teradata Corporation: February 2010.theartofservice.org www.10. Teradata Database Database Administration Release 13.theartofservice. Teradata information: www. Teradata Database Database Design Release 13.
120 206 . 107. 40-2. 183. 29. 115. 44-5. 155. 54. 56-7. 25-6. 36. 108. 97. 129. 73. 38. 174-5 ALTER PROCEDURE 99. 105. 80. 158 attributes 20. 194. 199-200. 88. 26. 96. 15 B base table rows 77-8. 130. 124. 144. 50-2. 176-7 ARC 36. 85.12 Index A accessing 54. 62-3. 116. 153. 180 assigned roles 52. 18. 46. 181 ANSI SQL 18. 153. 116-17. 10. 199 archive 11. 124. 35-7. 95. 155 authorization 6. 169. 72. 28. 117. 98. 28. 146. 49. 36. 79. 172 AMPs (Access Module Processor) 19. 111. 108-9. 169-70. 46. 118-22. 23. 46. 26. 180-1 architecture 13. 118. 90. 56 external 53-4. 53-4. 51. 203 authentication 6. 64. 30-3. 103. 59. 60. 105-6. 51. 150. 149. 187. 133-4 applications 21-2. 76. 137. 121. 124. 87. 180-1 archiving 29. 66-73. 126. 187-8. 177-9 Active System Management (ASM) 49-50. 161-3 amount 73. 73-6. 55. 78. 62-3. 196 algorithm 76. 151. 84. 125-6. 170 accounts 2. 153-4.
116. 40. 103. 84. 119. 120-2. 177. 80-1. 55. 77-81. 31. 130. 129. 134 Boyce-Codd Normal Form (BCNF) 69-70. 171 BCNF (Boyce-Codd Normal Form) 69-70. 159-60. 200 change 45. 197 blocks 47. 47. 188. 112. 102. 100. 51. 25. 109. 101. 183. 131. 167 CLOB 75-6. 73. 131. 53. 91. 119. 70. 117. 134. 97. 164. 85-6. 57-8. 125. 87. 183. 108. 194. 115. 99. 91-2. 85-6. 114. 90. 194 capacity planning 7. 172-3. 110. 134-5. 61. 147. 203 Character sets 26. 73. 167. 140.base tables 21. 124. 170 boundaries 101. 115. 75. 82. 123. 174 characters 28-9. 120 bytes 77. 179 column names 79. 73. 50-1. 66-7. 113-14. 197 BLOB 70-1. 15. 103. 174 clients 18-19. 30-2. 197 BTEQ (Basic Teradata Query) 43. 167. 184. 113. 90 BYNET 18. 130. 108-9. 96-7. 134 207 . 64. 174 checking 29. 111. 68. 154-5. 179. 62. 151-2. 82. 104. 76-8. 108. 124. 114. 48. 201 C candidate key 67-8. 134 business 35. 40. 190. 145. 75-6. 180 clause 25. 168-9. 120. 87-8 checkpoint 100. 115-16. 70. 167. 196 cardinality 20. 197 collection 18. 132. 158. 77. 50. 125. 103. 145. 96-7. 78. 56.
80. 26. 171. 25. 107-8. 146-54. 146. 23. 136. 52. 78-80. 69. 162. 169-70 D data blocks 73. 168 cursors 7. 126. 167 208 . 139. 143.column set 73-5. 61. 153. 41-4. 94. 87. 20. 200 compression 80. 84. 169 constraints 20. 32. 74. 146 database 18-23. 139. 23. 47. 173-4. 111. 126 cylinders 33. 83-6. 57. 159. 112. 180 DBC 26. 173-6 conditions 20. 21-2. 127-31. 29-30. 99. 60. 30-1. 141. 147. 63-5. 79-81. 12. 87. 110-13. 126. 103-4. 144. 135-6. 181 Data Dictionary 4. 105. 83-5. 81. 199 database objects 8. 139-40. 90-1. 101. 119. 123. 70-7. 161. 107. 62. 84. 178-80. 79. 50. 152. 48-9. 28. 64. 175 configuration 34. 55. 171-3. 138. 157-62. 30. 60. 151. 140. 103. 14. 20-1. 54. 66. 22-3. 102. 32. 26. 46-66. 107-10. 66-7. 91. 1679. 111. 157-8. 25-6. 166. 10. 64. 89 creator 20. 95. 199 context 20. 72. 186-9 Database Administration 9. 174 columns 8. 66 components 18. 117-24. 76. 135-6. 125. 147 database management 44. 60. 80. 188. 83-6. 99-100. 41. 39. 28. 113-14. 169-70. 77. 112-13. 52. 143-6. 51. 132. 144. 156. 166-7 data values 90. 40. 100. 155. 96-8. 97. 193-4 communication 18. 193. 174. 148. 66 database name 28. 136 column values 77.
94. 167 DDL (Data Definition Language) 9. 154-6 default values 109. 57. 143 DML (Data Manipulation Language) 9. 138. 111. 99. 152 dimensions 64-5. 139 domain 53. 200 entries 75. 88-9. 32. 22. 45. 138-9. 94-5. 69. 189. 69-70. 135 entities 3. 127. 119. 49-50. 99. 108. 83. 34. 99. 88. 126. 87. 187 design 61. 147. 13. 141. 140. 157 degree 20. 120. 169-70 distribution 73. 47. 76. 31-2. 153.DBW (Database Window) 5. 175. 83. 168 default roles 52. 138 environment 16. 127. 66-7. 200 directory users 51-2. 200 dependencies 21. 29-30. 22. 166. 29. 64. 140. 88. 72. 105-6. 182. 19. 13. 127. 164 E Embedded SQL 102. 84 dumps 45. 135-6. 130. 154. 75. 154. 151. 182 errors 20. 160-1. 55. 35. 42-5. 21-2. 198 DCL statements 100. 135-6. 136. 19. 67-8. 167 default database 9. 90 Dictionary 21. 143. 196 DCL (Data Control Language) 9. 157. 20. 13. 176 209 . 109-10. 154. 64-9. 75. 185. 72. 94. 153 disk 32. 83. 123. 100. 85. 170. 189. 129.
191. 133. 77. 130. 37. 39. 48. 198 expressions 26. 134. 131. 40-1. 144. 146 format 48. 166. 22. 144. 109. 171 fallback tables 28. 69. 201 Foreign Keys (FKs) 71. 101. 164 G Geospatial 75-6. 90. 186. 160-2. 73. 92. 146-7. 104. 115. 55. 117. 46-7. 123. 111. 116-17. 93. 82. 147. 71. 168. 107-8. 24-6. 184 Geospatial data types 82. 136 Existential quantifier 83-4. 90.execution 23. 120. 85-6. 156 210 . 63. 124 Fault Tolerance 5. 86. 188. 99. 96-7. 104. 77. 28. 106. 52. 170 files 36. 112 free space 26. 87. 166. 87. 179 External UDFs 24-5. 24-5. 34-5. 78. 128-9 F failure 32. 35. 38. 41-2. 128. 199 FastLoad 20. 117. 77. 103. 77. 102. 28-9. 35-6. 144. 150. 113. 61 fallback 35. 170 functions 9. 131-2 external stored procedures 4. 117. 77. 44. 180 foreign keys 68. 146. 61 fields 54. 92. 46-8. 35-6. 119. 197 global temporary tables 21. 117. 128-9. 79-81. 44. 123. 119. 85-6. 17. 88. 173 GRANT statement 23-5. 81. 91. 36. 126-7.
20. 65. 81-2. 184. 77-8. 41. 110. 175 Identity Columns 9. 39-40. 105. 123. 104. 136-7. 159. 127-8 211 . 48. 79. 179. 64. 64. 139. 92-3. 64-5. 35. 80-1. 146 immediate owner 97. 168. 121. 74-5. 118. 80-2. 62. 166-7. 28-9. 193 INDEX statement 76. 78. 108-9. 144. 145. 171-3. 100. 145 hashing algorithm 73. 75. 174 hashing 9. 73. 45-6. 118. 118 hierarchies 21. 111. 73-82. 86. 85. 26. 153. 126. 39. 104. 120-1. 177. 157 index columns 80-2. 77. 51. 154. 143-5. 170. 11. 168 I I/Os 33. 51. 20. 122. 143. 127 J JDBC 39. 189 H hash 62. 28-9. 117 integrity 7. 117-22. 145. 88. 36. 26. 174 indexes 8. 46-7. 76. 121. 118. 54. 146. 79-80. 145. 99. 40-1. 14. 133 input parameters 25-6. 46. 32. 174. 197 hash indexes 7-8. 123 interfaces 34.group 33. 106-7. 19 information 3. 199 input 34. 51-2. 102. 157. 190. 157. 73-6. 83. 178 implementation 12.
195. 32. 114-15 levels 21. 145 journal tables. 45. 99. 166-7 management 14. 38. 196 mapping 67. 157-8. 189 K key 8. 170-3 list 43. 194. 177 logon 6. 153 MB 91. 169. 133. 8. 62. 87. 64. 85-6. 148-9. 13. 135-6. 67-8. 111. 101. 180 journals 36. 148. 111. 153-4. 144-5. 202 mechanisms 34. 181 logging 40-1. 157. 41-4. 178 212 . 203 keywords 9. 81. 32. 108. 52. 49. 101. 36. 36. 101. 85. 183. 97. 44. 81-2. 87. 77. 176. 82. 153-4. 118. 27-8. 153. 55. 192. 57. 173-4 literals 9. 108.Join Indexes 7-9. 53. 92. 199-200 log 2. 123. 182. 144. 190 limit 101. 164. 108. 100. 49. 87 members 33. 188. 80. 145. 117. 63. 131. 159-61. 103. 53-7. 148. 126. 90. 59. 53. 93-4. 156-7. 70-1. 202 L length 28. 188-9. 176-7. 75. 141. 46. permanent 92-3. 71-2. 79. 23. 151. 51. 56-7. 109. 181 M macros 4. 41. 131-4 locks 5. 26-7.
197 NUPI (non-unique primary indexes) 74-5. 190 O objects 23. 108. 108-9. 121-3. 143. 186. 166-9. 85-6. 26. 127 modifier 28. 88. 55. 35. 120-1. 111-12. 198. 176. 136. 82. 94 nontemporal tables 77. 159. 75-6. 64. 68. 152-3. 62. 28-9. 62. 67. 204 NoPI tables 75-6. 198 NoPI (No Primary Index) 20. 28-9. 144. 99. 143-4 NUSIs (Non-unique secondary indexes) 74. 54. 76-8. 191. 53 nodes 30-1. 84-5. 82. 198. 67. 131. 56. 133 213 . 80. 192 N name 20. 77. 49. 93 multistatement request 127. 199 normalization process 6. 149. 83. 72. 100. 70. 69. 99-100. 124. 199 MultiLoad 46-8. 73-4. 109-10. 76-9. 66. 130. 64. 67. 92. 34. 73. 24. 140 middle-tier application 50-1. 189.MERGE 21. 172-3. 49. 46. 138. 66. 94. 33. 200 model 64-7. 57. 129. 155. 55. 169-70. 62-3. 131. 69. 110. 180-1 operators 9. 73. 144 modification 42-3. 167 network 39. 186. 80. 157-9. 106. 197 message 51. 41. 51-2. 194 number 28. 71. 51. 89-93. 48. 110. 54. 89. 56-7. 96-8. 195. 44. 110. 78. 106. 204 normalization 9. 144. 146.
143. 109. 33. 191. 168-9. 73. 144 primary index 7-9. 66-71. 86. non-unique 74-5. 49-50. 82. 183 phrases 28. 181 parents 122. 113. 109. 143-4. 112. 143 primary key 62. 199 P parallel 19. 24. 129-30. 32-3. 117. 76. 108. 119. 204 primary indexes. 126-7. 88. 117-19. 153-4. 144. 154-5. 98. 66-71. 80. 157. 54-6. 113. 153. 35. 93. 99. 64. 123. 146-7. 97. 143-4. 147-8. 155 PERM space 96. 171. 31. 157. 14. 106. 21. 143-4. 169-70 permanent space 81. 123. 73-5. 146-7. 119. 117. 30. 127 owner 3. 110. 153. 193. 101. 73. 103. 85. 167-8 partitions 76. 165 performance 18. 145. 197 Primary Key (PK) 62. 77. 121. 149-50. 77. 112. 77. 124. 194 Priority Scheduler 33. 52. 106. 82. 121-2. 109-10. 105.Optimizer 31-2. 30. 64-5. 117. 78-9. 118. 24. 179 password 30. 71. 142. 78. 168-9. 173 pointers 40. 151 214 . 171-2 perspectives 21. 144 output 31-2. 101. 85-6. 62-3. 176. 47. 28. 134. 182 PE (Parsing Engine) 19. 118. 147. 121. 62-3 parameters 14. 80-1. 183. 33. 122. 159. 117. 73. 174 PPI (partitioned primary index) 74-6. 79. 131. 38. 181 PDE (Parallel Data Extensions) 19. 152. 40-1. 73-6. 95. 88. 191. 133-4.
73. 127-8. 52. 73. 180 215 . 65. 108-9. 155-9. 32. 179. 156-7. 202 resource 42-3. 83. 200 restore 11. 151 referenced table 73. 135-6. 31-2. 104. 165 product 3. 27. 146 relational table 72. 159. 41. 176-7. 187. 123. 44. 20. 88-9 relational databases 18. 117. 146-7 relation variables 69-70. 86. 151. 178-9 proxy users 51-2. 72. 163-4. 166-8. 61-3. 42. 176-7. 57. 65. 123-4. 145. 167 referential integrity (RI) 8-9. 69. 187-8. 59. 165. 180-1 PROCEDURE 22. 136. 183 R recovery 5. 160-2. 130. 117. 165. 189. 144. 121-2. 51-3. 64. 153. 10. 18. 78-81. 111. 66-70. 146 representing 66.privileges 10. 71. 67-71. 155. 162-3. 152-3. 141. 160-1. 83. 143. 135. 36. 68-9. 143-4. 200 Q queries 24. 153. 36. 196. 171. 108. 146-7 references 20-1. 115-16. 122. 108-9 relations 20. 83. 102. 104-6. 38-9. 50. 26-9. 199-200 relationships 20. 133 requirements 35. 57. 189. 117. 129. 179 Query Bands 50. 149. 83. 127-8. 110. 139. 153-4. 85-6. 46. 63. 40-1. 171-3. 75. 117. 196 profiles 8. 83-4. 108. 55. 146.
36. 117-23. 113. 163-4. 120. 153-6. 93-4. 120. 176 session modes 7. 132 scope 50. 104-6. 93-4. 51-2. 73. 116. 104. 108-13. 155. 27-9. 145. 42. 86. 130. 57-8. 179. 199 roll 36. 51. 15. 158. 71. 62. 42. 105. 189. 88. 126. 144-5 security 6. 14.return 47-8. 172. 107. 120. 133-4. 178. 77-9. 197-8 semantic disintegrity 71-2. 176. 76. 139-40. 74-6. 26. 117. 145. 151. 14. 197 RowID 73. 91. 10. 196 secondary indexes 7-9. 201 rows 20-1. 55. 81. 187. 169-74 rules 27. 22. 176 216 . 157 semantic constraints 85. 176. 50-1. 126-7 services 1. 74-8. 118-21. 72-9. 129. 176 Session Management 10. 13. 53. 9. 108. 123. 149. 52. 44-5. 148. 81-4. 72. 196. 40-1. 64-5. 84. 105. 144 Secondary Indexes (SI) 7. 109. 186. 184. 199 sequence 49. 143-6. 167. 196 rollback 18. 100. 14. 33. 132 server 21. 202 S Scalar 25. 82. 109. 87. 77. 188. 168. 3. 182 row hash 42-3. 153. 55. 50. 10. 141 Revoke 139. 81. 176 roles 8. 102. 45. 118. 125-6. 129. 199 row hash values 62. 184. 79. 117-21. 100. 145.
181 set theory 7. 120-2. 172-4 space allocation 28-9. 129. 111-12. 106. 37. 78. 20-2. 174 table header 91. 27-8. 113. 135. 108. 179. 81. 73.session number 56. 152. 147-9. 169 string 100. 80. 54. 46. 149-51. 39. 108. 173-4. 202 table names 28. 171 stored procedures 8. 120. 103. 186\ single column 67. 26. 134. 85-6. 18. 195 subtables 50. 75. 131-2. 98. 154 subset 21. 121. 43. 69. 46. 156. 99. 157. 116-17. 176-7. 151 statistics 14. 29. 48. 21. 112. 199 spool 96. 90. 52. 166 SQL statements 7. 171. 83. 135-6. 152. 148. 24. 94. 168-9. 132-3. 94. 113-15 SQL requests 32. 135. 101. 169 support 87. 81. 154. 20-2. 152. 35. 149. 131 217 . 105-6. 119-20. 153-4. 101-2. 151-2 storage 38. 60. 144. 136. 178 sessions 19-21. 99. 105-6. 27. 145. 125-6. 127. 168-70. 101. 48-50. 139. 188. 101-2. 24. 131-3 SQL data types 7. 104. 106. 74. 99. 84. 89. 102. 30. 138-9. 169. 201 software 5. 99. 20. 31. 49. 129. 167. 148. 73 space 21. 159. 109. 76. 94-8. 12-13. 96. 192. 27. 46. 171-2 SQL (Structured Query Language) 7. 108. 197 T table columns 91. 109. 126-8. 105-6. 125-7. 33. 92. 101. 40-2. 64. 69. 23.
136. 105. 101. 105-7. 36. 179. 8. 26. 183 transfers 47. 43. 33. 37. 118. 116-18. 40. 81-2. 30. 25. 111-12. 43. 97. 62. 77. 114-15. 80. 108. 95. 154. 72. 75. 119 UPDATE 21. 96. 140 temporal tables 75-7. 43-7. 110-11. 171-2. 62-3. 172-3 temporary tables 21. 149.table rows 119-20. 109. 168. 114-15. 109. 202 trigger 24. 188. 14951. 65. 159-61. 18. 132 time zone 99. 23-4. 93. 37. 172-3 temporary space 10. 99. 183 test 16. 113. 150. 111. 30-1. 200 U UDFs (user-defined functions) 4. 116-17. 27. 12-15. 132 transactions 5. 99. 85-6. 191. 18. 115. 174-6 Teradata Viewpoint 50. 8. 174 UNIQUE constraints 70. 85-6 time 23. 125. 88-9. 84. 99. 40-5. 25. 181. 186. 141. 36. 49-50. 132. 179-80 Teradata Database 17-18. 108. 108. 125. 49. 177. 39-41. 170-1. 166 TABLE statement 71. 100. 47. 62. 147-8 Unicode 30. 193. 199 218 . 33-4. 103. 109. 179. 29. 75. 76-7. 135. 66. 167. 77. 28-9. 150. 28. 22. 28. 114-15. 73. 129. 53-4. 174 UDTs (User-Defined Types) 4. 177. 4. 117. 72-3. 201 TEMPORARY 96. 143. 67-70. 106. 75-8. 50. 90. 24. 36. 105. 85-6. 172. 53. 100. 24-8. 165. 160-1. 153-4. 169. 168. 109. 110. 128. 129 Teradata 1. 119. 136 timestamp 20. 75. 164 tuples 20.
176-9 USIs (unique secondary indexes) 74. 101. 140-1. 89-90. 99. 185. 154. 50. 150 USER statement 51-2. 50-9. 76-8. 150 V values 1-2. 108. 17.UPDATE operation 84. 140. 147. 89. 18. 175 UPIs (unique primary indexes) 74-5. 85. 135. 98. 147-50. 50. 46-8. 143-4. 152. 202 workload management 6. 149. 155 user names 28-9. 96-7. 146 utilities 5. 28-9. 35. 136 user logs 30. 152-4 username 51. 152-62. 36-7. 154. 130. 149-50. 83. 47. 34. 130-1. 119-22. 122. 168. 10. 62. 108. 73. 179 219 . 112-14. 71-3. 181 users 20-1. 143-6. 197 updates 20-1. 126-8. 166. 31. 44-5. 169 User-Defined Method (UDM) 4. 15. 23. 171-3. 176. 13. 27. 49-50. 135-6. 42-5. 135. 88. 181 W words 55. 111-12. 112. 124. 167-9. 116-21. 81. 104. 107-8. 13. 130. 173-4. 20. 29. 146 User DBC 26. 84. 101. 130. 96. 111. 33. 197-8 volatile tables 75. 54-6. 69. 134. 134. 124. 25. 85. 144. 76. 109. 96. 177 vprocs 30-1. 131-3. 26.
This action might not be possible to undo. Are you sure you want to continue?