This action might not be possible to undo. Are you sure you want to continue?
This self-study exam preparation guide for the Teradata 12 Certified Master certification exam contains everything you need to test yourself and pass the Exam. All Exam topics are covered and insider secrets, complete explanations of all Teradata 12 Certified Master subjects, test tricks and tips, numerous highly realistic sample questions, and exercises designed to strengthen understanding of Teradata 12 Certified Master concepts and prepare you for exam success on the first attempt are provided. Put your knowledge and experience to the test. Achieve Teradata 12 Certified Master certification and accelerate your career. Can you imagine valuing a book so much that you send the author a “Thank You” letter? Tens of thousands of people understand why this is a worldwide best-seller. Is it the authors years of experience? The endless hours of ongoing research? The interviews with those who failed the exam, to identify gaps in their knowledge? Or is it the razor-sharp focus on making sure you don’t waste a single minute of your time studying any more than you absolutely have to? Actually, it’s all of the above. This book includes new exercises and sample questions never before in print. Offering numerous sample questions, critical time-saving tips plus information available nowhere else, this book will help you pass the Teradata 12 Certified Master exam on your FIRST try. Up to speed with the theory? Buy this. Read it. And Pass the Teradata 12 Certified Master Exam.
Certi cation Exam Preparation Course in a Book for Passing the
Certi ed Master
The How To Pass on Your First Try Certi cation Study Guide
Teradata 12 Certified Master Exam Preparation
This Exam Preparation book is intended for those preparing for the Teradata 12 Certified Master certification. This book is not a replacement for completing the course. This is a study aid to assist those who have completed an accredited course and preparing for the exam. Do not underestimate the value of your own notes and study aids. The more you have, the more prepared you will be. While it is not possible to pre-empt every question and content that MAY be asked in the Teradata exams, this book covers the main concepts covered within the Database Management discipline. Due to licensing rights, we are unable to provide actual Teradata Exams. However, the study notes and sample exam questions in this book will allow you to more easily prepare for Teradata exams.
Ivanka Menken Executive Director The Art of Service
Write a review to receive any free eBook from our Catalog - $99 Value!
If you recently bought this book we would love to hear from you! Benefit from receiving a free eBook from our catalog at http://www.emereo.org/ if you write a review on Amazon (or the online store where you purchased this book) about your last purchase!
How does it work?
To post a review on Amazon, just log in to your account and click on the Create your own review button (under Customer Reviews) of the relevant product page. You can find examples of product reviews in Amazon. If you purchased from another online store, simply follow their procedures.
What happens when I submit my review?
Once you have submitted your review, send us an email at firstname.lastname@example.org with the link to your review, and the eBook you would like as our thank you from
http://www.emereo.org/. Pick any book you like from the catalog, up to $99 RRP. You will
receive an email with your eBook as download link. It is that simple!
All other product names and services identified throughout this book are used in editorial fashion only and for the benefit of such companies with no intention of infringement of the trademark. or otherwise. Where those designations appear in this book. Trademarks Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Notice of Liability The information in this book is distributed on an “As Is” basis without warranty. No such use.Notice of Rights All rights reserved. mechanical. 3 . and the publisher was aware of a trademark claim. or the use of any trade name. neither the author nor the publisher shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the instructions contained in this book or by the products described in it. electronic. the designations appear as requested by the owner of the trademark. While every precaution has been taken in the preparation of the book. recording. No part of this book may be reproduced or transmitted in any form by any means. is intended to convey endorsement or other affiliation with this book. without the prior written permission of the publisher. photocopying.
3.Contents Foreword 1 3 4 5 184.108.40.206 5.3.10 5.3.1 5.1 220.127.116.11 5.4 5.3 Teradata 12 Certified Master Exam Specifics Teradata Products Teradata Solution Teradata Database Data Warehouse Concepts Relational Database Concepts Tables Views SQL Stored Procedures External Stored Procedures Macros Triggers User-Defined Functions User-Defined Methods User-Defined Types Databases and Users Data Dictionary Views Teradata RDBMS Components Platforms Virtual Processors Processing Requests 4 12 16 17 17 18 19 20 20 21 21 22 23 23 24 25 25 26 29 30 30 30 32 .7 18.104.22.168 5.4.2 5.5 5.3 5.3.8 5.2 5.2 5.6 5.11 22.214.171.124 5.1 5.3.3.
8.1 5.6 126.96.36.199 5.9 5. 5.4.8 188.8.131.52 5.9.2 5.6 5.3 5.9.1 5.5 5.7 5.5 5.2 5.7 184.108.40.206.9 5.4 220.127.116.11 5.1 5.4.3 18.104.22.168.1 22.214.171.124.6 126.96.36.199 Cliques Hot Standby Nodes Parallel Database Extensions Workstations Types Teradata Database Window Database Requirements Fault Tolerance from Software Fault Tolerance from Hardware Redundant Array of Inexpensive Disks Client Communication Network Attachments Channel Attached Systems Data Availability Concurrency Control Transactions Locks Host Utility Locks Recovery Two-Phase Commit Protocol Teradata Tools and Utilities Data Archiving Utilities Load and Extract Utilities Access Modules Querying 5 32 33 33 33 34 34 34 35 37 38 39 39 40 41 41 41 42 44 44 45 46 46 46 47 48 .4 Disk Arrays 5.
10.1 6.10.2 5.1 5.10.6 5.7 5.5 5.1.7 6.9.6 6.5 6.1.1 188.8.131.52 5.10 6.1.8 5.1.11 Session and Configuration Management Resource and Workload Management Security and Privacy Concepts of Security Users Database Privileges Authentication Logon Authorization Data Protection Security Monitoring Security Policy Database Design Development Planning Design Considerations for Teradata Database Data Marts Data Warehousing Parallel Processing Usage Considerations ANSI/X3/SPARC Three Schema Architecture Design Phases Requirements Analysis Entity-Relationship Models Normalization Process Join Modeling 6 48 49 51 51 51 52 53 54 55 56 56 57 59 59 59 60 61 62 63 63 64 64 66 67 71 .1.3 5.9 6 6.3 6.9 184.108.40.206.220.127.116.11.2 6.5 5.8 6.5.10 5.6 5.1.4 18.104.22.168.
2 6.7 Activity Transaction Modeling Process Indexing Primary Indexes Secondary Indexes Join Indexes Hash Indexes Integrity Set Theory Semantic Integrity Physical Integrity Database Principles Missing Values Capacity Planning Planning Considerations Database Sizes Estimating Space Requirements Structured Query Language Overview SQL Statements SELECT Statements SQL Data Types Recursive Query SQL Functions Cursors Session Modes 7 72 73 75 76 79 81 83 83 84 87 88 89 90 90 94 96 99 99 99 102 103 104 104 105 105 .1 6.3 7 7.1.4 6.3 6.4 6.1 7.2 6.1 6.3.2 6.3 7.2.1 22.214.171.124.4.6 126.96.36.199.6.2 6.1 188.8.131.52.3 184.108.40.206 7.3.5 7.5 6.12 220.127.116.11 6.4 6.4 7.1.2.
7 7.8 18.104.22.168 7.1 7.2.16 7.2 22.214.171.124 7.10 7.18 126.96.36.199 7.1 SQL Applications EXPLAIN Request Modifier Third-Party Development Database Objects Databases and Users Tables Columns Data Types User-Defined Types Keys Indexes Primary Index Secondary Index Join Index Hash Index Referential Integrity Views Triggers Macros Stored Procedures User-Defined Functions Profiles Roles SQL Syntax Statement Structure 8 105 106 106 107 108 108 112 113 117 117 117 118 119 121 122 122 124 125 126 126 128 129 130 131 131 .2.2.19 7.2.15 7.11 7.6 188.8.131.52.3 7.1.4 7.2.3 184.108.40.206.9 220.127.116.11.10 7.17 7.2.12 7.8 7.5 7.2.14 7.2.
4 7.4 18.104.22.168.2.5 8.2 7.4 7.5 7.5.4 22.214.171.124.1.1 126.96.36.199.6 7.1.2 8.1 8.3 Keywords Literals Operators Functions Delimiters and Separators Default Database Functional Families Data Definition Language General DDL Statements Data Control Language Data Manipulation Language Query and Workload Analysis Database Administration Physical Database Design Primary Indexes Secondary Indexes Join Indexes Hashing Identity Columns Normalization Referential Integrity Database Administration System Users Administrator User Administration Tools 9 131 132 133 133 134 135 135 135 137 138 139 141 143 143 143 144 145 145 146 146 146 147 148 148 149 .1.1 7.3.1 8.3 8.3 7.2 8.5.5 7.7 7.5 8 8.7 8.7.2 7.2 188.8.131.52 8.5.2.
7 184.108.40.206.220.127.116.11 18.104.22.168 8.2 8.5. 8.1 8.1 8.3 8.4 8.6 22.214.171.124 126.96.36.199.5.3 8.4.5 8.4.4 8.1 8.1 8.8.3 8.7.5 System Administration Session Management Utilities User and Security Management Databases and Users User Types Creating Users Roles Profiles Privileges Object Maintenance Data Dictionary Data Dictionary Views Capacity Management Ownership Space Limits Spool Space Temporary Space Data Compression Session Management Session Modes Monitoring Tools Accounts System Accounting Using Query Bands 10 149 149 150 152 152 153 153 155 157 157 166 166 166 168 168 169 171 172 173 176 176 177 177 179 179 .2 188.8.131.52 8.6 8.3.2 8.2 8.5 8.2 8.
4 9 9.8 8.3 8.8.1 11 12 Business Continuity Archive.1 10 10.1 8. Restore.2 184.108.40.206. and Recover Required Privileges Session Control HUT Locks Practice Exam Refresher “Warm up Questions” Answer Guide Answers to Questions References Index 180 180 180 181 181 182 182 196 196 205 206 11 .8.
For the Certified Master certification.3 Teradata 12 Certified Master The Certified Master certification is part of the Teradata Certified Professional Program covering the knowledge and skills required to install. or certification. manage. the following exams must be passed in sequential order: • • • • • • • TEO-121 – Teradata 12 Basics TEO-122 – Teradata 12 SQL TEO-123 – Teradata 12 Physical Design and Implementation TEO-124 – Teradata 12 Database Administration TEO-125 – Teradata 12 Solutions Development TEO-126 – Teradata 12 Enterprise Architecture TEO-127 – Teradata Comprehensive Mastery Each exam covers the following: • TEO-121 – Teradata 12 Basics o o o o Product Overview Processing Types and Characteristics Data Warehouse Architectures Relational Database Concepts 12 . The entire list of available certifications in recommended order is as follows: • Teradata 12 Certified Professional • Teradata 12 Certified Technical Specialist • • • • Teradata 12 Certified Database Administrator Teradata 12 Certified Solutions Developer Teradata 12 Certified Enterprise Architect Teradata 12 Certified Master The exams for each certification roughly correspond to the steps in the program. and operation Teradata systems. Each step in the program builds on the previous step.
o o o o o o o • Teradata RDBMS Components and Architecture Database managed Storage Data Access Mechanics Data Availability Features Teradata Tools and Utilities Workload Management Security and Privacy TEO-122 – Teradata 12 SQL o o o o o o o o o o o o o o o o o Teradata Extensions Data Definition Language Data Manipulation Language Data Control Language Views and Macros Logical and Conditional Expressions Data Conversions and Computations CASE Expressions Subqueries and Correlated Subqueries Joins Attribute and String Functions Set Operations Analytical Functions Time/Date/Timestamp/Intervals Stored Procedures Concepts Aggregations SQL Optimization Concepts 13 .
o Advanced SQL Concepts • TEO-123 – Teradata 12 Physical Design and Implementation o o o o o o o o o o • Physical Database Design Table Attributes Column Attributes Statistics Primary Indexes Secondary Indexes Transaction Isolation Physical Database Operations Teradata Query Analysis Database Space Management TEO-124 – Teradata 12 Database Administration o o o o o o o o o o System Software Setup and Parameters User and Security Management Session Management Load and Extract System Administration Tools System Workload Analysis and Management Performance Optimization Capacity Management and Planning Business Continuity Object Maintenance • TEO-125 – Teradata 12 Solutions Development o Development Process 14 .
o o o o Development Considerations Development Planning Development Strategies Optimization • TEO-126 – Teradata 12 Enterprise Architecture o o o o o o • System Planning and Space management Optimization Data Integration Data Protection Data Governance Information Delivery Strategies TEO-127 – Teradata 12 Comprehensive Mastery o o o o o o o Workload Management Performance Management and Query Optimization Database Design Configuration Management and Capacity Planning Data Availability and Security Application Integration and Optimization Data Management and Integration 15 .
Two valid forms of ID are required when arriving at the center. Scheduling and location of test sites can be obtained at www. Multiple Choice 16 . Tests are conducted at a testing center. and timed.prometric. Specifics about the exam include: • • • Exam Number : Time Limit: Question Type: TEO-121 – TEO-127 2 hours to 3 hours minutes (based on exam). Exams are delivered in a secure environment.4 Exam Specifics Teradata Exams are proctored by Prometric.com/teradata. proctored.
The tools and their functions are: • Teradata Manager . 17 .capture and delivery of changed data • Teradata FastLoad .data is loaded actively and continuously while supporting other workloads.table loading • Teradata FastExport .data loading • Teradata MultiLoad .database system administrator • Teradata Administration Workstation • Teradata Parallel Transporter .1 Teradata Solution The core product is the Teradata Database.data quality • Teradata Analytic Data Set Generator The following functions are provided by the complete Teradata solution: • Active Load . The methods include streaming from a queue.data extraction • Teradata Archive/Recovery Utility • Teradata Dynamic Workload Manager • Teradata Analyst Pack • Teradata Utility Pack • Teradata Warehouse Miner Stats • Teradata Warehouse Miner • Teradata Profiler . batch updates. • Active Access . and moving changed data.5 Teradata Products 5.continuous load • Teradata Replication Services . A comprehensive set of management tools and utilities are included to aid in numerous functions.unified load utility • Teradata TPump .analytical intelligence can be accessed quickly and consistently.
• Active Enterprise Integration . partners. and workload management to allow clients to access a single copy of the data. Due to the Teradata Database being a relational database.aids in identifying and fulfilling application-specific availability.1 Teradata Database The Teradata Database is an inexpensive solution which will use most standard hardware components. The capabilities of the Teradata Database allow users to view and manage data as collections of related tables. The solution enabled data type translation. The architecture supports single-node (Symmetric Multiprocessing) and multinode (Massively Parallel Processing) systems.the data warehouse solution is integrated into the enterprise business and technical architectures to support business users.business events are detected automatically and business rules are applied against current and historical data.mixed workloads are managed dynamically and system resource utilization is optimized to meet business goals. and performance requirements. concurrency. connections. 18 . Data transactions will rollback to a consistent state if a fault occurs.increasing the speed for processing user and customer requests. the database can be expanded without sacrificing performance. • Active Availability . and customers. Either Teradata or ANSI modes are available for transactions to be run. Additionally.1. Teradata Database is designed to be a single data store for multiple client architectures for the purpose of reducing data duplication and inaccuracies present when multiple stores are used. recoverability. The channel attachment method will attach systems directly to a mainframe computer using an I/O channel. A fast interconnected structure (BYNET) allows distributed functions to communicate. • Active Events . but adds Teradata-specific extensions. Teradata SQL is compatible with ANSI SQL. The database has the capacity to store large amounts of detailed data and performs large amounts of instructions every second. Parallel processing makes the Teradata product faster than other relational systems. • Active Workload Management . Fault tolerance capabilities ensure hardware failures are detected and recovered. This single store is possible through heterogeneous client access. 5. communication utilizes Structured Query Language (SQL). The network attachment method is used to attach the system to workstations and other computers and devices through a LAN. The database can use one of two attachment methods to connect to other operational computer systems.
• Teradata Database Management Software .includes a Parsing Engine (PE).validates messages from clients of generate sessions and control encryption. 5. This data is: • Subject oriented • Integrated • Timestamped • Nonvolatile • Data can be captured from several sources. 19 . Access Module Processor (AMP) and file system. • Parallel Data Extensions (PDE) .used to control operations. such as: • Customer orders • Inventory database • Shipping processes • Direct mail • Electronic mail • Phone calls • Utilities load the data either in a timely. continuous fashion or through batch jobs.2 Data Warehouse Concepts A data warehouse is a centralized database storing data that is later used to aid strategic. The data is typically gathered from other operational databases. tactical. • Teradata Gateway . and event-driven decision making.enables the database to operate in a parallel environment by implementing a software interface layer.The database software implements the relational database environment using the following functional modules: • Database Windows .
only the creator needs access to 20 . • Queue . • Global Temporary . predictable outcome is possible when manipulating the table because the operations allows are well-defined. Names. The table format is used to organize data and present that data to users. Table constraints are simply conditions that must be met before a value is written to a column and can include value ranges. In this context. A column contains the same type of information. There are several types of tables: • Permanent .1 Tables Tables are two-dimensional objects comprised of rows and columns. or thing which has information stored in the tables. An entity is found in a row and the attributes of that entity is found in columns. and an attribute is defined by a column. dates.different sessions and users can share table content. a tuple is defined by a row. but never in the same column. These tables have a persistent table definition which is stored in the Data Dictionary. These tables have a persistent table definition which is stored in the Data Dictionary. or attributes. consisting of rows and columns. place.used when only one session needs a table. 5. The implementation of a relational database is a general implementation of set theory relations. A consistent. • Volatile . • Global Temporary Trace .3 Relational Database Concepts Relational database concepts are grounded in the mathematical exercise of set theory. a relation is defined by a table. Tables within databases are two-dimensional. The rows within the table define the cardinality of the relation.3.5. on a permanent table.session based tables which are private to the session and dropped automatically at the end of a session. and the columns within the table define the degree of the relation. and intercolumn dependencies.permanent tables with a timestamp with contents organized in first-in firstout (FIFO) ordering. • NoPI . particularly insert and update errors. • Error Logging .stores trace output for sessions to be used for debugging SQL stored procedures and external routings. equalities or inequalities. relationships. A row is a specific instance consisting of all columns in the table. Relationships and constraints of data are defined by references between tables.used to store information about errors.permanent tables without a defined primary index used as staging tables to load data from FastLoad or TPump Array INSERT. and prices may be found in their own columns. Entities are a specific person. Columns generally represent entities.
Views provide users a perspective of the data in the database and can restrict access to tables and provide updates. and DML statements. usually presenting only a subset of columns and rows. 5. Access to the data can be defined. • Derived . tested. • Local variables and cursors. One or more base tables or views can be used to create a view.a table and better performance than a global temporary table is required or the table definition after ending the session is not required.a temporary table is created through a subquery of one or more other tables. They contain: • Multiple input and output parameters.3 SQL Stored Procedures SQL stored procedures are a combination of procedural control statements. • SQL DDL. • Business rules on the server can be encapsulated and enforced.2 Views Database views are virtual tables. Views can be used as if they are tables in SELECT statements. and DELETE statements. though some restrictions apply when using INSERT.3.3. With SQL stored procedures. They are executed on the Teradata Database server space. MERGE. and enables high performance capabilities. UPDATE. the dependencies become invalidated. SQL statements. large and complex database applications can be built. They are used as if they are physical tables when retrieving data defining columns form other views or tables. When done. View definitions are stored in the Data Dictionary. This table type is specified in an SQL SELECT statement. Views can provide a level of independence to logical data. and control declarations which provide a procedural interface to the Teradata Database. reducing network traffic. If a lower-level view is deleted. improving application 21 . 5. DCL. A view does not contain data and they do not exist without a reference from a DL statement. Hierarchies of views can be created from other views. higher-level views have dependencies on lower-level views. Applications using SQL stored procedures have the following characteristics: • Stored procedures reside and are executed on the server.
• User access can be granted to procedures instead of the data tables directly. • Control declarations .notes which are simple of bracketed. • LOCKING modifiers . and DECLARE statements to provide condition handlers. and SQLWARNING. Place the class or classes in an archive file (JAR or ZIP) if using Java and call the SQLLINSTALL_JAR external stored procedure to register the archive file. • Runtime conditions generated by the application can be handled by an exception handling mechanism. SQLEXCEPTION. 5. DCL. or Java code for the procedure. or defined for SQLSTATE. C++. Condition handlers can be of a CONTINUE or EXIT type. There are five steps to follow in order to write and utilize an external stored procedure: 1.nested or non-nested compound statements.maintenance. Create a database object for the external stored procedure using the CREATE PROCEDURE or REPLACE PROCEDURE statement. Write. DML.used with all supported SQL statements except CALL. and local variable declarations respectively. 22 3. • Security is better due to the ability of the data access clause restricting access to the database.3. and SELECT statements. C++. One or more of the following elements can be found in SQL stored procedures: • SQL control statements . FOR. cursor declarations. 2.initiated by DECLARE HANDLER.DDL. They are installed on the database and executed like stored procedures. . • Comments . • SQL transaction statements . A single CALL statement can be used to execute all SQL and SQL control statements embedded in an SQL stored procedure. and debug the C. including dynamic SQL statements.4 External Stored Procedures External stored procedures can be written in C. test. • Transaction control is better. NOT FOUND. DECLARE CURSOR. or Java programming language.
INSERT. Execution of a macro does not require knowledge of database access. and the database is returned to its original state. The conditions of the triggering event are represented by a database object. 5. 5. • DROP MACRO . or the results of the macro.4.3. or UPDATE. Macros can be created by an individual for personal use or for others by granting execution authorization.5 Macros One or more SQL statements can be executed by a single request using the macro database object. The execution of a macro will either process all the statements it contains or none of the statements. Each time a macro is performed.6 Triggers A triggering event is an event which is initiated because of the occurrence of another event. 5. one or more rows of data may be returned. Though multiple statements may exist in a macro. it is treated as a single request. Grant privileges to authorized users using the GRANT statement.3. the tables affected.deletes a macro. which is a stored SQL statement associated with a table called a subject table. Triggers are executed when a specified column or columns in the subject table are modified by a DELETE. A failed macro is aborted. • EXECUTE . all updates are backed out. Invoke the procedure using the CALL statement.used to incorporate a frequently used SQL statements or series of statements into a macro. Two types of triggers are supported by the Teradata Database: • Statement • Row 23 .runs a macro. The basic SQL statements used with macros include: • CREATE MACRO .
Call the function. they are executed in the order determined by the timestamp of each trigger. INSERT.executed after the completion of a triggering event. 24 . • AFTER . or DELETE during specific timeframes. • To set a threshold. System performance can be maximized by running triggered and triggering statements in parallel.7 User-Defined Functions SQL can be extended by writing functions. An SQL UDF can be created and used with the following steps: 1. unless the ORDER extension is used.executed before the completion of a triggering event. INSERT. Two types of UDFs are supported: • SQL UDFs • External UDFs Regular SQL expressions can be encapsulated into functions and used like a standard SQL function by creating an SQL UDF. MERGE. 2. They can be sorted based on the preceding ANSI rule. • To disallow major operations for UPDATE. called user-defined functions (UDFs). Define the UDF using the CREATE FUNCTION or REPLACE FUNCTION statement. Triggers can initiate other triggers. such as business hours. MERGE. 3. • The performance of an audit. Triggers allow: • The UPDATE. • To call SQL stored procedures and external stored procedures.Triggers are initiated when one of the following statements is used: • BEFORE . When multiple triggers are specified. and DELETE statements performed to a subject table can be propagated to another table. SQL expressions are objectified when they are frequently repeated in queries. Use the GRANT statement to grant privileges to authorized users.3. 5.
produces summary results. C++. ordering. An External UDF can be created and used with the following steps: 1Write. and cast functionality for the UDT. Use the GRANT statement to grant privileges to authorized users. Two types of EDMs are supported: • Instance . 6.9 User-Defined Types User-Defined Types (UDT) can be structured or distinct.External UDFs are functions written in C. or Java programming language.returns a single value result for an input parameter.3. 5. 5. 2. • Table . 1. • Aggregate .3. A structured UDT is a collection of one or more attributes defined as a predefined data type or other UDT.invoked in a FROM clause of a SELECT statement and returns a table to the statement. It is always associated with a UDT. 5. Create a database object for the UDF using a CREATE FUNCTION or REPLACE FUNCTION statement. Call the function. They are installed on the database and used like standard SQL functions. Three types of external UDFs are supported: • Scalar . • Constructor .8 User-Defined Methods A User-Defined Method (UDM) is a special form of UDF. test. C++. and debug the C. or Java code for the UDF.operates on a specific instance of a distinct or structured UDT and can provide transform. but a dynamic UDT is defined using a NEW VARIANT_TYPE 25 . 3. Place the class or classes in an archive file (JAR or ZIP) if using Java and call the SQLLINSTALL_JAR external stored procedure to register the archive file. Dynamic UDT is a form of structured UDT. 4. Both types of UDTs can define methods.initializes an instance of a structured UDT. A distinct UDT is a single predefined data type. A distinct or structured UDT is defined using a CREATE TYPE statement.
the only difference being the ability for the user to log on to the system. Dynamic UDTs can only be specified as a data type of input parameters to external UDFs. Space is assigned from the User DBC to all other objects. The database administrator manages this user.expression. control information and general information about: • • • • • • • • • • • Authorization Accounts Character sets Columns Constraints Databases Disk Space End users Events External stored procedures Indexes 26 . specifically all free space and the database and users created after installation. This expression constructs an instance of the dynamic UDT and its attributes at runtime. When the Teradata Database is installed on a server. 5. the User DBC is created. The tables used only by the system and contain metadata about objects. This user is then used to allocate space. DBC. privileges. usually assigning all database disk space not required by system tables to the User System Administrator. The metadata is comprised of current definitions. This single user owns all other databases and users in the system and all the space in the entire system initially. system events. The Data Dictionary is a set of tables and views associated with the system user. The database administrator can create a User System Administrator from the User DBC to protect the system tables within Teradata Database. Views provide access to the information contained within the tables. and system usage.3. a database and a user are nearly identical.10 Databases and Users In Teradata Database.
• • • • • • • • • • • • • • • • • • • JAR and ZIP archive files Logs Macros Privileges Profiles Resource usage Roles Rules Sessions Session attributes Statistics Stored procedures Tables Translations Triggers User-defined functions User-defined methods User-defined types Views 27 .
parameter passing convention. 28 . length. null-call characteristic. table name.information contains C source code and object code. character and platform type. creator name. data type. and user and creator access privileges. deterministic characteristic. • Trigger . data accessing characteristic. default character set.stores the IDs of tables.information including name. parameters including parameter name. parameter type. and phrases. and revision numbers. database name and user names.Generally. modify data and time. • Java external stored procedure . creator name. table backup and protection. • Stored procedures . collation type. • Databases . users who created and last updated the trigger. data accessing characteristics. external name. data accessing characteristics. creation time stamp.information on table location. default format. parameter passing convention. specific name. version. data types. and creation date and time for the table.information on the view or macro text. and user and creator access privileges. indexes. • User-defined function . These views are used by users to obtain information within the table. JAR name. creation text and time stamp. execution protection mode. defined indexes and constraints. Information on columns in the table including column name. triggers. platform type. owner name. external stored procedure name. number of fallback tables.information on call name. parameter data types. data type. role and profile names.information on database name. source file language. and revision number. database name. data type. function class. • External stored procedures . data accessing characteristic. execution protection code. overflow text. external file reference. source file language. external name. function call name. external name. specific name.defined function . execution protection mode. and subject table databases.information on source code and object code. trigger names and conditions. • JAR .information on the Java object code. parameter passing convention. identification. function class. space allocation. This information is typically focused on objects in the system. parameter passing convention. • View or Macro . timestamp for the last updated. databases. character and platform type. • Java user . character and platform type. database administrators will create and update tables referenced by the system views. modifier. source file language. and fallback tables. attributes of the creation time. including: • Tables . user/creator access privileges. source file language. account name.information on the attributes of the creation time. data types.
• Supervisory . while other views are accessible by all users. • Database administrator . Information on objects found in the Data Dictionary can be presented through views.deterministic characteristic.information on creating and organizing database. and character and parameter types. default transform group. the database administrator will need to grant privileges. The different types of users and their information needs are: • End . DBC. and character and parameter types.information includes user name. default account.information on access logging rules and access checking results.3. role and profile name. types of access granted. and DBC. password string. including implicit assignment and cast routing. monitoring space usage. and accounting. space allocation. parameter data types.archive and recovery activities 29 . null-call characteristic. To have access to these views. specific name. execution protection mode. function call name. creating indexes.UDTCast (each cast entry). ordering category. Some Data Dictionary views are restricted to specific types of users. cast count. • User-defined methods . database. deterministic characteristic. • Operations and Recovery Control .information contains DBC. defining new users. rather than querying the actual tables.UDFInfo (each auto-generated default constructor) including the same information as a regular UDF. • User-defined types . execution protection mode. type kind. 5. ordering routine ID. • Security administrator . character type. collation. data accessing characteristics. password change date. creator name.information on objects. external name. instantiability. allocating access privileges. parameter passing convention. including type name. • User . DBC. source file language. date form. including default transform group name. owner name. errors.information on performance status and statistics. and privileges related to a user. and archiving.UDTTransform (each transform). ToSQL routine ID and FromSQL routine ID. function class.11 Data Dictionary Views Views are used to examine information in tables. creation timestamp.information contains C source code and object code. ordering form. modification name and timestamp.UDTInfo (each UDT entry). null-call characteristic.
links nodes on an MMP system to enable point-to-point. The only SQL DML command allowed with the Data Dictionary is the SELECT statement. Single-code SMP systems will use Boardless BYNET. or virtual BYNET.2 Virtual Processors There are several different types of virtual processors: • Access Module Processor (AMP) . 5.several tightly coupled CPUs connected to one or more disk arrays (SMP configuration) used as the hardware platform. The BYNET resembles a switched network which loosely couples the SMP nodes in a multinode system. The processors are virtual (vprocs). Transmissions can be optimized using load balancing software. software to emulate BYNET hardware. 5. An MPP configuration is made of two or more loosely coupled SMP nodes. At least two BYNETs in the multinode system are needed to provide fault tolerant capabilities and enhance communication between processors. the SMP technology can be formed to create a Massively Parallel Processing (MPP) system.4. and makes a SQL query. • BYNET .The Data Dictionary is accessed every time a user logs into the Teradata Database. and broadcast messaging between processors.performs database functions: each processor owns a portion of the overall database storage. uses a password. but are backward compatible to Kanji/Latin. Other statements cannot be used at all or are limited to specific Data Dictionary tables. The components include: • Processor node . It utilizes a high-speed logic to allow bi-directional communications and merge functions. 30 . multicast. They run software processes on a node under the Parallel Database Extensions (PDE).4. When combined with a communications network.1 Platforms The Teradata Database software is supported by hardware components based on Symmetric Multiprocessing (SMP) technology. The views used by the Data Dictionary are based on Unicode.4 Teradata RDBMS Components 5.
At the core of the PE is the database software that manages sessions. up to 128 vprocs can be supported at each node.• Gateway (GTW) . decomposes SQL statements and returns answered to clients.used to generate and package steps. AMP vprocs manage all interactions between the disk subsystem and the Teradata Database. A vproc will also manage the disk space in the file system.manages session activities and recovers sessions. then monitors for completion of the steps or errors. Each vproc is an independent copy of the processor software.384 vprocs are supported within a single system. and converting output data. The communication with AMPs is enabled through BYNET. During the query process. query parsing. locking database objects. 31 .receives output from parser and sends to appropriate AMPs. • Dispatcher . • Relay Services Gateway (SG) .performs any actions related to session control.provides a socket interface to Teradata Database.decomposes SQL into processing steps. Each AMP vproc will manage a portion of the disk storage. an AMP will sort. merge data rows and aggregate data. Multiple AMPs can be grouped into logical clusters. Communication with vprocs occurs using a unique-address messaging driven by BYNET driver software.identifies the most efficient path to access data. Though each node will typically have 6-12 vprocs. These clusters enable the fault-tolerant capabilities of the database. journaling. • Session Control . Up to 16. The following elements comprise the PE software: • Parser . This processor type cannot be manipulated externally.manages the database storage. • Generator . query optimizations. query dispatch. • VSS . Parsing Engine (PE) vprocs perform communication duties between client systems and AMPs. sharing only the physical resources of the node with other vprocs.communicates with the Teradata Meta Data Services utility with any dictionary changes and provides a socket interface for the replication agent. and security validation. • Parsing Engines (PE) . The vproc performs specific database management tasks like accounting. • Optimizer .
5. The request is scanned by the Optimizer to determine where to place locks. the Dispatcher will communicate if the step is for one AMP. the request is aborted and an error message sent back to the requester.3 Processing Requests The SQL parser processes incoming SQL requests to query the database. The Optimizer will determine the best way to implement the SQL request. 32 . The resolver will add information from the Data Dictionary or cache to convert database objects to internal identifiers. As each step is sent to the BYNET. The SQL requests are handled in the following way: 1. which provide directives to the AMPs. Privileges in the Data Dictionary are checked by the security monitor.4 Disk Arrays Redundant Array of Independent Disks (RAID) solutions are used to protect data at the disk level. If the next step is dependent on the output. 3. The Dispatcher will wait for a completion response. or a set of AMPs called a dynamic BYNET group. Completion processes come from all expected AMPs and will place the next step on the BYNET. 4. the request is sent to the Optimizer or.4. The process continues until all AMP steps related to a request are completed. and then passes the parse tree to the Generator. The concrete steps are passed to the Dispatcher. if errors exist an error message is sent back to the requester or. AMP steps are sent to one AMP. 2. if no errors exist the request is converted into a parse tree and sent on to the Resolver.4. 7. Disk drives are grouped into RAID LUNS to ensure availability of data during a failure. The optimized parse tree is transformed into plastic steps by the Generator and the steps are cached. AMPs will acquire the required rows. and then sent to the gncApply. a response is required before starting. he syntax of the incoming request is checked by the Syntaxer. than steps can be processed in parallel. 5. When processing requests. If no dependency exists. 8. 6. which will distribute those steps to the AMP database management software. several AMPs. The gncApply will bind parameterized data into the plastic steps and transform them into concrete steps. or all AMPs. The Dispatcher controls the execution sequence of the concrete steps. Messages are transmitted to and from AMPs and PEs by the BYNET. all AMPs.5. If the privileges are valid. It will pass the steps to the BYNET. if not.
Drive groups are a set of drives configured into one of more LUNs.6 Hot Standby Nodes Nodes that are used solely to improve the availability and maintain performance are called hot standby nodes.4. The physical connections are made using Fibre Channel (FC) buses.4. Vdisks are a group of cylinders assigned to an AMP. Due to this. the PEs can migrate. 5. I/O and messaging system interfaces. A PDE allows the Teradata Database to: • • • • Run in a parallel environment. Manage memory. The PEs are dependent on the physically attached hardware. 5.5 Cliques Nodes of an MPP system are physically linked by multiported access to common disk array units. They are members of a clique and will not normally perform any database operations. they cannot migrate. Execute vprocs. Each LUN is uniquely identified. This is a feature called a clique and supports the migration of vprocs under PDE.7 Parallel Database Extensions Between the operating system and Teradata Database is a software interface layer called the Parallel Database Extensions (PDE). Apply a priority scheduler. The vprocs migrate to other nodes when the original nodes in the clique fail. including: • • • The ability to manage parallel execution of database operations on multiple nodes Dynamic distribution of database tasks Task execution coordinated between and within nodes Between the PDE and the Teradata Database is a layer of software called the Teradata 33 . If they are connected through the LAN.4. Multiple operating systems can be run in parallel and PDE provides a number of services for parallel operating systems. They are utilized only when a node within the core system fails. 5.
5 Database Requirements The critical requirements to be fulfilled by the Teradata Database are characterized as 34 .4. the system status. The CNS is a part of the PPDE software which the database runs on top. 5.9 Teradata Database Window The Teradata Database operations are controlled by database administrators. 5. This interface is specific to the Teradata Console Subsystem (CNS). Administration Workstation (AWS) on the MPP platform. The file system allows the Teradata Database to store and retrieve data efficiently without bothering with specific low-level operating system interfaces. With the AWS.8 Workstations Types Two types of workstations exist to look into the Teradata Database: • • System Console on the SMP platform. and performance stations can be displayed. current system configuration. multiple nodes can be viewed through a single system and system performance can be monitored.4. The system console provides a mechanism to allow system and database administrators to provide input to the Teradata Database. The Administration Workstation will perform all the functions of the system console. Various utilities can be controlled through the system console. system operators. Within the console. and support personnel using the Teradata Database Window (DBW).Database File System. 5. Database commands can be issued and utilities can be run from the DBW. The DBW is used to start and control Teradata Database utilities. It can be run from: • • • System console Administration Workstation Remote workstation or computer The DBW is a graphical user interface.
fallback tables can maintain a row’s availability. 5. while others are optional.reliability. Fallback tables are a copy of a primary table. which can cause a system halt. The Parsing Engine (PE) and Access Module Processor (AMP) are vprocs which will migrate when a node fails. Subpools can prevent disk 35 . availability.5. Smaller clusters can reduce the potential failure of two AMPs in the same cluster. Each fallback row in a fallback table is stored on an AMP different from the AMP hashed by the primary row. and installability (RASUI). Fallback can be defined for individual tables. This technique is at the expense of doubling storage space and the I/O used for tables. Clustering can be performed across subpools. They provide fallback capabilities for each other by storing a copy of each row on a separate AMP in the same cluster. The fulfillment of these requirements is achieved by combining: • • • Multiple microprocessors in a SMP arrangement DAID disk storage Operational anomaly protection Fault tolerance is provided through both the hardware and software. while leaving other tables alone. Should the system lose an AMP. This migration enables a system to function fully during a failure. Some fault tolerance features are mandatory. usability. though some performance degradation will be experienced because of the nonfunctional hardware. AMP clusters are comprised of 2-16 AMPS. logical groupings of AMPs and disks within the same cluster. such as those critical for the business. serviceability. especially in large systems.1 Fault Tolerance from Software Software fault tolerance is provided through the following Teradata Database facilities: • • • • • • Vproc migration Fallback tables AMP clusters Journaling Backup/Archive/Recovery Table Rebuild Utility Vproc migration can migrate from node to node within the same hardware clique when a failure occurs. Multiple clusters can exist.
The different types of journals include: • Down AMP recovery . An image is created on the same AMP as the row described. and restore activities. • Permanent . or both to enable rollback.consists of Teradata Database nodes connected to BAR servers which initiate and run the backup. and ALTER statements. • Direct-attached Architecture . full-table archives is reduced because of permanent journals.used to roll back failed transactions aborted by the user or system because it stores BEFORE images of the transactions by capturing begin/end transaction indicators.occurs during an AMP failure on fallback tables only and used to recover the AMP after repair. Backup Archive and Restore (BAR) solutions can be found in two different architectures: • BAR Framework . row IDs for INSERT statements.specified by the user for tables or database.failures from occurring by dividing AMPs in the same cluster into a subpool. Journaling is a recording of activity and there are several such mechanisms in the Teradata Database. and the image is discarded when the transaction or rollback is completed. and restore activities. while other activities can be configured for journaling. the system will not crash if the AMPs are in different subpools. The Teradata Archive/Recovery (ARC) utility archives files to client tape or files and restores them to the Teradata Database. • Transient . before row images for UPDATE and DELETE statements. The files which are backed up and restored by this utility include: • • • • • • Authorization objects Databases Data Dictionary tables External stored procedures Hash Indexes Join Indexes 36 . archive. control records for CREATE. DROP. rollforward. archive. This type of journal can contain before images or after images.uses tape libraries and drives directly connected to Teradata Database nodes which initiate and run the backup. or full recovery. The need for frequent. If two AMPs in the same cluster fail. DELETE. The journal is discarded after use. The system itself will perform some journaling.
5. all tables in an individual AMP. or other malfunction.2 Fault Tolerance from Hardware Hardware fault tolerance is provided through the following Teradata Database facilities: • • • • • • • • Multiple BYNETs RAID disk units Multiple-channel connections Isolation from client hardware defects Battery backup Power supplies and fans Hot swap node capabilities Cliques 37 . or all tables in a database. or the entire disk on a single AMP. The recreation process is performed when the table structure or data is damaged due to a software problem. or system support representative.• • • • • • • • Methods Stored procedures Tables Table Partitions Triggers UDFs UDTs Views The Table Rebuild Utility is used to recreate a table. A table can be rebuilt on an AMP-by-AMP basis for the primary or fallback portions of a table. The utility can be used to remove any inconsistencies in stored procedure tables in a database. The affected tables utilize fallback protection. The utility is usually run by a system engineer. power failure. database.5. head crash. filed engineer. an entire table.
Back to the different RAID levels. but performance goes down when the drive fails though it can withstand single drive failures. RAID Level 6 – parity is stored on striped drives along with the data.5. Level 6 has high reliability and high read speed. and different levels of RAID provide different levels of redundancy and performance. though they are relatively stable. but is costly since double the storage capacity is required. Level 0 is best used when high throughput is desired with the lowest cost possible. fast RAID arrays have additional hardware caches. RAID Level 3 – uses parity to store the parity value on a separate drive. Both technologies have the potential to fail at any point. Tape devices are primarily used to back up large volumes of data. Level 1 is excellent when the primary requirements are high availability and high reliability. deciding which level to use is one of the most important decisions in SAN designing using RAID. Software implementation of RAID is possible. It is best used when the primary 38 . Magnetic disks are the preferred device for primary storage. but offers no redundancy. Redundant Array of Inexpensive Disks (RAID) provides a fault-tolerant array of drives to overcome any possibility of failure. RAID Level 3 provides the best high data transfer and costs less than other levels. The most basic level is RAID 0. RAID Level 1 – uses mirroring to replicate data from one drive to the next. To minimize host processing. It is most suitable for multiple applications. This level does not offer any redundancy and is not recommended for storing data. Level 5 has a high read rate and is reliable. but write performance is low and is unsuitable for frequent transactions using small data transfers. RAID is a simplified system for managing and maintaining the storage environment. Damaged drives can be hot-swapped without disrupting the network functions. and RAID 5 and RAID 3 options are the most popular choices for large databases. The different levels of RAID include: • • • • • RAID Level 0 – simple level of disk striping which has data stored on all drives. The level of redundancy provides by a virtual disk ensures that the data is protected from disk failures. The technology is for large database operations. and striping schemes. RAID Level 5 – uses parity to store parity values across different drives. but the write speeds are typically slower than hardware implementations. multiple buses.6 Redundant Array of Inexpensive Disks Each storage device involves different technologies. The system creates a combined large storage device from smaller individual devices. The reason for this reduction in speed is the need for the host system to calculate the parity values and perform additional I/O operations to ensure the storage of these values. Data is generally stored across different drives.
NET Data Provider for Teradata. The ODBC Driver provides an interface to the database using a standard ODBC API. OLE DB Provider for the Teradata Database uses service providers to enhance a provider’s functionality. CLIv2 is used by . The JDBC API provides a standard set of interfaces for opening connections to databases. and processing results. Programmers can use the OLE DB Provider to design application programs to access databases and data stores that do not use SQL. Open Database Connectivity (ODBC). executing SQL statements. Results are either processed directly or cached in an ADO.NET DataSet to generate XML and bridge relational and XML data sources. These methods are used to allow client applications to make requests to the database server and send back responses from the database.7. execute commands.1 Network Attachments The methods used to communicate over the network include: • • • • • . 5.NET Data Provider for Teradata to connect. 5. Java Database Connectivity (JDBC). This 39 . Teradata CLIv2 for NAS.requirements are high availability and data security. the Teradata Database can be accessed using the Java language. The costs are high and the write speed is slower than RAID 5. and retrieve results from the database. The Java Database Connectivity is a specification for an API to allow platform-independent java applications to access the database using SQL and external stored procedures. Responses from the database are sent back to the application program through the intermediary. The program will request database information from an intermediate program which in turn will access the database.7 Client Communication The Teradata Database can communicate with client applications through the LAN network or a channel provided by a mainframe server. OLE BD Provider for Teradata. With the Teradata JDBC Driver.
Through the TDP. Each parcel returned has a pointer for the application. The collection provides an interface between applications and the Teradata Director Program (TDP) running on an IBM mainframe client. The APE can build parcels packaged by MTDP and sent to the Teradata Database using MOSI. only one version of MTDP is required to run all network-attached platforms when using MOSI. 5. • Insulation of the application from communication mechanics with a server. Teradata CLIv2 for NAS is a proprietary API and library. • Communication with Two-Phase Commit (2PC) coordinators for CICS and IMS transactions. It is a collection of callable service routines.7. recovery. as well as enabling logging. but multiple TDPs can operate and 40 . All versions of IBM operating systems can be operated. The Micro Teradata Director Program (MTDP) is the interface between Teradata CLIv2 and MOSI. Providing physical input to and output from the server is another function of MTDP. requests are sent to the server and responses are sent back from the server to client applications. The driver uses the Windows Sockets TCP/IP communications software interface. including: • • • Customer Information Control System (CICS). This library allows clients that access the Teradata Database operating system independence. It provides Teradata an interface between applications on a network-attached client and the Teradata Database server. It allows sessions to be initiated and terminated. Information Management System (IMS). The TDP enables communications between Teradata CLIv2 for CAS and the Teradata Database server. OBDC and CLI work independent from each other. Micro Operating System Interface (MOSI) is an interface between the MTDP and the Teradata Database to provide a library of service routines. verification.driver provides Core-level SQL and Extension-level function call capability. As a result.2 Channel Attached Systems Teradata CLIv2 for channel-attached systems (CAS) is used to perform channel attachment. Teradata CLIv2 for CAS enables: • Management of multiple simultaneous sessions to server(s). Individual TDPs are associated with a logical server. • Cooperative processing to allow simultaneous operations on client and server by an application. IBM System z Operating System. and restart.
Teradata Database supports ANSI transaction semantics and Teradata transaction semantics 41 . Shrinking phase . The serailizability of transactions is ensured by the Two-Phase Locking (2PL) protocol.a lock is placed on an object before the object is used. Locks are only released after a transaction is completely committed or completely rolled back. verification. An identifier called the TDPid is attached to each TDP and referred to by applications. concurrency control is established through two mechanisms: • • Transactions Locks 5. recovery.8.2. Transactions Transactions are used to maintain the integrity of the database. and updated. no more locks are placed on an object. Requests are nested inside a transaction and are atomic.after a lock is released. Also provided is the physical input and output to and from the server. and restarting. That is.1 Concurrency Control Multiple users accessing the same database raises the possibility of the same data being added. deleted. but as a different job. To prevent concurrently running processes from causing any problems to simultaneously updating. It is a logical unit of work and a unit of recovery.simultaneously accessed by Teradata CLIv2 on the same mainframe. The functions of the TDP include initiating and terminating sessions. all the requests in a transaction either must happen or not happen. The two phases of the protocol are: • • Growing phase .8 Data Availability 5. 5. The Teradata Database server implements the relational database.8. This database will process requests from Teradata CLIv2 for CAS. Transactions can be serializable: a condition where a set of transactions can produce the same result as an arbitrary execution of the same transactions for arbitrary input. logging. which comes through the TDP. Partial transactions cannot occur. The program executes on the same mainframe as Teradata CLIv2 for CAS. as well as session balancing and queue maintenance.
locks all rows in a table and any associated index and fallback subtables. • Table . performs a DDL statement that aborts. The request can be aborted if a lock is not obtained immediately.minor inconsistencies in the data are allowed and modifications on the underlying data are allowed while the SELECT operation is in progress. the database software generates an error. All other requests are implicit transactions. or executes an explicit ROLLBACK or ABORT statement.locks the primary copy of a row and all rows sharing the same hash code in the same table. If a resource is locked by a user. 42 . ANSI mode transactions are implicitly opened. In ANSI mode. If an error occurs. The type of lock used is based on the data integrity requirement of the request. Locks can be placed on the following objects: • Database .8. The transaction closes when a COMMIT. • Row hash . 5. any subsequent requests are queued until the lock is released. Teradata mode transactions can either be explicit or implicit. the entire transaction will be rolled back if the current requests result in a deadlock. ROLLBACK. The BEGIC TRANSACTION. Most locks on resources are obtained automatically. typically being opened by the execution of the first SQL in a session or the execution of the first request after the close of a transaction. the resource is either fully or partially inaccessible from other users.3 Locks Locks are used to control access to a resource. DDL statements in a transaction must be the last request before the transaction closing statement. Four different locking severities exist: • Access . or the two-phase commit protocol is allowed in a session. but the severity can never be downgraded. When a lock is placed on a resource.as specified by a system parameter. The severity of a lock can be upgraded with the LOCKING request modifier. An explicit transaction is generated by the user and consists of a single set of BEGIN TRANSACTION/END TRANSACTION statements. the END TRANSACTION. the entire transaction is rolled back by the system. In ANSI mode. or ABORT request is performed by the application.locks the rows in all tables in the database.
requester has exclusive privileges to the locked resource and no other process can write. The request that caused the error is resubmitted but not the entire transaction.• Read . UPDATE .write lock on row hash.exclusive lock on table. DROP DATABASE . DROP TABLE .exclusive lock on table.exclusive lock on table. Teradata Database will generally abort the younger transaction defined by the shortest length of time a resource is held.requester has exclusive privileges to the locked resources except for readers. • Write .write lock on table and row hash. CREATE TABLE . In this situation. • Exclusive . INSERT . Deadlocks are situations where a transaction already has a lock on a resource and needs a lock on another resource. Below is a list of statements and the lock severity on the appropriate lock level: • • • • • • • • • • SELECT .exclusive lock on database. the report of the transaction abort is sent to BTEQ.read lock on table and row hash. CREATE DATABASE . 43 .write lock on table and row hash. DELETE . MODIFY DATABASE . which already has a lock on it by a requester who needs a lock on the original resource. If BTEQ is used. read.exclusive lock on database. neither transaction can move forward until the other transaction is aborted. ALTER TABLE . or access the resource.exclusive lock on database.allows multiple locks of this type to exist by several users and permits no modification to the resource. Most locks are automatic based on the SQL statement used.
8. • Placed on objects in the AMPs participating in a utility operation and none other.5. • Exclusive .4 Host Utility Locks The Teradata Archive/Recovery utility located on the client uses a different locking operation than Teradata Database. 5.tables in ROLLFORWARD or ROLLBACKWARD during recovery. they are automatically reinstated. • Placed during a CLUSTER dump at the cluster level. • Group Read . • If not released when Teradata Database restarts.any object being restored. • Does not conflict with any lock at a different level for the same object for the same user.used when the table is defined for an after image permanent journal causing rows of a table to be archived. HUT locks are: • Associated with the user currently logged-on. • Write . The different types are: • Read . When a database discovers an error or failure. the system may restart.journal table being deleted. • Write . • Remains active until the RELEASE LOCK option of an ARC command is given or RELEASE LOCK statement after a utility operation completes its execution. This is typically the result of: 44 . • Write .8.5 Recovery Recovery in database management handles the process where databases in an inconsistent state are brought back to a consistent state. These locks are commonly called Host Utility (HUT) locks. Since transactions are simply a series of updates to the database.permanent journal table being restored.any object being archived. they can be used to take the database to an earlier state or forward to a current state.
or inconsistent data table. Errors can be isolated to data or index subtables or to a range of rows in a data or index subtable. After all updates have been made. A database recovery is performed after initiating a restart of the system. When a large number of rows need to be processed. System recovery uses the down AMP Recovery Journal. the Teradata Database will process requests using fallback data.8. it will continue to process. Software failure. The system can perform an automatic transaction recovery. Two types of transaction recovery can occur: • Single transaction recovery • Database recover A single transaction recovery will occur because of a transaction deadlock. each participant in the transaction commit operation will vote to either commit or abort the changes. the AMP is considered recovered.6 Two-Phase Commit Protocol Update consistency across distributed databases is assured using the Two-Phase Commit protocol. If only a few rows need to be recovered. A participant is any database manager performing some work related to the transaction. A change is not fully made by any participant until it knows that all participants can commit to the change. When errors are found.• • • AMP or disk failure. This form of recovery uses the transient journal to perform its operation. the AMP will recover offline. userinitiated abort command. If an AMP fails to come online during system recovery. In this type of environment. If any transactions are still in progress that requires the down subtable or region. When the AMP finally does come online. the affected subtable or region is marked and a snapshot dump is taken. user error. 45 . A vote by a participant is simply a declaration that they can either commit or roll back its portion of the transaction work. If the transaction does not require access to the affected subtable or region. the process is performed when the AMP is online. the down AMP recovery procedures are initiated to bring the AMP up-to-date. 5. the transaction is aborted. Participants can also be a coordinator of participants at a lower level. Disk parity error.
9. and updating by using and expanding on the traditional Teradata extract and load utilities. recover. In connection with BAR application software. BAR software includes: • Bakbone NetVault and NetVault Plug-in . join indexes.9. The framework is comprised of the TARA Server. loading. TARA GUI. triggers.manages I/O interfaces between Teradata ARCMAIN and IBM TSM. and Restore (BAR) for various archiving.1 Data Archiving Utilities Third party software products are supported by Teradata Backup. and restore functions. and views. Tiered Archive Restore Architecture (TARA) consists of software extensions to connect BAR software to the Teradata Database. TARA GUI. unattended backups for client systems and the framework is comprised of the TARA Server. high-speed. backup. The prominent features of Teradata PT are: • • • • Process-specific operators Access modules Parallel execution structure Data stream use 46 . and TSM Teradata Extension. and copy table data. 5. Archive. MultiLoad.2 Load and Extract Utilities Teradata Parallel Transporter (Teradata PT) provides scalable.used to schedule automatic.5.9 Teradata Tools and Utilities 5. • Tivoli Storage Manager (TSM) and Tivoli Manager Teradata Extension . and TPump. restore. Teradata ARC will archive and restore tape storage devices and backup-to-disk (B2D) storage devices. Teradata ARC supports the archiving and restoration of individual functions. • Symantec NetBackup and NetBackup Extension for Teradata . FastExport. hash indexes. parallel data extraction.used to select databases and tables graphically and define the backup types to perform. such as FastLoad. stored procedures. macros. and NetBackup Extension for Teradata. The Teradata Archive/Recovery utility (ARC) works with BAR application software writing and reading sequential files to archive.
Bulk inserts. Utilities and the access modules are linked though the Teradata Data Connector API.will load data into the database from a UNIX OS named pipe (data buffer).• • SQL-like scripting language Teradata PT Wizard Teradata PT utilizes an Application Programming Interface (API) to provide several interfaces to load data into or extract data from the Teradata Database. • Teradata WebSphere MQ Access module . and export data to an Output Modification (OUTMOD) routine.will load data from a JMS-enabled message system.9. Only one table is populated per job. Conventional row hash locking is supported. The following access modules will read. as well as multiple SQL statements packed into a single request. and scalability found with MultiLoad. The utility supports client environments and server environments. The utility will also perform block transfers with multisession parallelism. The utility is supported in client environments and server environments. Data from multiple input source files can be loaded.will load data from a message queue using IBM’s WebSphere MQ. Block transfers can be performed with multisession parallelism. FastLoad will load data in to unpopulated tables. It supports the same restart. • Teradata Access Module for JMS . portability. FastExport is a functional complement of FastLoad and MultiLoad. and block transfers with multisession parallelism. TPump is used to maintain data in tables. and deletes can be performed using Teradata MultiLoad. Multiple instances of the tool can be run simultaneously with no limitations. 5. They import data from data sources and return that data to a Teradata utility. not write. data: • Named Pipes Access Module . 47 . Instead of using block transfers. These activities can be performed against several unpopulated or populated database tables at a time.3 Access Modules Providing block-level I/O interfaces is the function of dynamically-linked software components known as access modules. FastExport can be used to export tables to client files. the utility uses standard SQL. requiring multiple FastLoad jobs to be submitted. Teradata PT API is a functional library designed to allow greater control over the load and extraction operations. updates. The utility is designed to extracts large amounts of data from the Teradata Database to the client in parallel.
will provide access to specific systems using the DataCenter operator. as a file. modify.9.• Teradata OLE DB Access Module . 5. add. Create and use stored procedures. and delete data. or to a designated printer.5 Session and Configuration Management The following tools are used for investigating sessions and configurations: • Query Session .will transfer data between an OLE DB provider and Teradata Database.provides information about active sessions by monitoring the state of selected sessions. During a BTEQ session.9. Enter operating system commands.4 Querying Basic Teradata Query (BTEQ) is a command-based program for allowing users to communicate with one or more Teradata Database system and format reports for print and screen output. Access modules will work with the following tools and utilities on multiple operating systems: • • • • • • BTEQ FastExport FastLoad MultiLoad Teradata PT TPump 5. SQL queries are submitted to the database: BTEQ will format the results and return them to the screen. • Custom Access Modules . Enter BTEQ commands. the following actions can be taken: • • • • View. 48 .
and PE identification and status. This tool will protect all permanent tables and all system tables. even when coming from different transactions. except: • • • Transient Journal tables User journal tables Restartable spool tables 49 . • Gateway Global . Log data is a sequence of WAL records and are not accessible through SQL. AMP.9. The log file is written to the disk at key moments. 5. The WAL has a simpler structure than a table.6 Resource and Workload Management Several tools are used to perform system resource and workload management in Teradata Database: • • • • Write Ahead Logging Ferret utility Priority Scheduler Terada Active System Management The Write Ahead Logging (WAL) protocol will record permanent data writes to a log file to provide an update report for the database.provides monitoring and control capabilities for sessions of networkconnected users. In the event of a system failure. but is conceptually similar to tables. including node.provides reports on the current configuration of the database. the WAL can be used to reconstruct the database.• Query Configuration . Modifications to permanent data can be batched. The log contains Redo Records and Transient Journal (TJ) records.
• Resource Usage Monitor . The utility is active on all Teradata Database systems. The utility reconfigures data in the file system while maintaining data integrity during the change.analysis DBQL data to group workloads and define rules for managing system performance. Controls resource sharing between different applications. and defining classes of workloads. Prevents aggressive queries from consuming resources. • Open APIs . and creates events for monitoring system resources. • Query Bands . It can do so within vprocs.defined by the user or middle-tier application to allow sessions or transactions to be tagged with an ID.used for data collection. Teradata Active System Management (Teradata ASM) is a collection of products interacting with each other and a common data source for the purpose of allowing automation in: • • • • Workload management Performance tuning Capacity planning Performance monitoring Some of the Teradata ASM products include: • Teradata Workload Analyzer (Teradata WA) . It includes: • • • SCANDISK SCOPE SHOWBLOCK and SHOWSPACE Priority Scheduler (schmon) controls access to resources based on different active jobs in the database. subtables.defines rules for filtering. throttling.The Ferret utility will display and set storage space utilization attributes for the database. Automates changes. WAL log. • Teradata Viewpoint . disk. 50 . The following capabilities are available: • • • • Better service for higher priority work.provides an SQL interface to PMPC through UDFs and external stored procedures. Administrators can define priorities to different types of work. tables. and cylinder.
determines the permissions to the database associated with a user.10 Security and Privacy 5.secures the message traffic between network attached clients and Teradata Database.provides specific authentication. • Privileges .2 Users Users must be defined in the database or a supported directory. and integrity services. Directory-based users accessing the database must be defined in a supported directory.5. • Authentication . • Logon . 5. confidentiality. one of more configurations must be updated to allow directory users to access the database.the verification of a user’s identity. Users who access the database through a middle-tier application using trusted sessions are called proxy users. • Authorization .explicitly or automatically granted permissions to a user or database. The implementation may or may not require the creation of a matching database user.the submission of user credentials when requesting access to the database. • Message Integrity .1 Concepts of Security Security in the Teradata Database is grounded in the following concepts: • Database user . • Security mechanism . Rolebased database privileges are assigned to proxy users using a GRANT CONNECT THROUGH statement. • Network traffic protection . A SET QUERY_BAND statement is submitted when establishing privileges for a 51 .ensures the message received is the same as the message sent with no loss or change to the data. The application will authenticate the user rather than the database.10.provides a history of users accessing the database and the objects accessed. • Access logs .an individual or group of individuals represented as a single user identity. Usernames are typically used to represent individuals and must be unique in the database.10. In any case. The CREATE USER statement defines permanent database users.
user. or database object or to newly created users and databases.granted to user or database directly or to a role which is associated to a user. The use of roles will decrease the dictionary space required for granting privileges to individuals. such as directory users to database users. since directory users do not exist in the database.passed on based on a user’s relationship to another user or role. 5. • Explicit . the statement and the rules for applying it must be coded into the application in order to be used. the roles are not granted to the directory users directly. Proxy users are assigned roles using the GRANT CONNNECT THROUGH statement. If no privileges exist. The CREATE ROLE statement is used to define a role. Users can switch between multiple roles they are members of using the SET ROLE statement. they cannot access the object.3 Database Privileges • If a user has privileges on an object. the roles are assigned to groups which the directory users are members. they can access the object. or the default database user PUBLIC. Profiles can be defined and assigned to a group of users sharing similar values.granted to the owner of the space where the database objects are created. however. External roles are used to assign privileges to directory users. • Automatic . • Inherited .10. However. the user can use the SET ROLE ALL statement. The default role for the user must be specified when using the CREATE USER statement. The GRANT statement can be used to grant roles to users. The different types of database privileges include: • Implicit . To access all roles.granted to the creator of a database. All objects that a role has privileges to can be accessed with those privileges by a member of that role. Groups of users with similar needs can be granted privileges on database objects by associating them to a role.particular connection to the database. the MODIFY USER statement can be used. Instead. such as: • • • • Default database assignment Spool space capacity Temporary space capacity Account strings permitted 52 . To assign additional user roles. External roles and database roles are created in the exact same way.
Logon type. The different types of logon for external authentication are: • Single Sign-on . It is dependent on two elements: • • Security mechanism specified in the logon (authenticating agent). The mechanisms do not need to be specified at logon unless another mechanism is set as the default. 53 . Teradata Database authentication makes use of the TD2 mechanisms by default. The process consists of the following elements: • • • Authentication method Logon format and controls Password format and controls Users must choose a security mechanism to identify the authentication method used.user is authenticated by the directory.the logon to Teradata Database is recognized by the domain. Security mechanisms are part of the logon string or the system will use the default mechanism. • Directory Sign-on . • Sign-on As . the Teradata Database requires the user and privileges for the user to be defined. The categories of authentication are: • • Teradata Database authentication External authentication For authentication.4 Authentication User authentication is a process where a user’s identity is verified and compared to a list of approved users.• Password security attributes 5.users are authenticated in the domain and do not require subsequent logons.10. External authentication utilizes an agent running on the same network as the Teradata Database and its clients.
The following logon forms are found in Teradata Database: • • • • Command line GUI Logon from channel-attached client Logon through a middle-tier application
When using a command line logon, the following information is provided by the user: • Jogmech - the name of the security mechanism is specified to define the method used to authenticate and authorize the user. • Jogdata - used for external authentication to specify a username and password. • Jogon - used for Teradata Database authentication and optionally for external authentication. A GUI logon is performed within dialog boxes, which includes fields and buttons to prompt the user to enter the same logon information provided in command line logon. When channel-attached clients log into sessions, network security features are not supported. These features include security mechanisms, encryption, or directory management of users. Logons require the submission of username, password, tdpid, and optional account string information. When accessing the database through a middle-tier application, a database username must be used. A connection pool is set up by the application to allow end users to access the database without any formal logon. Permissions are automatically granted for all users defined in the database to allow them to logon from all connected client systems. The configuration for this automatic process can be modified by administrators, specifically: • • • • • Modify current or default logon privileges. Restrict logon from a specific channel or network interface. Set the maximum number of unsuccessful logon string submissions. Enable external applications to perform authentication. Restrict access based on IP address. 54
Passwords used in logons must conform to password format rules. These rules govern the type and number of characters allowed in the password. Password controls allow administrators to: • Restrict password content, including minimum and maximum password characters, characters allowed, and the use of specific words. • Set the number of days a password is valid. • Assign a temporary password. • Set lockout time after the maximum number of logon attempts is exceeded. • Define time period that a previous password cannot be reused.
Once a user has been fully authenticated, their defined privileges define their authorized database privileges. Permanent users are authorized with the following privileges: • • • Directly granted using the GRANT statement. Indirectly granted using automatic, implicit, and inherited privileges. Granted as a member of a role.
Directory-based users will be authorized access privileges to the database according to these rules: • Each directory user is authorized the privileges of the objects if the directory maps users to those objects. • The directory user is authorized to all privileges associated to a matching database user if the directory does not map users to those objects, but the directory username matches a database username. • The directory user has no privileges to the database if the user is neither mapped to any database object nor their username does not match a username.
The types of directories certified for use with Teradata Database are: • • • • • Active Directory Active Directory Application Mode (ADAM) Novell eDirectory Sun Java System Directory Server LDAPv3-compliance directories
Middle-tier applications will log on the database, be authenticated as a permanent database user, and establishing a connection pool. Authentication of individual application end-users is then performed by the application. All end-users accessing the database through a middletier application are authorized privileges to the database, and audited in access logs based on the permanent database user identity of the application.
Data protection in Teradata Database is enhanced by the following features:
• The logon string is encrypted, by default, to maintain the username and password’s confidentiality. • Optional encryption of the message maintains the data’s confidentiality. • Data corruption is prevented through automatic integrity checking. • BAR encryption will ensure data backup confidentiality between BAR servers and the storage device. • Systems using LDAP authentication with simple binding utilizes SSL/TLS protection. • Systems using LDAP authentication with Digest-MD5 binding utilizes SASL protection.
Two types of user security monitoring are performed by Teradata Database: • Viewed through DBC.LogOnOffV, all user logon and logoff activity is collected in the Event Log using parameters such as database username, session number, and logon events. 56
• A developed policy which is a mixture of system-enforced and personnel-enforced features.identifies logon rules resulting from GRANT and REVOKE LOGON statements.identifies a privileges check resulting from a user request.• Viewed through DBC. requesting database object. Logging can be enabled and disabled using the BEGIN and END LOGGING statements.identifies all logon and logoff activity. 57 .LogonRulesV .10. requesting database username.9 Security Policy A security policy implemented for the database should consider: • Balancing the need for secure data against the need for quick and efficient access to the data. Directory users are logged by their directory username and not their associated database name. request type. If trusted sessions are not set between the application and end user. • DBC. • The security features of the database meet the current security needs of the business. • DBC. user attempts to access the database can be optionally recorded using parameters such as access type.AccLogRulesV .LogOnOffV . Some security-related system views used by the Data Dictionary: • DBC. Users accessing a database through a middle-tier application using a trusted session and set up as proxy users are logged using their proxy username. 5. and frequency of access. These statements can also establish the access parameters to log.AccessLogV . • DBC. the user’s username for the application is shown in logs.identifies the access logging rules contained in each BEGIN and END LOGGING statement.AccessLogV.
The benefits of the security policy for users and the business.A security policy document should detail: • • • • • The need for security. 58 . Suggested and required actions related to security. A description of security features in the database. Contacts for security related questions.
Key considerations surrounding entering data into the database include the IT users.1. the source data.1 Design Considerations for Teradata Database The primary requirement for any data warehousing effort is a properly designed and configured database. 59 . The various methods related to data transformation are: • • • • • • • • • • Accessing Capturing Extracting Filtering Scrubbing Reconciling Conditioning Condensing Householding Loading The physical aspects of the database consist of the Relational Database Management System (RDBMS).1 Development Planning 6. When considering a database solution. the major areas of interest fall into entering data. and the replication and propagation of the data. storing data. and supporting processes. from batch jobs to queries to detailed analysis. The effort provides long range decision support as well as adhoc tactical and decision support. how the data is organized in the RDBMS (Data Marts).6 Database Design 6. Different user types require access to different types of data using different methods of retrieval. retrieving data. and transformation of the data.
utilizing data from the enterprise data store and permitting users to have full access to the enterprise data store. 60 . Three types of data marts are recognized: • Independent data marts .a form of dependent data mart which is virtually constructed from the physical database. These include: • • • • • • • • Logical Models Middleware Metadata Data Dictionary Network Management Database Management System Management Business and Technology Services 6. and the data mining methods used.2 Data Marts Data marts are relatively small subset of a data warehouse database which is application or function specific and designed for a narrowly defined user population. • Logical data marts .1.also known as data basements.The considerations surrounding retrieving data covers the business users who require the data in the database. they are isolated entities separated from the enterprise data warehouse and utilizes independent data sources. • Dependent data marts . the access tools to the database. including: • • • • Clustering Statistical Artificial Intelligence Neural Nets Supporting the database solution are several components and processes in place for the entire lifecycle of the database.are part of the enterprise data warehouse.
The data was typically added to the data warehouse from the operational database after some defined age. Most often. sometimes. non-proprietary access language.1. If the new data mart is successful. the perception that the reasons for failure were always present. Focus on a small. preselected set of queries. updated. The result is a failed database solution and. This approach is nearly always unsuccessful.Data warehouse designs typically “start small” using data marts before transforming into a large data warehouse solution. nonvolatile. Aggregates stored over detail data. Faster response times. and handled time variances. Centralized shared information architecture. The development of the Teradata solution fulfills the following goals: • • • • • • • Large capacity. the user demands placed on the data mart will cause the database to grow faster than expected. it can be changed. Data in this context was oriented to a specific subject. parallel processing database. 61 . day-to-day operations. Use of a standard. usually because the authority is too restrictive or the data is present in the data mart. increasing response times. updated. The data is volatile. 6. The data warehouse was only used to perform historical queries of the data. The data was static and data was not inserted. and added. and adding to user frustration. deleted. The following are popular summaries for data warehouse failures: • • • Denormalization. integrated. The Teradata Database solution is a dynamic solution with active data supporting a business’ ongoing.3 Data Warehousing Though data marts have a place in data warehousing. The introduction of data warehousing focused on the provision of a historical database containing data derived from an active operational database. Fault tolerance. they are not. or modified. deleted. the data mart is originally unsuccessful because the design restricts the user’s access to data which they really need. and should not be. scalar growth. the focus of a data warehouse solution. Redundant network connectivity. Manageable.
read. Each system node within the system will have multiple BYNET paths to every other node in the same system. and the number of rows used to a particular operation.6. and query optimizer to work in parallel. The Teradata Database optimizer is designed specifically to optimize queries in a parallel architecture. • Determines an AMP-to-CPU ratio to define the most efficient query plan.1. • Statistical and demographical information related to the AMPs. All relational operations are performed in parallel. lock manager. Teradata Database will use a non-collision. The system supports parallel processing by balancing the database’s workload. To minimize the cost of this internodal communication. Request parallelism in Teradata Database is multidimensional. These rows are hashed across AMPs using the row hash value of the primary key. • Determines the need to redistribute rows for a join operation.4 Parallel Processing Parallel processing speaks to the Teradata Databases’ ability to allow its file system. or lock its data. and unconditionally across all AMPs and they are performed independently of the data on other AMPs in the system. update. table cardinalities. The hash distribution of rows across AMPs enables request parallelism. The BYNET internodal communication is used when the same information is required by all AMPs. including the following dimensions: • • • Query parallelism Within-step parallelism Multistep parallelism 62 . simultaneously. are supported by internodal communication. The operations of Teradata Database. As a result. Each data row is owned by exactly one AMP. The balance is achieved by distributing table rows across AMPs and giving the responsibility of the data to those AMPs. each AMP has exclusive control over its own virtual disk space. point-to-point monocast communication architecture which allows a single sender to connect with a single receiver. Query optimization is another consideration in a parallel environment. message subsystem. The optimizer is aware of the parallel system in the following ways: • Determines how long each operation takes to perform the query to determine the optimal ordering of the join. including the BYNET. which has the ability to create. Teradata Database is a shared nothing database architecture: the PE and AMP vprocs in the architecture do not share either memory or disk storage across central processing units. information is typically broadcasted to all AMPs in the system at the same time. As a result. This row hash is used to retrieve the row from the AMP.
allows a number of distinct SQL statements to be bundled together and treated as a single unit. before relational databases were commercially available.5 Usage Considerations Some design considerations for Teradata Database are: • • • • • • Online Transaction Processing (OLTP) Decision Support (DSS) Summary Data Detail Data Simple and Complex Queries Adhoc Queries 6. Multi-statement requests. 63 . The statements are executed in parallel. 6. • Conceptual . Multistep parallelism refers to the system’s ability to invoke more than one process for each step in the request.6 ANSI/X3/SPARC Three Schema Architecture The American National Standards Institute/Standards Planning and Requirements Committee (ANSI/X3/SPARC) architecture was developed to define database management systems.1.transparently maps External level views to the physical storage of the actual data in the database.composed of all views of the underlying physical database. Within each concrete step.1. • Internal . the components of a query is divided into concrete steps and dispatched to appropriate AMPs for execution.• Multistatement request parallelism When a query plan is generated by the Optimizer. relational operations are processed in parallel. Multiple steps of the same request can be executed simultaneously as long as the results of those steps are not dependent on each other. Three levels of the architecture are specifically defined by ANSI/SPARC: • External .the physical storage of data on the storage media. The Teradata SQL extension.
In this phase.8 Requirements Analysis The purpose of requirements analysis is the development of an enterprise data model: a blueprint for ensuring IT standards exist and are integrated into the business enterprise. Active Transaction Modeling (ATM) is introduced.defines the scope of the project. Summarizing table and join accesses.1. After the logical database design is achieved. ATM will define physical attributes to the logical data model. Identifying and modeling application transactions. The first step in designing the logical database is normalization. In essence. base tables. The different perspectives include: • Planner .defines the envisioned product. views. indexes. including cost constraints and regulations. The principle concept behind the enterprise data model is a logical construct defining and controlling interfaces and the integration of all components of the enterprise information structure. Based on an idea created by John Zachman. Identifying and defining attribute domains and constraints for physical columns.7 Design Phases The purpose of the logical database design is to formally define the objects and their relationships. 6. which results into a scope definition. The phase will identify and create actual databases. and other objects. • Owner . The structure of database management is built on normal forms and derivations as applicable to a series of inference rules and formal logical operations derived from set theory. The process includes the following efforts: • • • • • • Identifying business rules that impact data storage. and relationships derived from the logical model are translated into the physical database design process. 64 . The physical design phase is the point where commitment is made to the physical attributes of the database.1. triggers. macros. including the usage constraints and policies. the basic framework behind the enterprise data model has the following components: perspectives and dimensions. attributes. as well as the attributes applicable to those objects. Creating a preliminary set of data demographics. The results of these activities act as an input into the design of the physical database. the entities. Identifying and modeling database applications.6.
Locations . each perspective should be handled in order. which results into a system model. Analysis of requirements ensures the development of the system is done properly the first time. Motivation . • Subcontractor . People . Activities .defines the users of the model.defines the geographical aspects.constructs the product. often defined by generic models. The greatest disadvantage is the possible insertion of hidden agendas or conflicting perspectives on the importance of specific requirements.constructs the individual components of the product. Legacy database can provide substantial information on the strengths and weaknesses of the data and how that data is used or misused. Each dimension is addressed by each perspective. The disadvantage is that most legacy systems are developed with little or no direction or guidance. and little explanation on what currently exists and why. • Builder . Each dimension is independent from each other and has equal importance to the overall model. and produces reasonable and reliable estimates of costs.which results into a business model.defines goals promoting the model. resulting into a technology model. The different dimensions include: • • • • • • Entities . interviewing and review of legacy databases.defines the specific inputs into the process. Requirements are generally identified through two activities which are. 65 . Each perspective is dependent on the product of the perspective preceding it.defines the work to be performed. Therefore. • Designer . The resulting items of each perspective/dimension question become the requirements for the end product. Interviewing can provide the current demands on database requirements. supports user access to the system.defines when activities occur.ensures the scope and requirements are transformed into a product specification. Time .
two schemas are commonly referred to: • Major and minor entities . Entities interact with each other through relationships. The entity-relationship (E-R) model extends the semantics of the relational model to obtain greater meaning out of the data. logical definition of objects. Data models are an abstract. From a design perspective.6.database objects representing a real world “thing”.represents entities of large and small cardinality and degree. In this model.an association between two or more entities. Relational models for the database model identify connections between data within the database.1.a characteristic of an entity: at least one attribute always exists called the primary key. Major entities are updated frequently while minor entities are updated rarely. including: • • • • • • • • • Communications Financial Services Healthcare Insurance Manufacturing Media Retail Transportation and Logistics Travel Industry Most of these models are based on the entity-relationship model for database management. objects are tangible in the real world and captured as an entity. and supporting elements to the structure and behavior of the data. several types of schemas exist for categorizing entities. For Teradata.9 Entity-Relationship Models Teradata provides several industry-specific logical data model (LDM) frameworks. operators. each with unique characteristics captured as attributes. • Relationship . • Supertype and subtype entities . Key definitions within this context are: • Entity . • Attribute . 66 .represents a generic entity (supertype) comprised of several specific entities (subtype).
• M: M Relationship .a non-prime table where all primary key columns are also foreign keys.the mapping of entity occurrences in a relationship (values are 0.a relationship between entities where the existence of one entity depends on the existence of the other entity. • Degree .10 Normalization Process Some definitions used within the context of normalization are: • Alternate key . • Associative Table .any attribute derived through a calculation from other data in the model.any candidate key not selected as a relation’s primary key.a table with a single column primary key.1. 6. 1.• Derivative Attribute . but in reverse many occurrences of the second entity is related to at most one occurrence of a single entity (mathematically A leads to at least one B and B leads to exactly one A). • Cardinality .the number of entities associated in the relationship. • Existence Dependency . • 1:1 Relationships . • 1:M Relationship . • Body . • Candidate key . • Attribute . and vice versa (mathematically A leads to at least one B and B leads to at least one A). 1. many). • Connectivity . 67 . • Non-Prime table . • Prime Table . or more occurrences of another entity.the occurrence of one entity is related to at most one occurrence of another entity and vice versa (mathematically A leads to B and B leads to A). 1.the occurrence of one entity is related to 0. drawn from the domain. and constrained.the number of entity instances related through the relationship.the occurrence of one entity is related to 0.a composite value set of a relation assigned to tuple variables. or more occurrences of another entity.a table with a composite primary key.describes the primary key with a unique name.an attribute set that uniquely identifies a tuple.
set of relations in a logical relational model. • Domain . • Foreign key . • Instance . • Key .a key defined on multiple attributes. • Heading .any set of attributes uniquely identifying a tuple.• Composite key .an “overloaded” simple key encoded with more than one fact.all possible values specified for a given attribute.the intersection of a tuple and an attribute. • Natural key .a unique instance of a relations consisting of at least one primary key and any associated attributes. • Simple key .a tuple drawn from a complete set of tuples for a relation.a key defined on a single attribute. • Field .an attribute set.represents a real world identifier for a tuple in a relational database. • Repeating group .attached to each attribute and comprised of a name and a domain.a representation of data in tabular form. • Tuple . 68 .attribute sets based on an identical attribute set which is a candidate key for a different relation. uniquely identifying each tuple in the relation. • Primary key . • Relational schema .an artificial simple key identifying individual entities in an arbitrary way. • Surrogate key . • Relation . Every relationship has one and only one primary key.a collection of logically related attributes occurring multiple times in a tuple. or a data type.an attribute set uniquely identifying each tuple in a relation. • Superkey . • Intelligent key .
each field contains one value and one value only. • Join . • Third Norm (3NF) . Normal forms define a system of constraints placed on a relation. while attributes represent a single-value property of the entity.the division of one relation of degree by another relation of degree to produce a quotient relation of degree.the boolean product of the concatenation of tuples within relations.Since relational databases are based on set theory. • Union .eliminates circular dependencies. eventually reducing a relational database to a single aphorism . • Restrict/Select . Any relation in a relational database is considered to be in first norm by definition. • Divide . • Project .eliminates nonkey attributes not describing the primary key. Different layers of normal forms provide greater detail related to the normalization process. A relation consists of a primary key and zero or more attributes. that is. • Second Norm (2NF) . The relationship between two nonkey attributes is not one-to-one in either direction. the relationship between a table and its primary key and attributes must be one-to-one. relations in a relational database meet the constraints of the first norm. The relationship between a table and its primary key and attributes are not one-to-one from primary key to attributes: all nonkey attributes are functionally dependent on the primary key.one fact in one place. Normalization is performed through identified dependencies between attributes. Relations (tables) are decomposed vertically and horizontally.represents the set of all attributes contained in both relations and only in both relations. That is. • Product .any subset of tuples satisfying specified conditions.the first layer where all fields of a relation variable is atomic. Relational databases are typically broken into six layers as follows: • First Norm (1NF) . The primary key uniquely identifies any tuple.represents all multiples of all attributes found in associated relations.represents all attributes found in either relation or both relations. Formally. • Intersection . A stricter form is called the Boyce-Codd Normal Form (BCNF) and defines a relation if 69 . logical operators are used to construct and decompose relationships in the database: • Difference .represents the set of all attributes contained in one relation but not another.any subset of attributes of a relation is a projection on the relation.
• Select those attribute that remain unique. • Select system-assigned keys whenever possible. Identify any superkeys . • Sixth Norm (6NF) . Primary key attributes cannot be null. The relation is in BCNF and has no anomalies associated with multivalued dependencies. To enforce Referential Integrity Rule.and only if every determinant is a candidate key. • Name primary key attributes using consistent convention.the relation satisfies no nontrivial join dependencies. 2. and cannot be defined on columns in the BLOB or CLOB data types. surrogate keys can be used. When a simple primary key cannot be chosen. A surrogate key is an artificial simple key used when no natural key exists. Eliminate any redundant or unnecessary attributes in the superkeys to create a candidate key set. • In supertype-subtype relationship. cannot contain duplicate values. • Fourth Norm (4NF) . always assign the same primary to both sides of the relationship. 70 . Select a primary key from the candidate keys: every relation has one and only one primary key.Also known as projection-join normal form (PJ/NF). should never be modified.eliminates multivalued dependencies from relations. Any relation that is in 3NF is in 5NF if every candidate key is simple in the relation. Recommendations for selecting primary keys are: • Select numeric attributes whenever possible. • Fifth Norm (5NF) . Every join dependency is a consequence of the candidate keys of the relation.a set of attributes uniquely identifying the tuples of a relation. Primary keys do not identify order or access a path. To identify a primary key: 1. 3. • Never use intelligent keys. Any candidate key that is unselected is considered an alternate key. • Select attributes that rarely change. A primary key uniquely identifies each tuple associated to a relation variable. a UNIQUE constraint can be assigned to any alternate key.
The result of a join is another relation. Consider that normalization will decompose relations into smaller relations. 71 . A binary join defines the joining of two relations.Foreign keys are attribute sets in one relation found in another relation acting as a primary or alternate key. the join is considered lossy. A lossless join relation is defined as any table which exists solely to ensure any adhoc query against a set of relations will only use standard operators of the relational algebra. a property defined by the law of conservation of relations. There are three types of foreign key values: • • • Mirror image Wholly null Partially null Foreign keys cannot be defined on columns in the BLOB or CLOB data types. Since joins between relations are also relations themselves.). when three or more tables are combined. If the relations are joined and will not result in the original relation. When a database has been improperly normalized or the result of normalization is misunderstood. every value of the foreign key must be equal to a primary key value or be wholly null. If the smaller relations (tables) are joined to recreate the original relation. the expression n-ary join is used (tertiary. Due to the presence of foreign keys.11 Join Modeling A join describes when two tables are associated with each other.1. A join requires any two relations to share a common attribute. 6. Attribute mapping provided within join processing can provide an analysis of semantic disintegrity. etc. The following properties apply: • Mappings are either 1:1 or 1:M. A relational join describes an operation when data from two or more tables are combined. it is possible to have a lossless join relation. semantic disintegrity can occur. The Referential Integrity Rule states that if a relation has a foreign key match a primary key of another relation. The rule is enforced using the PRIMARY KEY and FOREIGN KEY clause of the CREATE TABLE statement. nor can they be defined for a global temporary trace table. A sign of this occurs when an incorrect answer is provided for a user’s query. the join is considered lossless. There are several types of joins. primary index joins can be performed. • A 1:1 mapping between attributes is equivalent to a functional dependency (X and Y).
4. Model identified transactions. Identify column change ratings. Define all domains. 72 . • Column . 5. Identify all applications.1. If the above properties hold. Transfer access information. 2. 3. The activities of the ATM process include: 1.an instance of an object in a relational table. Model identified applications. Compile or estimate demographic information of data. where all function dependences of the original relational schema can be implied by the functional dependencies in decomposed relations. 6.a physical restriction defined for a column or table.a unique. but the same value Y can be mapped to different values of X. Dependency preservation is a product of a relation’s decomposition.a defined. • If a mapping is not 1:1. • Table . 7. then the mapping must be 1:M. • Constraint . • A sequence of attribute mappings must meet the lossless join property. semantic disintegrity can be avoided. 6. 8. closed set of values and commonly referred to as a data type. • Row . Define all constraints. The ATM process defines the following terms accordingly: • Domain .• 1:1 is loosely defined: X can be mapped to only one Y. atomic attribute of a relational entity.an abstract representation of an entity constructed of rows (tuples) and columns (attributes).12 Activity Transaction Modeling Process The Activity Transaction Modeling (ATM) Process is the initial step in transforming the logical data model into physical data.
536 (16-bit hash buckets) or 1. The value is a 32-bit value containing either a 16-bit hash bucket number with a 16-bit remainder or a 20-bit hash bucket number with a 12-bit remainder. 6. Primary indexed tables will distribute rows across multiple AMPs.consists of unique and system-generated columns which are frequently used to create surrogate keys.• Primary key . or a new row is inserted. which will compute a row hash value based on the value from the primary index. 73 . planned queries. In unique indexes. and join indexes. or possible locations. Rows are hashed differently between primary indexes and no primary indexes (NoPI). They provide direct access to data typically retrieved by common. Hash buckets are distributed across AMPs as evenly as possible.a column set that uniquely identifies a tuple within a relation. of the row in the base table. a 32-bit row hash value of the primary index is stored with the column data for the row. based on the settings for the CurHashBucketSize and NewHashBucketSize flags.2 Indexing Indexes are used by the Optimizer to allow table access to be more efficient. • Normalization . Additional storage space is required when fallback features are defined for a table. Subtables store secondary. • Identity Column . This distribution is enabled using a hashing algorithm. RowIDs will uniquely identify each row. Full-table scans do not require indexing and represent the activity of unplanned.segregates the attributes of a database into individual tables to allow the attributes to be uniquely modified. Indexes of relational database are tables consisting of columns and rows and referencing base tables in the database. accessing one or more data block per data block. The NoPI will store the row hash value with the RowID generated by the AMP software after assigning the row to the AMP. but will change whenever the value of the primary index or partitioning column for the row changes and can be reused after any association with a current row has ended. A Teradata system can have up to 65. • Foreign key .576 hash buckets (20-bit hash buckets). The number of rows retrieved by the index determines the index’ selectivity. The Teradata Database will update the index subtables every time an indexed column value is updated or deleted in the base table. adhoc queries. The rows are made up of two parts: a data field in the referenced table and a pointer to the location. Highly selective indexes retrieve very few rows.048. hash. which uses a substantial amount of system storage space. A low selectivity identifies those indexes which retrieve many rows.a column set identifying a relationship between two or more tables in a database.
Single-table simple join index. Nonunique secondary index hash-ordered on single column with no ALL option. Nonunique secondary index hash-ordered on all columns with no ALL option. will enforce a unique value for a particular column set. Nonunique multilevel partitioned primary index. Hash Index. Single-table aggregate join index. Unique single-level partitioned primary index. Multitable simple join index. unique primary indexes (UPI) and unique secondary indexes (USI). Nonunique secondary index value-ordered on single column with no ALL option. Nonunique indexes. Nonunique nonpartition primary index. utilizing the primary key column set constraint. Unique secondary index. Unique multilevel partitioned primary index. The different types include: • • • • • • • • • • • • • • • • • • • Unique nonpartition primary index. do not require a unique value.There are a number of different index types falling under four general categories: primary. Nonunique secondary index value-ordered on single column with ALL option. Nonunique secondary index hash-ordered on single column with ALL option. Nonunique primary indexes are typically used to join by defining entities with the same primary index to ensure rows are hashed to the same AMP. non-unique primary indexes (NUPI) and non-unique secondary indexes (NUSI). Nonunique single-level partitioned primary index. Unique indexes. secondary. Multitable sparse join index. join and hash. Multitable aggregate join index. 74 . Single-table sparse join index.
nor is a default primary index assigned for a temporal table. Primary indexes are used to define the distribution of the rows to the AMPs. and aid in efficient aggregation. while single-table join indexes define joins by hashing frequently joined subsets of base table columns to the same AMP.1 Primary Indexes Tables in the Teradata Database can have zero or one primary index. standard temporal and non75 . The primary index definition will specify no more than 64 columns and those columns cannot have data types for BLOB. Only one primary index can be defined for each table. All temporal tables are defined to have only non-unique primary indexes. A Primary INDEX may be explicitly defined or a PRIMARY INDEX may not be defined or NO PRIMARY INDEX may be explicitly defined or not. global temporary tables. volatile tables. Both single-level and multilevel PPIs can be defined for global temporary and volatile tables. or Geospatial. CLOB. Non-unique primary indexes are assigned to minor entities and defined on the same column as the major entities which the minor entity is associated. a table will hash its rows to the appropriate AMPs and stored in row hash order. The unique primary index column set for non-temporal tables should always be set with a NOT NULL attribute. Primary indexes can be partitioned or non-partitioned. As mentioned above. Unique primary indexes are assigned to major entries and subentries in non-temporal tables. Partitioned primary indexes cannot be applied to queue tables. Primary indexes are created using the CREATE TABLE statement. Period. Subentities will typically use the same unique primary index as the major entity it is associated with to ensure that related rows in the different tables are hashed to the same AMP. If a primary index is not explicitly defined.Multitable join indexes are defined for frequently performed join queries.2. A non-partitioned primary index (NPPI) is the standard primary index for the Teradata Database. the existence of a default primary index is dependent on a PRIMARY KEY constraint or any UNIQUE constraints and on the setting of the PrimaryIndexDefault Control flag. row compressed join tables. ensure access to rows without a full-table scan. Multi-value compression cannot be specified for primary index columns or partitioning columns of a PPI partitioning expression. 6. NoPI tables. and hash indexes. provide efficient joins. UDT. A partitioned primary index (PPI) is an extension to the NPPI. primary indexes can be unique or non-unique. Hash indexes are similar to single-table join indexes with the syntax similar to secondary indexes. Partitioned primary indexes can have up to 15 levels and be applied to base tables. When created with an NPPI. Primary indexes are unique or non-unique and partitioned or non-partitioned. non-compressed join indexes. A unique primary index cannot be defined for a temporal table.
such as: • • Selection conditions Join conditions 76 . NoPI tables cannot specify a permanent journal or an identity column. the secondary index will specify other access paths to the desired tables. Unique secondary indexes (USI) are assigned to any column constrained by unique values. or UDT data types. These numbers are based on the user defined value of a partitioning expression. A Row ID in a NoPI table row is randomly selected using an arbitrary hash bucket owned by an AMP. Instead of using the primary index path. particularly for repetitive or standard queries.2. Up to 64 columns can be found within a secondary index definition and the columns defined in the definition cannot contain BLOB. queue table. an internal and an external number. PPI rows are grouped into partition groups on an AMP using their partition number. Non-primary index (NoPI) tables are non-temporal tables which have no primary index and a table type of MULTISET. the Query ID for a row is used in the hash. 6. Secondary indexes can be unique or non-unique. NoPI tables cannot be used as a temporal table. a NoPI table will not hash rows to an AMP using a primary index value. Geospatial.temporal base tables. or a different algorithm. When a PPI is created with a table or join index. and then a unique value. Period. The partition number field is located in the Row ID and represents the combined partitioning expressions of a PARTITION BY clause. CLOB. Also not allowed are system-derived PARTITION or PARTITION#Ln columns. error table. nor can SQL UPDATE or MERGE requests be used to update a NoPI table. Instead. or SET table. These tables are used to improve performance of data loading operations. while the external number is the partitioning expression for a row in a PPI table. These indexes can be created when a table is created or added later using the CREATE INDEX statement. A Teradata Database can have up to 32 secondary hash and join indexes. the row are hashed to AMPs and assigned to compute internal partition numbers. There are two partition numbers.2 Secondary Indexes Secondary indexes are never required for tables. They will typically appear in conditions identified by the WHERE clause. then the row hash value for each partition. NPPIs will be ordered using only the row’s hash value. Non-secondary indexes (NUSI) are generally assigned to non-unique column sets which have attributes which are frequently sorted. and non-compressed join indexes. but are used to improve performance. Without a primary index. The design of PPIs will optimize range queries. The internal number is calculated from the external partition number.
The RowID of the PPI table contains the partitioned number for the row. or Teradata Parallel Transporter operations (LOAD and UPDATE) to load data if the indexes are associated with the target base tables. USIs are preferred: NUSIs are more preferred for range query access. USIs are used to access base tables or to enforce data integrity.• • • • • ORDER BY clauses GROUP BY clauses Foreign keys UNION DISTINCT For row access using a single value.524) . • Row ID (8 bytes) . MultiLoad. Before any load is successful. a subtable row will be structured slightly differently. The RowID for the NPPI will not contain a partitioned number. • Secondary index value . in addition to the row hash and uniqueness value. The fields in the row layout include: • Row Length (2 bytes) . If a DROP INDEX transaction is running. all associated USIs to the table must be dropped. the EXCLUSIVE lock is activated and will not allow other processing until completed. such as Teradata Parallel Data Pump. If requests are currently running. thus blocking any DROP INDEX transaction from completing. with a NONSEQUENCED VALIDTIME UNIQUE constraint. 77 .(up to 65.defines uniquely the USI row by combining the row hash (output of the hashing algorithm) and uniqueness value (system-generated integer). • Base table row ID (8 bytes for NPPI tables and 10 bytes for PPI tables) . Unique secondary indexes do not allow FastLoad. Usually. BTEQ. These constraints are implemented on temporal tables only when a valid time table has a column specified.defines the RowID of the base table row identified by the USI. or Teradata Parallel Transporters (INSERT and STREAM).defines the number of bytes in the row. • Overhead (2 bytes) – unused.defines the column values for USI. including overhead. Depending on the type of base table and format. the index will place a READ lock on the index subtable. The Teradata Database will implement a USI when a non-primary index uniqueness constraint is created for a nontemporal table using the PRIMARY KEY or UNIQUE constraints. or other load utilities must be used.
NUSI access will use all AMPs in its operations. combinations of NUSIs may be highly selective. 78 . A three-part BYNET message is specified by USI access. 9. GREATER THAN. The USI row hash is used by the Dispatcher to send a message across the BYNET to the appropriate AMP containing the USI subtable row. 4. containing the USI Table UD. USI row hash value. while ANDed expressions will allow all predicate conditions to evaluate to TRUE. 3. The RowID is used to locate the base table row. While individual NUSIs may not be highly selective. or LIKE. An AMP steps message is created by the Generator. Non-unique secondary indexes are best used for range access equality and nonequality conditions. Composite set selection with an OR or AND can be used. To use a USI to locate a row. If any condition evaluates to FALSE. the Optimizer will typically perform a full table scan. 7. no row is retrieved for the set of conditions. The subtable access is not hashed for NUSIs. As a result. 2. LESS THAN. Multiple NUSIs are frequently defined for the same table. Other selections can provide conditions for BETWEEN.Teradata Database will distribute a USI row to a different AMP than the base table row the index identifies. The purpose of bit mapping is to drastically reduce the number of base rows to be accessed and is only used when weakly selective indexed conditions are ANDed. The USI value is hashed by the hashing algorithm. the following process is used: 1. 5. The AMP locates the index row in the subtable using the USI RowID. USI access is typically a two-AMP operation. If an OR expression is used. The Parser checks the syntax and lexicon of the query. the Optimizer will use bit mapping. The USI value request is accessed by hashing to its subtable and reading the pointer to the base table in order to access the stored row directly. and the USI data value. The AMP reads the base table RowID from the USI row and distributes a message containing the base table ID and RowID for the requested row. ORed expressions allows any predicate condition in the WHERE clause to evaluate to TRUE for the specified condition. The AMP locates the USI subtable using the USI Table ID. 8. The Parser looks up the Table ID for the USI subtable containing the desired USI value. How many of those NUSIs are used in the query plan depends on the individual and composite selectivity. If low selectivity is possessed by both indexes. 6. requiring the subtables to be scanned in order to identify the relevant pointers to base table rows.
A hash or join index definition cannot be specified with a system-derived PARTITION column. and if not.2. 79 . each ROWID specification must be qualified.3 Join Indexes Join indexes are designed to permit the resolution of queries by accessing the index without accessing or joining the underlying base tables. An alias name for the ROWID can be referenced. The join index definition is the only definition which uses the ROWID keyword.When a query is covered. Join indexes will join multiple tables in a prejoin table. or the keyword ROWID. If multiple tables in the definition are referenced. or if any bad character columns exist. they are only specified in a COUNT function or UPPERCASE operator in the query. and its rows partitioned using a different primary index than the base table. It can be used optionally to specify the ROWID for the base table. does not contain any bad character column sets. it means that all columns requested in a query are available in some index subtable without having to access a base table. however a user-named column called partition in the index definition. or in part. Aliases are required to resolve any ambiguities in column names or ROWIDs in the select list of the join index definition. Partial covering means that only some of the columns requested are available. 6. • All columns referenced in a query are included in the NUSI. When the index structure contains all columns referenced by joins. multitable join tables are useful. A query is covered by a NUSI if the following is true: • The query references only columns in the NUSI. the same alias must be referenced in the CREATE INDEX statement creating the secondary index. • A character column set is not referenced which is not defined as either CASESPECIFIC or UPPERCASE in the base table. Join tables allow one or more columns of a single table to be aggregated as a summary table. • An ALL option is defined by the NUSI. or if not. If an alias name is referenced in the select list of the join index definition. If a query is not covered by a join index. the Optimizer will often use the index to join underlying base tables to provide greater optimization. no CASESPECIFIC condition exists in the conditions of the query. A single base table will be replicated as a whole.
• A transfer to a join index definition cannot happen if the column is a component of the primary index for the join index. • Single-table join indexes . • A transfer to a join index column will not happen for any column specified with an argument for the functions COUNT. • Aggregate join indexes . 80 . EXTRACT. • A transfer to a join index definition cannot happen if the column is a component of an ORDER BTY clause in the definition. These join indexes cover aggregate queries considering a subset of groups contained in the join table. a single-table join index is a database object created by a CREATE JOIN INDEX statement specifying only one table in its FROM clause.When a join index is created. The different types of join indexes include: • Simple join indexes .a database object created using the CREATE JOIN INDEX statement specifying one or more columns derived from an aggregate expression.a simple join index on a single table. A defined multi-value compression is not inherited by hash indexes. or SUM. All or some of the columns can be hashed on a foreign key which hashes rows to the same AMP as another large table. The following rules must be followed to transfer compression values to join index columns: • Multi-value compression transfers will occur to a join index definition even if alias names are specified for columns in the join index definition. satisfying conditions such as: o All columns specified in the grouping clause of a query must be included in the grouping clause. the system will transfer any column multi-value compression defined in the base table automatically to the join index definition. • A transfer to a join index definition cannot happen for any columns which are components of a partitioned primary index or partitioning expression for the join index.satisfying any query performing a frequent join operation by defining a permanent prejoin table without violating the normalization of the database schema. particularly specifying a SUM or COUNT aggregate operation. • Transfers to the multitable join index definition will continue as long as the maximum header length of the join index is not exceeded. • A transfer to a column cannot happen in a compressed join index which has indexes defined on its column_1 and column_2.
whether simple or aggregate. • Neither can be directly updated or queried.a sparse join index will use a constant expression in the WHERE clause of its definition to narrowly filter the row population. • They can be row compressed. o • Sparse join indexes . or UPDATE statement. 81 . HELP INDEX.4 Hash Indexes Hash indexes are file structures. • They can be FALLBACK protected. Below are the similarities between single-table join indexes and hash indexes: • Both function to improve query performance. Their properties are common to single-table join indexes and secondary indexes. • Both can be an object of the SQL statements . • The system maintains the relevant columns of their base table automatically through an update using a DELETE. and SHOW HASH INDEX.2. • They can be hash. but a hash index cannot. Any join index can be sparse.COLLECT STATISTICS. • A single-table join table can have a partitioned primary index. 6. • A complex expression can be transformed into a simple index column. INSERT. • A partially covered query containing a TOP n or TOP m PERCENT clause is not supported.o All columns in the query WHERE clause must be part of the join index definition. • Row compression specifying a UDT in their select list cannot be implemented. DROP STATISTICS. multitable or single-table. • Space allocation is received from permanent space and stored in distinct tables. The partitioning columns must be members of the column set specified in the GROUP BY clause of the index definition if a PPI is defined fro an aggregate join index.or value-ordered.
while hash indexes do not. or Geospatial data types. while join indexes must explicitly define index row compression. • Join indexes will transparently add column multi-value compression. Neither index type will allow column multi-value compression to be explicitly defined. • Hash indexes will transparently add index row compression. • Primary indexes cannot be partitioned for hash indexes and must be in noncompressed row forms for join indexes. and join indexes. The differences between hash indexes and join indexes are: • A hash index will index only one table while join indexes will index multiple tables. • NoPI tables are supported by join indexes. while pointers are implicitly added for hash indexes. Period. while the column list for a hash index cannot specify aggregate or order analytical functions. not hash indexes. • The column list for a join index can specify aggregate functions. a logical row can correspond to either one and only one row in a referenced base table or multiple rows in referenced base tables. • Hash indexes cannot specify a UDT in its select list. while join indexes can only when the UDT is not row-compressed. BLOB. • Logical rows can correspond to one and only one row in a reference base table for a hash index while in a join index. 82 . and Archive/Recovery utilities. • Base table row pointers in join indexes are explicitly defined using the ROWID keyword. Join index columns cannot have BLOB. CLOB. Teradata tables can have up to 32 secondary. • Restrictions exist for using the MultiLoad. hash. or Geospatial data types. Hash index columns cannot have UDT. FastLoad. CLOB.• A system-derived PARTITION column cannot be defined.
1 Set Theory The principles of set theory are at the core of relational database management and the foundation for practical design and administration of relational databases. • Existential quantifier . all rows in a relational table or relational variable are assumed to be true. Semantic integrity constraints are used to enforce the logical aspects of the data and their relationships. Declarative semantic constraints are part of the definition of the database and consist of: • • • Column-level constraints Table-level constraints Database constraints Physical data integrity constraints will check used data as it travels from system memory to disk. is a statement that can be proven without question to be either true or false.6. Though. By default. commonly representing attributes of a relation (columns) and the relation heading (relation variable).also called a proposition. 6. They should never be declared on columns defining BLOB or CLOB data types.3 Integrity Databases have two types are integrity constraints: semantic and physical. Constraints are the physical implementation of business rules. Some common terms from set theory and formal logic that find themselves in database design are: • Assertion . as integrity constraints would deny their entry is proven false.determines the steps of reasoning used to prove propositions.” • Identity predicate .” and “there exists. The purpose of a constraint is twofold.a truth-valued function.3. To ensure bad data is not loaded into the database and ensure no corruption occurs between tables due to improper deletion or update of data in the existing database.signifies the logical equivalent of “for some.” “for any. the normalization of a database can be drilled down to a point where these principles and formal logic can seem trivial. 83 .” • Inference rules .signifies the logical equivalent of “is identical to” or “is equal to. • Predicate .
Business rules are a component of the business model that defines specific conditional modes of activity. as compensate responses are not permitted. • Truth-valued function . 84 .defines the sets of relationships available to values within a row and can be applied to a single row or multiple rows. an integrity constraint set.a set of inference rules where propositions in predicate logic are proven.a data type.specifies a simple predicate which is applied to only one column. and identity predicates to prove validity of a statement. acting as a simple constraint by defining the characteristics of values entered into the database. An integrity constraint is a component of the logical database model that formalizes a component of the business model (business rule) by defining the conditions and ranges permitted in database parameters. The SQL CREATE TABLE and ALTER TABLE statements are used to specify integrity rules. each table must be able to evaluate its own predicate or relational variable (relvar) for its truth value and enforce a set of business rules. In a database. a checking time. • Column . The ANSI SQL standard support either a response of reject or compensate. INSERT.signifies the logical equivalent of “for all”. The violation response component defines what action will be taken of the integrity constraint is violated.• Predicate calculus . • Universal quantifier .a function that will evaluate without question to either TRUE or FALSE.2 Semantic Integrity Integrity constraints will restrict database updates to a set of specified values or range of values. Four types of integrity constraints are supported: • Domain .uses truth-functional operators of propositional calculus. An integrity rule is a set of rules used to ensure data integrity. All Teradata Database constraints will respond with a reject.3. The ANSI SQL standard supports immediate or deferred checking times. • Predicate logic . universal and existential quantifiers. Each rule is comprised of a name. The checking time is the point within the process where the constraint is checked. • Table . and violation response. They can be implemented using constraints or triggers. a constraint is any predicate which must evaluate to TRUE in order for a DELETE. or UPDATE operation to be permitted in a database. though all Teradata Database constraints will immediately check and deferred checking time is not permitted. In its strictest sense. 6.
system-generated names will not be assigned.REFERENCES . Period. 85 . Column-level constraints cannot reference other columns in the table. UDT. • Unnamed constraints with identical text and case are considered duplicates and will result in an error is submitted with a CREATE TABLE or ALTER TABLE statement.# $ If a constraint is not named. The names of a constraint can contain the following characters: • • • Uppercase and lowercase alphanumeric Integers Special characters . Each column defined for a table must specify a name and a data type. semantic constraints cannot be defined with BLOB.defined on a single column as USIs for nontemporal tables or as either single-table join indexes or USIs for temporal tables. Additional attributes or constraint definitions can be used to further define a column.defined on a single column and cannot be implemented as indexes. CLOB.defined on a single column as USIs for nontemporal tables or as either single-table join indexes or USIs for temporal tables. while table-level constraints must reference other columns in the table. The most general form of SQL constraint specifications is the CHECK constraint. • The current session collation is used to test the constraints on character columns.defined on a single column and cannot be implemented as an index. • Can have a temporal dimension... Not supported on temporal tables. • CHECK .defines the functional determinant between a key is its dependent attributes. • The specified predicate can be any simple boolean search condition. Some column-level integrity constraints for both nontemporal and temporal tables are: • UNIQUE . The following rules are applied to the all CHECK constraints: • Can be defined at the column or table-level. • FOREIGN KEY. It can be applied to either a individual column or to an entire table depending on the CREATE or ALTER TABLE text. • PRIMARY KEYS . • Cannot be specified for any level of a volatile table. or Geospatial data types.• Database . In general.
If multiple unnamed constraints are defined for the same column. Period. deleted. Batch Referential Integrity will test every inserted. the AMP software will reject the operation and respond with an error message. They cannot be defined on self-referencing tables. but assumes that it is enforced by the user in some other way than the normal declarative referential integrity constraint mechanism. or updated. • Cannot be specified for global temporary trace tables Column-level CHECK constraints can have multiple constraints specified to a single column. Constraint predicates for table-level CHECK cannot references columns from other tables. the system will combine the constraints into a single-column constraint. or Geospatial data types.. A maximum of 100 table-level constraints can be defined for a table. Referential constraints can be used for either temporal or nontemporal tables. Neither process is valid for use with temporal tables.REFERENCES include: • • • • Standard Referential Integrity Batch Referential Integrity Referential Constraints Temporal Relationship Constraints Standard referential integrity will test each row for referential integrity which is inserted. Each named constraint is handled separately. If a violation to referential integrity occurs.. If a FOREIGN KEY is specified.• Cannot be defined with BLOB. Referential primary key relationships with foreign keys are allowed through the FOREIGN KEY. FOREIGN KEY keywords are optional. CLOB. or updated batch operation for referential integrity. the parsing engine will roll back the entire batch and respond with an abort message.REFERENCES. TRC operations are defined at the table level and valid for referential integrity relationships between single columns belonging to valid time and bitemporal table types to non-temporal or transaction table types. Referential constraint types supported by the FOREIGN KEY.. However. UDT. but assumes some other method is used to enforce it. Temporal relationship constraints (TRCs) also do not perform a test for referential integrity.. deleted. Referential integrity is not tested. 86 . Table-level CHECK constraints will reference at least two columns from the table. Column-level CHECK constraints cannot reference any other column n the table. If a violation occurs. a REFERENCE clause must be specified also.
If this happens. or byte strings. or fallback. including: • • • Full end-to-end checksums.3 Physical Integrity Checking mechanisms for physical integrity will detect data corruption due to lost writes or errors in bit. byte.3. the user will simply access the second. However.represents the master index and defines the current state of cylinder indexes through cylinder index descriptor entries. Disk I/O integrity checking (checksums) will detect and log any errors in the disk I/O. Additionally fallback protection allows the same data to be written to two different AMPs within a cluster in case the primary copy of the data is corrupted or goes down for some reason. If hardware devices do not support these mechanisms. copy of the data.represents the data block and contains data rows. including: • • • • Data blocks (DB) Cylinder indexes (CI) Master index (MI) File information block (FIB) A Teradata file system is a generalized B* tree structure with three levels: • Bottom node . the system will remove the affected AMP from service. 87 . No checksum integrity checking. Checksums can be performed at different level.represents the cylinder index and defines the current state of the cylinder through data block descriptor entries. Disk I/Os in the Teradata Database will read from and write to different data and file structures. when the data corruption is detected. Most data corruption is typically protected through end-to-end error detection and correction algorithms. but will not fix the errors.6. • Middle node . • Top node . Statistically sampled partial end-to-end checksums. the Teradata Database can implement checksums to enforce physical data integrity at the I/O disk level.
4 Database Principles Several fundamental principles of relational database management must be adhered to when designing and maintaining databases.no unmatched foreign key values can exist. 6. • NONE . and 5) the decomposition 88 . This rule will violate the Principle of Interchangeability. 4) every projection on the non-5NF relvar must reconstruct the original relvar. • Referential Integrity Rule .user-defined setting and samples one third (33 percent) of the words of the data per disk sector to compute the checksum value.user-defined setting and samples one word (2 percent) of the data per disk sector to compute the checksum value. • Principle of Interchangeability .declares that if a tuple appears in a relation variable at a given instant but not in that relvar. which violates the Entity Integrity Rule.the system will use whatever system-wide level of integrity checking defined for the table type in the DBS Control utility.disables disk I/O integrity checking. • Closed World Assumption . The rule allows attributes of the foreign key to be wholly or partially null.3.user-defined setting and samples two third (66 percent) of the words of the data per disk sector to compute the checksum value. • Principles of Normalization .The following keywords have predefined integrity levels: • DEFAULT .five principles declaring: 1) a relation variable not in 5NF should be decomposed into a set of 5NF projections. then the corresponding logical proposition must evaluate to FALSE. • MEDIUM . • ALL – 100 percent of the words in each disk sector is sampled to generate a checksum. 2) the decomposition must not be non-loss. • Information Principle . They are: • Entity Integrity Rule .The attributes of the primary key cannot be null and applied to base relations. 3) the decomposition must preserve dependencies.a database can only contain relation variables: information content can be represented at any given instant as explicit values in attribute positions in tuples. • HIGH .Base relations and virtual relations (tables and views) have no arbitrary or unnecessary distinctions. • LOW .
Value is not supplied. a comparison between the two must evaluate to TRUE.3.any database constraint cannot be evaluated to FALSE if the result of an update operation. • Assignment Principle . Value is not applicable. Value is an empty set.5 Missing Values The use of SQL nulls covers a number of situations resulting in missing values in the data. • Principle of the Identity of Indiscernables . but are not values themselves.after a value is assigned to a variable. and behavior of different null types are different. the operators IS NULL or IS NOT NULL must be used. Value is not valid. • Golden Rule . context. the semantics. those entities are identical and cannot be two different entities. They are not valid predicate conditions in SQL. 89 .should end when all relation variables are in 5NF. except in CASE expressions. Value does not exist. Nulls are used to identify where missing information is located. To search fields which may or may not contain nulls. • Principle of Orthogonal Design . Value is not defined.every entity has its own identity: if there is not a way to distinguish between two entities. The value of a null cannot be resolved because they have no value. 6. Though SQL treats all these situations the same. The most common reasons are: • • • • • • • Value is unknown.two distinct relation variables in the same database must not be a non-loss decomposition. such that the relation variable constraints for their projections permit the same tuple to appear in both projections.
except in the cases where it acts as an explicit SELECT item or operand of a function where its data type is treated as an INTEGER. integrated. operational. and icy respectively.1 Planning Considerations Capacity planning does more than simply ensuring that access to data is available and stored appropriately. It can be used in the following ways: • • • • • • A CAST source operand. As data ages. The access rates are called hot. the relevance of the data can lower. Designing processes begin by recognizing that large amounts of historical data can be stored at a given time and that business processes require access to that data. Proper design can provide optimal performance to the data. A CASE result. An operand of a function. This set of data is required to be up-to-date.4 Capacity Planning 6.4. Cool and icy data is accessed less frequently. with some hot spots within the cool data sector. The packed64 format will store data on tables which are generally 3 percent to 9 percent smaller than the other format and will reduce the number of I/O operations required to access and write 90 . An explicit SELECT item. This rate of descent can be faster or slower than other data. A default column definition specification. 6. a true assumption can be declared that different data in the database has different levels of relevance to the business over time. An item specifying a null in a column during an INSERT or UPDATE operation.The literal NULL represents a placeholder for the logical value in an SQL request. Cool sectors may be easily moved into greater relevance should new columns or indexes be added which affects the data in the sector or recast to make more relevant to the current business context. cool. The format of the base table row will provide the most basic estimate in capacity planning. Hot and warm data is accessed often and constitutes the operational data store. and maintained regularly. Teradata Database will utilize two row formats: packed64 or aligned row format. Recent data is typically accessed more frequently than older data. A NULL has no data value. However. warm.
Table structure version Table structure version for host utilities. The different fields contain: • Field 1 o o o o o o o o o o o Row 1 header. two bytes. Last table update timestamp. The maximum row length is always 64. Each table has an associated subtable called a table header which is stored on each AMP in the system. Offset array for locating variable length columns. Field 3 is not used. Last table archive timestamp. The representation used will determine the number of characters supported. Field 1 is fixed length while all other fields are variable length. or a combination of single-bytes and multibyte representations. Table headers are used to internally maintain information about each table. Characters can be represented by one byte. Primary index flag. the columns for each category are stored in decreasing alignment constraint order. Number of backup tables associated with this table.256 bytes (approximately 64KB). Table header row format version. the columns are stored in field ID order. Table columns fall into three general categories: • • • Fixed length Compressible Variable length The order above determines how rows are stored. Table creation timestamp. The size for a table header is limited to 1 MB. the overhead devoted from the row header is 12 or 16 bytes respectively. In aligned row format.rows. Internal ID of the database and database space charged. Depending on whether the table has partitioned or non-partitioned primary indexes. In packet64 format. The components of the table header include the row header and fields 1-9. 91 .
temporary. 92 . Percent free space for table. Merge block ratio validity. log. Hash flag. before. Table kind (permanent. Number of child tables referencing this table. Both. Data block size for the table (in bytes). Merge block ratio for the table. Message class of secondary step. Journal type (After. Message kind of primary step. Message class of primary step. DDL change flag. Percent free space validity. Audit. Number of parent tables referenced by this table. or join index). and none).o o o o o o o o o o o o o o o o o o o o o o o o o Internal ID of the permanent journal table. Protection type (fallback. Byte count of the number of defined USIs. volatile. Host character set at table creation. Host ID. Data block size validity. Message kind of secondary step. and None). User journal flag. Dropped flag. Session number. Disk I/O integrity checksum.
Table ID of base temporary table. Compression flag. Index into the field descriptor array.contains the table column descriptor. Row 1 length. • Field 2 . Dummy space. Number of varying length columns in the table.o o o o o o o o o o o Request number. Archive/Recovery. • Field 5 . Row format. Filed 5 type (Table descriptor with row hash and unique RowID. Flag to indicate a single row or multiple rows.contains primary index descriptor and all secondary index descriptors. 93 . Offset in row to the presence bit array. Number or presence bits in each row. Offset in row to the first byte past the presence bit array. Index descriptors list for table. • Field 4 . Internal ID of replicated error table Replicated table initiation flag. and replication copy information: is always present for permanent journal tables. table rebuild. Transaction number. Restart flag. o o o o o o o o o o Internal ID of the first column in the table. FastLoad. Internal ID for primary key index.contains MultiLoad. Number of columns in table.
ANSI tables without unique indexes. o o o o o • Field 6 . table descriptor with a PPI RowID.contains restartable sort and ALTER TABLE information.index descriptor with row hash and unique RowID. and the length and offset of database name and table name. Database sizing can consider the contents of the system disks and data disks. • Field 9 . 6. o o Row format. Duplicate rows flag (Dictionary and non-ANSI tables. the data disk space allocation.up to 128 reference index descriptors. • Field 7 . The contents of the boot disk include: • AMP identifiers file. Field descriptors array. table name. • Open PDE and operating system. name list of unresolved child tables from referential integrity constraint specifications. System code for building rows.2 Database Sizes The Teradata system managing relational database can be as large as 2. Disk 0 is the boot disk and contains file systems under the control of the operating system root directory. and the determination of usable disk space.contains LOB descriptors.048 GB of memory supporting 128 TB or larger databases.contains database name. ANSI tables with unique indexes). Disk 1 provides additional space for dumps and memory swapping. or index descriptor with a PPI RowID). • Configuration maps.4. Offset to system code to build rows.048 CPUs and 2. Compressed values and UDT contexts. UDT name. A BYNET interconnect can support up to 512 nodes. • Field 8 . 94 . Disk 0 and Disk 1 are system disks cabled to a controller not associated with the disk arrays.
• Executable Teradata software. and operating system and Open PDE dumps. • Space for diagnostic reports. Reserved space is used to contain the vprocID of each AMP and optional memory swapping. • UDF libraries. memory swapping. Permanent space is owned by the system user DBC and used for the following: • • • • • • • • • CRASHDUMPS user SYSTEMFE user SYSADMIN user Data dictionary WAL log space Depot area space Spool files Temporary space Hierarchical ownership of PERM and TEMP spaces 95 . • Values. • Bad disk sectors file. Data disks are virtual and controlled by the AMPs. Included in the data disk is reserved space and permanent space. • Copy of Teradata GDOs.
CRASHDUMPS user. A WAL log is created and managed to recover data tables as a result of aborted transactions. The number of bytes to be allocated to a database or user is based on the TEMPORARY=n clause in the CREATE DATABASE and CREATE USER statements. TEMP space is allocated from the available space of the owning user. Nonuser table space allotments include: • Overhead space • Depot area • Tables area. user temporary space.3 Estimating Space Requirements Allocation of space to the data disk is reserved for system use. The Depot area consists of large Depot slots and small Depot slots. and user spool space. Determine the maximum row length. and ad-hoc queries. To determine the maximum size of the WAL log. 2. including data dictionary. User table data space is determined by first defining the nonuser table data space allotments and subtracting the total from the available table space. • Every new user or database created can have the maximum amount of spool space specified. batch jobs. Double the resulting value.4. Cylinder index space requirements are calculated using the formula: (2 times the size of one cylinder index/size of one cylinder) times 100. 3. or TEMP space owned by the DBC user. PERM space is allocated with the CREATE USER statement. • An extra 5percent of PERM space in the user DBC should be allowed.6. PERM space. A large Depot slot is used 96 . the following steps are used: 1. Global temporary tables require a minimum of 512 bytes from the PERM space of the containing database or user for the GTT table header. The size of the WAL log is based on the total number of data rows being updated or deleted. Multiply the maximum row length by the total number of rows in application programs. The following guidelines should be used: • Reserve 25 percent to 35 percent of total space for spool space and spool growth buffer. Spool space is allocated from the available free space of the owning user.
n .approximately 80 MB should be reserved for growth of system tables and WAL log. The number of cylinders allocated are fixed at startup and allocated per pdisks. If no default temporary space is defined.a minimum of 20 percent of the user permanent space should be reserved. SCF . The table area combines the following requirements for space: • Data dictionary . RO .allocates default space for global temporary tables. CF . 97 . • User TEMP space . A minimum of 512 bytes is required f0r the GTT table header. PH . RP .row parcel indicators. The usable data space for Field mode is calculated using the following equation: a(RO + RP + n(PH + NF + CF)) + b(SNF + SCF) Where: o o o o o o o o o o a .formatted size of each character column in the ORDER BY clause. b . Small Depot slots are used when Depot protection is required by individual blocks which are written to the Depot area. • User Spool Space . The system will calculate the average number of pdisks per AMP and multiply by the specified value to determine the total number of depot cylinders for each AMP.number of rows being selected. which are grouped to a subpool and AMPs are individually assigned to those subpools. the allocated space is set to the maximum temporary space for the immediate owner. SNF .formatted size of each numeric column. • CRASHDUMPS User Space .number of columns.created by the DIP utility with a default allocation of 1 GB.sorted numeric fields.parcel headers.number of numeric columns in the ORDER BY clause.formatted size of each character column. NF .to write multiple blocks by aging routines to the Depot area with a single I/O.row overhead.
To calculate PERM space requirements. The space required to accommodate the user DBC and maximum WAL log should be subtracted from the user DBC PERM space. the PERM space is determined by subtracting the number of cylinders comprising the default free space defined by the DBS Control record. estimate the size of each database. the total table storage based on the sum of space estimates for all table sizes. and the space requirement for the application. Finally. 98 . The estimation from before should be deducted from the remaining space.
7 Structured Query Language 7. or macro. view. Data Definition Language statements define the structure and instance of a database in the form of database objects. • ALTER . trigger. such as HELP and SHOW.specifies for a session a time zone. • COLLECT . • REPLACE . The language uses statements to define database objects. columns. group of columns.1 Overview 7. stored procedure.changes the name of a table. UDF. stored procedure. • RENAME .changes tables. Database objects can be a database.used to define a new database object.1. user.1 SQL Statements Structured Query Language (SQL) is the most commonly used language for relational database management systems. trigger. table. or index. SQL statements exist within three primary functional families: • • • Data Definition Language (DDL) Data Control Language (DCL) Data Manipulation Language (DML) Additional statements are provided by SQL which do not fall into any of these families clearly. UDT. macro.used to collect optimizer or QCD statistics on a column. • MODIFY . index.used to replace a macro. • ALTER PROCEDURE .changes a database or user definition. or UDM. referential constraint. and update data. user access to those objects. trigger. trigger. view. collation or character set.removes a database object. 99 . and view. and indexes. The most commonly used basic DDL statements include: • CREATE . • DROP . stored procedure. • SET .recompiles an external stored procedure.
ABORT . All DCL statement results are recorded in the Data Dictionary. SELECT .used to manage transactions. as well as insert new rows in a table.removes row(s) from a table. update values in stored rows. Some of the most commonly used basic DML statements are: • • • • • • • • • • • • CHECKPOINT .used to manage transactions. DELETE . UPDATE .controls a user’s privileges on an object.echoes a string or command to a client. ROLLBACK .modifies data on one of more rows.gives a database object to another database object.• DATABASE . Ownership of those objects can be changed from one user to the next. MERGE . END TRANSACTION . BEGIN TRANSACTION . COMMIT . and delete a row. Data Manipulation Language statements are used to manipulate and process database values. INSERT .combines the UPDATE and INSERT statements into a single SQL statement.used to manage transactions.returns specified row data in a result table.used to manage transactions. Data Control Language statements are used to grant and revoke access to database objects. • GRANT LOGON/REVOKE LOGON . ECHO .specifies a default database. • COMMENT .inserts new rows into a table.inserts or retrieves a text comment for a database object. 100 .used to manage transactions. The most commonly used basic DCL statements include: • GRANT/REVOKE .defines a recovery point in the journal. • GIVE .controls the logon privileges to a client or host group.
separates database names from table names and table names from a particular column name. From an application program with embedded code.prefixes reference parameters or client system variables. 101 . • Apostrophe . as follows: • Period .delimits boundaries of character string constraints. From an SQL stored procedure through dynamic creation. • Colon . Through a trigger. From embedded code in a macro. • Semicolon . From embedded code in a stored procedure or external stored procedure. • Double quotation marks .The following is typically found in a SQL request: • • • • • Statement keyword One or more column names Database name Table name Optional clauses related to keywords An executable statement can be invoked in one of the following ways: • • • • • • • From a terminal through interaction.distinguishes column names in the select list or column names or parameters in an optional clause.groups expressions or defines limits of a phrase. From an embedded application through dynamic creation. Different parts of the SQL statement are identified or separated using punctuation.identifies user names.separates statements in multi-statement requests and terminates requests submitted through some utilities. • Comma . • Left and right parenthesis .
This is done by combining the results of each query into a single result set. derived tables. The SELECT statement can be used to specify both inner joins and outer joins. and the table(s) referenced within the database. INTERSECT.2 SELECT Statements The most frequently used SQL statements is the SELECT statement. It is used to specify the table columns where the data is obtained. The referenced data is combined by a relational join.1. The SELECT statement will specify how the system will return a set of result data.7. in what format and what order. and MINUS/EXCEPT) Set operators can be used to manipulate the answers to two or more queries. and clauses can be used with a SELECT statement: • • • • • • • • • • DISTINCT FROM WHERE GROUP BY HAVING QUALIFY ORDER BY WITH Query expressions Set operators (UNION. lists. Each source is specifically named and the common relationship (join condition) can be on an ON or WHERE clause. The following options. the corresponding database. and subqueries. SELECT statements are used to reference data in tow or more tables. Embedded SQL and stored procedures will use the SELECT INFO statement. An outer join is an extension of the inner join. It includes rows that qualify for a simple inner join and a specified set of rows which do not match the join condition. 102 . Set operators can be used within view definitions. An inner join will select data from two or more tables to meet specific conditions.
numeric. to control the internal representation of the stored data (import format) and the presentation of data in a column or expression result (export format). period.byte.1. Data types are communicated through phrases. structured. • ANSI-compliant data types . Data type attributes include: • • • • • • • • • • • • • • • NOT NULL UPPERCASE CASESPECIFIC NOT CASESPECIFIC FORMAT string_literal TITLE string-literal NAMED name DEFAULT value DEFAULT USER DEFAULT DATE DEFAULT TIME DEFAULT TIME DEFAULT NULL WITH DEFAULT CHARACTER SET 103 .7.3 SQL Data Types The SQL data types supported by Teradata Database are: • Teradata Database data types . character. interval.large objects (LOBs). DateTime. graphic. and geospatial. Teradata SQL can be used to define attributes of a data value. • User-defined types . A data type phrase will determine how data is stored and how data is presented.distinct.
MIN . The function is called for each item in a set and produces a result for each detail item.minimum column value.5 SQL Functions Standard functions can be: • • • Scalar Aggregation Ordered analytical A scalar function will create a result based on input parameters. the function’s result is used by the expression referencing the function.arithmetic sum of values in a column. The following are ordered analytical functions: • • AVG .maximum column value.arithmetic average of all values for each row in the group. Upon completion. These types or functions allow sophisticated data mining to be performed on the information in the database.4 Recursive Query A recursive query is a named query expression which references itself in its definition.arithmetic average of values in a column.number of qualified rows. This data is typically grouped together using a GROUP BY or ORDER BY clause. 104 . The following are aggregate functions: • • • • • AVG . The function is invoked as required whenever a stated expression is evaluated for an SQL statement. Use the WITH RECURSIVE clause in the statement to implement a recursive query and the RECURSIVE clause in the CREATE VIEW statement.ordered ranking of rows based on the value of the column. MAX . Each set of data is processed independently and only one result is produced for each set.1. This feature can reduce complexity in querying and allows the efficient execution of a class of queries. Aggregate functions will produce a result from a set of relational data. COUNT . SUM . Ordered analytical functions perform over a range of data for a particular set of rows in some order. 7.7. This allows a way to search a table using iterative self-join and set operations. RANK .1.
1. the requests are considered embedded into the application. These host variables are used by application to perform computations. Cursor definition and manipulation. An embedded SQL contains extensions to executable SQL to permit declarations. then increment the cursor as needed. 7. Result set cursors are used to return the result of a SELECT statement executed in the stored procedure to the caller.6 Cursors A cursor is a pointer used by the application program to move through a result table. Cursors are used by Teradata Preprocessor to mark or tag the first row accessed by an SQL query.1. 7. These declarations include: • • Code to encapsulate the SQL from the application language. SQL stored procedures used cursors to fetch one result row at a time and execute SQL and SQL control statements as required. executing the SQL request. the Teradata session mode has several defaults that differ from ANSI semantics.1.NET Data Provider Java Database Connectivity (JDBC) Open Database Connectivity (ODBC) When SQL requests are inserted into an application program.7 Session Modes Two session modes are supported by Teradata Database: • • ANSI Teradata While the ANSI session mode semantics conform fully with the ANSI SQL: 2008 standard. It is declared for a SELECT request and a named cursor is opened. Rows can be individually fetched and written to host variables using the FETCH…INTO… statement.7.8 SQL Applications The following APIs are used by client applications to communicate with Teradata Database: • • • . 105 .
MERGE. the access and join plans generated by the Optimizer are returned as a text file. SQL is not defined for any of these languages. This converted source code can be processed using the native language compiler of the application. an explanation is provided describing how a request will be processed. SQL Applications can also come in the form of macros and SQL stored procedures. and the performance impact of the request.10 Third-Party Development Third-party software products are supported by Teradata Database. the request is parsed and optimized. the estimated number of rows involved. However. The modifier will precede any SQL request. The precompiler will produce an output as a native source code with CLI calls substituted for the SQL source.1. and the relative cost needed to complete the request is presented. The precompiler tool is Preporcessor2 (PP2) and it will: • Read application code for defined SQL code fragments. • Interpret the code’s intent after isolating SQL code and translating it into Call-Level Interface (CLI) calls. It allows SQL statements formulated for DB2 to be translated into Teradata SQL. Using EXPLAIN allows complex queries to be evaluated and alternative processing strategies developed. The execution plan for the request is displayed but the request is not submitted for execution. UPDATE. COBOL. With this translation. 7.9 EXPLAIN Request Modifier The EXPLAIN request modifier is provided by Teradata SQL to view the execution plan of a query. Detailed Optimizer information. and PL/I. With this modifier. this requires the embedded SQL code to be precompiled for the purpose of translating the SQL into native code. To display the execution plan. DB2 applications are permitted to access data stored in Teradata Database. The Transparency Services/Application Program Interface (TS/API) provides a gateway between the Teradata Database and IBM DB2. and DELETE. 106 . The PP2 will support the application development languages of C. largely consisting of two general categories: transparency series and native interface products. is supported in the EXPLIAN request modifier.Any compiled client application language will support embedded SQL. IPSERT. • Comment out of the SQL source.1. 7. such as cost estimates for INSERT.
With PM/API. the following actions can be taken: • • • • • Monitoring system and session-level activities.2 Database Objects Multiple objects are used to create a database.The Workload Management API contains interfaces to PM/APIs and open APIs. • Access to data by CLIv2 request not usually accessible by resource usage. PM/APIs will provide access to PMPC routines found in Teradata Database. Tracking system usage. With these interfaces. Open APIs provide an SQL interface to System PMPC using user-defined functions and external stored procedures. the following is possible: • Frequent in-process performance analysis allowed on CLIv2 data in near real time. Some general categories are: • • • • • Databases Users Tables Columns Data Types 107 . Managing task priorities. Updating components stored in the customer TDWM database. A specialized PM/ API subset of CLIv2 is used by the MONITOR logon partition to access the PMPC subsystem. such as: • • • AbortSessions function TDWMRuleControl function GetQueryBandValue procedure 7. Managing Teradata Active System Management (ASM) rules. • Data retrieval of raw data in an in-memory buffer for real-time analysis or importing into custom reports.
or column names. 7. amount of storage to allot and other attributes are specified during the creation process. The first row in every relational table consists of column headings. The name of the database. and macros. stored procedures. triggers. Users are similar to databases. A database has no password and does not require the need to log on to the system.1 Databases and Users Databases are a collection of related tables. The user name and password are specified during the creation process. Users are created using the CREATE USER STATEMENT. indexes. Databases are created using the CREATE DATABASE statement. The number of tuples is referred to as the cardinality of the table and the number of columns refers to the degree or parity of the 108 . userdefined functions. views.• • • • • • • • • • • • Keys Indexes Referential Integrity Views Triggers Macros Stored Procedures External Stored Procedures User-Defined Functions Profiles Roles User-Defined Types 7. The difference between databases and users is the existence of a password for the user and their need to log on to the system.2. or other users or databases. Each row following the first row consists of unique data values. where columns identify attributes and rows represent tuples. Users are given space by the database to create and maintain objects.2 Tables A table defines relationships.2.
a person can also specify the data block size. Error logging tables are associated with a permanent base table to log information when: • Errors occur during SQL INSERT. or multisets. and column attributes are specified. and other physical attributes of the table. The SELECT AND CONSUME statement can be used to return data from the row with the oldest timestamp and deleting that row after returning data. and one or more secondary indexes. Queue tables are like ordinary base tables. Though duplicate rows are prohibited in relational tables based on set theory. Indexes are used to store and access rows within a table. then the operation is aborted and rolled back. it is called a SET table. the error logging table will contain an error for each error generated. The SELECT statement can be used to view the contents of the queue table. a TIMESTAMP column must be defined with a default value of CURRENT_TIMESTAMP. If no rows reside in the queue table. If a table is defined and none of the following statements are specified. If the error logging facilities can handle the error generated. Therefore if a table prohibits duplicate rows. each row in a table has unique values not shared with any other row in the same table. the transaction enters a delay state until a row is inserted into the queue table or the transaction aborts. and a marker row is used to determine the 109 . When a table is defined so is the primary index. the default action by the system is to create the table using the first column as a non-unique primary index: • • • • PRIMARY INDEX clause NO PRIMARY INDEX clause PRIMARY KEY constraint UNIQUE constraint Ideally. A bag or multiset is an unordered group of elements which may be repeated.SELECT operations. When using the statement. percent free space. by the ANSI standard.. This column is used to identify when a row is inserted into the table.table. the table name. When creating a table. When creating queue tables. but work like an asynchronous first-in-first-out (FIFO) queue.. If the table allows duplicate rows. Base tables are defined using the CREATE TABLE statement. If the request is completed. one or more column names. • Errors occur during SQL MERGE operations. it is called a MULTISET table. SQL definition is based on bags. Permanent journal tables can be created through options in the CREATE/MODIFY USER and CREATE/MODIFY DATABASE statements. the request will be completed using the referential integrity (RI) or unique secondary index (USI) violations are detected.
For instance. If the PA overlaps the PV. • Transaction time . Temporal tables store and maintain information related to time to allow the database to performed operations and queries using time-based reasoning. a second row representing the modified information within the PA. transforms or standardizes the data. If the error limit specified by the LOGGING ERRORS option is reached or the error logging facilities cannot handle the generated error.one row with the original information representing the beginning of the PV to the beginning of the PA. The system will store rows on any desired AMP. the transaction time for a row is changed and the database now has two rows to manage.number of errors generated in single request. an error row is inserted representing each error that occurred. Within the error logging table. No Primary Index (NoPI) tables are used to improve performance when bulk loading data using FastLoad or SQL sessions. therefore no marker row indicates the request failed. If the marker row exists. The NoPI table acts as a staging table for loading data. The database will create new rows automatically when transaction time rows in temporal tables are changed in order to maintain time dimensions. When valid-time rows are changed. which represents the changed information.records and maintains the time period the database is aware of the information in the row. The two columns are independent time dimensions and serve different purposes. The relationship between the PA and PV will determine the number of rows created as a result of the modification. called the period of validity (PV). three rows are created . only two rows are created. The NoPI tables have no data redistribution. Tables with both a valid-time and transaction-time column are considered bitemporal tables. and has the begin transaction time represented as the modification time. 110 . The original row is “closed”. appending them at the end of the table. which is called the period of applicability (PA).records and maintains the time period the information is valid. and stores the converted data into another staging table. the time period which the modification is valid can be specified. the operation is aborted and rolled back. Normally staging tables with primary indexes will perform some data redistribution. The original row with the end transaction time being the modification time and acts as a history row to represent the information contained before the modification and the new row. if the PA is within the boundaries of the PV. thus improving performance when applications load data into a staging table. the request completed is successful. NoPI tables can also act a log files or sandbox table for storing data until an indexing method can be defined. and a third row with the original information representing the time period from the end of the PA to the end of the PV. For instance. Temporal tables include one or two special columns to store the time-based information: • Valid-time .
but do not have a persistent definition in the data dictionary and must be created by the session. if an INDEX statement is used. No privileges are required to create volatile tables. There are three types of temporary tables: • Global temporary . Up to 1000 volatile tables can be created for each user session at any given time. The tables are only created when they are needed.used to debug external routines. Volatile tables are created using the CREATE TABLE statement and the VOLATILE keyword.consists of a persistent table definition stored in the data dictionary and used to store intermediate results from multiple queries into working tables used by applications and retained only for the length of the session. The events that cause an empty global temporary table to materialize are a CREATE INDEX statement and COLLECT STATISTICS statement on the table. The definition and contents of the volatile table are not persistent. The table is started each time a session requiring it starts and continues until the end of the session. existing only as a definition. The spaced used by the global temporary table is charged to the login user temporary table. A CREATE TABLE statement can be used to manage transient journaling options on the global temporary table definitions. Global temporary tables allow table templates to be defined in the database schema. • Volatile .retained for the session duration. the table has no rows and no physical instantiation.Temporary tables are used to temporarily store data. When an application accesses a table by the same name as the base table and the base table has not already materialized. They are created in the login user space. they have persistent definitions but will not retain data across sessions. The base definition of the table is created using the CREATE TABLE statement with the GLOBAL TEMPRORY keyword used to describe the table type. the table is populated immediately upon being materialized. No access logging is performed on a materialized global temporary table. • Global temporary trace . The ALTER TABLE statement can be used to modify transient journaling and ON COMMIT options for the base global temporary table. 111 . When created. Users must have the appropriate privileges on the base table or database (or user) containing the base table in order to materialize a new global temporary table. The database name for the table is the login user’s name. the new table is materialized using the stored definition. and no access logging is performed on these tables.
3 Columns Columns are a structure component of a table. • Default value control clauses (DEFAULT. when a volatile table is created. Exactly one value is found in each column of each row in the table. Permanent journaling Referential integrity constraints Check constraints Compressed columns DEFAULT clause TITLE clause Named indexes 7. They have a name and a declared type. the name must be unique from all global and permanent temporary table names in the database with the same name of the login user. UNIQUE. One of more optional attribute definitions can be added to define the column further. However. A name and data type must be defined for each column in the table. • Column storage attributes clause (COMPRESS). The elements of the table column are defined in the column definition clause of the CREATE TABLE statement. The value is a value in the declared type of the column and can include nulls. WITH DEFUALT). and CHARACTER SET). and CHECK). • Column constraint attributes clauses (PRIMARY KEY. including: • Data type attribute declaration (NOT NULL. FORMAT. REFERENCES. a volatile table can be created with the same name in each session.Volatile tables do not permit the following CREATE TABLE options: • • • • • • • • If a user logs on to multiple sessions. Volatile tables are private to each session. 112 .2. TITLE.
854. The values found in each column belong to one of the following data types: • • • • • • • • Numeric Character DateTime Interval Period Byte UDT Geospatial Numeric data types allow numeric values.648 to 2.507.147.specified with the GENERATED ALWAYS AS IDENTITY or GENERATED BY DEFAULT AS IDENTITY option in the table definition. • Object Identifier (OID) .used when table has LOB columns to store pointers to subtables containing LOB data.647.2.Some columns are dynamically generated by the database.223. 113 . which is specified for each column when creating a table.147. binary integer value from -9.775. Numeric values are specified using the following SQL data types: • BIGINT .4 Data Types Every data value is part of an SQL data type.used when the table is defined with a multilevel PPI to provide the corresponding partition level and partition number.a signed. such as: • Identity .used when table is defined with a partitioned primary index to provide the partition number. • ROWID . • PARTITION . binary integer value from -2. either as an exact number (integer or decimal) or an approximate number (floating point). 7.036.372.a signed. • INTEGER .372.036. • PARTITION#1 .uniquely identifies the row.PARTITION#5 .775.483.223.854.483.508 to 9.
DateTime data types represent dates. for storing a character large object (CLOB) such as simple text. or FLOAT . binary integer value from -128 to 127. and timestamps.• SMALLINT . Characters of a given character set are represented by character data types.a large character string.Year-Month and Day-Time. The SQL data types used to specify DateTime values are: • DATE . The different SQL data types for each category are: • YEAR-MONTH o o o • DAY-TIME o o INTERVAL DAY INTERVAL DAY TO HOUR 114 INTERVAL YEAR INTERVAL YEAR TO MONTH INTERVAL MONTH . • BYTEINT . • REAL. fractional second. DOUBLE PRECISION. and optional time zone.a signed.timestamp value including components for hour. • DECIMAL or NUMERIC (n|m) . second. HTML. and day. • TIMESTAMP [WITH TIME ZONE] .data value including components for year. minute. fractional. • CLOB . • VARCHAR(n) . or XML documents. with m of those digits to the right of the decimal point. Interval data types represent a time period and consist of two mutually exclusive interval type categories .fixed length character string for internal character storage. • LONG VARCHAR .the longest permissible variable length character string. times.a decimal number of n digits.variable length character string of length for internal character storage.a value in sign/magnitude form.time value including components for hour.a signed. • TIME [WITH TIME ZONE] . minute.768 to 32767. second and optional time zone. The SQL data types for character data are: • CHAR |(n)| . month. second. binary integer value from -32.
and optional time zone. fractional second. length binary string. video clips. second. Raw data can be stored as logical bit streams when using byte data types. • PERIOD(TIME|(n)| [WITH TIME ZONE]) . month.a fixed.a large binary string of raw bytes. • BLOB .an anchored duration of DATE elements including components for year. This type of data is transmitted from the memory of the client system.o o o o o o o o INTERVAL DAY TO MINUTE INTERVAL DAY TO SECOND INTERVAL HOUR INTERVAL HOUR TO MINUTE INTERVAL HOUR TO SECOND INTERVAL MINUTE INTERVAL MINUTE TO SECOND INTERVAL SECOND Different from interval data types which represent a span of time. and documents. • VARTYPE .a variable-length binary string. The SQL data types for the period category are: • PERIOD(DATE) . representing a binary large object (BLOB) such as graphics. • PERIOD(TIMESTAMP|(n)| [WITH TIME ZONE]) . second. BYTE and VARBYTE are extensions of Teradata to the ANSI SQL:2008 standard. while BLOB is ANSI SQL:2008 compliant. and day.an anchored duration of TIMESTAMP elements including components for hour. 115 . a period data types represents a set of contiguous time granules from the beginning boundary up to but not including the ending boundary. and optional time zone.an anchored duration of TIME elements including components for hour. fractional second. files. minute. minute. The SQL data types are: • BYTE .
collection of zero or more ST_Geometry values. ST_GeomCollection .Data types can be custom defined to support the structural and behavioral model required by the database.2-dimensional geometry collection where elements are restricted to ST_Polygon values. Custom data types are referred to as UDT data types and can be either distinct or structured. ST_LineString . each defining a hole. analyze. ST_MultiLineString . geospatial data types are used. o o o o o o o • MBR .0-dimensional geometry representing a single location in 2-dimensional space. ST_MultiPolygon . GeoSequence -extension of ST_LineString to contain tracking information. The following data types are used by Teradata Database: • ST-Geometry . and display geographic information. while structured UDTs are a collection of one or more fields called attributes representing predefined data types or other UDTs.1-dimensional geometry stored as a sequence of points with linear interpolation between points.Teradata proprietary internal UDT for obtaining the minimum bounding rectangle (MBR) for tessellation purposes. Distinct UDTs are based on a single predefined data type.1-dimensional geometry collection where elements are restricted to ST_LineString values. For application to manage.0-dimensional geometry collection where elements are restricted to ST_Point values. 116 . ST_Polyygon . ST_MultiPoint .2-dimensional geometry consisting a one exterior boundary and zero or more interior boundaries.Teradata proprietary internal UDT representing any of the following: o ST_Point .
Relational databases utilize a classic index. A column or group of columns which is the primary key in two or more tables in the same database is called a foreign key. the values used to link related tables together. connect change. A distinct UDT is based on a single predefined data type. two types are supported: Distinct and Structured. the index is accessed instead of the full table. Row hash values are 117 . and cannot be null. a concept where relationships between tables must remain consistent. Teradata Database indexes are based on row hash values instead of raw table column values.7. Each time a row is inserted into a table. a 32-bit row hash value is stored with the row. An index has weak selectivity if many rows are retrieved and has strong selectivity if few rows are retrieved.6 Keys A column or group of columns which uniquely identify each row in the table is called a primary key.5 User-Defined Types In addition to a set of predefined data types provided through SQL.2. or a pointer to all possible locations of rows with the same data field value. the more useful the index is. Table rows in Teradata Database are self-indexing based on their primary index. A structured UDT is a set of one or more attributes. When a user makes a request or query.7 Indexes When a user accesses a full table for a small amount of data. Several weakly selective non-unique secondary indexes can be combined through bit mapping to create a single strongly selective index. Indexes can be created to pull information from full tables on common types of queries. the performance and processor requirements to operate are much lower. ad hoc queries. Primary and foreign keys are used to ensure referential integrity.2.2. a query may have to look through a very large number of rows. Indexes do not support unplanned. if the index is unique. Which is a file comprised of rows having a data field in the reference table and a pointer to the location of the row in the base table. 7. These are called User-Defined Types (UDTs). the values defining a primary key must be unique. a user can create other data types and use them just like a predefined data type. The stronger the selectivity. Since the index is relatively smaller than the full table. each defined as a predefined data type. 7. This operation can consume a large amount of processor usage and reduce the performance of the entire database. Another form of structured UDT is the dynamic UDT which is associated with external UDFs to provide up to eight input parameters. if the index is non-unique.
Where the hash buckets are located is maintained in a hash map. Older systems will generally use 16-bit hash bucket sizes. Generally. The algorithm computes a 32-bit row hash value based on the primary index.576 hash buckets.536 hash buckets.8 Primary Index Data is distributed and retrieved for particular tables across AMPs using a primary index. the RowID includes the combined partition numbers for each level of the PPI. each associated with a different piece of data. table or multitable.2. • If a primary index is not explicitly defined. allow 65. 118 . Hash buckets are distributed evenly across AMPs on a system. 7. Hash indexes and join indexes are created using the CREATE HASH INDEX or the CREATE JOIN INDEX statements respectively. except partitioned primary indexes if one or more partitioning columns do not exist in the primary table. hash-ordered or valueordered. single.048. A partitioned primary index (PPI) will partition the data on each AMP based on some set of columns and ordered using a hash of the primary index columns. the CREATE TABLE statement will automatically be assigned. • Join index: simple or aggregate. A hashing algorithm is used to distribute rows across the AMPS. complete or sparse. The properties of the primary index consist of: • The CREATE TABLE data definition statement defines the primary index. The hash bucket size may be 16 bits or 20 bits. • The ALTER TABLE data definition statement modifies the primary index. • Secondary index: unique or non-unique.not unique. therefore a unique 32-bit numeric value is generated and appended to the row hash value to create a unique RowID. Any primary or secondary index can be defined unique. Secondary indexes can also be defined using the CREATE INDEX statement. • Hash index The CREATE TABLE statement is used to define a primary index and one or more secondary indexes. new systems will have a 10 bit hash bucket size allowing a total number of 1. If the table has a partitioned primary index defined to it. Hash buckets are an ordered letter section consisting of data and used to sort and lookup the data. The different types of indexes used in Teradata Database are: • Primary index: unique or non-unique and partitioned or non-partitioned.
When considering uniform data distribution. it is defined as non-unique. • A maximum of one primary index is defined for each table. Most tables will require a primary index. and retrievals on a single value is always a one-AMP operation. ST_Geometry. they are distributed to the same AMP. • Improves performance when performing single-AMP retrievals. the primary index values should be as distinct as possible. • Member columns of the primary index column set or partitioning columns cannot be compressed. If rows have the same primary index value. • Multilevel PPI may only specify CASE_N and RANGE_N functions in their partitioning expressions. the primary index should be on the most frequently used access path. and efficient parallel processing is possible through even distribution of table rows across AMPs. BLOB. joins between tables with identical primary indexes.2.9 Secondary Index Secondary indexes are created explicitly with the CREATE TABLE and CREATE INDEX statements. The columns used as the primary index has several restrictions: • Column cannot contain a CLOB. Primary index operations must provide a full primary index value. or Period data type. When considering optimal data access. but will often improve performance in the system. When a secondary index is created. • Single-level PPIs may only used the following general forms of partitioning expressions: INTEGER or casting to INTEGER. CASE_N and RANGE_N functions. 7. MBR. They are never required by tables. • A primary index can be unique or non-unique. or partition elimination.• A primary index can have up to 64 columns. a separate internal subtable is built to contain 119 . • A primary index can be partitioned or non-partitioned. Unique secondary indexes are implicitly created when using the CREATE TABLE statement to specify a primary table and on column set specified using PRIMARY KEY or UNIQUE constraints. • The hashing algorithm controls data distribution and retrieval. UDT. • If primary index is not explicitly defined as unique or is specified for a single column SET table. Optimal access and uniform distribution of data are the two most important factors in choosing a primary table.
• If the row already exists in the index subtable.the index rows. • For hash-ordered NUSI. Row Hash. Secondary index properties include: • Speed of data retrieved enhanced. Nonunique secondary indexes (NUSIs) are specified as hash-ordered or value-ordered. If a table is defined with FALLBACK. • A maximum of 32 secondary indexes defined for each table. • The data is received by an appropriate AMP and created a row in the index subtable. the RowIDs are sorted in ascending order. deleted. • All three values. the secondary index subtables are duplicated. an error is reported. • Composed up to 64 columns. • For hash-ordered NUSI. 120 . Secondary indexes can be unique or non-unique. and secondary index value. the rows are sorted by NUSI value order. • Data distribution is not affected. • The secondary index value is copied and appended to the ROWID of the base table row. • A spool file is built containing each secondary index value followed by the RowID. Unique secondary indexes (USI) are built using the following process: • Each AMP accesses its subset of the base table rows. the NUSI value is used to determine storage. or updated. For value-ordered NUSI. • Only a single numeric or DATE column of four or less bytes can be used for hashordered NUSI. Non-unique secondary indexes (NUSI) are built using the following process: • Each AMP accesses its subset of the base table rows. which is updated whenever a table row is inserted. RowID. • A Row Hash is created on the secondary index value. • Only a single column specified for the hash ordering in its covering index. are placed onto the BYNET. a row hash value is created for each secondary index value and a row created in the index subtable. For value-ordered NUSI.
Single-table join tables are used to resolve joins on large tables without redistributing the joined rows across the AMPs. When the index is used to access data.2. They are used for join queries which are performed with a high frequency. • Secondary indexes cannot be partitioned. by can be defined on a table with a partitioned primary index. • Composite secondary indexes are not used by the Optimizer unless explicit values exist for each column. • Selection of a small number of rows is highly efficient. When locating rows with a specific value in the index. • Columns which do not enhance selectivity should not be included. Unique secondary indexes are guaranteed to have a unique index value. Highly useful where the index structure contains all the columns referenced by one or more loins.• Dynamic creation or release as data usage changes. • Non-unique secondary indexes can be hash-ordered or value-ordered. an NUSI is the best option. the multitable join index will allow part of the query to be covered. Join indexes can be used to define a prejoin table on frequently joined columns.10 Join Index Join indexes are file structures for permitting queries to be resolved by accessing an index rather than its base table. • Additional I/Os required on inserts and deletes. Multitable join indexes will store and maintain joined rows of two or more tables and will aggregate selected columns. • Can be unique or non-unique. These types of join indexes will hash a frequently joined subset 121 . A join index can be defined on one or more tables. it involves a two-AMP operation. • Should not be defined on columns with frequently changing values. the CREATE JOIN INDEX statement is used. Access using the index involves an all-AMP operation. or define a summary table. create a full or partial replication of a base table with a primary index. These types of secondary values include covering columns. • Additional disk space required to store subtables. • Composite secondary indexes should not be used when multiple single column indexes and bit mapping can be used. 7. To create a join index.
2. A referencing table is sometimes called a child table. Any join index can be sparse.of base table columns to the same AMP. Aggregate join indexes can be defined on two or more tables. and the referencing columns are called child columns. A hash index can be defined on one table only. 7. highly efficient method of resolving queries to frequently specified aggregate operations on the same column or set of columns. Aggregate join indexes is a cost-effective. 122 . As a result aggregate calculations for every query is not required. The number of rows used in a join index can be limited to only those rows when a small.2. Child tables and columns must have parents which are the tables and columns being referenced. Referential integrity can: • Increase productivity in development by bypassing the need to code SQL statements to enforce referential constraints. the referenced columns must be either a unique primary index (UPI) or unique secondary index (USI). In both standard and batch referential integrity.12 Referential Integrity Enforcement of referential integrity supports two forms of declarative SQL: a standard method and a batch method. well known subset of them are referenced in a frequently run query. or on a single table. neither can be defined as NOT NULL. • Requires fewer written programs by ensuring referential constraints are not violated. to create a full or partial replication of a base table with a primary index. This “sparse” index is created using a constant expression to filter rows.11 Hash Index Single-table join tables and hash indexes have the same purpose. simple or aggregate. whether it is singletable or multitable. As a result BYNET traffic is eliminated. • Improve performance by identifying the most efficient method for enforcing referential constraints. which will include a summary table containing a subset of columns from the base table and additional columns for the aggregate summaries of the base table columns. This primary index is on a foreign key column. 7.
all foreign key values are invalidated. Columns have the same data type and case sensitivity. if a row exists. an error is returned.the system verifies a row exists in the referenced table with values matching those of the altered values of the foreign key columns. Here are some actions the database will take to maintain that integrity when data is manipulated: • Inserting a row into the referencing table with a NOT NULL foreign key column . • The foreign and primary key must contain the same number of columns. if not.the system verifies no rows exist with the same foreign key values as the referenced columns. A circular reference exists when one table references another table which has another reference to the original table. an error is returned. if not. The following rules are applied to columns assigned as FOREIGN KEYS: • The COMPRESS option is not permitted on referenced or referencing column(s) for standard referential integrity. an error is returned. If one column value for the foreign key is null. • A foreign key referenced is added . an error is returned. if a row exists. • No comparisons are made between column level constraints. • Before altering the structure of columns . • Deleted row in the referenced table .the system will ensure the referencing table will dropped its foreign key for the referenced table. The database must maintain the integrity of foreign keys. When a circular reference exists. • Before updating a referenced column .the system verifies no rows exist with the same foreign key values as the deleted row. an error is returned when an ALTER TABLE or DROP INDEX statement is made. if a rule is broken. FOREIGN KEY references can exist on the same table containing the FOREIGN KEY.the system will validate all values in the foreign key columns against columns in the referenced table. 123 .the system verifies the same values exist for the foreign key column and the row existing in the referenced table.the system verifies the change will not violate foreign key constraint rules.Referring tables must have FOREIGN KEY columns which are identical in definition with keys in the referenced table. at least one set of FOREIGN KEYS must be defined on nullable columns. The referenced and referencing columns must be different columns. • Foreign key columns values are altered to NOT NULL . • A referenced table is dropped .
7. or deletes to the table are allowed. the CREATEVIEW statement is used.2. foreign key references are not supported. and a view does not exist until it is referenced by a statement. 124 . contains derived columns.Archiving and restoration of individual tables are performed by the Archive/Recovery (ARC) utility. its columns. a user can work with a view. When a single table is restored. which will provide a view name. Until then. • View column names must be explicitly specified for any derived columns in the view. it is possible to create a reference definition which is inconsistent. However. the table is marked inconsistent and no updates. inserts. This dictionary definition contains the references between a parent and child tables. If a table is the target for a FastLoad or MultiLoad operation. or defines the same column more than once. Data does not exist within a view like it does a table. as if they were accessing the physical table. a GROUP clause. • Tables cannot be updates from a view when the view is a join view. as well as copying tables from one database to another. and any conditional expressions or aggregate operators. The ARC utility will validate the references in both tables after the restoration is complete. When restoring or copying a table. Only the column definition for the view is stored. The restrictions are as follows: • An index cannot be created on a view. the dictionary definition for the table is also restored. they are virtual. a SELECT statement on one or more columns from underlying tables or views. Some of the operations used to manipulate tables are not valid for views and some are even restricted.13 Views A view can be used to see defined portions of one or more tables in a database. To define a view. Views will be presented as tables to the user. a DISTINCT clause. At this point. the database will materialize the view. • An ORDER BY clause cannot exist within a view definition. but instead of being physical tables storing data.
the order of fire is based on the ANSI-specified order of creation timestamp. all the actions will abort. The following process is initiated when a trigger is executed: • A triggering event occurs on the subject table. To remove a trigger permanently from the system. • If any action within the triggering process aborts. Teradata Database triggers conform to ANSI SQL: 2008 standards.2. • Statement triggers . The CREATE TRIGGER statement is used to define a trigger. The ALTER TRIGGER statement will enable.executed once when the triggering statement is executed. • With multiple qualifying triggers. DELETE.14 Triggers Triggers are active database objects consisting of a stored SQL statement or a block of SQL statements. • Control is passed from one trigger to the next. 125 . If a table has an active trigger. • The trigger action time for qualified triggers is examined to determine when they fire before or after the triggering event. • Appropriate triggers on the subject table are activated based on the triggering event.7. most load utilities will not be able to access the table. • The triggered action is executed. the ALTER TRIGGER statement must be used by the application. typically in the form of SQL statements. A trigger will have two types of granularity: • Row triggers . UPDATE. the DROP TRIGGER statement is used. They execute when an INSERT. or change the creation timestamp of the trigger.executed once for each row changed by a triggering event and satisfying any qualifying condition. or MERGE statement is used to modify a specified column(s) in the subject table. To disable the trigger and enable the load. disable.
When macros are executed. A macro can contain an EXECUTE statement to execute another macro. All unqualified database object references are resolved using the default database of the user submitting the EXECUTE statement. They consist of control and condition handling statements written in SQL. a cursor is used. References to a parameter name can be prefixed with a COLON character. A macro can contain a data definition statement if it is the only SQL statement in the macro. one or more rows of data can be returned. a static execution is specified using the EXEC statement and a dynamic execution is specified using both a PREPARE and EXECUTE statements. The exception to this rule is CREATE AUTHORIZATION and REPLACE AUTHORIZATION. The DECLARE CURSOR statement is used to associate a macro cursor with a static SQL macro execution.7. The PREPARE statement is used to define a dynamic macro execution. The EXECUTE statement is used to perform a dynamic macro. When multiple statements are used and return data. The following statements can be used to perform different actions on a macro: • • • • • DROP MACRO REPLACE MACRO RENAME MACRO HELP MACRO SHOW MACRO 7. If the macro is a single statement that returns no data. Applications have a server126 . Macros can be created by a user for use only by the creator or be granted authorization for use by other users. Parameters in a macro can be substituted with data values and time a macro is executed.15 Macros Macros are one or more statements that can be executed as a group by performing a single EXECUTE statement.2. The resolution of a data definition statement is not fully reached until the macro is executed.16 Stored Procedures The ANSI SQL: 2008 standards calls stored procedures Persistent Stored Modules. If a different result is desired. A USING modifier in a macro will allow parameters to be filled with data from an external source. Parameters cannot be used for data object names. all object references must be fully qualified in the data definition statement. The statement string within this PERPARE contains an EXEC macro-name statement. Macros can be executed statically or dynamically. The EXEC statement is used to perform a static macro.2.
If authorized. Execution of a stored procedure requires the appropriate privileges. They are stored as objects within the user database space and executed on the server. or Java can also execute stored procedures. Result sets are returned when the CREATE PROCEDURE or REPLACE PROCEDURE statements specify DYNAMIC RESULT SET clause. a user can use the SQL CALL statement from any supporting client utility or interface. and Teradata SQL Assistant. either single or compound. 127 . A compound statement stored procedure body will contain a BEGIN-END statement with a set of declarations and statements.based procedural interface to the database to manage these statements. C++. comprise the stored procedure body. DDL. Stored procedures are a set of statements. The output from stored procedures consist of values in INOUT or OUT arguments or result sets of SELECT statements. All the parameters in a stored procedure must have their arguments specified. JDBC. such as: • • • • • • • Local variable declarations Cursor declarations Condition declarations Condition handler declaration statements Control statements SQL DML. or DCL statement. Stored procedures are created using either the COMPILE command from the BTEQ utility or the SQL CREATE PROCEDURE or REPLACE PROCEDURE statements through CLIv2 applications. which as the main tasks of the procedures. Modification of a stored procedure definition is done using the REPLACE PROCEDURE statement. A single statement stored procedure body will contain only one control statement or one SQL DDL. External stored procedures written in C. DML. ODBC. DCL statements Multistatement request Statements can be nested in compound statements. Declaration and cursor statements are not allowed.
a user must have the appropriate privilege. or Java programming language. • Allows changes in compile-time attributes such as SPL option and Warnings option. The CREATE PROCEDURE does not require the use of COMPILE command in BTEQ. they are considered external store procedures and are installed on the database and executed like stored procedures. namely the CREATE EXTERNAL PROCEDURE privilege on the database. Nested stored procedures can be called using the FNC_CAllSP library. CLIv2 is used to execute C or C++ external stored procedures.2.Stored procedures can be recompiled using the ALTER PROCEDURE statement. The following statements perform different operations on stored procedures: • • • • DROP PROCEDURE RENAME PROCEDURE HELP PROCEDURE SHOW PROCEDURE If a stored procedure is written in C. while Java external stored procedures are executes through JDBC. instead of executing SHOW PROCEDURE and REPLACE PROCEDURE statements. C++.17 User-Defined Functions Two types of user-defined functions are supported and allow a user to write their own functions: • • SQL UDFs External UDFs 128 . • Stored procedures created in earlier versions of the database can be recompiled with current features. To install an external stored procedure. This allows: • Allows cross-platform archive and restore operation for stored procedures. 7.
2.uses grouped sets of relational data to return a summary result. Number of days before password reuse is allowed. a CREATE PROFILE statement is used and allows the following to be set: • • • • • • • Account identifiers. They are installed on the database and used like standard SQL functions. C++. which is the benefit of creating SQL UDFs. To create a profile. Table . The purpose of profiles is to simplify system administration and control password security. External UDFs allow functions to be written in C. Optimizer cost profile. using a FROM clause.Regular SQL expressions can be encapsulated in functions and used as standard SQL functions. the following system parameters are provided defined values: • • • • • • Default database Spool space Temporary space Default account and alternate accounts Password security attributes Optimizer cost profile Profiles are defined and assigned to a group of users who need to share the same settings. Number of incorrect logon attempts. 129 . or Java programming language. External UDFs are categorized into three types: • • • Scalar . Aggregate .uses input parameters to return a single value result. Default database.returns a table to the SELECT statement.18 Profiles With profiles. 7. Number of days before password expiration. complex SQL expressions can be moved from queries to SQL UDFs. Allocation of space for spool files and temporary tables. As a result.
• Use the GRANT statement to grant a role to users or other roles.19 Roles The privileges on objects within the database are defined by roles. • Add privileges to the newly created role using the GRANT statement. • Change the current role for a session. Users are assigned roles. Allowable characters in password string. including: o o o o o o Use of digits and special characters At least one numeric character At least one alphabet character At least one special character Removal of user name in password Mixture of character case 7.2. which provide them the authority to access the objects the role has privileges. Restriction of specific words from significant portion of password string. The primary purpose of roles is to simplify the administration of privileges across the database and reduce the disk space required to assign privileges to individual users.• • • Minimum and maximum number of characters allowed. The process for managing user privileges using roles is as follows: • Use the CREATE ROLE statement to define a role. if required. • Assign default roles to users with the DEFAULT ROLE option in the CREATE USER or MODIFY USER statement. 130 .
7.any literals. The first keyword in an SQL statement is the statement keyword. Functions .the name of a function and its arguments. Keywords . Phrases . name references. and one of more optional clauses. formally valid object names may be using newly reserved keywords. one or more column names.any values introducing clauses or phrases or representing special objects. Non-reserved keywords can be used in object names.3.3. Clauses . those words cannot be used to name database objects. If a keyword is reserved.2 Keywords Keywords are reserved and non-reserved words which have special meaning in SQL statements.” The action can be any of the following: • • • • • Expressions . database name.subordinate statement qualifiers. Other keywords found in a statement act as modifiers or introducers of clauses. This keyword is always a verb.3 SQL Syntax 7. but may cause confusion. table name.data attribute phrases. 7. 131 . or operations. A typical SQL statement will have a statement keyword. As new releases of the database product are introduced.1 Statement Structure The basic structure of an SQL statement is “statement_keyword action. All keywords must be ASCII compliant.
the other sequence is required. or interval of exactly one declared type. The three types of character literals are the character literal. and TIMESTAMP are the types of DateTime literals found. time. view. or CHECK constraint definition text. and Unicode character string literal. the literal character E. If one sequence on either side of the decimal point is missing. name references. optional sequence of digits. Character literals declare character values in an expression. or operations using names and literals.4 Literals Literals are constants coded directly into the text of an SQL statement. 132 .literal strings of decimal numbers are declared consisting of an optional sign. it produces a single number. A numeric literal is a character string of 1 to 40 characters consisting of 0-9. timestamp. a value is specified. data. an optional sequence of up to 38 digits. DATE. There are three types of numeric literals: • Integer Literal -literal strings of integer numbers are declared using an optional sign and a sequence of up to 10 digits. view. and decimal point. • Query . another optional sign.3. a plus sign. Hexadecimal character literal. optional sequence of digits. minus sign.7.literal strings of floating point numbers are declared consisting of an optional sign. Spans of time can be declared using interval literals and consists of Year-Month and Day-Time categories. byte string. or timestamp values in an SQL expression. and an optional sequence of digits. an optional decimal point. or CHECK constraint definition text. character string.3 Expressions Expressions are literals.produces rows and tables of data and operate on table views. Expressions are of two types: • Scalar . or macro definition text. Object names in a data dictionary can be created and referenced using hexadecimal name literals and Unicode delimited identifiers. or macro definition text. optional decimal point. TIME. 7.3. or macro definition text. time. view. and a sequence of digits representing the exponent. Within an expression. • Decimal Literal . • Floating Point Literal .also known as a value expression. or CONSTRAINT definition text. Period literals are used to specify a constant value to the Period data type. Data and time literals are used to declare date.
SQL operators consist of (with order of precedence): • Numeric (unary plus. addition. NE. Instead of representing a value. division. Operators are evaluated from left to right if they have the same precedence. or an unknowable value.5 Operators Logical and arithmetic operations are expressed as SQL operators. The innermost parenthesis is always performed first and evaluated outwardly. AND. Parenthesis is used to control the order of precedence. They are used as literals in the following ways: • • • • • • • CAST source operand CASE result Insert items Update items Default column definition specification Explicit SELECT item (Teradata extension) Function operand (Teradata extension) 7. but not identical as. nulls represent the absence of a value.6 Functions There are two types of functions used in SQL: scalar and aggregate.3. NULL is ANSI SQL: 2008-compliant with extensions. LE. GE. representing an empty column. 133 . modulo operator. an unknown value. exponentiation. Precedence refers to the importance of the operator compared to other existing operators and represents three levels. NOT IN set. LIKE. GT. IN set.The NULL keyword is considered to be like a literal. Aggregate functions produce summary results resulting from grouped sets of relational data. multiplication. The keyword NULL acts similar to. unary minus. literals. LT. subtraction) • String (concatenation operator) • Logical (EQ. BETWEEN/AND.3. OR) • Value • Set 7. Scalar functions use input parameters to return a single value results. NOT.
separates DateTime fields.separates DateTime fields. The SEMICOLON is a Teradata extension of the ANSI SQL: 2008 standard. • FULLSTOP .7.7 Delimiters and Separators Delimiters are special characters used in different capacities: • PARENTHESIS .used to prefix reference parameters or client system variables. • QUOTATION MARK -defines the boundaries of nonstandard names. 134 .defines the boundaries of character string constants and separates DateTime fields.3. Bracketed comments are a text string of unlimited length delimited by a beginning SOLIDUS and ASTERISK (/*) and an end ASTERISK and SOLIDUS (*/). Other forms of separators are lexical and statement separators. or DateTime fields. • SEMICOLON . They can be simple or bracketed. • COMMA . • HYPHEN .used to group expressions and define limitations on phrases. The SEMICOLON is also used to terminate requests. pad characters. DateTime fields. The statement separator is the SEMICOLON and allows multiple statements to be distinguished from each other.used to separate statements in multi-statement requests.column).separates and distinguishes column names. separate. represented by being the last nonblank character in an input line in BTEQ (not counting ends of comments). stored procedure bodies. They will not change the meaning of a statement and are represented as comments. A newline character is implementation-specific but created by hitting the Enter or Return key. and delimiters. • COLON .used to separate database objects (database. Simple comments are delimited by two consecutive HYPHEN characters before the comment text and a newline character at the end. literals. Also used to terminate requests through utilities and embedded SQL statements.separates DateTime fields. parameters. and SQL procedure statements. • Uppercase B and Lowercase b . Comments can be inserted in an SQL request anywhere a pad character can occur. Lexical separators are a character strings located between words.table. • APOSTROPHE . and RETURN characters. or act as a decimal point. method names from UDT expression. • SOLIDUS .
• Create. MODIFY USER. with a DEFAULT DATABASE clause. To change the permanent default database definition or add a default database when none exist. MODIFY PROFILE.4 Default Database A database used by the Teradata Database to look for unqualified object names is called the default database. 7. 7. any of the following data definition statements can be used: • • • MODIFY USER. If an unqualified object name exists in multiple databases. drop. rename. All SQL statements support the definition of database objects fall under the category of DDL. views. and alter user-defined types. A default database can be established for the current session of the database. 135 . with a PROFILE clause. the PROFILE clause with the CREATE USER statement is used. if a profile exists with a defined default database. It will search this database and other databases to resolve names. and alter tables. with a DEFAULT DATABASE clause.5 Functional Families SQL statements can be executed from within client application programs and are referred to as embedded SQL. using the DATABASE statement.5. It can be defined using the DEFAULT DATABASE clause with the CREATE USER statement or. Teradata Database will distinguish the SQL language statements embedded in the application program form the host programming language using a special prefix. EXEC SQL. DDL statements perform the following functions: • Create.7.1 Data Definition Language Data Definition Language (DDL) is an SQL language subset. • Create. user-defined functions. drop. and replace stored procedures. and macros. the SQL statement will produce an ambiguous name error. Some privilege on an object in the database must exist before a user can establish a default database. drop. A permanent default database can be established and invoked each time a user logs on. rename.
• Collect statistics on a column set or index. drop. drop. • Set the query band for a transaction or session. and database. • Logging start and ending. alter. and replace triggers. rename. • Create and drop indexes. • Create. • Comment on database objects. or last statement. drop. When DDL statements are entered. drop. entries in the Data Dictionary will be automatically created and updated. • Establish default database. • Create. If the execution of a DDL statement is successful. and replace rule sets. • Set a different collation sequence. • Create. • Solitary statement. • Create. 136 . No DDL statement can be part of a multistatement request. and modify profiles. in explicit transactions.• Create. drop. • Create. drop. • Enable and disable online archiving. and modify users and databases. and replace user-defined methods. and alter replication groups. DateFrom. account priority. • Solitary statements in a macro. drop. time zone. • Create. and set roles. they can consist as: • Single statement requests.
7.5.2 General DDL Statements • • • • • • • • • • • • • • • • • • • • • • • • ALTER FUNCTION ALTER METHOD ALTER PROCEDURE ALTER REPLICATION GROUP ALTER TABLE ALTER TRIGGER ALTER TYPE BEGIN LOGGING/END LOGGING COMMENT CREATE AUTHORIZATION/DROP AUTHORIZATION CREATE CAST/DROP CAST/REPLACE CAST CREATE DATABASE/DELETE DATABASE/DROP DATABASE/MODIFY DATABASE CREATE ERROR TABLE/DROP ERROR TABLE CREATE FUNCTION/DROP FUNCTION/REPLACE FUNCTION CREATE GLOP SET/DROP GLOP SET CREATE HASH INDEX/DROP HASH INDEX CREATE INDEX/DROP INDEX CREATE JOIN INDEX/DROP JOIN INDEX CREATE MACRO/DROP MACRO/RENAME MACRO/REPLACE MACRO CREATE METHOD/REPLACE METHOD CREATE PROCEDURE/DROP PROCEDURE/RENAME PROCEDURE/REPLACE PROCEDURE CREATE PROFILE/DROP PROFILE/MODIFY PROFILE CREATE REPLICATION GROUP/DROP REPLICATION GROUP CREATE REPLICATION RULESET/DROP REPLICATION RULESET/REPLACE REPLICATION 137 .
RULESET • • • • • • • • • • • • • CREATE ROLE/DROP ROLE/SET ROLE CREATE TABLE/DROP TABLE/RENAME TABLE CREATE TRANSFORM/DROP TRANSFORM/REPLACE TRANSFORM CREATE TRIGGER/DROP TRIGGER/RENAME TRIGGER/REPLACE TRIGGER CREATE TYPE/DROP TYPE CREATE USER/DELETE USER/DROP USER/MODIFY USER CREATE VIEW/DROP VIEW/RENAME VIEW/REPLACE VIEW DATABASE DROP ORDERING/REPLACE ORDERING LOGGING ONLINE ARCHIVE ON/OFF SET QUERY_BAND SET SESSION SET TIME ZONE 7. entries in the data dictionary are created and updated automatically. If a DCL statement executes successfully. or last statement.3 Data Control Language Data Control Language (DCL) is a subset of the SQL language. DCL statements define the security authorization required for accessing database objects. A solitary statement in a macro. 138 . The DCL statement cannot be part of a multistatement request. A data control statement can be either: • • • A single statement request. specifically granting and revoking privileges and transferring ownership of a database to another user. in an explicit transaction.5. A solitary statement.
and even specified columns for all rows or specified rows. which returns information found in tables of the relational database. A single statement can be used to retrieve data from all rows or specified rows.5. which returns stored data in the table. The most common form of DML statement is the SELECT statement. The SELECT statement can be used to select columns. and the table(s) within the database to access. DML statements define the manipulation or processing of database objects.4 Data Manipulation Language Data Manipulation Language (DML) is a subset of the SQL language. The SELECT statement can also be used to select rows. the database other than default. The components of the SELECT statement are the referenced table columns.DCL statements consist of: • • • • • • • GIVE GRANT GRANT CONNECT THROUGH GRANT LOGON REVOKE REVOKE CONNECT THROUGH REVOKE LOGON 7. Greater detail can be obtained by using one of the following clauses: • • • • • • • • FROM WHERE ORDER BY DISTINCT WITH GROUP BY HAVING TOP 139 .
the operation is rejected and an error message is returned.SELECT statement can be used to perform a bulk insert of rows retrieved from another table. An update operation can be affected by the attributes in the CREATE TABLE statement: • A value violating some defined constraint will be rejected and an error message returned. The WHERE clause in the statement will qualify the rows to be deleted. • An update result will violate uniqueness constraints or create a duplicate row... • If a value for a column is omitted requiring a defined default value.SELECT is performed on a table defined as SET and in Teradata mode. The only exception is when an INSERT. an error is returned. An UPDATE statement is used to modify data in one or more rows of a table. an error message is returned. The DELETE statement allows an entire row or rows to be removed from a table. This merge is based on the matching condition between the source row and target rows. A WHERE clause can be used to identify the rows to be changed. • If a value for a column is omitted when NOT NULL or no default is specified. The CREATE TABLE statement may define defaults and constraints which affect an insert operation. The MERGE statement can be used to merge a source row set into a target table. any existing data in the column is removed. The INSERT statement can be used to add a new role to a table.A special form of the SELECT statement is the zero-table SELECT statement. The column name of the data being modified is specified in the statement. These statements will return data without access any tables. the operation is rejected and an error message is returned. • A value NULL is supplied and is allowed... Subqueries are nested SELECT statements. the default value for the column is stored. 140 . An INSERT. • A value which does not satisfy the constraints or violates a defined constraint is supplied. in the following ways: • If an attempt to add a duplicate row for a unique index or to a table defined as SET.
To ensure infinite recursion is not performed. Perform recursion based on the existing result set. Incrementing the column value by 1 in each recursive statements. The process of performing a recursive query is in three phases: • • • Create an initial result set. This is done by: 1. Specifying a depth control column in the column list.5. repeated iteration of logic. The following SQL statements are provided by Teradata to collect and analyze query and data: • • • • • • • • • • • BEGIN QUERY LOGGING COLLECT DEMOGRAPHICS COLLECT STATISTICS DROP STATISTICS DUMP EXPLAIN END QUERY LOGGING INITIATE INDEX ANALYSIS INITIATE PARTITION ANALYSIS INSERT EXPLAIN RESTART INDEX ANALYSIS SHOW QUERY LOGGIN 141 . Return the final result set as the final query.7. 2. 3. The characterization of recursion is comprised of three steps: initialization. Specifying a limit for the value of the depth control column. 4. and termination.5 Query and Workload Analysis Recursive queries can be used to query hierarchies of data. A recursive query is specified using a WITH RECURSIVE clause preceding a query or a RECURSIVE clause in a CREATE VIEW statement. depth control is provided. Initializing the column value to 0 in seed statements.
These statements include: • • • • • • • • • • DIAGNOSTIC HELP PROFILE DIAGNOSTIC SET PROFILE DIAGNOSTIC COSTPRINT DIAGNOSTIC DUMP COSTS DIAGNOSTIC HELP COSTS DIAGNOSTIC SET COSTS DIAGNOSTIC DUMP SAMPLES DIAGNOSTIC HELP SAMPLES DIAGNOSTIC SET SAMPLES DIAGNOSTIC “Validate Index” 142 .The results of these statements can be used by the Optimizer to produce better query plans or to populate a user-defined Query Capture Database (QCD) tables. Diagnostic statements supporting the Teradata Index Wizard can be used to emulate a production environment on a test system.
1 Physical Database Design 8. Indexes are used to: • • • • Locate data rows Distribute data rows Improve performance Ensure index value uniqueness The different types of indexes include: • • • • • • Primary Partitioned Primary Secondary Join Hash Special (referential integrity) Tables can be created with a Unique Primary Index (UPI). guaranteeing uniform distribution of table rows. or No Primary Index (NoPI). causing “skewed data” when distributing data. columns cannot have duplicate values.1 Primary Indexes Data can be distributed or accessed. a Non-Unique Primary Index (NUPI). Primary Key (PK) defines a column. The degree of uniformity in the distribution is highly susceptible to the degree of uniqueness of the index. as relationships with other tables may be lost if a PK is changed or 143 . The values in the column must be unique form each other and cannot be null.1. With UPIS. A table created with NUPI can have duplicate values in its columns. or columns.8 Database Administration 8. The PK values should never be changed. used to uniquely identify a row in a table. An index is a physical mechanism for storing and accessing rows in a table. Retrieval of data can easily be processed using indexes. A primary index is the most efficient method of accessing data and the best primary index has the greatest level of uniqueness.
Each level of the MLPPI must define at least two partitions. and non-compressed.re-used. When a PPI is used to create a table or join index. but SIs add to table overhead. These subtables have index rows where the SI value is associated with one or more rows. Can be unique or non-unique. specifically they model a relationship between data values on different tables. rows are hashed to AMPs based on the PI columns and assigned to an appropriate partitions.2 Secondary Indexes Secondary Indexes (SI) are used to obtain information using alternative paths. volatile tables.1. The Optimizer can be used on indexes to improve query performance. Rows are stored in row hash order when assigned to a partition. For instance: two tables are created to store information about subscription customers. A Multilevel Partitioned Primary Index (MLPPI) allows partitions to be sub-partitioned. Foreign Keys (FKs) identify table relationships. thus improving performance. Full table scans are not required. Subtables for all SIs are built by the system. global temporary tables. Their limits may further restrict the number of levels allowed. the customer’s name is the foreign key. The product of the number of partitions cannot exceed 65. one table lists billing information on customers with the primary key being the customer’s account number and the other table listing the customer’s address with the address column being the primary key. Useful for NoPI tables. 144 . The rows are updated in the subtable when changes are made to column values or new rows are added. The default PI of Teradata Database is a non-partitioned PI. The core relationship between these two tables is the customer’s name. There can be up to 15 levels of partitioning. though both UPIs and NUPIs can be partitioned. SIs can be dropped and recreated when required.535. A Partitioned Primary Index (PPI) will still provide a path to rows in the base table. join indexes using PI values. In this instance. since a column with these values can be found on both tables. Some characteristics of Secondary Indexes: • • • • Row distribution across AMPs is not affected. 8.
All types of join indexes can be joined to base tables to retrieve columns referenced by the query but not stored in the JI. A query covered by the JI is characterized as all referenced columns are stored in the index and the query only needs to examine the JI. The hash value is computed using the hash of the values of the SI columns. 145 . Teradata Database can support multitable.4 Hashing To distribute data for tables with a PI to disk storage. To distinguish between rows with the same row hash within the same table. When a SI value is specified in the SQL. The exception to this rule is aggregate join indexes. A row identifier is applied to uniquely identify each row in a table and is a combination of the row hash and sequence number.3 Join Indexes An indexing structure containing columns from one or more base tables is called a Join Index (JI). Join indexes support a primary index.8. A multitable JI can be used to predefine a join when queries frequently request the join. A single-table JI contains rows from a single-table and used as an alternative approach to directly accessing data. A row hash is obtained by hashing values of the PI columns. Nearly all indexes are based on whole or in part the row hash values instead of table column values. This type of JI will limit the rows indexed. along with the actual value of the index columns and a list of primary index row identifiers. hashing is used. partially-covering JIs.1. a sequence number is assigned. An aggregate JI can be defined on two or more tables or on a single-table using: • • • SUM function COUNT function GROUP BY clause Spares Join Indexes refer to situations using the WHERE clause in the CREATE JOIN INDEX to index a portion of the table. When queries use the JI but also the base tables. This hash value is recorded by the SI subtable. the hash value is used to access the required access.1. 8. the query is considered partially-covered by the index.
1. which define a system of constraints. The primary concept behind this process is normal forms. If two nonPrimary Key columns or group of columns are not in a one-on-one relation in either direction. second. Columns within a referencing table can be specified as foreign keys for columns in a referenced table which are defined as either primary key columns or unique columns. USIs. Relational databases are defined in the first normal form (1NF). the concept of referential integrity states a row cannot exist in a table with a non-value when a value of equal value for either the referencing table or referenced table. typically taking on first. The attribute is used to generate a unique. primary keys. as a database is considered relational.1.5 Identity Columns The ANSI standard defines a column attribute option called the Identity Column. The elimination of non-key attributes that do not describe the Primary Key is the basis of third normal form.7 Referential Integrity Relationships between tables are the focus of referential integrity. the relation is considered to be in third normal form. A relation is in normal form if it meets the constraints of a particular normal form. 146 .1. and surrogate keys. A relational database is always defined as normalized to the first form. A relation is considered to be in second normal form (2NF) if it is and in 1NF and every non-key attribute is fully dependent on the entire Primary Key. Non-key attributes are any attribute not part of the Primary Key. Where a relation eliminates repeating groups in first normal form. 8. Identity columns can be used to generate UPIs. No hierarchies of data values are allowed in first normal form. specifically when the relationship is based on the definition of a primary key and foreign key. second normal form eliminates circular dependencies. the relation is considered to be in 1NF. table-level number of every row in a table. Essentially. 8. and third normal forms. If all fields within a relation contain one and only one value (atomic). Normal forms are layered based on the Boyce-Codd model. The process of this reduction is called normalization.8. stable database schema.6 Normalization A complex database schema can be reduced into a simple.
• Parent Key . As new databases and users are created. 8. The user DBC also owns all space for the entire system.2 Database Administration When installing Teradata Database. • Child Table . • Foreign Key . it only has one user called DBC. Associated with the user DBC is a default database also called DBC. except for space used for the following system users.Referential integrity utilizes the following concepts: • Parent Table .a parent table column set referred to be a foreign key column set in a child table. The usable disk space in the DBC database will reflect the entire system hardware capacity initially.a child table column set referring to a primary key column set in a parent table.where referential constraints are defined. databases.also called the referenced table. this is referred to by a child table.candidate key in the parent table. • Primary Key . permanent space is extracted from the user DBC. This user will own all other databases and users in the system. and objects: • • • • • • • • • • • Crashdumps user SysAdmin user SystemFE user (for field engineers) TDPUSER user Sys-Calendar database TD_SYSFNLIB database SQLJ database (for external routines and UDTs) SYSLIB database (for external routines and UDTs) SYSUDTLIB database (for external routines and UDTs) SYSSPATIAL database (for geospatial data types) DBCExtension database (for GLOP sets) 147 .
2. tables. 8. views. • Create and manage databases.contains the Sys_Calendar.2 Administrator User The responsibilities of the administrator user are: • Establish user management policy with security administrator. migration. • Allocate space to users and databases. specifically a DIPCRASH script being run during DIP installation. permanent space can set aside and remain unused or an empty database can be created to hold the space. journals. query logs.used to contain domain-specific functions. databases and administrative tools. These executable files contain SQL scripts known as DIP scripts. 8.created for internal use in the database to manage and perform system administration functions and contains several views and macros for administrative purposes. • Crashdumps . • SystemFE .Calendar view. The following are system users created by these scripts: • SysAdmin .1 System Users The DIP utility has several executable files for creating system users. • Grant privileges to roles and users. • TD_SYSFNLIB .CalDates table and Sys_Calendar. or upgrade.2. • Sys_Calendar . UDTs.created for internal use by field engineers for diagnostic and emulation operations. macros. and other database object.• • • System Transient Journal (JT) System catalog tables of Data Dictionary User accessible views To ensure sufficient space is dedicated to the DBC database to allow it to be a spool. 148 .used to log internal errors. stored procedures.
Subsequent requests and responses over the session can be identified by the host id. BETQWin. a session is established. The beginning of the session starts after the username. Troubleshoot user problems. Manage data archive and restore. Monitor system performance. enough space should be allocated to handle growth of system tables. it is recommended an administrative user is created. password. 8. When assigning space to this name. Manage space usage.3 System Administration 8. the following methods can be utilized: • • • • Direct submission of SQL statement to BTEQ. The name of the administrative user cannot be the same as any name already reserved by the system. Manage database maintenance tasks. and transient journals. and account number is accepted by the database and the database returns a session number to the process. and request number.3 Administration Tools To perform administrative tasks. logs. Created and manage accounts. The identification used by the database is provided automatically and unknown to 149 . To protect sensitive data and system objects owned by the user DBS. SQL Assistant. session number.• • • • • • • • Manage users and databases. or scripts Client-based utilities DBS utilities Teradata Viewpoint 8.1 Session Management When a user logs on to Teradata Database.2.3. Manage data load and export.
• ampload (AMP Load) .2 Utilities Administrative and maintenance functions can be performed by system administrators using a large number of available utilities for Teradata Database.the user. • ctl (Control GDO Editor) .used to abort all outstanding transactions on a failed host.used to collect and display a summary of AWT snapshot for users. • cufconfig (Cufconfig Utility) .used to display and change PDE Control Parameters GDO fields.used to define AMPs. The session is established when the user logs on to the database. an interactive terminal session with an application.used to check for inconsistencies between internal data structures. or an interactive user.used to identify the load of all AMP vprocs on the system. • cnsrun (CNS Run) . • checktable (CheckTable) .used to run database utilities from scripts.used to display and modify configuration settings for the UDF and external stored procedures subsystem. The actual procedure for establishing the session is slightly different based on the client system the operating system. • DIP (Database Initialization Program) . and whether the user is either an application program. and hosts and their interrelationships. • config (Configuration Utility) . PEs. The logon string used to identify the user can include: • • • • Tdpid User name Password Optional account number 8. • awtmon (AWT Monitor) .executes one of more DIP scripts. including: • aborthost (Abort Host) . 150 .3.
used to estimate the elapsed time for reconfigurations and time estimates for redistribution. or syscheck (Resource Check Tools) . modify.• dbscontrol (DBS Control) .used to monitor session state on selected logical host IDs • reconfig (Reconfiguration Utility) . • filer (Filer Utility) .used to define the scope. • ferret (Ferret Utility) .used to display locks placed by Archive and Recovery and Table Rebuild operations • sysinit (System Initializer) . • lokdisp (Lock Display) . session identifiers. and lock levels associated with SQL statements currently being executed.used to initialize the database 151 . and NUSI building • rcvmanager (Recovery Manager) . and execute an action. and monitor Teradata Database process prioritization parameters. • dumplocklog (Locking Logger) .used to display all real-time database locks and sessions in a snapshot capture. • gtwcontrol (Gateway Control) .used to identify slow down and potential hangs of the database and displays statistics • showlocks (Show Locks) .used to establish an operational database with the component definitions created by the Configuration utility.used to display and modify DBS Control Record fields.used to create. • qryconfig (Query Configuration) . lock object identifiers.used to log transaction identifiers.used to save system dumps to tape and restore from tape. • schmon (Priority Scheduler) .used to change default values for the Gateway Control GDO.used to modify the node list file. • reconfig_estimator (Reconfiguration Estimator) . deletion.used to report the current Teradata Database configuration • qrysessn (Query Session) . display parameters of. • DUI or DULTAPE (Dump Unload/Load) .used to display information to allow monitoring of recovery progress • dbschk. • modmpplist (Modify MMP List) . nodecheck.used to find and correct problems in the file system.
8. To create a new database or user.used to reset the PDA and database components • tpccons (Tow-Phase Commit Console) . Each database and user may contain one permanent journal optionally.used to recalculate the permanent.1 Databases and Users Databases and users are uniquely named permanent space.4 User and Security Management 8. or spool space used by database(s).used to perform GetStat/ResetStat operations and display statistics • tdntune (TDN Tuner) . They store objects which use space and the utilize objects which do not take space but have definitions in the Data Dictionary which use space. privilege definitions. and space limits. Databases are logical repositories for database objects. • updatedbc (Update DBC) . the CREATE DATABASE or CREATE USER statement can be used.used to display and change tunable parameters for Teradata Network Services • tpareset (Tpareset) . Users are similar to databases. privileges must be explicitly granted first. temporary. • vprocmanager (Vproc Manager) . Users will have passwords and startup strings. • verify_pdisks (Verify_pdisks) .used to convert a SDF into a GDP • tdnstat (TDN Statistics) .used to display information about PDE processes.used to verify pdisks are accessible and correctly mapped. but can perform actions. As databases and users are 152 . When done.4.• rebuild (Table Rebuild) .used to manage vprocs.used to rebuild tables that cannot be automatically recovered by database • tdlocaledef (Tdlocaledef Utility) .used to perform 2PC related functions. • updatespace (Update Space) .used to recalculate the PermSpace and SpoolSpace values in the DBASE table. • tsklist (Task List) .
Any journal tables associated with the database or user requires all data table references to it to be removed and the journal itself removed using the DROP DEFUALT JOURNAL TABLE option in a MODIFY DATABASE or MODIFY USER statements.product of the CREATE USER statement. 8. has PERM space on disk. temporary. • Trusted user . • Directory user .uses a session of the trusted user to access the Teradata Database: can be a permanent user or an application proxy user. 8. • Externally authenticated database user . and directly logs into the Teradata Database.2 User Types Several types of users can be created: • Permanent database user . all objects contained in the database or user must be deleted. The DROP DATABASE and DROP USER statements can be used to remove a specified database or user. 153 . • Define appropriate authorization checks and validation procedures. and other account and access-related activity is audited. • Resolve any ownership issues.4.created.4. • LOGON. requiring a mapping to a permanent user or a generic user supplied in Teradata Database called EXTUSER.a permanent user granted the privilege to assume a proxy user identity. • Identify the implications available on permanent. In order for the process to be complete.3 Creating Users To create a new user: • Identify the job function of the user to determine the appropriate roles and profiles to assign to the user. • Proxy user . session. a hierarchy is created. At the top of the hierarchy is the DBC and the parent of any subsequent databases and users.grant access with NULL password provided by external authentication or externally authorized without permanent space allotted. and spool space. REVOKE.located on a directory server and not the Teradata Database. GRANT.
the value is used is the same SPOOL value as the immediate value. The following defaults can be optionally defined: • Default database . • TEMPORARY . • Default role .If the profile is assigned to user and has TEMPORARY value.• Mapping directory users with database users. If the profile is assigned and a single account exists.no profile exists for the user. the default is the account identifier of the immediate owner of the user.use can enter a startup string during logon.Unless a SET ROLE statement is submitted. If the profile is not assigned. Null passwords 154 . the value is the defined limit from the profile. • SPOOL . the statement is rejected as incomplete. The default is null.typically the username. Temporary passwords can be specified when a user is first created. the default is the account identifier of the immediate owner of the user. but a SET SESSION DATABASE statement can be submitted to change the default for the current session. permanent or temporary. • DEFAULT ROLE . the default is the same TEMPORARY value of the immediate owner of the space. • STARTUP .If the profile is assigned to be used by more than one account exists.default database of the creating user. the statement requires a password and permanent space to be defined. In addition to defining a unique username. the default is the defined limit in the profile. The default values of the CREATE USER statement if not defined by the DDL are: • FROM database . The CREATE USER statement is used to create users to the system. • DEFAULT DATABASE . If the profile is not assigned to a user. If the user attempts to log into the Teradata Database without a password. If the profile is assigned but does not have a SPOOL value. the default is the same TEMPORARY value as the immediate owner of the space.If the profile is assigned to user that has a SPOOL value. the default is the account defined in the profile. If the profile is assigned but does not have a TEMPORARY value.used during a session. no role is used in the privileges rights validation process. If the profile is not assigned. provides space to store or search new or target objects. if required. the value used is the same SPOOL value as the immediate value. the default is the first account in the string.Users must have an active role. • ACCOUNT . and must be specified if known and provided. If the profile is assigned and no accounts are defined. • PROFILE .
Any role created by the user to be granted to other users and roles. creator privileges are automatically assigned to the role. Roles can reduce the number of rows added to and deleted from the DBC. New passwords can be prompted upon first login if installing a new system or through the enforcement of password requirements. To change or nullify the current role. 155 . 8. When given the WITH ADMIN OPTION. Roles can be granted to users or other roles.AccessRights. Specific privileges on database objects for each role must be granted. Revoke any role granted.4 Roles Roles can be used to manage user privileges. a SET ROLE statement can be submitted. When using roles to manage privileges. creator privileges allow: • • • • Any role created by the user to be dropped by the user. The assigned default role will be activated when the user logs onto the system. Default roles should be assigned to users.4. the following considerations should be taken: • • • • • Different roles created for different job functions or responsibilities. The DROP ROLE and WITH ADMIN OPTION privileges are provided during this creation. therefore. A role is created using the CREATE ROLE statement. Grant a role with the WITH ADMIN OPTION to another user. Privileges to objects for each role must be specified. Once a role is created. Privileges are assigned to a specific role and roles are assigned to users. performance can be improved. Users can then access all objects associated with the assigned role.can only be used when external authentication is employed.
The following rules apply to roles: • One of more roles can be granted to one or more users or roles. • The MODIFY USER. a user can have many roles.. • Privileges granted to a role are inherited by every user or role member of the role. 156 . preventing any members of roles from granting the privileges it contains to other users or roles. • Any privilege granted to an existing role will affect any user or role that is specified as a recipient of the GRANT statement.AS DEFAULT ROLE ALL requires no roles to be granted by the creator. • The CREATE user. • Single-level nesting is allowed: a role has a member role cannot also be a member of another role. • The current session may be set by the user to ALL. A role can have many users. • The SET ROLE ALL statement allows all roles to be available within a session..AS DEFAULT ROLE ALL requires no roles to be granted by the user... The following privileges cannot be granted to roles: • • • • • • • • CREATE RILE DROP ROLE CREATE PROFILE DROP PROFILE CREATE USER DROP USER CTCONTROL REPLCONTROL The GRANT OPTION cannot be granted to roles.
4. External profiles can be assigned to directory users by the directory administrator. must be explicitly granted the CREATE PROFILE privileges. permissions.the same SPOOL value for the immediate owner of the space. granted.8.6 Privileges The implementation of security in the database is largely done through the granting of privileges. If profile definitions are not specified when a DREATE RPOFILE statement is submitted. Profile definitions are assigned to users and will override specifications at the system or user level. Implicit privileges are inherited 157 .the defined user setting. including: • • • • • Password settings Account strings Default database assignments Spool limits Temporary space limits 8. Performance group . except DBC.5. Temporary space amount . or authorization.default value is NULL. These privileges allow a profile to be created. Values in a profile will always take precedence over the values defined for the user. Profiles are assigned to a group of users to ensure all members of the group operate under the same parameters. Sometimes referred to as access rights. Cost profile . and implemented. dropped. Spool space amount .the defined user setting. All users. Account ID . the following defaults can be used: • • • • • • • Password attribute .the default system profile.4. Default database .Level M. Profiles Profiles are used to simplify the management or user settings. Privileges are either explicit or implicit.derived from the default account ID of the user. privileges are applied to objects to ensure the user has the right to access or manipulate an object within the Teradata Database. Profiles allow several attributes to be managed.
• DBC.all explicit privileges granted to each role.UserRoleRightsVX . Explicit privileges are stored in one row for each user for each privilege on the database object.AllRightsV . Privileges are maintained in the Data Dictionary.UserRightsV . though if it conflicts.RoleMembersVX .AccessRights table containing information on explicitly or automatically granted privileges. • DBC. through an assigned role. 158 . specifically the DBC. Explicit privileges are granted to the user or database directly.RoleMembersV .all explicit privileges granted. • DBC. • DBC.any explicit privileges granted by requesting users to other users. Privileges can be granted to a role which is a collection of privileges on database objects. • DBC.all roles directly granted to the requesting user. the system checks privileges in the following order: • The DBC. the explicit privileges will take precedence.all explicit privileges granted to each role for the requesting user. Explicit privileges can be revoked. AccessRights table are inserted or removed each time a CREATE/DROP or GRANT/REVOKE statement is submitted. or to an external role.UserGrantedRightsV . When working with an object.AllRoleRightsV . They are either granted to a user or database directly on objects by another user or granted automatically to a user or database on objects created. • Check if required privilege is a PUBLIC privilege. A user can have both implicit and explicit privileges.RoleGrants table for the users’ current role. Privileges can be explicitly granted to PUBLIC.AccessRights table is checked for the required privilege for each nested role identified in the DBC. • DBC.AccessRights table is checked for the required privilege at the individual level.AccessRights table is checked for the required privilege at the role level. Rows in the DBC.any explicit privileges granted by requesting user. The following views can be used to report explicit privileges: • DBC. • The DBC.each role and every user or role the privilege has been granted. • The DBC.through ownership of the object and cannot be revoked.
the privileges used are from the permanent user. the proxy user privileges will be used. if the stored procedure is created with the SQL SECURITY INVOKER clause. the following privileges are automatically assigned to the creator: • • • • • • • • • • ANY CHECKPOINT CREATEAUTHORIZATION CREATE DATABASE CREATE MACRO CREATE TABLE CREATE TRIGGER CREATE USER CREATE VIEW DELETE 159 . • The privileges of the proxy user will not be used for macros because the appropriate privileges must come from the immediately owning database. the privileges used are from active roles of a proxy user and PUBLIC. Implicit privileges are implied for an owner of the object or from the ownership hierarchy defined in the DBC table. When using a proxy connection. If a CREATE USER/DATABASE statement is submitted. the proxy user’s active roles. proxy users must be associated with trusted users to allow the database to be accessed from an external source. Implicit privileges cannot be revoked and are not logged in DBC. • The privileges of the immediate owner are used for stored procedures or the privileges of the “invoke” of the procedure depending on the SQL SECURITY clause: that is. the system checks for privileges in the following way: • If the proxy user is a permanent database user. These implicit privileges are valid for as long as the object is owned by the associated database or user.AccessRights.If middle-tier applications are being used. The system will use the privileges of the proxy user and not the trusted user. and PUBLIC. • If the proxy user is an application proxy user.
they will receive the following privileges automatically: • • • • • • • • ANY CHECKPOINT CREATEAUTHORIZATION CREATE MACRO CREATE TABLE CREATE TRIGGER CREATE VIEW DELETE 160 .• • • • • • • • • • • • • • • • DROP AUTHORIZATION DROP DATABASE DROP FUNCTION DROP MACRO DROP PROCEDURE DROP TABLE DROP TRIGGER DROP USER DROP VIEW DUMP EXECUTE INSERT SELECT STATISTICS RESTORE UPDATE When a user or database is created.
• • • • • • • • • • • • • • DROP AUTHORIZATION DROP FUNCTION DROP MACRO DROP PROCEDURE DROP TABLE DROP TRIGGER DROP VIEW DUMP EXECUTE INSERT SELECT STATISTICS RESTORE UPDATE The following privileges must be explicitly granted to creators of a user or database: • • • • • • • • • ALTER EXTERNAL PROCEDURE ALTER FUNCTION ALTER PROCEDURE CREATE EXTERNAL PROCEDURE CREATE FUNCTION CREATE PROCEDURE EXECUTE FUNCTION EXECUTE PROCEDURE SHOW 161 .
The following privileges must be explicitly granted to created users or databases: • • • • • • • • • • • • • ALTER EXTERNAL PROCEDURE ALTER FUNCTION ALTER PROCEDURE CREATE DATABASE CREATE EXTERNAL PROCEDURE CREATE FUNCTION CREATE PROCEDURE CREATE USER DROP DATABASE DROP USER EXECUTE FUNCTION EXECUTE PROCEDURE SHOW The DBC or user with the following privileges can grant the same privileges to another: • Table level privileges o o • INDEX REFERENCES GLOP Data privileges o o o CREATE GLOP DROP GLOP GLOP MEMBER • Monitor privileges o o ABORTSESSION MONRESOURCE 162 .
UserRightsV DBC.UserGrantedRightsV DBC.ALTER EXTERNAL PROCEDURE AF .ALTER PROCEDURE 163 .o o o • MONSESSION SETRESRATE SETSESSRATE System level privileges o o o o o o CREATE PROFILE CREATE ROLE DROP PROFILE DROP ROLE REPLCONTROL CTCONTROL • UDT privileges o o o UDTMETHOD UDTUSAGE UDTTYPE The following views have an AccessRight column: • • • • • DBC.AllRightsV DBC.ALTER FUNCTION AP . The codes are as follows: • • • AE .AllRoleRightsV DBC.UserRoleRightsV The column will record a two character code representing the privilege granted to the particular object referenced.
DROP TABLE DU .DELETE DA .DROP ROLE DT .CREATE TRIGGER CM .CREATE MACRO CO .DROP DATABASE DF .CREATE ROLE CT .DROP FUNCTION DG .CREATE PROFILE CP .DROP PROFILE DP .DROP VIEW 164 .CREATE DATABASE CE .DROP MACRO DO .CREATE TABLE CU .CHECKPOINT CR .DUMP DR .DROP AUTHORIZATION DD .CREATE VIEW D .DROP TRIGGER DM .• • • • • • • • • • • • • • • • • • • • • • • • • AS .DROP USER DV .CREATE USER CV .CREATE EXTERNAL PROCEDURE CF .CREATE AUTHORIZATION CD .ABORT SESSION CA .CREATE FUNCTION CG .
DROP GLOP GM .SECURITY CONSTRAINT ASSIGNMENT SD .EXECUTIVE PROCEDURE R .CTCONTROL U .INSERT IX .REPLCONTROL RS .REFERENCE RO .RETRIEVE/SELECT RF .SET RESOURCE RATE SS .CREATE GLOP GD .INDEX MR .SET SESSION RATE TH .DROP PROCEDURE PE .SHOW SR .UPDATE 165 .EXECUTE FUNCTION GC .GLOP MEMBER I .CREATE OWNER PROCEDURE PC .NONTEMPORAL OP .SECURITY CONSTRAINT DEFINITION SH .EXECUTE EF .• • • • • • • • • • • • • • • • • • • • • • • • • E .MONITOR RESOURCE MS .CREATE PROCEDURE PD .RESTORE SA .MONITOR SESSION NT .
UDT Usage 8. and macros. the other tables are used only for system or data recovery.5.2 Data Dictionary Views The following are views of the Data Dictionary: • DBC.provides information on hierarchical relationships in the DBC.TempTables table. The Data Dictionary tables are created when the Teradata software is installed through the DIPVIEWS utility. views.• • • UM . • Automatically updated as objects are created. • DBC. and types. explicit privileges. or dropped. Some Data Dictionary tables will contain rows not distributed using hash maps and are stored AMP locally.1 Data Dictionary The Data Dictionary contains information about the entire database and is comprised of tables. The Data Dictionary consists of tables. and privileges are granted or revoked. and macros stored in the user DBC: • Stores information about created objects.UDT Type UU .UDT Method UT .Owners table. modified. Most Data Dictionary tables are fallback protected: a copy of every table row is maintained on different AMPs in the configuration and will provided automatic recovery. views.ChildrenV . 166 . While some tables are referenced by SQL requests. altered.provides information on all global temporary tables materialized in the system found in the DBC.5. including ownership hierarchy. 8. Data Dictionary tables are updated whenever a data definition or data control (DCL) statement.5 Object Maintenance 8.AllTempTablesV .
The only difference between X views and non-X views is the existence of a WHERE clause to ensure a user can view only those objects the user owns. macros. These views are identified by an appended X to the system view name and are sometimes called X views. The type of locking used. Most views will reference more than one table and access to Data Dictionary information can be limited. • DBC. Indexes table. views.TriggersV .Dbase table. specialized actions attached to a single table as found in the DBC. The Data Dictionary will store object names in Unicode to allow the same set of characters to be available regardless of the character set of the client. • DBC.provides information about event-driven. and immediate parents found in the DBC.TableConstraints table. • DBC.DatabasesV .IndicesV . triggers. specialized actions attached to a single user as found in the DBC. These views can be accessed using an EXPLAIN modifier preceding a DDL or DCL statement.provides information about event-driven. stored procedures. and on what objects. and replication status found in the DBC.TablesV . Some views are user-restricted: they are only applied to the user submitting the query acting upon the view.Dbase table. The processing of the DDL or DCL statement is described. is associated with. The operating conditions resulting from an EXPLAIN modifier is: • • • The statement is not executed. 167 .provides information about databases. or assigned a role with privileges.TriggersTbl table.• DBC.ShowTblChecksV .USersV . • DBC. Unicode versions of view are identified by an appended V to the system view name.provides information about tables. is described.provides information about database table constraint information found in the DBC.provides information on indexes on the tables found in the DBC. functions. • DBC.TVM table. users. These views will only report a subset of the available information. been granted privileges on.
The parent of a new user or database is the database space where the user or database resides. transfer permanent space to the specified recipient. The default database is the database of the creating user. The creator of an object is the user submitting the CREATE statement. If a user creates an object in the space of another user or database. If a CREATE statement is executed to create an object. If a user owns a second user creates an object in the second user’s database. There are three types of space: • Permanent (PERM) . the second user is the immediate owner: the first user is an owner. the user is the creator of the object but the immediate owner of the object is the other user and database.6 Capacity Management Disk space is managed in bytes. then the user is both creator and the immediate owner. • The GIVE statement will transfer the permanent space owned by the transferring database or user. The owner of the object is the database or user just above the object in the hierarchy. In addition to the specified object. all child objects are transferred. in reality. the user executing the statement is the creator or the object. 168 .used to store data for global temporary tables. than the object is immediately owned by that user or database.stores intermediary query results or formatted answer set and volatile tables. • Temporary (TEMP) . If the object is directly below a user or database in the hierarchy. logical repository for database objects. but not necessarily the owner or immediate owner of the object. The immediate owner is sometimes referred to as the parent.6. The rules for transferring ownership are as follows: • Database and users can only be transferred using the GIVE statement.8. • The explicit DROP DATABASE and DROP USER privilege on the transferred object and the received object must be set to use the GIVE statement. The GIVE statement will. the GIVE statement is performed. • Object can only be given to the children of the object.allocated to a user or database as a uniquely defined. 8. • Spool . but not an immediate owner.1 Ownership The privileges for owner and creator are different and determine the default settings for undefined parameters. If the object is created in the user’s database by the user. Ownership can be transferred from one immediate owner to another for databases and users. To perform this transfer.
and permanent journals. As additional new databases and users are created under those users. stored procedures. the SUM aggregate can be used. 8.6. • Implicit privileges are impacted by changed ownership transferring to the new owner. Unused space is dynamically allocated for temporary or spool space. They are not set at the table level. It contains one or more rows from the same table and acts as the physical I/O unit for the file system.DiskSpaceV view can be used to determine the amount of PERM space available for each AMP. Space on cylinders is obtained from a pool of free cylinders. Data blocks are stored in segments and grouped in cylinders.the maximum number of bytes available for storage of all data tables. A number of cylinders can be reserved for transactions requiring permanent space. triggers. For all AMPs on the system. If no unused space is available.the total number of bytes currently allocated to existing tables.” If a statement requires permanent space and one or more free cylinders. The number of reserved cylinders is determined by the File system field in DBS Control called “Cylinders Saved for PERM. subtables.DiskSpaceV are: • CURRENTPERM . The space is allocated from the immediate parent of the object being created. permanent space is dynamically acquired by data blocks and cylinders. The specified amount of permanent space for each user or database is divided by the number of AMPs in the configuration. A data block is a diskresident structure. The PERM space values tracked in the DBC. new objects cannot be created until more space is acquired. If space required for a transaction is not available. • PEAKPERM . The DBC. and permanent journals. the transaction will be aborted. These limits set the maximum limit with the PERM parameter of a CREATE/MODIFY USER/DATABASE statement. PERM space limits are deducted from the available space of the immediate owner or the database or user.• Explicit privileges granted to others on a transferred user are not automatically revoked: explicit privileges must be explicitly REVOKED as required. When the system inserts rows. subtables. resulting in a reduction of actual PERM space available. triggers. index tables. The space is then allocated to other users created from the user DBC.2 Space Limits Permanent space limits can be set at the database or user level. the statement will succeed. All available permanent space is allocated to the user DBC. If the statement 169 . • MAXPERM .the largest number of bytes used to store data in a user or database since the last reset. index tables. stored procedures. Each AMP will record the result and may not be exceeded on that AMP.
PermDBSize. along with the size of resulting data blocks. • FREESPACE attribute in CREATE/ALTER TABLE statement . If the statement fails in any case. To set data block size limits. • “Cylinders Saved for PERM” field of DBS Control . the global parameters. are set in the DBS Control utility to determine the maximum size of permanent datablocks holding multiple rows.resets the current free space percent for a table at a global or table level. this row will be placed in its own block. the number of DBs in a table is reduced.the number of cylinders saved for permanent data only. This space is controlled at the global and table level using: • FreeSpacePercent field of DBS Control . the statement succeeds. up to 127. Data blocks are segments of one or more rows from a single subtable. 170 .determines the percentage of cylinder space left on each cylinder when bulk loading the table. PermDBAllocUnit. Proper settings should consider: • • • The frequency of accessing or modifying certain tables. can be controlled by setting the system-level MergeBlockRatio field in DBS Control or specifying the table-level MergeBlockRatio attribute in the CREATE/ALTER TABLE statement. The frequency of merges. It is possible to have a greater number of errors for requests requiring SPOOL space than those requiring PERM space.determines the percentage of space left unused on each cylinder. a disk full error is returned. The percentage of cylinder space left during load operations is called the Free Space Percent (FSP). • FREESPACEPERCENT option in PACKDISK .5 KB. Teradata Database has the ability to merge small DBs into larger one during full-table modify operations.the percentage of storage space that PACKDISK should leave unoccupied on cylinders. In the long-term. and JournalDBSize. This field/ attribute allows adjustments to be made to ensure merged data blocks are a reasonable size. along with the number of I/Os.requires spool space and free cylinders exist greater than the amount specified in the Cylinders Saved for PERM. The general size of blocks desired across certain tables. If the size of multi-row data blocks is exceeded by a row. • DEFAULT FREESPACE of an ALTER TABLE statement . The amount of data being modified. The DATABLOCKSIZE = n [BYTES/KBYTES/KILOBYTES] specification used in the CREATE/ ALTER TABLE statement will allow the maximum multi-row data block size.
the impact of performing bad queries can be reduced. or the limit set for the immediate owner (if no spool is defined). user. Highly compressed data. high concurrency and large intermediate spools require more spool space. or 40 percent of the CURRENTPERM size. the limit is from the user submitting the statement and declared in the profile (if one exists).retained until the response rows are returned in the answer set for a query or the rows updated within. A system with no fallback.3 Spool Space Spool space is used by the system for response rows of every query run by a user during a session and for intermediate spool tables. By default. • Intermediate Spool . inserted into. the statement. • If SPOOL is not specified in a CREATE/MODIFY USER/DATABASE statement and a profile does not apply. or deleted from a base table. • If SPOOL is not specified in a CREATE/MODIFY USER/DATABASE statement and a profile applies but the parameter is NULL or NONE. users. By limiting spool space. no compression. The spool space is drawn dynamically from unused system perm space. the termination of the session or a Teradata Database reset. a manual drop of a table during a session. the limit may not exceed the limit of the user submitting the statement and is declared in the profile. or a profile for a user. the specification from the 171 .6. • If SPOOL is specified in a CREATE/MODIFY USER/DATABASE statement and a profile does apply. used to hold intermediate rows during query execution. all unused permanent space in the Teradata Database system is available for use as spool space: a spool reserve database can be used to reserve a minimum amount of spool that cannot be used for storage. and well-designed queries require less spool size. the limit is inherited from the profile specification.retained until a transaction is complete. • Output Spool . batch loads. and a medium workload is recommended to have a required spool for the entire system of 30 percent of the MAXPERM. real-time loads. Spool space limits can be set for a database.retained until intermediate spool results are no longer required by the query or the Teradata Database resets.8. Fallback. • If SPOOL is not specified in a CREATE/MODIFY USER/DATABASE statement and a profile does apply. but not a table. and profiles are: • If SPOOL is specified in a CREATE/MODIFY USER/DATABASE statement and a profile does not apply the limit may not exceed the limit of the immediate owner of the user or database. The types of spool space available are: • Volatile Spool . The maximum and default limits for databases. the limit is inherited from the specification of the immediate owner.
statement, or the specification set for the immediate owner (if no spool is defined). The Teradata Viewpoint can be used to view spool trends for each database or user. Available permanent space can be dynamically allocated as spool when required. Reserving permanent space for spool requirements will ensure transaction processing is not impacting. At minimum, the amount reserved should not exceed 40 percent of space relative to CURRENTPERM. To create a spool reserve database, the CREATE DATABASE statement is submitted and the amount of space to reserve is specified in the PERM parameter. No objects should be created in this database, or any data stores. This ensures the database is available for use as spool space. The following settings can be assigned to spool space limits when using the CREATE USER, CREATE DATABASE, or CREATE PROFILE statements: • MAXSPOOL - limits the number of bytes allocated to create spool files for a user. • CURRENTSPOOL - defines the number of bytes for resolving queries. • PEAKSPOOL - defines the maximum number of bytes used by a transaction for a user since the last reset.
8.6.4 Temporary Space
Temporary space hold rows of materialized global temporary tables. It is allocated at the database or user level, but not table level. The TEMPORARY parameter within the CREATE/ MODIFY PROFILE or CREATE/MODIFY USER/DATABASE statements is used to define a temporary space. The maximum and default limits for temporary space allocation following the rules: • If specified in the TEMPORARY in a CREATE/MODIFY USER/DATABASE statement and a profile does not apply, the limit is inherited from and does not exceed the limit of the immediate owner. • If specified in the TEMPORARY in a CREATE/MODIFY USER/DATABASE statement and a profile does apply, the limit is inherited from the submitting user, cannot exceed the user’s limit, and is declared by the profile limit (if one exists), the limit specified in the statement, or the limit of the immediate owner (if no TEMPORARY is defined o for user). • If not specified in the TEMPORARY in a CREATE/MODIFY USER/DATABASE statement and a profile does not apply, the limit is inherited from the immediate owner of the user. 172
• If not specified in the TEMPORARY in a CREATE/MODIFY USER/DATABASE statement and a profile does apply, the limit is inherited from the profile specification. • If not specified in the TEMPORARY in a CREATE/MODIFY USER/DATABASE statement and a profile does apply with a SPOOL parameter of NULL or NONE, the limit is inherited from the submitting user and declared by the profile limit (if one exists), the limit specified in the statement, or the limit of the immediate owner (if no TEMPORARY is defined o for user). The different types of temporary space include: • • • CURRENTTEMP - the amount of space currently used by Global Temporary Tables. PEAKTEMP - the maximum temporary space used since the last session. MAXTEMP - limits the space available for global temporary table rows.
8.6.5 Data Compression
Disk space usage can be reduced using data compression. The most common forms of compression are: • • • • Multi-Value Compression Algorithmic compression Hash/Join Compression Block Level Compression
Within the CREATE/ALTER TABLE statement, the COMPRESS phrase is used to specify a list of frequently occurring values for compression. The phrase is associated with the columns containing the values. Data that matches the value specified in the list, the database will store the value only once in the table header and a smaller substitute value for each affected row. Multi-Value Compression (MVC) as described above has the greatest cost/benefit ratios to any compression method, requiring minimal resources to decompress the data and lack of impact to query/load performance. Despite these benefits, the number of MVC values listed in the statement is limited: • • A maximum of 255 values per column. Allowable storage per columns for uncompressed values up to: o -7800 bytes per column of BYTE, KANJI1, and KANJISJIS data. 173
-7800 characters per column for GRAPHIC, LATIN, or UNICODE data.
• 1 MB capacity of the table header for storing a list if values being compressed, minus whatever space is needed to store other table-related information. Standard algorithmic compression (ALC) algorithms are available for Teradata Database, along with a framework for creating custom algorithms for compression and decompression. When data is moved into a specified table, the COMPRESS algorithm is invoked. The DECOMPRESS algorithm is invoked when the data is accessed. The system will not apply ALC to any value covered by MVC. ALC is most effective when column values are unique. ALC and MVC can be used concurrently on the same column. The type of algorithm can yield different performance benefits. ALC use is limited to certain types of table are data. Compressions algorithms are found in the form of UDFs, which are used to compress character data by table columns. The different types of Teradata UDFs are: • CAMSET and CAMSET_L - compresses Unicode or Latin (respectively) character set data using a proprietary compression algorithm from Teradata. • DECAMSET and DECAMSET_L - decompresses character data compressed by CAMSET or CAMSET_L algorithm. • LZCOMP and LZCOMP_L - compresses Unicode or Latin (respectively) character data using the ZLIB compression library, based on the Limpel-Ziv algorithm. • LZDECOMP and LZDECOMP_L - decompresses character data compressed by the LZCOMP or LZCOMP_L algorithm. • TRANSUNICODETOUTF8 - compresses Unicode into UTF8 format. • TRANSUTF8TOUNICODE - decompresses the Unicode data compressed by TRANSUNICODETOUTF8. Hash/Join Index row compression will divide rows into repeating portions and non-repeating portions. Multiple sets of non-repeating column values are appended to a single set of repeating column values. The result is the repeating value set is stored once. The nonrepeating column values are stored as logical segmental extensions of the base repeating set. A pointer exists from the non-repeating column set to each repeating column set. Row compression is the default for hash indexes, but for join indexes, row compression is specified in the SELECT clause of a CREATE JOIN INDEX statement. The SELECT clause allows specific join index columns to be listed separately to identify which will be subject to row compression or not. Block Level Compression (BLC) stores data blocks in compressed format. Data blocks are 174
determines the extent of compression operations will favor processing speed or degree or data compression.specifies conditions for compressing permanent table DBs.specifies conditions for compressing spool DBs. • CompressSpoolDBs . the system will default to applying BLC to all subsequent updates to the table. The Lempel-Ziv algorithm is used for all BLC through the ZLIB library. • CompressMloadWorkDBs .specifies the algorithm used to compress DB data.specifies conditions for compressing global temporary table DBs. Not meeting DBS Control compression parameters.identifies the minimum size DBs to be compresses. • CompressGlobalTempDBs . Disabled for a particular table by the FERRET utility UNCOMPRESS command. • UncompressReservedSpace . • CompressPJDBs . • MinDBSectsToCompress . The following are DBC Control fields: • BlockLevelCompression . BLC should be applied to large tables only to balance space benefits with CPU cost.identifies the minimum percentage which the size of a DB must be reduced by compression. Disabled for particular type by a DBS Control field.specifies conditions for compressing CLOB subtables.specifies if compression is applied to all permanent journal DBs.specifies conditions for compressing data blocks from multiload sort worktables and index maintenance worktables. Secondary index subtables are not compressed by BLC. • CompressPermDBs . • MinPercentCompReduction . • CompressionAlgorithm .specifies the minimum percentage of storage space that 175 .physical units of I/O defining how the Teradata Database file system handles data. Once a table is enabled for BLC. The maximum multi-row data block size can be defined using the DATABLOCKSIZE option in the CREATE/ALTER TABLE statement.enables or disables BLC globally. The maximum setting is 255 sectors and its specification will minimize the datablock percentage required for compress logic. • CompressionLevel . • CompressCLOBTableDBs . unless BLC is: • • • • Disabled globally by a DBS Control field.
Trusted sessions allow user identities and roles to be asserted without establishing a logon session for the user.1 Session Modes A session is a logical connection between an application and Teradata Database. • The active session identity and role is set by the middle-tier application for the application end user by submitting a SET QUERY_BAND statement with the PROXYUSER and PROXYROLE name-value pairs.7. In ANSI mode.7 Session Management 8. A session begins when a Teradata Database accepts the username and password of a user. With these sessions. A GRANT CONNECT THROUGH statement is used to assert a proxy user identity through a trusted user and create a trusted session. Trusted sessions are handled by: • A session pool created by the middle-tier application will authenticate itself to a Teradata Database. • Privileges for the query are verified by the Teradata Database based on the PROXYUSER’s active roles. • The application end user is authenticated with the middle-tier application and requests a service requiring a query to Teradata Database. Sessions are identified by a unique number assigned by TDP. 176 . • The PROXYUSER’s (application end user) connection privilege through the database connection is verified. In Teradata Session mode. Both these statements are executed based on the CTCONTROL privilege. • The PROXYUSER’s identity is recorded by the Teradata Database in the Access Log and Database Query Log. Two session modes are available: Teradata and ANSI. • Queries to Teradata Database are submitted by the middle-tier application on behalf of the PROXYUSER. 8. end-users can be authenticated and requests submitted on behalf of the end user. A REVOKE CONNECT THROUGH statement removes this assertion. it rolls back the entire transaction and all locks released. the system will roll back only the request causing an error and not the entire transaction.most remain available after compression.
AmpUsage for every AMP in every session: the rows are not written until after the query completes. 177 . monitoring resource usage.2 Monitoring Tools Information about a session can be obtained from these tools: • Teradata Viewpoint . and applications. • DBQLinformation .a single row is generated by ASE into DBC. the PROXYUSER is switched to another PROXYUSER within a transaction and the temporary and volatile tables and results set are not discarded.7. Account strings can be defined at the user level and the profile level. the session is associated with the default account of the user at logon. Typically.AmpUsage view and ASE . providing a basis for billing. A SET SESSION ACCOUTN statement can be explicitly submitted to change the session to another account. 8. and assigning work priorities.Whether the PROXYUSER is set for the session or transaction will determine how the system handles operations. If set to transaction. unless the logon or startup string explicitly specifies a different account. • DBC.3 Accounts An account and a session are always associated together. If set to session. Multiple strings can be separated with commas and enclosed in parenthesis.7. 8. the temporary and volatile tables and results sets for the PROXYUSER is discarded at the end of the proxy connection.can provide information about queries running in the system for the purpose of identifying problem queries.the submission of the BEGIN QUERY LOGGING statement will enable logging to allow monitoring users.allows session information to be immediately accessed with the proper privileges in place: the MONITOR SESSION request or the MonitorSession function can be used to collect current status and activity information about the sessions. • PM/APIs . Accounts manage workloads. accounts.
the session number &D .Accounts and DBC. This ID aids in improving query and workload analysis.the hour When a CREATE/MODIFY DATABASE statement is processed. the default is the first account defined in the profile. the default is the first account in the definition string. the default is none for the profile. a row is inserted or updated in DBC. When a CREATE/MODIFY USER/PROFILE statement is processed. a row is inserted or updated in the DBC. • If multiple accounts are defined. the account of the immediate owner of the user.the performance group identifier WDID . The names of account IDs take the following format: $PG$WDID&S&D&H Where: • • • • • $PG$ .the date &H .Dbase or DBC. • If the user has a profile with multiple accounts.the workload definition identifier &S . • If no account is defined for the user without a profile assignment. • If the user has a profile with no accounts.Profiles.Acctg table and the log entries in the DBQL tables. 178 . • If no account is defined for the user or members of a profile with a NULL account. the default is the first account defined in the user or database definition. The determination of the default account follows the rules below: • If no account is defined for the user or a database. the default is the account of the immediate owner of the database. The Account ID can be found in the DBC.Accounts and DBC.Dbase.Each account is provided an Account ID. • If no account is defined for the user of a profile. the default is the first account defined in the user or database definition. otherwise. • If the user has no profile. the default is the account of the immediate owner of the user. the default is the first account in the user definition.
8. system UDFs and external stored procedures. query logging (DBQL). The user account string can be used to summarize resource usage based on the accountID. and DBC.Profiles. • Capacity Planning .8.used to view the usage of each AMP for each user and account.used to view system error messages for Teradata Database Field Engineers.allows the collection and analysis of resource utilization information and used to understand the workload requirements of the system. query the QueryBand field in the DBC. transaction type. role. Administration of accounts is performed using the following two views: • DBC. can manage user account priorities. such as session source. • DBC.used to identify the users currently logged on and other information. • Workload Management . as well as the activities of any console utilities. password status. current partition.used to access the DBC.Software_Event_LogV .LogonRulesV .Dbase. use the HELP SESSION.7. collation. query band.7. or Teradata Viewpoint 179 .SessionInfo view.used to identify the session and duration of user sessions.SessionTbl.LogOnOffV . users.through the Priority Schedule.4 System Accounting Three administrative functions are possible through system accounting: • Charge-back Billing . They identify the originating source of a query. DBC. the following session-related system views are used: • DBC.5 Using Query Bands Query bands are sets of name-value pairs assigned to a session or transaction.used to view the current logon rules. and profiles.SessionInfoV .AccountInfoV . • DBC. • DBC.allows cost allocations of system resources across all users. and audit trail ID.Accounts dictionary tables to provide information on all valid accounts for all databases.AMPUsage . • DBC. The identifiers are defined and stored along with other session data in DBC. as well as the assignment of Priority Scheduling codes and ASE codes. To retrieve the query band. To monitor database access.
1 Archive.RESTORE. and Recover The ARC utility is invoked by Backup Application Software solutions to archive. or selected partition to a archived file. or selected partitions from an archived file to the same or different database. or selected partition to a different Teradata Database. 8. individual database object. the ARC will overlook the object and continue with the operation. restore.2 Required Privileges Any object being processed for archive. and copy data. recover. or copy operations must have specific privileges assigned. individual database object. • Restoring copied database. The ARC utility aids in: • Archiving a database. individual object. or selected partition from a archived file. table. The following privileges must be explicitly granted: • • • Archive operations . If the privileges are missing. restore. CREATE TABLE/VIEW/MACRO/TRIGGER/PROCEDURE/ FUNCTION 180 . • Recovering a database to an arbitrary checkpoint using a before or after-change images in a permanent journal table.RESTORE Copy operations .8 Business Continuity 8.8.ARCHIVE/DUMP Restore operations . Restore. • Restoring a database.8. • Copying a database. • Deleting a changed image row from a permanent journal table.8.
The SESSIONS runtime parameter is used to set the number of sessions. They only affect the AMPs participating in the ARC operations and are associated with the user performing the operation. The LOGOFF statement ends all session logged by the task and terminates the ARC utility. Complete database clocks are built from each vproc in archive attempts.8.8. The optimal number of sessions for archive and restore operations is one session for each AMP. To release a HUT lock. The user ID used to log on must have privileges for the ARC statements used. Data blocks from different vprocs are not combined in the same archive block.3 Session Control A user must logon and start a session to use the ARC utility.8. Multiple sessions are possible or the default can be used. Each session is assigned to a vproc and stays with the vproc until all required data is archived. HUT locks are not used for Online Archive. Transaction locks are automatically released when a transaction completes: HUT locks must be explicitly released and will be reinstated automatically should the database reset. 181 . 8.4 HUT Locks HUT locks are created by the ARC utility and are applied to any ARC operation performed on an object. The LOGON statement must specify the name of the Teradata machine connecting to the ARC and the username and password used. the RELEASE LOCK option in an ARC command can be used or the RELEASE LOCK command in a job script can be executed.
B. Teradata Database Management Software Teradata Gateway Database Windows Parallel Data Extension Question 2 What type of journal is used to rollback failed transactions? A.1 Refresher “Warm up Questions” The following multiple-choice questions are a refresher. D. C. Transient Permanent Down AMP Recovery Any of the above 182 . B.9 Practice Exam 9. D. Question 1 What functional module is used by Teradata Database to control operations of the database environment? A. C.
Teradata Viewpoint Query Bands Resource Usage Monitor Teradata Workload Analyzer Question 4 When performing requirements analysis. D. B. C. 3NF 1NF 5NF 2NF 183 . D. B. B. C.Question 3 What Active System Management product allows sessions or transactions to be tagged with an ID? A. Subcontractor Owner Builder Designer Question 5 Which level of normal form is also called the Boyce-Codd Normal Form? A. D. which perspective will result in a system model? A. C.
D. D. C. and join indexes can a Teradata Database support? A. 12 16 20 32 Question 7 How many secondary.Question 6 What is the typically size of a row hash value? A. C. BLOB Integer UDT Geospatial 184 . B. B. C. D. B. hash. 12 16 32 64 Question 8 Which of the following data types can be defined in semantic constraints? A.
C. B. SELECT CREATE GIVE ABORT Question 11 Which of the following can lead to a SQL null? A. C. During a MERGE operation As an explicit SELECT item As a CASE result As a default column definition specification Question 10 Which of the following statements is part of the Data Control Language? A. B.Question 9 Which of the following methods will not use a literal NULL? A. Value does not exist Value not valid Value is an empty set All of the above 185 . D. D. C. B. D.
D. D. C. Primary index value is not used to hash rows to an AMP. Column Database Physical Table Question 14 What is a defining characteristic of NoPI tables? A. Existential quantifier Identity predicate Universal quantifier Predicate logic Question 13 Which of the following integrity constraints is not considered a semantic constraint? A.Question 12 The logical equivalent for “for any” is signified by what term from formal logic and set theory? A. A permanent journal with no primary index. B. D. 186 . C. B. C. Temporal tables with a MULTISET table type. B. Updated using SQL UPDATE requests.
B. C. Improper normalization Dependencies Relationship mapping None of the above Question 16 What logical operator represents the set of all attributes which are contained in both relations? A. D. D. D. Join Intersection Union Product Question 17 Which of the following phrases best describes the architecture used by the Teradata Database? A. C. B. C. Interconnected database Intermodal database BYNET database Shared nothing database 187 . B.Question 15 Which of the following is a cause of semantic disintegrity? A.
B. D. B. C. D. CREATE DATABASE SELECT INSERT ALTER TABLE Question 19 Which of the following information items is not gathered on databases for management purposes? A.Question 18 Which of the following statements will provide a write lock on a row hash? A. Fallback tables Indexes and constraints Account name Space allocation Question 20 Which of the following are characteristics of tables? A. B. C. D. C. Cardinality and relation Row and column Tuple and attribute All of the above 188 .
People Entities Locations Policies 189 .Question 21 What is used to control access to resources in a database while that resource is in use? A. B. C. B. C. Trusted User Proxy User Application User Any of the Above Question 23 Which of the following is not a dimension of the Zachman enterprise data model? A. D. B. C. D. D. Locks Journals Roles Groups Question 22 What are users called who access the database using a trusted session between the database and a middle-tier application? A.
Question 24 Which level of normal form focuses on the elimination of circular dependencies? A. D. D. D. 2 bytes 8 bytes 10 bytes 12 bytes 190 . B. 2NF 5NF 1NF 3NF Question 25 Which of the following is not a valid index type? A. B. B. C. Single table aggregate join index Non-unique secondary index hash-ordered on all columns with a not ALL option Unique single-level non-partitioned primary index Multitable sparse join index Question 26 What is the size of the row length field of a subtable row? A. C. C.
B. Principle of Interchangeability Entity Integrity Rule Principles of Normalization Referential Integrity Rule Question 29 What of the following is not a term used to describe the relevance of data? A. B. C. D. D.Question 27 Which of the following integrity constraints for the column-level are not supported on temporal tables? A. B. D. Warm Cool Critical Hot 191 . C. UNIQUE FOREIGN KEY…REFERENCES PRIMARY KEYS CHECK Question 28 Which principal for relational database management declares that all foreign key values must have a match? A. C.
D. B. D. 2 KB 5 KB 1 MB 2 MB 192 . C. SEMICOLON SOLIDUS APOSTROPHE COLON Question 32 What is the size limit for the table header? A. B. C.Question 30 Which of the following components is not part of the space requirements for the table area? A. C. User Spool Space Cylinder size Data dictionary User TEMP space Question 31 Which delimiter is used to separate statements in a multistatement request? A. B. D.
D. Which predefined integrity level requires the most processing? A. D.Question 33 The DBC Control utility can define the integrity levels used for physical constraints. Transfers to a definition cannot occur if alias names are specified. Transfers to a definition will occur as long the maximum header length of the index is not exceeded. B. Transfers to a definition will occur only when columns are components of a partitioning expression. HIGH MEDIUM DEFAULT ALL Question 34 Which of the following rules applies to transferring compression values to join index columns? A. B. C. C. Transfers to a definition will occur only when columns are a component of the primary index for the join index. 193 .
C. B. Use numeric attributes whenever possible Always use intelligent keys Select attributes that are likely to change Attributes do not need to be unique Question 37 What type of key used in the normalization process is encoded with more than one fact? A. D.Question 35 How many columns can be supported by a primary index definition? A. D. 16 32 64 128 Question 36 Which of the following criteria is valid for selecting primary keys? A. B. C. C. Natural key Composite key Candidate key Intelligent key 194 . D. B.
B. Data basement Dependent data mart Logical data mart Independent data mart Question 39 What is the most restrictive severity provided by a lock? A. Permanent NoPI Global Temporary Volatile 195 . C. C. D. C. B.Question 38 A data mart is a small subset of a data warehouse database. What type of data mart is virtually constructed from a physical database? A. Exclusive Write Access Read Question 40 What type of table is used to stage data during FastLoad operations? A. D. B. D.
capacity planning. Teradata Viewpoint defines rules for filtering. and performance monitoring. 196 . Question 3 Answer: B Reasoning: Active System management is a collection of products used to automate workload management. throttling. performance tuning. Question 4 Answer: D Reasoning: The designer perspective transforms scope and requirements into a product specification. and defining classes of workloads. Question 2 Answer: A Reasoning: Transient journals are used to roll back failed transactions which are aborted by the user or system. Open APIs provide an SQL interface to PMPC. Resource usage Monitor will collect relevant data.1 Answers to Questions Question 1 Answer: C Reasoning: Database Windows is the functional module used to control operations within a Teradata Database. The Teradata Workload Analyzer is used to analyze DBQL data. The result of the effort is a system model. Query bands allow IDs to be tagged to sessions and transactions as defined by the user or middletier application.10 Answer Guide 10.
UDT. Question 8 Answer: B Reasoning: Semantic constraints cannot be defined with BLOB. Question 9 Answer: A Reasoning: A literal NULL can be used in an INSERT or UPDATE operation. The value contains either a 16-bit hash bucket number with the 16-bit remainder or a 20-bit hash bucket number with a 12-bit remainder.Question 5 Answer: A Reasoning: The BCNF is a stricter form of 3NF which eliminates nonkey attributes not describing primary key. Period. or Geospatial data types. CLOB. 197 . Question 6 Answer: D Reasoning: A row hash value is 32 bits in size. and join indexes. Question 7 Answer: C Reasoning: A Teradata Database can support up to 32 secondary. hash. but not a MERGE operation.
and database-level constraints Question 14 Answer: D Reasoning: NoPI tables are nontemporal tables with no primary index or a table type of MULTISET. Semantic constraints are further divided into column-. These tables cannot be specified as permanent journal. GRANT LOGONJ/REVOKE LOGON. not applicable. does not exist.Question 10 Answer: C Reasoning: The most common statements of the Data Control Language are GRANT/REVOKE. 198 . not defined. table-. or an empty set. but not the value missing. Question 12 Answer: A Reasoning: Existential quantifiers are a term to logically identify “for some. not valid. not supplied. The common reasons for a null are the value is unknown. nor can they be updated using a SQL UPDATE request.” and “there exists. and GIVE.” Question 13 Answer: C Reasoning: Integrity constraints for databases fall into two categories: semantic and physical. Question 11 Answer: D Reasoning: SQL nulls can be used for a number of reasons to identify missing information.” “for any.
Question 17 Answer: D Reasoning: The Teradata database is a shared nothing database architecture. DELETE. or INSERT statement. Some of the general information about database stored includes database name. creation time stamp. 199 . not databases. collation type. creator name. Within this architecture. allowing AMPs to have exclusive control over its own virtual space. owner name. modifier. account name. Question 16 Answer: B Reasoning: An intersection of two relations covers the attributes that are found in both relations and only within the intersection. role and profile names. Question 18 Answer: C Reasoning: A write lock can be placed on a row hash using an UPDATE. and revision numbers. number of fallback tables. Question 19 Answer: B Reasoning: Indexes and constraints is information gathered on tables. memory or disk storage is not shared between the PE and AMP vprocs across CPUs. space allocation.Question 15 Answer: A Reasoning: Semantic disintegrity is a result of improper normalization or misunderstanding of the normalization.
Question 24 Answer: B Reasoning: Second normal forms describe the decomposition of the database to eliminate any circular dependencies between data. Tuples are defined by a row which defines the cardinality of the relation. Question 23 Answer: D Reasoning: The dimensions used in the Zachman enterprise data model are entities. people. locations. Attributes are defined by a column which defines the degree of the relation. and motivation. Question 22 Answer: B Reasoning: If a user accesses the database through a middle-tier application using a trusted session. they are commonly called proxy users. 200 . Policies are not a component of the data model. Question 21 Answer: A Reasoning: Locks are used to control access to a resource. time. activities. Different types of objects and their severities will ensure different levels of access.Question 20 Answer: D Reasoning: Tables have two dimensions: tuples and attributes.
warm. cool.Question 25 Answer: C Reasoning: Only partitioned indexes can be classified with either a single-level or a multilevel. Question 29 Answer: C Reasoning: Data relevance is typically categorized as hot. Question 26 Answer: A Reasoning: The row length field in a subtable row is 2 bytes. The RowID is 8 bytes. The Case table row ID is 8 bytes for NPPI tables and 10 bytes for PPI tables. and icy. Question 27 Answer: B Reasoning: FOREIGN KEY…REFERENCES are defined on a single column and not implemented as an index. Question 28 Answer: D Reasoning: The Referential Integrity Rule declares that no unmatched foreign key values can exist. 201 . They are not supported on temporal tables.
and use. transfers to a multitable join index definition will continue. and user spool space. a checksum is generated using 100% of the words in the disk sector requiring the most processing requirements to perform. CRASHDUMPS user space. Question 34 Answer: B Reasoning: As long as the maximum header length is not exceeded. All other statements are opposite expressions of actual rules. maintain. user TEMP space. Question 32 Answer: C Reasoning: A table header is limited to 1 MB in size. 202 .Question 30 Answer: B Reasoning: The space requirements for the table area are generated by combining the requirements for the data dictionary. Question 31 Answer: A Reasoning: Semicolons are used to separate statements and to terminate requests. Question 33 Answer: D Reasoning: When the ALL keyword is used.
and exclusive in order from the least restrictive to most. Attributes should remain unique and rarely change. Question 36 Answer: A Reasoning: The recommendations for selecting primary keys include selecting numeric attributes. Question 37 Answer: D Reasoning: An intelligent key is a simple key which is overloaded with multiple facts. write. 203 . system-assigned keys. Question 39 Answer: A Reasoning: Locking severities consists of access.Question 35 Answer: C Reasoning: Up to 64 columns can be supported within a primary index definition. An exclusive lock provides the requester with sole access to the locked resource. Intelligent keys should never if used. Question 38 Answer: C Reasoning: Logical data marts are logical constructs of the physical database. read.
204 . They are used as a staging area to facilitate operations using FastLoad or TPump Array INSERT.Question 40 Answer: B Reasoning: NoPI tables are permanent tables which have no primary index.
Teradata information: www.artofservice. Teradata Database Database Design Release 13.10. Teradata Corporation: August 2010.com Websites: www.theartofservice. Teradata Corporation: February 2010.com.teradata.org www. Teradata Corporation: November 2010.theartofservice.com 205 . Teradata Database Introduction to Teradata Release 13. Teradata Corporation: August 2010.10.11 References Teradata Database SQL Fundamentals Release 13.au www. Teradata Database Database Administration Release 13.10.10.
170 accounts 2. 44-5. 97. 50-2. 183. 36. 38. 26. 153-4. 130. 88. 54. 117. 25-6. 155. 203 authentication 6. 72. 187-8. 15 B base table rows 77-8. 155 authorization 6. 18. 76.12 Index A accessing 54. 137. 107. 95. 144. 153. 129. 133-4 applications 21-2. 29. 177-9 Active System Management (ASM) 49-50. 180-1 archiving 29. 78. 35-7. 56-7. 169-70. 172 AMPs (Access Module Processor) 19. 51. 118-22. 23. 196 algorithm 76. 180-1 architecture 13. 111. 187. 55. 115. 194. 176-7 ARC 36. 62-3. 150. 124. 124. 90. 62-3. 10. 56 external 53-4. 79. 108. 46. 28. 30-3. 161-3 amount 73. 158 attributes 20. 28. 126. 120 206 . 181 ANSI SQL 18. 98. 40-2. 103. 66-73. 87. 46. 199 archive 11. 108-9. 199-200. 84. 149. 116. 118. 96. 153. 105. 180 assigned roles 52. 80. 124. 60. 73. 125-6. 46. 169. 151. 146. 26. 174-5 ALTER PROCEDURE 99. 53-4. 64. 59. 105-6. 85. 116-17. 121. 51. 49. 36. 73-6.
96-7. 120. 167. 183. 115-16. 201 C candidate key 67-8. 73. 179 column names 79. 134 Boyce-Codd Normal Form (BCNF) 69-70. 103. 197 blocks 47. 70. 147. 47. 48. 170 boundaries 101. 124. 119. 102. 134-5. 203 Character sets 26. 190. 108-9. 101. 113-14. 70. 117. 53. 90. 57-8. 131. 125. 90 BYNET 18. 130. 158. 116. 104. 64. 167 CLOB 75-6. 15. 50. 73. 50-1. 151-2. 167. 103. 130. 68. 55. 120 bytes 77. 140. 197 BTEQ (Basic Teradata Query) 43. 134. 40. 174 characters 28-9. 77. 113. 200 change 45. 134 business 35. 197 BLOB 70-1. 66-7. 123. 40. 87-8 checkpoint 100. 168-9. 183. 77-81. 197 collection 18. 194. 30-2. 119. 87. 82. 132. 84. 85-6. 97. 145. 62. 56. 76-8. 124. 51. 100. 78. 134 207 . 174 checking 29. 109. 82. 188. 154-5. 120-2. 115. 111. 103. 108. 31. 25. 180 clause 25. 179. 61. 196 cardinality 20. 91-2. 110. 91. 96-7. 131. 73. 112. 125. 194 capacity planning 7. 75. 174 clients 18-19. 114. 114. 167. 129. 177. 159-60. 145. 115. 85-6. 80-1. 171 BCNF (Boyce-Codd Normal Form) 69-70. 172-3. 164.base tables 21. 75-6. 99. 108. 184.
26. 1679. 186-9 Database Administration 9. 153. 111. 89 creator 20. 14. 77. 159. 125. 178-80. 87. 146 database 18-23. 23. 126 cylinders 33. 99-100. 30-1. 151. 83-6. 25. 41. 143-6. 32. 79-81.column set 73-5. 173-4. 20. 26. 146-54. 107-8. 169-70 D data blocks 73. 168 cursors 7. 169 constraints 20. 147. 64. 66 database name 28. 101. 84. 144. 132. 126. 138. 25-6. 91. 110-13. 12. 23. 60. 136. 52. 123. 147 database management 44. 28. 175 configuration 34. 64. 41-4. 84. 102. 188. 51. 80. 20-1. 99. 127-31. 169-70. 66-7. 139. 30. 76. 83-5. 10. 107-10. 62. 111. 105. 47. 97. 74. 107. 126. 112. 70-7. 57. 152. 117-24. 174 columns 8. 83-6. 29-30. 143. 54. 171-3. 60. 80. 136 column values 77. 66 components 18. 181 Data Dictionary 4. 63-5. 146. 96-8. 39. 139-40. 199 context 20. 103. 66. 140. 156. 72. 61. 94. 113-14. 48-9. 135-6. 28. 50. 141. 157-62. 103-4. 157-8. 87. 22-3. 171. 135-6. 148. 69. 81. 119. 78-80. 46-66. 193-4 communication 18. 79. 100. 174. 52. 21-2. 60. 180 DBC 26. 162. 166. 40. 95. 90-1. 55. 173-6 conditions 20. 200 compression 80. 199 database objects 8. 32. 161. 155. 139. 193. 167 208 . 166-7 data values 90. 144. 112-13.
170. 75. 94-5. 105-6.DBW (Database Window) 5. 29-30. 75. 182. 72. 154. 187 design 61. 72. 88. 20. 83. 154. 127. 88. 138 environment 16. 143 DML (Data Manipulation Language) 9. 160-1. 83. 164 E Embedded SQL 102. 167 default database 9. 157 degree 20. 13. 35. 123. 136. 19. 99. 29. 88-9. 100. 34. 129. 84 dumps 45. 135-6. 143. 200 directory users 51-2. 135 entities 3. 83. 64-9. 31-2. 120. 189. 185. 94. 47. 200 entries 75. 13. 138-9. 140. 99. 111. 22. 94. 130. 55. 127. 141. 167 DDL (Data Definition Language) 9. 69. 67-8. 198 DCL statements 100. 138. 189. 153. 64. 140. 175. 126. 108. 49-50. 90 Dictionary 21. 147. 22. 157. 85. 13. 152 dimensions 64-5. 196 DCL (Data Control Language) 9. 66-7. 127. 151. 119. 21-2. 139 domain 53. 168 default roles 52. 76. 87. 182 errors 20. 45. 69-70. 153 disk 32. 200 dependencies 21. 135-6. 166. 109-10. 169-70 distribution 73. 32. 154-6 default values 109. 57. 154. 99. 42-5. 19. 176 209 .
119. 35. 61 fields 54. 28. 103. 28-9. 173 GRANT statement 23-5. 44. 86. 61 fallback 35. 198 expressions 26. 133. 90. 134. 79-81. 160-2. 191. 170 functions 9. 117. 197 global temporary tables 21. 92. 131-2 external stored procedures 4. 201 Foreign Keys (FKs) 71. 46-8. 46-7. 17. 107-8. 81. 144. 128. 113. 77. 52. 93. 85-6. 35-6. 37. 78. 40-1. 87. 71. 150. 117. 117. 128-9. 96-7. 44. 22. 171 fallback tables 28. 144. 146 format 48.execution 23. 166. 92. 156 210 . 186. 106. 168. 119. 123. 124 Fault Tolerance 5. 146-7. 116-17. 136 Existential quantifier 83-4. 170 files 36. 131. 99. 164 G Geospatial 75-6. 88. 85-6. 144. 147. 104. 128-9 F failure 32. 111. 126-7. 109. 35-6. 24-6. 24-5. 115. 101. 117. 130. 104. 188. 41-2. 120. 199 FastLoad 20. 146. 55. 87. 184 Geospatial data types 82. 77. 112 free space 26. 102. 48. 38. 77. 180 foreign keys 68. 63. 91. 73. 69. 179 External UDFs 24-5. 90. 77. 34-5. 77. 82. 166. 36. 123. 39.
11. 45-6. 143-5. 123 interfaces 34. 179. 78. 154. 170. 145. 14. 79.group 33. 46. 168 I I/Os 33. 26. 190. 110. 32. 123. 54. 139. 118. 62. 77-8. 64. 73-82. 184. 143. 121. 120-1. 157. 199 input 34. 153. 175 Identity Columns 9. 88. 74-5. 121. 39. 178 implementation 12. 197 hash indexes 7-8. 145. 111. 79-80. 127-8 211 . 40-1. 81-2. 51. 171-3. 39-40. 20. 76. 168. 85. 80-2. 105. 146 immediate owner 97. 177. 28-9. 108-9. 77. 157. 36. 106-7. 166-7. 80-1. 136-7. 86. 51. 92-3. 145. 83. 118. 145 hashing algorithm 73. 102. 51-2. 122. 46-7. 41. 64-5. 118 hierarchies 21. 99. 157 index columns 80-2. 19 information 3. 118. 20. 146. 104. 159. 35. 100. 26. 144. 64. 28-9. 75. 189 H hash 62. 174 hashing 9. 127 J JDBC 39. 193 INDEX statement 76. 133 input parameters 25-6. 117-22. 174. 117 integrity 7. 126. 73-6. 104. 174 indexes 8. 48. 73. 65.
131-4 locks 5. 144. 67-8. 108. 53. 53. 131. 92. 189 K key 8. 133. 38. 23. 13. 85-6. 181 logging 40-1. 181 M macros 4. 41. 192. 202 mechanisms 34. 157. 199-200 log 2. 183.Join Indexes 7-9. 111. 57. 196 mapping 67. 75. 151. 32. 170-3 list 43. 153-4. 79. 176. 164. 36. 157-8. 27-8. 126. 153. 108. 194. 135-6. 85. 77. 80. 41-4. 169. 101. 177 logon 6. 45. 32. 52. 141. 111. 156-7. 87 members 33. 51. 178 212 . 87. 36. 49. 202 L length 28. 145 journal tables. 62. 118. 53-7. 114-15 levels 21. 195. 70-1. 145. 148. 148. 159-61. 87. 144-5. 8. 64. 26-7. 123. 103. 63. 81-2. 180 journals 36. 108. 101. 55. permanent 92-3. 109. 148-9. 71-2. 82. 93-4. 97. 100. 99. 49. 44. 153-4. 166-7 management 14. 56-7. 81. 59. 190 limit 101. 173-4 literals 9. 176-7. 182. 90. 203 keywords 9. 188. 188-9. 117. 46. 101. 153 MB 91.
136. 73. 204 normalization 9. 51-2. 192 N name 20. 64. 85-6. 69. 78. 56. 44. 34. 49. 157-9. 100. 124. 195. 121-3. 49. 73-4. 133 213 . 94. 72. 56-7. 108-9. 194 number 28. 189. 57. 51. 190 O objects 23. 66. 80. 172-3. 99. 89-93. 41. 131. 149. 93 multistatement request 127. 191. 110. 67. 71. 82. 110. 159. 129. 68. 120-1. 76-9. 200 model 64-7. 197 message 51. 198 NoPI (No Primary Index) 20. 28-9. 127 modifier 28. 167 network 39. 96-8. 55. 131. 152-3. 64. 89. 143-4 NUSIs (Non-unique secondary indexes) 74. 24. 186. 109-10. 99-100. 76-8. 144. 66.MERGE 21. 197 NUPI (non-unique primary indexes) 74-5. 204 NoPI tables 75-6. 155. 35. 92. 62-3. 111-12. 67. 176. 94 nontemporal tables 77. 62. 53 nodes 30-1. 140 middle-tier application 50-1. 48. 130. 77. 75-6. 110. 82. 67. 169-70. 84-5. 73. 62. 166-9. 198. 80. 186. 26. 70. 144 modification 42-3. 144. 54. 106. 54. 83. 198. 28-9. 69. 199 MultiLoad 46-8. 33. 46. 143. 180-1 operators 9. 138. 55. 106. 88. 108. 146. 199 normalization process 6.
143-4. 24. 123. 85-6. 197 Primary Key (PK) 62. 62-3 parameters 14. 52. 119. 168-9. 112. 131. 147-8. 129-30. 191. 109. 78. 80. 143-4. 151 214 . 66-71. 144 primary index 7-9. 171. 73-5. 181 PDE (Parallel Data Extensions) 19. 77. 88. 54-6. 113. 79. 153. 182 PE (Parsing Engine) 19. 121-2. 123. 118.Optimizer 31-2. 76. 153-4. 30. 134. 31. 171-2 perspectives 21. 64. 32-3. 146-7. 193. 73-6. 80-1. 49-50. 147. 204 primary indexes. 33. 86. 113. 71. 117. 47. 191. 119. 101. 133-4. 143. 62-3. 109-10. 145. 117-19. 159. 73. 165 performance 18. 24. 108. 78-9. 38. 144. 143 primary key 62. 99. 127 owner 3. 122. 153. 64-5. 176. 126-7. 149-50. 112. 117. 117. 183 phrases 28. 144 output 31-2. 14. 179 password 30. 168-9. non-unique 74-5. 174 PPI (partitioned primary index) 74-6. 33. 146-7. 106. 121. 73. 85. 142. 88. 110. 106. 105. 98. 40-1. 152. 157. 199 P parallel 19. 157. 73. 167-8 partitions 76. 21. 121. 169-70 permanent space 81. 173 pointers 40. 82. 95. 77. 118. 77. 103. 117. 28. 155 PERM space 96. 181 parents 122. 82. 124. 109. 30. 101. 154-5. 194 Priority Scheduler 33. 183. 35. 143-4. 97. 66-71. 93.
187-8. 146-7 references 20-1. 78-81. 38-9. 108-9 relations 20. 153-4. 178-9 proxy users 51-2. 146 relational table 72. 151 referenced table 73. 183 R recovery 5. 83. 36. 202 resource 42-3. 73. 42. 66-70. 40-1. 127-8. 176-7. 163-4. 57. 115-16. 110. 123-4. 167 referential integrity (RI) 8-9. 102. 83. 41. 83. 52. 32. 67-71. 135. 176-7. 162-3. 18. 180-1 PROCEDURE 22. 153. 127-8. 85-6. 146 representing 66. 165 product 3. 57. 108. 160-1. 129. 108. 165. 88-9 relational databases 18. 155. 65. 69. 121-2. 179 Query Bands 50. 155-9. 143-4.privileges 10. 180 215 . 189. 130. 189. 133 requirements 35. 86. 117. 135-6. 166-8. 159. 149. 122. 146-7 relation variables 69-70. 199-200 relationships 20. 146. 51-3. 200 restore 11. 59. 165. 143. 64. 27. 196 profiles 8. 36. 61-3. 50. 179. 65. 73. 26-9. 196. 145. 111. 200 Q queries 24. 160-2. 83-4. 20. 136. 153. 156-7. 152-3. 63. 83. 117. 139. 104-6. 46. 68-9. 171-3. 31-2. 108-9. 123. 144. 117. 55. 104. 117. 151. 75. 187. 72. 10. 171. 141. 44. 71.
144 Secondary Indexes (SI) 7. 14. 151. 197-8 semantic disintegrity 71-2. 77-9. 50-1. 14. 86. 155. 188. 158. 141 Revoke 139. 197 RowID 73. 107. 26. 117. 81-4. 132 scope 50. 108. 120. 163-4. 179. 81. 51. 10. 74-8. 100. 120. 117-23. 123. 130. 148. 3. 73. 182 row hash 42-3. 93-4. 116. 104-6. 120. 22. 55. 13. 27-9. 108-13. 82. 149. 153-6. 62. 196. 52. 126-7 services 1. 72-9. 186. 105. 196 secondary indexes 7-9. 71. 176 216 . 15. 14. 42. 64-5. 153. 33. 178. 189. 168. 53. 202 S Scalar 25. 144-5 security 6. 109. 118. 184. 176. 117-21. 129. 201 rows 20-1. 91. 167. 113. 132 server 21. 9. 176. 45. 184. 10. 40-1. 129. 57-8. 74-6. 196 rollback 18. 55. 109. 84.return 47-8. 199 sequence 49. 145. 36. 125-6. 157 semantic constraints 85. 176 session modes 7. 139-40. 118-21. 88. 76. 105. 169-74 rules 27. 199 roll 36. 199 row hash values 62. 176 Session Management 10. 44-5. 100. 143-6. 102. 51-2. 77. 145. 145. 126. 42. 81. 79. 50. 104. 87. 187. 172. 93-4. 176 roles 8. 133-4. 72.
108. 171-2 SQL (Structured Query Language) 7. 152. 111-12. 105-6. 109. 90. 173-4. 94-8. 73 space 21. 99. 37. 179. 105-6. 64. 83. 85-6. 151 statistics 14. 26. 157. 74. 35. 27. 96. 149-51. 149. 147-9. 99. 84. 108. 81. 126-8. 73. 144. 159. 176-7. 201 software 5. 138-9. 156. 139. 48. 24. 121. 116-17. 92. 33. 80. 125-7. 27. 102. 99. 148. 151-2 storage 38. 101-2. 78. 188. 181 set theory 7. 69. 171 stored procedures 8. 20-2. 69. 172-4 space allocation 28-9. 23. 49. 135. 94. 30. 125-6. 169.session number 56. 174 table header 91. 18. 101. 101-2. 148. 152. 197 T table columns 91. 168-70. 12-13. 76. 113-15 SQL requests 32. 103. 106. 52. 81. 169 support 87. 178 sessions 19-21. 154 subset 21. 94. 129. 24. 131 217 . 192. 104. 120. 169 string 100. 135-6. 60. 202 table names 28. 105-6. 46. 46. 54. 129. 39. 99. 145. 132-3. 31. 153-4. 101. 40-2. 109. 167. 21. 166 SQL statements 7. 136. 113. 120-2. 168-9. 127. 20-2. 131-2. 112. 135. 89. 152. 195 subtables 50. 171. 27-8. 199 spool 96. 101. 20. 119-20. 29. 154. 186\ single column 67. 75. 98. 131-3 SQL data types 7. 106. 48-50. 134. 43. 108. 46.
37. 143. 33-4. 18. 188. 43. 170-1. 77. 111. 132 time zone 99. 174-6 Teradata Viewpoint 50. 37. 47. 93. 136. 168. 62-3. 80. 99. 75. 183 transfers 47. 100. 118. 28. 108. 66. 23-4. 100. 125. 132. 136 timestamp 20. 26. 201 TEMPORARY 96. 43-7. 84. 85-6. 115. 114-15. 179. 160-1. 39-41. 99. 191. 135. 36. 90. 30-1. 147-8 Unicode 30. 193. 53.table rows 119-20. 167. 77. 72. 153-4. 125. 53-4. 88-9. 99. 114-15. 168. 109. 111-12. 172. 96. 165. 171-2. 12-15. 101. 97. 150. 27. 177. 24. 108. 25. 179. 164 tuples 20. 85-6. 119 UPDATE 21. 28. 40. 30. 8. 103. 105. 43. 119. 169. 113. 110-11. 114-15. 36. 172-3 temporary space 10. 72-3. 166 TABLE statement 71. 75-8. 40-5. 22. 85-6 time 23. 141. 150. 67-70. 29. 129 Teradata 1. 81-2. 4. 174 UNIQUE constraints 70. 108. 49. 75. 65. 132 transactions 5. 116-18. 200 U UDFs (user-defined functions) 4. 62. 202 trigger 24. 129. 154. 106. 18. 183 test 16. 110. 199 218 . 36. 76-7. 50. 105. 186. 95. 14951. 117. 181. 174 UDTs (User-Defined Types) 4. 140 temporal tables 75-7. 109. 149. 109. 49-50. 179-80 Teradata Database 17-18. 33. 128. 105-7. 25. 159-61. 62. 75. 73. 116-17. 177. 172-3 temporary tables 21. 109. 8. 24-8. 28-9.
149-50. 154. 47. 176. 124. 96-7. 146 utilities 5. 36-7. 111-12. 130. 131-3. 13. 49-50. 140-1. 134. 18. 69. 197-8 volatile tables 75. 181 W words 55. 109. 76-8. 20. 76. 89. 84. 152-4 username 51. 152-62. 107-8. 15. 50. 34. 143-6. 85. 26. 13. 88. 149. 124. 130. 116-21. 83. 101. 126-8. 166. 73. 54-6. 130-1. 85. 176-9 USIs (unique secondary indexes) 74. 179 219 . 35. 112-14. 50. 42-5. 17. 28-9. 27. 155 user names 28-9. 147-50. 81. 23. 140. 130.UPDATE operation 84. 10. 175 UPIs (unique primary indexes) 74-5. 46-8. 96. 108. 143-4. 71-3. 202 workload management 6. 122. 177 vprocs 30-1. 168. 147. 62. 146 User DBC 26. 44-5. 119-22. 135. 169 User-Defined Method (UDM) 4. 154. 96. 173-4. 144. 135-6. 98. 135. 134. 25. 136 user logs 30. 152. 181 users 20-1. 29. 99. 197 updates 20-1. 171-3. 33. 150 USER statement 51-2. 111. 185. 150 V values 1-2. 50-9. 31. 104. 101. 112. 108. 89-90. 167-9.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.