You are on page 1of 17

 

Tera-Tom on Teradata Utilities V12-V13
by Tom Coffing Coffing Data Warehousing. (c) 2011. Copying Prohibited.   

Reprinted for Hema Ganapathy, Cognizant Technology Solutions Hema.Ganapathy@cognizant.com Reprinted with permission as a subscription benefit of Skillport, http://skillport.books24x7.com/

All rights reserved. Reproduction and/or distribution in whole or in part in electronic,paper or other forms without written permission is prohibited.

We are going to make these difficult Teradata Utilities easy to use and easy to understand. Let me first say that if you are going to write Teradata Utilities they are very difficult and cumbersome in the beginning. and FastExport. but I now always use SmartScript to build them and average about 1-minute. The brilliance behind the Teradata load utilities is in their power and flexibility. Not anymore! Just Right Click on a table and choose SmartScript and then select the script you want to build. Cognizant Technology Solutions Coffing Data Warehousing. but why would you in most cases? In seconds you can create the script. This book has had contributions from over 10 Certified Teradata Masters with experience at over 125 Teradata sites worldwide. TPump.Tera­Tom on Teradata Utilities V12­V13 Chapter 1: Introduction Overview “It’s not the data load that breaks us down. Then you hit Build Script and check it out or make any minor changes you want (shouldn’t need to make any changes). We at Coffing Data Warehousing have built SmartScript inside our world famous Nexus Query Chameleon.”  Tom Coffing Teradata has been doing data transfers to and from the largest data warehouses in the world for close to two decades. Teradata has continued to set the standards and break new barriers. MultiLoad. but also show you how to build them by hand. This book will introduce SmartScript. All you do is point and click and Nexus builds them brilliantly for you. With six great utilities Teradata allows you to pick the utility for the task at hand. FastLoad. Then hit Execute or use the Nexus Scheduler and run the script when you want! Page 2 / 17 Reprinted for CTS/227461. Nexus Query Chameleon . The only thing we needed to select in the FastLoad script below was the Source File. You can download the Nexus Query Chameleon from our website for a free trial so you use the Nexus to build your scripts until you know what you are doing. This book is dedicated to explaining these utilities in a complete and easy manner. but you can still change if you want. Let our experience be your guide. Use the Nexus to build your Teradata scripts.Build Teradata Load Scripts in Seconds You know how difficult and cumbersome it is to build Teradata Load Scripts such as BTEQ. Coffing Publishing (c) 2011. While other databases have allowed the loads to break them down. Copying Prohibited . In the beginning of my career it took me an average of about 1-hour to build the scripts. it’s the way you carry it. Most of the defaults your script will need will automatically be placed in the script.

or referential integrity. and Referential Integrity. but it also imports and exports data 1-row at a time. This is a mover and shaker and always feels the need for speed. Hopefully the light will go on and you will start with a bright and clear fundamental understanding of which utility to use and when. Cognizant Technology Solutions Coffing Data Warehousing. Edison Did you know that Thomas Edison only slept four hours per night? That makes sense with that stupid light always in his eyes! Here is an introduction to the six Teradata Utilities starting with the first three of BTEQ. BTEQ is not a block utility because it works a row at a time. Copying Prohibited . Page 3 / 17 Reprinted for CTS/227461. join indexes. FastLoad and MultiLoad. The great news is that since the table must start empty you won’t need any of these. You can however use FastLoad to load the table and then add your Secondary Indexes. Coffing Publishing (c) 2011. FastLoad is used to load to Teradata tables that are empty in 64K blocks. Sorry though because your table can’t have Secondary Indexes when loading. and utility is success. Where FastLoad only understands the word INSERT because it only INSERTS into empty tables. MultiLoad is used to populate populated tables. triggers. The only command that FastLoad needs to know is INSERT because it INSERTS into empty Teradata tables by loading 64K blocks of rows (could be hundreds to thousands of rows with a single block load). DELETE and UPSERT. UPDATE. Its sale is proof of utility. MultiLoad understands the words INSERT. MultiLoad is like FastLoad in that it also loads in 64K blocks so it is also considered a block utility. I don’t want to invent. Join Indexes.”  Thomas A. BTEQ was the first Teradata query tool and first utility because it was built as a report writer. Triggers. The idea is to use FastLoad to load to an empty table the first time and then use MultiLoad each time you want to add to the table.Tera­Tom on Teradata Utilities V12­V13 The Teradata Utilities “Anything that won’t sell.

”  Karl Marx I had no idea that Karl Marx was a Teradata Certified Marxist! He must have really known the utilities because this class struggle will be in determining which utility to use! FastExport is another block Utility that works in 64K blocks. Copying Prohibited . This soon may be the only utility you need to use. but that story hasn’t been scripted quite yet. It works a row at a time so it is slower than FastLoad or MultiLoad. TPump is one of the most exciting utilities. UPSERTS. Teradata Parallel Transport is Teradata’s newest utility and designed to use all of the before mentioned utilities in one scripting language. The only word that FastExport understands is SELECT. Cognizant Technology Solutions Coffing Data Warehousing. UPDATES. You SELECT the data from the table and then FastExport exports it off Teradata in 64K blocks. but FastExport only exports Teradata data off of Teradata. Coffing Publishing (c) 2011. Page 4 / 17 Reprinted for CTS/227461. just like FastLoad and MultiLoad. Think of MultiLoad as a noisy train coming down the tracks disrupting everything in its path and TPump as a quiet truck loading to its destination. but you can have Secondary Indexes. Join Indexes. It also improves on the other utilities by taking advantage of Teradata’s Parallel Processing. Referential Integrity and Triggers on your table while you load it. I like to think of TPump as MultiLoad. Why would you use something slower like TPump when you can rapidly load using MultiLoad? Because users can continue to query a table while TPump quietly INSERTS. but loading only a row at a time. or DELETES rows in the background.Tera­Tom on Teradata Utilities V12­V13 The Teradata Utilities (Continued) “Nothing can have value without being an object of utility.

 Cognizant Technology Solutions Coffing Data Warehousing.”  Michelangelo I had no idea that Michelangelo was a fan of Teradata block utilities. there are efficiencies associated with using large blocks of data when transferring between computers. You will learn that efficiency comes at the price of other database capabilities.Tera­Tom on Teradata Utilities V12­V13 Considerations for using Block at a Time Utilities “Every block of stone has a statue inside it and it is the task of the sculptor to discover it. there is never one best approach. Coffing Publishing (c) 2011. but with Block Utilities of today the sky is the limit! As mentioned above. So. The capabilities and considerations are covered in conjunction with the commands. Back then the ceiling was the limit. as important as it is to know the language of the utility and database. For instance. However. Copying Prohibited . the logic might indicate that it is always the best approach. So. it is important to understand when and where these considerations are present. when using large blocks to transfer and incorporate data into Teradata the following are not allowed: n Secondary indices Triggers Referential integrity More than 15 concurrent utilities running at the same time n n n Therefore. Page 5 / 17 Reprinted for CTS/227461. it is also important to understand when to use the appropriate utility.

Tera­Tom on Teradata Utilities V12­V13 Maximum Amount of Block Utilities has changed! “A book that is shut is but a block. The only caveat here is that 60 FastExports can run simultaneously (minus the number of active FastLoad and MultiLoad jobs also running). Let me explain. which controls AMP Worker Tasks (AWT). Copying Prohibited . Each AMP can perform 80 things at once. For FastExport jobs up to 60 can run concurrently. thus meaning that 80 AMP Worker Tasks per AMP.”  Thomas Fuller Thomas Fuller must have been a Tera-Tom fan because he really meant to say a Tera-Tom Utility book that is open talks about Block Utilities! Before Teradata V2R6.0 where only 15 FastLoad. After Teradata V2R6.0 Teradata increased and changed this number. Many companies set this to 5 because these Block Utilities will have major impact on a system and can greatly affect user query performance. This new feature is actually controlled by a new DBS Control parameter named MaxLoadAWT.0 when Tera-Tom was just a baby in the crib the DBS Control parameter MaxLoadTasks had a maximum limit of 15. When MaxLoadAWT is greater than zero the new feature is active. but remember it is up to each individual company to determine if this is too many because of the performance hit. When MaxLoadAWT is set to zero then it is like going back in time to pre-V2R6. Coffing Publishing (c) 2011. MultiLoad and FastExports can run max. It no longer includes FastExport. MultiLoad and FastExport could run simultaneously. Cognizant Technology Solutions Coffing Data Warehousing. This meant no more than a total combination of block utilities (FastLoad. The MaxLoadAWT should never exceed more than 48 or the AMPs would not be able to do much else during the load. Now there can be up to 30 concurrent FastLoad and MultiLoad jobs. Page 6 / 17 Reprinted for CTS/227461.

they allow all the things that the block utilities do not. The opposite of sending a large block of rows at the same time is sending a single row at a time. It is always faster to send multiple rows in one operation instead of one row. continue reading about the individual utilities and open up a new world of capabilities in working with the Teradata RDBMS. but row away from the rocks. Coffing Publishing (c) 2011. If it is slower. Page 7 / 17 Reprinted for CTS/227461. we mean that the row at a time utilities allow the following: n Secondary indices Triggers Referential integrity More than 15 concurrent utilities running at the same time n n n As you can see.”  Hunter S. With that in mind and for more information. Copying Prohibited .Tera­Tom on Teradata Utilities V12­V13 Considerations for using Row at a Time Utilities “Call on God. By this. why would anyone ever use this approach? The reason is that it provides more flexibility with fewer considerations. Thompson I had no idea that Hunter S. but row away from the Blocks! Block level utilities have speed but so many restrictions. Thompson understood the value of the row-level utilities. He made a little typo though because he meant to say ―Call on God. Welcome to the world of the Teradata Utilities. Cognizant Technology Solutions Coffing Data Warehousing. The primary difference in these approaches is speed.

MultiLoad. TPump and TPT are designed to import or export data to and from Teradata. For the Fast Path to be taken both the target and the source table must have the same Primary Index. That way no data has to be moved across the AMPs via the BYNET and Teradata can just copy and insert the blocks directly. We will discuss this later. The load utilities such as BTEQ. You can also use to utilize the Fast Path. but it is also important that you understand that once you are inside Teradata you can use an INSERT SELECT from one Teradata table to another and get great speed.”  Donald Freed I had no idea that Don’s mom understood the value of the fast path in Teradata. Coffing Publishing (c) 2011. If the target table you are loading to starts empty then there isn’t a large amount of writing to the Transient Journal. Page 8 / 17 Reprinted for CTS/227461. FastLoad. which is peace of mind. Peace is the path’. Cognizant Technology Solutions Coffing Data Warehousing. The Transient Journal is designed to Rollback bad transactions. but since the table starts empty there is only one write to the Transient Journal and then it is idle. If the Transient Journal needs to rollback the table it just empties it like it started. This is why it is called the Fast Path. Using this path will definitely bring you peace. Copying Prohibited .Tera­Tom on Teradata Utilities V12­V13 Fast Path Inserts inside the Teradata Database “My mother always used to say. 'There is no path to peace.

There is some different syntax based on if you are using ANSI mode or Teradata mode as you can see on the following page. Copying Prohibited . I mean it will not be in vain. but it will be empty. but uses the Teradata Cylinder Index and Master Index to delete the blocks almost instantaneously.Tera­Tom on Teradata Utilities V12­V13 Fast Path DELETE inside the Teradata Database “All speech is vain and empty unless it be accompanied by action. My Fellow Americans DELETE that Teradata table’s data! Thank you! This will not be vain and empty.”  Demosthenes I had no idea that Demosthenes understood the value of the fast path for deletes in Teradata during a speech…. Coffing Publishing (c) 2011. That is the idea behind the Fast Path Delete. Cognizant Technology Solutions Coffing Data Warehousing. It deletes not a row at a time or a block at a time. Page 9 / 17 Reprinted for CTS/227461. The blocks aren’t really physically deleted but logically deleted.. I have also included multiple statements so you have your choice in both modes. Oops.

The following page will show you the utilities that honor and don’t honor the Freespace Percent rule. This is actually set in a DBS control parameter called FreeSpacePercent so it becomes the default when loading a table. If a cylinder is filled completely during a load and someone does a simple insert of even a single row then the AMP will complain and Teradata will move data to another cylinder. Cognizant Technology Solutions Coffing Data Warehousing. Coffing Publishing (c) 2011.”  Thomas Jefferson I had no idea that Thomas Jefferson understood the value of freedom inside Teradata cylinders. Page 10 / 17 Reprinted for CTS/227461. but if there are individual inserts then you don’t want the cylinder splits. all is safe. but users can use the ALTER table or CREATE table commands and inform the system to override the default and utilize the Freespace Percent for that particular table that they want. That is why Teradata invented Freespace Percent. all is fast.Tera­Tom on Teradata Utilities V12­V13 Freespace Percent and Loading Tables “Where the press is free and every man able to read. It isn’t a big deal and Teradata does this in the background but it does take additional time. Copying Prohibited . This is called a cylinder split. Where the cylinder is free and every user able to INSERT. If nobody on your system ever did an INSERT command then you would not want to use the Freespace Percent command and you would want to completely fill the cylinder.

If you are trying to place Referential Integrity on the Employee_Table for example and you populated the table and then created the Foreign Key references and there was a Referential Integrity error(s) a table called Employee_Table_0 would be created and show all the RI errors. Page 11 / 17 Reprinted for CTS/227461.Tera­Tom on Teradata Utilities V12­V13 Referential Integrity and Load Utility Problems “A single lie destroys a whole reputation of integrity. They won’t! The following page will show you multiple ways to get around this utility situation.”  Baltasar Gracian In Teradata a single lie won’t destroy any Referential Integrity (RI). Copying Prohibited . Coffing Publishing (c) 2011. There is a problem when using FastLoad or MultiLoad and if you expect the utility to take care of any Referential Integrity problems. Cognizant Technology Solutions Coffing Data Warehousing. This is because Teradata provides a Referential Integrity Error Table.

 Cognizant Technology Solutions Coffing Data Warehousing. They are called NoPI tables. which stands for No Primary Index tables.”  – French Proverb New in Teradata V13 the DBA has the ability to CREATE tables without a Primary Index! These tables are designed to merely spread the rows randomly and evenly. Page 12 / 17 Reprinted for CTS/227461. Coffing Publishing (c) 2011. Copying Prohibited . A NoPI table is designed for ETL staging tables so data can be quickly transferred from flat files taken from operational systems such as Oracle or DB2. Then once the transformation has been completed the DBA can write an INSERT/SELECT command and quickly load the data inside the stating table into a Teradata table that has a Primary Index.Tera­Tom on Teradata Utilities V12­V13 V13 – No Primary Index Tables “No one is so generous as he who has nothing to give. Although you can query or JOIN a NoPI table with a traditional table containing a Primary Index they are really meant to quickly import data inside Teradata temporarily so it can be transformed inside Teradata and then loaded inside the data warehouse tables. This might be data that needs to be massaged or transformed.

”  – Ben Franklin On the following page you can see the NoPI CREATE statement. This can be done with normal SQL as seen on the following page or it can be done with a FastLoad or Tpump Load Utility.Tera­Tom on Teradata Utilities V12­V13 NoPI CREATE Statement “The Constitution only gives people the right to pursue happiness. Cognizant Technology Solutions Coffing Data Warehousing. This is done when you create the table. You have to catch it yourself. The key word to focus on the following page is the NO PRIMARY INDEX highlighted for your convenience. Coffing Publishing (c) 2011. Page 13 / 17 Reprinted for CTS/227461. Copying Prohibited .

Abraham Maslow The example on the next page allows you to realize that the Row Hash on each AMP is different. all rows contain that exact same Row Hash and each AMP only increments the Uniqueness Value. Coffing Publishing (c) 2011. but the size of the fight in the dog. you tend to see every problem as a nail. but once the Row Hash is established on each AMP.”  – Archie Griffin Each AMP will receive an equal amount of rows in an attempt by the Parsing Engine to spread the data evenly.Tera­Tom on Teradata Utilities V12­V13 NoPI Row-ID Increments the Uniqueness Value “It’s not the size of the dog in the fight. NoPI tables don’t need to be sorted and that is another main advantage if you desire to CREATE a staging table. Only the Uniqueness Value is incremented. Cognizant Technology Solutions Coffing Data Warehousing. Page 14 / 17 Reprinted for CTS/227461. Copying Prohibited . The Row Hash for every row in the NoPI table is the same. NoPI Row-Hash Different on each AMP “When all you have is a hammer. Notice the picture on the following page.”  .

Tera­Tom on Teradata Utilities V12­V13 NoPI Options and Facts “Failure accepts no alibis. Copying Prohibited . Cognizant Technology Solutions Coffing Data Warehousing. Success requires no explanation. Coffing Publishing (c) 2011.”  – Robert Rose The example on the next page describes the options and facts about NoPI Tables. NoPI Restrictions Page 15 / 17 Reprinted for CTS/227461.

Pivoting. Some DBA’s need to see all databases and users while users can choose the databases or users they want listed by right clicking on the system and Page 16 / 17 Reprinted for CTS/227461. Oracle.pdf Nexus . Database Administrators. The Nexus Query Chameleon from Coffing Data Warehousing may be the most sophisticated piece of software ever built in the data warehouse industry. The Nexus Query Chameleon is the Nexus for all databases and the Query Chameleon allows Nexus to literally change colors and fit in any enterprise environment.” – Tom Connelly The example on the next page shows the restrictions of NoPI Tables. power users. Download for a FREE Trial at www. DB2. Netezza. See the Nexus Query Chameleon User Guide at: http://www. Imagine having to pay for only one tool and that tool is the best ever seen by the user community.coffingdw. load experts and managers. Copying Prohibited .com.CoffingDW. Coffing Publishing (c) 2011. graphing and charting. easy-to-use and the only query tool that universally works with all major database vendor platforms including Teradata. and they still partner with CoffingDW to deliver it internally at Microsoft and to all PDW customers.A Brilliant Systems Tree Users can choose their colors for each system and see their system trees as they see fit. Greenplum. Cognizant Technology Solutions Coffing Data Warehousing. Not only can users query each system simultaneously. but he who never asks a question remains a fool forever. but they can also perform advanced analytics.Tera­Tom on Teradata Utilities V12­V13 “He who asks a question may be a fool for five minutes. SQL Server and SQL Server Parallel Data Warehouse. the developers. Nexus is the point where everything connects so Times Square is the Nexus of the New York Subway system.com/data/Nexus_Product_Info. After evaluating hundreds of tools Microsoft chose the Nexus as their tool of choice for their Parallel Data Warehouse (PDW) customers for 3-years straight. That is where Nexus gets its name. Cube Building. ETL and thousands of other functions on every database simultaneously! The end goal of the Nexus Query Chameleon is to be the only enterprise software tool needed to perform all functions on all databases. This is because the Nexus is so pretty. Database Administration.

Tera­Tom on Teradata Utilities V12­V13 choosing My Databases! You can also right click on any table in the systems tree and choose Quick Select which was done in the query example below.com/data/Nexus_Product_Info. Copying Prohibited .CoffingDW. Download a FREE Trial of the Nexus Query Chameleon at: www.coffingdw.com See the Nexus Query Chameleon User Guide at: http://www.pdf Page 17 / 17 Reprinted for CTS/227461. Coffing Publishing (c) 2011. Cognizant Technology Solutions Coffing Data Warehousing.