Chapter 6: FastExport
"An invasion of armies can be resisted, but not an idea whose time has come."
- Victor Hugo
Why it is Called "FAST" Export
FastExport is known for its lightning speed when it comes to exporting vast amounts of data fromTeradata and transferring the data into flat files on either a mainframe or network-attachedcomputer. In addition, FastExport has the ability to use OUTMOD routines, which provide the userthe capability to write, select, validate, and preprocess the exported data. Part of this speed isachieved because FastExport takes full advantage of Teradata's parallelism.In this book, we have already discovered how BTEQ can be utilized to export data from Teradata ina variety of formats. As the demand increases to store data, the ever-growing requirement for toolsto export massive amounts of data also increases.This is the reason why FastExport (FEXP) is brilliant by design. A good rule of thumb is that if youhave more than half a million rows of data to export to either a flat file format or with NULLindicators, then FastExport is the best choice to accomplish this task.Keep in mind that FastExport is designed as a one-way utility — that is, the sole purpose ofFastExport is to move data out of Teradata. It does this by harnessing the parallelism that Teradataprovides.FastExport is extremely attractive for exporting data because it takes full advantage of multiplesessions, which leverages Teradata parallelism. By default, Fastexport will take up 4 sessions thatwill be running on the local computer. FastExport can also export from multiple tables during asingle operation. In addition, FastExport utilizes the Support Environment, which provides a jobrestart capability from a checkpoint if an error occurs during the process of executing an export job.
How FastExport Works
When FastExport is invoked, the utility logs onto the Teradata database and retrieves the rows thatare specified in the SELECT statement and puts them into SPOOL. From there, it must build blocksto send back to the client. In comparison, BTEQ starts sending rows immediately for storage into afile.If the output data is sorted, FastExport may be required to redistribute the selected data two timesacross the AMP processors in order to build the blocks in the correct sequence. Remember, a lot ofrows fit into a 64K block and both the rows and the blocks must be sequenced. While all of thisredistribution is occurring, BTEQ continues to send rows. FastExport is getting behind in theprocessing. However, when FastExport starts sending the rows back a block at a time, it quicklyovertakes and passes BTEQ's row at time processing.The other advantage is that if BTEQ terminates abnormally, all of your rows (which are in SPOOL)are discarded. You must rerun the BTEQ script from the beginning. However, if FastExportterminates abnormally, all the selected rows are in worktables and it can continue sending themwhere it left off in a very smart and very fast manner!Also, if there is a requirement to manipulate the data before storing it on the computer's hard drive,an OUTMOD routine can be written to modify the result set after it is sent back to the client on eitherthe mainframe or LAN. Just like the BASF commercial states, "We don't make the products you buy,we make the products you buy better". FastExport is designed off the same premise, it does notmake the SQL SELECT statement faster, but it does take the SQL SELECT statement andprocesses the request with lighting fast parallel processing!
Reprinted for email@example.com, IBMCoffing Data Warehousing, Coffing Publishing (c) 2005, Copying Prohibited