Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more ➡
Download
Standard view
Full view
of .
Add note
Save to My Library
Sync to mobile
Look up keyword
Like this
39Activity
×
0 of .
Results for:
No results containing your search query
P. 1
Teradata Best Practices Using a 7[1].1

Teradata Best Practices Using a 7[1].1

Ratings:

4.67

(3)
|Views: 10,469|Likes:
Published by api-3820518

More info:

Published by: api-3820518 on Oct 18, 2008
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, DOC, TXT or read online from Scribd
See More
See less

03/18/2014

pdf

text

original

 
 Teradata Best Practiceswith InformaticaPowerCenter 7.1.2
 
Informatica Confidential. Do not duplicate.1 of 23Revision: 12/2/2011
 
Introduction
 This document discusses configuration and how-tos using PowerCenter 7.1.2 and NCR’s TeradataRDBMS. It covers Teradata basics and also describes some “tweaks” which experience has shownmay be necessary to adequately deal with some of the “common” practices one may encounter ata Teradata account. The Teradata documentation (especially the MultiLoad, FastLoad and Tpumpreference) is highly recommended reading material, as is the “External Loader” section of thePowerCenter’s Server Manager Guide.
Additional Information:
All Teradata documentation can be downloaded from the NCR web site(http://www.info.ncr.com/Teradata/eTeradata-BrowseBy.cfm), it is also available on the Informatica Tech Support website (tsspider.informatica.com/Docs/page1.html). There is a nice Teradata FAQ inthe Informatica Tech Support knowledge base (it contains a section on how to handle “timestamp”columns). Finally, there is a “Teradata Forum” that provides a wealth of sometimes usefulinformation (http://www.Teradataforum.com).
Teradata Basics
 Teradata is a relational database management system from NCR. It offers high performance forvery large databases tables because of its highly parallel architecture. It is a major player in theretail space. While Teradata can run on other platforms, it is predominantly found on NCRhardware (which runs NCR’s version of Unix). It is very fast and very scalable.
Teradata Hardware
 The NCR computers on which Teradata runs support both MPP (Massively Parallel Processing) andSMP (Symetric Multi-Processing). Each MPP “node” (or semi-autonomous processing unit) cansupport SMP. Teradata can be configured to communicate directly with a mainframe’s I/O channel. This is knownas “channel attached”. Alternatively, it can be “network attached”. That is, configured tocommunicate via TCP/IP over a LAN. Since PowerCenter runs on Unix, most of the time you will bedealing with a “network attached” configuration. However, once in a while, a client will want to usetheir existing “channel attached” configuration under the auspices of better performance. Do notnecessarily assume that “channel attached” is always faster than “network attached”. Similarperformance has been observed across a channel attachment as well as a 100MB LAN. In addition,“channel attached” requires an additional sequential data move because the data must be movedfrom the PowerCenter server to the mainframe prior to moving the data across the mainframechannel to Teradata.
Teradata Software
In the Teradata world, there are Teradata Director Program Ids (TDPIDs), databases and users. The TDPID is simply the name that one uses to connect from a Teradata client to Teradata server (thinkOracle “tnsnames.ora” entry). Teradata also looks at databases and users somewhatsynonymously. A user has a userid, password and space to store tables. A database is basically auser without a login and password (or a user is a database
with
a userid and password). Teradata AMPs are Access Module Processors. Think of AMPs as Teradata’s parallel databaseengines. Although they are strictly software (“virtual processors” according to NCR terminology), Teradata folks often seem to use AMP and hardware “node” interchangeable because in the “olddays” an AMP was a piece of hardware.
Informatica Confidential. Do not duplicate.2 of 23Revision: 12/2/2011
 
Client Configuration Basics for Teradata
 The client side configuration is wholly contained in the “hosts” file (/etc/hosts on Unix orwinnt\system32\drivers\etc\hosts on Win). Since INFA does not run on NCR Unix, one should notever have to deal with the server side. Teradata uses a naming nomenclature in the “hosts” file. The name of the Teradata instance (i.e. tdpid – Teradata Director Program Id) is indicated by theletters and numbers that precede the string “cop1” in a hosts file entry. For example:
127.0.0.1 localhost demo1099cop1192.168.80.113 curly pcop1
 This tells Teradata that when a client tool references the instance “demo1099”, it should directrequests to “localhost” (or ip address 127.0.0.1), when a client tool references instance “p”, thislocated on the server “curly” (or ip address 192.168.80.113). There is no tie here to any kind of database server specific information (this is
not
similar to Oracle’s instance id. Tdpid <> Oracleinstance id!!!). That is, the tdpid is used strictly to define the name a client uses to connect to aserver. You can really call a server whatever you want. Teradata does not care. It simply takesthe name you specify, looks in the “host” file to map the <name>cop1 (or cop2, etc.) to an IPaddress, and then attempts to establish a connect with Teradata at the IP address.Sometimes you’ll see multiple entries in a hosts file with similar tdpids:
127.0.0.1 localhost demo1099cop1192.168.80.113 curly_1 pcop1192.168.80.114 curly_2 pcop2192.168.80.115 curly_3 pcop3192.168.80.116 curly_4pcop4
 This setup allows load balancing of clients among multiple Teradata nodes. That is, most Teradatasystems have many nodes, and each node has its own IP address. Without the multiple hosts fileentries, every client will connect to one node and eventually this node will be doing more than its“fair share” of client processing. With multiple host file entries, if it takes too long for the nodespecified with the “cop1” suffix to respond (i.e. curly_1) to the client request to connect to “p”, thenthe client will automatically attempt to connect to the node with the “cop2” suffix (i.e. curly_2) andso forth.
Informatica / Teradata touch points
Informatica 7.1.2 accesses Teradatawith severalthroughvarious Teradata tools. Each will bedefinedand as tohow it is configured within PowerCenter.
ODBC:
Teradata provides 32-bit ODBC drivers for Windows and Unix platforms. If possible, use theODBC driver from Teradata’s TTU7 release (or above) of their client software because this versionsupports “array reads”. Tests have shown these “new” drivers (3.02) can be 20%-30% faster thanthe “old” drivers (3.01). This lastest release of Teradata’s TTU8.0 uses ODBC v3.0421. Teradata’sODBC is on a performance par with Teradata’s SQL CLI. In fact, ODBC is Teradata recommendedSQL interface for their partners.Do not use ODBC to write to Teradata unless you’re writing very small data sets (and even then,you should probably use Tpump
defined later 
instead) because Teradata’s ODBC is optimized forquery access and, hence, is not optimized for writing data. ODBC is good for sourcing and lookups.PowerCenter Designer uses Teradata’s ODBC to import Source and Target table.
Informatica Confidential. Do not duplicate.3 of 23Revision: 12/2/2011

Activity (39)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
Sravani Vemula liked this
Sravani Vemula liked this
princeanilb liked this
Human Kind liked this
Pat Cobrador liked this
rnsr2001 liked this
rohit01234567890 liked this
Hidayat Shaik liked this

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->