Teradata Interview Questions 3

Teradata Interview Questions
TERADATA VERSION DIFFERENCES V2R5 New Features and Eliminated Features 1. Added Identity column features to implement Primary key and Unique key concepts. 2. Statistics can be collected on a. Multiple Non Indexed columns of a table. b. Using sampling data from a table. 3.Partition Primary Indexed Added and types described. 3. Materialized View concept provided through a. Partialy covered join Index. b. Sparse Join Index. 4. Stored procedure features like a. Implementing Cursors using declare cursor statements. b. Altering a Procedure Using Nested computed statements. c. FETCH to get the rows one by one sequentially. d. Get help information on Stored Procedure and Attributes etc. 5. Multi Column Compression for Table Columns. 6. Setting the default DATEFORM form Options. 7. Implementing Multiple Distinct Aggregations in a Select Query etc. V2R5.1 Features 1. ANSI Triggers Introduced. 2.Row Triggers, statement triggers are supported. 3. It allows Recursion and Aggregation inside Triggers. 4. Stored Procedure Enhancement achieved Implemented different recompilation methods of a stored procedure. 5. Recognised requirements for running C compilers in the Teradata UNIX Environment. 6. Provided more Contextual Information for Internal Error 5547. 7. Large Objects supported. a. we can load and unload large objects even in BTEQ(BLOB,CLOB) By Using LargeDatamode.
8. User Defined Functions Added. Note: 1. we cannot write Print Statement inside Procedure. 2. Stored Procedures Created in V2R5.1 require Recompilation. V2R6 FEATURES 1. Queue Tables are added. 2. Triggered Stored Procedure added. 3. Stored Procedure Grammar Merge added. 4. Multi statements request submitted (supported in stored procedures by using BEGIN REQUEST -----END RESQUEST. 5. External Stored Procedures (xSP added). 6. BLOB s are supported by stored procedure and External stored procedure. V2R6.1 FEATURES 1. Stored Procedure OUT Parameter Changes. 2. PPI on Volatile and Global temporary tables. 3. User Defined types Added (UTDs) 4. PPI on Join Index added. V2R6.2 FEATURES 1. Alter Table column Compression added. 2.'CREATE TABLE AS WITH STATISTICS ' ADDED. 3. Auto generated key Retrieval added. TD 12 FEATURES 1. Stored Procedure Enhancements a. Retrieval of one or more results set from a stored procedure. 2. Error Tables Added with external syntax. 3. TERADATA 12 Provides complex error handling capabilities in conjunction with a MERGE INTO (or) INSERT SELECT statements accomplished
through the Usage of new SQL based error tables. 4. Usage of ANSI Merge Operation at the time of loading added. 5. Compare and contrast the advantage of Merge instead of MULTILOAD. 6. TD 12 MERGE plan is different from V2R5 MERGE plan. 7. Multilevel Partitioning provided. 8. Added many changes to the User defined functions and Table functions. 9. Supports SQL Invocation Using External Stored Procedures. 10. Teradata Extended Support for Windows Aggregate Function. example STDDEV_SAMP[standard deviation sample] CORR ---Corelation COVAR_SAMP (Corelation Variable sample ) VAR_SAMP etc. 11. It supports DSW of 20 bits also. 12. BLOB and CLOB Extended till 10 GB. Teradata Interview Questions 1,2,3 (Typical) Entire set of Standard Teradata Questions.This Complete Set will repeat for all Interview sessions.
1.Definition of Teradata? Tell me important components of Teradata? Definition: It is an RDBMS that drives company's Datawarehouse,Datamart,OLTP,OLAP,DSS Appliances. a.An "Open system which executes on UNIX MR-RAS(Discontinued in teradata 13),WIN2K, Suse Linux ETC & Above OS. b.Compatible with ANSI standards c.Runs on Single or Multi nodes. d.Built in Parallelism e.Acts Like a Server. New Featues (Teradata 13): Designed with the below appliances. a.Extreme Data warehouse 1550. d.Active Enterprise Datawarehouse 5550 H,5555 C/H c.Internet and E-business d.Customer Relationship Management. e.Datamart Appliance 2500,2550,2555
Important Components: a.PE(Parsing Engine ) b.Bynet c.Amp (Access Module Processor) d.VSS(Virtual Storage System) (from Teradata13) e.Vdisk(Virtual Disk) 2.How to run the Teradata jobs in unix environment ? At unix prompt,Please execute in the below way. $Sh>BTEQ<[Script Path]>[Logfile Path] or $Sh>BTEQ<[Script Path] TEE [Logfile Path] Not only these ways ,there are many ways. Similarly to other utilities.
3.There is a load to the Table every one hour and 24/7.Morning trafic is high, afternoon trafiic is less, Night trafiic is high.According to this situation which Utility you use and how do you load,which utility used? Tpump is suggestable here By Using packet size increasing or decreasing we can handle traffic.
4.Fast Load Script is failed and error tables are available then how do you restart? There are 2 ways 1.Incase of Old file to Run Dont drop error tables simply rectify the error in the script or file and Run again.so that it runs from last configuration. 2.Incase of new file to Run Drop Error tables Try to run the script with only begin and end loading statements. ,so that it releases the lock on target table If possible remove the record from fastlog table. Run the script with new file freshly.
5.What are the advantages of other ETL tools(Informatica,DataStage,SSIS etc...) than Teradata Utilities or vice versa? TD Utilities run faster than other ETL tools incase of File to Table(Import) or Table to File (Export) Operations. Inmod and Outmod also help us to do better programing while Importing and Exporting. Major advantages of Other ETL tools than TD Utilities: 1.Heterogeneous Sources and Destinations we can Operate. (Oracle,Sqlserver,Excel,Flat Files etc....)
2.As they are providing full GUI support,degugging is so easier. 3.Reusability of components(Informatica:mapplets,worklets etc...) available in ETL tools so if we change in main location automatically other applications(mappings) which are using these components can update instantly. 4.Pivoting(Normalization),Unpivot(Denormalization) we can implement very easily in ETL tools. 5.Caching is one more advantage when we work with a source (Heterogenous) which is not changing frequently.Some times cache can be shared across applications etc....
6.Aborted in Phase 2 data acquisition completed in fastload? Simply take Begin and End loading in scripts and Run again.so that it runs from Amp to table. 7.How do you Generate sequence at the time of Display? By Using CSUM 8.How do you Generate Sequence in Teradata? By Using Identity Column 1-for storing purpose using identity. 2-for display purpose using csum. 9.How do you load Multiple files to a table by using fast load scripts? Remove End Loading statement in the script and Replace the file one by one in the script till last file and submit every time so that data appended in Amp Level.For the last file specify End Loading statement in the script and Run.so that it runs from Amp to table. 10.Why Multiload and Fastload does not supports SI,JI,RI and TRIGGERS? Above all concepts require communication between Multiple Amps. As per Fastload and Multiload doesnot provide any communication b/w multiple Amps and they Operate Independently.As concept is violating as well as it takes time to implement the above operation they are not allowing them. 11.Why Multiload does not supports USI and supports NUSI? Index subtable row is on the same Amp as the data row in NUSI.so it operates each Amp independently and in parallel. 12.While Executing MLOAD Client System Restarted? We need to Manualy Submit the script,so that it loads data from last checkpoint. 13.While Executing MLOAD Teradata Server Restarted? Along with the server MLAOD script will also restarted and Runs from last check point. 14.There is file it contains 100 records,need to load 60 records by skipping first 20 records and last 20 records.
use BTEQ Utility to do this task by Skip = 20 and Repeat 60 in script. 15.How to see current teradata version? .SHOW VERSION 16.what is Node? A Node is nothing but Collection of Hardware and Software components.Typically a Server is called as node. 17.What is PDE? Parallel Data Extension A software interface layer on top of the operating system that enables the database to operate in a parallel environment. It was created by NCR to support the parallel environment. 18.What is Trusted parallel database (TPD)? A database is called TPD if it runs under PDE. Teradata is a database which runs under PDE.so we call Teradata as pure Parallel database or Trusted parallel database. 19.What is channel driver? A Software that communicates between the PEs and applications running on channel-attached clients. 20.What is Teradata Gateway? Terdata Gateway software gives communication between the application and the PEs assigned to network-attached clients. There is one Gateway per node. 21.What is Virtual disk? A collection of Cylinders(physical disks) arranged in an Array fashion is called Vdisk or Virtual disk.Traditionally this is called as disk Array or Array of disk. 22.what is Amp? Access Module ProcessorIt is a Virtual processor responsible for managing one portion of the database(collection of virtual disks).This portion is not sharable by any other AMP.so well call this Architecture as shared nothing Architecture. Amp contains Database Manager subsystem and it performs the below operations a.Performing DDL b.Performing DML c.Implementing Joins,Aggregations. d.Applying and Releasing Locks etc. 23.What is Parsing Engine? PE is type of Vproc it takes SQL request and delivers SQL response.
It has software components to break SQL into steps, and send the steps to the AMPs. Session Control A session is nothing but logical connection between User and Application.Here it controls Authorization if its a Valid it does log on otherwise log off. Parser checks syntactical errors checks semantical errors checks existence of objects. Dispatcher It takes set of request and keep in a queue ,delivers set of responses by keeping the same queue that means it does request response flow control. 24.How many MAX session PE handles at a time? PE handles MAX 120 sessions. 25.What is BYNET? It acts like a "Message Communication" happens between Components.It is responsible for 1. Sending messages 2. Merging data 3.Sorting data 26.what is Clique? It prevents from Node Failure. a.A Clique is a collection of Nodes will Share Common Disk drives. b.whenever any node went down automatically Vprocs all migrate from fail node to other node to retrieve data from common disk drives. 27.List different types of LOCKS in teradata? Teradata can apply four types of LOCKS a.Access Lock b.Read lock c.Write Lock d.Exclusive Lock 28.At what level teradata can apply a LOCK? 1.Database level lock --- All objects inside database will be locked. 2.Table level -- All rows inside a table will be locked. 3,Row hash level lock-- Only Corresponding row will be locked.
29.How many AMPs involved in case of Primary Index? Its always one AMP.
30.what about UPSERT command in teradata? UPSERT means update else insert. In Teradata we have this option. 31.What is Advantage of PPI? Mainly we use for Range based data storing or category based Data storing. Range queries don't have to utilize a Full Table Scan.Its directly goes to the corresponding partition and skips other partitions. Fastload and Multiload work with PPI tables, but not with all Secondary Indexes. 32.what is the size of BYTEINT,SMALLINT,INTEGER? BYTEINT - 1 Bytes - 8 bits -- > -128 to 127 SMALLINT - 2 bytes - 16 bites: ---> -32768 to 32767 INTEGER - 4 BYTES - 32 BITS ---> -2,147,483,648 to 2,147,483,647 33.Difference between user and database in teradata? Database User A database is Passive Repository User is a Active. It stores all database objects It stores any object such as table,macro,view etc. It does not contain Password. It contains password 34..Difference between primary key and Primary Index? Primary Index Primary Key It is mandatory It is optional Limit of 64 columns/table No limit Allows Nulls & Duplicates Doesnt allows
It is physical mechanism Logical mechanism Effects Data Distribution Does not effect Data Distribution 35.What is the use of Spool Space? Teradata spool space is unused Perm space that is used for running queries.Teradata recommend 20 % of the available perm space is allocated for spool space. This is used to hold intermittent results of the queries and volatile tables. 36.What is the need of Performance tuning? To identify bottlenecks and to resolve it we go for Performance tuning. Bottle neck is not an error but it causes system delay in Performance. example: There is a query it is suppose to run in 2 mins but executed for half an hour and finally succeeded. In this situation we need to identify bottlenecks and resolve it. To Identify bottlenecks we go for a.Explain Request Modifier b.Teradata Visual Explain c.Performance monitor d.Teradata Manager 37.Define Explain plan? Explain plan displays the execution plan of SQL statement that is going to be executed by the database.This plan will be specified by the component called optimiser.Generaly it displays below information a.Number of Amps b.Amount of spool memory it is occupying. c.Number of Rows its affecting. d.Type of Join strategy it is taking. e.Time it takes to execute. f.Locks it is Using etc. Syntax : EXPLAIN Example : EXPLAIN SEL * FROM PARTY; 38.What is Collect statistics?Collect stats just derives the data demographics of the table. Collect stats is an important concept in teradata, Collect stats gives PE to come up with a plan with least cost plan for an requested query. Collect stats defines the confidence level of PE in estimating "how many rows it is going to access ? how many unique values does a table have , null values etc and all this info is stored in data dictionary. Once you submit a Explain plan query in TD the parsing engine checks if the stats are available for the requested table .
If collected stats available already PE generates a plan with "high confidence" . If Collect stats unavailable gives "low confidence" . Syntax : COLLECT STATISTICS ON INDEX/COLUMN NAME 39.What is Least Cost Plan? It executes in shortest path with less time. 40.What is Highest Cost Plan? It executes in Longest path with more time. 41.How many Confidence Level present? a.Low b.No c.High d.Join 42. If collect stats is not done on the table what will happen? Teradata uses a cost based optimizer and cost estimates are done based on statistics. So if you donot have statistics collected then optimizer will use a Dynamic AMP Sampling method to get the stats. If your table is big and data was unevenly distributed then dynamic sampling may not get right information and yourperformance will suffer.
43.What are the 5 phases in a MultiLoad Utility? * Preliminary Phase Basic Setup * * DML Phase Get DML steps down on AMPs* * Data Acquisition Phase Send the input data to the AMPs and sort it* * Application Phase Apply the input data to the appropriate Target Tables* * End Phase Basic Cleanup*
44.What are the MultiLoad Utility limitations? MultiLoad is a very powerful utility; it has following limitations: * MultiLoad Utility doesn t support SELECT statement. * Concatenation of multiple input data files is not allowed. * MultiLoad doesn t support Arithmatic Functions i.e. ABS, LOG etc. in Mload Script. * MultiLoad doesn t support Exponentiation and Aggregator Operators i.e. AVG, SUM etc. in Mload Script.
* MultiLoad doesn t support USIs (Unique Secondary Indexes), Refrential Integrity, Join Indexes, Hash Indexes and Triggers. * Import task require use of PI (Primary Index).
45.What are TPUMP Utility Limitations? Following are the limitations of Teradata TPUMP Utility: * Use of SELECT statement is not allowed. * Concatenation of Data Files is not supported. * Exponential & Aggregate Operators are not allowed. * Arithmatic functions are not supported.
46.Explain Teradata Competetive Advantages detail? 1.Automatic,Even Data Distribution In Teradata Even uniform or Parallel or Random distrubution is Automatic. 2.High Scalability If you Increase the number of Nodes or users or work teradata does not sacrifice any performance and it scales Linearly.we are calling this scalabilty as linear scalability. 3.Mature Optimizer As we are having powerful optimiser in teradata it supports 64 joins/query, 64 subquery/query, formating commands and aggregate commands. 4.Models the business Teradata supports any business models for star,snowflake schema,Hybrid schema,Normalisation etc 5.Low Cost Tco(Total cost of Ownership) Its easy to Install,Manage,work. Full Support Gui and Cheaper in price.
47.How do you set the session mode parameters in BTEQ? .set session transaction ANSI-----> this is to set ANSI mode .set session transaction BTET-----> this is to set Teradata transaction mode. These commands have to be entered before logging to the session.
48.How Teradata makes sure that there are no duplicate rows being inserted when its a SET table?
Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate. If it s a duplicate it silently skips it without throwing any error.
49.List types of HASH functions used in teradata? There are HASHROW, HASHBUCKET, HASHAMP and HASHBAKAMP. The SQL hash functions are:
y y y y
HASHROW (column(s)) HASHBUCKET (hashrow) HASHAMP (hashbucket) HASHBAKAMP (hashbucket)
50.what is derived table? a.It stores intermediate results and calculations. b.You can specify derived table in an SQL statement(Preferrably Select). c.This table created and dropped as a part of the query. d.Its stored under spool memory. e.Once the query finishes execution table is not available. f.This types of tables are called as Inline Query tables.
Hi Students In this 2 nd module I covered Important questions from Journals, Indexes,Utilities. Teradata Typical Questions 21.What is Journaling? why teradata requires journaling? Journling is a data protection mechanism in teradata.It prevents from Image failure. Journals are generated to maintain pre-images and post images of a DML transaction starting/ending, at/from a checkpoint. When a DML transaction fails,the table is restored back to the last available checkpoint using the journal Images. There are 3 types of Journals Permanent Transient Recovery Journal
2.How many types of Image supports Journaling? Four types of Images are supported by Journaling they are a. Single Image---->one copy of data will be taken. b.Dual Image----->Two copies of data will be taken. c.Before Image----->Before changes occur on the row data will be taken. d.After Image---->After changes happen on the row taking copy of data. 3.What is Transient Journal? Transient Journal - an area of space in the DBC database which is used primarily for storing of roll-back information during inserts/deletes/updates of tables. Detail Explanation: The Transient Journal maintains a copy of all before images of all rows affected by the transaction. If the event of transaction failure, the before images are reapplied to the affected tables, the images are deleted from the journal and a rollback operation is completed. In the event of transaction success, at the point of transaction commit, the before images for the transaction are discarded from the journal. In Summary, if a Transaction fails (for whatever reason), the before images in the transient journal are used to return the data (in the tables involved in the transaction) to its original state. 4.What is Permanent Journal? The purpose of the Permanent Journal is to provide selective or full database recovery to a specified point in time.It permits recovery from unexpected hardware or software disasters. The Permanent Journal also has the effect of reducing the need for full table backups which can be costly both in time and resource. 5.What are the different image options for Permanent Journal? There are four image options for the Permanent Journal: Before Journal After Journal Dual Before Journal Dual After Journal 6.Permanent Journals are automatically purged in teradata. True / False? False. The Permanent Journal must be manually purged from time to time. 7.Where does TD store transient journal? In perm space -> dbc.transient journal But that special table can grow over dbc's perm limit, until the whole system runs out of perm space. 8.What are the different return codes(severity errors) in Teradata utilities? There are 3 basic return codes (severity errors) in teradata utilities. 4 - Warning 8 - User error 12 - System error
9.How will you connect a database server to other server? We can connect from one server to another server in UNIX using the command ssh or FTP or SU ssh - ssh user_id@server_name 10.What is the meaning of skewness in Teradata? Data or Amp skew occurs in teradata due to uneven distribution of data across all the amps. Often this leads to spool space error too. To avoid skewness try to select a Primary Index which has as many unique values as possible. PI columns like month day etc. will have very few unique values. So during data distribution only a few amps will hold all the data resulting in skew. If a column (or a combination of columns) is chosen a PI which enforces uniqueness on the table then the data distribution will be even and the data will not be skewed. 11.Does Primary Index column choosing is important? The Success of teradata warehouse starts with choosing of correct column to creation of Primary index. Try to Choose a column which has unique values. so that data can be distributed evenly among all AMPs. Otherwise Skewness will come into picture. Primary index is useful to get a straight path to retrieve data. 12.What are the basic rules that define how PI is defined in Teradata? These are the following rules how Primary Index defined in Teradata a. Only one Primary Index per table. b.Its a physical mechanism which assigns Row to AMP. c.A Primary Index value can be unique or non-unique. d.A Primary Index can be composite till 64 columns. e.The Primary Index of a populated table cannot be modified. 13.What are the basic criteria to select Primary Index column for a given table? Answer: A thumb rule of ADV demographics is followed. Access Demographics Identify index candidates that maximize one-AMP operations. Columns most frequently used for access (Value and Join). Distribution Demographics Identify index candidates that optimize parallel processing. Columns that provide good distribution. Volatility Demographics Identify index candidates with low maintenance I/O. 14.can you explain and PE and AMPs communicate? When user connects to teradata database he opened a session with parsing Engine(PE) there after when user submits a query, 1. First PE takes query, checks syntax, verifies his access rigthts 2. If every thing looks okay, PE prepare an action plan for AMP.
Which AMP to respond , what is row ID to read ...etc 3. Then PE sends action plan to AMP via BYNET signals. 4. Then corresponding AMP takes action plan and reads data and send back to PE Then PE sends data to USER 15.Does Permanent Journals and Secondary indexes require Perm Space? Ans: Yes
16.which objects require Perm space in teradata? Tables and Stored Procedures objects require Perm Space Views, Macros , Triggers doesn't require Perm space.
17.What is LOG TABLE? A log table maintains record of all checkpoints related to the load job, it is essential/mandatory to specify a log table in a job. This table will be useful in case you have a job abort or restart due to any reason.
18.what is the use of Partition? If you create PPI on table, then data at AMPs is ordered using Partition column. Example If we have Partition on deptno then all dept 10 records at one place at AMP and dept 20 records at one place. 19.Is it possible to alter NO RANGE and Unknown partition on a table? Yes if table is empty,we can alter NO RANGE and Unknown Partition of a table. 20.can you apply a read lock on table where write lock is already applied? No
21.How many partition in given SQ L? Assume SQ L, Partition by CASE_N ( partycode = 10, partycode = 20,
partycode = 30, partycode = 40 , NO CASE OR UNKNOWN) )
Answer -- 5 partitions 22.How many partions in given SQL?Assume SQL, Partition by CASE_N ( partycode = 10, partycode = 20, partycode = 30, partycode = 40 , NO CASE, UNKNOWN) ) Answer --- 6 partitions Here it creates 6 Partitions If a value partycode entered is not NULL and first 4 cases are not satisfied then data goes to "NO CASE" partition. NULL values goes to UNKNOWN partition.
23.How teradata makes sure that there are no duplicate rows being inserted when its a SET table? Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate. If its a duplicate it silently skips it without throwing any error.
24. Explain What are Low and High Confidentiality in Explain command? The explain generally displays the plan which would be prepared by Optimizer. Confidence levels indicate how well the optimizer is aware about the demographic data for a particular step. High confidence: Means the optimizer know about the no of rows that would be returned as a result of that step. Examples would be when PI statistics exist when the column or range stats exist or no join
involved. Low confidence: Some stats available. Join and stats available on both sides of the join. No confidence: No stats available. Join involved. 25.Why Fastload Utility does not support multiset table and duplication? Multi set table supports duplicate rows. Fastload utility does not support duplicate rows. Restart logic is one of the reason. Fastload is one of the utility which loads data in blocks(64K). During restart Fastload sends some of the rows for the second time which occur after a checkpoint. Due to this Fastload rejects duplicates. example consider 20 rows to be loaded in a table. FL utility is used to load the table and ckpt is 5 rows. If restart occurs after 7th row FL may send 6 and 7th to AMPs during the restart. This records would be considered as duplicates and rejected.
26.Can you Recover the password of a User in Teradata? No , you cant recover the password of a User in Teradata. Passwords are stored in Data Dictionary table (DBC.DBASE).Using a one way encryption method. You can view the encrypted passwords using the following query. SEL * FROM DBC.DBASE; 27.what is the differnce between Sub Query and Corelated Sub Query? Sub Query If Queries written in a nested manner then its termed as a sub query. Inner query executed First and executed Only one time. Corelated Sub Query Co-Related Sub query get executed once for each row of the parent query. Inner Query executed many based on Outer query.
28.what is FILLER command in Teradata?
while running Fastload or Multiload if you dont want to load a particular field from the datafile to the target table then use the FILLER command to achieve.
29.Difference between Access Logging and Query Logging in Teradata? 1.Access Logging is concerned with security ( who is doing what) In access logging you ask the database to log who is doing what on a given object. The information stored is based on the object not the SQL fired or the user who fired it. 2.Query Logging (DBQL) is used for debugging (what is happening around) Incase of DBQL database keep on tracking various parameters like SQLs, Resource, Spool Usage and other things which help you to understand whats going on, the information is fruitful to debug a problem. 30.What is basic Teradata Query language? 1.It allows us to write SQL statements along with BTEQ commands. we can use BTEQ for Importing,Exporting and Reporting Purposes. 2.The commands start with a (.) dot can be terminated by using (;) it is not mandatory to use(;) 3.BTEQ will assume any thing written with out dot as a SQL statement and requeries a (;) to terminate it. 31.How can you track Login parameters of users in teradata? You can view all these parameters in this Data Dictionary Table DBC.LOGONOFF //SELECT LOGDATE,LOGTIME,USERNAME,EVENT FROM DBC.LOGONOFF;//
32.How can you use HASH FUNCTIONS to View Data Distribution across all AMPs in Teradata? Hash Functions can be used to view the data distribution of rows for a chosen Primary Index. SELECT HASHAMP(HASHBUCKET(HASHROW())) AS "AMP#",COUNT(*) FROM GROUP BY 1 ORDER BY 2 DESC; HASHROW --- returns the row hash value for a given value. HASHBUCKET --- the grouping of a specific Hash value. HASHAMP-----the AMP that is associated with the Hash Bucket.
This is realy good, by looking into the result set of above written query you can easily find out the Data distribution across all AMPs in your system and further you can easily identify un-even data distribution.
33.How do you transfer large Amount of Data in Teradata? Transfering of large Amount of data can be done using various Applications like Teradata Utilities. BTEQ,FastLoad,MultiLoad,Tpump and FastExport. BTEQ (Basic Teradata Query) supports all 4 DML s : SELECT, INSERT,UPDATE and DELETE. BTEQ also support IMPORT/EXPORT protocols. Fastload ,Multiload and Tpump transfer the data from Host to Teradata. FastExport is used to export data from Teradata to the Host.
34.How can you determine I/O and CPU usage at a user level in Teradata? You can find out I/O and CPU Usage from this data Dictionary Table DBC.AMPUSAGE; SELECT ACCOUNTNAME,USERNAME,SUM(CPUTIME) AS CPU,SUM (DISKIO) AS DISKIO FROM DBC.AMPUSAGE GROUP BY 1,2 ORDER BY 3 DESC;
35.what is Normalization? Normalization is the process of reducing a complex data structure into a simple, stable one. Generally this process involves removing redundant attributes, keys, and relationships from the conceptual data model. 36. How many types of Indexes are present in teradata? There are 5 different indexes present in Teradata 1. Primary Index a.Unique primary index
b. Non Unique primary index 2. Secondary Index a. Unique Secondary index b. non Unique Secondary index 3. Partitioned Primary Index a. Case partition b. Range partition 4. Join index a. Single table join index b. multiple table join index c. Sparse Join index ( constraint applied on join index in where clause) 5. Hash index 37.what teradata supports 6A's means? Active Load Active Access Active Events Active Workload Management Active Enterprise Integration Active Availability. 38.Which is Faster MultiLoad Delete or Normal Delete command? MultiLoad delete is faster then normal Delete command, since the deletion happens in data blocks of 64Kbytes, where as delete command deletes data row by row. Transient journal maintains entries only for Delete command since Teradata utilities doesn t support Transient journal loading. 39.what tools can be used for Active Load in teradata?
ETL Tools can use queue tables and triggers, and use FastLoad, MultiLoad and TPump utilities .
40.How to Skip or Get first and Last Record from Flat File through MultiLoad?
In .IMPORT command in Mload we have a option to give record no. from which processing should begin. i.e. FROM m m is a logical record number, as an integer, of the record in the identified data source where processing is to begin. You can mention m as 2 and processing will start from second record. THRU k and FOR n are two options in the same Mload command, functions same towards the end of the processing. Adding to the above, if from n"start record" and for n "stop record" are not mentioned, mload considers records from start till the end of the file. 42.what is the use of TEMP space?
TEMPORARY (TEMP) space : A database may or may not have TEMP space, however, it is required if Global Temporary Tables are used.
43. What is the default join strategy in Teradata??? Answer #1 The Teradata Cost based optimizer will decide the join strategy based on the optimum path. The common strategies followed are from Merge, Hash & Nested Join Answer #2 Three strategies followed by optimizer are: 1. Duplication of rows of one table to every amp --> This one is opted by optimizer when the non-PI column is on a small table. 2. Redistribution of the non PI rows of the table to the amp containing the matching PI row amp. --> This one is opted when the tables are large. The non PI column is in one table is redistributed to the amp containing the matching PI. 3. Redistribute both the rows of the table by hash values. --> This is chosen when the join is on a column that is not the PI in either table. This will require the most spool space.
44. what is common practice in creating views and tables in teradata?
It is a common practice in Teradata to have some databases with PERM space that contain only tables. Then, other databases contain only views. These view databases require no PERM space and are the only databases that users have privileges to access. The views in these databases control all access to the real tables in other databases. They insulate the actual tables from user access . 45.PERM space is a pre allocated space? True/ False?
False. PERM space is not pre allocated space. its a maximum allowed space to user or a database. 46.How to handle nulls in Teradata? How many columns can be there in a table? How to find average sal with out using avg function? Answer 1. Use zeroifnull, nullifzero in select and NULL in insert directly. 2. 256 columns max per table. 3. Without using "avg" we can find the avg salary by using sum (sal)/count (sal); sel sum(sal)/count(sal) as avgsal from tablename. 47.what is FALLBACK in teradata? FALLBACK is a table protection feature used in case an AMP fails. You can use FALLBACK on all tables. FALLBACK tables use twice as much disk space as NON-FALLBACK rows . 48. How many types of Skew exist? If you utilized unequally TD resources (CPU,AMP,IO,Disk and etc) this is called skew exists. Major are 3 types of skews (CPU skew, AMP/Data skew, IO Skew). -Data skew? When data is not distributed equally on all the AMPs.
-Cpu skew? Who is taking/consuming more CPU called cpu skew. -IO skew? Who perform more IO Operation? Resulting in IO Skew. 49.What are Differences between Teradata and ANSI Session modes in Teradata? TERADATA MODE ------------1. Comparison is not Case sensitive 2. Create table are default to SET tables 3. Each transaction is committed implicitly 4. Supports all Teradata commands 5. It follows BTET (Begin and End Transaction) Mode ANSI MODE ------------1. Comparison is CASE sensitive 2. Create table are default to MULTISET tables 3. Each transaction has to be committed explicitly 4. Does not support all Teradata commands . 5. It does not follow BTET Mode. 50.How to skip 1st record while using Bteq IMPORT? By using SKIP=1 ; , we can skip first record. .import infile=, skip=1;
Teradata Typical Questions 3
Hi All, 1.Which statement about spool space is true? A.) Spool space must be specified during creation of the user B.) Spool space is used to hold the intermediate result set. C.) Spool space is permanent space. D.) Spool space is limited to the size of the table. Answer--- B 2. Which Utility used to copy a table and restore it to another database? A.) BTEQ B.) Archive Recovery C.) Fastload D.) Table rebuild Answer ---B 3.The Given below, which tool should you use to collect and analyze database performance? A.) BTEQ B.) TDQM C.) Teradata SQL Assistant D.) Teradata Manager Answer --- D 4. In Which way is data distributed with a partitioned primary index? A.) based on the secondary unique index B.) based on the Primary key of the table C.) Based on the Primary Index of the table D.) based on the partitioning column . Answer --- c 5. Batch process block level loading are performed against populated tables by which utility? A.) Multiload B.) bulkload
C.) Tpump D.) Fastload Answer-- A 6. . Teradata Index wizard used for to A.) compare performance of multiple workloads B.) recommend secondary indexes for a table C.) recommend primay indexes for a table D.) compare primay and secondary indexes for a table Answer --B 7. Which statement is true if using NUPIs to accomplish data distribution? A.) The increase of hash buckets has resulted in more hash collisions. B.) Duplicate NUPI values will produce the same row hash. C.) Uniform row distribution is accomplished using UPIs. D.) The low-order 16 bits of the row hash are input to the hash map. Ans -- B 8. Which is the component limits queries based on set thresholds A.) TDQM B.) Teradata Manager C.) BTEQ D.) Query Man Answer --- A 9.How many BYNETs in a teradata system? Well, BYNET is both a hard ware and software BYNET. Hard ware BYNET connects every PE and AMP in the sytem . It can be scalable at any time. Every teradata system has two BYNETS . one for backup in case of one failed However both BYNETS can be used at same time by AMP or PES . 10. What is TENACITY? What is its default value?
TENACITY specifies the amount of time in hours, to retry to obtain a loader slot or to establish all requested sessions to logon. The default for Fast Load is no tenacity , meaning that it will not retry at all. If several FastLoad jobs are executed at the same time, we recommend setting the TENACITY to 4, meaning that the system will continue trying to logon for the number of sessions requested for up to four hours. 11. How do you create materialized view in Teradata? There is no such thing as a "materialized view" in Teradata. The equivalent in Teradata would be a join index (or aggregate index) with a corresponding view put on top of it. The command to create one is "CREATE JOIN INDEX...(lots of options)". Join indexes are maintained when the data is maintained. They cannot be accessed directly, but are invoked when the optimizer determines it is beneficial. Even though you cannot invoke a join index directly, you create a view that looks like the join index and in that way, it would be similar to having direct access. However, this approach does not guarantee that the join index will be used. 12.who will read and write data in teradata?
AMP works like a hard worker. AMP read and write data to their own assigned disks. 13. Failure 7547 Target row updated by multiple source rows.
-updating the same target row by multiple source rows -Remove the duplicate data. 14.How do you execute the given SQL statement repeatedly in BTEQ? Select top 1* from database.table1; =n Here "=n is to run the previous sql statement, "n" number of times. 15. Failure 2646 No more spool space
Check select query well tuned. Collecting stats on the table may fix this problem. 16. How do you declare a variable inside a Teradata macro and Teradata stored procedure? No facility to declare variables inside a Teradata macro, however you can pass values into the macro by means of input parameters. These values will then be embedded in the SQL statements that the macro contains. Teradata stored procedure supports IN, OUT and INOUT parameters, as well as the ability to declare variables within the procedure itself. Such variables could be useful as counters when you have looping logic. 17. What does SLEEP function does in Fast load? The SLEEP command specifies the amount minutes to wait before retrying to logon and establish all sessions. Sleep command can be used with all load utilities not only fastload. This situation can occur if all of the loader slots are used or if the number of requested sessions is not available. The default value is 6 minutes. If tenacity was set to 2 hours and sleep 10 minutes, Sleep command will try to logon for every 10 minutes up to 2 hours duration. 18. Given below, What is the maximum level of data protection?
A.) RAID 1 and Fallback B.) RAID 5 and Fallback C.) RAID 5 and No Fallback D.) RAID 1 and No Fallback ANswer - A 19. When utilizing a unique secondary index, how many AMP(s) are most probably accessed? A.) 1 AMP B.) Multi_AMP C.) 2 AMP D.) All-AMP Answer --- c
20. Difference between count(*) and count(1) ? Count is one of very important functions used in any database . But what many dont know that is the result count we get from COUNT function might be different based on how it is used. Let us consider the example of count(*) and Count(1) and understand how there is some difference the usage. Say for a given table with data
Xy -----------21 31 Null 32 Null 33 41 34 Null Null And this query:
select count(*), count(x), count(1) from table; Results: 5, 2, 4 * Count(*) counts all occurrences including nulls. This is a row count for a given table. * Count(x) counts all non null occurrences of values in column x. If x has null values, count(x) will be less than count(*). * Count(1) is similar to count(x) .It counts all null values in a "pseudo column" created by the constant. Since the constant 1 will never be null, this would, be like count(*) i.e row count . 21.What is use of compress in teradata?Explain? Compression is used to Minimize the table size, for example when the table size is increasing anonymously We can use Compression to reduce the size of the table. Conditions: 1.Compression can be declared at the time of table creation 2.We can compress up to 256 column values(not columns) 3.We can't compress variable length fields (vartext,varchar..)
22. What is the difference between start schema and Fsldm? FSLDM --> Financial Services Logical Data Model (Teradata FS-LDM 7.0) its developed by Teradata for the financial sectors (Specifically for Bank). it can be customised based on the user requirement. StarSchema --> Its a relational database schema for representing multidimensional data. The data is stored in a central fact table, with one or more tables holding information on each dimension. 23. Teradata performance tuning and optimization steps?
1. collecting statistics 2. Explain Statements 3. Avoid Product Joins when possible 4. select appropriate primary index to avoid skewness in storage 5. Avoid Redistribution when possible 6. Use sub-selects instead of big "IN" lists 7. Use derived tables 8. Use GROUP BY instead of DISTINCT ( GROUP BY sorts the data locally on the VPROC. DISTINCT sorts the data after it is redistributed) 9. Use Compression on large tables. 24. Explain command displays the execution plan of the query chosen by the Teradata Optimizer a) False b) True Answer: b
25. Success/Failure/Error responses from Teradata are received by a) Request Parcel b) Data parcel c) Response Parcel Answer: c 26. Teradata Database facilities for software fault tolerance on a) Vproc Migration & Fallback tables b) AMP Clusters & Journaling c) Backup/Archive/Restore d) Table Rebuild Utility e) All of the above f) None of the above Answer: e
27.Total Number of nodes suppport by BYNET are a)510 SMP Nodes a) 512 SMP Nodes b) 513 SMP Nodes c) 514 SMP Nodes Answer: b 28.SMP Stands for a) Symmetric Multi-processor b) Symmetric Massive-Processor c) System Multi-processor d) Symmetric Multi-Processor Answer: a 29.Terabytes represent a) 10 power 15 b) 10 power 12 c) 10 power 18 d) 10 power 24 Answer: b 30.Global Temporary Tables are stored in the TD are a) Perm Space b) Spool Space c) Temp Space Answer : c 31.What are the following object required for Teradata Database-Prem space?. a) Tables & Stored Procedure b) Tables & Views c) Macros & Triggers d) Views & Stored Procedures Answer: a. 32. HASHROW funtions is a. Returns the hash bucket number that corresponds to a specified row hash value. b. Returns the hexadecimal row hash value for an expression c. Returns the AMP number of the primary AMP corresponding to the specified hash bucket number. d. Returns the identification number of the fallback AMP corresponding to the specified hash bucket. Answer: b 33.Teradata A Brief History ?
y
1979
Teradata Corp founded in Los Angeles, California.
Development begins on a massively parallel database computer.

y y
1982
BYNET technology is patented.
1984 Teradata sells first DBC/1012 to Wells Fargo Bank of California. Total revenue for year $3 million. 1986 1989 1990 1991 1992 1995 1996 1996 1997 1997 1998 1999 2000 Product of the Year! Teradata and NCR partner on next generation of DBC. First Terabyte system installed and in production. NCR is acquired by AT&T. Teradata revenue $280M. Teradata is merged into NCR. Teradata Version 2 for UNIX operating systems released. NCR only vendor to run multi-user TPC-D Benchmarks. AT&T spins off NCR Corporation with Teradata product. Teradata database becomes industry leader in scalable data warehousing. 100 Gigabyte TPC-D Benchmark leader. 24 Terabyte system in production. 1 TB TPC-D winner 50+ TB system in production! Teradata Division created.
y y y y y y y y y y y y y
VN:F [1.9.8_1114] 34.What are the functions of a Tera data DBA? Following are the different functions which a DBA can perform: 1. User Management Creation and managing Users, Databases, Roles, Profiles and Accounts. 2. Space Allocation Assigning Permanent Space, Spool Space and Temporary Space. 3. Access of Database Objects Granting and Revoking Access Rights on different database objects. 4. Security Control Handling logon and logoff rules for Users. 5. System Maintenance Specification of system defaults, restart etc. 6. System Performance Use of Performance Monitor(PMON), Priority Scheduler and Job Scheduling.
7. Resource Monitoring Database Query Log(DBQL) and Access Logging. 8. Data Archives, Restores and Recovery ARC Utility and Permanent Journals. 35. what is Full table scan (FTS) ? What are the scenarios in which Full Table Scans occurs? A Full Table Scan (FTS) is a query that reads every row of a table. Scenarios Full table scan occurs 1. Where clause in SELECT statement does not use either primary index or secondary index. 2. SQL Statement which uses a partial value (like or not like), in the WHERE statement. 3. SQL Statement which does not contain where clause. 4.SQL statement using range in where clause. Ex. (col1 > 40 or col1 < =10000) 36. How to identify PPI columns? Select databasename , tablename , columnposition ,columnname from dbc.indexes where indextype ='Q' order by 1 ,2,3 ; 37.Can you create a table with no Primary Index in teradata? NO. Its not possible.... if you dont mention external primary index, teradata creates one for you. Teradata uses this Primary index to distribute data among AMPs. 38.In case of NUSI, does AMP has local subtable? Yes. In case of NUSI, AMP is going to have a subtable which has only information for the rows in the same AMP. But in case of USI, Each row in AMP is hashed and decide AMP number to store its information in Secondary index Sub table. 39. How long ROW ID is? 64 Bits... AMP receives 32 bit hash value and row data from PE. AMP then append a 32 bit unique value to row HASH to make 64 bit ROW ID.
ROW ID is used to sort rows in AMP First AMP checks is there any row with same HASH value? If yes, then it append32 bit 1 to (0000 0000 0000 0000 0000 0000 0000 0001) to row hash and make it 64 bit ROW ID... If one more row comes with same hash value then it appends 2( 0000 0000 0000 0000 0000 0000 0000 0010) to Row HASH. 40.Can a user creates any object who assigned no perm space? A user with No permspace can't create any tables in his user space. But He can create Macro, Views and triggers. Because these objects will be stored in Global Space . 41. When a user query fails because of spool space? Query may fails because of spool space in two cases When there is no empty perm space. i.e complete system is full. User query requires more spool space than user spool space limit . 42.Does AMP will do row by row search once it has Hash value to retrieve a row? AMP receives a hash value from PE for which data is required. AMP applies a Binary Search to find that row.... Its never do linear search. 43.A Table has UPI on EMP no. I wrote a query in which i used deptno in emp table? How many amps gets this data? Even though table has Primary Index , my SQL query written on dept number on which there is not index.... So its a Full table scan. 44.While Multiload is running client system restarted? We need to manually resubmit the script so that it runs from Last configuration check point. 45.While Executing Multiload script teradata server restarted? Along with server multiload script restarted and Runs from last checkpoint. 46.Can we concatenate Multiple files with same structure and load in multiple tables? NO ,only in fastload its possible.
47.How do you load data from a file to multiple tables MLOAD? In this apply class we use where conditions while loading data into multiple tables. Apply Ins_table where Recind = T1 Apply Ins_table where Recind = T2 48.If Multiload is fail in Acquistion or Application phase how do you restart? a.Incase of old file to Run Dont drop ET,UV,WT and log tables Dont Release locks on target tables Rectify errors in the File or script and Run again. b.Incase of new file to Run Drop ET,UV,WT and log table. Release locks on target tables by using Release MLOAD command. RELEASE MLOAD Table1, ...Table 5 RELEASE MLOAD Table1, ...Table 5 IN APPLY 49.Terdata Data Dictonary Directory explain? The DD/D ( Data Dictionary Directory)
y y y y y y y
is an integrated set of system tables contains definitions of and information about all objects in the system is entirely maintained by the RDBMS is data about the data or metadata is distributed across all AMPs like all tables may be queried by administrators or support staff is accessed via Teradata supplied views
Examples of DD/D views: DBC.Tables > information about all tables DBC.Users > information about all users
DBC.AllRights > information about access rights DBC.AllSpace > information about space utilization. 50.Can I use drop statement in Fload utility? YES, But you have to declare it out of the FLOAD Block it means it should not come between .begin loading,.end loading FLOAD also supports DELETE,CREATE,DROP statements which we have to declare out of FLOAD block. In the FLOAD Block we can give only INSERT
1.Fload, Mload and error tables
2. What is the difference between Multiload & Fastload interms of Performance? 3. what is explain in teradata? 4. How many codd's rules are satisfied by teradata database? 5. Teradata performance tuning and optimization 6. What is the difference between Global temporary tables and Volatile temporary tables? 7. How teradata makes sure that there are no duplicate rows being inserted when its a SET table? 8. What are the enhanced features in Teradata V2R5 and V2R6? 9. What is cliques? What is Vdisk and how it will communicate with physical data storage at the time of data retrieval through AMP ? 10. In which phase of the Active Data Warehouse evolution do you use data to determine what will happen? 11. What are two examples of an OLTP environment? 12. After creating tables dynamically in the Teardata, where is the GRANT table option usually done ? When tables are newly created, what is the default role and what the default privileges which get assigned ? 13. what is basic teradata query language? 14. How many types errors will be occur in total SPOOL process and How can we connect a database server from to other server(i.e., i have to connect another server from a present server)
15. How does indexing improve query performance? 16. Does SDLC changes when you use Teradata instead of Oracle? 17. How can we build data marts from FSLDM. is there any standard approach for this? Is FSLDM supports place holder for aggregated data? 18. Which two can be implemented as a centrally located logical architecture? 19. Can we have an unconnected lkp to lookup a DB2 record against a Teradata record? doesnt seem to work. I could be wrong. 20. Which two statements are true about a foreign key? 21. What is the difference between start schma and Fsldm? 22. What is a common data source for the central enterprise data warehouse?

Teradata Interview Questions 3

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Teradata Interview Questions 3

Uploaded by

Copyright:

Available Formats

Teradata Interview Questions

HASHROW (column(s)) HASHBUCKET (hashrow) HASHAMP (hashbucket) HASHBAKAMP (hashbucket)

partycode = 30, partycode = 40 , NO CASE OR UNKNOWN) )

28.what is FILLER command in Teradata?

44. what is common practice in creating views and tables in teradata?

Teradata Typical Questions 3

Xy -----------21 31 Null 32 Null 33 41 34 Null Null And this query:

Teradata Corp founded in Los Angeles, California.

Development begins on a massively parallel database computer.

BYNET technology is patented.

1.Fload, Mload and error tables

You might also like