Professional Documents
Culture Documents
M.S.Prasad 165916
Contents
Introduction to Teradata Teradata Architecture Data Distribution PI characteristics Data Access Teradatas scalability Data Protection features
Introduction to Teradata
Teradata is a Relational Database Management System (RDBMS):
1. 2. 3. 4. 5. 6. 7. 8. Designed to run the worlds largest commercial databases. Preferred solution for enterprise data warehousing (OLAP). Executes on UNIX-MP-RAS or NT-based system platforms Compliant with ANSI industry standards Runs on single (SMP) or multiple (MPP) nodes Acts as Database server to client applications throughout the enterprise Uses Parallelism to manage Terabytes of data Shared-Nothing Architecture
Advantage Teradata
1. 2. 3. 3. 4. 5. Unlimited, Proven Scalability Unlimited Parallelism - Parallel sorts/aggregations, temporary tables Shared-Nothing architecture Mature Optimizer - Complex queries, joins per query, ad-hoc processing Its a Cost Based Optimizer. Model the Business - 3NF, robust view processing, star schema Lowest TCO - ease of setup & maintenance, robust parallel utilities, no re-orgs, lowest disk to data ratio, robust expansion utility High Availability - No single point of failure, scalable data loading, parallel load utilities Note: If the table demographics are well defined, the optimizer will choose the best plan for the query execution.
Advantage Teradata
7. Enormous capacity Billions of rows Terabytes of data 8. High-performance parallel processing 9. Single database server for multiple clients Single Version of the Truth 10. Network and Mainframe connectivity 11. Industry standard access language (SQL) 12. Manageable growth via modularity 13. Fault tolerance at all levels of hardware and software 14. Data integrity and reliability
They know that if data doubles, the system can expand easily to accommodate it. The workload for creating a table of 100,000 rows is the same as creating 1,000,000,000 rows!
ATM
MVS
POS
Operational Data
Teradata
Cognos Access BO
Data Warehouse
Access Tools
End Users
Architecture
Channel-Attached System
Client Application
Network-Attached System
Client Application
CLI Channel
TDP
Teradata Node
TPA PDE OS UN IX/ NT Channel Driver
LAN
Teradata Gateway
Parsing Engine
Parsing Engine
VDisk
VDisk
VDisk
VDisk
Architecture In Detail
Node
1. The basic building block for a Teradata system, the node is where the processing occurs for the database. 2. A node is a term for a general-purpose processing unit under the control of a single operating system. 3. Teradata system contains one or more nodes. Single Node - Symmetric Multi Processing (SMP) Multi Node - Massive Parallel Processing (MPP)
Node Components:
1. 2. 3. Parsing Engine BYNET AMP
Understanding Node
Parsing Engine
BYNET
AMP AMP AMP AMP
Vdisk
Vdisk
Vdisk
Vdisk
Node Components
Component Parsing Engine Functionality
1. 2. 3. 4. 5. Managing individual sessions (up to 120) Parsing and optimizing your SQL requests Dispatching the optimized plan to the AMPs ASCII / EBCDIC conversion (if necessary) Sending the answer set response back to the requesting client Storing and retrieving rows to and from the disks Lock management Sorting rows and Aggregating columns Join processing Output conversion and formatting Creating answer sets for clients Disk space management and Accounting Special utility protocols Recovery processing A vdisk (pronounced, "VEE-disk") is the logical disk space that is managed by an AMP.
AMP
Vdisk
1. 2. 3. 4. 5. 6. 7. 8. 9. 1.
Node
Other components
Component Channel Driver Functionality
Channel driver software is the means of communication between the PEs and applications running on channel-attached (mainframe) clients. The Teradata Gateway software is the means of communication between the PEs and applications running on: 1. LAN-attached clients 2. A node in the system The PDE (Parallel Database Extensions) software layer runs the operating system on each node. It was created by NCR to support the parallel environment. A Trusted Parallel Application (TPA) uses PDE to implement virtual processors (vprocs). The Teradata RDBMS is classified as a TPA
Gateway
PDE
TPA
1. The Teradata Director Program is used by the mainframe HOST to communicate with the Teradata system. 2. It manages all traffic between the Call Level Interface (CLI) and the Teradata System. Its functions include session initiation and termination, logging, verification, recovery, and restart. Performs many of the TDP functions including session management but not session balancing Provides operating system and network protocol independent interface
MTDP (Micro Teradata Director Program) MOSI (Micro Operating System Interface)
Data Distribution
1. The Parsing Engine uses the Hashing Algorithm to distribute data across the AMPs. Data distribution is dependent on the hash value of the Primary index (PI). The Hashing Algorithm acts like a mathematical "blender." It takes up to 16 columns of mixed data as input and generates a single 32-bit binary value called a Row Hash.
Hashing Algorithm
2. Input to the algorithm is the Primary Index (PI) value of a row. 3. Row Hash uniqueness depends directly on PI uniqueness.
4. 5. 6. 7. 8.
Row Hash
1. 2. 3. 4. 5.
A 32-bit binary value.
Remaining 16 bits
AMP AMP AMP AMP AMP AMP AMP AMP AMP AMP 0 1 2 3 4 5 6 7 8 9
Data Distribution
Records From Client (in random sequence) 2 32 67 12 90 6 Teradata 54 75 18 25 80 41
From Host
ASCII
AMP 1
AMP 2
AMP 3
AMP 4
Formatted
2 18
12 5 4
41
80 9 0
75 3 2
67 6
2 5
Stored
PI Characteristics
Primary Indexes (UPI and NUPI)
1. 2. 3. 4.
A Primary Index may be different than a Primary Key. Every table has only one, Primary Index. A Primary Index may contain null(s). Single-value access uses ONE AMP and, typically, one I/O.
1. Involves a single base table row at most. 2. No spool file is ever required. 3. The system automatically enforces uniqueness on the index value.
Non-Unique Primary Index (NUPI)
1. 2. 3. 4. 5.
May involve multiple base table rows. A spool file is created when needed. Duplicate values go to the same AMP and the same data block. Only one I/O is needed if all the rows fit in a single data block. Duplicate row check for a Set table is required if there is no USI on the table.
PI Considerations
ACCESS Maximize one-AMP operations: Choose the column most frequently used for access. Consider both join and value access. DISTRIBUTION Optimize parallel processing: Choose a column that provides good distribution. VOLATILITY Reduce maintenance resource overhead (I/O): Choose a column with stable data values. The Column chosen for PI must be at least nearly UNIQUE to achieve good distribution of data. Higher the distribution, higher the parallelism
AMP Operations
Single AMP operation (Typical UPI access)
Multi-AMP operation
Single-AMP operation
Application to PE
APPL 1
APPL 2
PE 1
PE 2
SQL Request
SELECT FROM WHERE LETTER SAMPLE NUMBER = 19; 13 G 6 E 12 F 15 J 20 B 7 W 4 T 19 N 16 K 17 L 2 U 11 D 14 H 23 X 3 Y 9 A 22 C 21 V 5 R 24 Z 18 M 1 P 10 S 8 Q
1. 2. 3. 4.
APPL 1 establishes a user session on PE 1. APPL 1 sends the SQL request to the PE on the forward channel. PE 1 acknowledges the message on the back channel. PE 1 parses and optimizes the request.
PE to AMP
1. PE 1 produces a one-step plan as a message to the BYNET. 2. BYNET uses the hash map to determine the destination to AMP 3. 3. BYNET sends the message to AMP 3 on the forward channel. 4. AMP 3 acknowledges message across the back channel.
AMP to PE
AMP 3 sends answer set to PE 1 on forward channel. PE acknowledges receipt across back channel
PE to Application
Single-AMP Query
PE 1 forwards response parcels to APPL 1 on forward channel. APPL 1 acknowledges messages on back channel. APPL 1 processes response and generates output.
Application to PE
All-AMP Query with Sort
SQL Request
SELECT FROM WHERE ORDER BY NUMBER, LETTER SAMPLE NUMBER > 9 LETTER ;
1. APPL 1 establishes a user session on PE 1. 2. APPL 1 sends the SQL request to the PE on the forward channel. 3. PE 1 acknowledges the message on the back channel. 4. PE 1 parses and optimizes the request.
PE to AMPs
All-AMP Query with Sort
PE1 produces a three-step plan. PE1 gives first step to BYNET to send to all AMPs. BYNET sends step over forward channel to all AMPs. All AMPs acknowledge receipt over back channel.
PE to AMPs
All-AMP Query with Sort
1. PE1 sends out step 2 over the BYNET. 2. BYNET sends step to all AMPs.
Each AMP sends its first block of sorted data to BYNET merge process.
AMP to Merge
All-AMP Query with Sort
Plan
1. GET NUMBER, LETTER WHERE NUMBER > 9 2. SORT ON LETTER 3. MERGE ON LETTER
1. The merge process continues to request sorted blocks from the AMPs until all AMPs have exhausted their spool supply.
2. When the merge process has an EOF from each AMP, the answer set is complete. Note: Spool is a temporary space used by the AMPs to store the intermediate results.
PE to Application
All-AMP Query with Sort
AMP AMP
AMP
Disk Space
Disk Space
Disk Space
Node Node
Node
Components may be added as requirements grow without Loss of Performance Double the number of AMPs - Number of users remains the same - Performance will double. Double the number of AMPs and double the number of users - Performance will stay the same.
Data Protection
Teradata provides the following data Protection features: Protection Method Locks Fallback Raid Protection Cliques Transient Journal Permanent Journal Archive and Restore Type Software Software Software Hardware Software Software Software
Locks
There are four types of locks: Exclusiveprevents any other type of concurrent access Writeprevents other Read, Write, Exclusive locks Readprevents Write and Exclusive locks Accessprevents Exclusive locks only Locks may be applied at three database levels: Databaseapplies to all tables/views in the database Table/Viewapplies to all rows in the table/views Row Hashapplies to all rows with same row hash Lock types are automatically applied based on the SQL command:
SELECTapplies a Read lock UPDATEapplies a Write lock CREATE TABLEapplies an Exclusive lock
Access Locks
Advantages of Access locks: 1. Permit quicker access to table in multi-user environment. 2. Have minimal blocking effect on other queries.
Fallback
Fallback is a software mechanism. The fallback row is a copy of a primary row stored on a different AMP. A fallback table is fully available in the event of an unavailable AMP.
PE PE
BYNET
AMP 1 AMP 2 AMP 3 AMP 4
2 3
6 8
11 5 2
5 1 11
12
8 6 12
Benefits of Fallback 1. Permits access to table data during AMP off-line period 2. Adds a level of data protection beyond disk array RAID 3. Automatically restores data changed during AMP off-line 4. Critical for high availability applications Cost of Fallback 1. Twice the disk space for table storage is needed 2. Twice the I/O for INSERTs, UPDATEs and DELETEs is needed
Fallback Cluster
A defined number of AMPs treated as a fault-tolerant unit. Fallback rows for AMPs in a cluster reside in the cluster. Loss of an AMP in the cluster permits continued table access. Loss of two AMPs in the cluster causes the RDBMS to halt.
AMP 2
22 38 50 8 5 22
AMP 3
78 62 19 1 14 50
AMP 4
1 27 38 78
AMP 5
41 93 66 72 7 88 58 45
AMP 6
93 7 20 17 88 37
AMP 7
2 58 45 41 17 20
AMP 8
37 2 72 66
Lose AMP 3 from cluster -> AMPs 1, 2 and 4 experience 33% increase in workload. Lose AMP 6 from cluster -> AMPs 5, 7 and 8 experience 33% increase in workload. Lose AMP 7 from cluster ->System halts. System performance can be adversely affected where any AMP has a disproportionate burden.
AMP
AMP
AMP
AMP
AMP
AMP
AMP
AMP
NON-FALLBACK TABLES
ONE AMP DOWN Data partially available Queries avoiding down AMP succeed
TWO OR MORE AMPs DOWN In different clusters Data partially available Queries avoiding down AMPs succeed In the same cluster System halts
AMP
AMP
AMP
AMP
AMP
AMP
AMP
AMP
Raid Protection
Two types of disk array protection:
RAID-1 (Mirroring)
1. 2. 3. 4. 5. 1. 2. 3. 4. 5. Each physical disk in the array has an exact copy in the same array. The array controller can read from either disk and write to both. When one disk of the pair fails, there is no change in performance. Mirroring reduces available disk space by 50%. Primary Array controller reconstructs failed disks quickly. For every 3 blocks of data, there is a parity block on a 4th disk. Parity Algorithm is applied to determine the parity block. If a disk fails, any missing block may be reconstructed using the other three disks. Parity reduces available disk space by 25% in a 4-disk rank. Array controller reconstruction of failed disks is longer than RAID 1.
Block 0 Parity Block 5 Block 1 Block 3 Parity Block 2 Block 8 Block 6 Parity Block 4 Block 7
Mirror
RAID-5 (Parity)
Summary
RAID-1 - Good performance with disk failures Higher cost in terms of disk space RAID-5 - Reduced performance with disk failures Lower cost in terms of disk space
41 93
66 72
7 88
58 45
93 7 Row-ID 7
20 17
88 37
2 58 Row-ID 41
45 41
17 20
37 2 Row-ID 66
72 66
RJ
RJ
RJ
Cliques
Cliques Pronounced as Clee-ques is a grouping of a set of nodes together. Two or more TPA nodes having access to the same disk arrays are called a clique.
SMP 1 SMP 2 SMP 3 SMP 4
AMP 1
AMP 2
AMP 3
AMP 4
AMP 5
AMP 6
AMP 7
AMP 8
D A C
D A C
AMP vprocs can run on any node within the clique and still have full access to their disk array space. If a node fails, AMPs migrate to another node in the clique.
SMP 1
AMP 3 AMP 1 AMP 2 AMP 5
SMP 2
SMP 3
AMP 4 AMP 6
SMP 4
AMP 7
AMP 8
D A C
D A C
Note: Failure of a Node within a Clique increases the workload for the other Nodes within the clique
Transient Journal 1. Consists of a journal of transaction before images. 3. Is automatic and transparent.
Transient Journal
Successful Transaction
Summary
In todays session we have learnt about:
1.
2. 3. 4. 5. 6.
The Teradata architecture and how it achieves the best parallelism and scalability. The concept of Shared-Nothing Architecture. The way data is distributed using Hashing algorithm. The significance of PI in row distribution. How the data rows are fetched? The various protection features in Teradata
References
Teradata Basics Official curriculum Published by NCR Teradata Solutions Group