You are on page 1of 36

Teradata Past, Present and Future

Todd Walter CTO Teradata Labs

2 > 09/2009

Copyright Teradata 2007-2009 All rights

Teradata Company Highlights


Founded 1979 West LA First product to market 1984 First Terabyte system 1987 Acquired by AT&T and merged with acquired NCR 1992 Tri-vested as part of NCR - 1997 Teradata Corporation (re)Launched October 1, 2007
> Global Leader in Enterprise Data Warehousing
EDW/ADW Database Technology Analytic Solutions Consulting Services

Top 10 U.S. publicly-traded software company


> > > > S&P 500 Member Listed NYSE: TDC NYSE Arca Tech 100 2007 - $1.7B revenue

> Positioned in Gartners Leaders Quadrant in data warehousing since 1999

Global presence and world-class customer list 5,500+ associates


3 > 09/2009

> More than 850 customers > More than 2,000 installations

Copyright Teradata 2007-2009 All rights

4 > 09/2009

Copyright Teradata 2007-2009 All rights

Continuous (R)evolution

+ Analytic applications
+ Data models and reports + Consulting

+ Database
Hardware
5 > 09/2009 Copyright Teradata 2007-2009 All rights

Continuous (R)evolution

Sell applications with consulting, SW and HW inside Sell solving business problems and technology to solve them
Sell the SW with some HW to run on Sell the HW, give everything else away
6 > 09/2009 Copyright Teradata 2007-2009 All rights

Continuous (R)evolution

10% R&D 90% integration Xeon Quad Core 20% R&D 80% integration Pentium 70% R&D 30% integration i486 90% R&D 10% integration 80286

7 > 09/2009

Copyright Teradata 2007-2009 All rights

1903

1906

1907

TRADEMARK 1901

1905

1909 1939 1920 1950

1963

1941

1971

1985

An AT&T Company
1991
Global Information Solutions

1991

1994
8 > 09/2009

1997 Copyright Teradata 2007-2009 All rights

Scale
Every dimension of the technology must scale to meet todays requirements
> Data, Data model complexity, Users, Performance, queries, Data loading,

What is a big Data Warehouse? Total spinning disk?


> 2.5 Petabytes

Big table?
> 150 billion rows

Number of tables?
> 300,000

Insert/Update per day?


> 5 billion records

Identified users?
> 100,000

Queries per day?


> 5 million

Data Turnover rate?


> 1TB per 5 seconds

9 > 09/2009

Copyright Teradata 2007-2009 All rights

The Problem
Operational Systems
Accts. Payable Accts. Receivable Invoicing Sales/Orders Finance G/L

Decision Makers
Marketing Supply Chain Finance Risk Management

Customer Support
HR Payroll Purchasing Order Fulfillment Manufacturing Inventory

Maintenance
Sales Operations Inventory Call Center

Proliferation of Data Marts has resulted in fragmented data, higher costs, poor decisions
10 > 09/2009 Copyright Teradata 2007-2009 All rights

The EDW Solution


Operational Systems
Accts. Payable Accts. Receivable Invoicing Sales/Orders Finance G/L

Decision Makers
Marketing Supply Chain Finance

Customer Support
HR Payroll Purchasing Order Fulfillment Manufacturing Inventory

Enterprise Data Warehouse (EDW)

Risk Management

Maintenance
Sales Operations Inventory Call Center

Integrated data provides consistency of data, lower costs, better decisions


11 > 09/2009 Copyright Teradata 2007-2009 All rights

Active Enterprise Intelligence

An Obvious Trend: More Speed, More Users


Seconds

Days

Strategic Intelligence
Enterprise Data Warehouse BI Tools & reports Analysis & visualization Predictive Analytics
12 > 09/2009

Operational Intelligence
EDW Enterprise Integration Mixed workload management SOA, BPMS, IDEs Portals/composite applications

Copyright Teradata 2007-2009 All rights

Active Enterprise Intelligence enabled by an Active Data Warehouse


Suppliers Customers Call Center Logistics Executive Product/ Services Finance Marketing

OPERATIONAL INTELLIGENCE
Workflow & Applications

STRATEGIC INTELLIGENCE
Business Intelligence Tools and Applications

Active Access

Active Events

Active Enterprise Integration Active Load Active Workload Management Active Availability

Teradata Warehouse

13 > 09/2009

Copyright Teradata 2007-2009 All rights

Active Enterprise Intelligence in Retail


Detecting Retail Fraud
Situation Thieves make copies of cash register receipts, walk into the store, pick up merchandise, and return items for cash. Problem Associates in returns department did not have historical POS receipt retrieval access to verify against previously returned receipts or to do returns without receipts. Impact (for 500-store chain) 100% ROI in 5 months Stopped a crime ring on the first day of rollout Cost savings have been huge

Solution
Associates query Teradata to quickly check if a return has already occurred on that receipt number. Also used by analysts to understand and prevent excessive returns.

14 > 09/2009

Copyright Teradata 2007-2009 All rights

Active Enterprise Intelligence in Retail

Single View of the Customer Across All Channels


Situation Needed to add Web channel for selling shoes.

Problem
Too much time and cost to keep multiple customer systems synchronized. Realized they needed just one customer database, not one more for the Web, in addition to Call Center, and POS/Store databases. Solution Adopted an ADW strategy, moved all customer data to one Teradata system, revised data models to cover all channels, added web channel for commerce, used web services, added TASM to handle multiple workload types
15 > 09/2009 Copyright Teradata 2007-2009 All rights

Impact

1M tactical hits to the EDW per day from the POS, Call Center, and Web with 0.11 sec response time Runs simultaneously with back-office BI, reports, and ETL workloads Eliminated all other customer data systems

Change is Fast and Getting Faster New Challenges for Database Technology

What is the Measure of a Great Architecture?

Handle huge changes of underlying technologies and dependent components while continuing to deliver the key value proposition.

17 > 09/2009

Copyright Teradata 2007-2009 All rights

18 > 09/2009

Copyright Teradata 2007-2009 All rights

Processor Roadmap CPU power radically increasing


90nm process 65nm process 45nm process 32nm process 22nm process

2003
Hyper-Threading

2005
Dual Core

2007

2009
Multi Core

2011

SPECInt2000

DUAL/MULTI-CORE PERFORMANCE

5X
SINGLE-CORE PERFORMANCE

Source Intel Corporation


19 > 09/2009

2000

2004

2008+

Copyright Teradata 2007-2009 All rights

What Does Shared Nothing Mean?


1985 Every hardware part, every line of software pure shared nothing 1995 Multiple units of parallelism sharing CPU, memory 2004 Multiple units of parallelism sharing multiple cores, memory 2009 Multiple units of parallelism sharing same physical spindles but still not sharing data Future Multiple units of parallelism in Virtual machines/cloud not even knowing what physical machine it is on or sharing

20 > 09/2009

Copyright Teradata 2007-2009 All rights

Teradata MPP Server Architecture


Nodes
> Incrementally scalable to 1024 nodes > Linux, Windows, Unix > Independent I/O > Scales per node
SMP Node1
Operating Sys CPU1 CPU2

Dual BYNET Interconnects

Operating System

SMP Node2
Operating Sys CPU1 CPU2

SMP Node3
Operating Sys CPU1 CPU2

SMP Node4
Operating Sys CPU1 CPU2

Storage

Memory

Memory

Memory

Memory

BYNET Interconnect Connectivity

> Fully scalable bandwidth > Fully scalable > Channel ESCON/FICON > LAN, WAN > One console to view the entire system

Server Management

Server Management

21 > 09/2009

Copyright Teradata 2007-2009 All rights

Shared Nothing - Dividing the Work


Virtual processors (vprocs) do the work Two types
> AMP: owns and operates on the data > PE: handles SQL and external interaction

Configure multiple vprocs per hardware node


> Take full advantage of SMP CPU and memory

Each vproc has many threads of execution


> Many operations executing concurrently > Each thread can do work for any user, transaction

Software is equivalent regardless of configuration


> No user changes as system grows from small SMP to huge MPP

22 > 09/2009

Copyright Teradata 2007-2009 All rights

Shared Nothing - Dividing the Work


Basis of Teradata scalability
> Each AMP owns an equal slice of the disk > Only that AMP reads that slice > I/O, Buffers, Locking, Logging, Dictionary > Nothing centralized > Exponential communication costs avoided
Coordination cost

No single point of control for any operation


Teradata

# Nodes

Logs Locks Buffers I/O

AMPs

23 > 09/2009

Copyright Teradata 2007-2009 All rights

Teradata Data Distribution


Rows automatically distributed evenly by hash partitioning
> > > Table A Table B Table C > > > Even distribution results in scalable performance Done in real-time as data are loaded, appended, or changed. Hash map defined and maintained by the system
2**32 hash codes, 64K buckets distributed to AMPs

Prime Index (PI) column(s) are hashed Hash is always the same - for the same values No reorgs, repartitioning, space management

Primary Index Teradata Parallel Hash Function


RowHash (Hash Bucket) Data Fields

AMP1
P

AMP2
P

AMP3
P

AMP4 AMPn
P P P P P P

24 > 09/2009

Copyright Teradata 2007-2009 All rights

Disk Capacity Exploding with Little Increase in Performance

Disk Drive Bandwidth (MB / Sec)

7 6 5 4 3 2 1 36 GB 73 GB 146 GB .044 .080 5.5 6.0

6.4
.155

Disk Drive Capacity


Random I/O; 48K block; 80% read

25 > 09/2009

Copyright Teradata 2007-2009 All rights

Performance per Capacity MB/Sec/GB

Platform Change
Focus used to be
> Optimization of expensive CPU cycles > Micro-management of precious disk space

Now
> Manage I/O > Balance CPU power to the I/O capacity > Find new ways to optimize I/O, trading for CPU use as necessary > Pulling 2.5GB/sec per node continuous

Discontinuity coming
> SSDs become price competitive and reliable

26 > 09/2009

Copyright Teradata 2007-2009 All rights

File System
Teradata wrote a new rule book
> Old one written by IBM 35 years ago, used by all mainstream DBMSs today - except Teradata

File system built of raw slices Rows stored in blocks


> Variable length > Grow and shrink on demand > Rows located dynamically
May be moved to reclaim space, defrag

> Maximum block size is configurable


System default or per table 8K to 128K Change dynamically

Indexes are just rows in tables Has evolved from direct management of single spindles to completely virtualized storage, not even knowing spindle location
27 > 09/2009 Copyright Teradata 2007-2009 All rights

Workload Management Evolution


1984 pure timeshare 1987 4 priorities, defined by user 1995 multiple priorities in multiple partitions 2000 weighted workload groups 2004 queuing, reserved resources, focus on tactical work 2009 Visualization and detailed workgroup management Future Set service level goals, our job to deliver

28 > 09/2009

Copyright Teradata 2007-2009 All rights

Active Workload Management


Manage workloads
> Reduce server congestion Active Events Active Access

Dynamically adjust in-flight task priority


> Turn the dial change priorities

Speed

60

Speed

75

Fast active access queries


> Performance, performance, performance
Speed

Active Data Warehouse

Get maximum throughput

10

Speed

25

Active Load

Query and Reporting

29 > 09/2009

Copyright Teradata 2007-2009 All rights

TASM Reporting/Monitoring - 13.10

30 > 09/2009

Copyright Teradata 2007-2009 All rights

Availability Requirements
Strategic Intelligence
Users 1000000 100000 10000 1000

Operational Intelligence

Dual Active Mission Critical

100

Business Critical
10

IT, Finance, Planners, Power Users, Data Miners


31 > 09/2009

Executives, Middles Managers, Marketing

Category Mgr, Operational Line Employees Managers, Service Managers

Consumers Suppliers B2B

Copyright Teradata 2007-2009 All rights

Always ON An Elusive Challenge


Unplanned downtime
> Hardware faults > Software faults > Hangs

Planned downtime

Disasters

> Software upgrade > Hardware upgrade > Data center maintenance > Multi-component failures > Building disasters > Area disasters

And optimize resource value to the business And avoid hidden costs and surprises Major opportunity for research but must be holistic
> Reaches far beyond core database

> Eg Major performance variations

32 > 09/2009

Copyright Teradata 2007-2009 All rights

Real time Operational Actions


1. Customer makes multi-segment travel reservation
Strategic Intelligence
WebSphere MQ, Oracle AQ, Microsoft MSMQ

2. Flight rerouted causing missed connections.

Operational Intelligence

6. Customer rebooked and notified. 7. Airport operations adjusted


33 > 09/2009

Active Enterprise Data Warehouse


Copyright Teradata 2007-2009 All rights

3. What are the customers flying history? 4. How profitable is each customer? 5. Which customers experienced delays or other problems in last 6 months?

Real Time Customer Management


1. Customer inserts Total Rewards Card at Slot Machine 6. Message sent to floor Luck Ambassador with customer offer to prevent additional losses.
Strategic Intelligence

4. Is this customer approaching the predicted loss rate for their segment? 5. What offers are available for this customer?

TIBCO
Operational Intelligence

Active Enterprise Data Warehouse


34 > 09/2009

2. What is the customers past spending history in all our casinos? 3. What is a significant loss for this person based on market segment, past and predicted behavior?

Copyright Teradata 2007-2009 All rights

Thats a Wrap!
Business requires a new level of decision making
> Many more decisions by many more people much faster > Current representation of the state of the enterprise

Data Warehouse must evolve to support the requirements of Active Enterprise Intelligence Technology must evolve to deal with the new requirements
> Rich area for research and innovation > Change view of what data warehouse/BI means

Teradata driving an aggressive roadmap to meet real business requirements

35 > 09/2009

Copyright Teradata 2007-2009 All rights

36 > 09/2009

Copyright Teradata 2007-2009 All rights

You might also like