You are on page 1of 16

NVIDIA T4 FOR VIRTUALIZATION

Nicola Sessions, February 2019


• Why Choose NVIDIA T4 for Virtualization?

• NVIDIA T4 Performance for Virtualization Workloads

AGENDA • Selecting the Right GPU for Your Virtualization Workload

2
ANNOUNCING NVIDIA T4 FOR VIRTUALIZATION
The New Generation of Computer Graphics on a Quadro Virtual Data Center Workstation

• Virtual Quadro Workstation for the Professional


Designer & Data Scientist:
• Up to 2X graphics performance versus M60
• 5 Giga Rays per second for real-time,
interactive rendering
• NGC support; run deep learning inferencing
workloads 25x faster than CPU on a virtual
machine

• Virtual PCs for the Knowledge Worker:


• Up to 33% improved performance versus CPU
only VMs
• Support for VP9 decode and H.265 encode
and decode for improved CPU offload

3
DRIVING NEW WORKFLOWS
Empowering the Modern Digital Workplace

Digital Workplace Photorealistic Rendering Data Science


Windows 10 & Productivity Apps Increasingly Complex Designs Increase in AI/DL & Inference

4
RTX PERFORMANCE IN A
QUADRO VIRTUAL WORKSTATION
Support for up to 5 Giga Rays/Sec

Media & Entertainment Manufacturing Architecture


Real-time Rendering Simulation, modeling, design Rendering, design

5
NVIDIA T4 KEY SPECIFICATIONS

GPU Architecture NVIDIA Turing

NVIDIA CUDA® Cores 2,560


NVIDIA Turing™ Tensor Cores 320

RT Cores 40

Giga Rays/second 5

Memory Size 16 GB GDDR6

Memory BW Up to 320 GB/s


vGPU Profiles 1 GB, 2 GB, 4 GB, 8 GB, 16 GB
PCIe 3.0 single slot
Form Factor
(half height & length)
Power 70W
Thermal Passive

6
LATEST GENERATION
QUADRO VIRTUAL WORKSTATION
Work Faster with Larger Models
Quadro Virtual Workstations

Continued performance 1.6 1.5

increases with latest 1.4

generation GPUs 1.2 [VALUE]


1.0
1
Added AI support and ray
tracing support with 0.8

Tensor and RT cores 0.6

0.4

0.2

0
M60 P4 T4
3D Graphics: 1.5x performance
SPECviewperf13

SPECviewperf 13 results tested on a server with Intel Xeon Gold 6154 (18C, 3.0 GHz), Quadro vDWS with T4-16Q, VMware ESXi 6.7, host/guest driver
410.87/412.10, VM config, Windows 10, 8 vCPU, 16GB memory. 7
HIGHEST GRAPHICS PERFORMANCE
ON A VIRTUAL WORKSTATION
Work Faster with Larger Models
SPECviewperf13
Relative Performance

2.5
Up to 2X performance 2.2
compared to M60 2
1.6
2X framebuffer compared to 1.5 1.5 1.5 1.5
1.5
P4 to support larger models 1.2
1.2 M60
1.1
Professional Performance 1 P4
 Healthcare T4
 Oil & Gas 0.5
 Media & Entertainment
 Manufacturing
0
Geomean Medical Energy 3ds Max Maya CATIA Creo Siemens NX SOLIDWORKS
Healthcare Oil & Gas Media & Ent Manufacturing/Product Design

SPECviewperf 13 results tested on a server with Intel Xeon Gold 6154 (18C, 3.0 GHz), Quadro vDWS with T4-16Q, VMware ESXi 6.7, host/guest driver
410.87/412.10, VM config, Windows 10, 8 vCPU, 16GB memory. 8
RUN RTX APPLICATIONS
ON A VIRTUAL WORKSTATION
Quadro vDWS with RTX-Capable NVIDIA T4

Run applications built on the RTX platform,


the most powerful rendering platform, on
any device, anywhere

Real-time ray tracing performance of up to


5 Giga Rays per second

Accelerate batch rendering for faster time


-to-market

AI-enhanced denoising speeds creative


workflows

Photorealistic design with accurate


shadows, reflections & refractions
9
NVIDIA T4 WITH QUADRO vDWS
Real-Time Inference Performance
Video Inference

Quadro Virtual Workstation for 30


deep learning inferencing

Speedup vs. CPU Server


workloads 25
25X
20
Support for NVIDIA GPU Cloud
(NGC) 15

Ideal for deep learning labs and 10


classrooms 5

0
CPU VM T4 & Quadro vDWS

Speedup: 25x faster


ResNet-50 (7ms latency limit)
Tested on a server with Intel Xeon Gold 6154 (18C, 3.0 GHz), Quadro vDWS with T4-16Q, VMware ESXi 6.7, host/guest driver 410.87/412.10, VM config, Ubuntu 10
16.04, 8 vCPU, 32GB memory. 25X performance improvement over CPU VM.
NVIDIA T4 FOR VIRTUAL PCs
Optimize Data Center Utilization with Mixed Workloads
Virtual PCs
T4 vs. CPU only: Adding NVIDIA GPUs 1.4 [VALUE]
results in 33% better user experience
versus CPU only VMs** 1.2

1
1
T4 vs. M10: provides same user density
with lower power consumption* 0.8

0.6
Same user experience & performance**
0.4
Support for VP9 decode
0.2

Support for H.265 (HEVC) 4:4:4 encode 0


and decode CPU only VM T4
UX: 1.3x better
Support for >1TB system memory UX based on Remoted Frames

• Two NVIDIA T4 GPUs support the same user density as a single M10 and fit in the same 2 slot PCIe form factor. 11
** NVIDIA internal benchmark running Microsoft PowerPoint, Word, Excel, Chrome, PDF viewing and video playback.
NVIDIA DATA CENTER GPUs
Recommended for Virtualization
V100 P40 T4 M10 P6
GPUs / Board 1 1 1 4 1
(Architecture) (Volta) (Pascal) (Turing) (Maxwell) (Pascal)

2,560
CUDA Cores 5,120 3,840 2,560 2,048
(640 per GPU)

Tensor Cores 640 --- 320 --- ---

RT Cores --- --- 40 --- ---

32 GB GDDR5
Memory Size 32 GB/16 GB HBM2 24 GB GDDR5 16 GB GDDR6 16 GB GDDR5
(8 GB per GPU)

1 GB, 2 GB, 4 GB, 1 GB, 2 GB, 3 GB,


1 GB, 2 GB, 4 GB, 8 GB, 16 0.5 GB, 1 GB, 2 GB, 1 GB, 2 GB, 4 GB,
vGPU Profiles 8 GB, 16 GB, 4 GB, 6 GB, 8 GB,
GB 4 GB, 8 GB 8 GB, 16 GB
32 GB 12 GB, 24 GB

PCIe 3.0 Dual Slot & SXM2 PCIe 3.0 Dual Slot PCIe 3.0 Single Slot (rack PCIe 3.0 Dual Slot MXM
Form Factor
(rack servers) (rack servers) servers) (rack servers) (blade servers)

Power 250W/300W 250W 70W 225W 90W

Thermal passive passive passive passive bare board

PERFORMANCE DENSITY BLADE


Optimized Optimized Optimized

12
SELECTING THE RIGHT GPU
NVIDIA Quadro Virtual Data Center Workstation
Use Case: Entry to Midrange Quadro Smaller Profiles,
Workstations
NVIDIA T4 More Users

Workloads: CAD, CAE, Digital Content


Creation, Rendering, Inferencing, My end users work with
larger models or applications
Training

Use Case: High-end Quadro


Workstations
Decreasing
Workloads: Large, Complex CAD NVIDIA P40 user density
per server
models, Seismic Exploration, Complex Increasing
Digital Content Creation, Effects, 3D My end users use CAE workflow/model
Medical Imaging applications, or are complexity
experimenting with DL/AI
Use Case: Ultra High-end Quadro
Workstations

Workloads: Largest CAD models, CAE, NVIDIA V100 Larger Profiles,


Seismic Exploration, GPGPU compute,
Deep Learning, Immersive Visualization Fewer Users

13
SELECTING THE RIGHT GPU
NVIDIA GRID vPC/vApps

2 x NVIDIA T4 1 x NVIDIA M10


Density 32 users 32 users
Form Factor PCIe 3.0 single slot PCIe 3.0 dual slot
Power 140W (70W per GPU) 225W
Cores Available CUDA, Tensor, RT CUDA
CODECs VP9, H.265 H.264
System Memory Support
> 1TB < 1TB

Use Case Universal GPU for virtual workstations,


knowledge workers, rendering, inferencing, Lowest TCO for knowledge workers
training
14
NVIDIA T4 FOR VIRTUALIZATION
Powerful, Versatile Platform for VDI

Powerful virtual workstation for the


engineer, professional designer, and data
scientist

Deep learning inferencing for virtual


labs and classrooms

High density virtual desktops for the best


user experience for Windows 10

15

You might also like