0% found this document useful (0 votes)
131 views38 pages

RTX PRO Server Partner Deck

The document discusses the necessity of AI factories for enterprises, highlighting that 92% are investing in AI, yet only 1% have mature deployments. It introduces the RTX PRO Server, designed for multi-workload acceleration and optimized for AI and visual computing, showcasing its performance advantages over traditional CPU servers. The RTX PRO Server supports various applications, including AI inference, industrial AI, and enterprise HPC, offering significant improvements in price-performance and energy efficiency.

Uploaded by

Gopi Chowdary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
131 views38 pages

RTX PRO Server Partner Deck

The document discusses the necessity of AI factories for enterprises, highlighting that 92% are investing in AI, yet only 1% have mature deployments. It introduces the RTX PRO Server, designed for multi-workload acceleration and optimized for AI and visual computing, showcasing its performance advantages over traditional CPU servers. The RTX PRO Server supports various applications, including AI inference, industrial AI, and enterprise HPC, offering significant improvements in price-performance and energy efficiency.

Uploaded by

Gopi Chowdary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

RTX PRO Server – OEM/Channel

Speaker Name, Title, November 2025


Why do I need an AI Factory?

92% of enterprises investing in AI

50% will use AI agents to achieve business


value

33% find complexity top barrier for


adoption

1% have mature AI deployments


Every Enterprise Needs an AI Factory
Manufacturing Intelligence at Scale

Electricity Tokens*, Intelligence


& Data & Outcomes

Traditional Data Center AI Factory


Transactional Workflows Accelerated Workloads

*What is a token? It’s about half a word, so a 10-word ChatGPT query is 20-30 tokens.
Not All Factories for Manufacturing are the Same

Not all Factories for AI are either

RTX PRO Server HGX B200 GB200 NVL72


Multi-Workload Acceleration, AI-Optimized Performance for Training and Large Exascale Computer in a Rack for Massive LLM
Enterprise DC Compatibility Model Inference Training and Inference
RTX PRO Server
NVIDIA Blackwell for Every Enterprise Data Center

MGX 4U Configuration 4U Configuration 2U Configuration


8X RTX PRO 6000 Blackwell Server Edition GPUs 8X RTX PRO 6000 Blackwell Server Edition GPUs 2X RTX PRO 6000 Blackwell Server Edition GPUs
ConnectX-8 SuperNIC with PCIe Gen 6 Switch
RTX PRO 6000 Blackwell
Server Edition
The most powerful universal GPU for
AI and visual computing in the data center

Breakthrough Multimodal AI Inference


• 5th-Gen Tensor, 2nd-Gen Transformer Engine, FP4
• Full Media Pipeline: 4 NVENC/ NVDEC/ NVJPEG

Powerful Graphics and Visual Computing


• 4th-Gen RTX, Neural Shaders, DLSS 4
• vGPU Support, AI Virtual Workstations (vWS)

Data Center Ready


• 96GB GDDR7,1.6 TB/s Memory BW, 128MB L2 Cache
RTX PRO 6000 Blackwell Server Edition GPU
• Multi-Instance GPU (MIG), TEE Confidential Compute
Dual Slot, FHFL I Up to 600W
RTX PRO Server is the Universal Data Center Platform
NVIDIA Provides Enterprise Software to Enable a Wide Range of Workloads on RTX PRO Servers

AGENTIC AI INDUSTRIAL & PHYSICAL AI SCIENTIFIC COMPUTING, DATA ANALYTICS, & VISUAL COMPUTING ENTERPRISE
SIMULATION APPLICATIONS

6X 4X 7X 3X 4X 4X
Throughput Faster Faster Throughput Higher FPS Concurrent Workloads
LLM Inference Synthetic Data Generation Genome Sequence Alignment Engineering Simulation Real-Time Rendering Multi-Instance GPU

NVIDIA NVIDIA AI NVIDIA Omniverse NVIDIA RTX Virtual


Run:ai Enterprise Enterprise Workstation/Virtual PC

RTX PRO SERVERS

Projected Performance - subject to change


1. Llama3.1 70B Inference; 8K/256, 20 t/s/usr, 2s FTL; RTX PRO 6000 (FP4) vs L40S (FP8)
2. NVIDIA Cosmos 7B; Text-Video Generation, 2.5s 720p Video; RTX PRO 6000 (FP4) vs L40S (FP8)
Measured Performance RTX PRO SERVER
3. Smith & Waterman (TCUPs); 8-bits; RTX PRO 6000 vs L40S
4. Altair UFX; FP32; RTX PRO 6000 vs L40S
5. Omniverse; Debrecan; Real Time Rendering FPS; RTX PRO 6000 (RT2+DLSS4 On) vs L40S (DLSS3)
6. # of concurrent workloads supported w MIG; RTX PRO 6000 vs L40S/L4
Enterprise Server for
NVIDIA Accelerated Computing
NVIDIA Blackwell for the Enterprise

RTX PRO Server

Designed for enterprise IT requirements

Ease of Orderability / Install / Deploy / Support

Air-Cooled / PCIe / x86 / 7kW per Node

NEW
LIMITED TIME
PROMOTION
RTX PRO Server Software Kit

NVIDIA AI Enterprise + Omniverse Enterprise + vPC and vWS

NVIDIA Run:ai

RTX PRO Server 7kW power requirement based on 8x RTX PRO 6000 Blackwell Server Edition GPUs
Announcing New Mainstream
NVIDIA RTX PRO Servers
High-volume servers from global system partners

Introducing New 2U Configurations

Most Popular Form Factor (~57% of Rack Servers)

2x RTX PRO 6000 Blackwell Server Edition GPUs

Multi-Workload Acceleration

15X price-performance and energy efficiency vs


CPU-only server

NEW
LIMITED TIME
PROMOTION RTX PRO Server Software Kit

NVIDIA AI Enterprise + Omniverse Enterprise + vPC and vWS

Available from Global System Partners Starting Fall 2025 NVIDIA RTX PRO Server (2U Chassis, 2x RTX PRO 6000 Blackwell Server Edition GPU, 2x x86 CPU, 3kW) vs. 2x x86 CPU (2U Chassis, 1.1kW); Multi-Workload Relative
Performance; Geomean of measured performance speedups for HPC Applications (FP32), Rendering (VRed), Spark RAPIDS, Encode Streams (1080p30, HEVC). Shown
for representation only. Please contact partners for pricing.
Accelerate the Shift from CPU Servers
To mainstream 2U servers with RTX PRO 6000 Blackwell server edition GPUs

Move CPU Workloads to Blackwell RTX PRO Server Delivers Over


Rendering, HPC Compute, Machine learning, Data Analytics
$5M CAPEX
15X Price-Performance and Energy Efficiency

18.0x

CPU Only Server RTX PRO Server 16.0x 15.7x 15.4x


2X x86 CPU 2X RTX PRO 6000
14.0x

12.0x

10.0x

8.0x

6.0x

4.0x

2.0x

0.0x
Performance/$ Performance/W

287 Servers 7 Servers


CPU Server RTX PRO Server
$4.9M $320K
NVIDIA RTX PRO Server (2U Chassis, 2x RTX PRO 6000 Blackwell Server Edition GPU, 2x x86 CPU, 3kW) vs. 2x x86 CPU (2U Chassis, 1.1kW); Multi-Workload Relative Performance; Geomean of measured performance speedups for HPC Applications (FP32), Rendering (VRed), Spark RAPIDS, Encode Streams (1080p30, HEVC). Shown for
representation only. Please contact partners for pricing.
RTX PRO 6000 Blackwell Server Edition
Leaves L40S and Hopper Behind!
Best Server for RTX Graphics and Visual Computing
RTX PRO Server Up to 2.6X Better Price Performance

Visual Computing Performance/$


3.0X Normalized to L40S
2.6X

2.3X

2.0X 1.9X

1.5X

1.2X
1.1X
1.0X

0.0X
4K (FPS) 4K DLSS4 (FPS) Rendering (Time) Omniverse - Interactive Encode (Streams) Decode (Streams)
Rendering (FPS)

Measured Performance
1. SPECviewperf 15 4K Geomean
2. Cyberpunk 2077, RT: Overdrive, SR: Performance, RR: On; RTX PRO 6000 (DLSS4 - FG:4x), L40S (DLSS3 - FG:2x)
L40S RTX PRO 6000
3. Autodesk Arnold, Chameleon scene
4. Omniverse EBC Factory 4K animated; RTX PRO 6000 (DLSS4), L40S (DLSS3)
Projected Performance
5. H.264 Encode; 1080p30 streams LL p1 Performance/$ = Performance / TCO for a Single Node (Server + Power Costs) for L40S compared to RTX PRO 6000 Blackwell Server Edition
6. H.264 Decode; 1080p30 streams LL p1
Best Server for RTX Graphics and Visual Computing
RTX PRO Server Up to 3.3X Better Performance

Visual Computing Performance


4.0X Normalized to L40S

3.3X

3.0X 2.9X

2.4X

2.0X 1.9X
1.6X
1.4X

1.0X

0.0X
4K (FPS) 4K DLSS (FPS) Rendering (Time) Omniverse - Interactive Encode (Streams) Decode (Streams)
Rendering (FPS)

Measured Performance
1. SPECviewperf 15 4K Geomean
2. Cyberpunk 2077, RT: Overdrive, SR: Performance, RR: On; RTX PRO 6000 (DLSS4 - FG:4x), L40S (DLSS3 - FG:2x)
L40S RTX PRO 6000
3. Autodesk Arnold, Chameleon scene
4. Omniverse EBC Factory 4K animated; RTX PRO 6000 (DLSS4), L40S (DLSS3)
Projected Performance
5. H.264 Encode; 1080p30 streams LL p1
6. H.264 Decode; 1080p30 streams LL p1
Visual Computing Ecosystem
Applications for Creative and Technical Professionals
Best Server for Industrial AI and Physical AI
RTX PRO Server Up to 2.9X Better Price Performance

Visual Simulation Performance/$


4X
Normalized to L40S

3X 2.9X
2.8X
2.6X

2.1X
2X

1X

0X
Factory Digital Twin Factory Digital Twin Synthetic Data Generation Synthetic Data Generation
Simulation Layout Planning Video Generation Batch Rendering

Measured Performance L40S RTX PRO 6000


Factory Digital Twin Animated: Real-Time Rendering w DLSS4
Factory Digital Twin Static: Real-Time Rendering w DLSS4
Synthetic Data Generation Video Generation: Cosmos Predict2 14B Video2World @ 720p
Synthetic Data Generation Batch Rendering: AV Sim SDG Full Scene Generation @ 1080p

Performance/$ = Performance / TCO for a Single Node (Server + Power Costs) for L40S compared to RTX PRO 6000 Blackwell Server Edition
Best Server for Industrial AI and Physical AI
RTX PRO Server Up to 4X Better Performance

Visual Simulation Performance


5X
Normalized to L40S

4X
3.7X
3.3X
3X 2.8X
2.6X

2X

1X

0X
Factory Digital Twin Factory Digital Twin Synthetic Data Generation Synthetic Data Generation
Simulation Layout Planning Video Generation Batch Rendering

L40S RTX PRO 6000


Measured Performance
1. Factory Digital Twin Simulation - Real-Time Rendering w DLSS; EBC Factory 4K animated; RTX PRO 6000 (DLSS4), L40S (DLSS3)
2. Factory Digital Twin Layout Planning - Real-Time Rendering w DLSS; Debrecan Factory 1080p static: RTX PRO 6000 (DLSS4), L40S (DLSS3)
3. Synthetic Data Generation Video Generation: Cosmos Predict2 14B Video2World @ 720p
4. Synthetic Data Generation Batch Rendering: AV Sim SDG Full Scene Generation @ 1080p
Industrial and Physical AI Ecosystem
Example RTX PRO Server Applications and Workloads

CAE Digital Twins Factory Digital Twins Robotics Simulation Synthetic Data Generation Vision AI Agents
NVIDIA CUDA-X and Omniverse NVIDIA Omniverse and Metropolis NVIDIA Omniverse and Isaac Sim NVIDIA Omniverse and Cosmos NVIDIA Metropolis and Cosmos
Best Server for Enterprise HPC
RTX PRO Server Up to 5X Better Price Performance

Compute Workload Performance/$


Normalized to HGX H100
6.0X

5.0X
5.0X

4.0X

3.3X 3.3X

3.0X
2.5X

2.0X 1.7X
1.3X
1.0X

0.0X
Discrete Element Modeling Fluid Dynamics Data Analytics Molecular Dynamics Energy Genomics

Measured Performance


Discrete Element Modeling- Altair EDEM- Mill – Elapsed Time (ms)
Computational Fluid Dynamics- Altair NFX-
HGX H100 RTX PRO 6000
• Data Analytics (TPC-DS) Queries - FP32- Graph queries for data science/analysis
• Molecular Dynamics GROMACS - FP32 – Simulation package for molecular analysis popular in t
• Energy RTM [TTI r8 2-pass] – Reverse Time Migration- Used for Seismic Imaging; FP32
• Genomics Smith-Waterman - FP32 – Performance/$ = Performance / TCO for a Single Node (Server + Power Costs) for HGX H100 compared to RTX PRO 6000 Blackwell Server Edition
Best Server for Enterprise HPC
RTX PRO Server Up to 2.3X Better Performance

Compute Workload Performance


Normalized to HGX H100
2.5X
2.3X

2.0X

1.5X 1.5X
1.5X

1.1X
1.0X
0.8X
0.6X
0.5X

0.0X
Discrete Element Modeling Fluid Dynamics Data Analytics Molecular Dynamics Energy Genomics

HGX H100 RTX PRO 6000


Performance/$ = Performance / TCO for a Single Node (Server + Power Costs) for HGX H100 compared to RTX PRO 6000 Blackwell Server Edition
Enterprise HPC Ecosystem
Accelerated Applications for Industries
Best Server for Enterprise AI
RTX PRO Server Up to 5X Better Price Performance

AI Inference Performance/$
5.0X
Normalized to L40S
4.4X

4.0X

3.4X

3.0X
2.6X 2.6X

2.0X 1.8X

1.0X

0.0X
Flux 12B Cosmos Llama 3 8B Llama 3 70B Mixtral 8x7B

Image Generation Video Generation LLM Inference


Measured Performance
1. Flux 12B, 1024x1024 Image
2. COSMOS VFM 7B – Text-Video Generation- 720p

Projected performance, subject to change.


L40S HGX H100 RTX PRO 6000
3. Llama3 8B; 2k/256- L40S, HGX H100 (FP8), RTX PRO (FP4).
4. LLama3 70B; 8K/256, 20 t/s/usr
5. Mixtral 8x7B- 4K
Performance/$ = Performance / TCO for a Single Node (Server + Power Costs) for L40S and HGX H100 compared to RTX PRO 6000 Blackwell Server Edition
Best Server for Enterprise AI
RTX PRO Server Up to 6X Better Performance

AI Inference Performance
6.0X
Normalized to L40S 5.6X

5.0X

4.0X

3.3X 3.3X

3.0X
2.7X 2.6X

2.0X

1.0X

0.0X
Flux 12B Cosmos Llama 3 8B Llama 3 70B Mixtral 8x7B

Image Generation Video Generation LLM Inference


Measured Performance
1. Flux 12B, 1024x1024 Image
2. COSMOS VFM 7B – Text-Video Generation- 720p

Projected performance, subject to change.


L40S HGX H100 RTX PRO 6000
3. Llama3 8B; 2k/256- L40S, HGX H100 (FP8), RTX PRO (FP4).
4. LLama3 70B; 8K/256, 20 t/s/usr
5. Mixtral 8x7B- 4K
Enterprise AI Ecosystem
Example RTX Server PRO Server AI Agents

Supply Chain Security Agent Customer Service Coding Agent Deep Research Bio-Medical Agent
Agent Agent Agent

Agent Applications

AIOps

Orchestration

NVIDIA Data Flywheel Blueprint NVIDIA AI-Q Blueprint


Blueprints
Refine AI Agents through Continuous Model Distillation with Data Flywheels Build AI Agents for Enterprise Research
Worldwide Customers Deploy Agents with NVIDIA

LLM as a Service Call Center Analytics Agent Recommendation Agent Geoscience Agent
Improve customer 84% decrease in cost Over 1M datapoints and 80% better data utilization
personalization product formulations improves yields

Conversational AI Agent Network Agent Digital Twin Agent Domain-Specific LLM


30% improved accuracy Automating network 10K products converted to 30% accuracy increase and
processes digital twins 20% less training time

Trading Insights Agent Voice AI Agent Drug Discovery Agent Supply Chain Agent
60% faster insights Enabling AI-powered voice Improve pharmaceutical Transform operations planning,
services development logistics, and warehouse
management
Industry Leaders Transforming Business With RTX PRO Server
Graphics, Industrial AI, Scientific Computing and Agentic AI Use Cases

Disney Hitachi Lilly SAP


Next-generation park guest experiences Industrial AI, Hitachi rail and energy use cases Drug discovery, computational physics SAP Business AI, embedded AI agents
Industry Leaders Transforming Manufacturing With RTX PRO Server
Physical AI, Digital Twin and Robotics Use Cases

Foxconn Hyundai Motor Group TSMC


Factory automation, industrial robotics Product design, factory digital twin Fab twin facility, design twin
Which AI Factory does my customer need?
Select the Best GPU for the Job - RTX PRO Server HGX B200
Determine the Right Infrastructure Based on Customer Requirements

Use Case

Foundational Model Training HGX B200

Fine Tuning / Inference (>70B Parameters) HGX B200


Large LLM Model Training and
Inference
Double Precision (FP64) Workloads? HGX B200

Memory-Dependent Workloads (HBM3e req.) ? HGX B200


Compute-heavy Workloads
No Graphics Mid to Large LLM Inference (>70B) / FP64); <20kW Rack H200 NVL Enterprise Server
Power Constraints

Fine-Tuning/Inference/RAG (< 70B Parameters) RTX PRO Server

Early in AI Journey? Data Center Refresh? RTX PRO Server


Mix of Accelerated
Workloads <20kW Data Center Rack? PCIe Flexibility? RTX PRO Server
Flexible + Lower Power
Visual Computing Workloads (Graphics, Design, Digital Twins)? RTX PRO Server

Additional Workloads (Data Analytics, Simulation, FP32 HPC)? RTX PRO Server
NVIDIA Enterprise AI Factory Options
NVIDIA Blackwell Accelerated Computing Comparison

RTX PRO Server HGX B200

Industrial AI / Physical AI / Digital Twin ☆☆☆☆☆ ---

Graphics / Rendering / VDI ☆☆☆☆☆ ---

Video Agents (VSS) / Video Analytics ☆☆☆☆☆ ☆☆


Inference (Multimodal AI, Generative AI, Agentic AI) – Perf/$ ☆☆☆☆☆ ☆☆☆☆☆
Fine Tuning ☆☆☆ ☆☆☆☆

Single/Mixed Precision HPC1 Applications- Perf/$ ☆☆☆☆ ☆☆☆

# GPUs per node 2-8 8

Power Requirement2 Up to 7 kW* 14 kW

Cooling Air Cooled Air or Water Cooled

1. FP64 not supported on RTX PRO


2. RTX PRO Server 7kW power requirement based on 8x RTX PRO 6000 Blackwell Server Edition GPUs
AI Factories Will Run on RTX PRO and HGX Servers
RTX PRO Server Offers Great Value Across Enterprise Workloads

2.0X
Performance/$

1.5X
1.4X
1.3X
RTX PRO Server Only
1.2X

1.0X

0.6X
0.5X
0.5X

0.3X 0.3X
0.2X

0.0X
Llama3 70B Fine Llama3 70B Llama3 8B Inference Image Gen Video Gen Data Analytics Molecular Dynamics Genomics Video Data Curation Graphics Rendering Virtual Workstations
Tuning Inference

RTX PRO 6000 HGX B200

Projected Performance - RTX PRO 6000 Blackwell Server Edition vs. HGX B200
1. LLM Fine Tuning: Llama 3 70B; 64x GPU; GBS128; SL 4K: FP8
2. LLM Inference: Llama 3 70B; 2k/256; 20t/s; 2s FTL, FP4
3. LLM Inference: Llama 3 8B; 2k/256; 20t/s; 2s FTL, FP4
4. Image Generation: FLUX 12B, 1024x1024, Min Latency
5. Video Gen: COSMOS VFM 7B; 720p vid; Min Latency
6. Data Analytics: GQE Queries; FP32
7. Molecular Dynamics: GROMACS; Cellulose (ns/day); FP32
Measured Performance
1. Genomics: Smith&Waterman (TCUPs); 8-bits;
AI Factory Workload Projection
World-Class AI Factories for Multiple Enterprise Workloads

HGX B200 RTX PRO Server


AI Factory Workloads AI Factory Workloads
1 Node 1 Node

Heavy Model Training and Inference Workloads Inference and Mixed Enterprise Workloads

Training Fine Tuning Inference RAG/Embedding Data Analytics Simulation Visual Computing
Model Size: Medium-→ Small Model Size: Medium-→ Small AI Agents, SDG, VSS, CV, ReccSys Multimodal Retrieval Pipelines Spark RAPIDS, Data Science CAE, CFD, Scientific Simulation Digital Twins, CAD, Rendering, VDI

Ilustrative chart for mapping workloads within an AI Factory for a representative Enterprise customer
RTX PRO Server
Example 4U System Configuration

RTX PRO Server


GPUs • Up to 8x NVIDIA RTX PRO 6000 Blackwell Server Edition

1
• 4x ConnectX-7 (1x400Gb) or
Networking (E/W) • 4x BlueField 3140H or
• 4x ConnectX-8 SuperNIC w PCIe Gen 6 Switch

Networking (N/S) • 1x BlueField-3 (B3220)

CPUs • 2x Intel® Xeon


• 2x AMD EPYC

Memory • 1.2TB DDR5 ECC2 (1 DIMM per channel)

Node Storage • 2x 4TB NVMe2

• 1x 1TB NVMe boot drive

1. Consult with your OEM partner for specific networking (E/W) configurations
2. Recommended minimum.
RTX PRO Server
Example 2U System Configuration

RTX PRO Server


GPUs • 2x NVIDIA RTX PRO 6000 Blackwell Server Edition

Networking (E/W)1 • 2x ConnectX-7 (1x400Gb) or


• 2x BlueField 3140H

Networking (N/S) • 1x BlueField-3 (B3220)

CPUs • 2x Intel® Xeon


• 2x AMD EPYC

Memory • 512GB DDR5 ECC2 (1 DIMM per channel)

Node Storage • 2x 4TB NVMe2

• 1x 1TB NVMe boot drive

1. Consult with your OEM partner for specific networking (E/W) configurations
2. Recommended minimum.
HGX B200
Example System Configuration

HGX B200
GPUs • HGX B200 8-GPU Baseboard

Networking (E/W) • 8x BlueField-3 SuperNIC (B3140H)

Networking (N/S) • 1x BlueField-3 (B3220)

CPUs • 2x Intel® Xeon


• 2x AMD EPYC

Memory • 1.5TB (1 DIMM per channel)

Node Storage • 2x 4TB NVMe1

• 1x 1TB NVMe boot drive

1. Recommended minimum. Per CPU socket


RESOURCES
Datacenter Solutions For Any AI & Mixed-Use Needs
Enterprise IT-ready Servers

For Massive Training and Inference Blend of Large Model Flexible Performance for Enterprise AI Universal Enterprise AI,
Training & Inference & HPC Industrial AI, Visual Computing

GB200 NVL72 HGX B200


H200 NVL RTX PRO Server
Rack-Scale Systems Scale-Up & Out Server

Mainstream LLM training many user Multimodal inference, LLM inference


Massive LLM training & LLM inference, fine tuning (>70B),
reasoning, fine-tuning (>70B), scientific industrial AI, visual computing, scientific
massive scale reasoning, Mixture of Experts scientific computing (FP64)
computing (FP64) computing
Key Use Cases

8-GPU NVLink Up to 8 GPUs 4-way NVLink Up to 8 GPUs


72-GPU NVLink
T 8U - 10U server 3U-4U server 3U-4U Server
Full rack, liquid only
Air or liquid Air-cooled Air-cooled
Up to 135 KW power
Deployable Unit 14 KW power 8 KW power 7 KW power

Scale-up models within rack Scale-up or Scale-out Scale-out Scale-out


Scale-out between racks with Quantum-X IB 900GB/s -1.8TB/s network per GPU for 900GB/s -1.8TB/s network per GPU for Spectrum-X AI Ethernet Ready
or Spectrum-X ethernet Model or Data-parallel workloads. Model or Data-parallel workloads 800GB/s All-to All BW w CX8
Scalability

1400 PFLOPS2 144 PFLOPS2 26 PFLOPS FP81 30 PFLOPS FP42


Peak Compute FLOPs (min
deployable unit)

1 H200 NVL FP8 Peak FLOPS. 2 FP4 Peak FLOPS


NVIDIA Datacenter Line Card

GB200 / GB300 HGX B200 / B300 H200 NVL RTX PRO 6000 L4

186GB
186GB <1200W
<1200W |/ 288GB
270GB <1400W 186GB <1000W / 288GB <1100W 141GB 600W 96GB 600W 24GB 72W
NVL72 NVL8 NVL2 / NVL4 Dual Slot FHFL Single Slot HHHL
Graph Neural Networks, Massive Scale LLM, RAG AI, HPC Data Analytics Dual Slot FHFL Standard & Multi Modal Edge AI, Inference, Video
Training & Inference Generative AI, Fine Tune
Up to 405B LLM LLM, RAG AI, HPC Data Analytics Training/Inferencing Omniverse LLM-7B
405B-1T+ LLM
AI Optimized Servers 8 GPUs per Up to 175B LLM Up to 70B LLM Mainstream AI Optimized
AI Optimized Rack Scale server Servers, 1-16GPUs per server
Enterprise AI Optimized Servers, Enterprise AI Optimized Servers,
Quantum IB Fabric SpectrumX Fabric 2-8 GPUs per server 2-8GPUs per server Ethernet Networking
E/W:CX7 N/S:B3220 E/W: B3410H N/S:B3220 SpectrumX Fabric SpectrumX Fabric
RA: 2-8-9-400/800
RA: 2-8-9-400/800?
Dual Plane RA: 2-8-9-400/800
RA: 2-8-9-400,800?
Dual Plane E/W: B3410H N/S:B3220 E/W: B3410H N/S:B3220
RA: 2-8-5-200, 2-4-5-400, 2-4-3- RA: 2-8-5-200/400
200

Scale Up HGX & Grace Blackwell Scale-Out PCIe

Highest model size and performance AI Liquid High performance AI compute, HPC, High performance AI compute, HPC, AI + Graphics Performant, Universal, AI Entry LVL universal, Edge, Video,
cooling only Power/Compute Value Optimized, Power Power/Compute Value Optimized Video, Standard Power Power limited
Intense, Air & Liquid options
Data Center Accelerated Computing Portfolio

RTX Pro 6000 Blackwell


GB300 NLV721 GB200 NLV72 HGX B3001 HGX B200 GB200 NVL4 GH200 H200 H100 L40S L4
Server Edition1

Highest Perf AI Mainstream LLM Mid-Size LLM Mainstream LLM Mid-Size LLM Universal
High Perf LLM Training Single Server Versatile AI and HPC Self- High Perf
Value Proposition Reasoning, Training and Highest AI Perf HGX High Perf AI, HPC HGX Training, Inference, Inference, Fine- Training, Inference, Fine- Highest Perf Universal AI, Video, and
and Inf, Data Processing AI + HPC Hosted Universal
Inf and HPC Tuning, HPC​, Inference, and HPC Tuning, HPC Graphics

NVL NVL
MGX Rack MGX Rack HGX HGX MGX​ MGX HGX 8 PCIe Gen5 x16 HGX 8 PCIe Gen5 x16 PCI Gen5 x16 PCIe Gen4 x16 PCIe Gen4 x16
Form Factor
72-GPU 72-GPU 8-SXM 8-SXM Server Server or 4-SXM​ 2 Slot FHFL or 4-SXM 2 Slot FHFL 2 Slot FHFL 2 Slot FHFL 1 slot LP
NVLink Bridged NVLink Bridged
Configurable
Configurable Configurable Configurable Configurable Configurable Configurable Configurable Configurable Configurable Configurable Configurable Configurable
Single GPU TDP up to 1000W
up to 1400W up to 1200W​ up to 1,100W up to 1000W​ up to 1200W​ up to 700W​ up to 600W up to 700W up to 400W up to 600W up to 350W up to 72W
(GPU+CPU+Memory)​

FP4 TC TFLOPS2 15,000 | 20,000 10,000 | 20,000 13,000 | 18,000 9,000 | 18,000 10,000 | 20,000 - - - - - 1,900 | 3,753 - -

FP8 TC TFLOPS3 10,000 10,000​ 9,000 9,000​ 10,000​ 3,958​ 3,958​ 3958 3,958 3,341 1,900 1,466 485

INT8 TC TOPS3 315 10,000 280 9,000​ 10,000 ​ 3,958 3,958​ 3,958 3,958 3,341 NA 1,466 485

FP16 TC TFLOPS3 5,000 5,000 4,400 4,500​ 5000 1979​ 1979​ 1979 1979 1,671 950 733 242

TF32 TC TFLOPS3 2,500 2,500 2,200 2,200 2,500 989 989 989 989 835 475 366 120

FP32 TFLOPS 80 80 ​ 72 75​ 80 ​ 67 67 67 67 60 117 92 30

FP64 TC | FP64 TFLOPS 1.3 | 1.3 40 | 40 1.2 | 1.2 37 | 37 40 | 40 67 | 34 67 | 34 67 | 34 67 | 34 60 | 30 NA NA NA

Up to 288 GB | 186 GB HBM3e | Up to 288 GB | 186 GB HBM3e | 180 GB HBM3e | Up to 144GB HBM3e | Up 141GB HBM3e | 141GB HBM3e | 80GB HBM3 | 3.35 94GB HBM3 | 96GB GDDR7 | 1.6 48GB GDDR6 | 24GB GDDR6 |
GPU Memory | Bandwidth
8 TB/s 8 TB/s 8 TB/s 8 TB/s 7.7 TB/s to 4.9 TB/s 4.8 TB/s 4.8 TB/s TB/s 3.9 TB/s TB/s 864 GB/s 300 GB/s

Multi-Instance GPU (MIG) Up to 7 Up to 7​ Up to 7 Up to 7 Up to 7 Up to 7 Up to 7​ Up to 7 Up to 7 Up to 7 Up to 4 - -

4 cards 2 cards
NVLink Connectivity 72 72 8 8 4 2 4 or 8​ 4 or 8 - - -
bridged bridged
4 Video Encoder 3 Video Encoder 2 Video Encoder3
7 JPEG Decoder​ 7 JPEG Decoder​ 7 JPEG Decoder​ 7 JPEG Decoder​ 7 JPEG Decoder​ 7 JPEG Decoder​ 7 JPEG Decoder​ 7 JPEG Decoder 7 JPEG Decoder 7 JPEG Decoder
Media Acceleration 4 Video Decoder 3 Video Decoder 4 Video Decoder3
7 Video Decoder4​ 7 Video Decoder4​ 7 Video Decoder4​ 7 Video Decoder4​ 7 Video Decoder4​ 7 Video Decoder 7 Video Decoder​ 7 Video Decoder 7 Video Decoder 7 Video Decoder
4 JPEG Decoder 4 JPEG Decoder 4 JPEG Decode

Ray Tracing - - 4th Gen | 352 3rd Gen | 142 3rd Gen | 60

Transformer Engine 2nd gen 1st Gen 2nd Gen 1st Gen

For in-situ visualization For in-situ visualization


Graphics Best Better Good
(no NVIDIA vPC or RTX vWS) (no NVIDIA vPC or RTX vWS)

vGPU Yes Yes Yes

NVIDIA AI Enterprise Add-on Add-on Add-on Included Add-on Included Add-on

1. Preliminary specifications subject to change.


2. Specification in dense | sparse
3. Specification in sparse only. Dense is ½ sparse spec shown.
4. Supported formats provide these speed-ups over H100 Tensor Core GPUs – 2X H.264, 1.25X HEVC, 1.25X VP9. AV1 support is new to Blackwell GPUs.

You might also like