RTX PRO Server Partner Deck
RTX PRO Server Partner Deck
*What is a token? It’s about half a word, so a 10-word ChatGPT query is 20-30 tokens.
Not All Factories for Manufacturing are the Same
AGENTIC AI INDUSTRIAL & PHYSICAL AI SCIENTIFIC COMPUTING, DATA ANALYTICS, & VISUAL COMPUTING ENTERPRISE
SIMULATION APPLICATIONS
6X 4X 7X 3X 4X 4X
Throughput Faster Faster Throughput Higher FPS Concurrent Workloads
LLM Inference Synthetic Data Generation Genome Sequence Alignment Engineering Simulation Real-Time Rendering Multi-Instance GPU
NEW
LIMITED TIME
PROMOTION
RTX PRO Server Software Kit
NVIDIA Run:ai
RTX PRO Server 7kW power requirement based on 8x RTX PRO 6000 Blackwell Server Edition GPUs
Announcing New Mainstream
NVIDIA RTX PRO Servers
High-volume servers from global system partners
Multi-Workload Acceleration
NEW
LIMITED TIME
PROMOTION RTX PRO Server Software Kit
Available from Global System Partners Starting Fall 2025 NVIDIA RTX PRO Server (2U Chassis, 2x RTX PRO 6000 Blackwell Server Edition GPU, 2x x86 CPU, 3kW) vs. 2x x86 CPU (2U Chassis, 1.1kW); Multi-Workload Relative
Performance; Geomean of measured performance speedups for HPC Applications (FP32), Rendering (VRed), Spark RAPIDS, Encode Streams (1080p30, HEVC). Shown
for representation only. Please contact partners for pricing.
Accelerate the Shift from CPU Servers
To mainstream 2U servers with RTX PRO 6000 Blackwell server edition GPUs
18.0x
12.0x
10.0x
8.0x
6.0x
4.0x
2.0x
0.0x
Performance/$ Performance/W
2.3X
2.0X 1.9X
1.5X
1.2X
1.1X
1.0X
0.0X
4K (FPS) 4K DLSS4 (FPS) Rendering (Time) Omniverse - Interactive Encode (Streams) Decode (Streams)
Rendering (FPS)
Measured Performance
1. SPECviewperf 15 4K Geomean
2. Cyberpunk 2077, RT: Overdrive, SR: Performance, RR: On; RTX PRO 6000 (DLSS4 - FG:4x), L40S (DLSS3 - FG:2x)
L40S RTX PRO 6000
3. Autodesk Arnold, Chameleon scene
4. Omniverse EBC Factory 4K animated; RTX PRO 6000 (DLSS4), L40S (DLSS3)
Projected Performance
5. H.264 Encode; 1080p30 streams LL p1 Performance/$ = Performance / TCO for a Single Node (Server + Power Costs) for L40S compared to RTX PRO 6000 Blackwell Server Edition
6. H.264 Decode; 1080p30 streams LL p1
Best Server for RTX Graphics and Visual Computing
RTX PRO Server Up to 3.3X Better Performance
3.3X
3.0X 2.9X
2.4X
2.0X 1.9X
1.6X
1.4X
1.0X
0.0X
4K (FPS) 4K DLSS (FPS) Rendering (Time) Omniverse - Interactive Encode (Streams) Decode (Streams)
Rendering (FPS)
Measured Performance
1. SPECviewperf 15 4K Geomean
2. Cyberpunk 2077, RT: Overdrive, SR: Performance, RR: On; RTX PRO 6000 (DLSS4 - FG:4x), L40S (DLSS3 - FG:2x)
L40S RTX PRO 6000
3. Autodesk Arnold, Chameleon scene
4. Omniverse EBC Factory 4K animated; RTX PRO 6000 (DLSS4), L40S (DLSS3)
Projected Performance
5. H.264 Encode; 1080p30 streams LL p1
6. H.264 Decode; 1080p30 streams LL p1
Visual Computing Ecosystem
Applications for Creative and Technical Professionals
Best Server for Industrial AI and Physical AI
RTX PRO Server Up to 2.9X Better Price Performance
3X 2.9X
2.8X
2.6X
2.1X
2X
1X
0X
Factory Digital Twin Factory Digital Twin Synthetic Data Generation Synthetic Data Generation
Simulation Layout Planning Video Generation Batch Rendering
Performance/$ = Performance / TCO for a Single Node (Server + Power Costs) for L40S compared to RTX PRO 6000 Blackwell Server Edition
Best Server for Industrial AI and Physical AI
RTX PRO Server Up to 4X Better Performance
4X
3.7X
3.3X
3X 2.8X
2.6X
2X
1X
0X
Factory Digital Twin Factory Digital Twin Synthetic Data Generation Synthetic Data Generation
Simulation Layout Planning Video Generation Batch Rendering
CAE Digital Twins Factory Digital Twins Robotics Simulation Synthetic Data Generation Vision AI Agents
NVIDIA CUDA-X and Omniverse NVIDIA Omniverse and Metropolis NVIDIA Omniverse and Isaac Sim NVIDIA Omniverse and Cosmos NVIDIA Metropolis and Cosmos
Best Server for Enterprise HPC
RTX PRO Server Up to 5X Better Price Performance
5.0X
5.0X
4.0X
3.3X 3.3X
3.0X
2.5X
2.0X 1.7X
1.3X
1.0X
0.0X
Discrete Element Modeling Fluid Dynamics Data Analytics Molecular Dynamics Energy Genomics
Measured Performance
•
•
Discrete Element Modeling- Altair EDEM- Mill – Elapsed Time (ms)
Computational Fluid Dynamics- Altair NFX-
HGX H100 RTX PRO 6000
• Data Analytics (TPC-DS) Queries - FP32- Graph queries for data science/analysis
• Molecular Dynamics GROMACS - FP32 – Simulation package for molecular analysis popular in t
• Energy RTM [TTI r8 2-pass] – Reverse Time Migration- Used for Seismic Imaging; FP32
• Genomics Smith-Waterman - FP32 – Performance/$ = Performance / TCO for a Single Node (Server + Power Costs) for HGX H100 compared to RTX PRO 6000 Blackwell Server Edition
Best Server for Enterprise HPC
RTX PRO Server Up to 2.3X Better Performance
2.0X
1.5X 1.5X
1.5X
1.1X
1.0X
0.8X
0.6X
0.5X
0.0X
Discrete Element Modeling Fluid Dynamics Data Analytics Molecular Dynamics Energy Genomics
AI Inference Performance/$
5.0X
Normalized to L40S
4.4X
4.0X
3.4X
3.0X
2.6X 2.6X
2.0X 1.8X
1.0X
0.0X
Flux 12B Cosmos Llama 3 8B Llama 3 70B Mixtral 8x7B
AI Inference Performance
6.0X
Normalized to L40S 5.6X
5.0X
4.0X
3.3X 3.3X
3.0X
2.7X 2.6X
2.0X
1.0X
0.0X
Flux 12B Cosmos Llama 3 8B Llama 3 70B Mixtral 8x7B
Supply Chain Security Agent Customer Service Coding Agent Deep Research Bio-Medical Agent
Agent Agent Agent
Agent Applications
AIOps
Orchestration
LLM as a Service Call Center Analytics Agent Recommendation Agent Geoscience Agent
Improve customer 84% decrease in cost Over 1M datapoints and 80% better data utilization
personalization product formulations improves yields
Trading Insights Agent Voice AI Agent Drug Discovery Agent Supply Chain Agent
60% faster insights Enabling AI-powered voice Improve pharmaceutical Transform operations planning,
services development logistics, and warehouse
management
Industry Leaders Transforming Business With RTX PRO Server
Graphics, Industrial AI, Scientific Computing and Agentic AI Use Cases
Use Case
Additional Workloads (Data Analytics, Simulation, FP32 HPC)? RTX PRO Server
NVIDIA Enterprise AI Factory Options
NVIDIA Blackwell Accelerated Computing Comparison
2.0X
Performance/$
1.5X
1.4X
1.3X
RTX PRO Server Only
1.2X
1.0X
0.6X
0.5X
0.5X
0.3X 0.3X
0.2X
0.0X
Llama3 70B Fine Llama3 70B Llama3 8B Inference Image Gen Video Gen Data Analytics Molecular Dynamics Genomics Video Data Curation Graphics Rendering Virtual Workstations
Tuning Inference
Projected Performance - RTX PRO 6000 Blackwell Server Edition vs. HGX B200
1. LLM Fine Tuning: Llama 3 70B; 64x GPU; GBS128; SL 4K: FP8
2. LLM Inference: Llama 3 70B; 2k/256; 20t/s; 2s FTL, FP4
3. LLM Inference: Llama 3 8B; 2k/256; 20t/s; 2s FTL, FP4
4. Image Generation: FLUX 12B, 1024x1024, Min Latency
5. Video Gen: COSMOS VFM 7B; 720p vid; Min Latency
6. Data Analytics: GQE Queries; FP32
7. Molecular Dynamics: GROMACS; Cellulose (ns/day); FP32
Measured Performance
1. Genomics: Smith&Waterman (TCUPs); 8-bits;
AI Factory Workload Projection
World-Class AI Factories for Multiple Enterprise Workloads
Heavy Model Training and Inference Workloads Inference and Mixed Enterprise Workloads
Training Fine Tuning Inference RAG/Embedding Data Analytics Simulation Visual Computing
Model Size: Medium-→ Small Model Size: Medium-→ Small AI Agents, SDG, VSS, CV, ReccSys Multimodal Retrieval Pipelines Spark RAPIDS, Data Science CAE, CFD, Scientific Simulation Digital Twins, CAD, Rendering, VDI
Ilustrative chart for mapping workloads within an AI Factory for a representative Enterprise customer
RTX PRO Server
Example 4U System Configuration
1
• 4x ConnectX-7 (1x400Gb) or
Networking (E/W) • 4x BlueField 3140H or
• 4x ConnectX-8 SuperNIC w PCIe Gen 6 Switch
1. Consult with your OEM partner for specific networking (E/W) configurations
2. Recommended minimum.
RTX PRO Server
Example 2U System Configuration
1. Consult with your OEM partner for specific networking (E/W) configurations
2. Recommended minimum.
HGX B200
Example System Configuration
HGX B200
GPUs • HGX B200 8-GPU Baseboard
For Massive Training and Inference Blend of Large Model Flexible Performance for Enterprise AI Universal Enterprise AI,
Training & Inference & HPC Industrial AI, Visual Computing
GB200 / GB300 HGX B200 / B300 H200 NVL RTX PRO 6000 L4
186GB
186GB <1200W
<1200W |/ 288GB
270GB <1400W 186GB <1000W / 288GB <1100W 141GB 600W 96GB 600W 24GB 72W
NVL72 NVL8 NVL2 / NVL4 Dual Slot FHFL Single Slot HHHL
Graph Neural Networks, Massive Scale LLM, RAG AI, HPC Data Analytics Dual Slot FHFL Standard & Multi Modal Edge AI, Inference, Video
Training & Inference Generative AI, Fine Tune
Up to 405B LLM LLM, RAG AI, HPC Data Analytics Training/Inferencing Omniverse LLM-7B
405B-1T+ LLM
AI Optimized Servers 8 GPUs per Up to 175B LLM Up to 70B LLM Mainstream AI Optimized
AI Optimized Rack Scale server Servers, 1-16GPUs per server
Enterprise AI Optimized Servers, Enterprise AI Optimized Servers,
Quantum IB Fabric SpectrumX Fabric 2-8 GPUs per server 2-8GPUs per server Ethernet Networking
E/W:CX7 N/S:B3220 E/W: B3410H N/S:B3220 SpectrumX Fabric SpectrumX Fabric
RA: 2-8-9-400/800
RA: 2-8-9-400/800?
Dual Plane RA: 2-8-9-400/800
RA: 2-8-9-400,800?
Dual Plane E/W: B3410H N/S:B3220 E/W: B3410H N/S:B3220
RA: 2-8-5-200, 2-4-5-400, 2-4-3- RA: 2-8-5-200/400
200
Highest model size and performance AI Liquid High performance AI compute, HPC, High performance AI compute, HPC, AI + Graphics Performant, Universal, AI Entry LVL universal, Edge, Video,
cooling only Power/Compute Value Optimized, Power Power/Compute Value Optimized Video, Standard Power Power limited
Intense, Air & Liquid options
Data Center Accelerated Computing Portfolio
Highest Perf AI Mainstream LLM Mid-Size LLM Mainstream LLM Mid-Size LLM Universal
High Perf LLM Training Single Server Versatile AI and HPC Self- High Perf
Value Proposition Reasoning, Training and Highest AI Perf HGX High Perf AI, HPC HGX Training, Inference, Inference, Fine- Training, Inference, Fine- Highest Perf Universal AI, Video, and
and Inf, Data Processing AI + HPC Hosted Universal
Inf and HPC Tuning, HPC, Inference, and HPC Tuning, HPC Graphics
NVL NVL
MGX Rack MGX Rack HGX HGX MGX MGX HGX 8 PCIe Gen5 x16 HGX 8 PCIe Gen5 x16 PCI Gen5 x16 PCIe Gen4 x16 PCIe Gen4 x16
Form Factor
72-GPU 72-GPU 8-SXM 8-SXM Server Server or 4-SXM 2 Slot FHFL or 4-SXM 2 Slot FHFL 2 Slot FHFL 2 Slot FHFL 1 slot LP
NVLink Bridged NVLink Bridged
Configurable
Configurable Configurable Configurable Configurable Configurable Configurable Configurable Configurable Configurable Configurable Configurable Configurable
Single GPU TDP up to 1000W
up to 1400W up to 1200W up to 1,100W up to 1000W up to 1200W up to 700W up to 600W up to 700W up to 400W up to 600W up to 350W up to 72W
(GPU+CPU+Memory)
FP4 TC TFLOPS2 15,000 | 20,000 10,000 | 20,000 13,000 | 18,000 9,000 | 18,000 10,000 | 20,000 - - - - - 1,900 | 3,753 - -
FP8 TC TFLOPS3 10,000 10,000 9,000 9,000 10,000 3,958 3,958 3958 3,958 3,341 1,900 1,466 485
INT8 TC TOPS3 315 10,000 280 9,000 10,000 3,958 3,958 3,958 3,958 3,341 NA 1,466 485
FP16 TC TFLOPS3 5,000 5,000 4,400 4,500 5000 1979 1979 1979 1979 1,671 950 733 242
TF32 TC TFLOPS3 2,500 2,500 2,200 2,200 2,500 989 989 989 989 835 475 366 120
Up to 288 GB | 186 GB HBM3e | Up to 288 GB | 186 GB HBM3e | 180 GB HBM3e | Up to 144GB HBM3e | Up 141GB HBM3e | 141GB HBM3e | 80GB HBM3 | 3.35 94GB HBM3 | 96GB GDDR7 | 1.6 48GB GDDR6 | 24GB GDDR6 |
GPU Memory | Bandwidth
8 TB/s 8 TB/s 8 TB/s 8 TB/s 7.7 TB/s to 4.9 TB/s 4.8 TB/s 4.8 TB/s TB/s 3.9 TB/s TB/s 864 GB/s 300 GB/s
4 cards 2 cards
NVLink Connectivity 72 72 8 8 4 2 4 or 8 4 or 8 - - -
bridged bridged
4 Video Encoder 3 Video Encoder 2 Video Encoder3
7 JPEG Decoder 7 JPEG Decoder 7 JPEG Decoder 7 JPEG Decoder 7 JPEG Decoder 7 JPEG Decoder 7 JPEG Decoder 7 JPEG Decoder 7 JPEG Decoder 7 JPEG Decoder
Media Acceleration 4 Video Decoder 3 Video Decoder 4 Video Decoder3
7 Video Decoder4 7 Video Decoder4 7 Video Decoder4 7 Video Decoder4 7 Video Decoder4 7 Video Decoder 7 Video Decoder 7 Video Decoder 7 Video Decoder 7 Video Decoder
4 JPEG Decoder 4 JPEG Decoder 4 JPEG Decode
Ray Tracing - - 4th Gen | 352 3rd Gen | 142 3rd Gen | 60
Transformer Engine 2nd gen 1st Gen 2nd Gen 1st Gen