Tektalkcomputeaisolutons 1700043747452

HPE TekTalk: ProLiant AI Inference Solutions
November 2023
Brad Sweeney, Compute AI Solutions, HPE

Bilal Majeed, DL320 Product Manager, HPE
Devon Daniel, DL380a Product Manager, HPE
Trent Boyer, North America OEM Sales Director, NVIDIA
Confidential | Authorized HPE Partner Use Only
The overall AI market opportunity is large and growing fast
AI stack TAM CAGR
(’25) (’22-’25)
Apps with AI Plug-ins/Agents (e.g., DigitalTwins, CodeBots, Chatbots, Search)
AI apps $50B 25%
Customized models
with domain-specific, proprietary customer data
AI
$15B 50%
models Open-source models Closed-source models
Model development Model deployment
AI model
$20B 30%
dev tools Data pipeline Data store / lifecycle Data governance
IT operations mgmt. Cloud mgmt.
SW $5B 25%
Runtime (OS / VM)
AI infra Servers & Networking Storage

H
W
$35B 25% Opportunity for HPE Compute
Interconnect Silicon Processor Silicon (CPU, GPU etc.)
Si $10B 30%
Source: IDC / Gartner / HPE Market Model. Jan 2023. Confidential | Authorized HPE Partner Use Only 2
Develop and deploy AI at any scale
Training Inference
Build new AI models Run trained models
Supervised or unsupervised learning Real-time execution
Use Cases Use Cases

Generative AI training Computer Vision
Computer Vision training Natural Language Processing
LLM training Fraud Detection
Scientific Discovery Predictive Maintenance
Financial Market Modeling Medical Imaging
Time to Results: Time to Results:

hours, days, weeks microseconds, milliseconds, seconds
HPE HPE
Cray ProLiant
Confidential | Authorized HPE Partner Use Only 3

Accelerate your AI Inference Initiatives
NE
W!
Computer Vision AI Generative Natural Language
at the Edge Visual AI Processing AI
Purpose-built for AI at the edge Optimized for visual apps Powering large language models
Loss prevention 3D animation Speech AI
Smart spaces Image/video generation Fraud detection
HPE ProLiant DL320 Gen11 HPE ProLiant DL380a Gen11 HPE ProLiant DL380a Gen11
Up to Four NVIDIA L4 GPUs Up to Four NVIDIA L40 GPUs Up to Four NVIDIA L40S GPUs
NVIDIA Metropolis ecosystem NVIDIA AI Enterprise suite NVIDIA AI Enterprise suite
Select configurations available for order today

Computer Vision AI at the Edge
HPE ProLiant DL320

NE
Gen11
W! Up to Four NVIDIA L4 GPUs
Analyzes video footage from
cameras to detect and track objects,
people, and behaviors for enhanced
operations, security, and smart space
experiences.
Purpose-built for AI at the edge

HPE ProLiant DL320 Gen11
GPU Support Front loaded - up to four NVIDIA L4 Tensor Core GPUs

(front loaded) (Up to 4 SW or 2 DW (front loaded)
Processor 4th Generation Intel® Xeon® Scalable Processors HPE ProLiant DL320
Memory Up to 2TB of DDR5 up to 4800 MT/s
Gen11
Drive Count Up to 4 SFF NVMe/SAS/SATA or up to 8 EDSFF E3.S 1T NVMe SSD Up to Four NVIDIA L4 GPUs
Boot Options Optional internal RAID1 M.2 NVMe (hot pluggable)
2x SATA/NVMe M.2 connector (onboard)
I/O Embedded 2x1GbE networking ports
Up to 2 x16 PCIe Gen5
Up to 1 x16 OCP slot
Storage Gen11 Controllers (PCIe and OCP) + VROC NVME/SATA

Controller
Management iLO 6
Chassis Depth 30.4” / 772mm

NVIDIA L4 for Efficient Video, AI, and Graphics
AI Video Graphics
120X
More Performance
4X
Faster Graphics with
3rd Gen RT Cores
Generative AI
2.5X
Better Performance with 4th
Generation Tensor Core
Single Slot, Low Profile

Fits Any Server
Measured Performance:
AI Video: 8x L4 vs 2S Intel 8380 CPU server performance comparison : end-to-end video pipeline with CV-CUDA pre-post processing, decode, inference (SegFormer), encode, TRT 8.6 vs CPU only pipeline using OpenCV 4.7
Graphics: Real-time Rendering: NVIDIA Omniverse performance for real-time rendering at 1080p and 4K with DLSS 3
Generative AI: L4 vs T4: image generation performance, 512x512 Stable Diffusion, FP16
L4 POWERED BY THE NVIDIA ADA LOVELACE ARCHITECTURE
Vision AI architecture
Video analytics platform software Access via browser and mobile app
RTSP
Stream Alert and
HPE ProLiant DL320 Gen11 with NVIDIA L4 object metadata
GPUs
IP Cameras NVIDIA
METROPOLIS
and Sensors
Video Files
Video
Storage
Real-time Email VMS

VMS / NVRs
Alerts Server Client

The Vaidio AI Vision Platform from IronYun
Solution:
▪ 30+ advanced, AI-enabled video analytics on a single
platform
▪ Monitor 1,000s of cameras in real-time
▪ Search 1,000s of hours of video in seconds
▪ Mine the video stream for business intelligence data
Key features:
▪ Active real-time perimeter monitoring and intrusion
detection Instantly find an object from the past 24 hours of footage
▪ Object recognition, detection & search
▪ Access control
▪ Forensic video search for incident investigation Scalable – from 10s to 1,000s of cameras
▪ Works with any new or existing ONVIF IP camera
▪ Built-in integration with 28 video management systems
Open – works with existing IP cameras and VMSs
(VMSs) Comprehensive – 30+ advanced AI video analytics
for security, safety and operations
Flexible – deploy analytics a la carte as needed and
multiple analytics per camera

Accelerate your AI Inference Initiatives
NE
W!
Computer Vision AI Generative Natural Language
at the Edge Visual AI Processing AI
Purpose-built for AI at the edge Optimized for visual apps Powering large language models
Loss prevention 3D animation Speech AI
Smart spaces Image/video generation Fraud detection
HPE ProLiant DL320 Gen11 HPE ProLiant DL380a Gen11 HPE ProLiant DL380a Gen11
Up to Four NVIDIA L4 GPUs Up to Four NVIDIA L40 GPUs Up to Four NVIDIA L40S GPUs
NVIDIA Metropolis ecosystem NVIDIA AI Enterprise suite NVIDIA AI Enterprise suite
Select configurations available for order today

HPE ProLiant DL380a Gen11
Processor 4th Generation Intel® Xeon® Scalable Processors
Memory Up to 3TB of DDR5 up to 4400 MT/s
Up to 2TB of DDR5 up to 4800 MT/s HPE ProLiant DL380a
Front Drive Count Up to 8 SFF SSD; NVMe
HPEGen11
ProLiant DL380a
Up to 8 EDSFF E3.S 1T NVMe SSD Gen11
Up to Four NVIDIA L40S GPUs
Boot Options Optional RAID1 M.2 NVMe Accelerator-optimized and purpose built for AI
External/Rear (hot-pluggable) or Internal (not hot-pluggable)
access
GPU Support Up to 4 DW or 8 SW (front-loaded)
NVIDIA H100, L40S, L40, L4
I/O Up to 4 x16 PCIe Gen5 slots
Up to 2 x16 OCP slots
Storage Controller Gen11 Controllers (PCIe and OCP) + VROC NVME

Cooling Air Cooling Module support
Management iLO 6
Chassis Depth SFF/EDSFF: 32.13“; LFF (not applicable)
Confidential | Authorized HPE Partner Use 12

Only
Generative Visual AI
HPE ProLiant DL380a

NE
 Generate lifelike and dynamic Gen11
W! Up to Four NVIDIA L40 GPUs
3D animations including
character movements, physics NVIDIA AI Enterprise
simulations, and environmental
effects.
 Curate new visual content,

such as images or videos, that
mimic real-world data.
Optimized for visual apps

Natural Language Processing AI
HPE ProLiant DL380a

NE
Gen11
Develop and deploy end-to-end
W! Up to Four NVIDIA L40S GPUs
AI to power natural language NVIDIA AI Enterprise
processing for:
 Speech AI
 Fraud detection
 Predictive maintenance
 and more
Powering large language models

NVIDIA L40 GPU Accelerator for Visual AI
Unprecedented visual computing

Generative AI for the data center
5X
Omniverse
Better Performance with 4th
Generation Tensor Core3
4X
More Performance with DLSS
31
Rendering
5X
Faster Graphics with
3rd Gen RT Cores2
Measured Performance:
Omniverse: Tests run on system with 17th Gen Intel(R) Core(TM) i9-12900K, 64GB RAM, Windows 10E x64, Driver 528.35(GRID 15.1), 1x
NVIDIA L40, NVIDIA A40, DLSS3 enabled on L40 , Real time Render Mode
Rendering Tests: run on 2S system with 2x Xeon Gold 6126, 2.6GHz (3.7GHz Turbo), Win10 x 64, NVIDIA driver version 526.08, Blender Cycles
2.93, Chaos Group V-Ray 5.0. Dassault Systemes- Stellar 3.0.45, AutoDesk Arnold 7.1.0.0 GA Release, AutoDesk VRED 2022.2, 1x NVIDIA L40, Confidential | Authorized HPE Partner Use 15
NVIDIA A40
Gen AI: Stable diffusion v2.1, L4 (TensorRT 8.6.0), T4 (TensorRT 8.5.2), Precision: FP16, image size: 512x512 Only
NVIDIA L40S GPU Accelerator for Universal AI
The highest performance universal GPU

for AI, graphics, and video
Fine tuning LLM AI training AI inference
4hrs 1.7X 1.5X
GPT-175B 860M Tokens 1 Performance vs. HGX A100 2 Performance vs. HGX A100 3
GPT3 training Image Gen AI Full video pipeline

<4 days >82 184
GPT-175 300B Tokens4 Images per minute 5 AV1 Encode Streams 64
Preliminary performance projections, subject to change

1. Fine-Tuning LoRA (GPT-175B), bs: 128, sl: 256; 64 GPUs: 16 systems with 4xL40S
2. Fine-Tuning LoRA (GPT-40B), bs: 128, sl: 256; Two systems with 4x L40S, vs HGX A100 8 GPU
3. Hugging Face SWIN Base Inference (BS=1,Seq 224); L40S vs. A100 80GB SXM
4. GPT 175B, 300B tokens, Foundational Training; 4K GPUs; 1000 systems with 4xL40S
5. Image Generation, Stable Diffusion v2.1, 512 x 512 resolution; 1xL40S
6. Concurrent Encoding Streams; 720p30; 1xL40S Confidential | Authorized HPE Partner Use 16
Only
PRODUCT LINE UP SPECIFICATIONS COMPARISON
L4 L40 L40S1
GPU Architecture NVIDIA Ada Lovelace NVIDIA Ada Lovelace NVIDIA Ada Lovelace
FP64 N/A N/A N/A
FP32 30 TFLOPS 90.5 TFLOPS 91.6 TFLOPS
RT Core 73.1 TFLOPS 209 TFLOPS 212 TFLOPS
TF32 Tensor Core2 121 TFLOPS 181 TFLOPS 366 TFLOPS
FP16/BF16 Tensor Core2 242 TFLOPS 362 TFLOPS 733 TFLOPS
FP8 Tensor Core2 484 TFLOPS 724 TFLOPS 1466 TFLOPS
INT8 Tensor Core2 485 TOPS 724 TOPS 1466 TOPS
GPU Memory 24 GB GDDR6 w/ ECC 48 GB GDDR6 w/ ECC 48 GB GDDR6 w/ ECC
GPU Memory
300 GB/s 864 GB/s 864 GB/s
Bandwidth
L2 Cache 48 MB 96 MB 96 MB
2 NVENC (+AV1) 3 NVENC (+AV1) 3 NVENC (+AV1)
Media Engines 4 NVDEC 3 NVDEC 3 NVDEC
4 NVJPEG 4 NVJPEG 4 NVJPEG
Power Up to 72 W Up to 300 W Up to 350 W
Form Factor 1-slot LP 2-slot FHFL 2-slot FHFL
PCIe Gen4 x16: PCIe Gen4 x16: PCIe Gen4 x16:
Interconnect
64 GB/s 64 GB/s 64 GB/s
1. Preliminary specifications, subject to change.

2. Specifications with sparsity.
NVIDIA AI Enterprise
Streamlines the development and AI Workflows, Frameworks and Pretrained Models

deployment of production AI including
Medical Imaging Conversational AI Recommenders Video Analytics Logistics Robotics
generative AI, computer vision, speech AI
and more. Speech AI Customer Service Physics ML Communications
Autonomous
Vehicles
Cybersecurity
 Optimized for inference with 40X faster

than CPU-only platforms, lowering AI and Data Science Development and Deployment Tools
infrastructure and energy costs.
Cloud Native Management and Orchestration
 Over 100 frameworks, pretrained
models, and development tools Infrastructure Optimization
optimized for building and running AI on
NVIDIA GPUs
Accelerated Infrastructure
 Deploy at scale with NVIDIA Triton™
Inference Server Cloud Data Center Edge Embedded

Only
NVIDIA AI ENTERPRISE 4.0 RELEASE OVERVIEW
Production-Ready Generative AI Inference Deployment Management Enterprise-Grade Manageability

NeMo Framework and New AI Workflows for Gen AI Triton Management Service Base Command Manager Essentials for Cluster Management
GPU Accelerated Spark On-Prem Deep Learning Frameworks Support for AI Workstations
Support for RAPIDS Accelerator on-prem deployment NVIDIA Modulus, TAO, and Deep Graph Library Rapid AI Development with NVIDIA RTX 6000 Ada
Business Value of an Enterprise Subscription
Enterprise-Grade Support, Security and Reliability for Production AI
Security & Reliability Exclusive Features Industry Ecosystem

 Security reviews  Stable APIs & Security  Hardware certifications
 Priority notifications & fixes Patching  Software certifications
 API compatibility  AI Workflows  Cloud certifications
 Lifecycle management  Unencrypted pretrained models  ML Ops certifications
 Long term support  AI management solutions  Multi-vendor case ownership
(e.g., Triton Management
Service)
Ongoing Delivery Technical Support

 Patches  24/7 case filing
 Bug fixes  8-5 live NVIDIA agent access
 Updates  Unlimited incidents
 Upgrades  Specialty-based routing
 Feature enhancements  NVIDIA customer portal/
knowledge base

HPE ProLiant AI Inference Solutions
Computer vision AI at the edge Generative visual AI Natural language processing AI
Smart
5134233671-03 5133904640-02 5133904763-02
template UCID
Server HPE ProLiant DL320 Gen11 HPE ProLiant DL380a Gen11 HPE ProLiant DL380a Gen11
Up to four NVIDIA L4 Up to four NVIDIA L40S

GPU Up to four NVIDIA L40
(Up to three today, four in October*)
• Computer vision & video analytics • 3D animation • Speech AI

Use cases • Loss prevention • Image generation • Fraud detection
• Smart spaces • Video generation • Predictive maintenance
NVIDIA AI Enterprise (available now in WISE; NVIDIA AI Enterprise (available now in WISE;
Software ISVs enabled by NVIDIA Metropolis
SKU coming in Oct 2023) SKU coming in Oct 2023)
Purpose-built for AI at the edge Optimized for visual apps

Powering large language models
Ideal for computer vision and video analytics at the For generative visual AI to drive product design,
To drive enterprise implementations of natural
edge for customers in the retail, hospitality, and 3D animation, or image and video generation, the
language processing, such as speech AI and fraud
manufacturing sectors, the HPE ProLiant DL320 HPE ProLiant DL380a Gen11 server with four
detection, HPE designed the ultra-scalable HPE
Gen11 server has a unique compact design purpose- NVIDIA L40 GPUs delivers the rendering and
ProLiant DL380a Gen11 server. Together with
built for edge computing. It can pack up to four design performance needed by demanding visual
Description NVIDIA L40S GPUs and the full suite of NVIDIA
NVIDIA L4 GPUs in a 1U form factor to power smart applications used in the media and entertainment,
AI Enterprise tools, this is a great AI platform for
spaces and loss prevention solutions and deliver healthcare, and manufacturing industries. The
running large language models used by the financial
insights in near-real time. Customers can take NVIDIA AI Enterprise suite offers pre-trained
services and manufacturing sectors, as well as
advantage of offerings from NVIDIA’s Metropolis models to streamline development and
customer service across a range of industries.
ecosystem to deploy targeted solutions for their deployment.
industry and use case.
Key verticals
Retail, hospitality, manufacturing Media & entertainment, healthcare, manufacturing Financial services, manufacturing, customer service
& functions
* Dates subject to change Learn more: www.hpe.com/ProLiant/AI Confidential | Authorized HPE Partner Use Only 21
1 Innovate with more graphical capabilities than ever before
2 Expect more from your infrastructure
Why HPE ProLia

nt Gen11 for AI?
3 Control cost by rightsizing and scaling as
needed
4 Get an intuitive cloud operating

experience
5 Protect your infrastructure with trusted security by design

Only
Thank you
Access the QRG here
bradley
.sweeney@hpe.com
bilal.majeed@hpe.com
devon.daniel@hpe.com
tboyer@nvidia.com
Confidential | Authorized HPE Partner Use Only
© 2023 Hewlett Packard Enterprise Development LP

Tektalkcomputeaisolutons 1700043747452

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tektalkcomputeaisolutons 1700043747452

Uploaded by

Copyright:

Available Formats

HPE TekTalk: ProLiant AI Inference Solutions

Brad Sweeney, Compute AI Solutions, HPE

Model development Model deployment

IT operations mgmt. Cloud mgmt.

AI infra Servers & Networking Storage

Use Cases Use Cases

Time to Results: Time to Results:

Confidential | Authorized HPE Partner Use Only 3

NVIDIA Metropolis ecosystem NVIDIA AI Enterprise suite NVIDIA AI Enterprise suite

Select configurations available for order today

HPE ProLiant DL320

Purpose-built for AI at the edge

GPU Support Front loaded - up to four NVIDIA L4 Tensor Core GPUs

Storage Gen11 Controllers (PCIe and OCP) + VROC NVME/SATA

Chassis Depth 30.4” / 772mm

Confidential | Authorized HPE Partner Use Only 6

Single Slot, Low Profile

Real-time Email VMS

Confidential | Authorized HPE Partner Use Only 9

Confidential | Authorized HPE Partner Use Only 10

NVIDIA Metropolis ecosystem NVIDIA AI Enterprise suite NVIDIA AI Enterprise suite

Select configurations available for order today

Storage Controller Gen11 Controllers (PCIe and OCP) + VROC NVME

Confidential | Authorized HPE Partner Use 12

HPE ProLiant DL380a

 Curate new visual content,

Optimized for visual apps

HPE ProLiant DL380a

Powering large language models

Unprecedented visual computing

The highest performance universal GPU

GPT3 training Image Gen AI Full video pipeline

Preliminary performance projections, subject to change

1. Preliminary specifications, subject to change.

Streamlines the development and AI Workflows, Frameworks and Pretrained Models

 Optimized for inference with 40X faster

Confidential | Authorized HPE Partner Use 18

Production-Ready Generative AI Inference Deployment Management Enterprise-Grade Manageability

Security & Reliability Exclusive Features Industry Ecosystem

Ongoing Delivery Technical Support

Confidential | Authorized HPE Partner Use Only 20

Up to four NVIDIA L4 Up to four NVIDIA L40S

• Computer vision & video analytics • 3D animation • Speech AI

Purpose-built for AI at the edge Optimized for visual apps

2 Expect more from your infrastructure

Why HPE ProLia

4 Get an intuitive cloud operating

5 Protect your infrastructure with trusted security by design

Confidential | Authorized HPE Partner Use 22

Access the QRG here

You might also like