You are on page 1of 23

HPE TekTalk: ProLiant AI Inference Solutions

November 2023

Brad Sweeney, Compute AI Solutions, HPE


Bilal Majeed, DL320 Product Manager, HPE
Devon Daniel, DL380a Product Manager, HPE
Trent Boyer, North America OEM Sales Director, NVIDIA
Confidential | Authorized HPE Partner Use Only
The overall AI market opportunity is large and growing fast
AI stack TAM CAGR
(’25) (’22-’25)
Apps with AI Plug-ins/Agents (e.g., DigitalTwins, CodeBots, Chatbots, Search)
AI apps $50B 25%

Customized models
with domain-specific, proprietary customer data
AI
$15B 50%
models Open-source models Closed-source models

Model development Model deployment

AI model
$20B 30%
dev tools Data pipeline Data store / lifecycle Data governance

IT operations mgmt. Cloud mgmt.

SW $5B 25%
Runtime (OS / VM)

AI infra Servers & Networking Storage


H
W
$35B 25% Opportunity for HPE Compute
Interconnect Silicon Processor Silicon (CPU, GPU etc.)
Si $10B 30%

Source: IDC / Gartner / HPE Market Model. Jan 2023. Confidential | Authorized HPE Partner Use Only 2
Develop and deploy AI at any scale
Training Inference
Build new AI models Run trained models
Supervised or unsupervised learning Real-time execution

Use Cases Use Cases


Generative AI training Computer Vision
Computer Vision training Natural Language Processing
LLM training Fraud Detection
Scientific Discovery Predictive Maintenance
Financial Market Modeling Medical Imaging

Time to Results: Time to Results:


hours, days, weeks microseconds, milliseconds, seconds

HPE HPE
Cray ProLiant

Confidential | Authorized HPE Partner Use Only 3


Accelerate your AI Inference Initiatives
NE
W!
Computer Vision AI Generative Natural Language
at the Edge Visual AI Processing AI
Purpose-built for AI at the edge Optimized for visual apps Powering large language models
Loss prevention 3D animation Speech AI
Smart spaces Image/video generation Fraud detection

HPE ProLiant DL320 Gen11 HPE ProLiant DL380a Gen11 HPE ProLiant DL380a Gen11
Up to Four NVIDIA L4 GPUs Up to Four NVIDIA L40 GPUs Up to Four NVIDIA L40S GPUs

NVIDIA Metropolis ecosystem NVIDIA AI Enterprise suite NVIDIA AI Enterprise suite

Select configurations available for order today


Confidential | Authorized HPE Partner Use Only 4
Computer Vision AI at the Edge

HPE ProLiant DL320


NE
Gen11
W! Up to Four NVIDIA L4 GPUs
Analyzes video footage from
cameras to detect and track objects,
people, and behaviors for enhanced
operations, security, and smart space
experiences.

Purpose-built for AI at the edge


Confidential | Authorized HPE Partner Use Only 5
HPE ProLiant DL320 Gen11

GPU Support Front loaded - up to four NVIDIA L4 Tensor Core GPUs


(front loaded) (Up to 4 SW or 2 DW (front loaded)
Processor 4th Generation Intel® Xeon® Scalable Processors HPE ProLiant DL320
Memory Up to 2TB of DDR5 up to 4800 MT/s
Gen11
Drive Count Up to 4 SFF NVMe/SAS/SATA or up to 8 EDSFF E3.S 1T NVMe SSD Up to Four NVIDIA L4 GPUs
Boot Options Optional internal RAID1 M.2 NVMe (hot pluggable)
2x SATA/NVMe M.2 connector (onboard)
I/O Embedded 2x1GbE networking ports
Up to 2 x16 PCIe Gen5
Up to 1 x16 OCP slot

Storage Gen11 Controllers (PCIe and OCP) + VROC NVME/SATA


Controller
Management iLO 6

Chassis Depth 30.4” / 772mm

Confidential | Authorized HPE Partner Use Only 6


NVIDIA L4 for Efficient Video, AI, and Graphics

AI Video Graphics
120X
More Performance
4X
Faster Graphics with
3rd Gen RT Cores

Generative AI
2.5X
Better Performance with 4th
Generation Tensor Core

Single Slot, Low Profile


Fits Any Server

Measured Performance:
AI Video: 8x L4 vs 2S Intel 8380 CPU server performance comparison : end-to-end video pipeline with CV-CUDA pre-post processing, decode, inference (SegFormer), encode, TRT 8.6 vs CPU only pipeline using OpenCV 4.7
Graphics: Real-time Rendering: NVIDIA Omniverse performance for real-time rendering at 1080p and 4K with DLSS 3
Generative AI: L4 vs T4: image generation performance, 512x512 Stable Diffusion, FP16
L4 POWERED BY THE NVIDIA ADA LOVELACE ARCHITECTURE
Vision AI architecture

Video analytics platform software Access via browser and mobile app

RTSP
Stream Alert and
HPE ProLiant DL320 Gen11 with NVIDIA L4 object metadata
GPUs
IP Cameras NVIDIA
METROPOLIS
and Sensors
Video Files

Video
Storage

Real-time Email VMS


VMS / NVRs
Alerts Server Client

Confidential | Authorized HPE Partner Use Only 9


The Vaidio AI Vision Platform from IronYun
Solution:
▪ 30+ advanced, AI-enabled video analytics on a single
platform
▪ Monitor 1,000s of cameras in real-time
▪ Search 1,000s of hours of video in seconds
▪ Mine the video stream for business intelligence data

Key features:
▪ Active real-time perimeter monitoring and intrusion
detection Instantly find an object from the past 24 hours of footage
▪ Object recognition, detection & search
▪ Access control
▪ Forensic video search for incident investigation Scalable – from 10s to 1,000s of cameras
▪ Works with any new or existing ONVIF IP camera
▪ Built-in integration with 28 video management systems
Open – works with existing IP cameras and VMSs
(VMSs) Comprehensive – 30+ advanced AI video analytics
for security, safety and operations
Flexible – deploy analytics a la carte as needed and
multiple analytics per camera

Confidential | Authorized HPE Partner Use Only 10


Accelerate your AI Inference Initiatives
NE
W!
Computer Vision AI Generative Natural Language
at the Edge Visual AI Processing AI
Purpose-built for AI at the edge Optimized for visual apps Powering large language models
Loss prevention 3D animation Speech AI
Smart spaces Image/video generation Fraud detection

HPE ProLiant DL320 Gen11 HPE ProLiant DL380a Gen11 HPE ProLiant DL380a Gen11
Up to Four NVIDIA L4 GPUs Up to Four NVIDIA L40 GPUs Up to Four NVIDIA L40S GPUs

NVIDIA Metropolis ecosystem NVIDIA AI Enterprise suite NVIDIA AI Enterprise suite

Select configurations available for order today


Confidential | Authorized HPE Partner Use Only 11
HPE ProLiant DL380a Gen11
Processor 4th Generation Intel® Xeon® Scalable Processors
Memory Up to 3TB of DDR5 up to 4400 MT/s
Up to 2TB of DDR5 up to 4800 MT/s HPE ProLiant DL380a
Front Drive Count Up to 8 SFF SSD; NVMe
HPEGen11
ProLiant DL380a
Up to 8 EDSFF E3.S 1T NVMe SSD Gen11
Up to Four NVIDIA L40S GPUs
Boot Options Optional RAID1 M.2 NVMe Accelerator-optimized and purpose built for AI
External/Rear (hot-pluggable) or Internal (not hot-pluggable)
access
GPU Support Up to 4 DW or 8 SW (front-loaded)
NVIDIA H100, L40S, L40, L4
I/O Up to 4 x16 PCIe Gen5 slots
Up to 2 x16 OCP slots

Storage Controller Gen11 Controllers (PCIe and OCP) + VROC NVME


Cooling Air Cooling Module support
Management iLO 6
Chassis Depth SFF/EDSFF: 32.13“; LFF (not applicable)

Confidential | Authorized HPE Partner Use 12


Only
Generative Visual AI

HPE ProLiant DL380a


NE
 Generate lifelike and dynamic Gen11
W! Up to Four NVIDIA L40 GPUs
3D animations including
character movements, physics NVIDIA AI Enterprise
simulations, and environmental
effects.

 Curate new visual content,


such as images or videos, that
mimic real-world data.

Optimized for visual apps


Confidential | Authorized HPE Partner Use Only 13
Natural Language Processing AI

HPE ProLiant DL380a


NE
Gen11
Develop and deploy end-to-end
W! Up to Four NVIDIA L40S GPUs
AI to power natural language NVIDIA AI Enterprise
processing for:
 Speech AI
 Fraud detection
 Predictive maintenance
 and more

Powering large language models


Confidential | Authorized HPE Partner Use Only 14
NVIDIA L40 GPU Accelerator for Visual AI

Unprecedented visual computing


Generative AI for the data center

5X
Omniverse
Better Performance with 4th
Generation Tensor Core3

4X
More Performance with DLSS
31

Rendering
5X
Faster Graphics with
3rd Gen RT Cores2

Measured Performance:
Omniverse: Tests run on system with 17th Gen Intel(R) Core(TM) i9-12900K, 64GB RAM, Windows 10E x64, Driver 528.35(GRID 15.1), 1x
NVIDIA L40, NVIDIA A40, DLSS3 enabled on L40 , Real time Render Mode
Rendering Tests: run on 2S system with 2x Xeon Gold 6126, 2.6GHz (3.7GHz Turbo), Win10 x 64, NVIDIA driver version 526.08, Blender Cycles
2.93, Chaos Group V-Ray 5.0. Dassault Systemes- Stellar 3.0.45, AutoDesk Arnold 7.1.0.0 GA Release, AutoDesk VRED 2022.2, 1x NVIDIA L40, Confidential | Authorized HPE Partner Use 15
NVIDIA A40
Gen AI: Stable diffusion v2.1, L4 (TensorRT 8.6.0), T4 (TensorRT 8.5.2), Precision: FP16, image size: 512x512 Only
NVIDIA L40S GPU Accelerator for Universal AI

The highest performance universal GPU


for AI, graphics, and video
Fine tuning LLM AI training AI inference
4hrs 1.7X 1.5X
GPT-175B 860M Tokens 1 Performance vs. HGX A100 2 Performance vs. HGX A100 3

GPT3 training Image Gen AI Full video pipeline


<4 days >82 184
GPT-175 300B Tokens4 Images per minute 5 AV1 Encode Streams 64

Preliminary performance projections, subject to change


1. Fine-Tuning LoRA (GPT-175B), bs: 128, sl: 256; 64 GPUs: 16 systems with 4xL40S
2. Fine-Tuning LoRA (GPT-40B), bs: 128, sl: 256; Two systems with 4x L40S, vs HGX A100 8 GPU
3. Hugging Face SWIN Base Inference (BS=1,Seq 224); L40S vs. A100 80GB SXM
4. GPT 175B, 300B tokens, Foundational Training; 4K GPUs; 1000 systems with 4xL40S
5. Image Generation, Stable Diffusion v2.1, 512 x 512 resolution; 1xL40S
6. Concurrent Encoding Streams; 720p30; 1xL40S Confidential | Authorized HPE Partner Use 16
Only
PRODUCT LINE UP SPECIFICATIONS COMPARISON
L4 L40 L40S1
GPU Architecture NVIDIA Ada Lovelace NVIDIA Ada Lovelace NVIDIA Ada Lovelace
FP64 N/A N/A N/A
FP32 30 TFLOPS 90.5 TFLOPS 91.6 TFLOPS
RT Core 73.1 TFLOPS 209 TFLOPS 212 TFLOPS
TF32 Tensor Core2 121 TFLOPS 181 TFLOPS 366 TFLOPS
FP16/BF16 Tensor Core2 242 TFLOPS 362 TFLOPS 733 TFLOPS
FP8 Tensor Core2 484 TFLOPS 724 TFLOPS 1466 TFLOPS
INT8 Tensor Core2 485 TOPS 724 TOPS 1466 TOPS
GPU Memory 24 GB GDDR6 w/ ECC 48 GB GDDR6 w/ ECC 48 GB GDDR6 w/ ECC
GPU Memory
300 GB/s 864 GB/s 864 GB/s
Bandwidth
L2 Cache 48 MB 96 MB 96 MB
2 NVENC (+AV1) 3 NVENC (+AV1) 3 NVENC (+AV1)
Media Engines 4 NVDEC 3 NVDEC 3 NVDEC
4 NVJPEG 4 NVJPEG 4 NVJPEG
Power Up to 72 W Up to 300 W Up to 350 W
Form Factor 1-slot LP 2-slot FHFL 2-slot FHFL
PCIe Gen4 x16: PCIe Gen4 x16: PCIe Gen4 x16:
Interconnect
64 GB/s 64 GB/s 64 GB/s

1. Preliminary specifications, subject to change.


2. Specifications with sparsity.
NVIDIA AI Enterprise

Streamlines the development and AI Workflows, Frameworks and Pretrained Models


deployment of production AI including
Medical Imaging Conversational AI Recommenders Video Analytics Logistics Robotics
generative AI, computer vision, speech AI
and more. Speech AI Customer Service Physics ML Communications
Autonomous
Vehicles
Cybersecurity

 Optimized for inference with 40X faster


than CPU-only platforms, lowering AI and Data Science Development and Deployment Tools
infrastructure and energy costs.
Cloud Native Management and Orchestration
 Over 100 frameworks, pretrained
models, and development tools Infrastructure Optimization
optimized for building and running AI on
NVIDIA GPUs
Accelerated Infrastructure
 Deploy at scale with NVIDIA Triton™
Inference Server Cloud Data Center Edge Embedded

Confidential | Authorized HPE Partner Use 18


Only
NVIDIA AI ENTERPRISE 4.0 RELEASE OVERVIEW

Production-Ready Generative AI Inference Deployment Management Enterprise-Grade Manageability


NeMo Framework and New AI Workflows for Gen AI Triton Management Service Base Command Manager Essentials for Cluster Management

GPU Accelerated Spark On-Prem Deep Learning Frameworks Support for AI Workstations
Support for RAPIDS Accelerator on-prem deployment NVIDIA Modulus, TAO, and Deep Graph Library Rapid AI Development with NVIDIA RTX 6000 Ada
Business Value of an Enterprise Subscription
Enterprise-Grade Support, Security and Reliability for Production AI

Security & Reliability Exclusive Features Industry Ecosystem


 Security reviews  Stable APIs & Security  Hardware certifications
 Priority notifications & fixes Patching  Software certifications
 API compatibility  AI Workflows  Cloud certifications
 Lifecycle management  Unencrypted pretrained models  ML Ops certifications
 Long term support  AI management solutions  Multi-vendor case ownership
(e.g., Triton Management
Service)

Ongoing Delivery Technical Support


 Patches  24/7 case filing
 Bug fixes  8-5 live NVIDIA agent access
 Updates  Unlimited incidents
 Upgrades  Specialty-based routing
 Feature enhancements  NVIDIA customer portal/
knowledge base

Confidential | Authorized HPE Partner Use Only 20


HPE ProLiant AI Inference Solutions
Computer vision AI at the edge Generative visual AI Natural language processing AI

Smart
5134233671-03 5133904640-02 5133904763-02
template UCID

Server HPE ProLiant DL320 Gen11 HPE ProLiant DL380a Gen11 HPE ProLiant DL380a Gen11

Up to four NVIDIA L4 Up to four NVIDIA L40S


GPU Up to four NVIDIA L40
(Up to three today, four in October*)

• Computer vision & video analytics • 3D animation • Speech AI


Use cases • Loss prevention • Image generation • Fraud detection
• Smart spaces • Video generation • Predictive maintenance
NVIDIA AI Enterprise (available now in WISE; NVIDIA AI Enterprise (available now in WISE;
Software ISVs enabled by NVIDIA Metropolis
SKU coming in Oct 2023) SKU coming in Oct 2023)

Purpose-built for AI at the edge Optimized for visual apps


Powering large language models
Ideal for computer vision and video analytics at the For generative visual AI to drive product design,
To drive enterprise implementations of natural
edge for customers in the retail, hospitality, and 3D animation, or image and video generation, the
language processing, such as speech AI and fraud
manufacturing sectors, the HPE ProLiant DL320 HPE ProLiant DL380a Gen11 server with four
detection, HPE designed the ultra-scalable HPE
Gen11 server has a unique compact design purpose- NVIDIA L40 GPUs delivers the rendering and
ProLiant DL380a Gen11 server. Together with
built for edge computing. It can pack up to four design performance needed by demanding visual
Description NVIDIA L40S GPUs and the full suite of NVIDIA
NVIDIA L4 GPUs in a 1U form factor to power smart applications used in the media and entertainment,
AI Enterprise tools, this is a great AI platform for
spaces and loss prevention solutions and deliver healthcare, and manufacturing industries. The
running large language models used by the financial
insights in near-real time. Customers can take NVIDIA AI Enterprise suite offers pre-trained
services and manufacturing sectors, as well as
advantage of offerings from NVIDIA’s Metropolis models to streamline development and
customer service across a range of industries.
ecosystem to deploy targeted solutions for their deployment.
industry and use case.

Key verticals
Retail, hospitality, manufacturing Media & entertainment, healthcare, manufacturing Financial services, manufacturing, customer service
& functions

* Dates subject to change Learn more: www.hpe.com/ProLiant/AI Confidential | Authorized HPE Partner Use Only 21
1 Innovate with more graphical capabilities than ever before

2 Expect more from your infrastructure

Why HPE ProLia


nt Gen11 for AI?
3 Control cost by rightsizing and scaling as
needed

4 Get an intuitive cloud operating


experience

5 Protect your infrastructure with trusted security by design

Confidential | Authorized HPE Partner Use 22


Only
Thank you

Access the QRG here

bradley
.sweeney@hpe.com
bilal.majeed@hpe.com
devon.daniel@hpe.com
tboyer@nvidia.com
Confidential | Authorized HPE Partner Use Only
© 2023 Hewlett Packard Enterprise Development LP

You might also like