You are on page 1of 15

Advanced Storage Technologies

for
High Performance Computing
Sorin, Faibish
EMC NAS Senior Technologist

IDC HPC User Forum, April 14-16, Norfolk, VA

IDC HPC User Forum 2008

New HPC Storage Intensive Applications

Storage Challenges*
New algorithms that can scale to search and process massive datasets;
New metadata management of distributed data sources;
New platforms provide uniform high-speed memory access to multi terabyte
data structures;
Hybrid interconnect architectures to process and filter multi gigabyte data
streams from scientific instruments;
High-performance, high-reliability, petascale distributed file systems;
New approaches to software mobility, so that algorithms can execute on
nodes where the data resides;
Flexible and high-performance software integration technologies running on
diverse computing platforms;
Data signature generation techniques for data reduction and rapid
processing.
*Computer Magazine: http://www.computer.org/portal/cms_docs_computer/computer/homepage/0408/R4gei.pdf

IDC HPC User Forum 2008

New Storage Technologies for HPC

Storage Technologies
Virtualization to address the multicore problem
CDP and memory snapshots to
address storage failures during
computation

Storage at Previous HPC User Forum

DR and distributed cache


appliances to address computation
across geographies
SSD disk technology to address
Data Intensive Super Computing
tasks as well as decrease power
consumption of storage
pNFS and RDMA technologies to
increase the I/O speeds and
reduce computation cycles
3

IDC HPC User Forum 2008

New Concept Better Utilization of multi-cores


Current Implementation
Application split on multiple single
core SMP HW
Use middleware SW (Platform)

IDC HPC User Forum 2008

New Concept Better Utilization of multi-cores

Dual-core support added


Application modified to support SMP
dual core
CPU used: 4x 100% (100%)
Licenses paid: 4
Licenses used: 4

IDC HPC User Forum 2008

New Concept Better Utilization of multi-cores

Quad-core chips appear

CPU used: 4x 100% (4/8=50%)


Licenses paid: 8
Licenses used: 4
Application must be modified or

IDC HPC User Forum 2008

New Concept Better Utilization of multi-cores

Quad-core chips appear

CPU used: 4x 100% (50%)


Licenses paid: 8
Licenses used: 4
Application must be modified or
Use VM with CPU affinity
CPU used: 8x80% (80%)
Licenses used: 8

IDC HPC User Forum 2008

New Concept Better Utilization of multi-cores

N-cores chips are coming


Use VM with VT support
CPU used: 2xNx90% (90%)
Licenses paid=used: 2xN

IDC HPC User Forum 2008

New Concept Better Utilization of multi-cores

Core agnostic Middleware will


work with as many cores as
available
Enabled by pNFS access to shared
storage
9

IDC HPC User Forum 2008

CDP + Memory Snapshots in HPC applications

HPC Application
platform support

CDP Technology will


work with Real and
Virtual Infrastructures
VM snapshots on central
storage repository
VM and HW hosts memory
snapshots
Any SAN or NAS storage
Recover HPC job at any
point in time (last minute
failure after 2 weeks run)

CDP
Appliance

SAN

CDP Journal +
Memory Snapshots

IBM HDS

EMC

HP Sun
10

IDC HPC User Forum 2008

Continuous Remote Replication in HPC


HPC Application
platform support

HPC Application
remote platform

Distributed cache engines allow


distributed access to shared storage
Remote Compute Nodes accessing
the shared storage

Heterogeneous
Blades; VM+HW

Cache
Appliance

Site A

IBM HDS

SAN

SAN

EMC

Cache
Appliance

HP Sun

IBM HDS
Heterogeneous
storage

EMC

Site B

HP Sun

11

IDC HPC User Forum 2008

SSD Disks in HPC applications


Solid State Disks will replace Disk Drives
Today HPC workloads are mostly compute
intensive
Data intensive Super Computing (DISC)
applications start to appear (see: IEEE Computer
Magazine, April 2008)
SSD will balance performance between DISC and
compute intensive HPC applications
EMC DMX has SSD today (25 SSD = 800K iops or
5 GB/sec)

HPC Application
platform support

SAN

EMC

DMX + SSD

12

IDC HPC User Forum 2008

pNFS will deliver very high I/O speeds to HPC

pNFS addresses the storage


access issues
Remove servers layer between
CE and shared storage
Separates MD traffic from Data
Traffic
Asymmetric storage
architectures increase
scalability
SSD increase I/O speed

Storage must be Networked

HPC Architecture
HPC Jobs
MIDDLEWARE

Compute Engines
pNFS NFS S E R V E R S
CONNECTIVITY
CONNECTIVITY

SSD STORAGE
13

IDC HPC User Forum 2008

pNFS with Infiniband RDMA value added to HPC


CE Cache

MD is directed to the single


MD server
Data is served by storage
servers or storage arrays
directly from host to storage

I/O to native IB or 10G storage


redirected via RDMA in HW

NFS (pNFS)
Control path

Data path

RDMA

Storage access controlled by


iSCSI

iSCSI (iSER)

MetaData Cache
NFS/pNFS

File systems

arrayCache
Native IBStorage
Storage Array

14

IDC HPC User Forum 2008

15

You might also like