You are on page 1of 8

AMD EPYC™ using OPENFOAM®

Powering the Future of HPC


September 2018

Exceptional Memory Bandwidth AMD EPYC: The right choice for Computational
OpenFOAM is a memory-intensive
workload that benefits from AMD EPYC’s Fluid Dynamics
8 channels of memory bandwidth and up
to 2TB of memory per processor. Designed from the ground up for a new generation of solutions, AMD
EPYC implements a philosophy of choice without restriction. Choose
Standards Based the number of cores and sockets that meet your needs without
AMD is committed to industry standards,
offering you a choice in x86 architecture
sacrificing key features like memory and I/O.
with design innovations that target the
Each EPYC processor can have from 8 to 32 cores with access to an
evolving needs of modern datacenters.
exceptional amount of I/O and memory regardless of the number of
High Density, Low Cost cores in use, including 128 PCIe lanes, and access to 2 TB of high speed
Compute requirements are increasing, memory per socket.
datacenter space is not. AMD’s EPYC
processor offers high core density with
full access to all features. Innovative
architecture means outstanding
performance at a low cost.

Partner Ecosystem
AMD’s broad partner ecosystem and
collaborative engineering provide tested
and validated solutions that help lower
your risk and total cost of ownership

OpenFOAM EPYC’s innovative architecture translates to tremendous performance


OpenFOAM is a free, open source at a low cost. More importantly, the performance you’re paying for is
software, offering a long-term and viable appropriate to the performance you need.
complement to proprietary CFD codes.
I/O intensive workloads can utilize the plentiful I/O bandwidth with
OPENFOAM® is a registered trade mark of
OpenCFD Limited, producer and the right number of cores – avoiding overpaying for unneeded power –
distributor of the OpenFOAM software via while compute-intensive workloads can make use of fully loaded core
www.openfoam.com. counts, dual sockets and plenty of memory.

AMD EPYC processors help enable more performance, flexibility, and security
PERFORMANCE. AMD EPYC processors bring a new balance to the datacenter. Utilizing an x86 architecture, the AMD EPYC
processor, brings together high core counts, large memory capacity, ample memory bandwidth and massive I/O with the right
ratios to help performance reach new heights.

FLEXIBILITY. Match core count with application needs without compromising processor features. EPYC’s balanced set of resources
means more freedom to right-size the server configuration to the workload.

SECURITY. AMD EPYC features the industry’s first dedicated security processor embedded in an x86-architecture server processor.
The processor manages secure boot, memory encryption, and secure virtualization on the processor itself. Encryption keys never
leave the processor where they can be exposed to intruders.

SCALABILITY. Scale-up or scale-out, AMD and its ecosystem partners offer high-performance network connectivity options for
applications at massive scale.

2018 © Advanced Micro Devices, Inc.


AMD EPYC for Computational Fluid Dynamics OpenFOAM
Memory bandwidth is a critical factor in maximizing OpenFOAM is free, open source CFD software
performance of computational fluid dynamics produced by OpenCFD Limited. It has a large user
workloads. AMD EPYC server processors’ base across many areas of engineering and science
exceptional memory bandwidth ensures that you from both commercial and academic organizations,
get the most out of your system, minimizing most notably in automotive, energy, and aerospace.
execution time and increasing overall utilization of OpenFOAM can solve a wide range of problems,
your deployment. form complex fluid flows involving chemical
reactions, turbulence and heat transfer, to
The EPYC Advantage: AMD EPYC server processors acoustics, solid mechanics and electromagnetics.
offer 8 memory channels of DDR4-2666 and up to
2TB of memory per processor, yielding exceptional
memory bandwidth and capacity.

Many High-Performance Compute (HPC) workloads


require you to balance performance vs per-core
license costs to manage your overall cost. AMD
EPYC processors offer a consistent set of features
across the product line, allowing users to optimize
the number of cores required for their workloads
without sacrificing features, memory channels,
memory capacity, or I/O lanes. Whether you need
8, 16, 24, or 32 physical cores per socket, you will
have access to 8 channels of memory per processor
across all EPYC server processors. OpenCFD Limited develops and maintains the
OpenFOAM software and releases it through the
The EPYC Advantage: Performance - The AMD OpenFOAM Foundation. It is licensed under the
EPYC processor brings new balance to the GNU General Public License (GPL).
datacenter. The highest core count yet in an AMD
x86-architecture server processor, large memory OpenFOAM comes with full commercial support
capacity, memory bandwidth and I/O density are all from ESI-OpenCFD, including software support,
brought together with the right ratios to help contracted development, engineering services and a
performance reach new heights. program of training courses and community-based
development projects. These activities help fund
As workloads demand more processor cores, the the continued development, maintenance and
communications between processor cores becomes release of OpenFOAM to make it a strongly viable,
critical to efficiently solving the complex problems commercially supported, open source product.
faced by customers. As cluster sizes increase, the
communication requirements between nodes rises
quickly and can limit scaling at large node counts.

2018 © Advanced Micro Devices, Inc. 2


Performance Benchmarks and Testing Tested Hardware/Software configuration
The OpenFOAM motorbike tutorial was used to Compute Nodes
benchmark performance. The mesh resolution can CPUs 2 x EPYC 7351
Cores 16 cores per socket / 32 cores per node
be increased or decreased to affect the accuracy.
Memory 256GB Dual-Rank DDR4-2666
This in turn affects the compute time and was used NIC Mellanox ConnectX-5 EDR 100Gb
to effectively demonstrate the scalability of various InfiniBand x16 PCIe
workloads. Storage: OS 1 x 256 GB NVMe
Storage: Data 1 x 1 TB NVMe
OpenFOAM testing was performed on a 32-node Software
cluster of dual-socket AMD EPYC 7351 processors. OS RHEL 7.5 (3.10.0-862.el7.x86_64)
Each AMD EPYC 7351 processor has 16 cores with Mellanox MLNX_OFED_LINUX-4.3-3.0.2.1
OFED Driver (OFED-4.3-3.0.2)
a base frequency of 2.4 GHz and a boost frequency MPI Version OpenMPI 3.1.1
of 2.9 GHz. Each system has a total of 16 channels Application OpenFOAM 5.0
of dual-rank DDR4-2666 memory, 8 channels per Network
processor. Switch Mellanox EDR 100Gb/s Managed
Switch (MSB7800-ES2F)
Configuration Options
BIOS Setting SMT=OFF, Boost=ON,
Determinism Slider = Power
OS Settings Transparent Huge Pages=ON (Default),
Swappiness=0, Governor=Performance

OpenFOAM Compilation:
OpenFOAM version 5.0 was compiled from source on RHEL 7.5 using the AOCC 1.2.1 compiler
(https://developer.amd.com/amd-aocc/) and OpenMPI 3.1.1. The optimization flag “-O3” was used. No
further compile time optimizations were done.

2018 © Advanced Micro Devices, Inc. 3


OpenFOAM Performance: Motorbike 100x40x40 (~21m cells)
The scalable motorbike tutorial is commonly used as a standard baseline to compare OpenFOAM
performance across various platforms. The mesh is re-sized as needed to test scalability across higher
numbers of nodes.
Motorbike was first sized to 100x40x40, which is ~21m cells, to provide a large enough model to scale
effectively to 32 nodes. Figure 1 details the performance of the benchmark when adding additional nodes,
showing how the performance of the benchmark improves through 32 nodes (1024 cores). At 1024 cores,
the workload allocates just over 20,000 cells per processor core.

Motorbike 100x40x40 Performance on EPYC 7351


1000

900 32 Cores

800
Elapsed Time in seconds (lower is better)

700

600

500

64 Cores

400

300

128 Cores
200

256 Cores
100 512 Cores
1024 Cores

0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
# of Nodes

Figure 1

2018 © Advanced Micro Devices, Inc. 4


OpenFOAM Scaling: Motorbike 100x40x40 (~21m cells)
Figure 2 details the scalability of the motorbike benchmark at 100x40x40 through 32 nodes (1024 cores).
While the benchmark scales very well up to 16 nodes, scalability begins to taper off at 32 nodes (1024 cores)
with an overall execution time of only ~45 seconds. This indicated the need to run a larger model to stress
the compute capability of the cluster at 32 nodes. See the next sections for improved scalability using a
larger model.

Motorbike 100x40x40 Scaling on EPYC 7351


25

20
1024 Cores
Scaling normalized to single node performance

15

512 Cores

10

256 Cores

5
128 Cores

64 Cores
32 Cores
0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
# of Nodes

Figure 2

2018 © Advanced Micro Devices, Inc. 5


OpenFOAM Scaling: Motorbike 130x52x52 (~42m cells)
For this test scenario, we increase the model size further. Setting the dimensions to 130x52x52 yields ~42m
cells in the model. This larger scenario loads each core with over 41,000 cells per core.
Figure 3 details the scalability of the motorbike benchmark at 130x52x52. With this size model, the
benchmark now scales exceptionally well all the way through 32 nodes (1024 cores) and has near perfect
scalability up to 16 nodes (512 cores).

Motorbike 130x52x52 Scaling on EPYC 7351


30

1024 Cores

25
Scaling normalized to single node performance

20

15 512 Cores

10

256 Cores

5
128 Cores

64 Cores
32 Cores
0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
# of Nodes

Figure 3

2018 © Advanced Micro Devices, Inc. 6


OpenFOAM Performance: Motorbike 130x52x52 (~42m cells)
Figure 4 details the performance of the same larger motorbike benchmark sized to 130x52x52 as the
workload is scaled from 1 node (32 cores) to 32 nodes (1024 cores). At 32 nodes (~41,000 cells per core),
the execution time of this ~42m cell model is still only ~66 seconds, however that is enough execution time
to exhibit very good scaling.

Motorbike 130x52x52 Performance on EPYC 7351


2000

32 Cores
1800

1600

1400

1200
Seconds (lower is better)

1000
64 Cores

800

600

128 Cores

400

256 Cores

200
512 Cores
1024 Cores

0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
# of Nodes

Figure 4

2018 © Advanced Micro Devices, Inc. 7


Conclusion AMD empowers the development of fast, accurate
computational fluid dynamics simulations running
Scale-out testing on the EPYC cluster shows on cost-effective clustered systems.
impressive results on the scalable motorbike
For more information about AMD’s EPYC line of
benchmark model. Results showed a general, and
processors visit: http://www.amd.com/epyc
expected, trend of better scaling as the model sizes
increased. For more information about OpenFOAM visit:
https://www.openfoam.com/
The OpenFOAM software application is architected
to deliver accuracy, performance, and scalability to Authors
meet your CFD needs, empowering you to go
This paper is authored by Anre Kashyap in
further and faster as you optimize your product's
collaboration with Marc Baker, Kevin Mayo, and
performance. OpenFOAM includes well-validated
Narjit Chadha.
physical modeling capabilities to deliver fast,
accurate results across a wide range of workloads.

DISCLAIMER

The information contained herein is for informational purposes only and is subject to change without notice. While every precaution has been
taken in the preparation of this document, it may contain technical inaccuracies, omissions and typographical errors, and AMD is under no
obligation to update or otherwise correct this information. Advanced Micro Devices, Inc. makes no representations or warranties with respect
to the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied warranties of
noninfringement, merchantability or fitness for particular purposes, with respect to the operation or use of AMD hardware, software or other
products described herein. No license, including implied or arising by estoppel, to any intellectual property rights is granted by this document.
Terms and limitations applicable to the purchase or use of AMD’s products are as set forth in a signed agreement between the parties or in
AMD's Standard Terms and Conditions of Sale. GD-18

©2018 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, EPYC, and combinations thereof are trademarks of
Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their
respective companies.

OPENFOAM® is a registered trade mark of OpenCFD Limited, producer and distributor of the OpenFOAM software via www.openfoam.com. This
offering is not approved or endorsed by OpenCFD Limited, owner of the OPENFOAM® and OpenCFD® trademarks.

Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies

2018 © Advanced Micro Devices, Inc. 8

You might also like