You are on page 1of 39

Enabling OpenMP Task

Parallelism on Multi-FPGAs

Ramon Nepomuceno
ramon.nepomuceno@ic.unicamp.br

Guido Araujo
guido@unicamp.br

University of Campinas (Unicamp)


Institute of Computing (IC)
Computing Systems Laboratory (LSC)
Heterogeneous Systems 2

GPU
FPGA x GPUs 3

GPU FPGA
FPGA x GPUs 4

GPU FPGA

High Memory Bandwidth Low Memory Bandwidth


FPGA Pipeline 5

Multi-FPGAs Pipeline
How to program such systems? 6

The OpenMP task-based programming model is a good choice


for programming heterogeneous Multi-FPGA systems.
How to program such systems? 7

The OpenMP task-based programming model is a good choice


for programming heterogeneous Multi-FPGA systems.

A. Computation offload to accelerators;


How to program such systems? 8

The OpenMP task-based programming model is a good choice


for programming heterogeneous Multi-FPGA systems.

A. Computation offload to accelerators;


B. Explicit data dependencies;
How to program such systems? 9

The OpenMP task-based programming model is a good choice


for programming heterogeneous Multi-FPGA systems.

A. Computation offload to accelerators;


B. Explicit data dependencies;
C. Definition of region of code for each specific device.
How to program such systems? 10

Extend the existing LLVM/OpenMP infrastructure as well as a


hardware platform to help the programmer easily express the
offloading and use of one available bitstream in a multi-FPGA
cluster.
Enabling OpenMP Task Parallelism on Multi-FPGAs 11

2nd) The OpenMP 3rd) The OpenMP


1st) Starts with an OpenMP libomptarget plugin
task (target) annotated runtime creates a task
dependency graph. maps these tasks to a
program. cluster of FPGAs.

#pragma omp target nowait


map(tofrom:V[:(h*w)]) \
depend(in:deps[i])

..
.
Enabling OpenMP Task Parallelism on Multi-FPGAs 12

2nd) The OpenMP 3rd) The OpenMP


1st) Starts with an OpenMP libomptarget plugin
task (target) annotated runtime creates a task
dependency graph. maps these tasks to a
program. cluster of FPGAs.

#pragma omp target nowait


map(tofrom:V[:(h*w)]) \
depend(in:deps[i])

..
.
Enabling OpenMP Task Parallelism on Multi-FPGAs 13

2nd) The OpenMP 3rd) The OpenMP


1st) Starts with an OpenMP libomptarget plugin
task (target) annotated runtime creates a task
dependency graph. maps these tasks to a
program. cluster of FPGAs.

#pragma omp target nowait


map(tofrom:V[:(h*w)]) \
depend(in:deps[i])

..
.
Enabling OpenMP Task Parallelism on Multi-FPGAs 14

2nd) The OpenMP 3rd) The OpenMP


1st) Starts with an OpenMP libomptarget plugin
task (target) annotated runtime creates a task
dependency graph. maps these tasks to a
program. cluster of FPGAs.

#pragma omp target nowait


map(tofrom:V[:(h*w)]) \
depend(in:deps[i])

..
.
Example 15
Example 16
Example 17
Example 18
Example 19
Example 20
Example 21
Example 22
Example 23

t1 t2 t3 t4
Software Stack 24
Software Stack 25
Software Stack 26
Software Stack 27
Multi-node 28

● A-SWT
○ Communication
among the
IP-cores
● MAC Frame Handler
(MFH)
○ Mount and
unmount a MAC
frame.
Multi-node 29

● A-SWT
○ Communication
among the
IP-cores
● MAC Frame Handler
(MFH)
○ Mount and
unmount a MAC
frame.
Multi-node 30

● A-SWT
○ Communication
among the
IP-cores
● MAC Frame Handler
(MFH)
○ Mount and
unmount a MAC
frame.
Multi-node 31

● A-SWT
○ Communication
among the
IP-cores
● MAC Frame Handler
(MFH)
○ Mount and
unmount a MAC
frame.
FPGA Scalability 32
FPGA Scalability 33
FPGA Scalability 34
Iteration and IP Scalability 35
Iteration and IP Scalability 36
Iteration and IP Scalability 37
Resource Utilization 38
39

● The multi-fpga system showed


a promising scalable result.
● More expressive results are
Final Remarks expected in a modern
infrastructure.
● The mapping algorithm can
still be explored.

You might also like