You are on page 1of 25

CGRA WITH FLOATING POINT SUPPORT FOR SIGNED ALU OPERATIONS

MAPPED CGRA FOR FLOATING POINT


OPERATIONS

Submitted by

P.VenugopalM.E.[A.E],M.B.A. Asst.Professor,Dept.of ECE, M.N.M.Jain Engineering college, Chennai,Tamilnadu,India

U.Maheswaran.,M.E.[A.E] PG Scholar, M.N.M.Jain Engineering college, Chennai,Tamilnadu,India

CONTENTS:
Introduction Problem

Definition Technical Background Proposed Idea Design Methodology Design Aspects Results Conclusions Queries

INTRODUCTION
Why What Why

reconfigurable computing? are CGRA s? not FPGAs for Complex applications? computing & Mapping applications?

Reconfigurable

PROBLEM DEFINITION
FPGAs are architectures with fine grain packing[bit level granularity]So flexibility reduces , complexity increases. Hence CGRA comes to picture. CGRA has WORD /NIBBLE level granularity. But CGRA applications are domain specific. It Supports only integer arithmetic. Proposing a new architecture supporting both integer and floating point operations.

TECHNICAL BACKGROUND
In

current scenario systems with Reconfigurable logic modules, have a greater impact on many technical applications. FPGAs are used in wide range in many technical domains to implement many interesting complex algorithms. But these FPGAs have less flexibility and give less efficiency, due to their fine grained architecture.

CONTINUED..
If

we use such fine grained architecture for complex algorithms, the flexibility has to be sacrificed and the system becomes more complex. COARSE GRAINED ARCHITECTURES has greater granularity, where the divided resource entities(hardware/problem) are larger grain size

CONTINUED..
Hence

on using this CGRA for complex algorithms the flexibility is preserved. But typical CGRAs comes without floating point unit Also they are domain specific. To overcome these barriers, heuristic mapping functions are used .so that a floating point unit can be dynamically created by the mapping algorithm.

CONTINUED..
Hence

this floating point enabled CGRA can be used for complex applications involving floating point arithmetic. E.g.: DSP filter design, Graphics accelerators, and many multimedia applications. Thus the hardware flexibility of a system is improved by using an High performance Hardware and the programming flexibility is achieved through mapping algorithms .

PROPOSED IDEA
The

real challenge before us now is the grain size of the reconfigurable device. By grouping the basic units of the Reconfigurable device with a data bus of a particular data width, and thus by improving its granularity is the aim of these authors.

EXISTING MODEL- FPGA


Existing FPGAs support algorithms based on integer arithmetic The FUs can execute common word-level operations. Register files for each computing module is localized. No shared bus communication among reconfigurable computing modules

MAPPED CGRA WITH FLOATING POINT SUPPORT

CONTINUED..

The target architecture consists of a reconfigurable computing module (RCM). RCM executes loop kernel code segments. A general-purpose processor for controlling the RCM is present. These units are connected with a shared bus. Each PE can be dynamically reconfigured to perform arithmetic/ logic operation.

DESIGN FLOW

DATA PATH & CONTROL PATH DESIGN


Any ASIC It

is typically implemented with FSMD .

consists of control path and data path designs. path design: Generation of set of control

Control

signals, [control word] at every clock cycle.


Data path design: Computational tasks described by control word

RECONFIGURING THE TARGET ARCHITECTURE


The

mapped CGRA contains a coprocessor[kcpsm3 Pico-Blaze] on the host FPGA ,used for the reconfiguration of grain size of FPGA. The kcpsm-3 [Constant(K) Coded Programmable State Machine] is a free soft processor cores from Xilinx for use in their FPGA .
.

CONTINUED..
Xilinx

documents the Pico-Blaze as requiring just 96 FPGA slices. It runs kernel in looping fashion , and reconfigures the CLBs in to required PEs. Reconfiguration details are stored in configurable caches. Now the floating point adder unit is synthesized on the mapped CGRA and addition is performed.

DESIGN ASPECTS

RESULTING PE OF MAPPED CGRA

FLOATING POINT OPERATIONS WITH MAPPED CGRA


A

pair of PEs used for floating point operations. One PE computes Mantissa & another handles Exponent. tree routing is preferred for faster routing performance. After ILP/QEA , heuristic approaches are followed to increase performance.

Steiner

CONTINUED
Thus,

each operation in a loop body is spatially mapped to a dedicated PE. The main advantage of spatial mapping is that each PE may not need reconfiguration during execution of a loop because of its fixed functionality. However, it has a disadvantage that spreading all operations of the loop body over the reconfigurable array may require a very large array size.

CONTINUED
The operations that a PE (or a pair of PEs) in our CGRA can execute are classified into three groups. 1) Arithmetic/logical operations A PE can execute ALU operations in one clock cycle . 2) Multiply/divide/load/store operations These operations are executed by dedicated functional resources located outside the PE array in several clock cycles. 3) Floating-point operations A pair of PEs can execute floating-point operations taking several clock cycles.

CONCLUSION
Thus Increased performance, Flexibility is achieved in both programming and Hardware by this mapping process over a reconfigurable device. A faster, more flexible reconfigurable hardware is mapped to support floating point operations, in this way, can draw good attention in embedded systems industries.

Queries ?

AUTHORS
U.Maheswaran.,M.E.[A.E] PG Scholar, M.N.M.Jain Engineering college, Chennai,Tamilnadu,India P.Venugopal M.E.[A.E],M.B.A. Asst.Professor,Dept.of ECE, M.N.M.Jain Engineering college, Chennai,Tamilnadu,India

er.maheswaran@live.com +91-9944215357

kpsvenu@gmail.com +91-9444420128

You might also like