System-Level Modeling of Dynamically Reconfigurable Co-Processors

Yang Qu1, Kari Tiensyrjä1, Kostas Masselos2
VTT Electronics, P.O. Box 1100 (Kaitoväylä 1), FIN-90571 Oulu, FINLAND E-mail: yang.qu@vtt.fi 2 INTRACOM SA, Emerging Technologies and Markets Division, 19,5km Markopoulou Avenue, P.O.Box 68, GR 19002 Peania, Attika, Greece
1

Abstract. Dynamically reconfigurable co-processors (DRCs) are interesting design alternatives when both flexibility and performance are concerns. However, it is difficult to study the performance impact of including such devices into design when using traditional design methods and tools. In this paper, we present easily adaptable system-level techniques, which are able to perform fast exploration of different reconfiguration alternatives. A SystemCbased modeling method for DRCs and a high-level synthesis-based estimation tool to support system partitioning are presented.

1

Introduction and Related Work

The technology developments have made it possible to re-program configurable hardware at run time. Such device is generally referred to as dynamically reconfigurable logic (DRL). Unlike software or hardware implementation, DRLs spread computation over both time and space. The new feature requires various changes in the traditional design flow. At the system level, the problems are how to support HW/SW/DRL partitioning, how to evaluate different reconfiguration alternatives, how to model the DRLs with the aim of fast design space exploration, etc. In the era that the design level is moving higher and higher, the design of Reconfigurable System-on-Chip (RSoC) requires an easily adaptable solution to enhance traditional design methods and tools in order to reduce the time-to-market. Authors in [1] proposed a VHDL modeling technique of the reconfigurable process that is simulatable and takes reconfiguration overhead into account, but the approach is not suitable for design space exploration. In [2], a system-level model of runtime reconfigurable system was proposed. However, the reconfiguration overhead was not addressed. In [3], the VCC tool was used to evaluate different design options of a reconfigurable platform, but context scheduling is not addressed. Our research focuses on high-level design methodology of reconfigurable systems, where DRLs are used as co-processors. This paper presents a system-level modeling technique of DRCs and the associated tools. The work is an extension of [4]. The main advantage of the approach is that it can be easily embedded into a SoC design flow to allow fast design space exploration for different reconfiguration alternatives without going into implementation. The system-level model describes the behavior of

based on which well-known highlevel synthesis tasks are carried out to produce the estimates.1 Estimation Approach to Support Identifying Candidate Components We developed a high-level synthesis-based estimation tool [5]. The criterion is that the task should have two features in combination: flexibility (that would exclude an ASIC implementation) and high computational complexity (that would exclude a software implementation). The decision whether a task should be a candidate component is clearly application dependent. allocation algorithms are used to estimate the hardware resources required for interconnection with multiplexer type of interconnection units. 2. from which we estimate the execution time. The input is C code of tasks to be studied. Traditional HW/SW partitioning methods will be involved when making a full HW/SW/DRL partitioning. As-soon-as-possible (ASAP) and as-late-as-possible (ALAP) scheduling are used to determine the critical paths. which can produce estimates of the execution time and hardware resources required for embedded FPGA type DRCs. Finally.the reconfiguration process and relates the performance impact of the reconfiguration process to a set of parameters extracted from reconfiguration technologies of interest. They are separately addressed in following sections. in order to support the selection of candidate components with the aim of total area reduction. and design reuse. Our modeling technique focuses on three issues: selection of candidate components. Section 2 introduces the modeling technique and supporting tools. The structure of the paper is as following. A modified version of ForceDirected Scheduling (FDS) is used to estimate the hardware resources required for the tasks. designers can easily evaluate the trade-offs between different technologies. The reconfiguration overhead is the feature closely related to DRL technologies and run-time behavior of the candidate components. The candidate components are application functions that are considered to gain benefit from being implemented on DRCs. A SUIF-based front-end preprocessor is used to extract Control-Data Flow Graphs (CDFG). Thus. by tuning the parameters. Section 4 gives the conclusions. the model can automatically detect the reconfiguration requests and trigger the reconfiguration process. In simulation. The validation work using a MPEG2 decoder case is described in section 3. modeling of the reconfiguration overhead for fast design space exploration. The modeling methodology is supported by an estimation tool for the system partitioning and a transformation tool for reuse of existing SystemC code. 2 Proposed System-Level Modeling Techniques The important tasks in system-level design of RSoC are to identify candidate components and to reveal reconfiguration overhead. Flexibility may come either from the point that the task will be upgraded in the future or in view of hardware resources sharing with other tasks with non-overlapping lifetimes for global area optimization. The current estimator targets a Virtex2-like embedded FPGA in which main resources are LookUp-Tables (LUTs) and multipliers. .

which implements the top-level bus interfaces with separate system address space. A general SystemC model of RSoC is shown in Fig. The use of blocking IMC requires the behavior of the system bus to be changed in order to avoid the bus being locked when the called module is off the device. the pre-emption of a running module is not supported. When a module finishes its operation.2 Modeling of Reconfiguration Overhead The modeling of reconfiguration overhead is divided into two steps. If so and it is possible to activate the waiting module. The DRC is a single SystemC module. which decides if reconfiguration is needed. a message of request to reconfigure the target module will be put into a FIFO queue and the IMC will return with the value of FALSE. different technology-dependent features are mapped onto a set of parameters. If so. Each candidate component has two extra ports. Concerning the practical implementation effort. a SystemC module that models the behavior of run-time reconfiguration process is created and is used in system-level simulation to reveal the reconfiguration overhead. it will send a DONE signal to the CS. Each candidate component (F1 to Fn) is an individual SystemC module. One is a DONE signal port routed to the Configuration Scheduler (CS). the CS will call the reconfiguration procedure. A configuration memory is modeled. the clock speed of configuration and the extra delays apart from loading of the configuration data. 1.2. it will hold the IMC and pass the control to the CS. 1. which are the size of configuration data. In the second step. Fig. If the module cannot be activated at the moment. and is instantiated inside the DRC. The port is used to acknowledge the CS that this task can be safely swapped out. the CS will call a reconfiguration procedure that uses the parameters specified in step 1 to generate memory traffic and associated delays to mimic the reconfiguration latency. which is in fact a hierarchical SystemC module. The Input Splitter (IS) is an address decoder and it manages all incoming Interface-Method-Calls (IMCs). The other is connected to a shared memory that saves the data to be preserved during reconfiguration. . and the CS will check if there is any waiting message in the FIFO queue. The right side shows the internal structure of the DRC. which implements the same bus interfaces in the same way as other HW/SW modules. When the IS captures an IMC to a candidate component. the IS will dispatch the IMC to the target module. In the first step. System-level Modeling of Reconfigurable SoC The main idea of the modeling method is as following. The CS monitors the operation states of the candidate components and controls the reconfiguration process. When the CS is done. which could be an on-chip or off-chip memory that holds the configuration data. The modeling method is for non-blocking IMCs. The left side is an overview of the RSoC.

which processes 8 pixels in parallel. The color converter (CC). based on which the CS makes reconfiguration decisions. For single context and multi-context DRCs.3 Transformation Tool to Support Reuse of Existing SystemC Modules We developed a tool that can automatically transform SystemC modules. There are two specific requirements for the input moduls. The DRC is a Virtex2-like FPGA. in which those specified SystemC modules have been replaced with a DRC module. The partial. similar state diagrams can be used in the model. 3 Case Study A MPEG2 decoder case is chosen to prove the approach is very useful for the task of fast design space exploration. and the IDCT are assigned to two separate hardwired ASICs. A shared memory and a one-level system bus are used. Otherwise the transformer would not have the knowledge of their meanings. such as variable-length decoding. The starting point is a SystemC transaction-level model of a static architecture of the decoding system. which however must follow a defined modeling pattern. The inputs are SystemC files of a static architecture and a script file. single/multi-context reconfiguration are to be considered. which gives the names of the modules that are selected as candidate components and the associated design parameters. The task is to study the possibility of moving the IDCT and the CC from ASIC implementation to a DRC. Reconfiguration state diagram 2. a port of DONE signal with specified name should exist in a candidate module in order to let the CS capture its status.There is a state diagram common to all candidate components. Motion compensation is assigned to a DSP core. The size of bitstream to configure the full device . A state diagram of partial reconfiguration is presented in Fig. The main advantage of the modeling method is that the rest of the system and the candidate components need not to be changed between a static approach and run-time reconfiguration approaches. Features of the target DRC are as following. There are 3200 LUTs and 40 multipliers available. Control-oriented tasks. Firstly. The outputs are SystemC files of a modified architecture. which makes this method very useful in making fast design space exploration. Fig. The kernel of the tool contains a C++ parser to analyze the SystemC files. are assigned to a RISC processor. 2. Secondly. 2. into a SystemC module of a DRC. modules should implement the bus interface methods with defined names. a script file parser and a template module of the DRC.

69 8. L. Three simulation packages were created using the modeling method described in section 2. Proc. DATE (2003) 662 – 667 4. Rissa. The configuration clock is running at 50MHz. Proc. The use of DRCs will create a flexible system and result in shorter time-to-market when comparing with equivalent ASIC-type SoC implementation. Pelkonen. 10th Annual IEEE Symposium on FCCM. T. (2002) 295 – 296 3. In the multi-context reconfiguration. the transformation tool can significantly reduce the amount of coding work. In partial reconfiguration. Designers need to edit only the script file of the design parameters. Soininen. Table 1. Vanzago. IPDPS’03 (2003) 174-181 5.: Design space exploration for a wireless protocol on a reconfigurable platform. et al. Comparison of reconfiguration latencies Decoding time (ms/fr) Conf. In the SystemC modeling. (2003) 214-221 . the size of configuration data is proportional to the number of LUTs required. The differences between three configuration styles are clearly revealed. Lysaght. The estimation tool can produce results within minutes without any manual effort. which can be easily done within a minute. which correspond to 186k and 168k configuration data.78 7. IEE Proc. et al.is 200k bytes.35 NA Single 26. latency (ms/fr) Original 15.69 2e-4 Partial 25. 147. D. We started with the estimation of the requirement of the configuration data in partial reconfiguration.-P.. The estimation tool showed 2983 LUTs and 2688 LUTs were required for the IDCT and the CC separately. Our easy-to-use approach has been proved with a MPEG2 case to be able to fulfill the task. (2000) 175-180 2. and 8 bits are loaded every cycle. A.: Methods of exploiting simulation technology for simulating the timing of dynamically reconfigurable logic. Robinson.. Yang Qu.. The case study proves the approach is useful in helping designers to rapidly perform design space exploration.: Estimating the utilization of embedded FPGA co-processor. there are two layers of programming bits and 5 clock cycles are required for context switching. References 1. Vol. et al. Proc.00 Multi 18. Euromicro Symposium on DSD. P.: System-Level Modeling of Dynamically Reconfigurable Hardware with SystemC.2 and the simulation results are given in Table 1.09 4 Conclusions In this paper. No.. Proc. Designers can easily make design decisions when information of ASIC area of the two functions and the estimates of design time are available.: System-level modeling and implementation technique for run-time reconfigurable systems. 3. It is very important to have an approach that allows designers in the early phase of design to rapidly explore the differences of using different reconfiguration alternatives. J. we have presented a system-level modeling methodology of DRCs.

Sign up to vote on this title
UsefulNot useful