8 views

Uploaded by elamaran_vlsi

Xilinx

save

- Vedic Multiplier design
- High Performance Integer Arithmetic Circuit Design on FPGA Architecture, Implementation and Design Automation (Springer Series in Advanced Microelectronics) 2015 {PRG}
- An Overview of Low Power Technique
- PID_VHDL_6
- lab555
- Exploring Alternative 3d Fpga Architectures
- IISc Workshop Brochure
- 06-OCret
- ECE448_rrtrtrtlecture7_FPGAs
- Engineering Applications of FPGAs
- xilinx.pdf
- xilinx
- A Hybrid ASIC and FPGA Architecture
- Xilinx Devices in Portable Ultrasound Systems
- ettus-productline2011
- Matlab Presentation
- Chen_Yuan_Spring+2012
- Matlab Tutorial Rapido
- The Very Basics of MATLAB
- Lesson_2_ FPGA_Techniques_85.ppt
- Circuit.cellar.074.Sep.1996
- FPGA
- ECE 124 Lab Manual Spring 2014
- Integrated Ultrasonic Arrays for Rapid Manual Inspection and Mapping
- MATLAB Material Ver21
- a - Copy
- Way_Forward_EN
- LatticeSemiconductor Feature
- StructuralVerilog
- Matlab Int
- Computer Networks Syllabus
- 600_may 31_last Data_Advanced Image Technologies
- FAQ Altera Tools Faq Quartus2 42 Final
- Heizer10flex Ch17 Pp
- EE461_new_Lab_2
- Manuscript Template
- Cl Application
- 04539803
- Manuscript template.docx
- Scialert Journal Catalog
- 2d Convolution Masks
- Fir
- 600_may 31_last data_Advanced Image Technologies.pdf
- fir.pdf
- TRY Voter Ckt
- Filters FPGA CRO Bread Board
- totally_Self checking_circuits.pdf
- Information Redundancy
- VLSI Design b.tech
- ACES Express Biomedical Arun June 30
- Sound Matlab Matlab Project
- Homework 2
- ME_5MR_7MR_9MR_TMR
- Glitch Debounce Elec204 Lab5
- Soft Error Tolerant FIR Filter
- Totally Self Checking Circuits
- Tmr Seu Agarwal
- 23253-23262
- fault_tolerant_ASIC.pdf
- MNEL Vlsi Design

You are on page 1of 7

Robert K. Anderson Xilinx DSP Technical Marketing High-level ESL design methodologies now enable the simultaneous development of both the DSP algorithms and hardware architectures resulting in systems with a much higher level of optimization. Traditionally, FPGAs have been designed by engineers with limited knowledge of the application domain and similarly, algorithms have been developed by researchers without insight into FPGA design. This significant gap results in inefficiencies in both the final hardware and in the algorithmic system response that can be addressed when both the algorithmic methods and the hardware architectures are designed in a unified manner. Performing this type of algorithm and hardware co-design provides a new degree of freedom that allows the algorithmic methods to more closely match the capabilities and capacity of the targeted hardware platform. MATLAB is an abstract language where the algorithmic specification does not necessarily imply any particular hardware implementation. As such, it is used extensively by researchers for algorithm development. These abstractions make it an obvious choice for architectural co-exploration and combined with an ESL design methodology provide the foundation for architectural / hardware co-design. This paper is the second part of a 3-part series discussing the considerations and possibilities for transitioning from high-level MATLAB into FPGA implementations. It will build on the concepts introduced in our first paper, “Using MATLAB to Create an Executable Specification for Hardware” that discusses how MATLAB can be used as an executable specification for hardware development and verification. This paper will build on those concepts by exploring how MATLAB code, supported with an automated ESL design flow, can be used to enable an algorithm developer to tailor the algorithmic methods to match available resources on an FPGA, and in doing so produce a superior overall design. These tools are designed with the algorithm developer in mind, and are intended to bridge the gap between high-level algorithmic descriptions and low-level hardware implementation details.

**Introducing the AccelDSP Synthesis Tool
**

The AccelDSP synthesis tool is a high-level algorithmic synthesis tool that takes highlevel floating-point MATLAB algorithms as input and performs a series of specific transformations to generate an optimized, fixed-point design. These transformations remove the requirement that engineers must manually re-code floating-point MATLAB into fixed-point MATLAB using “fi” datatypes, quantizer functions or fixed-point C. When these design requirements are removed, they eliminate the design gap that could otherwise create a divergence between the original floatingpoint MATLAB source and the hardware implementation. It also eliminates the potential that a design is forced into to any specific hardware implementation too early in the overall design process.

m” extracts successively larger regions around a given element of a matrix to compute statistics (minimum. verification and implementation steps.com/showArticle. This is partly achieved by automating or encapsulating the FPGA synthesis. Key to AccelDSP’s ease of use is a unique way of breaking out each step of the design flow using a step-by-step flowbar that guides users through the process.The tool supports a variety of “architectural exploration” capabilities that allow the user to quickly modify an implementation using various numerical implementations and micro-architectures. and median) over these regions. The MATLAB function “adaptive_stats. This will be discussed in more detail later in the article. function Out = adaptive_stats_roi(In) %In : square matrix with odd number of rows/columns %Out : Center element kept treated based on statistics .jhtml?articleID=207800773 a design example was provided that helps illustrate the steps of this development process. Figure 1 – AccelDSP Synthesis Tool AccelDSP is designed to be intuitive to use for algorithm developers who are familiar with the MATLAB programming language but not RTL design methodologies. http://www. Once a suitable fixed-point implementation is achieved the user can use these design exploration techniques to improve the overall area or performance of the design. maximum. The function outputs a filtered center element of the matrix based on these statistics. MATLAB to Embedded C Design Flow In a recent article from The Mathworks on their MATLAB to Embedded C design flow.dspdesignline. This same code example can also be used to illustrate a similar set of steps in the MATLAB to FPGA development process. and subsequently generate an RTL implementation optimized for targeting Xilinx devices.

%Indata : square matrix with odd number of rows/columns %Out : Center element kept treated based on statistics smax=size(Indata.1). end end %Sort the sub-matrix only n = count-1. Indata2. Indata9) %%%%%%%% . Indata5. AccelDSP Design Example Let’s take the same example used above by The Mathworks in their MATLAB to Embedded C design flow example and show how it can be targeted for implementation in an FPGA. v = mysort(v. for i = first:last for j = first:last v(count) = in(i.smax. % Conditionally remove noise if rmed > rmin && rmed < rmax if center <= rmin || center >= rmax Out = rmed.1 .2).s). %initialize large array for sorting v = ones(smax*smax.s) first = ceil(smax/2)-floor(s/2).n).smax=size(In. Indata8. Out=center. Indata7 Indata8 Indata9 ].rmed] = roi_states(in. end break. index=ceil(smax/2).j). Indata7. end end function [rmin. for s = 3:2:smax % Compute region of interest statistics [rmin. function Outdata = adaptive_stats_roi(Indata1. Indata4 Indata5 Indata6. %Compute statistics on sub-matrix rmed = v(ceil(n/s)).smax. last = ceil(smax/2)+floor(s/2). .index). Indata4. count = count+1. center=In(index.rmed]=roi_stats(In.Indata3. Outdata=center for s = 3:2:smax % Compute region of interest statistics [rmin.1).1). Rmax = v(n).smax.s). rmin = v(1).Matrix Indata is constructed in AccelDSP by concatenating multiple input vectors Indata = [Indata1 Indata2 Indata3.rmax.rmax.rmax.rmed]=roi_states(Indata. index=ceil(smax/2) center=Indata(2. count = 1. Indata6.

j).Substitute in an AccelDSP sorting function”. %initialize large array for sorting v = ones(smax*smax. count = 1.rmax.s) first = ceil(smax/2)-floor(s/2). MATLAB Coding Considerations Since MATLAB is an abstract language where the algorithmic specification does not necessarily imply any particular hardware architecture. rmin = v(1). as there are a diverse set of implementations.% Conditionally remove noise if rmed > rmin && rmed < rmax if center <= rmin || center >= rmax Outdata = rmed. The Sort Function Next. AccelDSP then takes these vectors and aggregates them into internal arrays. The definition for the sort function we are using follows: % Compute region of interest statistics function v = selectsort(v_in). end end end function [rmin. v_in = v %%%%%%%% . rmax = v(n). . based on the MATLAB coding style used. count = count+1.2 . for i = first:last for j = first:last v(count) = Indata(i.smax. last = ceil(smax/2)+floor(s/2). Construction of internal MATLAB Matrices In the first coding instance “%%%%%%%% . Sorting functions can be found at a variety of locations.2 . To illustrate some of these items we will list changes made in the MATLAB source to prepare it for FPGA implementation. end end %Sort the sub-matrix only n = count-1.rmed] = roi_states(Indata. we used a custom sort function “%%%%%%%% .Matrix Indata is constructed in AccelDSP by concatenating multiple input vectors” allows internal matrices to be constructed by passing in and concatenating together vectors. various algorithmic methods may have a large number of numerically equivalent hardware architectures.Substitute in an AccelDSP sorting function v = selectsort(v_in).1). %Compute statistics on sub-matrix rmed = v(ceil(n/s)).1 .

These building blocks are parameterized so the user can further tailor the final implementation for precision. Furthermore. micro-architectures. AccelDSP used some basic heuristics to automatically select a Bipartite Table implementation. in the function roi_states(). AccelDSP addresses these issues by providing the user with a library of pre-designed building blocks that can be easily swapped into and out of a design. and specific FPGA resources such as RAMs. testing.Exploration As mentioned before. through design exploration. DWT. index = j . designing. and implementing these building blocks can take a significant amount of time and effort. For example. DCT. n = length(v). such as CORDIC. design throughput. Each of these methods is characterized with a set of trade-offs associated with numerical precision. AccelDSP. a division function is used immediately following selectsort().k]). square root. end Architectural Co. numerical methods. for j=k+1:n.v = v_in. A number of alternative implementations. flip-flops and shift registers. implementation performance. In this instance. for the most part. for k=1:n-1. linear algebra. and area constraints. trigonometric functions. This division is used to compute statistics on a sub-matrix in the design. etc). transformations (FFT. gives the user the ability to pick a specific numerical architecture for this function. %Compute statistics on sub-matrix rmed = v(ceil(n/s)). if v(j) < v(index).index]) = v([index. a single algorithm can have a large number of different implementations which are. . and others. numerically equivalent. end end v([k. For example. resource utilization. index = k. there are many different ways to implement division.

xilinx. and vector sets grow very large. Enabling this insight early in the design process helps refine the overall system to produce an overall superior design.5 . This methodology enables an algorithm developer. hardware co-simulation becomes a viable option and can increase simulation performance by up to 1000x. One of these refinements is quantization. This additional freedom gives FPGA developers the ability to tailor the quantization of a data-path to an exact set of requirements. This ability for the user to quickly modify micro-architectures for the key building blocks of a design allows an algorithm developer to evaluate a much larger hardware solution space quickly and with minimal hardware design experience. the improved simulation run times when HW co-simulation is employed. or perform hardware co-simulation of a MATLAB algorithm. further refinements to the design are possible that help improve overall area. 16 or 32 bit data widths as are fixed-point DSP processors.com/dsp www.” The table below shows a comparison of simulations performed using just MATLAB or Simulink vs. performance and system response.to Fixed-Point Conversion and Co-Verification Once the basic hardware architecture for a design has been established.com/acceldsp .Newton-Rhaphson and LUT also exist and could have been selected by the user using a design directive.75 23 4 92 Increase 45X 989X 32X 69x 113X Conclusion The abstract nature of MATLAB combined with the powerful and easy-to-use hardware design explorations of AccelDSP provide an efficient environment from which to perform algorithm / hardware architecture co-design. Simulation Time (sec) Design Beamformer OFDM BER Test DUC CFR Color Space Conv Video Scalar Non-accelerated 113 742 731 277 10422 Accelerated (through HW co-sim) 2. not familiar with FPGA design techniques. When algorithms become sophisticated. fixed-point MATLAB. Fixed-point ‘C’ is the default language generated for fixed-point verification because it is significantly faster than running fixed-point MATLAB. This is an iterative process that requires constant feedback. For more information. please visit www. to quickly assess the feasibility of the algorithms on an FPGA and to refine the algorithmic methods to match the available hardware resources. Hardware co-simulation establishes a simulation connection between a MATLAB system model and a design running on an actual FPGA that is part of a supported hardware platform such as the “XtremeDSP Starter Platform – Spartan-3A DSP Edition” or the “XtremeDSP Development Platform – Virtex-5 SXT Edition. FPGAs are not limited to fixed 8. Floating.xilinx. AccelDSP gives the user the option to generate fixed-point ‘C’.

- Vedic Multiplier designUploaded byVeena Sridhar
- High Performance Integer Arithmetic Circuit Design on FPGA Architecture, Implementation and Design Automation (Springer Series in Advanced Microelectronics) 2015 {PRG}Uploaded byPablo Andres Mosquera
- An Overview of Low Power TechniqueUploaded byndtlee
- PID_VHDL_6Uploaded byPleple Arjsanam
- lab555Uploaded bytalha
- Exploring Alternative 3d Fpga ArchitecturesUploaded bymossaied2
- IISc Workshop BrochureUploaded bygpsonline
- 06-OCretUploaded byOmar Pérez Barrios
- ECE448_rrtrtrtlecture7_FPGAsUploaded byerdvk
- Engineering Applications of FPGAsUploaded byHeitor Galvão
- xilinx.pdfUploaded byAfreenKhan
- xilinxUploaded bylskumar74
- A Hybrid ASIC and FPGA ArchitectureUploaded byPhuc Hoang
- Xilinx Devices in Portable Ultrasound SystemsUploaded byGraham Peyton
- ettus-productline2011Uploaded byAnonymous 60esBJZIj
- Matlab PresentationUploaded byAli El-Gazzar
- Chen_Yuan_Spring+2012Uploaded bybubo28
- Matlab Tutorial RapidoUploaded byDaniel Vaz
- The Very Basics of MATLABUploaded byVincent
- Lesson_2_ FPGA_Techniques_85.pptUploaded byimad
- Circuit.cellar.074.Sep.1996Uploaded bysarah kaya
- FPGAUploaded byBurakcan Guvenir
- ECE 124 Lab Manual Spring 2014Uploaded bydarkmasterzorc
- Integrated Ultrasonic Arrays for Rapid Manual Inspection and MappingUploaded byJorige Kotanagaramanjaneyulu
- MATLAB Material Ver21Uploaded byAjeet Singh
- a - CopyUploaded byBalaji Tiwar
- Way_Forward_ENUploaded byvasim_2281
- LatticeSemiconductor FeatureUploaded bydima
- StructuralVerilogUploaded bydarkdrone66
- Matlab IntUploaded byHana Hamid

- Computer Networks SyllabusUploaded byelamaran_vlsi
- 600_may 31_last Data_Advanced Image TechnologiesUploaded byelamaran_vlsi
- FAQ Altera Tools Faq Quartus2 42 FinalUploaded byelamaran_vlsi
- Heizer10flex Ch17 PpUploaded byelamaran_vlsi
- EE461_new_Lab_2Uploaded byelamaran_vlsi
- Manuscript TemplateUploaded byelamaran_vlsi
- Cl ApplicationUploaded byelamaran_vlsi
- 04539803Uploaded byelamaran_vlsi
- Manuscript template.docxUploaded byelamaran_vlsi
- Scialert Journal CatalogUploaded byelamaran_vlsi
- 2d Convolution MasksUploaded byelamaran_vlsi
- FirUploaded byelamaran_vlsi
- 600_may 31_last data_Advanced Image Technologies.pdfUploaded byelamaran_vlsi
- fir.pdfUploaded byelamaran_vlsi
- TRY Voter CktUploaded byelamaran_vlsi
- Filters FPGA CRO Bread BoardUploaded byelamaran_vlsi
- totally_Self checking_circuits.pdfUploaded byelamaran_vlsi
- Information RedundancyUploaded byelamaran_vlsi
- VLSI Design b.techUploaded byelamaran_vlsi
- ACES Express Biomedical Arun June 30Uploaded byelamaran_vlsi
- Sound Matlab Matlab ProjectUploaded byelamaran_vlsi
- Homework 2Uploaded byelamaran_vlsi
- ME_5MR_7MR_9MR_TMRUploaded byelamaran_vlsi
- Glitch Debounce Elec204 Lab5Uploaded byelamaran_vlsi
- Soft Error Tolerant FIR FilterUploaded byelamaran_vlsi
- Totally Self Checking CircuitsUploaded byelamaran_vlsi
- Tmr Seu AgarwalUploaded byelamaran_vlsi
- 23253-23262Uploaded byelamaran_vlsi
- fault_tolerant_ASIC.pdfUploaded byelamaran_vlsi
- MNEL Vlsi DesignUploaded byelamaran_vlsi