Muhammad Iram Baig, Farooq Niaz and Adeel Mukhtar.
paper discusses the technique of implementing shifters on FPGA using tri-state buffers and provides a comparison against the standard technique of implementing shifters using combinational logic. The overview of utilized resources, power and speed is also provided. Further more enhancements which can be made to tri-state buffer based shifters by optimizing the control circuitry and hence making the new technique viable in terms of area, speed and power are also discussed.



Shifters are required in a variety of applications , including arithmetic operations, bit indexing, and variable length coding. Barrel shifter is a common design choice because it can perform multi-bit shifts in a single operation . Many researchers have explored various barrel shifters with varying emphasis on area, power [2] and performance [3, 4, 5]. The traditional design of shifter is based on multiplexers. This shifter is mostly implemented in FPGAs using the combinational logic. The input data, applied to the multiplexer data lines, is shifted by an amount specified at its selection lines. The multiplexer data lines are necessarily required to be greater than the data bits of the input number. The number of multiplexers required for shifter implementation is equal to the shifter output bits. The output bits may be greater than input bits in which case the shifter can place the input bits any where in the output bit fields based on the amount and direction specified by the shift amount and zero fills/sign extends the vacated bit positions.
 This work is supported by the ASIC design & DSP section at the University of Engineering & Technology Taxila, PAKISTAN. Muhammad Iram Baig. Professor. Center of Excellence ASIC Design & DSP. University of Engineering & Technology, Taxila, Pakistan ( Farooq Niaz, Lab Manager/Assistant Professor, Center of Excellence ASIC Design & DSP. University of Engineering & Technology, Taxila, Pakistan.(email: Adeel Mukhtar, Student BSc. Final Year in the Electrical Engineering Department at the University of Engineering & Technology, Taxila, Pakistan. (email:

The logic used in these shifter is highly combinational and based on AND and OR gates which introduce delay. Especially as the size of shifter grows, gate stages increase and so does the delay. More area also consumes more power which is very critical in power sensitive applications. Also if the number of output bits is greater than the input bits, the number of multiplexers increase and thus a huge combinational design can result with a lot of wastage in terms of area and speed. FPGA consists of large number of tri-state buffers which could be used to develop the multiplexing logic. The combinational logic is built using the LUTs of the CLB. To optimally utilize the FPGA resources, we have developed a technique to efficiently implement shifters by using built in FPGA tri-state buffers. Thus, leaving the precious CLBs for other logic implementation. This paper presents a technique for implementing shifters using these Tri-state buffers and also discusses an optimization that can be practiced for the efficient tri-state control implementation. II. DESIGN AND COMPARISON OF SHIFTERS

For a tri-state buffer based shifter design each bit of the input is tied to every bit of the output through tri-state buffers. For 4x4 shifter array this is shown for one bit in Fig.1.

Fig. 1. This is the basic building block that is used for each bit of the input number. In this figure it is shown for a 4x4 Shifter array. The inverter is optional and could be used based on the tri-state buffer enable condition.

The input bit can be enabled on any of the output bits by controlling the enable input of the corresponding tri-state buffer. The above module is replicated for all the bits of the input number. Also there are two extra bits, that may be enabled to output, mutually exclusively as required by a specified shift amount. These are the extension bit for right

with Spartan 2 (Xc2s100-pq208-6) FPGA as the target device. The total number of tri-states used in the design of a shifter is given by the formula below: N = (r+1) * k where r = number of input bits.603ns 10. COMBINATIONAL DELAY Tri-State Based Tri-State Based Mux Based (without control (with control optimization) optimization) 9. e2’s bit 2.696ns 28. It can be noticed from the enable codes that for a particular shift amount e. These results have been obtained using the Xilinx synthesis tool. //enable bit number 0 of input to output bit 0 en1=4'b0010.362ns 10.Here en0 means enable input bit 0 for the output bit positions dictated by the 1’s in en0 vector.256 ns 16. Below is the comparison of the three shifters in terms of area.Note that en4 is to enable extension bit and en5 is to enable zero bit to output lines. when en0’s bit 0 is enabled at the same time e1’s bit 1.g.//Enable zero to 2 lest significant output bits …… . //enable bit number 3 of input to output bit 3 en4=4'b0000.502ns 25. But we need only one module like Fig. //Disable input bit 3 for all output bits en4=4'b0000. N = number of tri-state buffers used in the design. //Disable extension bit for all output bits en5=4'b0011. //enable bit number 2 of input to output bit 2 en3=4'b1000. enable bit number 0 of input to output bit 2 en1=4'b1000.427ns 20.362 ns 30.489ns Size 4x4 8x8 16x16 32x32 . CONTROL OPTIMIZATION FOR TRI-STATE ENABLE LINES The one draw back that may have been noticed from the above discussion is the large number of tri-state buffers that are to be controlled and a huge combinational cloud that may result if a decoder or any synthesizer generated combinational circuit is used for controlling the enable lines of these tri-state buffers.816 ns 12. //Disable extension bit to all output bits NOTE: Separate modules for the sign extension bit and the zero extension bit were used which can be replaced by a single module using the above technique. As left and right shifts are mutually exclusive so this optimization can be practiced in the design. 111. Shift Amount = 3'b111 (no shift): en0=4'b0001. //extension bit is disabled for all output lines en5=4'b0000. //Disable input bit 2 for all output bits en3=4'b0000. The sizes of shifter arrays used are 4x4.54 ns 22. The control line of this multiplexer is controlled by the shift amount’s bit that specifies the direction and type of the shift. //enable bit number 0 of input to output bit 1 en1=4'b0100. 16x16 and 32x32.1 for these two bits. But a careful look at the enable vectors reveals that they follow a certain relationship. Given below are the tristate enable controls for a 4x4 shifter array with using separate modules for the extension bit and zero bit for three different shift amounts. with a single 2-1 1 bit multiplexer as the input to the module. TABLE I MAX. III. enable bit number 1 of input to output bit 3 en2=4'b0000. speed and power consumption. One data line of this multiplexer is tied to zero which is selected for left shift and for logical right shift while the other is tied to the MSB extension bit for arithmetic right shift. . tri-state based without control optimization and tri-state based with control optimization. Shift Amount = 3'b110(shift left by 1): en0=4'b0010. //Disable input bit 3 for all output bits en4=4'b0000.909ns 13. //enable bit number 1 of input to output bit 1 en2=4'b0100. //enable bit number 2 of input to output bit 3 en3=4'b0000.arithmetic shift and the zero for left shift.816 ns 13. //enable bit number 1 of input to output bit 2 en2=4'b1000. The added one is for the extra bit that may be needed to be output on any number of output bit positions based on the shift amount. 8x8. e.g. e4’s bit 4 and e5’s bit 5 are enabled thus we need only to detect the condition for en0’s bit 0 and for that condition a chain can be driven thus greatly simplifying the complex control decoder that may have resulted otherwise. The shifters discussed are multiplexer based. //Enable zero to output line 0 Shift Amount = 3'b101 (shift left by two): en0=4'b0100. Note that this is the only bit that can be enabled to multiple output lines for a specified shift amount while all other input bits are enabled to only one output line for a given shift amount en5=4'b0001. //Zero is disabled for all output lines. e3’s bit 3. K = number of output bits.

IEEE Journal of Solid-State Circuits.3. [6] Jan J. pp261-263. IEE Electronic Letters. Kime. “Logic & Computer Design Fundamentals” pp 364-366.S T A T E B A S E D ( w it h o u t c o n t r o l o p t im iz a t io n ) T R I.2 This is a plot of the maximum combinational delay in nano seconds for a multiplexer based shifter. June 1996. Prentice Hall. IEEE Journal of Solid-State Circuits. pp.-J.Morris Mano & Chrles R. 1999. Tharakan and S.OPTIMIZATION AND TRI-STATE BUFFER BASED SHIFTER WITH CONTROL OPTIMIZATION 35 M U L T IP L E X E R B A S E D T R I. Irwin. 209-212. June 1995. [5] S. M. Rabaey. M. 3500 M UX BASED T R I. O F 4 IN P U T L U T s O C C U P IE D 2000 1500 1000 500 0 0 5 10 15 20 S H IF T E R A R R A Y S IZ E 25 30 35 FIG.S T A T E B A S E D ( w i t h c o n t r o l o p t i m i z a t i o n ) 3000 [4] G. Feb. pp. Power comparisons for barrel shifters. 217-221. 2500 N O . Acken. Feng. Pereira. the multiplexer based shifters become infeasible in terms of both area and timings. A new design of a fast barrel switch network. M. [7] Peter A. A Design Perspective” pp 594-595. Cheng.3. TRI-STATE BUFFER BASED SHIFTER WITHOUT CONTROL .S T A T E B A S E D ( w it h c o n t r o l o p t im iz a t io n ) 30 WITH THE SHIFTER ARRAY SIZE ON X-AXIS. ANALYSIS OF RESULTS TABLE II COMPARISON IN TERMS OF AREA Size 4x4 8x8 16x16 32x32 Mux Based 14 LUTs 52 LUTs 183 LUTs 535 LUTs Tri-State Based (without control optimization) 13 LUTs 79 LUTs 950 LUTs 3618 LUTs Tri-State Based (with control optimization) 13 LUTs 29 LUTs 81 LUTs 179 LUTs As shown by the above results. 2000. Kang. and W. In ISLPED. J. as the size of the shifter array increases. P. If the control optimization is not used with the tri-state based shifter design then the results in terms of area and timings grow even worse. Statistically Optimized Asynchronous Barrel Shifters for Variable Length Codecs. Multilevel barrel shifter for CORDIC design. power consumption and timing requirements.S T A T E B A S E D ( w i t h o u t c o n t r o l o p t i m z a t i o n ) T R I. Pearson Education Electronics and VLSI series. tri-state based shifter without control optimization and tri-state based shifter with control optimization with the shifter array size on X-axis. and R. On the other hand a shifter using tri-state buffers along with control optimization technique described above leads to most feasible and amazing results in terms of area. Sangyun Kim. pp686-690. “Digital Integrated Circuits. Beerel. [3] R. M. IEEE Journal of Low Power Electronics and Design.1996. 1992. REFERENCES [1] M. Yih. [2] K. THIS FIGURE IS PLOT OF NUMBER OF 4 INPUT LUT’S OCCUPIED FOR A MULTIPLEXER BASED SHIFTER. Pei-Chuan Yeh. pp. IV. Kyeounsoo Kimt. TABLE III COMPARISON IN TERMS OF POWER CONSUMPTION Size Mux Based Tristate Based (without control optimization) 11 m Watts Tristate Based (with control optimization) 7mWatts M A X IM U M C O M B IN A T IO N A L D E L A Y IN n s 25 20 4x4 15 11mWatts 8x8 10 11mWatts 11mWatts 7m Watts 5 16x16 0 5 10 15 20 S H IF T E R A R R A Y S IZ E 25 30 35 11mWatts 11mWatts 7mWatts FIG.1178-1179. M. Fully pipelined TSPC barrel shifter for high speed applications. Anantha Chandrakasan & Borivoje Nickolic. Owens.

Sign up to vote on this title
UsefulNot useful