Al. (2015) ) - These Areas Often Require Real-Time Results With The Efficiency. A 32-Bit

CHAPTER 1
INTRODUCTION
Arithmetic operation is an important component for Digital Signal Processing (DSP)
and Image processing applications [Naziri et al. (2014); Klinefelter et al. (2015)].
Especially, Discrete Cosine Transform (DCT) technique, Fast Fourier Transform (FFT)
and Finite Impulse Response (FIR) needs to be designed with an efficient multiplier
[Kouretas et al. (2013)]. Nowadays, there are lots of handheld portable battery-operated
devices require the efficient, errorless and low power arithmetic operations [Abid et al.
(2016)]. Researchers have observed that noise signals are embedded with such
applications [Ryu and Nishimura (2009)]. The 2-D Gaussian filter is an example of the
specialized filter which is frequently used for blurring (smoothing) and noise reduction
in image processing applications [Hsiao et al. (2007)]. Such filter has also been used
for edge detection in the human visual system. But, it is well-known fact that a
multiplier has always been a limiting factor in terms of accuracy, speed and area. A lot
of research efforts have been directed to design an efficient multiplier and it is to be
taken care at different levels of the design hierarchy. It is also found that on the
algorithmic level the overall efficiency can also be improved in a significant amount.
The real world is full of applications of Logarithm Number System (LNS) based
multiplier which motivates the researchers to design it an efficient LNS. Few of these
applications are briefly discussed here. The approximate computing has been an
applicable technique to carry out the research area where the exact output is not a big
concern, but fast processing of output is important, like image processing, video-signal
processing and transmission [Mahalingam and Rangantathan (2006)]. It is well known
that the arithmetic computations (floating-point or fixed-point) should be performed at
a reasonable speed [Hall et al. (1970); Duncan (2003); Sun and Kim (2011)]. Large
demand of high-speed DSP applications can be full-filled with the use of high-order
FIR filters. Hence, an efficient Very Large Scale Integration (VLSI) architecture for a
high-speed FIR filter has been a thirsty research topic. A FIR filter has multipliers,
adders and delay units. Logarithmic multiplier may be implemented for the high-
performance of a system. Various research fields like astronomy, geography, medicine,
etc. have frequently applied the concepts of Digital Image Processing (DIP) [Cabello et
al. (2015)]. These areas often require real-time results with the efficiency. A 32-bit
1
Logarithmic Arithmetic Unit (LAU) has been found in the 3-D mobile graphic system.
With LAU, the performance has been improved about five times as compared to the
complex radix-4 method. LNS are broadly applicable in video compression, especially
in the motion vector [Babic et al. (2011)]. Babic’s multiplier is used as an application
for motion vector calculations. Its algorithm for block matching was applied on the
successive Computed Tomography (CT) scan frames for a selected region.
The implementation of the 2-D Gaussian filter is widely used for the smoothing and
noise removal. It requires lots of computational resources and its efficiency during the
implementation has been a motivating research area. Convolution operators are the
Gaussian operator and the idea of Gaussian smoothing is achieved by applying the
convolution [Mitra Basu (2002)]. The convolution operation used for blurring
(smoothing) and noise reduction can be performed by using the Mitchell algorithm and
Operand Decomposition (OD) multiplications. The Kalman filter for the object tracking
uses the logarithm multipliers. Also, neural network processing comprises a huge
number of multiplication operations. However, multiplication circuits consume a lot of
area, time and power. To overcome these issues various LNS techniques may be used.
This filter is composed of addition and multiplication processes. The convolution is a
multiplication based process. Hence, an efficient logarithm multiplier is required to
improve the accuracy of Gaussian filters. Due to numerous applications of the
logarithm multiplier in DSP, it motivates researchers to come forefront to contribute
research efforts. In next Section Logarithm Number System, their importance and
logarithm multiplier have been introduced.
1.1 LOGARITHM NUMBER SYSTEM
There are various number systems (like binary, BCD, decimal, hexadecimal, octal,
logarithm etc.) exist in the literature which are frequently used for implementation of
arithmetic units. Most of the number systems fall in the category of the weighted
number system and non-weighted number system. Multiplication and division
operations have shown the delay and the requirement of extra hardware. Due to these
reasons, most of the weighted binary number arithmetic units may have to compromise
at speed and area of circuits. The binary number system has given issues of large area,
delay and large power consumption in an operation of multiplication, division, root and
power performed in a DSP application. But, DSP applications have a primary goal to
2
process an operation faster with an efficient hardware which may not be full-filled due
to the above mentioned issues of the binary number system. To overcome these issues,
the Residue Number System (RNS) has been used in the realization of general purpose
computers because in RNS carry digits do not propagate between two moduli that
reduced the delay. But it shows the problem with the division. The LNS addresses to
these issues and overcomes the technical gaps. LNS provides a new approach to speed
up the slow multiplication and division with a conventional weighted number, it avoids
the issues inherent in RNS. LNS cannot replace conventional arithmetic units in general
purpose computers. Rather, it is intended to enhance the implementation of special
purpose processors for specialized applications [Swartzlander and Alexopoulos
(1975)].
Mathematicians have used logarithms to simplify the mathematical operations like
multiplication, division, etc. because these operations can be performed by using
addition and subtraction. LNS multipliers are advantageous in terms of speed and
accuracy over the Fixed Point (FXP) multipliers and Floating Point (FLP) multipliers
[Mitchell (1962); Kostopoulos (1991)]. LNS multiplier supports the two families of
native data-types: integer or FXP data and FLP data. FXP multiplication is frequently
used in general purpose DSP applications because of an easier algorithm, faster
implementation and a clear understanding. But, the FLP representation is a suitable
choice at the place where decimal notations have essential criteria to represent the non-
integer number. In 1985, floating-point was standardized by “IEEE 754 standard” with
two sizes, 32 bits and 64 bits. A floating-point number can be represented as:
(-1) s . 2 x . y (1.1)
Where‘s’ is the sign bit, x-bit is the exponent, and y-bit is the mantissa.
The floating-point logarithmic numbers are kept in the following format [Fu et al.
(2010)]:
Flags Sign Integer Fraction

(2 bit) (1bit) (x bits) (y bits)
Here, first 2-bits are reserved for flag code of a special exception (like negative, zero,
Not a Number (NaN) and +/- infinity) [Detrey and Dinechin (2003)]. IEEE 754
standard sets two formats for LNS: single and double precision. For single precision,
total size of 32 bits and for double precision total size of 64 bits. LNS-single precision
equivalent has a range of ≈ ± 1.5×10-39 to ± 3.4×10-38 and the double precision
3
equivalent has a range of ≈ ± 2.8×10-309 to ± 1.8×10-308 [ Johansson et al. (2008)].
Floating-point operation gives faster and more accurate result in a comparison of an
integer operation to perform the multiplication for larger than of 32-bits. However, an
integer may provide more accuracy in cases where the range of the data is well-framed
in a floating-point representation with the known exponent. But, the representation of
infinities is possible in floating-point number system [Maire et al. (2016)].
The primary LNS arithmetic operations (like multiplication, division, addition and
subtraction) of two binary numbers A1 and A2 are shown in Table 1.1.
Table 1.1: Logarithmic arithmetic operations
Binary operation Logarithmic operation
C=A1×A2 Lg C= Lg A1 + Lg A2
C=A1÷ A2 Lg C= Lg A1 - Lg A2
C=A1+ A2 Lg C= Lg A1 + Lg (1+2(Lg A2- Lg A1))
C=A1- A2 Lg C= Lg A1 +Lg (1-2( Lg A2- Lg A1))
Variable Lg C, Lg A1 and Lg A2 represent the logarithm values of C, A1, and A2,

respectively [Johansson et al. (2008); Das et al. (1995)]. In conventional logarithmic
number systems, the logarithm of the product of two numbers is given by the sum of
their logarithms. Suppose, two numbers ‘A1’ and ‘A2’ are to be multiplied, then the sum
of Lg A1 and Lg A2 are the equivalence of product of ‘A1’ and ‘A2’ called Lg C. In the
same way, the division is done with the subtraction of logarithms of given numbers.
LNS addition and subtraction are mainly the extension of multiplication. Take ‘C’ as a
sum of two numbers ‘A1’ and ‘A2’.
C= A1+A2
C=A1 (1+A2 /A1) (1.2)
On taking logarithm ‘Lg’ both side and putting values of multiplication and division in
Equation (1.2) and it is found the equation of sum of logarithm:
Lg C= Lg A1 + Lg (1+2 (Lg A2- Lg A1)) (1.3)
In the same way, it can be found the equation of subtraction of logarithm:
Lg C= Lg A1 +Lg (1-2 (Lg A2- Lg A1)) (1.4)
Logarithmic Multiplication introduces the conversion of an operand of the binary
number system to the LNS and vice-versa. The logarithmic multiplication of the two
operands ‘A1’ and ‘A2’ is performed in three phases, calculating the logarithm of
4
operand, the addition of them and then the antilogarithm of calculation that becomes
the product of the two operands. An example is given below.
Let A1=b’ (00010100) =d’20; and A2=b’ (00111100) =d’60;
Lg A1 =0100.0100; and Lg A2=0101.11100;
Sum = Lg A1 + Lg A2
Sum = (0100.0100) + (0101.11100) =1010.00100;
Log-1 (Sum) = Log-1 (Lg A1 + Lg A2);
Log-1 (Sum) =00000010010000000;
X×Y= Log-1 (Sum) =1152;
Thus, the computation of arithmetic based on LNS involves three steps: (1) Conversion
of binary numbers into logarithmic numbers, (2) then respective arithmetic operations
are performed on logarithmic numbers, and then (3) antilogarithmic conversion. LNS
multiplier has two subcategories: (a). Lookup Tables (LUT) and Interpolations based
multipliers [Brubaker and Becker (1975); Tang (1991); Muscedere et al. (2005);
Arnold et al. (2003)] and (b). Mitchell’s algorithm based multipliers [Mitchell (1962)].
LUT and Interpolation based LNS multiplier perform with high accuracy. But, their
drawback was the hardware complexity that increases the design cost. However,
Mitchell's method was simple and area efficient, but it suffers from low accuracy
[Mitchell (1962)]. An improvement in terms of the area, delay and power is of the main
concerns for an LNS multiplier. Such LNS multiplier can be used in DSP applications
where the accuracy is not of concern like the approximate computing and FIR filter,
etc. It motivates the researchers to work for faster, simpler and cheaper alternatives
with an eye for accuracy.
A Simple block diagram of logarithmic based multiplication is shown below in Figure
1.1. The process of implementing logarithmic converter and antilogarithmic converter
play the significant role to decide the accuracy and performance of LNS based circuits
[Mitchell (1962)]. Many methods regarding logarithmic conversion and antilogarithmic
conversion have been presented in recent years [Johansson et al. (2008); Das et al.
(1995); Abed and Sifred (2003); Joshua and Jong (2015); Juang et al. (2009); Juang et
al. (2011); Chaudhary and Lee (2014); Liu et al. (2016); SanGregory et al. (1999); Kim
et al. (2006); Paul et al. (2009); Kuo et al. (2012); Juang et al. (2015); Durgesh
Nandan et al. (2016); Kuo et al. (2016)]. Since logarithmic and antilogarithmic
converters play a crucial role, so it needs efficient architecture of logarithmic and
antilogarithmic converter for logarithmic arithmetic operations with the best possible
5
accuracy. Next Section explains about various components of logarithm multiplier and
logarithm converter. It is well known that Leading-One Detector (LOD) is the one of
the building blocks of logarithm multiplier and logarithm converter. Next Section has
introduced the design and applications of LOD.
Input X Input Y
Logarithmic Logarithmic
Converter Converter
Arithmetic Circuit
Antilogarithmic
Converter
Figure 1.1: Block diagram of computation of arithmetic based on LNS
1.2 ‘LEADING ONE DETECTOR’
Leading-One Detector (LOD) design becomes important due to the normalization

process in logarithmic multiplication and logarithmic converter for DSP applications
[Suzuki et al. (1996); Abed & Siferd (2006); Kunaraj & Seshasayanan (2013)]. It is
used to find out the position of the leading one bit in logarithmic multiplication and
logarithmic converter. With the help of LOD, the integer portion and the fractional
portion of the binary logarithm are obtained. Low power and efficient architecture of
LODs are the demand for logarithmic converters. LOD is used as a key component for
performing shifting and normalization process in logarithmic converters [Abed &
Siferd (2006)]. Researchers are working to develop an efficient architecture for LOD
[Oklobdzija (1992); Lee & Sartori (1998)].
The reported LOD designs are either slower or hardware inefficient and consumed
large power dissipation. Existing methods (shifting-and-counting, bit-by-bit serial
evaluation, etc.) may be able to design the efficient LODs. An effective technique is
required to design a faster, hardware-efficient and low power LOD circuit. It motivates
6
to explore new approaches. Next Section described the concept of reversibility and its
use in low power LOD design.
1.3 REVERSIBILITY
Day-by-day the advancement in the technologies gives better performance and lower
cost of computers. Normally, every technology shows some impediments and the
punishments. Here, punishment comes in terms of power dissipation. As indicated by
Moore [Thapliyal et al. (2005)], essential purposes behind the rise in power dissipation
are the expansion in clock frequency and the expanded number of transistors packed
onto a silicon chip. In 1961, Landauer [Landauer (1961)] proposed concept for
irreversible circuits. Irreversible hardware computation results in power dissipation due
to the information loss. It states that, change of state from ‘reset’ to ‘set’ and vice-versa
is directly proportional to the loss of energy.
In 1973, Bennett [Bennett (1973)] has demonstrated that power dissipation is close to
zero if the circuit could come back to its initial state from its final state. Those circuits
which are losing information in the form of power dissipation are called irreversible
circuit. Digital gates those do not lose information are called reversible logic gates or
lossless circuits [Toffoli (1980)].
Reversibility has one to one mapping of input and output in a logic gate. It is divided
into two sub-categories: (1) logical reversibility and (2) physical reversibility.
Logical reversibility states that there is no information loss during the data travel from
input to output. The benefits of logical reversibility can be gained only by employing
the physical reversibility. Physical reversibility states that there is no energy dissipation
in the form of heat during the data travel from input to output. Practically, physical
reversibility is not achievable. Computing systems give off heat when voltage levels
change from positive to negative, bits from ‘zero’ to ‘one’. Reversible logic has the
ability to provide the outputs with less power dissipation. So, nowadays it is an
emerging field to design the ultra low-power and high-speed arithmetic computation
[Yelekar and Chiwande (2012)].
Reversible logic, Quantum-Dot Cellular Automata (QCA) and quantum computing are
the hot research fields to reduce the power dissipation. Reversible computing has many
possible applications in the field of Quantum computer, nanotechnology,
communication, QCA computer graphics, information security, design of low power
7
arithmetic for DSP, Field Programmable Gate Arrays (FPGAs) [Nielsen & Chuang
(2000); Vedral (1996); Vos & Rentergem (2005); Ercan & Anderson (2011); Anderson
et al. (2012); Stearns & Anderson (2013)]. Next Section defines the importance of
various software tools in the implementation of logarithm multiplication.
1.4 LOGARITHM MULTIPLICATION IMPLEMENTATION
Logarithmic multiplication based algorithms and architectures are to be implemented

for general-purpose systems. But, it is known fact that the general purpose processor
unit has more area and consumes more power. VLSI design techniques provide facility
to design the hardware efficient and low cost system. The characteristic of a VLSI
system is that it offers concurrency and enormous amount of computing power within a
small silicon area [Weste and Eshraghian (1993)]. The global communication system is
not only expensive, but dissipates high power. Thus, a high degree of parallelism is
crucial for the realization of high-performance VLSI system [Kung (1982)]. High-
performance Application Specific Integrated Circuit (ASIC) systems (dedicated VLSI
systems) is rapidly evolving in recent years. Therefore, the VLSI Logarithmic
multiplier is implemented to meet the requirement of a real-time application. Keeping
this fact in view, several design schemes have been suggested in the last fifty years for
an efficient design of logarithmic multiplier, logarithm and antilogarithm converter.
Researchers have applied different algorithms and architectural designs to increase the
accuracy and hardware performance of Logarithmic multipliers. A detailed study of the
reported LNS methods, available design algorithms and design limitations is presented
in Chapter 2. An appropriate design strategy to improve the accuracy and hardware
performance of logarithmic multiplier structures and their possible applications are
carried out in this thesis.

Al. (2015) ) - These Areas Often Require Real-Time Results With The Efficiency. A 32-Bit

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Al. (2015) ) - These Areas Often Require Real-Time Results With The Efficiency. A 32-Bit

Uploaded by

Copyright:

Available Formats

CHAPTER 1

1.1 LOGARITHM NUMBER SYSTEM

Flags Sign Integer Fraction

C=A1+ A2 Lg C= Lg A1 + Lg (1+2(Lg A2- Lg A1))

C=A1- A2 Lg C= Lg A1 +Lg (1-2( Lg A2- Lg A1))

Variable Lg C, Lg A1 and Lg A2 represent the logarithm values of C, A1, and A2,

Figure 1.1: Block diagram of computation of arithmetic based on LNS

1.2 ‘LEADING ONE DETECTOR’

Leading-One Detector (LOD) design becomes important due to the normalization

1.4 LOGARITHM MULTIPLICATION IMPLEMENTATION

Logarithmic multiplication based algorithms and architectures are to be implemented

You might also like