Are you sure?
This action might not be possible to undo. Are you sure you want to continue?
CHAPTER 1
ADDERS COMPARISON
1.0) INTRODUCTION
By using adder, delay and multiplier we can realize any digital filter which can be further
extended to realize any system. So to increase the speed of system we need to focus on
increasing the speed of addition and multiplication.
If we look at the conventional binary number system, considering the case of addition of two
numbers the carry may propagate all the way from the least significant digit to the most
significant. Thus the addition time is dependent on the word length (linear in ripple carry
adders).
In this chapter we will design various adders based on different addition techniques on Active
HDL software and study their delay and complexity characteristics. Our focus would be to
design certain adders in which the addition time is independent of the word length like the RBSD
adder and the QSD adder. Carryfree addition in RBSD and QSD addition is achieved by
exploiting the redundancy of RBSD and QSD numbers. The redundancy allows multiple
representations of any integer quantity. There are two steps involved in the carryfree addition.
The first step generates an intermediate carry and sum from the addend and augend. The second
step combines the intermediate sum of the current digit with the carry of the lower significant
digit.
By designing the adders in which the carry doesn’t propagate we can considerably increase the
speed of addition. So, in a system involving large number of adders and multipliers, its response
time can be considerably improved.
2
1.1) FULL ADDER
1.1.1) Introduction
A full adder is a logical circuit that performs an addition operation on three binary digits. The
full adder produces a sum and carry value, which are both binary digits. It can be combined with
other full adders (see below) or work on its own. [1]
A full adder adds binary numbers and accounts for values carried in as well as out. A onebit full
adder adds three onebit numbers, often written as A, B, and Cin; A and B are the operands, and
Cin is a bit carried in (in theory from a past addition).
Figure 1.1: Full adder
The delay through a digital circuit is measured in gatedelays, as this allows the delay of a design
to be calculated for different devices. AND and OR gates have a nominal delay of 1 gatedelay,
and XOR gates have a delay of 2, because they are really made up of a combination of ANDs
and ORs.
A full adder block has the following worst case propagation delays:
 From A or B to C
out
: 4 gatedelays (XOR → AND → OR)
 From A or B to S
: 4 gatedelays (XOR → XOR)
 From C
in
to C
out
: 2 gatedelays (AND → OR)
 From C
in
to S
: 2 gatedelays (XOR)
The worst propagation delay in 1 bit full adder is of 4 gate delays so the total propagation
delay in 1 bit full adder is of 4 gate delays.
Assuming that both normal and complement form of inputs are present.
3
1.1.2) VHDL code of full adder
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity fullader is
port(
a, b, cin : in bit;
sum, cout : out bit
);
end fullader;
}} End of automatically maintained section
architecture fullader of fullader is
begin
sum<= a xor b xor cin after 4 ns;
cout<= (a and b) or (b and cin) or (a and cin) after 4 ns; enter your statements here 
end fullader;
4
1.2) RIPPLE CARRY ADDER
1.2.1) Introduction
It is possible to create a logical circuit using multiple full adders to add Nbit numbers. Each full
adder inputs a Cin, which is the Cout of the previous adder. This kind of adder is a ripple carry
adder, since each carry bit "ripples" to the next full adder. [2]
Figure 1.2: Ripple Carry Adder
Because the carryout of one stage is the next's input, the worst case propagation delay is then:
 4 gatedelays from generating the first carry signal (A
0
/B
0
→ C
1
).
 2 gatedelays per intermediate stage (C
i
→ C
i+1
).
 2 gatedelays at the last stage to produce both the sum and carryout outputs (C
n1
→ C
n
and S
n1
).
So for an nbit adder, we have a total propagation delay, t
p
of:
t
p
= 4 + 2(n − 2) + 2 = 2n + 2 (1.1)
This is linear in n, and for a 32bit number, would take 66 cycles to complete the calculation.
This is rather slow, and restricts the word length in our device somewhat. We would like to find
ways to speed it up.
5
1.2.2) VHDL code of ripple carry adder
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity ripplecarry is
port( a, b: in bit_vector(3 downto 0); ci: in bit;
s: out bit_vector(3 downto 0); co: out bit
);
end ripplecarry;
}} End of automatically maintained section
architecture ripplecarry of ripplecarry is
component fullader
port (a, b, cin: in bit;
cout, sum: out bit);
end component;
signal c: bit_vector(3 downto 1);
begin
fa0: fullader port map (a(0), b(0), ci, c(1), s(0));
fa1: fullader port map (a(1), b(1), c(1), c(2), s(1));
fa2: fullader port map (a(2), b(2), c(2), c(3), s(2));
fa3: fullader port map (a(3), b(3), c(3), co, s(3));
end ripplecarry;
6
Figure 1.3: VHDL simulation of ripple carry adder
1.3) CARRY LOOK AHEAD ADDER [2]
1.3.1) Introduction
The generate function, G
i
, indicates if that stage causes a carryout signal C
i
to be generated if no
carryin signal exists. This occurs if both the addends contain a 1 in that bit:
G
i
= A
i
. B
i
(1.2)
The propagate function, P
i
, indicates if a carryin to the stage is passed to the carryout for the
stage. This occurs if either the addends have a 1 in that bit:
P
i
= A
i
+ B
i
(1.3)
Note that both these values can be calculated from the inputs in a constant time of a single gate
delay. Now, the carryout from a stage occurs if that stage generates a carry (G
i
= 1) or there is a
carryin and the stage propagates the carry (P
i
· C
i
= 1):
C
i+1
= A
i
B
i
+ A
i
C
i
+ B
i
C
i
(1.4)
C
i+1
= A
i
B
i
+ (A
i
+ B
i
)
C
i
(1.5)
C
i+1
= G
i
+ P
i
C
i
(1.6)
7
C
i+1
= G
i
+ P
i
(G
i1
+ P
i1
C
i1
) (1.7)
C
i+1
= G
i
+ P
i
G
i1
+ P
i
P
i1
(G
i2
+ P
i2
C
i2
) (1.8)
.
.
C
i+1
= G
i
+ P
i
G
i1
+ P
i
P
i1
G
i2
+ P
i
P
i1
P
i2
G
i3
+ … + P
i
P
i1
... P
i
P
i1
…P
1
P
0
C
0
(1.9)
Note that this does not require the carryout signals from the previous stages, so we don't have to
wait for changes to ripple through the circuit. In fact, a given stage's carry signal can be
computed once the propagate and generate signals are ready with only two more gate delays (one
AND and one OR). Thus the carryout for a given stage can be calculated in constant time, and
therefore so can the sum.
Operation Required Data Gate Delays
Produce stage generate and propagate signals Addends (a and b) 1
Produce stage carryout signals, C1 to Cn P and G signals, and C
0
2
Produce sum result, S Carry signals and addends 3
Total 6
Figure 1.4: Carry Look Ahead adder
A basic carrylookahead adder is very fast but has the disadvantage that it takes a very large
amount of logic hardware to implement. In fact, the amount of hardware needed is approximately
quadratic with n, and begins to get very complicated for n greater than 4.
Due to this, most CLAs are constructed out of "blocks" comprising 4bit CLAs, which are in turn
cascaded to produce a larger CLA.
8
1.3.2) VHDL code for carry look ahead adder
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity claadder is
port (x,y : in bit_vector (3 downto 0); cin : in bit;
s : out bit_vector (3 downto 0); cout,gout,pout : out bit);
end claadder;
architecture claadder of claadder is
component gpfulladder
port (a,b,cin : in bit;
g,p,so : out bit);
end component;
component clalogic
port (g,p : in bit_vector (3 downto 0); ci : in bit;
c : out bit_vector (3 downto 1) ; co,go,po : out bit);
end component;
signal g,p : bit_vector (3 downto 0);
signal c : bit_vector (3 downto 1);
begin
carrylogic : clalogic port map (g,p,cin,c,cout,pout,gout);
gpfa0 : gpfulladder port map ( x(0),y(0),cin,g(0),p(0),s(0));
gpfa1 : gpfulladder port map (x(1),y(1),c(1),g(1),p(1),s(1));
9
gpfa2 : gpfulladder port map (x(2),y(2),c(2),g(2),p(2),s(2));
gpfa3 : gpfulladder port map (x(3),y(3),c(3),g(3),p(3),s(3));
end claadder;
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity gpfulladder is
port (a,b,cin : in bit;
g,p,so : out bit);
end gpfulladder;
}} End of automatically maintained section
architecture gpfulladder of gpfulladder is
signal p_int : bit;
begin
g <= a and b;
p <= p_int;
p_int <= a xor b;
so <= p_int xor cin;
 enter your statements here 
end gpfulladder;
10
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity clalogic is
port (g,p : in bit_vector (3 downto 0); ci : in bit;
c : out bit_vector (3 downto 1) ; co,go,po : out bit);
end clalogic;
}} End of automatically maintained section
architecture clalogic of clalogic is
signal go_int,po_int : bit;
begin
c(1) <= g(0) or (p(0) and ci);
c(2) <= g(1) or (p(1) and g(0)) or (p(1) and ci);
c(3) <= g(2) or (p(2) and g(1)) or (p(3) and p(2) and g(1)) or ( p(2) and p(1) and p(0)
and ci);
po_int <= p(3) and p(2) and p(1) and p(0);
go_int <= g(3) or (p(3) and g(2)) or (p(3) and p(2) and g(1)) or (p(3) and p(2) and p(1)
and g(0));
co <= go_int or (po_int and ci);
po <= po_int;
go <= go_int;
 enter your statements here 
end clalogic;
11
Figure 1.5: VHDL simulation of Carry Look Ahead adder
1.4) REDUNDANT BINARY SIGNED ADDER [3]
1.4.1) Introduction
In such a system, a “carry–free” addition can be performed, where the term “carry–free” in this
context means that the carry propagation is limited to a single digit position. In other words, the carry
propagation length is fixed irrespective of the word length. The addition consists of two steps. In
the first step, an intermediate sum s
i
and a carry c
i
are generated, based on the operand digits x
i
and y
i
at each digit position i. This is done in parallel for all digit positions. In the second step,
the summation z
i = s
i
+ c
i1
is carried out to produce the final sum digit z
i
. The important point is that it is
always possible to select the intermediate sum s
i and carry c
i1
such that the summation in the second step
does not generate a carry. Hence, the second step can also be executed in parallel for all the digit
positions, yielding a fixed addition time, independent of the word length.
Figure shows an example for an 8bit redundant binary addition. In the Figure, X and Y are n
digit redundant binary integers. ISum and ICin are intermediate sum and carryin. Final Sum
(FSum), which is obtained by adding ISum and ICin. Note that there is no carry generation in
12
the addition of ISum and ICin to satisfy a carryfree condition and the LSB of ICin is set to
logic zero.
Figure 1.6: Signed addition
The addition of two signed digit takes place in two steps. In the first step intermediate carry and
intermediate sum is written using the above table, then in the second step the intermediate sum
and intermediate carry is added to obtain the final sum. The above table is designed such that the
addition of intermediate sum bit and intermediate carry bit does not produce a carry.
(1.10)
Figure 1.7: Signed adder cell [4]
13
If the delay of NAND, NOR gate is considered t
o
then delay of the circuit for the circuit becomes
T
delay
= t
o
+2t
o
+2t
o
+ t
o
+ t
o
= 7t
o
(1.11)
1.4.2) Rules For Redundant Binary Addition
Type Augend
digit
(x
i
)
Addend
digit
(y
i
)
Digit at the next lower
order position
(x
i1
, y
i1
)
Intermediate
Carry
(c
i
)
Intermediate
Sum
(s
i
)
1 1 1

1 0
2 1
0
0
1
Both are nonnegative 1
0
1
1 Otherwise
3 0 0

0 0
4 1
1
1
1

0
0
0
0
5 0
1
1
0
Both are nonnegative 0 1
Otherwise 1 1
6 1 1

1 0
Figure 1.8: Rules table for intermediate carry and intermediate sum
The addition of two signed digit takes place in two steps. In the first step intermediate carry and
intermediate sum is written using the above table, then in the second step the intermediate sum
and intermediate carry is added to obtain the final sum. The above table is designed such that the
addition of intermediate sum bit and intermediate carry bit does not produce a carry.
Figure 1.9: Steps of RBSD addition
14
1.4.3) VHDL code of RBSD adder
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity rbsdadder is
port (a,b,c,d,e,f,g,h : in bit;
c2,c1,s2,s1 : out bit);
end rbsdadder;
}} End of automatically maintained section
architecture rbsdadder of rbsdadder is
begin

c2 <= (e and f and ((a and (not d)) or ((not b) and c))) or (a and f and ((b and c) or ((not d)
and g)))
or (g and (((not b) and c and f)
or ( a and (not d) and (not f)))) or (c and (((not b)and (not f) and g) or ( a and b
and (not h))))
or (a and b and c and (not f));

c1 <= (e and f and ((a and (not d)) or ((not b) and c))) or (a and f and ((b and c) or ((not d)
and g)))
or (g and (((not b) and c and f) or ( a and (not d) and (not f)))) or (c and (((not b)and (not
f) and g) or ( a and b and (not h))))
15
or (b and((a and c and (not f)) or ( (not a) and (not c) and d and f)))
or ((not a) and b and(not c) and (not h) and (d or ( not e )))
or ((not a) and b and(not c) and (not f) and (d or (not g)))
or ((not b) and (not c ) and d and (((not e) and (not h)) or ((not f) and (not g))))
or ((not e) and f and (not g ) and (((not a) and b and (not d)) or ((not b) and (not c) and
d)));

s2 <= (b and (not d) and (((not e) and (not h)) or ((not f) and (not g))))
or ((not b) and d and (((not e) and (not h)) or ((not f) and ( not g))))
or ((not e) and f and (not g) and (b xor d));

s1 <= (f and (b xor d)) or (b and (not d) and ((not h) or (not f)))
or ((not b) and d and ((not h) or (not f)))
end rbsdadder;
16
Figure 1.10: Waveform of RBSD adder cell
Figure 1.11: VHDL simulation of RBSD adder cell
17
1.5) HYBRID SIGNED DIGIT ADDER [5]
1.5.1) Introduction
Here, instead of insisting that every digit be a signed digit, we let some of the digits to be signed
and leave the others unsigned. For example, every alternate or every third or fourth digit can be
signed; all the remaining ones are unsigned. We refer to this representation as a Hybrid Signed
Digit (HSD) representation. In the following, we show that such a representation can limit the
maximum length of carry propagation chains to any desired value. In particular, we prove that
the maximum length of a carry propagation chain equals (d + 1), where d is the longest distance
between neighboring signed digits.
Unsigned digit position Signed digit position
Figure 1.12: signed and unsigned adder cell [5]
In HSD for d=1 (the distance between signed digit positions) the delay is;
18
(1.12)
Here, the two delays of 1.5 units in parenthesis are due to the two complex gates in the lower
order signed digit cell. The last 1.5 units of delay (shown within the square brackets) is
associated with the XNOR gate at the higher order signed digit where the carry propagation
terminates. The terms in between are proportional to d since the carry ripples through all the
unsigned digit positions.
Figure 1.13: Critical path delay vs. distance between signed digits [5]
19
Figure 1.14: Transistor count vs. Distance between signed digits [5]
Figure 1.15: Transistor count *Delay vs. distance between signed digits [5]
20
1.5.2) VHDL code of unsigned position adder cell
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity unsigned_new is
port(
ai_1 : in STD_LOGIC;
bi_1 : in STD_LOGIC;
vi_2 : in STD_LOGIC;
wi_2 : in STD_LOGIC;
vi_1 : out STD_LOGIC;
wi_1 : out STD_LOGIC;
ei_1 : out STD_LOGIC
);
end unsigned_new;
architecture unsigned_new of unsigned_new is
component nor2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
21
end component;
component and2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component not1
port(
a : in STD_LOGIC;
y : out STD_LOGIC );
end component;
component xnor2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component xor2
22
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component or2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
signal s1,s2,s3,s4,s5,s6,s7: STD_LOGIC;
begin
N1: not1 port map (wi_2,s1);
N2: xnor2 port map (ai_1,bi_1,s2);
N3: and2 port map (vi_2,s1,s3);
N4: or2 port map (s1,vi_2,s4);
N5: and2 port map (s2,s4,s5);
N6: or2 port map (s3,s5,vi_1);
23
N7: nor2 port map (ai_1,bi_1,wi_1);
N8: xor2 port map (vi_2,wi_2,s6);
N9: xor2 port map (ai_1,bi_1,s7);
N10: xor2 port map (s7,s6,ei_1);
end unsigned_new;
Figure 1.16: Waveform of HSD unsigned position adder cell
Figure 1.17: VHDL simulation of HSD unsigned position adder cell
24
1.5.3) VHDL code of signed position adder cell
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity signedaddercell is
port(
xis_c : in STD_LOGIC;
yis_c : in STD_LOGIC;
xia : in STD_LOGIC;
yia : in STD_LOGIC;
vi_1_c : in STD_LOGIC;
wi_1 : in STD_LOGIC;
vi : out STD_LOGIC;
wi : out STD_LOGIC;
zia : out STD_LOGIC;
zis_c : out STD_LOGIC
);
end signedaddercell;
}} End of automatically maintained section
25
architecture signedaddercell of signedaddercell is
component nor2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component and2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component xnor2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
26
end component;
component nand2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component xor2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component nor3
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
c : in STD_LOGIC;
y : out STD_LOGIC
27
);
end component;
signal s1,s2,s3,s4,s5,s6: STD_LOGIC;
begin
N1: nand2 port map (xis_c,yis_c,wi);
N2: nor2 port map (xis_c,yis_c,s1);
N3: nor2 port map (xia,yia,s2);
N4: xor2 port map (xia,yia,s3);
N5: and2 port map (s3,wi_1,s5);
N6: xor2 port map (wi_1,s3,s6);
N7: nand2 port map (s6,vi_1_c,zis_c);
N8: xnor2 port map (vi_1_c,s6,zia);
N9: nor3 port map (s1,s2,s5,vi);
 enter your statements here 
end signedaddercell;
28
Figure 1.18: Waveform of HSD signed position adder cell
Figure 1.19: VHDL simulation of HSD signed position adder cell
29
1.6) QUATERNARY SIGNED DIGIT NUMBERS
1.6.1) Introduction
QSD numbers are represented using 3bit 2’s complement notation. Each number can be
represented by:
i
i
x 4
n
i
D =
¿
(1.13)
Where x
i
can be any value from the set {3, 2, 1, 0, 1, 2, 3} for producing an appropriate
decimal representation. For digital implementation, large number of digits such as 64, 128, or
more can be implemented with constant delay. A high speed and area effective adders and
multipliers can be implemented using this technique.
1.6.2) QSD ADDER
We can achieve carryfree addition by exploiting the redundancy of QSD numbers and the QSD
addition. The redundancy allows multiple representations of any integer quantity. There are two
steps involved in the carryfree addition. The first step generates an intermediate carry and sum
from the addend and augend. The second step combines the intermediate sum of the current digit
with the carry of the lower significant digit.
To prevent carry from further rippling, we define two rules.
1) The first rule states that the magnitude of the intermediate sum must be less than or equal
to 2(or 2).
2) The second rule states that the magnitude of the carry must be less than or equal to 1(or 
1).
Consequently, the magnitude of the second step output cannot be greater than 3 which can be
represented by a singledigit QSD number; hence no further carry is required. In step 1, all
30
possible input pairs of the addend and augend are considered. The output ranges from 6 to 6 as
shown in figure 1.19.
Figure 1.20: QSD representation
Both inputs and outputs can be encoded in 3bit 2’scomplement binary number. The mapping
between the inputs, addend and augend, and the outputs, the intermediate carry and sum are
shown in binary format. Since the intermediate carry is always lies between 1 and 1, it requires
only a 2bit binary representation. Finally, five 6variable Boolean expressions can be extracted.
The intermediate carry and sum circuit is shown in Figure 1.20.
31
Figure 1.21: The intermediate carry and sum generator
Figure 1.22: The second step QSD adder
In step 2, the intermediate carry from the lower significant digit is added to the sum of the
current digit to produce the final result. The addition in this step produces no carry because the
current digit can always absorb the carryin from the lower digit.
32
Figure 1.23: Ndigit QSD adder
By using N cells in parallel we can make N digit adder. The delay in this N digit adder is
constant which is equal to delay of single digit adder.
33
Table1.1 : The mapping between the inputs and outputs of the Intermediate carry and sum
34
1.6.3) VHDL code of QSD adder
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity qsdadder is
port (a0,a1,a2,b0,b1,b2 : in bit;
c0,c1,s0,s1,s2 : out bit);
end qsdadder;
architecture qsdadder of qsdadder is
begin
c1 <= (a2 and b2 and(not b1)) or (a2 and (not a1) and b2) or
(a2 and b2 and (not b0))or (a2 and (not a0) and b2) or (b2 and (not a1)
and (not a0) and (not b1)) or (a2 and (not a1) and (not b1) and (not b0));
c0 <= (a2 and b2 and (not b1)) or (a2 and (not a1) and b2) or
(a2 and b2 and (not b0)) or (a2 and (not a0) and b2) or
((not a1) and (not a0) and b2 and (not b1))
or ((not a2)and a1 and (not b2) and b1) or ((not a2) and a0 and (not b2) and b1)
or((not a2) and (not b2) and b1 and b0) or ((not a2) and a1 and (not b2) and b0)
or (a2 and (not a1) and (not b1) and (not b0)) or ((not a2) and a1 and a0 and (not b2));
s2 <= ((not a1) and b2 and b0) or (a2 and (not a0) and (not b1))
or ((not a1) and a0 and b2 and (not b1)) or ((not a1) and (not a0) and b2 and b1)
or ((not a1) and a0 and b1 and (not b0)) or ((not a1) and (not a0) and b1 and b0)
or (a2 and (not a1 ) and (not b1) and b0) or (a1 and (not a0) and (not b1) and b0)
or (a2 and a1 and (not b1) and (not b0)) or (a1 and a0 and (not b1 ) and (not b0))
or (a2 and a1 and a0 and b2 and b1 and b0);
s1 <= ((not a1) and b1 and (not b0)) or ((not a1) and (not a0) and b1 )
or (a1 and (not a0) and (not b1)) or (a1 and (not b1) and (not b0))
or ( a1 and a0 and b1 and b0) or ((not a1) and a0 and (not b1) and b0);
s0 <= (a0 and (not b0)) or ((not a0) and b1 and b0) or ((not a2) and (not a0) and b0)
or ((not a0 ) and (not b2) and b0);
end qsdadder;
35
Figure 1.24: Waveform of QSD adder cell
Figure 1.25: VHDL simulation of QSD adder cell
On simulation in Xilinx we get the delay of 13.931ns for QSD adder.
36
1.6.4) Single Digit QSD Multiplier
There are generally two methods for a multiplication operation : parallel and iterative. QSD
multiplication can be implemented in both ways, requiring a QSD partial product generator and a
QSD adder as basic components. A partial product M
i
is a result of multiplication between an n
digit input , A
n1
– A
0
, with a single digit input B
i
, where i = 0…n1 .
The primitive component of the partial product generator is a single digit multiplication unit. The
single digit multiplication produces M as a result and C as a carry to be combined with M of the
next digit. The range of the out is from 9 to 9 which can be represented with M and C in QSD
form. The value of M and C should lie between 2 and 2.
The mapping between inputs A (Multiplicand) and B (Multiplier) and the outputs M and C is
shown in the Table 1.2.
INPUT OUTPUT
QSD Binary Decimal QSD Binary
A B A B Product C M C M
3 3 011 011 9 2 1 010 001
3 3 101 101 9 2 1 010 001
3 2 011 010 6 1 2 001 010
2 3 010 011 6 1 2 001 010
3 2 101 110 6 1 2 001 010
2 3 110 101 6 1 2 001 010
2 2 010 010 4 1 0 001 000
2 2 110 110 4 1 0 001 000
3 1 011 001 3 1 1 001 111
3 1 101 111 3 1 1 001 111
1 3 001 011 3 1 1 001 111
1 3 111 101 3 1 1 001 111
2 1 010 001 2 0 2 000 010
2 1 110 111 2 0 2 000 010
1 2 001 010 2 0 2 000 010
1 2 111 110 2 0 2 000 010
1 1 001 001 1 0 1 000 001
1 1 111 111 1 0 1 000 001
3 0 011 000 0 0 0 000 000
2 0 010 000 0 0 0 000 000
1 0 001 000 0 0 0 000 000
0 1 000 001 0 0 0 000 000
0 2 000 010 0 0 0 000 000
0 3 000 011 0 0 0 000 000
0 0 000 000 0 0 0 000 000
3 0 101 000 0 0 0 000 000
2 0 110 000 0 0 0 000 000
1 0 111 000 0 0 0 000 000
0 1 000 111 0 0 0 000 000
0 2 000 110 0 0 0 000 000
0 3 000 101 0 0 0 000 000
1 1 001 111 1 0 1 000 111
1 1 111 101 1 0 1 000 111
2 1 010 111 2 0 2 000 110
1 2 111 010 2 0 2 000 110
1 2 001 110 2 0 2 000 110
2 1 110 001 2 0 2 000 110
3 1 011 111 3 1 1 111 001
1 3 111 011 3 1 1 111 001
3 1 101 001 3 1 1 111 001
1 3 001 101 3 1 1 111 001
37
Table 1.2: The mapping between multiplicand and multiplier
1.6.5) VHDL code for single digit multiplier
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity QSD_SINGLE_DIGIT_MULT is
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
);
2 2 010 110 4 1 0 111 000
2 2 110 010 4 1 0 111 000
3 2 011 110 6 1 2 111 110
2 3 110 011 6 1 2 111 110
3 2 101 010 6 1 2 111 110
2 3 010 101 6 1 2 111 110
3 3 011 101 9 2 1 110 111
3 3 101 011 9 2 1 110 111
38
end QSD_SINGLE_DIGIT_MULT;
}} End of automatically maintained section
architecture QSD_SINGLE_DIGIT_MULT of QSD_SINGLE_DIGIT_MULT is
begin
c2<= (a2 and(not b2)and b0 and((not b1)nand a1)) or (a2 and(not b2)and b1 and(a1 nand
a0)) or ((not a2)and a0 and b2 and ((not a1)nand b1)) or ((not a2) and a1 and b2 and(b1 nand
b0));
c1<= c2 or ( a2 and(not a1)and b2 and(not b1)) or (a1 and a0 and (not b2)and b1 and b0);
c0<= (a1 and(a0 nor b2)and b1) or (a1 and b1 and(a2 nor b0)) or (a2 and b2 and(a1 xor b1)) or
( a2 and b2 and(a0 nor b0)) or (a2 and b1 and(a1 nor b0)) or (a1 and b2 and(a0 nor b1)) or ((a1
nor b1)and a2 and(not b2)and b0) or ((a1 nor b1)and(not a2)and a0 and b2) or (a2 and a1 and(not
b2)and b1 and b0) or ((not a2)and a1 and a0 and b2 and b1) or ((a2 nor b2)and a0 and b0 and(a1
xor b1));
m2<= (a2 and b1 and(a1 nor b2)) or (a1 and b2 and(a2 nor b1)) or (a1 and a0 and b2 and(not
b1)) or(a2 and(not a1)and b1 and b0) or (a0 and b2 and(a2 nor b0)) or (a0 and b0 and(a1 xor b1))
or (a2 and b0 and(a0 nor b2)) or (a2 and a0 and b1 and(b2 nor b0)) or (a1 and b2 and b0 and(a2
nor a0));
m1<= (a0 and b1 and(a1 nand b0)) or (a1 and b0 and(b1 nand a0));
m0<= a0 and b0;
 enter your statements here –
end QSD_SINGLE_DIGIT_MULT;
39
Figure 1.26: Single digit QSD multiplier
On simulation of QSD single digit multiplier in Xilinx we get the delay of 11.348ns.
40
1.7) COMPARATIVE RESULT OF DIFFERENT ADDERS
Figure 1.27: Delay vs. Number of bits for addition for different adding schemes
Figure 1.28: complexity vs. number of bits for addition of different adding schemes
10 13
106
14
130
20
30
212
28
260
40
500
424
56
520
0
100
200
300
400
500
600
ripple carry
addition
carry look
ahead
addition
redundant
binary
addition
hybrid signed
addition
quartinary
signed digit
addition
2 bit
4 bit
8 bit
41
CHAPTER 2
ADAPTIVE FILTER [6]
2.1) INTRODUCTION
An adaptive filter is a filter that selfadjusts its transfer function according to an optimization
algorithm driven by an error signal. Because of the complexity of the optimization algorithms,
most adaptive filters are digital filters. By way of contrast, a nonadaptive filter has a static
transfer function. Adaptive filters are required for some applications because some parameters of
the desired processing operation (for instance, the locations of reflective surfaces in a reverberant
space) are not known in advance. The adaptive filter uses feedback in the form of an error signal
to refine its transfer function to match the changing parameters.
Generally speaking, the adaptive process involves the use of a cost function, which is a criterion
for optimum performance of the filter, to feed an algorithm, which determines how to modify
filter transfer function to minimize the cost on the next iteration.
As the power of digital signal processors has increased, adaptive filters have become much more
common and are now routinely used in devices such as mobile phones and other communication
devices, camcorders and digital cameras, and medical monitoring equipment.
The block diagram, shown in the following figure, serves as a foundation for particular adaptive
filter realizations, such as Least Mean Squares (LMS) and Recursive Least Squares (RLS). The
idea behind the block diagram is that a variable filter extracts an estimate of the desired signal.
Figure 2.1: Adaptive filter
To start the discussion of the block diagram we take the following assumptions:
* The input signal is the sum of a desired signal d(n) and interfering noise v(n)
x(n) = d(n) + v(n) (2.1)
42
* The variable filter has a Finite Impulse Response (FIR) structure. For such structures the
impulse response is equal to the filter coefficients. The coefficients for a filter of order p are
defined as
w
n
=[w
n
(0), w
n
(1),……. W
n
(p)]
T
(2.2)
* The error signal or cost function is the difference between the desired and the estimated
signal
e(n) = d(n)
ˆ
d (n) (2.3)
The variable filter estimates the desired signal by convolving the input signal with the impulse
response. In vector notation this is expressed as
ˆ
d (n) = w
n *
x(n) (2.4)
where
x(n)=[x(n),x(n1),…….,x(np)]
T
(2.5)
is an input signal vector. Moreover, the variable filter updates the filter coefficients at every time
instant
w
n+1
= w
n
+ ∆w
n
(2.6)
where ∆w
n
is a correction factor for the filter coefficients. The adaptive algorithm generates this
correction factor based on the input and error signals. LMS and RLS define two different
coefficient update algorithms.
2.2) LEAST MEAN SQUARE ADAPTIVE FILTER [6]
2.2.1) Introduction
Adaptive algorithms are a mainstay of Digital Signal Processing (DSP). They are used in a
variety of applications including acoustic echo cancellation, radar guidance systems, and
wireless channel estimation, among many others.
An adapative algorithm is used to estimate a time varying signal. There are many adaptive
algorithms such as Recursive Least Square (RLS) and Kalman filters, but the most commonly
used is the Least Mean Square (LMS) algorithm. It is a simple but powerful algorithm that can
be implemented to take advantage of Lattice FPGA architectures. Developed by Window and
Hoff, the algorithm uses a gradient descent to estimate a time varying signal. The gradient
43
descent method finds a minimum, if it exists, by taking steps in the direction negative of the
gradient. It does so by adjusting the filter coefficients to minimize the error.
The LMS reference design consists of two main functional blocks  a FIR filter and the LMS
algorithm. The FIR filter is implemented serially using a multiplier and an adder with feedback.
The FIR result is normalized to minimize saturation. The LMS algorithm iteratively updates the
coefficient and feeds it to the FIR filter. The FIR filter than uses the coefficient e(n) along with
the input reference signal x(n) to generate the output y(n). The output y(n) is then subtracted to
from the desired signal d(n) to generate an error, which is used by the LMS algorithm to compute
the next set of coefficients.
Figure 1 is a block diagram of system identification using adaptive filtering. The objective is to
change (adapt) the coefficients of an FIR filter, W, to match as closely as possible the response
of an unknown system, H. The unknown system and the adapting filter process the same input
signal x[n] and have outputs d[n] (also referred to as the desired signal) and y[n].
Figure 2.2: Least Mean Square adaptive filter
2.2.2) GRADIENTDESCENT ADAPTATION [6]
The adaptive filter, W, is adapted using the least meansquare algorithm, which is the most
widely used adaptive filtering algorithm. First the error signal, e[n], is computed as
e[n]=d[n]−y[n], which measures the difference between the output of the adaptive filter and the
output of the unknown system. On the basis of this measure, the adaptive filter will change its
coefficients in an attempt to reduce the error. The coefficient update relation is a function of the
error signal squared and is given by
   
 
2
n 1 n
n
( )
h i h i
2 h i
e µ
+
 
c
= + ÷


c
\ .
(2.7)
The term inside the parentheses represents the gradient of the squarederror with respect to the I
th
coefficient. The gradient is a vector pointing in the direction of the change in filter coefficients
44
that will cause the greatest increase in the error signal. Because the goal is to minimize the error,
however, Equation 1 updates the filter coefficients in the direction opposite the gradient; that is
why the gradient term is negated. The constant μ is a stepsize, which controls the amount of
gradient information used to update each coefficient. After repeatedly adjusting each coefficient
in the direction opposite to the gradient of the error, the adaptive filter should converge; that is,
the difference between the unknown and adaptive systems should get smaller and smaller. To
express the gradient decent coefficient update equation in a more usable manner, we can rewrite
the derivative of the squarederror term as
   
2
( ) ( )
2
h i h i
e e
e
   
c c
=
 
 
c c
\ . \ .
(2.8)
Or,
   
2
( ) ( )
2
h i h i
e d y
e
   
c c ÷
=
 
 
c c
\ . \ .
(2.9)
 
 
 
1
2
0
( h i [ ])
( )
2
h i h i
N
i
d x n i
e
e
÷
=
 
c ÷ ÷

 
c
 =


c c

\ .

\ .
¿
(2.10)
 
2
( )
2( [ ])
h i
e
x n i e
 
c
= ÷ ÷


c
\ .
(2.11)
which in turn gives us the final LMS coefficient update,
   
n 1 n
h i h i [ ] ex n i µ
+
= + ÷
(2.12)
The stepsize μ directly affects how quickly the adaptive filter will converge toward the
unknown system. If μ is very small, then the coefficients change only a small amount at each
update, and the filter converges slowly. With a larger stepsize, more gradient information is
included in each update, and the filter converges more quickly; however, when the stepsize is
45
too large, the coefficients may change too quickly and the filter will diverge. (It is possible in
some cases to determine analytically the largest value of μ ensuring convergence.)
2.2.3) CONVERGENCE AND STABILITY [6]
Assume that the true filter H(n) = H is constant, and that the input signal x(n) is widesense
stationary. Then E{W(n)} converges to H as n→∞ if and only if
max
2
0 µ
ì
< <
(2.13)
Where λ
max
is the greatest eigenvalue of the autocorrelation matrix. If this condition is not
fulfilled, the algorithm becomes unstable and W(n) diverges.
Maximum convergence speed is achieved when
max min
2
µ
ì ì
=
+
(2.14)
where λ
min
is the smallest eigenvalue of autocorrelation matrix. Given that μ is less than or equal
to this optimum, the convergence speed is determined by μ.λ
min
, with a larger value yielding
faster convergence. This means that faster convergence can be achieved when λ
max
is close to
λ
min
, that is, the maximum achievable convergence speed depends on the eigenvalue spread of
autocorrelation matrix.
A white noise signal has autocorrelation matrix R = σ
2
I, where σ
2
is the variance of the signal. In
this case all eigenvalues are equal, and the eigenvalue spread is the minimum over all possible
matrices. The common interpretation of this result is therefore that the LMS converges quickly
for white input signals, and slowly for colored input signals, such as processes with lowpass or
highpass characteristics.
46
It is important to note that the above upperbound on μ only enforces stability in the mean, but the
coefficients of W(n) can still grow infinitely large, i.e. divergence of the coefficients is still
possible. A more practical bound is
2
0
[ ] tr R
µ < <
(2.15)
where tr[R] denotes the trace of autocorrelation matrix. This bound guarantees that the
coefficients of W(n) do not diverge (in practice, the value of μ should not be chosen close to this
upper bound, since it is somewhat optimistic due to approximations and assumptions made in the
derivation of the bound).
47
CHAPTER 3
IMPLEMENTATION OF LMS ADAPTIVE FILTER [7]
3.1) INTRODUCTION
In LMS the weight vector is updated from sample to sample as follows
h
k+1
= h
k
– μ ∇k (3.1)
h
k
and ∇k are the weights and the true gradient vectors respectively. At the k
th
sampling instant,
μ controls the stability and the rate of convergence.
LMS algorithm for updating the weights from sample to sample is
h
k+1
= h
k
+ 2 μe
k
x
k
(3.2)
where,
e
k
= y
k

h
k
T
x
k
(3.3)
3.2) IMPLEMENTATION OF LMS ALGORITHM [7]
1) Initially, set each each weight h
k
(i), for i=0,1,2,……,N1 to an arbitrary fixed value such
as 0.
For each subsequent sampling instant, k=1,2,….. carry out steps (2) to step (4) below.
2) Compute filter output as
1
k
0
n ( )
N
k k i
i
h i x
÷
÷
=
=
¿
(3.4)
3) Compute the error estimate
e
k
= y
k
 n
k
(3.5)
4) Update the next filter weights
k 1 k
h ( ) h ( ) 2 e
k k i
i i x µ
+ ÷
= +
(3.6)
The LMS algorithm requires approximately 2N+1 multiplications and 2N+1 additions for each
new set of input and output samples.
48
3.3) FLOWCHART FOR THE LMS ADAPTIVE FILTER [7]
Update Coefficient
w
k+1
= w
k
+ 2μe
k
x
ki
Compute Factor
2μe
k
Compute Error
e
k
=y
k
 n
k
Filter x
k
n
k
=∑w
k
(i).x
ki
Read x
k
and y
k
from ADC
Initialize
h
k
(i) and x
ki
49
3.4) IMPLEMENTATION OF DIFFERENT ORDERS LMS
ADAPTIVE FILTER
3.4.1) Introduction [7]
The LMS algorithm is a linear adaptive filtering algorithm, which, in general, consists of two
basic processes:
1) A filtering process, which involves (a) computing the output of a linear filter in response
to an input signal and (b) generating an estimation error by comparing this output with a
desired response.
2) An adaptive process, which involves the automatic adjustment of the parameters of the
filter in accordance with the estimation error.
The combination of these two processes working together constitutes a feedback loop. First we
have a transversal filter, around which the LMS algorithm is built, this component is responsible
for performing the filtering process. Second, we have a mechanism for performing the adaptive
control process on the tap weights of the transversal filter, hence is called adaptive weight
control mechanism.
Figure 3.1: LMS filter
50
3.4.2) 1
st
order LMS adaptive filter
3.4.2.1) Introduction
Figure 3.2: 1
st
order LMS adaptive filter
d
out
is the output of transversal filter
y
n
is the desired signal
e(n) is the estimation error given as
e(n) = d
out
(n) – y(n) (3.7)
w(n+1) = w(n) + 2μe(n)x
in
(n) (3.8)
w(n+1) is the updated weight and w(n) is the previous weight
51
Components required for designing of 1
st
order LMS adaptive filter are
Number of delay elements required = 1
Number of multipliers in transversal filter = 2
Number of multipliers in adaptive weight control mechanism = 3
Number of adders in transversal filter = 1
Number of adders in adaptive weight mechanism = 3
Here total number of multipliers are 5 and total number of adders are 4. The delay of QSD adder
is 13.931ns and the delay of QSD multiplier is 11.348ns, so the total delay of 1
st
order LMS
adaptive filter is 112.464ns.
3.4.2.2) VHDL implementation of 1
st
order LMS adaptive filter
Here we are using μ=0.5.
3.4.2.2.1) VHDL code for 1
st
order LMS adaptive filter
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity first_order_filter is
port( x2,x1,x0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
w02,w01,w00: in std_logic ;
w12,w11,w10: in std_logic;
d5,d4,d3,d2,d1,d0:inout std_logic);
end first_order_filter;
}} End of automatically maintained section
architecture first_order_filter of first_order_filter is
component delay_unit
port(
a ,b,c: in STD_LOGIC;
d ,e,f: out STD_LOGIC
52
);
end component ;
component qsdadder
port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);
end component ;
component qsdadder2bit
port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;
z5,z4,z3,z2,z1,z0:out std_logic );
end component ;
component QSD_SINGLE_DIGIT_MULT
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
);
end component ;
component adaptationunit_first_order
port (d5,d4,d3,d2,d1,d0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
x12,x11,x10,x22,x21,x20: in std_logic ;
w12,w11,w10,w22,w21,w20: in std_logic ;
wo12,wo11,wo10,wo22,wo21,wo20: out std_logic );
end component ;
signal xd12,xd11,xd10:std_logic ;
signal xd22,xd21,xd20 :std_logic ;
signal nk02,nk01,nk00: std_logic ;
signal nk12,nk11,nk10: std_logic ;
signal nki02,nki01,nki00: std_logic ;
53
signal nki12,nki11,nki10: std_logic ;
signal do4,do3: std_logic ;
signal ws02,ws01,ws00,ws12,ws11,ws10:std_logic ;
begin
delay1: delay_unit port map (x2,x1,x0,xd12,xd11,xd10);
mul1: QSD_SINGLE_DIGIT_MULT port map
(x2,x1,x0,w02,ws02,ws01,ws00,nki01,nki00,nk02,nk01,nk00);
mul2: QSD_SINGLE_DIGIT_MULT port map
(xd12,xd11,xd10,ws12,ws11,ws10,nki12,nki11,nki10,nk12,nk11,nk10);
add1: qsdadder2bit port map (
nki02,nki01,nki00,nk02,nk01,nk00,nki12,nki11,nki10,nk12,nk11,nk10,d5,d4,d3,d2,d1,d0);
adaptation: adaptationunit_first_order port map
(d5,d4,d3,d2,d1,d0,y5,y4,y3,y2,y1,y0,q2,q1,q0,x2,x1,x0,xd12,xd11,xd10,
w02,w01,w00,w12,w11,w10,ws02,ws01,ws00,ws12,ws11,ws10);
 enter your statements here 
end first_order_filter;
54
Figure 3.3: VHDL simulation of 1
st
order LMS adaptive filter
3.4.2.2.2) VHDL code for one digit QSD adder
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity qsdadder is
port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);
end qsdadder;
}} End of automatically maintained section
architecture qsdadder of qsdadder is
begin
c1 <= (a2 and b2 and(not b1)) or (a2 and (not a1) and b2) or
(a2 and b2 and (not b0))or (a2 and (not a0) and b2) or (b2 and (not a1)
and (not a0) and (not b1)) or (a2 and (not a1) and (not b1) and (not b0)) after 2 ns;
c0 <= (a2 and b2 and (not b1)) or (a2 and (not a1) and b2) or
(a2 and b2 and (not b0)) or (a2 and (not a0) and b2) or
((not a1) and (not a0) and b2 and (not b1))
or ((not a2)and a1 and (not b2) and b1) or ((not a2) and a0 and (not b2) and b1)
55
or((not a2) and (not b2) and b1 and b0) or ((not a2) and a1 and (not b2) and b0)
or (a2 and (not a1) and (not b1) and (not b0)) or ((not a2) and a1 and a0 and (not b2))
after 2 ns;
s2 <= ((not a1) and b2 and b0) or (a2 and (not a0) and (not b1))
or ((not a1) and a0 and b2 and (not b1)) or ((not a1) and (not a0) and b2 and b1)
or ((not a1) and a0 and b1 and (not b0)) or ((not a1) and (not a0) and b1 and b0)
or (a2 and (not a1 ) and (not b1) and b0) or (a1 and (not a0) and (not b1) and b0)
or (a2 and a1 and (not b1) and (not b0)) or (a1 and a0 and (not b1 ) and (not b0))
or (a2 and a1 and a0 and b2 and b1 and b0) after 2 ns;
s1 <= ((not a1) and b1 and (not b0)) or ((not a1) and (not a0) and b1 )
or (a1 and (not a0) and (not b1)) or (a1 and (not b1) and (not b0))
or ( a1 and a0 and b1 and b0) or ((not a1) and a0 and (not b1) and b0) after 2 ns;
s0 <= (a0 and (not b0)) or ((not a0) and b1 and b0) or ((not a2) and (not a0) and b0)
or ((not a0 ) and (not b2) and b0) after 2 ns;
 enter your statements here 
end qsdadder;
56
Figure 3.4: 1 digit QSD adder
3.4.2.2.3) VHDL code for two digit QSD adder
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity qsdadder2bit is
57
port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;
z5,z4,z3,z2,z1,z0:out std_logic );
end qsdadder2bit;
}} End of automatically maintained section
architecture qsdadder2bit of qsdadder2bit is
signal ci5,ci4,ci3:std_logic ;
signal s5,s4,s3,s2,s1:std_logic ;
signal si5,si4:std_logic ;
component qsdadder
port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);
end component ;
begin
ci5<=’0’ ;
add1: qsdadder port map ( x2,x1,x0,y2,y1,y0 ,ci4,ci3,z2,z1,z0);
add2: qsdadder port map ( x5,x4,x3,y5,y4,y3 ,s5,s4,s3,s2,s1);
add3: qsdadder port map ( s3,s2,s1,ci5,ci4,ci3,si5,si4,z5,z4,z3);
 enter your statements here –
end qsdadder2bit;
58
Figure 3.5: 2 digit QSD adder
3.4.2.2.3) VHDL code for single digit multiplier
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity QSD_SINGLE_DIGIT_MULT is
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
59
b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
);
end QSD_SINGLE_DIGIT_MULT;
}} End of automatically maintained section
architecture QSD_SINGLE_DIGIT_MULT of QSD_SINGLE_DIGIT_MULT is
begin
c2<= (a2 and(not b2)and b0 and((not b1)nand a1)) or (a2 and(not b2)and b1 and(a1 nand
a0)) or ((not a2)and a0 and b2 and ((not a1)nand b1)) or ((not a2) and a1 and b2 and(b1 nand
b0));
c1<= c2 or ( a2 and(not a1)and b2 and(not b1)) or (a1 and a0 and (not b2)and b1 and b0);
c0<= (a1 and(a0 nor b2)and b1) or (a1 and b1 and(a2 nor b0)) or (a2 and b2 and(a1 xor b1)) or
( a2 and b2 and(a0 nor b0)) or (a2 and b1 and(a1 nor b0)) or (a1 and b2 and(a0 nor b1)) or ((a1
nor b1)and a2 and(not b2)and b0) or ((a1 nor b1)and(not a2)and a0 and b2) or (a2 and a1 and(not
b2)and b1 and b0) or ((not a2)and a1 and a0 and b2 and b1) or ((a2 nor b2)and a0 and b0 and(a1
xor b1));
m2<= (a2 and b1 and(a1 nor b2)) or (a1 and b2 and(a2 nor b1)) or (a1 and a0 and b2 and(not
b1)) or(a2 and(not a1)and b1 and b0) or (a0 and b2 and(a2 nor b0)) or (a0 and b0 and(a1 xor b1))
or (a2 and b0 and(a0 nor b2)) or (a2 and a0 and b1 and(b2 nor b0)) or (a1 and b2 and b0 and(a2
nor a0));
60
m1<= (a0 and b1 and(a1 nand b0)) or (a1 and b0 and(b1 nand a0));
m0<= a0 and b0;
 enter your statements here –
end QSD_SINGLE_DIGIT_MULT;
Figure 3.6: Single digit QSD multiplier
On simulation of QSD single digit multiplier in Xilinx we get the delay of 11.348ns.
61
3.4.2.2.4) VHDL code for complement generator of two digit QSD number
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity complement_genrator is
port(a5,a4,a3,a2,a1,a0: in std_logic;
b5,b4,b3,b2,b1,b0: inout std_logic);
end complement_genrator;
}} End of automatically maintained section
architecture complement_genrator of complement_genrator is
signal f5,f4,f3,f2,f1,f0: std_logic;
signal n2,n1: std_logic ;
signal n0: std_logic ;
component qsdadder
port (a0,a1,a2,b0,b1,b2 : in std_logic;
c0,c1,s0,s1,s2 : out std_logic);
end component;
begin
n2<=’0’;
n1<=’0’;
n0<=’1’;
f5<=’0’;
process(a0)
begin
62
if a2=’0’ and a1 =’0’ and a0=’0’ then
b2<=’0’ ; b1<=’1’ ; b0<=’1’;
end if;
if a2=’0’ and a1 =’0’ and a0=’1’ then
b2<=’0’ ; b1<=’1’ ; b0<=’0’;
end if ;
if a2=’0’ and a1 =’1’ and a0=’0’ then
b2<=’0’ ; b1<=’0’ ; b0<=’1’;
end if ;
if a2=’0’ and a1 =’1’ and a0=’1’ then
b2<=’0’ ; b1<=’0’ ; b0<=’0’;
end if ;
if a5=’0’ and a4 =’0’ and a3=’0’ then
b5<=’0’ ; b4<=’1’ ; b3<=’1’;
end if ;
if a5=’0’ and a4 =’0’ and a3=’1’ then
b5<=’0’ ; b4<=’1’ ; b3<=’0’;
end if ;
if a5=’0’ and a4 =’1’ and a3=’0’ then
b5<=’0’ ; b4<=’0’ ; b3<=’1’;
end if ;
if a5=’0’ and a4 =’1’ and a3=’1’ then
63
b5<=’0’ ; b4<=’0’ ; b3<=’0’;
end if ;
end process;
add1: qsdadder port map (n0,n1,n2,b0,b1,b2,f3,f4,f0,f1,f2) ;
add2: qsdadder port map (f3,f4,f5,b3,b4,b5,f4,f5,b3,b4,b5) ;
 enter your statements here –
end complement_genrator;
Figure 3.7: Two digit QSD number complement generator
64
3.4.2.2.5) VHDL code for delay unit
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity delay_unit is
port(
a ,b,c: in STD_LOGIC;
d ,e,f: out STD_LOGIC
);
end delay_unit;
}} End of automatically maintained section
architecture delay_unit of delay_unit is
begin
d<=a after 100 ns;
e<=b after 100 ns;
f<=c after 100 ns;
 enter your statements here 
end delay_unit;
65
Figure 3.8: Delay unit
3.4.2.2.6) VHDL code for adaptive weight control mechanism
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity adaptationunit_first_order is
port (d5,d4,d3,d2,d1,d0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
x12,x11,x10,x22,x21,x20: in std_logic ;
66
w12,w11,w10,w22,w21,w20: in std_logic ;
wo12,wo11,wo10,wo22,wo21,wo20: out std_logic );
end adaptationunit_first_order;
}} End of automatically maintained section
architecture adaptationunit_first_order of adaptationunit_first_order is
component complement_genrator
port(a5,a4,a3,a2,a1,a0: in std_logic;
b5,b4,b3,b2,b1,b0: inout std_logic);
end component ;
component qsdadder2bit
port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;
z5,z4,z3,z2,z1,z0:out std_logic );
end component ;
component QSD_SINGLE_DIGIT_MULT
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
67
c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
);
end component ;
component qsdadder
port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);
end component ;
signal dc5,dc4,dc3,dc2,dc1,dc0:std_logic ;
signal e5,e4,e3,e2,e1,e0:std_logic ;
signal f5,f4,f3,f2,f1,f0:std_logic ;
signal g15,g14,g13,g12,g11,g10:std_logic ;
signal g25,g24,g23,g22,g21,g20:std_logic ;
signal wo14,wo13,wo24,wo23:std_logic ;
begin
complement: complement_genrator port map (
d5,d4,d3,d2,d1,d0,dc5,dc4,dc3,dc2,dc1,dc0);
add1: qsdadder2bit port map (
dc5,dc4,dc3,dc2,dc1,dc0,y5,y4,y3,y2,y1,y0,e5,e4,e3,e2,e1,e0);
mul1: QSD_SINGLE_DIGIT_MULT port map (e2,e1,e0,q2,q1,q0,f5,f4,f3,f2,f1,f0);
68
mul21:QSD_SINGLE_DIGIT_MULT port map
(f2,f1,f0,x12,x11,x10,g15,g14,g13,g12,g11,g10);
mul22:QSD_SINGLE_DIGIT_MULT port map
(f2,f1,f0,x22,x21,x20,g25,g24,g23,g22,g21,g20) ;
add2: qsdadder port map (w12,w11,w10, g12,g11,g10 ,wo14,wo13,wo12,wo11,wo10);
add3: qsdadder port map (w22,w21,w20, g22,g21,g20, wo24,wo23,wo22,wo21,wo20);
 enter your statements here –
end adaptationunit_first_order;
Figure 3.9: Adaptive weight control mechanism of 1
st
order LMS adaptive filter
69
3.4.3) 2
nd
order LMS adaptive filter
3.4.3.1) Introduction
Figure 3.10: 2
nd
order LMS adaptive filter
d
out
is the output of transversal filter
y
n
is the desired output
e(n) is the estimation error given as
e(n) = d
out
(n) – y(n) (3.9)
w(n+1) = w(n) + 2μe(n)x
in
(n) (3.10)
w(n+1) is the updated weight and w(n) is the previous weight
Components required for designing of 2
nd
order LMS adaptive filter are
Number of delay elements required = 2
Number of multipliers in transversal filter = 3
70
Number of multipliers in adaptive weight control mechanism = 4
Number of adders in transversal filter = 2
Number of adders in adaptive weight mechanism = 4
Here total number of multipliers are 7 and total number of adders are 6. The delay of QSD adder
is 13.931ns and the delay of QSD multiplier is 11.348ns, so the total delay of 2
nd
order LMS
adaptive filter is 163.022ns.
3.4.3.2) VHDL implementation of 2
nd
order LMS adaptive filter
3.4.3.2.1) VHDL code for 2
nd
order LMS adaptive filter
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity second_order_filter is
port( x2,x1,x0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
w02,w01,w00: in std_logic ;
w12,w11,w10: in std_logic;
w22,w21,w20: in std_logic;
d5,d4,d3,d2,d1,d0:inout std_logic);
end second_order_filter;
}} End of automatically maintained section
architecture second_order_filter of second_order_filter is
component delay_unit
71
port(
a ,b,c: in STD_LOGIC;
d ,e,f: out STD_LOGIC
);
end component ;
component qsdadder
port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);
end component ;
component qsdadder2bit
port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;
z5,z4,z3,z2,z1,z0:out std_logic );
end component ;
component QSD_SINGLE_DIGIT_MULT
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
72
b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
);
end component ;
component adaptation_unit
port (d5,d4,d3,d2,d1,d0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
x12,x11,x10,x22,x21,x20,x32,x31,x30: in std_logic ;
w12,w11,w10,w22,w21,w20,w32,w31,w30: in std_logic ;
wo12,wo11,wo10,wo22,wo21,wo20,wo32,wo31,wo30: out std_logic );
end component ;
signal xd12,xd11,xd10:std_logic ;
signal xd22,xd21,xd20 :std_logic ;
signal nk02,nk01,nk00: std_logic ;
signal nk12,nk11,nk10: std_logic ;
signal nki22,nki21,nki20,nk22,nk21,nk20:std_logic ;
signal nki02,nki01,nki00: std_logic ;
signal nki12,nki11,nki10: std_logic ;
signal do4,do3: std_logic ;
73
signal di5,di4,di3,di2,di1,di0:std_logic ;
signal ws02,ws01,ws00,ws12,ws11,ws10,ws22,ws21,ws20:std_logic ;
begin
delay1: delay_unit port map (x2,x1,x0,xd12,xd11,xd10);
delay2: delay_unit port map (xd12,xd11,xd10,xd22,xd21,xd20);
mul1: QSD_SINGLE_DIGIT_MULT port map
(x2,x1,x0,w02,ws02,ws01,ws00,nki01,nki00,nk02,nk01,nk00);
mul2: QSD_SINGLE_DIGIT_MULT port map
(xd12,xd11,xd10,ws12,ws11,ws10,nki12,nki11,nki10,nk12,nk11,nk10);
mul3: QSD_SINGLE_DIGIT_MULT port map
(xd22,xd21,xd20,ws22,ws21,ws20,nki22,nki21,nki20,nk22,nk21,nk20);
add1: qsdadder2bit port map (
nki02,nki01,nki00,nk02,nk01,nk00,nki12,nki11,nki10,nk12,nk11,nk10,di5,di4,di3,di2,di1,di0);
add2: qsdadder2bit port map
(di5,di4,di3,di2,di1,di0,nki22,nki21,nki20,nk22,nk21,nk20,d5,d4,d3,d2,d1,d0);
adaptation: adaptation_unit port map
(d5,d4,d3,d2,d1,d0,y5,y4,y3,y2,y1,y0,q2,q1,q0,x2,x1,x0,xd12,xd11,xd10,xd22,xd21,xd20,
w02,w01,w00,w12,w11,w10,w22,w21,w20,ws02,ws01,ws00,ws12,ws11,ws10,ws22,ws2
1,ws20);
 enter your statements here 
end second_order_filter;
74
figure 3.11: VHDL simulation of 2
nd
order LMS adaptive filter
3.4.3.2.2) VHDL code for adaptive weight control mechanism of 2
nd
order LMS filter
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity adaptation_unit is
port (d5,d4,d3,d2,d1,d0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
x12,x11,x10,x22,x21,x20,x32,x31,x30: in std_logic ;
w12,w11,w10,w22,w21,w20,w32,w31,w30: in std_logic ;
wo12,wo11,wo10,wo22,wo21,wo20,wo32,wo31,wo30: out std_logic );
end adaptation_unit;
75
}} End of automatically maintained section
architecture adaptation_unit of adaptation_unit is
component complement_genrator
port(a5,a4,a3,a2,a1,a0: in std_logic;
b5,b4,b3,b2,b1,b0: inout std_logic);
end component ;
component qsdadder2bit
port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;
z5,z4,z3,z2,z1,z0:out std_logic );
end component ;
component QSD_SINGLE_DIGIT_MULT
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
76
);
end component ;
component qsdadder
port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);
end component ;
signal dc5,dc4,dc3,dc2,dc1,dc0:std_logic ;
signal e5,e4,e3,e2,e1,e0:std_logic ;
signal f5,f4,f3,f2,f1,f0:std_logic ;
signal g15,g14,g13,g12,g11,g10:std_logic ;
signal g25,g24,g23,g22,g21,g20:std_logic ;
signal g35,g34,g33,g32,g31,g30:std_logic ;
signal wo14,wo13,wo34,wo33,wo24,wo23:std_logic ;
begin
complement: complement_genrator port map (
d5,d4,d3,d2,d1,d0,dc5,dc4,dc3,dc2,dc1,dc0);
add1: qsdadder2bit port map (
dc5,dc4,dc3,dc2,dc1,dc0,y5,y4,y3,y2,y1,y0,e5,e4,e3,e2,e1,e0);
mul1: QSD_SINGLE_DIGIT_MULT port map (e2,e1,e0,q2,q1,q0,f5,f4,f3,f2,f1,f0);
mul21:QSD_SINGLE_DIGIT_MULT port map
(f2,f1,f0,x12,x11,x10,g15,g14,g13,g12,g11,g10);
mul22:QSD_SINGLE_DIGIT_MULT port map
(f2,f1,f0,x22,x21,x20,g25,g24,g23,g22,g21,g20) ;
mul23:QSD_SINGLE_DIGIT_MULT port map
(f2,f1,f0,x32,x31,x30,g35,g34,g33,g32,g31,g30);
77
add2: qsdadder port map (w12,w11,w10, g12,g11,g10 ,wo14,wo13,wo12,wo11,wo10);
add3: qsdadder port map (w22,w21,w20, g22,g21,g20, wo24,wo23,wo22,wo21,wo20);
add4: qsdadder port map (w32,w31,w30, g32,g31,g30, wo34,wo33,wo32,wo31,wo30);
 enter your statements here 
end adaptation_unit;
Figure 3.12: Adaptive weight control mechanism of 2
nd
order LMS adaptive filter
78
CHAPTER 4
CONCLUSION
We have implemented 1
st
and 2
nd
order adaptive filters using LMS algorithm for adaptive weight
control mechanism. For implementation of above adaptive filter we have used non conventional
quaternary signed digit number system. For this we have designed and implemented addition and
multiplication blocks for QSD number system. By use of these blocks we have implemented our
adaptive filter. We have shown above that in QSD number system the addition takes place in
parallel so the delay is constant and does not depend on number of bits to be added, the delay of
QSD adder is 13.931ns and the delay of QSD multiplier is 11.348ns.
The LMS algorithm requires approximately 2N+1 multiplications and 2N+1 additions for each
new set of input and output samples, where N is order of the filter. So the delay depends upon
the number of multiplication and addition.
Here we have implemented the adaptive filter using QSD adders and multipliers the total delay
of 1
st
order LMS adaptive filter is 112.464ns and the total delay of 2
nd
order LMS adaptive filter
is 163.022ns. So the delay is much less in comparison to the implementation of adaptive filter
using conventional adders and multipliers.
79
APPENDIX
1. Xilinx report for QSD adder
Release 9.2i  xst J.36
Copyright (c) 19952007 Xilinx, Inc. All rights reserved.
> Parameter TMPDIR set to ./xst/projnav.tmp
CPU : 0.00 / 0.16 s  Elapsed : 0.00 / 0.00 s
> Parameter xsthdpdir set to ./xst
CPU : 0.00 / 0.16 s  Elapsed : 0.00 / 0.00 s
> Reading design: qsdadder.prj
TABLE OF CONTENTS
1) Synthesis Options Summary
2) HDL Compilation
3) Design Hierarchy Analysis
4) HDL Analysis
5) HDL Synthesis
5.1) HDL Synthesis Report
6) Advanced HDL Synthesis
6.1) Advanced HDL Synthesis Report
7) Low Level Synthesis
8) Partition Report
80
9) Final Report
9.1) Device utilization summary
9.2) Partition Resource Summary
9.3) TIMING REPORT
=====================================================================
====
* Synthesis Options Summary *
=====================================================================
====
 Source Parameters
Input File Name : "qsdadder.prj"
Input Format : mixed
Ignore Synthesis Constraint File : NO
 Target Parameters
Output File Name : "qsdadder"
Output Format : NGC
Target Device : xc2s156cs144
 Source Options
Top Module Name : qsdadder
Automatic FSM Extraction : YES
FSM Encoding Algorithm : Auto
Safe Implementation : No
FSM Style : lut
RAM Extraction : Yes
81
RAM Style : Auto
ROM Extraction : Yes
Mux Style : Auto
Decoder Extraction : YES
Priority Encoder Extraction : YES
Shift Register Extraction : YES
Logical Shifter Extraction : YES
XOR Collapsing : YES
ROM Style : Auto
Mux Extraction : YES
Resource Sharing : YES
Asynchronous To Synchronous : NO
Multiplier Style : lut
Automatic Register Balancing : No
 Target Options
Add IO Buffers : YES
Global Maximum Fanout : 100
Add Generic Clock Buffer(BUFG) : 4
Register Duplication : YES
Slice Packing : YES
Optimize Instantiated Primitives : NO
Convert Tristates To Logic : Yes
Use Clock Enable : Yes
Use Synchronous Set : Yes
82
Use Synchronous Reset : Yes
Pack IO Registers into IOBs : auto
Equivalent register Removal : YES
 General Options
Optimization Goal : Speed
Optimization Effort : 1
Library Search Order : qsdadder.lso
Keep Hierarchy : NO
RTL Output : Yes
Global Optimization : AllClockNets
Read Cores : YES
Write Timing Constraints : NO
Cross Clock Analysis : NO
Hierarchy Separator : /
Bus Delimiter : <>
Case Specifier : maintain
Slice Utilization Ratio : 100
BRAM Utilization Ratio : 100
Verilog 2001 : YES
Auto BRAM Packing : NO
Slice Utilization Ratio Delta : 5
=====================================================================
====
83
=====================================================================
====
* HDL Compilation *
=====================================================================
====
Compiling vhdl file "C:/Xilinx92i/lma/adaptive.vhd" in Library work.
Architecture qsdadder of Entity qsdadder is up to date.
=====================================================================
====
* Design Hierarchy Analysis *
=====================================================================
====
Analyzing hierarchy for entity <qsdadder> in library <work> (architecture <qsdadder>).
=====================================================================
====
* HDL Analysis *
=====================================================================
====
Analyzing Entity <qsdadder> in library <work> (Architecture <qsdadder>).
Entity <qsdadder> analyzed. Unit <qsdadder> generated.
84
=====================================================================
====
* HDL Synthesis *
=====================================================================
====
Performing bidirectional port resolution...
Synthesizing Unit <qsdadder>.
Related source file is "C:/Xilinx92i/lma/adaptive.vhd".
Unit <qsdadder> synthesized.
=====================================================================
====
HDL Synthesis Report
Found no macro
=====================================================================
====
=====================================================================
====
* Advanced HDL Synthesis *
=====================================================================
====
85
Loading device for application Rf_Device from file '2s15.nph' in environment C:\Xilinx92i.
=====================================================================
====
Advanced HDL Synthesis Report
Found no macro
=====================================================================
====
=====================================================================
====
* Low Level Synthesis *
=====================================================================
====
Optimizing unit <qsdadder> ...
Mapping all equations...
Building and optimizing final netlist ...
Found area constraint ratio of 100 (+ 5) on block qsdadder, actual ratio is 3.
Final Macro Processing ...
=====================================================================
====
Final Register Report
86
Found no macro
=====================================================================
====
=====================================================================
====
* Partition Report *
=====================================================================
====
Partition Implementation Status

No Partitions were found in this design.

=====================================================================
====
* Final Report *
=====================================================================
====
Final Results
RTL Top Level Output File Name : qsdadder.ngr
Top Level Output File Name : qsdadder
Output Format : NGC
87
Optimization Goal : Speed
Keep Hierarchy : NO
Design Statistics
# IOs : 11
Cell Usage :
# BELS : 17
# LUT2 : 1
# LUT3 : 2
# LUT4 : 10
# MUXF5 : 3
# MUXF6 : 1
# IO Buffers : 11
# IBUF : 6
# OBUF : 5
=====================================================================
====
Device utilization summary:

Selected Device : 2s15cs1446
Number of Slices: 7 out of 192 3%
88
Number of 4 input LUTs: 13 out of 384 3%
Number of IOs: 11
Number of bonded IOBs: 11 out of 86 12%

Partition Resource Summary:

No Partitions were found in this design.

=====================================================================
====
TIMING REPORT
NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE.
FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT
GENERATED AFTER PLACEandROUTE.
Clock Information:

No clock signals found in this design
89
Asynchronous Control Signals Information:

No asynchronous control signals found in this design
Timing Summary:

Speed Grade: 6
Minimum period: No path found
Minimum input arrival time before clock: No path found
Maximum output required time after clock: No path found
Maximum combinational path delay: 13.931ns
Timing Detail:

All values displayed in nanoseconds (ns)
=====================================================================
====
Timing constraint: Default path analysis
Total number of paths / destination ports: 59 / 5

Delay: 13.931ns (Levels of Logic = 6)
Source: a0 (PAD)
Destination: c0 (PAD)
90
Data Path: a0 to c0
Gate Net
Cell:in>out fanout Delay Delay Logical Name (Net Name)
 
IBUF:I>O 10 0.776 1.980 a0_IBUF (a0_IBUF)
LUT4:I0>O 1 0.549 1.035 c138 (c1_map15)
LUT3:I2>O 1 0.549 1.035 c149 (c1_map17)
LUT4:I0>O 2 0.549 1.206 c157 (c1_OBUF)
LUT4:I3>O 1 0.549 1.035 c0 (c0_OBUF)
OBUF:I>O 4.668 c0_OBUF (c0)

Total 13.931ns (7.640ns logic, 6.291ns route)
(54.8% logic, 45.2% route)
=====================================================================
====
CPU : 3.12 / 3.31 s  Elapsed : 3.00 / 3.00 s
>
Total memory usage is 161748 kilobytes
Number of errors : 0 ( 0 filtered)
Number of warnings : 0 ( 0 filtered)
Number of infos : 0 ( 0 filtered)
91
2. Xilinx report for QSD multiplier
Release 9.2i  xst J.36
Copyright (c) 19952007 Xilinx, Inc. All rights reserved.
> Parameter TMPDIR set to ./xst/projnav.tmp
CPU : 0.00 / 0.16 s  Elapsed : 0.00 / 0.00 s
> Parameter xsthdpdir set to ./xst
CPU : 0.00 / 0.16 s  Elapsed : 0.00 / 0.00 s
> Reading design: QSD_SINGLE_DIGIT_MULT.prj
TABLE OF CONTENTS
1) Synthesis Options Summary
2) HDL Compilation
3) Design Hierarchy Analysis
4) HDL Analysis
5) HDL Synthesis
5.1) HDL Synthesis Report
6) Advanced HDL Synthesis
6.1) Advanced HDL Synthesis Report
7) Low Level Synthesis
8) Partition Report
9) Final Report
9.1) Device utilization summary
9.2) Partition Resource Summary
92
9.3) TIMING REPORT
=====================================================================
====
* Synthesis Options Summary *
=====================================================================
====
 Source Parameters
Input File Name : "QSD_SINGLE_DIGIT_MULT.prj"
Input Format : mixed
Ignore Synthesis Constraint File : NO
 Target Parameters
Output File Name : "QSD_SINGLE_DIGIT_MULT"
Output Format : NGC
Target Device : xc2s156cs144
 Source Options
Top Module Name : QSD_SINGLE_DIGIT_MULT
Automatic FSM Extraction : YES
FSM Encoding Algorithm : Auto
Safe Implementation : No
FSM Style : lut
RAM Extraction : Yes
RAM Style : Auto
93
ROM Extraction : Yes
Mux Style : Auto
Decoder Extraction : YES
Priority Encoder Extraction : YES
Shift Register Extraction : YES
Logical Shifter Extraction : YES
XOR Collapsing : YES
ROM Style : Auto
Mux Extraction : YES
Resource Sharing : YES
Asynchronous To Synchronous : NO
Multiplier Style : lut
Automatic Register Balancing : No
 Target Options
Add IO Buffers : YES
Global Maximum Fanout : 100
Add Generic Clock Buffer(BUFG) : 4
Register Duplication : YES
Slice Packing : YES
Optimize Instantiated Primitives : NO
Convert Tristates To Logic : Yes
Use Clock Enable : Yes
Use Synchronous Set : Yes
Use Synchronous Reset : Yes
94
Pack IO Registers into IOBs : auto
Equivalent register Removal : YES
 General Options
Optimization Goal : Speed
Optimization Effort : 1
Library Search Order : QSD_SINGLE_DIGIT_MULT.lso
Keep Hierarchy : NO
RTL Output : Yes
Global Optimization : AllClockNets
Read Cores : YES
Write Timing Constraints : NO
Cross Clock Analysis : NO
Hierarchy Separator : /
Bus Delimiter : <>
Case Specifier : maintain
Slice Utilization Ratio : 100
BRAM Utilization Ratio : 100
Verilog 2001 : YES
Auto BRAM Packing : NO
Slice Utilization Ratio Delta : 5
=====================================================================
====
95
=====================================================================
====
* HDL Compilation *
=====================================================================
====
Compiling vhdl file
"C:/Xilinx92i/QSD_SINGLE_DIGIT_MULT/QSD_SINGLE_DIGIT_MULT.vhd" in Library
work.
Entity <QSD_SINGLE_DIGIT_MULT> compiled.
Entity <QSD_SINGLE_DIGIT_MULT> (Architecture <QSD_SINGLE_DIGIT_MULT>)
compiled.
=====================================================================
====
* Design Hierarchy Analysis *
=====================================================================
====
Analyzing hierarchy for entity <QSD_SINGLE_DIGIT_MULT> in library <work> (architecture
<QSD_SINGLE_DIGIT_MULT>).
=====================================================================
====
* HDL Analysis *
=====================================================================
====
Analyzing Entity <QSD_SINGLE_DIGIT_MULT> in library <work> (Architecture
<QSD_SINGLE_DIGIT_MULT>).
96
Entity <QSD_SINGLE_DIGIT_MULT> analyzed. Unit <QSD_SINGLE_DIGIT_MULT>
generated.
=====================================================================
====
* HDL Synthesis *
=====================================================================
====
Performing bidirectional port resolution...
Synthesizing Unit <QSD_SINGLE_DIGIT_MULT>.
Related source file is
"C:/Xilinx92i/QSD_SINGLE_DIGIT_MULT/QSD_SINGLE_DIGIT_MULT.vhd".
Found 1bit xor2 for signal <m2$xor0000> created at line 55.
Unit <QSD_SINGLE_DIGIT_MULT> synthesized.
=====================================================================
====
HDL Synthesis Report
Macro Statistics
# Xors : 1
1bit xor2 : 1
97
=====================================================================
====
=====================================================================
====
* Advanced HDL Synthesis *
=====================================================================
====
Loading device for application Rf_Device from file '2s15.nph' in environment C:\Xilinx92i.
=====================================================================
====
Advanced HDL Synthesis Report
Macro Statistics
# Xors : 1
1bit xor2 : 1
=====================================================================
====
=====================================================================
====
* Low Level Synthesis *
=====================================================================
====
98
Optimizing unit <QSD_SINGLE_DIGIT_MULT> ...
Mapping all equations...
Building and optimizing final netlist ...
Found area constraint ratio of 100 (+ 5) on block QSD_SINGLE_DIGIT_MULT, actual ratio is
4.
Final Macro Processing ...
=====================================================================
====
Final Register Report
Found no macro
=====================================================================
====
=====================================================================
====
* Partition Report *
=====================================================================
====
Partition Implementation Status

No Partitions were found in this design.
99

=====================================================================
====
* Final Report *
=====================================================================
====
Final Results
RTL Top Level Output File Name : QSD_SINGLE_DIGIT_MULT.ngr
Top Level Output File Name : QSD_SINGLE_DIGIT_MULT
Output Format : NGC
Optimization Goal : Speed
Keep Hierarchy : NO
Design Statistics
# IOs : 12
Cell Usage :
# BELS : 23
# LUT2 : 1
# LUT3 : 1
# LUT4 : 14
# MUXF5 : 5
# MUXF6 : 2
# IO Buffers : 12
100
# IBUF : 6
# OBUF : 6
=====================================================================
====
Device utilization summary:

Selected Device : 2s15cs1446
Number of Slices: 8 out of 192 4%
Number of 4 input LUTs: 16 out of 384 4%
Number of IOs: 12
Number of bonded IOBs: 12 out of 86 13%

Partition Resource Summary:

No Partitions were found in this design.

=====================================================================
====
101
TIMING REPORT
NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE.
FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT
GENERATED AFTER PLACEandROUTE.
Clock Information:

No clock signals found in this design
Asynchronous Control Signals Information:

No asynchronous control signals found in this design
Timing Summary:

Speed Grade: 6
Minimum period: No path found
Minimum input arrival time before clock: No path found
Maximum output required time after clock: No path found
Maximum combinational path delay: 11.348ns
Timing Detail:

102
All values displayed in nanoseconds (ns)
=====================================================================
====
Timing constraint: Default path analysis
Total number of paths / destination ports: 71 / 6

Delay: 11.348ns (Levels of Logic = 5)
Source: a1 (PAD)
Destination: c1 (PAD)
Data Path: a1 to c1
Gate Net
Cell:in>out fanout Delay Delay Logical Name (Net Name)
 
IBUF:I>O 13 0.776 2.250 a1_IBUF (a1_IBUF)
LUT4:I1>O 2 0.549 1.206 c2_SW0 (N21)
LUT3:I0>O 1 0.549 0.000 c1_F (N41)
MUXF5:I0>O 1 0.315 1.035 c1 (c1_OBUF)
OBUF:I>O 4.668 c1_OBUF (c1)

Total 11.348ns (6.857ns logic, 4.491ns route)
(60.4% logic, 39.6% route)
=====================================================================
====
103
CPU : 3.03 / 3.21 s  Elapsed : 3.00 / 3.00 s
>
Total memory usage is 162324 kilobytes
Number of errors : 0 ( 0 filtered)
Number of warnings : 0 ( 0 filtered)
Number of infos : 0 ( 0 filtered)
104
REFERENCES
1) M. Morris Mano, Digital design 2
nd
edition, pp. 119121.
2) Charles H Roth & Lizy Kurian John, Principles of digital system design, pp. no. 6669 &
186190.
3) Iljoo Choo and R.G. Deshmukh, “A Novel Fast Parallel SignedDigit Hybrid
Multiplication Scheme for Digital Systems”. 0780359577/00 2000 IEEE.
4) Dr. Krishna Raj and Suman Lata, “Fast Processing Using Signed Digit Number System”
International Journal of Electronics Engineering, 2(1), 2010, pp. 173175.
5) Dhananjay S. Phatak and Israel Korean et al “Hybrid Signed Digit Number Systems: A
Unified Framework for Redundant Number Representation with Bounded Carry
Propagation Chains”, IEEE Transactions on Computers Vol. 43, No. 8, pp 880891,
August 1994.
6) S. Haykin. (1996). Adaptive Filter Theory 3
rd
edition. pp. 231240. Prentice Hall.
7) Paulo S.R. Diniz: Adaptive Filtering: Algorithms and Practical Implementation, Kluwer
Academic Publishers, 1997
1.1) FULL ADDER
1.1.1) Introduction A full adder is a logical circuit that performs an addition operation on three binary digits. The full adder produces a sum and carry value, which are both binary digits. It can be combined with other full adders (see below) or work on its own. [1] A full adder adds binary numbers and accounts for values carried in as well as out. A onebit full adder adds three onebit numbers, often written as A, B, and Cin; A and B are the operands, and Cin is a bit carried in (in theory from a past addition).
Figure 1.1: Full adder The delay through a digital circuit is measured in gatedelays, as this allows the delay of a design to be calculated for different devices. AND and OR gates have a nominal delay of 1 gatedelay, and XOR gates have a delay of 2, because they are really made up of a combination of ANDs and ORs. A full adder block has the following worst case propagation delays:
From A or B to Cout : 4 gatedelays (XOR → AND → OR) From A or B to S : 4 gatedelays (XOR → XOR) From Cin to Cout : 2 gatedelays (AND → OR) From Cin to S : 2 gatedelays (XOR) The worst propagation delay in 1 bit full adder is of 4 gate delays so the total propagation delay in 1 bit full adder is of 4 gate delays. Assuming that both normal and complement form of inputs are present.
2
1.1.2) VHDL code of full adder library IEEE; use IEEE.STD_LOGIC_1164.all;
entity fullader is port( a, b, cin : in bit;
sum, cout : out bit
); end fullader;
}} End of automatically maintained section
architecture fullader of fullader is begin
sum<= a xor b xor cin after 4 ns; cout<= (a and b) or (b and cin) or (a and cin) after 4 ns; enter your statements here end fullader;
3
2: Ripple Carry Adder Because the carryout of one stage is the next's input. So for an nbit adder. would take 66 cycles to complete the calculation. since each carry bit "ripples" to the next full adder. which is the Cout of the previous adder. 2 gatedelays per intermediate stage (Ci → Ci+1). This kind of adder is a ripple carry adder.2) RIPPLE CARRY ADDER 1.1. 2 gatedelays at the last stage to produce both the sum and carryout outputs (Cn1 → Cn and Sn1).1) This is linear in n. the worst case propagation delay is then: 4 gatedelays from generating the first carry signal (A0/B0 → C1). tp of: tp = 4 + 2(n − 2) + 2 = 2n + 2 (1. we have a total propagation delay. [2] Figure 1. and for a 32bit number. 4 . Each full adder inputs a Cin.1) Introduction It is possible to create a logical circuit using multiple full adders to add Nbit numbers. We would like to find ways to speed it up. and restricts the word length in our device somewhat.2. This is rather slow.
begin fa0: fullader port map (a(0). s(3)). c(3). end ripplecarry. use IEEE.1. c(3). s(2)). co. b(0). b(1). ci: in bit. entity ripplecarry is port( a. cout. b(3). ci. s(1)).2. s(0)). c(2). cin: in bit. signal c: bit_vector(3 downto 1). s: out bit_vector(3 downto 0). b: in bit_vector(3 downto 0). fa1: fullader port map (a(1). end ripplecarry.all.2) VHDL code of ripple carry adder library IEEE. end component. fa2: fullader port map (a(2). fa3: fullader port map (a(3). 5 . c(1). c(2). b. sum: out bit). }} End of automatically maintained section architecture ripplecarry of ripplecarry is component fullader port (a. co: out bit ).STD_LOGIC_1164. c(1). b(2).
1) Introduction The generate function. This occurs if both the addends contain a 1 in that bit: Gi = Ai . Bi (1.3) Note that both these values can be calculated from the inputs in a constant time of a single gate delay.6) . Gi.4) (1.2) The propagate function. indicates if that stage causes a carryout signal Ci to be generated if no carryin signal exists.5) (1.3) CARRY LOOK AHEAD ADDER [2] 1. the carryout from a stage occurs if that stage generates a carry (Gi = 1) or there is a carryin and the stage propagates the carry (Pi·Ci = 1): Ci+1 = AiBi + AiCi + Bi Ci Ci+1 = AiBi + (Ai + Bi) Ci Ci+1 = Gi + Pi Ci 6 (1. Now.3.3: VHDL simulation of ripple carry adder 1. indicates if a carryin to the stage is passed to the carryout for the stage. Pi.Figure 1. This occurs if either the addends have a 1 in that bit: Pi = Ai + Bi (1.
7 . Thus the carryout for a given stage can be calculated in constant time. most CLAs are constructed out of "blocks" comprising 4bit CLAs.. and begins to get very complicated for n greater than 4. which are in turn cascaded to produce a larger CLA. so we don't have to wait for changes to ripple through the circuit. the amount of hardware needed is approximately quadratic with n.8) (1.4: Carry Look Ahead adder A basic carrylookahead adder is very fast but has the disadvantage that it takes a very large amount of logic hardware to implement.7) (1. In fact. C1 to Cn Produce sum result.Ci+1 = Gi + Pi (Gi1 + Pi1 Ci1) Ci+1 = Gi + Pi Gi1 + Pi Pi1(Gi2 + Pi2 Ci2) . and therefore so can the sum. Ci+1 = Gi + Pi Gi1 + Pi Pi1Gi2 + PiPi1Pi2 Gi3 + … + Pi Pi1. S Required Data Addends (a and b) P and G signals. PiPi1…P1P0C0 (1. In fact. a given stage's carry signal can be computed once the propagate and generate signals are ready with only two more gate delays (one AND and one OR). and C0 Carry signals and addends Total Gate Delays 1 2 3 6 Figure 1. Operation Produce stage generate and propagate signals Produce stage carryout signals. Due to this.9) Note that this does not require the carryout signals from the previous stages. ..
so : out bit).c. signal g. use IEEE. signal c : bit_vector (3 downto 1).cin. 8 .y(1).gout.po : out bit).g(0).3.pout : out bit). end component.p(0). s : out bit_vector (3 downto 0).p : in bit_vector (3 downto 0). cin : in bit.cin : in bit. c : out bit_vector (3 downto 1) . end component. gpfa0 : gpfulladder port map ( x(0).go. g. architecture claadder of claadder is component gpfulladder port (a.all.2) VHDL code for carry look ahead adder library IEEE. gpfa1 : gpfulladder port map (x(1). begin carrylogic : clalogic port map (g.y : in bit_vector (3 downto 0).p(1). cout. ci : in bit.c(1). entity claadder is port (x.cin.pout.s(1)).p : bit_vector (3 downto 0).g(1).gout).b.p.cout. component clalogic port (g.p. co.s(0)).STD_LOGIC_1164.y(0).1. end claadder.
g.g(3).so : out bit).c(2). end claadder. }} End of automatically maintained section architecture gpfulladder of gpfulladder is signal p_int : bit.p(2).s(3)). 9 .cin : in bit. entity gpfulladder is port (a.y(2).c(3).s(2)).b. gpfa3 : gpfulladder port map (x(3). end gpfulladder. .all.p. begin g <= a and b. so <= p_int xor cin. library IEEE.y(3). use IEEE. p_int <= a xor b.p(3).enter your statements here end gpfulladder. p <= p_int.STD_LOGIC_1164.gpfa2 : gpfulladder port map (x(2).g(2).
c(2) <= g(1) or (p(1) and g(0)) or (p(1) and ci). . use IEEE. }} End of automatically maintained section architecture clalogic of clalogic is signal go_int. co <= go_int or (po_int and ci). ci : in bit.library IEEE. po <= po_int. go <= go_int.po_int : bit. c(3) <= g(2) or (p(2) and g(1)) or (p(3) and p(2) and g(1)) or ( p(2) and p(1) and p(0) and ci). c : out bit_vector (3 downto 1) .all.po : out bit).p : in bit_vector (3 downto 0). begin c(1) <= g(0) or (p(0) and ci). co. go_int <= g(3) or (p(3) and g(2)) or (p(3) and p(2) and g(1)) or (p(3) and p(2) and p(1) and g(0)). entity clalogic is port (g. end clalogic.STD_LOGIC_1164.go. 10 . po_int <= p(3) and p(2) and p(1) and p(0).enter your statements here end clalogic.
a “carry–free” addition can be performed. Figure shows an example for an 8bit redundant binary addition. This is done in parallel for all digit positions. where the term “carry–free” in this context means that the carry propagation is limited to a single digit position. Note that there is no carry generation in 11 . Hence. ISum and ICin are intermediate sum and carryin. the second step can also be executed in parallel for all the digit positions. the carry propagation length is fixed irrespective of the word length.1) Introduction In such a system. independent of the word length. In the second step. the summation zi = si + ci1 is carried out to produce the final sum digit zi. based on the operand digits xi and yi at each digit position i. an intermediate sum si and a carry ci are generated. In other words. X and Y are ndigit redundant binary integers.4. In the first step. which is obtained by adding ISum and ICin.4) REDUNDANT BINARY SIGNED ADDER [3] 1.Figure 1. yielding a fixed addition time.5: VHDL simulation of Carry Look Ahead adder 1. The important point is that it is always possible to select the intermediate sum si and carry ci1 such that the summation in the second step does not generate a carry. Final Sum (FSum). The addition consists of two steps. In the Figure.
the addition of ISum and ICin to satisfy a carryfree condition and the LSB of ICin is set to logic zero. then in the second step the intermediate sum and intermediate carry is added to obtain the final sum. (1. The above table is designed such that the addition of intermediate sum bit and intermediate carry bit does not produce a carry.6: Signed addition The addition of two signed digit takes place in two steps. Figure 1. In the first step intermediate carry and intermediate sum is written using the above table.7: Signed adder cell [4] 12 .10) Figure 1.
If the delay of NAND.9: Steps of RBSD addition 13 . The above table is designed such that the addition of intermediate sum bit and intermediate carry bit does not produce a carry.11) 1. then in the second step the intermediate sum and intermediate carry is added to obtain the final sum.8: Rules table for intermediate carry and intermediate sum The addition of two signed digit takes place in two steps. yi1) Intermediate Intermediate Carry (ci) 1 1 0 0 0 0 0 1 1 Sum (si) 0 1 1 0 0 0 1 1 0 Both are nonnegative Otherwise   Figure 1. NOR gate is considered to then delay of the circuit for the circuit becomes Tdelay = to+2to+2to+ to+ to = 7to (1.2) Rules For Redundant Binary Addition Type Augend digit (xi) 1 2 1 1 0 3 4 0 1 1 5 0 1 6 1 Addend digit (yi) 1 0 1 0 1 1 1 0 1 Both are nonnegative Otherwise Digit at the next lower order position (xi1 . Figure 1.4. In the first step intermediate carry and intermediate sum is written using the above table.
h : in bit.d.c.STD_LOGIC_1164.4. end rbsdadder.g. }} End of automatically maintained section architecture rbsdadder of rbsdadder is begin c2 <= (e and f and ((a and (not d)) or ((not b) and c))) or (a and f and ((b and c) or ((not d) and g))) or (g and (((not b) and c and f) or ( a and (not d) and (not f)))) or (c and (((not b)and (not f) and g) or ( a and b and (not h)))) or (a and b and c and (not f)). c1 <= (e and f and ((a and (not d)) or ((not b) and c))) or (a and f and ((b and c) or ((not d) and g))) or (g and (((not b) and c and f) or ( a and (not d) and (not f)))) or (c and (((not b)and (not f) and g) or ( a and b and (not h)))) 14 .e.all. entity rbsdadder is port (a.s1 : out bit).3) VHDL code of RBSD adder library IEEE. use IEEE.b.f.s2. c2.1.c1.
or (b and((a and c and (not f)) or ( (not a) and (not c) and d and f))) or ((not a) and b and(not c) and (not h) and (d or ( not e ))) or ((not a) and b and(not c) and (not f) and (d or (not g))) or ((not b) and (not c ) and d and (((not e) and (not h)) or ((not f) and (not g)))) or ((not e) and f and (not g ) and (((not a) and b and (not d)) or ((not b) and (not c) and d))). 15 . s2 <= (b and (not d) and (((not e) and (not h)) or ((not f) and (not g)))) or ((not b) and d and (((not e) and (not h)) or ((not f) and ( not g)))) or ((not e) and f and (not g) and (b xor d)). s1 <= (f and (b xor d)) or (b and (not d) and ((not h) or (not f))) or ((not b) and d and ((not h) or (not f))) end rbsdadder.
10: Waveform of RBSD adder cell Figure 1.Figure 1.11: VHDL simulation of RBSD adder cell 16 .
17 .12: signed and unsigned adder cell [5] In HSD for d=1 (the distance between signed digit positions) the delay is.1. In the following.5. instead of insisting that every digit be a signed digit. every alternate or every third or fourth digit can be signed. In particular. For example. we show that such a representation can limit the maximum length of carry propagation chains to any desired value. we let some of the digits to be signed and leave the others unsigned. where d is the longest distance between neighboring signed digits. We refer to this representation as a Hybrid SignedDigit (HSD) representation. Unsigned digit position Signed digit position Figure 1.1) Introduction Here. all the remaining ones are unsigned.5) HYBRID SIGNED DIGIT ADDER [5] 1. we prove that the maximum length of a carry propagation chain equals (d + 1).
The terms in between are proportional to d since the carry ripples through all the unsigned digit positions. The last 1.13: Critical path delay vs. distance between signed digits [5] 18 . Figure 1.5 units in parenthesis are due to the two complex gates in the lower order signed digit cell.(1.5 units of delay (shown within the square brackets) is associated with the XNOR gate at the higher order signed digit where the carry propagation terminates. the two delays of 1.12) Here.
14: Transistor count vs. Distance between signed digits [5] Figure 1.15: Transistor count *Delay vs. distance between signed digits [5] 19 .Figure 1.
architecture unsigned_new of unsigned_new is component nor2 port( a : in STD_LOGIC. vi_1 : out STD_LOGIC.1. wi_1 : out STD_LOGIC. bi_1 : in STD_LOGIC.all. vi_2 : in STD_LOGIC. b : in STD_LOGIC. use IEEE.STD_LOGIC_1164. wi_2 : in STD_LOGIC.5. end unsigned_new. 20 . entity unsigned_new is port( ai_1 : in STD_LOGIC. ei_1 : out STD_LOGIC ).2) VHDL code of unsigned position adder cell library IEEE. y : out STD_LOGIC ).
end component. component not1 port( a : in STD_LOGIC. b : in STD_LOGIC. component and2 port( a : in STD_LOGIC. y : out STD_LOGIC ). end component. b : in STD_LOGIC. y : out STD_LOGIC ). y : out STD_LOGIC ). end component. component xor2 21 . end component. component xnor2 port( a : in STD_LOGIC.
N5: and2 port map (s2.s3).s5). end component. N6: or2 port map (s3. end component.vi_2. begin N1: not1 port map (wi_2.s1). N2: xnor2 port map (ai_1. 22 .s2. N4: or2 port map (s1.s1. component or2 port( a : in STD_LOGIC.s7: STD_LOGIC. signal s1. N3: and2 port map (vi_2. y : out STD_LOGIC ). b : in STD_LOGIC.s2).s3.bi_1.vi_1).s4.s5.s4.s6.s5.port( a : in STD_LOGIC. y : out STD_LOGIC ).s4). b : in STD_LOGIC.
end unsigned_new. N10: xor2 port map (s7. N8: xor2 port map (vi_2. N9: xor2 port map (ai_1.N7: nor2 port map (ai_1.s6.bi_1.16: Waveform of HSD unsigned position adder cell Figure 1.s7).wi_2.ei_1).wi_1). Figure 1.17: VHDL simulation of HSD unsigned position adder cell 23 .s6).bi_1.
vi : out STD_LOGIC. yis_c : in STD_LOGIC. use IEEE. }} End of automatically maintained section 24 .all. entity signedaddercell is port( xis_c : in STD_LOGIC.1. zia : out STD_LOGIC. zis_c : out STD_LOGIC ). wi_1 : in STD_LOGIC.5. vi_1_c : in STD_LOGIC.STD_LOGIC_1164.3) VHDL code of signed position adder cell library IEEE. xia : in STD_LOGIC. yia : in STD_LOGIC. end signedaddercell. wi : out STD_LOGIC.
component and2 port( a : in STD_LOGIC. component xnor2 port( a : in STD_LOGIC. end component. y : out STD_LOGIC ). b : in STD_LOGIC.architecture signedaddercell of signedaddercell is component nor2 port( a : in STD_LOGIC. end component. y : out STD_LOGIC ). 25 . b : in STD_LOGIC. y : out STD_LOGIC ). b : in STD_LOGIC.
y : out STD_LOGIC ). end component. b : in STD_LOGIC. component nor3 port( a : in STD_LOGIC. b : in STD_LOGIC. y : out STD_LOGIC 26 . component xor2 port( a : in STD_LOGIC.end component. c : in STD_LOGIC. y : out STD_LOGIC ). component nand2 port( a : in STD_LOGIC. b : in STD_LOGIC. end component.
N8: xnor2 port map (vi_1_c. signal s1.s3. N2: nor2 port map (xis_c.s3. N5: and2 port map (s3.s1). N4: xor2 port map (xia. .s5.yia.wi_1.zia).vi).enter your statements here  end signedaddercell.yia.s2.vi_1_c.yis_c.zis_c). end component.yis_c. N6: xor2 port map (wi_1. 27 .).s6: STD_LOGIC.s5. N3: nor2 port map (xia.s3).wi).s2.s6).s2).s6. N7: nand2 port map (s6. N9: nor3 port map (s1.s5). begin N1: nand2 port map (xis_c.s4.
19: VHDL simulation of HSD signed position adder cell 28 .18: Waveform of HSD signed position adder cell Figure 1.Figure 1.
To prevent carry from further rippling. A high speed and area effective adders and multipliers can be implemented using this technique.1.1) Introduction QSD numbers are represented using 3bit 2’s complement notation. For digital implementation. 1. Consequently. the magnitude of the second step output cannot be greater than 3 which can be represented by a singledigit QSD number. all 29 . 1. we define two rules. Each number can be represented by: D x i 4i i n (1. large number of digits such as 64.6. The first step generates an intermediate carry and sum from the addend and augend. hence no further carry is required. or more can be implemented with constant delay. 2. 2) The second rule states that the magnitude of the carry must be less than or equal to 1(or 1). 2. In step 1. The redundancy allows multiple representations of any integer quantity. 0. 1) The first rule states that the magnitude of the intermediate sum must be less than or equal to 2(or 2).6. 3} for producing an appropriate decimal representation.13) Where xi can be any value from the set {3. The second step combines the intermediate sum of the current digit with the carry of the lower significant digit. There are two steps involved in the carryfree addition.2) QSD ADDER We can achieve carryfree addition by exploiting the redundancy of QSD numbers and the QSD addition. 1.6) QUATERNARY SIGNED DIGIT NUMBERS 1. 128.
The output ranges from 6 to 6 as shown in figure 1. The intermediate carry and sum circuit is shown in Figure 1. addend and augend. the intermediate carry and sum are shown in binary format.possible input pairs of the addend and augend are considered. and the outputs.20.20: QSD representation Both inputs and outputs can be encoded in 3bit 2’scomplement binary number. 30 . The mapping between the inputs. Since the intermediate carry is always lies between 1 and 1. Finally. Figure 1.19. five 6variable Boolean expressions can be extracted. it requires only a 2bit binary representation.
31 . the intermediate carry from the lower significant digit is added to the sum of the current digit to produce the final result.21: The intermediate carry and sum generator Figure 1.Figure 1.22: The second step QSD adder In step 2. The addition in this step produces no carry because the current digit can always absorb the carryin from the lower digit.
23: Ndigit QSD adder By using N cells in parallel we can make N digit adder. 32 .Figure 1. The delay in this N digit adder is constant which is equal to delay of single digit adder.
Table1.1 : The mapping between the inputs and outputs of the Intermediate carry and sum 33 .
s0. s0 <= (a0 and (not b0)) or ((not a0) and b1 and b0) or ((not a2) and (not a0) and b0) or ((not a0 ) and (not b2) and b0).STD_LOGIC_1164. entity qsdadder is port (a0. s2 <= ((not a1) and b2 and b0) or (a2 and (not a0) and (not b1)) or ((not a1) and a0 and b2 and (not b1)) or ((not a1) and (not a0) and b2 and b1) or ((not a1) and a0 and b1 and (not b0)) or ((not a1) and (not a0) and b1 and b0) or (a2 and (not a1 ) and (not b1) and b0) or (a1 and (not a0) and (not b1) and b0) or (a2 and a1 and (not b1) and (not b0)) or (a1 and a0 and (not b1 ) and (not b0)) or (a2 and a1 and a0 and b2 and b1 and b0).3) VHDL code of QSD adder library IEEE.a1.b2 : in bit.a2.b1.c1.b0. c0.1.6.s2 : out bit). end qsdadder. c0 <= (a2 and b2 and (not b1)) or (a2 and (not a1) and b2) or (a2 and b2 and (not b0)) or (a2 and (not a0) and b2) or ((not a1) and (not a0) and b2 and (not b1)) or ((not a2)and a1 and (not b2) and b1) or ((not a2) and a0 and (not b2) and b1) or((not a2) and (not b2) and b1 and b0) or ((not a2) and a1 and (not b2) and b0) or (a2 and (not a1) and (not b1) and (not b0)) or ((not a2) and a1 and a0 and (not b2)). 34 . end qsdadder.s1.all. use IEEE. architecture qsdadder of qsdadder is begin c1 <= (a2 and b2 and(not b1)) or (a2 and (not a1) and b2) or (a2 and b2 and (not b0))or (a2 and (not a0) and b2) or (b2 and (not a1) and (not a0) and (not b1)) or (a2 and (not a1) and (not b1) and (not b0)). s1 <= ((not a1) and b1 and (not b0)) or ((not a1) and (not a0) and b1 ) or (a1 and (not a0) and (not b1)) or (a1 and (not b1) and (not b0)) or ( a1 and a0 and b1 and b0) or ((not a1) and a0 and (not b1) and b0).
25: VHDL simulation of QSD adder cell On simulation in Xilinx we get the delay of 13.Figure 1. 35 .24: Waveform of QSD adder cell Figure 1.931ns for QSD adder.
requiring a QSD partial product generator and a QSD adder as basic components. The single digit multiplication produces M as a result and C as a carry to be combined with M of the next digit. A partial product Mi is a result of multiplication between an ndigit input .4) Single Digit QSD Multiplier There are generally two methods for a multiplication operation : parallel and iterative.1. A n1 – A0 . The mapping between inputs A (Multiplicand) and B (Multiplier) and the outputs M and C is shown in the Table 1. The range of the out is from 9 to 9 which can be represented with M and C in QSD form. with a single digit input Bi . The primitive component of the partial product generator is a single digit multiplication unit. QSD multiplication can be implemented in both ways. QSD A B 3 3 3 3 3 2 2 3 3 2 2 3 2 2 2 2 3 1 3 1 1 3 1 3 2 1 2 1 1 2 1 2 1 1 1 1 3 0 2 0 1 0 0 1 0 2 0 3 0 0 3 0 2 0 1 0 0 1 0 2 0 3 1 1 1 1 2 1 1 2 1 2 2 1 3 1 1 3 3 1 1 3 INPUT Binary A B 011 011 101 101 011 010 010 011 101 110 110 101 010 010 110 110 011 001 101 111 001 011 111 101 010 001 110 111 001 010 111 110 001 001 111 111 011 000 010 000 001 000 000 001 000 010 000 011 000 000 101 000 110 000 111 000 000 111 000 110 000 101 001 111 111 101 010 111 111 010 001 110 110 001 011 111 111 011 101 001 001 101 Decimal Product 9 9 6 6 6 6 4 4 3 3 3 3 2 2 2 2 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 2 2 2 2 3 3 3 3 OUTPUT QSD Binary C M C M 2 1 010 001 2 1 010 001 1 2 001 010 1 2 001 010 1 2 001 010 1 2 001 010 1 0 001 000 1 0 001 000 1 1 001 111 1 1 001 111 1 1 001 111 1 1 001 111 0 2 000 010 0 2 000 010 0 2 000 010 0 2 000 010 0 1 000 001 0 1 000 001 0 0 000 000 0 0 000 000 0 0 000 000 0 0 000 000 0 0 000 000 0 0 000 000 0 0 000 000 0 0 000 000 0 0 000 000 0 0 000 000 0 0 000 000 0 0 000 000 0 0 000 000 0 1 000 111 0 1 000 111 0 2 000 110 0 2 000 110 0 2 000 110 0 2 000 110 1 1 111 001 1 1 111 001 1 1 111 001 1 1 111 001 36 . where i = 0…n1 .2.6. The value of M and C should lie between 2 and 2.
6. 37 . m1 : inout STD_LOGIC. b2 : in STD_LOGIC. m2 : inout STD_LOGIC. entity QSD_SINGLE_DIGIT_MULT is port( a2 : in STD_LOGIC. c0 : inout STD_LOGIC. c1 : inout STD_LOGIC. use IEEE.all.2: The mapping between multiplicand and multiplier 1. a0 : in STD_LOGIC.2 2 3 2 3 2 3 3 2 2 2 3 2 3 3 3 010 110 011 110 101 010 011 101 110 010 110 011 010 101 101 011 4 4 6 6 6 6 9 9 1 1 1 1 1 1 2 2 0 0 2 2 2 2 1 1 111 111 111 111 111 111 110 110 000 000 110 110 110 110 111 111 Table 1. m0 : inout STD_LOGIC ). c2 : inout STD_LOGIC.5) VHDL code for single digit multiplier library IEEE.STD_LOGIC_1164. b1 : in STD_LOGIC. a1 : in STD_LOGIC. b0 : in STD_LOGIC.
38 . . m1<= (a0 and b1 and(a1 nand b0)) or (a1 and b0 and(b1 nand a0)). c1<= c2 or ( a2 and(not a1)and b2 and(not b1)) or (a1 and a0 and (not b2)and b1 and b0). m2<= (a2 and b1 and(a1 nor b2)) or (a1 and b2 and(a2 nor b1)) or (a1 and a0 and b2 and(not b1)) or(a2 and(not a1)and b1 and b0) or (a0 and b2 and(a2 nor b0)) or (a0 and b0 and(a1 xor b1)) or (a2 and b0 and(a0 nor b2)) or (a2 and a0 and b1 and(b2 nor b0)) or (a1 and b2 and b0 and(a2 nor a0)).end QSD_SINGLE_DIGIT_MULT. m0<= a0 and b0. c0<= (a1 and(a0 nor b2)and b1) or (a1 and b1 and(a2 nor b0)) or (a2 and b2 and(a1 xor b1)) or ( a2 and b2 and(a0 nor b0)) or (a2 and b1 and(a1 nor b0)) or (a1 and b2 and(a0 nor b1)) or ((a1 nor b1)and a2 and(not b2)and b0) or ((a1 nor b1)and(not a2)and a0 and b2) or (a2 and a1 and(not b2)and b1 and b0) or ((not a2)and a1 and a0 and b2 and b1) or ((a2 nor b2)and a0 and b0 and(a1 xor b1)).enter your statements here – end QSD_SINGLE_DIGIT_MULT. }} End of automatically maintained section architecture QSD_SINGLE_DIGIT_MULT of QSD_SINGLE_DIGIT_MULT is begin c2<= (a2 and(not b2)and b0 and((not b1)nand a1)) or (a2 and(not b2)and b1 and(a1 nand a0)) or ((not a2)and a0 and b2 and ((not a1)nand b1)) or ((not a2) and a1 and b2 and(b1 nand b0)).
26: Single digit QSD multiplier On simulation of QSD single digit multiplier in Xilinx we get the delay of 11. 39 .Figure 1.348ns.
Number of bits for addition for different adding schemes 600 500 400 300 200 100 0 ripple carry addition 40 10 20 13 30 500 424 520 212 106 56 14 28 260 130 2 bit 4 bit 8 bit carry look ahead addition redundant binary addition hybrid signed quartinary addition signed digit addition Figure 1.7) COMPARATIVE RESULT OF DIFFERENT ADDERS Figure 1.1.28: complexity vs. number of bits for addition of different adding schemes 40 .27: Delay vs.
The adaptive filter uses feedback in the form of an error signal to refine its transfer function to match the changing parameters. By way of contrast. Generally speaking. serves as a foundation for particular adaptive filter realizations. which determines how to modify filter transfer function to minimize the cost on the next iteration. a nonadaptive filter has a static transfer function.CHAPTER 2 ADAPTIVE FILTER [6] 2. and medical monitoring equipment. such as Least Mean Squares (LMS) and Recursive Least Squares (RLS). The block diagram. to feed an algorithm. Because of the complexity of the optimization algorithms. shown in the following figure. camcorders and digital cameras. As the power of digital signal processors has increased. most adaptive filters are digital filters.1) .1: Adaptive filter To start the discussion of the block diagram we take the following assumptions: * The input signal is the sum of a desired signal d(n) and interfering noise v(n) x(n) = d(n) + v(n) 41 (2. the adaptive process involves the use of a cost function. Adaptive filters are required for some applications because some parameters of the desired processing operation (for instance.1) INTRODUCTION An adaptive filter is a filter that selfadjusts its transfer function according to an optimization algorithm driven by an error signal. the locations of reflective surfaces in a reverberant space) are not known in advance. which is a criterion for optimum performance of the filter. adaptive filters have become much more common and are now routinely used in devices such as mobile phones and other communication devices. Figure 2. The idea behind the block diagram is that a variable filter extracts an estimate of the desired signal.
The adaptive algorithm generates this correction factor based on the input and error signals. In vector notation this is expressed as ˆ d (n) = wn * x(n) where x(n)=[x(n).. but the most commonly used is the Least Mean Square (LMS) algorithm. Developed by Window and Hoff. wn (1). among many others.5) is an input signal vector. radar guidance systems.d (n) (2.6) where ∆wn is a correction factor for the filter coefficients. There are many adaptive algorithms such as Recursive Least Square (RLS) and Kalman filters.……. Wn(p)]T (2.1) Introduction Adaptive algorithms are a mainstay of Digital Signal Processing (DSP). The coefficients for a filter of order p are defined as wn=[wn (0).2.x(n1). the algorithm uses a gradient descent to estimate a time varying signal. For such structures the impulse response is equal to the filter coefficients. and wireless channel estimation. the variable filter updates the filter coefficients at every time instant wn+1 = wn+ ∆wn (2.……. An adapative algorithm is used to estimate a time varying signal. It is a simple but powerful algorithm that can be implemented to take advantage of Lattice FPGA architectures. They are used in a variety of applications including acoustic echo cancellation.4) (2.* The variable filter has a Finite Impulse Response (FIR) structure.2) LEAST MEAN SQUARE ADAPTIVE FILTER [6] 2.x(np)]T (2.3) The variable filter estimates the desired signal by convolving the input signal with the impulse response. The gradient 42 . 2. Moreover.2) * The error signal or cost function is the difference between the desired and the estimated signal ˆ e(n) = d(n). LMS and RLS define two different coefficient update algorithms.
H.2) GRADIENTDESCENT ADAPTATION [6] The adaptive filter.a FIR filter and the LMS algorithm.descent method finds a minimum. It does so by adjusting the filter coefficients to minimize the error. The gradient is a vector pointing in the direction of the change in filter coefficients 43 . e[n]. is adapted using the least meansquare algorithm. W. by taking steps in the direction negative of the gradient. The unknown system and the adapting filter process the same input signal x[n] and have outputs d[n] (also referred to as the desired signal) and y[n]. The output y(n) is then subtracted to from the desired signal d(n) to generate an error. W. Figure 1 is a block diagram of system identification using adaptive filtering.7) The term inside the parentheses represents the gradient of the squarederror with respect to the Ith coefficient. The FIR filter is implemented serially using a multiplier and an adder with feedback. The LMS algorithm iteratively updates the coefficient and feeds it to the FIR filter. which is used by the LMS algorithm to compute the next set of coefficients. The objective is to change (adapt) the coefficients of an FIR filter. The coefficient update relation is a function of the error signal squared and is given by h n 1 i h n i ( e) 2 2 h n i (2.2: Least Mean Square adaptive filter 2. First the error signal. The LMS reference design consists of two main functional blocks . which is the most widely used adaptive filtering algorithm. to match as closely as possible the response of an unknown system. the adaptive filter will change its coefficients in an attempt to reduce the error. is computed as e[n]=d[n]−y[n]. which measures the difference between the output of the adaptive filter and the output of the unknown system. The FIR result is normalized to minimize saturation. Figure 2. On the basis of this measure. The FIR filter than uses the coefficient e(n) along with the input reference signal x(n) to generate the output y(n). if it exists.2.
Equation 1 updates the filter coefficients in the direction opposite the gradient. ( e) 2 (d y ) 2 h i h i e (2. hn 1 i h n i ex[n i] (2. we can rewrite the derivative of the squarederror term as ( e) 2 ( e) 2 e h i h i (2.11) which in turn gives us the final LMS coefficient update. however. If μ is very small. To express the gradient decent coefficient update equation in a more usable manner. then the coefficients change only a small amount at each update.that will cause the greatest increase in the error signal.10) ( e) 2 2( x[n i ])e h i (2. the adaptive filter should converge.12) The stepsize μ directly affects how quickly the adaptive filter will converge toward the unknown system. the difference between the unknown and adaptive systems should get smaller and smaller. The constant μ is a stepsize. With a larger stepsize. when the stepsize is 44 . After repeatedly adjusting each coefficient in the direction opposite to the gradient of the error. that is why the gradient term is negated. which controls the amount of gradient information used to update each coefficient. more gradient information is included in each update. and the filter converges more quickly. Because the goal is to minimize the error.9) N 1 (d h i x[n i]) ( e) i 0 2 h i h i 2 e (2.8) Or. that is. however. and the filter converges slowly.
the maximum achievable convergence speed depends on the eigenvalue spread of autocorrelation matrix. that is. and that the input signal x(n) is widesense stationary. the coefficients may change too quickly and the filter will diverge. Then E{W(n)} converges to H as n→∞ if and only if 0 2 max (2.14) where λmin is the smallest eigenvalue of autocorrelation matrix. Maximum convergence speed is achieved when 2 max min (2. and the eigenvalue spread is the minimum over all possible matrices. where σ2 is the variance of the signal. with a larger value yielding faster convergence. (It is possible in some cases to determine analytically the largest value of μ ensuring convergence.) 2. and slowly for colored input signals.13) Where λmax is the greatest eigenvalue of the autocorrelation matrix. such as processes with lowpass or highpass characteristics. This means that faster convergence can be achieved when λmax is close to λmin. Given that μ is less than or equal to this optimum. In this case all eigenvalues are equal. 45 . A white noise signal has autocorrelation matrix R = σ2I. If this condition is not fulfilled.too large. the algorithm becomes unstable and W(n) diverges. The common interpretation of this result is therefore that the LMS converges quickly for white input signals.λmin. the convergence speed is determined by μ.3) CONVERGENCE AND STABILITY [6] Assume that the true filter H(n) = H is constant.2.
It is important to note that the above upperbound on μ only enforces stability in the mean.e. This bound guarantees that the coefficients of W(n) do not diverge (in practice. since it is somewhat optimistic due to approximations and assumptions made in the derivation of the bound). divergence of the coefficients is still possible.15) where tr[R] denotes the trace of autocorrelation matrix. the value of μ should not be chosen close to this upper bound. i. 46 . but the coefficients of W(n) can still grow infinitely large. A more practical bound is 0 2 tr[ R] (2.
…….2.1) INTRODUCTION In LMS the weight vector is updated from sample to sample as followshk+1 = hk – μ ∇k (3.2) IMPLEMENTATION OF LMS ALGORITHM [7] 1) Initially.1. set each each weight hk(i). k=1.4) (3. At the kth sampling instant. LMS algorithm for updating the weights from sample to sample is hk+1 = hk + 2 μekxk where.nk 4) Update the next filter weights h k 1 (i ) h k (i ) 2 e k xk i The LMS algorithm requires approximately 2N+1 multiplications and 2N+1 additions for each new set of input and output samples.hkTxk (3. ek = yk .2) 3.…. μ controls the stability and the rate of convergence.2.3) (3. For each subsequent sampling instant.1) hk and ∇k are the weights and the true gradient vectors respectively.6) 3) Compute the error estimate ek = yk .. 2) Compute filter output as n k hk (i ) xk i i 0 N 1 (3.N1 to an arbitrary fixed value such as 0.CHAPTER 3 IMPLEMENTATION OF LMS ADAPTIVE FILTER [7] 3.5) (3. carry out steps (2) to step (4) below. 47 . for i=0.
3) FLOWCHART FOR THE LMS ADAPTIVE FILTER [7] Initialize hk(i) and xki Read xk and yk from ADC Filter xk nk=∑wk(i).nk Compute Factor 2μek Update Coefficient wk+1 = wk + 2μekxki 48 .xki Compute Error ek=yk .3.
1) Introduction [7] The LMS algorithm is a linear adaptive filtering algorithm. Figure 3.3. hence is called adaptive weightcontrol mechanism. we have a mechanism for performing the adaptive control process on the tap weights of the transversal filter. consists of two basic processes: 1) A filtering process. which. First we have a transversal filter. which involves the automatic adjustment of the parameters of the filter in accordance with the estimation error.4) IMPLEMENTATION OF DIFFERENT ORDERS LMS ADAPTIVE FILTER 3. Second. in general. The combination of these two processes working together constitutes a feedback loop.1: LMS filter 49 . which involves (a) computing the output of a linear filter in response to an input signal and (b) generating an estimation error by comparing this output with a desired response.4. this component is responsible for performing the filtering process. 2) An adaptive process. around which the LMS algorithm is built.
1) Introduction Figure 3.4.3.7) (3.8) 50 .4.2.2: 1st order LMS adaptive filter dout is the output of transversal filter yn is the desired signal e(n) is the estimation error given ase(n) = dout(n) – y(n) w(n+1) = w(n) + 2μe(n)xin(n) w(n+1) is the updated weight and w(n) is the previous weight (3.2) 1st order LMS adaptive filter 3.
x1.y1.5.464ns. w12.d1.all.d4.b. y5.y0:in std_logic .x0:in std_logic .q1. q2. end first_order_filter.d0:inout std_logic). use IEEE.2.c: in STD_LOGIC. d5.w11.348ns.y4.931ns and the delay of QSD multiplier is 11.y3. }} End of automatically maintained section architecture first_order_filter of first_order_filter is component delay_unit port( a .q0:in std_logic . 3.w01.2) VHDL implementation of 1st order LMS adaptive filter Here we are using μ=0. so the total delay of 1st order LMS adaptive filter is 112. w02.1) VHDL code for 1st order LMS adaptive filter library IEEE.STD_LOGIC_1164.2.e.f: out STD_LOGIC 51 . The delay of QSD adder is 13.w00: in std_logic .4.Components required for designing of 1st order LMS adaptive filter areNumber of delay elements required Number of multipliers in transversal filter =1 =2 Number of multipliers in adaptive weight control mechanism = 3 Number of adders in transversal filter Number of adders in adaptive weight mechanism =1 =3 Here total number of multipliers are 5 and total number of adders are 4.d2. 3.2.d3. d .w10: in std_logic. entity first_order_filter is port( x2.4.y2.
wo21.xd21.w22. q2.y4.w11. b1 : in STD_LOGIC.z1.y3.).wo20: out std_logic ).xd20 :std_logic .z3. z5. b2 : in STD_LOGIC.x22.w21.q0:in std_logic .x0.x4. c1 : inout STD_LOGIC.x21.s0 : out std_logic).d0:in std_logic .xd10:std_logic .x1. end component .a0 : in std_logic. 52 . signal nki02. component qsdadder2bit port (x5.d2.d3.z4.s1. component adaptationunit_first_order port (d5. wo12. end component . a1 : in STD_LOGIC. end component .wo22.y1. c0 : inout STD_LOGIC. end component .y0:in std_logic . m2 : inout STD_LOGIC. component qsdadder port (b2.xd11. m1 : inout STD_LOGIC.d4.x3.z0:out std_logic ).w10.b1. signal nk12.y3.c0. signal xd12. x12. y5.y4.x11.d1.y2.nk10: std_logic .y0:in std_logic .wo11. w12.q1.z2.nk01.x2.x10. b0 : in STD_LOGIC.nki01.nk11.b0. signal nk02.y2. signal xd22.w20: in std_logic . m0 : inout STD_LOGIC ). c2 : inout STD_LOGIC.wo10. c1. end component .x20: in std_logic .nk00: std_logic . component QSD_SINGLE_DIGIT_MULT port( a2 : in STD_LOGIC.y1.a1.s2.y5. a0 : in STD_LOGIC.nki00: std_logic .a2.
nk10.x1.nk10).y2.enter your statements here end first_order_filter.d1.signal nki12.ws00.nki00. add1: qsdadder2bit port map ( nki02.nk01.nk12.xd11.xd10.w12.w11.d0).x1.x1.d4. .nki10.w00.d0.d1.ws01. signal do4.ws11. w02.y0.w02. signal ws02.ws11.ws02.nki00.y5.ws10.ws10:std_logic .ws12.nk02.q1.nk02.xd12.ws00.ws01.d4.nk00). mul1: QSD_SINGLE_DIGIT_MULT port map (x2.d2.q0. begin delay1: delay_unit port map (x2.ws12.nki01.y3. mul2: QSD_SINGLE_DIGIT_MULT port map (xd12.y4. 53 .x0.x2.d2.nki12.xd10). adaptation: adaptationunit_first_order port map (d5.nki11.xd11.d3.x0.ws00.x0.ws10).nki10: std_logic .y1.nki11.ws12.nk00.d5.xd10.nk11.xd12.nki10.d3.nki12.q2.w01.nki11.w10.nk11.nk01.nk12.ws01.nki01.xd11.ws02.do3: std_logic .ws11.
c1.4.2.2) VHDL code for one digit QSD adder library IEEE.a0 : in std_logic.a1. }} End of automatically maintained section architecture qsdadder of qsdadder is begin c1 <= (a2 and b2 and(not b1)) or (a2 and (not a1) and b2) or (a2 and b2 and (not b0))or (a2 and (not a0) and b2) or (b2 and (not a1) and (not a0) and (not b1)) or (a2 and (not a1) and (not b1) and (not b0)) after 2 ns. end qsdadder.Figure 3.all.2. c0 <= (a2 and b2 and (not b1)) or (a2 and (not a1) and b2) or (a2 and b2 and (not b0)) or (a2 and (not a0) and b2) or ((not a1) and (not a0) and b2 and (not b1)) or ((not a2)and a1 and (not b2) and b1) or ((not a2) and a0 and (not b2) and b1) 54 . entity qsdadder is port (b2.s0 : out std_logic).b0.c0.3: VHDL simulation of 1st order LMS adaptive filter 3.s2.a2.s1.STD_LOGIC_1164.b1. use IEEE.
or((not a2) and (not b2) and b1 and b0) or ((not a2) and a1 and (not b2) and b0) or (a2 and (not a1) and (not b1) and (not b0)) or ((not a2) and a1 and a0 and (not b2)) after 2 ns.enter your statements here end qsdadder. s1 <= ((not a1) and b1 and (not b0)) or ((not a1) and (not a0) and b1 ) or (a1 and (not a0) and (not b1)) or (a1 and (not b1) and (not b0)) or ( a1 and a0 and b1 and b0) or ((not a1) and a0 and (not b1) and b0) after 2 ns. . 55 . s2 <= ((not a1) and b2 and b0) or (a2 and (not a0) and (not b1)) or ((not a1) and a0 and b2 and (not b1)) or ((not a1) and (not a0) and b2 and b1) or ((not a1) and a0 and b1 and (not b0)) or ((not a1) and (not a0) and b1 and b0) or (a2 and (not a1 ) and (not b1) and b0) or (a1 and (not a0) and (not b1) and b0) or (a2 and a1 and (not b1) and (not b0)) or (a1 and a0 and (not b1 ) and (not b0)) or (a2 and a1 and a0 and b2 and b1 and b0) after 2 ns. s0 <= (a0 and (not b0)) or ((not a0) and b1 and b0) or ((not a2) and (not a0) and b0) or ((not a0 ) and (not b2) and b0) after 2 ns.
2.all.Figure 3.4: 1 digit QSD adder 3.3) VHDL code for two digit QSD adder library IEEE.4.2.STD_LOGIC_1164. use IEEE. entity qsdadder2bit is 56 .
z0:out std_logic end qsdadder2bit.y1.z4. z5. begin ci5<=’0’ . end component .x4.ci5.z4.y2.s1.y2.x2.s4.x0.ci4.port (x5.y4.s1.x3. .s2.z1.s4.z2. .x0. component qsdadder port (b2.z1. c1.si4:std_logic .z2.z3.ci3:std_logic .y5. ).s3.b1.ci3.a1.y5.z5.y3.si4.y0:in std_logic .ci4. .a0 : in std_logic.x1.x3.s1:std_logic signal si5.ci4.s2.y3 .enter your statements here – end qsdadder2bit.s5.s2.z0).y1.s2.s1).si5. }} End of automatically maintained section architecture qsdadder2bit of qsdadder2bit is signal ci5.ci3.x4. signal s5.s0 : out std_logic). map ( s3.y4.x1.s3.z3).b0.a2. 57 .c0. add1: qsdadder port add2: qsdadder port add3: qsdadder port map ( x2.y0 map ( x5.
58 .2. a1 : in STD_LOGIC.3) VHDL code for single digit multiplier library IEEE.Figure 3.4.all.5: 2 digit QSD adder 3. entity QSD_SINGLE_DIGIT_MULT is port( a2 : in STD_LOGIC. a0 : in STD_LOGIC.2.STD_LOGIC_1164. use IEEE.
b0 : in STD_LOGIC. 59 . m0 : inout STD_LOGIC ). c0 : inout STD_LOGIC. m2<= (a2 and b1 and(a1 nor b2)) or (a1 and b2 and(a2 nor b1)) or (a1 and a0 and b2 and(not b1)) or(a2 and(not a1)and b1 and b0) or (a0 and b2 and(a2 nor b0)) or (a0 and b0 and(a1 xor b1)) or (a2 and b0 and(a0 nor b2)) or (a2 and a0 and b1 and(b2 nor b0)) or (a1 and b2 and b0 and(a2 nor a0)). }} End of automatically maintained section architecture QSD_SINGLE_DIGIT_MULT of QSD_SINGLE_DIGIT_MULT is begin c2<= (a2 and(not b2)and b0 and((not b1)nand a1)) or (a2 and(not b2)and b1 and(a1 nand a0)) or ((not a2)and a0 and b2 and ((not a1)nand b1)) or ((not a2) and a1 and b2 and(b1 nand b0)). end QSD_SINGLE_DIGIT_MULT.b2 : in STD_LOGIC. c1<= c2 or ( a2 and(not a1)and b2 and(not b1)) or (a1 and a0 and (not b2)and b1 and b0). b1 : in STD_LOGIC. c2 : inout STD_LOGIC. c0<= (a1 and(a0 nor b2)and b1) or (a1 and b1 and(a2 nor b0)) or (a2 and b2 and(a1 xor b1)) or ( a2 and b2 and(a0 nor b0)) or (a2 and b1 and(a1 nor b0)) or (a1 and b2 and(a0 nor b1)) or ((a1 nor b1)and a2 and(not b2)and b0) or ((a1 nor b1)and(not a2)and a0 and b2) or (a2 and a1 and(not b2)and b1 and b0) or ((not a2)and a1 and a0 and b2 and b1) or ((a2 nor b2)and a0 and b0 and(a1 xor b1)). m1 : inout STD_LOGIC. c1 : inout STD_LOGIC. m2 : inout STD_LOGIC.
348ns. .6: Single digit QSD multiplier On simulation of QSD single digit multiplier in Xilinx we get the delay of 11.enter your statements here – end QSD_SINGLE_DIGIT_MULT. Figure 3. 60 .m1<= (a0 and b1 and(a1 nand b0)) or (a1 and b0 and(b1 nand a0)). m0<= a0 and b0.
component qsdadder port (a0. c0. begin n2<=’0’. process(a0) begin 61 . b5.f3.a3.s1.f2.b0: inout std_logic).b2.a2.b0.a0: in std_logic.a1. n0<=’1’.s2 : out std_logic).2.b2 : in std_logic.b3.3. end complement_genrator.b1.f4. n1<=’0’.b4. entity complement_genrator is port(a5.all.b1.s0.c1.STD_LOGIC_1164.2.n1: std_logic .f0: std_logic. use IEEE.4) VHDL code for complement generator of two digit QSD number library IEEE.f1. signal n0: std_logic . signal n2.a4.a2. end component. }} End of automatically maintained section architecture complement_genrator of complement_genrator is signal f5. f5<=’0’.a1.4.
b1<=’0’ . end if . b1<=’0’ . b3<=’1’.if a2=’0’ and a1 =’0’ and a0=’0’ then b2<=’0’ end if. if a2=’0’ and a1 =’1’ and a0=’0’ then b2<=’0’ . b0<=’1’. b0<=’0’. b3<=’1’. b0<=’0’. b4<=’0’ . if a2=’0’ and a1 =’1’ and a0=’1’ then b2<=’0’ . end if . b4<=’1’ . if a5=’0’ and a4 =’1’ and a3=’1’ then 62 . if a2=’0’ and a1 =’0’ and a0=’1’ then b2<=’0’ . b3<=’0’. if a5=’0’ and a4 =’1’ and a3=’0’ then b5<=’0’ . end if . if a5=’0’ and a4 =’0’ and a3=’0’ then b5<=’0’ . end if . . b1<=’1’ . b4<=’1’ . b0<=’1’. if a5=’0’ and a4 =’0’ and a3=’1’ then b5<=’0’ . end if . end if . b1<=’1’ .
b5) .f0.b1.b2. b4<=’0’ .f5.f2) . .f1.b3.enter your statements here – end complement_genrator.f4.b0.f3.n1. add2: qsdadder port map (f3.b4.n2. add1: qsdadder port map (n0. b3<=’0’.b5<=’0’ . Figure 3.b5.f4.b3.f5. end if .f4. end process.7: Two digit QSD number complement generator 63 .b4.
5) VHDL code for delay unit library IEEE. d . end delay_unit. entity delay_unit is port( a . 64 .b.2.c: in STD_LOGIC.STD_LOGIC_1164.2. f<=c after 100 ns.enter your statements here  end delay_unit.f: out STD_LOGIC ). e<=b after 100 ns. use IEEE.e.4.all.3. }} End of automatically maintained section architecture delay_unit of delay_unit is begin d<=a after 100 ns. .
y5.x11.8: Delay unit 3. entity adaptationunit_first_order is port (d5.x22.2.6) VHDL code for adaptive weight control mechanism library IEEE.x21.4.x10. 65 .x20: in std_logic .y2.STD_LOGIC_1164.d3.2.d2. use IEEE.y3.Figure 3.d4.y4.q0:in std_logic . q2.d1.q1.d0:in std_logic .y1.y0:in std_logic . x12.all.
z5.wo20: out std_logic ).wo22. a1 : in STD_LOGIC.b3.w22.x4.b2. b0 : in STD_LOGIC.w11.a1.a0: in std_logic. component QSD_SINGLE_DIGIT_MULT port( a2 : in STD_LOGIC. }} End of automatically maintained section architecture adaptationunit_first_order of adaptationunit_first_order is component complement_genrator port(a5.a2.wo21.y2.x1.z3. end component . wo12.x2.y3. c2 : inout STD_LOGIC. c1 : inout STD_LOGIC.wo11.a3.b4.y5.wo10.w20: in std_logic . 66 .x3. b1 : in STD_LOGIC.a4. end adaptationunit_first_order.w12.w10.z2.b1. b2 : in STD_LOGIC.z4. ).y0:in std_logic .b0: inout std_logic).y1. b5.z0:out std_logic end component .w21. a0 : in STD_LOGIC.y4. component qsdadder2bit port (x5.z1.x0.
dc1.f5.dc1.b0.q0.a0 : in std_logic.d2.e5.g23.g24.q1.f3.e0).y4. end component .y5.b1.s1. mul1: QSD_SINGLE_DIGIT_MULT port map (e2. signal dc5.dc2. component qsdadder port (b2.f3.a2.dc1.d0.g13.g12.g10:std_logic .dc4.e4.g20:std_logic .e4.y2.f4. m1 : inout STD_LOGIC.d3.dc0).dc4. signal f5.f0). m0 : inout STD_LOGIC ).e1.e2.e0.d4.a1. signal wo14.wo13.f2.e0:std_logic . complement: complement_genrator port map ( d5.e1.y1.q2.f1. c1.c0.dc3.g22.e1.dc3. m2 : inout STD_LOGIC.s2.y3.s0 : out std_logic).dc2.dc0. 67 .f2.f1.g14. signal e5. signal g25.d1.dc4.c0 : inout STD_LOGIC.g21.g11. add1: qsdadder2bit port map ( dc5.f4.dc2.e3. signal g15.wo24.y0.e2.e3. end component .f0:std_logic .wo23:std_logic begin .dc0:std_logic .dc3.dc5.
wo24.x11.x21.x12.g11.g22.9: Adaptive weight control mechanism of 1st order LMS adaptive filter 68 .f1.wo21.w20.f0. map (w22.g21. g12.f1.wo13.wo10).w21.g24.g12.wo14.g23.w10.wo20).wo12.g11.x20. .wo22.g13.g14.g10 .w11. Figure 3.g10).wo23.f0.g15. add2: qsdadder port add3: qsdadder port map (w12.g21. g22.wo11.x10. mul22:QSD_SINGLE_DIGIT_MULT port map (f2.g20) .mul21:QSD_SINGLE_DIGIT_MULT port map (f2.x22.g20.enter your statements here – end adaptationunit_first_order.g25.
3.4.3)
2nd order LMS adaptive filter
3.4.3.1) Introduction
Figure 3.10: 2nd order LMS adaptive filter
dout is the output of transversal filter yn is the desired output e(n) is the estimation error given ase(n) = dout(n) – y(n) w(n+1) = w(n) + 2μe(n)xin(n) w(n+1) is the updated weight and w(n) is the previous weight Components required for designing of 2nd order LMS adaptive filter areNumber of delay elements required Number of multipliers in transversal filter
69
(3.9) (3.10)
=2 =3
Number of multipliers in adaptive weight control mechanism = 4 Number of adders in transversal filter Number of adders in adaptive weight mechanism =2 =4
Here total number of multipliers are 7 and total number of adders are 6. The delay of QSD adder is 13.931ns and the delay of QSD multiplier is 11.348ns, so the total delay of 2nd order LMS adaptive filter is 163.022ns.
3.4.3.2) VHDL implementation of 2nd order LMS adaptive filter
3.4.3.2.1) VHDL code for 2nd order LMS adaptive filter library IEEE; use IEEE.STD_LOGIC_1164.all;
entity second_order_filter is port( x2,x1,x0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ; q2,q1,q0:in std_logic ; w02,w01,w00: in std_logic ; w12,w11,w10: in std_logic; w22,w21,w20: in std_logic; d5,d4,d3,d2,d1,d0:inout std_logic); end second_order_filter;
}} End of automatically maintained section architecture second_order_filter of second_order_filter is component delay_unit
70
port( a ,b,c: in STD_LOGIC;
d ,e,f: out STD_LOGIC
); end component ;
component qsdadder port (b2,b1,b0,a2,a1,a0 : in std_logic; c1,c0,s2,s1,s0 : out std_logic); end component ;
component qsdadder2bit port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ; z5,z4,z3,z2,z1,z0:out std_logic end component ; );
component QSD_SINGLE_DIGIT_MULT port( a2 : in STD_LOGIC; a1 : in STD_LOGIC; a0 : in STD_LOGIC; b2 : in STD_LOGIC; b1 : in STD_LOGIC;
71
wo30: out std_logic ).w11.y2.nk10: std_logic .w22.xd10:std_logic . signal nki22. 72 . c2 : inout STD_LOGIC.nki10: std_logic .x22.do3: std_logic . signal xd12.nk01.wo22.b0 : in STD_LOGIC. w12. m0 : inout STD_LOGIC ).x10.x11.x21.wo10.w20.wo11. m1 : inout STD_LOGIC.w10.wo31.x32. signal do4.nk11.w31.xd21.d1.wo20. wo12. signal nk02. component adaptation_unit port (d5.nki11.xd20 :std_logic .nki20.wo32.y1. c1 : inout STD_LOGIC.w30: in std_logic .d3.nki00: std_logic .q0:in std_logic . signal nk12.x30: in std_logic . end component .d0:in std_logic .y3. m2 : inout STD_LOGIC. x12.x31. signal nki02.nk00: std_logic .d2. y5.d4.w21. c0 : inout STD_LOGIC.x20.w32. q2.nki01.nk21.wo21.y4. signal nki12.q1. end component . signal xd22.nk20:std_logic .y0:in std_logic .xd11.nk22.nki21.
signal di5,di4,di3,di2,di1,di0:std_logic ; signal ws02,ws01,ws00,ws12,ws11,ws10,ws22,ws21,ws20:std_logic ; begin delay1: delay_unit port map (x2,x1,x0,xd12,xd11,xd10); delay2: delay_unit port map (xd12,xd11,xd10,xd22,xd21,xd20); mul1: QSD_SINGLE_DIGIT_MULT port map (x2,x1,x0,w02,ws02,ws01,ws00,nki01,nki00,nk02,nk01,nk00); mul2: QSD_SINGLE_DIGIT_MULT port map (xd12,xd11,xd10,ws12,ws11,ws10,nki12,nki11,nki10,nk12,nk11,nk10); mul3: QSD_SINGLE_DIGIT_MULT port map (xd22,xd21,xd20,ws22,ws21,ws20,nki22,nki21,nki20,nk22,nk21,nk20); add1: qsdadder2bit port map ( nki02,nki01,nki00,nk02,nk01,nk00,nki12,nki11,nki10,nk12,nk11,nk10,di5,di4,di3,di2,di1,di0); add2: qsdadder2bit port map (di5,di4,di3,di2,di1,di0,nki22,nki21,nki20,nk22,nk21,nk20,d5,d4,d3,d2,d1,d0); adaptation: adaptation_unit port map (d5,d4,d3,d2,d1,d0,y5,y4,y3,y2,y1,y0,q2,q1,q0,x2,x1,x0,xd12,xd11,xd10,xd22,xd21,xd20, w02,w01,w00,w12,w11,w10,w22,w21,w20,ws02,ws01,ws00,ws12,ws11,ws10,ws22,ws2 1,ws20);  enter your statements here end second_order_filter;
73
figure 3.11: VHDL simulation of 2nd order LMS adaptive filter 3.4.3.2.2) VHDL code for adaptive weight control mechanism of 2nd order LMS filter library IEEE; use IEEE.STD_LOGIC_1164.all;
entity adaptation_unit is port (d5,d4,d3,d2,d1,d0:in std_logic ; y5,y4,y3,y2,y1,y0:in std_logic ; q2,q1,q0:in std_logic ; x12,x11,x10,x22,x21,x20,x32,x31,x30: in std_logic ; w12,w11,w10,w22,w21,w20,w32,w31,w30: in std_logic ; wo12,wo11,wo10,wo22,wo21,wo20,wo32,wo31,wo30: out std_logic ); end adaptation_unit;
74
}} End of automatically maintained section
architecture adaptation_unit of adaptation_unit is component complement_genrator
port(a5,a4,a3,a2,a1,a0: in std_logic; b5,b4,b3,b2,b1,b0: inout std_logic); end component ; component qsdadder2bit port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ; z5,z4,z3,z2,z1,z0:out std_logic end component ; component QSD_SINGLE_DIGIT_MULT port( a2 : in STD_LOGIC; a1 : in STD_LOGIC; a0 : in STD_LOGIC; b2 : in STD_LOGIC; b1 : in STD_LOGIC; b0 : in STD_LOGIC; c2 : inout STD_LOGIC; c1 : inout STD_LOGIC; c0 : inout STD_LOGIC; m2 : inout STD_LOGIC; m1 : inout STD_LOGIC; m0 : inout STD_LOGIC
75
);
wo24.e0.f0.dc1.d0. signal e5.g32.dc3.g10:std_logic .g30:std_logic . signal wo14.x21.f2. end component .f0.g21.e1.g34.e2.f1.g12.g23.d3.g13.g13.f3.f1.dc1.q1.g34.g30).x11.dc5. component qsdadder port (b2.y2.d1. signal f5.s0 : out std_logic).dc4.g14.e4.g25.a1.dc1.dc0).g22. mul23:QSD_SINGLE_DIGIT_MULT port map (f2.e2.f5.b1.f2.d2.s2.f4.g22.f3. mul22:QSD_SINGLE_DIGIT_MULT port map (f2.f0:std_logic .g24.f1.g20) .g31.y0.g14.g15.dc4.g12.g10).f0.d4. signal g35.g23.g31.).g33.a0 : in std_logic.c0.g20:std_logic .dc0:std_logic .e3.g11. signal g15.e0).g35.y4.dc2. end component .x30.e1.f1.g24. c1.e1. mul1: QSD_SINGLE_DIGIT_MULT port map (e2.dc4.g32.g11. mul21:QSD_SINGLE_DIGIT_MULT port map (f2.y3.g21. 76 .dc3.wo23:std_logic begin complement: complement_genrator port map ( d5.x20.e5.x10.g33.q0.dc2.wo13.s1.e3.a2. signal dc5.b0.dc2.x31. signal g25.f0).dc0.f1.e4.wo34.e0:std_logic .y5. add1: qsdadder2bit port map ( dc5.y1.dc3. .wo33.x32.f4.q2.x12.x22.
w20.wo30).g30.wo31. g32. map (w32.g31. wo24.g20.wo10). . g22. Figure 3.wo23.g10 .w30.g11.enter your statements here end adaptation_unit.wo12.wo21.12: Adaptive weight control mechanism of 2nd order LMS adaptive filter 77 .w21.w31.wo33.wo20).w10.add2: qsdadder port add3: qsdadder port add4: qsdadder port map (w12.w11.wo22. wo34. g12.wo13.wo14.wo32.g21.wo11. map (w22.
348ns. For this we have designed and implemented addition and multiplication blocks for QSD number system. where N is order of the filter.464ns and the total delay of 2nd order LMS adaptive filter is 163. the delay of QSD adder is 13. Here we have implemented the adaptive filter using QSD adders and multipliers the total delay of 1st order LMS adaptive filter is 112. 78 .022ns.931ns and the delay of QSD multiplier is 11. We have shown above that in QSD number system the addition takes place in parallel so the delay is constant and does not depend on number of bits to be added. By use of these blocks we have implemented our adaptive filter.CHAPTER 4 CONCLUSION We have implemented 1st and 2nd order adaptive filters using LMS algorithm for adaptive weight control mechanism. The LMS algorithm requires approximately 2N+1 multiplications and 2N+1 additions for each new set of input and output samples. So the delay depends upon the number of multiplication and addition. So the delay is much less in comparison to the implementation of adaptive filter using conventional adders and multipliers. For implementation of above adaptive filter we have used non conventional quaternary signed digit number system.
1) Advanced HDL Synthesis Report 7) Low Level Synthesis 8) Partition Report 79 .00 s > Reading design: qsdadder.00 / 0.1) HDL Synthesis Report 6) Advanced HDL Synthesis 6.00 / 0.APPENDIX 1. All rights reserved. Xilinx report for QSD adder Release 9.prj TABLE OF CONTENTS 1) Synthesis Options Summary 2) HDL Compilation 3) Design Hierarchy Analysis 4) HDL Analysis 5) HDL Synthesis 5./xst/projnav. Inc.xst J.tmp CPU : 0.36 Copyright (c) 19952007 Xilinx.16 s  Elapsed : 0. > Parameter TMPDIR set to .00 / 0./xst CPU : 0.00 s > Parameter xsthdpdir set to .2i .16 s  Elapsed : 0.00 / 0.
Source Parameters Input File Name Input Format : "qsdadder.3) TIMING REPORT ===================================================================== ==== * Synthesis Options Summary * ===================================================================== ==== .1) Device utilization summary 9.9) Final Report 9.Target Parameters Output File Name Output Format Target Device : "qsdadder" : NGC : xc2s156cs144 .Source Options Top Module Name Automatic FSM Extraction FSM Encoding Algorithm Safe Implementation FSM Style RAM Extraction : qsdadder : YES : Auto : No : lut : Yes 80 .prj" : mixed Ignore Synthesis Constraint File : NO .2) Partition Resource Summary 9.
RAM Style ROM Extraction Mux Style Decoder Extraction Priority Encoder Extraction Shift Register Extraction Logical Shifter Extraction XOR Collapsing ROM Style Mux Extraction Resource Sharing : Auto : Yes : Auto : YES : YES : YES : YES : YES : Auto : YES : YES : NO Asynchronous To Synchronous Multiplier Style : lut Automatic Register Balancing : No .Target Options Add IO Buffers Global Maximum Fanout : YES : 100 :4 Add Generic Clock Buffer(BUFG) Register Duplication Slice Packing : YES : YES Optimize Instantiated Primitives : NO Convert Tristates To Logic Use Clock Enable Use Synchronous Set : Yes : Yes : Yes 81 .
General Options Optimization Goal Optimization Effort Library Search Order Keep Hierarchy RTL Output Global Optimization Read Cores Write Timing Constraints Cross Clock Analysis Hierarchy Separator Bus Delimiter Case Specifier Slice Utilization Ratio BRAM Utilization Ratio Verilog 2001 Auto BRAM Packing Slice Utilization Ratio Delta : Speed :1 : qsdadder.lso : NO : Yes : AllClockNets : YES : NO : NO :/ : <> : maintain : 100 : 100 : YES : NO :5 ===================================================================== ==== 82 .Use Synchronous Reset Pack IO Registers into IOBs Equivalent register Removal : Yes : auto : YES .
Unit <qsdadder> generated.vhd" in Library work. ===================================================================== ==== * HDL Analysis * ===================================================================== ==== Analyzing Entity <qsdadder> in library <work> (Architecture <qsdadder>).===================================================================== ==== * HDL Compilation * ===================================================================== ==== Compiling vhdl file "C:/Xilinx92i/lma/adaptive. Entity <qsdadder> analyzed. Architecture qsdadder of Entity qsdadder is up to date. 83 . ===================================================================== ==== * Design Hierarchy Analysis * ===================================================================== ==== Analyzing hierarchy for entity <qsdadder> in library <work> (architecture <qsdadder>).
Unit <qsdadder> synthesized..vhd". Related source file is "C:/Xilinx92i/lma/adaptive..===================================================================== ==== * HDL Synthesis * ===================================================================== ==== Performing bidirectional port resolution. ===================================================================== ==== HDL Synthesis Report Found no macro ===================================================================== ==== ===================================================================== ==== * Advanced HDL Synthesis * ===================================================================== ==== 84 . Synthesizing Unit <qsdadder>.
....nph' in environment C:\Xilinx92i.. Found area constraint ratio of 100 (+ 5) on block qsdadder.. Building and optimizing final netlist .Loading device for application Rf_Device from file '2s15. Mapping all equations. ===================================================================== ==== Advanced HDL Synthesis Report Found no macro ===================================================================== ==== ===================================================================== ==== * Low Level Synthesis * ===================================================================== ==== Optimizing unit <qsdadder> . actual ratio is 3... Final Macro Processing . ===================================================================== ==== Final Register Report 85 .
Found no macro ===================================================================== ==== ===================================================================== ==== * Partition Report * ===================================================================== ==== Partition Implementation Status  No Partitions were found in this design.  ===================================================================== ==== * Final Report * ===================================================================== ==== Final Results RTL Top Level Output File Name Top Level Output File Name Output Format : qsdadder.ngr : qsdadder : NGC 86 .
Optimization Goal Keep Hierarchy : Speed : NO Design Statistics # IOs : 11 Cell Usage : # BELS # # # # # LUT2 LUT3 LUT4 MUXF5 MUXF6 : 17 :1 :2 : 10 :3 :1 : 11 :6 :5 # IO Buffers # # IBUF OBUF ===================================================================== ==== Device utilization summary:  Selected Device : 2s15cs1446 Number of Slices: 7 out of 192 3% 87 .
 ===================================================================== ==== TIMING REPORT NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE. FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT GENERATED AFTER PLACEandROUTE.Number of 4 input LUTs: Number of IOs: Number of bonded IOBs: 11 13 out of 384 3% 11 out of 86 12% Partition Resource Summary:  No Partitions were found in this design. Clock Information: No clock signals found in this design 88 .
931ns Timing Detail: All values displayed in nanoseconds (ns) ===================================================================== ==== Timing constraint: Default path analysis Total number of paths / destination ports: 59 / 5 Delay: Source: Destination: 13.931ns (Levels of Logic = 6) a0 (PAD) c0 (PAD) 89 .Asynchronous Control Signals Information: No asynchronous control signals found in this design Timing Summary: Speed Grade: 6 Minimum period: No path found Minimum input arrival time before clock: No path found Maximum output required time after clock: No path found Maximum combinational path delay: 13.
8% logic.931ns (7.549 1.2% route) ===================================================================== ==== CPU : 3.668 c0_OBUF (c0) Total 13.Data Path: a0 to c0 Gate Cell:in>out Net fanout Delay Delay Logical Name (Net Name) .00 s > Total memory usage is 161748 kilobytes Number of errors : Number of warnings : Number of infos : 0 ( 0 filtered) 0 ( 0 filtered) 0 ( 0 filtered) 90 .980 a0_IBUF (a0_IBUF) 1 0.31 s  Elapsed : 3.640ns logic.IBUF:I>O LUT4:I0>O LUT3:I2>O LUT4:I0>O LUT4:I3>O OBUF:I>O 10 0. 6.035 c138 (c1_map15) 1 0.035 c0 (c0_OBUF) 4.035 c149 (c1_map17) 2 0.00 / 3.291ns route) (54.549 1. 45.12 / 3.776 1.549 1.206 c157 (c1_OBUF) 1 0.549 1.
/xst CPU : 0.prj TABLE OF CONTENTS 1) Synthesis Options Summary 2) HDL Compilation 3) Design Hierarchy Analysis 4) HDL Analysis 5) HDL Synthesis 5.xst J./xst/projnav.16 s  Elapsed : 0.tmp CPU : 0.2i . Inc.00 / 0.00 s > Parameter xsthdpdir set to .00 / 0.1) HDL Synthesis Report 6) Advanced HDL Synthesis 6.1) Advanced HDL Synthesis Report 7) Low Level Synthesis 8) Partition Report 9) Final Report 9.1) Device utilization summary 9. > Parameter TMPDIR set to .00 / 0. All rights reserved.00 / 0. Xilinx report for QSD multiplier Release 9.2.2) Partition Resource Summary 91 .36 Copyright (c) 19952007 Xilinx.00 s > Reading design: QSD_SINGLE_DIGIT_MULT.16 s  Elapsed : 0.
Source Options Top Module Name Automatic FSM Extraction FSM Encoding Algorithm Safe Implementation FSM Style RAM Extraction RAM Style : QSD_SINGLE_DIGIT_MULT : YES : Auto : No : lut : Yes : Auto 92 .Source Parameters Input File Name Input Format : "QSD_SINGLE_DIGIT_MULT.3) TIMING REPORT ===================================================================== ==== * Synthesis Options Summary * ===================================================================== ==== .9.prj" : mixed Ignore Synthesis Constraint File : NO .Target Parameters Output File Name Output Format Target Device : "QSD_SINGLE_DIGIT_MULT" : NGC : xc2s156cs144 .
Target Options Add IO Buffers Global Maximum Fanout : YES : 100 :4 Add Generic Clock Buffer(BUFG) Register Duplication Slice Packing : YES : YES Optimize Instantiated Primitives : NO Convert Tristates To Logic Use Clock Enable Use Synchronous Set Use Synchronous Reset : Yes : Yes : Yes : Yes 93 .ROM Extraction Mux Style Decoder Extraction Priority Encoder Extraction Shift Register Extraction Logical Shifter Extraction XOR Collapsing ROM Style Mux Extraction Resource Sharing : Yes : Auto : YES : YES : YES : YES : YES : Auto : YES : YES : NO Asynchronous To Synchronous Multiplier Style : lut Automatic Register Balancing : No .
Pack IO Registers into IOBs Equivalent register Removal : auto : YES .General Options Optimization Goal Optimization Effort Library Search Order Keep Hierarchy RTL Output Global Optimization Read Cores Write Timing Constraints Cross Clock Analysis Hierarchy Separator Bus Delimiter Case Specifier Slice Utilization Ratio BRAM Utilization Ratio Verilog 2001 Auto BRAM Packing Slice Utilization Ratio Delta : Speed :1 : QSD_SINGLE_DIGIT_MULT.lso : NO : Yes : AllClockNets : YES : NO : NO :/ : <> : maintain : 100 : 100 : YES : NO :5 ===================================================================== ==== 94 .
===================================================================== ==== * HDL Analysis * ===================================================================== ==== Analyzing Entity <QSD_SINGLE_DIGIT_MULT> in library <work> (Architecture <QSD_SINGLE_DIGIT_MULT>). Entity <QSD_SINGLE_DIGIT_MULT> (Architecture <QSD_SINGLE_DIGIT_MULT>) compiled. ===================================================================== ==== * Design Hierarchy Analysis * ===================================================================== ==== Analyzing hierarchy for entity <QSD_SINGLE_DIGIT_MULT> in library <work> (architecture <QSD_SINGLE_DIGIT_MULT>). Entity <QSD_SINGLE_DIGIT_MULT> compiled. 95 .vhd" in Library work.===================================================================== ==== * HDL Compilation * ===================================================================== ==== Compiling vhdl file "C:/Xilinx92i/QSD_SINGLE_DIGIT_MULT/QSD_SINGLE_DIGIT_MULT.
===================================================================== ==== * HDL Synthesis * ===================================================================== ==== Performing bidirectional port resolution.Entity <QSD_SINGLE_DIGIT_MULT> analyzed. Unit <QSD_SINGLE_DIGIT_MULT> synthesized. Related source file is "C:/Xilinx92i/QSD_SINGLE_DIGIT_MULT/QSD_SINGLE_DIGIT_MULT.vhd". Found 1bit xor2 for signal <m2$xor0000> created at line 55. Unit <QSD_SINGLE_DIGIT_MULT> generated.. ===================================================================== ==== HDL Synthesis Report Macro Statistics # Xors 1bit xor2 :1 :1 96 .. Synthesizing Unit <QSD_SINGLE_DIGIT_MULT>.
nph' in environment C:\Xilinx92i. ===================================================================== ==== Advanced HDL Synthesis Report Macro Statistics # Xors 1bit xor2 :1 :1 ===================================================================== ==== ===================================================================== ==== * Low Level Synthesis * ===================================================================== ==== 97 .===================================================================== ==== ===================================================================== ==== * Advanced HDL Synthesis * ===================================================================== ==== Loading device for application Rf_Device from file '2s15.
. 98 .. actual ratio is 4... Found area constraint ratio of 100 (+ 5) on block QSD_SINGLE_DIGIT_MULT.... ===================================================================== ==== Final Register Report Found no macro ===================================================================== ==== ===================================================================== ==== * Partition Report * ===================================================================== ==== Partition Implementation Status  No Partitions were found in this design. Building and optimizing final netlist .. Final Macro Processing . Mapping all equations.Optimizing unit <QSD_SINGLE_DIGIT_MULT> .
ngr : QSD_SINGLE_DIGIT_MULT : NGC : Speed : NO Design Statistics # IOs : 12 Cell Usage : # BELS # # # # # LUT2 LUT3 LUT4 MUXF5 MUXF6 : 23 :1 :1 : 14 :5 :2 : 12 99 # IO Buffers . ===================================================================== ==== * Final Report * ===================================================================== ==== Final Results RTL Top Level Output File Name Top Level Output File Name Output Format Optimization Goal Keep Hierarchy : QSD_SINGLE_DIGIT_MULT.
# # IBUF OBUF :6 :6 ===================================================================== ==== Device utilization summary:  Selected Device : 2s15cs1446 Number of Slices: Number of 4 input LUTs: Number of IOs: Number of bonded IOBs: 8 out of 192 4% 384 4% 16 out of 12 12 out of 86 13% Partition Resource Summary:  No Partitions were found in this design.  ===================================================================== ==== 100 .
348ns Timing Detail: 101 .TIMING REPORT NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE. FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT GENERATED AFTER PLACEandROUTE. Clock Information: No clock signals found in this design Asynchronous Control Signals Information: No asynchronous control signals found in this design Timing Summary: Speed Grade: 6 Minimum period: No path found Minimum input arrival time before clock: No path found Maximum output required time after clock: No path found Maximum combinational path delay: 11.
668 c1_OBUF (c1) Total 11.315 1. 39.250 a1_IBUF (a1_IBUF) 2 0.000 c1_F (N41) 1 0. 4.348ns (Levels of Logic = 5) a1 (PAD) c1 (PAD) Data Path: a1 to c1 Gate Cell:in>out Net fanout Delay Delay Logical Name (Net Name) .035 c1 (c1_OBUF) 4.857ns logic.IBUF:I>O LUT4:I1>O LUT3:I0>O MUXF5:I0>O OBUF:I>O 13 0.348ns (6.4% logic.549 0.6% route) ===================================================================== ==== 102 .206 c2_SW0 (N21) 1 0.All values displayed in nanoseconds (ns) ===================================================================== ==== Timing constraint: Default path analysis Total number of paths / destination ports: 71 / 6 Delay: Source: Destination: 11.549 1.491ns route) (60.776 2.
00 s > Total memory usage is 162324 kilobytes Number of errors : Number of warnings : Number of infos : 0 ( 0 filtered) 0 ( 0 filtered) 0 ( 0 filtered) 103 .00 / 3.03 / 3.CPU : 3.21 s  Elapsed : 3.
2(1). (1996). pp. pp 880891. pp. Morris Mano. 173175. Prentice Hall. “A Novel Fast Parallel SignedDigit Hybrid Multiplication Scheme for Digital Systems”.REFERENCES 1) M. Kluwer Academic Publishers. 0780359577/00 2000 IEEE. 119121. 5) Dhananjay S. IEEE Transactions on Computers Vol. 1997 104 . Deshmukh. 2) Charles H Roth & Lizy Kurian John. 2010. pp. “Fast Processing Using Signed Digit Number System” International Journal of Electronics Engineering. August 1994. 43. 6) S. 8. Principles of digital system design.R. 6669 & 186190. No.G. no. 7) Paulo S. Diniz: Adaptive Filtering: Algorithms and Practical Implementation. Haykin. 231240. Phatak and Israel Korean et al “Hybrid Signed Digit Number Systems: A Unified Framework for Redundant Number Representation with Bounded Carry Propagation Chains”. Digital design 2nd edition. 3) Iljoo Choo and R. pp. 4) Dr. Krishna Raj and Suman Lata. Adaptive Filter Theory 3rd edition.
This action might not be possible to undo. Are you sure you want to continue?