Gabe Rowe: EE 471 Lab 2

Gabe Rowe
EE 471
Lab 2
Abstract
This 32 bit ALU has a couple great functions. Minimized logic for speed, and carry look-
ahead for even more speed. The goal of this project was to make an ALU that looks like
the figure below. My ALU can perform all operations within approximately 14 gate
delays assuming the zero detect is one gate delay.
Bus A
32
Output
32
32 bit ALU
Bus B Zero
32
Overflow
CarryOut
ALU
Control
Figure 1. 32 bit ALU block diagram

Overflow
In order to calculate the overflow, we need to know what the carry in and the carry out
are into the highest order bits. However, I wanted to minimize logic, so I made a table
showing how I could possibly go about this, and made the logic equation using the
following tables.
To minimize logic of overflow:

Overflow=Cout xor Cin
All Overflow Cases

Example Binvert A B Bmux Cin Cout Sum Overflow
A+B < 0 0 0 0 0 1 0 1 1
-A+-B > 0 0 1 1 1 0 1 0 1
A - -B < 0 1 0 1 0 1 0 1 1
-A - B > 0 1 1 0 1 0 1 0 1
Simplified Overflow Cases

Example Binvert A B Bmux Cin Cout Sum Overflow
A+B < 0 X 0 X 0 1 0 1 1
-A+-B > 0 X 1 X 1 0 1 0 1
Logic Equation based on table:

Overflow = ~A*~Bmux*Cin + A*Bmux*~Cin
Set Less Than
In order to calculate the set less than output from the highest order bit, we would
typically xor the sum output and overflow. However, this is costly in time, and can be
minimized in logic. So I made a table showing how I could possibly go about this, and
made the logic equation using the following tables, and a k-map.
Set=Overflow xor Sum

I expanded this, and realized that it was too much
work to minimize logic this way, so I used a k-map
A B Cin Sum Overflow Set

0 0 0 0 0 0
0 0 1 1 1 0
0 1 0 1 0 1
0 1 1 0 0 0
1 0 0 1 0 1
1 0 1 0 0 0
1 1 0 0 1 1
1 1 1 1 0 1
K-map
AB AB AB AB
Cin 00 01 10 11
0 0 1 1 1
1 0 0 0 1
This gives us the minimized equation for set of:

Set=AB+A~Cin+B~Cin
1 bit ALU
I decided to go beyond just minimizing logic for the ALU. I decided to do carry-
lookahead. This meant that I would need to employ a partial full adder (PFA). Since the
PFA used an OR gate for the propogate, and an AND gate for the generate outputs, I
simply re-used those for the AND and OR operations required of the ALU. I also
decided to simply use an XOR gate instead of a 2 to 1 mux, since the XOR gate
simplified logic.
OP Code
Cin
3
b invert 1 2
AND 0
OR 1
a Result
b a
2
3
Less
p
g
Sum
1-bit ALU PFA

Bmux
Cin
A Overflow
Overflow=
ABmux~Cin+~A~BmuxCin
Bmux
A
Cin Set
Set=ABmux+A~Cin+Bmux~Cin
Figure 2. 1-bit ALU
I decided to use the 2 gate level version of the XOR in the 1 bit ALU, however, in my
code to compete with my friends, I decided to use XOR gates with 50ps delays—the
same as the other gates. In reality, the XOR is actually two gate levels, and thus should
be slower. A 4 to 1 mux was used to select the input we want to look at, whether it’s the
sum, the set less than, AND or OR. This is shown below.
4 to 1 mux
sel1 sel0
sel
in0
in0 0
in1
out
in1 1 in2
out
in2 2 in3
in3 3
Figure 3. 4 to 1 Mux
Carry Look-Ahead
The next main module I decided to use in this 32 bit ALU was carry look-ahead for my
adder. This increased the speed of my adder by a factor of 8 approximately. The entire
add process takes 13 gate delays to calculate the slowest bit—the 32nd bit’s sum. The
carry look ahead modules are shown on the following pages. This illustrates how I used
the same 4-bit carry look-ahead module to create a 32-bit carry look-ahead module, with
cascaded 4-bit sections. I first show the main 4-bit carry look-ahead section in the 4-bit
adder. Then the block diagrams used to create the 16-bit and 32-bit adders. Then finally,
I show the actual gate-level design of the 16 and 32 bit adders.

Insert carry look-ahead pages here.
Conclusion
I expect this to work, and to be the fastest in the class. If I were to take the two gate level
XOR’s and make them one gate level, this would be unstoppable. I had a lot of fun
doing this lab, and I look forward to the next labs.

Testing Output Waveforms
1 bit ALU testing
32 bit ALU testing

Appendix A
Verilog Code
/*
Gabe Rowe
EE 471
Lab #2
32 bit ALU with Carry Look Ahead, and minimized logic on set less than and overflow.
*/
module alu_32_bit(bus_a,bus_b,op_code,result_bus,zero_detect,overflow_detect,carryout);
input [31:0] bus_a, bus_b;
input [2:0] op_code;
output [31:0] result_bus;
output zero_detect,overflow_detect,carryout;
wire [15:0] p0,p1,g0,g1,ci0,ci1;
wire gnd=0;
set set0(bus_a[31],bmux31,ci1[15],set_less_than);
overflow overflow0(bus_a[31],bmux31,ci1[15],overflow_detect);
nor
nor0(zero_detect,result_bus[31],result_bus[30],result_bus[29],result_bus[28],result_bus[27],result_bus[26],r
esult_bus[25],
result_bus[24],result_bus[23],result_bus[22],result_bus[21],result_bus[20],result_bus[19],result_
bus[18],
result_bus[17],result_bus[16],result_bus[15],result_bus[14],result_bus[13],result_bus[12],result_
bus[11],
result_bus[10],result_bus[9],result_bus[8],result_bus[7],result_bus[6],result_bus[5],result_bus[4]
,
result_bus[3],result_bus[2],result_bus[1],result_bus[0]);
//These two 16 bit carry look ahead blocks make up a 32 bit carry lookahead block
cla_16_bit cla_16_bit0(op_code[2],p0,g0,ci0,gg0,pg0);
cla_16_bit cla_16_bit1(c16,p1,g1,ci1,gg1,pg1);
and #50 and0(pre_c16,pg0,op_code[2]);
and #50 and1(pre_c32_1,pg1,gg0);
and #50 and2(pre_c32_2,pg0,pg1,op_code[2]);
or #50 or0(c16,pre_c16,gg0);
or #50 or1(carryout,pre_c32_1,pre_c32_2,gg1);
alu_1_bit alu_1_bit0(bus_a[0],bus_b[0],op_code[2],op_code[2],
{op_code[1],op_code[0]},set_less_than,p0[0],g0[0],bmux0,outsum0,result_bus[0]);
alu_1_bit alu_1_bit1(bus_a[1],bus_b[1],ci0[1],op_code[2],
{op_code[1],op_code[0]},gnd,p0[1],g0[1],bmux1,outsum1,result_bus[1]);
alu_1_bit alu_1_bit16(bus_a[16],bus_b[16],c16,op_code[2],
endmodule
module alu_1_bit(a,b,cin,binv,op,less,p,g,bmux,sum,result);
input [1:0] op;
input a,b,cin,binv,less;
output p,g,bmux,sum,result;
b_mux b_mux0(b,binv,bmux);
pfa pfa0(a,bmux,cin,g,p,sum);
mux_4_to_1 mux_4_to_1_0(op,g,p,sum,less,result);
endmodule
module pfa(a,b,cin,g,p,sum);
input a,b,cin;
output g,p,sum;
and #50 and0(g,a,b);
or #50 or0(p,a,b);
xor #50 xor0(sum,a,b,cin);
endmodule
module cla_16_bit(cin,p,g,ci,gg_out,pg_out);
input cin;
input [15:0] p,g;
output [15:0] ci;
output gg_out,pg_out;
wire [3:0] gg,pg,ci_main;
cla_4_bit cla_4_bit0(cin,{p[3],p[2],p[1],p[0]},{g[3],g[2],g[1],g[0]},ci[1],ci[2],ci[3],gg[0],pg[0]);
cla_4_bit cla_4_bit1(ci[4],{p[7],p[6],p[5],p[4]},{g[7],g[6],g[5],g[4]},ci[5],ci[6],ci[7],gg[1],pg[1]);
cla_4_bit cla_4_bit_main(cin,pg,gg,ci[4],ci[8],ci[12],gg_out,pg_out);
endmodule
module cla_4_bit(cin,p,g,c1,c2,c3,gg,pg);
input cin;
input [3:0] p,g;
output c1,c2,c3;
output gg,pg;
and #50 and0(c1_and0,p[0],cin);
and #50 and1(c2_and0,p[1],g[0]);
and #50 and2(c2_and1,p[1],p[0],cin);
and #50 and3(c3_and0,p[2],g[1]);
and #50 and4(c3_and1,p[2],p[1],g[0]);
and #50 and5(c3_and2,p[2],p[1],p[0],cin);
and #50 and6(c4_and0,p[3],g[2]);
and #50 and7(c4_and1,p[3],p[2],g[1]);
and #50 and8(c4_and2,p[3],p[2],p[1],g[0]);
and #50 and9(pg,p[3],p[2],p[1],p[0]);
or #50 or0(c1,g[0],c1_and0);
or #50 or1(c2,g[1],c2_and0,c2_and1);
or #50 or2(c3,g[2],c3_and2,c3_and1,c3_and0);
or #50 or3(gg,g[3],c4_and2,c4_and1,c4_and0);
endmodule
module overflow(a,b,cin,overflow_detect);
input a,b,cin;
output overflow_detect;
not not0(not_a, a);
not not1(not_b, b);
not not2(not_cin, cin);
and #50 and0(and_a_b_not_cin,a,b,not_cin);
and #50 and1(and_not_a_not_b_cin,not_a,not_b,cin);
or #50 or0(overflow_detect,and_a_b_not_cin,and_not_a_not_b_cin);
endmodule
module set(a,b,cin,set_less_than);
input a,b,cin;
output set_less_than;
not not0(not_cin, cin);
and #50 and0(a_and_b,a,b);
and #50 and1(a_and_not_cin,a,not_cin);
and #50 and2(b_and_not_cin,b,not_cin);
or #50 or0(set_less_than,a_and_b,a_and_not_cin,b_and_not_cin);
endmodule
module mux_4_to_1(sel, in0, in1, in2, in3, out);
input [1:0] sel;
input in0, in1, in2, in3;
output out;
wire [1:0] not_sel;
not not0(not_sel[0], sel[0]);
not not1(not_sel[1], sel[1]);
and #50 and0(sel_in0, in0, not_sel[1], not_sel[0]); //00
and #50 and1(sel_in1, in1, not_sel[1], sel[0]); //01
and #50 and2(sel_in2, in2, sel[1], not_sel[0]); //10
and #50 and3(sel_in3, in3, sel[1], sel[0]); //11
or #50 or0(out, sel_in0, sel_in1, sel_in2, sel_in3);
endmodule
module b_mux(b, binv, bmux);
input b, binv;
output bmux;
xor #50 xor0(bmux, binv, b);
endmodule

Gabe Rowe: EE 471 Lab 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Gabe Rowe: EE 471 Lab 2

Uploaded by

Copyright:

Available Formats

Gabe Rowe

delays assuming the zero detect is one gate delay.

Figure 1. 32 bit ALU block diagram

To minimize logic of overflow:

All Overflow Cases

Simplified Overflow Cases

Logic Equation based on table:

Set=Overflow xor Sum

A B Cin Sum Overflow Set

This gives us the minimized equation for set of:

1-bit ALU PFA

Figure 2. 1-bit ALU

I show the actual gate-level design of the 16 and 32 bit adders.

doing this lab, and I look forward to the next labs.

1 bit ALU testing

32 bit ALU testing

input [31:0] bus_a, bus_b;

input [2:0] op_code;

output [31:0] result_bus;

wire [15:0] p0,p1,g0,g1,ci0,ci1;

and #50 and0(pre_c16,pg0,op_code[2]);

and #50 and1(pre_c32_1,pg1,gg0);

and #50 and2(pre_c32_2,pg0,pg1,op_code[2]);

input [1:0] op;

and #50 and0(g,a,b);

xor #50 xor0(sum,a,b,cin);

input [15:0] p,g;

output [15:0] ci;

wire [3:0] gg,pg,ci_main;

input [3:0] p,g;

and #50 and0(c1_and0,p[0],cin);

and #50 and1(c2_and0,p[1],g[0]);

and #50 and2(c2_and1,p[1],p[0],cin);

and #50 and3(c3_and0,p[2],g[1]);

and #50 and4(c3_and1,p[2],p[1],g[0]);

and #50 and5(c3_and2,p[2],p[1],p[0],cin);

and #50 and6(c4_and0,p[3],g[2]);

and #50 and7(c4_and1,p[3],p[2],g[1]);

and #50 and8(c4_and2,p[3],p[2],p[1],g[0]);

and #50 and9(pg,p[3],p[2],p[1],p[0]);

not not0(not_a, a);

not not1(not_b, b);

not not2(not_cin, cin);

and #50 and0(and_a_b_not_cin,a,b,not_cin);

and #50 and1(and_not_a_not_b_cin,not_a,not_b,cin);

and #50 and0(a_and_b,a,b);

and #50 and1(a_and_not_cin,a,not_cin);

and #50 and2(b_and_not_cin,b,not_cin);

module mux_4_to_1(sel, in0, in1, in2, in3, out);

input [1:0] sel;

input in0, in1, in2, in3;

wire [1:0] not_sel;

not not0(not_sel[0], sel[0]);

not not1(not_sel[1], sel[1]);

and #50 and0(sel_in0, in0, not_sel[1], not_sel[0]); //00

and #50 and1(sel_in1, in1, not_sel[1], sel[0]); //01

and #50 and2(sel_in2, in2, sel[1], not_sel[0]); //10

and #50 and3(sel_in3, in3, sel[1], sel[0]); //11

or #50 or0(out, sel_in0, sel_in1, sel_in2, sel_in3);

module b_mux(b, binv, bmux);

xor #50 xor0(bmux, binv, b);

You might also like