Professional Documents
Culture Documents
Verilog HDL
Project IT Autumn 2016
Mahdad Davari
<mahdad.davari@it.uu.se>
Programmable Devices
Since 1969: PROM, (E)EPROM, PAL, PLA, GAL,
CPLD, FPGA
Key Players in programmable-device industry:
Altera (first CPLD)
Xilinx (first FPGA)
2
FPGA from a Birds-Eye View
3
FPGA in a Nutshell
4
Logic Slice
5
Look-Up Table (LUT)
SRAM cells
0
1
0
1
0
1
1
1
abc
6
FPGA Birds-Eye View
7
Roadmap
Programmable Devices
FPGA Design Flow
FPGA vs GP-CPU vs ASIC
Accelerator Design Example
Verilog HDL Example
8
FPGA Design Flow
Design Entry
(RTL design using HDL)
Behavioral Simulation
(ModelSIM)
Behaviour NO
OK?
Synthesis
(Quartus II)
Timing Analysis
(Quartus II)
Speed NO
OK?
10
CPU vs FPGA vs ASIC
High
CPU
FPGA
ASIC
Low
11
Roadmap
Programmable Devices
FPGA Design Flow
FPGA vs GP-CPU vs ASIC
Accelerator Design Example
Verilog HDL Example
12
One Monday Morning
FFT algorithm on CPU
13
Butterfly Operation
14
4-Point Butterfly Operation
X0 Y0
2-Point BF 2-Point BF
X2 Y1
X1 Y2
2-Point BF 2-Point BF
X3 Y3
15
8-Point Butterfly Operation
16
16-Point Butterfly Operation
17
32-Point Butterfly Operation
18
Speedup
20
Top-Down Design
8-Point FFT
0 O0
1 O1
2 O2
3 O3
4 O4
5 O5
6 O6
7 O7
21
Top-Down Design
22
Top-Down Design
Top: 8-Point FFT
o0 o0 0 i0 o0
0 i0 2-Point BF
0 0 i0 0 O0
1 i1 o1 1 1 i1 o1 1 1 i1 o1 O1
2 i0 o0 2 2 i0 o0 2 2 i0 o0 O2
3 i1 o1 3 3 i1 o1 3 3 i1 o1 O3
4 i0 o0 4 4 i0 o0 4 4 i0 o0 O4
5 i1 o1 5 5 i
1 o1 5 5 i1 o1 O5
6 i0 o0 6 6 i0 o0 6 6 i0 o0 O6
7 i1 o1 7 7 i1 o1 7 7 i1 o1 O7
23
Top-Down Design
Top: 8-Point FFT
o0 o0 0 i0 o0
0 i0 2-Point BF
0 0 i0 0 O0
1 i1 o1 1 1 i1 o1 1 1 i1 o1 O1
2 i0 o0 2 2 i0 o0 2 2 i0 o0 O2
3 i1 o1 3 3 i1 o1 3 3 i1 o1 O3
4 i0 o0 4 4 i0 o0 4 4 i0 o0 O4
5 i1 o1 5 5 i
1 o1 5 5 i1 o1 O5
6 i0 o0 6 6 i0 o0 6 6 i0 o0 O6
7 i1 o1 7 7 i1 o1 7 7 i1 o1 O7
Pipe1 Pipe2 Pipe3
FFT:Stage 1 FFT:Stage 2 FFT:Stage 3
24
Top-Down Design
25
Top-Down Design
Top: 8-Point FFT
o0 o0 0 i0 o0
0 i0 2-Point BF
0 0 i0 0 O0
1 i1 o1 1 1 i1 o1 1 1 i1 o1 O1
2 i0 o0 2 2 i0 o0 2 2 i0 o0 O2
3 i1 o1 3 3 i1 o1 3 3 i1 o1 O3
4 i0 o0 4 4 i0 o0 4 4 i0 o0 O4
5 i1 o1 5 5 i
1 o1 5 5 i1 o1 O5
6 i0 o0 6 6 i0 o0 6 6 i0 o0 O6
7 i1 o1 7 7 i1 o1 7 7 i1 o1 O7
Pipe1 Pipe2 Pipe3
FFT:Stage 1 FFT:Stage 2 FFT:Stage 3
26
Top-Down Design
27
Top-Down Design
Top: 8-Point FFT
I0 0 i0 o0 0 0 i0 o0 0 0 i0 o0 O0
2-Point BF o1
I4 1 i1 o1 1 1 i1 o1 1 1 i1 O1
I2 2 i0 o0 2 2 i0 o0 2 2 i0 o0 O2
I6 3 i1 o1 3 3 i1 o1 3 3 i1 o1 O3
I1 4 i0 o0 4 4 i0 o0 4 4 i0 o0 O4
I5 5 i1 o1 5 5 i
1 o1 5 5 i1 o1 O5
I3 6 i0 o0 6 6 i0 o0 6 6 i0 o0 O6
I7 7 i1 o1 7 7 i1 o1 7 7 i1 o1 O7
Pipe1 Pipe2 Pipe3
FFT:Stage 1 FFT:Stage 2 FFT:Stage 3
28
Top-Down Design
Top: 8-Point FFT
I0 0 i0 o0 0 0 i0 o0 0 0 i0 o0 O0
2-Point BF o1
I4 1 i1 o1 1 1 i1 o1 1 1 i1 O1
I2 2 i0 o0 2 2 i0 o0 2 2 i0 o0 O2
I6 3 i1 o1 3 3 i1 o1 3 3 i1 o1 O3
I1 4 i0 o0 4 4 i0 o0 4 4 i0 o0 O4
I5 5 i1 o1 5 5 i
1 o1 5 5 i1 o1 O5
I3 6 i0 o0 6 6 i0 o0 6 6 i0 o0 O6
I7 7 i1 o1 7 7 i1 o1 7 7 i1 o1 O7
assign
Adder y0 = x0(y0,
Add1 + x1;
x0, x1);
assign y1 = Sub1
Subtractor x0 x1;
(y1, x0, x1);
endmodule
30
Bottom-Up Implementation
0 i0 o0 0
Y0module fft (i0, i1, i2, i3, i4, i5, i6, i7,
1 i1
2-Point BF
o1 1 o0, o1, o2, o3, o4, o5, o6, o7);
X0
Y1
2 i0 o0 2 Input i0, i1, i2, i3, i4, i5, i6, i7;
3 i1 X1 o1 3 output o0, o1, o2, o3, o4, o5, o6, o7;
31
Top-Down Design
Top: 8-Point FFT
I0 0 i0 o0 0 0 i0 o0 0 0 i0 o0 O0
2-Point BF o1
I4 1 i1 o1 1 1 i1 o1 1 1 i1 O1
I2 2 i0 o0 2 2 i0 o0 2 2 i0 o0 O2
I6 3 i1 o1 3 3 i1 o1 3 3 i1 o1 O3
I1 4 i0 o0 4 4 i0 o0 4 4 i0 o0 O4
I5 5 i1 o1 5 5 i
1 o1 5 5 i1 o1 O5
I3 6 i0 o0 6 6 i0 o0 6 6 i0 o0 O6
I7 7 i1 o1 7 7 i1 o1 7 7 i1 o1 O7
endmodule
wire [7:0] w1;
wire [7:0] w2;
wire [7:0] w3;
fft stage1 (i[0], i[4], i[2], i[6], i[1], i[5], i[3], i[7], w1[0:7]);
fft stage2 (pipe1[0], pipe1[2], pipe1[1], pipe1[3], pipe1[4], pipe1[6], pipe1[5], pipe1[7], w2[0:7]);
fft stage3 (pipe2[0], pipe2[4], pipe2[2], pipe2[6], pipe2[1], pipe2[5], pipe2[3], pipe2[7], w3[0:7]);
33
Bottom-Up Implementation
// continued from the previous slide
Y0
always @ (posedge clk)
begin
if (rst) Y1
begin
pipe1 <= 9b000000000;
pipe2 <= 9d0;
pipe3 <= 0;
end
else
begin
pipe1 <= {valid, w1};
pipe2 <= {pipe1[8], w2};
pipe3 <= {pipe2[8], w3};
end
end
endmodule
35
Testbench
Testbench
Input Expected
Input Output ==
Generator Result
Test OK!
36
Testbench
module fft_tb;
reg clk, rst, valid; reg [8:0] i; wire [8:0] o; wire ready;
always
#5 clk = !clk;
initial
begin
rst=0; clk=0; valid=0;
rst = #20 1b1;
i = #20 8hff;
valid = 1b1;
valid = #10 1b0;
#50 $finish;
end
endmodule
37
Net Types in Verilog
Wire
Used only as connectors, or
left-hand side of assign, e.g. assign w = a & b
Reg
Implements combinatorial or sequential logic
Used inside always blocks
38
Combinatorial vs. Sequential
// combinatorial // sequential
39
Two-Dimensional Input Ports
module myModule (input [7:0] i [0:3], output [7:0] o [0:3]);
40
Useful References
https://www.doulos.com/knowhow/verilog_designers_guide/ (good starting point into Verilog)
https://inst.eecs.berkeley.edu/~cs150/Documents/Nets.pdf (net types in Verilog, wire vs. reg)
http://www.asic-world.com/tidbits/blocking.html (blocking vs. non-blocking assignmets, see the
example)
http://web.mit.edu/6.111/www/f2007/handouts/L06.pdf (another reference for blocking vs. non-
blocking assignments and finite-state-machine design; slides 1 to 7 and slides 11 to 15)
http://www.asic-world.com/verilog/art_testbench_writing1.html (writing testbenches in Verilog)
http://www.rfwireless-world.com/source-code/ (useful source code examples; jump to Verilog part)
http://www.fpl2016.org/slides/Gupta%20--%20Accelerating%20Datacenter%20Workloads.pdf
(HARP-related material)
http://web.cs.ucla.edu/~haoyc/pdf/dac16.pdf (HARP-related paper)
https://pdfs.semanticscholar.org/8b8f/8cb7885bc751fa919d216d96caf4a0234717.pdf (HARP-related
paper)
41
Thank you!
42