Professional Documents
Culture Documents
Richard E. Haskell and Darrin M. Hanna CSE Dept., Oakland University, Rochester, Michigan 48309
ABSTRACT Everyone agrees that engineering programs should teach the fundamentals. The disagreement comes in agreeing on what the fundamentals are. Perhaps we can agree that the fundamentals are those concepts, theories, and practices that have long-term staying power and are still useful decades into the future. How does such a viewpoint apply to the teaching of digital design? In this paper we will argue that it is the behavioral specifications of digital components and systems that have not changed and are therefore the fundamentals that should be the centerpiece of courses in digital design. This paper describes a sophomore/junior course in computer hardware design that we have taught at Oakland University for the past six years. In this course the students begin by studying basic logic gates and circuits and then proceed to design a complete 16-bit stack-based microprocessor using VHDL and implement it in a Xilinx FPGA. During the last three weeks of the course the students, working in groups of 3-4, complete a project in which they write a software program and compile it to execute on their custom microprocessor that they have designed. These projects have ranged from video games to a real-time software debugger. The only way that all of this material can be included in a single course is to focus on the fundamentals.
Introduction The first digital circuits used relays, the original key to speaking binary, the fundamental language of computing. But no one would argue that relays, or the electromagnetic theory on which they rely, are fundamental to digital design today. The same could be said of vacuum tubes, transistors, TTL, CMOS, PLDs, CPLDs, FPGAs, ASICs, or any other implementation mode that is popular at a particular time. Inasmuch as any logic circuit can be made using AND, OR, NAND, NOR, and NOT gates representing high-level logic compared to their transistor fabric they have been considered fundamental for some time. In fact, any logic circuit can be made from only NAND gates or from only NOR gates, escalating their status to universal gates. If one peeks inside some modern silicon system such as an FPGA one looks in vain for any of these gates as they are concealed by the FPGAs fundamental lookup tables within the configurable logic blocks (CLBs). Most digital logic courses spend considerable time on reducing logic equations particularly by using Karnaugh maps to derive reduced, fundamental Boolean logic. But engineers who design modern digital circuits for a living never use Karnaugh maps. The vast majority of digital systems today are designed by using either Verilog or VHDL, the present-day fundamental hardware description languages. Surely the syntax and semantics of a particular hardware description language no matter how universally used and which may be replaced by a better hardware description language in the future, can not be really fundamental. So what is it that has remained constant over the last several decades of digital design? In this paper we will argue that it is the behavioral specifications of digital components and systems that have not changed and are therefore the fundamentals that should be the centerpiece of courses in digital design.
Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education
For example, the important thing to know about a 2 x 1 multiplexer with inputs A, B, and S, and output Z is that the output Z is equal to A when S = 0 and is equal to B when S = 1. It is much less important to know what particular arrangement of gates can implement this multiplexer, because your particular implementation, for example in an FPGA, may not contain any gates at all. This viewpoint has important implications of what and how topics should be taught in a digital design course. In this paper we describe a sophomore/junior course in computer hardware design that we have taught at Oakland University for the past six years. In this course the students begin by studying basic logic gates, combinational logic circuits, and sequential circuits by focusing on the behavior of these circuits. The students then proceed to design a complete 16-bit stack-based microprocessor using VHDL and implement it in a Xilinx FPGA. At the end of the course the students, working in groups of 3-4, complete a project in which they write a software program and compile it to execute on their custom microprocessor that they have designed. These projects have ranged from video games to a real-time software debugger. The only way that all of this material can be included in a single course is to focus on the fundamentals. Behavior, Implementation, and History The course, CSE 378, Computer Hardware Design, is a junior-level course taken by all computer engineering and computer science majors at Oakland University. At the beginning of the course we tell the students that everything we cover will fall into one of three categories: 1) the behavior of a digital circuit, system, or component (which we argue is the most fundamental in the sense of having long-term staying power); 2) the implementation of a particular digital circuit, system, or component (which is fun and uses the latest technology, which will likely be replaced by a newer technology that is even more fun next year); and history (such as how the totem-pole output of a TTL chip works) that is, well, history. At each stage in the course we urge the students to decide into which category the particular topic of the day falls. Learning material in the third category (history) gives the students the perspective of understanding how we got to the current state of affairs. Learning material in the second category (implementation) will help the student get a job next year. Learning material in the first category (behavior) will help the student get a job ten years from now. How do we describe the behavior of digital circuits? We describe it in words. For example, The output of an AND gate is HIGH only if all inputs are HIGH. The output of an OR gate is LOW only if all inputs are LOW. The output of a NAND gate is LOW only if all inputs are HIGH. The output of a NOR gate is HIGH only if all inputs are LOW. For larger circuits it is convenient to use some type of hardware description language (HDL) to describe the behavior of the circuit. We chose to use VHDL but Verilog (or even C) could also be used. Using VHDL or Verilog has the advantage of being able to simulate and synthesis the designs using widely available tools. We use Aldec Active-HDL for simulation and the Xilinx ISE Project Navigator for synthesis to Xilinx FPGAs. In this course, each student purchases the Spartan-3 board from Digilent.1
Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education
The Fundamentals of Combinational and Sequential Circuits The fundamentals of basic combinational and sequential circuits are introduced by describing their behavior in terms of VHDL statements within a VHDL architecture. For example, an 8line, 2 x 1 multiplexer is described in Figure 1, a 4-bit adder is described in Figure 2, and a 4-bit shifter is described in Figure 3. The VHDL code in all of these examples can directly be simulated and synthesized to an FPGA.
architecture mux2g_arch of mux2g is begin mux2_1: process(a, b, sel) begin if sel = '0' then y <= a; else y <= b; end if; end process mux2_1; end mux2g_arch;
a(n-1:0) b(n-1:0)
y(n-1:0)
sel
architecture adder4 of adder4 is begin process(A,B) variable temp: STD_LOGIC_VECTOR(4 downto 0); begin temp := ('0' & A) + ('0' & B); S <= temp(3 downto 0); carry <= temp(4); end process; end adder4;
A(3:0)
S(3:0)
adder4
B(3:0) carry
architecture shifter_arch of shifter is begin shift_1: process(D, s) begin case s is when "00" => -- no shift Y <= D; when "01" => -- U2/ Y <= '0' & D(width-1 downto 1); when "10" => -- 2* Y <= D(width-2 downto 0) & '0'; when "11" => -- 2/ Y <= D(width-1) & D(width-1 downto 1); when others => -- no shift Y <= D; end case; end process shift_1; end shifter_arch;
D3
D2
D1
D0
s1 s0
Shifter
s1 0 0 1 1
s0 0 1 0 1
noshift U2/ 2* 2/
Y3 D3 0 D2 D3
Y2 D2 D3 D1 D3
Y1 D1 D2 D0 D2
Y0 D0 D1 0 D1
Fig. 3 Describing the behavior of a 4-bit shifter Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education
As another example, the Shift and Add 3 algorithm for converting an 8-bit binary number to BCD is shown in Figure 4. The VHDL description of this behavior is shown in Figure 5 and the result of the simulation of this code is shown in Figure 6.
Binary-to-BCD Conversion: 1. Shift the binary number left one bit. 2. If 8 shifts have taken place, the BCD number is in the Hundreds, Tens, and Units column. 3. If the binary value in any of the BCD columns is 5 or greater, add 3 to that value in that BCD column. 4. Go to 1.
Operation
HEX Start Shift 1 Shift 2 Shift 3 Add 3 Shift 4 Add 3 Shift 5 Shift 6 Add 3 Shift 7 Add 3 Shift 8 BCD P z
Hundreds
Tens
Units 1 1 1 0 1 0 1 1 1 1 0 1 0
8 7
Binary 1 1 1 1 1 1 1 1 1 1 1 1 F 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 F 1 1 1 1 1 1 1 1 1 1 1
1 1 1 0 2 9 8
17 16
1 0 0 0 7
15
1 1 1 0 0 0 1 0 1 1 0 5
1 1 1 0 1 0 0 1 4
1 0 1 0 0 0 0 1 0 3
11
1 1 1 0 1 1 0 0 0 0 0 0 1 0 1 1 1 0 1 1 0 5
12
architecture binbcd_arch of binbcd is begin bcd1: process(B) variable z: STD_LOGIC_VECTOR (17 downto 0); for i in 0 to 17 loop z(i) := '0'; end loop; z(10 downto 3) := B; for i in 0 to 4 loop if z(11 downto 8) > 4 then z(11 downto 8) := z(11 downto 8) + 3; end if; if z(15 downto 12) > 4 then z(15 downto 12) := z(15 downto 12) + 3; end if; z(17 downto 1) := z(16 downto 0); end loop; P <= z(17 downto 8); end process bcd1; end binbcd_arch; begin
Fig. 5 VHDL behavior of Shift and add 3 algorithm in Fig. 4 Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education
The behavior of an R-S latch is given in Figure 7 and its simulation is show in Figure 8. Note that this behavior is different from an R-S latch made from crossed NOR gates in that it doesnt have a disallowed state in which both Q and NOT Q are zero when R and S are both 1. In our case the output Q remains unchanged. The students make a truth table for this version of an R-S latch and show how to implement one in terms of AND, OR, and NOT gates.
architecture rslatch of rslatch is begin process(R,S) begin if S = '1' and R = '0' then Q <= '1'; elsif S = '0' and R = '1' then Q <= '0'; end if; end process; end rslatch;
R R-S Latch S
Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education
The behavior of an n-bit register with an asynchronous clear and a load input is given in Figure 9. The behavior of an n-bit counter that can be used as a program counter is given in Figure 10. This counter has an asynchronous clear and load and inc inputs. In addition to these examples the students use VHDL to describe the behavior of 7-segment decoders, comparators, decoders, Gray code converters, arithmetic logic units, ROMs, D latches, D flip-flops, and shift registers. For each of these cases they can synthesize the design to a Xilinx Spartan 3 FPGA by adding a wrapper that includes the switch and pushbutton inputs and the LED and 4-digit 7-segment display of their Digilent Spartan-3 board.
architecture reg_arch of reg is begin process(clk, clr) begin if clr = '1' then for i in width-1 downto 0 loop q(i) <= '0'; end loop; elsif rising_edge(clk) then if load = '1' then q <= d; end if; end if; end process; end reg_arch;
d(n-1 downto 0)
clr load
reg
clk
q(n-1 downto 0)
architecture PC_arch of PC is signal q1: STD_LOGIC_VECTOR(width-1 downto 0); begin process(clk, clr) begin if clr = '1' then for i in width-1 downto 0 loop q1(i) <= '0'; end loop; elsif rising_edge(clk) then if load = '1' then q1 <= d; elsif inc = '1' then q1 <= q1 + 1; end if; end if; end process; q <= q1; end PC_arch;
d(n-1:0)
clr clk
PC
load inc
q(n-1:0)
Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education
The Design of a Stack-Based Microprocessor Core Forth is a programming language that uses a data stack and postfix notation. Everything in Forth is a word and every word is a module that does something useful. Forth words accept parameters on the data stack, execute themselves, and return the answers back on the data stack. In the junior-level course, CSE 378, Computer Hardware Design, students design and implement the FC16 Forth core shown in Figure 11. This microprocessor core contains four main components, the data stack, DataStack, the function unit, Funit16, the return stack, ReturnStack, and the controller, FC16_control. The FC16 also contains a program counter, PC, whose output, P, containing the address of the next instructions, is the input to the program ROM outside the FC16. The output of the ROM is the signal, M, which can be loaded into the instruction register, IR, pushed onto the data stack through the multiplexer, Tmux, or loaded into the program counter, PC, through the multiplexer, Pmux. A detailed description of this Forth core is given in the references.2,3
M
0
R
Pmux 1
P1
0
T
Rmux 1
psel
rinsel
Pin
pload pinc PC clr clk rsel rload rdec clr clk rpush rpop
Rin
P(15:0)
plus1
ReturnStack
E1(15:0)
SW(15:0) M(15:0)
R M
irload IR icode clr clk tload nload nsel ssel clr clk dpush dpop
E2(15:0)
y
0 1
M
2
S
3
E1 E2 N2
Tmux
4 5 6 7
N
tsel(2:0)
Tin
R
BN(3:0) clr clk
DataStack
FC16_control
N2
digload N(15:0)
T(15:0)
Funit16
y1(15:0) y(15:0)
Fcode(5:0)
Fig. 11 The FC16 Core that executes Forth instructions Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education
The FC16 data stack is a modified 32x16 stack whose architecture is shown in Figure 12. The FC16 data stack shown in Figure 12 consists of two 16-bit registers for the top and second elements of the data stack followed by a 32x16 stack implemented using a dual-port RAM. These registers, Treg and Nreg, serve as false top and false second elements in the data stack, respectively. This architecture is necessary to support single-clock-cycle execution of Forth instructions such as ROT involving the top three stack elements. Forth programs can easily be compiled to hardware by translating the program to VHDL code for a ROM that contains the corresponding FC16 instructions. A C++ program that translates Forth programs to 68HC12 assembly language is described in Haskell.4 A modification of this C++ program has been used to produce a VHDL ROM array directly from a Forth program. This makes it easy to quickly change programs, compile them to a VHDL ROM, and download them to the FPGA for testing.
Tin(15:0)
The FC-16 microprocessor core contains over 60 Forth instructions, 75% of which execute in a single clock cycle. Forth programs written for the FC-16 typically execute at 25 MHz. After completing the design and implementation of the FC-16 core the students work in groups of 2 or 3 to design and implement a project that involves the use of the FC-16 perhaps with addition of new hardware and new instructions. Student Projects
DataStack
tload clr clk Treg tload
y1(15:0)
y1
2 0
T1 Nmux
1
N2
Nin nload
Nreg
N1
0
T Smux
1
clk d Using the FC16 core developed throughout the dpush dpop course, CSE 378 students produce final projects clr clk in groups of three or four during the last three dpush stack32x16 dpop weeks. The project guidelines require students to full develop software for a derivative of the FC16 empty core made by any modifications they need to implement their project. Since Forth is a modular T(15:0) N(15:0) N2(15:0) high-level language combined with the Fig. 12 The data stack knowledge and experience theyve obtained from this FPGA-based hardware design course, substantial projects have been designed, implemented, tested, and demonstrated in only three weeks. In addition to creating the completed, working project, each group must deliver a 15-minute PowerPoint presentation, demonstrate the project in class, submit a written report detailing the processor and their project, and construct a creative poster for public display. Some of these projects are described below.
clr
ssel
A Real-time Forth Compiler and Debugger Development of Forth programs for the FC16 microprocessor requires that compiled software is implemented in a program ROM. This ROM and FC16 project must then be compiled using ISE for synthesis with each program change. In this project, students made hardware modifications and developed a Visual Basic application to streamline software development for the FC16.
Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education
Through the use of parallel FC16 processors implemented on the Spartan2E FPGA and the Visual Basic application; real-time compilation, execution, and debugging are realized for rapid development and testing of Forth code. Figure 13 shows a screenshot of the Visual Basic interactive debugger. This design required a modified FC16 with minor modifications to a slave FC16 that executes the programmers code. This makes it easy for future students to replace the FC16 with a different processor and still use this interactive programmer and debugger. Figure 14 shows a diagram of the modifications made to the FC16 processor shown previously in Figure 11 for the real-time programmer and debugger. The CD component is the compilerdebugger. The SLAVE component is the FC16 processor shown in Figure 11 with some extra output signals connected to the CD. All of these modifications were designed and implemented by the student group. Figure 15 shows the hardware required inside the compiler-debugger, CD. This component is also a modified FC16 core.
clr cclk
I db1 O BN_IN(1)
clr cclk
I db2 O BN_IN(2)
clr cclk
I db3 O BN_IN(3)
clr cclk
I db4 O BN_IN(4)
clk
bn
I buff O clr led
BN_IN
clr
clk
P clk B N
sclr
dbP dbM
dbN dbN2 dbR
ldg
sclr
RX RX TX TX en
N2 clr R
CD
mclk
clock
clr
cclk clk
SLAVE
R2
Y Y1
dbR2 (compiler) pDo dbY dbY1 dbE1 dbE2 icode dbIcode pAddr pData pwe
E1
clka clkb paddr(9:0) addra addrb P(9:0)
E2 P
T
digload
cT
cdigload sdigload
icode
en
misc1 misc2
en
dbMisc1 dbMisc2
pData pwe
dpbram
dina wea doutb
M
T
sT
sT(15:8)
dbT d
douta
S
pDo
digload sdigload
ldload
load
ldreg q dbLD
clr clk
sclr clk
SW(1:8)
LD(1:8)
Fig. 14 The Hardware for the Real-time Forth Compiler-Debugger and FC16 Microcontroller
A(1:4)
AtoG(6:0)
Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education
dbT
dbN
dbN2 dbR
dbR2 dbY
dbP
dbM
dbIcode
dbY1
RX
X0000 "
prom
M
addr
Tout(3:0)
0 sel
dmux
9 y
10
11
12
13
14
15
M
M R d
dbgOut
Pp1
0 sel
Tout rmux 1
y
iload
load
ireg
q
0 pmux 1 sel y
icode
Pin pcount
inc q d
rin
rin
P(15:0)
Pp1
rstack
M
clr clk
tsel (2:0)
open
ac_out
Y
Y ac_load ac_inc
pload inc d q dpush Y 0 1 asmux sel y ac_out dpop fwe ac_load ac_inc 0 sel
M
1 2 3
RX_in
4 5
pDo (15:0) N2
6
N
7
X0000 " 8 y 9 10 11 12 13 14 15
pwe
tmux
acount
clr clk
icode
tsel
R
fsel pdsel pwe iload pload pinc
Tin
tload nload nsel ssel clr tload nload nsel ssel clr tin y1 clk X0000 "
ac_out
dstack
dpush T dpop dpush dpop
assel
Tout N2
as_in
open open clr clk ac_out as_out Tout 0 1 sel X0000 " 3 Tout 0 Y 1 sel data full push empty
Tout (4:0)
D A SPO
fram
CLK WE
clk fwe
T
psel tload Tout (0) Tout 0 sel ac_out
T(15:0)
apush apop
astack
q
ctrl
pop
d load txreg q
clr clk
TX
F_in
a en_on en_off load d clk s_en_reg clr q en_reg_out b
pasel
pamux y pAddr
alu
y
fcode (5:0)
pdsel
pAddr (15:0)
3 4 pdmux y pData
Y(15:0)
pData (15:0)
digload
A Video Game: Tetris In this project, students implemented a hardware video driver that controls the horizontal sync, vertical sync, red, green, and blue signals for a VGA monitor and implemented the game of Tetris. They used a Nintendo game controller interfaced to the Digilent Digilab IIE Protoboard using open pins to allow players to rotate game pieces and accelerate the piece downward. These students also used the FPGA block RAM and created a Tetris Control Component to implement the nontrivial game logic. Figure 16 shows a block diagram of the hardware developed for this project and a screenshot of the VGA game in action.
FC16
Fig. 16 A Hardware Block Diagram for the FC16 Tetris Video Game Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education
The Card Gallery Students implemented a card gallery comprised of a generic card gaming engine and four cardgame modules. The games are played using the VGA screen controlled by the video driver designed by the students and implemented in the FPGA and the buttons on the Digilent Digilab IIE Protoboard. The four games implemented were High-Low, Black Jack, In Between, and War. Figure 17 shows a block diagram of the Card Gallery including a screenshot of one of the games, In Between, in action.
bn m cl k
G 2P roject
IB U FG
le d clr
ptrLo ad p trsel
clkdiv
clk
R N G In cclk
T N PtrR AM
clr clk
p trinc
P P rogram ROM M
clr
cl k
N T
clr
T (5:0) T
D igD isp la y cclk dig lo ad
addr(5:0)
W C 16
d ig lo ad
R N G In we cs LC D _R W LC D _ R S L C D _E
R NG
oe
N buff3 E1 B S
D en
D ata(7:0)
T N VGA C o ntroller
B TN 4
S W (1:8)
L D (1:8)
ld g
A(1 :4)
A toG (6:0)
All of these projects represent computer hardware design employing the fundamentals of digital logic, the behavior of digital systems. These student projects were implemented on modern RAM-based FPGA technology using VHDL, simulated by Aldecs ActiveHDL, and synthesized using Xilinxs ISE Foundation Tools. Summary In this paper we have argued that it is the behavioral specification of digital circuits that remains constant over time and is therefore more fundamental than any particular implementation technology. We have showed how this approach allows students in a junior-level course in computer hardware design to design and implement a complete 16-bit microprocessor core from scratch and then write a sophisticated high-level software program to perform some useful task and execute this program on their personally-designed microprocessor core.
Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education
References
1. 2. www.digilentinc.com. Haskell, R. E. and D. M. Hanna, Rapid Prototyping using a Microprocessor Core on a Spartan II FPGA, Proc. of the International Conference on Embedded Systems and Applications, ESA03, pp. 49-55, Las Vegas, Nevada, USA, June 23-26, 2003. Haskell, R. E. and D. M. Hanna, A VHDL Forth Core for FPGAs, Microprocessors and Microsystems, Vol. 28/3 pp. 115-125, Apr 2004. Haskell, R. E., Design of Embedded Systems Using 68HC12/11 Microcontrollers, Prentice Hall, Upper Saddle River, NJ, 2000.
3. 4.
RICHARD E. HASKELL is Professor of Engineering in the Department of Computer Science and Engineering at Oakland University. He is the author of 15 books and has taught numerous undergraduate and graduate courses including courses in microprocessors, embedded systems and digital design using VHDL. DARRIN M. HANNA is Assistant Professor of Engineering in the Department of Computer Science and Engineering at Oakland University. His primary area of research is pattern recognition and artificial intelligence and embedded systems.
Proceedings of the 2005 ASEE North Central Conference Copyright 2005, American Society for Engineering Education