You are on page 1of 31

Department of Engineering

Australian National University

ENGN3213

Digital Systems & Microprocessors

Project

V1.0
Copyright 2009 ANU Engineering

1
Contents
1 Introduction 3

2 Reverse Polish Notation 4


2.1 History of Reverse Polish . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Reverse Polish Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 The Memory Device in Reverse Polish . . . . . . . . . . . . . . . . . . . . 6
2.4 The HP-35 Reversal Polish Algorithm . . . . . . . . . . . . . . . . . . . . . 8

3 RPC Design and Specification 11


3.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 The PS/2 Keyboard Interface . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 The Seven Segment Display Interface . . . . . . . . . . . . . . . . . . . . . 13
3.4 RTL design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4 Project Details 15
4.1 Keyboard and Display Interfaces . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Implementation Levels of the RP Engine . . . . . . . . . . . . . . . . . . . 16
4.2.1 RP Engine Level I . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2.2 RP Engine Level II . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2.3 RP Engine Level III . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.4 PEGASUS Boad Peripherals . . . . . . . . . . . . . . . . . . . . . . 20
4.3 Project Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4 Assessmnet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

A Appendix: A Description of the MU0 Microprocessor 23


A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
A.2 The Control Path Finite State Machine . . . . . . . . . . . . . . . . . . . . 25
A.3 MU0 in action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
A.4 Running a Program on MU0 . . . . . . . . . . . . . . . . . . . . . . . . . . 28
A.5 MU0 Assembly Language? . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2
1 Introduction
In this course I have emphasised the Register Transfer Level (RTL) description of com-
plex digital systems. An early example of the technique was demonstrated in the design
amd implementation of the MU0 microprocessor (described in detail in the appendix).
By now you should be familiar with the details of the operation of MU0 and ready to
apply the same approach to other designs. The design and operation of MU0 is the best
indicator so far of how you should approach the present project. In the labs, you have
also been learning many new things about the implementation of hardware in VERILOG
that can now be applied in a real design.
The project will give you the opportunity to use the RTL technique for the design of a
system of modest complexity: a reverse polish calculator with 4 significant decimal
digits. The project has various milestones among the specifications to allow you to do a
top-down design and to tackle the project at various levels of complexity with plenty of
scope for individual creativity. A major aspect of the project will be to explore different
approaches of developing the different hardware blocks taking special account of meeting
spec and synthesis in hardware.
Additional information can be found on the course website:

http://engnet.anu.edu.au/DEcourses/engn3213/Documents/PROJECT

3
2 Reverse Polish Notation

2.1 History of Reverse Polish


Reverse polish notation or RPN is an arithmetic notation introduced by the Polish math-
emetician Jan Lukasiewicz in 1920. During the 1960s and 1970s, RPN had some currency
even among the general public, as it was widely used in desktop calculators of the time.
The Hewlett-Packard HP-35, shown in Figure 1 was the world’s first handheld scientific
calculator (1972) and was based on RPN ([1]).

Figure 1: The HP 35 calculator.

The arrival of the HP-35 was a significant event given the market dominance of slide rules
and mechanical calculators for engineering computations. The HP-35 used a traditional
floating decimal display that automatically switched to scientific notation. The fifteen
digit LED display was capable of displaying a 10 digit mantissa plus its sign and a dec-
imal point and a two digit exponent plus its sign. The display was unique in that the
multiplexing was designed to illuminate a single LED segment at a time, rather than a
single LED digit, because HP research had shown that this method was perceived by the
human eye as brighter for equivalent power.
Architecturally, the calculator was a bit-serial machine that processed 56-bit floating-point
numbers, representing 14-digit BCD (Binary Coded Decimal) numbers. Figure 2 shows
the main board of the HP-35. As you can see, integrated dual in-line was the technology
of the day.

4
Figure 2: The HP 35 main board.

2.2 Reverse Polish Notation


RPN is a simpler and more practical alternative to the conventional procedure for per-
forming arithmetic calculations that we learned in school. The latter method is reliant
on the use of parentheses and equals signs and is sometimes referred to as infix notation.
RPN is often referred to as postfix notation.
RPN is easiest to explain by example. Consider the following simple operation,

4 + 5 =

In RPN this expression is written,

5
4 ENTER 5 +

There is just one operation key referred to as “ENTER”. Computations are performed
incrementally and results are stored in memory as we proceed. Here is a more complex
example,

(4 + 5 × 2) / 7 =

In RP we would do,

4 ENTER 5 ENTER 2 × + ENTER 7 /

Note the logical manner in which the calculation proceeds and how parentheses and equals
signs are eliminated. To do the project, you will need to familiarise yourself with
Reverse Polish notation.

2.3 The Memory Device in Reverse Polish


RP calculations require some form of limited memory to store variables and results. The
RP algorithm is suited to a special memory device referred to as a stack. A stack is a
computer term for a memory in which data is stored on a pile of registers. A stack is
analogous to a filing system in which the latest document to be filed is placed on top of
the document pile.
A stack is a Last In First Out (LIFO) memory. When a variable is stored, it is pushed
onto the stack. When a variable is to be retrieved, variables higher on the stack have to
be popped until we reach the desired variable. The stack does not need to be very deep
i.e. have many memory locations. The HP 35 stack has only four levels.
To see how the stack would be used in RP, consider the following examples,
Example 1

(4 + 2 × 5) / (1 + 2 × 3)

In RPN this is described by,

6
4 ENTER 2 ENTER 5 × + ENTER 1 ENTER 2 ENTER 3 × + /

In the following table the S1-S4 refer to the stack register levels. The register S1 would
be at the top of the stack in the document filing analogy. Hewlett-Packard [1] referred to
it as the bottom of the stack. From now on I refer to this as the input to the stack in
order to avoid confusion.
In the following example it is convenient to introduce an additional register that we
refer to as the KEY HOLDING register (KHR). Though the KHR has no role in
RP per se, it has several practical purposes here. One is to provide a register where
final output from the keyboard can be temporarily stored. Keyboards only provide one
digit at a time. Operands will therefore have to be built from digits prior to arithmetic
processing. Another application of the KHR is that it makes it easier to implement above
RP algorithm.
Exactly how you handle input from the keyboard for RP processing is one of
the design decisions you will have to make in your project.

Input: KHR S4 S3 S2 S1
4 4 . . . .
ENTER 4 4 . . .
2 2 4 . . .
ENTER 2 2 4 . .
5 5 2 4 . .
* 10 4 . . .
+ 14 . . . .
ENTER 14 14 . . .
1 1 14 . . .
ENTER 1 1 14 . .
2 2 1 14 . .
ENTER 2 2 1 14. .
3 3 2 1 14. .
* 6 1 14 . .
+ 7 14 . . .
/ 2 . . . .

The effect of ENTER is to push numbers onto the stack while leaving the current digit
in the KHR. Note that in RPN operators are never stored on the stack. In the algorithm
described here the effect of an operator is that the RP controller pops the stack, triggers
the operation and places the result in the KHR.
In this implementation of RP, the ENTER key has to be pressed whenever
there is further input after an operator so that the last result is stored on the

7
stack and not overwritten by new input to the KHR. As we shall see, this choice
of implementation is by no means unique: the HP-35 handles the storage of prior results
in a different manner.
Example 2

(-4 + 54) / (1 + 3 × (7+1))

In RPN this is described by,

4 CHS ENTER 54 PLUS ENTER 1 ENTER 3 ENTER 7 ENTER 1


+ × + /

Input: KHR S4 S3 S2 S1
4 4 . . . .
CHS -4 . . . .
ENTER -4 -4 . . .
54 54 -4 . . .
+ 50 . . . .
ENTER 50 50 . . .
1 1 50 . . .
ENTER 1 1 50 . .
3 3 1 50 . .
ENTER 3 3 1 50 .
7 7 3 1 50 .
ENTER 7 7 3 1 50
1 1 7 3 1 50
+ 8 3 1 50 .
* 24 1 50 . .
+ 25 50 . . .
/ 2 . . . .

It should be clear from this example that in any RP calculation you will never
need to seek variables lower than the top level of the stack.

2.4 The HP-35 Reversal Polish Algorithm


The following discussion follows articles from the Hewlett-Packard journal describing the
HP-35 calculator ([1], [2]). Fig. 3 shows the instructions sticker posted on the back of
the calculator and Fig. 4 shows the HP-35 implementation of the RP algorithm.

8
Figure 3: The HP 35 instruction sticker.

As you can see the HP implementation differs in a couple of ways from the version pre-
sented above. Firstly the KHR in a HP-35 is actually the input register to the stack,
X (see Fig. 4) (the display being connected to this register). Secondly, after an op-
eration is executed, results are pushed onto the stack without the need for an ENTER
key. Actually the HP-35 algorithm does allow the user to press the ENTER key after an
operator. However this has exactly the same effect as not pressing the ENTERN key: so
it is probably ignored. The reason for this design decision appears to be to reduce the
number of ENTER key strokes used in lengthy calculations. In my experience one of the
greatest weaknesses in the engineering of the HP RP calculators was the tendancy of keys
to stick after extended use.

9
Figure 4: The HP-35 RP implementation explained.

10
3 RPC Design and Specification

3.1 General
The overall block diagram of the project is shown in Figure 5.

Figure 5:

In addition to the reverse polish calculator RTL system (RP engine), the project also
involves interfacing to a PS/2 AT keyboard and a seven segment display. You should
treat the keyboard and seven segment interfaces as separate design projects from the
RTL design of the RP engine itself. In designing these three subsystems you will have
to decide how they will interface to each other. How you present data to the RP engine
from the keyboard affects the form and timing of the inputs to the RP engine. Similarly
how you process data in the RP engine affects what type of decimal encoding has to be
done before the seven segment display.

3.2 The PS/2 Keyboard Interface


The keyboard is a simple FSM design involving a serial to parallel converter. A good
introduction to the PS/2 AT keyboard and its communications protocol can be found at

http://www.beyondlogic.org/keyboard/keybrd.htm

As described in this article when a key is pressed, the keyboard sends data frames referred
to as scan codes via an asynchronous serial protocol. As shown in Figs 7 and 6 these
scan codes are mostly 8 bit but in some cases (for e.g. the Del, Ins, / and ENTER keys)
they are 16 bit. As we are using a subset of all available keys, you will need to design
your keyboard interface to deal with unused keys in a friendly way.
The PS/2 keyboard interface will involve two main parts.

11
Figure 6: Keyboard showing the RP function keys.

Figure 7: Keyboard scan codes.

1. A serial to parallel converter FSM that handles the PS/2 protocol.

2. An output buffer to make KEY data available in a suitable format to the RP engine.

A sample serial protocol FSM is available on the project website. You may use this as
a starting point to develop code to receive data from the keyboard. This code follows
Wakerly’s coding style for VERILOG coding of FSMs. You will have to adapt it to the
specific PS/2 keyboard protocol shown in Fig. 8.
The form in which you provide the keyboard output influences the design of the RP RTL
controller. For example we will provide the key identity in a coded format, KEY, (if a key
is pressed) or as a NOKEY symbol (if no key or a wrong key is pressed) on the posedge
of the RP system CLOCK. Using this technique, unused scan codes can be replaced with
the NOKEY code.

12
Question: Read through the above article and explain how the IDLE mode pull up to
+5V is obtained.

Figure 8: Keyboard serial protocol.

3.3 The Seven Segment Display Interface


The seven segment display driver will also include two parts.

1. A BCD (binary coded decimal) or other encoder to convert the RP engine’s chosen
number format into a form suitable for driving the display.

2. A display driver that drives the anodes and cathodes of the seven segment display
given the decimal values of the digits to be displayed.

Dealing with the former issue is a big part of the project and the output format depends
on the implementation level you are trying to achieve. You should already be familiar
with code capable of implementing the display anode and cathode driver.
Further descriptions of the keyboard and seven segment displays can be found in the
PEGASUS and BASYS manuals on the project website:

http://engnet.anu.edu.au/DEcourses/engn3213/Documents/PROJECT

3.4 RTL design


Given the implementation of the RP algorithm described above and following the RTL
description of MU0 (the appendix), one may propose the RTL archotecture of the RP
calculator shown in Fig. 9.
In this figure, the control path is a FSM (at the left) which has two inputs: the keyID
code and a reset. The keyID and reset are synchronised to the RP engine system clock.
From the above example of the KHR, exactly one keyID appears per positive clock edge
for each valid key. Otherwise a NOKEY is produced. The reset must be provided via
a separate input such as a PEGASUS board pushbutton and not from the keyboard so

13
Reset
keyID from KB interface
Input

Input
Key Holding Register (KHR)
Output

FSM
Output Arithmetic Logic Unit

Display
Stack In Stack Out interface

Output

Figure 9: Simplified RPN RTL control and data paths.

that the calculator can be manually forced into the INIT state (commonly referred to as
“switching on the calculator”).
The keyID inputs are analogous to the commands stored in memory in MU0. These
determine the state transitions of the RP controller FSM.
The outputs of the controller are a bunch of enable and reset switches that control the
hardware blocks of the data path. As is the case for MU0, there should no need to
send the data buses through the controller FSM (see Fig. 9.)
In addition, the data path consists of well defined hardware blocks. In the
present example these are the key holding register, an arithmetic logic unit and a
stack.

14
4 Project Details
Now that we have an idea of what the project involves, let’s see what you have to do.
The project will consist of a design and implementation of various RP calculators for
the PEGASUS board with an XC2S50 FPGA (our target hardware). The aim will be to
design and implement a keyboard interface, a seven segment display interface and up to 3
designs and implementations of increasing complexity and functionality of the RP engine.

4.1 Keyboard and Display Interfaces


These have to be fairly universal. The display interface must be capable of displaying
digits, decimal points and any other characters you think may be necessary at some
level of RP engine implementation. The main requirement is that you meet the
precision standards specified so that the same outputs are produced for a give
combination of inputs to the calculator.
The keyboard interface should be capable of sending any key, but you will have to take care
of filtering unwanted keys. Allow me to suggest the KEY/NOKEY system described above
if you wish. This will allow a standardised input into all the RP calculators developed by
the class.
The KEY/NOKEY system is described in Fig. 10.

Keyboard data RP Sys Clk

Keyboard Clock Keyboard interface

KEY / NOKEY

KEY NOKEY NOKEY KEY

RP Sys Clk

Figure 10: Keyboard input and output formats. The timing diagram shows how the
keyboard interface outputs keys on the posedge of the RP system clock.

The following encodings may be used for the key IDs.

15
KeyID Representation

ZERO 5’h00
ONE 5’h01
TWO 5’h02
THREE 5’h03
FOUR 5’h04
FIVE 5’h05
SIX 5’h06
SEVEN 5’h07
EIGHT 5’h08
NINE 5’h09
ENTER 5’h0A
CHS 5’h1A
CLX 5’h0B
CLR 5’h1B
PLUS 5’h0C
MINUS 5’h1C
TIMES 5’h0D
DIV 5’h1D
DP 5’h0E
NOKEY 5’h1E

4.2 Implementation Levels of the RP Engine


In the following sections different levels or versions of the RP engine are described. The
levels correspond to increasingly complex implementations and improvements in function-
ality of the calculator. The changes mainly affect the design of the arithmetic logic unit.
It is not compulsory to attempt to design and implement each of these in the project, but
level I is compulsory and levels II and III do attract 12 additional marks out of 40.
Make sure that you implement either the direct or the HP Reverse Polish
algorithm previously described.
In short these levels are

1. A signed decimal integer calculator that does addition and subtraction.

2. A fixed point signed decimal calculator that also does multiplication and division.

3. A floating point signed decimal calculator that also does multiplication and division.

In order to obtain full marks in the project it will be necessary to complete


both of the more complex fixed and floating point implementations of the

16
ALU. Much of this requires a good understanding of number systems and
representations. Those who have not done the COMP2300 course may find
the following lecture notes useful.

http://cs.anu.edu.au/student/comp2300/lectures/

4.2.1 RP Engine Level I

This is the simplest level. We confine ourselves to decimal integer addition and subtrac-
tion. Key functions will be entered from the PS/2 keyboard and will be displayed on
the seven segment display. To illustrate the functionality at this level consider Figure 11
showing the front panel of a HP-35 calculator. The relevant keys are shown inside the
yellow squares.

Figure 11: HP-35 functionality for the level I system.

The large blue key on the top left is the ENTER key. The operator keys − and + are in
blue at the left. The CHS button changes the sign of the current number on the display
and CLX clears the display to a 0. The CLR key clears the stack. At this level we will
not implement the keys that have a red cross through them. These include, among many

17
others, the EEX key which converts a number to scientific notation, the PI key which
stores the number π and the decimal point key.
The following table shows the meaning and AT keyboard designations of the HP-35 keys
of Figure 11.

RP FUNCTION KEYBOARD DESIGNATION Result

ENT ER Enter Key Store on stack

− ′′
−′′ Subtract

+ ′′
+′′ Add

CHS “Num” Change sign of last number entered

CLX “Del” Clear the display to 0.

CLR “Ins” Clear all stack levels to 0.

At this level you develop your basic RTL design. This is the most important
project milestone. Try to make it extensible to the more complex designs.
The precision is to be the full 4.0.

4.2.2 RP Engine Level II

The aim is to implement fixed point arithmetic with fractional decimals: a very common
functionality in digital systems such as data radios and MPEG codecs. Fig. 12 shows
the HP-35 keys.
At this level we include multiplication, division and a fixed decimal point in
the middle of the display. The precision is 2.2.
The following table shows the meaning and keyboard designations of the HP-35 keys of
Figure 12.

18
Figure 12: HP-35 functionality for the level II and III systems.

RP FUNCTION KEYBOARD DESIGNATION Result

ENT ER Enter Key Store on stack

− ′′
−′′ Subtract

+ ′′
+′′ Add

× ′′ ′′
∗ Multiply

/ ′′ ′′
/ Divide

. “.” Decimal point

CHS “Num” Change sign of last number entered

CLX “Del” Clear the display to 0.

CLR “Ins” Clear all stack levels to 0.

19
4.2.3 RP Engine Level III

This is the most difficult level. We aim to implement floating point arithmetic (allbeit
without scientific notation). The floating decimal point in the result adjusts itself to
the appropriate position on the display. Floating point will allow us to multiply decimal
numbers with larger dynamic range than fixed point.
The precision is 4 digits maximum before the decimal point and 3 digits max-
imum after the decimal point.
Since we will not be trying to implement exponents, the HP-35 functionality is the same
as in Fig. 12 followed by the same table above showing the keyboard designations.

4.2.4 PEGASUS Boad Peripherals

Fig. 13 shows the PEGASUS board peripherals that will be used in the project.

Figure 13: PEGASUS board showing the peripherals.

You may also have noticed that it will not be possible to accurately represent
results that include minus signs if we confine ourselves to the four digit seven
segment display. In this project we will use the four decimal digits of the

20
display for numerical data and an illuminated LED for a negative result as
shown in Fig. 13.
I will leave it to you to decide how to deal with overflows. For example you may decide to
display a row of four minus signs. Interestingly the HP-35 fails to handle overflow properly.
The HP-35 rounds overflowing results down to the maximum number 9.999999999 × 1099.
Dividing any two numbers larger than this by each other produces a 1.

4.3 Project Rules


This project leaves plenty of scope for individual creativity. You do not have to follow
the exact procedure described above for the RTL design. If you do choose to be creative
in your coding style then I expect a solid justification in terms of theory and synthesised
hardware.
The following are the project rules.

1. You may work in groups of one or two (maximum). Let me know the group members.
2. You hand in one report per group.

3. The length of the report should be < 30 pages. There is no pressure to produce a
big report and there will be no penalties for exceeding the limit.
4. Hand in the report in hardcopy and include the code and a softcopy of the report
on an attached CD.
5. You may not use any third party code. All VERILOG code is to be the
original work of the group save code offered for general use on the course
website.
6. You must follow the design conventions introduced in this document.

7. The project is worth 40% of the final mark and must be handed in to me
by C.O.B Friday June 5.

4.4 Assessmnet
Assessment will consist of the following tasks to be described in the report. The project
report will be worth 30 marks in total. In addition there will be a hardware test lab at
which marks will be awarded for successful implementations. The lab demonstrations will
be worth a maximum of 10 marks.

1. A short introduction to the project and the approach taken. A short description of
how the labour was divided amongst team members (if relevant). 2 marks

21
2. A description of the design of the keyboard and display interfaces. 4 marks

3. A description of the design of the RTL control path of the level I system including
next state tables and/or state diagrams, Karnaugh maps and appropriate excerpts
from the VERILOG. 4 marks

4. A description of the level I hardware blocks in the datapath. A description of the


arithmetic logic unit, how it does its calculations and what design trade-offs you
have had to make to meet timing constraints and space limitations. 4 marks

5. For the level I system provide test benches and simulation traces (e.g. GTKWAVE
or ISE XST) demonstrating individual working hardware blocks and a complete
working system. 4 marks

6. A description of the level II and III hardware blocks in the datapath. A detailed
description of the arithmetic logic unit, how it does its calculations and what design
trade-offs you have had to make to meet timing constraints and space limitations.
3 further marks each

7. For each level you attempt, provide the FPGA resource consumption through the
ISE synthesis reports. Provide and discuss the floorplanner output(s). flat 3 marks
total

8. Overall VERILOG coding style - 3 marks

9. A hardware demonstration of levels I. Does the calculator work? Does the hardware
meet spec? 4 marks

10. A hardware demonstration of levels II and III. Does the calculator work? Does the
hardware meet spec? - 3 further marks each

Lab sessions to test hardware will be assigned closer to the final date.

22
A Appendix: A Description of the MU0 Micropro-
cessor

A.1 Introduction
The definition of the instruction set shown in Fig. 14 and the requirement of two clock
cycles for an instruction forms the specification of MU0.

Figure 14: MU0 assembly language instructions

The first thing to do is to understand what goes on with these instructions. Note the
syntax of the commands. The symbol S refers to a memory address. The notation [S]
refers to the contents of the memory location.
Consider the datapath of Fig. 15. It shows the following hardware systems,

1. A program counter register (PC) which stores the address in the memory of the
current instruction. Exactly what is the current instruction and what is the next
instruction we’ll see in a minute. The addresses count from 0 upwards and in any
program, the instructions are stored in the first contiguous memory locations while
the data is store in the subsequent locations. This is the basis of the Von Neumann
architecture wherein program and data are stored sequentially in memory.

23
Figure 15: MU0 architecture.

2. An instruction register (IR) which contains the instruction while it is being exe-
cuted.

3. An accumulator (ACC) which provides intermediate storage of data during in-


struction execution. The ACC is sometimes referred to as the working register.

4. An Arithmetic Logic Unit (ALU)

5. Several multiplexers

In MU0 the data has 16 bits and the memory has storage locations that are 16 bits wide.
The data in memory is stored at locations that can be located by their address. These
addresses are represented by words that are 12 bits wide. That is MU0’s memory has 212
memory locations where data can be stored.
It is interesting and entirely pertinent to note that an instruction word consists
of 16 bits and can therefore be stored in memory. The most significant four bits
([15:12] in VERILOG parlance) of the instruction word is referred to as the opcode. This
is the machine language symbol that represents an instruction. This is the hex number
F in the left most column of Fig. 14. There are 16 possible opcodes but only 8 are
implemented in MU0. The meanings of the instructions are also described in Fig. 14.
The remaining 12 bits in the instruction word is the address in memory of either the
operand that the instruction operates on (in the case of LDA, ADD and SUB) or the

24
destination of the data in the ACC (STO) or the address of the next instruction in the
case of the JUMP commands (JMP, JGE, JNE).
Fig. 16 shows the instruction word format.

Figure 16: MU0 instruction format.

The program counter is incremented every instruction. It is only controlled by the active
edges of the clock. Consequently MU0 automatically runs sequentially through the ad-
dresses in memory. Reading an instuction occurs when that instruction appears in the IR
in the EXEC state.
The cunning in the design of the MU0 architecture is that each hardware block in the
datapath of Fig 15 is configured to execute these instructions by appropriate changes in
their control inputs. Examples of controls are PCen (enable PC), ACCen (enable
the ACC), Asel (choose the input that connects to the output of the address
MUX, a-mux), M (choose the function to be performed by the ALU), etc. The
bit values of these controls are the outputs of the control path FSM.

A.2 The Control Path Finite State Machine


The next state diagram of the controller in Fig. 17 describes how the FSM works.
The first column shows the states of which there are just two,

0 FETCH (fetch instruction and store instruction in IR) and,

1 EXEC (decode instruction in the IR and execute instruction).

The second column shows the opcode, F. The opcode is MU0’s only input. MU0 obtains
the opcode when the controller FSM reads bits [15:12] of the IR. The third column is
the next state that the controller jumps to. Note that the opcode is not needed in the
FETCH state (0) because in this state the only steps are to mux the PC contents onto
the address bus through the a-mux and to set up the ALU input control, M, for a PC
increment. As a result, regardless of the F or opcode value, the next state is EXEC (1).
This explains the dont cares, “XXX” in the F cell in the table.
If you look at the next-state diagram you should be able to confirm the interconnects of
MU0 in Fig. 18 for the FETCH state. The grey tracks indicate connected paths in the
datapath.

25
Figure 17: MU0 next state diagram

A.3 MU0 in action


Try and follow the following verbal description. Remember that all registers (PC, IR and
ACC) and state transitions occur on the positive edge of the clock but memory read and
writes occur on negative clock transitions.
The sequence of events that occur in the FETCH state from the first postive transition
of the system clock are as follows.

1. The FETCH cycle occurs at the first positive edge of the clock.

2. In the FETCH state the a-mux input is connected to the PC output. The MUX is
a combinational device and so the PC contents should already be pointing to the
address of the next instruction in memory.

3. At tbe ensuing negative clock transition the memory transfers the contents of the
location whose address is in the PC to the Dbus. The Dbus is the output data line
of the memory whereas the Xbus is the input data line.

26
4. The contents of the Dbus are now present at the input to the IR.
5. In the FETCH state the PC contents are also pointing at the ALU input through
the x-mux. The ALU M-value is set so that the ALU increments the value on
this input. Since both the x-mux and the ALU are combinational devices, the PC
incremented contents are transferred instantaneously at the PC input. On the next
positive clock transition the contents of the PC will be incremented ready for the
next time the FSM is in the FETCH state.

From the second positive clock transition we are in the EXEC state. The sequence of
events that occur in the this state are as follows.

1. At this transition the PC increments its contents as discussed previously.


2. The IR registers the contents of the Dbus to its output.
3. The controller reads the [15:12] bits (the opcode) from the IR.
4. Depending on the opcode value, several function controls in the datapath may be
enabled or disabled as follows.

• If the opcode is LDA then the ACC is enabled and the y-mux is set so that
the Dbus is connected to the ALU input on the Ybus. The ALU M value is set
for a through connnection on its Ybus input. At the positive edge of the next
clock transition into the FETCH state, the ACC output will store the contents
of the Dbus.
• If the opcode is for ADD or SUB then the ACC is enabled and the Dbus is
again connected to the ALU via the Ybus through y-mux. The x-mux is set to
allow the contents of the ACC onto the Xbus and the ACC M value is set for
ADD or SUB. On the subsequent negative clock transition the contents of the
memory is transferred onto the Dbus. At the positive edge of the next clock
transition into the FETCH state, the ACC output will store the sum of its
previous value and that in the memory location.
• If the opcode is STO, the x-mux places the contents of the ACC on the input
to the Xbus which is also the memory input data line. The last 12 bits of the
contents of the IR are sent via the a-mux to the address bus of the memory.
On the next negative clock transition the memory stores the contents of the
Xbus (the contents of the ACC).
• In the case of the JUMP instructions, the last 12 bits of the instruction register
are sent via the y-mux to the Ybus and the ALU. The ALU is set for straight
through so that this new memory address is fed to the PC. At the ensuing
posedge of clock (FETCH) the PC is changed to the address which is the
operand of the JUMP instruction.

27
Figure 18: The MU0 datapath interconnects during FETCH and EXEC.

A.4 Running a Program on MU0


The following is a machine code listing of a MU0 program which adds the contents (000A)
of memory location 4 to the contents (0001) of memory location 5. The first hex digit in
each command is the opcode. These are 0 (LDA), 2 (ADD), 1 (STO). The remaining 3
hex digits are the address operands as discussed above.

0004 (load (LDA) the contents of memory adddress 4 into the ACC)
2005 (add (ADD) the contents of memory address 5 to that in the ACC)
1006 (store (STO) the contents of the ACC in memory location 6)
7000 (STOP)
000A (data stored in memory location 4)
0001 (data stored in memory location 5)
0000 (data stored in memory location 6)

Notice how execution occurs in purely sequential fashion. MU0 does not know which
memory addresses contain instructions and which data. Its proper operation depends
entirely on proper programming and the march of the PC contents. The STOP command
terminates execution and prevents the processor from trying to perform a false opcode in
the first hex digit of the data at memory location 4.
Fig 19 shows the complete GTKWAVE output from running MU0 with ICARUS VER-
ILOG.
Fig. 20 expands the traces around the FETCH and EXEC states when the instruction
2005 is being executed. For instruction 2005 the PC is pointing to address 1 in memory.

28
Figure 19: GTKWAVE traces of the MU0 data during execution of the above program

During this instruction the contents of memory address 5 (0001) is added to the contents
of the accumulator which is by now 000A. Notice that the actual instruction 2005 does
not appear in the IR until the EXEC state is reached and that the contents of the ACC
do not register the sum, 000B, until the FETCH cycle of the following instruction.

A.5 MU0 Assembly Language?


The lexical commands in Fig. 14, LDA, STO, etc are referred to as assembly language
instructions. Normally when writing programs for a microprocessor one only has to use
these commands and some variables representing the data. This is clearly much easier to
read than the column of numbers that form the machine code. However use of assembly
language presumes the existence of an assembly language compiler or assembler for short
which translates tbe assembly language into machine code. Unfortunately (to the best
of my knowledge) MU0 does not have an assembler written for it (though you may be
attempted to write one in JAVA or C, I dont think it would be difficult). We have already

29
FETCH EXEC FETCH

CLOCK

ACC 0000 000A 000A 000A 000A 000B

IR 1004 1004 1004 2005 2005 2005

PC 0001 0001 0001 0002 0002 0002

Figure 20: Expected and GTKWAVE traces of MU0 ACC, IR and PC registers around
the execution of the 2005 instruction

seen a little assembly language with PICOBLAZE. and later in the course we will see
some more.

30
References
[1] Thomas M. Whitney, France Rode and Chung C. Tung The ’Powerful Pocketful’: an
Electronic Calculator Challenges the Slide Rule Hewlett Packard Journal, 1972.

[2] David S. Cochran Algorithms and Accuracy in the HP-35 Hewlett Packard Journal,
1972.

[3] M. Ercegovac, T. Lang, and J.H. Moreno Introduction to Digital Systems Wiley,
1999.

[4] John F. Wakerly Digital Design: Principles and Practices Prenitce-Hall, 2000.

31