You are on page 1of 40

AT LEAST IT’S NOT LOAD STORE

- ALINLS -
Group 2A
Abby Abernathy
Worden Barr
Alexander Bradshaw
Chris Comeau
CSSE232-02
Group 2A - 2

Table of Contents

Table of Contents 2

Introduction to the ALINLS Processor 4

ALINLS Detailed Overview 5


Instruction Set Design 5
Implementation and Testing Methodologies 5
Xilinx Model 6
Milestone 6 Performance Results 6
Unique Features 7
Assembler 7
FPGA Board Input 7

Conclusion 8

Appendix A: Design Documentation 9


ARCHITECTURE 9
INSTRUCTION SET 9
Machine Language Instruction Format Types 9
Addressing Modes 10
Instructions 10
MEMORY 12
Reserved Spots in Data Memory 12
REGISTER TRANSFER LANGUAGE (RTL) SPECIFICATION 13
RTL Symbols 13
RTL Breakdown 13
VERIFYING THE DESIGN PROCESS 14
SYSTEM TEST PLAN 14
DATAPATH 15
Single-cycle Block Diagram 15
Hardware Components 15
Integration Plan 17
CONTROL SIGNALS 18
Descriptions 18
Test Plan 19
Control Signal Bits by Opcode 20
Special Control Unit Logic 20
DESIGN PERFORMANCE 20
Group 2A - 3

SAMPLE CODE 21
Load Address 21
Iteration 21
Conditional Statement 22
Assembly RelPrime and Euclid’s Algorithm 22
Machine Code RelPrime and Euclid’s Algorithm 26

Appendix B: Control Signal Bits by Opcode 28

Appendix C: Design Process Journals 29


Group 2A 29
Abby Abernathy 32
Worden Barr 33
Alexander Bradshaw 37
Chris Comeau 39
Group 2A - 4

Introduction to the ALINLS Processor


ALINLS is an architecture that has an accumulator based system with some stack-like
functionalities. There is a memory unit that acts like a stack that can be used to temporarily save
data or instruction place. There is also two registers and another memory unit that the user can
use to do work in. It is important the user knows which register they are in, as there is a “current
register” that some instructions work to. This is not a real hardware component but more of an
idea that allows the user to use some instructions in more than one register. The processor is a
single cycle datapath.
Group 2A - 5

ALINLS Detailed Overview

Instruction Set Design


Our instruction set is made up of instructions pulled from both the MIPS and PIC
systems, along with a few we came up with. They all use 1 instruction format, A-Type. In this
instruction type, the first five bits describe the opcode/function code and the last eleven bits are
used for an immediate value. It is important to note that some opcodes may leave the
immediate value blank, indicating that it is not needed. Each instruction follows one of four
addressing modes which dictate how the immediate is handled: Immediate Addressing,
Immediate Operating, Pseudo-Direct, and Inherent (see Appendix A for more details).

Implementation and Testing Methodologies


The working register (WR) is provided to hold data you are currently working with. The
addressing register (AR) is provided to hold an address that you can then use to load and store
things. The stack pointer register (SP) is provided to hold the current position in the stack. The
program counter register (PC) is provided to hold the current position in the instruction set. The
current register (CR) referred to throughout the document is always referring to the register you
are currently in. It is not a separate register in the implementation, instead it is implemented by
control bits. The programmer can specify which register they want to do work in by inserting
specific instructions -- SAR to switch to the addressing register or SWR to switch to the working
register. The @ symbol is used in code when the thing following it should be translated into an
address when converted to machine code.

When you are making a procedure call, the address of the next instruction (PC+1) is
automatically put on the stack. The stack is then available for general use/storing data but there
is a trust that stack space will be allocated and deallocated properly. Anything not on the stack,
meaning the working register, addressing register, or data memory such as inputs and outputs,
could be overridden if not used properly. These can be retained by storing them on the stack
before making a call. When a procedure call finishes, a return call is executed which will read
the last thing on the stack (which should be a return address) and go there.
Group 2A - 6

In order to implement our processor, we first decided on a single-cycle datapath design.


To create our datapath, we worked through all of the instructions to verify that every part and
connection needed was present (further details on the datapath design can be found in
Appendix A). To create this datapath in verilog, we first separated the above parts into six
separate subsystems. Then we combined the subsystems into one datapath. Each step along
the way we performed tests. Starting with each component, testing was performed to ensure
that each did everything it was supposed to do properly. This made it easier for system creation.
When we combined the components into subsystems, we checked if each groups function was
executed properly. This made it easier to find specific sections that were wrong in the datapath
rather than debugging the entire section at once. Finally, once we combined the subsystems we
tested each instruction to insure that they all worked properly. We verified test results by a
combination of waveform verification and display line results.

Xilinx Model
INSERT PICTURE HERE

Milestone 6 Performance Results


Memory Usage:
- Instruction Memory: 77 Instructions = 154 bytes
- Data Memory: 3 Memory variables = 6 bytes
Number of instructions executed when relPrime is called with 0x13B0:
- 204,141 instructions
Number of cycles required to run relPrime when called with 0x13B0:
- 204,141 cycles
Average cycles per instruction:
- 1 cycle per instruction
Cycle time for ALINLS
- 28.813 ns = 34.71 MHz
Total execution time when relPrime is called with 0x13B0:
- 28.813 ns * 204141 = 0.005882 seconds

Device Utilization Summary:


Selected Device : 3s500efg320-4
Number of Slices: 1071 out of 4656 23%
Number of Slice Flip Flops: 273 out of 9312 2%
Number of 4 input LUTs: 1970 out of 9312 21%
Group 2A - 7

Number used as logic: 1458


Number used as RAMs: 512
Number of IOs: 26
Number of bonded IOBs: 26 out of 232 11%
Number of GCLKs: 1 out of 24 4%

Unique Features

Assembler
ALINLS’s Assembler takes in a file filled with lines of assembly code and translates each
line to its equivalent machine code. Each line must be assembly code in the format “mnemonic
immediate” or be a label in the format “label name:”. The spacing and format must be followed
exactly or the assembler could incorrectly create the machine code. The machine code is written
to the proper instruction memory coe file, but you do have to regenerate the core in Xilinx for the
memory unit to update properly. The Assembler can be found by going to
“2A-....-comeaucm\Implementation\RelPrime\CSSE232Assembler2A” in our repository.

FPGA Board Input


We started with the provided ALU with input/output project from the CSSE 232 resources
page, that used the lcd and rotary switch to input in two numbers and an operation to an ALU
and continuously changed the output on the lcd screen to the correct value. We switched out
the ALU for our ALINLS processor. We used the A input (the first 4 digit hex number) of the lcd
screen, ignoring the operation and B input, and switched the spot that took in the output of the
ALU to take in the WR value of our processor. We also added in a reset switch that paused our
data path running and enabled us to put an input into WR when toggled.
Group 2A - 8

Conclusion
Overall, we feel like we made a decent processor. We improved our efficiency by
changing a bunch of inefficient inherent instructions to do an addressing register plus immediate
inside the instruction automatically instead of having to do that separately before. While it does
use more cycles than most processors running a similar size instruction set, the individual
instruction execution speed is good. We believe that our processor would been benefited from
being multi-cycle, because our instructions finished relatively quickly and the processor being
multi-cycle would have quickened our overall runtime. If we could do the project over again, we
would try and simplify our control unit logic especially the current register logic.
Group 2A - 9

Appendix A: Design Documentation

ARCHITECTURE
ALINLS architecture is an accumulator based system.
A working register (WR) is provided to hold data you are currently working with. An
addressing register (AR) is provided to hold an address that you can then use to load/store
things. A stack pointer register (SP) is provided to hold the current position in the stack. A
program counter register (hereby referred to as PC) is provided to hold the current position in
the instruction set. The current register (CR) referred to throughout the document is always
referring to the register you are currently in. It is not a separate register in the implementation,
instead it is implemented by control bits. The programmer can specify which register they want
to do work in (the addressing register or the working register).
The @ symbol is used in code when the item following it should be translated into an
address when converted to machine code.
When you are making a procedure call, the address of the next instruction (PC+1) is
automatically put on the stack. The stack is then available for general use/storing data but there
is a trust that stack space will be allocated and deallocated properly. Anything not in memory or
on the stack, aka the working register and addressing register, could be overridden if not used
properly. When a procedure call finishes, a return call is executed which will read the last item
on the stack (which should be a return address) and go there.

INSTRUCTION SET

Machine Language Instruction Format Types


A-Type: The first five bits describe the opcode/function code. The last eleven bits are used for
an immediate value. This type will be used for all instructions though some opcodes may
indicate to ignore the immediate as it isn’t needed.

OPCODE IMMEDIATE
15 11 10 0
Group 2A - 10

Addressing Modes
Immediate Addressing: Adds the immediate to the addressing register in addition to performing
the operation.
Immediate Operating: Uses a specified immediate from the instruction in order to perform an
operation.
Pseudo-Direct: Effective address is stated explicitly in the instruction, then is used to perform an
operation with that address’ value.
Inherent: Immediate is ignored, uses the opcode to tell the processor to perform an operation.

Instructions

Mnemonic, Description ZE / SE 16-Bit


Operands Opcode

Pseudo-Direct Addressing

call Call procedure (jal) ZE 00100

goto Go to address (j) ZE 00101

Immediate Addressing

load Set addressing register to addressing register + SE 01000


immediate. Load the value from address register into
working register.

store Set addressing register to addressing register + SE 01001


immediate. Store value from working register to address
obtained in address register.

sub Set addressing register to addressing register + SE 00001


immediate. Subtract value at address in addressing
register from working register and put into working
register.

clearf Set addressing register to addressing register + SE 00110


immediate. Clear contents at address in addressing
register.

seq Set addressing register to addressing register + SE 10100


immediate. Compare value at address in addressing
register to working register; if equal, skip next instruction.
Group 2A - 11

sne Set addressing register to addressing register + SE 10101


immediate. Compare value at address in addressing
register to working register; if​ not ​equal, skip next
instruction.

slt Set addressing register to addressing register + SE 10110


immediate. Compare value at address in addressing
register to working register; if working register < value at
addressing register, store TRUE in working register.

add Set addressing register to addressing register + SE 00000


immediate. Add value at working register to value at
address in addressing register and put result in working
register.

and Set addressing register to addressing register + SE 00010


immediate. Bitwise AND value at address in addressing
register with working register and put result in working
register.

or Set addressing register to addressing register + SE 00011


immediate. Bitwise OR value at address in addressing
register with working register and put result in working
register.

Immediate Operating

addi Add immediate to the current register and store in current SE 01010
register.

andi Bitwise AND immediate with current register and store in SE 01011
current register.

ori Bitwise OR immediate with current register and store in ZE 01100


current register.

slti Compare immediate to working register; if working register SE 01101


< immediate, store TRUE in working register.

loadi Load immediate into lower 11 bits of the current register. SE 01110

loadui Load immediate into upper 11 bits of the current register. ZE 01111

sll Shift bits left in current register by signed immediate and SE 10000
store in current register.

Inherent Addressing

push Add 2 to the stack pointer register, then push current N/A 00111
Group 2A - 12

register’s value onto the stack.

pop Pop the last item off the stack into the current register, N/A 10010
then adds 2 to the stack pointer register.

peek Look at the last item on the stack and put it in the current N/A 10011
register, but don’t take it off the stack.

return Return to address at the top of stack. N/A 10111

clearc Clear current register (set it to 0). N/A 11000

swr Switch the current register to the working register. N/A 11011

sar Switch the current register to the addressing register. N/A 11100

sez Compare contents of working register with Zero; N/A 11001


if equal, skip next instruction.

snz Compare contents of working register with Zero; N/A 11010


if ​not​ equal, skip next instruction.

MEMORY
Our datapath has three memory units. One acts as a stack; the user can add to using the push
instruction and read from using pop and peek. Some other instructions read and write to the
stack. It is important to note that our stack starts at zero and grows up as you add more to it.
Another memory is used to hold the instruction list, PC holds the place in the the memory that
we are currently at. The last memory is used when the user wants to store data in memory for
later. The user can use any address except those reserved for memory calls.

Reserved Spots in Data Memory


- @IN0: used for function arguments 0x0000
- @IN1: used for function arguments 0x0001
- @IN2: used for function arguments 0x0002
- @IN3: used for function arguments 0x0003
- @IN4: used for function arguments 0x0004
- @OUT0: used for function return values 0x0005
- @OUT1: used for function return values 0x0006
- @OUT2: used for function return values 0x0007
Group 2A - 13

REGISTER TRANSFER LANGUAGE (RTL) SPECIFICATION

RTL Symbols

PC = program counter Mem = memory


AR = addressing register SE = sign extended operation
WR = working register ZE = zero extended operation
CR = current register SP = stack pointer register

RTL Breakdown

The steps are done sequentially. SE Immediate Operating


instr = Mem[PC]
AR Immediate Addressing Other PC = PC + 1
instr = Mem[PC] imm = SE(instr[10-0])
PC = PC + 1 addi​: CR = CR + imm
imm = SE(instr[10-0]) andi​: CR = CR & imm
ARtemp = AR+imm slti​: If (imm < WR) { WR = 1 }
AR = ARTemp loadi​: CR = imm
load​: WR = Mem[ARtemp] sll​: CR = CR << imm
store​: Mem[ARtemp] = WR
sub​: V = Mem[ARtemp] ZE Immediate Operating
WR = WR - V instr = Mem[PC]
clearf​: Mem[ARtemp] = 0 PC = PC + 1
slt​: V = Mem[ARtemp] imm = ZE(instr[10-0])
If (V < WR) { WR = 1 } ori​: CR = CR | imm
add​: V = Mem[ARtemp] loadui​: CR = imm
WR = WR + V
and​: V = Mem[ARtemp] ZE Pseudo Direct
WR = WR & V instr = Mem[PC]
or​: V = Mem[ARtemp] PC = PC + 1
WR = WR | V imm = PC[15:12] ^._.^ (instr[10-0])
call​: SPtemp = SP - 1
AR Immediate Addressing Skips Mem[SPtemp] = PC
instr = Mem[PC] SP = SPtemp
imm = SE(instr[10-0]) PC = imm
ARtemp = AR+imm goto​: PC = imm
AR = ARTemp
V = Mem[ARtemp] Inherent Addressing Skips
seq​: if (WR == V) { PC = PC+2 } instr = Mem[PC]
else { PC = PC + 1 } sez​: if (WR == 0) { PC = PC+2 }
sne​: if (WR != V) { PC = PC+2 } else { PC = PC + 1 }
else { PC = PC + 1 } snz​: if (WR != 0) { PC = PC+2 }
else { PC = PC + 1 }
Group 2A - 14

Inherent Addressing Other peek​: CR = Mem[SP]


instr = Mem[PC] pop​: CR = Mem[SP]
PC = PC + 1 SP = SP + 1
swr​: CR = WR push​: SPtemp = SP - 1
sar​: CR = AR Mem[SPtemp] = CR
clearc​: CR = 0 SP = SPtemp
return​: PC = Mem[SP]
SP = SP + 1

VERIFYING THE DESIGN PROCESS


In order to test if our RTL is designed correctly, we would start by checking that every
instruction starts by getting the instruction from memory, then incrementing the PC by 1 in order
have the next instruction address ready in the PC. Next, we would pull the immediate from the
instruction (bits 10 through 0). Then, we would need to decide what to do with the immediate.
The inherent and immediate addressing modes would either sign extend, zero extend, or do
nothing to the immediate based on the opcode. We would make sure that the gotos and calls
(psuedo-direct) shift it by one then concatenate with the top 4 bits of the PC. Finally, we would
look at each instruction individually and make sure that each step in the RTL does what it
should, such as using the right registers and using memory correctly.
We use this process to check each instruction type with a knows set of inputs and verify
the actual output matches the expected output.

SYSTEM TEST PLAN


After testing individual parts and subsystem, we integrated them all together. To test the system
integration, we started by writing an instruction list that uses all of the instructions. We ran that
through the system, then verified that they were all decoded correctly. We then checked the
waveform to verify that all of the registers and memory units had the expected values in them.
Group 2A - 15

DATAPATH

Single-cycle Block Diagram

Hardware Components
3 Memory Units
- Inputs: Address (16 bits), WriteData (16 bits)
- Outputs: MemData (16 bits)
- Control: MemRead (1 bit), MemWrite (1 bit), StackWrite (1 bit)
- Description: A value is always read from the unit. If the write enable bit is on, the data
from WriteData is put into memory at the address given. Otherwise, nothing is written to
memory.
- Implements: Mem
- Hardware: Pull from class resources - Distributed Memory.
- Unit Tests: Test capability of writing to and reading from a specified address depending
on the enable bits
16 bit ALU
- Inputs: ALUIn1 (16 bits), ALUIn2 (16 bits)
- Outputs: ALUOut (16 bits)
- Control: ALUOp (3 bit), ALUSrcA (1 bit), ALUScrB (1 bit)
- ALUOp Choices
Group 2A - 16

- And: 000
- Or: 001
- Addition: 010
- Subtraction:110
- Set Less Than: 111
- Description: Based on the ALUOp, the ALU will perform an operation (e.g. and, or, add,
subtract, less than) on ALUIn1 and ALUIn2 and output it through ALUOut.
- Implements: Operations
- Hardware: Pull from class resources - ALU and LCD, and modify if needed.
- Unit Tests: Test each operation with multiple inputs including each edge case for
example overflow, and checking the output and status bits.
4 16 Bit Registers
- Inputs: WriteData (16 bits)
- Outputs: ReadData (16 bits)
Control: EnWRWrite, EnARWrite, UseCur, SelWR, SelAR, SPWrite (all 1 bit)
Description: Based on the control signals, registers will have data written to them, read
from them, or both in some cases.
- Implements: PC, AR, WR, SP, CR
- Hardware: Pull from class resources - Register.
- Unit Tests: Test if it can store only when designated as well as read the stored value.
11 to 16 bit Shifter
- Inputs: Input (11 bits)
- Outputs: Output (16 bits)
- Control: none
- Description: The input is shifted to the left by 5 bits.
- Implement: << 5
- Hardware: Write in Verilog ourselves.
- Unit Tests: Test if it can shift the bits by the proper amount.
11 to 16 bit Sign Extender
- Inputs: Input (11 bits)
- Outputs: Output (16 bits)
- Control: none
- Description: The input is sign extended from 16 bits to 11 bits.
- Implements: SE
- Hardware: Write in Verilog ourselves.
- Unit Tests: Tests if it can properly sign extend both positive and negative immediates.
11 to 16 bit Zero Extender
- Inputs: Input (11 bits)
- Outputs: Output (16 bits)
- Control: none
- Description: The input is zero extended from 16 bits to 11 bits.
- Implements: ZE
- Hardware: Write in Verilog ourselves.
Group 2A - 17

- Unit Tests: Tests if it can properly zero extend non-signed immediates.


4 16 bit Adders
- Inputs: PC & 1, PC & 2, AR & SE(imm), SP & 1 or -1 (All 16 bits)
- Output: Output (16 bits)
- Control: none
- Description: Current PC is added to 2 and outputted as a 16 bit number.
- Implements: +
- Hardware: Write in Verilog ourselves.
- Unit Tests: Test if it can add properly including edge cases such as overflow.
2 bit Latch
- Inputs: None
- Outputs: CurReg (1 bit)
- Control: SelWR (1 bit) and SelAR (1 bit)
- Description: No inputs are needed. The controls change the value of the latch, which
then is outputted to the register bus.
- Implements: CR
- Hardware: Write in Verilog ourselves.
- Unit Tests: Tests whether or not it stores and outputs the correct bit depending on all
four scenarios of inputs.
Variable Left Shifter
- Inputs: Value (16 bits)
- Outputs: Shifted Value (16 bit)
- Control: SelWR (1 bit) and SelAR (1 bit)
- Description: Shifts Value by the immediate input
- Implements: <<
- Hardware: Write in Verilog ourselves.
- Unit Tests: Tests if it can properly shift an input by the other input (an immediate).

Integration Plan
1. Instruction fetching and decoding
a. Connect a memory unit with a 11-16 bit shifter, a sign extender, and a zero
extender, along with a control unit.
b. Testing
i. Feed in various addresses
ii. Verify that the outputs (opcode and immediates) are correct
1. Based on the coe file we set up
2. PC
a. Connect a 16 bit register, a 4 to 1 mux, and two 16 bit adders.
b. Testing
i. Increment and update PC (with each option from pcSrc)
ii. Receive addresses from the stack
Group 2A - 18

iii. Receive addresses from the immediate based on the PCSrc control bit.
3. Stack
a. Connect a 16 bit register, a 16 bit adder, three 2 to 1 muxes, and a memory unit
b. Testing
i. Increment and decrement val at SP (verify what expected)
ii. Test that reading and writing, using both input sources, to stack memory
works
4. Register Operations
a. Connect two 16 bit registers, a 16 bit ALU, a variable left shifter, two 2 to 1
muxes, a 4 to 1 mux, and a 7 to 1 mux.
b. Testing
i. Test instructions that use different combinations of components
ii. Verify output is what was expected
5. Memory Operations
a. Connect a memory unit, a 16 bit adder, and a 2 to 1 mux.
b. Testing
i. Test that reading and writing, using all inputs, to data memory works
6. Control Unit
a. Combine the two control subsystems with a system that adds opcode mapping
b. Testing
i. See below
7. Combine Subsystems
a. Combine all subsystems.
b. Testing
i. Test different instructions, increasing the difficulty of the combinations
throughout the tests

CONTROL SIGNALS

Descriptions
SPWrite (1 bit)
- Description: Enables writing to SP register
SPStep (1 bit)
- Description: Selects the step direction for the stack pointer(SP) to change by
SPSelect (1 bit)
- Description: Selects between SP and SP-2
StackSrc (1 bit)
- Description: Selects the source to write into the stack between PC+1 and Current
Register
StackWrite (1 bit)
- Description: Enables writing to the stack
Group 2A - 19

PCSrc (2 bits)
- Description: Selects the source to write to PC from PC+1, PC+2,return address, or jump
address
IsSkip (1 bit)
- Description: Selects if we use skip logic or not when deciding the PCSrc
SkipE (1 bit)
- Description: Tells the logic if we’re doing a skip equal or skip not equal
MemSrc (1 bit)
- Description: Selects between Working register(WR) or zero to write to memory
MemWrite (1 bit)
- Description: Enables writing to memory
MemRead (1 bit)
- Description: Enables reading from memory
ALUSrcB (2 bits)
- Description: Selects between immediate, 0, or the data from memory to pass into the B
port of ALU
ALUOp (3 bit)
- Description: Tells the ALU which operation it should perform
WRCRSrc (3 bits)
- Description: Selects the source to use for either the Working Register or Current
Register between the immediate, data from memory, ALU output, or the data from the
stack
SelWR (1 bit)
- Description: Selects the Working Register (WR) as the Current Register (CR)
SelAR (1 bit)
- Description: Selects the Addressing Register (WR) as the Current Register (CR)
EnWRWrite (1 bit)
- Description: Enables writing to the Working Register (WR) directly without the use of
current register
EnARWrite (1 bit)
- Description: Enables writing to the Addressing Register (AR) directly without the use of
current register
UseCur (1 bit)
- Description: Selects the current register for reading and writing

Test Plan
We will input each opcode and verify that every control bit is correct. To verify the control bits we
will follow our control signals table. We will also check the two diagrams to the below separately,
performing more in depth checks before integrating them in with the rest of the controls. These
checks will still input signals and verify that the output is correct.
Group 2A - 20

Control Signal Bits by Opcode


Refer to Appendix B.

Special Control Unit Logic

DESIGN PERFORMANCE
Memory Usage:
- Instruction Memory: 77 Instructions = 154 bytes
- Data Memory: 3 Memory variables = 6 bytes
Number of instructions executed when relPrime is called with 0x13B0:
- 204,141 instructions
Number of cycles required to run relPrime when called with 0x13B0:
- 204,141 cycles
Average cycles per instruction:
- 1 cycle per instruction
Cycle time for ALINLS
- 28.813 ns = 34.71 MHz
Total execution time when relPrime is called with 0x13B0:
- 28.813 ns * 204141 = 0.005882 seconds
Group 2A - 21

Device Utilization Summary:


Selected Device : 3s500efg320-4
Number of Slices: 1071 out of 4656 23%
Number of Slice Flip Flops: 273 out of 9312 2%
Number of 4 input LUTs: 1970 out of 9312 21%
Number used as logic: 1458
Number used as RAMs: 512
Number of IOs: 26
Number of bonded IOBs: 26 out of 232 11%
Number of GCLKs: 1 out of 24 4%

SAMPLE CODE
Load Address
sar
loadui @IN1^
store @IN1

Iteration
primeLoop:
call gcd
swr
loadi 1
sar
loadui @OUT0^
swr
sne @OUT0
goto end
pop
addi 1
sar
loadui @IN^
store @IN1
swr
Group 2A - 22

peek
sar
loadui @IN0^
store @IN0
loadui @IN1^
load @IN1
swr
push
goto primeLoop

Conditional Statement
snz
goto gcdIF
goto gcdELSE

Assembly RelPrime and Euclid’s Algorithm


relPrime:
sar
loadui @IN0^
load @IN0
swr
push # save n
loadi 2 # put 2 in wr
push # save m
sar # switch to addressing register (ar)
loadui @IN1^ # load upper part of in1 addr into ar
store @IN1 # store wr in ar + lower part of in1
primeLoop:
call gcd # jal to gcd
swr # switch to wr
loadi 1 # puts 1 into wr
Group 2A - 23

sar # switch to ar
loadui @OUT0^ # load upper out0
swr # switch to wr
sne @OUT0 # compare wr(1) to ar(out0)
goto end # if they are equal goto return
pop # pop m off stack into wr
addi 1 # m + 1
sar
loadui @IN1^ # load upper IN1
store @IN1 # store wr in ar + lower IN1
swr
peek # peek in wr
sar
loadui @IN0^ # load upper IN0
store @IN0 # store wr in ar + lower IN0
loadui @IN1^ # load upper IN1
load @IN1 # load IN1 into wr
swr
push # push m+1 onto stack
goto primeLoop # loop
end:
pop # get m off of the stack
sar
loadui @OUT0^ # load upper out0
store @OUT0 # store wr (m) in ar + lower out0
pop # get n off stack that way
return # the top of stack is return addr
gcd:
sar
loadui @IN1^ # load upper in1
load @IN1 # load in1(b) into wr
loadui @OUT0^ # load upper out0
store @OUT0 # store wr (b) into OUT0
Group 2A - 24

loadui @IN0^ # load upper in1


load @IN0 # or lower in1 (a) to wr
snz # compare wr to Zero
return # return because a == 0
gcdLoop:
sar
loadui @IN0^ # load upper in0
load @IN0 # put in0 (a) in wr
loadui @OUT0^ # load upper out0
store @OUT0 # put wr (a) in out0
loadui @IN1^ # load upper in1
load @IN1 # load in1 (b) into wr
snz # compare in1 (b) to 0
return # if in1 == 0, return
load 0 # load in1 (b) into wr
loadui @IN0^ # load upper in0
slt @IN0 # is wr (in1 b) < ar (in0 a)?
sez # skip if wr == 0 (ie b >= a)
goto gcdIf # wr != 0
goto gcdElse # Otherwise
gcdIf:
loadui @IN0^ # load upper in0
load @IN0 # put in0 (a) in wr
loadui @IN1^ # load upper in1
sub @IN1 # subtract in1 (b) from wr
loadui @IN0^ # load upper in0
store @IN0 # store wr in in0
goto gcdLoop
gcdElse:
loadui @IN1^ # load upper in1
load @IN1 # store in1 (b) in into wr
loadui @IN0^ # load upper in0 (a)
sub @IN0 # subtract in0 (a) from wr (b)
Group 2A - 25

loadui @IN1^
store @IN1 # store wr in IN1
goto gcdLoop
Group 2A - 26

Machine Code RelPrime and Euclid’s Algorithm

Part 1: Part 2:

1101100000000000 0101000000000001

1100000000000000 1110000000000000

0101000000000010 0111100000000000

1110000000000000 0100100000000001

0111100000000000 1101100000000000

0100000000000000 1001100000000000

0011100000000000 1110000000000000

1101100000000000 0111100000000000

0011100000000000 0100100000000000

1110000000000000 0111100000000000

0111100000000000 0110000000000001

0100100000000001 1101100000000000

0010000000101010 0011100000000000

1100000000000000 0010100000001100

1101100000000000 1001000000000000

0101000000000001 1110000000000000

1110000000000000 0111100000000000

0111100000000000 0100100000000101

1101100000000000 1001000000000000

1010100000000101 1011100000000000

0010100000100100 1110000000000000

1001000000000000 0111100000000000
Group 2A - 27

Part 3: Part 4:

0100000000000001 1011000000000000

0111100000000000 1100100000000000

0100100000000101 0010100000111111

0111100000000000 0010100001000011

0100000000000000 0000100000000000

1101000000000000 0111100000000000

1011100000000000 0100100000000000

1110000000000000 0010100000110011

0111100000000000 0100000000000000

0100000000000000 0011100000000000

0111100000000000 0111100000000000

0100100000000101 0000100000000000

0111100000000000 1001000000000000

1010100000000001 0100100000000000

1011100000000000 0010100000110011
Group 2A - 28

Appendix B: Control Signal Bits by Opcode


Group 2A - 29

Appendix C: Design Process Journals

Group 2A
1.7.19 at 5:30pm for 2.25 hours
- Worked on Milestone 1 Design Document
- We first discussed our preferences in the architecture that we were going to
build. Going through what each of us knew, we decide that as we all had
experience in load-store architectures and Worden and Chris had experience in
an accumulator architecture. This led to us doing an accumulator architecture
design with “mips-like” functionalities.
- We determined which instructions to include by looking through a list of mips
instructions and pics instructions and determined which would be necessary for
our architecture and how they would be used for our type.
- Decided to meet again on 1.8.19 at 9pm in the Percopo 3 study room
1.8.19 at 9:00pm for 3 hours
- Worked on Milestone 1 Design Document
- We decided what addressing modes we would need in order to run each
instructions. From here we looked at the mips and pics addressing modes and
used them as bases to which ones we used.
- We also went through each of the instructions and decided what the immediate
would stand for (address/number/ignored).
- Elected Worden SpaceMaster
- Decided to meet again on 1.9.19 at 2:30pm in the Percopo 3 study room.
1.9.19 at 2:30pm for 3.75 hours
- Worked on Milestone 1 Design Document
- We made decisions about how the memory layout (o.e. Stack vs instruction vs
reserved). And how these spaces can be used by instructions and users.
- We worked on turning our instructions into machine code.
1.13.19 at 5:15pm for 5.25 hours
- Worked on fixing the Milestone 1 Design Document based on Micah’s comments
- We fixed the layout of the document so it was less of a list of requirements and
more of a document documenting our architecture.
- The first major decision we made was how to deal with the fact that we can’t
access all of memory with load and store the way we have it set up right now. We
tossed around a few ideas including adding an addressing register with an
instruction that switches between our registers or an instruction that moves stuff
from the WR to an addressing register, and breaking our memory into chunks
that we then can move in between in order to access more memory.
- We also decided to add push/pop functionalities to use with our stack.
- We decided to meet on 1.14.19 at 5pm in the Percopo 3 study room.
Group 2A - 30

1.14.19 at 5:30 for 2.5 hours


- Worked on fixing Milestone 1
- Converted relprime and gcd to our assembly language using new decisions
- Started on the RTL for our instructions
- We decided to meet on 1.15.19 at 4:15 in the Percopo 3 study room.
1.15.19 at 4:30 for 5 hours
- Worked on Milestone 2
- Added RTL instructions for all instructions.
- Added in a general hardware list for the datapath.
- We decided to use a register file controlled by a latch output in order to
determine which register to do work with.
1.20.19 at 9:00 for 2 hours
- Worked on fixing Milestone 2
- We fixed the descriptions of each instruction making sure that they were very
explicit in their wording.
- We decided to make a new register to hold the stack pointer instead of a
reserved place in memory. This is so we don’t have to access memory twice
when we need to know the stack pointer.
- Updated the hardware components list to make it more of a “shopping list”.
- Worked on Milestone 3
- We decided to tentatively do a multicycle datapath. We started to work on a
drawing for this.
- We decided to on 1.21.19 at 5:15 in the Percopo 3 study room.
1.21.19 at 5:15 for 2.75 hours
- Worked on Milestone 3
- Worked on the datapath drawing.
- We decided to switch to a single cycle datapath. We decided to switch based on
our comfort level.
1.22.19 at 3:30 for 3 hours
- Worked on Milestone 3
- Finished the datapath drawing.
- We created test plans to test components, subsystems, and the entire system.
- We decided to meet on 1.23.19 in class.
- We implemented the SE in verilog as well as the unit test associated
1.23.19 at 9:00am for 2 hours
- Finished Milestone 3
- Implemented and tested bit shifter
- Finished updating the design document and the design journal
- In our tests, we attempted to cover a few valid example cases, and applicable
edge cases.
- We have not found any errors in our tests or components yet.
- Our architecture made the datapath slightly more spread out, compared to an
architecture like MIPS, in terms of number of components. We have many
Group 2A - 31

sections that work individually without many interactions with other sections but
required more complex choices within each section.
1.27.19 at 9:00 for 2 hours
- Fixed Milestone 3
- Cleaned up the datapath
- Combined muxes, erased lines to control bits, fixed pc adders
- Added in which parts from website
- Added in more details about integration of parts
1.28.19 at 5:30 for 2.5 hours
- Worked on Milestone 4
- Worked on creating parts and tests
1.29.19 at 7:00 for 4 hours
- Worked on Milestone 4
- Created a control table that had all control bits and all instructions
- Created the last parts and wrote tests for them.
- We had to figure out the skip instructions’ control logic. This was difficult because
we only needed to skip the next instruction based on specific conditions including
the Zero output on the ALU. We decided to add two new control bits and change
PCSrc’s least significant control bit if it needed to skip an instruction (PC+2
instead of PC+1). The final design we chose can be found in the design
document.
1.30.19 at 5:30 for 4.5 hours
- Worked on Milestone 4
- Wrote a plan to test the control signals.
- Started to create the control unit.
- Created some subsystems based on our integration plan.
2.4.19 at 6:45 for 3 hours
- Worked on Milestone 4 and 5
- Finished implementing our control unit.
- Updated control table based on discrepancies.
- Started creating the Register Operations subsystem.
2.5.19 at 6:00 for 4 hours
- Worked on Milestone 5
- Worked on testing our various subsystems.
2.8.19 at 2:00 for 7 hours
- Worked on Milestone 5
- Finished testing all of the subsystems.
- Started to integrate them into one project.
2.10.19 at 7:00 for 5 hours
- Finished testing the system for Milestone 5.
- Started writing an assembler.
- Started working on the final report.
2.12.19 at 4:00 for 2.5 hours
Group 2A - 32

- Finished coding the assembler


2.13.19 at 5:00 for 3.75 hours
- Finished debugging RelPrime and GCD assembly code
- Started testing efficiency of system
2.14.19 at 8:00 for 5 hours
- Worked on I/O state machine for our processor.
2.15.19 at 10:00 for 2 hours and 2:00 for 2.5 hours
- Worked on getting FPBA board input to our processor.
2.17.19 at 12:00 for 10 hours
- Worked on getting input from FPGA board.
- Worked on presentation and final report.

Abby Abernathy
1.7.19 [2.25 hours]
- Worked on Milestone 1 Design Document
- Decided on architecture type and instructions
1.8.19 [3 hours]
- Worked on Milestone 1 Design Document
- Worked on writing a program in our language
- Defined addressing modes and determining which addressing mode goes with
each instruction
1.9.19 [1.5 hours]
- Worked on Milestone 1 Design Document
- Converted our instructions into machine code
- Decided how our main memory would be seperated
1.13.19 [5.25 hours]
- Worked on fixing Milestone 1 Design Document
- Fixed layout of file
- Figured out how to fix problems Micah brought up with our current design
1.14.19 [2.5 hours]
- Worked on fixing Milestone 1 Design Document
- Started RTL
1.15.19 [5 hours]
- Worked on Milestone 2 Design Document
1.20.19 [2 hours]
- Worked on fixing Milestone 2 Design Document
- Worked on Milestone 3 Design Document
- Started drawing a multicycle datapath.
1.21.19 [2.75 hours]
- Worked on Milestone 3 Design Document
- Worked on single-cycle datapath drawing.
1.23.19 [2 hours]
Group 2A - 33

- Worked on Milestone 3 Design Document


- Finalized the document, adding in the tests for a few components.
1.27.19 [2 hours]
- Worked on fixing Milestone 3 Design Document
- Fixed datapath drawing.
- Added more detail to the document (integration and implementation).
1.28.19 [2 hours]
- Worked on Milestone 4 Design Document
- Wrote tests for the memory unit and register parts from the course page.
1.29.19 [4 hours]
- Worked on Milestone 4 Design Document
- Created a control unit table and updated datapath as needed.
1.30.19 [4.5 hours]
- Worked on Milestone 4 Implementation
- Worked on creating the PC and Stack subsystem.
2.4.19 [.75 hours]
- Worked on Milestone 5 Design Document
- Fixed some of the errors in the design document.
2.5.19 [3 hours]
- Worked on Milestone 5 Implementation
- Worked on the PC Subsystem.
2.8.19 [6 hours]
- Worked on Milestone 5.
- Finished testing the PC and Stack Subsystems.
- Updated the design document.
- Worked on some of the Control Subsystem testing.
2.10.19 [4 hours]
- Worked on the final report and presentation.
2.13.19 [2 hours]
- Updated documentation with information from the last week.
2.14.19 [5 hours]
- Worked on creating and testing an I/O state machine.
2.15.2019 [3 hours]
- Worked on getting FPGA input.
2.17.2019 [10 hours]
- Worked on getting input from FPGA board.
- Worked on presentation and final report.

Worden Barr
Monday, January 7, 2019
Began preliminary work on the design document and established the team’s schedule
Group 2A - 34

[2.25 hrs]

Tuesday, January 8, 2019


Milestone 1: Worked on instruction set and programmed Euclid’s algorithm in our language
[3.00 hrs]

Wednesday, January 9, 2019


FInished machine code conversion and planned out storage design
[3.75 hrs]

Sunday, January 13, 2019


Reworked Milestone 1 based on instructor feedback
[5.25 hrs]

Monday, January 14, 2019


Continued reworking instruction set
[2.00 hrs]

Tuesday, January 15, 2019


Added RTL descriptions and specified hardware inputs, outputs, and control bits.
[5.00 hrs]

Sunday, January 20, 2019


Started laying out the datapath and fixed sections of the RTL based on feedback
[2.00 hrs]

Monday, January 21, 2019


Worked some more on laying out the datapath
[2.50 hrs]

Tuesday, January 22, 2019 (Morning)


Continued datapath design and partial control signals
[1.50 hrs]

Tuesday, January 22, 2019 (Afternoon)


Group 2A - 35

Finished datapath design, created control signal descriptions, implemented verilog tests of a
couple hardware components, and began work on lab 06
[3.25 hrs]

Wednesday, January 23, 2019


Finished implementing the hardware components from yesterday and continued work on lab 06
[1.75 hrs]

Sunday, January 27, 2019


Fixed problems with Milestone 3 according to weekly feedback
[2.25 hrs]

Monday, January 28, 2019


Started building processor components and test benches in verilog
[1.50 hrs]

Monday, January 28, 2019


Built more parts and test benches in verilog and helped work on subsystem testing
[2.50 hrs]

Tuesday, January 29, 2019


Built more parts and test benches in verilog and helped work on assigning control bits to all the
instructions
[3.75 hrs]

Wednesday, January 30, 2019


Built the instruction decode subsystem, helped build the PC subsystem, and helped build the
control
[4.50 hrs]

Monday, February 4, 2019


Began building the register and operation subsystem
[3.00 hrs]
Group 2A - 36

Tuesday, February 5, 2019


Finished building the register and operation subsystem and built test benches for all the
subsystem I created
[4.00 hrs]

Friday, February 8, 2019


Integrated all the subsystems
[7.00 hrs]

Sunday, February 10, 2019


Finished integrating and testing the system
[5.00 hrs]

Monday, February 11, 2019


Started debugging RelPrime
[4.33 hrs]

Wednesday, February 13, 2019


Finished debugging RelPrime and obtained a timing for an input of 0h13B0
[3.25 hrs]

Thursday, February 14, 2019


Fixed datapath to work with the soon-to-be-finished I/O state machine
[3.25 hrs]

Friday, February 15, 2019


Began work on FPGA implementation
[4.67 hrs]

Sunday, February 17, 2019


Fairly successfully got input to work for the FPGA, but not output…
[10.00 HOURS]
Group 2A - 37

Alexander Bradshaw
1/7/19
Monday January 7, 2019
Met with team to discuss design documents. [2.5 hours]

1/8/19
Tuesday January 8, 2019
Worked on milestone 1 and finished the machine language translations. [3 hours]

1/9/19
Wednesday January 9,2019
Finished putting the assembly code into machine code[3 hours]

1/13/19
Sunday January 13,2019
Mainly fixed the problems that micah brought up during the discussion[2.5 hours]

1/14/19
Monday January 14,2019
Worked on fixing milestone 1
Started on RTL [4 hours]

1/15/19
Tuesday January 15,2019
Worked on Milestone 2, wrote a lot of RTL’s, and named the hardware requirements [5 hours]

1/20/19
Sunday January 20, 2019
Fixed the milestone 2 problems. And started working on writing the data path. [2 hours]

1/21/19
Monday January 21, 2019
Started the datapaths for milestone 3 and wrote the data path. [3 hours]

1/22/19
Tuesday January 22, 2019
Fixed the mistakes made from the last meeting of the data path [3 hours]

1/23/19
Wednesday January 23, 2019
Checked and made sure we hit all of the checkpoints for the last milestone [2 hours]
Group 2A - 38

1/27/19
Sunday January 27, 2019
Cleaned up the datapath and fixed everything that micah said to fix in the last meeting [2 hours]

1/28/19
Monday January 28, 2019
Started on milestone 4 and made tests for component in verilog [3 hours]

1/29/19
Tuesday January 29, 2019
Worked on the control units and made a table for it. Finished all of the control units. [4 hours]

1/30/19
Wednesday January 30, 2019
Worked on the control units and implemented the pc and stack subsystem. [4.5 hours]

2/5/19
Monday February 4,2019
Fixed the errors from milestone 4 and implemented control unit. [3 hours]

2/8/19
Friday February 8, 2019
Worked on the Stack subsystem and helped write a test bench for it [4 hours]

2/10/19
Sunday February 10, 2019
Worked on making an assembler to convert the assembly code into machine code [4 hours]

2/12/19
Tuesday February 12, 2019
Finished making the assembler and turned the rel prime code into machine code [2.5 hours]

2/13/19
Wednesday February 13, 2019
Finished rel prime and got it working with our processor. [3 hours]

2/14/19
Thursday February 14, 2019
Worked on the input for the project [5 hours]
Group 2A - 39

2/15/19
Friday February 15, 2019
Tried to get the relprime working on the fpga board [3 hours]

Chris Comeau
Monday, January 7, 2019 Established schedule and initial project steps [135 min]

Tuesday, January 8, 2019 Contributed towards instruction set and developed Euclidean
algorithm in assembly [180 min]

Wednesday, January 9, 2019 Made decisions towards stordescage design and finished
machine code conversions with both PC addressing and User Mode allocation [225 min]

Sunday, January 13, 2019 Helped Rework problems and move towards useful and unique
solution [315 min]

Monday, January 14, 2019 Refined instruction set and began defining RTL for each instruction
[120 min]

Tuesday, January 15, 2019 Helped add RTL descriptions as well as hardware inputs, outputs,
and control signals [300 minutes]

Sunday, January 20, 2019 Helped update RTL descriptions based on feedback as well as start
basic datapath layout [110 minutes]

Monday, January 21, 2019 Continued datapath design and helped implement most instructions’
requirements in hardware [150 minutes]

Tuesday, January 22, 2019 (Morning) Continued datapath design and partial control signals [90
minutes]

Tuesday, January 22, 2019 (Afternoon) Finished datapath design and helped create control
signal descriptions and describe unit tests and integration plans [225 minutes]

Wednesday, January 23, 2019 Helped implement some of the components as well as the unit
tests for said units [120 minutes]

Sunday, January 27, 2019 Fixed datapath and integration plan problems from milestone 3 [135
minutes]
Group 2A - 40

Monday, January 28, 2019 (Morning) Fixed integration plan and helped implement and test
parts [90 minutes]

Monday, January 28, 2019 (Afternoon) Implemented more parts for integration and testing and
began breaking down subsystem testing[150 minutes]

Tuesday, January 29, 2019 Updated and assigned control bits to each instruction and updated
RTL accordingly for each, along with skipping control logic [230 minutes]

Wednesday, January 30, 2019 Helped implement PC and Instruction fetching subsystems, and
started to create the Control Unit [270 minutes]

Monday, February 4, 2019 Helped fix problems from Milestone 4 and finished implementation of
Control Unit [180 minutes]

Tuesday, February 5, 2019 Fixed issues with Control Unit and began testing control using
multiple testbenches [240 minutes]

Friday, February 8, 2019 Fully tested Control unit with each opcode to verify instructions and
assisted integrating Stack and Register operations subsystems [420 minutes]

Sunday, February 10, 2019 Helped with issues in full system integration and began work on
assembler [260 minutes]

Monday, February 11, 2019 Finished basic assembler and began debugging relprime [260
minutes]

Wednesday, February 13, 2019 Debugged relprime and obtained initial timing [225 minutes]

Thursday, February 14, 2019 Created and tested input state machine to use for I/O [300
minutes]

Friday, February 15, 2019 Debugged and tested datapath input capability and began FPGA I/O
[280 minutes]

Sunday, February 17, 2019 Debugged FPGA I/O and reworked to support working register
input, continued debugging and helped update final design document [600 minutes]

You might also like