You are on page 1of 5

BEE 425 Microprocessor System Design

Spring 2020
Lab 3: ARM Assembly language CPU diagnostic programming
Note: this is the class project, Milestone 2
1. Lab Objectives:
• • Using Keil MDK integrated development environment, learned in Lab 2
• • Programming in ARM assembly language
• • Sesign and test CPU diagnostic code to use in Lab 4/Milestone 3.

2. Materials Needed:
• • Keil MDK version 4.7
• • Assembly code listed in the textbook Figure 7.60
• • Your own class project Milestone 1 report on CPU enhancements

You should have installed Keil MDK on your home machine to complete Lab 2. You will reuse it here.
1.
The machine code is stored in a hexadecimal file called memfile.dat, which is loaded by the testbench
during simulation. The file consists of the machine code for the instructions, one instruction per line. The
testbench, top-level ARM module, and external memory HDL code are given in the following examples.
The memories in this example hold 64 words each.

Code:
00 MAIN SUB R0, R15, R15 ; R0 = 0 1110 000 0010 0 1111 0000 0000 0000 1111 E04F000F
04 ADD R2, R0, #5 ; R2 = 5 1110 001 0100 0 0000 0010 0000 0000 0101 E2802005
08 ADD R3, R0, #12 ; R3 = 12 1110 001 0100 0 0000 0011 0000 0000 1100 E280300C
0C SUB R7, R3, #9 ; R7 = 3 1110 001 0010 0 0011 0111 0000 0000 1001 E2437009
10 ORR R4, R7, R2 ; R4 = 3 OR 5 = 7 1110 000 1100 0 0111 0100 0000 0000 0010 E1874002
14 AND R5, R3, R4 ; R5 = 12 AND 7 = 4 1110 000 0000 0 0011 0101 0000 0000 0100
E0035004
18 ADD R5, R5, R4 ; R5 = 4 + 7 = 11 1110 000 0100 0 0101 0101 0000 0000 0100 E0855004
1C SUBS R8, R5, R7 ; R8 = 11 - 3 = 8, set Flags 1110 000 0010 1 0101 1000 0000 0000 0111
E0558007
20 BEQ END ; shouldn't be taken 0000 1010 0000 0000 0000 0000 0000 1100 0A00000C
24 SUBS R8, R3, R4 ; R8 = 12 - 7 = 5 1110 000 0010 1 0011 1000 0000 0000 0100 E0538004
28 BGE AROUND ; should be taken 1010 1010 0000 0000 0000 0000 0000 0000 AA000000
2C ADD R5, R0, #0 ; should be ski pped 1110 001 0100 0 0000 0101 0000 0000 0000
E2805000
30 AROUND SUBS R8, R7, R2 ; R8 = 3 - 5 = -2, set Flags 1110 000 0010 1 0111 1000 0000
0000 0010 E0578002
34 ADDLT R7, R5, #1 ; R7 = 11 + 1 = 12 1011 001 0100 0 0101 0111 0000 0000 0001
B2857001
38 SUB R7, R7, R2 ; R7 = 12 - 5 = 7 1110 000 0010 0 0111 0111 0000 0000 0010 E0477002
3C STR R7, [R3, #84] ; mem[12+84] = 7 1110 010 1100 0 0011 0111 0000 0101 0100
E5837054
40 LDR R2, [R0, #96] ; R2 = mem[96] = 7 1110 010 1100 1 0000 0010 0000 0110 0000
E5902060
44 ADD R15, R15, R0 ; PC = PC+8 (skips next) 1110 000 0100 0 1111 1111 0000 0000 0000
E08FF000
48 ADD R2, R0, #14 ; shouldn't happen 1110 001 0100 0 0000 0010 0000 0000 0001
E280200E
4C B END ; always taken 1110 1010 0000 0000 0000 0000 0000 0001 EA000001
50 ADD R2, R0, #13 ; shouldn't happen 1110 001 0100 0 0000 0010 0000 0000 0001
E280200D
54 ADD R2, R0, #10 ; shouldn't happen 1110 001 0100 0 0000 0010 0000 0000 0001
E280200A
58 STR R2, [R0, #100] ; mem[100] = 7 1110 010 1100 0 0000 0010 0000 0101 0100 E5802064
END

Explanation:
Summary of the process:

A data path contains all the functional units and connections necessary to implement an
instruction set architecture. For our single-cycle implementation, we use two separate memories,
an ALU, some extra adders, and lots of multiplexers. MIPS is a 32-bit machine, so most of the
buses are 32-bits wide. The control unit tells the data path what to do, based on the instruction
that’s currently being executed. Our processor has ten control signals that regulate the data path.
The control signals can be generated by a combinational circuit with the instruction’s 32-bit
binary encoding as input. Next, we’ll see the performance limitations of this single-cycle
machine and try to improve upon it. Last time we saw a MIPS single-cycle datapath and control
unit. Today, we’ll explore factors that contribute to a processor’s execution time, and specifically
at the performance of the single-cycle machine. Next time, we’ll explore how to improve on the
single cycle machine’s performance using pipelining.

CPU timeX,P= Instructions executedP* CPIX,P* Clock cycle timeX

In first step R0 become 0 when R0 is subtracted from R15 , then #5 is added to it and
result became 2 in R2. R0 is added in R0, #12 the result become12 in R3. Then R7 is
subtrated from R3, the result become 3 in R7. Then Logical OR is taken of R4.R7,R2
resulting R4=3 | 5=7 . And operation is done on R5,R3,R4 resulting R5=4+2=11.
Subtraction with carry is done on R8,R5,R7 Resulting R8=8 and it will set flag.
Then Subtraction with carry is done on R8,R3,R4 Resulting R8=5 and it will set flag.
Then branch first instruction was chosen. Adding R5 and R0 resulting equalization of
both registers. Subtraction with carry is done on R8,R7,R2 Resulting R8=-2 and it will
set flag. Subsequently, the ADDLT instruction is executed because LT condition is full
filled when V != N (values of overflow and negative bits in the CPSR are different).
Subtraction is done on R7,R7,R2 Resulting R7=12 .
Then storing the value in R7 of [R3 + #84] resulting MEM[12+84]= 7
Here LDR generate literal constants when an immediate value cannot be moved into a
register because it is out of range of the MOV and MVN instructions.
Then Adding R15,R15 resulting PC=PC+8. Then Adding R2,R0 Resulting R2=14.
Then adding R2,R0, #13. Adding R2,R0, #10. And finally agaiun storing the final relust
in register R2 resulting MEM[100] = 7
Which parts of the CPU are validated if the results match expectations?
R2,R0

Which parts of the CPU are not tested at all? E.g. any unused registers.
R1

Which parts of the ALU are tested?


Sequential Logic

Have we tested all bits of the ALU in all operations? If not, which bits in
which functions?
Yes we tested all.

Which ALU status bits are checked? Which are not checked?
All status bits are checked

How have we tested memory access?


By using STR and LDR operations

If we were to test memory access more thoroughly, how could we do


so?
If your computer's CPU had to constantly access the hard drive to retrieve every piece of
data it needs, it would operate very slowly. When the information is kept in memory, the
CPU can access it much more quickly. Most forms of memory are intended to store data
temporarily.

How have we tested branch instructions?


By using BEG END

You might also like