You are on page 1of 10

Example: Translate the following C statement into MIPS machine language code.

Assume that $t1 is


the base of array A and $s2 corresponds to h:

A[300] = h + A[300];

This compiles into

lw $t0, 1200($t1) # t0 = A[300]


add $t0, $s2, $t0 # t0 = h + A[300]
sw $t0, 1200($t1) # A[300] = h + A[300]

First, the machine language instructions using decimal numbers

op rs rt rd shamt/address funct

35 9 8 N/A 1200 N/A

0 18 8 8 0 32

43 9 8 N/A 1200 N/A

The binary equivalent of the decimal form is as follows:

1000 1101 0010 1000 0000 0100 1011 0000


0000 0010 0100 1000 0100 0000 0010 0000
1010 0101 0010 1000 0000 0100 1011 0000

In hex:

0x8D2804B0
0x02284020
0xA52804B0

J-type
This instruction format is for the jump instructions (i.e., j and jal). It consists of 6 bits for the operation
field and the rest of the bits for the address field. These instructions are identified and differentiated by
their opcode numbers (2 and 3 respectively).

Addresses in Branches and Jumps


Branches and jumps belongs to different instruction format: branches (e.g., beq, bne, etc.) comes in the
I-format in which there’s a 16-bit immediate and jumps (e.g., j, etc.) comes in the J-format in which
there’s a 26-bit immediate. Since addresses are 32-bit wide, how do we fit such an address into
either a 16-bit or a 26-bit field? The answer is simple: We cannot do that.

Instead MIPS handles it all by making a few assumptions and using a certain type of addressing mode
to calculate the effective memory address of an operand. What MIPS does is the following depending
on the type of instruction being executed:

● PC-relative addressing: Treat branches as relative offsets which we add to the current PC,
meaning that we add the 16-bit immediate to the PC to allow us to move forward (i.e., offset is
positive) and backward (i.e., offset is negative). Since the immediate field is 16 bits, we can
15
move ± 2 words/instructions away from the PC (i.e., the sign-extended 16-bit offset value
provides offsets between -32768 and 32767 words). The following drawing shows how MIPS
calculates the branch target of a branch instruction using the current PC and the 16-bit
immediate:
● Pseudo-direct addressing: Treat jumps as an absolute value by replacing/ignoring some bits
in the PC. The following drawing shows how MIPS calculates the branch target of a jump
instruction using the current PC’s 4 leftmost bits and the 26-bit immediate:

Why do we have 00 at the end of the immediate when we calculate the new instruction for both
the branch and jump instructions?

Instructions are word-aligned, i.e., they're 4 bytes each, so we'll never have anything other than 00 in
the rightmost bits. This is similar to how any multiple of 100 in decimal systems has two 0s as its
rightmost bits; in our case, every word is a multiple of 4, which means all of them have 00 in the 2 least
significant bits. We always jump by full instructions since it doesn't make sense to jump into the middle
of an instruction and we can make sure of this by interpreting the jump offset as full instructions. To
convert to full instructions we multiply by 4 (aka shifting left by 2), because each instruction is 4 bytes
and memory addresses are in bytes. Thus we jump in terms of instructions and not in terms of
bytes.

From Assembly to Machine Code


There are three examples under this heading:

1. Example: Walkthrough - This example goes over each single instruction, determining its
instruction format, and filling out the different fields. This culminates with the machine code
representing the assembly code.
2. Example: Branches and Jumps - Since most instructions are straightforward, this example
focuses only on discussing branch and jump instructions of a particular example. In other words,
how to calculate the offset/target and how MIPS uses them to find out the address of the
instruction that’s being branched/jumped to. This example fills out some of the gaps left by
Example: Walkthrough.
3. Example: To the Point - Pretty much self-explanatory. This example assumes you already
know how to calculate MIPS fields of whatever type and it’s just here to prompt you to say
“Wow, I understand where that’s coming from”.

Example: Walkthrough
Given the following C code, translate into MIPS assembly and then into machine code. Assume the first
line of your MIPS assembly code resides in memory address 0xABCD1234:
n = 0
for (i = 0; i < 5; i++) {
n += i
}

To MIPS assembly code


Assuming that $s0 stores the variable n, the MIPS assembly code is as follows (ignore the numbers in
front of each instruction; they’re for reference):

# Alloc: $s0 = n

(1) add $s0, $zero, $zero # n = 0


(2) add $t0, $zero, $zero # i = 0
(3) LOOP: slti $t1, $t0, 5 # t1 = (t0 < 5) ? 1 : 0
(4) beq $t1, $zero, EXIT # goto EXIT if t0 >= 5
(5) add $s0, $s0, $t0 # n = n + i
(6) addi $t0, $t0, 1 # t = t + 1
(7) j LOOP
(8) EXIT:

To Machine Code
We’ll translate each line of the above MIPS assembly code line by line, starting with line (1). For each
instruction, we’ll identify its instruction format, and then fill out each format accordingly. The appendix A.
10 MIPS R2000 Assembly Language in Patterson and Hennesy describes each MIPS instruction with
its mnemonic, its instruction format, and a description of what it does.

● Line (1): add $s0, $zero, $zero

The add instruction is a R-type instruction and thus its instruction format is as follows:

The opcode for most R-type instructions is 0 so that’s we’ve in the upper 6 bits of our 32-bit word. The
funct value for add is 0x20 (or decimal 32) so that’s what we’ve in the lower 6 bits (or funct field).
This instruction doesn’t perform any type of shift, so the shamt field is assigned 0. We’ve add $rd,
$rs, $rt, thus matching things up we get rd = $s0, rs = $zero, and $rt = $zero. The numbers
for registers $s0 and $zero are 16 and 0 respectively. Therefore:

op rs rt rd shamt funct

Decimal 0 0 0 16 0 32

Binary 000000 00000 00000 10000 00000 100000

Thus, the machine code for the add $s0, $zero, $zero is
0000 0000 0000 0000 1000 0000 0010 0000

● Line (2): add $t0, $zero, $zero

Similar instruction as before. The only difference is that $t0 is now the register destination. Its number is
8. Thus we’ve

op rs rt rd shamt funct

Decimal 0 0 0 16 0 32
Binary 000000 00000 00000 01000 00000 100000

● Line (3): slti $t1, $t0, 5

The slti instruction is a I-type instruction and its instruction format is as follows:

As shown above, the opcode for slti is 0xa (10 in decimal). We’ve slti $rt, $rs, imm, thus matching
things up: $rt = $t1, $rs = $t0, and imm = 5 (we signed-extend this to 16 bits if needed). Thus,

op rs rt imm

Decimal 10 8 9 5

Binary 001010 01000 01001 0000 0000 0000 0101

● Line (4): beq $t1, $zero, EXIT

The beq instruction is also an I-type instruction. Its instruction format is as follows:

As shown, the opcode for this instruction is 4. Matching up the general instruction, we’ve rs = $t1,
$rt = $zero, and label/offset/imm = EXIT.

To get the complete machine code for this instruction, we must find out the memory address the label
EXIT represents; this address is ultimately our offset. In the problem statement, we’re told to assume
the first line (namely line (1)) of the machine code is 0xABCD1234.

Remember that MIPS instructions have byte addresses, so addresses of sequential words differ by 4,
the number of bytes in a word. Knowing that the instruction add $s0, $zero, $zero sits at memory
location 0xABCD1234, we can increment 4 bytes (i.e, word or instruction) successively to get to the next
word (or next instruction). Thus
Memory address Instruction/label

0xABCD1234 Line 1

0xABCD1238 Line 2

0xABCD123C Line 3

0xABCD1240 Line 4

0xABCD1244 Line 5

0xABCD1248 Line 6

0xABCD124C Line 7

0xABCD1250 Line 8

Clearly, line (8)’s full address is 0xABCD1250. The problem is that the imm field for the beq instruction is
only 16 bits and that address is a full word (32 bits). The bottom line is that we cannot fit a 32-bit label
into a 16-bit field. MIPS takes care of this problem by using a form of branching known as PC-relative
addressing.

The program counter (PC) is just an implicit register that holds the address of the current instruction
being executed.

How does PC-relative addressing work though? Given that the program counter (PC) contains the
15
address of the current instruction, we can branch within ± 2 words of the current instruction if we use
the PC as the register to be added to the address. Thus, the address is the sum of the instruction after
the program counter (PC) and a constant in the instruction.

In the case of beq $t1, $zero, EXIT (the instruction being currently being executed), the PC points
to this same instruction and we must find out how many words from here until we reach the label EXIT
(i.e., the branch address here). Starting with 0 from the instruction after the PC, it takes 3 words to get
line (5) to line (8); thus, 3 is the constant value we use in the imm field. Therefore,

op rs rt imm

Decimal 04 9 0 3

Binary 000100 01001 00000 0000 0000 0000 0011

15
NOTE: Remember we can branch ± 2 words away from our current address to a branch address,
meaning the offset can be either positive or negative: 1) If we branch from lower address to a higher
one, then the offset is positive (as demonstrated with the previous instruction) and 2) if we branch from
a higher address to a lower one, then the offset is negative.

● Line (5): add $s0, $s0, $t0

We already know about this instruction format. We’ll just make the corresponding replacement for the
registers:

op rs rt rd shamt funct

Decimal 0 16 8 16 0 32

Binary 000000 10000 01000 10000 00000 100000

● Line (6): addi $t0, $t0, 1

The addi instruction is a I-type instruction and thus its instruction format is as follows:

The opcode for addi is 0x8 (8 in decimal). The remaining fields are as follows: rs = $t0, $rt = $t0,
and imm = 1. Therefore,

op rs rt imm

Decimal 8 8 8 1

Binary 001000 01000 01000 0000 0000 0000 0001

● Line (7): j LOOP

The j instruction is a J-type instruction. Its instruction format is


As shown above, the opcode for this instruction is 2. As for the target, we cannot just place the 32-bit
address of the label we’re jumping to, after all a 32-bit address cannot fit into a 26-bit field. MIPS takes
care of this by adjusting the label’s address to fit into a 26-bit field in the following manner:

Given a 32-bit address, we know the rightmost 2 bits are 0 (since addresses are multiple of 4)
so we don’t bother storing them. We take the next 26 bits to represent our target; this 26 bits will
fit into a 26-bit field. This leaves us with 4 extra bits which are dealt with by assuming the upper
4 bits of the target of the j instruction are the same as the upper 4 bits of the PC.

This type of addressing is known as pseudodirect addressing.

For this instance, the address of the target (i.e., LOOP) of the jump instruction is 0xABCD123C. In
binary,

1010 1011 1100 1101 0001 0010 0011 1100

Discarding the 2 rightmost bits and assuming the 4 leftmost bits are the same as the PC’s leaves us
with the following 26 bits, which is what we use for the instruction’s target’s machine code.

10 1111 0011 0100 0100 1000 1111

Therefore,

op target

Decimal 2

Binary 000010 10111100110100010010001111

Everything Together
Taking the machine code (i.e., binary representation) for each of the instructions yields the following
binary and hex dumps:

Binary Hex

00000000000000001000000000100000 0x00008120

00000000000000000100000000100000 0x00004020

00101001000010010000000000000101 0x29090005
00010001001000000000000000000011 0x11200003

00000010000000001000000000100000 0x02008020

00100001000010000000000000000001 0x21080001

00001010111100110100010010001111 0x0AF3448F

Example: Branches and Jumps


The following C loop

for (j = 0; j < 10; j++) {


b = b + j;
}

Translate into the MIPS assembly code down below:

# Alloc: $s0 = b, $s1 = j

add $s1, $zero, $zero # j = 0


addi $t0, $zero, 10 # t0 = 10
LOOP: beq $s0, $t0, DONE # goto DONE if j == 10
add $s0, $s0, $s1 # b = b + j
addi $s1, $s1, 1 # j = j + 1
j LOOP # iterate
DONE:

Assuming that the first line of code starts at memory location 0x00000000, translate the branch and
jump instructions into machine code.

This example is based on David B. Schaffer’s ISA 2.4 MIPS: Addresses in branches and jumps
Youtube video.

Branches
Let’s start with beq $s0, $t0, DONE. The beq instruction is a I-type instruction and aside for having to
determine the offset, everything else is given to us by a MIPS reference sheet. To make it easier to
reason about it, let’s add the addresses to each instruction/label to their left:

You might also like