Professional Documents
Culture Documents
A[300] = h + A[300];
op rs rt rd shamt/address funct
0 18 8 8 0 32
In hex:
0x8D2804B0
0x02284020
0xA52804B0
J-type
This instruction format is for the jump instructions (i.e., j and jal). It consists of 6 bits for the operation
field and the rest of the bits for the address field. These instructions are identified and differentiated by
their opcode numbers (2 and 3 respectively).
Instead MIPS handles it all by making a few assumptions and using a certain type of addressing mode
to calculate the effective memory address of an operand. What MIPS does is the following depending
on the type of instruction being executed:
● PC-relative addressing: Treat branches as relative offsets which we add to the current PC,
meaning that we add the 16-bit immediate to the PC to allow us to move forward (i.e., offset is
positive) and backward (i.e., offset is negative). Since the immediate field is 16 bits, we can
15
move ± 2 words/instructions away from the PC (i.e., the sign-extended 16-bit offset value
provides offsets between -32768 and 32767 words). The following drawing shows how MIPS
calculates the branch target of a branch instruction using the current PC and the 16-bit
immediate:
● Pseudo-direct addressing: Treat jumps as an absolute value by replacing/ignoring some bits
in the PC. The following drawing shows how MIPS calculates the branch target of a jump
instruction using the current PC’s 4 leftmost bits and the 26-bit immediate:
Why do we have 00 at the end of the immediate when we calculate the new instruction for both
the branch and jump instructions?
Instructions are word-aligned, i.e., they're 4 bytes each, so we'll never have anything other than 00 in
the rightmost bits. This is similar to how any multiple of 100 in decimal systems has two 0s as its
rightmost bits; in our case, every word is a multiple of 4, which means all of them have 00 in the 2 least
significant bits. We always jump by full instructions since it doesn't make sense to jump into the middle
of an instruction and we can make sure of this by interpreting the jump offset as full instructions. To
convert to full instructions we multiply by 4 (aka shifting left by 2), because each instruction is 4 bytes
and memory addresses are in bytes. Thus we jump in terms of instructions and not in terms of
bytes.
1. Example: Walkthrough - This example goes over each single instruction, determining its
instruction format, and filling out the different fields. This culminates with the machine code
representing the assembly code.
2. Example: Branches and Jumps - Since most instructions are straightforward, this example
focuses only on discussing branch and jump instructions of a particular example. In other words,
how to calculate the offset/target and how MIPS uses them to find out the address of the
instruction that’s being branched/jumped to. This example fills out some of the gaps left by
Example: Walkthrough.
3. Example: To the Point - Pretty much self-explanatory. This example assumes you already
know how to calculate MIPS fields of whatever type and it’s just here to prompt you to say
“Wow, I understand where that’s coming from”.
Example: Walkthrough
Given the following C code, translate into MIPS assembly and then into machine code. Assume the first
line of your MIPS assembly code resides in memory address 0xABCD1234:
n = 0
for (i = 0; i < 5; i++) {
n += i
}
# Alloc: $s0 = n
To Machine Code
We’ll translate each line of the above MIPS assembly code line by line, starting with line (1). For each
instruction, we’ll identify its instruction format, and then fill out each format accordingly. The appendix A.
10 MIPS R2000 Assembly Language in Patterson and Hennesy describes each MIPS instruction with
its mnemonic, its instruction format, and a description of what it does.
The add instruction is a R-type instruction and thus its instruction format is as follows:
The opcode for most R-type instructions is 0 so that’s we’ve in the upper 6 bits of our 32-bit word. The
funct value for add is 0x20 (or decimal 32) so that’s what we’ve in the lower 6 bits (or funct field).
This instruction doesn’t perform any type of shift, so the shamt field is assigned 0. We’ve add $rd,
$rs, $rt, thus matching things up we get rd = $s0, rs = $zero, and $rt = $zero. The numbers
for registers $s0 and $zero are 16 and 0 respectively. Therefore:
op rs rt rd shamt funct
Decimal 0 0 0 16 0 32
Thus, the machine code for the add $s0, $zero, $zero is
0000 0000 0000 0000 1000 0000 0010 0000
Similar instruction as before. The only difference is that $t0 is now the register destination. Its number is
8. Thus we’ve
op rs rt rd shamt funct
Decimal 0 0 0 16 0 32
Binary 000000 00000 00000 01000 00000 100000
The slti instruction is a I-type instruction and its instruction format is as follows:
As shown above, the opcode for slti is 0xa (10 in decimal). We’ve slti $rt, $rs, imm, thus matching
things up: $rt = $t1, $rs = $t0, and imm = 5 (we signed-extend this to 16 bits if needed). Thus,
op rs rt imm
Decimal 10 8 9 5
The beq instruction is also an I-type instruction. Its instruction format is as follows:
As shown, the opcode for this instruction is 4. Matching up the general instruction, we’ve rs = $t1,
$rt = $zero, and label/offset/imm = EXIT.
To get the complete machine code for this instruction, we must find out the memory address the label
EXIT represents; this address is ultimately our offset. In the problem statement, we’re told to assume
the first line (namely line (1)) of the machine code is 0xABCD1234.
Remember that MIPS instructions have byte addresses, so addresses of sequential words differ by 4,
the number of bytes in a word. Knowing that the instruction add $s0, $zero, $zero sits at memory
location 0xABCD1234, we can increment 4 bytes (i.e, word or instruction) successively to get to the next
word (or next instruction). Thus
Memory address Instruction/label
0xABCD1234 Line 1
0xABCD1238 Line 2
0xABCD123C Line 3
0xABCD1240 Line 4
0xABCD1244 Line 5
0xABCD1248 Line 6
0xABCD124C Line 7
0xABCD1250 Line 8
Clearly, line (8)’s full address is 0xABCD1250. The problem is that the imm field for the beq instruction is
only 16 bits and that address is a full word (32 bits). The bottom line is that we cannot fit a 32-bit label
into a 16-bit field. MIPS takes care of this problem by using a form of branching known as PC-relative
addressing.
The program counter (PC) is just an implicit register that holds the address of the current instruction
being executed.
How does PC-relative addressing work though? Given that the program counter (PC) contains the
15
address of the current instruction, we can branch within ± 2 words of the current instruction if we use
the PC as the register to be added to the address. Thus, the address is the sum of the instruction after
the program counter (PC) and a constant in the instruction.
In the case of beq $t1, $zero, EXIT (the instruction being currently being executed), the PC points
to this same instruction and we must find out how many words from here until we reach the label EXIT
(i.e., the branch address here). Starting with 0 from the instruction after the PC, it takes 3 words to get
line (5) to line (8); thus, 3 is the constant value we use in the imm field. Therefore,
op rs rt imm
Decimal 04 9 0 3
15
NOTE: Remember we can branch ± 2 words away from our current address to a branch address,
meaning the offset can be either positive or negative: 1) If we branch from lower address to a higher
one, then the offset is positive (as demonstrated with the previous instruction) and 2) if we branch from
a higher address to a lower one, then the offset is negative.
We already know about this instruction format. We’ll just make the corresponding replacement for the
registers:
op rs rt rd shamt funct
Decimal 0 16 8 16 0 32
The addi instruction is a I-type instruction and thus its instruction format is as follows:
The opcode for addi is 0x8 (8 in decimal). The remaining fields are as follows: rs = $t0, $rt = $t0,
and imm = 1. Therefore,
op rs rt imm
Decimal 8 8 8 1
Given a 32-bit address, we know the rightmost 2 bits are 0 (since addresses are multiple of 4)
so we don’t bother storing them. We take the next 26 bits to represent our target; this 26 bits will
fit into a 26-bit field. This leaves us with 4 extra bits which are dealt with by assuming the upper
4 bits of the target of the j instruction are the same as the upper 4 bits of the PC.
For this instance, the address of the target (i.e., LOOP) of the jump instruction is 0xABCD123C. In
binary,
Discarding the 2 rightmost bits and assuming the 4 leftmost bits are the same as the PC’s leaves us
with the following 26 bits, which is what we use for the instruction’s target’s machine code.
Therefore,
op target
Decimal 2
Everything Together
Taking the machine code (i.e., binary representation) for each of the instructions yields the following
binary and hex dumps:
Binary Hex
00000000000000001000000000100000 0x00008120
00000000000000000100000000100000 0x00004020
00101001000010010000000000000101 0x29090005
00010001001000000000000000000011 0x11200003
00000010000000001000000000100000 0x02008020
00100001000010000000000000000001 0x21080001
00001010111100110100010010001111 0x0AF3448F
Assuming that the first line of code starts at memory location 0x00000000, translate the branch and
jump instructions into machine code.
This example is based on David B. Schaffer’s ISA 2.4 MIPS: Addresses in branches and jumps
Youtube video.
Branches
Let’s start with beq $s0, $t0, DONE. The beq instruction is a I-type instruction and aside for having to
determine the offset, everything else is given to us by a MIPS reference sheet. To make it easier to
reason about it, let’s add the addresses to each instruction/label to their left: