General Instruction Format

Opcode-Field Address-Field
• •

Op-field: specifies the operation to be performed; Address-field: provides operands or the CPU register/MM addresses of the operands. Example:






are five main memory locations representing five variables;

3-address format: Assume variables A, B, C, D, and X are stored in MM locations labeled by their names. ADD R1 A B ADD R2 C # R1 [A] + [B] [C] + [D] [R1] [R2]

D # R2 X

MUL X R1 R2 #

Note: here we assume an instruction


where src1 and src2 are the source operand, dst is the destination operand, and * represents the operation specified in Op-code field OP.

2-address format: MOV R1 A # R1 ADD R1 B MOV R2 C # R1 # R2

[A] [B] + [R1] [C] [D] + [R2] [R1] [R2] [R2]

ADD R2 D # R2 MUL R2 R1 # R2

MOV X R2 # X
• • •

• • •

where src is the source operand, dst is the destination operand, and * represents the operation specified in Op-code field OP. 1-address format: Always use an implied accumulator (AC). LOAD ADD LOAD ADD MUL A # AC B # AC C # AC D # AC R # AC [A] [AC] + [B] [AC] [C] [AC] + [D] [AC] [AC] [R]



0-address format: used in stack-organized computer. First, the given notation of the operation is converted into ``reversed Polish notation (RPN)'' PUSH A PUSH B ADD PUSH C PUSH D ADD MUL POP X then execute this program:

o o

Stack A last-in, first-out (LIFO) data structure. Queue A first-in, first-out (FIFO) data structure.

Instruction Format:
An instruction is one, two, or three halfwords in length and must be located in main storage on an integral halfword boundary. Each instruction is in one of five basic formats: RR, RX, RS, SI, and S. Some instructions contain fields that vary slightly from the basic format, and in some instructions the operation performed does not follow the general rules stated in this section. All such exceptions are explicitly identified in the individual instruction descriptions. The format names express, in general terms, the classes of operands which participate in the operation: RR denotes a register-to-register operation; RX. a register-and-indexed-storage operation; RS, a register-and-storage operation; SI, a storage-and- immediate operation. The S format denotes an operation using all implied operand and storage. The first byte and, in the S format, the first two bytes of an instruction

contain the operation code (op code). For some instructions in the S format, all or a portion of the second byte is ignored. The first two bits of the operation code specify the length and format of an instruction, as follows:
Bit Positions Instruction Instruction 0-1 Length Format 00 One halfword RR 01 Two halfwords RX 10 Two halfwords RS/SI/S/RX

In the format illustration for each individual instruction description. the opcode field shows the opcode in hexadecimal representation. The hexadecimal representation uses one graphic for a four-bit code, and therefore two graphics for an eight-bit code. The graphics 0-9 are used for the codes 0000 - 1001 ; the graphics A-F are used for codes 1010 - 1111. The remaining fields in the format illustration for each instruction are designated by code names, consisting of a letter and possibly a subscript number. The subscript number denotes the operand to which the field applies.

Instruction Format
The MIPS R2000/R3000 ISA has fixed-width 32 bit instructions. Fixed-width instructions are common for RISC processors because they make it easy to fetch instructions without having to decode. These instructions must be stored at word-aligned addresses (i.e., addresses divisible by 4). The MIPS ISA instructions fall into three categories: R-type, I-type, and J-type. Not all ISAs divide their instructions this neatly. This is one reason to study MIPS as a first assembly language. The format is simple.

R-type instructions refer to register type instructions. Of the three formats, the R-type is the most complex. This is the format of the R-type instruction, when it is encoded in machine code. B31-26 opcode B25-21 B20-16 B15-11 B10-6 B5-0 register s register t register d shift amount function

The prototypical R-type instruction is:
add $rd, $rs, $rt

where $rd refers to some register d (d is shown as a variable, however, to use the instruction, you must put a number between 0 and 31, inclusive for d). $rs, $rt are also registers. The semantics of the instruction are;
R[d] = R[s] + R[t]

where the addition is signed addition. the order of the registers in the instruction is the destination register ($rd), followed by the two source registers ($rs and $rt). However, the actual binary format (shown in the table above) stores the two source registers first, then the destination register. Thus, how the assembly language programmer uses the instruction, and how the instruction is stored in binary, do not always have to match. Let's explain each of the fields of the R-type instruction.

• • • • •

opcode (B31-26) Opcode is short for "operation code". The opcode is a binary encoding for the instruction. Opcodes are seen in all ISAs. In MIPS, there is an opcode for add. The opcode in MIPS ISA is only 6 bits. Ordinarily, this means there are only 64 possible instructions. Even for a RISC ISA, which typically has few instructions, 64 is quite small. For R-type instructions, an additional 6 bits are used (B50) called the function. Thus, the 6 bits of the opcode and the 6 bits of the function specify the kind of instruction for R-type instructions. rd (B25-21) This is the destination register. The destination register is the register where the result of the operation is stored. rs (B20-16) This is the first source register. The source register is the register that holds one of the arguments of the operation. rt (B15-11) This is the second source register. shift amount (B10-6) The amount of bits to shift. Used in shift instructions. function (B5-0) An additional 6 bits used to specify the operation, in addition to the opcode.

I-type instructions
I-type is short for "immediate type". The format of an I-type instuction looks like: B31-26 opcode B25-21 B20-16 B15-0 immediate register s register t

The prototypical I-type instruction looks like:
add $rt, $rs, immed

In this case, $rt is the destination register, and $rs is the only source register. It is unusual that $rd is not used, and that $rd does not appear in bit positions B25-21 for both R-type and I-type instructions. Presumably, the designers of the MIPS ISA had their reasons for not making the destination register at a particular location for R-type and I-type. The semantics of the addi instruction are;
R[t] = R[s] + (IR15)16 IR15-0

where IR refers to the instruction register, the register where the current instruction is stored. (IR15)16 means that bit B15 of the instruction register (which is the sign bit of the immediate value) is repeated 16 times. This is then followed by IR15-0, which is the 16 bits of the immediate value. Basically, the semantics says to sign-extend the immediate value to 32 bits, add it (using signed addition) to register R[s], and store the result in register $rt.

J-type instructions
J-type is short for "jump type". The format of an J-type instuction looks like: B31-26 opcode The prototypical I-type instruction looks like:
j target

B25-0 target

The semantics of the j instruction (j means jump) are:

PC <- PC31-28 IR25-0 00

where PC is the program counter, which stores the current address of the instruction being executed. You update the PC by using the upper 4 bits of the program counter, followed by the 26 bits of the target (which is the lower 26 bits of the instruction register), followed by two 0's, which creates a 32 bit address. The jump instruction will be explained in more detail in a future set of notes.

Why Five Bits?
If you look at the R-type and I-type instructions, you will see 5 bits reserved for each register. You might wonder why. MIPS supports 32 integer registers. To specify each register, the register are identified with a number from 0 to 31. It takes log2 32 = 5 bits to specify one of 32 registers. If MIPS has 64 register, you would need 6 bits to specify the register. The register number is specified using unsigned binary. Thus, 00000 refers to $r0 and 11111 refers to register $r31.

Why Study Instruction Formats
You might wonder why it's important to study instruction formats. They seem to be arbitrarily constructed. Yet, they aren't. For example, it's quite useful to have the opcode be the same size and the same location. It's useful to know the exact bits used for the immediate value. This makes decoding much quicker, and the hardware to handle instruction decoding that much simpler. Furthermore, you begin to realize what information the instructions store. For example, it's not all that obvious that immediate values are stored as part of the instruction for I-type instructions. If you know that, for example, addi does signed addition, then you can also conclude that the immediate value is represented in 2C. Also, to add the immediate value to a 32-bit register value would mean sign-extending the immediate value to 32 bits. However, not all I-type instructions encode the 16 bit immediate in 2C. For example, addiu (add immediate unsigned) interprets the 16 bits as UB. It zero-extends the immediate and then adds it to the value stored in a 32 bit register.

Three Operand Instructions
Also, notice that the R-type instructions use three operands (i.e., arguments). In earlier, pre-RISC ISAs, memory was expensive, so ISA designers tried to minimize the number of bits used in an instruction. This meant that there were often two, one, or no operands. How did they manage that? Here's an example of an instruction
cisc_add $r1, $r2 # R[1] = R[1] + R[2]

One way to reduce the total number of operands is to make one operand both a source and a destination register. Another approach is to use an implicit register.
acc_add $r2 # Acc = Acc + R[2]

For example, there may be a special register called the accumulator. This register is not mentioned explicitly in the instruction. Instead, it is implied by the opcode. Early personal computers such as the Apple 2, used ISAs with 1 or 2 registers, and those registers were often part of most instructions, thus they didn't have to be specified. With memory becoming cheaper, and memory access becoming cheaper, it's become easier to devote more bits to an instruction, and to specify three operands instead of two. This makes it more convenient for the assembly language programmer.

Sign up to vote on this title
UsefulNot useful