You are on page 1of 49

CS1104: Computer Organisation http://www.comp.nus.edu.

sg/~cs1104
SchoolofComputing NationalUniversityofSingapore

PII Lecture 6: Processor: Datapath and Control


Datapath:
Single-bus Organization Multiple-bus Organization MIPS: Multicycle Datapath and Control Stages of Instructions Datapath Walkthroughs Processor and Logic Design

CS1104-P2-6

Processor: Datapath and Control

PII Lecture 6: Processor: Datapath and Control


Reading:
Chapter 9 of textbook, which is Chapter 7 in
Computer Organization by Hamacher, Vranesic and Zaky. Optional reading: Chapter 5 in Computer Organization & Design by Patterson and Hennessy.

CS1104-P2-6

Processor: Datapath and Control

Datapath

CS1104-P2-6

Processor: Datapath and Control

Recap: Organisation
Bus

Processor
Control

Memory

Devices
Input

Cache Datapath

Output Registers

CS1104-P2-6

Processor: Datapath and Control

Fundamental Concepts

Processor (CPU): the active part of the


computer, which does all the work (data manipulation and decision-making). Datapath: portion of the processor which contains hardware necessary to perform all operations required by the computer (the brawn). Control: portion of the processor (also in hardware) which tells the datapath what needs to be done (the brain).
Processor: Datapath and Control 6

CS1104-P2-6

Fundamental Concepts (2)

Instruction execution
cycle: fetch, decode, execute. Fetch: fetch next

Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction

instruction (using PC) from memory into IR. Decode: decode the instruction. Execute: execute instruction.

CS1104-P2-6

Processor: Datapath and Control

Fundamental Concepts (3)

Fetch: Fetch next instruction into IR

(Instruction Register). Assume each word is 4 bytes and each instruction

is stored in a word, and that the memory is byte addressable. PC (Program Counter) contains address of next instruction. IR [[PC]] PC [PC] + 4

CS1104-P2-6

Processor: Datapath and Control

Single-bus Organization
Internal processor bus PC Address line Memory bus Data line MAR MDR Y Control signals ...
Instruction decoder and control logic

IR

Constant 4 Select ALU control lines


Add Sub

RO MUX
A B

: : R(n1)
Carry-in

:
XOR

ALU

TEMP Z

CS1104-P2-6

Processor: Datapath and Control

Instruction Execution

An instruction can be executed by performing


one or more of the following operations in some specified sequence: Transfer a word of data from one register to

another or to the ALU (Arithmetic Logic Unit). Perform an arithmetic or a logic operation and store the result in a register. Fetch the contents of a given memory location and load them into a register. Store a word of data from a register into a given memory location.

CS1104-P2-6

Processor: Datapath and Control

10

Register Transfer

Register to register transfer:


For each register Ri, two control signals:
Riin used to load the data on the bus into the register. Riout to place the registers contents on the bus.

Example: To transfer contents of R1 to R4:


Set R1out to 1. This places contents of R1 on the bus. Set R4in to 1. This loads data from the processor bus into
R4.

CS1104-P2-6

Processor: Datapath and Control

11

Register Transfer (2)


Internal processor bus Y in X Constant 4 Select MUX
A B

Ri in X Ri X Ri out

ALU Z in X Z X Z out CS1104-P2-6 Processor: Datapath and Control 12

Arithmetic/Logic Operation
ALU: Performs
arithmetic and logic operations on its A and B inputs. To perform Select R3 [R1] + [R2]:
1. R1out , Yin 2. R2out , SelectY, Add, Zin 3. Zout , R3in
Y in X Constant 4 MUX
A B

Internal processor bus Ri in X Ri X Ri out

ALU Z in X Z X Z out

CS1104-P2-6

Processor: Datapath and Control

13

Arithmetic/Logic Operation (2)

If there are n operations, do we need n


ALU control lines? We could use encoding, which requires log2 n control lines for n operations. However, this will increase complexity and hardware (additional decoder needed).
ALU control lines
Add Sub A B

:
XOR

ALU

Carry-in

CS1104-P2-6

Processor: Datapath and Control

14

Reading a Word from Memory


Move (R1), R2
1. 2. 3. 4. 5.

/* R2 [[R1]]

MAR [R1] Start a Read operation on the memory bus Wait for the MFC response from the memory Load MDR from the memory bus R2 [MDR]
Memory-bus data lines MDR inE X MDR X MDR outE Processor: Datapath and Control X MDR out 15 Internal processor bus MDR in X

MDR has four control signals: MDRin, MDRout, MDRinE


and MDRoutE.

CS1104-P2-6

Reading a Word from Memory (2)


Move (R1), R2 /* R2 [[R1]] Sequence of control steps:
1. R1out, MARin, Read 2. MDRinE, WMFC 3. MDRout, R2in

WMFC: Wait for arrival of MFC (Memory-FunctionCompleted) signal.

MFC: To accommodate variability in response time,


the processor waits until it receives an indication that the Read/Write operation has been completed. The addressed device sets MFC to 1 to indicate this.
CS1104-P2-6 Processor: Datapath and Control 16

Storing a Word in Memory


Move R2, (R1) /* [R1] [R2] Sequence of control steps:
1. R1out, MARin 2. R2out, MDRin, Write 3. MDRoutE, WMFC

CS1104-P2-6

Processor: Datapath and Control

17

Executing a Complete Instruction


Add (R3), R1 /* R1 [R1] + [[R3]] Adds the contents of a memory location pointed to by
R3 to register R1. Sequence of control steps:
1. PCout , MARin , Read, Select4, Add, Zin 2. Zout , PCin , Yin , WMFC 3. MDRout , IRin 4. R3out , MARin , Read 5. R1out , Yin , WMFC 6. MDRout , SelectY, Add, Zin 7. Zout , R1in , End
CS1104-P2-6 Processor: Datapath and Control 18

Steps 1 3: Instruction fetch

Multiple-Bus Organization
Single-bus structure: Control sequences are long as
only one data item can be transferred over the bus in a clock cycle. Figure on next slide shows a three-bus structure. All registers are combined into a single block called register file with three ports: 2 outputs allowing 2 registers to be accessed simultaneously and have their contents put on buses A and B, and 1 input allowing data on bus C to be loaded into a third register. Buses A and B are used to transfer source operands to the A and B inputs of ALU, and result transferred to destination over bus C.
Processor: Datapath and Control 19

CS1104-P2-6

Multiple-Bus Organization (2)


Bus A Bus B
Incrementer Instruction decoder

Bus C

Bus A Bus B

Bus C

PC

Register file Constant 4

IR

MDR
A

MUX

ALU
R

MAR

CS1104-P2-6

Address line Memory bus data lines Processor: Datapath and Control

20

Multiple-Bus Organization (3)


For the ALU, R=A (or R=B) means that its A (or B)
input is passed unmodified to bus C. Add R4, R5, R6 /* R6 [R4] + [R5] Adds the contents of R4 and R5 to R6. Sequence of control steps:
1. PCout, R=B, MARin, Read, IncPC 2. WMFC 3. MDRoutB, R=B, IRin 4. R4outA, R5outB, SelectA, Add, R6in, End

CS1104-P2-6

Processor: Datapath and Control

21

Control
Hardwired control or microprogrammed control. Hardwired control:
Clock
CLK Control step counter

... :
IR External inputs

: :

Decoder/ encoder

: ...
Control signals CS1104-P2-6 Processor: Datapath and Control Memory bus data lines

Condition codes

22

Control (2)
Microprogrammed control:
Control signals generated by a program. Control word (CW) is a microinstruction that contains
individual bits that represent the various control signals. Vertical organization: highly encoded schemes that use compact codes to specify only a small number of control functions in each microinstruction. Horizontal organization: minimally encoded scheme in which many resources can be controlled with a single microinstructions. Popular in Complex Instruction Set Architectures (CISC) because complex instruction sets require complex controllers that can more easily be implemented as microprograms.

CS1104-P2-6

Processor: Datapath and Control Memory bus data lines

23

Control (3)
Example of a horizontal
organization scheme:
1. 2. 3. 4. 5. 6. 7.
tu o n i

PCout , MARin , Read, Select4, Add, Zin Zout , PCin , Yin , WMFC MDRout , IRin R3out , MARin , Read R1out , Yin , WMFC MDRout , SelectY, Add, Zin Zout , R1in , End

Zt uo 1 R tuo

Yn i

Zn i

t ce e S l

-oc M ri not c u t s n i r i

1 2 3 4 5 6 7

0 1 0 0 0 0 0

1 0 0 0 0 0 0

1 0 0 1 0 0 0

1 0 0 1 0 0 0

0 0 1 0 0 1 0

0 0 1 0 0 0 0

0 1 0 0 1 0 0

1 0 0 0 0 0 0

1 0 0 0 0 1 0

1 0 0 0 0 1 0

0 1 0 0 0 0 1

0 0 0 0 1 0 0

0 0 0 0 0 0 1

0 0 0 1 0 0 0

CF M W dn E
0 1 0 0 1 0 0 0 0 0 0 0 0 1

RA M dae R

RD M R I

dd A

1R 3R

CP

CP

..

..

tu o

n i

tu o

n j

n i

Select=0: SelectY Select=1: Select4


24

CS1104-P2-6

Processor: Datapath and Control Memory bus data lines

MIPS: Multicycle Datapath and Control


Adapted from D. Pattersons CS61C http://www.cs.berkeley.edu/~pattrsn/61CF00 Copyright 2000 UCB

CS1104-P2-6

Processor: Datapath and Control

25

Stages of a Datapath

Problem: a single, atomic block which

executes an instruction (performs all necessary operations beginning with fetching the instruction) would be too bulky and inefficient. an instruction into stages, and then connect the stages to create the whole datapath. Smaller stages are easier to design. Easy to optimize (change) one stage without
touching the others.
Processor: Datapath and Control 26

Solution: break up the process of executing

CS1104-P2-6

Stages of a Datapath (2)

There is a wide variety of MIPS instructions:


so what general steps do they have in common?

Stages
1. 2. 3. 4. 5. Instruction Fetch Instruction Decode ALU Memory Access Register Write

CS1104-P2-6

Processor: Datapath and Control

27

Stages of a Datapath (3)

Stage 1: Instruction Fetch.


No matter what the instruction is, the 32-bit
instruction word must first be fetched from memory (the cache-memory hierarchy). Also, this is where we increment PC (that is, PC = PC + 4, to point to the next instruction; byte addressing so + 4).

CS1104-P2-6

Processor: Datapath and Control

28

Stages of a Datapath (4)

Stage 2: Instruction Decode


Upon fetching the instruction, we next gather data
from the fields (decode all necessary instruction data). First, read the opcode to determine instruction type and field lengths. Second, read in data from all necessary registers. For add, read two registers. For addi, read one register. For jal, no read necessary.
Processor: Datapath and Control 29

CS1104-P2-6

Stages of a Datapath (5)

Stage 3: ALU (Arithmetic-Logic Unit)


The real work of most instructions is done here:
arithmetic (+, -, *, /), shifting, logic (&, |), comparisons (slt). What about loads and stores? lw $t0, 40($t1) The address we are accessing in memory = the value in $t1 plus the value 40. We do this addition at this stage.

CS1104-P2-6

Processor: Datapath and Control

30

Stages of a Datapath (6)

Stage 4: Memory Access


Actually only the load and store instructions do
anything during this stage; for the other instructions, they remain idle during this stage. Since these instructions have a unique step, we need this extra stage to account for them. As a result of the cache system, this stage is expected to be just as fast (on average) as the others.

CS1104-P2-6

Processor: Datapath and Control

31

Stages of a Datapath (7)

Stage 5: Register Write


Most instructions write the result of some
computation into a register. Examples: arithmetic, logical, shifts, loads, slt What about stores, branches, jumps? They do not write anything into a register at the end. These remain idle during this fifth stage.

CS1104-P2-6

Processor: Datapath and Control

32

Datapath: Generic Steps


rd rs rt imm

instruction memory

registers

PC

ALU

+4

1. Instruction Fetch

2. Decode/ Register Read

3. Execute 4. Memory 5. Reg. Write

CS1104-P2-6

Processor: Datapath and Control

Data memory
33

Datapath Walkthroughs: add

add $r3,$r1,$r2 # r3 = r1+r2


Stage 1: Fetch this instruction, increment PC. Stage 2: Decode to find that it is an add
instruction, then read registers $r1 and $r2. Stage 3: Add the two values retrieved in stage 2. Stage 4: Idle (nothing to write to memory). Stage 5: Write result of stage 3 into register $r3.

CS1104-P2-6

Processor: Datapath and Control

34

Datapath Walkthroughs: add (2)


reg[1] ALU

instruction memory

2 imm

reg[2]

+4

CS1104-P2-6

Processor: Datapath and Control

add r3, r1, r2

Data memory
35

3 1

registers

PC

reg[1]+reg[2]

Datapath Walkthroughs: slti

slti $r3,$r1,17
Stage 1: Fetch this instruction, increment PC. Stage 2: Decode to find it is an slti, then read
register $r1. Stage 3: Compare value retrieved in stage 2 with the integer 17. Stage 4: Go idle. Stage 5: Write the result of stage 3 in register $r3.

CS1104-P2-6

Processor: Datapath and Control

36

Datapath Walkthroughs: slti (2)


reg[1] ALU

instruction memory

3 imm

+4

17

CS1104-P2-6

Processor: Datapath and Control

slti r3, r1, 17

Data memory
37

x 1

registers

PC

reg[1]-17

Datapath Walkthroughs: sw

sw $r3, 20($r1)
Stage 1: Fetch this instruction, increment PC. Stage 2: Decode to find it is an sw, then read
CS1104-P2-6

registers $r1 and $r3. Stage 3: Add 20 to value in register $r1 (retrieved in stage 2). Stage 4: Write value in register $r3 (retrieved in stage 2) into memory address computed in stage 3. Stage 5: Go idle (nothing to write into a register).

Processor: Datapath and Control

38

Datapath Walkthroughs: sw (2)


reg[1] ALU

instruction memory

3 imm

reg[3]

+4

20

CS1104-P2-6

Processor: Datapath and Control

sw r3, 20(r1)

Data MEM[r1+20]<-r3 memory


39

x 1

registers

PC

reg[1]+20

Why Five Stages?

Could we have a different number of stages?


Yes, and other architectures do. So why does MIPS have five stages, if instructions tend to go idle for at least one stage? There is one instruction that uses all five stages:
the load.

CS1104-P2-6

Processor: Datapath and Control

40

Datapath Walkthroughs: lw

lw $r3, 40($r1)
Stage 1: Fetch this instruction, increment PC. Stage 2: Decode to find it is a lw, then read
register $r1. Stage 3: Add 40 to value in register $r1 (retrieved in stage 2). Stage 4: Read value from memory address compute in stage 3. Stage 5: Write value found in stage 4 into register $r3.

CS1104-P2-6

Processor: Datapath and Control

41

Datapath Walkthroughs: lw (2)


reg[3] reg[1] ALU

instruction memory

+4

imm

40

CS1104-P2-6

Processor: Datapath and Control

lw r3, 40(r1)

42

r3<-MEM[r1+40]

Data memory

x 1

registers

PC

reg[1]+40

What Hardware Is Needed?

PC: a register which keeps track of address


of the next instruction. General Purpose Registers Used in stages 2 (read) and 5 (write). We are currently working with 32 of these. Memory Used in stages 1 (fetch) and 4 (R/W). Cache system makes these two stages as fast as
the others, on average.

CS1104-P2-6

Processor: Datapath and Control

43

Datapath: Summary
Construct datapath based on register transfers
required to perform instructions. Control part causes the right transfers to happen.

instruction memory

ALU

+4

imm opcode, funct Controller

CS1104-P2-6

Processor: Datapath and Control

Data memory
44

rd rs rt

registers

PC

Where is Logic Design Used?

Combinational circuits for


ALU and other parts of the datapath.

Different control signals are


needed for different clock cycles and different instructions for the ALU, registers and other parts of the datapath. Sequential circuits.
CS1104-P2-6 Processor: Datapath and Control

ALU

ALU Control

45

Where is Logic Design Used? (2)


Start Instruction fetch/decode and register fetch

Memory access instructions

R-type instructions

Branch instruction

Jump instruction

High-level view of finite state machine control. Sequential logic design can be used to assert the
correct control signals at the correct times.
Processor: Datapath and Control CS1104-P2-6 46

Summary

Datapath is the hardware that performs


operations necessary to execute programs. Control instructs datapath on what to do next. Datapath needs: access to storage (general purpose registers and

memory) computational ability (ALU) helper hardware (local registers and PC)

CS1104-P2-6

Processor: Datapath and Control

47

Summary (2)

Five stages of datapath (executing an


instruction): 1: Instruction Fetch (Increment PC) 2: Instruction Decode (Read Registers) 3: ALU (Computation) 4: Memory Access 5: Write to Registers ALL instructions must go through ALL five stages. Datapath designed in hardware.
Processor: Datapath and Control 48

CS1104-P2-6

End of file

CS1104-P2-6

Processor: Datapath and Control

49

You might also like