IT3030E CA Chap5 CPU

Chapter 5: The Processor
Ngo Lam Trung, Pham Ngoc Hung
[with materials from Computer Organization and Design, 4th Edition,

Patterson & Hennessy, © 2008, MK
and M.J. Irwin’s presentation, PSU 2008]
IT3030E, Fall 2022 1
Review
Performance metric
CPU time = CPI * CC * IC
CPI: cycle per instruction

CC: clock cycle
IC: instruction count
How to improve?
• IC:
• CC:
• CPI:
In this chapter
• Implementation of data path
• How to get CPI < 1
IT3030E, Fall 2022 2

Overview
❑ We will examine two MIPS implementations

l A simplified version
l A more realistic pipelined version
❑ Limit to a simple subset of MIPS ISA

l Memory reference: lw, sw
l Arithmetic/logical: add, sub, and, or, slt
l Control transfer: beq, j
❑ Implementation of real CPU with other instructions are
similar to the simplified version (theoretically!)
IT3030E, Fall 2022 3

General instruction cycle
❑ Generic implementation
l use the program counter (PC) to supply Fetch
the instruction address and fetch the PC = PC+4
instruction from memory (and update the PC)
Exec Decode
l decode the instruction (and read registers)
l execute the instruction
❑ All instructions (except j) use the ALU after reading the

registers
l ALU: Arithmetic and Logic Unit, where the arithmetic and logic
operations are executed
❑ In this chapter: implementation of CPU that can execute

the simple subset of MIPS ISA
IT3030E, Fall 2022 4

CPU implementation with MUXes and Control
❑ Multiplexer
❑ Control
Don’t panic! We’ll build this incrementally.

IT3030E, Fall 2022 5
Fetching Instructions
❑ Fetching instructions involves
l reading the instruction from the Instruction Memory
l updating the PC value to be the address of the next instruction
in memory
clock Add
4
Fetch
PC = PC+4 Instruction
Memory
Exec Decode Read
PC Instruction
Address
IT3030E, Fall 2022 6

Decoding Instructions
❑ Decoding instructions involves
sending the fetched instruction’s opcode and function field
bits to the control unit
The control unit send appropriate control signals to other
parts inside CPU to execute the operations corresponds to
the instruction
Fetch Control
PC = PC+4 Unit
Exec Decode
Read Addr 1
Register Read
Read Addr 2 Data 1
Instruction
File
Write Addr Read
Data 2
Write Data
• Example: reading two values from the Register File

→Register File addresses are contained in the instruction
IT3030E, Fall 2022 7
Executing R Format Operations
❑ R format operations (add, sub, slt, and, or)
31 25 20 15 10 5 0
R-type: op rs rt rd shamt funct
read two register operands rs and rt
perform operation (op and funct) on values in rs and rt
store the result back into the Register File (into location rd)
Fetch
PC = PC+4
Example: add s1, s2, s3
- Value of s2 and s3 are sent to ALU
Exec Decode
- ALU execute the s2 + s3 operation
- Result is store into s1
IT3030E, Fall 2022 8

31 25 20 15 10 5 0
Fetch
PC = PC+4
Exec Decode
add s1, s2, s3
Draw connection between a and b to form the execution unit?

IT3030E, Fall 2022 9
31 25 20 15 10 5 0
RegWrite ALU control
Read Addr 1
Fetch Register Read
PC = PC+4 Read Addr 2 Data 1 overflow
Instruction
File zero
ALU
Write Addr
Exec Decode Read
Data 2
Write Data
We need the write control signal to control when the result is

written to Register File
IT3030E, Fall 2022 10
Executing Load and Store Operations
❑ Load and store operations involves
read register operands (including one base register)
compute memory address by adding the base to the offset
- The 16-bit offset field in the instruction is sign-extended to 32 bit
store: read from the Register File, write to the Data Memory
load: read from the Data Memory, write to the Register File
RegWrite ALU control MemWrite
overflow
Read Addr 1 zero
Register Read Address
Read Addr 2 Data 1
Instruction Data
File Memory Read Data
ALU
Write Addr Read
Data 2 Write Data
Write Data
Sign MemRead
Extend
IT3030E, Fall 2022

Draw necessary connections to form execution unit? 11
Executing Load and Store Operations
❑ Load and store operations involves
read register operands (including one base register)
compute memory address by adding the base to the offset
- The 16-bit offset field in the instruction is signed-extended to 32 bit
store: read from the Register File, write to the Data Memory
load: read from the Data Memory, write to the Register File
RegWrite ALU control MemWrite
overflow
Read Addr 1 zero
Read Addr 2 Data 1
Instruction Data
File Memory Read Data
ALU
Write Addr Read
Data 2 Write Data
Write Data
Sign MemRead
16 Extend 32
IT3030E, Fall 2022 12

Executing Branch Operations
❑ Branch operations involves
read register operands
compare the operands (subtract, check zero ALU output)
compute the branch target address: adding the updated PC to the
16-bit signed-extended offset field in the instr
Add Branch
4 Add target
Shift
address
left 2
ALU control
PC
Read Addr 1 zero (to branch

Register Read control logic)
Instruction Read Addr 2Data 1
File ALU
Write Addr Read
Draw necessary Data 2
Write Data
connections to form
execution unit?
Sign
16 Extend 32
IT3030E, Fall 2022 13
Executing Jump Operations
❑ Jump operation involves
keep 4 highest bits of PC
replace the lower 28 bits of the PC by
- the lower 26 bits of the fetched instruction shifted left by 2 bits
Add
4
4
Jump
Instruction Shift address
Memory
left 2 28
Read
PC Instruction
Address 26
IT3030E, Fall 2022 14

Creating a Single Datapath from the Parts
❑ Assemble the datapath segments and add control lines

and multiplexors as needed
❑ Single cycle design – fetch, decode and execute each
instructions in one clock cycle
separate Instruction Memory and Data Memory, though they
are both in main memory
multiplexors needed at the input of shared elements with
control lines to do the selection
write signals to control writing to the Register File and Data
Memory
IT3030E, Fall 2022 15

Fetch, R, and Memory Access Portions
Add
RegWrite ALUSrc ALU control MemWrite MemtoReg
4
ovf
zero
Read Addr 1
Instruction
Memory
Read Addr 2 Data 1 Data
Read File
PC Instruction ALU Memory Read Data
Address Write Addr
Read
Data 2 Write Data
Write Data
MemRead
Sign
16 Extend 32
IT3030E, Fall 2022 16

Designing ALU
❑ Input/output
Two data input: a, b
ALU control signal a
Data out
Flags out out
❑ Operations
And, or, nor b
flag
Add, subtract
ALU control
IT3030E, Fall 2022 17

1-bit ALU with logic operation
❑ What do we have if
Operation = 0:
Operation = 1:
IT3030E, Fall 2022 18

1-bit full-adder
➔ We already designed this in prev. chapter
IT3030E, Fall 2022 20

1-bit ALU with AND, OR, ADD
❑ Operation = 00:
❑ Operation = 01:
❑ Operation = 10:
IT3030E, Fall 2022 21
How about 1-bit ALU with AND, OR, ADD, SUB?
❑ a-b = a + (-b) = a + (2’s complement of b)
❑ For SUB operation

Operation =
Binvert =
CarryIn =
IT3030E, Fall 2022 22
How to add NOR operation?
❑ ത 𝑏ത
𝑎 + 𝑏 = 𝑎.
❑ Find control signal for NOR operation:
IT3030E, Fall 2022 23

Adding SLT operation
❑ Check highest bit of (a-b)
1: a < b → set
0: a >= b → clear
IT3030E, Fall 2022 24

Building 32-bit ALU from 1-bit ALUs
IT3030E, Fall 2022 25

Final ALU with AND, OR, NOR, ADD, SUB, SLT
IT3030E, Fall 2022 26

ALU symbol and interconnection
IT3030E, Fall 2022 27

Adding Control Unit to build single datapath
0
Add
Add 1
4 Shift
left 2 PCSrc
ALUOp Branch
MemRead
Instr[31-26] Control MemtoReg
Unit MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr[25-21] Read Addr 1
Instruction
Memory Instr[20-16] Read Addr 2 Data 1 zero
Data
Read File
PC Instr[31-0] 0 ALU Memory Read Data 1
Address Write Addr
1 Read 0
Instr[15 Data 2 Write Data 0
Write Data
-11] 1
Instr[15-0] Sign ALU

16 Extend 32 control
Instr[5-0]
IT3030E, Fall 2022 29

R-type Instruction Data/Control Flow
0
Add
Add 1
4 Shift
left 2 PCSrc
ALUOp Branch
MemRead
Unit MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instruction
Data
Read File
Address Write Addr
1 Read 0
Write Data
-11] 1

Instr[5-0]
IT3030E, Fall 2022 30

Load Word Instruction Data/Control Flow
0
Add
Add 1
4 Shift
left 2 PCSrc
ALUOp Branch
MemRead
Unit MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instruction
Data
Read File
Address Write Addr
1 Read 0
Write Data
-11] 1
Mark active Instr[15-0] Sign ALU

connections during 16 Extend 32 control
execution flow Instr[5-0]
IT3030E, Fall 2022 31

Load Word Instruction Data/Control Flow
0
Add
Add 1
4 Shift
left 2 PCSrc
ALUOp Branch
MemRead
Unit MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instruction
Data
Read File
Address Write Addr
1 Read 0
Write Data
-11] 1

Instr[5-0]
IT3030E, Fall 2022 32

Branch Instruction Data/Control Flow
0
Add
Add 1
4 Shift
left 2 PCSrc
ALUOp Branch
MemRead
Unit MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instruction
Data
Read File
Address Write Addr
1 Read 0
Write Data
-11] 1

IT3030E, Fall 2022 34

Branch Instruction Data/Control Flow
0
Add
Add 1
4 Shift
left 2 PCSrc
ALUOp Branch
MemRead
Unit MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instruction
Data
Read File
Address Write Addr
1 Read 0
Write Data
-11] 1

IT3030E, Fall 2022 35

Adding the Jump Operation
Instr[25-0]
Shift 1
28 32
26 left 2
PC+4[31-28]
0
Add 0
Add 1
4 Shift
left 2 PCSrc
Jump
ALUOp Branch
MemRead
Unit MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instruction
Data
Read File
Address Write Addr
1 Read 0
Write Data
-11] 1
Mark active
execution flow
Instr[5-0]
IT3030E, Fall 2022 36

Adding the Jump Operation
Instr[25-0]
Shift 1
28 32
26 left 2
PC+4[31-28]
0
Add 0
Add 1
4 Shift
left 2 PCSrc
Jump
ALUOp Branch
MemRead
Unit MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instruction
Data
Read File
Address Write Addr
1 Read 0
Write Data
-11] 1
Mark active
execution flow
Instr[5-0]
IT3030E, Fall 2022 37

Instruction Times (Critical Paths)
❑ What is the clock cycle time assuming negligible
delays for muxes, control unit, sign extend, PC access,
shift left 2, wires, setup and hold times except:
Instruction and Data Memory (200 ps)
ALU and adders (200 ps)
Register File access (reads or writes) (100 ps)
Instr. I Mem Reg Rd ALU Op D Mem Reg Wr Total

R-
type
load
store
beq
jump
IT3030E, Fall 2022 38
Instruction Critical Paths for Single cycle CPU
❑ What is the clock cycle time assuming negligible
delays for muxes, control unit, sign extend, PC access,
shift left 2, wires, setup and hold times except:
Instruction and Data Memory (200 ps)
ALU and adders (200 ps)
Register File access (reads or writes) (100 ps)
Instr. I Mem Reg Rd ALU Op D Mem Reg Wr Total

R- 200 100 200 100 600
type
load 200 100 200 200 100 800
store 200 100 200 200 700
beq 200 100 200 500
jump 200 200
IT3030E, Fall 2022 39
Single Cycle Disadvantages & Advantages
❑ Uses the clock cycle inefficiently – the clock cycle must
be timed to accommodate the slowest instruction
l especially problematic for more complex instructions like
floating point multiply
Cycle 1 Cycle 2
Clk
lw sw Waste
❑ May be wasteful of area since some functional units

(e.g., adders) must be duplicated since they can not be
shared during a clock cycle
but
❑ Is simple and easy to understand
IT3030E, Fall 2022 40

How Can We Make The Computer Faster?
❑ Divide instruction cycles into smaller cycles
❑ Executing instructions in parallel
l With only one CPU?
❑ Pipelining:
l Start fetching and executing the next instruction before the
current one has completed
l Overlapping execution
IT3030E, Fall 2022 41

Pipeline in real life
IT3030E, Fall 2022 42

A more serious example: laundry work
❑ Pipelined laundry boots performance up to 4 times
◼ With 4 loads
Tnormal = 4*2 = 8 hours
Tpipeline = 3.5 hours
◼ With n loads
Tnormal = n*2 hours
Tpipeline = (3+n)/2 hours
4 stages: washing, drying, ironing, folding

When n →  : Tnormal → 4*Tpipeline
IT3030E, Fall 2022 43
MIPS Pipeline
❑ Five stages, one step per stage
l IFetch: Instruction Fetch and Update PC
l Dec: Registers Fetch and Instruction Decode
l Exec: Execute R-type; calculate memory address
l Mem: Read/write the data from/to the Data Memory
l WB: Write the result data into the register file
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5
IFetch Dec Exec Mem WB
Execution time for a single instruction is always 5 cycles, regardless

of instruction operation
IT3030E, Fall 2022 44

Instruction pipeline
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5
IFetch Dec Exec Mem WB IFetch Dec Exec Mem WB
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8
lw IFetch Dec Exec Mem WB Instructions in

pipeline
sw IFetch Dec Exec Mem WB
R-type IFetch Dec Exec Mem WB
Start fetching and executing the More than one instruction are
next instruction before the current executed at a time
one has completed
IT3030E, Fall 2022 45
Single Cycle versus Pipeline
Single Cycle Implementation (CC = 800 ps):

Cycle 1 Cycle 2
Clk
lw sw Waste
Pipeline Implementation (CC = 200 ps): 400 ps

lw IFetch Dec Exec Mem WB
sw IFetch Dec Exec Mem WB
R-type IFetch Dec Exec Mem WB
❑ To complete an entire instruction in the pipelined case

takes 1000 ps (as compared to 800 ps for the single
cycle case). Why ?
❑ How long does each take to complete 1,000,000 adds ?
IT3030E, Fall 2022 47
Example with lw instructions
Single-cycle (Tc= 800ps)
Pipelined (Tc= 200ps)
IT3030E, Fall 2022 48

Pipeline hazards
❑ Pipeline can lead us into troubles!!!
❑ Hazards: situations that prevent starting the next
instruction in the next cycle
l structural hazards: attempt to use the same resource by two
different instructions at the same time
l data hazards: attempt to use data before it is ready
- An instruction’s source operand(s) are produced by a prior
instruction still in the pipeline
l control hazards: attempt to make a decision about program
control flow before the condition has been evaluated and the
new PC target address calculated
- branch and jump instructions, exceptions
❑ In most cases, hazard can be solved simply by waiting

l but we need better solutions to take advantages of pipeline
IT3030E, Fall 2022 50

Structural hazard
❑ Conflict for use of a resource
❑ In MIPS pipeline with a single memory
l Load/store requires data access
l Instruction fetch would have to stall for that cycle
- Would cause a pipeline “bubble”
❑ Hence, pipelined datapath require separate

instruction/data memories
l Or separate instruction/data caches
IT3030E, Fall 2022 51

A Single Memory Would Be a Structural Hazard
Time (clock cycles)
Reading data from

lw
ALU
I Mem Reg Mem Reg
memory
n
s
ALU
t Inst 1 Mem Reg Mem Reg
r.
ALU
O Inst 2 Mem Reg Mem Reg
r
d
ALU
e Inst 3 Mem Reg Mem Reg
r
ALU
Inst 4 Mem Reg Mem Reg
Reading instruction
from memory
❑ Fix with separate instr and data memories (I$ and D$)
IT3030E, Fall 2022 52
How About Register File Access?
Time (clock cycles)
Fix register file

add $1,
ALU
I IM Reg DM Reg access hazard by
n doing reads in the
s second half of the
ALU
t Inst 1 IM Reg DM Reg
cycle and writes in
r. the first half
ALU
O Inst 2 IM Reg DM Reg
r
d
ALU
e add $2,$1, IM Reg DM Reg
r
clock edge that controls clock edge that controls

register writing loading of pipeline state
IT3030E, Fall 2022
registers 53
Data hazard
❑ An instruction depends on completion of data access by a

previous instruction
add $s0, $t0, $t1
sub $t2, $s0, $t3
CPU must wait

until data in s0
becomes valid
IT3030E, Fall 2022 54

Example
❑ Dependencies backward in time cause hazards
ALU
add $1, IM Reg DM Reg
ALU
sub $4,$1,$5 IM Reg DM Reg
ALU
and $6,$1,$7 IM Reg DM Reg
ALU
or $8,$1,$9 IM Reg DM Reg
ALU
xor $4,$1,$5 IM Reg DM Reg
❑ Read before write data hazard

IT3030E, Fall 2022 55
Example
ALU
I lw $1,4($2) IM Reg DM Reg
n
s
ALU
t sub $4,$1,$5 IM Reg DM Reg
r.
ALU
O and $6,$1,$7 IM Reg DM Reg
r
d
ALU
e or $8,$1,$9 IM Reg DM Reg
r
ALU
xor $4,$1,$5 IM Reg DM Reg
❑ Load-use data hazard

IT3030E, Fall 2022 56
Solving hazard with forwarding
❑ Use result when it is computed

Don’t wait for it to be stored in a register
Requires extra connections in the datapath
❑ Forward from EX to EX (output to input)
IT3030E, Fall 2022 57

Load-Use Data Hazard
❑ One cycle stall is necessary

❑ Forward from MEM (output) to EX (input)
IT3030E, Fall 2022 58

Code Scheduling to Avoid Stalls
❑ Reorder code to avoid use of load result in the next

instruction
❑ C code: A = B + E;
C = B + F;
lw $t1, 0($t0) lw $t1, 0($t0)

lw $t2, 4($t0) lw $t2, 4($t0)
stall add $t3, $t1, $t2 lw $t4, 8($t0)
sw $t3, 12($t0) add $t3, $t1, $t2
lw $t4, 8($t0) sw $t3, 12($t0)
stall add $t5, $t1, $t4 add $t5, $t1, $t4
sw $t5, 16($t0) sw $t5, 16($t0)
13 cycles 11 cycles
IT3030E, Fall 2022 59

Control Hazards
❑ Branch determines flow of control
Fetching next instruction depends on branch outcome
Pipeline can’t always fetch correct instruction
- Still working on ID stage of branch
❑ In MIPS pipeline
Need to compare registers and compute target early in the
pipeline
Add hardware to do it in ID stage
IT3030E, Fall 2022 60

Branch Instructions Cause Control Hazards
beq
ALU
I IM Reg DM Reg
n
s
ALU
t lw IM Reg DM Reg
r.
ALU
r
d
ALU
e Inst 4 IM Reg DM Reg
r
IT3030E, Fall 2022 61

Stall on Branch
❑ Naïve approach: Wait until branch outcome determined

before fetching next instruction
Performance affect: assume that 17% of instructions in program are

branches, if each branch take one cycle for the stall, then performance
will be 17% slower. (CPI = 1.17)
IT3030E, Fall 2022 62

Branch Prediction
❑ Predict outcome of branch
❑ Only stall if prediction is wrong
❑ In MIPS pipeline
Can predict branches not taken
Fetch instruction after branch, with no delay
IT3030E, Fall 2022 63

MIPS with Predict Not Taken
Prediction
correct
Prediction
incorrect
IT3030E, Fall 2022 64

More-Realistic Branch Prediction
❑ Static branch prediction

Based on typical branch behavior
Example: loop and if-statement branches
- Predict backward branches taken
- Predict forward branches not taken
❑ Dynamic branch prediction

Hardware measures actual branch behavior
- e.g., record recent history of each branch
Assume future behavior will continue the trend
- When wrong, stall while re-fetching, and update history
As good as > 90% accuracy
IT3030E, Fall 2022 65

Summary: Pipeline Operation
Time (clock cycles)
Once the
ALU
I Inst 0 IM Reg DM Reg pipeline is full,
n one instruction
s is completed
ALU
t Inst 1 IM Reg DM Reg
every cycle, so
r. CPI = 1
ALU
r
d
ALU
e Inst 3 IM Reg DM Reg
r
ALU
Inst 4 IM Reg DM Reg
Time to fill the pipeline
IT3030E, Fall 2022 66

Exercise 1
Assume that the following instructions are executed in a 5-stage-pipelined MIPS CPU.
Draw the timeline of each intruction
IF ID EX MEM WB
add $s0, $s1, $s2

lw $t0, 0($t1)
sw $t2, 0($t3)
bne $s0, $s1, EXIT
IT3030E, Fall 2022 67

Pipelined datapath
❑ How to share/isolate data between different stages?
IT3030E, Fall 2022 68

Pipelined datapath
❑ Adding pipeline registers
❑ 5 stages, why only 4 registers?

IT3030E, Fall 2022 69
Instruction fetch in pipeline
IT3030E, Fall 2022 70

Instruction decode
IT3030E, Fall 2022 71

Execution
IT3030E, Fall 2022 72

Memory access
IT3030E, Fall 2022 73

Write-back
IT3030E, Fall 2022 74

Correction to support lw instruction
IT3030E, Fall 2022 75

Pipeline diagram
IT3030E, Fall 2022 76

Pipeline diagram
IT3030E, Fall 2022 77

Control signals in pipeline
IT3030E, Fall 2022 78

Pipelined control
IT3030E, Fall 2022 79

Solving read-after-write hazard
IT3030E, Fall 2022 80

Solving read-after-write hazard
IT3030E, Fall 2022 81

Solving load-use hazard
IT3030E, Fall 2022 82

IT3030E, Fall 2022 83

IT3030E, Fall 2022 84

Solving control hazard
IT3030E, Fall 2022 85

Pipeline when
Branch taken
IT3030E, Fall 2022 86

Summary
❑ All modern-day processors use pipelining

❑ Pipelining doesn’t help latency of single task, it helps
throughput of entire workload
❑ Potential speedup: a CPI of 1 and a fast CC
❑ Must detect and resolve hazards
l Stalling negatively affects CPI (makes CPI less than the ideal
of 1)
IT3030E, Fall 2022 87

IT3030E CA Chap5 CPU

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

IT3030E CA Chap5 CPU

Uploaded by

Copyright:

Available Formats

Chapter 5: The Processor

Ngo Lam Trung, Pham Ngoc Hung

[with materials from Computer Organization and Design, 4th Edition,

CPI: cycle per instruction

IT3030E, Fall 2022 2

❑ We will examine two MIPS implementations

❑ Limit to a simple subset of MIPS ISA

IT3030E, Fall 2022 3

❑ All instructions (except j) use the ALU after reading the

❑ In this chapter: implementation of CPU that can execute

IT3030E, Fall 2022 4

Don’t panic! We’ll build this incrementally.

IT3030E, Fall 2022 6

• Example: reading two values from the Register File

IT3030E, Fall 2022 8

add s1, s2, s3

Draw connection between a and b to form the execution unit?

We need the write control signal to control when the result is

RegWrite ALU control MemWrite

IT3030E, Fall 2022

RegWrite ALU control MemWrite

IT3030E, Fall 2022 12

Read Addr 1 zero (to branch

IT3030E, Fall 2022 14

❑ Assemble the datapath segments and add control lines

IT3030E, Fall 2022 15

IT3030E, Fall 2022 16

IT3030E, Fall 2022 17

IT3030E, Fall 2022 18

➔ We already designed this in prev. chapter

IT3030E, Fall 2022 20

❑ For SUB operation

❑ Find control signal for NOR operation:

IT3030E, Fall 2022 23

IT3030E, Fall 2022 24

IT3030E, Fall 2022 25

IT3030E, Fall 2022 26

IT3030E, Fall 2022 27

Instr[15-0] Sign ALU

IT3030E, Fall 2022 29

Instr[15-0] Sign ALU

IT3030E, Fall 2022 30

Mark active Instr[15-0] Sign ALU

execution flow Instr[5-0]

IT3030E, Fall 2022 31

Instr[15-0] Sign ALU

IT3030E, Fall 2022 32

Mark active Instr[15-0] Sign ALU

execution flow Instr[5-0]

IT3030E, Fall 2022 34

Mark active Instr[15-0] Sign ALU

execution flow Instr[5-0]

IT3030E, Fall 2022 35

IT3030E, Fall 2022 36

IT3030E, Fall 2022 37

Instr. I Mem Reg Rd ALU Op D Mem Reg Wr Total

Instr. I Mem Reg Rd ALU Op D Mem Reg Wr Total

❑ May be wasteful of area since some functional units

IT3030E, Fall 2022 40

IT3030E, Fall 2022 41

IT3030E, Fall 2022 42

4 stages: washing, drying, ironing, folding

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5