You are on page 1of 19

Lecture 11

Instruction Pipelining
Control Hazards

Zelalem Birhanu, AAiT 1


In this lecture:

Pipelining hazards
Control hazards

Zelalem Birhanu, AAiT 2


Review

5-stage pipeline
Time

I1 FI DI FO EI WO
I2 FI DI FO EI WO
I3 FI DI FO EI WO

Pipeline hazards
Resource hazards
Data Hazards
Control Hazards

Zelalem Birhanu, AAiT 3


Control Hazards

Arise from the need to make a decision based on the


results of one instruction while others are executing
Occur with branch instructions
PC=200
e.g. Time

100: JMP 200 FI DI FO EI WO


JMP 200
Add R1,R2 Add R1,R2 FI DI FO EI

200:
SUB R1,R2
Wrong
instruction is
fetched

Zelalem Birhanu, AAiT 4


Control Hazardscntd

Approaches for handling control hazards


Detect and stall
Delayed branch
Branch prediction

Zelalem Birhanu, AAiT 5


Detect and Stall

When a branch is detected during instruction decoding,


stop processing of next instructions until PC is updated
with next instruction address
Branch
e.g. detected PC=200
Time
100:
JMP 200 JMP 200 FI DI FO EI WO
Add R1,R2
Add R1,R2 FI idle idle idle
200: SUB R1,R2 FI DI
SUB R1,R2
Wait until next
address is
determined

Zelalem Birhanu, AAiT 6


Detect and Stallcntd

Typically there are large number of branch instructions in a


program (e.g. ifelse statements, loops)

Due to this, performance will degrade rapidly if detect and


stall method is used, especially if we have conditional
branches
Compare R1
e.g. for (i=0;i<1000;i++) MOV R1,0 with 1000
{ 100: ADD R2, R1
j=j+i; INC R1
}
CMP R1,1000 Jump if
JL 100 R1<1000

There will be 1000 stalls for this code
We need better solutions
Zelalem Birhanu, AAiT 7
Delayed Branch

Compiler automatically rearranges instructions so that


other instructions execute until a branch address is
determined
Not related to
MOV R3,8 the branch
MOV R1,0
MOV R4,9 instruction
100: ADD R2, R1
MOV R1,0
INC R1
100: ADD R2, R1
CMP R1,1000
INC R1
JL 100
CMP R1,1000
MOV R3,8
JL 100
MOV R4,9

Insert No Operation (NOP) instructions if no useful


instructions can be placed after branch operation
Zelalem Birhanu, AAiT 8
Delayed Branchcntd
PC=100
Time

JL 100 FI DI FO EI WO
MOV R3,8 FI DI FO EI WO
MOV R4,9 FI DI FO EI WO
NOP FI DI FO EI
NOP FI DI FO
100: ADD R2,R1 FI DI

Not very efficient since usually there are no instructions


to insert after branch instructions
Zelalem Birhanu, AAiT 9
Branch Prediction

Speculate (predict) what the next instruction will be


instead of waiting (stalling) until a branch instruction is
executed

Different types:
Static prediction: Prediction does not change
Predict a branch is never taken
Predict a branch is always taken
Predict backward is taken, forward is not taken
Dynamic prediction: Prediction changes based on branch
history
Last time prediction (single bit)
Two-bit counter based prediction
Zelalem Birhanu, AAiT 10
Branch Never Taken

Assume a branch is never taken and fetch the instruction


next to the branch instruction
e.g.
i=1; MOV R1,1 Jump if
j=1; MOV R2, 1 R1!=R2
if(i==j) CMP R1,R2
k=0; JNE else
else MOV R3,0
k=1; JMP else+1
else: MOV R3,1
else+1:

Zelalem Birhanu, AAiT 11


Branch Never Takencntd
Branch detected,
fetch next
instruction
JNE else FI DI FO EI WO
MOV R3,0 FI DI FO EI WO

When prediction is wrong (a branch should be taken), drop


fetched instruction and fetch instruction at the target address
(the address specified by the branch instruction)

Less accurate (30-40%)


e.g. during loops

Zelalem Birhanu, AAiT 12


Branch Always Taken

Assume a branch is always taken and fetch the instruction at


the target address
Branch instr.
Better accuracy (60-70%) address Branch Target
Branch detected,
103 100
fetch instruction
MOV R1,0 from address 100
100: ADD R2, R1 BTB
INC R1
CMP R1,1000 JL 100 FI DI FO EI WO
103: JL 100 FI
ADD R2, R1 FI DI FO EI WO

How does the processor know the target address?


Store the target address in a Branch Target Buffer (BTB) after the
first iteration, and use this address for subsequent branch
operations
Zelalem Birhanu, AAiT 13
Backward Taken, Forward Not Taken

Predict backward branches (loop branches) as


taken and forward branches as not taken

MOV R1,0
100: ADD R2, R1
INC R1
CMP R1,1000 Backward branch,
103: JL 100 branch is taken

Better accuracy compared to the previous two


approaches
Zelalem Birhanu, AAiT 14
Dynamic: Last time prediction

If a branch was taken last time, then take the branch,


else do not take branch
A single bit is used to store previous history
Actually
taken
Actually
Actually taken
not taken Predict Predict
not taken
taken

0 1
Actually
not taken

Zelalem Birhanu, AAiT 15


Last time predictioncntd

e.g. What is the accuracy of a last time branch prediction


for the following actual branch history (loop)?

Actual TTTTTN T: Taken


Prediction NTTTTTN N: Not taken

From 66.7% accuracy


previous (N-2)/N, for N iterations
loop

Ex. How about for this?


Actual TNTNTNTNTN
Zelalem Birhanu, AAiT 16
Dynamic: Two-bit Counter Based Prediction

A prediction is given two chances before it is changed


(two bits used to store previous history)
Actually
taken

Actually Predict
Predict
not taken not
not
Actually taken
taken Actually
not taken
taken
00 01
Actually
not taken Actually
Predict Predict
taken Actually
taken taken
taken
Actually
11 not taken 10
Zelalem Birhanu, AAiT 17
Two-bit Counter Based Predictioncntd

e.g. What is the accuracy of a two-bit counter based


branch prediction for the following actual branch
history (loop)?
Actual TTTTTN
TTTTTTT
83.3% accuracy
From previous (N-1)/N, for N iterations
loop

Ex. How about for this?


Actual TNTNTNTNTN

Zelalem Birhanu, AAiT 18


More Readings

1. Computer Architecture and Organization,


William Stallings, 8th edition (section 12.4)

Zelalem Birhanu, AAiT 19

You might also like