Professional Documents
Culture Documents
Handout_7
Branch Prediction
Static Branch Prediction
When branch prediction is not based on branch behavior i.e. the predictor always predicts the
branch in the same direction (taken/not-taken), such prediction is called static branch prediction.
Dynamic Branch Prediction
• The goal of dynamic branch prediction is to make use of run-time behavior of branch to more
accurately predict what direction (taken or not-taken) a given branch will follow.
• To achieve this following control structures are employed.
Branch Prediction Buffer
• A branch prediction buffer (BPB) or branch history table (BHT) is used to hold this
required information.
o BHT is indexed by lower 16 bits of the branch instruction address. The indexed entry
may be a 1 or 2 bit value depending upon 1-bit or 2-bit predictor used.
Lower 16 bits
of PC
T
NT
T
T
NT
NT
.
.
.
o This gives a prediction of direction of branch.
o In 1-bit prediction scheme, entry is inverted if the branch is predicted incorrectly, no
change otherwise.
o The predictor guesses that a branch will behave the same way as it did the last time.
o The following state diagram shows the operation of a 1-bit branch predictor.
Taken Branch
Not-Taken Branch 0 1
(Predict (Predict
Not-Taken) Not-Taken Branch Taken)
Taken Branch
o The drawback of a 1-bit branch predictor is that it reverses its prediction on just one
misprediction giving less prediction accuracy specially for highly regular branches
Page - 1 - of 5
CS-421 Parallel Processing BE (CIS) Batch 2004-05
Handout_7
that strongly favor taken or not-taken as most branches do (e.g. backward branches
are mostly taken in case of loop tests)
Page - 2 - of 5
CS-421 Parallel Processing BE (CIS) Batch 2004-05
Handout_7
Taken Branch
00 01
Not-Taken Branch (Predict (Predict
Not-Taken) Not-Taken Branch Not-Taken)
Taken Branch
o As discussed previously in the context of backward branches in loops, 2-bit
prediction improves the performance as compared to 1-bit prediction scheme.
o In practice, a 2-bit branch predictor gives fairly good performance (93% accuracy)
and therefore many architectures rely on 2-bit predictors for branch prediction.
Downside of BPB
o May cause aliasing as lower 16-bits of two different branches may match. However,
this works as it’s just a guess.
o A BHT is accessed in the IF stage and even a correctly predicted taken branch causes
1 cycle penalty as the branch target address can’t be known before ID stage.
o Can we avoid this penalty?
We can avoid this penalty employing Branch Target Buffer (BTB)
Branch Target Buffer (BTB)
• Indexed during IF stage using PC of instruction fetched.
• Each entry holds:
Page - 3 - of 5
CS-421 Parallel Processing BE (CIS) Batch 2004-05
Handout_7
o Predicted target address (if branch is taken)
o Prediction (taken/not-taken) - this is optional field. If this field isn’t used then each
entry in the BTB is for a taken branch (We also assume this approach)
• Branch target address is known before ID stage => no penalty for correct prediction.
• Every reference isn’t necessarily a HIT. (This is in contrast to BPB where every reference is a
HIT)
• In a variation of BTB, target instruction is stored in BTB rather than target address.
• Figure 2 shows the steps followed when a BTB is used with no prediction field.
Page - 4 - of 5
CS-421 Parallel Processing BE (CIS) Batch 2004-05
Handout_7
The following table lists branch penalties for various branches using BTB (assuming that BTB
only stores taken branches):
Branch in Penalty
Branch Prediction Actual Branch
BTB (clock Cycles)
Yes Taken Taken 0
Yes Taken Not-Taken 2
No Not-Taken Taken 2
No Not-Taken Not-Taken 0
******
Page - 5 - of 5