You are on page 1of 7

Branch Prediction

Easiest (static prediction)


Always taken, always not taken Opcode based Displacement based (forward not taken, backward taken) Compiler directed (branch likely, branch not likely)

Next easiest ( Dynamic Prediction)


1 bit predictor remember last taken/not taken per branch
Use a branch-prediction buffer or branch-history table with 1 bit entry Use part of the PC (low-order bits) to index buffer/table Why?

Multiple branches may share the same bit


Invert the bit if the prediction is wrong Backward branches for loops will be mispredicted twice
EX: If a loop branches 9 times in a row and not taken once, what is the prediction accuracy?
Ans: Misprediction at the first loop and last loop => 80% prediction accuracy although branch is taken 90% time.

Bimodal or 2-bit Branch Prediction


Has 4 states instead of 2, allowing for more information about tendencies A prediction must miss twice before it is changed Good for backward branches of loops

Prediction Accuracy ranges from 99% to 82% or a misprediction rate of 1% to 18%

Correlating or Two-level Predictors


Correlating branch predictors also look at other branches for clues. Consider the following example.
if (aa==2) -- branch b1 aa = 0; if (bb==2) --- branch b2 bb = 0; if(aa!=bb) { } --- branch b3 Clearly depends on the results of b1 and b2

Prediction if the last branch is NT

Prediction if the last branch is T (1,1) predictor uses history of 1 branch and uses a 1-bit predictor Need hardware to improve accuracy of prediction.
3

Performance of Correlating Branch Prediction


Branch address (4 bits)

With same number of state bits, (2,2) performs better than noncorrelating 2-bit predictor. Outperforms a 2-bit predictor with infinite number of entries

2-bits per branch local predictors

Prediction

2-bit recent global branch history (01 = not taken then taken)

General (m,n) Branch Predictors


The global history register is an m-bit shift register that records the last m branches encountered by the processor Usually use both the PC address and the GHR (2-level) m-bit ghr

01
PC
Combining funciton

n-bit predictors

00

Tournament Predictors
Multiple Predictors. With buffer size (8 32)Kb Tournament predictors: use two predictors, one based on global information and one based on local information, and combine with a selector. Tournament Predictors using 30 bits processor P5 & P4. Alpha 21264 uses 4K 2-bit counters to choose from

among a global predictor and a local predictor.


Note 12 Global Branch. SPECfp95 1 misprediction per 1000 completed instruction < 1% SPECint 95 11.5 misprediction per 1000 completed instruction.

Accuracy v. Size (SPEC89)


Conditional branch misprediction rate 10% 9% 8% 7% 6% 5% 4% 3% 2% 1% 0%
0 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120 128

Local - 2 bit counters

Correlating - (2,2) scheme Tournament

Total predictor size (Kbits)

You might also like