Branch Prediction
While a conditional jump instruction is making it’s way through the
pipeline it is not known yet which instruction will be the next one, since
it will be determined by the jump instruction itself. So it is unclear
which next instructions should be fed into the pipeline. (3 nop operations
would have to be inserted to make sure the result is ready)
Branch prediction guesses one of the two possible outcomes of the jump and puts them in the pipeline at the risk of choosing the wrong one and having to undo the work it did on the instructions so far.
But, if it guesses correctly then there is no performance hit due to the jump.
Static branch prediction #
“The branch will always be predicted the same way”
- Hardware
- backward jumps are predicted to be always taken (loops are run much more often than they are exited)
Software: 1. ISA could define a bit in the jump’s upcode used for prediction -> compilers can make use of it
2. <span class="underline">Feedback directed compilation</span> Compilers have intel over profiling data
of the program and can try to optimize the branches
Dynamic branch prediction #
“The CPU tries to learn which branch should be taken at runtime”
- Have a Branch Prediction Buffer (BPB) of fixed size
- Can come to collisions
Single Bit Predictor #
- Not optimal for nested loops:
- Every time when done in the inner loop and bouncing back to the outer loop the last jump in the inner loop was “not taken” (breaking the loop) and it will always predict the next inner loop to then also be “not taken” which is often wrong.
Two Bit Predictor #
- 11
- strongly taken
- 10
- weakly taken
- 01
- weakly not taken
- 00
- strongly not taken
Or with saturation scheme:
Correlation Predictor #
if (aa==2)
aa=0;
if (bb==2)
bb=0;
if (aa!=bb){
...
}
Motivation: If both of the first conditions are true then the 3. jump can always be predicted correctly.
Possible implementation:
- (m, n)-Predictors
- History of last
jumps to select one of
$n$-bit predictors
- Branch History Register (BHR): - Globally store the outcome of the last
jumps and always shift by one to insert a new one
-bit shift register
- Branch History Register (BHR): - Globally store the outcome of the last
Branch Target Buffer #
Try to cache the branch target given the address of the branching instruction.
- Can be done in the IF phase (since the table lookup uses the address whic is knon in IF)
- Optionally stores prediction bits; If not then an entry in the table implies the jump. (And is only written into the table if jup was done, else it is not written / removed)
- This one does not have prediction bits (entry means jump, no entry means no jump)
Hybrid Branch Predictor #
- Different Branch predictors preform differently in different circumstances.
- Correlation predictors need warmup (after every context switch)
- Solution: Have multiple branch predictors, and have one “selector” which decides which predictor to pick -> Local predictor after context switch and correlation after warmup
- Selector is updated according to the correctness of the predictors