Performance Issues in Correlated Branch Prediction Schemes
Nicolas Gloy, Michael D. Smith, Cliff Young
Abstract
Accurate static branch prediction is the key to many techniques for
exposing, enhancing, and exploiting Instruction Level Parallelism
(ILP). The initial work on static correlated branch prediction (SCBP)
demonstrated improvements in branch prediction accuracy, but did not
address overall performance. In particular, SCBP expands the size of
executable programs, which negatively affects the performance of the
instruction memory hierarchy. Using the profile information available
under SCBP, we can minimize these negative performance effects through
the application of code layout and branch alignment techniques. We
evaluate the performance effect of SCBP and these profile-driven
optimizations on instruction cache misses, branch mispredictions, and
branch misfetches for a number of recent processor implementations. We
find that SCBP improves performance over (traditional) per-branch
static profile prediction. We also find that SCBP improves the
performance benefits gained from branch alignment. As expected, SCBP
gives larger benefits on machine organizations with high
mispredict/misfetch penalties and low cache miss penalties. Finally, we
find that the application of profile-driven code layout and branch
alignment techniques (without SCBP) can improve the performance of the
dynamic correlated branch prediction techniques.
Keywords
branch prediction, branch correlation, code layout,
instruction cache performance
Talk
Overheads (213251 bytes)