Decoupling Integer Computation in Superscalar Processors
Subbarao Palacharla, J.E. Smith
Abstract
Current processor microarchitectures comprise a fetch unit that feeds
instructions to integer and floating point subsystems each containing
some number of computation units. While this division into integer and
floating point units eases implementation and permits integer and
floating point operations to execute in parallel in scientific codes,
it suffers from idle floating point units while the processor is
executing integer-intensive code. In this paper, we study the
fundamental division of programs into branching, addressing, and
computation functions and use the resulting data to suggest an
alternative microarchitecture that better utilizes the floating point
units while the processor is executing integer code. If the floating
point computation units are extended to perform simple integer
instructions, a significant number of instructions can be naturally
executed in the augmented floating point subsystem. The set of
instructions executing in the integer subsystem consists of all
load/store instructions and the instructions that are involved in
computing the corresponding address registers. The set of instructions
executing in the augmented floating point subsystem consists of
instructions not involved in computing addresses. Branch instructions
and instructions that contribute to branch outcomes are split between
the two units, depending on whether they use register values from the
addressing operations.
The analysis of benchmark programs compiled for the SPARC architecture
indicates that such a division is indeed worthwhile. Between 10\% and
39\% of the instructions in our integer benchmarks can be executed in
the augmented floating point units. Furthermore, these instructions are
all simple add, subtract and logical instructions.
Keywords
processor microarchitecture, decoupling, program
analysis, slicing, superscalar processors.
Talk
Overheads (84289 bytes)