Decoupling Integer Computation in Superscalar Processors

Subbarao Palacharla, J.E. Smith


Current processor microarchitectures comprise a fetch unit that feeds instructions to integer and floating point subsystems each containing some number of computation units. While this division into integer and floating point units eases implementation and permits integer and floating point operations to execute in parallel in scientific codes, it suffers from idle floating point units while the processor is executing integer-intensive code. In this paper, we study the fundamental division of programs into branching, addressing, and computation functions and use the resulting data to suggest an alternative microarchitecture that better utilizes the floating point units while the processor is executing integer code. If the floating point computation units are extended to perform simple integer instructions, a significant number of instructions can be naturally executed in the augmented floating point subsystem. The set of instructions executing in the integer subsystem consists of all load/store instructions and the instructions that are involved in computing the corresponding address registers. The set of instructions executing in the augmented floating point subsystem consists of instructions not involved in computing addresses. Branch instructions and instructions that contribute to branch outcomes are split between the two units, depending on whether they use register values from the addressing operations.

The analysis of benchmark programs compiled for the SPARC architecture indicates that such a division is indeed worthwhile. Between 10\% and 39\% of the instructions in our integer benchmarks can be executed in the augmented floating point units. Furthermore, these instructions are all simple add, subtract and logical instructions.


processor microarchitecture, decoupling, program analysis, slicing, superscalar processors.

Talk Overheads (84289 bytes)