Exploiting Short-Lived Variables in Superscalar Processors

Luis A. Lozano C., Guang R. Gao

Abstract

Superscalar processors may employ architecture features that use out-of-order instruction execution (dynamic scheduling) and aggressive register renaming to exploit instruction level parallelism. In order to support these features, they use complex hardware mechanisms like reorder buffers as a "parking lot" for instructions that have already passed the decode stage but cannot be executed because of data dependencies or resource constraints.

In this paper, we present experimental evidence showing that a significant number of program variables are short-lived in the sense that their whole live ranges occur entirely within the reorder buffer. Therefore, the values produced by these short-lived variables do not need to be written back (committed) to the register file. On the basis of this observation, we have proposed a scheme that includes a compiler analysis, which we call short-live-range analysis, and a simple architecture extension to avoid the useless commits of the values generated for these short-lived variables. Moreover, we have proposed an extension to the existing register allocation mechanism that does not assign these short-lived variables to locations in the register file. Instead, they are confined to locations in the reorder buffer using the features provided by the register renaming mechanism. This decreases the register pressure and improves the performance of the generated code by reducing the amount of spill code required.

We have implemented this scheme using the McCAT testbed which contains the McCAT C compiler and the SuperDLX superscalar simulator. Our simulation results show: (1) the short-live-range analysis and the proposed architecture feature can be successfully used to avoid the useless commit of instructions to the register files; (2) the above mechanism can reduce the number of write ports to the register files without affecting performance; (3) the allocation of short-lived variables to locations in the reorder buffer can significantly reduce the introduction of spill code and improve the overall performance.

Keywords

Superscalar architectures, register allocation, short-lived variables, register renaming, reorder buffer.

Talk Overheads (171,544 bytes)