Improving Instruction-Level Parallelism by Loop Unrolling and Dynamic Memory Disambiguation

Jack W. Davidson, Sanjay Jinturkar

Abstract

Exploitation of instruction-level parallelism is an effective mechanism for improving the performance of modern super-scalar/VLIW processors. Various software techniques can be applied to increase instruction-level parallelism. This paper describes and evaluates a software technique, dynamic memory disambiguation, that permits loops containing stores to be scheduled more aggressively, thereby exposing more instruction-level parallelism. We have implemented this technique in a production quality compiler system vpcc-vpo. The results of our evaluation show that when dynamic memory disambiguation is applied in conjunction with loop unrolling, register renaming and static memory disambiguation, the ILP of memory-intensive benchmarks can be increaed by as much as 300 percent over loops where only loop unrolling, register renaming and static memory disambiguation has been performed. Like other optimizations, loop unrolling, register renaming, and dynamic memory disambiguation use register resources. Our measurements indicate that for programs that benefit the most from these optimizations, the register usage does not exceed the number of registers found on most high-performance processors.

Keywords

Instruction-level parallelism, Loop unrolling, Register renaming, Dynamic memory disambiguatiob, VLIW

Talk Overheads (229519 bytes)