Improving Instruction-Level Parallelism by Loop Unrolling and Dynamic Memory
Jack W. Davidson, Sanjay Jinturkar
Exploitation of instruction-level parallelism is an effective mechanism
for improving the performance of modern super-scalar/VLIW processors.
Various software techniques can be applied to increase
instruction-level parallelism. This paper describes and evaluates a
software technique, dynamic memory disambiguation, that permits loops
containing stores to be scheduled more aggressively, thereby exposing
more instruction-level parallelism. We have implemented this technique
in a production quality compiler system vpcc-vpo. The results of our
evaluation show that when dynamic memory disambiguation is applied in
conjunction with loop unrolling, register renaming and static memory
disambiguation, the ILP of memory-intensive benchmarks can be increaed
by as much as 300 percent over loops where only loop unrolling,
register renaming and static memory disambiguation has been performed.
Like other optimizations, loop unrolling, register renaming, and
dynamic memory disambiguation use register resources. Our measurements
indicate that for programs that benefit the most from these
optimizations, the register usage does not exceed the number of
registers found on most high-performance processors.
Instruction-level parallelism, Loop unrolling, Register
renaming, Dynamic memory disambiguatiob, VLIW
Overheads (229519 bytes)