Unrolling-Based Optimizations for Software Pipelining
Daniel M. Lavery, Wen-mei W. Hwu
Abstract
Modulo scheduling is a method for overlapping successive iterations of
a loop in order to find sufficient instruction-level parallelism to
fully utilize high-issue-rate processors. The achieved throughput of a
software pipeline generated by modulo scheduling depends on the
resource requirements, the dependence pattern, and the register
requirements of the computation in the loop body. Traditionally,
unrolling followed by acyclic scheduling of the unrolled body, has been
an alternative to modulo scheduling. However there are benefits to
unrolling even if the loop is to be modulo scheduled. This paper
describes unrolling-based optimizations that reduce the resource
requirements of the loop and reduce the height of the critical path.
The resource reductions described can only be achieved by unrolling.
The performance benefits of these optimizations for five SPEC-CFP92
programs is reported. In addition, some ideas for controlling the
optimizations to balance the constraints on the throughput are
discussed.
Keywords
modulo scheduling, software pipelining, optimization,
loop unrolling, instruction-level parallelism
Talk
Overheads (257796 bytes)