Unrolling-Based Optimizations for Software Pipelining

Daniel M. Lavery, Wen-mei W. Hwu


Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput of a software pipeline generated by modulo scheduling depends on the resource requirements, the dependence pattern, and the register requirements of the computation in the loop body. Traditionally, unrolling followed by acyclic scheduling of the unrolled body, has been an alternative to modulo scheduling. However there are benefits to unrolling even if the loop is to be modulo scheduled. This paper describes unrolling-based optimizations that reduce the resource requirements of the loop and reduce the height of the critical path. The resource reductions described can only be achieved by unrolling. The performance benefits of these optimizations for five SPEC-CFP92 programs is reported. In addition, some ideas for controlling the optimizations to balance the constraints on the throughput are discussed.


modulo scheduling, software pipelining, optimization, loop unrolling, instruction-level parallelism

Talk Overheads (257796 bytes)