Improving CISC Instruction Decoding Performance Using a Fill Unit
Mark Smotherman, Manoj Franklin
Abstract
Current superscalar processors require substantial instruction fetch
and decode bandwidth to keep multiple functional units utilized. A
hardware assist, called a fill unit, can collect microoperations into a
decoded instruction cache, and future code fetches can bypass the
decoding logic. This approach is investigated using the x86
architecture, and a speedup of approximately a factor of two over a
P6-like decoding structure is seen for the three SPEC benchmarks
investigated. This design is accompanied by a microengine-register
allocation scheme that prevents the increased supply of microoperations
from placing excessive demands on the register renaming hardware.
Keywords
CISC, instruction decoding, fill unit, dynamic execution, register renaming
Talk
Overheads (140394 bytes)