Exploitation of large amounts of instruction level parallelism requires a large amount of connectivity between the shared register file and the function units; this connectivity is expensive and increases the cycle time.
This paper shows that the new class of transport triggered architectures requires fewer ports on the shared register file than traditional operation triggered architectures. This is achieved by programming data-transports instead of operations.
Experiments with our extended basic block scheduler have shown
that the reduction of the required number of register file ports
is substantial. The average requirement for scalar applications
is 0.50 read and 0.35 write ports per operation instead of 2
read and 1 write ports. Due to this reduction it is possible
to execute 2 operations per cycle with a two-ported register
file and 3.6 operations per cycle with a six-ported register
file.
Talk
Overheads (0 bytes)