A Limit Study of Memory Requirements Using Value Reuse Profiles
Andrew S. Huang, John P. Shen
Abstract
In this paper, we introduce our concept of a perfect memory system.
The perfect memory system is an omniscient and autonomous memory system
that allows programs to execute at full speed without having to worry
about memory accesses. By measuring the speedup of executing programs
on the perfect memory system, we obtain the maximum performance gain
that can ever be achieved by improving current memory system designs
and compiler storage allocation. We report speedups of 15% to 102% for
a group of benchmarks running on a DEC Alpha 21064 with a perfect
memory system. We also introduce the Value Reuse Profile (VRP), a
method for collecting the dynamic value reuse characteristics of
programs. The VRP provides a way to measure the minimum amount of local
memory that is needed by a program to achieve zero-cycle effective
memory latency on a perfect memory system. Our results show that most
of the benchmarks require less local memory and slightly more bandwidth
than what the 21064 already has.
Keywords
Memory hierarchy, Locality, Value reuse, Memory bandwidth,
Superscalar processors
Talk
Overheads (177243 bytes)