A Limit Study of Memory Requirements Using Value Reuse Profiles

Andrew S. Huang, John P. Shen

Abstract

In this paper, we introduce our concept of a perfect memory system. The perfect memory system is an omniscient and autonomous memory system that allows programs to execute at full speed without having to worry about memory accesses. By measuring the speedup of executing programs on the perfect memory system, we obtain the maximum performance gain that can ever be achieved by improving current memory system designs and compiler storage allocation. We report speedups of 15% to 102% for a group of benchmarks running on a DEC Alpha 21064 with a perfect memory system. We also introduce the Value Reuse Profile (VRP), a method for collecting the dynamic value reuse characteristics of programs. The VRP provides a way to measure the minimum amount of local memory that is needed by a program to achieve zero-cycle effective memory latency on a perfect memory system. Our results show that most of the benchmarks require less local memory and slightly more bandwidth than what the 21064 already has.

Keywords

Memory hierarchy, Locality, Value reuse, Memory bandwidth, Superscalar processors

Talk Overheads (177243 bytes)