We have developed simple compiler heuristics to identify load
instructions that are likely to cause a cache-miss. Experimentation
with a set of benchmarks shows that our heuristics are successful in
identifying 85% of the cache misses. Using the heuristics, we have
also developed an instruction scheduling algorithm to hide memory
latency by preloading. Our simulation on a set of SPEC92 benchmarks
suggests that our technique is successful in hiding the memory latency
and improves the overall performance. We also show the effect of
preloading for several CPU configurations, especially by changing the
number of instructions issued, and by adding a Branch Target Buffer.