Theoretical Modeling of Superscalar Processor Performance
Derek B. Noonburg,John P. Shen
derekn@ece.cmu.edu, shen@ece.cmu.edu
Abstract
The current trace-driven simulation approach to determine superscalar processor
performance is widely used but has some shortcomings. Modern benchmarks
generate extremely long traces resulting in problems with data storage, as
well as very long simulation run times. More fundamentally, simulation
generally does not provide significant insight into the factors that determine
performance or a characterization of their interactions. This paper proposes a
theoretical model of superscalar processor performance that addresses these
shortcomings. Performance is viewed as an interaction of program parallelism
and machine parallelism. Both program and machine parallelisms are decomposed
into multiple component functions. Methods for measuring or computing these
functions are described. The functions are combined to provide a model of the
interaction between program and machine parallelisms and an accurate estimate
of the performance. The computed performance, based on this model, is compared
to simulated performance for six benchmarks from the SPEC92 suite on several
configurations of the IBM RS/6000 instruction set architecture.
Talk
Overheads (174951 bytes)