- [Thor64]
-
J. E. Thorton,
"Parallel Operation in the Control Data 6600",
AFIPS Proceedings of the Spring Joint Computer Conference,
part II, vol. 26 (1964), pp. 33-40.
- [Moor65]
-
G. E. Moore,
"Cramming more components onto integrated circuits" ,
Electronics , pp. 114-117, April 1965.
- [Toma67]
-
R. M. Tomasulo,
"An Efficient Algorithm for Exploiting Multiple Arithmetic Units",
IBM Journal of Research and Development, Vol. 11 Issue 1 (January 1967), pp. 25-33.
- [AnST67]
-
D. W. Anderson, F. J. Sparacio, and R. M. Tomasulo,
"IBM System/360 Model 91: Machine Philosophy and Instruction-handling",
IBM Journal of Research and Development, Vol. 11 Issue 1(January 1967), pp. 8-24.
- [Amda67]
-
G. M. Amdahl,
"Validity of the single-processor approach to achieving large scale computing
capabilities",
AFIPS Conference Proceedings,
April 1967, pp. 483-485.
- [Thor70]
-
J. E. Thorton,
"Design of a Computer: The Control Data 6600",
Glenview, IL: Scott Foresman, 1970.
- [CRAY77]
-
"CRAY-1 Computer System Hardware Reference Manual"
,
CRAY Research Incorporated,
Publication No. 224004, Rev. C, November 1977.
- [Russ78]
-
R. M. Russell,
"The CRAY-1 Computer System"
Communications of the ACM, vol. 21, no. 1 (January 1978), pp. 63-72.
- [Smit81]
-
J. E. Smith,
"A Study of Branch Prediction Strategies",
Proceedings of the 8th Annual International Symposium on Computer
Architecture,
May 1981, pp. 135-148.
- [Kolo81]
-
J. S. Kolodzey,
"The CRAY-1 Computer Technology",
IEEE Transactions on Component Hybrids, and Manufacturing Technology,
vol. CHMT-4, no. 2 (June 1981), pp. 181-186.
- [EmCl84]
-
J. S. Emer and D. W. Clark,
"A Characterization of Processor Performance in the VAX-11/780",
Proceedings of the 11th Annual International Symposium on Computer
Architecture,
June 1984, pp. 301-330.
- [SmPl85]
-
J. E. Smith and A. R. Pleszkun,
"Implementing Precise Interrupts in Pipelined Processors",
Proceedings of the 12th Annual International Symposium on Computer
Architecture,
Boston, MA (June 1985), pp. 36-44.
- [PaHS85]
-
Y. N. Patt, W. M. Hwu, and M. Shebanow,
"HPS, a new microarchitecture: rationale and introduction",
Proceedings of the 18th Annual Workshop on Microprogramming,
Pacific Grove, CA (December 1985), pp. 103-108.
- [SoVa87]
-
G. S. Sohi and S. Vajapeyam,
"Instruction Issue Logic for High-Performance, Interruptable Pipelined Processors",
Proceedings of the 14th Annual International Symposium on Computer
Architecture,
Pittsburgh, PA (June 1987), pp. 27-34.
- [RaFi92]
-
B. R. Rau and J. A. Fisher,
"Instruction-Level Parallel Processing: History, Overview, and Perspective",
Hewlett-Packard Laboratories Tech Report HPL-92-132 ,
October 1992.
Instruction Sets
- [PaDi80]
-
D. A. Patterson and D. R. Ditzel,
"The Case for the Reduced Instruction Set Computer",
ACM SIGARCH Computer Architecture News,
Vol. 8 (October 1980), pp. 25-33.
- [Wulf81]
-
W. A. Wulf,
"Compilers and Computer Architecture",
IEEE Computer , Vol. 14, issue 7 (July 1981), pp. 41-47.
- [Radi82]
-
G. Radin,
"The 801 Minicomputer ",
Proceedings of the 1st International Conference on Architectural Support for Programming Languages and Operating Systems,
Palo Alto, CA (March 1982), pp. 39-47.
- [CHJS86]
-
G. Radin,
"Instruction Sets and Beyond: Computers, Complexity, and Controversy",
IEEE Computer , Vol. 18, issue 9 (September 1985), pp. 8-19.
Decoupled Processing
- [Rich99]
-
Kevin Rich,
"Compiler
Techniques for Evaluating and Extending Decoupled Architectures",
(Ph.D. Dissertation) University of California
at Davis, Davis, California (December 1999).
- [Tyso97]
-
Gary Tyson,
"Evaluation of a Scalable Decoupled Microprocessor Design"
(Ph.D. Dissertation)
University of California at Davis, Davis, California (August 1997).
- [TyFa94]
-
G. Tyson and M. Farrens,
"Code Scheduling for Multiple Instruction Stream Architectures",
International Journal of Parallel Processing, vol. 22, no. 3 (1994),
pp. 243-272.
- [TyFa93]
-
G. Tyson and M. Farrens,
"Techniques for Extracting Instruction Level Parallelism on MIMD
Architectures",
Proceedings of the 26th Annual International Symposium on
Microarchitecture, Austin, Texas
(December 1-3, 1993), pp. 128-137.
- [SmWP86]
-
J. E. Smith, S. Weiss and N. Y. Pang,
"A Simulation Study of Decoupled Architecture Computers",
IEEE Transactions on Computers, vol. C-35,
no. 8 (August 1986), pp. 692-702.
Methods
- [DeBK01]
-
Rajagopalan Desikan, Doug Burger, and Stephen Keckler,
"Measuring Experimental Error in Microprocessor Simulation",
Proceedings of the 28th Annual International Symposium on Computer
Architecture (ISCA01),
Goteborg, Sweden (July 1-4th, 2001), pp. 266-277.
- [OsCF00]
-
Mark Oskin, Frederic T. Chong, Matthew Farrens,
"HLS: Combining Statistical and Symbolic Simulation to Guide Microprocessor
Designs",
Proceedings of the 27th Annual International Symposium on Computer
Architecture (ISCA00),
Vancouver, Canada (June 10-14th, 2000) pages 71-82.
Branch Prediction
- [YeP91]
-
T. Yeh and Y. Patt,
"Two-level adaptive training branch prediction" ,
Proceedings of the 24th Annual International Symposium on
Microarchitecture,
Albuquerque, New Mexico (November 18-20, 1991), pp. 51-61.
- [YeP92]
-
T. Yeh and Y. Patt,
"Alternative Implementations of Two-Level Adaptive Training Branch Prediction"
,
Proceedings of the Nineteenth Annual International Symposium on
Computer Architecture,
Queensland, Australia (May 19-21, 1992), pp. 124-134.
- [McFa93]
-
Scott McFarling,
"Combining branch predictors",
Digital Equipment Corporation WRL Technical Note TN-36, June 1993
- [CHYP94]
-
P. Chang, E. Hao, T. Yeh and Y. Patt,
"Branch Classification: A New Mechanism for Improving Branch Predictor
Performance",
Proceedings of the 27th Annual International Symposium on Microarchitecture,
San Jose, Ca. (November 30-December 2, 1994), pp. 22-31.
- [YoGS95]
-
C. Young, N. Gloy and M. D. Smith,
"A Comparative Analysis of Schemes for Correlated Branch Prediction",
Proceedings of the 22nd Annual International Symposium on Computer Architecture,
Santa Marhgerita Ligure, Italy (June 22-24, 1995), pp. 276-286.
- [ChCM96]
-
I. K. Chen, J. T. Coffey and T. N. Mudge,
"Analysis of Branch Prediction via Data Compression",
Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems,
Cambridge, MA (October 1996), pp. 128-137.
- [EmGl97]
-
Joel Emer and Nikolas Gloy,
"A Language for Describing Predictors and its Application to Automatic Synthesis",
Proceedings of the 24th Annual International Symposium on Computer Architecture,
Denver, Colorado (June 2-4, 1997), pp. 304-314.
- [EPCP98]
-
M. Evers, S. J. Patel, R. S. Chappell and Y. N. Patt,
"An Analysis of Correlation and Predictability: What Makes Two-Level Branch
Predictors Work",
Proceedings of the 25th Annual International Symposium on Computer Architecture,
Barcelona, Spain (June 29-July 1, 1998), pp. 52-61.
- [JuSN98]
-
T. Juan, S. Sanjeevan and J. J. Navarro,
"Dynamic History-Length Fitting: A third level of adaptivity for branch
prediction",
Proceedings of the 25th Annual International Symposium on Computer Architecture,
Barcelona, Spain (June 29-July 1, 1998), pp. 155-166.
- [StEP98]
-
J. Stark, M. Evers and Y. N. Patt,
"Variable Length Path Branch Prediction",
Proceedings of the 8th International Conference on Architectural Support for
Programming Languages and Operating Systems,
San Jose, CA (October 3-7, 1998), pp. 170-179.
- [EdMu98]
-
A. N. Eden and T. Mudge,
"The YAGS Branch Prediction Scheme",
Proceedings of the 31st Annual International Symposium on Microarchitecture,
Dallas, Texas (November 30-December 2, 1998), pp. 69-77.
- [KiT98]
-
S. P. Kim and G. S. Tyson,
"Analyzing the Working Set Characteristics of Branch Execution",
Proceedings of the 31st Annual International Symposium on Microarchitecture,
Dallas, Texas (November 30-December 2, 1998), pp. 49-58.
- [HeSS99]
-
T. Heil, Z. Smith and J. E. Smith,
"Improving Branch Predictors by Correlating on Data Values",
Proceedings of the 32nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 32),
Haifa, Israel (November 16-18, 1999), pages 28-37.
- [HaSF00]
-
Michael Hangs, Phil Sallee and Matthew Farrens,
"Branch Transition Rate: A New Metric for Improved Branch Classification
Analysis",
Proceedings of the 6th International Symposium on High-Performance Computer Architecture,
Toulouse, France (January 8-12, 2000), pp. 241-250.
- [SkMC00]
-
Kevin Skadron, Margaret Martonosi, and Douglas Clark,
"A Taxonomy of Branch Mispredictions, and Alloyed Prediction as a Robust
Solution to Wrong-History Mispredictions",
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques,
Philadelphia, PA (October 15-19, 2000), pp. 199-206
- [ERSM01]
-
A. Eden, J. Ringenberg, S. Sparrow, and T. Mudge,
"Hybrid myths in branch prediction",
Proceedings of the 5th World Multiconference on Systemics, Cybernetics and
Informatics (SCI 2001)
and the 7th International Conference on Information Systems Analysis and Synthesis (ISAS 2001),
Orlando, FL, July 2001.
Confidence Predictors
- [JaRS96]
-
Erik Jacobsen, Eric Rotenberg, and James E. Smith,
"Assigning Confidence to Conditional Branch Predictions",
Proceedings of the 29th Annual International Symposium on Microarchitecture,
Paris, France (December 2-4, 1996), pp. 142-152.
- [GKMP98]
-
D. Grunwald, A. Klauser, S.Manne and A. Pleszkun,
"Confidence Estimation for Speculation Control",
Proceedings of the 25th Annual International Symposium on Computer
Architecture,
Barcelona, Spain (June 29-July 1,1998), pp. 122-131.
- [AGGG01]
-
J.L. Aragon, J. Gonzalez, J.M. Garcia and A. Gonzalez,
"Selective Branch Prediction Reversal by Correlating with Data Values
and Control Flow" ,
Proceedings of the 19th IEEE International Conference on Computer Design,
Austin, Texas (September 24-26, 2001), pp. 228-233.
Advanced Caching Techniques
- [BuGK96]
-
D. Burger, J. R. Goodman and A. Kagi,
"Memory Bandwidth Limitations of Future Microprocessors" ,
Proceedings of the 23rd Annual International Symposium on Computer Architecture,
Philadelphia, PA (May 22-24, 1996), pp. 78-89.
- [TFMP97]
-
G. Tyson, M. Farrens, J. Matthews and A. Pleszkun,
"Managing Data Caches using Selective Cache Line Replacement",
International Journal of Parallel Processing,
vol. 25, no. 3 (June 1997), pp. 213-242.
- [KuWi98]
-
S. Kumar and C. Wilkerson,
"Exploiting Spatial Locality in Data Caches using Spatial Footprints",
Proceedings of the 25th Annual International Symposium on Computer Architecture,
Barcelona, Spain (June 29-July 1, 1998), pp. 357-368.
- [VTGN99]
-
A. Veidenbaum, W. Tang, R. Gupta, A. Nicolau, and X. Ji,
"Adapting Cache Line Size to Application Behavior"",
Proceedings of the 13th ACM International Conference on Supercomputing,
Rhodes, Greece (June 20-25, 1999), pp. 145-154.
- [TRST99]
-
Edward S. Tam, Jude A. Rivers, Vijayalakshmi Srinivasan, Gary S. Tyson and Edward S. Davidson,
"Active Management of Data Caches by Exploiting Reuse Information",
IEEE Transactions on Computers,
Vol 48, No 11, pp. 1244-1259, Nov 1999.
- [HaRe00]
-
Erik G. Hallnor and Steven K. Reinhardt,
"A Fully Associative Software-Managed Cache Design",
Proceedings of the 27th Annual
International Symposium on Computer Architecture,
Vancouver, British Columbia (June 10-14, 2000), pp. 107-116
- [JaMu01]
-
B. Jacob and T. Mudge,
"Uniprocessor virtual memory without TLBs",
IEEE Transactions on Computers,
vol. 50, no. 5, May 2001, pp. 482-499.
Instruction Fetch Issues
- [CMMP95]
-
T. M. Conte, K. N. Menezes, P. M. Mills and B. A. Patel,
"Optimization of Instruction Fetch Mechanisms for High Issue Rates",
Proceedings of the 22nd Annual International Symposium on Computer Architecture,
Santa Margherita Ligure, Italy (June 22-24, 1995), pp. 333-344.
- [RoBS96]
-
E. Rotenberg, S. Bennett and J. E. Smith,
"Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching",
Proceedings of the 29th Annual International Symposium on Microarchitecture,
Paris, France (December 2-4, 1996), pp. 24-35.
- [PoTM99]
-
M. Postiff, G. Tyson and T. Mudge,
"Performance Limits of Trace Caches",
Journal of Instruction Level Parallelism, vol. 1, no. 5 (October 1999).
- [BlRS99]
-
B. Black, B. Rychlik and J. P. Shen,
"The Block-based Trace Cache",
Proceedings of the 26th Annual International Symposium on Computer Architecture,
Atlanta, GA (May 2-4, 1999), pp. 196-207.
- [Rein01]
-
G. Reinman,
"Hardware Optimizations Enabled by a Decoupled Fetch Architecture",
(Ph.D. Dissertation)
University of California at San Diego, San Diego, California (August 2001).
Interesting Ideas
- [Aust99]
-
Todd Austin,
"DIVA: A Reliable Substrate for Deep Submicron Microarchitecture Design,",
Proceedings of the 32nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 32),
Haifa, Israel (November 16-18, 1999), pages 28-37.
- [Aust00]
-
Todd Austin,
"DIVA: A Dynamic Approach to Microprocessor Verification",
Journal of Instruction Level Parallelism,
Vol. 2, no. 11 (May, 2000)
- [EKDP03]
-
Dan Ernst, Nam Sung Kim, Shidhartha Das, Sanjay Pant, Rajeev Rao, Toan Pham,
Conrad Ziesler, David Blaauw, Todd Austin, Krisztian Flautner, Trevor Mudge,
"Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation",
Proceedings of the 36th Annual International Symposium on Microarchitecture,
San Diego, CA (Dec. 3-5, 2003), pp. 7-18.
Branch Elimination
[MLCH92]
S. A. Mahlke, D. C. Lin, W.
Y.
Chen, R. E. Hank and R. A. Bringmann,
"Effective Compiler Support
for
Predicated Execution Using the
Hyperblock", Proceedings
of
the 25th Annual International Symposium on
Microarchitecture,
Portland,
Oregon (December 1-4, 1992), pp. 45-54.
[HMC93] W. W. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J.
Warter, R. A.
Bringmann, R. G. Ouellette,
R.
E. Hank, T. Kiyohara, G. E. Haab, J. G.
Holm and D. M. Lavery, "The
Superblock:
An Effective Technique for
VLIW and Superscalar
Compilation.", Journal of Supercomputing, , vol.
7, no.
1/2 (1993), pp. 229-248.
[MuWh95]
F. Mueller and D. B.
Whalley, "Avoiding
Conditional Branches by Code
Replication", Proceedings
of
the ACM SIGPLAN Notices Conference on
Programming Language
Design
and Implementation, La Jolla, CA (June
18-21, 1995), pp. 56-66.
[BoGS97]
R. Bodik, R. Gupta and M. L.
Soffa,
"Interprocedural Conditional
Branch Elimination", Proceedings
of the ACM SIGPLAN Notices Conference
on Programming Language
Design
and Implementation, Las Vegas, Nevada
(June 15-18, 1997), pp.
146-158.
[YaUW98]
M. Yang, G. Uh and D. B.
Whalley,
"Improving Performance by Branch
Reordering", Proceedings
of
the ACM SIGPLAN Notices Conference on
Programming Language
Design
and Implementation, Montreal, Canada (June
17-19, 1998), pp. 130-141.
[ASPM99]
D. I. August, J. W. Sias, J.
Puiatti,
S. A. Mahlke, D. A. Conners, K.
M. Crozier and W. W. Hwu,
"The
Program Decision Logic Approach to
Predicated Execution", Proceedings
of the 26th Annual International
Symposium on Computer
Architecture,
Atlanta, GA (May 2-4, 1999), pp.
208-219.
Prefetching
[Joup90b]
N. Jouppi, "Improving
Direct-Mapped Cache Performance by the Addition
of a Small Fully-Associative
Cache
and Prefetch Buffers", Proceedings
of the Seventeenth
Annual International
Symposium on Computer
Architecture,
vol.
18, no. 2 (May 1990), pp. 364-373.
[Joup90a]
N. Jouppi, "Reducing
Compulsory and Capacity Misses", Digital Western
Research Laboratory
Technical Note TN-53(August 1990).
[ChBa95]
T. Chen and J. Baer,
"Effective Hardware Based Data Prefetching for
High-Performance
Processors", IEEE Transactions on Computers, vol.
44, no. 5 (May 1995),
pp.
609-623.
[Eben98]
A. Ebenezer, Hardware
Based
Prefetching Methods, Masters Thesis,
Department of Electrical and
Computer
Engineering, University of
California-Davis, Davis,
California,
(December 1998).
[Joup98]
N. Jouppi,
"Retrospective: Improving Direct-Mapped Cache Performance
by the Addition of a Small
Fully-Associative
Cache and Prefetch
Buffers", 25 Years of
the International
Symposium on Computer
Architecture - Selected
Papers(1998),
pp. 71-73.
[JoGr99]
D. Joseph and D. Grunwald,
"Prefetching
Using Markov Predictors", IEEE
Transactions on Computers,
vol. 48, no. 2 (February 1999), pp. 121-133.
Value Prediction