Reading List for ECS 201B

Decoupled Processing


 [Rich99]
        Kevin Rich "Compiler Techniques for Evaluating and Extending Decoupled Architectures" ,
       (Ph.D. Dissertation) University of California at Davis, Davis, California (December 1999).

 [Tyso97]
        Gary Tyson,  Evaluation of a Scalable Decoupled Microprocessor Design"
        (Ph.D. Dissertation) University of California at Davis, Davis, California (August 1997),

[TyFa94]
        G. Tyson and M. Farrens, "Code Scheduling for Multiple Instruction
        Stream Architectures", International Journal of Parallel Processing,
        vol. 22, no. 3  (1994), pp. 243-272.

[TyFa93]
        G. Tyson and M. Farrens, "Techniques for Extracting Instruction Level
        Parallelism on MIMD Architectures", Proceedings of the 26th Annual
        International Symposium on Microarchitecture, Austin, Texas (December
        1-3, 1993), pp. 128-137.

[SmWP86]
        J. E. Smith, S. Weiss and N. Y. Pang, "A Simulation Study of Decoupled
        Architecture Computers", IEEE Transactions on Computers,  vol. C-35,
        no. 8  (August 1986), pp. 692-702.
 

Famous Machines

[Thor64]
       J. E. Thorton, "Parallel Operation in the Control Data 6600", AFIPS
        Proceedings of the Spring Joint Computer Conference, part II,  vol. 26
        (1964), pp. 33-40.

[Toma67]
        R. M. Tomasulo, "An Efficient Algorithm for Exploiting Multiple
        Arithmetic Units", IBM Journal,  vol. 11 (January 1967), pp. 25-33.

[Russ78]
        R. M. Russell, "The CRAY-1 Computer System", Communications of the
        ACM,  vol. 21, no. 1  (January 1978), pp. 63-72.
 

Methods

[DeBK01]
        Rajagopalan Desikan, Doug Burger, and Stephen WKeckler,
        "Measuring Experimental Error in Microprocessor Simulation",
        Proceedings of the 28th Annual International Symposium on Computer Architecture,
        Goteborg, Sweden (July 1-4th,  2001), pp.  266-277 (pdf)
 
[OsCF00]
        Mark Oskin, Frederic T. Chong, Matthew Farrens,
      "HLS: Combining Statistical and Symbolic Simulation to Guide Microprocessor Designs."
        Proceedings of the 27th Annual International Symposium on Computer Architecture (ISCA00),
       Vancouver, Canada (June 10-14th, 2000) pages 71-82. (ps)

Branch Prediction

[YeP92] T. Yeh and Y. Patt, "Alternative Implementations of Two-Level Adaptive
        Training Branch Prediction", Proceedings of the Nineteenth Annual
        International Symposium on Computer Architecture, Queensland,
        Australia (May 19-21, 1992), pp. 124-134.

[McFa93]
        Scott McFarling, "Combining branch predictors," Digital Equipment
        Corporation WRL Technical Note TN-36, June 1993

[CHYP94]
        P. Chang, E. Hao, T. Yeh and Y. Patt,
        "Branch Classification:  A New Mechanism for Improving Branch Predictor Performance",
        Proceedings of the 27th Annual International Symposium on Microarchitecture, San
        Jose, Ca.  (November 30 - December 2, 1994), pp. 22-31.

[YoGS95]
        C. Young, N. Gloy and M. D. Smith,
        "A Comparative Analysis of Schemes for Correlated Branch Prediction",
        Proceedings of the 22nd Annual International Symposium on Computer Architecture, Santa Marhgerita
        Ligure, Italy (June 22-24, 1995), pp. 276-286.
 

[ChCM96]
        I. K. Chen, J. T. Coffey and T. N. Mudge,
        "Analysis of Branch Prediction via Data Compression",
        Proceedings of the Seventh International Conference on Architectural Support for Programming
        Languages and Operating Systems, Cambridge, MA (October 1996), pp. 128-137.

[EmGl97]
        Joel Emer and Nikolas Gloy,
        "A Language for Describing Predictors and its Application to Automatic Synthesis",
        Proceedings of the 24th Annual International Symposium on Computer Architecture,
        Denver, Colorado (June 2-4, 1997), pp. 304-314.

[EPCP98]
        M. Evers, S. J. Patel, R. S. Chappell and Y. N. Patt,
        "An Analysis of Correlation and Predictability:  What Makes Two-Level Branch Predictors Work",
        Proceedings of the 25th Annual International Symposium on Computer Architecture, Barcelona,
        Spain (June 29-July 1, 1998), pp. 52-61.

[JuSN98]
        T. Juan, S. Sanjeevan and J. J. Navarro,
        "Dynamic History-Length Fitting: A third level of adaptivity for branch prediction",
        Proceedings of the 25th Annual International Symposium on Computer Architecture, Barcelona,
        Spain (June 29-July 1, 1998), pp. 155-166.

[StEP98]
        J. Stark, M. Evers and Y. N. Patt, "Variable Length Path Branch Prediction",
        Proceedings of the Eighth International Conference on Architectural Support for Programming
        Languages and Operating Systems,  San Jose, CA (October 3-7, 1998), pp. 170-179.

[EdMu98]
        A. N. Eden and T. Mudge, "The YAGS Branch Prediction Scheme",
        Proceedings of the 31st Annual International Symposium on
        Microarchitecture, Dallas, Texas (November 30-December 2, 1998), pp. 69-77.

[KiT98]
        S. P. Kim and G. S. Tyson,
        "Analyzing the Working Set Characteristics of Branch Execution",
        Proceedings of the 31st Annual International Symposium on Microarchitecture,
        Dallas, Texas (November 30-December 2, 1998), pp. 49-58.

[HeSS99]
        T. Heil, Z. Smith and J. E. Smith, "Improving Branch Predictors by Correlating on Data Values",
        Proceedings of the 32nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 32),
        pages 28-37, December 1999.

[HaSF00]
        Michael Hangs, Phil Sallee and Matthew Farrens,
        "Branch Transition Rate: A New Metric for Improved Branch Classification Analysis",
        Proceedings of  the 6th International Symposium on High-Performance Computer Architecture,
        Toulouse, France (January 8-12, 2000), pp. 241-250.

[SkMC00]
        Kevin Skadron, Margaret Martonosi, and Douglas Clark,
        "A Taxonomy of Branch Mispredictions, and Alloyed Prediction as a Robust Solution to Wrong-History Mispredictions "
        Proceedings of the International Conference on Parallel Architectures and Compilation Techniques,
        Philadelphia, PA (October 15-19, 2000), pp.

[ERSM01]
        A. Eden, J. Ringenberg, S. Sparrow, and T. Mudge.
        "Hybrid myths in branch prediction." Proc.  5th World Multiconference on Systemics,  Cybernetics and
        Informatics (SCI 2001) and the 7th Int. Conf.  on Information Systems Analysis and Synthesis (ISAS 2001),
        Orlando, FL, July 2001, to appear.

Confidence Predictors

[JaRS96]
        Erik Jacobsen, Eric Rotenberg, and James E. Smith,
        " Assigning Confidence to Conditional Branch Predictions ",
        Proceedings of the 29th Annual International Symposium on Microarchitecture,
        Paris, France (December 2-4, 1996), pp. 142-152.

[GKMP98]
        D. Grunwald, A. Klauser, S.Manne and A. Pleszkun,
        "Confidence Estimation for Speculation Control", Proceedings of the 25th Annual International
        Symposium on Computer Architecture, Barcelona, Spain (June 29-July 1,1998), pp.  122-131.

[AGGG01]
        J.L. Aragon, J. Gonzalez, J.M. Garcia and A. Gonzalez,
        " Selective Branch Prediction Reversal by Correlating with Data Values and Control Flow ",
        Proceedings of the 19th IEEE International Conference on Computer Design,
        Austin, Texas (September 24-26, 2001), pp. 228-233.

Advanced Caching Techniques

[BuGK96]
        D. Burger, J. R. Goodman and A. Kagi,
        "Memory Bandwidth Limitations of Future Microprocessors",
        Proceedings of the 23rd Annual International Symposium on Computer Architecture,
        Philadelphia, PA (May 22-24, 1996), pp. 78-89.

[TFMP97]
        G. Tyson, M. Farrens, J. Matthews and A. Pleszkun,
        "Managing Data Caches using Selective Cache Line Replacement",
        International Journal of Parallel Processing,  vol. 25, no. 3  (June 1997), pp. 213-242.

[KuWi98]
        S. Kumar and C. Wilkerson, "Exploiting Spatial Locality in Data Caches using Spatial Footprints",
        Proceedings of the 25th Annual International Symposium on Computer Architecture,
        Barcelona, Spain (June 29-July 1, 1998), pp. 357-368.

[PeHS99]
        J. Peir, W. W. Hsu and A. J. Smith, "Functional Implementation
        Techniques for CPU Cache Memories", IEEE Transactions on Computers,
        vol. 48, no. 2  (February 1999), pp. 100-110.

[VTGN99]
        A. Veidenbaum, W. Tang, R. Gupta, A. Nicolau, X. Ji,
        "Adapting Cache Line Size to Application Behavior."
        Proceedings of the 13 th ACM International Conference on Supercomputing, 1999(ICS '99).

[TRST99]
        Edward S. Tam, Jude A. Rivers, Vijayalakshmi Srinivasan, Gary S. Tyson and Edward S. Davidson,
        "Active Management of Data Caches by Exploiting Reuse Information",
        IEEE Transactions on Computers, Vol 48, No 11, pp. 1244-1259, Nov 1999.

[HaRe00]
        Erik G. Hallnor and Steven K. Reinhardt,
        "A Fully Associative Software-Managed Cache Design",
        Proceedings of the 27th Annual International Symposium on Computer Architecture,
        Vancouver, British Columbia (June 10-14, 2000), pp. 107-116

[JaMu01]
        B. Jacob and T. Mudge.  "Uniprocessor virtual memory without TLBs"
        IEEE Trans.  Computers, vol. 50, no. 5, May 2001, pp. xx-xx.
 

Instruction Fetch Issues

[CMMP95]
        T. M. Conte, K. N. Menezes, P. M. Mills and B. A. Patel, "Optimization
        of Instruction Fetch Mechanisms for High Issue Rates", Proceedings of
        the 22nd Annual International Symposium on Computer Architecture,
        Santa Marhgerita Ligure, Italy (June 22-24, 1995), pp. 333-344.

[RoBS96]
        E. Rotenberg, S. Bennett and J. E. Smith,
        "Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching",
        Proceedings of the 29th Annual International Symposium on Microarchitecture,
        Paris, France (December 2-4, 1996), pp. 24-35.

[PoTM99]
        M. Postiff, G. Tyson and T. Mudge, "Performance Limits of Trace Caches",
        Journal of Instruction Level Parallelism,  vol. 1, no. 5       (October 1999).

[BlRS99]
        B. Black, B. Rychlik and J. P. Shen, "The Block-based Trace Cache",
        Proceedings of the 26th Annual International Symposium on Computer
        Architecture, Atlanta, GA (May 2-4, 1999), pp. 196-207.
 

Future Processor Models

[EKDP03]
        Dan Ernst, Nam Sung Kim, Shidhartha Das, Sanjay Pant, Rajeev Rao, Toan Pham,
        Conrad Ziesler, David Blaauw, Todd Austin, Krisztian Flautner, Trevor Mudge, 
        "Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation ,"
        Proceedings of the 36th Annual International Symposium on Microarchitecture,
        San Diego, CA (Dec. 3-5, 2003), pp. 7-18

[Aust00]
        Todd Austin,  "DIVA: A Dynamic Approach to Microprocessor Verification,"
        Journal of Instruction Level Parallelism, Vol. 2, no. 11 (May, 2000)
 

Branch Elimination

[MLCH92]
        S. A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank and R. A. Bringmann,
        "Effective Compiler Support for Predicated Execution Using the
        Hyperblock", Proceedings of the 25th Annual International Symposium on
        Microarchitecture, Portland, Oregon (December 1-4, 1992), pp. 45-54.

[HMC93] W. W. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter, R. A.
        Bringmann, R. G. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G.
        Holm and D. M. Lavery, "The Superblock: An Effective Technique for
        VLIW and Superscalar Compilation.", Journal of Supercomputing, ,  vol.
        7, no. 1/2        (1993), pp. 229-248.

[MuWh95]
        F. Mueller and D. B. Whalley, "Avoiding Conditional Branches by Code
        Replication", Proceedings of the ACM SIGPLAN Notices Conference on
        Programming Language Design and Implementation, La Jolla, CA (June
        18-21, 1995), pp. 56-66.

[BoGS97]
        R. Bodik, R. Gupta and M. L. Soffa, "Interprocedural Conditional
        Branch Elimination", Proceedings of the ACM SIGPLAN Notices Conference
        on Programming Language Design and Implementation, Las Vegas, Nevada
        (June 15-18, 1997), pp. 146-158.
[YaUW98]
        M. Yang, G. Uh and D. B. Whalley, "Improving Performance by Branch
        Reordering", Proceedings of the ACM SIGPLAN Notices Conference on
        Programming Language Design and Implementation, Montreal, Canada (June
        17-19, 1998), pp. 130-141.

[ASPM99]
        D. I. August, J. W. Sias, J. Puiatti, S. A. Mahlke, D. A. Conners, K.
        M. Crozier and W. W. Hwu, "The Program Decision Logic Approach to
        Predicated Execution", Proceedings of the 26th Annual International
        Symposium on Computer Architecture, Atlanta, GA (May 2-4, 1999), pp.
        208-219.
 

Prefetching

[Joup90b]
        N. Jouppi, "Improving Direct-Mapped Cache Performance by the Addition
        of a Small Fully-Associative Cache and Prefetch Buffers", Proceedings
        of the Seventeenth Annual International Symposium on Computer
        Architecture,  vol. 18, no. 2  (May 1990), pp. 364-373.

[Joup90a]
        N. Jouppi, "Reducing Compulsory and Capacity Misses", Digital Western
        Research Laboratory Technical Note TN-53(August 1990).

[ChBa95]
        T. Chen and J. Baer, "Effective Hardware Based Data Prefetching for
        High-Performance Processors", IEEE Transactions on Computers,  vol.
        44, no. 5  (May 1995), pp. 609-623.

[Eben98]
        A. Ebenezer, Hardware Based Prefetching Methods, Masters Thesis,
        Department of Electrical and Computer Engineering, University of
        California-Davis, Davis, California, (December 1998).

[Joup98]
        N. Jouppi, "Retrospective:  Improving Direct-Mapped Cache Performance
        by the Addition of a Small Fully-Associative Cache and Prefetch
        Buffers", 25 Years of the International Symposium on Computer
        Architecture - Selected Papers(1998), pp. 71-73.

[JoGr99]
        D. Joseph and D. Grunwald, "Prefetching Using Markov Predictors", IEEE
        Transactions on Computers,  vol. 48, no. 2  (February 1999), pp. 121-133.
 
 

Value Prediction