Reading List for ECS 250B

Decoupled Processing


 [Rich99]
        Kevin Rich "Compiler Techniques for Evaluating and Extending Decoupled Architectures" ,
       (Ph.D. Dissertation) University of California at Davis, Davis, California (December 1999).

 [Tyso97]
        Gary Tyson,  Evaluation of a Scalable Decoupled Microprocessor Design"
        (Ph.D. Dissertation) University of California at Davis, Davis, California (August 1997),

[TyFa94]
        G. Tyson and M. Farrens, "Code Scheduling for Multiple Instruction
        Stream Architectures", International Journal of Parallel Processing,
        vol. 22, no. 3  (1994), pp. 243-272.

[TyFa93]
        G. Tyson and M. Farrens, "Techniques for Extracting Instruction Level
        Parallelism on MIMD Architectures", Proceedings of the 26th Annual
        International Symposium on Microarchitecture, Austin, Texas (December
        1-3, 1993), pp. 128-137.

[SmWP86]
        J. E. Smith, S. Weiss and N. Y. Pang, "A Simulation Study of Decoupled
        Architecture Computers", IEEE Transactions on Computers,  vol. C-35,
        no. 8  (August 1986), pp. 692-702.
 

Famous Machines

[Thor64]
       J. E. Thorton, "Parallel Operation in the Control Data 6600", AFIPS
        Proceedings of the Spring Joint Computer Conference, part II,  vol. 26
        (1964), pp. 33-40.

[Toma67]
        R. M. Tomasulo, "An Efficient Algorithm for Exploiting Multiple
        Arithmetic Units", IBM Journal,  vol. 11 (January 1967), pp. 25-33.

[Russ78]
        R. M. Russell, "The CRAY-1 Computer System", Communications of the
        ACM,  vol. 21, no. 1  (January 1978), pp. 63-72.
 

Methods

[DeBK01]
        Rajagopalan Desikan, Doug Burger, and Stephen WKeckler,
        Measuring Experimental Error in Microprocessor Simulation,
        Proceedings of the 28th Annual International Symposium on Computer Architecture,
        Goteborg, Sweden (July 1-4th,  2001), pp.
 

Branch Prediction

[YeP92] T. Yeh and Y. Patt, "Alternative Implementations of Two-Level Adaptive
        Training Branch Prediction", Proceedings of the Nineteenth Annual
        International Symposium on Computer Architecture, Queensland,
        Australia (May 19-21, 1992), pp. 124-134.

[McFa93]
        Scott McFarling, "Combining branch predictors," Digital Equipment
        Corporation WRL Technical Note TN-36, June 1993
 
[CHYP94]
        P. Chang, E. Hao, T. Yeh and Y. Patt, "Branch Classification:  A New
        Mechanism for Improving Branch Predictor Performance", Proceedings of
        the 27th Annual International Symposium on Microarchitecture, San
        Jose, Ca.  (November 30 - December 2, 1994), pp. 22-31.

[YoGS95]
        C. Young, N. Gloy and M. D. Smith, "A Comparative Analysis of Schemes
        for Correlated Branch Prediction", Proceedings of the 22nd Annual
        International Symposium on Computer Architecture, Santa Marhgerita
        Ligure, Italy (June 22-24, 1995), pp. 276-286.

[ChCM96]
        I. K. Chen, J. T. Coffey and T. N. Mudge, "Analysis of Branch
        Prediction via Data Compression", Proceedings of the Seventh
        International Conference on Architectural Support for Programming
        Languages and Operating Systems, Cambridge, MA (October 1996), pp.
        128-137.

[KiT98] S. P. Kim and G. S. Tyson, "Analyzing the Working Set Characteristics
        of Branch Execution", Proceedings of the 31st Annual International
        Symposium on Microarchitecture, Dallas, Texas (November 30-December 2,
        1998), pp. 49-58.

[EPCP98]
        M. Evers, S. J. Patel, R. S. Chappell and Y. N. Patt, "An Analysis of
        Correlation and Predictability:  What Makes Two-Level Branch
        Predictors Work", Proceedings of the 25th Annual International
        Symposium on Computer Architecture, Barcelona, Spain (June 29-July 1,
        1998), pp. 52-61.

[JuSN98]
        T. Juan, S. Sanjeevan and J. J. Navarro, "Dynamic History-Length Fitting:
        A third level of adaptivity for branch prediction", Proceedings of the 25th
        Annual International Symposium on Computer Architecture, Barcelona,
        Spain (June 29-July 1, 1998), pp. 155-166.

[StEP98]
        J. Stark, M. Evers and Y. N. Patt, "Variable Length Path Branch
        Prediction", Proceedings of the Eighth International Conference on
        Architectural Support for Programming Languages and Operating Systems,
        San Jose, CA (October 3-7, 1998), pp. 170-179.

[EdMu98]
        A. N. Eden and T. Mudge, "The YAGS Branch Prediction Scheme",
        Proceedings of the 31st Annual International Symposium on
        Microarchitecture, Dallas, Texas (November 30-December 2, 1998), pp.
        69-77.

[GKMP98]
        D. Grunwald, A. Klauser, S.Manne and A. Pleszkun, "Confidence Estimation for
        Speculation Control", Proceedings of the 25th Annual International
        Symposium on Computer Architecture, Barcelona, Spain (June 29-July 1,
        1998), pp.

[HeSS99]
        T. Heil, Z. Smith and J. E. Smith, "Improving Branch Predictors by
        Correlating on Data Values", In Proceedings of the 32nd Annual
        IEEE/ACM International Symposium on Microarchitecture (MICRO 32),
        pages 28-37, December 1999.
 
 
 
 
 
 

Prefetching

[Joup90b]
        N. Jouppi, "Improving Direct-Mapped Cache Performance by the Addition
        of a Small Fully-Associative Cache and Prefetch Buffers", Proceedings
        of the Seventeenth Annual International Symposium on Computer
        Architecture,  vol. 18, no. 2  (May 1990), pp. 364-373.

[Joup90a]
        N. Jouppi, "Reducing Compulsory and Capacity Misses", Digital Western
        Research Laboratory Technical Note TN-53(August 1990).

[ChBa95]
        T. Chen and J. Baer, "Effective Hardware Based Data Prefetching for
        High-Performance Processors", IEEE Transactions on Computers,  vol.
        44, no. 5  (May 1995), pp. 609-623.

[Eben98]
        A. Ebenezer, Hardware Based Prefetching Methods, Masters Thesis,
        Department of Electrical and Computer Engineering, University of
        California-Davis, Davis, California, (December 1998).

[Joup98]
        N. Jouppi, "Retrospective:  Improving Direct-Mapped Cache Performance
        by the Addition of a Small Fully-Associative Cache and Prefetch
        Buffers", 25 Years of the International Symposium on Computer
        Architecture - Selected Papers(1998), pp. 71-73.

[JoGr99]
        D. Joseph and D. Grunwald, "Prefetching Using Markov Predictors", IEEE
        Transactions on Computers,  vol. 48, no. 2  (February 1999), pp. 121-133.
 

Advanced Caching Techniques

[GLLG90]
        K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta and J.
        Hennessy, "Memory Consistency and Event Ordering in Scalable Shared-
        Memory Multiprocessors", Proceedings of the 17th Annual International
        Symposium on Computer Architecture, Seattle, WA (May 29-June 2, 1990),
        pp. 15-26.

[BuGK96]
        D. Burger, J. R. Goodman and A. Kagi, "Memory Bandwidth Limitations of
        Future Microprocessors", Proceedings of the 23rd Annual International
        Symposium on Computer Architecture, Philadelphia, PA (May 22-24,
        1996), pp. 78-89.

[TFMP97]
        G. Tyson, M. Farrens, J. Matthews and A. Pleszkun, "Managing Data
        Caches using Selective Cache Line Replacement", International Journal
        of Parallel Processing,  vol. 25, no. 3  (June 1997), pp. 213-242.

[KuWi98]
        S. Kumar and C. Wilkerson, "Exploiting Spatial Locality in Data Caches
        using Spatial Footprints", Proceedings of the 25th Annual
        International Symposium on Computer Architecture, Barcelona, Spain
        (June 29-July 1, 1998), pp. 357-368.

[PeHS99]
        J. Peir, W. W. Hsu and A. J. Smith, "Functional Implementation
        Techniques for CPU Cache Memories", IEEE Transactions on Computers,
        vol. 48, no. 2  (February 1999), pp. 100-110.

[ShAR99]
        X. Shen, Arvind and L. Rudolph, "Commit-Reconcile & Fences (CRF):  A
        New Memory Model for Architects and Compiler Writers", Proceedings of
        the 26th Annual International Symposium on Computer Architecture,
        Atlanta, GA (May 2-4, 1999), pp. 150-161.

[GnFV99]
        C. Gniady, B. Falsafi and T. N. Vijaykumar, "Is SC + ILP = RC?",
        Proceedings of the 26th Annual International Symposium on Computer
        Architecture, Atlanta, GA (May 2-4, 1999), pp. 162-171.

[LaF99] A. Lai and B. Falsafi, "Memory Sharing Predictor:  The Key to a
        Speculative Coherent DSM", Proceedings of the 26th Annual
        International Symposium on Computer Architecture, Atlanta, GA (May 2-
        4, 1999), pp. 161-182.
 

Instruction Fetch Issues


[CMMP95]
        T. M. Conte, K. N. Menezes, P. M. Mills and B. A. Patel, "Optimization
        of Instruction Fetch Mechanisms for High Issue Rates", Proceedings of
        the 22nd Annual International Symposium on Computer Architecture,
        Santa Marhgerita Ligure, Italy (June 22-24, 1995), pp. 333-344.

[RoBS96]
        E. Rotenberg, S. Bennett and J. E. Smith, "Trace Cache: a Low Latency
        Approach to High Bandwidth Instruction Fetching", Computer Sciences
        Department Technical Report CS-Technical Report-96-1310, University of
        Wisconsin-Madison , Madison, Wisconsin (April 11, 1996).

[PoTM99]
        M. Postiff, G. Tyson and T. Mudge, "Performance Limits of Trace
        Caches", Journal of Instruction Level Parallelism,  vol. 1, no. (to
        appear)        (June 1999).

[BlRS99]
        B. Black, B. Rychlik and J. P. Shen, "The Block-based Trace Cache",
        Proceedings of the 26th Annual International Symposium on Computer
        Architecture, Atlanta, GA (May 2-4, 1999), pp. 196-207.
 

Branch Elimination

[MLCH92]
        S. A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank and R. A. Bringmann,
        "Effective Compiler Support for Predicated Execution Using the
        Hyperblock", Proceedings of the 25th Annual International Symposium on
        Microarchitecture, Portland, Oregon (December 1-4, 1992), pp. 45-54.

[HMC93] W. W. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter, R. A.
        Bringmann, R. G. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G.
        Holm and D. M. Lavery, "The Superblock: An Effective Technique for
        VLIW and Superscalar Compilation.", Journal of Supercomputing, ,  vol.
        7, no. 1/2        (1993), pp. 229-248.

[MuWh95]
        F. Mueller and D. B. Whalley, "Avoiding Conditional Branches by Code
        Replication", Proceedings of the ACM SIGPLAN Notices Conference on
        Programming Language Design and Implementation, La Jolla, CA (June
        18-21, 1995), pp. 56-66.

[BoGS97]
        R. Bodik, R. Gupta and M. L. Soffa, "Interprocedural Conditional
        Branch Elimination", Proceedings of the ACM SIGPLAN Notices Conference
        on Programming Language Design and Implementation, Las Vegas, Nevada
        (June 15-18, 1997), pp. 146-158.
[YaUW98]
        M. Yang, G. Uh and D. B. Whalley, "Improving Performance by Branch
        Reordering", Proceedings of the ACM SIGPLAN Notices Conference on
        Programming Language Design and Implementation, Montreal, Canada (June
        17-19, 1998), pp. 130-141.

[ASPM99]
        D. I. August, J. W. Sias, J. Puiatti, S. A. Mahlke, D. A. Conners, K.
        M. Crozier and W. W. Hwu, "The Program Decision Logic Approach to
        Predicated Execution", Proceedings of the 26th Annual International
        Symposium on Computer Architecture, Atlanta, GA (May 2-4, 1999), pp.
        208-219.
 

Value Prediction