[Rich99]
Kevin Rich "Compiler
Techniques for Evaluating and Extending Decoupled Architectures" ,
(Ph.D. Dissertation) University
of California at Davis, Davis, California (December 1999).
[Tyso97]
Gary Tyson, Evaluation
of a Scalable Decoupled Microprocessor Design"
(Ph.D. Dissertation) University
of California at Davis, Davis, California (August 1997),
[TyFa94]
G. Tyson and M. Farrens,
"Code Scheduling for Multiple Instruction
Stream Architectures", International
Journal of Parallel Processing,
vol. 22, no. 3 (1994),
pp. 243-272.
[TyFa93]
G. Tyson and M. Farrens,
"Techniques for Extracting Instruction Level
Parallelism on MIMD Architectures",
Proceedings
of the 26th Annual
International Symposium
on Microarchitecture, Austin, Texas (December
1-3, 1993), pp. 128-137.
[SmWP86]
J. E. Smith, S. Weiss and
N. Y. Pang, "A Simulation Study of Decoupled
Architecture Computers",
IEEE
Transactions on Computers, vol. C-35,
no. 8 (August 1986),
pp. 692-702.
[Toma67]
R. M. Tomasulo, "An Efficient
Algorithm for Exploiting Multiple
Arithmetic Units", IBM
Journal, vol. 11 (January 1967), pp. 25-33.
[Russ78]
R. M. Russell, "The CRAY-1
Computer System", Communications of the
ACM, vol. 21,
no. 1 (January 1978), pp. 63-72.
[McFa93]
Scott McFarling, "Combining
branch predictors," Digital Equipment
Corporation WRL Technical
Note TN-36, June 1993
[CHYP94]
P. Chang, E. Hao, T. Yeh
and Y. Patt, "Branch
Classification: A New
Mechanism for Improving Branch Predictor Performance", Proceedings
of
the 27th Annual International
Symposium on Microarchitecture, San
Jose, Ca. (November
30 - December 2, 1994), pp. 22-31.
[YoGS95]
C. Young, N. Gloy and M.
D. Smith, "A
Comparative Analysis of Schemes
for Correlated Branch Prediction",
Proceedings of the 22nd Annual
International Symposium
on Computer Architecture, Santa Marhgerita
Ligure, Italy (June 22-24,
1995), pp. 276-286.
[ChCM96]
I. K. Chen, J. T. Coffey
and T. N. Mudge, "Analysis
of Branch
Prediction via Data Compression",
Proceedings of the Seventh
International Conference
on Architectural Support for Programming
Languages and Operating
Systems, Cambridge, MA (October 1996), pp.
128-137.
[KiT98] S. P. Kim and G. S. Tyson, "Analyzing
the Working Set Characteristics
of Branch Execution", Proceedings of the 31st Annual International
Symposium on Microarchitecture,
Dallas, Texas (November 30-December 2,
1998), pp. 49-58.
[EPCP98]
M. Evers, S. J. Patel, R.
S. Chappell and Y. N. Patt, "An Analysis of
Correlation and Predictability: What Makes Two-Level Branch
Predictors Work", Proceedings of the 25th Annual International
Symposium on Computer
Architecture, Barcelona, Spain (June 29-July 1,
1998), pp. 52-61.
[JuSN98]
T. Juan, S. Sanjeevan and
J. J. Navarro, "Dynamic History-Length Fitting:
A third level of adaptivity for branch prediction", Proceedings
of the 25th
Annual International
Symposium on Computer Architecture, Barcelona,
Spain (June 29-July 1, 1998),
pp. 155-166.
[StEP98]
J. Stark, M. Evers and Y.
N. Patt, "Variable Length Path Branch
Prediction", Proceedings of the Eighth International Conference
on
Architectural Support
for Programming Languages and Operating Systems,
San Jose, CA (October 3-7,
1998), pp. 170-179.
[EdMu98]
A. N. Eden and T. Mudge,
"The YAGS Branch Prediction Scheme",
Proceedings of the 31st
Annual International Symposium on
Microarchitecture,
Dallas, Texas (November 30-December 2, 1998), pp.
69-77.
[GKMP98]
D. Grunwald, A. Klauser,
S.Manne and A. Pleszkun, "Confidence Estimation
for
Speculation Control", Proceedings of the 25th Annual International
Symposium on Computer
Architecture, Barcelona, Spain (June 29-July 1,
1998), pp.
[HeSS99]
T. Heil, Z. Smith and J.
E. Smith, "Improving Branch Predictors by
Correlating
on Data Values", In Proceedings of the 32nd Annual
IEEE/ACM International
Symposium on Microarchitecture (MICRO 32),
pages 28-37, December 1999.
[Joup90a]
N. Jouppi, "Reducing Compulsory
and Capacity Misses", Digital Western
Research Laboratory Technical
Note TN-53(August 1990).
[ChBa95]
T. Chen and J. Baer, "Effective
Hardware Based Data Prefetching for
High-Performance Processors",
IEEE
Transactions on Computers, vol.
44, no. 5 (May 1995),
pp. 609-623.
[Eben98]
A. Ebenezer, Hardware
Based Prefetching Methods, Masters Thesis,
Department of Electrical
and Computer Engineering, University of
California-Davis, Davis,
California, (December 1998).
[Joup98]
N. Jouppi, "Retrospective:
Improving Direct-Mapped Cache Performance
by the Addition of a Small
Fully-Associative Cache and Prefetch
Buffers", 25 Years of
the International Symposium on Computer
Architecture - Selected
Papers(1998), pp. 71-73.
[JoGr99]
D. Joseph and D. Grunwald,
"Prefetching Using Markov Predictors", IEEE
Transactions on Computers,
vol. 48, no. 2 (February 1999), pp. 121-133.
[BuGK96]
D. Burger, J. R. Goodman
and A. Kagi, "Memory Bandwidth Limitations of
Future Microprocessors",
Proceedings
of the 23rd Annual International
Symposium on Computer
Architecture, Philadelphia, PA (May 22-24,
1996), pp. 78-89.
[TFMP97]
G. Tyson, M. Farrens, J.
Matthews and A. Pleszkun, "Managing Data
Caches using Selective Cache
Line Replacement", International Journal
of Parallel Processing,
vol. 25, no. 3 (June 1997), pp. 213-242.
[KuWi98]
S. Kumar and C. Wilkerson,
"Exploiting Spatial Locality in Data Caches
using Spatial Footprints",
Proceedings
of the 25th Annual
International Symposium
on Computer Architecture, Barcelona, Spain
(June 29-July 1, 1998),
pp. 357-368.
[PeHS99]
J. Peir, W. W. Hsu and A.
J. Smith, "Functional Implementation
Techniques for CPU Cache
Memories", IEEE Transactions on Computers,
vol. 48, no. 2 (February
1999), pp. 100-110.
[ShAR99]
X. Shen, Arvind and L. Rudolph,
"Commit-Reconcile & Fences (CRF): A
New Memory Model for Architects
and Compiler Writers", Proceedings of
the 26th Annual International
Symposium on Computer Architecture,
Atlanta, GA (May 2-4, 1999),
pp. 150-161.
[GnFV99]
C. Gniady, B. Falsafi and
T. N. Vijaykumar, "Is SC + ILP = RC?",
Proceedings of the 26th
Annual International Symposium on Computer
Architecture, Atlanta,
GA (May 2-4, 1999), pp. 162-171.
[LaF99] A. Lai and B. Falsafi, "Memory Sharing Predictor: The
Key to a
Speculative Coherent DSM",
Proceedings
of the 26th Annual
International Symposium
on Computer Architecture, Atlanta, GA (May 2-
4, 1999), pp. 161-182.
[CMMP95]
T. M. Conte, K. N. Menezes,
P. M. Mills and B. A. Patel, "Optimization
of Instruction Fetch Mechanisms
for High Issue Rates", Proceedings of
the 22nd Annual International
Symposium on Computer Architecture,
Santa Marhgerita Ligure,
Italy (June 22-24, 1995), pp. 333-344.
[RoBS96]
E. Rotenberg, S. Bennett
and J. E. Smith, "Trace Cache: a Low Latency
Approach to High Bandwidth
Instruction Fetching", Computer Sciences
Department Technical
Report CS-Technical Report-96-1310, University of
Wisconsin-Madison , Madison,
Wisconsin (April 11, 1996).
[PoTM99]
M. Postiff, G. Tyson and
T. Mudge, "Performance Limits of Trace
Caches", Journal of Instruction
Level Parallelism, vol. 1, no. (to
appear)
(June 1999).
[BlRS99]
B. Black, B. Rychlik and
J. P. Shen, "The Block-based Trace Cache",
Proceedings of the 26th
Annual International Symposium on Computer
Architecture, Atlanta,
GA (May 2-4, 1999), pp. 196-207.
[HMC93] W. W. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter,
R. A.
Bringmann, R. G. Ouellette,
R. E. Hank, T. Kiyohara, G. E. Haab, J. G.
Holm and D. M. Lavery, "The
Superblock: An Effective Technique for
VLIW and Superscalar Compilation.",
Journal
of Supercomputing, , vol.
7, no. 1/2
(1993), pp. 229-248.
[MuWh95]
F. Mueller and D. B. Whalley,
"Avoiding Conditional Branches by Code
Replication", Proceedings
of the ACM SIGPLAN Notices Conference on
Programming Language
Design and Implementation, La Jolla, CA (June
18-21, 1995), pp. 56-66.
[BoGS97]
R. Bodik, R. Gupta and M.
L. Soffa, "Interprocedural Conditional
Branch Elimination", Proceedings
of the ACM SIGPLAN Notices Conference
on Programming Language
Design and Implementation, Las Vegas, Nevada
(June 15-18, 1997), pp.
146-158.
[YaUW98]
M. Yang, G. Uh and D. B.
Whalley, "Improving Performance by Branch
Reordering", Proceedings
of the ACM SIGPLAN Notices Conference on
Programming Language
Design and Implementation, Montreal, Canada (June
17-19, 1998), pp. 130-141.
[ASPM99]
D. I. August, J. W. Sias,
J. Puiatti, S. A. Mahlke, D. A. Conners, K.
M. Crozier and W. W. Hwu,
"The Program Decision Logic Approach to
Predicated Execution", Proceedings
of the 26th Annual International
Symposium on Computer
Architecture, Atlanta, GA (May 2-4, 1999), pp.
208-219.