Project description for ECS201A
Proposal due: Friday, November 2nd
Status Report due: Tuesday, November 21st
Final Report due: Midnight Wednesday, December 13th
Intro/Overviews
The purpose of this project is to help you develop your research,
independent thinking, and presentation skills. Your assignment is
to pick some topic (ideally one that you find interesting) and study it
in more
detail. For example, you might choose to try and evaluate some
proposal of your own, or examine
an extension to a paper studied in class, or re-validate the data in
some
paper by writing your own simulator. Keep in mind that there is
an inverse relationship between creativity and detail - if you choose a
project with very little creative contribution, such as re-validating
an existing work, then a more detailed evaluation will be
expected. If the project has a high creativity factor, then
correspondingly less rigor will be acceptable. You can work in
groups of 2 or 3, but I expect approximately equal (and substantial)
contributions from all members - therefore, if you are doing a
re-validation, for example, your group size will have to be small (2)
since there is not that much work to do.
These projects will be graded on roughly 4 different things:
-
How well the problem is defined and motivated
-
How extensive the survey of previous work is
-
The experimental technique used
-
The quality of the presentation of the results
The paper should be similar in style to the conference papers that
we
will read in class or that are referenced in the back of each chapter
of
the text. Your goal should be to produce a publishable-quality
paper.
However, since many conference papers represent a significant part of a
Ph.D's
graduate work, conference-quality originality and results are not
expected.
Desired, but not required.
For those of you who are not familiar with what a conference paper
looks
like, here are 5 examples - reading these, you can get the feel for how
a
paper should be put together.
- A. N. Eden and T. Mudge, "
The YAGS Branch Prediction Scheme
", Proceedings of the 31st Annual International Symposium on
Microarchitecture, Dallas, Texas (November 30-December 2, 1998),
pp.
69-77.
In addition, there are many more examples
at
this web site.
There are three
milestones associated with this task: The Proposal, the Status Report,
and the Final Report.
Milestone 1 - The Proposal
Proposals should be 1 to 2 pages long and should include:
-
A description of the topic
-
A statement of why the topic is interesting or important
-
A description of the methods to be used for evaluating the
proposed idea (for projects with original research)
-
References to at least 3 relevant papers you have obtained and
read. The course text and readings cite many papers. Some other
important
venues for publishing relevant work on Architecture:
- Proceedings of the International Symposium on Computer
Architecture (ISCA)
- Proceedings of the Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS)
- Proceedings of the International Symposium on
Microarchitecture (MICRO)
- Proceedings of the High Performance Computer Architecture
Symposium (HPCA)
- International Journal of Parallel Processing
- ACM Transactions on Computer Systems
- IEEE Transactions on Computers
- IEEE Computer Magazine
- IEEE Micro
- Microprocessor Report
I will read these proposals and give
you feedback regarding the acceptability of the proposal. For
example, your proposal may be too ambitious to get done in the given
time frame, or it may be too easy to be a 3-person project but
acceptable as a 2-person project, or it may be already done and a
different "spin" will be required ... The proposal deadline is given above.
However, proposals turned in earlier than the deadline will get
feedback sooner. (Remember - up to means less than! :-)
Milestone 2 - The Status Report
In order to help ensure work on the
projects is moving forward in a timely fashion, a 1 to 2
page status report is due midway between the proposal submission and
Final Report due dates. This report should clearly describe the
progress
you are making, so that I can provide some feedback on how you are
doing
and suggest any mid-course corrections that might be advisable. The
status
report will not be graded, but should be viewed as an important part of
the project.
Milestone 3 - The Final Report
As stated above, your Final Report
should be similar in style to a conference paper - an abstract, body,
and optional appendices. The abstract should summarize the
contributions of the report in one or two paragraphs, while the length
of the body
should be limited to approximately 5000 words (15-20 pages of
double-spaced
10-point text). If you need more space, you can put additional
supporting
material in appendices.
Project Talks
15-25 minute presentations of your
results will be
scheduled during finals week, with the in-class finals time being the
latest possible available time (2006 might be different, since I have a
conference M-W of finals week ...). This should be viewed as an opportunity
to practice your presentation skills - the ability to convey your ideas
and results to your peers is critically important in our communication
age, and a central part of the research process that should be of
interest
to those pursuing an advanced degree.
Possible Research Topics
Ideally, you should come up with your own topic, one that you find
particularly interesting and related to your own interests. For
example, if you have an interest in compilers, then code scheduling for
instruction level parallelism might be a good topic. If you are more
interested in Operating Systems, then
the design of a processor to support the OS might be more to your
liking.
However, I realize that often at this point you do not yet know what
you
find most interesting, so to help you along a list of example projects
follows. This is by no means an
exhaustive list, nor is it a particularly good one. Examples,
mainly.
- Currently, almost all machines use 32-bit instructions. What if
you had a 64-bit instruction? What could you do with that? (add hints
to the instruction set to support potential underlying hardware, for
example).
- There is a big problem with fetching sufficient instructions to
feed machines with high ILP - propose a new way (or evaluate existing
ways) to deal with this
problem.
- Write a cycle-level simulator for an existing architecture, and
then evaluate various performance enhancements. For example, what
happens to the performance if Out Of Order (OOO) issue
is added (or removed)?
- You can now fabricate 500 million transistors on a chip. What is
the best use of these transistors?
- Compare and contrast different approaches to exploiting
instruction level parallelism methods - for example, decoupled vs.
VLIW, vectors vs. superscalar, VLIW vs. superscalar, decoupled vs.
superscalar, etc.
- Suggest modifications to the decoupled architecture approach
that might help provide prefetch capabilities.
- Evaluate the maximum amount of parallelism available in a
representative set of benchmark programs.
- Look at ways to increase the effective bandwidth between
processor and external memory.
- Study the "bursty" nature of pipelines; are averages really
useful? Is there a way to more accurately model bursty behaviour?
- (interesting note - 99% of the human population has more than the
average number of legs ...)
- Analyze program basic block size, and look at the branch
problem. Evaluate the technique of predicated execution, and give some
examples of how it can be used to increase basic block size.
- Architectures/implementations for non-load/store architectures.
For example, how might a stack or
accumulator architecture be implemented to go fast? Can performance
advantages be identified?
- Look at instruction set enhancements and their effect on
performance (e.g.,
update-mode addressing, conditional register-to-register moves, and
multiply-add
instructions)
- Analyze the static and dynamic instruction frequencies for 3-4
different architectures. Also look at instruction couples and triples.
Based on this information, can you propose any new instructions?
- Architectural support of operating systems (e.g., user-level
traps for lightweight threads)
- Revisit the concept of an OS co-processor. What should such a
co-processor look like? (what OS tasks could use specific hardware
support, how often would it have to be used to be effective, etc.)
"Design" the processor (define the instruction set, word size,
datapath, number of ALU's, registers, etc) What does this specially
designed OS co-processor give you that
a 68000 used in a similar manner wouldn't?
- What would an OS for a decoupled machine like MISC look like?
- Programs exhibit a lot of predictability and redundancy. Often
entire blocks of code have the same inputs each time, meaning they do
not have to be reexecuted. How might you identify these blocks and
exploit this
information?
Is it really
necessary to use all the existing transistors just to improve
performance? Currently it takes months/years to develop software
packages/systems, which are full of bugs and potential security holes.
What kind of hardware support might you add to a processor in order to
help facilitate the job of writing correct programs?
- What does the distribution of data values look like?
- What is the average lifetime of a cache location?
- What is the distribution of hard to predict branches? Do they
cluster or are they evenly distributed? Can you use this
information?
- Study cache implementations, especially non-blocking caches --
design
methods and performance, for example
- Various memory system enhancements, including victim caches,
stream buffers, address hashing, etc.
- Extend the current research that has been done on new ways to
manage a cache (evaluate and improve the effectiveness of C/NA, for
example)
- Look at what are called spacial/temporal caches - does it make
sense to treat data differently based on the type of locality it
exhibits?
- How about a "compressed" cache? We would expect there to be lots
of redundancy in the cache itself - could you maybe have 1 cache that
is really small that holds compressed data, and another cache that
holds uncompressed data?
- Some load instructions are more "important" than others - in
other
words, some load instructions need to find their data in the first
level
cache, while others can afford to have the data be in the second level
cache with no impact on performance. How might you identify these
different
types of loads, and do something useful with that information?
- Methods and performance of various predictors, both value and
branch, including ones
you propose yourself
- A study of confidence predictors, why they are important and how
they might be improved
- The importance of and techniques for predicting multiple
branches in a single cycle
- Is there really a "memory wall", and if so, what do we do about
it?
- What's all this noise about Processors In Memory (PIM), anyway?
- What is speculative execution, how important is it, how is it
implemented, what kind of performance can it provide, etc.
- Compiler transformations to improve pipeline/superscalar
performance
- Compiler transformations to improve memory behavior
- The effect of changing technology on architecture (e.g. flash
memories, fiber optics), and the most likely technology changes in the
near future.
- High-performance I/O (e.g. RAIDS and ATM networks)
- Prefetching, both data and instruction.
- Value Speculation, what it is and how it works
- Power-aware processing, what challenges designers are facing and
how these problems might be overcome
OR:
Take any paper in any of the major conferences (ISCA, MICRO, ASPLOS, HPCA, etc.) and
extend, expand, rebut, or verify it. You will need to tell me which
paper you are working on, so that we don't wind up with multiple groups
working on the same paper. The list of papers referenced above
(
this one) is a good place to start. There is also a link on the 201A main
web page to the home pages of several of these conferences. The papers are
almost always available online, and if not I have both hard and electronic
copies of ISCA and MICRO papers (and hard copies of most ASPLOS and several
HPCA).
OR:
You may write a survey paper of an area within computer
architecture. These papers should contain:
-
A summary of previous work in an area, including extensive
references
-
A presentation of opinions of other authors both for and against
various options (again, with references)
-
A conclusion containing your opinion of the strengths and
weaknesses of the arguments presented above
Since a survey paper has no creative content and therefore is less
risky than a reseach project, the survey papers will be expected to meet a
much higher standard (both of of completeness and analysis of the
literature.) A survey paper is also an individual project - no team
survey papers will
be accepted. You need to read at
least 10 papers on the subject you are surveying.
Here is an example of a survey paper, written by a Masters Degree
student a few years back.