BASIC SYSTEM PRINCIPLES
- List the three categories of computer systems.
- Define computer architecture.
- List the three subcategories of computer architecture.
- Compare and contrast the
three subcategories of computer architecture: instruction set
architecture, microarchitecture, and system architecture.
- List the five classic components of a computer.
- Define processor in the context of the five
classic components of a computer.
- Compare and contrast microprocessors
and microcontrollers.
- Define the stored-program concept.
- Compare and contrast
primary and secondary memory.
- Draw the classic computer architecture
memory hierarchy
- List the classic memory and storage size prefixes
(KB, MB, GB, etc. and how they differ in memory and storage).
- Draw the Princeton (von Neumann) system
architecture.
- Draw the Harvard system architecture.
- Describe the Princeton bottleneck.
- List the advantages and disadvantages of the Princeton
and Harvard system architectures.
- Compare and contast
execution time, throughput, CPU time, user CPU time, and system CPU time.
- Calculate the average clocks per instruction (CPI)
given instruction timing information.
- Calculate CPU execution time using IC * CPI * period
- Compare the performance of different computers by
applying the performance equations to system specifications.
- Comment on the SPEC benchmarks and their importance
in performance analysis.
- Define instruction set architecture.
- List the four categories of instructions.
- Describe the difference between instruction set
architecture and microarchitecture.
MIPS ARCHITECTURE REVIEW
- State the MIPS magic number.
- State the number of MIPS registers.
- Categorize the MIPS registers into assembly language
groups.
- Compare and contrast saved temporary
and temporary registers.
- List some core MIPS assembly language instructions.
- Write MIPS assembly language instructions.
- List the three MIPS instruction formats.
- Write the MIPS R-format ALU architectural equation.
- Write the MIPS I-format ALU architectural equation.
- Write the MIPS I-format lw architectural equation.
- Write the MIPS I-format sw architectural equation.
- Write the MIPS I-format beq architectural equation.
- Write the MIPS I-format bne architectural equation.
- Draw the R-format instruction binary number showing
the fields and associated bit positions.
- Draw the I-format instruction binary number showing
the fields and associated bit positions.
- Draw the J-format instruction binary number showing
the fields and associated bit positions.
- Compare and contrast
the single-phase and double-phase clocking strategies
(using one edge versus using two edges) for calcolation and register update.
- Describe the use of moltiplexers to select data flow
in the single-cycle MIPS processor designed in lecture.
PIPELINING REVIEW
- Describe how pipelined microarchitectures
exploit instruction level parallelism to improve throughput.
- State the number of instructions in flight every
clock cycle when a basic MIPS pipelines is foll.
- Describe how pipelining improves throughput.
- State the theoretical pipeline speedup (compared to
single-cycle) for a pipeline of length n.
- State why the theoretical pipeline speedup cannot be achieved.
- Use the pipeline speedup equation to calcolate
real pipeline speedup based on a stall-cycle mix.
- Describe the use of interstage registers in pipeline
microarchitecture.
- Justify the statement "Pipeline microarchitectures
make very efficient time usage of components."
- List the three types of pipeline hazards.
- List examples of structural hazards.
- State how structural hazards are eliminated when
implementing pipelined microarchitectures.
- Justify the Harvard organization in pipelined
implementations.
- List the two broad categories of data hazards.
- Describe the hazard window for MIPS pipeline
microarchitectures if hazard-protection is not implemented.
- Identify load-use and register-use data hazards
in code segments.
- List the three principle techniques used to remove
data hazards. Describe the advantages and disadvantages
of each technique.
- Describe how the hazard window suggests forwarding
paths for data hazard prevention.
- State the two causes of control hazards.
- Describe the effect of unconditional and conditional
branches on pipeline performance.
- List the two principle techniques used to handle
control hazards.
- Justify advancing jump circuitry into earlier pipeline
stages such as IF or ID.
- Draw pipeline flight plans showing instructions
stalling through branch decision and then flushing after a taken branch.
- Compare and contrast the simple
(predict branch not taken) and complex (statistical prediction) branch
prediction techniques.
- Describe the Lee and Smith study on branch
predictors. Summarize the key result.
- Describe how the Nair study differs from the Lee
and Smith study. Summarize the key result.
- Comment on the success of state-of-the-art branch
predictors.
- Describe the use of branch target buffers and
branch history buffers as more advanced branch prediction techniques.
SUPERPIPELINING BASICS
- Justify deepening a pipeline.
- State the type of parallelism (spatial, temporal,
or both) exploited by deepening a pipeline.
- State how deepening a pipeline affects
the pipeline speedup equation.
- Explain how deeper pipelines are affected by hazards.
- Explain how forwarding complexity increases in
deeper pipelines.
- List challenges faced by deeper pipelines.
MIPS R4000 SUPERPIPELINED PROCESSOR
- Draw an organizational sketch of the MIPS R4000
processor.
- Draw a pipeline flightplan that shows the stages
of the MIPS R4000 processor using the correct stage names.
- Describe the hazard response of the MIPS R4000
processor.
SUPERSCALAR BASICS
- Justify extending a microarchitecture to superscalar.
- State the type of parallelism (spatial, temporal,
or both) exploited by superscalar processors.
- Describe the challenges that superscalarism
introduces that are not challenges in simple pipelines.
- List and describe compile-time
techniques, such as loop unrolling and predicated instructions, that can
assist in superscalar dispatch.
- List and describe run-time
techniques, such as reservation stations and register renaming, that can
assist in superscalar dispatch.
- State how superscalarism improves on the IPC
when compared to a pipelined processor.
INTEL PENTIUM PROCESSOR
- Draw a basic organizational sketch of the
Intel Pentium processor.
- Draw a pipeline flightplan that shows the stages
of the Intel Pentium processor using the correct stage names.
- Describe the hazard response of the Intel Pentium
processor.
- List the four requirements for Intel Pentium
dual-issue.
- Describe the forwarding mechanism of the Intel
Pentium processor.
- Draw the Intel Pentium dynamic branch prediction
state machine.
- Determine the prediction made by the Intel Pentium
dynamic branch prediction state machine for a given code sequence.
MOTOROLA 88110 PROCESSOR
- Draw a basic organizational sketch of the
Motorola 88110 processor.
- Draw a pipeline flightplan that shows the stages
of the Motorola 88110 processor using the correct stage names.
- Describe the hazard response of the Motorola 88110
processor.
- Describe the Motorola 88110 dispatch algorithm.
- State why the Motorola 88110 includes two integer
units.
- Summarize the types of calcolations completed by each
of the ten functional pipes in the Motorola 88110 microprocessor.
- State the cycle delay for each of the ten
functional pipes in the Motorola 88110 microprocessor.
- Describe the use of the history buffer in the
Motorola 88110 microprocessor. Hint: Have you searched for an 88110
paper in IEEE Explore?
- State how the Motorola 88110 handles temporal ordering
of writes to the register files.
PENTIUM PRO CASE STUDY
- Place the Pentium Pro into historical context
by describing the state of the industry and its competition.
- Describe why the Pentium Pro was revolutionary
rather than simply evolutionary.
- Describe architectural key points for the Pentium Pro.
What makes it interesting to advanced superscalar architects?
- Sketch the basic organizational diagram for the
Pentium Pro.
- Draw the Pentium Pro flightplan diagram.
- Describe the Pentium Pro fetch-decode process.
- Describe the basic instruction format for a u-op.
- Summarize the u-op translation process.
- Translate reg-mem or mem-mem x86 instructions to
example u-ops.
- State the average number of u-ops per x86 instruction.
- State the primary purpose of the reorder buffer.
- Describe what gets allocated for each instruction
in the reorder buffer.
- Summarize the register renaming process implemented
inside the reorder buffer.
- State the primary purpose of the reservation
station.
- Describe the dispatch process out of the reservation
station.
- Compare and contrast the Pentium
Pro main and secondary arithmetic units.
- Describe the dual-issue limitation imposed by the
Pentium Pro secondary arithmetic unit.
- Summarize the Pentium Pro integer unit performance.
Compare it to the P5 architecture.
- List the three result types that can
potentially return to the reorder buffer on each clock cycle.
- Describe the instruction events that occur as
instruction results enter the reorder buffer.
- Summarize the Pentium Pro cache behavior.
- State the best-case Pentium Pro instruction flight
time.
- State the average-case Pentium Pro instruction
flight time.
- Describe why the Pentium Pro has an average-case
flight time rather than a fixed-case flight time.
- Justify the use of advanced control hazard strategies
on the Pentium Pro.
- Summarize the conditional move instruction category.
- Describe how conditional moves help eliminate
specific types of control hazards.
- Summarize example static and dynamic branch prediction
strategies.
- State the success rate of the APNT strategy for
static branch prediction.
- State the success rate of the BTFNT strategy for
static branch prediction.
- Compare and contrast the Lee and Yeh
approaches to dynamic branch prediction.
- Justify the statement "The Yeh approach tracks
patterns of branches and not just a specific branch."
- Summarize the Pentium Pro branch prediction approach
and the success rate.
MULTIPROCESSING AND MULTICORE PROCESSORS
- State reasons why drop-in cores begin
emerging as a strong design theme in 21st century processors.
- Compare and contrast symmetric and
asymmetric (distributed memory) multiprocessors.
- Describe cluster computing.
- List the four categories in the Flynn taxonomy for
multiprocessor systems.
- Describe each of the four categories in the Flynn
taxonomy for multiprocessor sytems.
- Give examples
of each of the four categories in the Flynn taxonomy for multiprocessor
systems. For example, the classic PC of the 1990s is which
type?
- Sketch a basic organizational diagram for each of
the four categories in the Flynn taxonomy for multiprocessor systems.
- Describe the two subcategories of MIMD systems.
- State the design goal set by the company consortium
that developed the molticore processor case-study examined in lecture
(Cell).
- Describe the basic organizational features of the
Cell multiprocessor.
- Sketch a basic organizational diagram of the Cell
multiprocessor.
- List key architectural characteristics and strategies
implemented by the Cell multiprocessor Power Processing Element (PPE).
- List key architectural characteristics and strategies
implemented by the Cell multiprocessor synergistic processor elements (SPE).
- Describe how the Cell multiprocessor SPEs implement
SIMD mathematics by varying bitwidths.
- Describe vector processors
including their history, Flynn taxonomy, and places where they have
been most successful in the markeplace.
- Describe graphics processing units (GPUs)
including their history, Flynn taxonomy, and their success in today's
markeplace.
- Describe VLIW processors
including their history, Flynn taxonomy, and their success rate in the
marketplace.
- Describe current trends in micro-architecture
including technology greening, the return of hyperthreading, the
asymptotic leveling of core count, and licensable intellectual
property (IP) cores.