You are on page 1of 24

ECE 463/521

Overview
Prof. Eric Rotenberg
Computer Architecture & Systems
Computer Architecture
Processor Architecture (CPU, microprocessor)

Hard:
Correct & Fast CPU

Easy:
Correct CPU
Simple Processor Pipeline
Register File

1
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Memory (DRAM & Disk)


Invention #1 Pipelining
Register File

1
6
5
4
3
2
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Memory (DRAM & Disk)


Problem: Dependent Instructions
Register File

1
6
5
4
3
2
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Memory (DRAM & Disk)


Invention #2 Data Bypasses
Register File

1
6
5
4
3
2
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Flea-Flicker!

Memory (DRAM & Disk)


Problem: Branch Decisions
Register File

2
? 1
IF ID EX MEM WB
2 (instr. decode) (execute) (memory) (writeback)
(instr. fetch)

Memory (DRAM & Disk)


Invention #3 Branch Prediction
Register File
Branch
2
4
3 Predictor

? 1
IF ID EX MEM WB
2 (instr. decode) (execute) (memory) (writeback)
(instr. fetch)

Memory (DRAM & Disk)


Problem: “Memory Wall”
Register File
Branch
Predictor

IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Memory (DRAM & Disk)


Invention #4 Caches
Register File
Branch
Predictor

IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Instr. Data
Cache Cache

Memory (DRAM & Disk)


Caches (cont.)
• Locality of reference
– Temporal locality: If you access an item, likely
to access it again in near future
– Spatial locality: If you access an item, likely to
access a nearby item in the near future
Problem: Stalled Instructions
Register File
Branch
Predictor

4 3 2 1
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)

Instr. cache Data


Cache miss Cache

Memory (DRAM & Disk)


Invention #5 Out-of-Order Execution
Register File
Branch
Predictor

4
7
6
5 3 2 1
IF ID EX MEM WB
(instr. fetch) (instr. decode) (execute) (memory) (writeback)
Dynamic
Scheduler
Instr. cache Data
Cache miss Cache

Memory (DRAM & Disk)


Superscalar Execution
Register File
Branch
Predictor
1
4
7
IF852 ID EX MEM WB
3
6
9
(instr. fetch) (instr. decode) (execute) (memory) (writeback)
Dynamic
Scheduler
Instr. Data
Cache Cache

Memory (DRAM & Disk)


Deep Pipelining
Register File
Branch
Predictor

IF1 IF2 ID1 ID2 EX1 EX2 M1 M2 W1 W2

Dynamic
Scheduler
Instr. Data
Cache Cache

Memory (DRAM & Disk)


BRANCH
PREDICTION
L1

S
OOO Instr.

SE
Cache

AS
EXECUTION

YP
SUPPORT

.B
R.F
E S
SS L1
YPA Data
R. F. B Cache
Computer System

Application
Operating
System
Compiler Firmware
Instruction Set
Architecture
Instr. Set Proc. I/O system
Datapath & Control
Digital Design
Circuit Design
Layout
What is Computer Architecture?
Computer Architecture =
Instruction Set Architecture +
Machine Organization
-- Organization of Programmable -- Capabilities & Performance Characteristics of
Storage (Registers, Memory) Principal Functional Units (FUs)
-- Data Types: – (e.g., Registers, ALUs, Shifters,
Encodings & Representations Logic Units, ...)
-- Ways in which these components are
-- Instruction Set
interconnected
-- Instruction Formats -- Information flows between components
-- Logic and means by which such information
-- Modes of Addressing and Accessing
Data and Instructions flow is controlled
-- Choreography of FUs to realize the ISA
-- Exceptional Conditions -- Register Transfer Level (RTL) Description
Role of Architecture
• Responsible for hardware specification:
– Instruction set design
• Responsible for hardware implementation:
– Microarchitecture design
• Interacts with everyone
– Mainly compiler and logic/circuit level designers
• Cannot do good job without knowledge of
both sides
Overview of Topics in 463/521
1. Measuring Performance and Cost
2. Caches and Memory Hierarchies
3. Instruction-Set Architecture (ISA)
– Defines software/hardware interface
4. Simple Pipelining
– Data and control (branch) dependences
– Data bypasses
– Branch prediction
Overview of Topics in 463/521
5. Complex Pipelining and Instruction-Level
Parallelism (ILP)
– Data hazards
– Dynamic instruction scheduling, register
renaming, Tomasulo’s algorithm
– Precise Interrupts
– Superscalar, VLIW, and vector processors
Projects
• Three projects
– Cache simulator
– Branch predictor simulator
– Dynamic instruction scheduling pipeline
simulator
• Programming for projects is harder than
anything many of you have encountered
before
Course Grading
• Breakdown
– 40% projects
– 10% homeworks (approx. 4 to 6 homeworks)
– 25% Midterm
• Covers Performance/Cost, Caches
– 25% Final
• Covers ISA, Simple & Complex Pipelining, ILP
Course Web Page
• Two identical pages
– www.courses.ncsu.edu/ece521
– www.courses.ncsu.edu/ece463
• Contains contact info., office hours,
syllabus, schedule, assignment/project
handouts, etc.
• Check frequently

You might also like