Professional Documents
Culture Documents
Organization
Lecture 4
CPU Structure and Function
1
Lecture Objective :-
After this lecture you will understand:
Processor organization
Register Roles
Register design issues
Instruction cycle with indirect addressing
Processor organization
CPU must:
Fetch instructions
Interpret instructions
Fetch data
Process data
Write data
3
Processor Organization
To do these things, it needs small internal memory -
Registers
To perform computation :– ALU
To control movement of data & operation of ALU :–
Control Unit
For data transfer and control logic :– Internal Bus
4
Processor Organization
5
Processor Organization
Internal structure of the CPU
6
Register Organization
CPU must have some working space (temporary
storage)called registers
Number and function vary between processor designs
One of the major design decisions
Top level of memory hierarchy
CPU registers perform two roles:
User visible registers: enable programmer to
minimize main memory references by optimizing
their uses
Control and status registers: used by control unit to
control operation of the processor
7
Register Organization
8
Register Organization
General Purpose Registers:
Can be assigned to variety of functions by the
programmer
May be true general purpose
Sometimes any general-purpose register can contain the
operand for any opcode.
May be restricted
May be used for data or addressing
In some cases, general-purpose registers can be used for
addressing functions(e.g., register indirect,
displacement).
In other cases, there is a partial or clean separation
between data registers and address registers. 9
Register Organization
10
Register Organization
Condition codes:
Hold condition codes (also referred to as flags)
Bits set by processor hardware as result of operations
E.g. result of last operation was zero
Can be read (implicitly) by programs
e.g. Jump if zero
Can not (usually) be set by programs
11
Register Organization
Design issues:
General purpose or specialized use:
Make them general purpose
Increase flexibility and programmer options
Increase instruction size & complexity
Make them specialized
Smaller (faster) instructions
Less flexibility
12
Register Organization
No. of registers:
Between 8 – 32 is optimum
Fewer = more memory references
More does not reduce memory references and
takes up processor real estate
Register length:
Large enough to hold full address
Large enough to hold full word
Often possible to combine two data registers
13
Register Organization
16
Instruction Cycle
To recall, an instruction cycle includes the
following stages:
Fetch: Read the next instruction from memory into
the processor.
Execute: Interpret the opcode and perform the
indicated operation.
Interrupt: If interrupts are enabled and an interrupt
has occurred, save the current process state and
service the interrupt.
17
18
The Indirect Cycle
The execution of an instruction may involve one or more
operands in memory
Each of which requires a memory access.
Further, if indirect addressing is used, then additional
memory accesses are required.
19
Data Flow (Instruction Fetch)
The exact sequence of events during an instruction cycle
depends on the design of the processor.
In general:
Fetch
PC contains address of next instruction
Address moved to MAR
Address placed on address bus
Control unit requests memory read
Result placed on data bus, copied to MBR, then to IR
Meanwhile PC incremented by 1
20
Data Flow (Instruction Fetch)
21
Data Flow (Indirect Fetch)
IR is examined
If indirect addressing, indirect cycle is performed
Right most N bits of MBR transferred to MAR
Control unit requests memory read
Result (address of operand) moved to MBR
22
Data Flow (Indirect Fetch)
23
Data Flow (Execute)
May take many forms
Depends on instruction being executed
May include
Memory read/write
Input/output
Register transfers
ALU operations
24
Data Flow (Interrupt)
25
Data Flow (Interrupt Diagram)
26
Questions?
Next:
Pipelining
27
Computer Architecture and
Organization
Lecture 5
Instruction Pipelining
28
Lecture Objective :-
30
Instruction Pipelining
Consider subdividing instruction processing in
two stages: fetch and execute
When main memory not accessed, fetch next
instruction in parallel with execution of current
While the second stage is executing, first stage
fetches and buffer next instruction
This is called instruction Prefetch or fetch overlap
This will speed up instruction execution
31
Instruction Pipelining
Two stage instruction pipeline
32
Instruction Pipelining
Improved Performance
But not doubled: why?
To gain further the execution time will generally be longer
than the fetch time (the fetch stage wait)
A conditional branch instruction makes the address of the
next instruction to be fetched unknown. Thus, the fetch
stage must wait until it receives the next instruction address
from the execute stage.
speed up, the pipeline must have more stages.
The greater the number of stages in a pipeline, the faster the
execution rate.
33
Instruction Pipelining
Decomposition of instruction processing:
Fetch instruction(FI): Read the next expected instruction
into a buffer
Decode instruction(DI): determine the opcode and the
operand specifiers.
Calculate operands(CO): calculate the effective address
of each source operand
Fetch operands(FO): fetch each operand from memory
Execute instructions(EI): perform the indicated operation
and store the result
Write operand(WO): store the result in memory.
34
Six Stage Instruction Pipeline
35
Six Stage Instruction Pipeline
Six-stage pipeline can reduce the execution time for 9 instructions from
54 time units to 14 time units.
The diagram assumes that :
Each instruction goes through all six stages of the pipeline.
All of the stages can be performed in parallel.
There are no memory conflicts.
For example, the FI, FO, and WO stages involve a memory
access.
36
Instruction Pipelining
37
Instruction Pipelining
The effect of conditional branch
38
Logic to
account for
branches and
interrupts:
39
Alternative Pipeline Depiction
40
Pipeline Performance
Some of the IBM S/360 designers pointed out two factors
that frustrate this seemingly simple pattern for high
performance design.
1. At each stage of the pipeline, there is some overhead
involved in moving data from buffer to buffer
2. The amount of control logic required to handle memory
and register dependencies and to optimize the use of the
pipeline increases enormously with the number of stages.
41
Pipeline Performance
max[ i ] d m d 1 i k
i
42
Pipeline Performance
Where:
τi = time delay of circuitry in the ith stage
τm = maximum stage delay (delay thru stage which
experiences the largest delay)
k = no of stages in the instruction pipeline
d = time delay of latch
In general, time delay d is equivalent to a clock pulse
and τm >> d
43
Pipeline Performance
Now, suppose n instructions are processed, with NO
branches:
Total time required for a pipeline with k stages:
Tk ,n [k (n 1)]
First instruction: – requires k cycles to complete
Remaining (n – 1) instructions: – require n – 1 cycles
E.g. The ninth instruction completes at time cycle 14:
14 = [6 + (9 – 1)]
44
Pipeline Performance
Consider a pipeline having 4 phases with duration 60, 50, 90 and 80 ns.
Given latch delay is 10 ns. Calculate-
Pipeline cycle time
Non-pipeline execution time
Speed up ratio
Solution
Four stage pipeline is used
Delay of stages = 60, 50, 90 and 80 ns
Latch delay or delay due to each register = 10 ns
Part-01: Pipeline Cycle Time-
Cycle time
= Maximum delay due to any stage + Delay due to its register
= Max { 60, 50, 90, 80 } + 10 ns
= 90 ns + 10 ns
= 100 ns 46
Example
48
Pipeline Hazards
51
Pipeline Hazards
2. Data Hazard: occurs when there is a conflict in the
access of an operand location.
Two instructions in a program are to be executed in
sequence and both access a particular memory or register
operand.
Example: ADD A, B ; A=A+B
SUB C, A ; C=C-A
To maintain correct operation, the pipeline must stall for
two clock cycles.
52
Pipeline Hazards
ADD A, B
SUB C, A
I3
I4
The ADD instruction does not update register A until the end of stage
5, which occurs at clock cycle 5.
But the SUB instruction needs that value at the beginning of its stage
3, which occurs at clock cycle 4.
To maintain correct operation, the pipeline must stall for two clocks
cycles.
53
Pipeline Hazards