You are on page 1of 86

Chapter 2 : CPU Architecture

 Contents
 Basic Operational Concepts
 Organization of ALU
 Stack Organization
 Instruction Execution
 Instruction Cycle
 Pipelining
 Addressing Modes
 Instruction Formats
 Hardwired and micro programmed control unit
 RISC vs CISC

1
Organization of ALU
Let’s see the CPU first

2
cont.…

3
Cont.…

4
Cont.…
 The basic function of a CPU is to:
◦ Fetch
◦ decode and
◦ execute instructions held in ROM or RAM.
 These are the process by which a computer retrieves a program
instruction from its memory, determines what actions the instruction
requires, and carries out those actions.
 The whole process is done in an instruction cycle (sometimes called
fetch-decode-execute (FDX))

5
Cont.…
 Hence, instruction cycle is the amount of time for fetching,
decoding and executing of a single instruction.
 In simpler CPUs, the instruction cycle is executed sequentially:
◦ each instruction is completely processed before the next one is
started.
 In most modern CPUs, the instruction cycle is instead executed
concurrently in parallel, as an instruction pipeline:
◦ the next instruction starts being processed before the previous
instruction is finished
6
 Typically,clock signals are generated by a quartz crystal, which
generates a constant signal wave while power is applied.

Cont.…
instruction cycle
7
Cont.…
The circuits used in the CPU during the cycle are:
Program counter (PC) - an incrementing counter that keeps
track of the memory address of the instruction that is to be
executed next.
Memory address register (MAR) - holds the address of a
memory block to be read from or written to.
Memory data register (MDR) - a two-way register that
holds data fetched from memory (and ready for the CPU
to process) or data waiting to be stored in memory
Instruction register (IR) - a temporary holding ground for
the instruction that has just been fetched from memory 8
 Control unit (CU) - decodes the program instruction in the IR,
selecting machine resources such as a data source register and a
particular arithmetic operation, and coordinates activation of those
resources
 Arithmetic logic unit (ALU) - performs mathematical and logical
operations.
◦ Hence the ALU is the basic part of the computer system that performs
mathematical and logical operations in every instruction cycle.
Cont.…
◦ For example let’s see a one bit ALU that performs arithmetic and logical
9
operations.
Block diagram of 1 bit ALU

10
Count..
Count..
Count..
Count..
Count..
Steps in program Execution
 Fetch the Instruction (address in Program Counter PC)
 Increment PC (prepared to get next instruction)
 Decode the Instruction (find out tasks to do)
 Fetch the Operands (data needed for the tasks)
 Execute the Operation (do the tasks, may involve ALU)
Summary
 Store the Results (in a register or in memory)
16
 Repeat For the next instruction
Execution Cycle

Obtain instruction from program storage


Instruction
Fetch

Instruction Determine required actions and instruction size


Decode

Operand Locate and obtain operand data


Fetch

Execute Compute result value or status

Result Deposit results in storage for later use


Store

Next
Determine successor instruction
Instruction
17
S A stack is a sequence of items that are accessible
at only one end of the sequence

stack
18
Stack operations
There are three basic stack terms we should know
1) The Push instruction
put the value of the data in register in to the top of the
stack.

19
Cont’d…
2) The POP instruction
◦ take the value on the top of the stack memory

20
Cont’d…
3) Top of stack
◦ The place in the stack memory which is ready to be
accessed

21
A general block diagram of stack operation

22
Summary of stack
Itis LIFO system  Last In First Out
Example See how the elephant comes to the first
location and the bird will be at the last in stack point of
view.

23
Example cont’d…
24
Cont’d…
25
Accessing stack example cont’d..
26
Accessing stack with example
27
In stack we can access the data on the
top
28
Computer programming languages

Programming languages are the languages that we use


to write instruction for a computer to perform
functions.
Classification of programming languages:
◦ Machine language
◦ Assembly language
◦ High level language
29
MACHINE CODE
A program running on a computer is simply a sequence
of bits.
A program in this format is said to be in machine code.
We can write programs in machine code:
But, bulky to write

23fc 0000 0001 0000 0040


0cb9 0000 000a 0000 0040
6e0c
06b9 0000 0001 0000 0040
30
60e8
ASSEMBLY LANGUAGE
Assembly language (or assembler code) was our first
attempt at producing a mechanism for writing programs
that was more palatable to ourselves.
• Of course a program mov R0, R1
written in assembly ADD R0, R3
code, in order to compare:
“run”, must first be cmp #oxa,n
translated into
cgt end_of_loop
machine code.
• This is done by acddl #0x1,n
assembler bra compare
31

end_of_loop:
HIGH LEVEL LANGUAGE
From the foregoing we can see that assembly language is
not much of an improvement on machine code!
A more problem-oriented (rather than machine-oriented)
mechanism for creating computer programs would also
be desirable.
Hence the advent of high(er) level languages
commencing with the introduction of “Autocodes”, and
going on to Fortran, Pascal, Basic, C, C++, java etc.
The HLL must be converted in to machine language to
be executed. This is done by compiler.

32
PIPELINING

33
What is Pipelining ?
 A technique used in advanced microprocessors where the
microprocessor begins executing a second instruction before the
first has been completed.
 A Pipeline is a series of stages, where some work is done at each
stage. The work is not finished until it has passed through all stages.
 With pipelining, the computer architecture allows the next
instructions to be fetched while the processor is performing
arithmetic operations, holding them in a buffer close to the
processor until each instruction operation can performed.
34
Consider an example with 6 stages
◦ FI = fetch instruction
◦ DI = decode instruction
◦ CO = calculate location of operand
◦ FO = fetch operand
◦ EI = execute instruction
◦ WO = write operand
Pipelining (store result)
(continued)
35
Pipelining Example

 Executes 9 instructions in 14 cycles rather than 54 for sequential execution


36
Instruction 1 Instruction 2

X X

Instruction 4 Instruction 3

X X
Four sample instructions, executed linearly
Example
37
5

IF ID EX M W
1
IF ID EX M W
1
IF ID EX M W
1
IF ID EX M W

Four Pipelined Instructions

38
Description of each step
 The instruction Fetch (IF) stage is responsible for obtaining
the requested instruction from memory.
 The Instruction Decode (ID) stage is responsible for
decoding the instruction and sending out the various control
lines to the other parts of the processor.
 The Execution (EX) stage is where any calculations are
performed. The main component in this stage is the ALU. The
ALU is made up of arithmetic, logic and capabilities.

39
Cont’d…

 The Memory and IO (MEM) stage is responsible for storing


and loading values to and from memory. It also responsible for
input or output from the processor. If the current instruction is
not of Memory or IO type than the result from the ALU is
passed through to the write back stage.
 The Write Back (WB) stage is responsible for writing the
result of a calculation, memory access or input into the
register file.

40
Operation Timings
Instruction 2ns
 Estimated timings for each of Fetch
the stages:
Instruction 1ns
Decode
Execution 2ns

Memory 2ns
and IO
Write Back 1ns
41
Advantages/Disadvantages of pipelining

Advantages:
 More efficient use of processor
 Quicker time of execution of large number of
instructions
Disadvantages:
 Pipelining involves adding hardware to the chip
 Inability to continuously run the pipeline at full speed
because of pipeline hazards which disrupt the smooth
execution of the pipeline. 42
Pipeline Hazards
 Data Hazards – an instruction uses the result of the previous
instruction. A hazard occurs exactly when an instruction tries
to read a register in its ID stage that an earlier instruction
intends to write in its WB stage.

 ControlHazards – the location of an instruction depends on


previous instruction

 StructuralHazards – two instructions need to access the


same resource

43
ta Hazards
Select R2 and R3 for ADD R2 and R3 STORE SUM IN
ALU Operations R1

ADD R1, R2, R3 IF ID EX M WB

SUB R4, R1, R5 IF ID EX M WB

Select R1 and R5 for


ALU Operations

44
Solution  Stalling

 Stalling involves halting the flow of instructions until the required


result is ready to be used. However stalling wastes processor time
by doing nothing while waiting for the result.

ADD R1, R2, R3 IF ID EX M WB

STALL IF ID EX M WB

STALL IF ID EX M WB

STALL IF ID EX M WB

45
SUB R4, R1, R5 IF ID EX M WB
Addressing modes

Defines how machine language instructions in that


architecture identify the operand (or operands) of each
instruction.
An addressing mode specifies how to calculate the
effective memory address of an operand by using
information held in registers and/or constants contained
within a machine instruction or elsewhere.
46
Addressing Modes
The basic addressing modes include
◦ Immediate

◦ Direct

◦ Indirect

◦ Register

◦ Register Indirect

◦ Displacement (Indexed)

◦ Stack

47
Immediate Addressing

Operand is part of instruction


Operand = address field
e.g. ADD 5
◦ Add 5 to contents of accumulator

◦ 5 is operand

No memory reference to fetch data


Fast 48
Direct Addressing

Address field contains address of operand


Effective address (EA) = address field (A)
e.g. ADD A
◦ Add contents of cell A to accumulator
◦ Look in memory at address A for operand
Single memory reference to access data
No additional calculations to work out effective address
Limited address space
49
Direct Addressing Diagram

Instruction

Opcode Address A
Memory

Operand

50
Indirect Addressing (1)

Memory cell pointed to by address field contains the


address of (pointer to) the operand
EA = (A)
◦ Look in A, find address (A) and look there for operand
 e.g. ADD (A)
◦ Add contents of cell pointed to by contents of A to
accumulator

51
Indirect Addressing (2)

Large address space


2n where n = word length
May be nested, multilevel, cascaded
◦ e.g. EA = (((A)))
 Draw the diagram yourself

Multiple memory accesses to find operand


Hence slower
52
Indirect Addressing Diagram

Instruction

Opcode Address A
Memory

Pointer to operand

Operand

53
Register Addressing (1)

Operand is held in register named in address filed


EA = R
Limited number of registers
Very small address field needed
◦ Shorter instructions
◦ Faster instruction fetch

54
Register Addressing (2)

No memory access


Very fast execution
Very limited address space
Multiple registers helps performance
◦ Requires good assembly programming or compiler writing

55
Register Addressing Diagram

Instruction

Opcode Register Address R


Registers

Operand

56
Register Indirect Addressing

EA = (R)
Operand is in memory cell pointed to by contents of
register R
Large address space (2n)
One fewer memory access than indirect addressing

57
Register Indirect Addressing Diagram

Instruction

Opcode Register Address R


Memory

Registers

Pointer to Operand Operand

58
Displacement Addressing

EA = A + (R)
Address field hold two values
◦ A = base value
◦ R = register that holds displacement
◦ or vice versa

59
Displacement Addressing Diagram

Instruction

Opcode Register R Address A


Memory

Registers

Pointer to Operand + Operand

60
Stack Addressing

Operand is (implicitly) on top of stack


e.g. ADD Pop top two items from stack
and add
 Example in the following stack ADD will add D+C

D
C
B
A
61
Comparison of Microprogrammed and hardwired CU

62
Basic CPU Architectures

 There are two types of fundamental CPU architectures


◦ 1) Complex Instruction Set Computers (CISC)
◦ 2) Reduced Instruction Set Computers (RISC).
  The difference between the two architectures is the relative
complexity of the instruction sets and underlying electronic and
logic circuits in CISC microprocessors
. For example, the original RISC I prototype had just 31
instructions, while the RISC II had 39.

63
RISC and CISC

1. RISC and CISC


2. Instruction Execution Characteristics
3. The Use of a Large Register File
4. Reduced Instruction Set Architecture
5. RISC versus CISC Controversy
1. RISC and CISC
RISC
◦ Reduced Instruction Set Computer
◦ Key features
 Large number of general purpose registers
 use of compiler technology to optimize register use
 Limited and simple instruction set
 Emphasis on optimising the instruction pipeline
Cont…
CISC
◦ Complex Instruction Set Computer
◦ Software costs far exceed hardware costs
◦ Increasingly complex high level languages
◦ Large instruction sets
◦ More addressing modes
◦ Hardware implementations of HLL statements
◦ Ease compiler writing
◦ Improve execution efficiency
◦ Support more complex HLLs
2. Instruction Execution Characteristics

One of the most evolution associated with computers is


programming language.
As the cost of hardware dropped, relative cost of the
software has risen
Execution Characteristics
◦ Operations performed
◦ Operands used
◦ Execution sequencing
Studies have been done based on programs written in
HLLs
Dynamic studies are measured during the execution of
the program
Cont…
Operations
◦ Assignments
◦ Conditional statements (IF, LOOP)
◦ Procedure call-return is very time consuming
◦ Some HLL instruction lead to many machine code operations
Operands
◦ Mainly local scalar variables
◦ Optimisation should concentrate on accessing local variables
Cont…
Procedure Calls (Execution sequencing)
◦ Very time consuming
◦ Depends on number of parameters passed
◦ Depends on level of nesting
◦ Most programs do not do a lot of calls followed by lots of
returns
◦ Most variables are local
Cont…
Implications of RISC
◦ Best support is given by optimising most used and most time
consuming features
◦ Large number of registers
 Operand referencing
◦ Careful design of pipelines
 Branch prediction
◦ Simplified (reduced) instruction set
3. The Use of a Large Register File

 Register is fastest available storage device


 The register file is physically small
 Strategy that allow the most frequently accessed
operands to be kept in registers
1. Software solution
 Require compiler to allocate registers
 Allocate based on most used variables in a given time
 Requires sophisticated program analysis
2. Hardware solution
 Have more registers
 Thus more variables will be in registers
Cont…
Registers for Local Variables
◦ Store local scalar variables in registers
◦ Reduces memory access
◦ Every procedure (function) call changes locality
◦ Parameters must be passed
◦ Results must be returned
◦ Variables from calling programs must be restored
Cont…
Register Windows
◦ Only few parameters
◦ Limited range of depth of call
◦ Use multiple small sets of registers
◦ Calls switch to a different set of registers
◦ Returns switch back to a previously used set of registers
Cont…
 Three areas within a register set
1. Parameter registers
2. Local registers
3. Temporary registers
◦ Temporary registers from one set overlap parameter
registers from the next
◦ This allows parameter passing without moving data
Overlapping Register Windows
Cont…
Global Variables
◦ Allocated by the compiler to memory
 Inefficient for frequently accessed variables
◦ Have a set of registers for global variables
Registers v Cache

Large Register File Cache

All local scalars variables Recently-used local scalars variables

Individual variables Blocks of memory

Compiler-assigned global variables Recently-used global variables

Save/Restore based on procedure nesting depth Save/Restore based on cache replacement algorithm

Register addressing Memory addressing


4. Reduced Instruction Set Architecture

RISC Characteristics
◦ One instruction per cycle
◦ Register to register operations
◦ Few, simple addressing modes
◦ Few, simple instruction formats
◦ Fixed instruction format
5. RISC versus CISC Controversy
Quantitative
◦ compare program sizes and execution speeds
Qualitative
◦ examine issues of high level language support and use of VLSI
real estate
Problems
◦ No pair of RISC and CISC that are directly comparable
◦ No definitive set of test programs
◦ Difficult to separate hardware effects from complier effects
◦ Most commercial devices are a mixture
Comparison of CISC and RISC
CISC RISC

 Large instruction set  Compact instruction set

 Compact and versatile register set


 Numerous registers

 Complex,
 Simple hard-wired machine code and
powerful instructions
control unit
 Numerous memory addressing options
 Compiler and IC developed
for operands
simultaneously
 Have micro programmed CU
 Have hardwired CU
 CISC systems shorten execution time  RISC systems shorten execution time by
by reducing the number of instructions reducing the clock cycles per instruction
per program 79
Exercise
1) What is the advantage of micro programmed CU over
hardwired CU and Vis.versa?
2) a) how many types of CPU architectures do we have?
Mention them

b) Which of these architectures is fast? Why?

80
3) if mov instruction needs 1cycle , ADD needs 1 cycle
and loop needs 1 cycle,
A) what is the total cycle for the ff program
Mov Ax,5
Mov Cx,3
again:Add Ax,Ax
Loop again
B) What is the values of registers Ax and Cx
after the execution of the program
cont’d…
81
Overview of CU and timing diagram

82
A Simplified Control Unit

Fetch
Fetch Unit

Decode
Decode Unit

Control Unit Execute


Execution Unit

Write Back
Write Back Unit

83
Timing Diagram

CLK

Fetch

Decode

Execute

Write Back

84
Let’s Sample The Signals

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

85
Another Way to Generate Signals

1000
0100
0010
0001

86

You might also like