Introduction To Processor Design & The ARM Architecture

Introduction to Processor Design
&
The ARM Architecture
ARM Applications
History
 RISC Idea from Stanford & Berkeley

Universities-1980
 Stored Program Digital
Computer(Principle)
 1940’s concept was started (Princeton)
 1948 implemented ‘Baby’ Machine
which ran at Manchester University,
England
Computer Architecture
 Describes Users view of the

Computer
 Eg.
– Instruction Set,
– Visible Registers,
– Memory Management Table Structure,
– Exception Handling Models etc
Computer Organization
 Describes User Invisible

Implementation of the Architecture
 Eg.
– Pipeline Structure,
– Transparent Cache,
– Translation Look Aside Buffers etc
What is a Processor?
 Finite State Automation

 Executes Instructions held in
Memory
 State depends on values hold by
registers & memory
Instruction Set Design
 4-Address Instructions, add d,s1,s2,

nextAdd
 3-Address Instructions, add d,s1,s2
 2-Address Instructions, add d,s1
 1-Address Instructions, add s1
 0-Address Instructions, add ; top of
stack
Instructions Types
 Data Processing
 Data Movement
 Control Flow
 Special Instructions
– Eg. Switching to privileged mode
How will u improve the Processor
Performance?
Instruction Type Dynamic Range

Data Movement 43%
Control Flow 23%
Arithmetic 15%
Comparisons 13%
Logical 5%
Other 1%
How will u improve the Processor
Performance?
 Pipeline
 Cache
 Super Scalar Architecture (multiple
instructions are executed by
dispatching them to non functional
units)
Pipelines
 Fetch
 Decode
 Register Access
 ALU
 Memory, if necessary
 Write Back
Pipeline Hazards
Instruction
1 Fetch Dec Reg ALU Mem Res
Instruction
2 Fetch Dec Reg ALU Mem Reg
Branch Instructions?
RISC Architecture
RISC CISC
Fixed width instructions Variable length instructions
Few formats of instructions Several formats of
instructions
Load/Store Architecture Memory values can be used
as operands in instructions
Large Register bank Small Register Bank
Instructions are pipelinable Cannot pipeline instructions

RISC Organization
RISC CISC
Hardwired instruction Microcode ROMS
decode instruction decoder
Single cycle execution of Multi cycle execution on
instruction instruction
RISC Advantages
 A Smaller Die Size

 A Shorter Development Time
 Higher Performance (Bit Tricky)
– Smaller things have higher natural
frequencies
RISC Disadvantages
 Generally poor code density (Fixed

Length Instruction)
ARM History
 ARM – Acorn RISC Machine(1983–1985)

– Acorn Computers Limited, Cambridge,
England
 ARM – Advanced RISC Machine 1990

– ARM Limited, 1990
– ARM has been licensed to many
semiconductor manufacturers
ARM Architect
Steve Furber
sfurber@cs.man.ac.uk
Father of ARM
Architectural Inheritance
 When first ARM chip was designed

examples of other RISC architectures were
– Berkeley RISC I & II
– MIPS
 Earlier Machines did share some of the
features
– PDP-8
– Cray-1
– IBM 801
Semiconductor Partners
Features Used from Berkeley RISC
 A Load/Store Architecture
 Fixed Length 32-bit Instructions
 3- Address Instruction Formats
Features Rejected from Berkeley RISC
 Delayed Branches
– Branches cause problem in Pipelines
– Most RISC Processor wait for execution
of branch
– Original ARM did not use delayed
Branching bcoz it makes exception
handling complex
– later helped simplify the re-
implementation of the Architecture
Features Rejected from Berkeley RISC
 Single Cycle Execution of ALL

Instructions
– Single Memory for Instruction & Data
– Even a simple load/store will require at
least two cycles
– Separate Data & Instruction was the
solution but was too costly those times
These GUYS used the extra cycle for
something useful such as supporting
auto-indexing
The ARM Programmers Model
 When writing user level programs only

– 15-general purpose 32-bit registers(r0-
r14) &
– the Program Counter (r15) &
– the CPSR (Current Program Status
Register) need to be considered
 The remaining registers are only for
system level programming & for
handling exceptions
Support for ARM Modes of Operations
CPSR
 In user level programs uses CPSR to

store the condition code bits
– N Negative
– C Carry
– Z Zero
– V Overflow
 The bottom bits are protected by the
user level program
– I, F, T, mode[4:0]
CPSR – Current Program Status Register
31 2827 8 7 6 5 4 0
NZCV unused IF T mode
The Memory System
 Memory may be viewed as linear array

of bytes number from 0 to 2^32 –1
 Data Bytes may be 8-bit (B), 16-bit
(HW), or 32-bit (W)
 Words are always aligned at 4-byte
boundaries i.e least two bits are zero
 Half Words are aligned on even
boundaries
ARM Memory Organization
bit 31 bit 0
23 22 21 20
19 18 17 16
word16
15 14 13 12
half-word14
half-word12
11 10 9 8
word8
7 6 5 4
byte6half-word4
3 2 1 0
byte
byte3byte2
byte1
byte0 address
Load-Store Architecture
 Data Processing Instructions

 Data Transfer Instructions
 Control Flow Instruction
ARM Exceptions
 ARM supports range of Interrupts,

Traps, Supervisor Calls, all grouped
under general heading of Exceptions
 PC is saved in r14 (link register) and
CPSR into SPSR for thr exception
type
Exception Priorities
1. Reset (Highest Priority)

2. Data Abort
3. FIQ
4. IRQ
5. Prefetch Abort
6. SWI – Including absent coprocessor
Exception Vector Addresses
Exception Mode Vector Address
 Reset SVC 0x 0000 0000
 Undefined UND 0x 0000
0004
Instruction
 Software Interrupt AVC 0x 0000
0008
 Prefetch Abort Abort 0x 0000
000C
 Data Abort Abort 0x 0000 0010
 IRQ (Normal Interrupt) IRQ 0x 0000
0018
 FIQ (Fast interrupt) FIQ 0x 0000 001C
The I/O System
 Handles I/O as memory mapped devices
with interrupt support
 Internal registers appear as
addressable locations
 Attention of ARM attracted by normal
interrupt (IRQ) or by fast interrupt
(FIQ)
er
Instru
ctions Agenda
 Contro
l Flow
Instru
ctions
 Writin
g
Simple
Assem
bly
the
data
Data
values
Processing Instructions
in
ARM
 Typica
lly
requir
e two
operan
ds &
produc
there
is any,
Rules for Data Processing
is 32-
bits Instructions
wide
and is
placed
in a
registe
r
(Except
ion:
Long
Multipli
Regist
er Operands in Data Processing
Opera
nds
 Imme
diate
Opera
nds
 Shifte
d
Regist
tions
 Bit-
wise Data Processing Operations
Opera
tions
 Regist
er
Move
ment
Opera
tions
Arithmetic Operations
ADD r0, r1, r2 r0 := r1 + r2

ADC r0, r1, r2 r0 := r1 + r2 + C
SUB r0, r1, r2 r0 := r1 - r2
SBC r0, r1, r2 r0 := r1 - r2 + C - 1
RSB r0, r1, r2 r0 := r2 – r1
RSC r0, r1, r2 r0 := r2 – r1 + C - 1
Bit-wise Logical Operations
AND r0, r1, r2 r0 := r1 and r2

ORR r0, r1, r2 r0 := r1 or r2
EOR r0, r1, r2 r0 := r1 xor r2
BIC r0, r1, r2 r0 := r1 and (not) r2
Register Movement Operations
MOV r0, r2 r0 := r2
MVN r0, r2 r0 := not r2
Comparison Operations
CMP r1, r2 set cc on r1 - r2

CMN r1, r2 set cc on r1 + r2
TST r1, r2 set cc on r1 and r2
TEQ r1, r2 set cc on r1 xor r2
– ADD
r3, Immediate Operands
r3,
#1
;
r3 :=
r3 +
1
– AND
r8,
r7,
#&ff
;
Shift Register Operands
 Second register operand is subjected

to shift before it is combined with
first operand
ADD r3, r2, r1, LSL #3 ; r3 := r2 +
(r1*8)
Shift
Right
 ASL-
ARM Shift Operations
Arith
metic
Shift
Left
 ASR-
Arith
metic
Shift
Right
LSL, LSR, ASL, ASR, ROR, RRX
31 0 31 0
00000 00000
LSL #5 LSR #5
31 0 31 0
0 1
00000 0 1 1111 1
ASR #5 , positive operand ASR #5 , negative operand
31 0 31 0
C
C C
ROR #5 RRX
specif
y the Shift Value in Register
numbe
r of
bits
the
second
operan
d
should
be
shifte
risons
a
special
Setting the Condition Codes
reques
t
needs
to be
made
 At
assem
bly
level
the
reques
t is
made
seco
nd
oper Multiplies
and
not
supp
orte
d
– The
resul
t
regis
ter
must
 Single Register Load & Store
Data Transfer Instructions
transfer of a data item (byte, half-word,
–
word)
between ARM registers and memory
 Multiple Register Load & Store
– enable transfer of large quantities of data
– used for procedure entry and exit, to
save/restore workspace registers, to copy
blocks of data around memory
 Single Register Swap Instructions
– allow exchange between a register and memory
in one instruction
– used to implement semaphores to ensure
mutual exclusion on accesses to shared data in
multis
Register-Indirect Addressing
LDR r0, [r1] r0 := mem32[r1]

STR r0, [r1] mem32[r1] := r0
Note: r1 keeps a word

address (2 LSBs are 0)
Offset upto 4KBytes
Pre Indexed Addressing
LDR r0, [r1, #4] r0 := mem32[r1]

Post Indexed Addressing
LDR r0, [r1], #4 r0 := mem32[r1]

r1 := r1 + 4
Auto Indexing Addressing
LDR r0, [r1, #4]! r0 := mem32[r1 + 4]

r1 := r1 + 4
Where do I use this?

2
 Algori
thm:
Exercise
– Point
er to
Tabl
e1
– Point
er to
Tabl
e2
– Load
[Tabl
e1]
– Stor
e
Answer
COPY: ADR r1, TABLE1 ; r1 points to TABLE1

ADR r2, TABLE2 ; r2 points to TABLE2
LOOP: LDR r0, [r1]
STR r0, [r2]
ADD r1, r1, #4
ADD r2, r2, #4
...
TABLE1: ...
TABLE2:...
Better Answer
COPY: ADR r1, TABLE1 ; r1 points to TABLE1

ADR r2, TABLE2 ; r2 points to TABLE2
LOOP: LDR r0, [r1], #4
STR r0, [r2], #4
...
TABLE1: ...
TABLE2:...
quanti
ty of
data
Multiple Register Transfer
needs
to be
transf
erred
 But
there
is a
trade
off, i.e
Example Multiple Transfer
LDMIA r1, {r0, r2, r5} r0:=mem32[r1]

r2 := mem32[r1 + 4]
r5 := mem32[r1 + 8]
Base Address should be Word Aligned
Order of Registers do not matter
Normal practice to specify in increasing order
Including r15 is also possible
locatio
ns
Exercise
0x800
0-
2000
&
0x800
0-
2001?
Check
the
questi
covert
the
followi Exercise
ng C
State
ments
–X=
A+B
–X=
A–B
–X=B
–A
–X=
A+
B*4
–X=

Introduction To Processor Design & The ARM Architecture

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Processor Design & The ARM Architecture

Uploaded by

Copyright:

Available Formats

Introduction to Processor Design

 RISC Idea from Stanford & Berkeley

 Describes Users view of the

 Describes User Invisible

 Finite State Automation

 4-Address Instructions, add d,s1,s2,

Instruction Type Dynamic Range

2 Fetch Dec Reg ALU Mem Res

2 Fetch Dec Reg ALU Mem Reg

Instructions are pipelinable Cannot pipeline instructions

 A Smaller Die Size

 Generally poor code density (Fixed

 ARM – Acorn RISC Machine(1983–1985)

 ARM – Advanced RISC Machine 1990

 When first ARM chip was designed

 Single Cycle Execution of ALL

 When writing user level programs only

 In user level programs uses CPSR to

 Memory may be viewed as linear array

 Data Processing Instructions

 ARM supports range of Interrupts,

1. Reset (Highest Priority)

ADD r0, r1, r2 r0 := r1 + r2

AND r0, r1, r2 r0 := r1 and r2

CMP r1, r2 set cc on r1 - r2

 Second register operand is subjected

ASR #5 , positive operand ASR #5 , negative operand

LDR r0, [r1] r0 := mem32[r1]

Note: r1 keeps a word

LDR r0, [r1, #4] r0 := mem32[r1]

LDR r0, [r1], #4 r0 := mem32[r1]

LDR r0, [r1, #4]! r0 := mem32[r1 + 4]

Where do I use this?

COPY: ADR r1, TABLE1 ; r1 points to TABLE1

COPY: ADR r1, TABLE1 ; r1 points to TABLE1

LDMIA r1, {r0, r2, r5} r0:=mem32[r1]

You might also like