Professional Documents
Culture Documents
3ARM Teaching Material
3ARM Teaching Material
ARMv7
version
ARM1156T2F-S™
ARM1136JF-S™
ARMv6
ARM1176JZF-S™
ARM102xE XScale TM ARM1026EJ-S™
ARMv5
ARM9x6E ARM926EJ-S™
SC200™
ARM7TDMI-S™ StrongARM
®
ARM92xT
V4
SC100™ ARM720T™
1
Data Sizes and Instruction Sets
The ARM is a 32-bit architecture.
2
3
Processor Modes
The ARM has seven basic operating modes:
cpsr
spsr spsr spsr spsr spsr spsr
5
Exception Handling
6
Program Status Registers
31 28 27 24 23 16 15 8 7 6 5 4 0
N Z C V Q J U n d e f i n e d I F T mode
f s x c
039v12 7
Program Counter (r15)
039v12 8
Conditional Execution and Flags
ARM instructions can be made to execute conditionally by postfixing them with the
appropriate condition code field.
This improves code density and performance by reducing the number of forward
branch instructions.
CMP r3,#0 CMP r3,#0
BEQ skip ADDNE r0,r1,r2
ADD r0,r1,r2
skip
By default, data processing instructions do not affect the condition code flags but
the flags can be optionally set by using “S”. CMP does not need “S”.
loop
…
SUBS r1,r1,#1 decrement r1 and set flags
BNE loop if Z flag clear then branch
9
Condition Codes
10
Conditional execution examples
5 instructions 3 instructions
5 words 3 words
5 or 6 cycles 3 cycles
11
Examples of conditional execution
Use a sequence of several conditional instructions
if (a==0) func(1);
CMP r0,#0
MOVEQ r0,#1
BLEQ func
Set the flags, then use various condition codes
if (a==0) x=0;
if (a>0) x=1;
CMP r0,#0
MOVEQ r1,#0
MOVGT r1,#1
Use conditional compare instructions
if (a==4 || a==10) x=0;
CMP r0,#4
CMPNE r0,#10
MOVEQ r1,#0
12
Data Processing Instructions
Consist of :
Arithmetic: ADD ADC SUB SBC RSB RSC
Logical: AND ORR EOR BIC
Comparisons: CMP CMN TST TEQ
Data movement: MOV MVN
CF Destination 0 Destination CF
Destination CF
Immediate value
8 bit number, with a range of 0-
255.
ALU Rotated right through even
number of positions
Allows increased range of 32-bit
constants to be loaded directly
Result into registers
16
Data Processing Exercise
1. How would you load the two’s complement
representation of -1 into Register 3 using one
instruction?
17
Data Processing Solutions
1. MVN r3, #0
18
Immediate constants (1)
No ARM instruction can contain a 32 bit immediate constant
All ARM instructions are fixed as 32 bits long
The data processing instruction format has 12 bits available
for operand2
11 8 7 0
rot immed_8
Quick Quiz:
x2 0xe3a004ff
Shifter
ROR MOV r0, #???
20
Loading 32 bit constants
To allow larger constants to be loaded, the assembler offers a pseudo-
instruction:
LDR rd, =const
This will either:
Produce a MOV or MVN instruction to generate the value (if possible).
or
Generate a LDR instruction with a PC-relative address to read the
constant from a literal pool (Constant data area embedded in the
code).
For example
LDR r0,=0xFF => MOV r0,#0xFF
LDR r0,=0x55555555 => LDR r0,[PC,#Imm12]
…
…
DCD 0x55555555
This is the recommended way of loading constants into a register
21
Single register data transfer
LDRSTR Word
LDRB STRB Byte
LDRH STRH Halfword
LDRSB Signed byte load
LDRSH Signed halfword load
Syntax:
LDR{<cond>}{<size>} Rd, <address>
STR{<cond>}{<size>} Rd, <address>
e.g. LDREQB
22
Address accessed
23
Pre or Post Indexed Addressing?
Pre-indexed: STR r0,[r1,#12]
Offset r0
Source
12 0x20c 0x5 0x5 Register
for STR
r1
Base
Register 0x200
0x200
24
C language example
int *ptr;
int x, y;
ptr = &y;
x = *ptr++;
LDR r0,[r1],#4
25
Single register data transfer
26
Single register data transfer
27
Single register data transfer: examples
28
Load/Store Exercise
Assume an array of 25 words. A compiler associates
y with r1. Assume that the base address for the array
is located in r2. Translate this C
statement/assignment using just three instructions:
array[10] = array[5] + y;
29
Load/Store Exercise Solution
array[10] = array[5] + y;
30
Load/Store halfword
31
Load/Store signed halfword
32
Load and Store Multiples
Syntax:
<LDM|STM>{<cond>}<addressing_mode> Rb{!}, <register list>
4 addressing modes:
LDMIA / STMIA increment after
LDMIB / STMIB increment before
LDMDA / STMDA decrement after
LDMDB / STMDB decrement before
IA IB DA DB
LDMxx r10, {r0,r1,r4} r4
STMxx r10, {r0,r1,r4} r4 r1
r1 r0 Increasing
Base Register (Rb) r10 r0 r4 Address
r1 r4
r0 r1
r0
33
Load and Store Multiples
34
Multiply and Divide
There are 2 classes of multiply - producing 32-bit and 64-bit results
32-bit versions on an ARM7TDMI will execute in 2 - 5 cycles
35
Branch instructions
Branch : B{<cond>} label
Branch with Link : BL{<cond>} subroutine_label
31 28 27 25 24 23 0
Cond 1 0 1 L Offset
The processor core shifts the offset field left by 2 positions, sign-extends it
and adds it to the PC
± 32 Mbyte range
How to perform longer branches?
36
Register Usage
Register
Arguments into function The compiler has a set of rules known as a
r0 Procedure Call Standard that determine how to
Result(s) from function r1 pass parameters to a function (see AAPCS)
otherwise corruptible r2
(Additional parameters r3
passed on stack) CPSR flags may be corrupted by function call.
Assembler code which links with compiled code
r4
must follow the AAPCS at external interfaces
r5
r6 The AAPCS is part of the new ABI for the ARM
Register variables r7 Architecture
Must be preserved r8
r9/sb - Stack base
r10/sl - Stack limit if software stack checking selected
r11
37
A hardware stack in memory
039v12 38
ARM Branches and Subroutines
B <label>
PC relative. ±32 Mbyte range.
BL <subroutine>
Stores return address in LR
Returning implemented by restoring the PC from LR
For non-leaf functions, LR will have to be stacked
func1 func2
STMFD sp!, :
: {regs,lr}
:
: :
:
BL func1 BL func2
:
: :
:
: LDMFD sp!,
{regs,pc} MOV pc, lr
039v12 39
PSR access
31 28 27 24 23 19 16 15 10 9 8 7 6 5 4 0
N Z C V Q de J GE[3:0] IT cond_abc E
I A
F T mode
f s x c
In User Mode, all bits can be read but only the condition flags (_f) can be modified
40
Agenda
Introduction to ARM Ltd
Fundamentals, Programmer’s Model, and Instructions
Core Family Pipelines
AMBA
41
Pipeline changes for ARM9TDMI
ARM7TDMI
ARM decode
Instruction Thumb®ARM Reg Reg
Shift ALU
Fetch decompress Read Write
Reg Select
ARM9TDMI
ARM or Thumb
Instruction Inst Decode Memory Reg
Shift + ALU Write
Fetch Reg Reg Access
Decode Read
FETCH DECODE EXECUTE MEMORY WRITE
42
ARM10 vs. ARM11 Pipelines
ARM10
Branch ARM or Memory
Prediction Shift + ALU
Thumb Reg Read Access Reg
Instruction Write
Instruction
Decode Multiply
Fetch Multiply
Add
FETCH ISSUE DECODE EXECUTE MEMORY WRITE
ARM11
Data Data
Address Cache Cache
1 2
43
Agenda
Introduction to ARM Ltd
Fundamentals, Programmer’s Model, and Instructions
Core Family Pipelines
AMBA
44
Example ARM-based System
Interrupt
Controller
Peripherals I/O
nIRQ nFIQ
ARM
Core
8 bit ROM
039v12 45
An Example AMBA System
High Performance
APB
ARM processor UART
High
Bandwidth AHB Timer
APB
External
Bridge
Memory Keypad
Interface
039v12 46
AHB Structure
Arbiter
HADDR
HADDR HWDATA Slave
Master HWDATA
#1
HRDATA
#1
HRDATA
Address/Control
Slave
#2
Master
#2
Write Data
Slave
Read Data #3
Master
#3
Slave
#4
Decoder
039v12 47