The ARM Architecture
T
TM
1L
Data Sizes and Instruction Sets
The ARM is a 32-bit architecture.
When used in relation to the ARM:
Most ARMs implement two instruction sets
Byte means 8 bits
Halfword means 16 bits (two bytes)
Word means 32 bits (four bytes)
32-bit ARM Instruction Set
16-bit Thumb Instruction Set
Jazelle cores can also execute Java bytecode
39v10 The ARM Architecture
TM
39v10 The ARM Architecture
TM
Processor Modes
The ARM has seven basic operating modes:
User : unprivileged mode under which most tasks run
FIQ : entered when a high priority (fast) interrupt is raised
IRQ : entered when a low priority (normal) interrupt is raised
Supervisor : entered on reset and when a Software Interrupt
instruction is executed
Abort : used to handle memory access violations
Undef : used to handle undefined instructions
System : privileged mode using the same registers as user mode
39v10 The ARM Architecture
TM
The ARM Register Set
Current Visible Registers
Abort
Mode
Undef
SVC
Mode
IRQ
FIQ
User
Mode
Mode
Mode
r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
spsr
39v10 The ARM Architecture
Banked
Banked
out
Registers
Registers
User
FIQ
IRQ
SVC
Undef
Abort
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
spsr
spsr
spsr
spsr
spsr
TM
39v10 The ARM Architecture
TM
Changing mode on an
Exception
There are total 37
registers in the
register file,of those,
20 registers are
hidden from a program
at user mode. These
registers are called
banked registers
39v10 The ARM Architecture
TM
Interrupt Masks
Interrupt masks are used to stop specific interrupt requests from
interrupting the processor.
There are two interrupt request levels available on the ARM
processor coreinterrupt request (IRQ) and fast interrupt request
(FIQ).
The cpsr has two interrupt mask bits, 7 and 6 (or I and F), which
control the masking of IRQ and FIQ, respectively.
The I bit masks IRQ when set to binary 1, and similarly the F bit
masks FIQ when set to binary 1.
39v10 The ARM Architecture
TM
Processor Modes
39v10 The ARM Architecture
TM
Condition Flags
Condition flags are updated by comparisons and the result of ALU operations
that specify the S instruction suffix.
For example, if a SUBS subtract instruction results in a register value of zero,
then the Z flag in the cpsr is set. This particular subtract instruction specifically
updates the cpsr.
39v10 The ARM Architecture
TM
10
10
PIPELINE
A pipeline is the mechanism a RISC processor uses to execute instructions.
Using a pipeline processor speeds up execution by fetching the next instruction
while other instructions are being decoded and executed.
39v10 The ARM Architecture
TM
11
11
PIPELINE
In the first cycle the core fetches the ADD instruction from memory.
In the second cycle the core fetches the SUB instruction and decodes the
ADD instruction.
In the third cycle, both the SUB and ADD instructions are moved along
the pipeline.
The ADD instruction is executed, the SUB instruction is decoded, and
the CMP instruction is fetched. This procedure is called filling the
pipeline.
The pipeline allows the core to execute an instruction every cycle.
39v10 The ARM Architecture
TM
12
12
PIPELINE EXECUTION
In the execute stage, the pc always points to the address of the
instruction plus 8 bytes.
39v10 The ARM Architecture
TM
13
13
EXCEPTIONS/INTERRUPTS
Reset vector is the location of the first instruction executed by the processor
when power is applied.
Undefined instruction vector is used when the processor cannot decode an
instruction.
Software interrupt vector is called when you execute a SWI instruction
Prefetch abort vector occurs when the processor attempts to fetch an
instruction from an address without the correct access permissions. The actual
abort occurs in the decode stage.
Data abort vector is similar to a prefetch abort but is raised when an instruction
attempts to access data memory without the correct access permissions.
Interrupt request vector is used by external hardware to interrupt the normal
execution flow of the processor.
Fast interrupt request vector is similar to the interrupt request but is reserved
for hardware requiring faster response times.
39v10 The ARM Architecture
TM
14
14
THE VECTOR TABLE
39v10 The ARM Architecture
TM
15
15
Core Extensions
The hardware extensions are standard components placed next to the ARM core.
They improve performance, manage resources, and provide extra functionality and
are designed to provide flexibility in handling particular applications.
There are three hardware extensions ARM wraps around the core:
Cache and tightly coupled memory (TCM)
Memory management
Coprocessor interface.
39v10 The ARM Architecture
TM
16
16
CACHE & TCM
The cache is a block of fast memory placed between main memory and the
core
ARM has two forms of cache. The first is found attached to the Von
Neumannstyle cores. It combines both data and instruction into a single
unified cache, as shown in figure
39v10 The ARM Architecture
TM
17
17
CACHE & TCM
TCM is fast SRAM located close to the core
A simplified Harvard architecture with TCMs.
39v10 The ARM Architecture
TM
18
18
MEMORY MANAGEMENT
ARM cores have three different types of memory management hardware
No extensions providing no protection
Memory protection unit (MPU) providing limited protection
Memory management unit (MMU) providing full protection
39v10 The ARM Architecture
TM
19
19
COPROCESSORS
Coprocessors can be attached to the ARM processor. A
coprocessor extends the processing features of a core by
extending the instruction set or by providing
configuration registers.
More than one coprocessor can be added to the ARM
core via the coprocessor interface.
39v10 The ARM Architecture
TM
20
20
THE ARM INSTRUCTION
SET
T
TM
21L
DATA PROCESSING
INSTRUCTIONS
The data processing instructions manipulate data within registers.
They are
Move instructions
Arithmetic instructions
Logical instructions
Comparison instructions
Multiplyinstructions
Most data processing instructions can process one of their operands using
the barrel shifter.
If you use the S suffix on a data processing instruction, then it updates
the flags in the cpsr
39v10 The ARM Architecture
TM
22
22
Move Instructions
39v10 The ARM Architecture
TM
23
23
Move Instructions
Barrel Shifter
39v10 The ARM Architecture
TM
24
24
Move Instructions
Barrel shifter operations
39v10 The ARM Architecture
TM
25
25
Move Instructions
Logical shift left by one.
39v10 The ARM Architecture
TM
26
26
Example of a MOVS instruction shifts register r1 left by one bit
39v10 The ARM Architecture
TM
27
27
Arithmetic Instructions
The arithmetic instructions implement addition and subtraction of
32-bit signed and unsigned values.
39v10 The ARM Architecture
TM
28
28
Arithmetic Instructions
Example 1
Example 2
39v10 The ARM Architecture
TM
29
29
Arithmetic Instructions
Example 3
39v10 The ARM Architecture
TM
30
30
Using the Barrel Shifter with Arithmetic Instructions
39v10 The ARM Architecture
TM
31
31
Logical Instructions
Logical instructions perform bitwise logical operations on
the two source registers
Example 1
39v10 The ARM Architecture
TM
32
32
Logical Instructions
Example 2
39v10 The ARM Architecture
TM
33
33
Comparison Instructions
39v10 The ARM Architecture
TM
34
34
Comparison Instructions
Example
39v10 The ARM Architecture
TM
35
35
Multiply Instructions
The multiply instructions multiply the contents of a pair of registers
and, depending upon the instruction, accumulate the results in with
another register. The long multiplies accumulate onto a pair of
registers representing a 64-bit value. The final result is placed in a
destination register or a pair of registers
39v10 The ARM Architecture
TM
36
36
Multiply Instructions
39v10 The ARM Architecture
TM
37
37
Multiply Instructions
Example 1
39v10 The ARM Architecture
TM
38
38
Multiply Instructions
Example 2
39v10 The ARM Architecture
TM
39
39
Branch Instructions
Example 1
39v10 The ARM Architecture
TM
40
40
Example 2
39v10 The ARM Architecture
TM
41
41
Load-Store Instructions
Load-store instructions transfer data between memory and
processor registers. There are three types of load-store instructions:
single-register transfer
multiple-register transfer,
swap.
39v10 The ARM Architecture
TM
42
42
Load-Store Instructions
39v10 The ARM Architecture
TM
43
43
Load-Store Instructions
Example 1
39v10 The ARM Architecture
TM
44
44
Single-Register Load-Store
Addressing Modes
39v10 The ARM Architecture
TM
45
45
Single-Register Load-Store
Addressing Modes
Example 1
39v10 The ARM Architecture
TM
46
46
Multiple-Register Transfer
Load-store multiple instructions can transfer multiple registers between memory and
the processor in a single instruction. The transfer occurs from a base address register
Rn pointing into memory. Multiple-register transfer instructions are more efficient
from single-register transfers for moving blocks of data around memory
39v10 The ARM Architecture
TM
47
47
Multiple-Register Transfer
39v10 The ARM Architecture
TM
48
48
Multiple-Register Transfer
39v10 The ARM Architecture
TM
49
49
Multiple-Register Transfer
39v10 The ARM Architecture
TM
50
50
Multiple-Register Transfer
39v10 The ARM Architecture
TM
51
51
Stack Operations
39v10 The ARM Architecture
TM
52
52
Stack Operations
When you use a full stack (F), the stack pointer sp points to an address that is
the last used or full location (i.e., sp points to the last item on the stack).
39v10 The ARM Architecture
TM
53
53
Stack Operations
If you use an empty stack (E) the sp points to
an address that is the first unused or empty
location
39v10 The ARM Architecture
TM
54
54
Swap Instruction
39v10 The ARM Architecture
TM
55
55
Swap Instruction
Example 1
39v10 The ARM Architecture
TM
56
56
Software Interrupt Instruction
Example 1
39v10 The ARM Architecture
TM
57
57
Program Status Register
Instructions
39v10 The ARM Architecture
TM
58
58
Program Status Register
Instructions
39v10 The ARM Architecture
TM
59
59
Coprocessor Instructions
39v10 The ARM Architecture
TM
60
60
Count Leading Zeros Instruction
The count leading zeros instruction counts the number of zeros between
the most significant bit and the first bit set to 1.
39v10 The ARM Architecture
TM
61
61
ARM Processor Exceptions and
Modes
39v10 The ARM Architecture
TM
62
62
Exception Priorities
39v10 The ARM Architecture
TM
63
63
Interrupts
39v10 The ARM Architecture
TM
64
64
IRQ & FIQ Exceptions
39v10 The ARM Architecture
TM
65
65
Interrupt Handling Schemes
A nonnested interrupt handler handles and services individual interrupts
sequentially. It is the simplest interrupt handler.
A nested interrupt handler handles multiple interrupts without a priority
assignment.
A reentrant interrupt handler handles multiple interrupts that can be prioritized.
A prioritized simple interrupt handler handles prioritized interrupts.
A prioritized standard interrupt handler handles higher-priority interrupts in a
shortertime than lower-priority interrupts.
A prioritized direct interrupt handler handles higher-priority interrupts in a
shorter time and goes directly to a specific service routine.
A prioritized grouped interrupt handler is a mechanism for handling interrupts
that are grouped into different priority levels.
A VIC based interrupt service routine shows how the vector interrupt
controller (VIC) changes the design of an interrupt service routine.
39v10 The ARM Architecture
TM
66
66
Non-nested Interrupt Handler
39v10 The ARM Architecture
TM
67
67
Nested Interrupt Handler
39v10 The ARM Architecture
TM
68
68
Reentrant Interrupt Handler
39v10 The ARM Architecture
TM
69
69
Prioritized Simple Interrupt
Handler
39v10 The ARM Architecture
TM
70
70
Prioritized Standard Interrupt
Handler
39v10 The ARM Architecture
TM
71
71
Prioritized Direct Interrupt
Handler
39v10 The ARM Architecture
TM
72
72
Prioritized Grouped Interrupt
Handler
39v10 The ARM Architecture
TM
73
73