You are on page 1of 65

CSE413 Microprocessor-Based Systems

Cortex-M0+ CPU

1
Topics

▪ Cortex-M0+ Processor Core Registers

▪ Memory System and Addressing

▪ Thumb Instruction Set

▪ Reference
▪ DDI0419C Architecture ARMv6-M Reference Manual

2
ARMv6-M architecture Core Registers
The Cortex-M0+ core is ARMv6-M architecture.

All core registers are 32 bits

3
ARMv6-M architecture Core Registers

▪ R0-R12 - General purpose registers: For data processing (Hold data or addresses)

▪ R13 - Stack pointer (SP): Points to the top element of the stack.
▪ Can refer to one of two SPs
◦ Main Stack Pointer (MSP), initially and whenever in Handler mode
◦ Process Stack Pointer (PSP), when in Thread mode
▪ Can select either MSP or PSP using SPSEL flag in CONTROL register.

▪ R14 - Link Register (LR): Holds the return address from functions.
▪ Holds return address when called with Branch & Link instruction (BL).

▪ R15 - program counter (PC): Points to the next instruction to be executed.

4
ARMv6-M architecture Core Registers
▪ General Purpose Registers:
R0-R12 registers hold data or addresses.
Examples:
MOV R0, R1; Copy content of R1 to R0
MOV R3, #200; R3 receives the decimal value 200
ADD R2, R0, R1 Add R1 to R0, result in R2

LDR R2, [R1]; Load R2 from location whose address is in R1


STR R0, [R2]; Store R0 to location whose address is in R2

BX R0; Branch to location whose address is in R0


Note:
The register should be loaded by the address of the memory location before using it to reference data or
5 instructions in memory.
ARMv6-M architecture Core Registers
▪ The Program Status Register (PSR) is a 32-bit register that comprises three subregisters:
▪ Application PSR (APSR)
◦ Condition code flag bits Negative, Zero, Carry, oVerflow
▪ Interrupt PSR (IPSR)
◦ When the processor is executing an exception handler, holds exception number of currently executing ISR,
otherwise it holds zeros.
▪ Execution PSR (EPSR)
◦ Thumb state

xPSR
6
ARMv6-M architecture Core Registers

PRIMASK Register
▪ PRIMASK - Exception mask register

▪ Bit 0: PM Flag

▪ Set to 1 to prevent activation of PRIMASK Register bit assignment


Only one bit used
<all exceptions with configurable priority>
▪ Use to prevent data race conditions with code
needing atomicity
▪ Access using instructions:
✓ CPS Change Processor State
✓ MSR Move to General Purpose Register from Special
✓ MRS Move to Special Register from general-purpose
register
7
ARMv6-M architecture Core Registers

CONTROL Register
▪ CONTROL - Control register bits[31:2] Reserved.

▪ Bit 0: nPRIV flag


◦ Defines whether thread mode is privileged (0)
or unprivileged (1)

▪ Bit 1: SPSEL flag


Selects which SP to use when in thread mode:
◦ 0 — MSP
◦ 1 — PSP
▪ With OS environment,
◦ Threads use PSP
◦ OS and exception handlers use MSP

8
ARM Cortex-M Operating Modes
An M-profile processor supports two operating modes: • Power-On Reset
Reset
1. Thread mode Can be • Reset button
Privileged/Unprivileged
• Is entered on Reset
Thread
• Can be entered as a result of an exception return.
Mode.
2. Handler mode Always Privileged MSP or PSP.
• Is entered as a result of an exception.
Exception Starting
The processor must be in Handler mode to issue an
exception return. Return Exception
Processing
1. Privileged execution
Has access to all resources Handler Mode
2. Unprivileged execution MSP
Has access to limited resources

▪ Which SP is active depends on operating mode, and SPSEL (CONTROL register bit 1)
▪ SPSEL == 0: MSP
9 ▪ SPSEL == 1: PSP
ARM Cortex-M Operating Modes
After reset: Reset
– Processor enters in thread mode, ISR_NUMBER=0,
– Is running in privileged mode and using main SP Thread
Mode.
– 32-bit value in location 0 is loaded in the SP. MSP or PSP.
– 32-bit value in location 4 is loaded in the PC.
Exception Starting
(This value is called reset vector) Return Exception
Processing
– Initialize the T bit in the PSR to 1.
– Initialize the LR to 0xFFFFFFFF Handler Mode
MSP
Whenever the processor is executing an ISR, it switches
to handler mode and ISR_NUMBER specifies which
interrupt is being processed.

10
Memory Maps For Cortex M0+ and MCU
KL25Z128VLK4 implementation
In the PPB address space, a 4kB block
0xE000E000 to 0xE000EFFF, is
0x2000_2FFF assigned as the SCS that supports:
SRAM_U (3/4) • processor ID registers.
16 KB SRAM • general control and configuration,
0x2000_0000 including the vector table base
SRAM_L (1/4) address.
4G Address space

0x1FFF_F000
• system handler support, for
system interrupts and exceptions.
• an optional system timer, SysTick.
• a Nested Vectored Interrupt
Controller (NVIC), that supports up
to 32 discrete external interrupts.
0x0001_FFFF • processor debug, optional in
ARMv6-M.
• MPU registers, in systems that
implement the
128KB Flash Unprivileged/Privileged Extension.

0x0000_0000
11
Memory Maps For Cortex M0+ and MCU
System Control Space (SCS)
The SCS is a memory-mapped 4KB address space that provides 32-bit registers for configuration,
status reporting and control.
The SCS registers divide into the following groups:
• system control and identification.
• the CPUID processor identification space.
• system configuration and status.
• an optional system timer, SysTick.
• a Nested Vectored Interrupt Controller (NVIC).
• system debug

System Control Space, address range 0xE000E000 to 0xE000EFFF

12
Memory Address Space
➢ The ARMv6-M architecture uses a single, flat address space of 232 = 4G bytes.
➢ Addresses are treated as unsigned numbers, running from 0 to 232 – 1.

This address space is regarded as consisting of :


1. 230 32-bit words that are word-aligned (address is divisible by 4).
The word whose word-aligned address is A consists of the four bytes with addresses A, A+1, A+2 and A+3.
2. 231 16-bit halfwords that are halfword-aligned (address is divisible by 2).
The halfword whose halfword-aligned address is A consists of the two bytes with addresses A and A+1.

• Instruction fetches are always halfword-aligned and


• Data accesses are always naturally aligned.

➢ Normal sequential execution of instructions effectively calculates:


(address_of_current_instruction) + (size_of_executed_instruction)

➢ Address calculation is modulo 232: Addresses wrap around if they overflow or underflow the address space.

13
Endianness
▪ For a multi-byte value, in what order are the
bytes stored in memory?

▪ Little-Endian: Start with least-significant


byte

▪ Big-Endian: Start with most-significant byte

‾ Instructions are always little-endian


‾ Loads and stores to Private Peripheral Bus are
always little-endian
‾ Data: Depends on implementation, or from reset
configuration
Kinetis processors are little-endian
14
ARM, Thumb and Thumb-2 Instructions
▪ ARM instructions optimized for resource-rich high-performance computing systems
▪ Deeply pipelined processor, high clock rate, wide 32-bit memory bus

▪ Low-end embedded computing systems are different


▪ Slower clock rates, shallow pipelines
▪ Different cost factors – e.g. code size matters much more, bit and byte operations critical

▪ Modifications to ARM Instruction Set Architecture (ISA) to fit low-end embedded computing

1995 Thumb instruction set 2003 Thumb-2 instruction set


16-bit instructions 16-bit + some 32-bit instructions
➢ Reduces memory requirements ➢ Improves speed with little
but reduces performance slightly memory overhead

15 ▪ CPU decodes instructions based on whether in Thumb state or ARM state - controlled by T bit
Thumb & ARM Instruction Sets
The Thumb instruction set is a subset of the 32-bit ARM instructions.
Thumb instructions are 16 bits long and have corresponding 32-bit ARM instructions.
➢ Cortex-M0+ core implements ARMv6-M Thumb instructions, , including a few 32-bit instructions that
use Thumb-2 technology: BL, DMB, DSB, ISB, MRS and MSR.

➢ Cortex-M0+ core is always in Thumb state indicated by program counter being odd (LSB = 1)
▪ Branching to an even address will cause an exception.

▪ Most 16-bit instructions can only access low registers (R0-R7), but a small number of 16-bit instructions can
access high registers (R8-R15)

▪ Conditional execution only supported for 16-bit branch

▪ Half-word aligned instructions

▪ 32-bit address space


See ARMv6-M Architecture Reference Manual for specifics per instruction (Section A.6.7)
16
Assembler Instruction Format
➢ All instructions have op-code <operation> field.
➢ Most instructions have reference to one, two or three operands:
<operation> <operand1> <operand2> <operand3>
▪ First operand is typically destination (<Rd>)
▪ Other operands are sources (<Rn>, <Rm>)
▪ Examples:
▪ ADDS R0, R1, R2; Add registers: R0 = R1 + R2
▪ AND R1, R2; Bitwise AND: R1 = R1 & R2
▪ CMP R1, R2; Compare: Set condition flags based on result of R1 – R2 (R1 and R2 not changed)

➢ Instructions with labels:


<label> <operation> <operand1> <operand2> <operand3>
A label is used if the instruction is referenced by other pieces of the code.
▪ Example:
▪ LOOP BEQ LOOP; Wait for non-zero condition. How would such condition happen to exit LOOP?

17
Where Can the Operands Be Located?
1. In a general-purpose register R
Reference manual notation: ADDS <Rd>,<Rn>,<Rm>
▪ Destination: Rd
▪ Source: Rm, Rn
The register number is encoded in 3 bits.
▪ Both source and destination: Rdn Therefore, similar 16-bit Thumb instructions
▪ Target: Rt can reach only low registers; R0-R7
▪ Source for shift amount: Rs

ADD <Rd>,SP,#<imm8>
2. In the instruction word as an immediate value
B <c>,<label>
Refer to:
3. In a condition code flag Table A6-1 Condition codes in
ARMv6-M Architecture Reference Manual
for possible values of <c>

4. In memory
LDR <Rt>,[<Rn>,<Rm>]
▪ Only for load, store, push and pop instructions
18
Addressing Modes
➢ The way the instruction references the operand(s) is called the addressing mode.

Two modes of addressing:

1. Immediate

2. Base/Offset/Indexed

19
Addressing Modes: 1- Immediate
The operand is part of the instruction:
(No need to reference the memory to get the operand)

All versions of the Thumb instruction set:


MOV <Rd>,#<imm8>

Examples: MOV R0, #3 ; R0 receives the value 3

20
Addressing Modes: 2- Base/Offset/Indexed
To access an operand, or fetch an instruction, in memory we need 32-bit address.

Problem: A 32-bit address cannot fit in the operand field in a 16 or 32-bit instruction.

Solution: Use a 32-bit register to hold the address to point to memory.

➢ The address value should be prepared in a 32-bit register (base register) before it is used by the
instruction that references the operand.

➢ Optionally, an offset in the instruction that references the operand can be added to or subtracted
from the address in the base register which facilitate referencing elements in arrays and structures.

Memory address syntax: [<Rn>,<offset>] Square brackets


‒ Rn the base, or index, register
‒ The <offset> can be:
1. An immediate constant, such as <imm3> or <imm8>
21 2. An index register, <Rm>
Addressing Modes: 2- Base/Offset/Indexed
Forms of base addressing mode: In each example assume register contents:
1. Base/Offset addressing: [Rn<, offset>] RI = 0x1000
The Memory address is [Rn]+offset. The base register is unchanged. R2 = 0x4
Examples: LDR R0, [R1]; load R0 from location 0x1000
LDR R0, [R1, #5]; load R0 from location 0x1005
LDR R0, [R1,R2]; load R0 from location 0x1004
In each case, Rn is the base register and offset can be:
2. Pre-indexed addressing: [Rn, offset]! • An immediate constant.
• The Memory address is [Rn]+offset. • An index register, Rm.
• [Rn]+offset is written back into Rn. • A shifted index register, such as Rm, LSL #shift.
Example: LDR R0, [R1,#4]!; load R0 from location 0x1004, change [R1] to 0x1004

3. Post-indexed addressing: [Rn], offset


• The Memory address is [Rn].
• [Rn]+offset is written back into Rn.
Example: LDR R0, [R1],#4; load R0 from location 0x1000, change [R1] to 0x1004
22
Thumb Instruction Set

Data Transfer
Data Processing
Flow Control
Other…

23
Instruction Set Summary
Instruction Instruction Type Instructions
Category
Move MOV
Data Transfer Load/Store LDR, LDRB, LDRH, LDRSH, LDRSB, LDM, STR, STRB, STRH, STM
Stack operations PUSH, POP
Add, Subtract, Multiply ADD, ADDS, ADCS, ADR, SUB, SUBS, SBCS, RSBS, MULS
Data Processing Compare CMP, CMN
(Arithmetic & Logical
ANDS, EORS, ORRS, BICS, MVNS, TST
Logical
Shift and Rotate LSLS, LSRS, ASRS, RORS
Operations)
Extend SXTH, SXTB, UXTH, UXTB
Reverse REV, REV16, REVSH
Flow Control Conditional branch IT, B, BL, B{<c>}, BX, BLX
Processor State Processor State SVC, CPSID, CPSIE, SETEND, BKPT
No Operation No Operation NOP
Hint Hint SEV, WFE, WFI, YIELD
24
Move (Register or immediate), No reference to memory

▪ MOV:
Copy data from source register (or immediate data) to destination register. No update to condition flags

▪ MOVS: Same as MOV instruction but update condition flags

Syntax: MOVS Rd, Rm; copy Rm to Rd, update N and Z flags


MOVS Rd, #imm8; copy #imm8 to Rd, update N, Z, and C flags
MOVNS Rd, Rm; copy 1’s complement of Rm to Rd

Examples: MOV R0, R1 ; copy content of R1 to R0.


MOVS R0, R1 ; copy content of R1 to R0, update N and Z flags
MOVS R0, #200 ; Rd receives the decimal value 200, N=0, Z=0, C=0

25
MOV (Move with shifts)
▪ Move shifted register:

Assembler translates pseudo-instructions into equivalent instructions (shifts, rotates).

Equivalent Instructions
26
Load/Store Memory Addressing
➢ Most load and store instructions use base register, possibly with an offset, to point to memory.

➢ A register should be initialized with the proper address before being used as a base register.

Offset can be:


Curly braces mean operand is optional
1. Immediate offset
Syntax: LDR Rt, [Rn {,#±imm}]; Memory address = content(Rn) ±imm
STR Rt, [Rn {,#±imm}]; Memory address = content(Rn) ±imm

2. Register (possibly shifted) offset


Syntax: LDR Rt, [Rn, Rm{, LSL #0}]; Memory address = content(Rn) + content(Rm)
STR Rt, [Rn, Rm{, LSL #0}]; Memory address = content(Rn) + content(Rm)

Permitted values of imm are multiples of 4 in the range 0-124 for encoding T1, or 0-1020 for encoding T2.

27
Load/Store Addressing Modes Variations
1. Offset addressing:
LDR Rt, [Rn, offset]
The Memory address = content(Rn) + offset.
The base register content is unchanged.

2. Pre-indexed addressing: (! Mark)


LDR Rt, [Rn, offset]!
The Memory address = content(Rn) + offset.
The Memory address is written back to the base register.

3. Post-indexed addressing: (offset outside the brackets)


LDR Rt, [Rn], offset
The Memory address = content(Rn).
content(Rn) + offset is written back to the base register.

28
Load/Store Addressing Examples
Examples

LDR R2, [R1]; Memory address = content(R1)

LDR R0, [R1, #8]; Memory address = content(R1) + 8 (pre-index addressing, R1 not changed)

LDR R0, [R1, #8]! ; Memory address = content(R1) + 8, (pre-index addressing, writeback R1=R1+8)

LDR R0, [R1], #8 ; Memory address = content(R1), then R1=R1+8 (post-index addressing)

The writeback suffix ! means that the value in the base register plus the offset is written back into the base register.

29
Labels
A label is a symbol that represents the memory address of an:
• instruction or
• data

The address can be:


1. PC-relative,
2. register-relative, or
3. absolute.

➢ The address of a label is calculated at assembly time relative to the origin of the section where the label is
defined. A reference to a label within the same section can use the PC plus or minus an offset.
This is called PC-relative addressing.

➢ The addresses of labels in other sections are calculated at link time, when the linker has allocated specific
locations in memory for each section.

30
Loading The Address into a Register
It is often necessary to load an address into a register.

You might have to load the address of:


• a variable
• a string literal, or
• the start location of a jump table.
There are several ways to do this.

Addresses are normally expressed as offsets from:


• a label
• the current PC, or
• other register.

You can load an address into a register either:


• Using the instruction ADR (Address to Register).
• Using the pseudo-instruction ADRL.
• Using the pseudo-instruction MOV32.

31
• From a literal pool using the pseudo-instruction LDR Rd,=Label.
Loading The Address into a Register (ADR Instruction)
➢ The ADR loads an address within a certain range from the instruction itself (ALIGN PC,4) into a register.
It adds an immediate value to the PC value, and writes the result to the destination register
▪ Syntax:
1. ADR Rd, label ; Normal syntax
label is the label of an instruction or literal data item whose address is to be loaded into Rd.
The assembler calculates the value of label in the instruction as the offset from this instruction.
2. ADD Rd, PC, #<const> Alternative syntax
Notice that the alternative syntax it is an ADD instruction not ADR.

➢ label must be within the same code section as the ADR instruction. The assembler faults references to labels
that are out of range.

➢ The available range of addresses for the ADR instruction depends on the instruction set and encoding.
For 16-bit encoding it is 0 to 1020 bytes and word-aligned. (You can use the ALIGN directive to ensure this)

33
ADR Instruction Example
Here, the ADR instruction
loads the address
of the jump table.

35
Loading The Address into a Register (from Literal Pool)
▪ Load the register Rt from the word in memory specified by label.
▪ Syntax: LDR Rt, label ; label is a PC-relative expression.
The assembler calculates the value of “label” in the instruction as the offset from this instruction to the
data item labeled “label”.

▪ Examples:
LDR R0, LookUpTable ; Load R0 with a word from an address labelled as LookUpTable.

LDR R3, [PC, #100] ; Load R3 with a word from address (PC + 100). Called PC-relative.

▪ Restrictions:
The label LookUpTable must be within 1020 bytes of the current PC and word aligned (multiple of 4).

▪ Condition flags: These instructions do not change the flags.

36
Load Literal Value into Register
Assembly pseudo-instruction: Not a machine instruction

LDR Rd, =value

Assembler generates code to load Rd with value, and selects best approach depending on value:
▪ Load immediate: MOV instruction provides 8-bit unsigned immediate operand (0-255)

▪ Load and shift immediate values: Can use MOV, shift, rotate, sign-extend instructions

▪ Load from literal pool: (literal pool is a portion of memory embedded in the code to hold constant values).
1. Place value as a 32-bit literal in the program’s literal pool
2. Use instruction LDR Rd, [PC,#offset] where offset indicates position of literal relative to PC

37
Loading/Storing Smaller Data Sizes
▪ LDR/STR load register/store into memory a whole word (32 bits).

▪ Some load/store instructions can handle half-word (16 bits) and byte (8 bits):
▪ Syntax: LDRH <Rt>, [<Rn> {, #±<imm>}] ; load half word (16 bits)
LDRB <Rt>, [<Rn> {, #±<imm>}] ; load one byte (8 bits)
STRH <Rt>, [<Rn> {, #±<imm>}] ; store half word (16 bits)
STRB <Rt>, [<Rn> {, #±<imm>}] ; store one byte (8 bits)

▪ Loading a byte or half-word in a 32-bit register requires padding or extension.


What do we put in the upper bits of the register?
▪ Example: How do we extend 0x80 into a full word?
◦ Unsigned? Then 0x80 = +128, so zero-pad to extend to word 0x0000_0080 = 128
◦ Signed? Then 0x80 = −128, so sign-extend to word 0xFFFF_FF80 = −128

Op Code for Signed Op Code for Unsigned


Byte LDRSB LDRB
Half-word LDRSH LDRH
38
In-Register Size Extension
▪ Can also extend byte or half-word already in a register
▪ Signed or unsigned (zero-pad)

▪ How do we extend 0x80 into a full word?


▪ Unsigned? Then 0x80 = +128, so zero-pad to extend to word 0x0000_0080 = 128
▪ Signed? Then 0x80 = −128, so sign-extend to word 0xFFFF_FF80 = −128

Signed Unsigned
Byte SXTB <Rd>,<Rm> UXTB <Rd>,<Rm>
Half-word SXTH <Rd>,<Rm> UXTH <Rd>,<Rm>

39
Load/Store Multiple
▪ LDM (Load Multiple): Load registers from consecutive locations starting from base register:

▪ Syntax: LDM Rn!, {reglist} ; ! Means update base register after. (by what value?)
LDM Rn, {reglist}
▪ Example: LDM R0 , {R0,R3,R4} ; ! is not used because the base register R0 is in reglist

▪ STM (Store Multiple): Store registers to consecutive locations starting at [base register]
▪ Syntax: STM Rn!, {reglist}
STM Rn, {reglist}
▪ Example: STM R1! , {R2-R4,R6}; R1 will be updated by 16

These instructions do not change the flags.

40
Load/Store Multiple
The pseudo instructions LDMIA/STMIA, load multiple and increment after/store multiple and
increment after are synonyms for LDM/STM, translated by assembler.

Restrictions:
▪ Rn and reglist are limited to R0-R7.

▪ The writeback suffix must always be used unless the reglist also contains Rn, in which case
the writeback suffix must not be used.
▪ The value in Rn (address) must be word aligned.

▪ For STM, if Rn appears in reglist, then it must be the first register in the list.

▪ Incorrect Example:
STM R5! , {R4,R5,R6}; Value stored for R5 is unpredictable

41
Stack Operations
▪ A common usage of stack is to create temporary workspace for subroutines.

o When a subroutine begins, any registers that are to be preserved can be pushed onto the stack.

o When the subroutine ends, it restores the saved registers by popping them in reverse order off
the stack before returning to the caller.

▪ The following code segment shows the general pattern of a subroutine:

STMFD SP!, {R0-R12, LR} ; save all registers and return address
{ Code of subroutine } ; subroutine code
LDMFD SP!, {R0-R12, PC} ; restore saved registers and return by LR

Usually PUSH and POP instructions are used to save and restore registers on stack when calling functions.

42
Stack Operations (PUSH & POP)
▪ PUSH: Push some, or all, of registers (R0-R7, LR) to stack, decrement SP by 4number of registers saved.

▪ Syntax: PUSH {reglist}


The reglist specifies the set of registers to be saved.

The registers are stored in sequence, the lowest-numbered register to the lowest memory address,
through to the highest-numbered register to the highest memory address.

▪ Example: PUSH {R1, R2, LR}; Save R1, R2, and LR on stack, decrement SP by 12.
R1 is saved at address [SP]−12
R2 is saved at address [SP]−8
LR is saved at address [SP]−4
Pushing LR saves return address.

43
Stack Operations (PUSH & POP)
▪ POP: Pop some, or all, of registers (R0-R7, PC) from stack, increment SP by 4number of registers restored.

▪ Syntax: POP {reglist}


Always pops registers in same order (opposite of pushing).
The reglist specifies the set of registers to be loaded.
The lowest-numbered register is loaded from the lowest memory address, through to the highest-
numbered register from the highest memory address.

▪ Example: POP {R1, R2, PC} ; Load R1, R2, and PC from stack, increment SP by 12.
R1 is loaded from address [SP]−12
R2 is loaded from address [SP]−8
PC is loaded from address [SP]−4

If the PC is specified in reglist, the instruction loads PC from stack and causes a branch to the address
loaded into the PC (return address).

44
Update Condition Codes in APSR?

▪ Condition flags are set by comparison (CMP) and test (TST) instructions.

▪ By default, data processing instructions do not affect the condition flags.

▪ To cause the condition codes (flags) in APSR to be updated, a data processing instruction can be
postfix with the S symbol, which sets the S bit in the instruction encoding.
▪ Examples:
▪ ADD vs. ADDS
▪ ADC vs. ADCS
▪ SUB vs. SUBS
▪ MOV vs. MOVS

45
Add Instructions
▪ ADD: Add register, or immediate value, and another register, result in a destination register.

▪ Syntax: ADD{S} {Rd}, Rn, Operand2; Add Operand2 and Rn, result in Rd if specified, otherwise in Rn.

Operand2 can be either of the following:


• A constant.

• A register with optional shift.

▪ Examples:

ADD R2, R0, R1 ; Add R1 to value in R0, result in R2

ADDS R0, R0, R1 ; Add R1 to value in R0, update condition flags.

ADD R1, #5 ; Add 5 to R1

46
Add Instructions (Add with Carry)
▪ ADC: Add register, or immediate value, the carry flag, and another register, result in the destination register.

▪ Syntax: ADC{S} {Rd,} Rn, Operand2 ; Add Operand2 and value in Rn, together with the carry flag,
; result in Rd if specified, otherwise in Rn.

Used for multi-precision arithmetic:


Two instructions add two 64-bit integers:

▪ Example:
Add the 64-bit integer in R3-R2 64-bit integer to
64-bit integer in R1-R0, and
place the result in R5-R4:
ADDS r4, r0, r2 ; adding the least significant words
ADC r5, r1, r3 ; adding the most significant words with carry from previous addition.

47
Subtract Instructions, Like Add instructions
▪ SUB: Subtract register, or immediate value, from a register, result in a destination register.

▪ Syntax: SUB{S} {Rd}, Rn, Operand2; Subtract Operand2 from Rn, result in Rd if specified, otherwise in Rn.

Operand2 can be either of the following:


• A constant.

• A register with optional shift.

▪ Examples:

SUB R2, R0, R1 ; Subtract R1 from value in R0, result in R2

SUBS R0, R0, R1 ; Subtract R1 from value in R0, update condition flags.

SUBS R7, #5 ; subtract 5 to R7

48
Subtract Instructions (Subtract with Carry)
▪ SBC: Subtract register, or immediate value, the carry flag, and another register, result in the destination register.

▪ Syntax: SBC{S} {Rd,} Rn, Operand2 ; Subtract Operand2 from Rn, if the carry flag is clear, the result is reduced
by 1, result in Rd if specified, otherwise in Rn.

Used for multi-precision arithmetic:


Two instructions subtract two integers more than 32-bits.

▪ Example:
Subtract 96-bit integer in R11-R9 from
96-bit integer in R8-R6, and
place the result in R3-R5:
SUBS r3, r6, r9 ; subtract the least significant words
SBCS r4, r7, r10
SBC r5, r8, r11 ; subtract the most significant words

49
Multiply
▪ Multiply source registers, signed or unsigned, save least significant word of result in destination register,
{update condition flags}.

▪ Syntax: MUL{S} {Rd,} Rn, Rm; Rd = RmRn, if Rd is omitted, Rn receives the result.
Rn must be different from Rd

Note: upper word of result is truncated!

50
Logical Operations: AND – OR - EOR
▪ Bitwise AND/OR/EOR of two registers, result in destination register, update condition flags.

▪ Syntax: AND{S} Rd, Rn, operand2


ORR{S} Rd, Rn, operand2
EOR{S} Rd, Rn, operand2

▪ Examples: AND R2, R0, R1 ; R1 AND R0, result in R2


ORRS R0, R1 ; R1 OR R0, result in R0, update flags

51
Logical Operations
▪ Bit clear register: Bitwise AND register and complement of second register, result in destination register,
update condition flags.
▪ Syntax: BIC{S} {Rd,} Rn, operand2; Rn AND NOT(Rn), result in Rd, {update condition flags}
▪ Example: BICS R2, R3; if value in R3 is 7, clear the least three bits in R2.

▪ Test selected bits in a register, update condition flags by ANDing two registers, discarding result.

▪ Syntax: TST Rn, operand2; Bitwise AND between Rn AND operand2, update condition flags, discard
result.
▪ Example: TST R2, R3; If value in R3 is 7, then Z flag is set if none of the least three bits in R2 is 1.
Affects Z, N, and C flags.

52
Compare
▪ Compare a register value to another register, or immediate value, (as if subtracting second value from
first), update condition flags, discard result.

▪ Syntax: CMP Rn, operand2; Compare value in Rn to operand2 (as if operand2 is subtracted from Rn),
update flags, discard result. Permitted values of const is 0-255 for encoding T1

▪ Example: CMP R5, #5; Compare R5 to 5: (R5 – 5), update flags according to result, discard result:

▪ Compare negative - adds values in Rn and operand2, update condition flags, discards result.

▪ Syntax: CMN Rn, Rm ; Compare Rn to –Rm (Rn + Rm), update flags according to result, discard result.

▪ Example: CMN R5, R6; Add values in R5 and R6, update flags according to result, discard result.

53
Shift and Rotate: Logical Shift
▪ Logical shift left a register value a number of bits, shift in zeroes on right, result in destination register, update flags.
▪ Syntax: LSLS Rd, Rm, #imm5

LSLS Rdn,Rm

▪ Examples: LSLS R0, R5, #3; Multiply value in R5 by 8, result in R0, update condition flags.

LSLS R5, R2; If R2 contains the value 3, this instruction multiplies R5 by 8.

▪ Logical shift right a register value a number of bits, shift in zeroes on left, result in destination register, update flags.
▪ Syntax: LSRS Rd, Rm, #imm5

LSRS Rdn, Rm

▪ Examples: LSLS R0, R5, #3; Divide value in R5 by 8, result in R0, update condition flags.

LSLS R5, R2 ; If R2 contains the value 3, this instruction divides R5 by 8.

54
Shift and Rotate: Arithmetic Shift
▪ Arithmetic shift right a register value a number of bits, shift in copies of sign bit (to maintain arithmetic sign), result
in destination register, update flags.
▪ Syntax: ASRS Rd,Rm,#imm5

ASRS Rd, Rn, Rm

▪ Example: ASRS R0, R5, #4; Multiply value in R5 by 16, result in R0, update condition flags.

▪ Rotate right a register value a number of bits, The bits that are rotated off the right end are inserted into the
vacated bit positions on the left, result in destination register, update flags.
▪ Syntax: RORS Rd, Rdn, Rm

▪ Example: RORS R0, R5, #4; if R5 contains the value 0x0003, R0 will have the value 0x3000.

55
Shift and Rotate

56
Reversing Bytes
▪ Theses instructions have no effect on condition flags. MSB LSB
▪ REV - reverse the byte order in a register.
▪ Syntax: REV Rd, Rm
▪ Example: REV Rd, Rm; if Rm has 0x1234, then Rd gets MSB LSB
0x4321.

▪ REV16 - reverses the byte order in each 16-bit halfword in a MSB LSB
register.
▪ Syntax: REV16 Rd,Rm
▪ Example: REV16 Rd, Rm; if Rm has 0x1234, then Rd gets MSB LSB
0x2143.

▪ REVSH - reverses the byte order in the lower 16-bit halfword MSB LSB
of in a register, and sign extends the result to 32 bits.
▪ Syntax: REVSH Rd, Rm Sign extend
▪ Example: REVSH R0, R1; if R1 has 0x8012, then R0 gets MSB LSB
0xFF21.

57
Changing Program Flow - Branches
Normal program flow: Flow of instruction execution
➢ Instructions are stored in memory sequentially. Instructions in memory are takes different branches
stored sequentially
➢ Instructions are executed sequentially.
➢ The PC is incremented by the size of the current
Code1
instruction to point to the next instruction. . Code1
.
.
The need to change program flow: Set Condition
• IF or SELECT statements Conditional Branch T
Condition
• Loops like FOR or WHILE CodeF ?
CodeT
F

A conditional branch instruction is needed to Unconditional Branch CodeF


branch to CodeT when Condition is true.
CodeT

An unconditional branch instruction is needed at


the end of CodeF to avoid overrunning CodeT. .
.
.
58
Changing Program Flow - Branches
We need to specify a 32-bit addresses to branch to an instruction in memory.

Problem: A 32-bit instruction address cannot fit in the operand field in a 16- or 32-bit instruction with the
op-code.
Solution: The 32-bit instruction address is formed in one of two ways:

1. Relative to the PC:


Use immediate value in the operand field of the instruction to be added to the PC

2. Use register addressing.

59
Changing Program Flow – Branches
Unconditional Branches: cause a branch to a target address:

1. Relative to the PC

Syntax: B <label>; Branch to the instruction labeled <label>.


Addressing is PC-relative: Supports limited address space.
• imm8 (–256 to +254) for encoding T1
• Imm11 (–2048 to +2046) for encoding T2.
The assembler calculates the required offset from the PC value of the B instruction to this label.
Example: LOOP B LOOP; infinite loop using PC-relative addressing.

2. In a register.

Syntax: BX <Rm>; Branch to the instruction whose address is in Rm. Supports full 4 GB address space.
LSB of target address must be set to 1 to ensure continued execution in Thumb state.
Example: BX R0; unconditional branch to location whose address is in R0.

60
Changing Program Flow – Conditional Branches
Conditional Branches: cause a branch to a target
address only if cond is satisfied.

Syntax: B<cond> <label>;


<cond> is the condition for branching.

Example: BNE SKIP;


if Z==0 then branch to SKIP,
else continue with next instruction.

▪ Append <cond> to branch instruction (B) to


make a conditional branch.
Examples: BEQ
BNE
BPL

▪ Full ARM instructions (not Thumb or Thumb-2)
support conditional execution of arbitrary
instructions.
61
Changing Program Flow – Subroutines
➢ A subroutine is a block of code that performs a task based on some arguments and
optionally returns a result.

➢ Register usage in subroutine calls:


(The Procedure Call Standard for the ARM Architecture defines how to use registers in
subroutine calls)

• You use branch instructions to call and return from subroutines.

• By convention, you use registers R0 to R3 to pass arguments to subroutines, and R0 to


pass a result back to the callers.

• A subroutine that requires more than four inputs uses the stack for the additional inputs.

62
Changing Program Flow – Subroutine Call
Call
1. BL <label> - branch with link. 16- or 32-bit instruction

Call subroutine at <label>, PC-relative, range limited to PC±16MB 24-bit relative address

• Save return address in LR.


• Add value of <label> to PC

2. BLX <Rm> - branch with link and exchange. 16- or 32-bit instruction

Call subroutine at address in register Rm. Full 4GB address range.

• Save return address in LR


• Copy Rm to PC.
• LSB of target address copied to PC must be 1 to ensure continued execution in Thumb state.
63
Changing Program Flow – Subroutine Return
Return
When a subroutine is called, the return address is placed in the link register (LR) by the (BL) or (BLX) instruction.
Therefore, control can be returned to the calling function by executing an unconditional branch via the LR register:
▪ BX LR - Return from subroutine

Problem:
If this subroutine calls another subroutine, then the return address in LR will be overwritten by the second call.

To prevent this, a subroutine that may call another subroutine saves the LR register on the stack with a PUSH {LR}
instruction before calling the other subroutine.

In this case, the return instruction will be POP {PC}, not a BX LR instruction, which pops the return address from the
stack into the program counter.

Note that if other registers were pushed onto the stack, the POP instruction may have a list of multiple registers.

64
Subroutine Call & Return Example
AREA subrout, CODE, READONLY ; Name this block of code
ENTRY ; Mark first instruction to execute
start MOV r0, #10 ; Set up parameters
MOV r1, #3
BL doadd ; Call subroutine
stop MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit
SVC #0x123456 ; ARM semihosting (formerly SWI)
doadd ADD r0, r0, r1 ; Subroutine code
BX lr ; Return from subroutine
END ; Mark end of file

65
Special Register Instructions
▪ Move to Register from Special Register
▪ MSR <Rd>, <spec_reg>

▪ Move to Special Register from Register


▪ MRS <spec_reg>, <Rd>

▪ Change Processor State - Modify


PRIMASK register
▪ CPSIE - Interrupt enable
▪ CPSID - Interrupt disable

66
Other
▪ No Operation - does nothing! - Can be used for code alignment purposes and for timing loops.
▪ NOP

▪ Breakpoint - causes hard fault or debug halt - used to implement software breakpoints
▪ BKPT #<imm8>

▪ Wait for interrupt - Pause program, enter low-power state until a WFI wake-up event occurs (e.g. an
interrupt)
▪ WFI

▪ Wait for event - Permits the processor to enter a low-power state until one of a number of events occurs.
▪ WFE

▪ Supervisor call - Generates a call to a system supervisor. Same as software interrupt.


▪ SVC #<imm>

67

You might also like