You are on page 1of 94

EE 3002.

01
Embedded Systems

Lecture 3
ARM Assembly Programming

G. Ablay 1
Outline
ARM Assembly Basics Tutorial:
• Assembling An ARM Program
• ARM Instruction Set
– Memory Access (Data movement)
– Data Processing (Arithmetic, shift, logic, bit)
– Branch and Control (Compare, branch)
• Writing simple examples
• Examples with Keil

G. Ablay 2
Why assembly?
• Why learning assembly language is still important
and relevant:
– Assembly language makes it possible to manipulate
hardware directly
– Understanding processor and memory function
– It eases debugging

G. Ablay 3
The Top Programming
Languages

Embedded Systems

G. Ablay 4
https://spectrum.ieee.org/static/interactive-the-top-programming-languages-2020
Assembling An ARM Program
• The design of the machine language encoding is called
the instruction set architecture (ISA)
– Processors supporting ARM's ISA are distributed quite widely,
usually for low-power devices such as cellphones, digital music
players, and handheld game systems
– The iPhone, Kindle and Nintendo DS are all prominent examples
of devices that incorporate an ARM processor

• Other ISAs
– Computer processors support IA32, manufactured by Intel,
AMD, and VIA
– Another well-known ISA is the PowerPC, used in automobiles
and gaming consoles (Wii, Playstation 3, XBox 360)

G. Ablay 5
Levels of Program Code int main(void) {
int i;
int total = 0; easy to
for(i = 0; i < 10; i++) { read and
C Program total += i; write
}
while(1); // dead loop
}

Compiler

MOVS r1, #0
MOVS r0, #0
B check harder than
Assembly loop ADD r1, r1, r0 high level
Program ADDS r0, r0, #1 languages
check CMP r0, #10
BLT loop
self B self

Assembler
Hex format
2100 0010000100000000 Load
Because .hex is 2000
Binary 0010000000000000 Output Devices
much more E001 Machine 1110000000000001
and (LCD, Motor)
4401 Program 0100010000000001 Run
readable and useful 1C40 0001110001000000 Microprocessor
than binary – hex 280A 0010100000001010
Input Devices
code is often used DBFS 1101110011111011
(Sensors, Button)
BF00 1011111100000000
and shown. E7FE 1110011111111110 6
Binary code
Assembling An ARM Program
• The steps to create an executable Assembly language program:

Editor
Program
myfile.asm
Assembler
Program

myfile.hex myfile.eep myfile.map myfile.lst myfile.obj

Download to
ARM’s FLASH
Because hex is much more readable and useful
than binary – hex code is often used and shown.
ARM Assembler
Anatomy of an Assembly Program:

G. Ablay 8
Assembler Directives

G. Ablay 9
ARM Assembler
Assembler Directives:
• They are used to provide key information to compile the source program, such as
declaring constants and symbolic names, defining data layout, allocating memory
space, and specifying the program structure and entry point.
AREA Make a new block of data or code
ENTRY Declare an entry point where the program execution starts
ALIGN Align data or code to a memory boundary
DCB Allocate one or more bytes (8 bits) of data
DCW Allocate one or more halfwords (16 bits) of data
DCD Allocate one or more words (32 bits) of data
DCFS Allocate single-precision (32 bits) floating-point numbers
DCFB Allocate double-precision (64 bits) floating-point numbers
SPACE Allocate a zeroed block of memory
FILL Allocate a block of memory and fill with a given value
EQU Give a symbol name to a numeric constant
RN Give a symbol name to a register
EXPORT Declare a symbol and make it referable by other source files
IMPORT Provide a symbol defined outside the current source file

PROC Declare the start of a procedure

ENDP Designate the end of a procedure

END Designate the end of a source file 10


Assembler Directives
• Instructions (e.g. ADD, MOV) tell the CPU what to do
• Assembler directives tell the assembler what to do
– AREA
– IMPORT and EXPORT
– END
– DCD, DCW, DCB
– EQU
– INCLUDE

G. Ablay 11
Assembler Directives
• AREA
– AREA directive indicates to the assembler the start of a new
data or code section.
– AREA sectionName, attribute1, attribute2, … 8 bits
4G 0xFFFF FFFF
Code: Cortex-M3 internal
peripherals
0xE000 0000
AREA myCode, CODE, READONLY
Data:
AREA myData1, DATA, READWRITE 3G 0xC000 0000

AREA myConst, DATA, READONLY


FSMC
AREA MY_PROG,CODE,READONLY 2G 0x8000 0000

__main
0x6000 0000
MOV R4, #6
ADD R1,R1,R2 0x5FFF FFFF
Peripherals
1G 0x4000 0000
….
0x3FFF FFFF
myFunc READWRITE SRAM 0x2000 0000
ADD R2,R3,R4
0x1FFF FFFF
… READONLY 0
Flash
0x0000 0000

G. Ablay 12
Assembler Directives
• IMPORT and EXPORT
– EXPORT declares a symbol and makes this symbol visible to the
linker.
– IMPORT gives the assembler a symbol that is not defined locally
in the current assembly file. (‘extern’ keyword)
file1.s
; from the main program:
IMPORT MY_FUNC
...
BL MY_FUNC ;call MY_FUNC function
...

file2.s
AREA EXAMPLE, CODE, READONLY
EXPORT MY_FUNC
IMPORT DATA1
MY_FUNC
LDR R1,=DATA1
...
G. Ablay 13
Assembler Directives
DCD, DCW, and DCB
• DCB allocates bytes (8-bit) of memory & initializes them.
– Examples: Nibble = 4 bits
• MYVALUE DCB 5 Byte = 8 bits
Half-word = 16 bits
• FIBO DCB 1,1,2,3,5,8 Word = 32 bits
Double-word = 64 bits
• MY_MSG DCB “Hello World!”
• DCW allocates a half-word (16-bit)
• MYVALUE DCW 25425

• DCD allocates a word of memory (32-bit)


• MYDATA DCD 0x200000, 0x30F5, 5000000

• SPACE allocates memory without initializing.


• BETA SPACE 255 ;Allocate 255 bytes of zeroed memory space

G. Ablay 14
Assembler Directives
• Example: Storing Fixed Data in Program Memory
AREA Example1, CODE, READONLY
ENTRY

LDR R2, = FIXED_DATA ; point to FIXED_DATA


LDRB R0, [R2] ; load R0 with the contents of memory pointed to by R2
ADD R1, R1, R0 ; add R0 to R1
HERE B HERE ; stay here forever
AREA LOOKUP_EXAMPLE, DATA, READONLY
FIXED_DATA
DCB 0x55, 0x33, 1, 2, 3, 4, 5, 6
DCD 0x23222120, 0x30
DCW 0x4540, 0x50
END

G. Ablay 15
Assembler Directives
• ALIGN
– ALIGN is used to align data on 32-bit or 16-bit boundary.
(1)

DTA DCB 0x55


DCB 0x22
END
(2)

DTA DCB 0x55


ALIGN 2
DCB 0x22
END
(3)

DTA DCB 0x55


ALIGN 4
DCB 0x22

G. Ablay 16
Assembler Directives
• EQU and RN
– EQU associates a symbolic name to a numeric constant (#define)
COUNT EQU 0x25
GPIOA_ODR EQU 0x4001080C
MOV R1, #COUNT

– RN directive gives a symbolic name to a register


RESULT RN R2
MOV RESULT,#23

G. Ablay 17
Assembler Directives
• INCLUDE
– INCLUDE directive is to include an assembly source file within
another source file.
hFile.inc
GPIOA_CRL EQU 0x40010800
GPIOA_CRH EQU 0x40010804
GPIOA_IDR EQU 0x40010808
GPIOA_ODR EQU 0x4001080C
....

Program.s
include “hFile.inc”

G. Ablay 18
ARM Instruction Set

G. Ablay 19
ARM Instruction Set
• There are two instruction sets: ARM and Thumb (Thumb-1 and 2).
• The Cortex-M architectures only implement the Thumb
instruction set.

Code comparison (if-then instruction)

G. Ablay 20
ARM Instruction Set
• ARM Cortex devices use a new instruction set called Thumb-2.
– Most Thumb-2 instructions are 16-bit, but six of them are 32-bit
– This mix of 16 and 32-bit instructions improves code density while
maintaining performance

G. Ablay 21
ARM Instruction Set

G. Ablay 22
ARM Instruction Set
• Instruction grouping based on functionality along with supported
processor architecture information.
Thumb2
Instructions

Memory Data
Access Processing

Multiply -
Bit Fields
Divide Cortex M3,
M4, M4F
Branch -
Saturating
Control Cortex M4,
M4F, M7
Packing -
Miscellaneous
Unpacking Cortex M4F,
M7
Floating Point
G. Ablay 23
ARM Instruction Set
• ARM assembly format
– The “label” is used as a reference to an address location. It is
optional.
– The “mnemonic” is the name of the instruction (add, mov, …).
– The number of operands varies, depending on each specific
instruction.
• operand1 is the destination register, and operand2 and operand3
are source operands.
– Everything after semicolon ; is a comment

label mnemonic operand1, operand2, …. ; Comments

G. Ablay 24
ARM Instruction Set
• A simple program: Adding numbers
– We want to add the numbers from 1 to 10
C language
int total;
int i;

total = 0;
for (i = 10; i > 0; i--) {
total += i;
}
ARM assembly
MOV R0, #0 ; R0 accumulates total
MOV R1, #10 ; R1 counts from 10 down to 1
again ADD R0, R0, R1
SUBS R1, R1, #1
BNE again
halt B halt ; infinite loop to stop computation

G. Ablay 25
ARM Registers 32-bit (4-byte) • CPUs use many registers to
store data temporarily
• 16 registers for arithmetic
and logic operations

Program Status Register xPSR

flags
G. Ablay 26
Instruction Execution Cycle

1. The PC (= Program Counter) register contains the value 44


2. The IR (= Instruction Register) register contains the previous instruction that just has
been executed
3. The PSR (Program Status Register) contains some arbitrary value denoted by XXX
4. The registers R0, R1, R2 will be important because the next instruction that the CPU will
execute will use them as input and output G. Ablay 27
ARM Instruction Set
• Immediate data: prefixed with “#”
MOVS R0, #0x1F ; Set R0 = 0x1F (R0 ← 0x1F)
MOVS R0, #’A’ ; Set R0 = 0x41 (ASCII code)
• MOV instruction is used for simple data transfers within the processor.

• Constant definition:
CLK_BA_base EQU 0x50000200 ; 32-bits
PWRCON EQU 0x00 ; 8-bits?

S is an optional suffix.
S-Suffix indicate an instruction that update xPSR (flags: N, Z, C, and V)
G. Ablay 28
ARM Instruction Set
• Embedded data:
– (LDR is a pseudo instruction- i.e., compilers translate it to one or
multiple actual machine instructions)

LDR R0, =MyData ; Load memory address of MyData into R0


LDR R1, [R0] ; Load value from memory [adress] R0 to R1
LDR R0, =MyText ; R0 = the starting addr
LDR R1, [R0] ; R1 = 0x61434241
MyData DCD 0x12345678
MyText DCB "ABCabc0123\n",0 ; Null terminated string

G. Ablay 29
OpCode S Operands

(1) Data Movement Instructions


Read data memory:
LDR (load word), LDRB (load byte), LDRH (load halfword), LDRD (load double-word), LDRSB (load signed
byte), LDRSH (load signed halfword), LDM,LDMDB,LDMFD (load multiple words),
LDREXB,LDREXH,LDREX (load register exclusive with a byte, halfword, and word), LDRT (load in
privileged modes), POP (load from stack)

Write data memory:


STR (store word), STRB (store byte), STRH (store halfword), STRD (store double-word), STRSB (store
signed byte), STRSH (store signed halfword), STM,STMDB,STMFD (store multiple words),
STREXB,STREXH,STREX (store register exclusive with a byte, halfword, and word), STRT (store in
privileged modes), PUSH (store into stack)

Data copy instructions:


MOV (move), MOVT (move top), MOVW (move halfword), MRS (move from coprocessor), MSR (move to
coprocessor)
G. Ablay 30
ARM Instruction Set: Memory Access
Target Memory Address = PC + 4 + Offset ; The program counter (PC) always incremented by 4.

• ADR: Generates a PC-relative address


ADR R0, MyData ; write address 0x0000016C to R0
; R0 = 0x0000016C

• LDR (load word) and STR (store word), immediate offset


LDR R0, [R5] ; Loads R0 from the [address] found in R5.
STR R1, [R6] ; store R1 in [address] found in R6.
STR R1, [R6, #const] ; const is a constant in the range 0-1020,
; R1=(R6+const).
• LDR and STR, register offset
STR R0, [R5, R1] ; Store value of R0 into an [address] equal to
; sum of R5 and R1
LDRSH R1, [R2, R3] ; Load a halfword from the memory address
; specified by (R2 + R3), sign extend to 32-bits
; and write to R1.
G. Ablay 31
ARM Instruction Set: Memory Access
LDR and ADR are pseudo instructions: Compilers translate it to one or multiple
actual machine instructions when the assembler builds the program into an
executable.

LDR R0, myData ;load the memory address of label myData into R0
LDR R1, [R0] ;load the value (0x01) at the memory address found in R0 to R1
STR R2, [R0] ;store the value found in R2 (0x19) to the memory address found in R0

Memory
Registers

R0 0x00010100 0x00010108 0x05
R1 0x01 0x00010104 0x02 myText
R2 0x19 0x00010100 0x01
0x19 myData

G. Ablay 32
Instruction Set: Memory Access
• LDR, PC-relative address

LDR R0, MyData ; Load R0 with a word of data from an address


; MyData, R0 = 0x12345678

Note: LDR R0, =255 → MOV R0, #255

• LDM and STM: Load and Store Multiple registers


LDM R0!, {R0,R3,R4}
; LDMIA,LDMFD is a synonym for LDM
; R0=memory[R0], R3=memory[R0+4}, R4=memory[R0+8]

STMIA R1!, {R2-R4,R6}


; memory[R1]=R2, memory[R1+4]=R3, memory[R1+8]=R4,
; memory[R1+12]=R6, and update R1

G. Ablay 33
Instruction Set: Memory Access
• MOV and MVN: Move and Move NOT.
MOVS R0, #0x000B
; Write value of 0x000B to R0, flags get updated
MOVS R1, #0x0
; Write value of zero to R1, flags are updated
MOV R10, R12
; Write value in R12 to R10, flags are not updated
MOVS R3, #23
; Write value of 23 to R3
MOV R8, SP
; Write value of stack pointer (SP) to R8
MVNS R2, R0 𝑅2 ← ~𝑅0
; Write inverse of R0 to the R2 and update flags

Example: MVN R2, R0

𝑅0: 0101 0011 1010 1111 1101 1010 0110 1011


𝑅2: 1010 1100 0101 0000 0010 0101 1001 0100
G. Ablay 34
ARM Instruction Set: Memory Access
• Example: Assuming R2 has the x value range of 0–6, the program calculates 10
to the power of R2 and stores the result in R3.
AREA lookup_example, CODE, READONLY
ENTRY

LDR R1, =LOOKUP ; point to LOOKUP


LDR R3, [R1, R2, LSL #2] ; R3 = entry of lookup table index by R2
HERE B HERE ; stay here forever
LOOKUP DCD 1, 10, 100, 1000, 10000, 100000, 1000000
END

• Example: Write a program that copies the contents of location 0x80 into
location 0x88.
AREA load_store_ex, CODE, READONLY
ENTRY

LDR R2, =0x80 ;R2 = 0x80


LDR R1, [R2] ;R1 = [0x80]
LDR R2, =0x88 ;R2 = 0x88
STR R1, [R2] ;[0x88] = R1
END
G. Ablay 35
Instruction Set: Memory Access
0x2000050
0x2000054
0x2000058
SP = SP + 4 0x200005C
SP = SP – 4 0x2000060
SP 0x2000064

• Stack is a memory region within the program/process.


• We use Stack for storing temporary data such as local variables of some
function, environment variables, etc.
• We interact with the stack using PUSH and POP instructions.

• PUSH (store into stack) and POP (load from stack):


PUSH {R0, R4-R7}
; Push R0,R4,R5,R6,R7 onto the stack
PUSH {R2, LR}
; Push R2 and the link-register onto the stack
POP {R0, R6, PC}
; Pop r0,r6 and PC from the stack, then branch to the new PC.
G. Ablay 36
Instruction Set: Memory Access
• Example: PUSH (store into stack) and POP (load from stack):
– Examine SP and the other registers and watch the values of locations
0x20000090–0x200000F0.

AREA Example, CODE, READONLY


ENTRY
LDR R0,=0x112233
LDR R1,=0x455
LDR R2,=0x6677
PUSH {R0} ; store into stack
PUSH {R1} ; store into stack
PUSH {R2} ; store into stack
MOV R0,#0
MOV R1,#0
MOV R2,#0
POP {R2} ; load from stack
POP {R1} ; load from stack
POP {R0} ; load from stack
L1 B L1
END

G. Ablay 37
OpCode S Operands

(2) Arithmetic Instructions

Addition:
ADD, ADC (add with carry)
Subtraction:
SUB, RSB (reverse subtract), SBC (subtract with carry)

Multiplication:
MUL (multiply), MLA (multiply with accumulate), MLS (multiply with subtract), SMULL (signed long multiply),
UMULL (unsigned long multiply), SMLAL (signed long multiply, with accumulate), UMLAL (unsigned long
multiply, with subtract)
Division: SPIV (signed), UDIV (unsigned)
Saturation: SSAT (signed), USAT (unsigned)

G. Ablay 38
Arithmetic Instruction Set: Data Processing
• ADC, ADD, RSB, SBC, and SUB: Add with carry, Add, Reverse
Subtract, Subtract with carry, and Subtract.

• Example: shows two instructions that add a 64-bit integer


contained in R0 and R1 to another 64-bit integer contained in
R2 and R3, and place the result in R0 and R1.
[R3:R2] = [R1:R0]+[R3:R2]

ADDS R0, R0, R2 (R0 ← R0 + R2)


; add the least significant words
ADCS R1, R1, R3
; add the most significant words with carry

S is an optional suffix.
S-Suffix indicate an instruction that update xPSR (flags: N, Z, C, and V)
G. Ablay 39
Arithmetic Instruction Set: Data Processing
• ADC, ADD, RSB, SBC, and SUB: Add with carry, Add, Reverse
Subtract, Subtract with carry, and Subtract.
RSB R7, R7, R7, LSL #3 ; R7 = (8-1)*R7

SUB R7, R7, R7, LSL #3 ; R7 = R7-8*R7

• Example: shows the RSB instruction used to perform a 1's


complement of a single register.
RSBS R7, R7, #0 ; subtract 0 from R7 (R7 ← (#0 − R7))

S is an optional suffix.
S-Suffix indicate an instruction that update xPSR (flags: N, Z, C, and V)
G. Ablay 40
NOTE: 2’s Complement
• Why use 2’s Complement
– 2’s complement simplifies hardware
– Signed and Unsigned operations are the same with Addition,
Subtraction and Multiplication operations

1’s complement → transforming the 0 bit to 1 and the 1 bit to 0 (Flip)


2’s complement → 1 added to the 1’s complement (add 1)
1's complement of 7 (0111) is 8 (1000)
2's complement of 7 (0111) is 9 (1001)

Example:
-9 + 6
10111 + 00110 = 11101 (29)
Flip : 00010
Add 1 : 00011
Result is (3)
G. Ablay 41
Example
Example: Write an ARM assembly program to compute 4 + 5 − 19. Save
the result in R1.

mov r1, #4 ; R1=4


add r1, r1, #5 ; R1=R1+5
sub r1, r1, #19 ; R1=R1-19

G. Ablay 42
Arithmetic Instruction set: data processing
• MUL(S): Multiply using 32-bit operands, and producing a 32-
bit result.
MUL R0, R2, R0 ; Multiply with flag update, R0 = R2 x R0
MLA R0, R1, R2, R3 ; R0 = (R1 x R2) + R3, multiply with accumulate
MLS R0, R1, R2, R3 ; R0 = R3 – (R1 x R2) , multiply with subtract

G. Ablay 43
Example
Example: Compute 123 + 1, and save the result in R3.

; load test values


mov r0, #12
mov r1, #1
; perform logical computation
mul r4, r0, r0 ;12*12
mla r3, r4, r0, r1 ;(12*12)*12 + 1

Example: Sign extension with LDRSB (load signed byte).

LDR R0, =val1


LDRSB R1, [R0] ; read val1
LDR R0, =val2
LDRSB R2, [R0] ; read val2
SDIV R3,R1,R2 ; R3 = R1 / R2 signed division
HERE B HERE
val1 DCB -25
val2 DCB -2

G. Ablay 44
(3) Shift, Bit and Logic Instructions
Shift:
LSL (logic shift left), LSR (logic shift right), ASR (arithmetic shift right), ROR (rotate right), RRX (rotate right
with extend)

Logic:
AND (bitwise and), ORR (bitwise or), EOR (bitwise exclusive or), ORN (bitwise or not), MVN (move not)

Bit set/clear: BIC (bit clear), BFC (bit field clear), BFI (bit field insert), CLZ (count leading zeros)
Bit/byte reordering: REV (reverse byte order in a word), RBIT (reverse bit order in a word)

G. Ablay 45
Logic-Bit Instruction Set: Data Processing
• AND, ORR, EOR, and BIC: Logical AND, OR, Exclusive OR, and
Bit Clear.
ANDS R2, R2, R1 (R2 ← R2 AND R1)
ORRS R2, R2, R5 (R2 ← R2 OR R5)
EORS R7, R7, R6 (R7 ← R7 XOR R6)
BICS R0, R0, R5 (R0 ← R0 AND ~R5)
(R5 = 0010 → ~R5 = 1101)

S is an optional suffix.
S-Suffix indicate an instruction that update xPSR (flags: N, Z, C, and V)
G. Ablay 46
Example: Setting and Clearing bits

OR can be used to set a specific bit(s) of a byte


04 0 0 0 0 0 1 0 0
ORR 30 0 0 1 1 0 0 0 0
34 0 0 1 1 0 1 0 0

AND can be used to clear a specific bit(s) of a byte

35 0 0 1 1 0 1 0 1
AND 0F 0 0 0 0 1 1 1 1
05 0 0 0 0 0 1 0 1

EOR can be used to toggle a specific bit(s) of a byte

44 0 1 0 0 0 1 0 0
EOR 06 0 0 0 0 0 1 1 0
34 0 1 0 0 0 0 1 0

G. Ablay 47
Example

Example: Write an ARM assembly program to compute 𝐴 + 𝐵, where


A=0 and B=1. Save the result in R0.

mov r0, #0x0


orr r0, r0, #0x1
mvn r0, r0

G. Ablay 48
Shift Instruction Set: Data Processing
• ASR, LSL, LSR, and ROR: Arithmetic Shift Right, Logical Shift
Left, Logical Shift Right, and Rotate Right.
ASRS R7, R5, #9 ; R7 = R5 >> 9, signed
; Arithmetic shift right by 9 bits and update flags
LSRS R4, R5, #6 ; R4 = R5 >> 6, unsigned
; Logical shift right by 6 bits and update flags
LSLS R1, R2, #3 ; R1 = R2 << 3
; Logical shift left by 3 bits with flag update
RORS R4, R5, R6 ; R4 = rotate R5 by R6 bits
; Rotate right by the value in the bottom byte of R6.

B = A << 3; (Left shift 3 bits)


A 1 0 1 0 1 1 0 1 LSL is equivalent to multiplication by 2n
B 0 1 1 0 1 0 0 0 LSR is equivalent to division by 2n
ASR is equivalent to signed division by 2n
B = A >> 2; (Right shift 2 bits) ROR, all 32 bits are shifted right
A 1 0 1 1 0 1 0 1 simultaneously
B 0 0 1 0 1 1 0 1 G. Ablay 49
Shift Instruction Set: Data Processing
• ASR, LSL, LSR, and ROR: Arithmetic Shift Right, Logical Shift
Left, Logical Shift Right, and Rotate Right.
ASR #3

moves all bits right by n bits and copies of the left


most bit (the sign bit) are shifted in at the left end

LSR #3

moves all bits of a register value right by n bits and


zeros are shifted in at the left end
G. Ablay 50
Shift Instruction Set: Data Processing
• ASR, LSL, LSR, and ROR: Arithmetic Shift Right, Logical Shift
Left, Logical Shift Right, and Rotate Right.
LSL #3

moves all bits of a register value left by n bits and


zeros are shifted in at the right end

• The C language does not provide rotate operations (ROR).


ROR #3

all 32 bits are shifted right simultaneously


G. Ablay 51
Examples
Example: Write an ARM assembly program to compute R1=R2/4.

mov r1, r2, asr #2

Example: Write an ARM assembly program to compute R1=R2+R3*4.

add r1, r2, r3, lsl #2 Faster than MUL


instruction

Example: Write an ARM assembly program to compute R0 = 5 ∗ R1.

add r0, r1, r1, LSL #2 ; R1=4*R1+R1

G. Ablay 52
Bit/Byte Instruction set: data processing

• REV, REV16, and REVSH: Reverse bytes.

REV R3, R7 ; Reverse byte order of value in R7 and write it to R3

G. Ablay 53
Bit/Byte Instruction set: data processing
• REV, REV16, and REVSH: Reverse bytes.
REV16 R0, R1 ; Reverse byte order of each 16-bit halfword in R0

REVSH R0, R5 ; Reverse signed halfword

G. Ablay 54
Instruction set: data processing
• SXT and UXT: Sign extend and Zero extend.
SXTH R4, R6
; Obtain the lower halfword of the value in R6 and then sign extend to
; 32 bits and write the result to R4.

UXTB R3, R1
; Extract lowest byte of the value in R10 and zero
; extend it, and write the result to R3

G. Ablay 55
(4) Compare and Branch Instructions

Data compare instructions:


CMP (compare), CMN (compare negative), TST (test), TEQ (test equivalent), IT (if-then)

Branch instructions:
B (branch), CBZ (compare and branch on zero), CBNZ (compare and branch on non-zero), TBB (table
branch byte), TBH (table branch halfword)

Subroutine instructions:
BL (branch with link), BLX (branch with link and exchange), BX (branch and exchange)

G. Ablay 56
Branch Instruction Set: Branch and Control
• B, BL, BX, and BLX: Branch instructions.
B loopA ; Branch to loopA -- Unconditional branch

BL funC
; Branch with link (Call) to function funC, return address stored in LR

BX LR ; Return from function call

BLX R0
; Branch with link and exchange (Call) to a address stored in R0

BEQ labelD
; Conditionally branch to labelD
; if last flag setting instruction set the Z flag, else do not branch.

G. Ablay 57
Branch Instruction Set: Branch and Control
• Use of unconditional branch instruction, where `loop1' is the
label used by the program

loop1 LDR R2 , [R1] ; load R2 with data from memory pointed to


; by register R1
...
CBZ R5 , label1 ; Compare R5 with zero . If comparison
; result is true then branch to label1
...
B loop1 ; jump to the memory location labeled as loop1
label1
MOV R3 , #0x034

G. Ablay 58
Branch Instruction Set: Branch and Control
Compare Instructions:
CMP R1, R2 ; Compare , set flag after computing R1-R2
CMN R1, R2 ; Compare Negative , set flag after computing R1+R2

TST: Test bits.


TST R0, R1
; Perform bitwise AND of R0 value and R1 value (r0 AND r1),
; condition code flags are updated but result is discarded

CPSR/APSR (Current/Application Program Status Register)


N Z C V
G. Ablay 59
Branch Instruction Set: Branch and Control
Signed Numbers:

8-bit Signed numbers

16-bit Signed numbers

32-bit Signed
numbers
Decimal Binary Hex
-2,147,483,648 10000000000000000000000000000000 80000000
... ... ...
-1 11111111111111111111111111111111 FFFFFFFF
0 00000000000000000000000000000000 00000000
+1 00000000000000000000000000000001 00000001
... ... ...
+2,147,483,647 01111111111111111111111111111111 7FFFFFFF

Considering 8-bit signed Considering 8-bit unsigned


numbers numbers
-11 1111 0101 245
+ +7 + 0000 0111 + 7
-4 1111 1100 252
G. Ablay 60
Condition code suffixes

==
!=

≥ U
< U
> U
≤ U
≥ S
< S
> S
≤ S
G. Ablay 61
List of unconditional and conditional branch instructions
N Z C V

Jump to LabeL if

• When two numbers are unsigned integers, branch instructions should use an unsigned condition suffix.
• When these two numbers are signed integers, branch instructions should use a signed condition suffix.
G. Ablay 62
Branch Instruction Set: Branch and Control
Conditions:

Example: Write a program that if R0 is equal to R1 then R2 increases.

No
R0 == R1

Yes

increment R2

SUBS R0,R0,R1 ;Z will be set if R0 == R1 CMP R0, R1


BNE next ;if Not Equal jump to next Jump to NEXT if 𝑅0 ≠ 𝑅1
ADD R2,R2,#1
next

G. Ablay 63
Branch Instruction Set: Branch and Control
Conditions:

Example: Write a program that if R6 < R4 then R2 increases.

No
R6 < R4

Yes

increment R2

SUBS R6,R6,R4 ;C will be set when R6 >= R4 CMP R6, R4


BHS next ;if higher or same set jump to next Jump to NEXT if 𝑅6 ≥ 𝑅4
ADD R2,R2,#1
next

G. Ablay 64
Branch Instruction Set: Branch and Control
Conditions:

Example: Write a program that if R6 >= R4 then R2 increases.

No
R6 >= R4

Yes

increment R2

SUBS R6,R6,R4 ;C will be set when R6 >= R4 CMP R6, R4


BLO next ;if lower, jump to next Jump to NEXT if 𝑅6 < 𝑅4
ADD R2,R2,#1
next

G. Ablay 65
Branch Instruction Set: Branch and Control
Conditions: R7 = 5
int main ( )
{
Example: IF and ELSE R7 = 5;
if (R0 > R1) No
R0 > R1
R2++;
else
R2--; Yes

R7++;
}
increment R2

MOV R7,#5 decrement R2


SUBS R1,R1,R0 ;C is set when R1 >= R0 CMP R1, R0
BHS else ;jump to else if set Jump to else if 𝑅1 ≥ 𝑅0
ADD R2,R2,#1
B next
else increment R7
SUB R2,R2,#1
next
ADD R7,R7,#1

G. Ablay 66
Branch Instruction Set: Branch and Control
Loop:

Example: Write a program that executes the


instruction “ADD R3,R3,R1” 9 times.
for (init; condition; calculation) R6 = 9

{
do something
} R3 = R3 + R1

R6 = R6 - 1
MOV R6,#9 ;R6 = 9
L1 ADD R3,R3,R1 ;R3=R3+R1
SUBS R6,R6,#1 ;R6 = R6 - 1
BNE L1 ;if Z = 0 Jump to L1 if R6 ≠ 1
Yes
L2 B L2 ;Wait here forever R6 > 0

No

END

G. Ablay 67
Examples
Example: Go to the labeled instruction if two numbers are equal.

CMP r1, r2 ; set flag after computing R1-R2


BEQ Label ; branch if equal (if flag Z=1) (𝑟1 = 𝑟2)

Example: For x = 0x00000001; y = 0xFFFFFFFF; if x>y then z=1, else z=0.


Write a program (Implementation of if-statement ).
MOV r5, #0x00000001 ; r5 =x
MOV r6, # 0xFFFFFFFF ; r6=y
CMP r5, r6 ; set flag Z after computing R5-R6
BLS else ; branch if <= (Z=1 or C=0) (𝑟5 ≤ 𝑟6)
then MOV r7, #1 ; z=1
B endif ; skip next instruction
else MOV r7, #0 ; z=0
endif
G. Ablay 68
Examples
32 32 32 32
Example: Add two long values stored in R2,R1 and R4,R3. 64 bit values

adds r5, r1, r3 ; r5=r1+r3 (the LSB bits)


adc r6, r2, r4 ; r6=r2+r4 (the MSB bits) (and the value of carry)

Example: Write an ARM assembly program to compute the factorial of a positive


number stored in R0, and save results in R1.
mov r0, #5 ; index =5
mov r1, #1 ; product =1
mov r3, #1 ; index=1
loop
mul r1, r3, r1 ; product=product*index
CMP r3, r0 ; compare index with the input
add r3, r3, #1 ; index++
BNE loop ; if Z=0 (not equal), then continue (𝑟3 ≠ 𝑟0)
G. Ablay 69
(5) Miscellaneous Instructions
WFE (wait for event), WFI (wait for interrupt),
BKPT (breakpoint), NOP (no operation), SEV (set event),
CPSID (interrupt disable), CPSIE (interrupt enable),
DMB (data memory barrier), DSB (data synchronization barrier), ISB (instruction synchronization barrier)
MRS (move from coprocessor), MSR (move to coprocessor)

G. Ablay 70
Instruction set: Miscellaneous
• BKPT: Breakpoint.
BKPT #0 ; Breakpoint with immediate value set to 0x0.

• CPS: Change Processor State.

CPSID i ; Disable all interrupts except NMI (set PRIMASK)


CPSIE i ; Enable interrupts (clear PRIMASK)

• DMB: Data Memory Barrier.


DMB ; Data Memory Barrier

G. Ablay 71
Instruction set: Miscellaneous
• MRS: Move the contents of a special register to a general-
purpose register.
MRS R0, PRIMASK ; Read PRIMASK value and write it to R0

• MSR: Move the contents of a general-purpose register into


the specified special register.
MSR CONTROL, R1
; Read R1 value and write it to the CONTROL register

• WFE: Wait For Event.


WFE ; Wait For Event

• WFI: Wait for Interrupt.


WFI ; Wait For Interrupt
G. Ablay 72
The Most Common Instructions
Instruction Description Instruction Description
MOV Move data EOR Bitwise XOR
MVN Move and negate LDR Load
ADD Addition STR Store
SUB Subtraction LDM Load Multiple
MUL Multiplication STM Store Multiple
LSL Logical Shift Left PUSH Push on Stack
LSR Logical Shift Right POP Pop off Stack

ASR Arithmetic Shift Right B Branch

ROR Rotate Right BL Branch with Link

CMP Compare BX Branch and eXchange

Branch with Link and


AND Bitwise AND BLX
eXchange
ORR Bitwise OR SWI/SVC System Call
G. Ablay 73
Examples

G. Ablay 74
Assembly language template
• Program 1: Assembly language template.

THUMB
AREA DATA, ALIGN=2
; global variables go here (e.g., GPIO_PORTE_DATA_R EQU 0x400243FC)
assembler
directives
ALIGN
AREA |.text|, CODE, READONLY, ALIGN=2
EXPORT Start
Start

; initialization code goes here


loop
user ; put your main engine here
code B loop
; put any subroutines here
ALIGN ; make sure the end of this section is aligned
END ; end of file

G. Ablay 75
Example-1
• Addition: The problem: P = Q + R + S
• Let Q = 2, R = 4, S = 5. Assume that r1 = Q, r2 = R, r3 = S. The result Q will go in r0.

ADD r0, r1, r2 ; P=Q+R


ADD r0, r3 ; P=P+S
Stop B Stop ; infinite loop
END ; This ends the program

G. Ablay 76
Example-1
• The Complete Assembly Program that can be complied to build the executable.

THUMB ; Marks the THUMB mode of operation


StackSize EQU 0x00000100 ; Define stack size of 256 bytes
AREA STACK , NOINIT , READWRITE , ALIGN =3 ; Allocate space for stack
MyStackMem SPACE StackSize
AREA RESET , READONLY ; initialize two entries of vector table
EXPORT __Vectors
assembler
__Vectors
directives
DCD MyStackMem + StackSize ; stack pointer for empty stack
DCD Reset_Handler ; reset vector
AREA MYCODE , CODE , READONLY ; user code is placed in CODE AREA
ENTRY ; starting point of the code execution
EXPORT Reset_Handler
Reset_Handler ; user code starts from next line

MOV r1, #2 ; load r1 with the constant Q


MOV r2, #4 ; load r2 with the constant R
user MOV r3, #5 ; load r3 with the constant S
code ADD r0, r1, r2 ; P=Q+R
ADD r0, r3 ; P=P+S
Stop B Stop ; infinite loop
END ; This ends the program
G. Ablay 77
Example-1
• Or you can add a «startup.s» file to execute the assembly code:
example1.s
AREA example1, CODE, READONLY
ENTRY

MOV r1, #2 ; load r1 with the constant Q


MOV r2, #4 ; load r2 with the constant R
MOV r3, #5 ; load r3 with the constant S
ADD r0, r1, r2 ; P=Q+R
ADD r0, r3 ; P=P+S
Stop B Stop ; infinite loop
END ; This ends the program

startup.s
THUMB ; Marks the THUMB mode of operation
StackSize EQU 0x00000100 ; Define stack size of 256 bytes
AREA STACK , NOINIT , READWRITE , ALIGN =3 ; Allocate space for stack
MyStackMem SPACE StackSize
AREA RESET , READONLY ; initialize two entries of vector table
EXPORT __Vectors
__Vectors
DCD MyStackMem + StackSize ; stack pointer for empty stack
DCD Reset_Handler ; reset vector
AREA MYCODE , CODE , READONLY ; user code is placed in CODE AREA
ENTRY ; starting point of the code execution
EXPORT Reset_Handler
Reset_Handler ; user code starts from next line
G. Ablay 78
Example-2
• Addition: This problem is the same as Example 1. P = Q + R + S
• Once again, let Q = 2, R = 4, S = 5 and assume r1 = Q, r2 = R, r3 = S. In this case, we
will put the data in memory in the form of constants before the program runs.

AREA Example2, CODE, READONLY


ENTRY

MOV r1, #Q ; load r1 with the constant Q


MOV r2, #R
MOV r3, #S
ADD r0, r1, r2 ; P=Q+R
ADD r0, r0, r3 ; P=P+S
Stop B Stop ; infinite loop
Q EQU 2 ;Equate the symbolic name Q to the value 2
R EQU 4 ;
S EQU 5 ;
END ; This ends the program

G. Ablay 79
Example-3
• Addition: P = Q + R + S
• Once again, let Q = 2, R = 4, S = 5 and assume r1 = Q, r2 = R, r3 = S.
• In this case, we will put the data in memory as constants before the program runs.
First we use load register LDR to reach the memory location.

AREA Example3, CODE, READONLY


ENTRY

LDR r1, Q ; load r1 with Q


LDR r2, R ; load r2 with R
LDR r3, S ; load r3 with S
ADD r0, r1, r2 ; P=Q+R
ADD r0, r3 ; P=P+S
STR r0, [r4,r1] ; store R0 into an address equal to sum of R4 and R1
Stop B Stop ; infinite loop

AREA Example3, CODE, READWRITE


P SPACE 4 ; save one word of storage
Q DCD 2 ; create variable Q with initial value 2
R DCD 4 ; create variable R with initial value 4
S DCD 5 ; create variable S with initial value 5
END

G. Ablay 80
Example: Arithmetic Expressions
• Assume that we wish to evaluate (A + 8B + 7C - 27)/4, where A
= 25, B = 19, and C = 99.

AREA Example4, CODE, READONLY


ENTRY

MOV r0, #25 ;Load register r0 with A which is 25


MOV r1, #19 ;Load register r1 with B which is 19
ADD r0, r0, r1, LSL #3 ;Add 8 x B to A in r0
MOV r1, #99 ;Load register r1 with C which is 99 (reuse of r1)
MOV r2, #7 ;Load register r2 with 7
MLA r0, r1, r2, r0 ;Add 7 x C to total in r0 (=R1*R2+R0)
SUB r0, r0, #27 ;Subtract 27 from the total
MOV r0, r0, ASR #2 ;Divide the total by 4
Stop B Stop ;infinite loop
END

G. Ablay 81
Example: Logical Operations
• Let’s perform a simple Boolean operation to calculate the
bitwise calculation of 𝐹 = 𝐴𝐵 + 𝐶𝐷. Assume that A, B, C, D
are in r1, r2, r3, r4, respectively.
AREA Example5, CODE, READONLY
ENTRY
LDR r1, =2_0000000011111111010101011110000 ; setup A
LDR r2, =2_0000000000000000010101011111111 ; setup B
LDR r3, =2_1100000011111111010101011110000 ; setup C
LDR r4, =2_1111000011111111010101011110000 ; setup D

AND r0, r1, r2 ;r0 = AB


AND r3, r3, r4 ;r3 = CD
MVN r3, r3 ;r3 = NOT(CD)
ORR r0, r0, r3 ;r0 = AB +NOT(CD)
Stop B Stop ;infinite loop
END

G. Ablay 82
Example
• Write a program that determines the sum of the five even
numbers.
AREA example, CODE, READONLY
ENTRY
MOV R0 , #0 ; R0 will accumulate the sum
MOV R1 , #2 ; R1 will have the updated even number
MOV R2 , #5 ; the counter for the loop
lbegin
CBZ R2 , lend ; If R2 != 0 continue with the next instruction
CBZ = compare and branch with zero
ADD R0 , R1 ; R0 = R0 + R1
ADD R1 , #2 ; Generate next even number
SUB R2 , #1 ; R2 = R2 - 1
B lbegin ; branch unconditionally to lbegin
lend
END

• The first five even numbers are 2, 4, 6, 8, 10. When we add them up,
the sum is 30 (or 1E in hex).
G. Ablay 83
Example: function
Write an ARM assembly program with a function call.
Method 1: not preferred !
C program ARM Assembly

int foo() { foo:


return 2; mov r1, #2
} mov pc, lr ; pc  lr
void main () { main:
int x=3; mov r1, #3 ; x=3
int y=x+foo(); bl foo ; call foo
} add r2, r0, r1 ; y=x+foo()

bl foo : (1) jump unconditionally to function at foo


(2) save the next PC (PC+4) in the lr register

G. Ablay 84
Example: function
Write an ARM assembly program with a function call.
Method 2: Preferred method
C program ARM Assembly

int foo() { foo:


return 2; mov r1, #2
} bx lr ; pc  lr
void main () { main:
int x=3; mov r1, #3 ; x=3
int y=x+foo(); bl foo ; call foo
} add r2, r0, r1 ; y=x+foo()

bx lr : (1) jump unconditionally to the address


contained in lr register
instead of «mov pc, lr», use «bx lr»

G. Ablay 85
Example: arrays
Example with Arrays.

C program Void addNumbers (int a[100]) {


int index;
int sum = 0;
for (index=0; index<100; index++){
sum=sum+a[index];
}
}

ARM Assembly mov r1, #0 ; sum =0


mov r2, #0 ; index=0
loop
ldr r3, [r0, r2, lsl #2]
add r2, r2, #1 ; index++
add r1, r1, r3 ; sum += a[index]
cmp r2, #100 ; loop condition
bne loop ; if Z=0 (not equal), then continue
G. Ablay 86
• ARM Assembly Language Tutorial - Setup ARM Tools

G. Ablay 87
Assembling An ARM Program
• ARM Software Tool
– MDK-ARM (Keil µvision5)

https://www.youtube.com/watch?v=Sm6v9UyhCkA

G. Ablay 88
Create «New uVision Project»

Create New Folder write


Project name

It is important that your


Project must be in separate
folder
G. Ablay 89
Select «STM32F103C8»

Click «OK»
G. Ablay 90
Add «Asm» file

G. Ablay 91
Translate Build

Write your Assembly Code here

Then – Translate  Build (if no error, then 


Start/Stop Debug

Start/Stop Debug G. Ablay 92


Click «Options for Target» ->
«Debug» -> «Use Simulator»

G. Ablay 93
Click «Step Over» to run step-by-step

See changes in
«Registers»

4 byte

See changes in
«Memory»
G. Ablay 94

You might also like