Professional Documents
Culture Documents
Lecture 3 ARM Assembly
Lecture 3 ARM Assembly
01
Embedded Systems
Lecture 3
ARM Assembly Programming
G. Ablay 1
Outline
ARM Assembly Basics Tutorial:
• Assembling An ARM Program
• ARM Instruction Set
– Memory Access (Data movement)
– Data Processing (Arithmetic, shift, logic, bit)
– Branch and Control (Compare, branch)
• Writing simple examples
• Examples with Keil
G. Ablay 2
Why assembly?
• Why learning assembly language is still important
and relevant:
– Assembly language makes it possible to manipulate
hardware directly
– Understanding processor and memory function
– It eases debugging
G. Ablay 3
The Top Programming
Languages
Embedded Systems
G. Ablay 4
https://spectrum.ieee.org/static/interactive-the-top-programming-languages-2020
Assembling An ARM Program
• The design of the machine language encoding is called
the instruction set architecture (ISA)
– Processors supporting ARM's ISA are distributed quite widely,
usually for low-power devices such as cellphones, digital music
players, and handheld game systems
– The iPhone, Kindle and Nintendo DS are all prominent examples
of devices that incorporate an ARM processor
• Other ISAs
– Computer processors support IA32, manufactured by Intel,
AMD, and VIA
– Another well-known ISA is the PowerPC, used in automobiles
and gaming consoles (Wii, Playstation 3, XBox 360)
G. Ablay 5
Levels of Program Code int main(void) {
int i;
int total = 0; easy to
for(i = 0; i < 10; i++) { read and
C Program total += i; write
}
while(1); // dead loop
}
Compiler
MOVS r1, #0
MOVS r0, #0
B check harder than
Assembly loop ADD r1, r1, r0 high level
Program ADDS r0, r0, #1 languages
check CMP r0, #10
BLT loop
self B self
Assembler
Hex format
2100 0010000100000000 Load
Because .hex is 2000
Binary 0010000000000000 Output Devices
much more E001 Machine 1110000000000001
and (LCD, Motor)
4401 Program 0100010000000001 Run
readable and useful 1C40 0001110001000000 Microprocessor
than binary – hex 280A 0010100000001010
Input Devices
code is often used DBFS 1101110011111011
(Sensors, Button)
BF00 1011111100000000
and shown. E7FE 1110011111111110 6
Binary code
Assembling An ARM Program
• The steps to create an executable Assembly language program:
Editor
Program
myfile.asm
Assembler
Program
Download to
ARM’s FLASH
Because hex is much more readable and useful
than binary – hex code is often used and shown.
ARM Assembler
Anatomy of an Assembly Program:
G. Ablay 8
Assembler Directives
G. Ablay 9
ARM Assembler
Assembler Directives:
• They are used to provide key information to compile the source program, such as
declaring constants and symbolic names, defining data layout, allocating memory
space, and specifying the program structure and entry point.
AREA Make a new block of data or code
ENTRY Declare an entry point where the program execution starts
ALIGN Align data or code to a memory boundary
DCB Allocate one or more bytes (8 bits) of data
DCW Allocate one or more halfwords (16 bits) of data
DCD Allocate one or more words (32 bits) of data
DCFS Allocate single-precision (32 bits) floating-point numbers
DCFB Allocate double-precision (64 bits) floating-point numbers
SPACE Allocate a zeroed block of memory
FILL Allocate a block of memory and fill with a given value
EQU Give a symbol name to a numeric constant
RN Give a symbol name to a register
EXPORT Declare a symbol and make it referable by other source files
IMPORT Provide a symbol defined outside the current source file
G. Ablay 11
Assembler Directives
• AREA
– AREA directive indicates to the assembler the start of a new
data or code section.
– AREA sectionName, attribute1, attribute2, … 8 bits
4G 0xFFFF FFFF
Code: Cortex-M3 internal
peripherals
0xE000 0000
AREA myCode, CODE, READONLY
Data:
AREA myData1, DATA, READWRITE 3G 0xC000 0000
__main
0x6000 0000
MOV R4, #6
ADD R1,R1,R2 0x5FFF FFFF
Peripherals
1G 0x4000 0000
….
0x3FFF FFFF
myFunc READWRITE SRAM 0x2000 0000
ADD R2,R3,R4
0x1FFF FFFF
… READONLY 0
Flash
0x0000 0000
G. Ablay 12
Assembler Directives
• IMPORT and EXPORT
– EXPORT declares a symbol and makes this symbol visible to the
linker.
– IMPORT gives the assembler a symbol that is not defined locally
in the current assembly file. (‘extern’ keyword)
file1.s
; from the main program:
IMPORT MY_FUNC
...
BL MY_FUNC ;call MY_FUNC function
...
file2.s
AREA EXAMPLE, CODE, READONLY
EXPORT MY_FUNC
IMPORT DATA1
MY_FUNC
LDR R1,=DATA1
...
G. Ablay 13
Assembler Directives
DCD, DCW, and DCB
• DCB allocates bytes (8-bit) of memory & initializes them.
– Examples: Nibble = 4 bits
• MYVALUE DCB 5 Byte = 8 bits
Half-word = 16 bits
• FIBO DCB 1,1,2,3,5,8 Word = 32 bits
Double-word = 64 bits
• MY_MSG DCB “Hello World!”
• DCW allocates a half-word (16-bit)
• MYVALUE DCW 25425
G. Ablay 14
Assembler Directives
• Example: Storing Fixed Data in Program Memory
AREA Example1, CODE, READONLY
ENTRY
G. Ablay 15
Assembler Directives
• ALIGN
– ALIGN is used to align data on 32-bit or 16-bit boundary.
(1)
G. Ablay 16
Assembler Directives
• EQU and RN
– EQU associates a symbolic name to a numeric constant (#define)
COUNT EQU 0x25
GPIOA_ODR EQU 0x4001080C
MOV R1, #COUNT
G. Ablay 17
Assembler Directives
• INCLUDE
– INCLUDE directive is to include an assembly source file within
another source file.
hFile.inc
GPIOA_CRL EQU 0x40010800
GPIOA_CRH EQU 0x40010804
GPIOA_IDR EQU 0x40010808
GPIOA_ODR EQU 0x4001080C
....
Program.s
include “hFile.inc”
G. Ablay 18
ARM Instruction Set
G. Ablay 19
ARM Instruction Set
• There are two instruction sets: ARM and Thumb (Thumb-1 and 2).
• The Cortex-M architectures only implement the Thumb
instruction set.
G. Ablay 20
ARM Instruction Set
• ARM Cortex devices use a new instruction set called Thumb-2.
– Most Thumb-2 instructions are 16-bit, but six of them are 32-bit
– This mix of 16 and 32-bit instructions improves code density while
maintaining performance
G. Ablay 21
ARM Instruction Set
G. Ablay 22
ARM Instruction Set
• Instruction grouping based on functionality along with supported
processor architecture information.
Thumb2
Instructions
Memory Data
Access Processing
Multiply -
Bit Fields
Divide Cortex M3,
M4, M4F
Branch -
Saturating
Control Cortex M4,
M4F, M7
Packing -
Miscellaneous
Unpacking Cortex M4F,
M7
Floating Point
G. Ablay 23
ARM Instruction Set
• ARM assembly format
– The “label” is used as a reference to an address location. It is
optional.
– The “mnemonic” is the name of the instruction (add, mov, …).
– The number of operands varies, depending on each specific
instruction.
• operand1 is the destination register, and operand2 and operand3
are source operands.
– Everything after semicolon ; is a comment
G. Ablay 24
ARM Instruction Set
• A simple program: Adding numbers
– We want to add the numbers from 1 to 10
C language
int total;
int i;
total = 0;
for (i = 10; i > 0; i--) {
total += i;
}
ARM assembly
MOV R0, #0 ; R0 accumulates total
MOV R1, #10 ; R1 counts from 10 down to 1
again ADD R0, R0, R1
SUBS R1, R1, #1
BNE again
halt B halt ; infinite loop to stop computation
G. Ablay 25
ARM Registers 32-bit (4-byte) • CPUs use many registers to
store data temporarily
• 16 registers for arithmetic
and logic operations
flags
G. Ablay 26
Instruction Execution Cycle
• Constant definition:
CLK_BA_base EQU 0x50000200 ; 32-bits
PWRCON EQU 0x00 ; 8-bits?
S is an optional suffix.
S-Suffix indicate an instruction that update xPSR (flags: N, Z, C, and V)
G. Ablay 28
ARM Instruction Set
• Embedded data:
– (LDR is a pseudo instruction- i.e., compilers translate it to one or
multiple actual machine instructions)
G. Ablay 29
OpCode S Operands
LDR R0, myData ;load the memory address of label myData into R0
LDR R1, [R0] ;load the value (0x01) at the memory address found in R0 to R1
STR R2, [R0] ;store the value found in R2 (0x19) to the memory address found in R0
Memory
Registers
…
R0 0x00010100 0x00010108 0x05
R1 0x01 0x00010104 0x02 myText
R2 0x19 0x00010100 0x01
0x19 myData
…
G. Ablay 32
Instruction Set: Memory Access
• LDR, PC-relative address
G. Ablay 33
Instruction Set: Memory Access
• MOV and MVN: Move and Move NOT.
MOVS R0, #0x000B
; Write value of 0x000B to R0, flags get updated
MOVS R1, #0x0
; Write value of zero to R1, flags are updated
MOV R10, R12
; Write value in R12 to R10, flags are not updated
MOVS R3, #23
; Write value of 23 to R3
MOV R8, SP
; Write value of stack pointer (SP) to R8
MVNS R2, R0 𝑅2 ← ~𝑅0
; Write inverse of R0 to the R2 and update flags
• Example: Write a program that copies the contents of location 0x80 into
location 0x88.
AREA load_store_ex, CODE, READONLY
ENTRY
G. Ablay 37
OpCode S Operands
Addition:
ADD, ADC (add with carry)
Subtraction:
SUB, RSB (reverse subtract), SBC (subtract with carry)
Multiplication:
MUL (multiply), MLA (multiply with accumulate), MLS (multiply with subtract), SMULL (signed long multiply),
UMULL (unsigned long multiply), SMLAL (signed long multiply, with accumulate), UMLAL (unsigned long
multiply, with subtract)
Division: SPIV (signed), UDIV (unsigned)
Saturation: SSAT (signed), USAT (unsigned)
G. Ablay 38
Arithmetic Instruction Set: Data Processing
• ADC, ADD, RSB, SBC, and SUB: Add with carry, Add, Reverse
Subtract, Subtract with carry, and Subtract.
S is an optional suffix.
S-Suffix indicate an instruction that update xPSR (flags: N, Z, C, and V)
G. Ablay 39
Arithmetic Instruction Set: Data Processing
• ADC, ADD, RSB, SBC, and SUB: Add with carry, Add, Reverse
Subtract, Subtract with carry, and Subtract.
RSB R7, R7, R7, LSL #3 ; R7 = (8-1)*R7
S is an optional suffix.
S-Suffix indicate an instruction that update xPSR (flags: N, Z, C, and V)
G. Ablay 40
NOTE: 2’s Complement
• Why use 2’s Complement
– 2’s complement simplifies hardware
– Signed and Unsigned operations are the same with Addition,
Subtraction and Multiplication operations
Example:
-9 + 6
10111 + 00110 = 11101 (29)
Flip : 00010
Add 1 : 00011
Result is (3)
G. Ablay 41
Example
Example: Write an ARM assembly program to compute 4 + 5 − 19. Save
the result in R1.
G. Ablay 42
Arithmetic Instruction set: data processing
• MUL(S): Multiply using 32-bit operands, and producing a 32-
bit result.
MUL R0, R2, R0 ; Multiply with flag update, R0 = R2 x R0
MLA R0, R1, R2, R3 ; R0 = (R1 x R2) + R3, multiply with accumulate
MLS R0, R1, R2, R3 ; R0 = R3 – (R1 x R2) , multiply with subtract
G. Ablay 43
Example
Example: Compute 123 + 1, and save the result in R3.
G. Ablay 44
(3) Shift, Bit and Logic Instructions
Shift:
LSL (logic shift left), LSR (logic shift right), ASR (arithmetic shift right), ROR (rotate right), RRX (rotate right
with extend)
Logic:
AND (bitwise and), ORR (bitwise or), EOR (bitwise exclusive or), ORN (bitwise or not), MVN (move not)
Bit set/clear: BIC (bit clear), BFC (bit field clear), BFI (bit field insert), CLZ (count leading zeros)
Bit/byte reordering: REV (reverse byte order in a word), RBIT (reverse bit order in a word)
G. Ablay 45
Logic-Bit Instruction Set: Data Processing
• AND, ORR, EOR, and BIC: Logical AND, OR, Exclusive OR, and
Bit Clear.
ANDS R2, R2, R1 (R2 ← R2 AND R1)
ORRS R2, R2, R5 (R2 ← R2 OR R5)
EORS R7, R7, R6 (R7 ← R7 XOR R6)
BICS R0, R0, R5 (R0 ← R0 AND ~R5)
(R5 = 0010 → ~R5 = 1101)
S is an optional suffix.
S-Suffix indicate an instruction that update xPSR (flags: N, Z, C, and V)
G. Ablay 46
Example: Setting and Clearing bits
35 0 0 1 1 0 1 0 1
AND 0F 0 0 0 0 1 1 1 1
05 0 0 0 0 0 1 0 1
44 0 1 0 0 0 1 0 0
EOR 06 0 0 0 0 0 1 1 0
34 0 1 0 0 0 0 1 0
G. Ablay 47
Example
G. Ablay 48
Shift Instruction Set: Data Processing
• ASR, LSL, LSR, and ROR: Arithmetic Shift Right, Logical Shift
Left, Logical Shift Right, and Rotate Right.
ASRS R7, R5, #9 ; R7 = R5 >> 9, signed
; Arithmetic shift right by 9 bits and update flags
LSRS R4, R5, #6 ; R4 = R5 >> 6, unsigned
; Logical shift right by 6 bits and update flags
LSLS R1, R2, #3 ; R1 = R2 << 3
; Logical shift left by 3 bits with flag update
RORS R4, R5, R6 ; R4 = rotate R5 by R6 bits
; Rotate right by the value in the bottom byte of R6.
LSR #3
G. Ablay 52
Bit/Byte Instruction set: data processing
G. Ablay 53
Bit/Byte Instruction set: data processing
• REV, REV16, and REVSH: Reverse bytes.
REV16 R0, R1 ; Reverse byte order of each 16-bit halfword in R0
G. Ablay 54
Instruction set: data processing
• SXT and UXT: Sign extend and Zero extend.
SXTH R4, R6
; Obtain the lower halfword of the value in R6 and then sign extend to
; 32 bits and write the result to R4.
UXTB R3, R1
; Extract lowest byte of the value in R10 and zero
; extend it, and write the result to R3
G. Ablay 55
(4) Compare and Branch Instructions
Branch instructions:
B (branch), CBZ (compare and branch on zero), CBNZ (compare and branch on non-zero), TBB (table
branch byte), TBH (table branch halfword)
Subroutine instructions:
BL (branch with link), BLX (branch with link and exchange), BX (branch and exchange)
G. Ablay 56
Branch Instruction Set: Branch and Control
• B, BL, BX, and BLX: Branch instructions.
B loopA ; Branch to loopA -- Unconditional branch
BL funC
; Branch with link (Call) to function funC, return address stored in LR
BLX R0
; Branch with link and exchange (Call) to a address stored in R0
BEQ labelD
; Conditionally branch to labelD
; if last flag setting instruction set the Z flag, else do not branch.
G. Ablay 57
Branch Instruction Set: Branch and Control
• Use of unconditional branch instruction, where `loop1' is the
label used by the program
G. Ablay 58
Branch Instruction Set: Branch and Control
Compare Instructions:
CMP R1, R2 ; Compare , set flag after computing R1-R2
CMN R1, R2 ; Compare Negative , set flag after computing R1+R2
32-bit Signed
numbers
Decimal Binary Hex
-2,147,483,648 10000000000000000000000000000000 80000000
... ... ...
-1 11111111111111111111111111111111 FFFFFFFF
0 00000000000000000000000000000000 00000000
+1 00000000000000000000000000000001 00000001
... ... ...
+2,147,483,647 01111111111111111111111111111111 7FFFFFFF
==
!=
≥ U
< U
> U
≤ U
≥ S
< S
> S
≤ S
G. Ablay 61
List of unconditional and conditional branch instructions
N Z C V
Jump to LabeL if
• When two numbers are unsigned integers, branch instructions should use an unsigned condition suffix.
• When these two numbers are signed integers, branch instructions should use a signed condition suffix.
G. Ablay 62
Branch Instruction Set: Branch and Control
Conditions:
No
R0 == R1
Yes
increment R2
G. Ablay 63
Branch Instruction Set: Branch and Control
Conditions:
No
R6 < R4
Yes
increment R2
G. Ablay 64
Branch Instruction Set: Branch and Control
Conditions:
No
R6 >= R4
Yes
increment R2
G. Ablay 65
Branch Instruction Set: Branch and Control
Conditions: R7 = 5
int main ( )
{
Example: IF and ELSE R7 = 5;
if (R0 > R1) No
R0 > R1
R2++;
else
R2--; Yes
R7++;
}
increment R2
G. Ablay 66
Branch Instruction Set: Branch and Control
Loop:
{
do something
} R3 = R3 + R1
R6 = R6 - 1
MOV R6,#9 ;R6 = 9
L1 ADD R3,R3,R1 ;R3=R3+R1
SUBS R6,R6,#1 ;R6 = R6 - 1
BNE L1 ;if Z = 0 Jump to L1 if R6 ≠ 1
Yes
L2 B L2 ;Wait here forever R6 > 0
No
END
G. Ablay 67
Examples
Example: Go to the labeled instruction if two numbers are equal.
G. Ablay 70
Instruction set: Miscellaneous
• BKPT: Breakpoint.
BKPT #0 ; Breakpoint with immediate value set to 0x0.
G. Ablay 71
Instruction set: Miscellaneous
• MRS: Move the contents of a special register to a general-
purpose register.
MRS R0, PRIMASK ; Read PRIMASK value and write it to R0
G. Ablay 74
Assembly language template
• Program 1: Assembly language template.
THUMB
AREA DATA, ALIGN=2
; global variables go here (e.g., GPIO_PORTE_DATA_R EQU 0x400243FC)
assembler
directives
ALIGN
AREA |.text|, CODE, READONLY, ALIGN=2
EXPORT Start
Start
G. Ablay 75
Example-1
• Addition: The problem: P = Q + R + S
• Let Q = 2, R = 4, S = 5. Assume that r1 = Q, r2 = R, r3 = S. The result Q will go in r0.
G. Ablay 76
Example-1
• The Complete Assembly Program that can be complied to build the executable.
startup.s
THUMB ; Marks the THUMB mode of operation
StackSize EQU 0x00000100 ; Define stack size of 256 bytes
AREA STACK , NOINIT , READWRITE , ALIGN =3 ; Allocate space for stack
MyStackMem SPACE StackSize
AREA RESET , READONLY ; initialize two entries of vector table
EXPORT __Vectors
__Vectors
DCD MyStackMem + StackSize ; stack pointer for empty stack
DCD Reset_Handler ; reset vector
AREA MYCODE , CODE , READONLY ; user code is placed in CODE AREA
ENTRY ; starting point of the code execution
EXPORT Reset_Handler
Reset_Handler ; user code starts from next line
G. Ablay 78
Example-2
• Addition: This problem is the same as Example 1. P = Q + R + S
• Once again, let Q = 2, R = 4, S = 5 and assume r1 = Q, r2 = R, r3 = S. In this case, we
will put the data in memory in the form of constants before the program runs.
G. Ablay 79
Example-3
• Addition: P = Q + R + S
• Once again, let Q = 2, R = 4, S = 5 and assume r1 = Q, r2 = R, r3 = S.
• In this case, we will put the data in memory as constants before the program runs.
First we use load register LDR to reach the memory location.
G. Ablay 80
Example: Arithmetic Expressions
• Assume that we wish to evaluate (A + 8B + 7C - 27)/4, where A
= 25, B = 19, and C = 99.
G. Ablay 81
Example: Logical Operations
• Let’s perform a simple Boolean operation to calculate the
bitwise calculation of 𝐹 = 𝐴𝐵 + 𝐶𝐷. Assume that A, B, C, D
are in r1, r2, r3, r4, respectively.
AREA Example5, CODE, READONLY
ENTRY
LDR r1, =2_0000000011111111010101011110000 ; setup A
LDR r2, =2_0000000000000000010101011111111 ; setup B
LDR r3, =2_1100000011111111010101011110000 ; setup C
LDR r4, =2_1111000011111111010101011110000 ; setup D
G. Ablay 82
Example
• Write a program that determines the sum of the five even
numbers.
AREA example, CODE, READONLY
ENTRY
MOV R0 , #0 ; R0 will accumulate the sum
MOV R1 , #2 ; R1 will have the updated even number
MOV R2 , #5 ; the counter for the loop
lbegin
CBZ R2 , lend ; If R2 != 0 continue with the next instruction
CBZ = compare and branch with zero
ADD R0 , R1 ; R0 = R0 + R1
ADD R1 , #2 ; Generate next even number
SUB R2 , #1 ; R2 = R2 - 1
B lbegin ; branch unconditionally to lbegin
lend
END
• The first five even numbers are 2, 4, 6, 8, 10. When we add them up,
the sum is 30 (or 1E in hex).
G. Ablay 83
Example: function
Write an ARM assembly program with a function call.
Method 1: not preferred !
C program ARM Assembly
G. Ablay 84
Example: function
Write an ARM assembly program with a function call.
Method 2: Preferred method
C program ARM Assembly
G. Ablay 85
Example: arrays
Example with Arrays.
G. Ablay 87
Assembling An ARM Program
• ARM Software Tool
– MDK-ARM (Keil µvision5)
https://www.youtube.com/watch?v=Sm6v9UyhCkA
G. Ablay 88
Create «New uVision Project»
Click «OK»
G. Ablay 90
Add «Asm» file
G. Ablay 91
Translate Build
G. Ablay 93
Click «Step Over» to run step-by-step
See changes in
«Registers»
4 byte
See changes in
«Memory»
G. Ablay 94