Professional Documents
Culture Documents
Chapter 2
Sepehr Naimi
www.NicerLand.com
Topics
ARM’s CPU
Its architecture
Some simple programs
Data Memory access
Program memory RAM EEPROM Timers
Program Data
Bus Bus
CPU
Interrupt Other
OSC Ports
Unit Peripherals
I/O
PINS
2
ARM ’s CPU
ARM ’s CPU
ALU
16 General Purpose
R0
registers (R0 to R15) R1
ALU
PC register (R15) R2
…
Instruction decoder CPSR: I T H S V N Z C
R13 (SP)
PC registers
Instruction decoder
Instruction Register
3
CPU
4
Some simple instructions
1. MOV (MOVE)
5
LDR pseudo-instruction (loading 32-bit values)
LDR Rd, =k
Rd = k
k is an 32-bit value
Example:
LDR R5,=5543
R5 = 5543
LDR R9,=0x123456
R9 = 0x123456
LDR R4,=2_10110110011011001
6
Some simple instructions
Instruction
2. Description
Arithmetic calculation
ADD Rd, Rn,Op2 * ADD Rn to Op2 and place the result in Rd
Opcode
ADC destination,
Rd, Rn,Op2 source1,
ADD Rn to source2
Op2 with Carry and place the result in Rd
Opcodes:
AND
ADD,AND
Rd, Rn,Op2
SUB, AND, etc.
Rn with Op2 and place the result in Rd
BIC Rd, Rn,Op2 AND Rn with NOT of Op2 and place the result in Rd
Examples:
CMP Rn,Op2 Compare Rn with Op2 and set the status bits of CPSR**
CMN Rn,Op2 Compare Rn with negative of Op2 and set the status bits
ADD
EOR
R5,R2,R1
Rd, Rn,Op2 Exclusive OR Rn with Op2 and place the result in Rd
MVN R5 = R2 + R1Store the negative of Op2 in Rd
Rd,Op2
MOV Rd,Op2 Move (Copy) Op2 to Rd
SUB
ORR
R5, R9,#23OR Rn with Op2 and place the result in Rd
Rd, Rn,Op2
R5
RSB = R9 - 23 Subtract Rn from Op2 and place the result in Rd
Rd, Rn,Op2
RSC Rd, Rn,Op2 Subtract Rn from Op2 with carry and place the result in Rd
SBC Rd, Rn,Op2 Subtract Op2 from Rn with carry and place the result in Rd
SUB Rd, Rn,Op2 Subtract Op2 from Rn and place the result in Rd
TEQ Rn,Op2 Exclusive-OR Rn with Op2 and set the status bits of CPSR
TST Rn,Op2 AND Rn with Op2 and set the status bits of CPSR
* Op2 can be an immediate 8-bit value #K which can be 0–255 in decimal, (00–FF in hex).
Op2 can also be a register Rm. Rd, Rn and Rm are any of the general purpose registers
** CPSR is discussed later in this chapter
7
A simple program
Write a program that calculates 19 + 95
8
A simple program
Write a program that calculates 19 + 95 - 5
MOV R1, #19 ;R6 = 19
MOV R2, #95 ;R2 = 95
MOV R3, #5 ;R21 = 5
ADD R6, R1,R2 ;R6 = R1 + R2
SUB R6, R6,R3 ;R6 = R6 - R3
9
Status Register (CPSR)
D31 D30 D29 D28 ………. D7 D6 D5 D4 D3 D2 D1 D0
CPSR: N Z C V Reserved I F T M4 M3 M2 M1 M0
Zero carry
Example:Show
Example: Showthe thestatus
statusof ofthe
theZZflag
flagafter
afterthethesubtraction
subtractionof of0x73
0x23
Example:
Example: Show
Show the
the status of the C
status instructions: and
ofinstructions:
the ZC flag Z
and afterflags
Z flags after
theafter the addition
subtraction of
of 0x9C
the addition of
from0x52
from 0xA5 ininthe
the following
following
0x0000009C
from
0x38 0x9C
and 0x2Fin and
the 0xFFFFFF64
in following
the following in the following instructions:
instructions:
instructions:
LDR
LDR R0,=0xA5
R0,=0x52
MOV LDR LDR
R6, #0x38 R0,=0x9C
R0,=0x9C;R6 = 0x38
LDR
LDR R1,=0x23
R1,=0x73
MOV LDR
LDR
R7, #0x2F R1,=0xFFFFFF64
R1,=0x9C ;R17 = 0x2F
SUBS
SUBS R0,R0,R1
R0,R0,R1 ;subtract R1
;subtract R1 from
from R0R0
ADDS SUBS ADDS
R6, R6,R7 R0,R0,R1
R0,R0,R1;add R7 ;subtract ;add
to R6 R1 to R0
R21 from R20
Solution:
Solution:
Solution: 52
Solution: 0xA5 0101 101000100101
-- 9C
73 38 00000000
1001 1100 00000000 00000000 0011 1000
0x23 0111
0000009C 0010 0011
0011 00000000 00000000 10011100
00000000
+ - +DF
9C2F
0x82 1101
FFFFFF64 00000000
1001 1100
1111
11111111
1000
00000000
0010 11111111 R0 00000000
R0= =0xDF
11111111
0x82
0010 1111
01100100
Z = 10 because 0067the R20
00000000 00000000
1 0000
has a value
00000000 00000000
0000other than
00000000 R0 00000000
=00000000
zero $00 01100111
after the subtraction.
00000000
C
R0
ZZ====01
R6 because
becauseR1
=000000000
0x67
because theis
the R20bigger
R20 ishasthan
zero R0 the
after
a value and there
thanis0 aafter
subtraction.
other borrow from D32 bit.
the subtraction.
CC==11because
becausethere
R21 isisnot
R1 is a carry
not beyond
bigger theand
than R0
R20 D7there
andbit.
thereisisno
noborrow
borrowfrom
fromD32
D32bit.
bit.
C = 0 because there is nobigger than
carry beyond the D31 bit.
Z = 1 because R0 (the result) has a value 0 in it after the addition.
Z = 0 because the R6 (the result) has a value other than 0 after the addition.
Harvard in ARM9 and Cortex
11
Memory Map in STM32F103
8 bits
4G 0xFFFF FFFF
Cortex-M3 internal
peripherals
0xE000 0000
Example: Add contents of location 0x90 to contents of location 0x94
Afterand
running the following
store instruction:
the result STR (Store register)
in location 0x20000300.
SRAM
3G STR R5, 0000
0xC000 [R2]
Solution:
locations 0x20000000 through 0x20000003 will be loaded
with 0x78, 0x56, 0x34, and 0x12, respectively.
STR Rx,[Rd] ;[Rd]=Rx
LDR (Load register)
Example: Write a program
LDR R6,=0x90 ;R6 that copies the contents of location 0x80
= 0x90
FSMC 0x12 0x2000 0003
into location
0x8000 LDR
0x88.
R1,[R6]
0000 Example:
LDR Rd,
;R1 = [0x90] [Rx];Rd = [Rx]
2G 0x34 0x2000 0002
Solution: 0x56
LDR R6,=0x94
0x6000 0000
;R6 = 0x94 ;[0x20000000]=0x12345678
0x2000 0001
LDRR2,[R6]
LDR R2,=0x80 Example:
;R2 == [0x94]
;R1 0x80 0x78 0x2000 0000
Peripherals
0x5FFF FFFF LDR R5,=0x12345678
1G LDR R2,R2,R1
0x4000 ADD
0000 R1,[R2] ;R1 == R2
;R2 [0x80]
+ R1
LDR
R4,=0x20000000
0x3FFF FFFF
R5: 0x2000
LDR R2, =0x20000000
SRAM 0x12
LDR 0x34
R2,=0x88 0x56
LDR R6,=0x20000300;R2 0x78
;R6= =0x88
0x20000300
0000
LDR R1, [R4]
STR R2,[R6]
0x1FFF STR
FFFF R1,[R2] ;[0x88] =STR
R1 = R5,[R2]
;[0x20000300] R2 ; [R2] = R5
Flash
0 0x0000 0000
LDRB, LDRH, STRB, STRH
Data Size Bits Load instruction used Store instruction used
Byte 8 LDRB STRB
Half-word 16 LDRH STRH
Word 32 LDR STR
Assumethat
Assume thatR5=0x40000200,
R5=0x40000200,and andR1locations 0x40000200
= 0x41526374.
SRAM
through
After 0x40000203
running contain
the following 0x78, 0x56, 0x34 ,and 0x12,
instruction:
respectively.
STRB R1, [R5]
After running
locations the following
0x40000200 will beinstruction:
loaded with 0x74.
LDRH R7, [R5]
R7 will be loaded with 0x00005678 0x12
- 0x4000 0203
0x34
- 0x4000 0202
0x56
- 0x4000 0201
0x00 0x00 0x78
0x74 0x4000 0200
R7
R1 0x00
x 0x00
x 0x56
x 0x78
0x74
13
Memory Map in STM32F103
14
Some Arm addressing modes
Immediate
MOV R1, #0x25 F04F0125
ADD R6, R6, #0x40
Register addressing mode
MOV R2, R4
ADD R3, R2, R1 EB020301
15
Assembler Directives
16
Assembler
Assembly
Editor Program
myfile.a
assembler
Assembler Program
Machine
Language
Linker
Downloaded to the
myfile.map myfile.hex
Program Memory
17
Assembler directives vs. Instructions
Instructions (e.g. ADD, MOV) tell the CPU what
to do
Assembler directives tell the assembler what to
do
AREA
IMPORT and EXPORT
END
DCD, DCW, DCB
EQU
INCLUDE
18
AREA
AREA sectionName, attribute1, attribute2, …
Code:
8 bits
AREA myCode, CODE, READONLY 4G 0xFFFF FFFF
Data:
Cortex-M3 internal
AREA
AREA MY_PROG,CODE,READONLY
MY_PROG,CODE,READONLY peripherals
0xE000 0000
__main
__main
AREA
MOV myData1,
MOV R4,
R4, #6
#6 DATA, READWRITE 3G 0xC000 0000
ADD
ADD R1,R1,R2
R1,R1,R2
AREA
….
…. myConst, DATA, READONLY
FSMC
myFunc
myFunc
2G 0x8000 0000
ADD
ADD R2,R3,R4
R2,R3,R4
…
… 0x6000 0000
0x5FFF FFFF
Peripherals
1G 0x4000 0000
0x3FFF FFFF
READWRITE
READWRITE SRAM 0x2000 0000
0x1FFF FFFF
READONLY
READONLY 0
Flash
0x0000 0000
19
IMPORT and EXPORT
File1.s
; from the main program:
IMPORT MY_FUNC
...
BL MY_FUNC ;call MY_FUNC function
...
File2.s
AREA OUR_EXAMPLE,CODE,READONLY
EXPORT MY_FUNC
IMPORT DATA1
MY_FUNC
LDR R1,=DATA1
...
20
First Assembly Program
EXPORT __main
AREA PROG_2_1, CODE, READONLY
__main
MOV R1, #0x25 ; R1 = 0x25
MOV R2, #0x34 ; R2 = 0x34
ADD R3, R2, R1 ; R3 = R2 + R1
HERE B HERE ; stay here forever
END ;end of source file
21
Defining Const. Values using DCD, DCW, and DCB
22
Storing Fixed Data in Program Memory
EXPORT __main
AREA PROG2_2, CODE, READONLY
__main LDR R2, =OUR_FIXED_DATA ; point to OUR_FIXED_DATA
LDRB R0, [R2] ; load R0 with the contents
; of memory pointed to by R2
ADD R1, R1, R0 ; add R0 to R1
HERE B HERE ; stay here forever
AREA LOOKUP_EXAMPLE, DATA, READONLY
OUR_FIXED_DATA
DCB 0x55, 0x33, 1, 2, 3, 4, 5, 6
DCD 0x23222120, 0x30
DCW 0x4540, 0x50
END
23
Allocating memory using SPACE
SPACE allocates memory without initializing.
Example 1: Allocating 4 bytes of memory:
MY_LONG SPACE 4
Example 2: Allocating 2 bytes:
ALFA SPACE 2
Example 3: Allocating an array of 20 bytes:
MY_ARRAY SPACE 20
24
Defining 3 variables A, B, and C
EXPORT __main AREA OUR_DATA, DATA, READWRITE
AREA OUR_PROG, CODE, READONLY ; Allocates the followings in SRAM
__main ; A = 5 A SPACE 4
LDR R0, =A ; R0 = Addr. of A B SPACE 4
MOV R1, #5 ; R1 = 5 C SPACE 4
STR R1, [R0] ; init. A with 5 END
; B = 4
LDR R0, =B ; R0 = Addr. of B
MOV R1, #4 ; R1 = 4
STR R1, [R0] ; init. B with 4
; R1 = A
LDR R0, =A ; R0 = Addr. of A
LDR R1, [R0] ; R1 = value of A int main()
; R2 = B {
LDR R0, =B ; R0 = Addr. of A int a = 5;
LDR R2, [R0] ; R2 = value of A int b = 4;
; C = R1 + R2 (C = A + B) int c = a + b;
ADD R3, R1, R2 ; R3 = A + B
LDR R0, =C ; R0 = Addr. of C while(1)
STR R3, [R0] ; C = R3 {
loop B loop }
}
25
ALIGN
ALIGN is used to align data on 32-bit or 16-bit
boundary.
a)
26
Assembler Directives
EQU and RN
name EQU value
Example:
RESULT RN R2
MOV RESULT,#23
Example 2:
ProgCounter RN R15
27
Assembler Directives
INCLUDE
INCLUDE “filename.ext”
hFile.inc
GPIOA_CRL EQU 0x40010800
GPIOA_CRH EQU 0x40010804
GPIOA_IDR EQU 0x40010808
GPIOA_ODR EQU 0x4001080C
....
Program.s
include “hFile.inc”
28
Power up in Cortex-M
29
Startup and main files
Startup_stm32f10x.s
__initial_sp
30
Flash memory and PC register
0x08000200 F04F0125
0x08000204 F04F0234
0F02
0x08000208 EB020301
0x0800020C E7FE
0x0800020E
RAM
PROGRAM
Flash ROM ALU
main.lst 32bit
PC: 0x0800020C
0x08000200
0x08000208
0x08000204
0x0800020E Data
CPU Bus
Line Offset Machine Instruction _ 32bit
32
Pipeline
Non-pipeline
Just fetches, decodes, or executes in a given time
Pipeline
33
Pipeline (Cont.)
SUB R3,R3,R4
LDR R2, [R4] ; R2 = [R4] ADD R0, R0,R1
ADD R0,R0,R1 ; R20 = R20 + R21 LDR R2, [R4]
SUB R3,R3,R4
Fetch
Decode
Execute
34
Harvard Architecture
separate buses for opcodes and operands
Advantage: opcodes and operands can go in and out of the CPU
together.
Disadvantage: Using Harvard architecture in motherboards leads
to more cost in general purpose computers.
35
Changing the architecture
RISC vs. CISC
CISC (Complex Instruction Set Computer)
Put as many instruction as you can into the CPU
RISC (Reduced Instruction Set Computer)
Reduce the number of instructions, and use your
facilities in a more proper way.
36
RISC architecture
Feature 1 (fixed instruction size)
RISC processors have a fixed instruction size. It
makes the task of instruction decoder easier.
In ARM the instructions are 4 bytes.
In Thumb2 the instructions are either 2 or 4 bytes.
In CISC processors instructions have different
lengths
E.g. in 8051
CLR C ; a 1-byte instruction
ADD A, #20H ; a 2-byte instruction
LJMP HERE ; a 3-byte instruction
37
RISC architecture
Feature 2: reduce the number of instructions
Pros: Reduces the number of used transistors
Cons:
Can make the assembly programming more difficult
Can lead to using more memory
38
RISC architecture
Feature 3: limit the addressing mode
Advantage
hardwiring
Disadvantage
Can make the assembly programming more difficult
39
RISC architecture
Feature 4: Load/Store
LDR R8,=0x20
LDR R0,[R8]
LDR R8,=0x220
LDR R1,[R8]
ADD R0, R0,R1
LDR R8,=0x230 RAM USART Timers
STR R0,[R8]
PROGRAM
Flash ROM ALU
PC: Data
CPU Bus
Instruction dec.
Program
Bus
Interrupt Other
OSC Ports
Unit Peripherals
I/O
PINS
40
RISC architecture
Feature 5: more than 95% of instructions are
executed in 1 machine cycle
41
RISC architecture
Feature 6
RISC processors have at least 32 registers.
Decreases the need for stack and memory usages.
In ARM there are 16 general purpose registers (R0
to R15)
42