Professional Documents
Culture Documents
-C
IT
-V
0 6
2 3
( 5
ik
m
Advanced RISC Machines
o
Introductionw and Architecture
h
. B
.r P Day 1
C
ARM History -C
IT
-V
• Key component of many 32 – bit
0 6
embedded systems
2 3
• Portable Consumer devices
( 5
ik
• ARM1 prototype in 1985
m
• One of the ARM’s most successful cores is
w
the ARM7TDMI,provides high code density
o
h
and low power consumption
. B
. P
r
Day 1
C
ARM History -C
IT
V -
6
• ARM – Acorn RISC Machine(1983–1985)
0
3
– Acorn Computers Limited, Cambridge,
2
5
England
(
• ARM – Advanced RISCikMachine 1990
m
w
– ARM Limited, 1990
o
h manufacturers
– ARM has been licensed to many
B
semiconductor
.
.r P Day 1
C
Advanced RISC Machines - C
I T
- V
• ARM Core uses a RISC architecture
0 6
• ARM is Physical hardware
2 3 design
company.
( 5
• ARM licenses its k
i cores out and other
m
companies make processors based on its
cores
o w
h
. B
.r P Day 1
C
Companies licensing with ARM -C
I T
• 3com • Motorola
- V
• Agilent Technologies • Panasonic
0 6
• Sharp 3
• Altera • Qualcomm
• Sanyo2
5
• Epson
• Freescale
(
•kSony
• Sun Microsystems
• Fijitsu
i• Symbian
m • Texas Instruments
• NEC
• Nokia
• Intel
o w
h
• Toshiba
• IBM
• Microsoft B
• Wipro
.r Day 1
C
The RISC Design Philosophy- C
I T
- V
• RISC is characterized by limited
0 6 number
of instructions
3
• A complex instruction 2is obtained as a
5
sequence of simple (instructions.In RISC
processor softwareik is complex but the
m
processor architecture is simple
• Large number w
h o of registers are required.
• Pipelined instruction execution
Ex : ARM,. BATMEL AVR, MIPS, Power PC etc
.r P Day 1
C
The CISC Design Philosophy- C
I T
V
• CISC is characterized by large- instruction
0 6
set.
2 3
• The aim of designing CISC
(
reduce software complexity5 processors is to
by increasing
i k
the complexity of processor architecture.
• Very small numberm of registers are
available.
o w
h family, Motorola 68000
Ex : Intel X86
. B
series.
.r P Day 1
C
CISC vs. RISC -C
I T
- V
CISC
0 6
RISC
Compiler
2 3Compiler
Greater
5
Complexity
(
Code Generation
ik Code Generation
m
Processor o w
Greater
Processor
h Complexity
. B
. P
r
Day 1
C
RISC Design Rules -C
IT
-V
• Instructions
0 6
• Pipelines
2 3
• Registers
( 5
•
ik
Load – Store Architecture
m
o w
h
. B
. P
r
Day 1
C
RISC – 4 major design rules- C
I T
- V
• Instructions
0 6
3
– Reduced Number of Instructions
– Execute in a single cycle 2
5
( complicated
ik
– The compiler synthesizes
operations
– Each instruction m
o w is a fixed length
h
. B
.r P Day 1
C
RISC – 4 major design rules- C
I T
- V
• Pipelines
0 6
2 3
– The processing of instructions is broken down
parallel by pipelines (5
into smaller units that can be executed in
o w
h
. B
.r P Day 1
C
RISC – 4 major design rules- C
I T
- V
• Registers 6
– Have a large general purpose0register set
2 3
( 5
– Any register can contain either data or
ik
address
m
o w
h
. B
.r P Day 1
C
Load Store Architecture - C
I T
V
• Memory can be accessed only6-through
two dedicated instructions 0
– LDR ; move word from 2
3
5 memory to register
( register to memory
k
– STR ; move word from
i
m
• All other instructions have to work on
registers only.
o w
h
. B
.r P Day 1
C
ARM Design Policy -C
IT
-V
• Reduce power consumption
0 6
• High code density
2 3
• Price sensitive
( 5
•
ik
Reduce the area of the die taken up by
m
the embedded processor
o w
• ARM Incorporated hardware debug
technology h
. B
. P
r
Day 1
C
Instruction set for Embedded Systems -C
I T
• Variable cycle execution 6for
V
- certain
instructions
3 0
2
• Inline barrel shifter 5leading to more
complex instructions (
i k
m
• Thumb 16 – bit instructions
w
• Conditional execution
o
h
• Enhanced Instructions
. B
.r P Day 1
C
-C
IT
-V
0 6
2 3
( 5
ARM Processor Fundamentals
ik
m
o w
h
. B
.r P Day 1
C
ARM core dataflow model - C
I T
- V
Data
6
Instruction
0
Decoder
3
Sign Extend
2
Read
( 5
r15 Register File Rd
r0 – r15
pc Result
ik
A B
Rn Rm
A B Acc
m
Barrel Shifter
w
N MAC
o
ALU
h
B
Address Register
.
Incrementer
P
Address
r. Day 1
C
Registers -C
IT
-V
• ARM has Load Store Architecture
0 6
• General Purpose Registers can hold data
2 3
or address
( 5
ik
• Total of 37 Registers, each of 32 bit
m
• There are 17 or 18 active Registers
o w
– 16 data registers
h
– 2 status registers
. B
.r P Day 1
C
Registers – User Mode -C
T I
- V
• 32 bit in size - Hold either data or address
• 16 data registers(r0 – r15) and 26processor
0
r0
r1
2
r3
i k
r7
o
r12
r13 sp
B
r15 pc
.
cpsr
.r Day 1
C
-C
Current Program Status Register
I T
- V
• To monitor and control internal
0 6
operations
2 3
• Some ARM Processor core have extra bits
( 5
ik
allocated
m
Flags Status Extension Control
Fields
w
Bit 31 30 29 28 7 6 5 4 0
NZ CV
h o I F T Mode
Function
. B
Condition Flags Interrupt Processor Mode
P
Masks
.
Thumb State
r
Day 1
C
Processor Modes -C
IT
- V
• Determines which registers are active and the
0 6
3
access rights to the cpsr register itself
• Privileged & Nonprivileged
5 2
– Abort (
–
ik
Fast Interrupt Request
– Interrupt Request
m
– Supervisor
o w Privileged
– System
h
– Undefined
. B
P
– User Nonprivileged
r. Day 1
C
Banked Registers -C
I T
- V
r0
r1
Banked Registers
0 6
3
r2
2
r3
5
r4
(
r5
Fast
r6 Interrupt
ik
r7 Request
r8 r8_fiq
m
r9 r9_fiq
r10 r10_fiq Interrupt
w
r11 r11_fiq Request Supervisor Undefined
o
r12 r12_fiq
r13 sp r13_fiq r13_irq r13_svc r13_undef r13_abt
h
r14 lr r14_fiq r14_irq r14_svc r14_undef r14_abt
B
r15 pc
.
cpsr
P
- spsr_fiq spsr_irq spsr_svc spsr_undef spsr_abt
r . Day 1
C
Banked Registers -C
IT
-V
6
• Banked registers are available only when the processor
is in a particular mode
30
5 2
• Every processor mode except user mode can change
(
mode by writing directly to the mode bits of the cpsr
ik
• Banked registers are a subset of the main 16 registers
m
the new mode will w
• If we change processor mode, a banked register from
. B
• Exceptions and Interrupts cause a mode change
.r P Day 1
C
-C
Changing mode on an exception
IT
User Mode
-V
r0
r1
0 6
• This change causes user
3
register r13 and r14 to
r2
2
r3
be banked
5
r4
(
r5
r6
• The user registers are
ik
r7
r8
replaced with registers
m
r9
Interrupt
r10
Request r13_irq and r14_irq
w
r11
Mode
• spsr stores the previous
o
r12
r13 sp r13_irq
h mode cpsr
r14 lr r14_irq
B
r15 pc
.
cpsr
P
- spsr_irq
r . Day 1
C
Processor Mode -C
I T
- V
6
Mode Abbr: Privileged Mode[4:0]
30 10111
2
Fast Interrupt Request fiq yes 10001
Interrupt Request irq
( 5
yes 10010
ik
Supervisor svc yes 10011
System sys yes 11111
m
Undefined und yes 11011
w
User usr no 10000
o
h due to a program writing
cpsr is not copied into the spsr when a mode
directly to B
change is forced
P . the cpsr.
.r Day 1
C
Operation Modes -C
IT
-V
Mode Registers
6 CPSR[4:0]
0 10000
User User
2 3
FIQ _fiq
( 5 10001
i_svck
IRQ _irq 10010
Supervisor Mode
m_abt 10011
Abort
o w 10111
h
Undefined Instruction _und 11011
System . B User 11111
. P
r
Day 1
C
Processor Modes -C
IT
User
-V
Unprivileged mode for most applications to
run
0 6
FIQ Fast Interrupt Routine
2 3
IRQ Interrupt Routines
5
Supervisor Entered on reset and( when there is a
exception ik
Entered whenm
aborted w
Abort data or instruction prefetch
Undefined When an
o
h undefined instructions is executed
. B
P
System Privileged user mode for operating system
.r Day 1
C
State and Instruction Sets - C
I T
- V
• There are three instruction sets
0 6
– ARM
2 3
– Thumb
( 5
– Jazelle
i k
m
The Jazelle instruction set is a closed instruction set and is
not openly available.
o w
licensed from B
h
To take advantage of Jazelle extra software has to be
.
both ARM Limited and Sun Microsystems.
.r P Day 1
C
State and Instruction Sets - C
I T
- V ARM Thumb
0 6(cpsr T = 0) (cpsr T = 1)
3
Instruction Size 32 bit 16 bit
Core Instruction
5
58
2 30
(
ik
Conditional Execution Most Only branch instructions
m
Data Processing Instructions Access to barrel shifter Separate barrel and ALU
and ALU instructions
Program Status Register
h
Register Usage 15 GPR + PC 8 GPR + 7 high registers
+ PC
. B
. P
r
Day 1
C
State and Instruction Sets - C
I T
- V
0 6
3
Jazelle
5 2 (cpsr T = 0, J = 1)
(
Instruction Size 8 bit
ik
Core Instruction Over 60% of the java bytecodes are
implemented in hardware; the rest of the codes
m
are implemented in software
o w
h
. B
. P
r
Day 1
C
Interrupt Masks -C
IT
-V
• Are used to stop specific interrupt
0 6
3
requests from interrupting the processor
2
– IRQ
( 5
– FIQ
i k
m
• The I bit masks IRQ when set to binary 1,
w
and F bit masks FIQ when set to binary 1
o
h
. B
.r P Day 1
C
Condition Flags -C
I T
- V
Flag Flag Name
0
Set when
6
2 3
5
Q Saturation The result causes an overflow and / or saturation
(
ik
V oVerflow The result causes a signed overflow
m
w
Z Zero The result is zero, frequently used to indicate the
equality
N Negative
h o
Bit 31 of the result is a binary 1
. B
. P
r
Day 1
C
Condition Flags -C
IT
-V
• Condition flags are updated by
0 6
comparisons and the result of ALU
2 3
operations that specify the S instruction
suffix ( 5
ik
– If SUBS results in a register value of zero,
m
then the Z flag in the CPSR is set
o w
h
. B
. P
r
Day 1
C
CPSR -C
I T
- V
Flags Status Extension
0 6 Control
3
Fields
Bit 31 30 29 28 27 24 7 6 5 4 0
5 2
0 0 1 0 0 0
( 0 1 0 10011
Function nzCvq j
ik i F t svc
m
w
cpsr = nzCvqjiFt_SVC
o
h
. B
. P
r
Day 1
C
CPSR
-C
I T
31 28 24 23 16 15 8
-
7
V
6 5 4 0
6
N Z C V J U n d e f i n e d I F T mode
3 0
hold information about the most recently performed ALU operation
2
set the processor operating mode
ik
– Z = Zero result from ALU – I = 1: Disables the IRQ.
– F = 1: Disables the FIQ.
m
– C = ALU operation Carried out
• T Bit
w
– V = ALU operation oVerflowed
o
• J bit – Architecture xT only
h
– Architecture 5TEJ only – T = 0: Processor in ARM state
B
– J = 1: Processor in Jazelle state – T = 1: Processor in Thumb state
.
• Mode bits
P
– Specify the processor mode
r . Day 1
C
Pipeline -C
IT
-V
• Is a mechanism a RISC processor uses to
0 6
execute instructions
2 3
5
• Using a pipeline speeds up execution by
(
ik
fetching the next instruction while other
instructions are being decoded and
m
executed
o w
h
. B
. P
r
Day 1
C
ARM7 Three stage pipeline - C
I T
- V
Fetch Decode
0 6
Execute
2 3
5
• Fetch loads an instruction from memory
• Decode identifies the (instruction to be
i k
m
executed
• Execute processes
o w the instruction and
h
writes the result back to a register
. B
.r P Day 1
C
Pipelined instruction sequence -C
I T
- V
Fetch Decode
0 6 Execute
Time
Cycle 1 ADD
2 3
Cycle 2 SUB
( 5
ADD
.r P Day 1
C
ARM9 Five stage pipeline - C
I T
- V
Fetch Decode Execute
6
Memory
0
Write
0
– The instruction is fetched from memory and placed in the instruction
3
pipeline
2
• Decode
5
– The instruction is decoded and register operands read from the
(
register file
• Execute
ik
– An operand is shifted and the ALU result generated
• Memory (Buffer/Data)
m
– Data memory is accessed if required. Otherwise the ALU result is
w
buffered for one clock cycle to give the same pipeline flow for all
o
instructions
h
• Write (Write-Back)
– The results generated by the instruction are written back to the
B
register file, including any data loaded from memory
.
. P
r
Day 1
C
ARM10 Six stage pipeline - C
I T
- V
Fetch Issue Decode
0 6
Execute Memory Write
2 3
5
• Increase in instruction throughput
around 34% in 6 stage (pipeline
by
Cycle 2 ADD (
MSR
cpsr
k
IFt_SVC
i cpsr
m
Cycle 3 AND ADD MSR iFt_SVC
Cycle 4 SUB w
h o AND ADD
. B
.r P Day 1
C
Pipeline Characteristics - C
I T
V
- will
• An instruction in the execute6stage
3 0
complete even though an interrupt has
been raised
5 2
( instruction or
i k
• The execution of a branch
branching by the direct modification of
mARM core to flush its
pipeline o w
the PC causes the
h
. B
.r P Day 1
C
ARM Exceptions -C
IT
-V
• ARM supports range of Interrupts, Traps,
0 6
Supervisor Calls, all grouped under
2 3
general heading of Exceptions
( 5
ik
m
o w
h
. B
. P
r
Day 1
C
Exceptions, Interrupts, and the Vector
-C
Table
I T
• When an exception or interrupt-Voccurs, the
0 6
processor set the PC to a specific memory address
2 3
• The address is within a special address range called
the vector table
5
( are instructions that
k
• The entries in the vector table
i
branch to specific routines designed to handle a
m
particular exception or interrupt
w
• When an exception or interrupt occurs,the
o
h
processor suspends normal execution and starts
B
loading instructions from the exception vector
table
P .
.r Day 1
C
Vector Addresses -C
IT
- V
6
Exception / Interrupt Shorthand Address High Address
Reset RESET
30
0x00000000 0xffff0000
5 2
0x00000004 0xffff0004
Data Abort
m
DABT 0x000000010 0xffff0010
Reserved
o
-
w 0x000000014 0xffff0014
h
Interrupt Request IRQ 0x000000018 0xffff0018
B
Fast Interrupt Request FIQ 0x00000001C 0xffff001C
P .
r. Day 1
C
Exceptions, Interrupts, and the Vector
-C
Table
I T
• RESET – when power is applied,-Vbranches to
initialization code 6
0 decode an
• UNDEF – when the processor 3cannot
2
• SWI – when a SWI instruction (is5called
instruction
w
• DABT –attempts to access
• IRQ – by externalo
correct access permissions
h hardware
. B
• FIQ – by external hardware requiring faster response
P
time
.r Day 1
C
Exception Priorities -C
IT
-V
1. Reset (Highest Priority)
0 6
2. Data Abort
2 3
3. FIQ
( 5
4. IRQ
ik
5. Prefetch Abort m
6. SWI, Undefined o w
h
. B
. P
r
Day 1
C
Core Extensions -C
IT
-V
• Standard components placed next to the
0 6
ARM core
2 3
5
• Improve performance, manage resources,
(
ik
provide extra functionality
m
• Three hardware extensions
– Caches
o w
h
– Memory Management
. B
– Coprocessors
. P
r
Day 1
C
-C
Caches
IT
-V
• Cache is a block of fast memory placed
0 6
between main memory and the core
2 3
( 5
• Cache provides an overall increase in
performance
ik
m
• ARM has two forms of cache
o w
– Single unified cache for data and
h
instruction
. B
– Separate caches for data and instruction
.r P Day 1
C
Memory Management -C
IT
V
-
• MMU is a class of processor6hardware
components for handling 3 0
memory
5 2
accesses requested by
( the CPU.
i k
• The functions of MMU’s are
– Translation ofm
virtual address to physical
address. w
h o
– Memory protection
– CacheBcontrol etc
P .
.r Day 1
C
Coprocessors -C
IT
-V
• Coprocessors can be attached to the ARM
processor
0 6
2 3
• A separate chip,that performs lot of calculations
( 5
for the microprocessor,relieving the CPU some
of its work and thus enhancing overall speed of
ik
system.
• A secondary processor used to speed up
m
operation by taking over a specific part of main
processors work.
o w
h
• The ARM processor uses coprocessor 15 registers
B
to control cache, TCMs, and memory
.
management
. P
r
Day 1
C
Description of cpsr -C
I T
-V
6
Parts Bits Architecture Description
Mode 4:0 all
30 processor mode
2
T 5 ARMv4T Thumb state
5
I&F 7:6 all interrupt masks
J 24 ARMv5TEJ
( Jazelle state
ik
Q 27 ARMv5TE condition flag
V 28 all condition flag
C 29
m
all condition flag
Z
N
30
31
o wall
all
condition flag
condition flag
h
. B
. P
r
Day 1
C
ARM processor families - C
I T
- V
• ARM7, ARM9, ARM10 and ARM11
0 6
2 3
• 7, 9, 10, 11 indicate different core
designs
( 5
ik
m
o w
h
. B
.r P Day 1
C
-C
ARM family attribute comparison
IT
-V
ARM7 ARM9
6
ARM10
0
ARM11
Pipeline depth
Typical MHz
three-stage five-stage
80 150
2
260 3
six-stage eight-stage
335
mW/MHz
( 5
0.06 mW/MHz 0.19 mW/MHz 0. 5 mW/MHz 0.4 mW/MHz
(+ cache)
ik
(+ cache) (+ cache)
MIPS/MHz 0.97 1.1 1.3 1.2
m
Architecture Von Neumann Harvard Harvard Harvard
Multiplier 8 x 32 8 x 32 16 x 32 16 x 32
o w
h
. B
. P
r
Day 1
C
ARM Processor Families - C
I T
- V
0 6
2 3
( 5
ik
m
o w
h
. B
. P
r
Day 1
C
Architecture Revisions - C
I T
- V
6
Revision Example core ISA enhancement
ARMv1
Implementation
ARM1
3 0
First ARM Processor
ARMv2 ARM2 2
26 – bit addressing
5
32 – bit multiplier
ik
ARMv2a ARM3 On chip cache
Atomic swap instruction
ARMv3
m
ARM6 & ARM7DI 32 – bit addressing
w
Separate cpsr & spsr
o
New modes – UNDEF, ABORT
MMU support – virtual memory
ARMv3M ARM7M
h Signed & unsigned long multiply
ARMv4
. B
StrongARM Load – store instruction
New Mode - System
.r P Day 1
C
Architecture Revisions - C
I T
- V
6
Revision Example core ISA enhancement
ARMv4T
Implementation
ARM7TDMI & ARM9T Thumb
3 0
ARMv5TE ARM9E & ARM10E
2
Superset of the ARMv4T
5
(
Extra inst. added for changing state
between ARM & Thumb
ik
Enhanced multiply instructions
Extra DSP type instructions
m
Faster multiply accumulate
w
ARMv5TEJ ARM7EJ & ARM926EJ Java acceleration
ARMv6 ARM11
. B
.r P Day 1
C
Instruction Set Architecture- C
I T
- V
6
Thumb
0
Architecture ® DSP Jazelle Media TrustZone Thumb-2
v4T *
2 3
v5TE * *
( 5
ik
m
v5TEJ * * *
o w
h
v6 * * * *
v6Z
.
*
B * * * *
P
v6T2 * * * * *
r. Day 1
C
ARM Processors -C
I T
ARM11 Family
- V
6
• ARM7 Family
ARM1136J-S
0
– ARM7EJ-S
ARM1136JF-S
3
– ARM7TDMI
– ARM7TDMI-S ARM1156T2(F)-S
2
– ARM720T ARM1176JZ(F)-S
5
ARM11 MPCore
(
• ARM9/9E Families
– ARM920T
ik
– ARM922T Cortex Family
– ARM926EJ-S Cortex-A8
m
– ARM940T Cortex-M1
– ARM946E-S Cortex-M3
w
– ARM966E-S Cortex-R4
o
– ARM968E-S
h
• Vector Floating Point Families Other Processors/Microarchitectures
StrongARM (DEC-Intel)
B
– VFP10
Xscale (Intel- Marvell Tech)
.
• ARM10 Family
– ARM1020E Other
– ARM1022E
. P
r
– ARM1026EJ-S
Day 1
C
Cortex Family -C
I T
• ARM Cortex-A Series - Application
- V processors
for complex OS and user applications
0 6
3
– ARM Cortex-A8, ARM Cortex-A9
• ARM Cortex-R Series Embedded
5 2 - processors
for real-time systems
(
•
–
i k
ARM Cortex-R4(F)
–
h o
ARM Cortex-M0, ARM Cortex-M1, ARM Cortex-M3
. B
.r P Day 1
C
Switching States -C
IT
-V
• ARM to Thumb
0 6
2 3
– Execute the BX instruction with state bit=1
• Thumb to ARM
( 5
i k
– Execute the BX instruction with state bit=0
m
– An interrupt or exception occurs
o w
h
. B
.r P Day 1