You are on page 1of 62

C

-C
IT
-V
0 6
2 3
( 5
ik
m
Advanced RISC Machines

o
Introductionw and Architecture
h
. B
.r P Day 1
C
ARM History -C
IT
-V
• Key component of many 32 – bit
0 6
embedded systems
2 3
• Portable Consumer devices
( 5
ik
• ARM1 prototype in 1985
m
• One of the ARM’s most successful cores is
w
the ARM7TDMI,provides high code density
o
h
and low power consumption
. B
. P
r
Day 1
C
ARM History -C
IT
V -
6
• ARM – Acorn RISC Machine(1983–1985)
0
3
– Acorn Computers Limited, Cambridge,
2
5
England
(
• ARM – Advanced RISCikMachine 1990
m
w
– ARM Limited, 1990
o
h manufacturers
– ARM has been licensed to many
B
semiconductor
.
.r P Day 1
C
Advanced RISC Machines - C
I T
- V
• ARM Core uses a RISC architecture
0 6
• ARM is Physical hardware
2 3 design
company.
( 5
• ARM licenses its k
i cores out and other
m
companies make processors based on its
cores
o w
h
. B
.r P Day 1
C
Companies licensing with ARM -C
I T
• 3com • Motorola
- V
• Agilent Technologies • Panasonic
0 6
• Sharp 3
• Altera • Qualcomm

• Sanyo2
5
• Epson
• Freescale
(
•kSony
• Sun Microsystems
• Fijitsu
i• Symbian
m • Texas Instruments
• NEC
• Nokia
• Intel
o w
h
• Toshiba
• IBM
• Microsoft B
• Wipro

P . ………….And many more

.r Day 1
C
The RISC Design Philosophy- C
I T
- V
• RISC is characterized by limited
0 6 number
of instructions
3
• A complex instruction 2is obtained as a
5
sequence of simple (instructions.In RISC
processor softwareik is complex but the
m
processor architecture is simple
• Large number w
h o of registers are required.
• Pipelined instruction execution
Ex : ARM,. BATMEL AVR, MIPS, Power PC etc
.r P Day 1
C
The CISC Design Philosophy- C
I T
V
• CISC is characterized by large- instruction
0 6
set.
2 3
• The aim of designing CISC
(
reduce software complexity5 processors is to
by increasing
i k
the complexity of processor architecture.
• Very small numberm of registers are
available.
o w
h family, Motorola 68000
Ex : Intel X86
. B
series.

.r P Day 1
C
CISC vs. RISC -C
I T
- V
CISC
0 6
RISC

Compiler
2 3Compiler
Greater

5
Complexity

(
Code Generation
ik Code Generation

m
Processor o w
Greater
Processor
h Complexity

. B
. P
r
Day 1
C
RISC Design Rules -C
IT
-V
• Instructions
0 6
• Pipelines
2 3
• Registers
( 5

ik
Load – Store Architecture
m
o w
h
. B
. P
r
Day 1
C
RISC – 4 major design rules- C
I T
- V
• Instructions
0 6
3
– Reduced Number of Instructions
– Execute in a single cycle 2
5
( complicated
ik
– The compiler synthesizes
operations
– Each instruction m
o w is a fixed length

h
. B
.r P Day 1
C
RISC – 4 major design rules- C
I T
- V
• Pipelines
0 6
2 3
– The processing of instructions is broken down

parallel by pipelines (5
into smaller units that can be executed in

– Pipeline advances byi kone step on each cycle


m
for maximum throughput

o w
h
. B
.r P Day 1
C
RISC – 4 major design rules- C
I T
- V
• Registers 6
– Have a large general purpose0register set
2 3
( 5
– Any register can contain either data or

ik
address

m
o w
h
. B
.r P Day 1
C
Load Store Architecture - C
I T
V
• Memory can be accessed only6-through
two dedicated instructions 0
– LDR ; move word from 2
3
5 memory to register
( register to memory
k
– STR ; move word from
i
m
• All other instructions have to work on
registers only.
o w
h
. B
.r P Day 1
C
ARM Design Policy -C
IT
-V
• Reduce power consumption
0 6
• High code density
2 3
• Price sensitive
( 5

ik
Reduce the area of the die taken up by
m
the embedded processor

o w
• ARM Incorporated hardware debug
technology h
. B
. P
r
Day 1
C
Instruction set for Embedded Systems -C
I T
• Variable cycle execution 6for
V
- certain
instructions
3 0
2
• Inline barrel shifter 5leading to more
complex instructions (
i k
m
• Thumb 16 – bit instructions
w
• Conditional execution
o
h
• Enhanced Instructions
. B
.r P Day 1
C
-C
IT
-V
0 6
2 3
( 5
ARM Processor Fundamentals
ik
m
o w
h
. B
.r P Day 1
C
ARM core dataflow model - C
I T
- V
Data

6
Instruction

0
Decoder

3
Sign Extend

2
Read

( 5
r15 Register File Rd
r0 – r15
pc Result

ik
A B
Rn Rm
A B Acc

m
Barrel Shifter

w
N MAC

o
ALU

h
B
Address Register

.
Incrementer

P
Address

r. Day 1
C
Registers -C
IT
-V
• ARM has Load Store Architecture
0 6
• General Purpose Registers can hold data
2 3
or address
( 5
ik
• Total of 37 Registers, each of 32 bit
m
• There are 17 or 18 active Registers

o w
– 16 data registers
h
– 2 status registers

. B
.r P Day 1
C
Registers – User Mode -C
T I
- V
• 32 bit in size - Hold either data or address
• 16 data registers(r0 – r15) and 26processor
0
r0
r1

status register (cpsr & spsr)


3
r2

2
r3

5 the stack in the


• r13, r14, r15 – Special functions r4

• r13 (sp) –stores the head(of


r5
r6

i k
r7

current processor mode r8

• r14 (lr) – the core m


r9

puts the return address r10

whenever it callswa subroutine


r11

o
r12
r13 sp

instruction tohbe fetched by the processor


• r15 (pc) – contains the address of the next r14 lr

B
r15 pc

.
cpsr

dependPupon the current mode of the processor


• Which register are visible to the programmer -

.r Day 1
C
-C
Current Program Status Register
I T
- V
• To monitor and control internal
0 6
operations
2 3
• Some ARM Processor core have extra bits
( 5
ik
allocated

m
Flags Status Extension Control
Fields

w
Bit 31 30 29 28 7 6 5 4 0

NZ CV
h o I F T Mode

Function

. B
Condition Flags Interrupt Processor Mode

P
Masks

.
Thumb State

r
Day 1
C
Processor Modes -C
IT
- V
• Determines which registers are active and the
0 6
3
access rights to the cpsr register itself
• Privileged & Nonprivileged
5 2
– Abort (

ik
Fast Interrupt Request
– Interrupt Request
m
– Supervisor
o w Privileged

– System
h
– Undefined
. B
P
– User Nonprivileged

r. Day 1
C
Banked Registers -C
I T
- V
r0
r1
Banked Registers
0 6
3
r2

2
r3

5
r4

(
r5
Fast
r6 Interrupt

ik
r7 Request
r8 r8_fiq

m
r9 r9_fiq
r10 r10_fiq Interrupt

w
r11 r11_fiq Request Supervisor Undefined

o
r12 r12_fiq
r13 sp r13_fiq r13_irq r13_svc r13_undef r13_abt

h
r14 lr r14_fiq r14_irq r14_svc r14_undef r14_abt

B
r15 pc

.
cpsr

P
- spsr_fiq spsr_irq spsr_svc spsr_undef spsr_abt

r . Day 1
C
Banked Registers -C
IT
-V
6
• Banked registers are available only when the processor
is in a particular mode
30
5 2
• Every processor mode except user mode can change
(
mode by writing directly to the mode bits of the cpsr

ik
• Banked registers are a subset of the main 16 registers
m
the new mode will w
• If we change processor mode, a banked register from

h o replace an existing register

. B
• Exceptions and Interrupts cause a mode change

.r P Day 1
C
-C
Changing mode on an exception
IT
User Mode
-V
r0
r1

0 6
• This change causes user

3
register r13 and r14 to
r2

2
r3
be banked
5
r4

(
r5
r6
• The user registers are
ik
r7
r8
replaced with registers
m
r9
Interrupt
r10
Request r13_irq and r14_irq
w
r11
Mode
• spsr stores the previous
o
r12
r13 sp r13_irq

h mode cpsr
r14 lr r14_irq

B
r15 pc

.
cpsr

P
- spsr_irq

r . Day 1
C
Processor Mode -C
I T
- V
6
Mode Abbr: Privileged Mode[4:0]

Abort abt yes

30 10111

2
Fast Interrupt Request fiq yes 10001
Interrupt Request irq

( 5
yes 10010

ik
Supervisor svc yes 10011
System sys yes 11111

m
Undefined und yes 11011

w
User usr no 10000

o
h due to a program writing
cpsr is not copied into the spsr when a mode

directly to B
change is forced

P . the cpsr.

.r Day 1
C
Operation Modes -C
IT
-V
Mode Registers
6 CPSR[4:0]
0 10000
User User
2 3
FIQ _fiq
( 5 10001

i_svck
IRQ _irq 10010
Supervisor Mode
m_abt 10011
Abort
o w 10111

h
Undefined Instruction _und 11011
System . B User 11111

. P
r
Day 1
C
Processor Modes -C
IT
User
-V
Unprivileged mode for most applications to
run
0 6
FIQ Fast Interrupt Routine
2 3
IRQ Interrupt Routines
5
Supervisor Entered on reset and( when there is a
exception ik
Entered whenm
aborted w
Abort data or instruction prefetch

Undefined When an
o
h undefined instructions is executed
. B
P
System Privileged user mode for operating system

.r Day 1
C
State and Instruction Sets - C
I T
- V
• There are three instruction sets
0 6
– ARM
2 3
– Thumb
( 5
– Jazelle
i k
m
The Jazelle instruction set is a closed instruction set and is
not openly available.
o w
licensed from B
h
To take advantage of Jazelle extra software has to be

.
both ARM Limited and Sun Microsystems.

.r P Day 1
C
State and Instruction Sets - C
I T
- V ARM Thumb

0 6(cpsr T = 0) (cpsr T = 1)

3
Instruction Size 32 bit 16 bit

Core Instruction
5
58
2 30

(
ik
Conditional Execution Most Only branch instructions

m
Data Processing Instructions Access to barrel shifter Separate barrel and ALU
and ALU instructions
Program Status Register

o w R/W in privileged mode No direct access

h
Register Usage 15 GPR + PC 8 GPR + 7 high registers
+ PC

. B
. P
r
Day 1
C
State and Instruction Sets - C
I T
- V
0 6
3
Jazelle

5 2 (cpsr T = 0, J = 1)

(
Instruction Size 8 bit

ik
Core Instruction Over 60% of the java bytecodes are
implemented in hardware; the rest of the codes

m
are implemented in software

o w
h
. B
. P
r
Day 1
C
Interrupt Masks -C
IT
-V
• Are used to stop specific interrupt
0 6
3
requests from interrupting the processor
2
– IRQ
( 5
– FIQ
i k
m
• The I bit masks IRQ when set to binary 1,
w
and F bit masks FIQ when set to binary 1
o
h
. B
.r P Day 1
C
Condition Flags -C
I T
- V
Flag Flag Name

0
Set when
6
2 3
5
Q Saturation The result causes an overflow and / or saturation

(
ik
V oVerflow The result causes a signed overflow

C Carry The result causes an unsigned carry

m
w
Z Zero The result is zero, frequently used to indicate the
equality
N Negative

h o
Bit 31 of the result is a binary 1

. B
. P
r
Day 1
C
Condition Flags -C
IT
-V
• Condition flags are updated by
0 6
comparisons and the result of ALU
2 3
operations that specify the S instruction
suffix ( 5
ik
– If SUBS results in a register value of zero,
m
then the Z flag in the CPSR is set

o w
h
. B
. P
r
Day 1
C
CPSR -C
I T
- V
Flags Status Extension

0 6 Control

3
Fields
Bit 31 30 29 28 27 24 7 6 5 4 0

5 2
0 0 1 0 0 0
( 0 1 0 10011

Function nzCvq j
ik i F t svc

m
w
cpsr = nzCvqjiFt_SVC
o
h
. B
. P
r
Day 1
C
CPSR
-C
I T
31 28 24 23 16 15 8

-
7

V
6 5 4 0

6
N Z C V J U n d e f i n e d I F T mode

3 0
hold information about the most recently performed ALU operation

2
 set the processor operating mode

• Condition code flags


– N = Negative result from ALU
( 5
• Interrupt Disable bits.

ik
– Z = Zero result from ALU – I = 1: Disables the IRQ.
– F = 1: Disables the FIQ.

m
– C = ALU operation Carried out
• T Bit

w
– V = ALU operation oVerflowed

o
• J bit – Architecture xT only

h
– Architecture 5TEJ only – T = 0: Processor in ARM state

B
– J = 1: Processor in Jazelle state – T = 1: Processor in Thumb state

.
• Mode bits

P
– Specify the processor mode

r . Day 1
C
Pipeline -C
IT
-V
• Is a mechanism a RISC processor uses to
0 6
execute instructions
2 3
5
• Using a pipeline speeds up execution by
(
ik
fetching the next instruction while other
instructions are being decoded and
m
executed
o w
h
. B
. P
r
Day 1
C
ARM7 Three stage pipeline - C
I T
- V
Fetch Decode
0 6
Execute

2 3
5
• Fetch loads an instruction from memory
• Decode identifies the (instruction to be
i k
m
executed
• Execute processes
o w the instruction and
h
writes the result back to a register

. B
.r P Day 1
C
Pipelined instruction sequence -C
I T
- V
Fetch Decode
0 6 Execute

Time
Cycle 1 ADD
2 3
Cycle 2 SUB
( 5
ADD

Cycle 3 CMP ik SUB ADD


m
o w
• Filling the pipeline
• Allows the h
.
every cycle
B core to execute an instruction

.r P Day 1
C
ARM9 Five stage pipeline - C
I T
- V
Fetch Decode Execute
6
Memory

0
Write

• Higher operating frequency


2 3  higher
performance
( 5
• Latency increases k
i
m throughput by
• Increase in instruction
around 13% in w
o 5 stage pipeline
h MIPS per MHz
B
• 1.1 Dhrystone
.
.r P Day 1
C
ARM9 Five stage pipeline - C
I T
- V
6
• Fetch

0
– The instruction is fetched from memory and placed in the instruction

3
pipeline

2
• Decode

5
– The instruction is decoded and register operands read from the

(
register file
• Execute

ik
– An operand is shifted and the ALU result generated
• Memory (Buffer/Data)

m
– Data memory is accessed if required. Otherwise the ALU result is

w
buffered for one clock cycle to give the same pipeline flow for all

o
instructions

h
• Write (Write-Back)
– The results generated by the instruction are written back to the

B
register file, including any data loaded from memory

.
. P
r
Day 1
C
ARM10 Six stage pipeline - C
I T
- V
Fetch Issue Decode
0 6
Execute Memory Write

2 3
5
• Increase in instruction throughput
around 34% in 6 stage (pipeline
by

• 1.3 Dhrystone MIPSik per MHz


m
w
• Code written for the ARM7 will execute
on ARM9 andoARM10
h
. B
.r P Day 1
C
ARM Instruction Sequence - C
I T
- V
Fetch Decode
0 6 Execute

Time Cycle 1 MSR


cpsr
2 3
5
IFt_SVC

Cycle 2 ADD (
MSR
cpsr

k
IFt_SVC
i cpsr

m
Cycle 3 AND ADD MSR iFt_SVC

Cycle 4 SUB w

h o AND ADD

. B
.r P Day 1
C
Pipeline Characteristics - C
I T
V
- will
• An instruction in the execute6stage
3 0
complete even though an interrupt has
been raised
5 2
( instruction or
i k
• The execution of a branch
branching by the direct modification of
mARM core to flush its
pipeline o w
the PC causes the

h
. B
.r P Day 1
C
ARM Exceptions -C
IT
-V
• ARM supports range of Interrupts, Traps,
0 6
Supervisor Calls, all grouped under
2 3
general heading of Exceptions
( 5
ik
m
o w
h
. B
. P
r
Day 1
C
Exceptions, Interrupts, and the Vector
-C
Table
I T
• When an exception or interrupt-Voccurs, the
0 6
processor set the PC to a specific memory address

2 3
• The address is within a special address range called
the vector table
5
( are instructions that
k
• The entries in the vector table
i
branch to specific routines designed to handle a
m
particular exception or interrupt
w
• When an exception or interrupt occurs,the
o
h
processor suspends normal execution and starts

B
loading instructions from the exception vector
table
P .
.r Day 1
C
Vector Addresses -C
IT
- V
6
Exception / Interrupt Shorthand Address High Address

Reset RESET
30
0x00000000 0xffff0000

Undefined Instruction UNDEF

5 2
0x00000004 0xffff0004

Software Interrupt SWI


( 0x00000008 0xffff0008

Prefetch Abort PABT


ik 0x0000000C 0xffff000C

Data Abort
m
DABT 0x000000010 0xffff0010
Reserved

o
-
w 0x000000014 0xffff0014

h
Interrupt Request IRQ 0x000000018 0xffff0018

B
Fast Interrupt Request FIQ 0x00000001C 0xffff001C

P .
r. Day 1
C
Exceptions, Interrupts, and the Vector
-C
Table
I T
• RESET – when power is applied,-Vbranches to
initialization code 6
0 decode an
• UNDEF – when the processor 3cannot
2
• SWI – when a SWI instruction (is5called
instruction

• PABT – attempts to fetch k


i access permissions
an instruction from an

m data memory without the


address without the correct

w
• DABT –attempts to access

• IRQ – by externalo
correct access permissions
h hardware

. B
• FIQ – by external hardware requiring faster response

P
time

.r Day 1
C
Exception Priorities -C
IT
-V
1. Reset (Highest Priority)
0 6
2. Data Abort
2 3
3. FIQ
( 5
4. IRQ
ik
5. Prefetch Abort m
6. SWI, Undefined o w
h
. B
. P
r
Day 1
C
Core Extensions -C
IT
-V
• Standard components placed next to the
0 6
ARM core
2 3
5
• Improve performance, manage resources,
(
ik
provide extra functionality

m
• Three hardware extensions
– Caches
o w
h
– Memory Management

. B
– Coprocessors

. P
r
Day 1
C
-C
Caches
IT
-V
• Cache is a block of fast memory placed
0 6
between main memory and the core
2 3
( 5
• Cache provides an overall increase in
performance
ik
m
• ARM has two forms of cache

o w
– Single unified cache for data and
h
instruction

. B
– Separate caches for data and instruction

.r P Day 1
C
Memory Management -C
IT
V
-
• MMU is a class of processor6hardware
components for handling 3 0
memory
5 2
accesses requested by
( the CPU.

i k
• The functions of MMU’s are
– Translation ofm
virtual address to physical
address. w

h o
– Memory protection
– CacheBcontrol etc
P .
.r Day 1
C
Coprocessors -C
IT
-V
• Coprocessors can be attached to the ARM
processor
0 6
2 3
• A separate chip,that performs lot of calculations

( 5
for the microprocessor,relieving the CPU some
of its work and thus enhancing overall speed of

ik
system.
• A secondary processor used to speed up
m
operation by taking over a specific part of main
processors work.
o w
h
• The ARM processor uses coprocessor 15 registers

B
to control cache, TCMs, and memory

.
management

. P
r
Day 1
C
Description of cpsr -C
I T
-V
6
Parts Bits Architecture Description
Mode 4:0 all
30 processor mode

2
T 5 ARMv4T Thumb state

5
I&F 7:6 all interrupt masks
J 24 ARMv5TEJ
( Jazelle state

ik
Q 27 ARMv5TE condition flag
V 28 all condition flag
C 29
m
all condition flag
Z
N
30
31
o wall
all
condition flag
condition flag

h
. B
. P
r
Day 1
C
ARM processor families - C
I T
- V
• ARM7, ARM9, ARM10 and ARM11
0 6
2 3
• 7, 9, 10, 11 indicate different core
designs
( 5
ik
m
o w
h
. B
.r P Day 1
C
-C
ARM family attribute comparison
IT
-V
ARM7 ARM9
6
ARM10
0
ARM11
Pipeline depth
Typical MHz
three-stage five-stage
80 150
2
260 3
six-stage eight-stage
335
mW/MHz
( 5
0.06 mW/MHz 0.19 mW/MHz 0. 5 mW/MHz 0.4 mW/MHz
(+ cache)

ik
(+ cache) (+ cache)
MIPS/MHz 0.97 1.1 1.3 1.2

m
Architecture Von Neumann Harvard Harvard Harvard
Multiplier 8 x 32 8 x 32 16 x 32 16 x 32

o w
h
. B
. P
r
Day 1
C
ARM Processor Families - C
I T
- V
0 6
2 3
( 5
ik
m
o w
h
. B
. P
r
Day 1
C
Architecture Revisions - C
I T
- V
6
Revision Example core ISA enhancement
ARMv1
Implementation
ARM1
3 0
First ARM Processor
ARMv2 ARM2 2
26 – bit addressing

5
32 – bit multiplier

(32 – bit coprocessor support

ik
ARMv2a ARM3 On chip cache
Atomic swap instruction
ARMv3
m
ARM6 & ARM7DI 32 – bit addressing

w
Separate cpsr & spsr

o
New modes – UNDEF, ABORT
MMU support – virtual memory
ARMv3M ARM7M
h Signed & unsigned long multiply
ARMv4

. B
StrongARM Load – store instruction
New Mode - System

.r P Day 1
C
Architecture Revisions - C
I T
- V
6
Revision Example core ISA enhancement
ARMv4T
Implementation
ARM7TDMI & ARM9T Thumb
3 0
ARMv5TE ARM9E & ARM10E
2
Superset of the ARMv4T

5
(
Extra inst. added for changing state
between ARM & Thumb

ik
Enhanced multiply instructions
Extra DSP type instructions

m
Faster multiply accumulate

w
ARMv5TEJ ARM7EJ & ARM926EJ Java acceleration
ARMv6 ARM11

h o New multimedia instructions

. B
.r P Day 1
C
Instruction Set Architecture- C
I T
- V
6
Thumb

0
Architecture ® DSP Jazelle Media TrustZone Thumb-2
v4T *

2 3
v5TE * *
( 5
ik
m
v5TEJ * * *

o w
h
v6 * * * *

v6Z

.
*
B * * * *

P
v6T2 * * * * *

r. Day 1
C
ARM Processors -C
I T
 ARM11 Family
- V
6
• ARM7 Family
 ARM1136J-S

0
– ARM7EJ-S
 ARM1136JF-S

3
– ARM7TDMI
– ARM7TDMI-S  ARM1156T2(F)-S

2
– ARM720T  ARM1176JZ(F)-S

5
 ARM11 MPCore

(
• ARM9/9E Families
– ARM920T

ik
– ARM922T  Cortex Family
– ARM926EJ-S  Cortex-A8

m
– ARM940T  Cortex-M1
– ARM946E-S  Cortex-M3

w
– ARM966E-S  Cortex-R4

o
– ARM968E-S

h
• Vector Floating Point Families  Other Processors/Microarchitectures
 StrongARM (DEC-Intel)

B
– VFP10
 Xscale (Intel- Marvell Tech)

.
• ARM10 Family
– ARM1020E  Other
– ARM1022E

. P
r
– ARM1026EJ-S
Day 1
C
Cortex Family -C
I T
• ARM Cortex-A Series - Application
- V processors
for complex OS and user applications
0 6
3
– ARM Cortex-A8, ARM Cortex-A9
• ARM Cortex-R Series Embedded
5 2 - processors
for real-time systems
(

i k
ARM Cortex-R4(F)

ARM Cortex-M Series – Embedded processors


optimized for cost m
Mobile devices w
sensitive applications, as


h o
ARM Cortex-M0, ARM Cortex-M1, ARM Cortex-M3

. B
.r P Day 1
C
Switching States -C
IT
-V
• ARM to Thumb
0 6
2 3
– Execute the BX instruction with state bit=1
• Thumb to ARM
( 5
i k
– Execute the BX instruction with state bit=0

m
– An interrupt or exception occurs

o w
h
. B
.r P Day 1

You might also like