You are on page 1of 52

CISC
Complex instruction set computer
Emphasis on hardware
Includes multi-clock complex instructions
Memory-to-memory:
"LOAD" and "STORE" incorporated in
Instructions
Small code sizes, high cycles per second
Transistors used for storing complex
Instructions

Instructions are often of variable
size and takes many cycles to
execute
Have dedicated registers for
specific purposes
the data processing operations can
act on memory directly
Traditional CISC processor core can
operate at lower clock frequencies
because of complex hardware
design rules

RISC
Reduce instruction set computer
Emphasis on software
Single-clock, reduced instruction only
Register to register:
"LOAD" and "STORE" are
independent instructions
Low cycles per second, large code
Sizes
Spends more transistors on memory registers

Instructions are often of fixed size
and takes single cycle to execute
Have large general purpose register
set
Operates on data held in registers
Traditional RISC processor core can
operate at higher clock frequencies
because of simple hardware design
rules

V. and PC-relative addressing modes are provided. the ARM ISA reflects a RISC-style architecture. and C) are used for branching and for conditional execution of instructions. Z. . • All arithmetic and logic instructions operate on operands in processor registers. but it has some CISC-style features.In most respects. RISC-style Aspects • All instructions have a fixed length of 32 bits. • Only Load and Store instructions access memory. • Condition codes (N. or stored in a block. CISC-style Aspects • Auto-increment. using a single instruction. • Multiple registers can be loaded from a block of consecutive memory words. Auto-decrement.

ARM has incorporated hardware debug technology within the processor so that software engineers can view what is happening while the processor is executing code. The ARM processor has been specifically designed to be small to reduce power consumption and extend battery Operation essential for applications such as mobile phones and personal digital assistants High code density is another major requirement since embedded systems have limited memory due to cost and/or physical size restrictions. which has a direct . High code density is useful for applications that have limited on-board memory. software engineers can resolve issues faster.The ARM Design Philosophy portable embedded systems require some form of battery power. With greater visibility. such as mobile phones and mass storage devices.

This feature improves performance and code density by reducing branch instructions. Enhanced instructions—The enhanced digital signal processor (DSP) instructions were added to the standard ARM instruction set to support fast 16×16-bit multiplier operations and saturation. The 16-bit instructions improve code density by about 30% over 32-bit fixed-length instructions. Thumb 16-bit instruction set—ARM enhanced the processor core by adding a second 16-bit instruction set called Thumb that permits the ARM core to execute either 16or 32-bit instructions. .The ARM architecture has a number of features not generally found in modem processors Conditional execution—An instruction is only executed when a specific condition has been satisfied. These instructions allow a faster-performing ARM processor in some cases to replace the traditional combinations of a processor plus a DSP.

Variable cycle execution for certain instructions—Not every ARM instruction executes in a single cycle. . which increases performance since sequential memory accesses are often faster than random accesses. Inline barrel shifter leading to more complex instructions—The inline barrel shifter is a hardware component that preprocesses one of the input registers before it is used by an instruction. Code density is also improved since multiple register transfers are common operations at the start and end of functions. For example. The transfer can occur on sequential memory addresses. This expands the capability of many instructions to improve core performance and code density. load-store-multiple instructions vary in the number of execution cycles depending upon the number of registers being transferred.

.

.

.

ARM 32 bit processor Data width = 32 bit Processor registers = 32 bits 8 bit=1 byte 16 bits = half word 32 bits = word Support both little endian and big endian Little endian Big endian Reg=0A0B0C0D Reg=0A0B0C0D Memory= Memory= 0A 0D 0B 0C 0C 0B 0D 0A .

Leading provider of 32-bit embedded RISC microprocessors. 75% of market High performance Low power consumption Low system cost .

.

.

.

.

•includes MMU •Harvard architecture •up to 1 GHz.StrongARM XScale SecurCore SC100 Modified by Intel Corporation. •XScale executes architecture v5TE instructions •includes MMU The SC100 is the first SecurCore and is based on an ARM7TDMI core with an MPU. Modified by Intel Corporation. . •Harvard architecture •caches •five-stage Pipeline •not support the Thumb instruction set.

The presence of the MMU means the ARM720T is capable of handling th Linux and Microsoft embedded platform operating systems. The processor also includes a unified 8K cache.ARM7TDMI-S. also synthesizable. ARM7EJ-S is quite different since it includes a five-stage pipeline and executes ARMv5TEJ . RM720T is the most flexible member of the ARM7 family because it includes a MMU. The ARM7TDMI-S has the same operating characteristics as a standard ARM7TDMI but is also synthesizable. Another variation is the ARM7EJ-S processor.

.

.

.

.

1 Standard Test Access Port and boundary scan architecture. It is a serial protocol used by ARM to send and receive debug information between the processor core and test equipment. JTAG is described by IEEE 1149. .  All ARM cores after the ARM7TDMI include the TDMI features even though they may not include those letters after the “ARM” label.

.

06 MIPS/MHz : 0.ARM7 family features Version : ARMv4T Pipeline depth : 3 stages Typical MHz : 80 MHz mw/MHz : 0.97 Architecture : Von neumann Multiplier : 8x32 .

Registers •General-purpose registers hold either data or an address. •They are identified with the letter r prefixed to the register number Figure shows the active registers available in user mode .

. They are frequently given different labels to differentiate them from the other registers. The ARM processor has three registers assigned to a particular task or special function: r13. and SPSR (saved program status registers. ■ Register r14 is called the link register (lr) and is where the core puts the return address whenever it calls a subroutine. respectively).•All the registers shown are 32 bits in size. r14. ■ Register r13 is traditionally used as the stack pointer (sp) and stores the head of the stack in the current processor mode. and r15. •There are up to 18 active registers: 16 data registers and 2 processor status registers CPSR (the current program status registers).

each 8 bits wide: flags. . and control.Current Program Status Register •The ARM core uses the CPSR to monitor and control internal operations. extension. •The CPSR is divided into four fields. •The CPSR is a dedicated 32-bit register and resides in the register file. status. In current designs the extension and status fields are reserved for future use.

.

.

•Fast interrupt request and interrupt request modes correspond to the two interrupt levels available on the ARM processor. •The processor enters abort mode when there is a failed attempt to access memory. •Supervisor mode is the mode that the processor is in after reset and is generally the mode that an operating system kernel operates in •System mode is a special version of user mode that allows full read-write access to the CPSR.Each processor mode is either privileged or nonprivileged privileged mode: allows full read-write access to the CPSR. •Undefined mode is used when the processor encounters an instruction that is undefined or not supported by the implementation. . non-privileged mode : allows read access to the control field in the CPSR but still allows read-write access to the condition flags.

•The cpsr has two interrupt mask bits. respectively. •I bit masks IRQ when set to binary 1 •F bit masks FIQ when set to binary 1.Interrupt Masks •Interrupt masks are used to stop specific interrupt requests from interrupting the processor. . 7 and 6 (or I and F). which control the masking of IRQ and FIQ. •There are two interrupt request levels available on the ARM processor core interrupt request (IRQ) and fast interrupt request (FIQ).

. r14_abt and spsr_abt. They are available only when the processor is in a particular mode.REGISTER FILE contains 37 registers 20 registers are hidden from a program at different times. Every processor mode except user mode can change mode by writing directly to the mode bits of the CPSR. These registers are called banked registers and are identified by the shading in the diagram. abort mode has banked registers r13_abt. for example.

the arrows represent between its different parts. and conversely instructions treat the Rn and Rm.•ARM core dataflow model provides an overview of the ARM core is functional units processor core and connected by data describes how databuses. translates instructions The data may bean an architecture. Rd. the flow ofof data. This represent either before they are executed. copy load instructions data Since the ARM core is a 32ARM instructions typically The extend fromsign memory to hardware registers in bit processor. and a copy single 16-bit The ALU (arithmetic logic store instructions data registers as holding Data processing instructions result orMAC destination numbers unit) or to 32-bit (multiplyvalues as from registers to memory. register file a storage bank belongs to the a particular and out of processor: made up of 32-bit registers. Source they are accumulate unit) from memory the values. instruction set. uses a load-store buses. and the boxes interact. signed or unsigned 32-bit write theread result intakes Rd register. computes abe result. most have two signed source registers. the lines •Data functions theprocessor processor enters the represent the different parts core and how The instruction decoder core through the Data bus. types for transferring data in area. operands are read from the and placed in a register. Together the barrel address register and shifter and ALU can broadcast therange Address calculate aon wide of bus. moves where. expressions and addresses. the ALU.held respectively. converts 8-bit andthe the core. register Rm alternatively Load and store instructions register file using the values Rn and Rm from can be preprocessed in the use the ALU to generate an internal A and B buses and barrel shifter before it enters address in the buses A to and B. . instruction to means it has two operation unit or ainstruction storage Data items are placed in the Each instruction executed execute or a data item. directly to the register file.

Pipeline •A pipeline is the mechanism a RISC processor uses to execute instructions. . •Using a pipeline speeds up execution by fetching the next instruction while other instructions are being decoded and executed.

■ Fetch loads an instruction from memory. •shows a sequence of three instructions being fetched. decoded. and executed by the processor. ■ Decode identifies the instruction to be executed. ■ Execute processes the instruction and writes the result back to a register. • Each instruction takes a single cycle to complete after the pipeline is filled. .

Cycle 3: The ADD instruction is executed SUB instruction is decoded .•The three instructions are placed into the pipeline sequentially •Each instruction takes a single cycle to complete after the pipeline is filled. Cycle 2: core fetches the SUB instruction decodes the ADD instruction. Cycle 1: core fetches the ADD instruction from memory.

.

Barrel Shifter A barrel shifter is a digital circuit that can shift a data word by a specified number of bits in one clock cycle. .

•A unique and powerful feature of the ARM processor is the ability to shift the 32-bit binary pattern in one of the source registers left or right by a specific number of positions before it enters the ALU. r5. • Pre-processing or shift occurs within the cycle time of the instruction. •This is particularly useful for loading constants into a register and achieving fast multiplies or division by a power of 2 •The five different shift operations that you can use within the barrel shifter are r5 = 5 r7 = 8 MOV r7. LSL #2 r5 = 5 r7 = 20 .

the processor sets the PC to a specific memory address. the processor suspends normal execution and starts loading instructions from the exception vector table •Each vector table entry contains a form of branch instruction . Interrupts. and the Vector Table •When an exception or interrupt occurs. •The address is within a special address range called the vector table. •The memory map address 0x00000000 is reserved for the vector table •On some processors the vector table can be optionally located at a higher address min memory (starting at the offset 0xffff0000).Exceptions. Operating systems such as Linux and Microsoft’s embedded products can take advantage of this feature •When an exception or interrupt occurs.

a SWI Fast instruction. This Software interrupt vector is power called you execute is raised when an instruction attempts to access data instruction branches to the initialization code. when the flow processor of the processor. in the CPSR. It canThe onlyactual be raised if occurs FIQS are in the notdecode maskedstage. . Prefetch processor to interrupt abort cannot the vector decode normal occurs an execution instruction. routine. faster response times. interrupt request The SWI vector instruction is similar ispermissions.Reset the location of the first instruction Datavector abort is vector is similar to a prefetch abort but Undefined Interrupt instruction request vector vector is used is used by when external the hardware executed by the processor when iswhen applied. abort CPSR. frequently to the memory without the correct access attempts It as canthe toonly fetch bean raised instruction if reserved IRQs from are not an address maskedsystem without in the used interrupt request mechanism but is to invoke an for operating hardware requiring the correct access permissions.

ARM cores have three different types of memory management hardware Non-protected memory is fixed and provides very little flexibility. This is achieved with the assistance of memory management hardware. •It is usually necessary to have a method to help organize these devices and protect the system from applications trying to make inappropriate accesses to hardware.Memory Management •Embedded systems often use multiple memory devices. type of memory management is used for systems that require memory protection but don’t have a complex memory map. It is normally used for small. . simple embedded systems that require no protection from rogue applications memory protection unit (MPU) employ a simple system that uses a limited number of memory regions.

Non-protected memory protection memory memory unit (MPU) management unit ARM cores have three different types of memory management (MMU) hardware fixed and provides employ a simple are the most very little flexibility. system that uses a comprehensive It is normally used for limited number of memory small. memory map MMUs are designed for more sophisticated . management embedded systems type of memory hardware available on that require no management is used the ARM. The MMU protection from rogue for systems that uses a set of applications. require memory translation tables to protection but don’t provide fine-grained have a complex control over memory. This is achieved with the assistance of memory management hardware. •It is usually necessary to have a method to help organize these devices and protect the system from applications trying to make inappropriate accesses to hardware.Memory Management •Embedded systems often use multiple memory devices. simple memory regions.