You are on page 1of 45

Announcement

Embedded Systems Design: A Unified 1


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Embedded Systems Design: A Unified
Hardware/Software Introduction

Chapter 3 General-Purpose Processors:


Software

2
Introduction

• General-Purpose Processor
– Processor designed for a variety of computation tasks
– Low unit cost, in part because manufacturer spreads NRE
over large numbers of units
• Motorola sold half a billion 68HC05 microcontrollers in 1996 alone
– Carefully designed since higher NRE is acceptable
• Can yield good performance, size and power
– Low NRE cost, short time-to-market/prototype, high
flexibility
• User just writes software; no processor design
– a.k.a. “microprocessor” – “micro” used when they were
implemented on one or a few chips rather than entire rooms
Embedded Systems Design: A Unified 3
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Basic Architecture

• Control unit and Processor


datapath Control unit Datapath

– Note similarity to ALU


Controller Control
single-purpose /Status
processor
• Key differences Registers

– Datapath is general
– Control unit doesn’t PC IR
store the algorithm –
the algorithm is
I/O
“programmed” into the
Memory
memory

Embedded Systems Design: A Unified 4


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Datapath Operations
• Load Processor
– Read memory location Control unit Datapath
into register ALU
• ALU operation Controller Control +1
/Status
– Input certain registers through
ALU, store back in register
Registers

• Store 10 11
– Write register to PC IR

memory location
I/O
Memory
...
10
...
11

Embedded Systems Design: A Unified 5


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Control Unit
• Control unit: configures the datapath
operations Processor
– Sequence of desired operations Control unit Datapath
(“instructions”) stored in memory –
“program” ALU
Controller Control
• Instruction cycle – broken into /Status
several sub-operations, each one
clock cycle, e.g.: Registers
– Fetch: Get next instruction into IR
– Decode: Determine what the
instruction means
– Fetch operands: Move data from PC IR R0 R1
memory to datapath register
– Execute: Move data through the
ALU I/O
– Store results: Write data from 100 load R0, M[500] Memory
...
register to memory 500 10
101 inc R1, R0
102 store M[501], R1
501
...
Embedded Systems Design: A Unified 6
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Control Unit Sub-Operations

• Fetch Processor

– Get next instruction Control unit Datapath

ALU
into IR Controller Control
/Status
– PC: program
counter, always Registers

points to next
instruction PC IR
100 R0 R1
load R0, M[500]
– IR: holds the
fetched instruction I/O

100 load R0, M[500] Memory


...
500 10
101 inc R1, R0
102 store M[501], R1
501
...
Embedded Systems Design: A Unified 7
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Control Unit Sub-Operations

• Decode Processor

– Determine what the Control unit Datapath

ALU
instruction means Controller Control
/Status

Registers

PC 100 IR R0 R1
load R0, M[500]

I/O

100 load R0, M[500] Memory


...
500 10
101 inc R1, R0
102 store M[501], R1
501
...
Embedded Systems Design: A Unified 8
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Control Unit Sub-Operations

• Fetch operands Processor

– Move data from Control unit Datapath

ALU
memory to datapath Controller Control
register /Status

Registers

10
PC 100 IR R0 R1
load R0, M[500]

I/O

100 load R0, M[500] Memory


...
500 10
101 inc R1, R0
102 store M[501], R1
501
...
Embedded Systems Design: A Unified 9
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Control Unit Sub-Operations

• Execute Processor

– Move data through Control unit Datapath

ALU
the ALU Controller Control
/Status
– This particular
instruction does Registers

nothing during this


sub-operation PC IR
10
100 R0 R1
load R0, M[500]

I/O

100 load R0, M[500] Memory


...
500 10
101 inc R1, R0
102 store M[501], R1
501
...
Embedded Systems Design: A Unified 10
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Control Unit Sub-Operations

• Store results Processor

– Write data from Control unit Datapath

ALU
register to memory Controller Control
/Status
– This particular
instruction does Registers

nothing during this


sub-operation PC IR
10
100 R0 R1
load R0, M[500]

I/O

100 load R0, M[500] Memory


...
500 10
101 inc R1, R0
102 store M[501], R1
501
...
Embedded Systems Design: A Unified 11
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Instruction Cycles

PC=100 Processor
Fetch Exec. Store
Fetch Decode ops results
Control unit Datapath

ALU
clk Controller Control
/Status

Registers

10
PC 100 IR R0 R1
load R0, M[500]

I/O

100 load R0, M[500] Memory


...
500 10
101 inc R1, R0
102 store M[501], R1
501
...
Embedded Systems Design: A Unified 12
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Instruction Cycles

PC=100 Processor
Fetch Exec. Store
Fetch Decode ops results
Control unit Datapath

ALU
clk Controller Control +1
/Status

PC=101
Fetch Exec. Store Registers
Fetch Decode ops results
clk
10 11
PC 101 IR R0 R1
inc R1, R0

I/O

100 load R0, M[500] Memory


...
500 10
101 inc R1, R0
102 store M[501], R1
501
...
Embedded Systems Design: A Unified 13
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Instruction Cycles

PC=100 Processor
Fetch Exec. Store
Fetch Decode ops results
Control unit Datapath

ALU
clk Controller Control
/Status

PC=101
Fetch Exec. Store Registers
Fetch Decode ops results
clk
10 11
PC 102 IR R0 R1
store M[501], R1

PC=102
Fetch Exec. Store
Fetch Decode ops results I/O

100 load R0, M[500] Memory


...
clk 500 10
101 inc R1, R0
102 store M[501], R1
501 11
...
Embedded Systems Design: A Unified 14
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Architectural Considerations

• N-bit processor Processor

– N-bit ALU, registers, Control unit Datapath

ALU
buses, memory data Controller Control
interface /Status

– Embedded: 8-bit, 16- Registers

bit, 32-bit common


– Desktop/servers: 32- PC IR

bit, even 64
• PC size determines I/O

address space Memory

Embedded Systems Design: A Unified 15


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Architectural Considerations

• Clock frequency Processor

– Inverse of clock Control unit Datapath

ALU
period Controller Control
/Status
– Must be longer than
longest register to Registers

register delay in
entire processor PC IR
– Memory access is
often the longest I/O
Memory

Embedded Systems Design: A Unified 16


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Pipelining: Increasing Instruction
Throughput

Wash 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Non-pipelined Pipelined
Dry 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

non-pipelined dish cleaning Time pipelined dish cleaning Time

Fetch-instr. 1 2 3 4 5 6 7 8

Decode 1 2 3 4 5 6 7 8

Fetch ops. 1 2 3 4 5 6 7 8 Pipelined

Execute 1 2 3 4 5 6 7 8
Instruction 1
Store res. 1 2 3 4 5 6 7 8

Time
pipelined instruction execution

Embedded Systems Design: A Unified 17


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Superscalar and VLIW Architectures

• Performance can be improved by:


– Faster clock (but there’s a limit)
– Pipelining: slice up instruction into stages, overlap stages
– Multiple ALUs to support more than one instruction stream
• Superscalar
– Scalar: non-vector operations
– Fetches instructions in batches, executes as many as possible
• May require extensive hardware to detect independent instructions
– VLIW: each word in memory has multiple independent instructions
• Relies on the compiler to detect and schedule instructions
• Currently growing in popularity

Embedded Systems Design: A Unified 18


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Two Memory Architectures

Processor Processor
• Princeton
– Fewer memory
wires
• Harvard
– Simultaneous Program
memory
Data memory Memory
(program and data)
program and data
memory access
Harvard Princeton

Embedded Systems Design: A Unified 19


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Cache Memory

• Memory access may be slow Fast/expensive technology, usually on


the same chip
• Cache is small but fast
memory close to processor
Processor

– Holds copy of part of memory


– Hits and misses Cache

Memory

Slower/cheaper technology, usually on


a different chip

Embedded Systems Design: A Unified 20


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Programmer’s View

• Programmer doesn’t need detailed understanding of architecture


– Instead, needs to know what instructions can be executed
• Two levels of instructions:
– Assembly level
– Structured languages (C, C++, Java, etc.)
• Most development today done using structured languages
– But, some assembly level programming may still be necessary
– Drivers: portion of program that communicates with and/or controls
(drives) another device
• Often have detailed timing considerations, extensive bit manipulation
• Assembly level may be best for these

Embedded Systems Design: A Unified 21


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Assembly-Level Instructions
Instruction 1 opcode operand1 operand2

Instruction 2 opcode operand1 operand2

Instruction 3 opcode operand1 operand2

Instruction 4 opcode operand1 operand2

...

• Instruction Set
– Defines the legal set of instructions for that processor
• Data transfer: memory/register, register/register, I/O, etc.
• Arithmetic/logical: move register through ALU and back
• Branches: determine next PC value when not just PC+1
Embedded Systems Design: A Unified 22
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
A Simple (Trivial) Instruction Set

Assembly instruct. First byte Second byte Operation

MOV Rn, direct 0000 Rn direct Rn = M(direct)

MOV direct, Rn 0001 Rn direct M(direct) = Rn

MOV @Rn, Rm 0010 Rn Rm M(Rn) = Rm

MOV Rn, #immed. 0011 Rn immediate Rn = immediate

ADD Rn, Rm 0100 Rn Rm Rn = Rn + Rm

SUB Rn, Rm 0101 Rn Rm Rn = Rn - Rm

JZ Rn, relative 0110 Rn relative PC = PC+ relative


(only if Rn is 0)
opcode operands

Embedded Systems Design: A Unified 23


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Addressing Modes
Addressing Register-file Memory
mode Operand field contents contents

Immediate Data

Register-direct
Register address Data

Register
Register address Memory address Data
indirect

Direct Memory address Data

Indirect Memory address Memory address

Data

Embedded Systems Design: A Unified 24


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Sample Programs
C program Equivalent assembly program

0 MOV R0, #0; // total = 0


1 MOV R1, #10; // i = 10
2 MOV R2, #1; // constant 1
3 MOV R3, #0; // constant 0

Loop: JZ R1, Next; // Done if i=0


int total = 0; 5 ADD R0, R1; // total += i
for (int i=10; i!=0; i--) 6 SUB R1, R2; // i--
total += i; 7 JZ R3, Loop; // Jump always
// next instructions...
Next: // next instructions...

• Try some others


– Handshake: Wait until the value of M[254] is not 0, set M[255] to 1, wait
until M[254] is 0, set M[255] to 0 (assume those locations are ports).
– (Harder) Count the occurrences of zero in an array stored in memory
locations 100 through 199.

Embedded Systems Design: A Unified 25


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Programmer Considerations

• Program and data memory space


– Embedded processors often very limited
• e.g., 64 Kbytes program, 256 bytes of RAM (expandable)
• Registers: How many are there?
– Only a direct concern for assembly-level programmers
• I/O
– How communicate with external signals?
• Interrupts

Embedded Systems Design: A Unified 26


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Microprocessor Architecture Overview

• If you are using a particular microprocessor, now is a


good time to review its architecture

Embedded Systems Design: A Unified 27


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Example: parallel port driver

LPT Connection Pin I/O Direction Register Address


Pin 13
1 Output 0th bit of register #2
Switch
PC Parallel port
2-9 Output 0th bit of register #2
Pin 2 LED
10,11,12,13,15 Input 6,7,5,4,3 bit of register #1
th

14,16,17 Output 1,2,3th bit of register #2

• Using assembly language programming we can configure a PC


parallel port to perform digital I/O
– write and read to three special registers to accomplish this table provides
list of parallel port connector pins and corresponding register location
– Example : parallel port monitors the input switch and turns the LED
on/off accordingly

Embedded Systems Design: A Unified 28


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Parallel Port Example
; This program consists of a sub-routine that reads extern “C” CheckPort(void); // defined in
; the state of the input pin, determining the on/off state // assembly
; of our switch and asserts the output pin, turning the LED void main(void) {
; on/off accordingly while( 1 ) {
.386 CheckPort();
}
CheckPort proc }
push ax ; save the content
push dx ; save the content
mov dx, 3BCh + 1 ; base + 1 for register #1
in al, dx ; read register #1
and al, 10h ; mask out all but bit # 4 Pin 13
cmp al, 0 ; is it 0?
Switch
jne SwitchOn ; if not, we need to turn the LED on
PC Parallel port
SwitchOff:
mov dx, 3BCh + 0 ; base + 0 for register #0 Pin 2 LED
in al, dx ; read the current state of the port
and al, f7h ; clear first bit (masking)
out dx, al ; write it out to the port
jmp Done ; we are done

SwitchOn:
LPT Connection Pin I/O Direction Register Address
mov dx, 3BCh + 0 ; base + 0 for register #0 1 Output 0th bit of register #2
in al, dx ; read the current state of the port
or al, 01h ; set first bit (masking) 2-9 Output 0th bit of register #2
out dx, al ; write it out to the port
10,11,12,13,15 Input 6,7,5,4,3th bit of register #1
Done: pop dx ; restore the content
pop ax ; restore the content
14,16,17 Output 1,2,3th bit of register #2
CheckPort endp

Embedded Systems Design: A Unified 29


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Operating System

• Optional software layer


providing low-level services to
a program (application).
– File management, disk access
– Keyboard/display interfacing
– Scheduling multiple programs for DB file_name “out.txt” -- store file name

execution MOV
MOV
R0, 1324
R1, file_name
--
--
system call “open” id
address of file-name

• Or even just multiple threads from INT


JZ
34
R0, L1
--
--
cause a system call
if zero -> error

one program . . . read the file


JMP L2 -- bypass error cond.

– Program makes system calls to L1:


. . . handle the error

the OS L2:

Embedded Systems Design: A Unified 30


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Development Environment

• Development processor
– The processor on which we write and debug our programs
• Usually a PC
• Target processor
– The processor that the program will run on in our embedded
system
• Often different from the development processor

Development processor Target processor

Embedded Systems Design: A Unified 31


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Software Development Process

• Compilers
C File C File Asm.
– Cross compiler
File
• Runs on one
Compiler Assembler
processor, but
generates code for
Binary
File
Binary
File
Binary
File
another
• Assemblers
Linker
Debugger
Library

Exec.
• Linkers
Profiler
File
• Debuggers
Verification Phase

Implementation Phase
Profilers
Embedded Systems Design: A Unified 32
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Running a Program

• If development processor is different than target, how


can we run our compiled code? Two options:
– Download to target processor
– Simulate
• Simulation
– One method: Hardware description language
• But slow, not always available
– Another method: Instruction set simulator (ISS)
• Runs on development processor, but executes instructions of target
processor

Embedded Systems Design: A Unified 33


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Instruction Set Simulator For A Simple
Processor
#include <stdio.h> }
typedef struct { }
unsigned char first_byte, second_byte; return 0;
} instruction; }

instruction program[1024]; //instruction memory int main(int argc, char *argv[]) {


unsigned char memory[256]; //data memory
FILE* ifs;
void run_program(int num_bytes) {
If( argc != 2 ||
int pc = -1; (ifs = fopen(argv[1], “rb”) == NULL ) {
unsigned char reg[16], fb, sb; return –1;
}
while( ++pc < (num_bytes / 2) ) { if (run_program(fread(program,
fb = program[pc].first_byte; sizeof(program) == 0) {
sb = program[pc].second_byte; print_memory_contents();
switch( fb >> 4 ) { return(0);
case 0: reg[fb & 0x0f] = memory[sb]; break; }
case 1: memory[sb] = reg[fb & 0x0f]; break; else return(-1);
case 2: memory[reg[fb & 0x0f]] = }
reg[sb >> 4]; break;
case 3: reg[fb & 0x0f] = sb; break;
case 4: reg[fb & 0x0f] += reg[sb >> 4]; break;
case 5: reg[fb & 0x0f] -= reg[sb >> 4]; break;
case 6: pc += sb; break;
default: return –1;

Embedded Systems Design: A Unified 34


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Testing and Debugging
(a) (b) • ISS
Implementation Implementation – Gives us control over time –
Phase Phase set breakpoints, look at
register values, set values,
Verification step-by-step execution, ...
Phase Development processor
– But, doesn’t interact with real
Debugger/ ISS
environment
• Download to board
Emulator – Use device programmer
– Runs in real environment, but
not controllable
External tools
• Compromise: emulator
– Runs in real environment, at
Programmer speed or near
Verification – Supports some controllability
Phase
from the PC

Embedded Systems Design: A Unified 35


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Application-Specific Instruction-Set
Processors (ASIPs)
• General-purpose processors
– Sometimes too general to be effective in demanding
application
• e.g., video processing – requires huge video buffers and operations
on large arrays of data, inefficient on a GPP
– But single-purpose processor has high NRE, not
programmable
• ASIPs – targeted to a particular domain
– Contain architectural features specific to that domain
• e.g., embedded control, digital signal processing, video processing,
network processing, telecommunications, etc.
– Still programmable
Embedded Systems Design: A Unified 36
Hardware/Software Introduction, (c) 2000 Vahid/Givargis
A Common ASIP: Microcontroller
• For embedded control applications
– Reading sensors, setting actuators
– Mostly dealing with events (bits): data is present, but not in huge amounts
– e.g., VCR, disk drive, digital camera (assuming SPP for image
compression), washing machine, microwave oven
• Microcontroller features
– On-chip peripherals
• Timers, analog-digital converters, serial communication, etc.
• Tightly integrated for programmer, typically part of register space
– On-chip program and data memory
– Direct programmer access to many of the chip’s pins
– Specialized instructions for bit-manipulation and other low-level
operations

Embedded Systems Design: A Unified 37


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Another Common ASIP: Digital Signal
Processors (DSP)
• For signal processing applications
– Large amounts of digitized data, often streaming
– Data transformations must be applied fast
– e.g., cell-phone voice filter, digital TV, music synthesizer
• DSP features
– Several instruction execution units
– Multiple-accumulate single-cycle instruction, other instrs.
– Efficient vector operations – e.g., add two arrays
• Vector ALUs, loop buffers, etc.

Embedded Systems Design: A Unified 38


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Trend: Even More Customized ASIPs
• In the past, microprocessors were acquired as chips
• Today, we increasingly acquire a processor as Intellectual
Property (IP)
– e.g., synthesizable VHDL model
• Opportunity to add a custom datapath hardware and a few
custom instructions, or delete a few instructions
– Can have significant performance, power and size impacts
– Problem: need compiler/debugger for customized ASIP
• Remember, most development uses structured languages
• One solution: automatic compiler/debugger generation
– e.g., www.tensillica.com
• Another solution: retargettable compilers
– e.g., www.improvsys.com (customized VLIW architectures)

Embedded Systems Design: A Unified 39


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Selecting a Microprocessor
• Issues
– Technical: speed, power, size, cost
– Other: development environment, prior expertise, licensing, etc.
• Speed: how evaluate a processor’s speed?
– Clock speed – but instructions per cycle may differ
– Instructions per second – but work per instr. may differ
– Dhrystone: Synthetic benchmark, developed in 1984. Dhrystones/sec.
• MIPS: 1 MIPS = 1757 Dhrystones per second (based on Digital’s VAX
11/780). A.k.a. Dhrystone MIPS. Commonly used today.
– So, 750 MIPS = 750*1757 = 1,317,750 Dhrystones per second
– SPEC: set of more realistic benchmarks, but oriented to desktops
– EEMBC – EDN Embedded Benchmark Consortium, www.eembc.org
• Suites of benchmarks: automotive, consumer electronics, networking, office
automation, telecommunications

Embedded Systems Design: A Unified 40


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
General Purpose Processors
Processor Clock speed Periph. Bus Width MIPS Power Trans. Price
General Purpose Processors
Intel PIII 1GHz 2x16 K 32 ~900 97W ~7M $900
L1, 256K
L2, MMX
IBM 550 MHz 2x32 K 32/64 ~1300 5W ~7M $900
PowerPC L1, 256K
750X L2
MIPS 250 MHz 2x32 K 32/64 NA NA 3.6M NA
R5000 2 way set assoc.
StrongARM 233 MHz None 32 268 1W 2.1M NA
SA-110
Microcontroller
Intel 12 MHz 4K ROM, 128 RAM, 8 ~1 ~0.2W ~10K $7
8051 32 I/O, Timer, UART
Motorola 3 MHz 4K ROM, 192 RAM, 8 ~.5 ~0.1W ~10K $5
68HC811 32 I/O, Timer, WDT,
SPI
Digital Signal Processors
TI C5416 160 MHz 128K, SRAM, 3 T1 16/32 ~600 NA NA $34
Ports, DMA, 13
ADC, 9 DAC
Lucent 80 MHz 16K Inst., 2K Data, 32 40 NA NA $75
DSP32C Serial Ports, DMA

Sources: Intel, Motorola, MIPS, ARM, TI, and IBM Website/Datasheet; Embedded Systems Programming, Nov. 1998

Embedded Systems Design: A Unified 41


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Designing a General Purpose Processor
FSMD

• Not something an embedded Declarations:


bit PC[16], IR[16]; Reset PC=0;
bit M[64k][16], RF[16][16];
system designer normally Fetch IR=M[PC];
PC=PC+1

would do Decode from states


below
– But instructive to see how Mov1 RF[rn] = M[dir]

simply we can build one top op = 0000


to Fetch

down 0001
Mov2 M[dir] = RF[rn]
to Fetch

– Remember that real processors Mov3 M[rn] = RF[rm]


0010 to Fetch
aren’t usually built this way
Mov4 RF[rn]= imm
• Much more optimized, much 0011 to Fetch

more bottom-up design Add RF[rn] =RF[rn]+RF[rm]


0100 to Fetch

Sub RF[rn] = RF[rn]-RF[rm]


Aliases: 0101 to Fetch
op IR[15..12] dir IR[7..0]
rn IR[11..8] imm IR[7..0]
Jz PC=(RF[rn]=0) ?rel :PC
rm IR[7..4] rel IR[7..0]
0110 to Fetch

Embedded Systems Design: A Unified 42


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Architecture of a Simple Microprocessor
• Storage devices for each Datapath
declared variable Control unit To all
input RFs
1
2x1 mux
0

– register file holds each of the control


signals
RFwa
variables Controller RFw
(Next-state and RFwe
• Functional units to carry out control
logic; state register)
From all
output RFr1a
RF (16)

the FSMD operations control


signals RFr1e

– One ALU carries out every PCld


16
Irld
RFr2a

required operation PCinc


PC IR RFr2e
RFr1 RFr2

• Connections added among the PCclr ALUs


ALU
components’ ports 2 1 0
ALUz

corresponding to the operations Ms


required by the FSM 3x1 mux Mre Mwe

• Unique identifiers created for


every control signal A Memory D

Embedded Systems Design: A Unified 43


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
A Simple Microprocessor

Reset PC=0; PCclr=1;

Fetch IR=M[PC]; MS=10; Datapath 1


PC=PC+1 Irld=1;
Control unit To all 0
input RFs
Mre=1; 2x1 mux
Decode from states PCinc=1; control
below signals
RFwa RFw
Mov1 RF[rn] = M[dir] RFwa=rn; RFwe=1; RFs=01; Controller
op = 0000
to Fetch Ms=01; Mre=1; (Next-state and RFwe
control From all RF (16)
Mov2 M[dir] = RF[rn] RFr1a=rn; RFr1e=1; output
0001
logic; state RFr1a
to Fetch Ms=01; Mwe=1; control
register)
signals RFr1e
Mov3 M[rn] = RF[rm] RFr1a=rn; RFr1e=1;
0010 Ms=10; Mwe=1;
to Fetch 16 RFr2a
PCld Irld
RF[rn]= imm RFwa=rn; RFwe=1; RFs=10; PC IR RFr1 RFr2
Mov4 RFr2e
0011
to Fetch PCinc

RFwa=rn; RFwe=1; RFs=00; PCclr ALUs


Add RF[rn] =RF[rn]+RF[rm]
0100 RFr1a=rn; RFr1e=1; ALU
to Fetch RFr2a=rm; RFr2e=1; ALUs=00 ALUz
RFwa=rn; RFwe=1; RFs=00;
2 1 0
Sub RF[rn] = RF[rn]-RF[rm] RFr1a=rn; RFr1e=1;
0101
to Fetch RFr2a=rm; RFr2e=1; ALUs=01
Ms
PCld= ALUz; 3x1 mux Mre Mwe
Jz PC=(RF[rn]=0) ?rel :PC RFrla=rn;
0110
to Fetch RFrle=1;

FSM operations that replace the FSMD


FSMD
operations after a datapath is created Memory
A D
You just built a simple microprocessor!

Embedded Systems Design: A Unified 44


Hardware/Software Introduction, (c) 2000 Vahid/Givargis
Chapter Summary
• General-purpose processors
– Good performance, low NRE, flexible
• Controller, datapath, and memory
• Structured languages prevail
– But some assembly level programming still necessary
• Many tools available
– Including instruction-set simulators, and in-circuit emulators
• ASIPs
– Microcontrollers, DSPs, network processors, more customized ASIPs
• Choosing among processors is an important step
• Designing a general-purpose processor is conceptually the same
as designing a single-purpose processor
Embedded Systems Design: A Unified 45
Hardware/Software Introduction, (c) 2000 Vahid/Givargis

You might also like