You are on page 1of 60

Embedded Systems An

O i Overview
Santanu Chattopadhyay
Dept of Electronics & Elec Comm Engg Dept. of Electronics & Elec. Comm. Engg.
Indian Institute of Technology, Kharagpur
India 721302
Email: santanu@ece.iitkgp.ernet.in @ gp
What is an Embedded System
Computing systems embedded within larger Computing systems embedded within larger
electronic devices, repeatedly carrying out a
particular function, often going completely particular function, often going completely
unrecognized by the devices user are the
embedded systems
Any computing system other than a desktop
Nearly any device that runs on electricity
either already has or will soon have a
computing system embedded within it
Application areas
Consumer electronics: cellphones pagers Consumer electronics: cellphones, pagers,
digital cameras
Home appliances: micro oven washing Home appliances: micro oven, washing
machine
Office automation: fax, copiers, printers Office automation: fax, copiers, printers
Business equipments: cash register, alarm
system, card reader y ,
Automobiles: transmission control, cruise
control, fuel injection, antilock brakes
Characteristics of Embedded Systems
Si l f ti d E t ifi f ti Single functioned: Executes a specific function
repeatedly, e.g., pager. Desktop systems execute a
variety of programs y p g
Tightly constrained:
Cheap
Fit on a single chip
Fast enough for real-time
Consume minimum power for extended battery life Consume minimum power for extended battery life
No cooling arrangement
Reactive and real-time e c ve d e e
Common Design Metrics
Non-recurring engineering (NRE) cost Non-recurring engineering (NRE) cost
Unit cost
Size bytes, gates, transistors
P f Performance
Power
Flexibility ability to change
Time to prototype
Time to market design + manufacturing + testing
Maintainabilityy
Correctness
Safety
Trade-offs
Power
P hi th t
Si
Performance
Pushing one, others pop out
Size
Performance
NRE cost
To Meet the Optimization Challenge
Designer must be comfortable with a variety of Designer must be comfortable with a variety of
hardware and software implementation technologies
Must be able to mitigate from one technology to another
t fi d th b t i l t ti bj t t th i t to find the best implementation subject to the given set
of constraints
A designer cannot simply be a hardware expert or a
software expert, should have expertise in both
Hardware-software codesign is the field that emphasizes
unified view of hardware and software, develops unified view of hardware and software, develops
synthesis tools and simulators that enable co-
development of systems using hardware and software
Embedded Computer Architecture
ADC CPU SPI
Smaller embedded
systems use
Digital I/O ROM I
2
C
Co nter/
Microcontrollers having
a CPU, small internal
memory some I/O
Serial
Port
(UART)
RAM
Counter/
Timer
Bus interface ( )
Embedded Program
Al t ll b dd d d ith Almost all embedded programs end with an
infinite loop, surrounding a significant
portion of the program functionality portion of the program functionality
Most embedded systems have a single piece
of software running of software running
Embedded System Design Technology
Sh ld b t d Should be a top-down
approach, rather than
bottom-up
System specification
Behavioural
ifi ti
p
Should be supported
by tools and libraries
specification
Register transfer
ifi i
at each level
Reuse philosophy is
sed e tensi el
specification
Logic specification
used extensively
To final
implementation implementation
System specification
D ib d i f ti lit i Describes design functionality in some
language like C or natural English
S ifi ti t h i b f l h Specification technique be powerful enough
to express real-life synchronism and
parallelism in the operation parallelism in the operation
Ideally be executable
Behavioural specification
Obt i d b fi i t ifi ti Obtained by refining system specification
Portions of system are distributed among
l ft ( l ) several software (general purpose processor)
and hardware (single purpose processor)
modules modules
Yields behavioral specification as an HDL
d ( VHDL V il ) f h d d code (e.g. VHDL, Verilog) for hardware and
software code (e.g. C) for software modules
Register-Transfer specification
H d S ft Hardware:
Refined to a
connection of Register-
Software:
Refined to assembly
code for the general- connection of Register
Transfer components
and a state machine
code for the general
purpose processor
Can be run directly on
controlling it
Structural HDLs
(str ct ral VHDL
y
the processor
(structural VHDL,
Verilog etc.) are used
Logic specification
F h d i t t f ifi ti For hardware, register transfer specification
is converted into a logical specification
consisting of a set of Boolean equations consisting of a set of Boolean equations
Results in flattening of RT components to do
better optimization better optimization
No refinement needed for software
Final implementation
M hi d f th f Machine code for the sofware
Gate-level netlist for the hardware
Communication mechanism
Compilation / Synthesis tools used
Systemsynthesis tool converts systemspecification System synthesis tool converts system specification
into set of sequential programs on general- and single-
purpose processors
B h i l th i t l t ti l Behavioural synthesis tool converts a sequential
program into finite-state machines and register
transfers. A software compiler converts a sequential
t bl d program to assembly code
Register-Transfer synthesis tool converts finite state
machine and register transfers into a datapath of RT g p
components and a controller of Boolean equations
Logic synthesis tool converts Boolean expressions into
a connection of logic gates, called netlist a connection of logic gates, called netlist
Libraries / IP
S t l l lib h ld l t System-level library may hold complete
system solutions, such as, an Ethernet network
Behavioural level library commonly used Behavioural-level library commonly used
components, such as, bus interface, display
controller, cores controller, cores
RT-level library layout of RT components,
such as, register, multiplexer, decoder, , g , p , ,
functional units
Logic-level library layout for gates and cells g y y g
Test and Verification
E i th t f ti lit i t Ensuring that functionality is correct
Can prevent time-consuming debugging at
l b t ti l l d it ti b k t low abstraction levels and iterating back to
high abstraction levels
Simulation is the most common method
Formal verification techniques are also used
Summary of Development Environment
Compilation /
Synthesis
Libraries / IP Test /
Verification
System
specification
System
synthesis
HW/SW/OS Model
simulators /
Checkers
Behavioural
specification
Behaviour
synthesis
Cores HW-SW
cosimulators
RT
ifi ti
RT synthesis RT
t
HDL
i l t specification components simulators
Logic
specification
Logic
synthesis
Gates / Cells Gate
simulators
Development Environment
Embedded Processor Alternatives
G l General processors
Microcontrollers
Digital Signal Processors
Field Programmable Gate Arrays (FPGAs)
Application Specific Integrated Circuit
(ASIC)
Comparison between alternatives
ARM: An Advanced Microcontroller
Introduction to ARM
32 bit RISC hit t 32-bit RISC architecture
Developed by ARM Corporation, previously
k A RISC M hi known as Acron RISC Machine
Licensed to companies that want to
manufacture ARM based CPUs or SOC
products
Helps the licensee to develop their own -
processors compliant with ARM instruction
t hit t set architecture
Features that make ARM the most popular
embedded architecture
ARMcores are very simple require relatively lesser number of ARM cores are very simple, require relatively lesser number of
transistors, leaving enough space on die to realize other
functionalities on the silicon
Instruction set architecture and the pipeline design aimed at Instruction set architecture and the pipeline design aimed at
minimizing energy consumption
Also capable of running 16-bit THUMB instruction set greater
code density and enhanced power saving y p g
Higher performance
Highly modular architecture the only mandatory part is the
integer pipeline, all other components are optional g p p , p p
Built-in JTAG debug port and on-chip embedded in-circuit
emulator (ICE) that allows programs to be downloaded and
fully debugged in-system
Improved Features
E h i t ti t l th ALU d hift ki Each instruction controls the ALU and shifter, making
the instructions more powerful
Auto-increment and auto-decrement addressing modes
supported
Multiple load/store instructions that allow to load/store
upto 16 registers at once upto 16 registers at once
Conditional execution of instructions introduced.
Instruction opcode is preceded by a 4-bit condition code.
For the instruction to execute the condition must be met For the instruction to execute, the condition must be met.
Eliminates small branches and thus pipeline stalls
Arithmetic operations may or may not affect the status
bit bits
ARM Registers in Different Modes
Conditional Execution
ARMallows all instructions to be executed ARM allows all instructions to be executed
conditionally
Most significant 4-bits of each instruction Most significant 4-bits of each instruction
are reserved to hold 16 condition codes
Instruction is executed only if the condition Instruction is executed only if the condition
set is met by the flags in CPSR
Example: p
EQADD R0, R1, R2; R0 = R1+R2
only if zero flag is set
I t f i I nt er f aci ng
I nt er f aci ng
Pl i t t l i ti t Plays important role in connecting processors to
peripherals
Interfacing requirements of devices vary a lot Interfacing requirements of devices vary a lot
It is necessary to make the devices and
processors compatible to each other processors compatible to each other
Wide range of interfacing standards available
Example: SPI I
2
C RS232 family USB CAN Example: SPI, I
2
C, RS232 family, USB, CAN,
IrDA, Bluetooth etc.
SPI Feat ur es
Can be used for interfacing memory ADC Can be used for interfacing memory, ADC,
DAC, real-time clock, LCD drivers, sensors,
audio chips, even other processors audio chips, even other processors
Compared to standard serial port, it is
synchronous y
All transfers referenced to a common clock,
generated by master
More than one peripheral may be connected
to the same master through SPI interface.
Sl l t d b hi l t Slaves selected by chip select
SPI Feat ur es ( Cont d.)
Both master and slave contain a serial shift register Both master and slave contain a serial shift register
Contents of these shift registers are exchanged for data
transfer
Master initiates the transfer by writing a byte to its SPI
shift register
Register transmits the byte on MOSI line to the slave
Slave transfers the content of its shift register to the
master on MISO line master on MISO line
To only write, master ignores the byte read
To only read, master transfers a dummy byte to slave y , y y
Dat a Tr ansf er Thr ough SPI
I nt er I nt egr at ed Ci r cui t
( I
2
C)
Very cheap yet effective network to connect peripherals Very cheap, yet effective network to connect peripherals
in small scale embedded systems
Uses two wires to connect multiple devices in a p
multidrop bus
Bus is bidirectional, synchronous to a common clock
Achievable data rate: 100-400 Kbps
Two wires are:
SDA: Serial data SDA: Serial data
SCL: Serial clock
Both lines are bidirectional and open-drain p
Dat a Tr ansf er i n I
2
C
Cont r ol l er Ar ea Net w or k ( CAN)
Or i gi nal l y desi gned f or aut omot i ve el ect r oni cs t o al l ow g y g
mi cr ocont r ol l er s and devi ces t o communi cat e
Hi ghl y noi sy envi r onment
CAN uses a br oadcast , di f f er ent i al ser i al bus st andar d
Dat a t r ansmi ssi on uses an aut omat i c ar bi t r at i on f r ee Dat a t r ansmi ssi on uses an aut omat i c ar bi t r at i on f r ee
mechani sm
When mul t i pl e devi ces t r ansmi t si mul t aneousl y , t he one
t r ansmi t t i ng mor e domi nant bi t s w i ns. Thus i t has hi gher
i i pr i or i t y
Nodes t r ansmi t t i ng l ow er pr i or i t y messages w i l l sense i t and
back of f and w ai t
A 0 bi t i s domi nant , 1 i s r ecessi ve A 0 bi t i s domi nant , 1 i s r ecessi ve
Bus i s phy si cal l y an open- col l ect or w i r ed AND connect i on
Hi gher pr i or i t y message nev er del ay ed
Low er pr i or i t y node at t empt s r et r ansmi ssi on 6 bi t - cl ock s
f h d f d i af t er t he end of domi nant message
CAN i n Vehi cl e
Bl uet oot h Net w or k i ng Bl uet oot h Net w or k i ng
Tr ansmi t s dat a t hr ough l ow - pow er r adi o w aves at a g p
f r equency of 2.45GHz ( bet w een 2.402 2.480 GHz)
Fr equency band has been i nt er nat i onal l y ear mar k ed f or
use of i ndust r i al , sci ent i f i c, and medi cal dev i ces
f Avoi ds i nt er f er i ng w i t h ot her sy st ems by sendi ng w eak
si gnal s of about 1 mw , mak i ng r ange of Bl uet oot h
t r ansmi ssi on r est r i ct ed t o 10m
Wal l cannot st op a Bl uet oot h si gnal mak i ng i t usef ul f or Wal l cannot st op a Bl uet oot h si gnal , mak i ng i t usef ul f or
cont r ol l i ng sever al devi ces i n di f f er ent r ooms
Can connect upt o 8 devi ces si mul t aneousl y
Devi ces do not i nt er f er e w i t h each ot her , due t o t he use ,
of spr ead spect r um f r equency hoppi ng
A dev i ce uses 79 i ndi v i dual , r andoml y chosen
f r equenci es w i t hi n a desi gnat ed r ange, changi ng f r om
one f r equency t o anot her 1600 t i mes per second on a one f r equency t o anot her 1600 t i mes per second on a
r egul ar basi s
Hardware-Software Partitioning
Har dw ar e/ sof t w ar e par t i t i oni ng / p g
No need to consider special purpose hardware in the long run?
Correct for fixed functionality, but wrong in general, since
B th ti MPEG b i l t d i ft MPEG +1 By the time MPEG-n can be implemented in software, MPEG-n+1
has been invented
Funct i onal i t y t o be i mpl ement ed i n sof t w ar e or i n har dw ar e?
Funct ionalit y t o be implement ed
in soft ware or in hardware?
Decision Decision
based on
hardware/
soft ware soft ware
part it ioning
, a special
case of case of
hardware/
soft ware
codesign codesign.
Hardware/ soft ware codesign: approach / g pp
Specification
Processor Processor
Mapping
[ d / f f l d b dd d l d
P1 P2
Hardware
[Niemann,Hardware/SoftwareCoDesignforDataFlowDominatedEmbeddedSystems,KluwerAcademic
Publishers,1998(Comprehensivemathematicalmodel)]
St eps of t he COOL part it ioning algorit hm
( 1) ( )
1 T l t i f t h b h i i t i t l h 1. Tr ansl at i on of t he behavi our i nt o an i nt er nal gr aph
model
2. Tr ansl at i on of t he behav i our of each node f r om VHDL
i nt o C i nt o C
3. Compi l at i on
All C programs compiled for t he t arget processor,
Comput at ion of t he result ing program size,
est imat ion of t he result ing execut ion t ime
( simulat ion input dat a might be required)
4. Sy nt hesi s of har dw ar e component s:
leaf nodes, applicat ion- specific hardware is synt hesized.
High- level synt hesis sufficient ly fast .
St eps of t he COOL part it ioning algorit hm
( 2) ( )
5 Fl t t i f t h hi h 5. Fl at t eni ng of t he hi er ar chy :
Granularit y used by t he designer is maint ained.
Cost and performance informat ion added t o t he nodes.
P i i f i i d f i i i i Precise informat ion required for part it ioning is pre-
comput ed
6. Gener at i ng and sol v i ng a mat hemat i cal model
f t h t i i t i bl of t he opt i mi zat i on pr obl em:
I nt eger programming I P model for opt imizat ion.
Opt imal wit h respect t o t he cost funct ion ( approximat es
i t i t i ) communicat ion t ime)
St eps of t he COOL part it ioning algorit hm
( 3) ( )
7. I t er at i v e i mpr ov ement s: 7. I t er at i v e i mpr ov ement s:
Adj acent nodes mapped t o t he same hardware
component are now merged.
St eps of t he COOL part it ioning algorit hm
( 4) ( )
8. Interfacesynthesis:
Afterpartitioning,thegluelogicrequiredforinterfacingprocessors,application
specifichardwareandmemoriesiscreated.
An i nt eger pr ogr ammi ng model f or
HW/ SW par t i t i oni ng / p g
Not at i on:
I ndex set I denot es t ask graph nodes.
I ndex set L denot es t ask graph node t y pes
e. g. square root , DCT or FFT
I ndex set KH denot es hardware component t y pes.
e. g. hardware component s for t he DCT or t he FFT. g p
I ndex set J of hardware component inst ances
I ndex set KP denot es processors.
All processors are assumed t o be of t he same t ype
An I P model f or HW/ SW
par t i t i oni ng p g
X
i k
: = 1 i f node v
i
i s mapped t o har dw ar e X
i ,k
: 1 i f node v
i
i s mapped t o har dw ar e
component t y pe k e KH and 0 ot her w i se.
Y
i ,k
: = 1 i f node v
i
i s mapped t o pr ocessor k e KP
and 0 ot her w i se and 0 ot her w i se.
NY
,k
= 1 i f at l east one node of t y pe i s mapped t o
pr ocessor k e KP and 0 ot her w i se.
T i s a mappi ng f r om t ask gr aph nodes t o t hei r
t y pes:
T: I L
The cost f unct i on accumul at es t he cost of
har dw ar e uni t s:
C = cost ( pr ocessor s) + cost ( memor i es) + C cost ( pr ocessor s) + cost ( memor i es) +
cost ( appl i cat i on speci f i c har dw ar e)
Const raint s
Operat ion assignment const raint s

e e
= + e
KH k KP k
k i k i
Y X I i 1 :
, ,
Alltaskgraphnodeshavetobemappedeitherinsoftwareorinhardware.
Variablesareassumedtobeintegers.
Additionalconstraintstoguaranteetheyareeither0or1:
1 : :
,
s e e
k i
X KH k I i
1 s Y KP k I i 1 : :
,
s e e
k i
Y KP k I i
Operat ion assignment const raint s
( 2) ( )
eL, i: T( v
i
) = c

, k e KP: NY
, k
> Y
i, k
For all t ypes of operat ions and for all nodes i of t his For all t ypes of operat ions and for all nodes i of t his
t ype:
if i is mapped t o some processor k, t hen t hat processor
must implement t he funct ionalit y of .
Decision variables must also be 0/ 1 variables:
eL, k e KP: NY
k
s 1. eL, k e KP: NY
, k
s 1.
Resource & design const raint s g
k e KH,thecost(area)usedforcomponentsofthattypeis
calculatedasthesumofthecostsofthecomponentsofthattype.
Thiscostshouldnotexceeditsmaximum.
k e KP, the cost for associated data storage area should not ke KP,thecostforassociateddatastorageareashouldnot
exceeditsmaximum.
k e KP thecostforstoringinstructionsshouldnotexceedits
maximum maximum.
Thetotalcost(E
k e KH
) ofHWcomponentsshouldnotexceedits
maximum
Thetotalcostofdatamemories(E
k e KP
) shouldnotexceedits
maximum
Thetotalcostinstructionmemories(E
k e KP
) shouldnotexceedits (
k e KP
)
maximum
Ot her const raint s
Ti mi ng const r ai nt s
These const raint s can be used t o
guarant ee t hat cert ain t ime g
const raint s are met .
Some less import ant const raint s
omit t ed . .
Example p
HW t H1 H2 d H3 it h HW t ypes H1, H2 and H3 wit h
cost s of 20, 25, and 30.
Processors of t ype P.
Tasks T1 t o T5 Tasks T1 t o T5.
Execut ion t imes:
T H1 H2 H3 P
1 20 100
2 20 100
3 12 10
4 12 10
5 20 100
Operat ion assignment const raint s
( 1) ( )
T H1 H2 H3 P T H1 H2 H3 P
1 20 100
2 20 100
3 12 10

e e
= + e
KH k KP k
k i k i
Y X I i 1 :
, ,
4 12 10
5 20 100
X
1,1
+Y
1,1
=1(task1mappedtoH1ortoP)
X
2,2
+Y
2,1
=1
X
3,3
+Y
3,1
=1
X Y 1 X
4,3
+Y
4,1
=1
X
5,1
+Y
5,1
=1
Operat ion assignment const raint s
( 2) ( )
Assume t ypes of t asks are = 1, 2, 3, 3, and 1.
eL, i: T( v
i
) = c

, k e KP: NY
, k
> Y
i, k
Functionality3tobeimplemented
on processor if node 4 is mapped onprocessorifnode4ismapped
toit.
Ot her equat ions q
Time const raint s leading t o: Applicat ion Time const raint s leading t o: Applicat ion
specific hardware required for t ime
const raint s under 100 t ime unit s.
T H1 H2 H3 P
1 20 100
2 20 100 2 20 100
3 12 10
4 12 10
5 20 100
Costfunction:
C=20#(H1)+25#(H2)+30#(H3)+cost(processor)+cost(memory)
Result
For a t ime const raint of 100 t ime unit s and cost ( P) < cost ( H3) :
T H1 H2 H3 P
1 20 100
2 20 100
3 12 10
4 12 10
5 20 100 5 20 100
Solution(educatedguessing):
T1 H1 T1H1
T2H2
T3P
T4P
T5 H1 T5H1
Ot her approaches
I t t i Al it h I t erat ive Algorit hms
Genet ic Search
Part icle Swarm Opt imizat ion
Power- aware Part it ioning
Books on Embedded Systems
SANTANU
CHATTOPADHYAY
PETER MARWEDEL
Publ i sher : Spr i nger
Publ i sher : PHI Lear ni ng
I SBN PB: 9788120340244
Pages: 192
Year of Publ i cat i on: 2010
I SBN: 0387292373
Pages: 241
Year of Publ i cat i on: 2005
Year of Publ i cat i on: 2010
Thank you y

You might also like