You are on page 1of 19

QUALITY  

ATTRIBUTES  OF EMBEDDED


SYSTEM
These are the attributes that together form the
deciding factor about the quality of an embedded
system.
There are two types of quality attributes are:-
1.Operational Quality Attributes.
These are attributes related to operation or
functioning of an embedded system. The way an
embedded system operates affects its overall
quality.
2.Non-Operational Quality Attributes.
These are attributes not related to operation or
functioning of an embedded system. The way an
embedded system operates affects its overall
quality.
These are the attributes that are associated with the
embedded system before it can be put in operation.
 
Operational Attributes
A. Response
Response is a measure of quickness of the
system.
It gives you an idea about how fast your system
is tracking the input variables.
Most of the embedded system demand fast
response which should be real-time.
B. Throughput
  Throughput deals with the efficiency of system.
It can be defined as rate of production or process
of a defined process over a stated period of time.
In case of card reader like the ones used in buses,
throughput means how much transaction the
reader can perform in a minute or hour or day.
C. Reliability
Reliability is a measure of how much percentage
you rely upon the proper functioning of the system
.
Mean Time between failures and Mean Time To
Repair are terms used in defining system
reliability.
Mean Time between failures can be defined as
the average time the system is functioning before
a failure occurs.
Mean time to repair can be defined as the
average time the system has spent in repairs.
D. Maintainability
Maintainability deals with support and
maintenance to the end user or a client in case of
technical issues and product failures or on the
basis of a routine system checkup
It can be classified into two types :-
D.1. Scheduled or Periodic Maintenance
 This is the maintenance that is required regularly after
a periodic time interval.
Example : Periodic Cleaning of Air
Conditioners; Refilling of printer
cartridges.
D.2. Maintenance to unexpected failure
This involves the maintenance due to a
sudden breakdown in the functioning of
the system.
Example: Air conditioner not powering on
Printer not taking paper in spite of a full
paper stack
E. Security
   Confidentiality,  Integrity and Availability are three 
corner stones of       information security.
Confidentiality deals with protection data from
unauthorized disclosure.
Integrity gives protection from unauthorized
modification.
Availability gives protection from unauthorized
user
Certain Embedded systems have to make sure they
conform to the security measures.
Ex. An Electronic Safety Deposit Locker can be
used only with a pin number like a password.
F.Safety
Safety deals with the possible damage that can
happen to the operating person and environment
due to the breakdown of an embedded system or
due to the emission of hazardous materials from the
embedded products.
A safety analysis is a must in product engineering
to evaluate the anticipated damage and determine
the best course of action to bring down the
consequence of damages to an acceptable level.
 
Non Operational Attributes
 
A. Testability and Debug-ability
 It deals with how easily one can test his/her design,
application and by which mean he/she can test it.
 In hardware testing the peripherals and total
hardware function in designed manner
  Firmware testing is functioning in expected way
 Debug-ability is means of debugging the product as
such for figuring out the probable sources that create
unexpected behaviour in the total system
 
B. Evolvability
  For embedded system, the qualitative attribute
“Evolvability” refer to ease with which the embedded
product can be modified to take advantage of new
firmware or hardware technology.
 
C. Portability
  Portability is measured of “system Independence”.
 An embedded product can be called portable if it is
capable of performing its operation as it is intended
to do in various environments irrespective of
different processor and or controller and embedded
operating systems.
 
D. Time to prototype and market
Time to Market is the time elapsed between the
conceptualization of a product and time at which
the product is ready for selling or use Product
prototyping help in reducing time to market.
Prototyping is an informal kind of rapid product
development in which important feature of the
under consider are develop.
In order to shorten the time to prototype, make use
of all possible option like use of reuse, off the self-
component etc.
 
E. Per unit and total cost
Cost is an important factor which needs to be
carefully monitored. Proper market study and cost
benefit analysis should be carried out before taking
decision on the per unit cost of the embedded
product.

DESIGN METRICS OF EMBEDDED SYSTEMS


A Design Metric is a measurable feature of the
system’s performance, cost, time for implementation
and safety etc. Most of these are conflicting
requirements i.e. optimizing one shall not optimize
the other: e.g. a cheaper processor may have a lousy
performance as far as speed and throughput is
concerned.

1. NRE cost (nonrecurring engineering cost)


It is one-time cost of designing the system. Once the
system is designed, any number of units can be
manufactured without incurring any additional design
cost; hence the term nonrecurring.
 
2. Unit cost
The monetary cost of manufacturing each copy of the
system, excluding NRE cost.

3. Size
The physical space required by the system, often
measured in bytes for software, and gates or
transistors for hardware.
 
4. Performance
The execution time of the system
 
5. Power Consumption
It is the amount of power consumed by the system,
which may determine the lifetime of a battery, or the
cooling requirements of the IC, since more power
means more heat.
 
6. Flexibility
The ability to change the functionality of the system
without incurring heavy NRE cost. Software is
typically considered very flexible.
 
7. Time-to-prototype
The time needed to build a working version of the
system, which may be bigger or more expensive than
the final system implementation, but it can be used to
verify the system’s usefulness and correctness and to
refine the system’s functionality.
 
8. Time-to-market
The time required to develop a system to the point
that it can be released and sold to customers. The
main contributors are design time, manufacturing
time, and testing time. This metric has become
especially demanding in recent years. Introducing an
embedded system to the marketplace early can make
a big difference in the system’s profitability.
 
9. Maintainability
It is the ability to modify the system after its initial
release, especially by designers who did not
originally design the system.
 
10. Correctness
This is the measure of the confidence that we have
implemented the system’s functionality correctly. We
can check the functionality throughout the process of
designing the system, and we can insert test circuitry
to check that manufacturing was correct.
 
The Performance Design Metric
Performance of a system is a measure of how long the
system takes to execute our desired tasks.
The two main measures of performance are:
Latency or response time
This is the time between the start of the task’s
execution and the end. For example, processing an
image may take 0.25 second.
Throughput
This is the number of tasks that can be processed per
unit time. For example, a camera may be able to
process 4 images per second
Need for RTOS in Embedded systems
 Meeting deadlines
 Deterministic behaviour
 Physical and memory size
 Prioritized tasks
 Minimum interrupt latency
 Watchdog timer & vectored interrupt
 Small footprint
 Reliable system

Issues in Real-time System Design


In the following sections we will be discussing these
very issues:
 Realtime Response
 Recovering from Failures
 Working with Distributed Architectures
 Asynchronous Communication
 Race Conditions and Timing
Realtime Response
Realtime systems have to respond to external
interactions in a predetermined amount of time.
Successful completion of an operation depends upon
the correct and timely operation of the system.
Design the hardware and the software in the system
to meet the Realtime requirements. For example, a
telephone switching system must feed dial tone to
thousands of subscribers within a recommended limit
of one second. To meet these requirements, the off
hook detection mechanism and the software message
communication involved have to work within the
limited time budget. The system has to meet these
requirements for all the calls being set up at any
given time.
The designers have to focus very early on the Real-
time response requirements. During the architecture
design phase, the hardware and software engineers
work together to select the right system architecture
that will meet the requirements. This involves
deciding inter connectivity of the processors, link
speeds, processor speeds, etc. The main questions to
be asked are:
 Is the architecture suitable? If message
communication involves too many nodes, it is
likely that the system may not be able to meet the
Realtime requirement due to even mild
congestion. Thus a simpler architecture has a
better chance of meeting the Realtime
requirements.
 Are the link speeds adequate? Generally,
loading a link more than 40-50% is a bad idea. A
higher link utilization causes the queues to build
up on different nodes, thus causing variable
amounts of delays in message communication.
 Are the processing components powerful
enough? A CPU with really high utilization will
lead to unpredictable Real-time behaviour. Also,
it is possible that the high priority tasks in the
system will starve the low priority tasks of any
CPU time. This can cause the low priority tasks
to misbehave. As with link, keep the peak CPU
utilization below 50 %.
 Is the Operating System suitable? Assign high
priority to tasks that are involved in processing
Realtime critical events. Consider pre-emptive
scheduling if Realtime requirements are
stringent. When choosing the operating system,
the interrupt latency and scheduling variance
should be verified.
o Scheduling variance refers to the
predictability in task scheduling times. For
example, a telephone switching system is
expected to feed dialtone in less than 500
ms. This would typically involve scheduling
three to five tasks within the stipulated time.
Most operating systems would easily meet
these numbers as far as the mean dialtone
delay is concerned. But general purpose
operating systems would have much higher
standard deviation in the dialtone numbers.
o Interrupt Latency refers to the delay with
which the operating system can handle
interrupts and schedule tasks to respond to
the interrupt. Again, real-time operating
systems would have much lower interrupt
latency.
Recovering from Failures
Realtime systems must function reliably in event of
failures. These failures can be internal as well as
external. The following sections discuss the issues
involved in handling these failures.
Internal Failures
Internal failures can be due to hardware and software
failures in the system. The different types of failures
you would typically expect are:
 Software Failures in a Task: Unlike desktop
applications, Realtime applications do not have
the luxury of popping a dialog box and exiting on
detecting a failure. Design the tasks to safeguard
against error conditions. This becomes even more
important in a Realtime system because sequence
of events can result in a large number of
scenarios. It may not be possible to test all the
cases in the laboratory environment. Thus apply
defensive checks to recover from error
conditions. Also, some software error conditions
might lead to a task hitting a processor exception.
In such cases, it might sometimes be possible to
just rollback the task to its previous saved state.
 Processor Restart: Most Realtime systems are
made up of multiple nodes. It is not possible to
bring down the complete system on failure of a
single node thus design the software to handle
independent failure of any of the nodes. This
involves two activities:
1. Handling Processor Failure: When a
processor fails, other processors have to be
notified about the failure. These processors
will then abort any interactions with the
failed processor node. For example, if a
control processor fails, the telephone switch
clears all calls involving that processor.
2. Recovering Context for the Failed Processor:
When the failed processor comes back up, it
will have to recover all its lost context from
other processors in the system. There is
always a chance of inconsistencies between
different processors in the system. In such
cases, the system runs audits to resolve any
inconsistencies. Taking our switch example,
once the control processor comes up it will
recover the status of subscriber ports from
other processors. To avoid any
inconsistencies, the system initiates audits to
crosscheck data-structures on the different
control processors.
 Board Failure: Realtime systems are expected to
recover from hardware failures. The system
should be able to detect and recover from board
failures. When a board fails, the system notifies
the operator about the it. Also, the system should
be able to switch in a spare for the failed board.
(If the board has a spare)
 Link Failure: Most of the communication in
Realtime systems takes place over links
connecting the different processing nodes in the
system. Again, the system isolates a link failure
and reroutes messages so that link failure does
not disturb the message communication.
External Failures
Realtime systems have to perform in the real world.
Thus they should recover from failures in the external
environment. Different types of failures that can take
place in the environment are:
 Invalid Behavior of External Entities: When a
Realtime system interacts with external entities,
it should be able to handle all possible failure
conditions from these entities. A good example
of this is the way a telephone switching systems
handle calls from subscribers. In this case, the
system is interacting with humans, so it should
handle all kinds of failures, like:
1. Subscriber goes off hook but does not dial
2. Toddler playing with the phone!
3. Subscriber hangs up before completing
dialing.
 Inter Connectivity Failure: Many times a
Realtime system is distributed across several
locations. External links might connect these
locations. Handling of these conditions is similar
to handling of internal link failures. The major
difference is that such failures might be for an
extended duration and many times it might not be
possible to reroute the messages.
Working with Distributed Architectures
Most Realtime systems involve processing on several
different nodes. The system itself distributes the
processing load among several processors. This
introduces several challenges in design:
 Maintaining Consistency: Maintaining data-
structure consistency is a challenge when
multiple processors are involved in feature
execution. Consistency is generally maintained
by running data-structure audits.
 Initializing the System: Initializing a system
with multiple processors is far more complicated
than bringing up a single machine. In most
systems the software release is resident on the
OMC. The node that is directly connected to the
OMC will initialize first. When this node finishes
initialization, it will initiate software downloads
for the child nodes directly connected to it. This
process goes on in an hierarchical fashion till the
complete system is initialized.
 Inter-Processor Interfaces: One of the biggest
headache in Realtime systems is defining and
maintaining message interfaces. Defining of
interfaces is complicated by different byte
ordering and padding rules in processors.
Maintenance of interfaces is complicated by
backward compatibility issues. For example if a
cellular system changes the air interface protocol
for a new breed of phones, it will still have to
support interfaces with older phones.
 Load Distribution: When multiple processors
and links are involved in message interactions
distributing the load evenly can be a daunting
task. If the system has evenly balanced load, the
capacity of the system can be increased by
adding more processors. Such systems are said to
scale linearly with increasing processing power.
But often designers find themselves in a position
where a single processor or link becomes a bottle
neck. This leads to costly redesign of the features
to improve system scalability.
 Centralized Resource Allocation: Distributed
systems may be running on multiple processors,
but they have to allocate resources from a shared
pool. Shared pool allocation is typically managed
by a single processor allocating resources from
the shared pool. If the system is not designed
carefully, the shared resource allocator can
become a bottle neck in achieving full system
capacity.
Asynchronous Communication
Remote procedure calls (RPC) are used in computer
systems to simplify software design. RPC allows a
programmer to call procedures on a remote machine
with the same semantics as local procedure calls.
RPCs really simplify the design and development of
conventional systems, but they are of very limited use
in Realtime systems. The main reason is that most
communication in the real world is asynchronous in
nature, i.e. very few message interactions can be
classified into the query response paradigm that
works so well using RPCs.
Thus most Realtime systems support state machine
based design where multiple messages can be
received in a single state. The next state is determined
by the contents of the received message. State
machines provide a very flexible mechanism to
handle asynchronous message interactions. The
flexibility comes with its own complexities. We will
be covering state machine design issues in future
additions to the Realtime Mantra.
Race Conditions and Timing
It is said that the three most important things in
Realtime system design are timing, timing and
timing. A brief look at any protocol will underscore
the importance of timing. All the steps in a protocol
are described with exact timing specification for each
stage. Most protocols will also specify how the
timing should vary with increasing load. Realtime
systems deal with timing issues by using timers.
Timers are started to monitor the progress of events.
If the expected event takes place, the timer is
stopped. If the expected event does not take place, the
timer will timeout and recovery action will be
triggered.
A race condition occurs when the state of a resource
depends on timing factors that are not predictable.
This is best explained with an example. Telephone
exchanges have two way trunks which can be used by
any of the two exchanges connected by the trunk. The
problem is that both ends can allocate the trunk at
more or less the same time, thus resulting in a race
condition. Here the same trunk has been allocated for
a incoming and an outgoing call. This race condition
can be easily resolved by defining rules on who gets
to keep the resource when such a clash occurs. The
race condition can be avoided by requiring the two
exchanges to work from different ends of the pool.
Thus there will be no clashes under low load. Under
high load race conditions will be hit which will be
resolved by the pre-defined rules.
A more conservative design would partition the two
way trunk pool into two one way pools. This would
avoid the race condition but would fragment the
resource pool.
The main issue here is identifying race conditions.
Most race conditions are not as simple as this one.
Some of them are subtle and can only be identified by
careful examination of the design.

You might also like