Safety Critical Systems: Protect Life with Reliable Controls

Safety Critical Systems
Safety Critical Systems

 A safety-critical computer system is a computer system
whose failure may cause injury or death to human
beings or the environment
 Examples:
 Aircraft control system (fly-by-wire,...)
 Nuclear power station control system
 Control systems in cars (anti-lock brakes,...)
 Health systems (heart pacemakers,...)
 Railway control systems
 Communication systems
 Wireless Sensor Networks Applications?
2
What is Safety?
“The avoidance of death, injury or poor health to customers,

employees, contractors and the general public; also
avoidance of damage to property and the environment”
Safety is also defined as "freedom from unacceptable

risk of harm"
A basic concept in System Safety Engineering is the

avoidance of "hazards"
Safety is NOT an absolute quantity!
3
Safety vs. Security
 These two concepts are often mixed up
 In German, there is just one term for both!
System
Security Safety
= protection against = doesn’t cause harm
attacks
4
SILs and Dangerous Failure Probability
Safety High demand mode of operation

Integrity (Probability of dangerous failure per
Level hour)
SIL 4 10-9  P < 10-8
SIL 3 10-8  P < 10-7
SIL 2 10-7  P < 10-6
SIL 1 10-6  P < 10-5
5
Railway Signalling Systems
 Signalling and Switching
 Axle Counters
 Applications for ETCS
 An incorrect output may lead

to an incorrect signal causing
a major accident!
 Safety Integrity Level 4

(highest)
6
(Old) Interlocking Systems
Mechanical /
Electromechanical
Systems
7
Signal Box / Interlocking Tower
 Electric system with some electronics
8
Modern Signal Box / Interlocking Tower
 Lots of electronics and computer systems
9
What is a Hazard?
 Hazard
 physical condition of platform that threatens the safety of
personnel or the platform, i.e. can lead to an accident
 a condition of the platform that, unless mitigated, can
develop into an accident through a sequence of normal
events and actions
 "an accident waiting to happen"
 Examples
 oil spilled on staircase
 failed train detection system at an automatic railway level
crossing
 loss of thrust control on a jet engine
 loss of communication
 distorted communication
 undetectably incorrect output
10
Hazard Severity Level (Example)
Category Id Definition
.
CATASTROPHIC I General: A hazard, which may cause death, system
loss, or severe property or environmental damage.
CRITICAL II General: A hazard, which may cause severe injury,

major system, property or environmental damage.
MARGINAL III General: A hazard, which may cause marginal injury,

marginal system, property or environmental damage.
NEGLIGIBLE IV General: A hazard, which does not cause injury,

system, property or environmental damage.
11
Hazard Probability Level (Example)
Occurrences
Level Probability [h-1] Definition
per year
may occur several times
Frequent P ≥ 10-3 More than 10
a month
likely to occur once a
Probable 10-3 > P ≥ 10-4 1 to 10
year
likely to occur in the life
Occasional 10-4 > P ≥ 10-5 10-1 to 1
of the system
unlikely but possible to
Remote 10-5 > P ≥ 10-6 occur in the life of the 10-2 to 10-1
system
Improbable 10-6 > P ≥ 10-7 very unlikely to occur 10-3 to 10-2
extremely unlikely, if not
Incredible P < 10-7 Less than 10-3
inconceivable to occur
12
Risk Classification Scheme (Example)
Hazard Severity
Hazard CATASTROPHIC CRITICAL MARGINAL NEGLIGIBLE
Probability
Frequent A A A B
Probable A A B C
Occasional A B C C
Remote B C C D
Improbable C C D D
Incredible C D D D
13
Risk Class Definition (Example)
Risk Class Interpretation

A Intolerable
Undesirable and shall only be accepted when

B
risk reduction is impracticable.
Tolerable with the endorsement of the
C
authority.
Tolerable with the endorsement of the normal
D
project reviews.
14
Risk Acceptability
 Having identified the level of risk for the product we

must determine how acceptable & tolerable that risk is
 Regulator / Customer
 Society
 Operators
 Decision criteria for risk acceptance / rejection

 Absolute vs. relative risk (compare with previous, background)
 Risk-cost trade-offs
 Risk-benefit of technological options
15
Risk Tolerability
Hazard
Severity Probability
Risk
Risk Criteria
No
Risk
Tolerable? Reduction
Yes Measures
16
What are Safety Requirements
 The system requirements specification (or sub-system/equipment

as appropriate) may be considered in two parts :
1. Requirements which are not related to safety
2. Requirements which are related to safety
 Requirements which are related to safety are usually called safety

requirements. These may be contained in a separate safety
requirements specification.
 Safety integrity relates to the ability of a safety-related
system to achieve its required safety functions. The
higher the safety integrity, the lower the likelihood
that it will fail to carry out the required safety
functions
 Safety integrity comprises two parts :
1. Systematic failure integrity
2. Random failure integrity

 Systematic failure integrity is the non-quantifiable part of the
safety integrity and relates to hazardous systematic faults
(hardware or software). Systematic faults are caused by human
errors in the various stages of the system/sub-system/equipment
life-cycle.
 Examples of Systematic Faults are :

1. Specification errors
2. Design errors
3. Manufacturing errors
4. Installation errors
5. Operation errors
6. Maintenance errors
7. Modification errors
 Random failure integrity is that part of the
safety integrity which relates to hazardous
random faults, in particular random hardware
faults, which are the result of the finite
reliability of hardware components.
 Examples of Random Faults are:
1. Failure of Resistor
2. Failure of an IC Etc.
Diversity
 Goal: Fault Tolerance/Detection
 Diversity is "a means of achieving all or part of
the specified requirements in more than one
independent and dissimilar manner."
 Can tolerate/detect a wide range of faults
"The most certain and effectual check upon errors
which arise in the process of computation, is to cause
the same computations to be made by separate and
independent computers; and this check is rendered still
more decisive if they make their computations by
different methods."
Dionysius Lardner, 1834
22
Layers of Diversity
abstraction
Diversity Examples
Concept of Operation e.g. two different paradigms, such as

(e.g. specifications) rule based and functional
Design e.g. n version design

(e.g. design descriptions)
Implementation
e.g. n version coding
(e.g. source code)
Realisation
(e.g. object code) e.g. diverse compilers
HW
(CPU, memory,...) e.g. diverse CPU
23
Examples for Diversity
 Specification Diversity
Some faults to be targeted:
 Design Diversity
 Data Diversity programming bugs,
specification faults, compiler
 Time Diversity faults, CPU faults, random
 Hardware Diversity hardware faults (e.g. bit flips),
security attacks,...
 Compiler Diversity
 Automated Systematic Diversity
 Testing Diversity
 Diverse Safety Arguments
 …
24
Compiler Diversity
Use of two
...
 Module A
{
diverse compilers
int i;
int end;
get(end); Common
to compile one for i = 1 to end
result=func(i,result);
POS[i]=result;
Source Code
common source }
next
...
code
Diverse Compiler
Compiler Compiler - different manufacturer
A B - different version
- different compiler options
... ...
move $4, A add ($66533), A
jmp $54256 ret Diverse Object
add ($5436), B
...
move $4, C
...
Code (?)
25
Compiler Diversity: Issues
 Targeted Faults:
 Systematic compiler faults
 Some systematic and permanent hardware faults (if
executed on one board)
 Issues:
 To some degree possible with one compiler and
different compile options (optimization on/off,…)
 If compilers from different manufacturers are taken,
independence must be ensured
26
Systematic Automatic Diversity
 What can be "diversified":
 memory usage
 execution sequence
 statement structures
 array references
 data coding
 register usage
 addressing modes
 pointers
 mathematical and logic rules
27

Safety Critical Systems: Protect Life with Reliable Controls

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Safety Critical Systems: Protect Life with Reliable Controls

Uploaded by

Copyright:

Available Formats

Safety Critical Systems

Safety Critical Systems

“The avoidance of death, injury or poor health to customers,

Safety is also defined as "freedom from unacceptable

A basic concept in System Safety Engineering is the

Safety is NOT an absolute quantity!

Safety High demand mode of operation

SIL 4 10-9  P < 10-8

SIL 3 10-8  P < 10-7

SIL 2 10-7  P < 10-6

SIL 1 10-6  P < 10-5

 An incorrect output may lead

 Safety Integrity Level 4

CRITICAL II General: A hazard, which may cause severe injury,

MARGINAL III General: A hazard, which may cause marginal injury,

NEGLIGIBLE IV General: A hazard, which does not cause injury,

Risk Class Interpretation

Undesirable and shall only be accepted when

 Having identified the level of risk for the product we

 Decision criteria for risk acceptance / rejection

 The system requirements specification (or sub-system/equipment

1. Requirements which are not related to safety

2. Requirements which are related to safety

 Requirements which are related to safety are usually called safety

 Safety integrity comprises two parts :

1. Systematic failure integrity

2. Random failure integrity

 Examples of Systematic Faults are :

Concept of Operation e.g. two different paradigms, such as

Design e.g. n version design

You might also like