You are on page 1of 27

Safety Critical Systems

Safety Critical Systems


 A safety-critical computer system is a computer system
whose failure may cause injury or death to human
beings or the environment
 Examples:
 Aircraft control system (fly-by-wire,...)
 Nuclear power station control system
 Control systems in cars (anti-lock brakes,...)
 Health systems (heart pacemakers,...)
 Railway control systems
 Communication systems
 Wireless Sensor Networks Applications?

2
What is Safety?

“The avoidance of death, injury or poor health to customers,


employees, contractors and the general public; also
avoidance of damage to property and the environment”

Safety is also defined as "freedom from unacceptable


risk of harm"

A basic concept in System Safety Engineering is the


avoidance of "hazards"

Safety is NOT an absolute quantity!

3
Safety vs. Security
 These two concepts are often mixed up
 In German, there is just one term for both!

System

Security Safety
= protection against = doesn’t cause harm
attacks

4
SILs and Dangerous Failure Probability

Safety High demand mode of operation


Integrity (Probability of dangerous failure per
Level hour)

SIL 4 10-9  P < 10-8

SIL 3 10-8  P < 10-7

SIL 2 10-7  P < 10-6

SIL 1 10-6  P < 10-5

5
Railway Signalling Systems
 Signalling and Switching
 Axle Counters
 Applications for ETCS

 An incorrect output may lead


to an incorrect signal causing
a major accident!

 Safety Integrity Level 4


(highest)

6
(Old) Interlocking Systems

Mechanical /
Electromechanical
Systems

7
Signal Box / Interlocking Tower
 Electric system with some electronics

8
Modern Signal Box / Interlocking Tower
 Lots of electronics and computer systems

9
What is a Hazard?
 Hazard
 physical condition of platform that threatens the safety of
personnel or the platform, i.e. can lead to an accident
 a condition of the platform that, unless mitigated, can
develop into an accident through a sequence of normal
events and actions
 "an accident waiting to happen"

 Examples
 oil spilled on staircase
 failed train detection system at an automatic railway level
crossing
 loss of thrust control on a jet engine
 loss of communication
 distorted communication
 undetectably incorrect output

10
Hazard Severity Level (Example)

Category Id Definition
.
CATASTROPHIC I General: A hazard, which may cause death, system
loss, or severe property or environmental damage.

CRITICAL II General: A hazard, which may cause severe injury,


major system, property or environmental damage.

MARGINAL III General: A hazard, which may cause marginal injury,


marginal system, property or environmental damage.

NEGLIGIBLE IV General: A hazard, which does not cause injury,


system, property or environmental damage.

11
Hazard Probability Level (Example)
Occurrences
Level Probability [h-1] Definition
per year
may occur several times
Frequent P ≥ 10-3 More than 10
a month
likely to occur once a
Probable 10-3 > P ≥ 10-4 1 to 10
year
likely to occur in the life
Occasional 10-4 > P ≥ 10-5 10-1 to 1
of the system
unlikely but possible to
Remote 10-5 > P ≥ 10-6 occur in the life of the 10-2 to 10-1
system
Improbable 10-6 > P ≥ 10-7 very unlikely to occur 10-3 to 10-2
extremely unlikely, if not
Incredible P < 10-7 Less than 10-3
inconceivable to occur

12
Risk Classification Scheme (Example)

Hazard Severity
Hazard CATASTROPHIC CRITICAL MARGINAL NEGLIGIBLE
Probability
Frequent A A A B

Probable A A B C

Occasional A B C C

Remote B C C D

Improbable C C D D

Incredible C D D D

13
Risk Class Definition (Example)

Risk Class Interpretation


A Intolerable

Undesirable and shall only be accepted when


B
risk reduction is impracticable.
Tolerable with the endorsement of the
C
authority.
Tolerable with the endorsement of the normal
D
project reviews.

14
Risk Acceptability

 Having identified the level of risk for the product we


must determine how acceptable & tolerable that risk is
 Regulator / Customer
 Society
 Operators

 Decision criteria for risk acceptance / rejection


 Absolute vs. relative risk (compare with previous, background)
 Risk-cost trade-offs
 Risk-benefit of technological options

15
Risk Tolerability

Hazard

Severity Probability

Risk

Risk Criteria
No
Risk
Tolerable? Reduction
Yes Measures

16
What are Safety Requirements

 The system requirements specification (or sub-system/equipment


as appropriate) may be considered in two parts :

1. Requirements which are not related to safety

2. Requirements which are related to safety

 Requirements which are related to safety are usually called safety


requirements. These may be contained in a separate safety
requirements specification.
 Safety integrity relates to the ability of a safety-related
system to achieve its required safety functions. The
higher the safety integrity, the lower the likelihood
that it will fail to carry out the required safety
functions

 Safety integrity comprises two parts :

1. Systematic failure integrity

2. Random failure integrity


 Systematic failure integrity is the non-quantifiable part of the
safety integrity and relates to hazardous systematic faults
(hardware or software). Systematic faults are caused by human
errors in the various stages of the system/sub-system/equipment
life-cycle.

 Examples of Systematic Faults are :


1. Specification errors
2. Design errors
3. Manufacturing errors
4. Installation errors
5. Operation errors
6. Maintenance errors
7. Modification errors
 Random failure integrity is that part of the
safety integrity which relates to hazardous
random faults, in particular random hardware
faults, which are the result of the finite
reliability of hardware components.
 Examples of Random Faults are:
1. Failure of Resistor
2. Failure of an IC Etc.
Diversity
 Goal: Fault Tolerance/Detection
 Diversity is "a means of achieving all or part of
the specified requirements in more than one
independent and dissimilar manner."
 Can tolerate/detect a wide range of faults
"The most certain and effectual check upon errors
which arise in the process of computation, is to cause
the same computations to be made by separate and
independent computers; and this check is rendered still
more decisive if they make their computations by
different methods."
Dionysius Lardner, 1834

22
Layers of Diversity
abstraction
Diversity Examples

Concept of Operation e.g. two different paradigms, such as


(e.g. specifications) rule based and functional

Design e.g. n version design


(e.g. design descriptions)

Implementation
e.g. n version coding
(e.g. source code)

Realisation
(e.g. object code) e.g. diverse compilers

HW
(CPU, memory,...) e.g. diverse CPU

23
Examples for Diversity
 Specification Diversity
Some faults to be targeted:
 Design Diversity
 Data Diversity programming bugs,
specification faults, compiler
 Time Diversity faults, CPU faults, random
 Hardware Diversity hardware faults (e.g. bit flips),
security attacks,...
 Compiler Diversity
 Automated Systematic Diversity
 Testing Diversity
 Diverse Safety Arguments
 …

24
Compiler Diversity
Use of two
...
 Module A
{

diverse compilers
int i;
int end;
get(end); Common
to compile one for i = 1 to end
result=func(i,result);
POS[i]=result;
Source Code

common source }
next

...

code

Diverse Compiler
Compiler Compiler - different manufacturer
A B - different version
- different compiler options

... ...
move $4, A add ($66533), A
jmp $54256 ret Diverse Object
add ($5436), B
...
move $4, C
...
Code (?)

25
Compiler Diversity: Issues
 Targeted Faults:
 Systematic compiler faults
 Some systematic and permanent hardware faults (if
executed on one board)
 Issues:
 To some degree possible with one compiler and
different compile options (optimization on/off,…)
 If compilers from different manufacturers are taken,
independence must be ensured

26
Systematic Automatic Diversity
 What can be "diversified":
 memory usage
 execution sequence
 statement structures
 array references
 data coding
 register usage
 addressing modes
 pointers
 mathematical and logic rules

27

You might also like