Professional Documents
Culture Documents
EUROCONTROL
introduction
In January 2007, a project was launched by EUROCONTROL with the aim of understanding the new area of Resilience
Engineering and its relevance to Air Traffic Management (ATM). Resilience Engineering is developing important tools
and methods for both system developers and people responsible for the maintenance and management of system
safety, in a number of industries.
This White Paper provides an overview of the work carried out and seeks to answer the following questions:
n
How mature is Resilience Engineering and what is the added value for ATM?
The added value of an Resilience Engineering approach is that it provides a way to address the issues of emergent
accidents and the often disproportionate consequences that are an outcome of ever more complex technologies
and ever more integrated organisations. Resilience methods are likely to be important tools in the management and
assurance of ATM safety in the future.
Performance variability:
The ways in which individual and collective
performances are adjusted to match current
demands and resources, in order to ensure
that things go right.
Positive
OUTCOME
Serendipity
Normal outcomes
(things that go right)
Neutral
Good luck
Incidents
Near misses
Negative
Accidents
Disasters
Mishaps
PREDICTABILITY
Very low
Very high
Today, most ANSPs have safety nets, Safety Management Systems (SMS), safety assessment and assurance
processes, and many are also improving their safety
culture. These efforts add layers of safety to already safe
systems. While this will reduce the likelihood of acci-
RCA, ATHEANA
HEAT
TRIPOD
MTO
Swiss cheese
HPES
FRAM
STEP
HERA
HCR
Root cause
Domino
FMEA
Fault tree
AcciMap
THERP
HAZOP
MERMOS
CSNI
TRACEr
FMECA
MORT
1900
1910
1920
1930
Technical
1940
1950
1960
Human Factors
1970
1980
CREAM
1990
Organisational
STAMP
AEB
2000
2010
Systemic
Resonance:
A principle that explains how disproportionate large consequences can arise from
seemingly small variations in performance
and conditions.
Organisation
(management)
Downstream
Design
Maintenance
Upstream
Technology
Organisation
(management)
Organisation
(management)
Downstream
This has, however, not always been so. The concern for
Maintenance
how human factors could affect safety began with the
accident at the Three Mile Island nuclear power plant
in 1979. At that time, the focus was naturally on control
Upstream
room operations (the sharp end), and methods were
Technology
therefore
developed to support that. The situation in
the 1980s with regard to the focus of safety is depicted
in Figure 4.
Design
Downstream
Design
Maintenance
Upstream
Technology
Tractable system
Intractable system
Number of details
Comprehensibility
Stability
Independence
Interdependence
Metaphor
Clockwork
Teamwork
Tight
Power
grids
Dams
Financial
systems
NPPs
Aircraft
Military
operations
Railways
Emergency
management
Coupling
Marine
transport
Integrated
operations
ATM
Mining
Healthcare
Space
missions
Manufacturing
Loose
Post
offices
Tractable
Universities Public
services
Assembly
lines
Manageability
Intractable
How is it done?
Resilience Engineering provides the conceptual basis for
a new perspective on safety. The leading prototype currently is the Functional Resonance Assessment Method
(FRAM). This method is based on four principles:
FRAM is a developing method. The following is an illustration of what a FRAM analysis might look like for an
overflight scenario:
A FRAM analysis consists of five steps:
The principle of emergence. Both failures and normal performance are emergent phenomena: neither
can be attributed to or explained simply by referring to the (mal)functions of specific components or
parts.
The principle of functional resonance. FRAM replaces the traditional cause-effect relation with resonance. This explains how the variability of a number
of functions every now and then may resonate, i.e.,
reinforce each other, leading to excessive variability
in one or more downstream functions. The consequences may spread through the system by means of
tight couplings rather than easily identifiable causeeffect links.
Time
Input
Function/
Process
Precondition
Control
Output
Resource
A/C 1: radio
contact established
C
9
Issue
clearance
Coordination
An example of what this looks like, applied to an overflight scenario, is in the following. The analysis of system
functions (Step 2) produced the following list:
n
n
n
n
n
n
n
n
n
n
Function/
Process
Input
Output
Issue
clearance
to pilot
I
T
Provide
MET data
MET data
Monitoring
FDPS updated
C
Strip
marking
Update
FDPS
T
FDPS updated
Planning
R
I
Team
coordination
Sector to
sector
comm.
10
T
P
C
Provide
flight &
radar
data
A/C 1: radio
contact established
I
T
C
Pilot ATCO
communication
Clearance A/C 1:
Climb to FL100
T
Flight & radar data
T
Team
coordination
Coordination
C
P
Request from
next sector:
Climb to FL100
O
ty
tional Safe
Organisa
Technical Safety
Human F
actors
Resilie
nce E
engin
eerin
11
Performance variability enables individuals and organisations to adjust what they do to match current
resources and demands. Performance variability is
the reason why things go right, and sometimes also
the reason why things go wrong.
Conclusion
This White Paper has presented the main concepts and practical principles of Resilience Engineering, a developing
field which will be important for ATM safety in the future. ATM is among the socio-technical systems that have developed
so rapidly that established safety assessment approaches are increasingly challenged to address all the issues and
identify all the risks. In complex socio-technical systems, things may go wrong in the absence of manifest failures
and malfunctions, and outcomes may often be disproportionately large. To safeguard against such developments,
it is necessary to have tools and methods that can deal with the underspecification and tight couplings of complex,
highly interactive systems such as ATM. Resilience Engineering offers a conceptual and methodological basis for
achieving that goal it is one to watch. The research and development efforts will continue and further results will
be documented and disseminated as a way of demonstrating the added value of the approach. Special emphasis
will be put on case studies and guidance on how a smooth integration with conventional safety assurance schemes
can be accomplished.
Some additional reading is suggested below.
12
Further Reading:
http://www.resilience-engineering.org/
Hollnagel, E. (2004).
Barriers and accident prevention. Chapter 5 & 6. Aldershot, UK: Ashgate publishers.
Hollnagel, E., Woods, D. D. & Leveson, N. (2006).
Resilience engineering: Concepts and precepts. Aldershot, UK: Ashgate publishers.
Hollnagel, E. (2009).
The ETTO principle: Efficiency-thoroughness trade-off. Why things that go right sometimes go wrong.
Aldershot, UK: Ashgate publishers.
Woltjer, R. & Hollnagel, E. (2008).
Modeling and evaluation of air traffic management automation using the functional resonance accident model (FRAM).
8th International Symposium of the Australian Aviation Psychology Association, Sydney, Australia.
GLOSSARY
Intractable:
A system which cannot be described in every detail and where the functioning therefore is not completely understood. Intractable systems are only partly predictable.
Performance variability: The ways in which individual and collective performances are adjusted to match current demands and resources, in order to ensure that things go right.
Resilience:
Resonance:
A principle that explains how disproportionate large consequences can arise from
seemingly small variations in performance and conditions.
Safety culture:
Serendipity:
The making of happy and unexpected discoveries by accident or when looking for
something else; such as discovery.
13
Dr Barry Kirwan leads Safety Research and Development in EUROCONTROL. He has degrees
in Psychology, Human Factors and Human Reliability Assessment. He has worked in the nuclear, chemical, petrochemical, marine and air
traffic sectors of industry, and lectured at the
University of Birmingham in Human Factors.
He was formerly Head of Human Reliability at
BNFL in the UK nuclear industry, and Head of
Human Factors at NATS (UK). For the past nine
years he has been working for EUROCONTROL,
managing a team of safety researchers and
safety culture specialists at the EUROCONTROL
Experimental Centre in Bretigny, near Paris. He
has published four books and around 200 articles. He is also a visiting Professor of Human
Reliability & Safety at Nottingham University
in the UK.
barry.kirwan@eurocontrol.int
EUROCONTROL
European Organisation for the Safety of Air Navigation (EUROCONTROL). September 2009
This document is published by EUROCONTROL for information purposes. It may be copied in
whole or in part, provided that EUROCONTROL is mentioned as the source and it is not used for
commercial purposes (i.e. for financial gain). The information in this document may not be modified
without prior written permission from EUROCONTROL.
www.eurocontrol.int