Techneau, June 2009

Methods for risk analysis of drinking water systems from source to tap
- Guidance report on Risk Analysis

Techneau, June 2009

TECHNEAU

© 2009 TECHNEAU TECHNEAU is an Integrated Project Funded by the European Commission under the Sixth Framework Programme, Sustainable Development, Global Change and Ecosystems Thematic Priority Area (contractnumber 018320). All rights reserved. No part of this book may be reproduced, stored in a database or retrieval system, or published, in any form or in any way, electronically, mechanically, by print, photoprint, microfilm or any other means without prior written permission from the publisher

Methods for risk analysis of drinking water systems from source to tap

Colofon
Title Methods for risk analysis of drinking water systems from source to tap - Guidance report on Risk Analysis Authors P. Hokstad1, J. Røstum1, S. Sklet1, L. Rosén2, T.J.R. Pettersson2; A. Linde2, S. Sturm3, R. Beuken4, D. Kirchner6, C. Niewersch6
1SINTEF 2Chalmers 3TZW 5KWR 6RWTH

University of Technology

Aachen

Quality Assurance By LNEC and KWR Deliverable number D 4.2.4

This report is: PU = Public

3

Methods for risk analysis of drinking water systems from source to tap 4 .

2  3.3.3.1  3.1  2.3  5.1.3  1.2  Data for risk analysis Introduction Data needs Data sources Types of data sources Failure event data bases 43  43  43  45  45  45  6  6.1  5.4  2.2  2.1  More advanced risk analysis methods for water supply systems Risk modelling and choice of a risk analysis method 47  47  5 .5  7  9  9  10  12  14  15  Introduction Objective and scope Content of the report The TECHNEAU generic framework for risk management Definitions Abbreviations 2  2.2  3.6  Risk analysis of drinking water systems – “From source to tap” Initiation and organisation of a complete risk analysis Relevant decision situations for water utilities System description Hazardous events Safety barriers – Causes and consequences of hazardous events Risk estimation 17  17  18  19  21  22  23  3  3.4  4.5  2.1  3.3  2.5  Quantification of risk The dimensions of risk and various ways to quantify risk Qualitative versus quantitative expressions for risk Risk measures for loss of water quality Risk measures for loss of water quantity (supply) Risk measured in monetary units 35  35  36  38  39  40  5  5.2  4.1  5.3  4.3  Coarse risk analysis of water supply systems Identification of hazardous events Various approaches for hazard identification TECHNEAU Hazard Data Base (THDB) Risk estimation in Coarse Risk Analysis (CRA) Tool for Coarse Risk Analysis (CRA) 25  25  25  26  28  29  4  4.Methods for risk analysis of drinking water systems from source to tap Contents Summary 1  1.1  1.1.2  5.1  4.4  1.2  1.

13.10  6.3  6. Analyses to establish treatment and monitoring system Appendix F: Some fundamental reliability concepts 99  107  113  6 .5  HAZOP Failure Modes.analysing consequences Methods for estimation of risk to human health (QMRA and QCRA) Methods for risk analysis of water quantity (supply) GIS as a tool in risk analysis Introduction GIS in catchment risk management GIS Assisted Risk Analysis – Description and application Main requirements Concluding comment 50  52  54  55  58  59  62  65  68  70  70  72  72  73  73  75  75  7  8  Summary of risk analysis methods References 77  81  87  91  91  Appendix A: Main steps of a risk analysis Appendix B.2  6.13.7  6.11  6. Appendix B.2  6. DALY and a generalisation Appendix B.1 A Fault Tree Analysis of an UV system C.6  6.13.12  6.3  6.2.4  6. Combined measure for water quantity and quality. Effects and Criticality Analysis (FMECA) Removal efficiency of the water treatment system Fault Tree Analysis (FTA) Reliability Block Diagram (RBD) Human Reliability Analysis (HRA) Markov Analysis Cause. Two examples of FTA C. Generalisation of DALY.8  6.2 Integrated risk analysis: Fault-tree analysis to investigate causes of failures 95  95  96  Appendix D.9  6.1  6.13.13. DALY – An overall risk measure of health effects.13  6.1.Methods for risk analysis of drinking water systems from source to tap 6.effect relations .Bayesian Networks Event Tree Analysis (ETA) . 93  Appendix C.5  6. Procedure and example of FMECA Appendix E.4  6.

and could be carried out by most water utilities with some assistance from risk analysts. This report is focussing on informing staff of water utilities on the available methods for risk analysis. There could for instance be a need of: • • • • An initial analysis. including hazard identification and risk estimation. The main steps of a complete risk analysis are discussed: • • • • Scope / analysis objective System description Identification of hazards and hazardous events Estimation of risk (probabilities and consequences). The report will refer to the TECHNEAU Case studies and other literature to give details on the various risk analysis methods. demonstrating typical objectives for carrying out risk analyses. An analysis driven by identified problems related to water quality or water availability. In Chapter 7 a summary of all risk analysis methods is presented. Consequences with respect to both water quality and water quantity are considered. However. This is a relatively simple analysis. and should be carried out by professionals in close cooperation with water utility personnel. The main objective of WA4 is to integrate risk assessments of the separate parts in drinking water supplies into a comprehensive decision support framework for costefficient risk management in safe and sustainable drinking water supply. An analysis imposed by rebuilding or operational changes of the utility. The capabilities and restraints of the methods and typical results of the analyses are presented. with the results presented in a risk matrix. Possible objectives of these more advanced methods are described. Various decision situations where use of risk analysis is relevant are presented.Methods for risk analysis of drinking water systems from source to tap Summary This report is a deliverable of Work Area 4 (WA4) – Risk Assessment and Risk Management in the TECHNEAU project. The present report gives an introduction into risk analysis methods for water supply. “Generic framework and method for integrated risk management in water safety plans”. It describes which problems can be addressed by the various methods. It also refers to the TECHNEAU Hazard Data Base (THDB) and to the TECHNEAU report. Several of these more complex analyses are described in the last part of the report. from source to tap. prior to the start up of water utility as a basis for design of the supply system. several of the other risk analysis methods will require deeper knowledge and experience with the risk analysis techniques. being a total risk analysis. The report describes a Coarse Risk Analysis (CRA). Cost/benefit considerations to identify the best risk reducing alternative. It can for instance be to analyse 7 .

The report discusses various ways to quantify relevant aspects of risk for water supply. etc. the consequences of these events. This risk analysis report will not in any detail treat the risk acceptance and risk evaluation steps of risk assessment. to assess the availability of water to various consumers (distribution network analysis). 8 . to analyse the effect of human errors on system reliability.Methods for risk analysis of drinking water systems from source to tap the causes of hazardous/undesired events. The data needed to perform risk analyses are also described. and to perform maintenance “optimisation”.

the report will not give all details of the various methods.Methods for risk analysis of drinking water systems from source to tap 1 Introduction 1. it can serve as a guide to the management of a water utility regarding which analyses are most relevant in various decision situations. and also some risk analysis method are described. The present report aims at demonstrating the application of various risk analysis methods for water utility systems. A conceptual schematic of the framework. guidance reports (like this) and tools that is to be produced in WA 4 is presented in Figure 1 and the present guidance report is the Methods for Analysing Risk. the use and potential results of the various methods for risk analysis. The goals in WA 4 are also to provide tools and guiding documents for water utilities carrying out risk assessment and risk management. The report is intended to give insight into the capabilities. which is not the scope here. A conceptual schematic of the framework.1 Objective and scope The main objective of Work Area 4 (WA4) – Risk Assessment and Risk Management in TECHNEAU is [1]: to integrate risk assessments of the separate parts into a comprehensive decision support framework for cost-efficient risk management in safe and sustainable drinking water supply. However. A number of the tools are also tested in case studies and are disseminated through training seminars. The present report provides a more complete overview of risk analyses methods for water utilities. the water 9 . This report is based on the TECHNEAU report [2] “Generic Framework and Methods for Risk Management in Water Safety Plans”. where risk management is discussed. Thus. The main target groups of the report are management and operational personnel of water utilities with some basic understanding of the main concepts of risk management. guides and tools that is produced in WA 4 with this report Methods for Analysing Risks put into the context. as this would require a full text book. Figure 1. Thus.

The report focuses on risk analysis. First some basic background information is given. [65]. 10 .5). showing that risk analysis is an integrated part of risk management. Within TECHNEAU we define the term water safety as “Water supply that protects water availability and human health with a high degree of practical certainty” that comprises both loss of water quality and water quantity. Risk evaluation has already been briefly discussed within WA4 [2].Methods for risk analysis of drinking water systems from source to tap utilities (at least the smaller ones) may need support from risk analysts/consultants in order to apply many of the methods described here. in parenthesis). the main tasks of risk analysis (middle boxes in Figure 2): • Chapter 2 gives an overview of the integrated approach to a complete risk analysis of a water utility. (Section 1. identification of hazardous events and finally risk estimation. Upper Nyameni – CRA and the South African Risk Evaluation Guidelines. but will be presented in later reports. Further. • Abbreviations used. (see top box in Figure 2): • The TECHNEAU framework for risk management. notably: • • • • • • Bergen – Coarse Risk Analysis (CRA).4). (Section 1. In the lower box in Figure 2 various risk analyses methods are listed and described in the report. Within WA4 six case studies are carried out at different water supply systems. and will not go into any detail on risk evaluation and risk reduction and control.2 Content of the report Figure 2 gives an overview of the various topics described in this report. [73]. 1. Březnice – CRA and a Failure Modes and Effect Analysis (FMEA). here we are not referring to measurements. (Section/Chapter no. Thus. The analysis will include description of the total system. and the report covers both these aspects of risk. [31]. • Chapter 5 discusses data needed to carry out a risk analysis. First some relevant decision situations for carrying out a risk analysis are given. The results of these case studies have been integrated into this report and are described more into detail in the various case study reports. [63]. Göteborg – Quantitative and probabilistic method based on a fault tree analysis. 1 In a risk analysis we may apply risk measures to quantify (express) the risk. In these case studies the applicability of various methods for RA is tested. Freiburg-Ebnet – GIS (Geographic Information System) Assisted Risk Analysis (GARA-method).3). • Definitions of risk analysis terms. • Chapter 4 describes various ways to quantify and measure1 different risks to a water utility. [74]. [75]. Amsterdam –Network simulation model and Bayesian belief networks. (Section 1.

for readers that have some additional background in the risk and reliability concepts. However. Chapter 3 describes a complete “coarse” risk analysis (CRA). these more advanced risk analysis techniques will not directly apply to ordinary water utility personnel but for the interested reader. Chapter 6 can then be seen as a Part II of the report. including both hazard identification and risk estimation. and when a CRA has been carried out. (illustrated in the lower part of Figure 2): 11 . advanced level of risk analysis. the need of more detailed analyses of critical events/subsystems can be identified. In Chapter 6 we describe several of the more advanced risk analysis methods that could be relevant for a more detailed investigation of the risk related to a water utility. in parenthesis). Overview of topics covered in the report (relevant Section/Chapter no. The advanced methods are. In the CRA this often restricts to a semi-quantitative risk estimate. A CRA is often the first risk analysis to be carried out for a utility. the water utility should have a rough picture of the main risks. Therefore.Methods for risk analysis of drinking water systems from source to tap First. Thus. Figure 2. giving probability and consequence categories.

although in TECHNEAU (WA 5) research is executed on modelling water quality. 12 .8). allowing geographical representation and analysis of infrastructure assets and the tracking of associated hazards/risks. various analyses to identify human errors and assess the consequences of these. used to evaluate various outcomes (consequences) of an undesired (hazardous) event. This presents the TECHNEAU generic framework for integrated risk management. but gives a different graphical presentation of the result. This is done by estimating e. GIS tools. Failure Mode. are given in Chapter 7.3 The TECHNEAU generic framework for risk management The risk management process is illustrated in Figure 3.13). (Section 6. a systematic approach to “break down a failure event into its “causes”/contributors. Event Tree Analysis.4). (Section 6. It can be used for instance to assess the effect of factors influencing the risk. a systematic way to identify and document the failure modes (and consequences of failure) of a specified system.12). and the corresponding risks are estimated. including an overview of the various risk analysis methods and the applicability of these methods. analysis techniques to estimate the effects on human health of microbiological or chemical hazards. a mathematical technique used to analyse dependencies between variables. Distribution network analyses. (Section 6. (see [2]). Bayesian networks. Reliability Block diagram.Methods for risk analysis of drinking water systems from source to tap • • • • • • • • • • • • Hazard and operability (HAZOP) analysis. Conclusions. Effects and Criticality Analysis (FMECA).6). Markov Analysis. but are included here as being important in risk analysis of water utilities. which includes the following main components: • Risk Analysis In a risk analysis the various hazardous events related to the water utility are identified. Can be relevant for maintenance analyses. (Section 6. (Section 6. 1. (Section 6.3).g.2). (Section 6. Analysis of the efficiency of treatment systems. methods to assess the performance of a distribution network. in order to identify proper method for water treatment applying FMECA. (Section 6.10). being a rather elaborate method for hazard identification. (Section 6. Note that a couple of these methods (“Treatment efficiency” and GIS tools”) are usually not considered as risk analysis methods. Human Reliability Analyses (HRA). gives very much the same information as a fault tree.11).9). This type of analysis is mostly relevant for analysing water quantity. Appendix C gives two examples of the use of FTA. (Section 6. (Section 6.7). a somewhat detailed analysis that can be carried out to analyse a system that can pass through various (performance) states. Fault Tree Analysis (FTA). QMRA/QCRA. (Section 6.5). for each step giving references to the relevant sections of the report. Finally note that an overview of the main steps of a risk analysis is presented in Appendix A.

Methods for risk analysis of drinking water systems from source to tap the frequency of hazardous events and various consequences of these events. approve and audit Figure 3. • Economy. risk analysis (the scope of this report).3.) • Risk Analysis Define Scope Identify Hazards Estimate Risks Qualitative Quantitative Get new information Update Analyse sensitivity Risk Evaluation Define tolerability criteria Water quality Water quantity Develop supporting programmes Document and assure quality Analyse risk reduction options Ranking Cost-efficiency Cost-benefit Risk Reduction/ Control Make decisions Treat risks Monitor Report and communicate Review. The estimated risk is then compared with this acceptance criteria in order to decide whether the risk is acceptable (tolerable). The main components of the TECHNEAU generic framework for integrated risk management in WSP [2].2 below). risks above the acceptance criteria must be treated. • Water quantity (and availability). In particular. (This activity will essentially to be treated in forthcoming report of TECHNEAU. 13 . Risk control Risk reduction options have to be decided on and then implemented. (see Section 2. Further. the risk is monitored during operation of the utility. Definition of scope of risk analysis A complete risk analysis will start by defining the scope of the analysis. see [2]. For a water utility the objective of the analysis could be related to one or more of the following topics: • Water quality. WP4. various risk reduction options are considered to evaluate their cost-effectiveness. includes the following three steps: 1. • Risk evaluation The risk evaluation requires that a risk acceptance/tolerability criterion is defined (by the water utility). Further. The first component of this framework.

Various methods exist for identifying hazards and hazardous events.g. Important considerations are if qualitative. 1. Risk estimation A lot of methods exist for modelling and estimating the various risks to a water utility. property or the environment. per year). Further.Methods for risk analysis of drinking water systems from source to tap • Environmental impact. of occurrence and the consequences of a specified hazardous event. • Risk analysis is the systematic use of available information to identify hazards and to estimate the risk to individuals or populations. the existence of hazardous agents in the drinking water source). an appropriate team need to be assembled according to the scope of the analysis. For instance. checklists [20].g. p. We can also give the probability that the event will occur during one year. chemical. is “dimensionless” and is always a number between 0 and 1. Risk estimation consists of the following steps. A hazardous event is an event which can cause harm. • Consumer trust. f = 0. cf. (e. physical or radiological agent that has the potential to cause harm. The probability. 2 When we give the mean number of events during a fixed period of time (e. System definition/description and limitations of analysis are also given in this initial step. The present report focuses on water quality and water quantity. experience from the past and expert judgements. semi-quantitative or quantitative measures of risk are needed and if the risk analysis comprises the complete water utility or some subsystem(s) of it.g.g. 2. risk evaluation and risk control. For instance p = 0. (e. • Risk is a combination of the frequency.4 Definitions The following definitions of terms are applied in the TECHNEAU project. • Risk estimation is the process used to produce a measure of the level of risk being analysed.1/ year. • Risk evaluation is the process in which judgements are made on the tolerability of the risk on the basis of risk analysis and taking into account factors such as socioeconomic and environmental aspects. A hazard is usually given as a source of potential harm. f = 3/year. the existence of a farming or industrial activity in the catchment area). 14 . or probability2. consequence analysis. [2]: • Hazard is a source of potential harm or a situation with a potential of harm. • Hazard identification is the process of recognizing that a hazard exists and defining its characteristics. 3 Various activities are required in order to carry out a risk analysis. f. Hazard identification The next step is the identification of all hazards and hazardous events. frequency analysis. A proper method need to be selected with respect to the specific scope of the risk analysis. and their integration. (and frequency. These are indicated in the rightmost box of Figure 3. • Hazardous event is an event which can trigger a hazard and cause harm.) Likelihood can be used as a common word for probability and frequency.1 means that on the average the event will occur in one out of 10 years. we talk about a frequency. e. • Hazardous agent is for example a biological.

1. evaluating and controlling risk.Methods for risk analysis of drinking water systems from source to tap • • • Risk assessment is the overall process of risk analysis and risk evaluation. Risk management is the systematic application of management policies. Water safety is defined (within TECHNEAU) as: “Water supply that protects water availability and human health with a high degree of practical certainty”. procedures and practices to the tasks of analysing. Effects and Criticality Analysis GIS Assisted Risk Analysis Geographic Information System Hazard Analysis and Critical Control points Hazard Identification Hazard and Operability analysis Hydraulic Criticality Index Health Impact Assessment Human Reliability Assessment Mean Time To Failure Mean Time To Repair Probability of Failure on Demand Preliminary Hazard Analysis Performance Shaping Factors Quantitative Chemical Risk Assessment Quantitative Microbiological Risk Assessment Reliability Block Diagram Risk and Vulnerability Analysis (ROS is the Norwegian abbreviation) Risk Priority Number Substandard Supply Minutes TECHNEAU Hazard Database World Health Organisation Water Safety Plans Water Supply Structure Years Lived with Disability Years of Life Lost Years of Life with water supply of bad Quality Years of Life without water Supply 15 .5 Abbreviations ALARP BN CCP CML CRA DALY ETA FTA FMEA FMECA GARA GIS HACCP HAZID HAZOP HCI HIA HRA MTTF MTTR PFD PHA PSF QCRA QMRA RBD ROS RPN SSM THDB WHO WSP WSS YLD YLL YLQ YLS - As Low As Reasonable Practicable Bayesian Network Critical Control Points Customer Minutes Lost Coarse Risk Analysis Disability Adjusted Life Years Event Tree Analysis Fault Tree Analysis Failure Modes and Effect Analysis Failure Modes.

Methods for risk analysis of drinking water systems from source to tap 16 .

1 Initiation and organisation of a complete risk analysis A risk analysis should be initiated by a general objective on how to reduce the risk for the public or the water utility. see [4]). The risk estimation. 3. The identification of hazardous events. including the motivation for carrying out a risk analysis. As discussed below. in Section 2. including study initiation/organisation. One typical difficulty is how to give value to human life. for instance whether only a subsystem of the utility should be considered. water utility owners. 17 . If the water utility is in a decision situation it should consider the questions: • • • • • • What is the problem? What are the alternatives? Who is affected by the decision? Who is making the decision? (For whom shall we carry out the analysis?) Which aspects are considered when making the decision? What are the wishes and priorities of the various stakeholders? When risk analyses are utilised as decision support there are several ways to express (quantify) the various aspects of the risk. which must be handled by decision makers. Critical stakeholders. the comparison of these benefits/losses may represent (ethical) problems.2. 2. Appendix A presents a summary of the various steps of the risk analysis. and schools. for example. So if there are various benefits and losses (potential consequences) involved. system description and assembling a team. a clear scope of the specific analysis should always be formulated (e. The process is described in detail in Appendix A (also briefly described in Section 1. safety managers.g. Further. have to be identified and given special attention during the analysis.g. Scope definition. hospitals.3) and includes the following main steps: 1. Some details of the process are given below. municipalities. 2. These decide whether any restrictions should be imposed on the work. consumers. or whether to include only specific types of hazardous events or risk reduction options. the risk analysis could be initiated by making an overview of the overall risk situation within the supply system for the specific decision situations. health authorities. kindergartens.Methods for risk analysis of drinking water systems from source to tap 2 Risk analysis of drinking water systems – “From source to tap” This chapter describes the overall structure and main elements of the risk analysis process. e. When assembling the risk (analysis) team relevant stakeholders are to be identified. Further.

• Initial risk analyses. it has been experienced that 18 . Estimate the risk to identify any need of additional Critical Control Points. which could initiate risk analyses work. So. Evaluate cost/benefit of risk reduction options to achieve an acceptable risk.g.). researchers. (or modifications. For instance. In this respect it is important to stress the importance of having commitment from all professional categories of the water company in order to achieve real risk reductions as a result of the work. in the meantime: o o How can the protection be improved by optimizing the present treatment? Which risks can be reduced by process optimization? How important are periods with suboptimal performance? • Analyses initiated by specific operational problem. The protection against water-borne diseases by implementing new barriers in the plant may be a long-term action for many water utilities. The team should consist of water works experts (operators. The working process must also be organised in a combination of meetings (with information gathering and evaluation) and analysis work. laboratory personnel etc. planners. Thus the initial part of the analysis process is to organise and make a plan for the work. consultants etc. Drinking water supply is subjected to many different risks and it is important to focus the risk control to the most important areas. various experts and generalists. (CCP). 2. and it may take some years before the required barrier function is implemented. such as rebuilding or operational changes). e. and some outside specialists (e. Such a system gives information on the acute actions. Relevant objectives to initiate a risk analysis could simply be a need to: o o o Identify and rank all hazards (in order to control risk). it must be decided who shall participate in the analysis work: risk analyst(s). Practical examples on this are included. an analysis team must be selected. Below we list some typical decision situations for water companies. The water utility may have a deviation reporting system that gives support to the handling of specific problems. required prior to the start up of a plant/water utility.2 Relevant decision situations for water utilities The scope of a risk analysis should describe the purpose of the analysis and the problems that initiated it. It is also designed to sort out the need for improvements in order to avoid similar events or to reduce the consequences of them.Methods for risk analysis of drinking water systems from source to tap Further. Examples of typical (specific) questions that could launch a risk analysis exercise could be: o o Which out of all chemicals and microbial substances are most critical to health aspects for drinking water consumers? How do we compare the risks of shipping petroleum products on the raw water source with cattle grazing next to it? • Analyses carried out to “optimise” operational maintenance and emergency procedures.) that may introduce new perspectives in the risk analysis process.g..

Water utilities may have acceptance levels for interruption of supply that take into consideration number of consumers without water and time without water. relevant questions are: o o What is the likelihood of such combinations in the future? How can they be detected and avoided. which is the limiting factor to achieve acceptable risk? Where are the bottlenecks? It is observed that the above questions could be related to various life cycle phases. a water treatment plan could be designed for having a multi-barrier protection. and also describing the functions of the various subsystems.g. and the questions can be related both to strategic and operational decisions. (major delivery failures). So relevant questions to initiate further analyses could be: o o o Are the barriers in the water treatment plant sufficient for emerging microbial contamination? Does the raw water contain micro-organisms that can harm human health? How will present treatment meet to the predicted climate change? • Analysis to obtain acceptable risk with respect to supply. design or operational phase).g. This can be combined with humic contents in the raw water reservoir and inadequate sludge removal in the treatment.Methods for risk analysis of drinking water systems from source to tap deviations related to a very rainy autumn can include simultaneous pollution in the main raw water source and the back-up water source. while according to new knowledge formerly unknown microbial agents are pointed out as an important hazard. or how can the consequences be reduced? More generally.3 System description One of the first tasks is to provide a system description.g. (e. 2. or a combination of these. unacceptable level of some bacteria) Reduced availability of water delivery observed (to some group of users) Observed security problems Occurrence of an unwanted event (accident investigation) • Analyses to update initial risk analyses. and not recognized until 1984 as a waterborne infection. treatment or distribution system. Relevant problems: o o Is it raw water. in order to include possible new hazards. The risk of limited delivery failures can be calculated from statistical data. So. The recognition of new hazards can result in new risk reduction options as e. Each water supply system is unique and a description of the 19 . improved barriers against Cryptosporidium. For instance the parasitic protozoa Cryptosporidium was not described until 1970s as infecting humans [8]. but little information is available of the larger failures. For instance. risk analyses could be initiated by problems like: o o o o Delivered water is observed not to comply with required quality standards (e.

Water treatment systems and monitoring systems. As an example Figure 4 is an illustration of a water supply system from source to tap carried out according to the WSP guidance. drawings. and each module into components etc. power supply. The system should be broken down into suitable subsystems that can be handled effectively in an analysis (i. treatment. ITsystems. Water source (groundwater and/or surface water) and the catchment area. Some generic information is also seen as a part of the system description. The system descriptions should include detailed knowledge of the following three subsystems (in case the total system is analysed): 1. which the water utility is depending on for successful operation. it is specified which concentrations of various contaminants that the treatment system is designed to handle. from source to tap.g. 3. supply of chemicals. after the treatment process and control points have been decided. The system description illustrates a “normal operational situation”. training and employment of personnel). An important aspect is that the risk analysts shall get familiarised with the analysis object. 2.e. such as the total number of consumers linked to the distribution system and their consumption demand. Distribution network. maps. (e. operating procedures. So specification of this normal operational situation is an important part of the system description. Important documents are rules and regulations. the technical systems. A common way to break down a system in an analysis is a hierarchical model which reflects how the system is designed. The description should include both illustrations (drawings) and written text. distribution ). 20 .Methods for risk analysis of drinking water systems from source to tap system is therefore an important part of a risk analysis. Source water Treatment Distribution Tank WTP 1 Reservoir WTP 2 Reservoir Reservoir Figure 4. Each subsystem can further be broken down into modules. statistics. For an identification of hazardous events it is also important to point out important support systems. etc. The system description should include a description of the system boundaries. splitting Figure 4 into subsystems like source. including plumbing system and consumers. operational conditions and the environment. Many risk analysis methods require some structured way to breakdown the system in manageable parts. In particular. Illustration of system flowchart. standards.

there could be a risk of counting hazardous events twice. Water treatment is malfunctioning: Treatment system having reduced ability to handle “normal contamination” or to detect contamination above “normal levels”. incl. 2. Identification of hazardous events is described in detail in the TECHNEAU Hazard Database (THDB) [20]. and events that result in a biological/chemical agents entering the drinking water source. (for additional information see [5]). However. In this case. Events causing insufficient water supply to consumers. to avoid such “double counting”.4 Hazardous events Step 2 (on page 15) of the risk analysis is to identify hazardous events. liability) A typical example of a hazardous event is the presence of a contamination (hazard) in the source of a drinking water supply system. Note that if the objective of the analysis is to assess the total risk caused by all identified hazardous events. 21 . which the existing treatment system is not designed to handle. Secondary faecal contamination.77. (e.microrisk.com resulted in a number on reports on QMRA. because both events must occur at the same time in order for the drinking water to be contaminated. The Microrisk project3. it is suggested to start from the above four categories of events. Primary faecal contamination: Events that cause significant contamination of the source water. with microbial levels much above “normal levels”. wastewater intrusion due to cross-connections or backflow). The presence of Giardia in the water source is another. in the various parts of the system.Methods for risk analysis of drinking water systems from source to tap 2. In TECHNEAU we add the following type of hazardous events: 4. it is not necessarily correct not to add the frequencies of these events to achieve the total risk. see http://217. faecal contaminations that are not originating from the source water.141.e. Following groups of hazards are normally considered: • • • • • • Biological Chemical Radiological or physical Unavailability (insufficient availability of water supply to consumers) Safety (safety to personnel) External damage (external damage to third parties.pdf . 3. see www. 3 The EU project Microrisk (contract EVK1-CT-2002-00123). failure of the treatment system to handle/detect Giardia is one hazardous event. A hazardous event is an event which can cause harm. refers to the following types of hazardous events from a microbiological point of view: 1. QMRA was applied to 12 systems across Europe and Australia. In principle all types of unwanted events should be included.80/clueadeau/microrisk/uploads/microrisk_how_to_implement_qmra.g. (which should in principle be handled by the existing treatment system). For instance. i.

Methods for risk analysis of drinking water systems from source to tap

Different approaches for identifying hazardous events are discussed in Section 3.1.

2.5 Safety barriers – Causes and consequences of hazardous events

When hazardous events are identified we may want to analyse both causes and possible consequences. The so-called Bow-Tie diagram can be used to illustrate this, see the example in Figure 5, with the hazardous event “Giardia in water source”. The chain of events goes from left to right with the causes (and hazards) on the left, the hazardous event in the middle and the consequences on the right.

Figure 5. Bow-Tie diagram and barriers. A practical example. In Figure 5 also some safety barriers are introduced, which are implemented to reduce the risk. In the left part of the bow-tie diagram we have barriers (1, 2, 3) that prevent the hazardous event to occur or mitigate the hazardous event; to the right we see barriers (4, 5) for preventing or reducing unwanted consequences; (e.g. people being infected by contaminated water). In general safety barriers can either: • • prevent the undesired event to occur (reduce probability), e.g. by introducing restrictions on the use of the catchment area, or reduce the consequences by water treatment systems; thus preventing contaminated water to be delivered to consumer.

So introducing a barrier actually means implementing a risk reducing option. Figure 6 illustrates the concept of hygienic (safety) barriers in a water supply system considering all elements from source to tap, see also the Bergen Case study [65].

22

Methods for risk analysis of drinking water systems from source to tap

Figure 6. Illustration hygienic barriers in a water supply system from source to tap (modification of a figure taken from: SA Water - Drinking water quality report 2004-2005).

2.6 Risk estimation

The risk estimation can be carried out at various levels of detail. An analysis of the hazardous events should include estimation of likelihood (probability) and consequence. Often a semi-quantitative approach is chosen, just giving categories of likelihood and consequence. The combined likelihood-consequence categories could then be inserted in a risk matrix, see example of risk matrix in Figure 7. As an example we here indicate corresponding risk values ranking from 1 (likelihood = rare; consequence = insignificant) to 9 (likelihood = almost certain; consequence = catastrophic). This is just an example on how to rank the risks related to the various hazardous events. The categories (e.g. “catastrophic”) can be defined in various ways (see Figure 11).
Severity of consequences Likelihood Almost certain Likely Moderately likely Unlikely Rare Insignificant 5 4 3 2 1 Minor 6 5 4 3 2 Moderate 7 6 5 4 3 Major 8 7 6 5 4 Catastrophic 9 8 7 6 5

Figure 7. An example of a Risk matrix.

One should make a separate risk matrix for loss of quality, and another for loss of quantity, (and possibly one for e.g. economic losses). Performing the risk estimation one should note the possible links between poor water quality and reduced water quantity. Water quality problems can occur due to low pressure, as a result of a pipe burst (water quantity problem). Further, a drought (quantity problem), will often also result in a decrease of water quality.

23

Methods for risk analysis of drinking water systems from source to tap

In more advanced analyses risk can be fully quantified, see Chapter 6. Often the input data to these quantifications are rather uncertain, and the results involve considerable uncertainty. Then it is recommended to carry out a sensitivity analysis; i.e. calculating risk with various input values to demonstrate the range of “probable results”.

24

Use of experience from the past. Checklists are limited by their author’s knowledge and experience and should be viewed as living documents.g.2) 3. (Section 3. 5. accident and reliability data.1). In this case. and possibly compare to risk acceptance criteria.e. Checklists may be databases. Section 3. HAZOP is another commonly used method (Section 6. and in this chapter we first give a description of the hazard identification of an overall CRA (Section 3. • • 25 . A brief description of some of the methods is presented in this section. A general discussion is given first.e. (Section 3. Assess the need for risk reduction options or more detailed analyses.Methods for risk analysis of drinking water systems from source to tap 3 Coarse risk analysis of water supply systems The Coarse Risk Analysis (CRA) is a method for semi-quantitative risk analysis. The scope of an overall CRA – including risk evaluation and risk control .1 Various approaches for hazard identification There are various techniques for identification of hazards or hazardous events within a system.1 Identification of hazardous events There are various approaches for the identification of hazardous events. Hazard Identification (HAZID) is a collective term often used for such techniques. Identify hazardous events related either to the total water supply system. 3. i.1. 21. may also be used to identify potential problem areas and provide input into frequency analysis (probability estimation). or to a specific part (or in general to some category of undesired events). then THDB is shortly described.1. reviewed regularly and updated when necessary. A traditional checklist comprises a list of specific items to identify known types of hazards and potential accidents scenarios associated to a system. Next. the task is to identify hazards or hazardous events in a water supply system. The descriptions are primarily based on [18] and [19]. the risk estimation in a CRA is discussed (Section 3. 22]): 1. such as the TECHNEAU Hazard database (THDB.typically consists of (see [18.2). • Brainstorming is a main method of problem solving or idea generation in which members of a group contribute ideas spontaneously. Checklists may vary widely in level of detail. 4. Risk estimation. i.1) 2. Rank the hazardous events with respect to their risk. using. Brainstorming. experience from the past.1). Finally an example is given (Section 3.2). The first two steps relate to risk analysis. Present these risks in risk matrices. Experience from the past is often used as input to the methods described in this chapter. 3.2). e. “What-if” analysis is a specific and effective brainstorming approach [2].3). estimate the probability and consequence for each hazardous event.1. checklists (Section 3.

The hazards identified in the THDB are both internal and external. see below. geographical or human origin for the whole part of the system.Methods for risk analysis of drinking water systems from source to tap Experience from the past could be experience from the actual (or similar) water utility. a more specific list may be described for the various parts of the system. cf. Checklists can be applied at any stage of the life-cycle of a water supply system and can be used to evaluate conformance with codes and standards. Internal hazards are mostly related to functional failures or the absence of infrastructure. The database has a generic set-up. A form. etc. shown in Table 1.2 TECHNEAU Hazard Data Base (THDB) The THDB. It does not cover all possible specific operational hazards. but should be regarded as a checklist to assess possible risks of the supply system. may be used for doing this. [20]. A list of generic hazardous events can be formulated by considering characteristics such as [18]: • • • • • Materials used or produced and their reactivity Equipment employed Operating environment Layout Interfaces among system components. polluting the water source near by the inlets Cause Vulnerable locality . External hazards are for instance source water contamination. 26 . provided by operational personnel of these utilities. Chapter 5. Hazardous event Tanker containing 20 m³ of gasoline tips over near intersection XX. A checklist is easy to use and is a cost-effective way to identify common and customarily recognized hazards. This method is rather similar to the brainstorming session. degradation of mains due to aggressive soils or terrorist actions. applies a holistic view on hazard identification within the water supply system and provides a list of hazards and hazardous events for each element in the water system. Table 1. Specific list of hazardous events.Slippery conditions Possible effects: 3. Based on the general hazardous events identified. One could also utilise statistics and data on events that have been recorded in various data sources. One could then go through the total system and record operational problems and concerns that are experienced. The TECHNEAU Hazard Database.Sudden illness of Intersection XX tank driver . [20] presents a comprehensive list of hazards and hazardous events that can serve as a checklist for water utilities. The objective of the database is to help water supply utilities with the identification of relevant hazards by providing a catalogue with potential hazards of technical.1.

O: operation-related .Safety: safety to personnel . A source of potential harm or a situation with a potential of harm (e.Chemic. The tables are divided into components and elements.: radiological or physical (including turbidity) . and for each element the most relevant hazards are given in combination with a description of the cause of the hazard. Examples of hazardous events from the THDB are shown in Table 3. .Unavail./phys.g.: biological . At component level the most important elements are given. The hazard database uses the definitions given in Table 2. The hazard database is presenting the identified hazards in a table at the subsystem level. what can happen and how). Ref. Indication of the origin of the hazardous event.Methods for risk analysis of drinking water systems from source to tap The water supply system is subdivided into 12 sub-systems. Element: Hazard: Lowest level of the system at which hazards are described. Table 2. Column to be used by the end-user for marking the identified hazards. an incident or situation that can lead to the presence of a hazard. of which 10 are physical subsystems representing the infrastructure.E: external-related . [20].: chemical .D: design-related .OS: consequence of a hazard in other sub-system . one is a non-physical sub-system representing organizational aspects and one is a sub-system representing future hazards.: insufficient availability of water supplied to consumers . Reference number (id. Reference of the sub-system affected by the hazard.Biolog.) of the hazard. a biological. including liability Description of potential consequences of the hazard to other sub-systems and to consumers. Definitions applied in the TECHNEAU database (THDB). chemical. system: 27 . OS: Reference of other sub-system Indication of the type of hazard. . physical or radiological agent or undesired event that has the potential to have a negative effect on the supply of safe and sufficient water).: Hazardous event: Type of hazardous event: Type of hazard: Consequence description: Consequence to sub-system: Rel.Rad. The THDB focuses on both water quality and water quantity.g. An event which can cause harm (e.External damage: external damage to third parties. the hazard type and the consequences.Ref.

g. improper pH control.Improper coagulant mixing and/or flocculation. (6.1. water quality.1. and is referred to as a ROS (Risk and Vulnerability) analyses. industrial accidents or forest fire.2) . in order to identify the most serious threats and then to make the right priorities with respect to implementing risk reduction options.3) .2) . (1. As an example we have taken the hazardous event 6. they are also used for analysing existing systems. The probability categories are denoted e. in brackets) . and similarly consequence categories.Malfunctioning valves. However.Poor hygiene during repair. Note that the consequences can be evaluated with respect to several “dimensions”.g. The risk estimation in a CRA usually restricts to presenting categories of probability and consequence. 28 . Then there is little information on design details and operating procedures. (1. (8. In such a situation water utilities can carry out a Coarse Risk Analysis (CRA). (6. connections to different water qualities (industrial water. e. or in the launching of a WSP implementation in an existing system.4. The CRA can also be used to prepare emergency preparedness plans for the water supply companies. g.Industrial discharge of biological matter. water quantity (supply) or reputation/economic loss. (8. Several variations of this form are used.Decrease of UV lamp performance due to ageing or colour sediments on quartz tube.Industrial discharges of chemicals. These analyses are often carried out early in the development of a utility.3) .6. Electrical disruptions.Methods for risk analysis of drinking water systems from source to tap Table 3. (1. The main objective of the CRA is to identify hazardous events (as described above). cf.1. System Source/ Catchment Examples of hazardous events from the THDB – (hazard id. the causes of the event. C1-C4. One example of a worksheet used to document the results of the analysis are shown in Table 4.7 in the THDB. In Norway this type of analysis is commonly applied.14) Water treatment plant Distribution and plumbing 3.7) . and to make a coarse evaluation of likelihoods (probabilities) and consequences of these events. and the analysis can be a precursor to further studies. Each hazardous event identified is inserted in the list and analysed. Table 4. sewers). Examples of hazardous events in the TECHNEAU hazard database (THDB) [20].1. P1–P4. this method is similar to the Preliminary Hazard Analysis (PHA).1) .Emissions during accidents (fire or explosions) e.6.2 Risk estimation in Coarse Risk Analysis (CRA) It is a rather common situation that a water utility wants to have a coarse overview of the main risks for its activities. inappropriate flocculant or flocculation agent. [20]. These pairs of values are later inserted in the appropriate cell of the risk matrix.1. The results are normally displayed in a list of hazardous events (in a worksheet form). or specific subsystem.

the most serious hazardous events are identified. and then ranking these with respect to their contribution to risk. training.6.7 Pathogen in water source Operating mode: Normal operation Hazardous Causes event Analyst: NN Date: 2008-10-10 Probability Conse. The tools itself is also an aid for organizing the data generated as a part of the coarse risk analysis. The main focus is on identifying major hazardous events.worksheet. medium and large water companies. Risk reduction options to prevent the hazardous event or to neutralize its consequences are identified.Preventive quence actions C2 2) Online measurement of UV intensity to verify correct intensity Comments Too low UV Ageing or colour P2 1) dose sediments on quartz tube 1) 2) Probability category Consequence category According to the resulting risk-score of the various hazardous events in the risk matrix. Finally a priority list for risk reduction options (with deadlines) is formulated. No detailed modelling and calculations are needed. In summary. as shown in this figure. Example of a CRA. organization. 29 . If statistics about hazards are not available the CRA will rely on expert judgements to estimate the risk and define appropriate risk reduction options. The structure of the tool is a database which enhances future updating of the tool. or a combination of these. etc. the CRA requires good information and knowledge about the system including surroundings. e. using experience from the past.3 Tool for Coarse Risk Analysis (CRA) A tool for carrying out a coarse risk analysis was developed. However.g. Normally a CRA is not very time consuming. Hazard identification is usually based on some kind of expert judgement. and the analysis may be carried out by professionals with good system knowledge.Methods for risk analysis of drinking water systems from source to tap Table 4. The various fields of the interface are explained in Table 5. System: Treatment Ref. Hazard 6. However.) and the reduction of risk of the various risk reduction options are roughly evaluated. time. The userinterface for carrying out the analysis is shown in Figure 8. The needed efforts (in terms of costs. By clicking on the acronym of a potential hazardous event the corresponding risk registering dialog box for the relevant hazardous event appears. The tool is applicable for small. but is not requiring computational skills. a CRA is a rather simple semi-quantitative risk analysis method. 3. check lists. Note that the CRA does not provide a score of the total risk of the water utility. this depends on the size and complexity of the system to be analysed.

e.Methods for risk analysis of drinking water systems from source to tap Figure 8. (from Low to Very high). In the example below (in Figure 9) duration and exposure are chosen. illness or lack of supply can. 1000-10 000 5. number of affected persons. 10 000-100 000 6. 0-6 hrs 2. be classified as: 1. 6-24 hrs 3. can be given as: 1. i. for example. > 100 000 As seen in Figure 9 this can be used to define four consequence categories. User interface for the CRA tool (registration of potential hazardous events). 10-100 3. 1-4 weeks 5. 1-10 2. The consequence classes can be specified by two dimensions. 100-1 000 4. 1-7 days 4. Likelihood (probability) and consequence are given as categories. 1-6 months 6. 30 . Duration of e. > 6 months Exposure.g.

The probabilities of occurrence are defined as small (P1). source. Waterworks 31 . Components A description of the components (e. measuring remaining pipe wall thickness by non-destructive testing) Risk reduction Based on the resulting risk matrixes. (e. Comment The user selects which waterworks the analysis belongs to. It will also serve as a justification for the assessed consequences making it easier to review the estimated values. The user must select event from a drop down text. for treatment plant the following detailed elements might be analysed: coagulation. medium (C2). A check list of event/hazardous possible events is available from the “Hazard database” developed as a event part of Techneau.g. Probability The probability for the undesired event to occur. Cause The underlying cause for undesired event. The user must select cause from a drop down text. large (P3) or very large (P4). catchment. Cause description A more detailed description of the underlying cause of event can be given. medium (P2). intake. Manageability Description on how the risk can be managed i. water quality and loss of reputation/ reputation/direct economic loss. pH).g. The probability must be estimated by the user either based on available data or expert evaluation. Consequences The possible consequences resulting from the event are described as small (quality. The barriers can be existing barriers and possible future barriers. These can either be physical options or implementation of critical control points (CCP) for controlling the risk in real time. Analysis object Describes which element in the water supply system is analysed (e.g. A checklist for possible causes can be found in the “Hazard database” developed in Techneau. One water company might have several waterworks and some hazardous events might be unique for one of the waterworks.e. UV. The user must select analysis object from a drop down text. how and what can be modelled and/or measured to control the process. For the terms quality and delivery/quantity the economic) duration and the number of involved persons Barriers Identification of barriers (c.Methods for risk analysis of drinking water systems from source to tap Table 5. the need for risk reduction options for options /CCP each of the undesired events might be introduced. Undesired Description of the undesired event or hazardous event. large (C3) and very large C4). Some guidance on assessing the probability is given within the tool.f Bow-tie diagram) reducing both the probability and consequences for the event. Vulnerability Description of how vulnerable the system is if the analysed elements fails (e. filtration. Description of user interface (CRA tool) in Figure 8. chlorination. (C1). consist of 3 elements: quantity/delivery.g. Might also be used indirectly for assessing the consequences. Detailed Detailed description of the analysis object (e. Assessing whether the barriers reduces the Probability (P) or the Consequences (C) might be useful. The consequences delivery/quantity. CO2.g. The user must select detailed analysis object from a drop down text. two pumps in parallel) Description A more detailed description of the consequences of the event might be given here. water treatment plant). if the water company has alternative sources of backup supply the water supply will be less vulnerable).

There can for instance be one matrix for quality (life and health). yellow and green area of the risk matrix reflect the applied risk acceptance criteria of the water utility. ref. Each hazardous event will be shown in the risk matrices by a symbol. P. Decisions of these acceptance criteria are to be taken by the management. This set (P. Specification of the consequence categories by duration and exposure (CRA tool). yellow and green. see Figure 10. Note that the colour coding within the risk matrix in the CRA tool are defined by the user by editing the individual cells of the risk matrix.Methods for risk analysis of drinking water systems from source to tap Figure 9. [2]. C). C. and consequence. while “red” risks indicate that the risk is not tolerable and there is need for risk reduction options. So. one for quantity (delivery) and one for reputation/economic. 32 . In Figure 10 “Green” risk indicates that the risk is tolerable and there is no need for risk reduction options. By this splitting of outcomes in three categories. red. The decisions of the spreading of the red. for the various “dimensions” of risk considered. “Yellow” risks indicate that the need for risk reduction options should be discussed. i.e. we adopt the ALARP principle. one outcome of the analysis of a specific hazardous event is the risk. given by probability (likelihood). is to be inserted in the risk matrices.

construction) o Fire (huge water demands might lead to low pressure) o Water mains failure (might lead to non-pressurised system) o Incorrect operation of valves o Failure at pumping stations in zones without water tanks o Water hammer o Pipe fracture. repair. In that case the following undesired events. The risk matrix for water quality (delivery). Identical matrices are given for quality (life and health) and loss of reputation/economy (CRA tool).Methods for risk analysis of drinking water systems from source to tap Figure 10. rehabilitation. valve closes without intention o Water tanks emptied due to communication error o Extraordinary water demand/tapping o In-pipe processes • Cross-connection/backflow o Unintended backflow from building o Sabotage (intended backflow from building) 33 . [65]. were identified: I Failures in hygienic barriers (water quality)/ intrusion of contaminated water into network: • Contamination in water tanks (water surface) • Intrusion due to low pressure/non-pressurised network o Operational and maintenance situations (e.g. which might take place in the distribution system. An example of the application of the tool is given in the Bergen case study report. valve operations) o Power failure o Work on non-pressurised network (e.g.

g.g.Methods for risk analysis of drinking water systems from source to tap II Failures of water delivery/quantity: o Operational and maintenance situations (e. valves) 34 . valve operations) o Pipe failures o Rockslides/rockfall in tunnel o Water tanks emptied due to communication error o Failure at pumping stations o Failure of equipment (e.

or as the actual health effects for the consumers (item 4). Some examples of risk measures for water quality are: 1. 4. the complete water supply chain should be considered. Risk related to water quality is not necessarily measured in terms of the quality of water delivered to consumers (item 3 on the list). When risk related to water quantity shall be quantified. risk can be quantified in various ways. representing the different types of consequences. Risk is often expressed in terms of these probabilities and consequences. For the consumer it is important to be supplied with water of good quality.4): • frequency of interruptions of water supply 4 Note that some fundamental probability concepts are discussed in Appendix F. In order to quantify risk related to water quality. Probability that one litre of drinking water at tap contains a certain parasite. 4. (meaning that contaminated water is delivered to consumer).4). One refers to the various “dimensions” of risk. Probability of a specific failure of the treatment system.3. 2. resulting in contaminated water entering the distribution network. Probability of a specific degree of contamination/pollution of the water source. So the TECHNEAU project focuses on the quality and quantity of the water supply. 3. and each of these risk dimensions can be quantified. In this chapter various ways to quantify (measure) risk are discussed. but there should also be enough water. Mean number of consumers getting adverse health effects caused by drinking water (due to a certain hazardous event). in order to give an expression of the total risk of a water supply system. both aspects essential for the consumers’ risk. Various measures for water quality are discussed in Section 4. (probability x consequence)4 is applied. This means that a rather traditional definition of risk as the “mean loss”.Methods for risk analysis of drinking water systems from source to tap 4 Quantification of risk Depending on which aspects of risk are considered. For a water utility it can be useful also to estimate the risk of a water source being polluted or of a failure of the treatment system (items 1 and 2). it should be noted that loss with respect to water quantity/availability depends on (see Section 4. The estimated risk of the various hazardous events can also be aggregated. Several types of potential consequences can be considered in a risk analysis of a water supply system. 35 .1 The dimensions of risk and various ways to quantify risk The TECHNEAU project applies the very common definition of risk (Section 1. In Item 4 “Mean number of persons getting adverse health effects” a quantification of risk is applied where probability and consequence are combined into one figure. as a combination of the probability (frequency) of the occurrence of specified hazardous events and the consequence(s) of these events.

Methods for risk analysis of drinking water systems from source to tap

• •

duration of the interruption, exposure, i.e. number of consumers being affected.

Note that even without interruption of the water flow at the consumers tap, the water may be delivered with a pressure which is too low (e.g. for appliance to work). So water pressure being excessively low is also a risk to water quantity, (and excessive high pressure is a hazard, potentially causing leakage or bursts in plumbing installations). Thus, loss of water quality and loss of water quantity are the two most important “dimensions” of risk for a water utility. But note that if analysis of water quality restricts to include the effect on human health, then environmental impacts is another dimension of the risk. Also this risk can be measured in various ways, e.g. in terms of frequency of polluting events and the exposure (e.g. number of affected species/animals). In addition, the water utility can experience loss of reputation (consumer thrust), which is more difficult to measure, but also these losses can have economic consequences. Further, consumers (e.g. certain industries) and the water utility itself may experience economic losses, which are most reasonably expressed in monetary units, (e.g. Euro). But in principle, it is possible to measure all losses - related both to water quality and quantity (and environment) – as economic losses, and in this way give an overall measure of the total risk. Finally, we mention societal risk, which is the risk related to major events, e.g. causing main functions of society to be at risk. This is certainly relevant for a major infrastructure like the water supply; (either lack of water or polluted water, affecting many consumers or an institution like a hospital). Specific risk measures could be designed to express also these risks. However, the present report focuses on risks for the consumers and for the water utility. Various ways to measure (quantify) risk will be discussed in more detail below. Some measures are “common”, i.e. can be used for various dimensions of risk, and others are related to a specific dimension, as quality or quantity.

4.2 Qualitative versus quantitative expressions for risk

As stated above, risk is usually measured by severity of some unwanted consequence, C and the likelihood (i.e. probability, p, or frequency, f) that this consequence occurs. Various types of consequences (losses) can be considered. Often we want to rank various risks, and so the C- and p-values are quantified to give an overall measure of the risk, e.g. R = p x C. This quantification can be time consuming. Also note that risk quantification expressed in ‘detailed’ numbers pretends an exactness that may not be the case because it has been derived from assumed probabilities or ranges of numbers described in then literature. So there is a danger of creating a false sense of precision of the result. A ranking can also be carried out qualitatively, without specifying p- and C-values for each risk. One possibility is to apply paired ranking; i.e. comparing pairs of risks: each risk is compared to every other risk, specifying which of the two is greater [7]. This should give an explicit weighting, but again with the danger of giving a false sense of precision. The

36

Methods for risk analysis of drinking water systems from source to tap

process could also be very time consuming and complicated due to the fact that “experts” are not always consistent (agreeing) in their evaluations of paired comparisons. A common qualitative approach is to apply a classification of risk. Probabilities and consequences are divided into categories. For the probability category measures as ‘rare’ and ‘frequent’ are used. Consequences could be categorised as ‘small’, ‘medium’ and ‘catastrophic’. These categories are a ranking of likelihood and consequences. The categories can also be defined by intervals, for instance, the probability category ‘rare’ could be defined as ‘less than once a month’. Similarly, the consequence category ‘small’ with respect to health effects could be defined as ‘at most 10 consumers with minor health effects’, etc. In this case the term semi-quantitative approach is used (not fully quantitative but placed into pre-determined categories). Based on categories for probability and consequence, a risk matrix can be made; for an example based on WHO [8], see Figure 11. In this figure risk categories are given (1-9) (note that this is an example). Also observe that the WHO definition of the likelihood (probability) category “Almost certain” equals “Once per day”. In a risk analysis it is rather seldom to include events which are that frequent.
Severity of consequences Likelihood Almost certain Likely Moderately likely Unlikely Rare Insignificant 5 4 3 2 1 Minor 6 5 4 3 2 Moderate 7 6 5 4 3 Major 8 7 6 5 4 Catastrophic 9 8 7 6 5

Examples of definitions of likelihood (probability) and severity (consequence) categories that can be used in risk scoring Item Likelihood categories Almost certain Likely Moderately likely Unlikely Rare Severity category Catastrophic Major Moderate Mortality expected from consuming water Morbidity expected from consuming water Major aesthetic impact possibly resulting in use of alternative but unsafe water sources Minor aesthetic impact possibly resulting in use of alternative but unsafe water sources Not detectable impact Once per day Once per week Once per month Once per year Once every 5 years Definition

Minor

Insignificant

Figure 11. Example of a risk matrix and definitions of likelihood (probability) and severity (consequence) categories to be used in risk scoring in WSP (WHO, [76]). Suggested risk categories, 1-9, are added here as an example; (not included in the WHO report).

37

Methods for risk analysis of drinking water systems from source to tap

A risk matrix is the most common way to present risk when a semi-quantitative approach is chosen, see Chapter 3. In particular it is used to prioritize and distinguish between important and less important hazardous events, as the risk for each hazardous event is assessed and inserted in the risk matrix. As there are various dimensions of risk, we can design one matrix for each dimension; e.g. one for health effects (loss of quality) and one for water availability (loss of quantity).

4.3 Risk measures for loss of water quality

Considering the total system, from source to tap, there could be various quantifications related to loss of water quality, i.e. the measures could be related to: 1. 2. 3. 4. Quality of source water, treatment technology and distribution network. Health effects for consumers. Effects on the consumers’ acceptability Effects on the distribution and plumbing systems and equipment (e.g. corrosion)

Here the measures related to 1 will be interesting only as they say something about the potential to avoid the consequences 2, 3 and 4. Some examples are given below. 1. Quality of water source, treatment technology and distribution network. For a water utility it can be useful to estimate the risks related e.g. to pollution of raw water or to treatment failures, and so the following are examples of risk measures: • • • • Probability (frequency) of specific degrees of contaminations/pollution of the water source. Probability of failure of specific treatment systems. The probability of one litre of treated water containing a certain parasite. Probability of pollution entering distribution network

2. Health effects for consumers. The risk of contaminated water to human health can be characterised in a number of ways. For instance one can give the risk per person and then in addition the number of persons exposed. The risk per person can be described by a probability distribution, and the measure could be given by the mean, the median or e.g. the 95% percentile; cf. the Microrisk project, (www.microrisk.com). Thus, the following are some risk measures related to health effects for consumers: • • Mean number of consumers which during one year have serious health effects caused by bad drinking water. Frequency, f, of events resulting in at least N consumers getting ill (adverse health effects); with say, N = 1000).

Finally, note that a general risk measure for overall health effects is DALY (= Disability Adjusted Life Years), which is the measure of health effects used by WHO. The use of

38

as • • • • • Probability (fraction of the time) that an arbitrary consumer is without water supply. Frequency of events resulting in failure to supply water to at least 1000 consumers. The frequency of interruption of supply.g. or a combination of these. Customer Minutes Loss (CML). There could for instance be an effect on pumps and appliances. and further. odour. Some risk measures are: • • Probability of water delivered to consumer has unacceptable odour/smell.e. Here also a generalisation is introduced. Substandard Supply Minutes (SSM) i. A similar argument could apply for residential consumers: 500 persons losing water supply for one month (30 days) may be considered worse than 15 000 39 . 4. However. (or supply is insufficient).4 Risk measures for loss of water quantity (supply) Generally. corrosion) or hardness (e. there should be a high availability of the water supply. However. one long delivery interruption does not necessarily represent the same risk as ten small interruptions. Effects on the consumers’ acceptability. Mean number of consumers affected by shortage (when supply is insufficient). This way arise from the occurrence of taste. affecting at least 1000 consumers) is another. even though the interruption’s contribution to the yearly unavailability can be small compared to a more long-lasting stop. due to water aggressiveness (e. the number of minutes the average consumer is supplied with drinking water that do not complying with existing quality and/or quantity standards. In addition to the consequences (i. In general.e. For specific types of industries a short interruption might have approximately the same consequences as a longer one. a high probability of every consumer being supplied). For instance. Volume of water missing (when supply is insufficient). usually DALY is probably too complex to be used by water utilities.Methods for risk analysis of drinking water systems from source to tap DALY is discussed in Appendix C. 4. incrustations).g. aggregated over all consumers. loss of water quantity can be measured e.e. the average water unavailability (fraction of time without water) for a consumer should be a reasonable measure for water quantity.g.g. colour or turbidity (and may further convey economic risks (loss of reputation). supply should be done with proper flow and pressure. (e.g.e. The risk measure should quantify this damage. could be such a measure of risk. (i. that gives an overall measure for loss of both water quality and water quantity. risk measures related to water quantity could consider either the number of affected consumers. the frequency and the duration. To give some examples. e. how much deficient are flow and pressure). even if the total time without supply is the same. the average number of minutes that drinking water is not delivered to an average consumer. i. Effects on the distribution and plumbing system equipment. 3. the mean number of days without water supply.

including both use and non-use values (see e. Methods that are less strongly founded in economic theory We will describe the first two groups.e. The principle of SPMs is that a scenario is presented for a randomly selected group of individuals. is generally more problematic. making use of a relation between the market and the non-market goods. wrong valve operation etc.Methods for risk analysis of drinking water systems from source to tap losing water for 1 day. where the individuals are asked about their willingness to pay (WTP) for a suggested change in the scenario. A closely related SPM is choice experiments where the individuals have to make a choice among different 40 . RPMs include (a) the production function method. the revealed preference methods (RPMs) are based on individuals’ actual behaviour on an existing market. does usually not constitute any large problems. i. Three groups of valuation methods for non-market goods can be distinguished [14]: 1.g. (c) the hedonic price method. Each individual has to decide on the scenario through interviews or questionnaires. a risk reduction. goods traded in the common market. Several studies [10-13] provide detailed and extensive information on economic valuation methods of non-market goods. The unavailability of water should be evaluated both with respect to planned and unplanned activities. So. measuring water quantity by the average unavailability of supply may not be sufficient. and then both the frequency and durations of interruptions should be given. An example of a revealed preference valuation is to investigate the decrease in sales of bottled water after installing a new treatment system to decrease the health risks of the drinking water. for prioritising between risk reduction options. Thus. Example of unplanned activities might be pipe burst. An overview of different valuation principles and methods are presented in a TECHNEAU report on risk management [2].g. A shortcoming of the RPMs is that they are capable of valuing only parts of the total economic value (TEV) of e. Stated preference methods (SPM) 3.5 Risk measured in monetary units Risks and risk reduction can be valued in monetary units in order to (1) express all risks in a common unit and (2) facilitate economic analyses. (b) the travel cost method. and (d) the replacement cost method and the restoration cost method. Revealed preference methods (RPM) 2. cost-effectiveness or cost-benefit analyses. 4. the relationship between goods on a market and for example the reduced health risk from consuming drinking water is used for indirect valuation of the risk reduction. even if both events give the same contribution to overall water unavailability. such as the reduced risks to human health from drinking-water consumption. In the second group we describe the stated preference methods (SPMs) which are capable of measuring the TEV. in more advanced approaches we could distinguish between long and short durations of the interruptions in water supply. First. Thus.g. Economic valuation of market goods. [15]). e. The most common SPM is the contingent valuation method. Economic valuation of non-market goods.

where the drinking-water industry is privatised.g. economic valuation is being increasingly used in order to achieve cost-effective asset management. based on stated preference surveys. which uses economic valuation of risks. how much they are willing to pay for different well-defined levels of drinking water safety.g. important improvements have been made on various types of SPMs. saving a statistical life and ecological improvements. see e. e. However. [16]. Especially in the UK.Methods for risk analysis of drinking water systems from source to tap situations. Based on these experiments. 41 . Economic valuation of non-market goods is still to some extent controversial. In the drinking-water sector. economic valuation is common.g. extensive research and applications in the field of environmental economics over the last decades have resulted in greatly increased knowledge regarding the possibilities and limitations of valuations of e. as an integral part of its asset management (see [17]). it is possible to derive a willingness-topay (WTP) model. For example. A successful example often referred to is the Yorkshire Water utility.

Methods for risk analysis of drinking water systems from source to tap 42 .

raw water sources. number of consumers and local conditions relevant to adjust the generic failure data. Data requirements for a coarse qualitative risk analysis differ from the requirements for a detailed quantitative risk analysis.1 Introduction Available and accurate data are essential for achieving reliable results from a risk analysis. • Environmental and geographical data are necessary to identify possible hazards and to obtain an understanding of the environment where the water system is located. System data Data describing the entire system of the water supply (from source to tap) in question. effectiveness of different types of water treatment or the effect that different types of pollutants have on humans (cf. 43 . This information can allow evaluation of the dose and frequency of a contamination of the source.g. hazard identification risk estimation and risk reduction option identification and implementation. The treatment systems will be of special importance. subsystems or the entire system. • Specific data about the reliability of barriers in the system is essential. • Knowledge about removal efficiencies.2 Data needs The various types of data needed can be considered in three categories: • Generic data Data from external data sources (not data from the water utility under investigation). • • Some data needs for risk analysis are summarized in Table 6.g. Data could relate to e. The level of detail of the needed data depends on methods used for risk analysis. • Knowledge about the effects of the identified hazard on consumers is also required.Methods for risk analysis of drinking water systems from source to tap 5 Data for risk analysis 5. e. • Technical data are needed to understand the functions of the technical systems and to identify the barriers. water treatment methods being used. • Guideline values and national standards. and to identify possible contamination points. Some relevant types of data are listed below. • Information about the specific layout of the system is essential to establish a system model and to gain an understanding of the system as a whole. layout of the plant . following this categorisation. the required level of detail of the analysis and need of accuracy of the results. Event data Monitored data of hazardous events or system failures that have occurred in the past. 5. Data is needed for the system description. dose-response results). • Operational and maintenance data are needed to determine availability and reliability of components.

water source and the distribution system • GIS data on hazards • Environmental data • Treatment systems • Water distribution network • Number and types of consumers connected to water utility • Volume of water consumed per consumer (per day) Event Data • Failure data for various subsystems.microrisk. Data needed for risk analysis. water distribution networks etc) • Local knowledge • On-site inspection • • • • • Failure data base of water utility Maintenance system Generic failure data bases Vendor information (e.e.g. maintenance personnel) • • • • Reliability and failure rate of equipment and systems Type and frequency of hazardous events • 44 .g. (treatment systems / barriers) • Data on erroneous operation (human errors) • Events that have resulted in contaminated water • Preventive and corrective maintenance data System description is used throughout risk analysis to assess e. System data • Geographical data • Layout of the catchment area and source • Possible hazards in the catchment area. Type of data Generic data • Data on health effects of various doses of various pollutants on humans. (e.g. cf.com) WHO website Databases available on USEPA websites provide additional information (e. • Hazards • Hazardous events • Treatment system reliability • Exposure and consequences to water quality and human health Maps Water utility/plant data: o Technical drawings o Layout drawings o Asset databases o Maintenance systems • Municipality. on failures) Reporting system for hazardous/undesired events Local knowledge. level of contamination in source being unacceptable) • Calculations of risk in terms of DALY Data sources • • • Microrisk website (www. for health risk assessment) in comparison to the WHO or Microrisk websites. water utility (GIS maps.Methods for risk analysis of drinking water systems from source to tap Table 6. dose-response (QMRA) • Effectiveness of treatment systems for various types of contamination • Weights to be used in DALY calculations Use • Efficiency of treatment systems (i.g.

3 Data sources 5. which can be important sources for reliability data. Expert judgment and testing can either be from external sources or internal. operational times.2 Failure event data bases Information about reliability of equipment is defined by Rausand and Høyland (2004) [23] as information about the failure/error modes and time to failure distributions for hardware.Methods for risk analysis of drinking water systems from source to tap 5. It is important to consider the relevance of the data for the specific system in question before utilizing external data sources. These can be grouped in different categories [67]: 1. (incl. 5. 2. 5.) Inventory of equipment. External data sources Internal data sources Expert judgement Test data Literature and publications External data sources can be used for reliability of technical component and systems and for obtaining the effect different types of pollutants have on humans. failure mode. Testing of systems can also be used either in operation or in laboratories. giving number of various types of components. hospital or baby sanatorium connected to the network). This is used where no reliable data is available. repair time etc. Important tasks when collecting data is to • Establish a common format for such a database (making it easy to transfer data) • Encourage exchange of data across water utilities (and countries) 45 .3. 4. 3. Such a database should contain the following information.1 Types of data sources There are different sources that can be utilized to obtain data. • • • • (Hazardous) events Failures of various components/equipment. In the case of component reliability this type of data source could give valuable data because the operational time where failures are registered is often extensive.3. Similar systems or barriers might have different external conditions and maintenance which may effect the reliability (and effectiveness) of the barriers. but structure and sensitivity of population supplied may be site specific (e. Various environmental and operational data that are (assumed) relevant for the performance of the systems/components. This is done in other industries than the drinking water industry such as in the offshore industry in Offshore REliability DAta [68].g. software and humans. It is therefore possible to collect reliability data from different sites (systems) into a common database. The effect that different pollutants have on humans is in most cases independent of local conditions. The reliability of specific systems or components is often not site specific. Internal data sources can be data monitored in a CMMS system (Computerized Maintenance Management System) or a SCADA (Supervisory Control And Data Acquisition) system. etc.

Methods for risk analysis of drinking water systems from source to tap • Develop analysis techniques to better utilise the information provided by such a data base. and once a database is established the risk analysis will be less costly and time consuming. Such a database will make risk analyses more reliable. 46 .

In this example there are two barriers to reduce consequences to consumers: disinfection and monitoring. “microbiological contamination of source water” has the meaning “Either there is an unacceptable concentration of this contamination in source water”. The objective is to explain the capabilities of the various methods and the situations where they can be beneficially applied. there are then three possible consequences for the consumers: • • • Contaminated water is disinfected. or “there is a microbiological hazard that the treatment system is not designed for”. and in this example it is assumed that there are two systems for disinfection that both must fail in order to cause water not being disinfected. Further. but there will also be qualitative analyses. 6. Also this is illustrated in Figure 12. resulting in corrective action. Further. A simple example is found in Figure 12. and could also include evaluation of human and operational activities. The risk analysis could be a detailed analysis of a specific subsystem or process. but contamination is revealed by monitoring. thus. Here the hazardous event. Most risk analysis methods described in this chapter are quantitative. First a general introduction to some main models and approaches is given. an overall analysis to assess the total risk of the water utility can be carried out. next the various methods are described in some detail. Neither disinfection nor monitoring works. there are two events (Event 1 and Event 2) that both can cause failure of system 1 (similar for system 2).1 Risk modelling and choice of a risk analysis method In order to investigate consequences event trees (Section 6. contaminated water is delivered to consumers.Methods for risk analysis of drinking water systems from source to tap 6 More advanced risk analysis methods for water supply systems This chapter gives an overview of various methods for risk assessments of a water supply system. 5 In this Section we will refer to some analysis techniques that will be properly defined later in the Chapter. 47 . The methods presented here give analyses of different level of detail and complexity. but they are more advanced than the simple CRA described in Chapter 3.5). Depending on whether the barriers are effective or not. possibly with respect to both quality and quantity.10) are often used5. (no effect for consumers) Disinfection does not work. A branch of the event tree indicates whether the preceding barrier functions or not (the first branching point of the event tree corresponds to “Is disinfection system of water OK?” If “yes” upper branch is chosen). here considering a failure of the barrier disinfection. The causes of the hazardous event or the failure of a barrier can be analysed using fault trees (Section 6. The fault tree uses specific symbols to “break down” the causes of such an event.

(here using a fault tree).Methods for risk analysis of drinking water systems from source to tap Figure 12 Two simple examples of risk analysis methods An overall risk model related to a hazardous event can then be provided. the further development of the event chain depends on the functioning of safety barriers. After the undesired event has occurred. illustrated by an event tree. again using a fault tree). see left part of figure. as illustrated for barrier 3 where a malfunction of the barrier is represented by the top event in a failure tree. In Figure 13 the undesired (hazardous) event is the “top event” in a fault tree. (using an event tree). 48 . see right part of the figure. as illustrated in Figure 13: • Analysing the causes of the hazardous event. • Analysing the consequences of the undesired events. So during the analysis we consider events that may cause a malfunctioning water supply system. This is illustrated by an event tree in the right part of in Figure 13. Each safety barrier in the event tree may also be influenced by a set of hazards or hazardous events. • Analysing one or more of the safety barriers of the event tree. hazardous events and malfunctioning barriers. describing the causes of the undesired event. The causes could be a combination of hazards. and also the effectiveness of different barrier systems for treatment or detection. (see Barrier 3.

including cause analysis and consequence (C) analysis The direct consequences are given at the rightmost part of the figure. Outline of an overall risk model. degree of contamination of water to consumer.8). If a component/system goes through various stages of degradation. This could be carried out by a Failure Modes.Methods for risk analysis of drinking water systems from source to tap Figure 13. health effect for consumers. If there is a public discussion about the risk to consumers’ health due to inadequate water quality. the use of a Hazard and Operability Analysis (HAZOP) could be relevant (Section 6. lack of delivery to consumer etc. this should be analysed to identify possible failure modes and their causes and effects. Effects. and Criticality Analysis (FMECA) (Section 6. by five increasing categories (C1-C5).5. then a Markov analysis can be carried out to identify a cost-effective preventive maintenance program (Section 6. an assessment of health impacts of microbiological threats in drinking • • • 49 . Various types of consequences can be considered. the most serious consequence (C5) usually located at the bottom.2). If a system during design is identified to be critical or during operation is observed to have a negative development. Note that the overall risk model in the figure above illustrates the bow-tie model from Figure 5 in Section 2. The following are some other examples where the use of more advanced risk analysis methods can be required: • If a thorough investigation/identification of all hazards is required for a complex system.3).

g. a QMRA (Quantitative Microbiological Risk Assessment) (Section 6. Decide whether these deviations can lead to hazards or operability problems. 3. such as: – – – – temperature.g. The steps of a HAZOP analysis are illustrated in Figure 14. or introduced into existing facilities due to changes in process conditions or operating procedures. high pressure) Quantitative decrease (e.g. HAZOP guidewords to review processes Terms No or not More Less As well as Part of Reverse Other than Definitions (and examples) No part of the intended result is achieved (e. 6. and chemical composition.g. Reveal how deviations from the intention of the design can occur. All parts of the systems are evaluated to see how deviations can occur and whether they can cause problems. Table 7. c. including the intended design conditions. something completely different happens (e. A HAZOP analysis is particularly useful in identifying unforeseen hazards designed into facilities due to lack of information.2 HAZOP Hazard and operability (HAZOP) is a detailed and systematic technique for identifying hazards and operability problems throughout an entire treatment plant or facility.g. flow level.g. low pressure) Qualitative increase (e. At each study node specify a relevant set of process parameters.g. no flow) Quantitative increase (e. e. additional material) Qualitative decrease (e. only one or two components in a mixture) Opposite (e. The approach is briefly described by the following steps: 1. b.Methods for risk analysis of drinking water systems from source to tap water could be required.g. The basic objectives of the analysis are to: a. All of the process parameters are used together with a set of predefined guide words (see Table 7) to review the process in a systematic way in order to identify possible deviations that may affect water quantity or quality. pressure. flow of wrong material) 50 . Provide a full description of the facility or process. Split the system or process into study nodes 2. backflow) No part of the intention is achieved.11).

An example for a water treatment system (chlorination for water disinfection) is given in Table 8. chlorination 1. Limited supply 2. Process parameter: Flow (of chlorine) Guide word Deviation Causes No No flow 1. Process unit: Water treatment. Chlorine supply is empty 2. Flow diagram for the HAZOP analysis The HAZOP study is documented in a HAZOP worksheet.Methods for risk analysis of drinking water systems from source to tap Figure 14. Miscalibration of equipment Consequences Water not disinfected Action /solution High chlorine concentration in water Water not disinfected Reverse Flow in opposite direction 51 . Leaking pipe or tank 3 Valve failed in closed position More More (too Miscalibration of much) flow equipment Less Less flow 1. Table 8. Example of HAZOP analysis (use of guidewords) for a water treatment system.

e.g. The results from the FMECA may also be useful during modifications of the system and for maintenance planning. What inherent provisions are provided in the design to compensate for the failure? The system should first be broken down into a suitable level to adjust the level of detail of the analysis to the purpose and the resources available. Effects and Criticality Analysis (FMECA) is often a first step in a reliability analysis. e.g. How is the failure detected? 6. The method can also be used to evaluate redesign and extension of water supply systems. 6. The analysis aims to answer the following questions: 1. In a FMECA subsystems or modules are reviewed to identify failure modes of components (i. The FMECA can be expanded down to a level of detail where estimates of a failure rate can be obtained. The FMECA is mainly a qualitative analysis. The results of a FMECA are risk reduction options of various types e. when a new consumer with special requirement for water quality or quantity is connected to an existing system. 52 . Is the failure in the safe or unsafe direction? 5. Effects and Criticality Analysis (FMECA) A good understanding of the functioning of the various subsystems or modules is a prerequisite for safe operation of any system. It can also be very useful as part of a risk analysis. but this is a rather typical example. The FMECA is often carried out during the design phase of a system in order to reveal weaknesses and potential failures at an early state. ways in which they can fail) and the causes and effects of these failures. The analysis results are recorded in a specific FMECA worksheet (see example Table 9). Therefore a Failure Modes. better maintenance. It can be carried out for the whole system or restricted to some subsystems or modules. How can each part of the system conceivably fail? 2. redesign.Methods for risk analysis of drinking water systems from source to tap A HAZOP study may highlight specific deviations for which risk reduction options need to be developed (and implemented). There are many variations of FMECA sheets. This risk analysis method is most suited to be applied to the treatment system and distribution network of a water supply system.3 Failure Modes. What mechanisms might produce these modes of failure? 3. new procedures. What could the effects be if the failure occurs? How critical is it? 4.

g.g. to pump required flow etc.g. The following information is given. failure to start.: xx Analysed unit Description of failure (module) Ref. a drawing. This could also just be the units name or tag number if it exists. Failure cause and mechanisms: The possible failure mechanisms and/or events that may cause the identified failure modes are recorded (e. Condition monitoring In Table 9 the heading gives the name of the system or subsystem. Membrane filtration Ref.Methods for risk analysis of drinking water systems from source to tap Table 9. tion onal mode cause or of failure mode mechanism xxy Pump Running Stop Degradation Alarm water while Corrosion running Performed by: NN Date: 2008-10-10 Effect of failure On (sub)system Pressure Reduced loss water quality from treatment On the module Page: 1 of 4 Failure Conse. corrosion. A pump can e. Func. Effect of failure on the system or subsystem function: The effects that the failure mode has on the system are recorded. and other general information.).Operati. pumping water.no: Reference to e. Function: The function of the unit is explained (e. fatigue etc.Failure Failure Detection no.Criticality Risk rate quence (risk) reduction options 1 per 10 years Medium Medium Maintenance. disinfectant addition) Operational mode: The unit can have different operation modes. erosion. This can affect the possible failure modes. drawing no. be running or be in standby. Effect of failure on unit/module: The “local” effect of the failure mode is recorded. Then information is successively provided for each unit (module) of the relevant subsystem. One line for each failure mode should be inserted in the form. Failure rate: The rate of failure (frequency) is recorded for each failure mode. see columns in the worksheet: Ref.g. Consequence: The consequence category (severity) of failure mode is recorded.g. Failure mode: All failure modes should be recorded. Stand by pump. Example of FMECA worksheet System: Treatment system. 53 .) Detection of failure: The way in which the failure mode is detected is recorded. The failure mode should be formulated as failure to perform a main function (e.

This analysis is described in more detail in Appendix E and a short review is given below. The system functionality is expressed in terms of removal efficiencies for each planned treatment step. Then the reduced removal efficiencies can be identified for each failure mode.4 Removal efficiency of the water treatment system When the hazards and hazardous events are identified during the design phase of a drinking water treatment plant. and then give a list of failure modes of the planned system.g. Therefore guideline concentration values are determined for each parameter of a predefined list of relevant parameters. The method is quite easy to understand and there is no need for much training.e. we obtain the probability that the concentrations of certain parameters are exceeded in the produced drinking water. i. Based on the removal efficiencies of the various treatment steps. turbidity etc. A more thorough introduction to and description of FMECA is given in Appendix D. we can for each water quality parameter and each failure mode calculate the overall removal efficiency for the whole system. there should be an analysis to identify the needed removal efficiency of the treatment system with respect to the hazardous agents identified. the need for man hour resources will be extensive. First the removal efficiencies are assessed for each treatment step during normal operation. In particular.coli. If one is to perform a FMECA of a complete water supply system. The FMECA can be documented by using a spreadsheet as shown earlier. one should also specify the requirements for the plant’s treatment system.). 6. Data has to be 54 . An introduction to FMECA is also given e. Risk reduction option: Possible actions to reduce the consequence or the frequency of the failure mode are recorded. in [23]. More detailed presentation of the FMECA method can be found in several standards [77. It is however recommended to have a facilitator that is familiar and experienced with the FMECA method. E. lead. and for a fixed set of predetermined water quality parameters. The most important competence of the people that are involved in the analysis is knowledge about the system in question. each treatment step and each parameter. The final objective of the method is to obtain the probability that the drinking water quality is insufficient (despite the drinking water treatment system). In addition there is usually also a Comments field in the form. 78.g. The FMECA method is a sort of structured brainstorming. from source to tap. but also many different FMECA software packages exist that will help structuring and executing the analysis.Methods for risk analysis of drinking water systems from source to tap Criticality: The (combined) evaluation of failure rate and consequence related to the failure mode. (e. 79]. An important step of this analysis is to carry out a FMECA in the design phase of the treatment and monitoring system.

etc. the occurrence of failure modes and the exceedence of the raw water concentrations. filtration. cf. illustration in Figure 15. The specific strength of this method is that it identifies any combination of single failures. Then we should estimate the probability that these parameter concentrations in the raw water are exceeded. The fault tree is a logic diagram. Pesticide > 0. Any event can be “broken down” step by step. 6. to get an overall view of all possible ways that this system can fail (or experience a specific type of failure). In all these cases we can apply a Fault Tree Analysis (FTA). we could look at the entire water utility. The combination of probabilities.075 > 0.Methods for risk analysis of drinking water systems from source to tap provided about the probability of the occurrence of certain concentrations of these parameters in the raw water. results in the calculation of the final probability that the concentration thresholds in the drinking water for certain parameters are not complied with. Event combinations for the parameters pesticides and bacteria. A FTA [23] is particularly suited to identify and analyse systematically the various failure causes of a (sub)system.03 or Bacteria > 1 x 10-5 or + FM1 > 0. For example it could be needed to analyse the effectiveness of a specific treatment system (UV disinfection. The method is based on published studies [24]. It displays the interrelationships between the undesired event and the causes of this event. There could for instance be a critical safety barrier. but together are critical (causing system failure). Further. In the design phase this approach can help to evaluate options for the design of the treatment system. which alone are not critical. starting with the main undesired 55 . in order to comply with guideline or threshold values of the produced drinking water. When the removal efficiencies are determined we can calculate raw water parameter concentrations that must not be exceeded.) in relation to possible failures of its elements.5 Fault Tree Analysis (FTA) As part of a total risk analysis we may need to carry out a dedicated analysis of the causes of some hazardous events. CO2.3 FM2 + > 0. [25]. which needs to be analysed in more detail. which provides a model of the failure causes.3 FM1 + >5x 10-6 >5x 10-6 FM2 + >2x 10-6 Figure 15. The main part of a FTA is to construct a fault tree related to a specified undesired (hazardous) event. in order to investigate the possible events causing it to fail. and described in more detail in Appendix E.

Similarly. if they occur. Qualitative analysis of the fault tree (e.g. will cause the top event to occur 56 . Identification of minimal “cut sets 6” 4. we will investigate the “top event”. In this example. in particular “AND-gates” and “OR-gates” (Figure 16). which in the FTA is referred to as the “top event”. Quantitative analysis of the fault tree (estimate probability of top event) We will present a simple FTA where four of the five steps above are considered.Methods for risk analysis of drinking water systems from source to tap event. there are two redundant pumps. the top event and boundary conditions 2. it is sufficient that one is functioning in order to avoid the undesired event. i. Construction of the fault tree 3. loss of power to motor The second event is broken further down into • • Pump 1 fails to pump water Pump 2 fails to pump water 6 Cut set is the combinations of basic events that. AND and OR gates of a fault tree. First (step 1). In Figure 16 the AND-gate is used to model that both “Event 1” and ”Event 2” must occur in order for the “Event A” to occur. undesired/hazardous event: • Failure to pump water (at a specific location) Next step (step 2) is to create the fault tree and we have three possible events that can cause the top event to occur (see Figure 17): • • • The pumps do not receive water (“no input”) None of the two pumps are working Common motor of pumps fail. Definition of the problem.e. the OR-gate is used to show that “Event B” occurs if either “Event 1” or “Event 2” occurs (or both). A fault tree analysis is normally carried out in five steps: 1. evaluating the criticality of the cut sets) 5. incl. Figure 16. The fault tree applies specific symbols.

and the events on the lowest level are called basic events. The undesired/hazardous event is called the top event of the fault tree. we can – using the fault tree – also quantify the probability that the top event occurs. normal events. Figure 17. In summary. but an example on this is presented in Appendix C. human errors. the top event occurs if both P1 and P2 occur. a fault tree is a logic diagram that displays the interrelationships between an undesired event in a system and the causes of this event. Finally. If we can assess the probability of all basic events. In the above example: P(top event) ≈ P(NI) + P(CP) + P(P1) x P(P2) The quantitative analysis (step 5) is not described here. A fault tree may be broken down to the preferred level of resolution. identified as: NI = No input to pumps P1 = Pump 1 fails P2 = Pump 2 fails CP = Common motor/power failure. and environmental factors that may result in a critical event for the system. identifying the “cut sets” (step 3). First there can be a qualitative analysis. A properly constructed fault tree provides a good illustration of the various combinations of (component) failures. P2} Next (step 4) we can carry out a quantitative analysis. and environmental conditions. human errors. Example of a simple fault tree. The causes may be technical failures. 57 . So in the above example there are three (minimal) cut sets: S1 = {NI} S2 = {CP} S3 = {P1. The various events in a fault tree are connected through logic gates.Methods for risk analysis of drinking water systems from source to tap At this stage we stop to further break the fault tree down and at the bottom of the fault tree we have four “basic events”. causing both pumps to fail The further analysis will be based on these basic events. For the top event to occur it is sufficient that NI or the CP event occur alone. A more extensive example is given in Appendix C. normal events.

see Appendix C (example C2) and [3. As long as there is at least one connection between the end points a and b. can be derived either from the FTA of previous section or from the RBD. A more comprehensive example is given in Appendix C (example C1). 6. There can be various ways to connect the end points a and b. Each component is illustrated by a block in the diagram. The analysis based on the fault tree can also be quite complex. using P(water pump is functioning) = 1 – P(water pump has failed). 30. The fault tree construction should be carried out in co-operation with risk analysts and with utility personnel that are well acquainted with the operation of the system. P(water pump has failed to pump water). The fault trees can be quite complex for big systems. Illustration of a simple reliability block diagram. When component fails it breaks the connection at that point. i. Both water pumps have failed. It is usually an easy task to convert a fault tree into a RBD. identify cut sets and perform quantifications. the specified system functions. A system that is functioning if and only 58 . The RBD in Figure 18 below corresponds to the fault tree (Figure 17) in the previous section. see Figure 18. Two important structures of a reliability block diagram are a series structure and a parallel structure. Pump motor has failed. Figure 18.g. Use of fault trees are described e. which is an overall analysis of water utility. Further. The way n components are interconnected to fulfil a specified system function may be illustrated by a RBD. 26-29]. The FTA is also the main approach of the Göteborg case study. These are illustrated in Figure 19 below. in [23. the probability of the system failing. 31]. The cut sets of the top event (see previous section) are easily seen from Figure 18.Methods for risk analysis of drinking water systems from source to tap The FTA is seen as a rather advanced method.e. Various software packages are available to draw fault trees.6 Reliability Block Diagram (RBD) A reliability block diagram (RBD) is an alternative to a fault tree. In Figure 18 we see that the system fails if either • • • Input water is not available.

HRA is a collective term for various methods (see e. Task analysis methods can also document the information and control facilities used to carry out the task: • Task analysis covers a range of techniques used to describe. Human reliability assessment (HRA) deals with the impact of human operators and maintainers on system performance and can be used to evaluate human error influences on water quality and water quantity in the water supply system. Human errors occur both during operation and maintenance.g. [32] for descriptions of HRAmethods). is called a series structure. Figure 19. 2. A parallel structure is a system that is functioning if at least one of its n components is functioning. The main steps of HRA-methods are: 1. Analysis of RBD is presented e. A detailed description of RBD is found in the IEC standard [80]. and it is an important task for any operator to reduce the number of these errors. HRA can for instance be used for analysing work processes carried out by human operators. Task analysis Human error identification Human reliability quantification Task analysis is the study of what an operator (or team of operators) is required to do. the human-machine and human-human interaction in systems. 59 . 6. to achieve a system goal [33].Methods for risk analysis of drinking water systems from source to tap if all of its n components are functioning. The objective of the task analysis is to describe and characterize the task to be analysed in sufficient detail to perform human error identification and/or human error quantification.7 Human Reliability Analysis (HRA) It is well known that human errors are very important (often the most important) sources of failures.g. and in some cases to evaluate. in [23]. in terms of actions and/or cognitive processes. Illustration of a series structure (left) and parallel structure (right) of a RBD. 3.

There are several types of error recovery: • Internal recovery. usually from a safety perspective. The human error identification is usually documented in a tabular task analysis. [33]: • • • Techniques for the collection of task data on human-system interactions. too late or too early. acts carried out inadequately.e.e.. Hierarchical task analysis – water sample. Analyse water sample 6. Warning about water sample 2. or later. etc. Report result from water sample Figure 20. The techniques for task analyses are divided into five groups. Error of commission. Extraneous act. the human error identification should consider the following types of error [32]: • • • • Error of omission. wrong (avoidable) act performed. Task simulation methods which are aimed at “compiling” data on human involvements to create a more dynamic model of what actually happens during the execution of a task. and corrects the situation. the operator having committed an error realises this immediately. Transport sample to laboratory 5. Task requirement evaluation methods which are utilized to assess the adequacy of the facilities which the operator(s) have available to support the execution of the task. 0.) and documentation such as procedures and instructions. At least. 60 . • • An example of a top level hierarchical task analysis for water sampling is shown in Figure 20. tools. i. The task analysis is followed by the human error identification.. Water sample 1. Carry out water sample 4. Task description techniques which structure the information collected into a systematic format. There exist several methods to carry out task analysis. controls. acts omitted or not carried out. and directly describe and assess the interface (displays.. An example is illustrated in Table 10. in wrong sequence. Error-recovery opportunities (possibility to correct errors before a critical event has occurred). i.e. Task behaviour assessment methods which are largely concerned with system performance evaluation. Prepare for water sample 3. i.Methods for risk analysis of drinking water systems from source to tap The human error identification identifies and describes possible erroneous actions while the human reliability quantification estimates the probability of erroneous actions.

Independent human recovery.Methods for risk analysis of drinking water systems from source to tap • • • External recovery. [37]. the operator having committed an error. which is the metric of human reliability assessment. • SLIM – Success Likelihood Index Method. 61 . Some known and recognized HRA-methods are: • THERP – Technique for Human Error Rate Prediction. The human error probability is defines as follows: HEP = Number of errors occurred Number of opportunities for error Further description of the quantification process is given in [32]. This implies a degree of error tolerance. another operator monitors the first operator. Analyse water sample 6. [35]. assuming that human error probability is a function of the time available to respond to an event. detects the error and either corrects it or brings it to the attention of the first operator. System recovery. • HEART – Human error assessment end reduction technique. three approaches to quantification can be distinguished: (1) Decomposition or Database Techniques. (3) Expert Judgement Based Techniques utilising expert knowledge. Human reliability quantification techniques all quantify the human error probability (HEP). [36]. Warning about water sample Human error Warning not sent Warning sent too late Warning not understood Test tube not disinfected Action omitted Action not carried out correctly (according to rules) Test tube not sealed Sample sent to wrong address Sample not properly packed Action omitted Analysis not carried out correctly Results misinterpreted 2. Transport sample to laboratory 5. Carry out water sample 4.g. Report result from water sample Dependent on the purpose of the analysis. there does not exist one universally accepted methodology with a firm theoretical basis [34]. Although numerous HRA quantification techniques have been developed and applied over the years. the system itself recovers from the human error. Prepare for water sample 3. it may be possible to quantify the likelihood of the errors involved and then determine the overall effect of human error on system safety or reliability. or of error detection and automatic recovery. According to [34]. an alarm or an error message). is prompted by a signal from the environment (e. who corrects it. involving decomposition of tasks to a level for which some reference data are available and can be adjusted according to the specifics of the task. Task 1. (2) Time Dependent Methods. Table 10. Human error identification – some examples.

a preliminary analysis can have identified a critical component or serious problems are actually observed with the operation of a specific subsystem. For instance. The analysis results for each option are e. ATHEANA – A Technique for Human Event Analysis [39]. Example A: In order to carry out a certain operation. The mean number of times (during one year) when it is necessary to operate with just one pump. two pumps are needed. it should be carried out an analysis to identify the right level of redundancy and/or preventive maintenance for the system (component). The time consumption of a HRA depends on the scope of the analysis. in WP 5. • • The probability (fraction of time) that it is necessary to operate with just one pump. Based on such measurements one can identify the state of degradation for various pipe sections.g. In depth knowledge about HRA methods and knowledge about human reliability data in order to carry out a quantitative human reliability analysis. In a Markov analysis we define various performance states (or levels of deterioration) for the system. The following knowledge is necessary to carry out human reliability assessments: • • • In depth knowledge about the work process in order to carry out the task analysis.Methods for risk analysis of drinking water systems from source to tap • • • • CREAM – Cognitive Reliability and Error Analysis Method [38].6 in the TECHNEAU project). (or mean number of times that both pumps fail). Access to relevant human reliability data as basis for quantitative human reliability analyses may be a problem. If no specific human reliability data for water supply systems are known the analyses have probably to be based on generic human reliability data from other types of industries. It may be time consuming to analyse all work processes involving human actions in the water supply system quantitatively. A Markov analysis can then be carried out to compare the performance of these two options. MERMOS – Methode d’Evaluation de la Realisation des Missions Opérateur pour la Sûreté [40]. Knowledge about human factors and human errors in order to identify human errors. or to have a third pump in stand-by. One can decide either to install just two pumps.8 Markov Analysis There are situations where a more detailed analysis of the reliability of a system (or component) should be carried out. 62 . Example B: Markov analysis can also be used to model deterioration processes.g. for example consider wastewater pipes. a Markov analysis will in combination with an analysis of costs help us to make a choice between the two options. Thus. 6. As a consequence. A Markov model can then be developed to describe transitions between deteriorating states. SPAR-H – Standardized Plant Analysis Risk HRA Method [41]. Equipment for direct measurement of the remaining pipe wall thickness exists for water networks (e.

system being “as good as new”). Similar results apply for repair times.. (repairs per unit time) μ2 = Repair rate when both components have failed. leaks of a certain size). and let the set of all possible states equal S = {0. In this specific diagram it is possible to make transitions between state 0 and state 1 (both ways). In general the analysis will start by formulating a Markov model for the system in question. (illustrating a Markov model). Now assume that an analysis should be carried out to derive the repair strategy to follow when both components have failed. The states of the model correspond to State 0: Both components (pumps) are working State 1: One component (pump) has failed and the other is working. (i. Thus. This can help us to plan repairs and replacements of pipes. three states). (failures per unit time) μ1 = Repair rate when one component has failed. Transitions between the states here occur according to the following constant rates: λ = Failure rate of a component (that is operating). both components can fail. 63 . and between state 1 and state 2 (both ways). These states. (and mean time until a failure occurs equals 1/(2λ)).e.Methods for risk analysis of drinking water systems from source to tap and such a model can help us to analyse and predict the time until the occurrence of pipe failure (i. N}. For instance. When the system is in state 0.e. Figure 21. but not between states 0 and 2. 1. the total rate of state 0 equals 2λ. Here state 0 could denote a perfect system (i. When both components have failed. and state N could represent a completely failed system.Markov state diagram for example C. (but system is still working) State 2: Both components have failed. and possible transitions between these are illustrated by a Markov diagram.e. (and so system has failed). the mean time until one repair is accomplished equals MTTR = 1/μ2. …. The repair time of a single component has a mean MTTR = 1/μ1. The Markov diagram in Figure 21 could illustrate the model for a redundant system of two components (say two pumps). see Figure 21 for an example with N=2. The analyst will in co-operation with personnel who is familiar with the system define various states. Note that in this example it is assumed to be sufficient that one component is working for the system to be OK. Note that these MTTR are interpreted as the total time elapsing from a failure occurs until a component is fully restored. (repairs per unit time) This means that Mean Time To Failure (MTTF) of one component equals MTTF=1/λ (Appendix F). we can allow the system to have N +1 states. corresponding to various levels of performance/deterioration.

e. 1 and 2 respectively. Some experience is required both to define the states and the relevant transitions between these. (this actually means that just one component is repaired at a time. In summary. There is also an IEC standard for the method which could be helpful for a concise description [81]. that the system is failed. Reducing MTTR further to 0. and demonstrate how this rate depends on μ. λ = 0. p1 and p2 that the system is in state 0. Finally it is pointed out that a Markov analysis is based on two specific assumptions: • All transition rates are constant in time. (say more than 3-4). μ=2 year-1.1 year (≈ 37 days). Appendix F). One strategy could be not to repair a failed component before the next overhaul. also when both have failed). μ=100 year-1. the transition rates will be independent of how long the system has been in the current state. and will not depend on the time elapsed since the current state was entered. which shows how the probability to be in the failed state (p2) depends on the repair rate μ (given the value λ = 0. μ=10 year-1. The need of data (i. like the frequency (rate) of system failures. px is the probability that the system (process) is in state x. the analysis can become rather complex and time consuming. One could also calculate other parameters. Such tools are commercially available. and then p2 = 1/(2μ2 + 2μ + 1).e. meaning that all transition rates depend only on the current state.e. the Markov analysis is seen as a rather advanced method. and the use of a data tool is recommended. 64 .e. That is.e. gives p2 ≈ 5·10-5. Obviously p0 + p1 + p2 = 1. This implies that all involved failure times and repair times are assumed to have an exponential distribution. that is system is in the failed state 7.7% of the time.01 year (≈ half a week). and not on the previous history of the system. A numerical example is given: Let time be measured in years. (x = 0. p2. 2). This gives p2 ≈ 0.5). p2·λ. will give p2 ≈ 0. Thus. Changing the repair strategy and reducing MTTR to 0. 1. The Markov model is based on the system having a “lack of memory”.5%. and by using some equilibrium equations all three probabilities can be derived.5 years. • So in particular.Methods for risk analysis of drinking water systems from source to tap The basic task of a Markov analysis is to derive the probabilities p0. i.0045 ≈ 0. For simplicity assume μ2 = μ1 = μ. i. This type of results can help us to choose a sensible MTTR to use for this system.5 year-1.077. Also note that if the system has several states. Then the following probabilities are derived p0 = μ2/(μ2 + 2λμ + 2λ2) p1 = 2λ μ/(μ2 + 2λμ + 2λ2) p2 = 2λ2/(μ2 + 2λμ + 2λ2) Here it is particularly interesting to find the probability. this could imply that on the average MTTR = 0. transition rates) is significant. and it will require skilled personnel to carry out such an analyses. (cf. and assume that it is experienced that the average life time (MTTF) of a component equals 2 years. If there is an overhaul every year. Rausand and Høyland [23] provide a good introduction to the use of Markov analyses in reliability and risk analyses. i. i.

A discrete variable is one with a well defined finite set of possible values (called states). Nodes choice decision utility Figure 22. If there is a dependency from node A to node B.9 Cause. A simple influence diagram [45] and node types Figure 22 contains three types of nodes: chance nodes represented by an ellipse. The chance node ‘Weather’ represents whether or not it actually rains during the day (states: “rain” or “no_rain”). The graphical representation is useful for intuitively defining dependencies and independencies of complex problems and for communicating about these problems. The decision node stands for the decision 65 .effect relations . Bayesian networks consisting of decision nodes or utility nodes are called influence diagrams or decision networks. Introduction A probabilistic network is a graphical and qualitative representation of a problem. a decision node represented by a rectangle and a utility node represented by a flattened hexagon. consisting of parameters. The chance node ‘Forecast’ represents the weather forecast in the morning (states: “sunny”. B is described as a child of A and A as a parent of B. “cloudy” or “rainy”). Probabilistic networks have become an increasingly popular tool for reasoning under uncertainty. A continuous variable is one which can take on a value between any other two values. Bayesian networks are a specific subclass of probabilistic networks where the connectors are represented as quantitative probabilistic dependencies between variables (cause-effect relations) in one specific direction.Methods for risk analysis of drinking water systems from source to tap 6. such as: indoor temperature or volume of consumed water. A specific condition for Bayesian networks is that it consists only of directed acyclic graphs. represented by nodes and their interactions represented by connectors [42-44]. The direction of the dependency defines the hierarchy between nodes. This probability can be a continuous or a discrete variable. A decision node represents a variable (or choice) that is under the control of the decision maker. A utility node represents the expected value that is to be maximized while searching for the best decision rule for each of the decision nodes. Bayesian networks contain nodes that represent a probability (chance nodes). such as the numbers 1 to 6 on a dice or a statement which is either “true” or “false”. Bayesian networks can be augmented with decision nodes or utility nodes.Bayesian Networks The Bayesian Network is an advanced analysis technique to model how various factors affect the performance of relevant systems and thereby the resulting risk. In Figure 22 an example of an influence diagram is given. meaning that the network may contain no nodes that lead through other nodes back to itself.

Where in the remainder of this paragraph reference is made to Bayesian networks.e. but no link from ‘Weather’ to ‘Decide_Umbrella’. The most important advantages of Bayesian networks are: • • • • • • A subtle modelling approach is possible. Specialist skills for building Bayesian networks are not required because well documented software packages are available. All nodes have an underlying Conditional Probability Table (CPT) containing the probabilities of occurrence. it would be easy to decide whether or not to take the umbrella. This high demand of data may require a significant effort.g. Use of Bayesian networks in practice Bayesian networks7 result in a quantitative outcome. nodes with parents) the CPT describes the probability of the occurrence of a state. A large freedom of programming. given the state of its parent(s). In these software CPTs can be generated by applying predefined rules. For non-modifying parents (i.1. especially compared to fault and event trees. For modifying parents (i. The possibility of making sensitivity and what-if analyses (see Section 3. nodes without parents) the CPT describes the probability of occurrence of the given states (“rain” or “no_rain”). The probabilities of the dependencies can be based on: • • • • raw data collected by direct measurement. Especially for complex problems a certain amount of analytical knowledge is required. The graphical lay-out (the nodes and the dependencies) are plotted first. output from models. and become a learning model. this is also valid for influence diagrams. Both tree types can be integrated into a Bayesian network. as probabilities can be obtained from various types of data and per variable (node) various states can be differentiated (e.1). Nodes with multiple parents and states will have a large CPT. expert opinions based on theoretical calculation or best judgement. “Small_burst” or “Big_burst”). The model can be updated with new data. The network configuration and its calculation are integrated into one model. spreadsheets and Matlab etc. a pipe burst can be characterized as ‘No_burst”. facilitating the communication of the approach and results. An important disadvantage of Bayesian networks is that with an increasing complexity the amount of input will grow exponentially. and at the utility node the result of the decision maker’s level of satisfaction is calculated. Bayesian networks have the ability to import data from other software. Adapting Bayesian networks to new situations is relatively easy.Methods for risk analysis of drinking water systems from source to tap whether or not to take an umbrella. with the possibility to link different branches of a tree or to link variables in a fault tree directly to variables in an event tree.e. if he knew for certain what the weather was going to be. raw data (mostly perceptions) collected through stakeholder elicitation. especially when data is obtained through stakeholder elicitation or expert knowledge. These rules can be based on statistical analysis or expert knowledge. representing the probability that a certain occurrence will happen. There is a link from ‘Forecast’ to ‘Decide_Umbrella’ indicating that the decision maker will know the forecast when he makes the decision. 7 66 .

cs. If the states in the node Light are replaced by “insufficient_light” and “sufficient_light” and it is assumed that the probability of sufficient light given that only 1 bulb is working. These conditional probabilities have to be defined into a CPT.ubc. Table 11 gives the CPT for the node ‘Light’. The network consists of three non-modifying nodes and one modifying node with a result depending on the condition of its three parents. Bayesian network of a system consisting of a power source and two bulbs. Table 11. Most of them have free demo versions. Conditional probability table for node ‘Light’. or two failing bulbs.norsys. In this example it is assumed that ‘Light’ is a result of either a failing power supply. than the CPT of the node Light looks like the one represented by Table 12.7.com) and Hugin (www.hugin. 67 . Each node has two states (“fails” and “works”).ca/~murphyk/Bayes/bnsoft. Power source Bulb 1 Bulb 2 no light Light Works Works Fails works fails works fails 0 0 0 1 1 1 1 0 fails works fails works fails works fails 1 1 1 1 0 0 0 0 In the CPT the Bayesian network is modelled for an uncertain condition (the availability of light). Light Power source fails Bulb 1 fails Bulb 2 fails Figure 23.dk) For a comprehensive list of Bayesian networking packages see: www. Two well known packages are Netica (www. Figure 22 and Figure 23 show simple Bayesian networks with a limited number of nodes. is 0.Methods for risk analysis of drinking water systems from source to tap For the calculation of Bayesian networks different commercial models exist. meaning that the condition of the node Light is dependent of eight (23 = 8) conditional probabilities.html. It is also possible to model for uncertain relations between conditions. More complex networks can be made such as in Figure 24 where an example of a Bayesian network is given for diagnosing the probability that a car starts. Example of a Bayesian network In Figure 23 a Bayesian network is given for a system containing a power source feeding two bulbs.

Conditional probability table for node ‘Light’. is in the analysis often denoted as the initiating event.7 0 Fails works Fails works fails works fails 1 1 1 1 0 0 0 0 Figure 24. 68 . The event sequence is influenced by safety barriers (or control measures) and the consequences are determined by assuming failure or success of the existing safety barriers (or control measures). The hazardous event considered in an ETA.3 1 1 0.analysing consequences Event tree analysis (ETA) is the most commonly used method for analysing the progression of a hazardous event from being initiated to the final consequences. An event tree is a logic tree diagram that starts from the initiating event and provides a systematic coverage of the time sequence of event propagation to its potential consequences. 6.Methods for risk analysis of drinking water systems from source to tap Table 12.3 0.7 0. A Bayesian network for diagnosing the probability a car starts [45]. Power source Bulb 1 Bulb 2 insufficient light sufficient light Works Works Fails works fails works fails 0 0.10 Event Tree Analysis (ETA) .

A simple event tree was illustrated in Figure 12. The event tree displays the chronological development of event chains (from left to right). if not follow the lower branch. 4. section 2.Methods for risk analysis of drinking water systems from source to tap Each barrier outcome in the tree will be conditional on the occurrence of the previous safety barrier outcomes in the event propagation. Identification of relevant initiating event that may give rise to unwanted consequences. 2. human interventions. starting with the initiating event and proceeding through successes and/or failures of the safety barriers that respond to the initiating event. The frequency of each consequence is estimated by the product of the frequency of the initiating event and the probability of the consequence category.1. or barriers. (Figure 12). emergency procedures. The qualitative part of the event tree analysis is usually carried out in the following steps: 1.5). 3. the barrier fully functions.6). and combinations of these. 69 . section 6. section 2. where failure of barrier 3 is a “top event” in a fault tree). a quantitative analysis of the event tree may be carried out to give probabilities or frequencies of the resulting consequences. a FTA (see Figure 13. The reliabilities of the safety function may be carried out by e. partly functions or fails). the consequences will be ranked in an increasing order. If the barrier functions successfully.6. Identification of safety barriers provided to stop or mitigate the unwanted consequences (discussed in section 2. Safety functions. follow the upper branch. The initiating event may be identified by other risk analysis methods presented in this chapter like FMECA. Construction of the event tree. the total risk can be evaluated. Then the probability of each consequence category for the specified hazardous event is estimated by multiplying all probabilities in the event sequence. ETA can be carried out both qualitatively and quantitatively. from the initiating event to the consequence class under consideration. are provided to stop or mitigate the consequences of the hazardous events.g. ETA is a method used to analyse the consequences of hazardous events (Right part of the overall risk model outlined in Figure 13. with the worst consequence lowest on the list. but may also include multiple outcomes (e. The safety functions may comprise technical equipment. At every barrier the event tree splits into one upper and one lower branch. Hence. If experience data are available for the initiating event and for all the relevant safety barriers and hazards. The outcomes of the barriers are most often assumed to be binary (the barrier is either functioning or not). PHA or HAZOP.g. The conditional probability that each safety barrier will function properly given that the previous event sequence has occurred must be estimated. Finally by quantifying the consequence categories. Description of the resulting consequences.

unife. duration of event. EPANET. and is denoted CARE-W REL. (MTTF = ∞). One example of such a model is developed in the CAREW project (see http://care-w. where the bottlenecks are marked red. The joint model considers the probability of leak or bursts as well as reduction of flow capacity and the consequences for water supply measured as flow and pressure to the consumers. QMRA is a specific tool for risk assessment of microbiological quality of drinking water (QMRA is considered here to be a subtype of Health Impact Assessment. including identification of hazards (pathogens) and hazardous events. numbers of consumers affected). The principles behind QMRA has been described in [47] and further developed and applied during recent years in relation to risk management [48-50]. this 0-1 scale allows one to compare pipes which belong to different hydraulically independent networks. A Hydraulic Criticality Index (HCI ) equal to zero means that the pipe has no effect on reliability (with respect to water supply). Risk characterisation. either because the pipe has forecasted failure rate is equal to zero. QMRA is derived from the chemical risk assessment (QCRA) paradigm that encompasses four basic elements: • • • • A characterisation of the problem setting (system description). The EU project MicroRisk resulted in a number on reports [51] describing the use of QMRA. Effect assessment (dose-response curves for specific pathogens).it/ and [82]). Available dose–response data have been obtained mainly from studies using healthy adult volunteers. Exposure assessment (e. i. Figure 25 shows an example of a CARE-W REL analysis. for instance Health Impact Assessment and Health Risk Assessment.g. integrating failure rate and mean time to repair (MTTR). 70 . Even if this value cannot be reached. These are referred to in [2]. and just a short review is given here. or HRA) – see TECHNEAU report for details [46]. The model identifies “reliability bottlenecks” in the network. The pipe availability is also considered. is combined with a routine forecasting the probability of failure for each pipe. Here a hydraulic network simulation model. These methods aim to assess the actual health effects to consumers. 6. or because its unavailability (repair of failure for instance) has no effect on consumer interruptions to supply. This project included pathogen samplings and risk assessments involving 12 systems across Europe and in Australia. It has been used and tested in several cities to assess the reliability of water supply to sensitive consumers and entire water supply districts.12 Methods for risk analysis of water quantity (supply) The loss of water supply due to bursts and leaks can be analysed by a combination of hydraulic and reliability models.e.Methods for risk analysis of drinking water systems from source to tap 6. due to undesired events at the water supply system.11 Methods for estimation of risk to human health (QMRA and QCRA) There are general approaches for assessment of various risks to human health. A HCI equal to 1 means that the pipe is totally unavailable and that its unavailability will result in supply interruptions for all consumers served.

Map displaying the hydraulic criticality of water mains: bold lines refer to a high criticality. Finding “reliability bottlenecks” of water through combination of hydraulic capacity and failure probability (Example from CARE-W REL). 71 . 2 2 9 T a r a ld s v ik 1 4 3 S y ke h u s R ED .Methods for risk analysis of drinking water systems from source to tap Figure 26 provides an example of a map displaying the hydraulic criticality of water mains in a particular network. V EN T IL R ED . 152 153 138 150 151 158 149 148 12 24 5 7 137 4 11 10 1 2 9 8 3 14 15 16 34 44 73 45 36 79 78 80 102 103 116 126 118 104 117 105 119 120 114 113 214 125 124 123 115 127 81 95 83 82 94 37 77 17 72 76 128 18 157 19 74 75 162 160 154 155 163 164 169 170 217 168 175 159 156 166 25 21 22 23 201 167 6 13 20 50 51 52 28 202 41 39 93 40 101 35 38 42 46 43 208 97 86 85 84 213 88 87 89 96 9 99 Figure 26. V EN T IL R ED . (example from CARE-W). V EN T IL T A R A L D S V V A NNF K O NTRO L L O S CA RB O RG 7 3 H ø g s k u le 1 6 3 F ly p la s s 5 1 In d u s t r i 2 0 0 A n k e n e s B o o g S e r v ic e s e n t e r 1 6 9 In d u s t r i A N K EN ES 1 7 7 In d u s tr i N Y B O R G PS 2 1 8 H å k v ik p s y k ia tr is k e n o r d la n d s k lin ik k e n Figure 25.

Furthermore. Due to the spatial context of risk.Methods for risk analysis of drinking water systems from source to tap Other models than CARE-W REL exist. Program and Operational level Risk portfolio including the use of GIS (extracted from [53]). Based on that. In the Netherlands all water companies apply for many years a reliability analysis of the total system. to examine infrastructure deterioration induced by spatially variable risk factors.13 GIS as a tool in risk analysis 6. GIS technologies allow utilities to convert data displayed on paper maps into digital format. g. groundwater) Quantified risk mapping over space and time Assessing risk of distribution system water quality degradation 72 . A comprehensive review has been presented and analyses show that Geographical Information Systems (GIS) have the potential to become part of the risk assessment portfolio [53]. Table 13. The risk analysis portfolio using GIS-techniques is compiled in Table 13. This has been integrated into the national policy. GIS provide the visualisation of infrastructure assets and the tracking of their associated risk factors [53]. applications of GIS technologies offer the capabilities to spatially analyse data.13. program. [54] [55] [56]). Risk hierarchy Tool / Technique Context PROGRAM RISK ANALYSIS LEVEL Asset management GIS risk tracking GIS spatial analysis GIS risk simulation Catchment management GIS risk mapping Contaminant flow/ transport modelling Kriging GIS risk simulation OPERATIONAL RISK ANALYSIS LEVEL Public Health and GIS simulation Compliance Risk Application Infrastructure risk-tracking. for example (e. GIS assisted risk analyses have a broad application not only in public health protection but also for asset management and potential threats to the security of supplies.1 Introduction Risk analysis strategies and techniques for application in the water utility sector have been surveyed at the strategic.g. GIS is used at the program risk analysis-level to optimise the total cost of owning and operating the infrastructure assets of a water utility. visualisation and communication Risk-mapping of infrastructure Evaluating degradation risk Mapping areas of catchment critical to water quality Projecting degradation patterns / assessing risk of water quality violation Projecting degradation patterns with limited sample data (e. 6. and operational levels of decision making by [52].

The TECHNEAU Hazard Database THDB [20] is applied as basis for hazard identification.. ‘GARAmethod’). industry. The catchment is often referred to as the ‘first barrier’ within the so called ‘multi-barrier-system’. It is then possible to display the hazards and assess them using a Geographical Information System. A short summary is given below.3 GIS Assisted Risk Analysis – Description and application During the German TECHNEAU case study in Ebnet-Freiburg. Information on the exact location of hazards (including map coordinates) and land take (aerial extent) is added. The natural properties and the anthropogenic land-use patterns in the catchment play a decisive role for the raw water quality. erosion. rainfall. geology. and recreational activities. specifically monitoring programs. caused by natural or human factors such as wild animals. Using geo-statistical inference (kriging methods) to characterise the extent and severity of source contamination (e. g.13. see the case study report [63]. as the first barrier to water supply.Methods for risk analysis of drinking water systems from source to tap 6. For groundwater resources the degree of this protective function can be expressed as ‘intrinsic vulnerability’. soil type.2 GIS in catchment risk management Because of the spatial context of risk and the classical capabilities of Geographical Information Systems for spatial data analysis. GIS can support the identification of hazardous events in the catchment areas by querying the natural boundary conditions and land-use patterns which may lead to the release of hazardous substances or pathogens. 6.) Spatial risk-ranking methodologies of these attributes. which is independent from the chemical or physical properties of the specific hazards (‘specific vulnerability’) [57]. Therefore GIS is an appropriate instrument for the preventive protection and for the management of risks in the catchment area. Map-overlay techniques to identify areas critical to catchment water quality and to inform the prioritisation of catchment management activities.13. using the GIS attribute table. g. land use. etc. according to predefined formulas (e. TZW researchers used a GIS based approach for catchment risk assessment. g. contaminant concentration) • • • The hydraulic and geological settings determine the natural protective function of the catchment. agricultural activities. hazard ranking and groundwater vulnerability mapping (‘GIS Assisted Risk Analysis. traffic. GIS techniques in catchment management [53] include: • Mapping of data and attributes that are spatially variable in nature considered to play a significant role in pollutant transport (e. Risk analysis of water supply in a catchment-scale has to consider a multitude of possible sources of hazardous events. combining hazard mapping. Such approaches have been applied by [58] [59] or for karst groundwater [60] [61] [62]. weighted runoff-potential index).. 73 . many GIS-applications deal with catchment or watershed management. Groundwater contamination risk assessments and mappings that are using GIS are often based on vulnerability maps.

74 . By overlaying the GIS hazard-layer and the GIS protection zone-layer. The final product. depth to groundwater table etc.Methods for risk analysis of drinking water systems from source to tap Figure 27. The result. Figure 28 demonstrates how the hazard level and the vulnerability factor can be combined to express the risk intensity. expressing the ‘harmfulness’ of a hazard. based on its natural protective function. the degree of groundwater contamination depends on the travel time and the dilution with uncontaminated groundwater. is the result of overlaying the vulnerability map and the hazard map with the GIS. visualizing the risk associated with the hazardous events.8 (outer protection zone) to 1. During that field survey the identified hazardous events in the catchment area are revised and described more closely. the Risk Intensity map.2 (inner catchment zone). depending on the hazards’ properties and their location within the catchment. Therefore a second ranking procedure modifies this score slightly by reducing or increasing it due to the distance to the water abstraction wells. The multiplication of the reciprocal value of the Hazard level with the PI-factor expressing the vulnerability level is performed with the GIS. Each groundwater protection zone is assigned with a ranking factor ranging from 0. and a field survey. Schematic illustration of the procedure of hazard mapping and hazard ranking used in Cost Action 620 (2004) [60] The hazard identification is supported by brainstorming. infiltration of surface waters. The weighting of the hazard in the GIS attribute table is done by assigning a score ranging from 0 to 100. In porous aquifers. To consider overlapping hazards. the single weighted hazard scores are added in the overlapping area. The next step is the digital mapping of the intrinsic groundwater vulnerability based on soil-data and the hydrogeological setting of the catchment area (like groundwater recharge. The ‘PI-method for groundwater vulnerability assessment’ [61] was applied and adapted in the TECHNEAU case study in Freiburg in Germany [63]. This Risk Intensity map can be the basis for future considerations on risk management and risk reduction options in the catchment area. it is possible to calculate the final ‘Hazard Level’. The weighting is based on literature data [60] and expert judgement and experience of the water utilities’ personnel. expressed as a catchment hazard map (Figure 27).). the ‘Risk Intensity Index’ is displayed in a map. an analysis of earlier incidents. The result is a semi-quantitative hazard ranking.

Maps and aerial pictures were used as topographic base. Digitizing can take place at a desktop PC or GPS-assisted. well locations etc. Depending on the degree of complexity of the project’s objective and the available data basis some effort may be necessary to gather the data needed. 6. with tablet PCs directly in the field. But the decisive factor for the use of all GIS is the availability of appropriate information or digital data. [64] 6. or GIS-data may be available from authorities. Diagram of risk intensity index with five different classes assigned to build risk classes. If no digital data exist. were used. due to the catchment management activities of the water utility. they have to be created by geo-referencing paper maps. Tabular data. Information on soil structure. CAD-. groundwater recharge and infiltration of surface waters have been derived from the local hydraulic groundwater model and transformed to GIS shape files.Methods for risk analysis of drinking water systems from source to tap Figure 28. a certain degree of competence and training is necessary. for example additional information on hazards from the THDB [20] have been linked to the GIS attribute table using the ‘Join‘-function. Other GIS software is either commercially or has open source GIS software available. tabular.13. ESRI) was used.13. Land-use patterns were already existent in GIS shape file format. To apply GIS analysis.2. In the case study the utilities’ CAD datasets on water supply structures.5 Concluding comment The use of GIS Assisted Risk Analysis (GARA-method) in the case study [63] illustrated that risk management is an iterative process of continuous updating as new information 75 .4 Main requirements In the German case study [63] a commercially available GIS software (Desktop ArcGIS 9. In some cases. hydrogeology. and digitising information obtained from maps and field surveys.

and the estimation of risks. By storing and updating all spatial information in a GIS. 76 . The plain visualisation of Risk Intensity in a map is a very vivid and convenient tool for communication of risks between the various involved stakeholders. through the evaluation of risk tolerability and identification of potential risk reduction options. to the selection and implementation of appropriate risk reduction and monitoring measures can be operated regularly.Methods for risk analysis of drinking water systems from source to tap becomes available and as the preconditions change. the entire risk management process from the identification of hazards. This outcome is in line with the TECHNEAU Generic framework [2].

This classification is given as an indication only. It is expected that most water utilities have the competence to carry out a CRA with some assistance from risk analysts. (biological/chemical). In the column “analysis complexity” we use the following symbols: H: M: L: High complexity of the analysis. and to demonstrate the applicability and capabilities of the various methods. Define objectives and scope of the risk analysis. Most of the other risk analysis methods. (technical/human). 77 . including e. biological and human aspects of a large and diverse system. Risk analyses provide useful tools for the management to control the variety of hazards and hazardous events of the water utility. and thus support the implementation of the “Generic framework and methods for integrated risk management in water safety plans” [2]. are rather complex. this method will in a rough way estimate the related risks. a fact that further increases the complexity. Identify hazards and hazardous events 3. In addition to support the identification of hazards and hazardous events.g. For instance FMECA is given complexity L. intrusion of contaminated water into pipes). technical. It is also required to consider and balance the risks related both to water quality and quantity. besides the CRA and required in various situations. • Failure of the treatment systems. • Contamination of the catchment area. Medium complexity of the analysis. and we note that it refers mainly to modelling complexity. The Coarse Risk Analysis (CRA) will often be the basic risk analysis method for the water utilities. In this respect the THDB is a useful tool. The objective is to describe the tasks of a risk analysis.g. (water leakage out. as a basis for identify and implement effective risk reduction options. which could be e. Low complexity of the analysis. • Failure of the distribution system. Make a system description and plan the work 2.Methods for risk analysis of drinking water systems from source to tap 7 Summary of risk analysis methods The report provides an overview of main risk analysis methods for a water utility. Table 14 gives an overview of some applications of risk analysis methods for several decision situations. Main steps of a risk analysis are (se Appendix A for details): 1. So the “risk picture” for water utility is quite complex. However. also this method requires extensive system knowledge. and these should most often be carried out by risk analysts in close cooperation with personnel from the water utility. However this application must be supplemented by a systematic use of experts with a thorough knowledge of the water supply system. Estimate risk A risk analysis is an important action to identify the hazards and hazardous events.

11 3.1.5 6. recovery.2 3. Life cycle Decision / Purpose of analysis phase Method Name HAZOP/Hazid FMECA Removal efficiency Analysis Complexity M/L L H H M L/M L H M H H M L M Comments / Examples Section Select type of water treatment Hazards to water source/catchment area 6.3 Specification of treatment system For distribution only Establish monitoring system. Identify hazards / hazardous events for water source Prioritise risk reduction options Improve procedures Identify causes of failure events Consequences of undesired events Effect of risk influencing factors More complete picture of hazards/vunerabilitie Optimise water availability for consumers Causes of network failures E.9 6. animals.4 6.12 3. to investigate redundant systems E. Maintenance optimisation Identify threats and vulnerable points New buildings. of source Hazid Hazid/HAZOP Modifications / Life extension FTA RBD 78 . HAZOP/hazid Changes in environm.6 6.2 3. Overview of risk analysis methods. …. (primarily for treatment?) E.g.1 6.2 6.13 6.5 6. roads.Methods for risk analysis of drinking water systems from source to tap Table 14.g.g. 6.7 6.12 6. obtain substitute of delivery. (capacity.2.3 6.2 6. 6. New hazards appear? Identify “new” failure causes 6.2 6. …. food industry.5 6. to investigate redundant systems Analyse potential for human errors causing maloperation Analyse (effects of) microbial/chemical contaminations Plans for warning consumers.5. hospital.10 6. new threats. 3.2 6.1 3. Analyse hazardous events of construct. redundancy) Identification of control points Hazard identification CRA (HACCP) Hazid/HAZOP FMECA Plan for risk reduction/avoidance FTA RBD HRA QMRA/QCRA Develop emergency plans Could be based on CRA CRA HAZOP Production and/or construction Avoid construction work to pollute water source Protect against undesired events CRA (HACCP) HRA FTA L/M H H M H H H H M/L H M/L L L/M H M 3.7 6.6 Extend risk analyses to cope with specific problems ETA Bayes Network GIS Operation Changes in network capacity or Network model reliability FTA New (type of) users to be HAZOP/hazid connected Unreliable equipment observed Markov Security problems.1 Reliability of treatment systems 6.2 Design and development Select/design distribution Network model system. Primarily for source & treatment Identify need for risk reduction options Technical failures.8 6.2 3. etc.

such that various failure probabilities/rates can be estimated. However. Generic data are not so useful and may not be available. and record undesired events with causes and consequences. Therefore water utilities should design their own data base. 79 . it would be useful if water utilities apply a similar design of their databases and allow exchange of data with other water utilities.Methods for risk analysis of drinking water systems from source to tap A major problem in the performance of risk analyses is the scarcity/lack of relevant data for instance regarding failure events.

Methods for risk analysis of drinking water systems from source to tap 80 .

Cambridge University Press. L. Pettersson. DOE Handbook chemical process hazard analysis. Contingent Valuation: Controversies and Evidence. Risk valuation in selection of remedial strategies. T. Economic Concepts and Approaches.-O. Hokstad. Bondelind. J. A. Description of methods and examples (summary in English). L. 92 (2007) 433 . USDOE.. E. and Wiencke.. 19 (2001) 173-210. A. 2008.. Rosén. 2004. Cost-Benefit Analysis of Environmental Change. Risk modelling handbook... A. Grahn. Šašek... Contract no. S.. F. 2003. J. and Roser. Niewersch. J. Cambridge. F. Switzerland. A. World Health Organization (WHO). Sklet... P... 2005. Å. T..1. W. Evaluating Health Risks: An Economic Approach. Hokstad.. The Drinking Water Directive.. B. B. Runštuk. QMRA methodology. Soutukorva. A decision framework for risk management. D.. Johansson. P. 15.. N. M... Beuken. P. Tuhovčák. A View on risks.. 2001... T.. E. Pumann... Pettersson. S. Norberg. US_EPA.1 (2005). L.R.J. Norwegian Technology Centre. R. 2004. Røstum.. Guidelines for Preparing Economic Analyses. Deliverable Number D4. D. 1.. Freeman III. Törnqvist. D. IEC. IEC. National Academy Press: Washington DC. 1998.5g. Aven. J... Vinnem.. R. Guidelines for Drinking-water Quality.. 018320-02. P.. EU Council Directive 98/83/EC. J. C. 1997. Swedish Environmental Protection Agency. E. Report 5537. Sturm. and Reinoso. Hokstad. Cambridge University Press. Ashbolt. D.. S. J. I. Dependability management Part 3: Application guide. Kozisek. 1995... 2006. 1993. NRC... Swartz. Bosch. Sklet. Environmental and Resource Economics. Berenschot Process Management. Section 9: Risk analysis of technological systems. EPA 240-R-00-003. Johansson. Geneva. Back. J. Melin. van Mil.. Generic framework and methods for integrated risk management in water safety plans. Bondelind. Stockholm.. M. T. S.. R. Dijkzeul. J. Washington DC. National Research Council.. Gari. IEC 300-3-9: 1995. Reliability Engineering & System Safety..C... Kiefer. Eikebrokk. [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] 81 . T. Åström. H. Technology enabled universal access to safe water Annex I "Description of work".. Norsok Z-013. Risk and emergency preparedness analysis. L.. Beuken. A.. J. S. Utrecht. P. MICRORISK. Kirchner. M. M.Methods for risk analysis of drinking water systems from source to tap 8 References [1] [2] [3] TECHNEAU. J. Kožíšek.. 2006. and Meade. Reinoso. Ručka. S. H.. T.. R. Rosén. P.-E. L. Cambridge. Washington D.. Rosén.. P. R. R. Rosén. 2000. Sturm. M. B... Kiefer.... Åström.. DOE-HDBK-1100-2004.-O. Machenbach. Lindhe. US Department of Energy. C. Flores. The Measurement of Environmental and Resource Values: Theory and Methods. Ball. P. Signor. : . N.. Smith. M. Capital maintenance: a good practice guide.. Lindhe. A. Petterson. 2007. Weyessa Gari. N.. L. Söderqvist. with application to the offsore oil and gas industry. A. Carson.. A. Røstum.448. Lindhe. T. EC. 1995. and van der Pennen. M. and Eklund. Leading Edge Asset Decisions Assessment (LEADA). Öfverström. . T. Thorsen. D. Valuing Ground Water. J. Water Asset Management International. Summary report: Risk assessment case studies.. Røstum. 2006... A. Resources for the Future. F. TECHNEAU.. J. E.

Lisbon. Sweden. Water Science and Technology: Water Supply 2. Safety of machinery . J. F.. A. 1310786000. E. C.1. Fault-tree analysis for integrated and probabilistic risk analysis of drinking-water systems. 2008. K. A guide to task analysis. Pettersson. Journal of Water and Health. L. R. T.. Chalmers. Norberg. CSNI Technical Opinion Papers No. and Bergstedt O. 2004. Lindhe A. Contract no. Bergstedt. and Norberg. S. Kirwan. E.S. Risebro. 5. Fault tree analysis of the causes of waterborne outbreaks. B. statistical methods. Albuquerque. 82 . 1983.R.. A guide to practical human reliability assessment. Human Reliability Analysis in Probabilistic Safety Assessment for Nuclear Power Plants. Ainsworth. and Guttmann. Hirschberg.. Rosén. 2002. Deliverable D4. and Charles. K. Johansson. K.1. and Allioux..J. Bondelind M.Guidance) (In Norwegian)... H. and Menaia. London. P.. and applications.4. R. L.. W. System reliability theory.. T. 1992. W.1. Gothenburg.. T. Integrated risk analysis from source to tap: Case study Göteborg. Økt sikkerhet og beredskap i vannforsyningen . International Standardization Organization. Kirwan. Rosén L. B. J. M... J. R. Rosa. SAND80-200. Démotier. SWECO VIAK AB. Y. London. TECHNEAU. J. A.. SLIM-MAUD: An approach to assessing human error probabilities using structured expert judgment. 2007. O. ISO 14121-1. 2008. Charles. M..-F. 2006. Démotier. A.. Pettersson..Risk assessment. Nordic Drinking Water Conference. 4.. 1984. 2008. D. Sweden. Norberg. G. A. and Hunter. S. In prep.. A. T. Dn oria. R.Methods for risk analysis of drinking water systems from source to tap [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] Swartz... Åström. J.. B. K.. Lindhe. OECD. Risk assessment for drinking water production process. T. In Swedish. J. L. France.. Åström J.A. P. 2008. D. Footohi. A. 2006.. Wiley-Interscience. Sandia National Laboratories Statistics Computing and Human Factors Division.. Lyon. S. TECHNEAU. and Bondelind... Embrey. In proceeding of European Conference on System Dependability and Safety.. Part 1: Principles. Handbook of human reliability analysis with emphasis on nuclear power plant applications: Final report NUREG CR-1278. Odeh.. Identification and description of hazards for water supply systems .. Risk assessment of water quantity..2 and 4. Swain.. 1994. P. Laîné..1. K. and Rea.. Water Safety Plans: Global Experiences and Future Trends. Deliverable 4. Schön. and Bondelind.. K. M.Veiledning (Improved safety and emergency preparedness in water supply . P.. Humphreys. U. Nuclear Energy Agency. Rausand. 2004.. updated Version August 2008.. Norberg T. L... M. Andersson. J. Comparing raw water options to reach water safety targets using an integrated fault tree model.. International Water Association Conference. Odeh. Kirwan. New Jersey. Oslo. 55-63. Consulting report. O. No 3 (2002). Norwegian Food Safety Authority. O. H. Høyland. A. F. Models. Lindhe. Schön.. Pettersson T. Taylor & Francis. Gothenburg. 2008. L.. T.. Sweden. Oslo. 1 (2007) 1-18. M. Rosén.5a... Bergstedt. and Steier. Risk assessment for drinking water production: assessing the potential risk due to the presence of Cryptosporidium oocysts in water..A catalogue of today's hazards and possible future hazards. L.. Osborn. Merdema. 4.1. Risk assessment case study – Göteborg. Schlosser. Taylor & Francis. Rosén. Göteborg... Lindhe. E. NFSA. J.. Rosén.. Pettersson. Department of Energy. Åström. USA.

University of Bradford. Strutt. N. Quantitative Microbial Risk Assessment. and Pollard.. Linköping University. B. Blackman.. v. Risicoanalysemethoden ten behoeve van infrastructuur voor drinkwaterdistributie. Accessed on 2008-0429.microrisk. Journal AWWA. S. Netica. Wiel. J. 2006. C. P. Literature review. Marble. J. T.shtml>. 94.. guidelines for using Bayesian networks to support the planning and management of development programmes in the water sector and beyond. B. John and Wiley & Sons. Niewersch. NUREG/CR-6883.. Medema. Kjaerulff. Probabilistic networks for practitioners – A guide to construction and analysis of Bayesian networks and influence diagrams. Hollnagel..com. Hijnen.. HEART . L. S. 95 [5]. and Bonnet. D. and Wintgens. MacGillivray.. Environmental Science and Technology. Cara.79. 83 . G. 2000.. MERMOS: EdF’s new advanced HRA method.. 2001.. Bieder. Hamilton. and Grabinsky.. J. Critical Reviews. Wallingford.Methods for risk analysis of drinking water systems from source to tap [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] Williams. J. C. Rietveld. MacGillivray. J. 1999. Planning improvements in natural resources management.S.104.com/publish/cat_index_6. P. Pollard. Linköping. D. New York City. T. Assessing intrusion susceptibility in distribution systems. Netica Helpfile. Applying GIS to a water main corrosion study. Medema. Nuclear Regulatory Commission. 90-104. Risk analysis and management in the water utility sector. New York. Crowmarch Gilford. The 4th international conference on probabilistic safety assessment and management (PSAM 4). USA. E. 2006. 2008.norsys. 82. Cognitive reliability and error analysis method : CREAM. Application of risk assessment in the drinking water sector. The SPAR-H Human Reliability Analysis Method. J. Haas.. and Hrudey. C. The Microrisk consortium.microrisk.a proposed method for assessing and reducing human error. B.. Available: <http://www. L. Risk Analysis Strategies in the Water Utility Sector: An Inventory of Applications for Better and More Credible Decision Making. " Available: <http://www.. MicroRisk. Oxford. QMRA: its value for risk management. 6 (2002) 66 . J. G.. and Stenström. Delft University of Technology. and Madsen. C. 95. Aalborg University.. Hamilton.. Strutt. The 9th Advances in Reliability Technology Symposium. Byers. and Smith... internal report TECHNEAU. Desmares. 2007. Microrisk. Stochastic modelling of drinking water treatment in quantitative microbial risk assessment.. U.. G.. Delft.. P.. 1996. Center for Ecology & Hydrology. Inc. Idaho National Laboratory. Gertman. 1998. U. Kirchner. Accessed on 2008-04-29. 5 (2003) 90 . Microbiological risk assessment: a scientific basis for managing drinking water safety from source to tap...pdf>. R. Microbial risk assessment and its implications for risk management in urban water systems. H.. C. N. B6 (2004) 453-462. d. Netica version 3. Technical Basis and Implementation Guidelines for A Technique for Human Event Analysis (ATHEANA). W.com/uploads/microrisk_value_of_qmra_for_risk_manage ment.. L. Le Bot. TNO-D-R0355/A.. (2006) 85-139. T. Delft. 2004. Westrell. and Gerba. and Buchberger. 2007. Doyle. A. ISBN 0903741009. www.. S.25. NRC. J. J..-A. F. P.. Caine. S. B. Lindley.. 1998. NUREG-1624. Smeets. E. and Ashbolt. Journal AWWA. Process Safety and Environmental Protection. 2006. 2006. W. M. T. Elsevier. 2005.

G.-L. 2009. M.. Assessing Groundwater contamination risk using ArcInfo via GRID function http://gis. 2009. Applying GIS to assess the vulnerability of the Päijänne waterconveyance tunnel in Finland. Report no. Deutsche Fassung EN 60812:2006. Risk assessment case study – Amsterdam.. M-L... Essink-Bot. 46. 8 (2000) 1241-1247. Kiefer J. Analysetechniken für die Funktionsfähigkeit von Systemen Verfahren für die Fehlzustandsart. The PI method . Törnqvist M.5f. Rosén. Geol. South Africa. The Netherlands. S. GIS techniques for mapping groundwater contamination risk. 2007. Sturm. edited by Zwahlen F. European Safety and Reliability Association and Society for Risk Analysis Europe Conference. 3 (2000) 157-166. Vietnam.. International Burden of Disease Network (IBDN). M. TECHNEAU.. 84 . Geneva... Norway. N. P. Risk analysis (In Norwegian: Risikoanalyse). World Health Organization.5b.. Final Report. (2008).. Runštuk J. M. T..: Vulnerability and Risk Mapping for the Protection of Carbonate (Karst) Aquifers. Ručka J.5d.. C..1. Reinoso M. Final Report . M. TECHNEAU. Öfverström B.. Hydrogeology Journal. Vietnam. V. F. Klute. 2004.. A. Disability weights for diseases: a modified proyocol and results for a Western European region.5e. Risk mapping. (GBD 2000 project) 2000. Goldscheider. and its first application in a tropical karst area. Nguyet. Hötzl. 2004. 2005. Cost Action 620. M. Risk assessment case study – Freiburg-Ebnet.294. 14 (2006) 1666-1675. Global Burden of Disease (GBD) Study.. Rausand. Risk assessment case study – Upper Mnyameni. and Swartz C... D.. and Ball T. Beuken R. Hötzl. Kožíšek F. J. Meerkerk M. and Papírník V.und auswirkungsanalyse (FMEA) (IEC 60812:2006. Risk assessment case study – Březnice. D 4.. Report no. M. Risk assessment case study – Bergen. Ducci. TECHNEAU. and Hötzl. Added value in fault tree analyses. and Goldscheider. 2006.1. A simplified methodology for mapping groundwater vulnerability and contamination risk.. Vulnerability and Risk Mapping for the Protection of Carbonate (Karst) Aquifers. Z. A. C. and Sappa. and Mesman G. Bonsel.. C.. 2004. Environmental Geology. Delporte. OREDA. A. Springer-Verlag (2007). SINTEF.J. Valencia. D 4. Šašek J.. Neukum. Journal of Public Health..G.esri. Malik. Water Safety Plans – Managing drinking-water quality from catchment to consumer. 2008.. Norberg. P. D.Methods for risk analysis of drinking water systems from source to tap [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] Lipponen. American Journal of Public Health. Eur. De Ketelaere. 90. 113-120.5c. Melse. Pumann P. Røstum J.N. Natural Hazards 20. angew.1. Cost Action 620. Germany. Cost Action 620. H. Offshore Reliability Data.. and Eikebrokk B.com/library. N. Kramers. 2008.. TECHNEAU. Deliverable 4.. and Lindhe... Neukum. M. Final Report. Tapir Forlag.. Civita. L.. Liesch. 2-3 (1999) 279 . and Svasta.1. Czech Republic... Weyessa Gari D. 2002. G. E. J. T. Hoeymans. Civita. Bosch A.. Essink-Bot. A National Burden of Disease Calculation: Dutch Disability-Adjusted Life-Years. Deliverable 4.. 1991. Zwahlen. TECHNEAU. M.a GIS-based approach to mapping groundwater vulnerability with special consideration of karst aquifers.1. 2008. Tuhovčák L. Sturm S. DIN EN 60812. Stouthart... H. Vulnerability and Risk Mapping for the Protection of Carbonate (Karst) Aquifers. H. and De Maio. WHO. 10 (2000) 24-30. Hazard Analysis and Mapping. Deliverable 4. N.

2006. Analysis techniques for dependability . Analysis techniques for system reliability . IEC 60812. 1991.pdf 85 . Application of Markov techniques. BS5760-5: 1991 .Reliability of systems.sintef. Guide to failure modes. Description of technical tools for failure forecasting and network reliability (WP2) (Deliverable D3). equipment and components. IEC 61078. CARE-W. Recommended Failure Modes and Effects Analysis (FMEA) Practices for Non-Automobile Applications.Procedures for failure mode and effect analysis (FMEA). 2006. 2005. IEC 61165. http://www. 2006. effects and criticality analysis (FMEA and FMECA).Reliability block diagram and boolean methods.Methods for risk analysis of drinking water systems from source to tap [77] [78] [79] [80] [81] [82] SAE ARP 5580.no/upload/24472/D03%20Models_Description.

Methods for risk analysis of drinking water systems from source to tap 86 .

(Sections 2.1. Definition of scope of analysis In defining the scope of the risk analysis.3 gives example). using THDB. Should include: a. risk analysts). the following steps should be included: a.1).2). Collect available data on hazards/hazardous events. and references to relevant parts of the reports are also given. (i.hazardous events to consider. including both water utility experts and outside professionals (e. (Section 2. (Section 2. and • Site specific data.1).3).g.3. (Section 2. c. for instance with respect to . (Section 3. 3. “experience from the past”). 87 .1). (Section 3.Methods for risk analysis of drinking water systems from source to tap Appendix A: Main steps of a risk analysis There are three main steps of a risk analysis. organise the working process. • Are there restrictions related to scope.e. exclude terrorist acts?). c. cf.g. b. b. 2. (e. or .4. (e. Identify hazards and hazardous events. Documentation of results. Define the scope of risk analysis. Perform an expert sessions to identify a list of site specific hazards and hazardous events: • Brainstorming and/or use of checklists. (Sections 2. only consider risks to water quality). • Which dimensions of risk shall be treated? (Section 4. 1. Section 1. Specify system boundaries.1): • Generic data e. comprising a source to tap approach.g. 2. Describe system to be analysed and main functions of these systems.1).g. • Why is it carried out? Is it aimed at a limited part of the system or is it an integrated risk analysis in concordance with the Water Safety Plan.with respect to types of consequences to investigate. Some details of these steps are given in Figure 29. Select a team to involve in analysis.

Are there restrictions. Decide on qualitative (semiquantitative) or quantitative analysis. Define scope of risk analysis.Why is it carried out? . Main references Section 2. on type of hazardous events. e. Main steps of a risk . Describe system. . Estimate risk.5 Safety barriers.1 Identification of hazardous events 3. 1b. Causes and consequences Chapter 4 Quantification of risk Chapter 5 Data for risk analysis Section 2. Define scope 1a. Identify safety barriers: Assess causes and consequences of hazardeous events. or . either by .Qualitative analysis (using risk matrix). 3b.1 Initiation and organisation Section 2.Quantitative anaysis. 2b.2 Decision situations Section 2. organise working process. Section 2. 2c.2 Risk estimation in CRA Figure 29. Give system boundaries. Document results (form). Select team to involve.6 Risk estimation Section 3.analysis.3 System description 2. 3c. Section 2.Methods for risk analysis of drinking water systems from source to tap Main steps of a risk analysis: 1. Risk estimation 3a.g. Identify hazards and hazardous events 2a.4 Hazardous events Section 3. subsystem and main functions. 88 . Perform expert sessions to identify site specific hazards/events. Collect available event data. or type of consequences to investigate 1c.

Methods for risk analysis of drinking water systems from source to tap 3. either . c.5).Quantitative (Section. and can be carried out without high competence within risk analysis. Chapter 6 describes a number of more advanced/detailed risk analysis methods. Risk estimation. Perform risk estimation.5). 2.6. Qualitative analysis: Identification of safety barriers.2). However. Should include: a.Qualitative (semi-quantitative): use of risk matrix (Section 2.2). which are less advanced. Section 3. . 89 . Sections 4. and risk quantification is described in Chapters 4 and 5. Decision on whether to carry out a qualitative (semi-quantitative) or quantitative analysis (Section 4.6.3-4. in addition to this. b. Note that main references above refer to semi-quantitative risk analyses (covering the entire system). Data requirements: Chapter 5. Several of these are (also) quantitative. Specifying causes and possible consequences of hazardous events (Section 2.

Methods for risk analysis of drinking water systems from source to tap 90 .

the DALY combines in one measure the time lost due to premature mortality and the time lived with disability. In TECHNEAU the DALY concept has been further developed by also including quantitative aspects. That is DALY = YLL + YLD DALY could in principle be calculated for each undesired (hazardous) event of a water utility. the ‘equivalent years lost’ due to disability (YLD) of the health condition. However. i. The basic principle of the DALY is to weight each health effect for its severity from 0 (normal good health) to 1 (death). and 2. it would hardly be data available to carry out such an exercise. the years of life lost due to premature mortality (YLL) in the population. which is the measure for health effects suggested by WHO. the DALY approach is most likely to be used at national level and not on water utility/company scale. Appendix B. years “lost” due to disability and other non-fatal consequences The DALY is calculated for a specific population. Disability adjusted life years (DALY) In their Guidelines for Drinking-water Quality. Thus.1. WHO [8] applies the Disability-Adjusted Life-Years (DALYs) as a risk measure. The DALY for the population in question is calculated as the sum of 1. one DALY can be thought of as one lost year of ‘healthy’ life. Years of Life Lost (YLL) The years of life lost (YLL) basically corresponds to the number of deaths multiplied by the standard life expectancy at the age at which death occurs. which is then a measure of the risk related to the water quality for this utility. However.Methods for risk analysis of drinking water systems from source to tap Appendix B. to a specific event related to water safety. years of life lost due to premature death 2. we get the overall DALY. DALY and a generalisation This section focuses on DALY. The DALY measure combines information on 1. 91 . So. In order to calculate the YLL. we may split the population according to factors like age (groups) and sex. (for a given cause): YLL = N x L where N = number of deaths L = standard life expectancy minus age of death in years.e. is the following. DALY – An overall risk measure of health effects. in our case. The basic formula for YLL. years in states of less than full health. related to a specific disease or health condition or. By adding the DALY of all relevant hazardous events.

Methods for risk analysis of drinking water systems from source to tap Years Lived with Disability (YLD) (=Years of life lost due to disability) To estimate YLD for a particular event. In effect. it implies a relative value of life (VOL). the number of incidents in that period is multiplied by the average duration of the disease/disability and a weight factor that reflects the severity of the disability on a scale from 0 (perfect health) to 1 (dead).02 Lower respiratory infections DW = 0.003 Influenza DW = 0.g. Here. with average duration. to which the stage disability weights contributed according to their share in the disease prevalence.2) of duration 1 year.01 Intestinal infectious diseases DW = 0. As described here a large number of disease stages were evaluated by panels of medical experts.33 Parkinson’s disease DW = 0. let I2 be the number of affected persons having disability weight. Similarly. and with an average duration. given as a measurement of the gap between current health status (due to a hazardous event) and an ideal situation where everyone lives into old age free of disease and disability.20 Tuberculosis DW = 0. L2. DW2. There may of course be various contributions to YLD: Let I1 be the number of affected persons getting a disability corresponding to the weight.68 Such weights must necessarily be rather controversial. The required distribution of the prevalence over the disease stages was obtained through consulting experts for each disorder. Then an average disability weight of a disease was calculated. L1. (in both cases DALY=10). DW1.07 Contact eczema DW = 0.04 STD (Bacterial only) DW = 0. A large amount of work has been carried out to assess these.23 Stomach cancer DW = 0. Some examples of DW given in [70] are: Upper respiratory infections DW = 0. Then in total YLD = I1 x DW1 x L1 + I2 x DW2 x L2 Disability Weights (DW) A crucial point in this approach is the determination of the DW corresponding to various disabilities. e. The basic formula for YLD is the following: YLD = I x DW x L where • • • I = number of affected persons DW = disability weight L = average duration of the case until remission or death (years) So YLD is a measure of the burden of disability.07 Inflammatory bowel disease DW = 0. corresponds to one person losing 10 years of healthy life. 92 . [69. 70]. the judgement is that 50 persons having an inflammatory bowel disease (DW=0. which always results in controversy.

Methods for risk analysis of drinking water systems from source to tap Yes. However.g. i. (corresponding to the formulas given above). and can not be used for drinking Water delivery is interrupted for some time The risk measures on quantity are typically calculated for the distribution network. Still it can be hard to apply the method for a water utility. Therefore an egalitarian principle with a fixed life expectancy is appropriate. we suggest not to apply such discounting. 10x lower than inflammatory bowel disease. The WHO has an objective e. Values of DW used for the most typical diseases caused by drinking water could be collected. (expected time until death at various ages). the question is how applicable DALY is for the drinking water producers. Most likely. it is suggested that the values used both for the life expectancy and the DW are adapted to the country in question. Consequently. Appendix B. to identify main health problems and then make priorities worldwide. one should rather apply the life expectancy of the relevant country (or the public to which the utility delivers water). intestinal infectious diseases with DW 0. We cold now extend the main idea behind DALY to include also loss of water quantity. but age and sex was taken into consideration when lost years of healthy life is calculated. Inflam. when a water utility shall carry out a risk assessment.e.2. However.02. bowel disease is serious disease. A risk measure could now be obtained by adding a term taking into account • • Years lived with water supply of bad quality (YLQ) Years lived without water supply (YLS). (i.e. i.e. the objective would rather be to identify the specific hazards and to prioritise risk reducing measures relevant for this specific water utility. We also suggest that all ages have equal weight. So in order to keep the approach as simple as possible. and the Global Burden of Disease (GBD) Study [71] used the same values for all regions of the world. For the consumer it is important both to have water of good quality and enough water. it is not just “normal” diarrhoea (i. it is rational (not emotional) simplification. Note that WHO explicitly built the egalitarian principles into the DALY. The use of discounting has been somewhat controversial even in WHO-applications. Actually. The GBD study also used a 3% time discounting and non-uniform age weights. Combined measure for water quantity and quality. due to difficulties in communicating DALY and in weighting different types of consequences. TECHNEAU use of DALY An important input to DALY is Life expectancy. but not non-sense. [69].e. if DALY is used in such a study. 93 . Generalisation of DALY. when a water utility (national authority) is carrying out the analysis. giving less weight to years lived at young and older ages).: • • Water is known to be seriously polluted. DALY would not be used as routine tool in risk assessment of water supply made by utility. in particular choosing appropriate weights. Then the number of consumers affected (exposure) is estimated. Note that the objective of using DALY in a risk assessment study within a TECHNEAU framework will not be the same as in the global WHO studies.

02 = 3 If we shall combine this with DALY to give an overall measure of risk. 94 . Then YLS = 500 000 x 0. We could also include the adverse effects for industries dependent on water delivery.02 years (approximately 1 week). this is not pursued here: From the point of view of the water works.0003 x 0. the water utility) to assess the risk of failing the own or legal requirements. together with on some weighting factor. In this way we can combine all losses to consumers in one single measure. the main problem is the weight.e. That is YLQ = J x QW x LQ YLS = K x SW x LS where J= QW = LQ = K= SW = LS = Number of people receiving water of bad quality (undrinkable) Weight of inconvenience of receiving bad water Duration of delivery of bad water (in years) Number of people not receiving water Weight of inconvenience of not receiving water Duration of lack of delivery of water (in years) Also LQ and LS should be given in years. in order to make the measure compatible with DALY. we could add this YLS to DALY. expressed in quality standards. as the use of DALY is probably not very relevant for water utility.00038. Water quality standards set up by health authorities are already existing health based targets. of course would have to be specified further. But if it is agreed that a weight SW = 0. but just used as an example. Such a measure as indicated here. TECHNEAU should support the end user (i. 8 This is not a recommended value. Assume (as an illustration) the weight SW = 0.Methods for risk analysis of drinking water systems from source to tap These parameters are in their simplest form obtained by multiplying the number of consumers affected by the duration (number of years) of the bad supply/interruption.0003 is comparable with the weights DW. To give a simple numerical example: Assume a total of K = 500 000 people do not receive water during a period of LS = 0. so it is not the water utilities’ duty to calculate these values. consumer trust etc. and get a total measure of risk. the risk is to fail national or internal standards (water quality and quantity). However. SW.

and then an overall system analysis. Fault tree for the top event ”UV-system delivers water that is not treated according to requirements”.1 A Fault Tree Analysis of an UV system The first example is taken from the Bergen Case (Section 3. first an analysis of one specific system (UV treatment). Two examples of FTA Fault Tree Analysis (FTA) was discussed in Section 6. the “unwanted event”) is defined as ”UV-system delivers water that is not treated according to requirements”. which is one of the treatment systems.e. as indicated in the figure. The fault tree is illustrated in Figure 30.5. Too low UV dose can occur if either: • Sensors measuring the UV-dose gives too high values. For this top event to occur we must have that both: • • UV applies a too low dose.Methods for risk analysis of drinking water systems from source to tap Appendix C. or 95 . Here a FTA analysis was applied to analyse the UV-system. The top event (i. C. We here give two example of a fault tree analysis. This is shown by an AND-gate directly above these two events. These two events can then be evaluated further. and Automatic shut-down fails (there are sensors that shall activate a valve to shutdown water production if UV-dose is too low) Figure 30.3).

Therefore it is realised that the sensors appear to be a critical component. This fault tree can be further analysed to give the “cut sets”. but unfit for human consumption according to existing water-quality standards. i. The analysis is further described by [31] and [30]. This is helpful to identify the most critical failures. i. no water is delivered to the consumer.e. or Shut-down fails to shut down when required by sensor/control system.e. or UV intensity is too low.2 Integrated risk analysis: Fault-tree analysis to investigate causes of failures This section shortly presents an example (from the city of Göteborg. both • A qualitative analysis: Find minimum combinations of events that result in the top event (“Cut sets”): A quantitative analysis: Assessing probability of top event. The automatic shut down fails if either: • • • • Shut-down system is disabled by operator (e. The entire system was considered and water quantity as well as quality aspects were included.g. combinations of the basic events that result in the top event. and (2) quality failure. the risk levels could be calculated in terms of Customer Minutes Lost (CML). Method The main failure event studied in the analysis was supply failure. • C. or Control system fails to activate shut-down valve. i. Various software allow us to analyse this fault tree further. or Sensor measures a too high UV-intensity. These three events can then be evaluated further. after a wash). Sweden) of an integrated risk analysis of the water supply system.Methods for risk analysis of drinking water systems from source to tap • • Water flow is too high (so that water does not stay long enough to get sufficient UV radiation).e. from source to tap. 96 . We observe that “Sensor(s) giving too high values” can result in both “Too low UV dose applied” and “Automatic shut down fails”. By including the number of people affected by different events in the fault tree. but also the failure rate and mean down time of the system. as indicated in the figure. defined as including: (1) quantity failure. Also these events could be further developed (not given in fault tree). The estimated levels of risk were compared to politically established performance targets that can be considered as acceptable levels of risk. 66]. see [28. water is delivered. To carry out the analysis an integrated and probabilistic fault tree method was used. The method is based on the fault tree technique and a Markovian approach was applied to be able to not only calculate the probability of failure.

but with the important difference that the compensating event may recover and start to compensate when it has failed. 1st variant of AND-gate: one or several events may compensate for one failure event during a limited time period. Failure events in any of the three sub-systems may cause supply failure. C') Distribution fails to compensate Quality failure Quality failure Distribution fails to compensate Treatment fails to compensate Distribution fails to compensate OR-gate st 1 variant of AND-gate Q = flow (Q = 0. but have been included here to illustrate the general thinking. The figure shows that the drinking water system was divided into its three main sub-systems. In Figure 31 a schematic fault tree structure including the main events is illustrated. treatment and distribution. The entire fault tree for the Göteborg system included 116 basic events. the system has an inherent ability to compensate for failure. C') Treament quality failure (Q > 0.Methods for risk analysis of drinking water systems from source to tap The drinking-water system was modelled as a supply chain composed of the following main sub-systems: raw water. However. For example. Supply failure Raw water failure Treatment failure Distribution failure Raw water quantity failure (Q = 0) Treatment quantity failure (Q = 0) Distribution quantity failure (Q = 0) Distribution quality failure (Q > 0. AND-gate: all input events have to occur to cause the system to fail. To construct the fault tree four types of logic gates were used. no water is delivered to the consumer. C') Quantity failure Quantity failure Treatment fails to compensate Distribution fails to compensate Raw water quality failure (Q > 0. The following logic gates were used: • • • • OR-gate: only one of the input events has to occur to cause system failure. failure of the treatment plant to produce drinking water may be compensated for by reservoirs in the distribution system. The logic gates illustrate the interaction between different events. The distribution was assumed to not be able to compensate for quality failure in previous sub-systems. 100 intermediate events and 101 logic gates. 97 . 2nd variant of AND-gate: similar to the 1st variant. Q > 0 water is delivered) C' = The drinking water does not comply with existing water-quality standards Figure 31. Schematic fault tree including the main events. All events were further developed in the analysis using also the traditional AND-gate and the second variant of it.

However. evaluation of available data. To analyse the drinking water system by means of the fault tree method was an iterative process. was set up. i. failure rate and mean down time provides information on the dynamic behaviour of the system. uncertainty analysis and evaluation of results. which mean that information on the uncertainties is provided. but have different failure rate and mean down time. The calculations were performed using Monte Carlo simulations. uncertainties were considered by defining all estimates by means if probability distributions. The main analysis steps were: scope definition. irrespective of the reason.Methods for risk analysis of drinking water systems from source to tap By using a Markovian approach the probability of failure as well as the failure rate and mean down time of the system could be calculated for the top event as well as for all intermediate events. fault-tree construction. people with different knowledge about the system and the risk analysis method. To input data to the fault tree model was based on hard data (e. expert judgments and combinations of these. Two subsystems may cause same number of CML. and affect different amount of people. expert judgements. By defining the number of people affected by different main type of events in the fault tree. When compared to the results of the fault tree analysis (mean value: 608 annual CML) it was concluded that the probability of exceeding the criterion was 0. Analysis procedure To facilitate the analysis work a team of water utility personnel and researchers. system description. These targets were considered as acceptable levels of risk and one of the targets were: Duration of interruption in delivery to the average consumer shall. for quantity and quality failure. Since a probabilistic approach was used. measurements and statistics on events). hazard identification. Also quantity and quality related CML is estimated for the entire system as well as its three main sub-systems.g. they were presented separately in order to retain transparency. Some parts of the results were compares to performance targets defined by the City of Göteborg. Hence. The CML illustrates the level of risk and the probability of failure. the uncertainties of the results could be analysed. This criterion was translated to 144 annual CML for the average consumer.84. All parameters are estimated as probability distributions. failure rate and mean down time. Results For the top event as well as all intermediate events the fault tree analysis provides information on the probability of failure. Both the risk levels related to quantity and quality failures were expressed using CML. the risk levels could be calculated in terms of CML. 98 . totally be less than 10 days in 100 years.e. risk estimation.

Definition and comprehension of the analysed system including all information about characteristics. The FMEA also classifies the severity of the effects from a certain failure mode and estimates the probability of the occurrence of a certain failure mode. Procedure and example of FMECA Scope and Overview The Failure Modes and Effects Analysis (FMEA) and the Failure Modes. But they can also be applied during the normal operation phase of a process. Effects and Criticality Analysis (FMECA) are both risk analysis techniques to identify possible failure modes with their causes and effects on a system as well as measures to avoid or reduce the failures. This consideration of the system levels is important because the effect of one failure mode on a lower level can be the cause of a failure mode at the next or highest level (Figure 32). hardware or a process. 99 . The FMEA/FMECA begins always on the lowest level.Methods for risk analysis of drinking water systems from source to tap Appendix D. the specific purposes of the analysis. but both are not combined to a measure of risk. the inputs and outputs of the system as well as the set up of the system with the corresponding operating conditions. roles and functions from all considered system elements at all system levels. The application of FMEA/FMECA is advantageous in an early phase of design of a process or product since the reduction of failures is in this stage cost efficient. semiquantitative or quantitative. depending on the kind of data used. obtaining the definition of the analysed system. The FMECA is a further development of the FMEA and contains additionally the criticality analysis with a combined measure for risk in order to evaluate the risks and prioritise their risk reduction options. performances. the logical connections between the elements. Therefore the FMEA is a qualitative technique and the FMECA can be qualitative. The system structure with the functional connections between elements. Assembling a team which is sufficiently qualified to identify and assess the effects and their severity in consequence of events or failure. But FMEA/FMECA can also be used to identify human failure modes and effects. Qualitative as well as quantitative results from a FMEA/FMECA can be an input to other analysis techniques such as fault tree analysis. input and output information and redundancies should be illustrated in the block diagram in order to reconstruct function failures. Procedure Making a reliability plan. A system can be a software. their redundancy. Breaking down the system into its elements and structuring it by determination of the highest to the lowest system level with the help of block diagrams (IEC 61078). its scope and objectives as well as all conducted actions and measures.

Methods for risk analysis of drinking water systems from source to tap System Subsystem 2 Subsystem 1 Subsystem 3 Subsystem 4 Subsystem 5 Effect: Failure of Subsystem 4 Subsystem 4 Module 1 Module 2 Module 3 Module 4 Effect: Failure of Module 3 Module 3 Part 3 Part 1 Part 2 Part 4 Part 5 Effect: Failure of Part 2 Part 2 Failure mode 1 Failure mode 2 Failure mode 3 Effect: Failure mode 3 Part 2. Block diagram for the connections between failure modes and effects in a system [72]. 100 . Failure mode 3 causes the failure of part 2. For instance: Causes 1 and 2 lead to the effect that failure mode 3 occurs. Causes for failure mode 3 Cause 1 Cause 2 Cause 3 Figure 32. The effect of the failure from part 2 is the failure of module 3. The failure of module 3 in turn is the cause of the failure of the subsystem 4.

for instance for the warranty period or the predetermined life period. These causes have to be independent from each other. from available databases containing failure rates. The classification of the effects is dependent on the kind of the FMEAimplementation. Estimation of the frequency or probability that the failure mode occurs during the pre-determined period. the function or the status of a system. Identification of the immediate effect and the possible final effect on a higher level of the failure mode. They can be determined by expert opinions or analysis of field failure and failures in test units. The probability has to be estimated for a time period. 2. failure to cease operation at a prescribed time. For a FMECA a special analysis is carried out to assess the criticality as the combination of the severity of an effect and the probability of occurrence. Proposal of an appropriate risk reduction method. An effect is the consequence of one failure mode regarding the operation. The more detailed failure modes can be identified by thinking about the function of the component. 101 . That will be determined by the severity of the corresponding effects. Identification of all failure modes for the selected component. An effect occurs due to one or more failure modes of one or more components. Selection of failure mode to be analysed. The probability can be estimated with help of data from the life testing of the components. 7. Identification of the potential causes of that failure mode. 5. field failure data and data from failures of similar components. This step is important for the evaluation of the failure mode. 6.Methods for risk analysis of drinking water systems from source to tap Analysis (Figure 33) 1. corrective measure or compensating provision if this failure mode has to be avoided or reduced. premature operation. Selection of component of the system for the analysis. in this Appendix). On each level the effect of a failure mode on a next higher level should be assessed. failure to operate at a prescribed time. Very general categories for failure modes could be for instance the following: failure during the operation. It is not always useful to describe all causes. Determine the severity of the final effect. It is often qualitative and similar with the risk matrices proposed by the WSP. 4. The procedure of a FMECA is described further below (see: The Criticality Analysis below. 8. The severity is the assessment of the significance of the effect on the system operation. its performance specification and function and stresses under appropriate test conditions. 3.

Determine the next revision date as appropriate Figure 33 Procedure of the FMEA/FMECA analysis [72]. recommendations. Identify actions and responsible personnel. the procedure of the FMEA/FMECA with its elaborated system diagrams.Methods for risk analysis of drinking water systems from source to tap Initiate FMEA or FMECA of an item Select a component of the item to analyse Identify failure modes of the selected component Select the failure mode to analyse Identify immediate effect and the final effect of the failure mode Determine severity of the final effect Identify potential causes of that failure mode Estimate frequency or probability of occurrence for the failure mode during the predetermined time period Do severity and/or probability of occurrence warrant the need for action? Yes Propose mitigation method. assumptions made in the analysis. 102 . Document notes. Compilation of a report containing all analysed details of the system with the source of data. corrective actions or compensating provisions. actions and remarks No Are there more of the component failure modes to analyze? Yes No Are there other components for analysis? No Yes Complete FMEA. design changes etc. worksheets and risk matrices as well as all recommendations for further analyses.

otherwise it is a semiquantitative calculation by using ranks for each parameter. In the following the structure of such a block diagram to identify all possible failure modes is shown exemplarily for the subsystem membrane filtration with the chosen module membrane module. a calculation based on the equation of risk R = S · P where S is the severity (consequence) and P is the probability: RPN = S * O * D where O = probability of the occurrence of the failure during a given period D = an estimate for time needed to detect the failure In the case of available real data the RPN is a quantitative measure. membrane filtration and disinfection. 103 . Another possible measure is a so called Criticality Number. Example for the application of the FMEA procedure: As an example the risk assessment method FMEA is applied on a drinking water treatment system with the treatment steps roughing filters. In this example the part membrane capillary has been selected. It has to be compiled for all subsystems and all modules. flocculation. The failure mode with the highest RPN is the prioritized failure. The Criticality Analysis One possible quantitative measure in the Criticality Analysis is the Risk Priority Number (RPN). Subsequently one part of the chosen module has to be selected to start the identification of failure modes.Methods for risk analysis of drinking water systems from source to tap Review of the system by conducting a new FMECA to assess the failure modes after implementing the risk reduction options. Basing on this measure a special approach is to calculate the probability of occurrence of one failure mode with the following equation: Pi = 1 − e − C i where Pi=the probability of occurrence of the failure mode i Ci=the Criticality Number for the failure mode i A possibility to illustrate the risk (criticality) is to combine severity and probability of occurrence of a failure mode in a matrix analogues to the risk matrix described in the WSP. which includes a combination of the failure rate for one failure mode and the operation time of the system. The first step of the FMEA is to build up a block diagram of the whole treatment system that should be analysed (Figure 34).

Identified failure modes for the part membrane capillaries.Methods for risk analysis of drinking water systems from source to tap System: Treatment Roughing filters Flocculation Membranefiltration (Ultrafiltration) Disinfection Effect: Failure of Subsystem 4: Membranefiltration Subsystem 4: Membranefiltration Feedpipe Pump Membranemodules Permeatetube CleaningSystem Effect: Failure of Module 3: Membranemodules Module 3: Membranemodule Feedjunction Resin sealing Membrane capillaries Permeatejunction Modul manteling Effect: Failure of Part 3: Membrane capillaries Figure 34. Failure mode 1: Failure mode 2: Failure mode 3: Fibre breakage Membrane fouling Membrane material imperfections 104 . The result of this step can be illustrated by integrating the failure modes in the block diagram (Figure 35). The identification of failure modes which can occur in membrane capillaries is conducted by using expert judgements and data available for the selected part. Compilation of a block diagram for an exemplary drinking water treatment process. In this example three failure modes are identified (Table 15). Table 15.

Methods for risk analysis of drinking water systems from source to tap System: Treatment Roughing filters Flocculation Membranefiltration (Ultrafiltration) Disinfection Effect: Failure of Subsystem 4: Membranefiltration Subsystem 4: Membranefiltration Feedpipe Pump Membrane modules Permeatetube CleaningSystem Monitoring Effect: Failure of Module 3: Membranemodule s Module 3: Membranemodule Feedjunction Resin sealing Membrane capillaries Permeatejunction Modul manteling Effect: Failure ofPart 3: Membrane capillaries Part 2: Membrane capillaries Failure mode 1 Failure mode 2 Failure mode 3 Effect: Failure mode 3 Figure 35. Using the results of the risk matrix the failure modes can be ranked according to their risk. The risk matrix often combines the severity and the probability or frequency of a failure mode. In the final block diagram (Figure 36) the identified causes are integrated. corrective actions or compensating provisions can be proposed and established for the failure modes with a not acceptable risk (Table 16). In the next step one failure mode is chosen for further analysis. Integration of the identified failure modes in the block diagram. risk reduction options. For a complete analysis all above mentioned steps have to be repeated for each part of the system and for each failure mode. After that. After the identification of all possible causes of the failure mode the corresponding frequency or probability that the failure mode occurs during the pre-determined period is estimated. Now it has to be determined which risks are acceptable. 105 . The immediate and final effect is identified as well as the severity of the final consequences is determined. After this step a risk matrix can be applied in order to assess the risk that the failure mode occurs and cause harm. In this example the failure mode 3 "membrane material imperfections" is analysed.

5 10-6 1 10-6 0.2 10-6 System: Treatment Roughing filters Flocculation Membranefiltration (Ultrafiltration) Disinfection Effect: Failure of Subsystem 4: Membranefiltration Subsystem 4: Membranefiltration Feedpipe Pump Membranemodules Permeatetube CleaningSystem Monitoring Effect: Failure of Module 3: Membranemodules Module 3: Membranemodule Feedjunction Resin sealing Membrane capillaries Permeatejunction Modul manteling Effect: Failure of Part 3: Membrane capillaries Part 2: Membrane capillaries Failure mode 1 Failure mode 2 Failure mode 3 Effect: Failure mode 3 Part 2. assessment of severity and probability of occurrence as well as proposal risk reduction options. Integration of the identified failure mode causes in the block diagram.1 10-6 Result risk matrix Risk reduction options 0. Causes for failure mode 3 Cause 1 Cause 2 Cause 3 Cause 4 Figure 36. Identification of causes of a failure mode.Methods for risk analysis of drinking water systems from source to tap Table 16. Failure mode: Membran e material imperfect ions (failure mode 3) Immediate effect membrane capillars have imperfection Final effect contaminant s in the filtered water Possible causes failure in the manufacturing process exceeding of pressure Aging Harmful chemicals in the water before the filtration Estimation of severity 3 Estimation of frequency 0. 106 .

Therefore the removal efficiencies have to be generated for the optional and possibly suitable treatment methods. with focus on assessment in the planning phase of a drinking water supply. The basis is the definition of relevant parameters for the drinking water quality and corresponding threshold values that must not be exceeded to have a sufficient drinking water quality. Analyses to establish treatment and monitoring system During a design phase of a drinking water treatment plant. Laîné et al. Pj ) = where FMi FM0 Pj CGV rk CCrt C GV ∏ (1 − r ) k k =0 m = failure mode i = nominal mode = parameter j = guideline value = reduction factor for treatment step k = critical concentration in the raw water that may not be exceeded to comply with the guideline values in the drinking water With this function critical raw water concentrations are calculated that must not be exceeded in the raw water to make sure that the drinking water complies with the thresholds in the case that the treatment works in the normal operation mode. With these removal efficiencies transfer functions are defined for each treatment step. But this method is also suitable for giving specifications for a treatment and monitoring system. [24] mentioned 63 parameters to be taken into account in drinking water supply. The approach is a comprehensive method for calculating the probability of water quality parameters for an existing drinking water treatment and monitoring system to exceed certain threshold values. inverted and finally combined to one overall inverse transfer function: C Crt ( FM i . A possible approach to establish the treatment steps is a risk assessment method described in the literature [24. For existing treatment systems removal efficiencies are identified for each treatment step for the normal operation (= nominal mode). But the choice of parameters and values can individually be made by the responsible person for the risk assessment. after the identification of hazards and hazardous events. 25]. 107 . The thresholds can base on national guideline values as well as on internal standards of the water utility. In order to establish a treatment and monitoring system optional treatment steps can be assessed with this method to support the process development. there is a need to determine the requirements of the treatment system depending on the identified hazards. In the following the procedure of this approach is described.Methods for risk analysis of drinking water systems from source to tap Appendix E.

With these reduced removal efficiencies critical raw water concentrations can be calculated analogue to those for the nominal mode. To calculate the probabilities that the threshold for a certain parameter is exceeded at a certain operational mode. activated carbon 108 . the combination of the events "failure respectively nominal mode" and "exceedance of the critical raw water concentration" is regarded (Figure 37). that means the reduction efficiencies in the case of a specific failure mode. Additionally the reduced reduction efficiencies are determined. These critical raw water concentrations must not be exceeded to make sure that the drinking water complies with the thresholds in the case that the corresponding failure modes occur. FMECA is especially suitable for application in the design phase of a process and therefore appropriate for a treatment and monitoring system that should be established.Methods for risk analysis of drinking water systems from source to tap To assess the probability of exceeding thresholds in the case of a deviation from the nominal mode (= failure mode) a Failure Mode and Effect and Criticality Analysis (FMECA) have to be performed for the treatment system. a exceedance of the guideline value for the parameter k n b exceedance of any guideline value due to failure mode i i=0 FMi m k=0 FMi eik eik with: FMi = Failure mode i = Nominal mode without failure F0 eik = exceedance of the critical concentration for FMi in the raw water for parameter k with: FMi = Failure mode i = Nominal mode without failure F0 eik = exceedance of the critical concentration for FMi in the raw water for parameter k Figure 37. The event combinations for the exceedance of one specific parameter at any operational mode (Figure 37a) respectively at one specific mode of any parameter (Figure 37b) can be illustrated by the fault trees. The arrays contain the kind and a description of the failure modes as well as an estimated probability for occurrence of the corresponding failure mode. Fault trees for the event combination failure modes/nominal mode and exceeding critical raw water concentration Example for the application of this method: In a simple example the method should be applied in a situation where a drinking water supply is designed with a simple treatment including the processes. The result of the FMECA is a failure mode arrays for all treatment steps and all identified failure modes.

3 mg/L and 5*10-6 1/L respectively. P1 Parameter Pesticide xy “x” Threshold reduction factor r1 Activated carbon filter transfer function inverse transfer function reduction factor r2 Disinfection transfer function inverse transfer function C input(threshold) = threshold / (1 -r2) Coutput = Cinput/(10) Cinput = Coutput/(10) - P2 Bacteria xy ”y” 0. Example of failure mode array for the activated carbon filter.0E-7 0.0E-6 0 Coutput = Cinput/(10) Cinput = Coutput/(10) - Coutput = Cinput/(10.002 0.9 1. Reduction factors (removal efficiency) have to be defined for each treatment step and in this example for the two parameters pesticide “x” and bacteria ”y”. By using these reduction factors and the threshold values for drinking water the inverse transfer function gives critical raw water concentrations 0.6 0 0 0.03 5. Calculation of the critical raw water concentration C for the nominal mode.9) - 0 0.0E-6 The application of the FMECA method leads to the identification of failure modes.0002 4.0E-6 C raw water(threshold) = (threshold / -r2)) / (1-r1) (1 0.8 Coutput = Cinput/(10. It is necessary to assess the reduced reduction factors for both parameters and both treatment steps in the FMECA. For nominal mode Table 17 shows the calculation of critical raw water concentrations which are not allowed to be exceeded to comply with the threshold values drinking water standards.Methods for risk analysis of drinking water systems from source to tap filtration and disinfection. One example for a failure mode and the information deriving from the FMECA method is shown in Table 18. Table 17.8 109 .03 0.8) - inverse transfer function disinfection (threshold) combination of both inverse transfer functions 0. Failure mode 1 (FM1) for activated carbon filter Failure Failure rate (1/h) Latency (h) Unavailability UFM1 r1d1 Removal efficiency for P1 r1d2 Removal efficiency for P2 r2d1 Removal efficiency for P1 r2d2 Removal efficiency for P2 Activated carbon filter failure 0.8) Cinput = Coutput/(10.3 5.9) Cinput = Coutput/(10. Table 18.

Failure mode FM2 (for disinfection) P1 Parameter Pesticide xy Threshold reduction factor r1 Activated carbon filter transfer function inverse transfer function reduction factor r2 Disinfection transfer function inverse transfer function C input(threshold) = threshold / (1-r2) Coutput = Cinput/(1-0) Cinput = Coutput/(1-0) P2 Bacteria xy 0.03 2.9) Cinput = Coutput/(1-0.8) inverse transfer function disinfection (threshold) combination of both inverse transfer functions 0.8 Coutput = Cinput/(1-0. Failure mode FM1 (for activated carbon filter) P1 Parameter Pesticide xy Threshold reduction factor r1 Activated carbon filter transfer function inverse transfer function reduction factor r2 Disinfection transfer function inverse transfer function C input(threshold) = threshold / (1-r2) Coutput = Cinput/(1-0) Cinput = Coutput/(1-0) P2 Bacteria xy 0.0E-6 C raw water(threshold) = (threshold / (1-r2)) / (1-r1) 0.9) 0 0. This calculation is shown for two exemplary failure modes in Table 19 and Table 20. Calculation of the critical raw water concentration C for the Failure Mode 2 in the disinfection. Table 19.9) Cinput = Coutput/(1-0.03 5.0E-6 C raw water(threshold) = (threshold / (1-r2)) / (1-r1) 0. Then the critical concentrations in the raw water can be determined by using the thresholds for the drinking water and the reduced reduction factors.9) 0 0.8) Cinput = Coutput/(1-0.0E-6 Table 20.0E-6 0 Coutput = Cinput/(1-0) Cinput = Coutput/(1-0) Coutput = Cinput/(1-0.6 1.03 0.8) inverse transfer function disinfection (threshold) combination of both inverse transfer functions 0.3 2.5 Coutput = Cinput/(1-0.8) Cinput = Coutput/(1-0.075 5.Methods for risk analysis of drinking water systems from source to tap The critical raw water concentrations in the case of a failure can be calculated by analogy with the nominal mode using the reduced reduction factors from the failure mode array: For each failure mode the inverse transfer function is compiled.9 1. Calculation of the critical raw water concentration C for the Failure Mode 1 in the activated carbon filter.0E-6 0 Coutput = Cinput/(1-0) Cinput = Coutput/(1-0) Coutput = Cinput/(1-0.0E-6 110 .03 0.

the given threshold concentrations for the drinking water will be exceeded.0E-6 5.075 > 0. In this table probabilities for the existence of the calculated concentration values in the raw water for certain parameters have to be estimated according to measurements.03 or Bacteria > 1 x 10-5 or + FM1 > 0. If such an event combination occurs.3 FM2 + > 0. Pesticide > 0. Estimated probabilities for the parameters pesticide and bacteria in nominal and failure mode.0E-6 The last step of this method is a fault tree used to combine the events "failure mode" and "exceedance of critical concentrations in the raw water".Fault trees with event combinations for the parameters pesticides and bacteria.0E-6 1. Figure 38 shows these possibilities of event combinations leading to such an exceedance.7E-6 5.3 0.3 Parameter 2: Bacteria Assessed probability 2.0E-5 2. Parameter 1: Pesticide Critical values Nominal mode Failure mode 1 Failure mode 2 0.3 FM1 + >5x 10-6 >5x 10-6 FM2 + >2x 10-6 Figure 38.Methods for risk analysis of drinking water systems from source to tap Another part of the method is the assessment of the probability that the calculated critical concentrations occur in the raw water as shown in Table 21.9E-6 2. 111 .0E-5 Assessed Critical values probability 1. expert judgements or information from literature. The calculation of the probability for the occurrence of the event combinations is shown in Table 22.0E-6 1. Table 21.0E-5 2.075 0.

00002 0.00002 0.00002 112 .000001 0.75E-13 P2 0.0000004 8E-12 0. Calculation of the probability of the event combinations.Methods for risk analysis of drinking water systems from source to tap Table 22.0000017 0.00000025 4. Overview of the possible events with their probability values P1 Raw water event in the nominal mode Raw water event in the failure mode 1 Event of failure mode 1 Combination of raw water event in FM1 and event FM 1 Raw water event in the failure mode 2 Event of failure mode 2 Combination of raw water event in FM2 and event FM 2 0.00000025 5E-12 Calculation of the final probabilities P1 1E-06 P2 0.8E-13 0.0000004 6.0000019 0.00002 0.

and a wear out period (3) where the failure rate increases because of wear and tear of the system components. A mental model for the failure rate that is common in the literature is the bathtub curve. as this rate can depend on the age of the barrier). This mean time from the barrier has failed until it is operational again is denoted Mean time to repair (MTTR). We often want to determine the unavailability of the barrier. The probability of failure within time t is: P(failure) = 1-R(t). Bath tub curve (Failure rate as a function of age). where t is the elapsed time period (which can be interpreted as the “age”). The model indicates that many systems and components will have a high failure rate in the initial period (1) where there often are some start up problems. Failure rate 1 2 3 Time Figure 39. (i. Then we also need the time which the barrier is not functioning. This is the rate of which failures for a barrier of a given age occurs. t) is MTTF=1/λ If the probability of failure is to be determined. 113 . The probability of a barrier “surviving” a time period t is given by the reliability R(t)=e.λt. λ being independent of age. a “useful” life period (2) where the failure rate is constant and only random failures occur. a time interval needs to be defined. The mean time to failure (MTTF) for a barrier with a constant failure rate.Methods for risk analysis of drinking water systems from source to tap Appendix F: Some fundamental reliability concepts Here we present some fundamental probabilistic concepts. it is most often given as the probability that the system fails within a given period of time. In this report the failure rate is expected to be in the period of useful life and a constant failure rate is assumed. Very often the failure rate shows a bath tub like behaviour as illustrated in Figure 39. When we shall quantify the reliability of a system (component).g. (often denoted λ or rather λ(t). The reliability is directly given by the failure rate. e.e. mathematical models for modelling different types of fault development (for a thorough introduction see [23]). each time it has failed. There are many methods.

If there is a constant failure rate (λ). (i. If the barrier consists of a redundant system of component.e. The tools use knowledge of past. by an actual “random” demand). we get PFD ≈ 0. and its performance by the probability of failure on demand (PFD)9. Special focus has been on the right end of the curve.unife.1 per year. Within the water sector a lot of effort has been put into modelling the bathtub curve (failure rate for repairable systems).e. the functioning of the barrier can usually not be decided without testing it. Also often denoted Mean Fractional Dead Time (MFDT).Methods for risk analysis of drinking water systems from source to tap If failures are detected immediately (not requiring a test). the calculation increases in complexity (see [23]). is unavailable) equals: U ≈ MTTR/(MTTF +MTTR) If a barrier is a sensor/detector. One example of tools for failure forecasting of pipe breaks in a water network is the project CARE-W (see http://care-w. Thus. giving alarm when a certain level (of concentration) is too high or low.e. So. the PFD approximately equals: PFD ≈ (λτ)/2 where τ is the test interval. observed and recorded failures. the barrier can have a dormant failure.5 year). 9 114 .025 (=2. the mean time the barrier is in a failed state (i. A failure is defined in this case as a break or detected leak that has necessitated repair to the pipe. This is the probability that the barrier will function when needed. The models have developed from simple regressions models to more advanced stochastic models. i.5%).it/). and the component is tested every six months (τ=0. where failures occur more and more frequently. if λ=0. the deterioration phase.

Methods for risk analysis of drinking water systems from source to tap 115 .

Sign up to vote on this title
UsefulNot useful