You are on page 1of 16

Risk-based maintenance Construction Management forand tunnel Economics ( July 2003) 21, 495510

495

A risk-based maintenance management model for toll road/tunnel operations


M. F. NG1, V. M. RAO TUMMALA2* and RICHARD C. M. YAM3
Engineering Department, Route 3 (CPS) Company Limited, NT, Hong Kong College of Business, Eastern Michigan University, Ypsilanti, MI, USA 3 Department of Manufacturing Engineering and Engineering Management, City University of Hong Kong, Kowloon, Hong Kong
2 1

Received 16 May 2002; accepted 20 February 2003

Preventive maintenance (PM) has long been recognized as a method to increase equipment reliability and availability. However, for equipment in complex plant installations like toll road/tunnel systems, to carry out PM on all components may not be feasible, or, may end up with excessive maintenance costs. This paper describes how a risk-based maintenance management model was formulated to systematically prioritize PM activities. The model was based on the five core elements of the risk management process (RMP): identification, measurement, assessment, valuation, and control and monitoring. This model was applied to a toll road/tunnel company in Hong Kong to enhance the PM operations of its lighting system. The improvements recommended in this case study show that the application of RMP in preventive maintenance could effectively identify and assess potential risks for equipment and facilities. The RMP results provide quantified information for decision-makers to select the best course of actions for implementing a more cost-effective risk-based PM system. Keywords: Risk management process, preventive maintenance, toll road/tunnel, operations

Introduction
Maintenance management for toll road/tunnel management is not new in Hong Kong. The primary objectives of a toll road/tunnel management company are to provide reliable, safe, fast and cost effective journeys for tunnel users. The failure of any of the critical equipment in the systems, such as power supply systems, tunnel ventilation systems, tunnel lighting systems, sump pump, traffic control and surveillance systems may cause disasters or hazards to users and operators. Although a toll road/ tunnel management company may adopt an expensive preventive maintenance programme to keep the equipment and facilities in good working condition at all times, there is no formal and consistent method currently used for setting up preventive maintenance programmes in tunnel operations. It should be noted that the allocation
*Author for correspondence. E-mail: rao.tummala@emich.edu

of resources and planning of schedules for effective preventive maintenance programmes are normally determined by the companys Engineering Department according to the requirements set by the equipment manufacturer or the experienced maintenance staff. The failures and effects of equipment (risk factor) and the corresponding preventive actions are not communicated well between different departments in the company. Moreover, there are increasing demands for tighter regulatory requirements, shorter allowable maintenance times and lower maintenance budget, etc., which have increased the complexities and difficulties of maintenance operations significantly. As such, new approaches need to be considered that would help management to choose the best course of actions for reducing or eliminating the potential risks of equipment failures. Tomic (1993) proposed the use of risk-focused maintenance in improving system reliability or availability through systematically identifying the applicable and

Construction Management and Economics ISSN 0144-6193 print/ISSN 1466-433X online 2003 Taylor & Francis Ltd http://www.tandf.co.uk/journals DOI: 10.1080/0144619032000089616

496
effective course of action for each failure mode of a system. The major advantage of employing a risk management approach is to provide a thorough assessment of risk factors of equipment failures. On the other hand, Vaughan (1997) defined the fundamental part of risk management function as the design and implementation of procedures to minimize the occurrence of loss or the financial impact of the loss. According to him, the objective of risk management is to reduce and eliminate certain types of risks facing organizations by avoiding, reducing, and transferring risks. Similarly, several authors have developed different risk management approaches based on different objectives. For example, the approach adopted by the Engineering Council (1994) is more on a general application suitable for most kinds of engineering activities. The European Community promotes a comprehensive risk management methodology, RISKMAN, which provides a more comprehensive framework to enumerate and assess potential risk factors associated with a project. RISKMAN focuses on project management issues, and emphasizes heavily towards the active management of risks rather than the identification and assessment of them (Carter et al., 1994). On the other hand, both Raffia (1994) and Hayes et al. (1986) defined risk management as a process consisting of several steps, as against what Hertz and Thomas (1984) referred to as risk analysis. Charette (1989) defined risk engineering consisting of two separated but interdependent concepts: risk analysis and risk management. As described by Cooper and Chapman (1987), risk management involves a multi-phase risk analysis approach, which covers the

Ng et al.
identification, evaluation, control and management of risks from the perspective of social hazard management. Rowes (1993) approach does not consider the phase of controlling and monitoring. Thus, a lot of confusion exists among practitioners in applying different risk management approaches. Through comparison of these several risk management approaches, Tummala et al. (1994) developed the risk management process (RMP) consisting of five core elements. As shown in Figure 1, the five core elements are: risk identification (finding and understanding risks); risk measurement (measuring the severity of risks); risk assessment (assessing the likelihood of occurrence of risks); risk evaluation (determining or ranking the identified risk factors according to the management objectives and available resources, and implementing risk response action plans); and risk control and monitoring (tracking the progress made and the results achieved by the risk response actions taken as a result of risk evaluation phase and taking corrective actions). The RMP is a comprehensive, detailed and easy to apply approach to manage risks. There are several successful applications that prove the viability of the RMP approach in the construction and maintenance fields. Burchett and Tummala (1998) studied the need and feasibility of employing the RMP to assess risks in capital investment for extra-high voltage (EHV) transmission line construction projects (Tummala et al., 1999). On the other hand, Tummala and Lo (forthcoming) and Tummala and Mak (2001) applied the RMP in developing a risk management model for improving electricity supply reliability and transmission operation and maintenance, respectively. In addition, Yu

Figure 1 Risk management process framework

Risk-based maintenance for tunnel


(1996) developed a knowledge-based system applying RMP in tackling schedule risks in project management for an EHV substation construction project. Similarly, a knowledge-based expert system was developed by Leung (1997), who used RMP and applied it to an EHV transmission line construction project to identify, evaluate and manage project cost (Leung et al., 1998). Another risk management model was developed by Mok (1994) to apply RMP in preparing cost estimates for building services installation of the building construction projects administered by the Building Services of Architectural Services Department of the Hong Kong Government. In the field of maintenance, Leung (1994) developed a framework by integrating the system hazard analysis with RMP to make it more applicable to assess safety and reliability risks associated with the door system of a train car for the Mass Transit Railway Corporation (MTRC) in Hong Kong (Tummala et al., 1996). It should be noted that either the RMP or other risk management models can assist project managers/decision makers in identifying and assessing potential risk factors to develop and implement the best course of actions in eliminating or reducing the identified risk factors. Even though they may not be able to identify all the potential risk factors, they can still provide an effective means to quantify and manage risks as opposed to other nonquantifying approaches. Burchett et al. (1999) carried out a worldwide survey within the context of electrical power supply projects and confirmed that there is a drive towards a more thorough assessment of risks. They also pointed out that a formal risk management process would meet the expectations of business growth and project sponsors and ensure that all risks are actively managed throughout the life cycle of a project. However, the issues of risks are not just technical, e.g. on hazard or failure processes, they are concerned with decision making and management support systems as well. Understanding risks and their control processes may still need further R&D, especially in some industries. Each industry should therefore review its own situation relating to the relevant experiences of the others and develop its own appropriate risk management systems. This paper aims to describe the development and implementation of an effective risk-based maintenance management model for a toll road/tunnel company to eliminate or reduce risks of equipment failure. The proposed model is developed to integrate RMP with the generic maintenance processes planning, scheduling, executing, analysing and improving (Figure 1). The application of RMP in maintenance modelling overcomes the deficiency of most of the maintenance models by considering the consequences of faults, their likelihood of occurrence and the cost of implementing risk response actions in a meaningful fashion. Moreover, suitable maintenance strategies can be determined based on the

497
identified risks. The formulated model was then applied to a real case of a toll road/tunnel operations to examine its applicability. The results obtained and effectiveness of the proposed risk-based maintenance model is described later in this paper.

The risk-based maintenance management model (RBMMM)


A risk-based maintenance management model is formulated using the RMP as shown in Figure 2. The model begins with the identification of the strategic importance of the project. The mission, aim and objectives of the company are the driving forces behind the model leading to the improvement of quality and effectiveness of maintenance operations under different internal and external factors facing the company. The potential risk factors are identified for each critical unit that may affect the success of the project. Subsequently, the consequences of all identified risk factors are determined and the magnitudes of the impact of their consequences (consequence severity) are enumerated. Depending upon the probability distributions of all identified risk factors, the likelihood of occurrence of consequences is assessed. Checklists, event tree analysis, fault tree analysis, Failure Mode and Effects Analysis (FMEA), HazOp analysis and Cause-and-Effect (C-E) Diagrams are some of the well known and widely used techniques to identify potential risk factors (risk identification) (Sundararajan, 1991; Tummala et al., 1994; FMEA, 1995). The System Hazard Analysis technique along with the FMEA is useful in enumerating and assessing the consequences of the identified risk factors (risk measurement) (Military Standard, 1993). The System Hazard Analysis technique is also suitable in assessing the severity of consequences and risk probability levels through qualitative analyses (risk assessment). Several cases have been reported on the successful application of the System Hazard Analysis technique (Leung, 1994; Tummala et al., 1994, 2001). Monte Carlo Simulation is another popular simulation technique used to generate probability distributions for project success factors by observing the probability distributions of all risk factors affecting them (Hammersley and Handscombe, 1967; Schmidt and Taylor, 1970). Other tools, such as five-point estimation and probability encoding can also be used if data are not sufficient. If sufficient data are available, one can use the Bestfit software to determine the best fitted distribution (@Risk, 1992; BestFit, 1993). All these techniques are complementary to each other. In selecting the suitable techniques for risk identification, measurement and assessment, the following factors should be considered: the objectives of the study, the nature of the problem, the complexity of the process, the

498

Ng et al.

Figure 2 Risk-based maintenance management model for toll road/tunnel operations

data requirements of the study, the resources available for the study and the level of expertise required in the use of these techniques (IEEE Spectrum, 1989). After reviewing these factors, the System Hazard Analysis technique (Military Standard, 1993) and FMEA (1995) were selected in this model for risk identification, risk measurement and risk assessment. The risk evaluation phase is to rank and prioritize the identified risk factors and to determine the risk acceptance levels according to the aim, objectives and available resources of the project. The risk severity and probability levels generated in risk identification, risk measurement and risk assessment phases can be used to calculate the risk exposure values (risk severity risk probability) for

each, or each group of, risk factor(s). All such information could be used to determine the acceptable risk exposure levels, the appropriate preventive maintenance programmes and the risk control actions. The Hazard Totem Pole (HTP) approach proposed by Grose (1987) can be used to systematically evaluate the identified risk factors and to integrate the severity, likelihood of occurrence and cost of preventive action into a format for easy decisionmaking by management. The advantage of HTP is that it simultaneously assesses the three fundamental management concerns: performance, schedule and cost. When the three variables are known, a HTP diagram can be plotted out. Finally, the cut-off points or risk acceptance levels can be determined based on the identified risks, and

Risk-based maintenance for tunnel


the aims, objectives and available resources of the project and suitable maintenance activities can be determined. The risk identification, measurement, assessment and evaluation are repeat processes so that when a new situation occurs, such as change of government regulation or decrease in performance level resulting from system or component failure or malfunction, the HTP analysis will indicate the risk levels of respective risk factors to alert management. Based on such information, management can then revise the existing acceptance levels and formulate appropriate maintenance strategies to improve the performance to meet the revised acceptable levels. The execution phase is the actual implementation of the preventive maintenance tasks according to the planned schedule. Suitable check sheets should be used for a proper control and monitoring system. During the execution phase, appropriate feedback channels should be established to report the deviations from the planned activities or changes in environmental factors. The risk control and monitoring phase reviews the progress of the project continuously and recommends necessary corrective actions to management for accomplishing the project objectives. Moreover, it serves to ensure that the training of staff, the auditing of risk management activities and the established emergency plans are properly executed

499
and coordinated among various parties in an effective and efficient manner. It is useful to generate information regarding major events/milestones, project status and project summary reports throughout the lifetime of the project to facilitate information distribution to staff and management. Finally, as shown in Figure 2, the risk-based maintenance management process should be supported by a computerized maintenance information system (MIS). The MIS includes information storage, data processing and analysis and report generation. Basically, the MIS system consists of several databases to keep track of all maintenance activities. This maintenance information is useful for future risk measurement, assessment and the determination of the best courses of actions for reducing or eliminating the identified risk factors. It is also useful for planning the contingency measures and training of staff in an organization.

The case study


Reliability of a tunnel lighting system is crucial for tunnel users, and its continuous operation without interruption must be assured. As illustrated in Figure 3, the tunnel

Figure 3

Tunnel lighting configuration for one tunnel tube

500
lighting configuration can be divided into three sections entrance, interior and exit in a tunnel tube. The entrance section is the most critical area, because without sufficient portal brightness, the entrance will appear to the approaching drivers as a black hole. The most severe visual task is not when the driver is passing through the plane of the portal shadow, but when he or she is outside of it and is trying to see within the portal shadow. The entrance section comprises the threshold and transition zones installed to provide sufficient reinforcement lighting to reduce the black hole effect by gradually decreasing the luminance level so as to finally match the basic lighting in the tube section. The interior section simply provides an adequate luminance level for safety driving. In order to ensure reliable tunnel lighting, the power supply of the basic lighting is provided by two independent uninterrupted power sources connecting from the two ends of the tunnel. The odd number lighting sets are connected to one power supply and the even number lighting sets are connected to another. In case of failure or power outage of a single power supply system, it will not cause a total or a sectional black out of the tunnel lighting that would endanger the drivers in the tunnel. The exit section on the other hand appears as a bright hole to the

Ng et al.
motorists. Usually, all obstacles will be discernible by silhouette against the bright exit and thus they will be clearly visible. However, in order to comfort the eyes of the drivers, reinforcement lighting similar to the entrance section is also provided. The reinforcement lighting at the exit section is also designed for bi-directional traffic condition as well. The reinforcement and basic lighting are divided into six control stages. Depending on the photometer reading, an appropriate lighting set up will be selected by the central monitoring and control system (CMCS) or manually by the operator in the central control centre. Figure 4 shows the basic control circuit schematic (Ng, 1998). Driver process As shown in Figure 2, the model begins with the driver process. In line with the vision, mission and the overall corporate business strategy of the company, the driver process identifies the strategic importance of the project under different internal and external environmental factors. The purpose of the driver process is to translate the aims and objectives of the project into several project success factors that can be used as guidelines for and

Figure 4 Basic control circuit diagram of tunnel lighting control

Risk-based maintenance for tunnel


understood by the project team. This process also enables top management to recognize the importance of the project so as to obtain their commitment and involvement in supporting the project. The internal factors are influenced by two external factors: government and customer requirements. Government requirements are concerned mainly with the changes in standard or ordinance, while customer requirements emphasize more on service quality, safety and cost. For the toll road/tunnel company, the corporate business plan and the toll road/tunnel management plan are the two major internal factors for developing the mission, aims and objectives of the operations. The aim of this case study was to apply the formulated risk-based maintenance management model to the toll road/tunnel company for selecting the best course of actions in improving its existing preventive maintenance activities (Ng, 1998). In order to achieve this, the following objectives were established: (1) to reduce the breakdown duration and frequency of the tunnel lighting system; and (2) to minimize hazards to drivers in case of breakdowns of the tunnel lighting system. These objectives were in line with the aims and objectives of the toll road/tunnel company. The outcome of the case study was to propose an action list for the decision by the management of the toll road/tunnel company. The action list should include the priority of preventive actions and improvement works that would eliminate or reduce the identified risks in the tunnel lighting system so as to achieve a more cost effective maintenance operation. System decomposition Before identifying potential risk factors, the tunnel lighting system was decomposed into a controllable hierarchy. The system decomposition involved the categorization of the equipment and the identification of the objectives and performance criteria of maintenance for each unit in the hierarchy. All the correspondence, manuals, drawings and schematics were collected at this stage to form the detailed equipment information database for the tunnel lighting system. The hierarchical/top-down techniques were used to illustrate the construction of the component list. The power-supply system, the central monitoring and control system (CMCS) and the dimming control system were the three major sub-systems of the tunnel lighting system (Ng, 1998). All the units of these subsystems were grouped together and listed out in different functional parts as shown in Table 1. Risk identification From the component list created in the system decomposition stage, the potential risk factors for equipment

501
Table 1 System decomposition for the tunnel lighting system Item 1. Power supply 1.1 1.2 1.3 1.4 1.5 2. System control 2.1 2.2 2.3 2.4 2.5 2.6 3. Field equipment 3.1 3.2 3.3 3.4 Component name Control/protection relay Isolator Contactor Booster transformer MCCB Dimming controller Dimming output control unit Dimming input module CMCS central computer CMCS field control unit CMCS programmable logic controller Basic lighting fittings fluorescent tube Reinforcement lighting fittings sun lamp Photometer Electronic control ballast

failure of the tunnel lighting system were identified. According to McAndrew and OSullivan (1993), failure mode and effects analysis (FMEA) is a simple technique used to identify potential risks and it is also suitable for service industries such as toll road/tunnel operations. In addition to FMEA, the following tools and techniques were also used in assisting the risk identification process: the instrumentation diagram, schematic and block diagrams, logic diagram, process flow diagram, installation drawing, inventory parts list, manufacturers manual, flow charts, etc. The possible failure modes, their symptoms and the possible causes were identified and filled in the FMEA check sheet as shown in Table 2 for the three functional parts: the power supply, system control and field equipment. Subsequently, two different kinds of failure effects the hazards to drivers and traffic blockage were listed out. The detection of the failure and the kind of actions recommended preventing the re-occurrence of the breakdown or failure are shown in the FMEA check sheet (Ng, 1998). From the FMEA analysis, the failure effects of the dimming controller, dimming output control unit, dimming input module and electronic control ballast were found having no impact and hazard to drivers. Moreover, failure of these components would not cause serious or total breakdown to the tunnel lighting system. These components were, therefore, eliminated from the subsequent analysis. Table 3 lists out the potential risk factors that cause the traffic blockage and hazard to drivers. For easy reference, an identification code was assigned to

502

Table 2 FMEA report for tunnel lighting system


Failure mode Control/ protection relay breakdown Failure symptom No power supply to a section of lighting sets Failure cause Relay coil open circuit Contact stuck Mis-operation Failure effect A section of lighting blackout Reinforcement lighting out of control Traffic may slow down due to different lighting level Traffic accident may occur due to sudden blackout A section of lighting blackout Reinforcement lighting out of control Traffic may slow down due to different lighting level Traffic accident may occur due to sudden blackout A few sections of lighting blackout Tunnel tube must be closed Traffic accident may occur due to sudden black out A few lighting blackout A few reinforcement lighting out of order Traffic may slow down due to different lighting level Failure detection By routine inspection Failure deterrents Do not overload the contact Check clearly before switch operation Preventive actions Check loading of the component Improve operating procedure to prevent mis-operation Increase stock level of spare parts Check loading of the component Improve operating procedure to prevent mis-operation Increase stock level of spare parts Check loading of the component Add ventilation fans. Increase stock level of spare parts Check loading of the component Improve operating procedure to prevent mis-operation Increase stock level of spare parts Improve operating procedure to prevent mis-operation Improve operating procedure to prevent mis-operation Improve operating procedure to prevent mis-operation Improve operating procedure to prevent mis-operation Training may be provided to operations and maintenance staff; Increase stock level of spare parts

Isolator/contactor breakdown

No power supply to a section of lighting sets

Contact stuck Mis-operation

By routine inspection

Do not overload the contact Check clearly before switch operation

Booster transformer breakdown

No power supply to a few sections of lighting sets

Transformer coil open circuit Transformer short circuit. Cable termination loosen. Contact burnt Contact stuck Mis-operation

By routine inspection

Do not overload the transformer Maintain operating temperature Do not overload the contact Check clearly before switch operation

MCCB Breakdown

No power supply to a few lighting sets

By routine inspection

Dimming controller

Dimming function to basic lighting malfunction Dimming function to basic lighting malfunction Dimming function to basic lighting malfunction Dimming function to basic lighting malfunction Reinforcement lighting control malfunction

Dimming output control unit

Dimming input module

CMCS central computer

Power supply unit failure Cable termination loosen Control card malfunction Mis-operation Power supply unit failure Cable termination loosen Control card malfunction Mis-operation Power supply unit failure Cable termination loosen Control card malfunction Mis-operation Software halt Power supply unit failure Communication cable fault Control card malfunction Mis-operation

Dimming function of about 1Km lighting out of order Marginally increase energy wastage Dimming function of about 200m lighting out of order Slightly increase energy wastage The whole dimming function of the system out of order Seriously increase energy wastage The whole dimming function of the system out of order Seriously increase energy wastage Reinforcement lighting out of control Traffic may slow down due to different lighting level Traffic accident may occur due to sudden black out

By routine inspection

By routine inspection

By routine inspection

By routine inspection

Check clearly before carrying out maintenance work Check clearly before carrying out maintenance work Check clearly before carrying out maintenance work Check clearly before carrying out operation or maintenance work

Ng et al.

Risk-based maintenance for tunnel

Table 2 (contd)
Failure mode CMCS field control unit Failure symptom Dimming function to basic lighting malfunction Reinforcement lighting control malfunction Failure cause Software halt Power supply unit failure Communication cable fault Control card malfunction Failure effect Partial of dimming function of the system out of order Marginally increase energy wastage One or two portals reinforcement lighting out of control Traffic may slow down due to different lighting level Traffic accident may occur due to sudden black out Partial of dimming function of the system out of order Marginally increase energy wastage One or two portals reinforcement lighting out of control Traffic may slow down due to different lighting level Traffic accident may occur due to sudden black out Negligible effect to normal traffic Failure detection By routine inspection Failure deterrents Check clearly before carrying out operation or maintenance work Preventive actions Improve operating procedure to prevent mis-operation Training may be provided to operations and maintenance staffIncrease stock level of spare parts Improve operating procedure to prevent mis-operation Training may be provided to operations and maintenance staff

CMCS programmable logic controller

Dimming function to basic lighting malfunction Reinforcement lighting control malfunction

Software halt Power supply unit failure Communication cable fault Control module malfunction

By routine inspection

Check clearly before carrying out operation or maintenance work

Basic lighting fittings fluorescent tube Reinforcement lighting fittings son lamp

Basic lighting set black out

Fluorescent tube burnt out Ballast malfunction Out of power supply Cable termination loosen Son lamp burnt out Ballast malfunction Out of power supply Contactor failure Cable termination loosen

By routine inspection

Reinforcement lighting set black out

Negligible effect to normal traffic

Photometer

Reinforcement lighting out of control

Photometer malfunction Photometer mis-adjustment Out of power supply CMCS component failure Cable termination loosen

Reinforcement lighting out of control Traffic may slow down due to different lighting level Traffic accident may occur due to sudden black out

Check condition monitoring device through CMCS By routine inspection Check condition monitoring device through CMCS By routine inspection

Predictive maintenance can be applied to change the tube before its failure Predictive maintenance can be applied to change the tube before its failure

Develop suitable Predictive algorithm

Develop suitable Predictive algorithm

Predictive maintenance can be applied to change the tube before its failure

Improve operating procedure to prevent mis-operation Training may be provided to operations and maintenance staff Sufficient spare parts must be ready Develop suitable Predictive algorithm

Electronic control ballast

Basic lighting set black out

Ballast malfunction Dimming O/P Control Unit failure Cable termination loosen

Negligible effect to normal traffic

By routine inspection

Predictive maintenance can be applied to change the tube before its failure

503

504
each potential risk factor as shown in the last column of Table 3. The data generated in the risk identification phase were also stored in the maintenance information system (MIS) for analysis at a later stage. Risk measurement Risk measurement involves the enumeration of the consequences and the magnitude of impacts for all identified potential risk factors generated in the risk identification phase. The four-severity category scale recommended by the US Military Standard 882C was used for assessing the level of severity of consequences. By reviewing the specific requirements of the toll road/ tunnel operations, an additional severity category called significant was added in between the original severity categories of critical and marginal. As such, a fiveseverity category scale catastrophic, critical, significant, marginal and negligible was formed to assess the severity levels of the consequences for the hazard to drivers and the duration of traffic blockage failure effects (see Table 4).
Table 3 Potential risk factors Item 1 2 3 4 5 6 7 8 9 10 Risk factor Control/protection relay breakdown Isolator/contactor breakdown Booster transformer breakdown MCCB breakdown CMCS central computer CMCS field control unit CMCS programmable logic controller Basic lighting fittings fluorescent tube Reinforcement lighting fittings sun lamp Photometer Identification code CP IC BT MC CC FC PL BL RL PH

Ng et al.
The failure effects reported in the FMEA analysis were used to determine the severity level of the consequences. For example, by referring to Table 2, the failure effects of the control/protection relay breakdown (CP) would cause the tunnel illumination decreasing to an uncomfortable level to drivers; hence, the consequence severity level 2 on the hazard to drivers was assigned to CP (x symbol in Table 5). Similarly, in consultation with experienced operations staff, the same failure would also cause an outage of less than 50 m basic lighting, which would slightly affect the traffic. As such, the consequence severity level 2 on the duration of traffic blockage was assigned to CP (# symbol in Table 5). Consider another illustrative example, namely the booster transformer breakdown (BT). As shown in Table 2, the failure in BT might cause a major accident to occur which could be critical; therefore, the consequence severity level 4 on the hazard to drivers was assigned to BT ( symbol Table 5). The same failure would also lead to the closure of the affected tunnel tube and the other tube would have to be operated in single-tube two-way traffic causing a critical traffic jam, and hence

Table 4

Severity categories for hazard to drivers and duration of traffic blockage Hazard to drivers Duration of traffic blockage Both tunnel tubes lighting outage Traffic stopped, more than 45 min. delay in travelling time Less than 500 m of basic lighting outage or one entrance portal reinforcement lighting outage Traffic jam, 1545 min. delay in travelling time Less than 200 m or either odd or even no. of basic lighting outage or more than two stages of reinforcement lighting outage Traffic slowed, 515 min. delay in travelling time Less than 50 m of basic lighting outage or one stage of reinforcement lighting outage Traffic flow slightly affected, less than 5 min. delay in travelling time Minor effect to traffic flow Assigned Index 5

Consequence severity categories Catastrophic

Serious traffic accident

Critical

Major traffic accident

Significant

Minor traffic accident

Marginal

Illumination in tunnel decreases to an uncomfortable level, very difficult to see objects The eyes feel twinkle

Negligible

Risk-based maintenance for tunnel


Table 5 Consequence severity levels for hazard to drivers () and duration of traffic blockage (#) Consequence severity Categories Catastrophic Critical Significant Marginal Negligible Level 5 4 3 2 1 CP IC # BT # # Identification code of risk factors MC CC # FC PL BL RL

505

PH

the consequence severity level 4 on the duration of traffic blockage was assigned (# symbol in Table 5). Table 5 shows the different consequence severity levels for hazard to drivers and duration of traffic blockage for all the other identified risk factors (Ng, 1998). Risk assessment Risk assessment involves the determination of the likelihood of occurrence (probability) of each identified risk factor. Occurrence (frequency) is the rating value corresponding to the estimated expected frequencies or cumulative number of failures that would occur for a given cause over the lifetime of the equipment. Depending on the available information, the likelihood of occurrence may be expressed either in qualitative or quantitative terms. The US Military Standard 882C five-level risk occurrence category frequent, probable, occasional, remote and improbable was used. Table 6 shows the qualitative and quantitative descriptions of the risk occurrence probability categories (failure rates) for component failures. Similar to the consequence severity
Table 6 Probability categories Risk probability categories Frequent Probable Occasional Remote Improbable Qualitative description

category levels, a severity level was also assigned to each risk occurrence category as shown in Table 6. At the time of conducting the risk assessment, the equipment had been operating for less than one year, hence sufficient failure data were not available. Therefore, the qualitative approach suggested in Military Standard 882c was adopted and its risk probabilities were determined as shown in Table 7 (Ng, 1998). Risk evaluation The risk evaluation process begins first with determining the risk exposure values (Grose, 1987). Risk exposure value The risk exposure value for each identified risk factor is calculated as follows: Risk Exposure Value = Consequence Severity Level Risk Probability Level (Table 5) (Table 7) The risk exposure values for risk factors on hazard to drivers () and duration of traffic blockage (#) were

Quantitative description The probability The probability The probability The probability 0.000001 The probability is is is is greater than 0.1 between 0.1 and 0.01 between 0.01 to 0.001 between 0.001 to

Level 5 4 3 2 1

Likely to occur frequently Will occur several times in the life of an item Likely to occur some time in the life of an item Unlikely but possible to occur in the life of an item So unlikely, it can be assumed occurrence may not be experienced

is less than 0.000001

Table 7

Risk probabilities on tunnel lighting component breakdown Identification code of risk factors Level 5 4 3 2 1 CP IC + + BT MC + CC FC + PL + BL + RL + PH +

Risk probability Categories Frequent Probable Occasional Remote Improbable

506
calculated using the consequence severity levels of Table 5 and the corresponding risk probability levels shown in Table 7 and were tabulated as shown in Table 8. For simplification and easy reference, the risk exposure values were grouped into four risk exposure classes with designated class codes and risk exposure levels respectively as shown in Table 9. The FMEA check sheet shown in Table 2 have already listed out the possible preventive actions to eliminate or reduce the identified risk factors. The costs for each of the preventive actions for the toll road/tunnel operations were calculated and described in Table 10 with designated cost category, cost level and cost class code. Subsequently, the total cost of preventive actions for each identified risk factor could be determined as shown in Table 11 (Ng, 1998).

Ng et al.
With all these costs and risk information, the next step is to determine the risks that are to be acceptable, tolerable or unacceptable. The hazard (or class) codes and the numerical level numbers for individual risk factors are tabulated as shown in Table 12 for the three variables i.e. the duration of traffic blockage and the hazard to drivers (Table 9) and the cost of preventive actions (Table 11), respectively. According to the Hazard Totem Pole (HTP) algorithm, priority is given to high severity, high likelihood, and low cost (Grose, 1987). The hazard index (HTP score) is determined as the sum of the numerical level numbers of the three variables. More preventive maintenance actions should be carried out for those risk factors with higher hazard index values (HTP scores). For easy reference, Table 12 shows the prioritized

Table 8 Risk exposure values for risk factors on hazard to drivers () and duration of traffic blockage (#) CP Consequence severity (A) Risk probability (B) Risk exposure value (AB) 2 3 6 # 2 3 6 3 4 12 IC # 4 4 16 4 2 8 BT # 4 2 8 1 4 4 MC # 2 4 8 CC 4 3 12 # 3 3 9 3 4 12 FC # 2 4 8 3 4 12 PL # 2 4 8 1 5 5 BL # 1 5 5 1 5 5 RL # 1 5 5 PH # 2 4 8 2 4 8

Table 9

Risk exposure value classification Risk exposure class Risk exposure level 4 3 Risk exposure value 1625 915 Risk factor identification code IC CC FC PL BT PH BL RL CP MC IC CC BT MC FC PL PH CP BL RL Number of risk factors 0 4 Cumulative number of risk factors 0 4

Hazard to drivers

J K

48

10

Duration of traffic blockage

M A B C

1 4 3 2

13 1625 915 48

0 1 1 8

10 1 2 10

13

10

Risk-based maintenance for tunnel


Table 10 Cost categories on preventive actions Cost categories Substantial High Preventive actions Preventive action cost $250 $160 $150 $125 $120 $70 000 000 000 000 000 000 Cost range > $200 000 Between $100 000 and $200 000 Cost level 1 2

507

CostClass code S R

Low

Trivial

Spare parts for booster transformer Adding ventilation fan Spare parts for CMCS central computer Spare parts for CMCS field control unit Developing predictive algorithm Providing training to operations and maintenance staff Spare parts for CMCS programmable logic controller Spare parts for photometer Carrying out power supply loading test Spare parts for control/protection relay Spare parts for isolator/contactor Spare parts for MCCB Improving operating procedures

Between $10 000 and 100 000

$60 000 $50 000 $45 000 $ 9000 $ 6000 $ 3000 $ 2000

< $10 000

Table 11 Summary of cost of preventive actions Risk factor identification code CP IC BT MC CC FC PL BL RL PH Total cost of preventive actions $45 000 + $2000 + $9000 = $56 000 $45 000 + $2000 + $6000 = $53 000 $45 000 + $160 000 + $250 000 = $455 000 $45 000 + $2000 + $3000 = $50 000 $2000 + $70 000 + $150 000 = $222 000 $2000 + $70 000 + $125 000 = $197 000 $2000 + $70 000 + $60 000 = $132 000 $120 000 $120 000 $2000 + $70 000 + $50 000 = $122 000 Cost level 3 3 1 3 1 2 2 2 2 2 CostClass code Q Q S Q S R R R R R

Table 12 Prioritized hazard index of risk factors Priority 1 2 3 4 5 6 7 8 9 10 Risk factor identification code IC CP MC CC FC PL BL RL PH BT Hazard code (class code) A C C B C C C C C C K L L K K K L L L L Q Q Q S R R R R R S Numerical level no. 4 2 2 3 2 2 2 2 2 2 3 2 2 3 3 3 2 2 2 2 3 3 3 1 2 2 2 2 2 1 HTP score 10 7 7 7 7 7 6 6 6 5 Cost of preventive actions $53 $56 $50 $222 $197 $132 $120 $120 $122 $455 000 000 000 000 000 000 000 000 000 000

hazard index values (HTP scores) in descending order. Figure 5 shows the HTP diagram constructed from Table 12 (Ng, 1998). Figure 5 is easy to interpret and ready to use for management to make decisions on maintenance activities. It helps management to re-arrange and

re-schedule existing maintenance tasks according to the objectives of the organization. All the data and information generated in this risk evaluation stage should be stored in the maintenance information system (MIS) for monitoring purposes.

508

Ng et al.

Figure 5 HTP diagram for risk evaluation of the tunnel lighting system

Maintenance activities execution According to the outcomes generated in the risk evaluation stage, appropriate preventive maintenance activities could be recommended. Figure 5 shows that it would be most cost effective to conduct preventive actions for the isolator/contactor of the power-supply (IC). However, the available resources from management should decide the determining factor for the cut-off point. If, for example, HK$800 000 were allocated to implement the improvement works, the first six preventive actions listed in Figure 5 could be carried out to eliminate or to reduce the corresponding risks. On the other hand, the consequence severity level of the boost transformer (BT) was found to be critical for both the risk factors for hazards to drivers and the duration of traffic blockage (see Table 5). Because of the low occurrence probability (see Table 7) and high preventive maintenance costs (see Table 11), the priority for BT was determined to be the lowest in the HTP diagram. If such low priority risk must be eliminated, top management must allocate extra resources to carry out the required preventive actions, which might not be cost-effective. Alternatively, a contingency plan can be implemented and the concerned staff can be trained beforehand to cater for such high risk factors with limited resource situations. Furthermore, the HTP diagram also indicated that the basic and reinforcement lighting fittings and photometer were not that important to affect the normal operations of the tunnel. The preventive maintenance frequency for these items, therefore, should

be reduced. The simple HTP diagram, which consolidates all the results of the risk-based preventive maintenance management model, is a simple tool helping management in making effective decisions more easily. It should be noted that the risk profiles and the related information generated by the proposed model are useful for understanding the impact of equipment failures. More importantly, such information should be shared within the organization through proper training so that the maintenance activities can be implemented effectively and efficiently. Risk control and monitoring The risk control and monitoring processes continuously review the effectiveness and the degree of compliance of the maintenance activities through periodic checks or audits. These control mechanisms provide feedback to management for taking corrective actions and signals for staff and the public regarding the effectiveness of the implementation of the risk-based maintenance management system. The risk control and monitoring processes must be perceived by staff as means to determine possible preventive measures and to provide guidelines for further improvement, rather than a search for a scapegoat. In the control and monitoring stage, deviation from specifications or requirements, abnormal cases and accidents that occurred are all reported. For example, if the reduction of maintenance frequency of the basic and reinforcement

Risk-based maintenance for tunnel


lighting fittings creates lighting blackouts, the maintenance frequency must to be revised. For the isolator/contactor, if the increased frequency of preventive maintenance creates an unacceptable workload, additional manpower needs to be provided. As such, the purpose of risk control and monitoring is to check the quality of the works performed and to take appropriate corrective actions, if necessary. Maintenance information system From the system decomposition stage to the risk control and monitoring stage of the risk-based maintenance management cycle, a lot of information are required to be processed, shared and stored in different processes. As shown in Figure 2, the maintenance information system (MIS) consisting of five different modules is designed to facilitate information processing in the maintenance management system. The system/equipment risk database module in the MIS is one such module and is developed to support the implementation of the riskbased maintenance management model. The risk information related to following are stored in this module and updated as needed:

509
model starts with identifying all potential risk factors due to equipment failures (risk identification). Then, all the possible consequences and their magnitude are enumerated (risk measurement). Subsequently, the probability of occurrence for each of the identified equipment failure modes is assessed (risk assessment). Afterwards, the identified risk factors are ranked according to their exposure values and costs of preventive actions. By combining the quantified data of the variables, a priority table and the corresponding HTP diagram can be created for management to decide on the best courses of action to contain and manage the identified risks (risk evaluation). The results of the case study clearly indicates that the formulated model can be applied effectively in implementing appropriate risk-based maintenance strategies to reduce the risks due to equipment failures. More importantly, it is easy to understand and apply for similar kinds of maintenance improvement projects. The application of RMP in maintenance modelling overcomes the deficiency of most of the maintenance models by considering the consequences of faults, their likelihood of occurrences and the costs of implementing risk response actions in a meaningful fashion. Moreover, if the risk-based maintenance model is repeatedly used, it will generate a rich risk profile of each component of the system. Based on this information, contingency measures and training for staff can be implemented much more effectively.

the identified risk factors; the consequence severity levels; the risk probabilities; and the Hazard Totem Pole.

The other four modules that comprise the MIS include the document module, maintenance record module, work order system module and the material and labour resource module. The computerized MIS supports various processes of the risk-based maintenance management system. It is useful to build up a comprehensive failure rate database for the implementation of a quantitative and objective risk-based analysis. A proper MIS system can also generate useful management reports for control, monitoring and auditing purposes.

References
@Risk (1992) Risk Analysis and Simulation Add-In for Lotus 1-2-3 Version 2.01, Palisade Corporation, New York. BestFit (1993) Users Guide, Palisade Corporation, New York. Burchett, J.F. and Tummala, V.M.R. (1998) An application of the risk management process (RMP) in capital investment decisions for an EHV transmission line construction project. Construction Management and Economics, 16(2), 23544. Burchett, J.F., Tummala, V.M.R. and Leung, H.M. (1999) A world-wide survey of current practices in the management of risk within electrical supply projects. Construction Management and Economics, 17, 7790. Carter, B., Hancock, T., Morin, J. and Robin, N. (1994) Introducing RISKMAN: The European Project Risk Management Methodology, NCC Blackwell Limited, Manchester. Charette, R.N. (1989) Software Engineering Risk Analysis and Management, Intertext Publications/McGraw-Hill Company, New York. Cooper, D.F. and Chapman, C.B. (1987) Risk Analysis for Large Projects: Models, Methods and Cases, John Wiley & Sons, Chichester. Engineering Council (1994) Guidelines and Risk Issues, Lloyds Register, London. FMEA (1995) Potential Failure Mode and Effects Analysis,

Conclusion
A risk-based maintenance management model has been developed and applied to a real life case in a toll road/ tunnel company for enhancing preventive maintenance activities. The advantage of the model is that it helps operators to establish and determine suitable maintenance strategies for selecting the best courses of action in managing identified risks. The model also requires the participation of different departments of the company to determine the failure modes and effects of equipment and the corresponding preventive actions. Therefore, it improves the understanding on the impact of equipment failures (risk factors) between different departments. The

510
Automotive Industry Action Group (AIAG), FMEA, Southfield, ML. Grose, V.L. (1987) Managing Risk Systematic Loss Prevention for Executives, Prentice Hall, Englewood Cliffs, NJ. Hammersley, J.M. and Handscombe, D.C. (1967) Monte Carlo Methods, Methuen & Company Limited, London. Hayes R.W., Perry, J.G., Thompson, P.A. and Willmer, G. (1986) Risk Management in Engineering Construction, Implications for Project Managers, Thomas Telford Limited, London. Hertz, D.B. and Thomas, H. (1984) Practical Risk Analysis An Approach Through Case Histories, John Wiley, Chichester. IEEE Spectrum (1989) Report on Risk, June, 267. Leung, H.M. (1997) Knowledge-based project risk management. MPhil thesis, MSc Engineering Management Dissertation, Department of Manufacturing Engineering and Engineering Management, City University of Hong Kong, Kowloon, Hong Kong. Leung, H.K., Tummala, V.M.R. and Chuah, K.B. (1998) A knowledge-based system for identifying potential project risks. OMEGA, The International Journal of Management Science, 26(5), 62338. Leung, M.Y.H. (1994) The application of risk management process to project appraisal in rolling stock section of the MTRC. MSc Engineering Management Dissertation, Department of Manufacturing Engineering and Engineering Management, City University of Hong Kong, Kowloon, Hong Kong. McAndrew, I. and OSullivan, J. (1993) FMEAS: A Managers Handbook, TQM Practitioner Series, Technical Communications (Publishing) Limited, Hitchin. Military Standard (1993) System Safety Program Requirements, MIL-STD-882C, AMSC Number F686. Mok, C.K. (1994) The application of risk management process in building services cost estimation. MSc Engineering Management Dissertation, Department of Manufacturing Engineering and Engineering Management, City University of Hong Kong, Kowloon, Hong Kong. Ng, M.F. (1998) The application of risk management process in maintenance activities for toll road/tunnel operations. MSc Engineering Management Dissertation, Department of

Ng et al.
Manufacturing Engineering and Engineering Management, City University of Hong Kong, Hong Kong. Raiffa H. (1994) Science and policy: their separation and integration in risk analysis. The American Statistician, 36(3), 22537. Rowe, W.D. (1993) An Anatomy of Risk, John Wiley and Sons, New York. Schmidt, J.W. and Taylor, R.E. (1970) Simulation and Analysis of Industrial Systems, Irwin, Homewood, IL. Sundararajan, C. (1991) Guide to Reliability Engineering Data, Analysis, Applications, Implementation, and Management, Van Nostrand Reinhold, New York. Tomic, B. (1993) Risk Based Optimization of Maintenance: Methods and Approaches, Safety and Reliability Assessment An Integral Approach, Elsevier Science Publishers B.V., New York. Tummala, V.M.R. and Burchett, J.F. (1999) Applying a risk management process (RMP) to manage cost risk for an EHV transmission line project. International Journal of Project Management, 17(4), 22335. Tummala, V.M.R. and Leung, Y.H. (1996) A risk management model to assess safety and reliability risks. International Journal of Quality & Reliability Management, 13(8), 5362. Tummala, V.M.R. and Lo, C.K. (forthcoming) A risk management model for improving electricity supply reliability. International Journal of Business and Economics. Tummala, V.M.R. and Mak, C.L. (2001) A risk management model for improving operation and maintenance activities in electricity transmission networks. Journal of the Operational Research Society, 52, 12534. Tummala, V.M.R., Nkasu, M.M. and Chuah, K.B. (1994) A systematic approach to risk management. Journal of Mathematical Modeling and Scientific Computing, 4, 17484. Vaughan, E.J. (1997) Risk Management, John Wiley and Sons, New York. Yu, C.M. (1996) Managing project schedule risks for an EHV substation construction project with expert system. MSc Engineering Management Dissertation, Department of Manufacturing Engineering and Engineering Management, City University of Hong Kong, Kowloon, Hong Kong, 1996.