You are on page 1of 17

Reliability Engineering and System Safety 82 (2003) 257–273

www.elsevier.com/locate/ress

Applying RCM in large scale systems: a case study with railway networks
Jesús Carreteroa,*, José M. Péreza, Félix Garcı́a-Carballeiraa, Alejandro Calderóna,
Javier Fernándeza, Jose D. Garcı́aa, Antonio Lozanob, Luis Cardonab,
Norberto Cotainac, Pierre Pretec
a
Department of Computer Science and Engineering, Universidad Carlos III de Madrid, Avda Universidad 30, Leganés, 28911 Madrid, Spain
b
Infrastructure Maintenance División, RENFE (Red Nacional de Ferrocarriles Españoles), Edificio 22, Estación de Chamartı́n, Madrid, Spain
c
ADEPA, Rue Perier 17, Montrouge, Paris, France

Received 22 November 2002; revised 28 January 2003; accepted 26 June 2003

Abstract
In 2000, the European Union founded a project named ‘RAIL: Reliability centered maintenance approach for the infrastructure and logistics
of railway operation’ aimed to study the application of Reliability centered maintenance (RCM) techniques to the railway infrastructure. In this
paper, we present the results obtained into the RAIL project, including a RCM methodology adapted to large infrastructure networks and a
RCM toolkit to perform the RCM analysis, including cost aspects and maintenance planning guidance. This paper addresses the problem of
applying RCM to large scale railway infrastructure networks to achieve an efficient and effective maintenance concept. Railways use nowadays
very traditional preventive maintenance (PM) techniques, relying mostly on ‘blind’ periodic inspection and the ‘know-how’ of maintenance
staff. RCM was seen as a promising technique from the beginning of the RAIL project because of several factors. First, technical insights
obtained were better than the existing, so that several maintenance processes could be revised and adjusted. Second, the interdisciplinary
approach used to make the analysis was very enriching and very encouraging for maintenance staff consulted. Third, using the RCM structured
approach allowed to achieve well-documented analysis and clear decision diagrams. Our methodology includes some new features to
overcome the problems of RCM observed in other projects. As a whole, our methodology and Computerized Maintenance Management
Systems have produced two short-term benefits: reduction of time and paperwork because databases and tools are accessible through Internet,
and creation of a permanent, accurate, and better collection of information. It will also have some long-term benefits: better PM will increase
equipment life and will help to reduce corrective maintenance costs; Production will increase as unscheduled downtime decreases; purchase
costs of parts and materials will be reduced; more effective and up-to-date record of inventory/stores reports; and better knowledge of the
systems to help the company to chose those systems with the best LCC. The results have been corroborated with the application of our
methodology to signal equipment in several railway network sections, as shown in this paper. Because of the successful conclusion of the
project, the Spanish railway company (RENFE) and the German railway company (DB A.G.), not only decided to adopt RCM to enhance PM,
but they have started a large project to implement Total Preventive Maintenance relying on the implantation of the RCM methodology.
q 2003 Elsevier Ltd. All rights reserved.
Keywords: Reliability centered maintenance; Railway maintenance; Reliability; Computerized maintenance management systems; Maintenance planning

1. Introduction such as aviation [17], oil industry [19] or ships [26].


The reason for the project was that railway maintenance had
In 2000, the European Union founded a project named been traditionally planned using the knowledge and
‘RAIL: Reliability centered maintenance approach for the experience of each company, but without any kind of
infrastructure and logistics of railway operation’ aimed to reliability based methodology to support the maintenance
study the application of Reliability Centered Maintenance plans and works. Some attempts to use RCM in railways,
(RCM) techniques [2,14] to the railway infrastructure, like the REMAIN [32] project or the Norway railways [29]
following the success of RCM in other industrial fields [1,4], were not adopted by the companies, probably due to their
very ambitious goals.
* Corresponding author. Fax: þ 34-91-62-49-129. As a result, each railway company has today its own
E-mail address: jcarrete@arcos.inf.uc3m.es (J. Carretero). maintenance procedures, which, in general, would not be
0951-8320/$ - see front matter q 2003 Elsevier Ltd. All rights reserved.
doi:10.1016/S0951-8320(03)00167-4
258 J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273

accepted by other companies and are not economically during the 1980s and the 1990s, being now extended to
optimized [25]. This situation arises because railway several industry fields.
companies are very traditional in their procedures and But, what is RCM? There are very good definitions of
because, historically, railway infrastructures have been RCM in the literature [2,15,23,27]. In short, RCM can be
designed in order to procure a significant level of safety. defined as a systematic approach to systems functionality,
As a result, maintenance developed was Breakdown failures of that functionality, causes and effects of failures,
Maintenance, devoted to bring back the systems to a perfect and infrastructure affected by failures. Once the failures are
state before it malfunctions, and train delays or revenues, known, the consequences of them must be taken into
for example, were not the major issue. To maintain an account. Consequences are classified in: safety and
efficiently operating infrastructure and to avoid failure of environmental, operational (delays), non-operational,
critical equipments, especially signaling equipment, the and hidden failure consequences. Later, those categories
focus has clearly shifted over the years to Preventive are used as the basis of a strategic framework for
Maintenance (PM), devoted to fix the equipment according maintenance decision-making. The decision-making
to planned maintenance schedule. Nowadays, many railway process is used in order to select the most appropriate task
companies have to satisfy rules provided by safety to maintain a system filtering the proposed classification of
regulation authorities, that, in several countries, define consequences through a logic decision tree. In the 1970s,
maintenance procedures and even the frequencies for PM. and still today, RCM was a major challenge in many
However, railway is competing with other forms of industries because it changed the focus of PM from bringing
transport today, and railway companies are being split to back the systems to a ‘perfect’ state to maintaining the
provide transport services on a side and infrastructure system in a good ‘functional’ state (within some defined
services on another. Now, customers want the best quality of operational limits). Through this approach, it provides an
service at the lower cost, forcing the railway companies to understanding of how infrastructure works, what it can
optimize every stage of the process, including maintenance. (or cannot) achieve, and the causes of failures. Fig. 1 shows
The railway companies themselves are starting to outsource the major steps of RCM methodology and their outcomes.
some maintenance services, facing that business without RCM methodology [14,15] has three major goals.
any kind of methodology to be applied to test the First one is to enhance safety and reliability of systems by
correctness of maintenance procedures. focusing on the most important functions. RCM is
In this paper, we present the results obtained into the concerned mainly with what we want the equipment to
RAIL project, including a RCM methodology [15] adapted do, not what it actually does. Second is to prevent or to
to large infrastructure networks and a RCM toolkit to mitigate the consequences of failures, not to prevent the
accomplish the RCM analysis, including cost aspects. failures themselves. The consequences of a failure differ
A description of the RCM methodology, templates design, depending on where and how items are installed and
Computerized Maintenance Management Systems (CMMS) operated. Third one is to reduce maintenance costs by
tools and databases, and a test case of maintenance planning avoiding or removing maintenance actions that are not
will be presented in this paper. Section 2 briefly describes
the RCM methodology. Section 3 shows the RCM
methodology adapted to railways into the RAIL project.
Section 4 briefly describes the CMMS toolkit developed on
the RAIL project. Section 5 shows some implementation
results and maintenance optimization for a test case. And,
finally, Section 6 highlights the major conclusions extracted
from our experience.

2. What is RCM?

The concepts behind RCM are not new, having their


origin in the airline industry back in the 1960s. After several
years of experience, in 1978, the US Department of Defense
issued the MSG-3 [16], an Airline/Manufacturers
Maintenance Program Planning Document. That year,
Nowlan and Heap wrote a comprehensive document on
the relationships among Maintenance, Reliability and
Safety, entitled Reliability Centered Maintenance [18],
creating the RCM methodology. RCM spread throughout
industries, specially those needing safety and reliability, Fig. 1. RCM methodology and outcomes.
J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273 259

strictly necessary. It is no longer assumed that all failures 3. Adapting RCM methodology for railway
can be prevented by PM, or that even if they could be infrastructure
prevented, it would be desirable to do so.
Of the thousands of possible failure modes on any The RAIL project had three major goals. First, trying to
facility or installation, each has a different effect on harmonize European maintenance procedures to satisfy the
safety, environment, operations, or other costs no related European interoperability rules aiming a safe trans-national
to operations. The failure consequences determine what, train circulation. Second, adapting the RCM methodology
if any, resources will be used to prevent their occurrence. to the railway infrastructure and the railway companies.
For example, in railway signaling infrastructure, there are Third, optimizing maintenance costs, while keeping the
thousand of kilometers of underground wires for safety levels.
communications, power supply, etc. Actually, the rate Those goals should be satisfied overcoming the initial
of failure of a wire is very low, and failures are mostly constraints found at the beginning of the RAIL project:
due to construction near or along the tracks. As a result, safety organisms, that define a safety level and even
wires are not periodically maintained, because the effort maintenance procedures; Existing maintenance plans,
to do it would be larger than the benefits obtained. which are defined by the practice of years; Historical habits,
As RCM provides a ranking of maintenance tasks for because railways are usually very old and large companies
a system, it can be used as a good technique for with a lot of history.
developing a PM program [8]. A formal review of failure Satisfying the former goals was difficult because of
consequences focuses attention on maintenance tasks that several reasons. First, each railway company had a very
are more effective, diverting energy away from those different understanding of the infrastructure and its
which have little or no effect [10]. This helps to ensure particular maintenance rules and procedures. Second,
that whatever is spent on maintenance, it is spent where RCM is a systematic approach aimed to maintain systems
it will be more necessary to ensure that the inherent functionality rather than restoring the equipment to the ideal
reliability of the equipments is enhanced. condition. This approach needed a new state of mind from
Today, RCM tools are integrated with CMMS [9], maintenance teams, because systems must be studied from a
as Relex [24] or ASPs. The latest trend encompasses functional point of view, and not from a mere structural
asset management and maintenance, supported by various description. Moreover, the method is very time consuming
methods of Condition Based Maintenance Systems and and needs a management commitment. Fourth, the scale of a
in-service inspection processes [20,7,12]. railway network is very large to apply the standard RCM
So, are there no problems? Yes! A lot of them. methodology to the whole network.
First, and major, RCM initiatives involve a tremendous To deal with the former problems, we carefully analyzed
amount of resources, time, and energy. It is usually a the railway network and concluded that:
long-term goal with a short-term expectation. It is known
that many projects to deploy RCM in manufacturing 1. The scale is very large, there are hundred of thousand of
plants failed because large projects require 2 or 3 years assets, but the number of different models is reduced.
to implement the process, which means expenditures but There are thousands of track circuits, but they can be
no proven benefits. Third, the best methodology in the grouped in few models. Thus, we decided to make a
world will fail if management, staff, and workers do ‘generic’ RCM analysis of each model to create RCM
not support it. The success of any initiative depends on templates, including system description, FMECA
the credibility of the expert’s knowledge, on showing the analysis, estimated criticality, and tasks proposed to
benefits to the groups that can be affected by the success solve failures. As RAIL was a European project, a strong
of the initiative, and on creating working groups willing emphasis was made to unify the templates among
to impulse the initiative by involving themselves in the different companies to have maintenance procedures
project. At last, but not least, RCM was conceived to accepted in several countries, which is important to
study exhaustively only a small part of the equipments of promote interoperability.
a factory or system, because making a RCM analysis of 2. There was a need of a unified cost model to compare
every system of the infrastructure would be complicated data among different companies to promote future
and time consuming. RCM experts usually ask for data comparisons of maintenance performance.
as failure rate, medium time between failures (MTBF),
detailed costs, etc. that many times are not known for a There should be several levels of analysis: line level,
specific system. As may be seen, most of the former section level, elemental section level and specific system
problems are not related to the methodology itself, but to level. Each level had different possibilities, but the final goal
its implementation aspects. was to optimize m aintenance planning using the manpower
In Section 3 we describe how we have overcome the available and increasing, or at least keeping, safety and
former problems and how the RCM methodology can be reliability levels through criticality and FMECA analysis
adapted to large scale systems, as railways. made on each level.
260 J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273

The decision tree should be modified to demonstrate to


the companies that applying the methodology will not only
increase reliability, but it will also reduce maintenance costs
of the organization.
The former conclusions were fundamental to convince
the company managers that the outcome of the RCM
methodology could be worthwhile for the companies.
However, to satisfy those goals, we found that it was
necessary to adapt the RCM methodology to apply it in a
large scale system like a railway network. To start with, we
decided to extend the RCM methodology to apply it to
functional ‘machines’ on several levels. Our methodology
has four steps: infrastructure breakdown, computing criti-
cality and state of the systems, classical RCM analysis Fig. 2. Railway infrastructure organization.
(FMECA analysis and selection of maintenance type based
on criticality), and maintenance planning. Not all the steps or partially, as shown in Fig. 2. A machine can be a line,
are applied in very level of the methodology, as shown ahead. a section of the line, or a system installed on the track.
Classical RCM analysis is not done for the upper levels;
3.1. Infrastructure breakdown instead a criticality analysis is made in order to detect what
parts of the railway network are more critical from a
The first step of a standard RCM methodology is to functional point of view. This hierarchical analysis, dividing
identify the systems to study. Since the beginning, it was the network in ‘logical machines’, is our first contribution to
very clear that our methodology and tools were going to the modified RCM methodology. Logical machines can be
have two types of users: formally defined as:
[
n
† Managing staff: who wanted to use the RAIL approach to N¼ Li ; where L means line and N is network ð1Þ
redefine the maintenance tasks, and their standard i¼1
frequency, and to compute the manpower needed to
[
m
maintain the infrastructure with a defined reliability L¼ Ji ; where J means section ð2Þ
level. j¼1
† Maintenance staff: who wanted to use the analyzed
results (task þ frequency) to define the actual [
p
J¼ Si ; where S means system ð3Þ
maintenance tasks, their frequency and the Mainten- i¼1
ance Work Order.
[
k

Both type of users have a dramatically different view of S¼ Pi ; where P means subsystem ð4Þ
i¼1
what a system is. For managers, the ‘machine’ to study is
not a single asset, like a track circuit, but a whole portion of [
l

the network (called section) or eventually the whole P¼ Mi ; where M means maintenable item ð5Þ
i¼1
network. For maintenance staff, the systems are the assets
to be maintained. Thus, it was meaningful to talk of track where a section ðJi Þ may belong to several lines
circuits and their components. However, can you do a RCM ðL1 ; L2 ; …; Lk Þ; and thus all the machines below, which
analysis for all the systems in a network with a realistic defines a graph as the one shown in Fig. 3.
approach? The answer was very clear: no. Former Table 1 shows two examples of high level ‘machines’
experiences [21] showed that the RCM methodology has analyzed as test case in RENFE. The first one is a line with
been discarded many times because of its expensiveness and 25 km, eight sections, and more than 250 systems.
lack of immediate benefits. Moreover, the managers did not The second one is a section of the line with only 2 km,
want to analyze all the machines on the network, but the but almost 40 systems. The former numbers show another
network itself as a machine. They wanted to know the feature of the railway signaling infrastructure: it is not
importance of each part of the railway network from a uniformly distributed along the tracks. We have considered
functional point of view. However, maintenance staff was two kinds of sections: stations and tracks between two
on the opposite side. They wanted to know in detail the stations. The stations usually concentrates most of the
behavior of a system installed on the track, including railway signaling devices, usually grouped in some kind of
failures, causes, tasks, etc. electronic interlocking system, while sections between
To satisfy both approaches, we have extended the RCM stations only have some track circuits and signals. Table 2
methodology to be applied to all the former entities, totally shows a simplified decomposition of a signal.
J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273 261

Table 2
Low level machines

Machine Example

System Four lights signal


Subsystem Signal head
Maintainable item Lens, lamp, etc.

know, not only the performance state of an asset, but the


asset state to compute risk of accident probability.

3.2. Computing criticality and state of the machines

Criticality is the base to rank the ‘machines’. What is


criticality? It is a measure of the importance of the system
from a functional point of view. Once criticality is
computed for several systems, those can be classified
according to their importance for the whole railway
network.
Criticality is computed for the whole hierarchical
decomposition of the infrastructure: line, section, and
system level. A set of factors is defined in order to compute
Fig. 3. Schematic railway network graph.
the criticality, which is an addition of all the factors values
(see Eq. (6)).
However, in RENFE, there are more than 600 general X
n
c¼ Fi ð6Þ
lines, 4000 sections (tracks between stations), and almost i¼1
2000 stations. Station tracks should also be included, to
almost 8000. The number of systems is very large. There are The factors to be taken into account were defined by a team
more than 20 classes of systems (track circuits, signals, of RCM experts, railway maintenance engineers, and
joints, level crossings, locks, etc.), which expand to more railway managers. They concluded that the factors should
than 100 discriminating the different existing technologies. be the same for lines and sections, but should be different for
The number of items of each system varies, but there are systems. Why? Lines and sections are classified using
more than 10,000 track circuits belonging to four large functional criteria specified mostly by the client
categories and 21 subcategories (for example, low (the transport companies running trains along the rails)
frequency, 50 Hz track circuits), more than 9000 signals and adding some criteria related to infrastructure and
belonging to eight major categories, and so on. In total, environment. The RAIL consortium agreed on the criteria
more that 70,000 systems and 250,000 subsystems. shown in Table 3 to evaluate the criticality of lines and
Moreover, see that some entities that we call ‘low level’ sections. Some factors may have subjective values or
are so complex that are comparable to the machines studied objective values not easy to measure, but we have always
traditionally in many RCM systems as large and complex tried to refer the factors to numerical values whenever
[22], thus a complete RCM analysis is only completely done possible. For example, traffic density 4 means more than
for those systems identified as more functionally important 200 trains/day, 3 means between 200 and 60 trains/day, 2
(critical) for their logical machines. means between 60 and 20 trains/day and 1 means less than
In addition, we have extended the RCM methodology to 20 trains/day. However, it was impossible to define all of
make an analysis of the state of the critical systems, as we them numerically. For example, maintenance costs are very
are coping with safety critical components, whose failure different for each company. To easy criticality analysis, we
may cause accidents, injuries and deaths. It is important to allow each company to tailor the methodology with their
own values, grading each factor from 1 to 4, which means
from low to very high. The meaning of each value is
Table 1
High level machines factor-dependent, but it can be seen as a scale from less to
more critical.
Machine Example In most cases, the importance of each factor is not the
same. This may be imposed by company or regulation
Line Villalba-Cercedilla
entities policies. To accommodate this fact, we have
Section Los Molinos-Guadarrama
introduced a weight for each factor in the formula and had
262 J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273

Table 3
Criticality estimators for lines and sections

Factor Description Value 1 Value 2 Value 3 Value 4

Technology Kind of technology of the line or section Mechanic Electro-mechanic Electric Electronic
Traffic density Number of circulation per day [1,20] (20,60] (60,200] .200
Revenues Revenues obtained from exploitation Low Medium High Very high
Availability Number of hours that the line must be available per day 6 12 18 24
Exploitation Number of passengers or dangerous freights Low Medium High Very high
Maintainability Maintenance process complexity Low Medium High Very high
Costs Costs associated to maintenance Low Medium High Very high
Environmental risk Risk of environmental damage generated by an installation failure Low Medium High Very high
Safety risk Risk of people damage generated by an installation failure Low Medium High Very high

normalized the values between 1 and 4n. should every system be placed, thus getting a more realistic
distribution of criticality.
X
n
Wi However, computing the criticality of thousand of
c¼ X
n Fi ð7Þ
i¼1
systems is not straightforward. How does the RAIL
Wj methodology cope with the criticality analysis of such
j¼1
amount of systems? Applying criticality inheritance.
or simplifying the notation: Criticality inheritance is our second contribution to the
traditional RCM methodology: To reduce the manpower
X
n and time to analyze the railway network, criticality
c¼ W 0i £ Fi ð8Þ inheritance is applied in a top –down manner from lines
i¼1
to systems. Criticality is computed for high level machines
(for instance, lines) and legated to low level machines
Weights are the same for the whole company. They cannot
(sections inherent criticality information from lines) as a
be modified by analysts, because it will lead to a different
starting point, applying equations similar to Eq. (9) for
analysis for each set of systems and it will reduce the
every level (criticality for a section J). The criteria are
robustness of criticality with respect to the subjective
clear: if a line has a certain criticality, initially it seems
variability of the weights. Thus, criticality is mostly affected
reasonable to apply to the systems of that line the same
by the factors and the company policy criteria followed to
criticality.
define the weights.
For systems like track circuits and signals, criticality is
computed similarly, but adding some different criteria such cj ¼ maxðlci Þ; lc being line criticality
as safety. The criteria used are mainly defined by l [ L; ;L and J [ L ð9Þ
engineers, maintenance staff, and safety regulation
authorities, and they are related to MTBF, reliability,
availability, etc. A top –down classification was established When a line criticality is applied for the first time, the same
for each criteria, following risk category, frequency of criticality is applied to sections and systems of that line.
failures, hazard security levels, and decision criteria When a specific section of the line is analyzed, if it belongs
defined in RAMS standard [6]. Moreover, we allow to several lines, its criticality is the maximum criticality of
defining a threshold to consider a component critical, its parent lines (see Eq. (9)). Once computed, the critically
but considering that classification could be different of the section is applied to its systems. Line criticality can
depending on the operational environment. The RCM be assigned at the managerial level, while the criticality of
methodology must be applied initially to those significant the sections can be assigned by several distributed
items upon the threshold. management teams starting with the inherited criticality,
Depending on the criticality, lines, sections and systems thus reducing the time extension of system’s analysis.
are classified in four levels, or classes of criticality, from A Obviously, afterwards, each machine can be analyzed
to D, which are visualized in RAIL CMMS using colors carefully, if needed, to study their situation. For example,
ranging from red to green, respectively. The resulting value secondary tracks of a station are less critical than primary
is distributed in the range ½1; 4n; where n is the number of ones, but those decisions can be taken closer to the section
factors involved in the computation of the criticality. and system evaluated. This way, the identification of
To compute the range of the classes, a statistical study functionally significant items (FSI) can be made quickly
was made using the test case shown ahead. Initially, all the at first glance, while a more detailed evaluation can be made
intervals were similar, but after computing the criticality of by every maintenance team, which is in charge of a few
the test case systems, we asked to the experts what interval systems only.
J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273 263

Table 4 the following equation:


Factors defining the state of a system
X
n
Factor Description ðnWi = Wj Þ
Y
n
J¼1
e¼ Fi ð11Þ
Safety Risk of accidents because of the state of the ‘machine’ i¼1
Technology Type of technology of the system
Reliability Number of failures affecting train circulations After computing criticality and state, the infrastructure of
Maintainability Effort associated to system maintenance, economic the railway network is perfectly classified to the system
or in man-power level, so that decisions can be taken to help planning
Environmental Environmental risk generated by an installation
maintenance and investments, as it will be shown later.
risk failure
Including the ‘state of the system’ to complement the RCM
analysis is our third contribution to the RCM methodology.
3.3. Computing the state of a system
3.4. Selecting the systems to apply FMECA
In addition of criticality, we have extended the RCM
methodology to evaluate the state of the systems in order to After computing criticality and state of systems, some
identify those that are in worst condition into the network, criteria must be used to choose the FSI that will go through
thus requiring more maintenance and generating a higher the following steps of the RCM methodology. As those
risk of accident. State is used in combination with criticality steps includes very time consuming procedures (FMECA,
to choose those systems that must undergo RCM analysis decision tree, etc.), two thresholds must be defined for
first, to identify safety and costs issues involved in the criticality ðTc Þ and state ðTe Þ parameters. Initially, only
maintenance task selection phase, and to establish an systems ðsÞ over the criticality threshold will be chosen ðSc Þ :
infrastructure replacement plan.
As with the criticality, several factors are involved in Sc ¼ {s1 ;s2 ;…;sj ;…;sn } ;si [ S and cðsi Þ $ Tc ð12Þ
state determination (see Table 4), but the formula used is
But, if time or economic resources are not enough to analyze
different:
all the resulting systems, those systems needing more
maintenance can be deduced using also the state ðeÞ value:
Y
n
e¼ Fi ð10Þ Sc ¼ {s1 ; s2 ; …; sj ; …; sn }
i¼1
;si [ S; ðcðsi Þ $ Tc Þ and ðsðei Þ $ Te Þ ð13Þ
In this case, we use a product based formula because we The same criteria can be used to decide how to distribute
want that factors with a low value (it means a bad state) maintenance and replacement budget for the railway
may influence heavily the global value of e: For the same network, as replacing most critical elements with the
reason, we have defined the factor values from 0 to 4. As can worst state will optimize the results obtained with
be observed, if some factor is ranked with the lowest value the same budget. The thresholds must be defined to the
(0), the total state obtained (applying Eq. (10)) is 0. company level, so that there cannot be subjectivity
The values obtained for the state range from 0 (a very bad depending on the criteria of the analyst applying the
state) to 4n (perfect state), n being the number of factors equations.
involved.
As in the criticality case, different factors may have 3.5. FMECA analysis
different relevance in different organizations, thus a set of
weights is used again to customize the state analysis. But in Following Moubray [15], the traditional RCM method
this case, the normalization is more complex as shown in may be summarized by the following characteristic
Table 5
Failure mode analysis of a system

System Subsystem Function Functional failure Maintainable items Condition of failure

FTG track Transmitter Filtering the amplified signal Wrong voltage levels Cabin filter card, Electric discharges and overvoltages
circuit filter card
Amplifying the modulated signal Wrong voltage levels Cabin amplifier, Electric discharges and overvoltages,
amplifier card wrong connection of the amplifier card
Selecting the resistor value Track circuit failure Resistor, wire Electric discharges and overvoltages,
as a function of the wire length compensation changing weather, ballast connected
Providing power supply to the FTG Track circuit failure Power source Electric discharges and overvoltages,
background batteries empty
Track circuit failure Fuse Electric discharges and overvoltages
264 J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273

Table 6
Classification of failure consequences. Punctuality criteria are for general lines

Failure consequences Safety Economic Punctuality

Criteria Value Criteria Value Criteria Value

Catastrophic Several dead 100 . 0.2M Euro 25 .8 h. More than one train seriously delayed 40
Critical One dead or seriously injured people 60 .0.1M Euro 10 .8 h. More than one train delayed. 25
Marginal Temporally injured people 20 .6000 Euro 5 30 min delay 8
Insignificant Not injured, but aesthetic. 5 ,6000 Euro 1 10 min delay 2

elements: establishing a register of all equipment functions; (as shown in Fig. 4) and the resulting value is
identifying functional failures (FF) and their causes, making filtered through the failure classification defined in the
a failure effect classification (FMEA) to identify the most RAMS standard (see Table 8). Both tables provide a
significant items for further analysis; selecting the classification of the failures that may range from intolerable
maintenance tasks based on economic and technical to negligible.
arguments applied to a decision tree; and defining a Once classified, failure consequences are matched
maintenance plan to implement the tasks. using a logical decision tree to chose the best
In railway infrastructure, systems ðSÞ are composed by maintenance task (MT) to be applied to a maintenance
several subsystems ðPÞ that are maintained as a whole. item ðMÞ: As our fourth contribution to the RCM
Thus, after careful evaluation with experts, we decided to methodology, the logical decision tree has been also
make FMEA to the subsystem level [22]. The adapted to reflect the reality of the infrastructure and to
methodology applied was auditing some railway company introduce RAMS terms. Related to safety installations,
experts, studying existing descriptions of the infrastructure most signaling infrastructure cannot afford ‘by law’ any
(databases, functional diagrams, etc.), and using railway significant failure related to security, but experience
system decomposition as starting point. The first stage shows that a bad or insufficient maintenance generates
was decomposing each system and subsystem, including accidents, injuries and deaths. Consequently, we have
their functional description with levels of quantitative adapted the logical decision tree to include also the
performance (e.g. frequency 50 Hz). Qualitative criteria system status, and not only criticality, to choose
must also be quantified (e.g. risk classification in RAMS). the maintenance tasks. If the failure or the status of
For each subsystem, it is mandatory to define a primary the system may affect safety with a certain probability
function (e.g. to send a 50 Hz signal through the rail), (values of state near to zero), the only solution
but secondary functions can be also added (e.g. must work recommended is ‘restoration’ of the whole system.
24 h). A secondary goal was to achieve a unified
A decision like this cannot be obtained from the
functional description among all the railway companies.
traditional RCM methodology, because only functional
Table 5 shows a portion of the FMEA of a transmitter of
features, and not state of systems, are considered.
a FTG track circuit.
Thus, not only intolerable failures lead to system
Each function may have several FF, and each FF may
redesign of the subsystem, also intolerable state can
have several failure modes (FM). The effects of each FM
lead to replacement as the appropriated task (see Fig. 5).
must be defined and measured, and also its consequences
See that classical RCM only detects structural or design
and costs, using metrics like MTTR, downtime, cost, safety,
failures, but not the status of systems that may be against
etc. The consequences of a failure have been evaluated for
the following consequence classes: safety ðSÞ; costs ðCÞ; security.
and punctuality ðDÞ: Using Eq. (14), where P is the
probability of the risk if the failure occurs, and MTBF is Table 7
the medium time between failures: Classification of probability of failures

P Classification Values Probability


R ¼ ðS þ C þ DÞ ð14Þ
MTBF Frequent 10 Most probable result if the failure
occurs (.2 £ 1021)
We have assigned a severity ðRÞ to each consequence class, Probable 6 It may happen probably (,2 £ 1021)
getting a consequence distribution. Tables 6 and 7 show the Occasional 3 Not usual sequence of failures, but it
severity and classification of probability used in has happened (,1022)
RAIL –RCM. Remote 2 Not in years, but possible (,1024)
Improbable 1 Never happened, but possible (,1026)
The outcome of the FMEA analysis is normalized in the
Incredible 0.5 Almost impossible (,1028)
interval [0,1] for each failure and consequence class
J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273 265

Fig. 4. Failure consequence distribution.

If a failure is not against safety, the remainder criteria are technique to detect hidden failures and to achieve new
then filtered for environmental risk, availability, stages of reliability.
punctuality, and costs. If several suitable maintenance A good example of root cause was detected when
tasks are found, the cheapest maintenance task is chosen, as analyzing a certain model of low frequency track circuit,
shown in Eq. (15). Thus, costs ðBÞ is the last criteria to which used to have unexpected random failures, even when
classify the maintenance tasks, with a branch of the the major failures were being inspected as regulated. There
logical decision tree adapted to get the most efficient were several similar failures whose causes seemed to be
maintenance task originated by several components. However, after analyzing
the system with multidisciplinary equipment including
t [ {t1 ;t2 ;…;tj ;…;tn } several maintenance people, we found that the problems
;ti ;tj [ MT; ðcðti Þ ¼ cðtj ÞÞ and ðbðti Þ , bðtj ÞÞ ð15Þ were created by the increased conductivity of a wire due to
vibrations against the rail. Analyzing the problem, we saw
The resulting tasks are classified further in time to optimize that those problems were present in several railway sections.
the maintenance plan. When the maintenance staff of the line was interviewed, we
found that one technician had detected the hidden failure
3.6. Root cause failure analysis and solved it by adding several silent blocks to the wire.
This solution was recommended to the manufacturer and the
The FMECA analysis is made firstly on the system class installer of the system. As another example, last months
template. Through the analysis, we observed that every
operation has failures that occur repeatedly without any
Table 8
observable cause. Those symptoms show chronic hidden RAMS classification for failures used in the LDT
failures that could be the root of many other detected ones.
Thus, their elimination complement RCM, increasing the Frequency Safety
performance of the methodology [11]. Solving a root cause
Insignificant Marginal Critical Catastrophic
eliminates not just one, but a multiplicity of problems, that
will not recur because their deepest root causes have been Frequent Undesirable Intolerable Intolerable Intolerable
corrected. Combining root cause failure analysis (RCFA) Probable Undesirable Intolerable Intolerable Intolerable
and RCM has other benefits as well. If an area in which Occasional Tolerable Undesirable Undesirable Intolerable
RCM has been completed still experiences some Remote Negligible Tolerable Undesirable Intolerable
Improbable Negligible Tolerable Undesirable Undesirable
failures, some failure mechanisms have been missed.
Incredible Negligible Negligible Tolerable Undesirable
Thus, we have combined both RCFA and RCM as a good
266 J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273

Fig. 5. A branch of the logical decision tree.

maintenance staff of one new line observed that the contact Internet. Using this tool, the analysis can be made separately
wire was ‘eroding’ very fast when the speed was in the different territorial divisions of the railway
incremented to 220 km. The failure was initially charged companies, thus reducing time and management costs
to wire quality, which lead to change some wire sections, related to the expertise needed to cope with the FMECA
without any benefit. Finally, it was discovered that the analysis problem. This section does not provide an
erosion was due to the excessive hardening of a model of exhaustive description of our CMMS, but only a small
‘springs’, which were changed to eliminate the failure. view of their possibilities.
RCFA has been encouraged in those lines and sections RAIL – RCM Toolkit provides railway maintenance
experiencing failures not documented or random failures professionals with an easy-to-use library of railway
without any ‘rational’ explanation to look for hidden failure infrastructure components and a complete RCM analysis
causes and to solve the problem. As we still do not have a for them. The results of the analysis are collected into a PM
complete RCM database, RCFA is being accomplished by database, developed with the same architecture for all the
exploiting the staff knowledge and the existing databases of railway company members of the RAIL project,
each railway company. In spite of that, the experience has which allows to share maintenance data and to promote
been really successful, because it has allowed the interoperability and the introduction of shared maintenance
maintenance staff to be more closely involved with creative methods (Fig. 6).
aspects of their work. Our CMMS uses the railway company databases to get
the inventory and some legacy data related to historic
maintenance performance, MTBF, maintenance procedures
4. RAIL Internet CMMS costs, etc. RAIL database stores lines and their sections and
the evaluations of their criticality and status. It also stores:
To implement the methodology, we have developed a
user-friendly CMMS tool [9] and a database to support the † Lines, sections and systems, and their criticality values.
RCM process and the maintenance schedule. The tool, † Systems analysis, their FF, detection of the FF and
named ‘RAIL– RCM Toolkit’, has been programmed using evaluation for each FF of the frequency, severity, and the
the Java programming language, which can be used through criticality criteria.
J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273 267

Our tool has a user-friendly graphic user interface that


has been designed jointly with railway maintenance
managers and staff. The infrastructure is shown as it is on
the railway network. There are quick button accesses (Fig. 7)
to equipment histories and a flexible, detailed, and graphical
reporting mechanism for LCC, organic decomposition of
components, FMCA analysis, etc. Our system fully
integrates and takes advantages of RCM providing an
interactive method for problem resolution.

5. Test case

To estimate the potential benefits or our methodology


several parameters as risk, probability of failure and
availability must be evaluated [5,13,28]. However, our first
goal was to demonstrate that the application of the RAIL –
RCM methodology could satisfy the needs of the railway PM
by increasing reliability with the same budget. Secondly, we
wanted to test that our toolkit was user-friendly and easy to
understand for the maintenance staff.
To test the methodology, the RAIL project analyzed four
test cases, including a commuter line near Madrid (Spain), a
general line near Frankfurt, a heavy mixed traffic line near
Amsterdam and a commuter line near Dublin. Only
Fig. 6. Architecture of the RAIL CMMS. signaling equipment was considered to reduce the size of
the test. In this paper, we will show the results of the
† Decision-making help and maintenance tasks planning. Spanish test case: a commuter line 23 km long near Madrid
† Reports on different aspects of the system (lines, (see Fig. 8) including 253 signaling systems. The average
components, functions, etc.), and statistics of the RCM traffic was medium (8 trains/h) and the UIC classification of
analysis. the line was class 5.

Fig. 7. RAIL–RCM Toolkit example.


268 J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273

Fig. 8. The Spanish test case: line Villalba-Cercedilla.

The RCM project organization consisted of two components were the signals, and, excepting the level
committees: crossings, they were also considered the most critical ones
(this pattern can be extrapolated along the railway network).
† RCM management group. The responsibility of the As may be seen, there are few systems with high-criticality.
RCM management group was to control the overall The next step was to elaborate from scratch a RCM
project performance. template for each kind of system including the breakdown to
† RCM working committees. The working committees maintainable items and the functional breakdown.
were responsible for carrying out the current analysis Those RCM templates were applied to each kind of critical
and controlling the technical aspects in the project. system.
The working committees had a permanent group of
six people (two from RENFE, two from UCIIIM and
two from ADEPA). For specific topics, more RENFE 5.1. Maintenance planning and risk reduction
experts assisted them. by using the RAIL– RCM methodology

As may be seen in Fig. 9, only five kinds of critical Applying reliability to plan maintenance is one of the
signaling components were found: switches, TPS, track major goals of RAIL – RCM. Currently, railway mainten-
circuit, level crossings, and signals. After the criticality ance is executed periodically following the rulebook (yearly
analysis, 167 systems were considered critical for the PM rate, fT in Table 10), and applying reduction when the
section [5]. In this test case, the most frequent type of manpower is less than the ruled one. The problem is that PM

Fig. 9. Total number and criticality classes of the test case systems.
J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273 269

is executed ‘blindly’, being the rate equal for each kind of 122 medium, and 35 low criticality):
component, independently of their state, location, or
X
10 X
122 X
35
criticality. fi1 ei þ fi2 ei þ fi3 ei # 1700 ð18Þ
RAIL – RCM allows to plan preventive periodic i¼1 i¼1 i¼1
maintenance with two new criteria:
16£f12 £e1 þ5£f13 £e1 þ9£f21 £e2 þ38£f22 £e2
1. Risk factor þ29£f23 £e2 þ31£f32 £e3 þ37£f42 £e4 þ1£f51 £e5
2. Criticality þ1£f53 £e5 # 1700

Let us assume that we only have 1.700 h/yr to maintain f1j $ 3f2j ;j criticality class
the test line systems. If we apply ‘blind’ planning, f4j ¼ f1j ;j criticality class
the number of inspections of each type of equipment is
computed using the following equation system f3j ¼ f2j ;j criticality class
f5j ¼ f1j ;j criticality class
X
176
fi ei # 1700 ð16Þ 2
i¼1 f13 # f
3 12
f1 $ 3f2 3
f22 # f21
4
f4 ¼ f1
1
f23 # f21
f3 ¼ f2 2
1
f5 ¼ f1 f53 # f51
2
where ei is the inspection effort for system i; and fi is the where ei is the inspection effort for system i; fij is the
number of inspections/year for system i; and the equation number of inspections/year for system i and criticality
should respect the restrictions defined, which are consistent class j; and the equation should respect the restrictions
with the frequencies defined by the company (see Table 9 defined, where the last four ones are new and due to the
column fT ). relation of criticalities ðcij =cmaxi Þ: Simplifying and solving
As we have several equipments of the same type again the equations system, we get that:
(as show in Fig. 8), we can develop the equation to:
f51 ¼ 16; f12 ¼ f42 ¼ 12; f13 ¼ f53 ¼ 8; f21 ¼ 6;
21£f1 £2þ76£f2 £1þ31£f3 £1þ37£f4 £2
þ2£f5 £3 # 1700 ð17Þ f22 ¼ f32 ¼ 4; f23 ¼ 6
Thus, what is the risk reduction achieved by applying
And solving Eq. (17), we get that:
RAIL –RCM? Following [28], an estimated maintenance
f1 ¼ f4 ¼ f5 ¼ 12; f2 ¼ f3 ¼ 4 frequency can be obtained as
2A
Applying RAIL – RCM methodology, the number of fi ¼ ð19Þ
inspections of each type of equipment is computed uR
using the following equation system, where the elements where A ¼ 1023 is factor to normalize the values, R is
are distinguished by type and criticality class (10 high, the acceptable risk factor for that system computed using

Table 9
Theoretic and RAIL–RCM preventive maintenance frequency

Component Criticality ðcÞ Theoretic yearly Yearly PM rate using PM rate for 1700 Yearly PM rate for 1700 h Hours/op per
PM rate ð fT Þ RAIL–RCM ð fRM Þ hours ð fR Þ and RAIL–RCM ð fRA Þ inspection ðtÞ

Track circuit Medium 24 12 12 12 2


Switch Medium 24 12 12 12 2
Low 24 6 6 8 2
Signals Medium 8 4 4 4 1
Low 8 2 4 3 1
High 8 8 4 6 1
ASFA Medium 8 4 4 4 1
Level crossing Medium 24 12 12 16 3
High 24 24 12 8 3
270 J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273

Fig. 10. Risk variation by using RAIL–RCM.

Eq. (14), and u is the number of initiating events per satisfy the minimum safety standards, an alarm should be
year for the system. From the former equation, the risk sent to the railway company and railway safety authorities.
variation r can be defined as the ratio between the
current risk and the risk existing after applying RAIL –
5.2. Maintenance planning and risk reduction
RCM
by using the RAIL– RCM methodology

R1 2A=uf1 f Section 5.1 has described how to optimize maintenance


r ¼12 ¼12 ¼12 0 ð20Þ
R0 2A=uf0 f1 using RCM to reduce risk given a fixed cost for the test case
line. However, we also tested RAIL –RCM to optimize costs
where Eq. (20) provides a distribution function of r; while satisfying a certain risk factor. The acceptable risk can
varying in the interval ð1; 21Þ; for systems of the same be a technical or managerial decision. Currently most
type and the same number of initiating events depending railway companies use the ALARP paradigm for
only on new and old number of inspections and a infrastructure, but what is the maintenance level to achieve
normalization factor to harmonize the distribution. Fig. 10 that?
shows the risk variation when using RAIL – RCM To compute the ‘optimum’ maintenance intervals,
(Fig. 10). Systems not present have the same risk before Eq. (18) must be applied, but including more restrictions
and after applying RAIL – RCM. As you may see, the to guarantee the risk level, as shown in Eq. (21). The major
risk factor of high-criticality systems has been consider- goal is to minimize manpower needed for maintenance.
ably reduced (50% for signals), while risk factor of low fT column of Table 9 shows the theoretic inspection
criticality systems is slightly increased (5%) for signals. values recommended for maximum reliability. Currently
This conclusion is very important because most of the those values are applied depending on the line classification,
accidents in the Spanish railways during the last five without discriminating by criticality numbers. With those
years were due to signaling systems and level crossings. premises, the maximum theoretical time ðTT Þ to be devoted
Most of the death were due to accidents in level to PM would be 3784 h. However, once RAIL –RCM
crossings. criteria is applied, after applying the RCM methodology,
Obviously, there should be a minimum ‘threshold’ for using our RCM tool with the company experts, we obtained
inspecting each system. That threshold would define a new PM rate ð fRM Þ for each type of equipment and
additional restrictions for linear equations used to plan criticality, as shown in the fifth column of Table 9, thus
maintenance, in the form of minimum for the frequencies requiring only 1846 h to maintain the section of the line with
found as solutions. If the manpower is not available to the same performance and reliability standards. Thus PM
J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273 271

time could be reduced to a 48% without affecting security or Table 10


availability. Manpower devoted to the test case
!
X10 X
122 X
35 Personal Number of hours
minðpmÞ ¼ min fi1 ei þ fi2 ei þ fi3 ei ð21Þ
i¼1 i¼1 i¼1 Maintenance engineer RENFE 290
Signaling specialist RENFE 480
minðpmÞ ¼ minð16 £ f12 £ e1 þ 5 £ f13 £ e1 þ 9 £ f21 £ e2 ADEPA assistance 320
UCIIIM assistance 480
þ 38 £ f22 £ e2 þ 29 £ f23 £ e2 þ 31 £ f32 £ e3 Total 1470

þ 37 £ f42 £ e4 þ 1 £ f51 £ e5 þ 1 £ f53 £ e5 Þ


application, assuming a MTBM for the line of a year
f1j $ 3f2j ;j criticality class
(8760 h):
f4j ¼ f1j ;j criticality class
Ea ¼ MTBM=ðMTBM þ 3784Þ ¼ 0:70 ð24Þ
f3j ¼ f2j ;j criticality class
Er ¼ MTBM=ðMTBM þ 1846Þ ¼ 0:83
f5j ¼ f1j ;j criticality class
where Er is nearer to the industrial standards (85%) than Ea,
2 the actual PM theoretically recommended in the railway
f13 # f
3 12 network. Actually, the manpower currently devoted to PM
3 in the test case in more similar to the 1700 h used in the
f22 # f21
4 former section. The PM manpower has been reduced along
1 the years while the number of accidents was very low,
f23 # f21 however, experience was the only source for that decision.
2
RAIL –RCM methodology provides an analytical tool to
1
f53 # f51 compute ‘optimum’ values considering reliability.
2
2A 5.3. Manpower needed to develop the test case
rij # ;j criticality class; ;i system type
uij fij
The manpower devoted for the test case was 1440 h, as
Expert agreed that the number of inspections could be showed in Table 10. However, the effort was not equally
reduced for most equipment because they have been distributed. The biggest part of the effort was devoted to
estimated with maximum frequency of usage. Including elaborate the RCM template of each system (720 h), mostly
the criticality, restrictions allow to adjust the inspection because we were training the experts at the same time.
intervals to minimize manpower while maintaining Surprisingly, they were happy with the methodology and
reliability. very receptive from the very first stages. Once the templates
Following Nakajima [30], the effectiveness of a system were made, analyzing the line was easy and it was made in
can be measured as: 200 h. Rest of the time was devoted to maintenance tasks
Effectiveness ¼Availability £ Performance Rate choosing and maintenance plan elaboration.
£ Quality Rate ð22Þ As the RAIL – RCM Internet toolkit allows to work
cooperatively, we hope to have the network analyzed to the
According to him, effectiveness is an operating performance system level in one year.
measure, which combines availability, productivity, and
quality rate, into a single quantitative measure in order to
evaluate a system’s performance. Assuming that 6. Conclusions
performance and quality rate are not reduced applying
RAIL – RCM criteria in Eq. (22), as the experts and the first This paper addressed the problem of applying RCM to
two months of test demonstrates, the only parameter large scale railway infrastructure networks to achieve an
influenced is the availability because of the reduction of efficient and effective maintenance concept. Railways use
the line downtime or delays of trains due to PM. Defining nowadays very traditional PM techniques, relying on ‘blind’
the achieved availability as [31]: periodic inspection and the ‘know-how’ of maintenance
staff. RCM was seen as a promising technique from the
Availability ¼ MTBM=ðMTBM þ PMTÞ ð23Þ
beginning of the RAIL project because of several factors.
where MTBM is the mean time before maintenance, and First, technical insights obtained were better than the
PMT is the PM time. existing, so that several maintenance processes could be
We can compare the effectiveness of our methodology by revisited and adjusted. Second, the interdisciplinary
comparing the situation before (Ea) and after (Er) its approach used to make the analysis was very enriching
272 J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273

and very encouraging for maintenance staff, which was maintenance costs; production will increase as unscheduled
consulted for the first time. Third, using the RAIL – RCM downtime decreases; purchase costs of parts and materials
structured approach allowed to achieve well-documented will be reduced; more effective and up-to-date record of
analysis and clear decision diagrams. inventory/stores reports; and better knowledge of the
However, there were also some drawbacks we had to systems to help the company to chose the systems with
overcome. First, the method was very time consuming and, the best LCC.
initially, it was not suitable for large scale infrastructures. Even when field tests are still in progress, the tests
Second, the RCM methodology, even using RCM II, is executed in several countries show that our RCM
thought for industry. The criteria to calculate criticality and methodology is suitable for railway PM maintenance,
the decision charts presented some limitations to apply enhancing the effectiveness of PM to reach almost the
them to railway infrastructure. Our methodology has industry standards. As a result, being a successful
included the following new features to overcome the conclusion of the project, the Spanish railway company
former problems: (RENFE) and the German railway company (DB A.G.), not
only decided to adopt RCM to enhance PM, but they have
† New concept of ‘machine’, defined as any entity started a large project to implement Total Preventive
functionally significant to the organization. We consider Maintenance (TPM) [3] relying on the implantation of
several types of ‘machines’ (lines, sections, functional RAIL –RCM.
elements as track circuits, etc.).
† Downsizing of the problem by making a ‘generic’ RCM
analysis of each machine on a template. The thousands
Acknowledgements
of elements could be reduced to less than 120 classes.
Each template carefully studied one of those classes,
including a complete FMECA analysis. This work was partially funded by the European Union
† Criticality inheritance, so that the criticality of a project 2000-RD-10819 and the Spanish Ministry Of
higher-level machine is propagated to lower level Science under the project TIC2000-1995-CE. We would
machines if the experts chose this option. Using this like to acknowledge the railway partners of the RAIL
approach, the RCM methodology can be easily used for project (RENFE, DB A.G., Iarnrod Eireann and
different purposes: planning and optimizing mainten- Netherlanden Spoorwegen) and other research partners
ance at a large scale, planning maintenance of each (Bast&Roost and FIR) for their contribution to this work.
system separately, or making an exhaustive RCM
analysis of a section of a line to study its behavior in
a more detailed way.
† Root cause analysis, which allowed to find recurrent References
failures, to solve them for all the same class of systems,
and to modify maintenance procedure books accordingly. [1] Abdul-Nour G, et al. A reliability based maintenance policy: a case
study. Comput Ind Engng 1998;33(3/4):591–4.
† Using criticality and state to classify critical systems.
[2] Anderson R, Lewis N. Reliability centered maintenance:
management and engineering methods. The Netherlands: Elsevier;
Our methodology has been implemented in a CMMS, 1990.
named RAIL – RCM Toolkit, which allows to execute the [3] Ben-Daya M. You may need RCM to enhance TPM implementation.
RCM methodology collectively through Internet. With our J Qlty Maintenance Engng 2002;6(2).
tool, the managing staff makes a first classification of the [4] Carretero J, Garcia F, Perez JM, Perez M, Cotaina N, Prete P. Study
of existing reliability centered maintenance (RCM) approaches used
criticality of infrastructure to the line, or even section, level.
in different industries. Technical report FIM/110.1/DATSI/00, Spain:
This classification is provided to the territorial managers Universidad Politécnica de Madrid; 2000.
that can study, using our CMMS, each one of the sections in [5] Carretero J, Garcia F, Perez JM. Planning preventive maintenance in
their areas to compute criticality reflecting the specific railway networks using RCM. Working With Display Units (WWDU)
features of each one. At last, each maintenance team can Conference, Germany; May 2002.
influence the system by applying the same criteria to each [6] EN 50126. Railway applications—the specification and demon-
stration of reliability, availability, maintainability and safety
system they are maintaining. This structured approach, (RAMS), CENELEC; 2000.
helped by the RAIL –RCM Toolkit, had a very good [7] Deshpande VS, Modak JP. Application of RCM to a medium scale
acceptance among the users. industry. Reliab Engng Syst Safety 2002;77(1):31–43.
Our methodology and CMMS have produced some short- [8] Hall R. Optimizing preventive maintenance using RCM. Maintenance
term benefits: reduction of time and paperwork because of 1992;7(4).
[9] Hounsell D. Tomorrow’s CMMS. Plant Maintenance Resource
databases and tools accessible through Internet, and creation
Center, USA: Trade Press Publishing Corporation; 1996. http://
of a more accurate collection of maintenance information. www.plant-maintenance.com/maintenance_articles_cmms.shtml.
It will also have some long-term benefits: better PM will [10] IEC, Dependability management. Part 3–10. Application guide:
increase equipment life and help to reduce corrective maintainability. IEC; 1982.
J. Carretero et al. / Reliability Engineering and System Safety 82 (2003) 257–273 273

[11] Latino M. RCFA þ RCM ¼ formula for successful maintenance. [22] Pujadas W, Chen F. A reliability centered maintenance strategy for a
Plant Engng Mag 1999. discrete part manufacturing facility. Comput Ind Engng June 1996;
[12] McCall J. Maintenance policies for stochastically failing equipment: a 31(1/2).
survey. Mgmt Sci 1965;11(5). [23] Rausand M. Reliability centered maintenance. Reliab Engng Syst
[13] Morsmann A. Reliability-based maintenancebreakthrough’ in Safety 1998;60:121–32.
maintenance improvement. Machine Plant Syst Monit 2000;21–3. [24] RELEX. Computerized maintenance management systems: RCM
[14] Moubray J. Maintenance management: a new paradigm. Maintenance tools. RELEX Software. http://www.relexsoftware.com, 2001.
January 1996;11(1). [25] Red Nacional de los Ferrocarriles Españoles. Procedimientos de
[15] Moubray J. Reliability centered maintenance RCM II. Oxford, UK: mantenimiento de instalaciones de señalización; septiembre de 2000.
Butterwoth/Heinnemann; 1997. [26] Schlkins N. Application of the reliability centered maintenance
[16] ATA. MSG-3, revision 2. ATA of America; 1978. structures methods to ships and submarines. Maintenance June 1996;
[17] NAVAIR, Guidelines for the naval aviation reliability-centered 11(3).
maintenance. Direction of Commander, Naval Air Systems [27] Smith AM. Reliability-centered maintenance. New York: McGraw-
Command, USA; 1996. Hill; 1993.
[18] Nowlan FS, Heap HF. Reliability-centered maintenance. [28] Svee H, Jorgen H, Vatn J. Estimating the potential benefit of
Technical report AD/A066-579. National Technical introducing reliability centered maintenance on railway infrastructure.
Information Service, US Department of Commerce, Springfield, SINTEF technical report; 2000.
Virginia; 1978. [29] Vatn J, Hokstad P, Bodsberg L. An overall model for maintenance
[19] OREDA Consortium, Reliability data handbook, 3rd ed.; 1997. optimization. Reliab Engng Syst Safety 1996;51:241–57.
[20] Parida S, Kotu NR, Prasad MM. Development and implementation of [30] Nakajima S. Total productive maintenance. Productivity Press; 1988.
reliability centered maintenance using vibration analysis: experiences [31] Ireson WG, Clyde FC. Handbook of reliability engineering and
at Rourkela steel plant. 15th WCNDT, Rome; 2000. management. New York: McGraw-Hill; 1988.
[21] Pintelon L, Nagarur N, Van Puyvelde F. Case study: RCM—yes, no or [32] REMAIN consortium. Final consolidated progress report, European
maybe? J Qlty Maintenance Engng March 1999;5(3). Union; February 1998.

You might also like