You are on page 1of 12

European Journal of Operational Research 125 (2000) 398±409

www.elsevier.com/locate/dsw

Theory and Methodology

A discrete semi-Markov decision model to determine the optimal


repair/replacement policy under general repairs
a,*,1
C.E. Love , Z.G. Zhang b, M.A. Zitron c, R. Guo d

a
Faculty of Business Administration, Simon Fraser University, Burnaby, BC, Canada V5A 1S6
b
University College of the Fraser Valley, Abbotsford, BC, Canada
c
Faculty of Business Administration, Simon Fraser University, Burnaby, Canada
d
Department of Statistical Sciences, University of Cape Town, Cape Town, South Africa
Received 1 July 1997; accepted 1 October 1998

Abstract

The state of a machine (system) that may experience failures is characterized by the real age of the machine and the
number of failures incurred to date. On failure, the unit may undergo a repair which can partially reset the failure
intensity of the unit. In this paper, it is assumed that repairs can, at most, shift back the failure intensity so as to remove
the most recent run-time sojourn. The other alternative at a failure is to conduct a major overhaul that serves to refresh
the failure intensity of the unit. General cost structures, depending upon both real age and number of failures are
permitted. The decision, on failure to repair or renew is formulated as a discrete semi-Markov decision process with real
age and number of failures as the state space. Optimal decisions are of the threshold type. That is, on the nth failure, if
the real age is above a predetermined threshold value, refresh; otherwise conduct an imperfect repair. Ó 2000 Elsevier
Science B.V. All rights reserved.

Keywords: Markov processes; Semi-Markov; General repair; Renewals

1. Introduction replaced by a new identical one. In general, a


repair can bring the state of a failed machine to a
We consider a machine (system) that is subject level which is somewhere between new and prior to
to failure (breakdown). At any failure instant, failure (assuming the repair does not damage the
there are two types of actions to be taken. Upon machine). The minimal repair (i.e. bad as old) and
breakdown, the machine can be repaired or can be perfect repair (good as new) are two special (ex-
treme) cases of this imperfect repair model. Kijima
[4] proposed that the state of the machine just after
*
Corresponding author. Tel.: +1 604 291 3708; fax: +1 604
repair can be described by its the so-called virtual
291 4920. age which is smaller (younger) than the real age. In
1
This research supported by NSERC Grant #1228. his framework, the failure rate depends on the

0377-2217/00/$ ± see front matter Ó 2000 Elsevier Science B.V. All rights reserved.
PII: S 0 3 7 7 - 2 2 1 7 ( 9 9 ) 0 0 0 0 9 - 0
C.E. Love et al. / European Journal of Operational Research 125 (2000) 398±409 399

virtual age of the system. The repair, following a ure/Type I repairs as a g-renewal density.
failure, serves us to reset the virtual age which Integrating this density up to an arbitrary time, T,
determines the revised failure intensity. In his yields a g-renewal function. The concept of a
original paper, Kijima proposed two repair e€ect g-renewal density and its integral, the g-renewal
models although in this paper we will focus only function follows directly from the well-known or-
his Type I imperfect repair model leaving attention dinary renewal functions of renewal processes (see
to his Type II imperfect repair model (and exten- Refs. [1,2]). Determination of the g-renewal func-
sions) to a subsequent paper. tion provides a direct estimate of the expected
Consider a system that has experienced its number of failures from time 0 to time T and hence
…n ÿ 1†th failure and has been repaired. After the an optimal cost rate C…T † can be found by mini-
repair, denote its virtual age as v‡ …tnÿ1 †, where mizing the function 1=T …C0 ‡ C1 H …T ††, with re-
the ‡ denotes virtual age at failure but after spect to T; where C0 is the cost of a single
repair. Upon restart, if the nth system failure replacement, C1 is the cost per repair and H …T † is
occurs after a sojourn of xn time units, then the the expected number of failures in …0; T Š (i.e. the
virtual age before repair at the nth failure can be g-renewal function).
denoted as To determine the g-renewal function, Kijima
proposed a numerical approximation procedure
vÿ …tn † ˆ v‡ …tnÿ1 † ‡ xn ; …1† and applied the scheme to an assumed gamma
where tn is the real age of the system at the nth failure process with Type I repairs. We return to
repair, and ÿ denotes virtual age at failure but his numerical work later in this paper by way of
before repair. comparing his results to the semi-Markov decision
In Kijima's Type I imperfect repair model, he analysis developed in this paper.
suggested that upon failure, the repair undertaken Makis and Jardine [7] formulated this failure/
could serve to reset the intensity only as far back Type I repair process as a semi-Markov decision
as the virtual age at the start of the last sojourn. process (for a thorough discussion of such struc-
That is: tures, see Ref. [11]) and demonstrated that, under
suitable conditions, an optimal stationary policy
v‡ …tn † ˆ v‡ …tnÿ1 † ‡ hn xn : …2†
exists. A two dimensional state space was utilized
Here, hn re¯ects the impact of the nth repair in their formulation, de®ned as (number of failures,
…0 6 hn 6 1†. If hn ˆ 1, then we have a minimal real age). This provides considerable ¯exibility in
repair. If hn ˆ 0, then there has been no aging from modelling such problems since, in addition to the
the …n ÿ 1†th repair. It is obvious from the above failure process itself, the costs of repair and re-
equation that, since a repair can reset, at a maxi- placement can depend upon this state space. A
mum, the most recent sojourn time, the post-repair further distinction between the Kijima [5] analysis
virtual age is non-decreasing as is the real age. and that of Makis and Jardine [7] is the assump-
Furthermore, if hnP ˆ h; 8n; one can immediately tion regarding cycle lengths. Kijima assumed that
n
see that v‡ …tn † ˆ h jˆ1 xj ˆ htn : Thus, in this case, if the prescribed time T had been reached then the
there is a one-to-one correspondence between real system would be immediately shut down and re-
age and post-repair virtual age. placed. We will refer to this as a ®xed-T policy.
The purpose of the original paper of Kijima [4] Makis and Jardine in contrast assumed that re-
was to provide a procedure whereby bounds on the placement occurs at the ®rst failure after time T.
expected real age of a system could be determined This latter assumption follows from the observa-
when repairs follow Type I or Type II processes. In tion of Phelps [9,10] following work of Muth [8]
a subsequent paper Kijima [5] sought to determine that, if replacements upon failure cost the same as
the optimal cycle time between replacements for a replacements at schedule times, then it is always
system subject to a sequence of Type I repairs. To optimal to wait until the ®rst failure following T
solve this problem, utilizing his notion of virtual to carry out the replacement. Such a policy we
aging, Kijima [5] formulated the sequence of fail- will refer to as a T-plus policy, in contrast to the
400 C.E. Love et al. / European Journal of Operational Research 125 (2000) 398±409

®xed-T policy utilized by Kijima. Up to T, with 2. Model formulation


Phelps, failures received minimal repairs whereas
the proposal of Makis and Jardine was that they 2.1. State space
should receive Type I repairs.
To generate a minimum expected average cost We note in Type I imperfect repairs, that the
per unit time following a T-plus policy with Type I failure intensity of the system is identi®ed uniquely
imperfect repairs, Makis and Jardine reformulated by its virtual age. Furthermore, we assume that
their cost function in terms of Kijima's g-renewal repair costs, in general, depend upon failure
function and thereby generated a numerical solu- number and real age. We will thus describe the
tion for any set of presumed data. In this way they state of the system in terms of two variables …n; tn †,
were able to identify the optimum stationary where n denotes the nth failure and tn the real age
solution of their semi-Markov model. at that instant. The state space then is S ˆ
Utilizing a g-renewal function approach, while f…n; tn † j n ˆ 1; 2; . . . ; 0 6 tn 6 1g. The machine
providing a straightforward numerical approxi- failure instants are decision epochs. At each deci-
mation procedure, does limit the range of sion epoch, there are two actions, repair …a ˆ 1† or
structures that can be analyzed. A g-renewal replacement …a ˆ 0†. Although not necessary, for
approach limits the analysis to purely time-de- consistency with Kijima [5] as well as Makis and
pendent failure densities. Furthermore the cost Jardine [7] the replacement cost is considered here
structures generating unique minimum expected to be ®xed at C0 : On the other hand, again for
average costs must be monotonically increasing consistency with previous authors, we treat the
and thus cannot depend upon states other than repair cost as a bounded non-decreasing function
time itself. of n and tn (i.e. C1 …n; tn † 6 K ; n P 1; tn P 0). Note
Developing repair/replacement policies directly that the state space of this problem is a two di-
from a semi-Markov decision framework however mensional in®nite state space with one discrete
does provide a method of analysis that allows for state variable (n) and one continuous state vari-
considerable more ¯exibility in accommodating able …tn †.
structures that might occur in practice. Our spe- In this paper, we utilize a discrete semi-Markov
ci®c purpose in this paper is to develop a discrete process (SMDP) formulation to determine the
semi-Markov decision structure and to propose a optimal replacement policy. In order to treat the
numerical search procedure that will generate in®nite and continuous state space of …tn † we divide
optimal repair/replacement policies for stochasti- the real age axis into a set of equally spaced age
cally failing systems subject to Kijima Type I slices and regard that the nth failure occurs at age
imperfect repairs. We will utilize the two dimen- slice in . To relate time slice in to real time tn , we
sional state space proposed by Makis and Jardine introduce a scaling parameter, n, de®ned as the
in order to permit state dependent cost structures number of time slices in a real time unit (i.e.
and, if needed, state dependent failure rates as in =n 6 tn < …in ‡ 1†=n). Notice then that the state at
well. the instant tn is …n; in † and the failure occurring in
The remainder of this paper is as follows. In this range is assumed to have taken place precisely
Section 2, we will formulate a discrete SMDP for at in . In doing so, the state space can be discretized.
such failing systems including an appropriate state While this remains an approximation to the real
space de®nition as well as the necessary transition age at this epoch, nonetheless, if need be we can
structures. In Section 3 we will present a search increase the accuracy of this approximation by
algorithm in order to identify optimal control-limit increasing the value of n (and thereby increasing
type policies. In Section 4 we will present the re- the number of time slices in ). In Appendix A we
sults of some numerical analysis in order to com- indicate how this can be accomplished for speci®c
pare this solution procedure with results of Makis failure models.
and Jardine, Kijima as well as Phelps when uti- Utilizing the main theorem of Makis and Jar-
lizing renewal functions. dine [7], the optimal replacement policy for this
C.E. Love et al. / European Journal of Operational Research 125 (2000) 398±409 401

type of system is of the control-limit form Failure in Region 2: …n < N ; i P sn †. Action:


(threshold type). That is, for each n given an im- a ˆ 0 (replace) Transition: …n; i† ! …1; j† where
perfect repair level h (assumed here to be a con- 0 6 j 6 M. Here we are below the maximum
stant), there exists a threshold number sn …h† such number of failures but above the control limit.
that i€ in P sn …h† the optimal action is to replace at Failure in Region 3: …N ; i†. Action: a ˆ 0 (re-
the nth failure. For convenience henceforth, we place) Transition: …N ; i† ! …1; j† where 0 6 j 6 M.
will suppress the parameter h, in reference to these Here we are have reached the maximum number of
threshold numbers since a constant h is implicit in failures.
the development of any threshold policy. Fur- In Fig. 1, the control limit policy we seek to
thermore, to simplify the notation, we will also establish is marked by the heavy line, sn . Region 1
suppress the subscript (n) on the age slices and denotes those states in which, if a failure occurs,
simply refer to the age slice as i. Because C1 …n; i† we conduct a Type I imperfect repair. A transition
(the repair cost at the discretized points) is a non- that moves the machine into Region 2 (where it
decreasing convex function in n, it is reasonable to fails) results in a renewal. Finally transitions oc-
assume a maximum number of failures before re- curing in Region 3 denote where an upper bound
placement, N. (This value can be determined nu- on number of failures has been reached and again
merically.) Hence, given that C1 …n; i† is non- result in a renewal (replacement). It is of course
decreasing in n, then, as established by Makis and possible for a machine to make a transition from
Jardine [7], the optimal replacement policy can be state …0; 0†, a renewed state, and reach age M
characterized by a set of positive integers without incurring a failure. However, in our search
…s1 P s2 P    P sN †. This means that if at the nth algorithm below, we would expand the state space
failure the real age i P sn , the machine should (in the time dimension) such that the probability of
be replaced. Since the control limit is a ®nite value, such an event occurring is e€ectively 0. (That is, we
we can choose an upper bound on the age, ensure M >> s1 P s2 P    P sN .)
M …>> s1 †, as illustrated in Fig. 1, beyond which A typical sample path is marked in Fig. 1.
the system is automatically replaced. Bounding the Notice that two lines are placed in Fig. 1. The
state space in this way, we can reasonably assume dotted line is the sample path in terms of (real age,
a ®nite, discrete state space I ˆ f…n; i† j 1 6 n 6 N ; number of failures). The solid line below it is the
0 6 i 6 Mg. The form of a typical control-limit sample path in terms of (virtual age, number of
policy being sought is sketched in Fig. 1. failures). A description of how such a sample path
is generated is included in the ®gure.
To construct an SMDP model, let xn represent
2.2. Transition probabilities and times the discrete sojourn time between the …n ÿ 1†th
and the nth failure. We will assume here that the
Since decision epochs are limited to instances of distribution of xn depends only upon the failure
machine failure, only policies of a control limit rate function and the imperfect repair model used,
form need be considered Ref. [7]. Notice that this noting that our structure would permit failure
eliminates the type of policy analyzed in Kijima [5] rates to be state dependent if we desired a more
wherein, the system is replaced at T, not the ®rst general process. For Type I imperfect repairs, the
failure after T. The state transition graph in Fig. 1 probability density function (p.d.f.) of xn‡1 is the
illustrates the possible transitions among the conditional p.d.f. of the time to ®rst failure in a
states. We can identify the action to be undertaken replacement cycle. The condition is on the virtual
by the location of the system at failure. For any age after the repair which is vn ˆ htn where tn is the
state …n; i†: real age (chronological time) at nth failure. In our
Failure in Region 1: …n < N ; i < sn †. Action: discrete Markovian model, we use vn ˆ hi=n. (If
a ˆ 1 (repair) Transition: …n; i† ! …n ‡ 1; j† where the determination of the virtual age of the system
i 6 j 6 M. Here we are below the maximum num- dependsPn upon the complete history, as in
ber of failures, N, and below the control limit. fvn ˆ jˆ1 hj ij =ng, then, of course, we lose the
402 C.E. Love et al. / European Journal of Operational Research 125 (2000) 398±409

Fig. 1. Control limit policy and sample failure/repair path.


C.E. Love et al. / European Journal of Operational Research 125 (2000) 398±409 403

Markovian property of the system and the di- ®rst failure after the maximum age (M). The
mensionally of the problem would become un- probability of making such as transition is
manageable.)
Z1
If at state …n; i† a repair action …a ˆ 1† is taken,
P…n;i†…1;M† …a ˆ 0† ˆ f …x† dx: …7†
the (discretized) p.d.f. of xn‡1 can be written as
M

f …x ‡ hi=n† The expected one-step transition times (mean


f…hi†=n …x† ˆ ; …3†
F …hi=n† durations) for the two possible actions (repair,
replacement) are:
where f …x† is the p.d.f. of the time to ®rst failure, Z1
and F …x† is the survival function. The transition
s…n;i† …a ˆ 1† ˆ xf……hi†=n† …x† dx; …8†
probabilities generated when the system begins
in each of the three regions are de®ned as 0

follows. First we have transitions which begin


Z1
and strictly end within the state space de®ned for
the analysis. s…n;i† …a ˆ 0† ˆ xf …x† dx: …9†
0
P…n;i†…n‡1;j† …a ˆ 1†
In Section 3, a numerical search algorithm to
Z
jÿi‡1
determine the optimal control-limit policy is
ˆ f……hi†=n† …x† dx; n 6 N ÿ 1; i 6 j 6 M ÿ 1: developed.
jÿi

…4†
3. A search algorithm
We also have transitions arising when the sys-
tem, following a repair, restarts in Region 1 but Utilizing state transitions de®ned by Eqs. (4)±
next fails on or beyond the maximum age M. (7) and mean durations de®ned by Eqs. (8) and
These are de®ned by (9), it is clear that we have a semi-Markov state
Z1 transition process. Since in each state, a decision to
either repair or replace must be taken, the resul-
P…n;i†…n‡1;M† …a ˆ 1† ˆ f……hi†=n† …x† dx; n 6 N ÿ 1:
tant structure is a uni-chain, SMDP. Chose an
Mÿi
initial set of sn values. De®ne this set as an initial
…5† policy (R). For this arbitrary initial feasible policy
Notice here that we have consolidated all cases (R), solving the following set of simultaneous
where j P M into one of the states (n; M), such that equations will yield the value of such a policy in
the failures occuring beyond M are put at M. terms of the cost per unit time (real age):
Next we have a machine that has traversed into
Regions 2 or 3 and fails. This initiates a replace- w…n; i† ˆ C1 …n; i† ÿ g…R†s…n;i† …1†
ment. One case is that the machine on replace- X
M

ment, has its ®rst failure before the maximum age ‡ P…n;i†…n‡1;k† …1†w…n ‡ 1; k†; …10†
(M) is reached. This probability is kˆi

Z
j‡1 where i 6 sn ÿ 1; 1 6 n 6 N ÿ 1;
P…n;i†…1;j† …a ˆ 0† ˆ f …x† dx; 0 6 j 6 M ÿ 1: …6† w…n; i† ˆ C0 ÿ g…R†s…n;i† …0†
j
X
M
The other situation, again beginning with a ‡ P…n;i†…1;k† …0†w…1; k†; …11†
machine that has traversed into Regions 2 or 3 and kˆ0

failed, is that the machine on replacement has its with i P sn ; 1 6 n 6 N ÿ 1;


404 C.E. Love et al. / European Journal of Operational Research 125 (2000) 398±409

w…N ; i† ˆ C0 ÿ g…R†s…N ;i† …0† Step 3: Test for state …n; sn †; where sn ˆ sn , to
X
M ®nd if:
‡ P…N ;i†…1;k† …0†w…1; k†; …12†
kˆ0 C1 …n; sn † ÿ g…R†s…n;sn † …1†
w…1; 0† ˆ 0; …13† X M
‡ P…n;sn †…n‡1;k† …1†w…n ‡ 1; k† < w…n; sn †: …15†
where g…R† represents the gain or cost per unit kˆsn
time of this policy and w…n; i† represents the value, If Eq. (15) does not apply for …n; sn †, then go to
(relative to state …1; 0†), of beginning in state …n; i† Step 4. Otherwise, determine the largest sn with
and following this policy for an in®nite amount of sn < sn 6 M such that Eq. (15) holds. (If sn ˆ M,
time. enlarge the bound M and retest.)
With this as an initial policy (over all states), we Step 4: n ˆ n ‡ 1, if n 6 N ÿ 1 go to Step 2,
can proceed to determine a policy improvement otherwise go to Step 5.
algorithm to identify improved policies. One ap- Step 5: The new control limit rule R consists of
proach would be to proceed by determining, for  the
the parameters values s1 ; s2 ; . . . ; sN ÿ1 . If R ˆ R,
each state, whether it is optimal to repair (a ˆ 1) algorithm is stopped. Otherwise, go to Step 1 with
or replace (a ˆ 0). However since optimal policies R replaced by R. (If sN ÿ1 6ˆ 0, enlarge the bound
must be of the control-limit (threshold) type, it is N.)
clear that one can search directly for optimal With this algorithm, we can numerically deter-
control limits (sn ). The following search algorithm mine the optimal control-limit (threshold-type)
can be used to ®nd the parameter values of the policy for the replacement problem under general
optimal control-limit policy. repair. In Fig. 2, we present a ¯owchart of this
algorithm.
3.1. Algorithm

Step 0: Choose an initial control limit policy R 4. Numerical analysis


(i.e. parameter values s1 P s2 P s3 P    P sN ÿ1 ).
Select a bound for the control limits, ensuring that Makis and Jardine [7] by way of comparison
M >> s1 , and a limit on number of repairs, N. with analyses previously presented by Kijima [5]
(Note, we could choose separate bounds M1 ; M2 ; assumed that the lifetime distribution of a new
M3 ; . . . ; MN for each repair number, although the system followed a Gamma distribution with
added complexity does not appear to be justi®ed density, f …t† ˆ ka tRaÿ1 =C…a† exp …ÿkt†, distribution
kt
by any computational savings.) function, 1=C…a† 0 exp …ÿu† uaÿ1 du, and ®rst
Step 1: For the current control limit policy R, mean passage time, a=k; where a is a shape pa-
compute g…R† and w…n; i† by solving the linear rameter and k is a scale parameter. C…a† is the
equation system Eqs. (10)±(13). Set n ˆ 1. gamma function with parameter a. Kijima, as well
Step 2: Test for state …n; sn †, where sn ˆ sn ÿ 1; as Makis and Jardine, assumed for the purposes of
to ®nd if: their analyses that a ˆ k: They further assumed
C1 …n; i† ˆ 1 for all states …i P 0; n P 1†: They pre-
C0 ÿ g…R†s…n;sn † …0† sented analyses varying both the replacement cost,
X
M C0 and the Gamma parameter, k:
‡ P…n;sn †…1;k† …0†w…1; k† < w…n; sn † …14† In order to structure this problem as a discrete
kˆ0
SMDP, both the transition probabilities (Eqs. (4)±
provided sn 6ˆ 0. If sn ˆ 0 or Eq. (14) does not (7)) as well as the mean duration times (Eqs. (8)
apply for …n; sn †, then go to Step 3. Otherwise, and (9)) must be determined. In Appendix A we
determine the smallest integer sn with 0 6 sn < sn develop these equations for the Gamma (as well as
such that Eq. (14) holds for sn < sn and go to the Weibull) failure process subject to Type I
Step 4. imperfect repairs.
C.E. Love et al. / European Journal of Operational Research 125 (2000) 398±409 405

Fig. 2. Algorithm ¯owchart.

Following the parameter assumptions of of 1 (i.e. repair costs do not vary with n or i). In
Makis and Jardine as well as Kijima, we present Table 1, we provide a comparison of our discrete
analyses for the case of k ˆ 3, a ®xed replacement SMDP model with the g-renewal approach of
cost (C0 ) of 2 and a constant repair cost (C1 …n; i†) Makis and Jardine. These restrictions on repair
406 C.E. Love et al. / European Journal of Operational Research 125 (2000) 398±409

Table 1 we assume a repair cost of the form …C1 …n; i† ˆ


a ˆ k ˆ 3; C0 ˆ 2; C1 …n; i† ˆ 1 1:5
p…i=n† ‡ 1†. Here p is a parameter to provide a
h Th W…Th † sn g…R† family of increasing cost curves. We ®x the re-
0.0 n/a n/a 30 1.036 pair e€ect at 0.8 …h ˆ 0:8† and to see the impact
0.1 3.69 1.399 3.6 1.391 of a rising cost curve on the optimal solution, we
0.2 2.52 1.529 2.45 1.518
select three values of p (0.05, 0.2, 0.5).
0.3 2.01 1.615 1.9 1.601
0.4 1.71 1.679 1.65 1.673 As in Table 1, for each case, a constant
0.5 1.51 1.729 1.50 1.721 threshold value (sn ) results, over all values of n
0.6 1.35 1.769 1.35 1.760 since the repair cost is a constant with respect to n.
0.7 1.23 1.802 1.225 1.792 For the second case …p ˆ 0:2†, the threshold values
0.8 1.13 1.829 1.15 1.819
switch between 0.75 and 0.8 due to end e€ects.
0.9 1.04 1.851 1.05 1.842
1.0 0.97 1.870 0.95 1.859 Running with a larger number of time slices would
no doubt eliminate such switching.
In Table 3 we assume a repair cost increasing
with the number of failures, of the form,
cost imply we are seeking a T-optimal policy (C1 …n; i† ˆ 0:1nr ‡ 1). The repair e€ect, h, is again
(Refs. [9,7]). set to 0.8. Results for two values of r (1.05 and 1.2)
Th is de®ned in the Makis and Jardine paper are presented.
as the optimum replacement time given a par- As expected now sn is not longer a constant but
ticular repair e€ect …h†. W…Th † is the minimum decreases towards 0 as n increases. Notice at low
cost per unit time determined via their procedure. failure numbers the threshold values are quite
For a h of 0 in Table 1, the threshold value of 30 similar because the repair cost is quite similar in
was selected (i.e. sn …0† ˆ 30† as the largest age this range. For higher failure numbers, n, a sepa-
used in the calculations. This e€ectively means ration in threshold values is apparent.
that if each repair is a renewal (h ˆ 0) then the
control limit is sn …0† ˆ 1. Such a renewal model
has an average cost per unit time, g…R†, of 1.036. 5. Conclusions
Makis and Jardine did not report a result for this
case. For a machine (system) subject to failures and
Note in Table 1, for each value of h, a constant which may either be replaced or receive a (Kijima)
threshold value, sn …h†, results, over all values of n Type I imperfect repair, we have proposed the use
since the repair cost is a constant. From Table 1 it of a discrete SMDP to determine optimal policies.
is clear that our discrete semi-Markov approach The structure of the decision model utilizes two
agrees quite closely with the results of Makis and state variables (real age, number of failures in-
Jardine. curred to date). Should a machine be repaired, the
An important advantage in using this SMDP time to next failure given any real age is a function
approach of course is to explore the situation of the virtual age of the unit, which can be de-
when the repair cost is not constant. In Table 2 termined from the knowledge of the unit's real age

Table 2
a ˆ k ˆ 3; C0 ˆ 2; C1 …n; i† ˆ p…i=n†1:5 ‡ 1
p s1 s2 s3 s4 s5 s6 s7 sa8 g…R†
0.05 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0 1.836
0.2 0.75 0.75 0.75 0.80 0.80 0.80 0.80 0 1.869
0.5 0.65 0.65 0.65 0.65 0.65 0.65 0.65 0 1.903
a
With the assumed data, the largest failure number, N, needed was 8. Thus on the 8th failure the threshold real time is 0, implying
replacement.
C.E. Love et al. / European Journal of Operational Research 125 (2000) 398±409 407

Table 3
a ˆ k ˆ 3; C0 ˆ 2; C1 …n; i† ˆ 0:1nr ‡ 1
r s1 s2 s3 s4 s5 s6 s7 sa8 g…R†
1.05 1.15 0.80 0.55 0.35 0.20 0.10 0 0 1.839
1.2 1.15 0.80 0.60 0.45 0.30 0.20 0.15 0 1.836
a
With the assumed data, the largest failure number, N, needed was 8.

and the degree of repair …h†. We see that, in gen- Appendix A. Transition structure for failure pro-
eral, the form of the optimal (repair/replacement) cesses with Type I repairs
policy is of the threshold type such that at the nth
failure, if the real age is above a critical value, We assume that repairs are of Type I such that
replace, otherwise undertake an imperfect repair. the virtual age of the system following the nth re-
We also see that this discrete SMDP is very ¯ex- pair is htn : In order to discretized our process we
ible, permitting state-dependent cost structures for introduce a parameter n to represent the number
both repair and replacement costs. In this paper of time slices in a unit of time. With this parame-
we have assumed the failure intensity depends ter, real age, tn is discretized into in non-overlap-
only on the virtual age of the machine. Although ping ages such that the discretized virtual age after
we have not explored the issue in this paper, with repair is denoted as hi=n (where here we have
few changes, the failure intensity could be made to suppressed the failure number, n for ease of pre-
depend on the state (age, number of failures) as sentation). Clearly then, increasing n allows us to
well. increase the state space along the time dimension
We have also limited this paper to ®nding op- (in ) and to achieve any level of accuracy desired. Of
timal T-plus policies [9]. It is clear however that course there is no need to discretize n (the number
with few modi®cations to the transition probabil- of failures to date) since it is a discrete valued
ities and duration times (Eqs. (4)±(9)) we could function.
determine optimal ®xed-T policies as well. Our numerical work in Section 4 utilizes the
We have only addressed the case of Type I Gamma distribution. Utilizing this n discretization
imperfect repairs. Kijima also proposed a Type II approach, Eq. (3) for the Gamma distribution,
imperfect repair model in which the repair e€ect becomes
served to operate on the full virtual age of the
system up to that time. Other virtual age models f…hi=n† …x†
a 1 aÿ1
are also possible and may be important in par- k
1 k …n …hi ‡ x†† exp …ÿ n …x ‡ hi††
ticular situations (see Refs. [3,6]). It is clear that ˆ R khi=n :
C…a† 1 ÿ C…a† exp …ÿu† uaÿ1 du
for Type II repairs, the semi-Markov construc- 0

tion would require the introduction of a virtual The transition probabilities (Eqs. (4)±(7))
age variable into the state space. The general become:
formulation here then would require either two P…n;i†;…n‡1;j† …1† ˆ
state variables; real age and virtual age and
possibly a third state variable (number of failures R…a; …k=n†…j ‡ 1 ÿ …1 ÿ h†i†† ÿ R…a; …k=n†…j ÿ …1 ÿ h†i††
;
to date) in order to properly characterize the 1 ÿ R…a; …khi=n††
system. Kijima Type II imperfect repairs and
where R…a; x† is the
R x incomplete Gamma function
extensions appear to have considerable potential
de®ned by 1=C…a† 0 exp…ÿu† uaÿ1 dt;
in capturing repair e€ects. Research into the
construction and computation demands of ap- P…n;i†;…n‡1;M† …1†
propriate discrete SMDP models for this case are
1 ÿ R…a; kn …M ÿ …1 ÿ h†i†† R…a; khi
n
†
ongoing and the results will be presented in a ˆ ‡ ;
subsequent paper. 1 ÿ R…a; khi
n
† 1 ÿ R…a; khi
n
†
408 C.E. Love et al. / European Journal of Operational Research 125 (2000) 398±409
   
k kj P…n;i†;…n‡1;j† …1†
P…n;i†;…1;j† …0† ˆ R a; …j ‡ 1† ÿ R a; ;
n n  a    a 
i …j ÿ i ‡ hi†
  ˆ exp kh exp ÿ k
kM n n
P…n;i†;…1;M† …0† ˆ 1 ÿ R a; :   a 
n …j ÿ i ‡ 1 ‡ hi†
ÿ exp ÿ k ;
n
The mean duration times (Eqs. (8) and (9)) are:
P…n;i†;…n‡1;M† …1†
2 a
khi
exp …ÿ khi†  a   a 
n 4 n n i …M ÿ i ‡ hi†
s…n;i† …1† ˆ ˆ exp kh exp ÿ k ;
1 ÿ R…a; khi
n
† kC…a† n n

3 P…n;i†;…1;j† …0†
       a    a 
a hi khi 5; i …j ‡ 1†
‡ ÿ  1 ÿ R a; ˆ exp ÿ k ÿ exp ÿ k ;
k n n n n
  a 
M
an P…n;i†;…1;M† …0† ˆ exp ÿ k :
s…n;i† …0† ˆ : n
k
Finally, the mean duration times (Eqs. (8) and
These transition probabilities and times for (9)) are:
the Gamma distribution can be used directly in
our search algorithm of Section 3 and are utilized s…n;i† …1†
a a
in Section 4 of the paper to determine optimal ˆ exp …khi=n† …n=ka†C…1=a†…1 ÿ R…1=a; …khi=n† ††
policies and optimal cost per unit time for the and
system.
One sees very often in the published literature s…n;i† …1†
the use of the Weibull distribution in reliabilities a a
ˆ exp …khi=n† …n=ka†C…1=a†…1 ÿ R…1=a; …khi=n† ††
studies. For example, Phelps [9] utilized this dis-
tribution in his comparisons of various repair/re- respectively.
placement policies under the assumption of
minimal repairs. Given its frequent application, in
the interests of completeness, we brie¯y present References
here the necessary transition structure for this
distribution as well. We assume the lifetime dis- [1] H. Ascher, H. Feingold, Repairable Systems Reliability:
Lecture Notes in Statistics, Marcel Dekker, New York,
tribution of a new system following a Weibull 1984, pp. 33±34.
a
distribution has density f …t† ˆ ka ataÿ1 exp …ÿ…kt† †; [2] C.L. Chaing, An Introduction to Stochastic Processes and
a
distribution function F …t† ˆ 1 ÿ exp…ÿ…kt† † and their Applications, Krieger Publishing, New York, 1980,
mean ®rst passage time of …1=k†…C…a ‡ 1†=a† pp. 172±206.
where k is the scale parameter of the Weibull, a is [3] R. Guo, C.E. Love, Simulating non-homogeneous Poisson
processes with proportional intensities, Naval Research
the shape parameter and again, C…† is the Gamma Logistics Quarterly 41 (1994) 507±522.
function. [4] M. Kijima, Some results for repairable systems with
Utilizing the same discretization approach, general repair, Journal of Applied Probability 26 (1989)
Eq. (3) for the Weibull distribution, becomes, 89±102.
aÿ1 a [5] M. Kijima, H. Morimura, Y. Suzuki, Periodical replace-
f…hi=n† …x† ˆ ka…k=n…x ‡ hi†† exp …ÿ…k=n…x ‡ hi††
a ment problem without assuming minimal repair, European
‡…khi=n† †. Journal of Operational Research 37 (1988) 194±203.
The transition probabilities equivalent to [6] C.E. Love, R. Guo, Simulation strategies to identify the
Eqs. (4)±(7) are: failure parameters of repairable systems under the in¯u-
C.E. Love et al. / European Journal of Operational Research 125 (2000) 398±409 409

ence of general repair, Quality and Reliability Engineering [9] R.I. Phelps, Replacement policies under minimal repair,
International 10 (1994) 37±47. Journal of the Operational Research Society 32 (1981)
[7] V. Makis, A.K.S. Jardine, A note on optimal replacement 549±554.
policy under general repair, European Journal of Opera- [10] R.I. Phelps, Optimal policy for minimal repair, Journal of
tional Research 69 (1993) 75±82. the Operational Research Society 34 (1983) 425±427.
[8] E.J. Muth, An optimal decision rule for repair vs [11] H.C. Tjims, Stochastic Models: An Algorithmic
replacement, IEEE Transactions on Reliability R26 Approach, Wiley, New York, 1994, pp. 218±248.
(1977) 179±181.

You might also like