You are on page 1of 26


Darko Louit and Rodrigo Pascual1 Centro de Minera Pontificia Universidad Catlica de Chile Dragan Banjevic and Andrew K.S. Jardine Department of Mechanical and Industrial Engineering University of Toronto

In industries characterized by heavy utilization of equipment and machinery, such as mining, oil & gas, utilities, transportation, adequate stockholding of critical spare parts becomes essential. Insufficient stocks affect overall performance of physical assets, as lack of spares may result in gross penalties, lower availability or increased operational risks. On the other hand, oversized inventories lead to inefficient use of capital and may imply severe expenditures. This paper presents various approaches for the determination of the optimal stock size, when the stock is composed of (i) non-repairable or (ii) repairable parts. The paper is focused on spares for relatively expensive, highly reliable components, rather than on fast-moving spare parts. Optimization criteria considered are minimization of costs, maximization of equipment availability, and the achievement of a desired stock reliability (probability that a spare part request will not be rejected due to lack of spares in stock). For stock reliability, instantaneous and interval reliability calculations are considered. In addition, models directed to the estimation of the remaining life of a given stock of spare parts (at a certain stock reliability level) are introduced. The paper describes several models subject to practical industrial application, and presents case studies from utilities and mining to illustrate their use.

Keywords: Inventory, spare parts, stochastic processes, non-repairable parts, repairable parts.

Corresponding author

1. INTRODUCTION Inventories represent about one third of all assets in a typical company (Daz and Fu, 1997). Of these, spare parts are of particular importance for industries characterized by heavily utilized and relatively expensive equipment. This paper presents several models for spare parts inventory optimization from a reliability engineering perspective and illustrates their use through industrial case studies. This perspective differs from that of general inventory control primarily in the sense that no infinite populations are assumed, thus the demand rate for spares depends on the number of units currently in operation. In a maintenance environment, the focus of inventory models should be that of supporting the operation of a single unit or component (or a fleet or group of components), ensuring that operational requirements are achieved. Thus, maintenance actions that affect forecasting of spares demand constitute a major component of inventory control in this context. In most industrial applications, expensive parts need to be stocked to guard the operation from unwanted (and costly) stock-outs. Typically, each demand is satisfied from a part taken from stock (if available), or else a backorder occurs. The lost sales rule that is common in final product inventory problems is regularly not applicable to the spare parts area, because if no spares are available, extended equipment downtime is generated. The question of how many spare parts to stock has been addressed by numerous researchers and has originated a wide variety of models (for a recent survey of the literature the reader is referred to Kennedy et al., 2002; a comprehensive review of repairable spare parts models is provided by Guide and Srivastava, 1997). In this paper we consider single echelon models that are stochastic in nature. Systems where the inventory operates at one location, thus spare parts are stocked at the local level only (consider the case of an industrial plant where the inventory supports the operation of the equipment in that plant only) are single echelon systems. On the other hand, and using the definition given by Rustenburg et al. (2000), when the same inventory system operates at different locations, in which case repair facilities and spare parts may be required at both the local level and at a central level; the problem is called the twoechelon case (obviously, multi-echelon refers to two or more echelons in the inventory system structure). We will concentrate only on relatively expensive, critical parts, which are subject to field failures with operational consequences, i.e. coverage from unpredictable failures is required, as regularly the costs associated with a stock-out are significant. This does not mean that the authors disregard the analysis of cheaper, less critical parts. In fact, these may represent large fractions of the total stock held, but in many cases it is the complex, expensive parts that significantly account for the total value of the spare parts stock in a company. Determination of stock levels and order quantities for cheaper, frequently used parts with low associated shortage costs is thus left out of the scope of this paper. References such as Tersine (1988) include several models directed to the determination of optimal stockholding policies for such class of spares. Spare parts can be generally classified into non-repairable and repairable. After an initial applied study with the Canadian oil producer Syncrude (see Wong et al., 1997), researchers at the Condition-Based Maintenance Laboratory at the University of Toronto have investigated and

developed models to calculate the optimal stock size in the cases of non-repairable and repairable components (the terms components and parts are used indistinguishably throughout the paper). A repairable part is one that upon removal from operation (due to a preventive replacement or failure), is sent to a repair or reconditioning facility, where it is returned to an operational (readyto-operate) state. Non-repairable parts, on the other hand, have to be discarded once they have been removed from operation (as it is uneconomical or physically impossible to repair them). Inventory control models used in each case are different, thus they will be treated here separately. In this paper we present risk and cost models for the optimization of inventories of both non-repairable and repairable spares [some of the models contained in this work were also included in an earlier paper (Louit et al., 2005)]. In the case of repairable parts, we consider the scenarios of unlimited and limited number of repair servers available. In addition, models for the calculation of the remaining life of spares inventories are introduced. Case studies based on applications in an electrical utility, an open-pit mining operation, and the defence sector have been included to illustrate the use of the models described. Also, the estimation of spares demand when no data is available (through the use of Bayesian updating) is briefly discussed in the conclusion. Many models discussed in the literature assume that demand for spares follows a Poisson process, where the failure (or replacement) rate for a population of m components in operation follows a Poisson distribution with mean m, where is the failure (or replacement) rate of an individual component. This assumption is less restrictive than it initially seems, as the number of identical units in operation is often relatively large. When this occurs, the superposed demand process for all the units converges rapidly to a Poisson process, independently of the underlying time to failure distribution (see e.g. Cox, 1962) (if the failure distribution is exponential, the number of failures in an interval follows exactly the Poisson process, for any number of components). Because of this, the use of the Poisson distribution in spare parts inventory modeling has found wide application (see, e.g. Birolini, 1999). The remainder of the paper is structured as follows: in section 2 a brief definition of terms is presented. Section 3 introduces risk models used for non-repairable components. Section 4 introduces risk models for repairable components. In section 5, a brief discussion of cost models is presented. Case studies for non-repairable and repairable components are included in section 6. Finally, section 7 summarizes the paper and provides some final remarks. 2. DEFINITION OF TERMS In determining the number of spare components needed to protect an industrial operation from costly stock-out situations, first it is necessary to define under which criterion the stock level is optimal. Of course, the criterion will not be the same for every application, though in industrial practice cost minimization is typically preferred. This imposes the necessity to generate reliable estimates for the cost associated with running out of spares. The latter is not always an easy task, as the spares shortage might imply complex consequences that are seldom

quantifiable in monetary terms. When shortage costs are unknown, the optimization criterion is usually shifted toward an inventory performance measure such as the probability of having a spare at hand when demand is generated. The user could be also interested in maximizing availability of the equipment supported by the inventory under analysis. In this paper, we consider four criteria for optimization, defined as follows: (i) Instantaneous reliability (of stock): this is the probability that a spare is available at any given moment in time. It is equivalent to the fraction of demands that can be immediately satisfied from stock at hand. In the literature, instantaneous reliability is commonly referred to as fill rate (see e.g. Petrovic et al., 1986), or less often as availability of stock or point availability in the long run. (ii) Interval reliability (of stock): this is the probability of not running out of stock at any moment over a specified period of time, such as one year. Because reliability has to be maintained for every moment during the interval, this criterion is more demanding than instantaneous reliability. (iii) Cost: perhaps the most common optimization criterion, this takes into account all costs associated with purchasing and stocking spares, together with the cost of running out of a spare part where applicable. (iv) Availability: this is the percentage of non-downtime of a system/unit, considering only the lack of spare parts as cause for downtime (i.e. only extended downtime due to shortage of spares is considered in the calculations; all other sources of downtime, such as time to conduct regular or preventive maintenance of the system, are not included in this case). In the following sections, mathematical models are introduced. The presentation of these models is consistent with the following situation (and basic notation). Consider that a group or fleet of m independent components is required to operate for an interval of length T, with the mean time to replacement (due to failure or preventive removal) of one component denoted by and its standard deviation denoted by . Let N(T,m) be a random variable representing the total number of replacements in the interval [0,T] and S(k,m) the time until the kth replacement. Then, the probability of having fewer than k replacements in [0,T] is equal to the probability that the time until the kth replacement is greater than T, that is,
P (N (T , m ) < k ) = P(S (k , m ) > T ) .


The calculation of this probability is key to the determination of the optimal stock size, according to all four criteria considered. Note that the term replacement is used, rather than failure, to identify an event generating demand for spares. The reason for this distinction is provided in the following paragraphs. The inventory models we propose assume that demand follows a Poisson process. Under this assumption, and if no preventive replacement policy is applied, replacement rate can be

calculated as 1/MTBF, where MTBF is the mean time between failures obtained from the failure time distribution F(x). When an age-based preventive replacement policy is underway, with preventive replacement time tp, then has to be calculated as 1/MTBR, where MTBR is the mean time between replacements, that is:


[1 F (x )]dx .


This model assumes that demand for spares comes from both the preventive and failure replacements, thus planning the total number of replacements during [0,T]. Note that in the case of non-repairable components, handling of components needed for the preventive replacements can be planned in advance (including ordering of the required spares), thus eliminating the need to stock these items. In this case only a safety stock to cover up for field failures should be maintained, so the replacement rate for one component will be 1/MTBFR (MTBFR mean time between failure replacements), where 1 MTBFR = [1 F (x )]dx . F (t p ) 0


Obviously, MTBFR > MTBR, and if tp is large so that F(tp) 1, then MTBFR MTBR MTBF. In the case of repairable components, on the other hand, if a preventive replacement policy is being enforced it always has to be accounted for in the demand estimations, as it is assumed that both failed and preventively removed items undergo a reconditioning or repair process before being sent back to stock. In other words, in the steady state (and assuming that repairable components or parts can always be repaired), we will have a pool of components, from where preventive and corrective replacements will be satisfied. 3. RISK MODELS FOR NON-REPAIRABLE COMPONENTS When the task of repair is extremely difficult (or physically impossible) or when the cost of repair exceeds the cost of purchasing a working component, the components are considered non-repairable. Thus, whenever a failure occurs or a component is preventively removed from operation, a spare one from stock is installed and the removed component is discarded (see Figure 1).



Figure 1. Representation of non-repairable spares In this case, we are interested in determining the stock level required to ensure that no stock-outs occur (at a defined level of reliability) over a selected interval of time, typically the time required to receive a component on site, after an order has been placed (this is commonly called lead-time). That is, in the case of non-repairable components we are interested in an interval reliability of the stock. Of course, the length of the interval to be evaluated might be different from the lead-time, depending on particular provisioning conditions of the company or on the duration of the mission of the operating system (in the case of equipment operated in missions). For example, consider an operation in a remote location, where due to contract specifications spares for a component are delivered to the site every six months. In this case, the interval of interest will obviously be six months, as within that time frame no possibility exists to procure these components. An example of mission-operated equipment is that of a navy ship, leaving port for an assignment of three months. The interval of interest in this case is, then, three months. Additionally, sometimes companies are forced to place one final order for a particular type of spare part, due to a decision by the manufacturer to cease its production. Then, the interval of interest will be defined by the expected operational lifetime of the equipment supported by this spare part, as after the final order is placed parts will not be available to be procured. Thus in this case, the interval of interest is given, and the required number of spares is calculated. The opposite problem can also be considered: the number of spares is given, and the supportability interval is what we are interested in calculating. Consider, for example, a company that has certain number of spares in stock and the manufacturer has decided to discontinue production of these parts. The company is interested in calculating how much longer it can support the system that requires these parts, using only the current stock, and ensuring a desired level of stock reliability. The following two approaches are considered for the calculation of the probability of running out of spares in the case of non-repairable components, over a given interval T. Corresponding models for finding the supportability interval (when stock level and desired reliability are specified) are also presented for each case.

3.1 Normal Distribution Approach (Non-Repairable Components) If T is large in comparison with the mean time to failure/demand, , it will imply a large number of failures, k, and then S (k ,1) will be asymptotically normally distributed. According to Cox (1962), superposition of m independent components, S (k , m) , is also asymptotically normally distributed, with mean k m and variance 2 k m 2 . This approximation is independent of the underlying time to failure distribution, and is valid when T and k are large. In this case, approximately, k m P(S (k , m ) > T ) = 1 T m k ,


where () is the cumulative standard normal distribution. If a certain desired reliability p for the stock is specified, the minimum stock that ensures that reliability, k*, can be found from the equation P (S (k * , m ) > T ) = p , so that
2 2 z * p + z p + Tm = Tm + z p z p + z p + Tm , k = 2 2 2 2 2


where z p is obtained from a standard cumulative normal distribution table. Another approximation for k*, asymptotically equivalent to equation (5) and valid only for the case where m = 1 , is given by Sheikh et al. (2000). Ghodrati and Kumar (2005) provide an example of application of that approximation (note that equation (5) is also valid for m 1 ). Tm / is the average stock size required (with approximately 50% reliability) and the remainder is the stock size required to achieve higher reliability levels (notice how it increases with ). So, if the initial inventory is smaller than k*, a shortage during [0, T] is expected to occur with probability at least 1-p. Worth mentioning is that the normal distribution approach discussed here incorporates a measure of variability of demand for spares, through the standard deviation of demand for an individual component, which might be unknown. For relatively expensive (and usually highly reliable) equipment, conditions required for the application of this approach are less likely to be met in a regular operational environment (large k is unlikely for highly reliable components), and commonly a Poisson process approach is adopted. Nevertheless, if the normal distribution approximation can be applied and we are interested in the calculation of the maximum supportability interval, T* (for a longer interval, reliability will be less than what is required), given a certain stock level, S, and a desired reliability, p, then we have that
T* = S S zp m


3.2 Poisson Process Approach (Non-Repairable Components)

We have mentioned in the introduction that the pooled output of several independent processes tends to be (or is at least well approximated by) a Poisson process. The theoretical foundation of this statement is given by Palms Theorem (see, e.g. Khintchine, 1969; or Carrillo, 1991), which states that under mild assumptions, the distribution of the total number of events (demands) coming from different independent renewal processes, in a fixed time interval, tends to a Poisson distribution when the number of processes grows. In that case inter-demand times for the superposed process are exponentially distributed. The maintenance system can therefore be described using a Markov process such as the one presented in Figure 2, where the state of the system is defined by i, the number of units that have failed (or have been replaced) up to a particular moment in time, t. In the case of non-repairable components, state i can only be reached from the previous state (i-1) or by remaining in the same state, as only two alternatives are possible: either a new failure (demand) occurs or it does not (as we assume no simultaneous failures, in a small time interval). For a fleet of m components, each with demand rate = 1 , transition rates r(i, j) are therefore given by (as long as m components are operating):

r (i, j ) = m , r (i, j ) = 0,

j = i +1, j i +1.


It is obvious that the probability of being in a state i at time T is equivalent to the probability of having i demands in the interval [0, T], which by Palms theorem can be approximated by

pi (T ) = P( N (T , m ) = i ) =

ai a e , i!


where a is the expected number of demands in [0,T], a = mT . That is, the number of failed (replaced) units up to time T follows the Poisson distribution with mean mT = mT . Now it is possible to calculate k* for which:

P ( N (T , m ) k * ) = P S ( k * + 1, m ) > T =
i =0


ai a e p. i!


The obtained value of k* will be the minimum stock level that ensures at least a reliability p of the stock over the interval (incorporating the case when no demand arrives while the stock is empty). The stock level k* - 1 provides a reliability which is less than p, but it may be closer to p than that for k*, so it is up to the user to select the most convenient value. This approximation does not assume that a is a very large number (this is reasonable for relatively expensive, highly reliable components).

If we are interested in the calculation of T*, given a certain stock level, S, and a desired reliability, p, then we first need to find the mean A of a Poisson distribution such that
Ai A i! e = p . i =0

(10) a bisection algorithm with an initial value




x = S + z 2 / 2 z p z 2 / 4 + S based on the normal approximation of the Poisson distribution, A p p

can be found without excessive calculations. T* can be then obtained from

T* =




Figure 2. Representation of the Markov process for non-repairable components


In many applications, components can be returned to an operational state by means other than just complete renewal, that is, they are repairable. For repairable components, whenever a component fails or is preventively removed from operation it is replaced by a spare, and the removed component is sent to a repair shop for repair or reconditioning. Once repair has been completed, the component is returned to the stock, where it waits until it is needed for operation again (see Figure 3). It is assumed that repair is perfect (i.e. the component is returned to an asnew state every time it is repaired) and that components can always be repaired (there is no condemnation). In practice, the latter assumption may not impose a serious restriction, as the expected number of repairs for the same component over its lifetime is small. We first describe models that make a second assumption, namely unlimited repair capacity (i.e. there is no limit on the number of repairs that can be performed simultaneously at the shop). Extension to the case of limited repair capacity (known as the machine-repair problem, see e.g. Barlow and Proschan, 1965), is presented afterwards.

Failures time


repaired units Repair shop

Figure 3. Representation of repairable spares When dealing with repairable components, we are usually interested in determining the stock level required to ensure a certain fill rate, that is, the probability of having a spare at hand when needed. This is the instantaneous reliability of the stock. In other applications, though, it might be of interest to determine the stock level required to ensure that, with a certain probability, no shortage will occur over a complete interval of time, particularly if consequences of a stock-out are catastrophic. Thus, in these cases we are interested in the interval reliability of the stock. It is assumed that a Poisson process is a good approximation for spares demand and that the whole system can be modelled using a Markov process such as the one presented in Figure 4. In this model, the state of the system, M (t ) , is defined by the number of units, i, undergoing repair at a particular moment in time, t. This state uniquely defines all other variables of interest in the system: if i S , then the number of units in operation is still m and the current stock size is S i , if i > S then the number of units in operation is m i + S and the stock is depleted. Transition into state i is now possible either from state (i-1) or state (i+1) in a short interval of time. For a fleet of m components, each with demand rate = 1 , and repair rate , transition rates r(i, j) are given by

r (i, i + 1) = m , r (i, i + 1) = (m + S i ) , r (i + 1, i ) = (i + 1) , r (i, j ) = 0,

i = 0,1,....., S , i = S + 1,...., m + S 1 , i = 0,1,....., m + S 1 , j i + 1, i, i 1 ,


where S is the initial size of the stock. It is assumed that the time to repair an individual component is exponentially distributed. The assumption of exponential repair times can be adopted since it has been suggested that the repair time distribution primarily affects performance of inventory systems through its mean, thus results are not sensitive to the choice of distribution (Sherbrooke, 2004). In addition, the choice of exponentially distributed repair times allows for much easier calculation of the performance measures of the inventory system. Verrijdt et al. (1998) present some simulation results that confirm this assumption. Inman (1999), based


on empirical evidence, concludes that the widely used assumption of exponential distribution for the time between failures and the time to repair is acceptable in many cases. Klein (1984), based on the analysis of 70 data sets for electronic and mechanical systems and equipment, states that in over 70% of the cases, repair times are better represented by a log-normal distribution. However, this reference also concludes that the use of exponential times to repair does not generate significant errors in the mean time to repair, when the distribution is in fact log-normal. Having said this, the choice of exponentially distributed repair times is less restrictive than it may appear initially.
Min{m,m+S-(i-1)} Min{m,m+S-i}




Figure 4. Representation of the Markov process for repairable components

4.1 Instantaneous Reliability Model for Repairable Components Unlimited Repair Capacity

We are interested in the probability of not running out of spares at any given moment t. This is just the sum of the probabilities pi(t) of being in a state i at instant t, where iS. If the number of components undergoing repair at time t is M(t), then the probability of not running out of spares is

P(M (t ) S ) = pi (t ) .
i =0


A stock-out occurs if there are more failed components in moment t than the initial stock size. Now, if we perform this analysis in the steady state, that is, when t + , the limiting probabilities pi(t) converge to positive numbers pi and the distribution of the number of components in repair does not depend on time (as the embedded Markov chain is ergodic, i.e. all states communicate, see e.g. Barlow and Proschan, 1965; Ross, 2003 or Cocozza-Thivent, 1997). If a = m is the average number of demands during one repair, then it follows from the general solution (see Ross, 2003, p. 223) that the instantaneous reliability of the stock is
ai 1 , P(M (t ) S ) = i = 0 i! C


where C is a normalizing constant,


C =
i =0

ai m+ S ai 1 2 i ( S + 1) + 1 1 L 1 . i ! i = S +1 i ! m m m


Notice that C depends on S, in general. For large m, C tends to e a , so that a simpler approximation can be used when the number of components in the fleet is relatively large. In this case, instantaneous reliability is
P(M (t ) S ) =
i =0 S

a i a e . i!


The last expression implies that the number of components under repair can be approximated by a Poisson distribution with mean equal to a . Using either approximation (equations (14) or (16)), the minimum stock size S* can be calculated so that a certain reliability p is ensured, or P M (t ) S * p .

4.2 Interval Reliability Model for Repairable Components Unlimited Repair Capacity

In the interval reliability case, we are interested in the probability of not running out of ( spares over a complete interval of time [0,T]. Let p ijD ) (T ) denote the probability of going from state i into state j over the interval of length T, assuming in addition that all the states the system goes into throughout the interval must be from the subset D of states, that is M (t ) D for all t, 0 t T . This probability can be considered as a case of taboo probability, where the taboo set is the complement of D (for theory of taboo probability see Chung (1967) or Latouche and Ramaswami (1999)). Our case of interest is the subset of states D = {0,1,2, K , S } where S represents the initial stock size. That is, we are interested in the probability of never going into a state where the number of units in repair exceeds the stock size, over the complete interval (as such a state implies a stock-out). Note that in this case the starting state i, or the number of units undergoing repair at the beginning, is given.
( ( Probabilities p ijD ) (T ) are the elements of the probability matrix P ( D ) = pijD ) (T ) that we have to obtain in order to calculate the interval reliability for a given stock size. Solving a system of differential equations obtained from the Chapman-Kolmogorov equations (see e.g. Barlow and Proschan, 1965, for details) of the Markov process with transition rates given by equation (12), we can calculate the probability matrix P ( D ) as


(T ) = e

TQ ( D )

k =0

(TQ )

(D) k



where Q ( D ) is the transition rate matrix for the subset of states D. Equation (17) involves exponentiation of a matrix, which will not be discussed here (see Bhat and Miller, 2002, for further details). In our case we can obtain Q ( D ) from the original transition rate matrix Q of the process by truncation at S, that is, by considering only the first (S+1) rows and columns of Q . Then, using equation (17), P ( D ) can be obtained. Summation over the rows of P ( D ) gives the interval reliabilities for each initial number of components in repair, i. Then for any fixed moment in time t (and also if t ), Interval reliability in [t, t+T], given initial state i at t = P( M (u ) S , t u t + T | M (t ) = i ) =

p (T ) .
j S (D) ij


In the unconditional case, with an unspecified number of components initially on repair, the limiting probabilities of being in state i at any given moment, pi (as calculated in the previous section), are used, such that the unconditional interval reliability is defined by Unconditional interval reliability in [0, T] = lim P ( M (u ) S , t u t + T | M (t ) = i ) P( M (t ) = i )
t iS

( = pi p ijD ) (T ) , i =0 jS


ai 1 , and C is calculated from (15). As before, the minimum i! C stock size S* can be calculated so that a certain reliability p is ensured. This procedure is computationally intensive, and requires a computer program.
where pi is calculated as pi = Sometimes it is of interest to find the interval for which a certain reliability level can be ensured, given a specified number of spare components in stock. Consider the following situation: a ship is at sea and due to safety reasons it is required that no stock-outs occur for a component, until the ship returns to port. There exists the capability of performing repairs to this component on-board the ship, and also a small stock is maintained on-board while out at sea. We might be interested in finding the maximum possible length of time the ship can remain at sea, so that desired reliability of the stock is accomplished without increasing the size of the on-board inventory (which might be impossible if the stock storage space is limited). Theoretically we want to find the maximum interval, T*, so that the interval reliability is greater than or equal to a certain requirement p, given that there are S spares in total. By the use of equations (18) or (19), depending on the initial status (i.e. whether the number of units undergoing repair at the beginning of the interval is known), and using a bisection algorithm, T* can be found, such that


( p = pijD ) T * , if the initial state i is known; jS

( )

( p = pi pijD ) T * , if the initial state is unknown. i =0 jS S

( )


Note that interval reliability in the unconditional case is bounded from above by the instantaneous reliability

i =0

(as when T tends to zero,

p (T )
jS ( D) ij

tends to one, so that p

should be less than

p ).
i =0 i

4.3 Instantaneous Reliability Model for Repairable Components Limited Repair Capacity

If we now assume that the number of available repair channels is not infinite, but finite (and known), then the inventory system can still be represented by a Markov process such as the one depicted in Figure 4 (state i can only be accessed in a short interval of time from state (i-1) when a failure occurs, or from state (i+1) whenever a repair is completed) but the transition rates will not have the same structure. For a fleet of m components, each with demand rate = 1 , if the repair rate is , and there exists a limited number of repair channels, c, the transition rates r(i, j), are given by

r (i, i + 1) = m , r (i, i + 1) = (m + S i ) , r (i + 1, i ) = (i + 1) , r (i + 1, i ) = c , r (i, j ) = 0,

i = 0,1,....., S , i = S + 1,...., m + S 1 , i = 0,1,....., c 1 , i = c, c + 1,....., m + S 1 , j i + 1, i, i 1 ,


where S is the initial size of the stock. For calculations of the instantaneous reliability of the stock, we are interested in the steady-state probabilities pi for this Markov process. The following two cases are identified: Case I: c m (number of repair channels is less than or equal to the number of units in the fleet) this is the most likely case for industrial operations. Within this case, we have to consider the sub-cases: a. 0 S < c, b. S c.


Case II: c > m (number of repair channels is greater than the number of units in the fleet) unlikely in an industrial setting. Remember, though, that as long as c < m+S, repair capacity is not infinite in practice. Different sub-cases are: c. 0 S < c-m, d. c m S < c , e. S c. Considering that the average number of demands during one repair is a = m , from the general solution we can find the expressions for the limiting probabilities pi. The limiting probabilities for the more realistic Case I: c m are presented as follows:
pi = bi p0 i = 0,1,K , m + S ,
m+ S i =0


where b0 = 1 and p0 =

. The bi s, 1 i m + S , are given by the following expressions:

For 0 S < c,
ai i! ai m! bi = iS i ! m ( m i + S )! ai m! i c i S c !c m ( m i + S ) ! 1 i S S <ic c <i m+S

for S c, ai i! ai bi = i c c !c ai m! i c i S c !c m ( m i + S ) ! 1 i c c<iS S <i m+S .

As in the unlimited repair capacity case, the probability of not running out of spares at time t (using a steady state analysis) is P(M (t ) S ) = pi . Necessary conditions to apply
i =0 S

limiting probabilities are detailed in Cocozza-Thivent (1997) or Ross (2003).


The minimum stock size S* can therefore be calculated so that a certain instantaneous reliability p is ensured, or P M (t ) S * p .

4.4 Interval Reliability Model for Repairable Components Limited Repair Capacity

The procedure for obtaining the interval reliability of the stock for repairable components when the number of repair channels is limited is the same as the one described in section 4.2, except that in this case the transition rates matrix Q ( D ) is constructed using equation (21). Then equations (17) and (18) can be applied to determine the minimum stock level that ensures the required reliability when the number of spares undergoing repair at the beginning of the interval is known. For the unconditional case, the limiting probabilities pi from section (4.3) should be used as an input for equation (19), in order to calculate the minimum stock that ensures a certain interval reliability requirement. If we are interested in finding T*, we can use the conditions as specified in equation (20), for the unconditional or conditional cases, now applied to the situation of limited repair capacity.
4.5 Availability Model for Repairable Components

Assuming steady-state operation of a group of m components, we can calculate the probability of being in state i at any moment, through the limiting probabilities pi. Obviously if i S, all m components are in operation. If i > S, it means that i-S components out of the original m are not operating (i.e. there is downtime of i-S components, due to shortage of spares). Then, the expected number of non-operating components due to the system running out of spares, at any given moment in time, given that initial stock is S, is

Expected number of unavailable components = U S =

i = S +1

(i S ) p


Dividing this value by m, we obtain the proportion of non-operating components for the entire fleet. In addition, as equation (23) applies for any given moment (i.e. is instantaneous), U s m is also the proportion of downtime per unit time, attributable to spares shortage. From this, it is possible to find S* so that the proportion of downtime per unit time is limited to a certain value 1-A, where A is the desired availability. That is,

U S * = MinS : S 1 A . m


With the models presented so far, we can optimize the inventory level for non-repairable or repairable parts using stock reliability (instantaneous or interval) or equipment availability as


optimization criteria. The combination of risk models with cost considerations will allow for the determination of stock levels leading to a minimization of inventory related costs, which will be discussed in the following section.


Costs that need to be considered in the analysis of a spare parts inventory should include all costs that vary as the level of inventory changes, or costs that are incurred according to the inventory policy (and that will be affected by the choice of policy). In general, these costs are (i) acquisition costs, (ii) inventory holding costs, and (iii) stock-out or shortage costs. Acquisition costs consider the purchase cost of the items themselves (calculated simply as the unit price times the number of spares bought), and the ordering costs associated with the processing of a purchase, from creation to receipt. Inventory holding costs are related to the costs of managing the inventory and are regularly expressed per item, per unit time. They include capital costs of the investment tied up in inventory, operational costs of warehousing, and deterioration (or monetary depreciation) of the items. In some cases, an obsolescence cost also needs to be considered. Inventory holding costs are a function of inventory on hand and it is commonly assumed that their value ranges between 20% and 40% of the value of the components stocked per year (Tersine, 1988). Finally, stock-out or shortage costs are incurred whenever demand cannot be routinely satisfied from inventory, due to lack of spares. In the maintenance environment, shortage costs are often large, if a stock-out of the component results in lost production or valuable downtime of a system or piece of equipment. Shortage costs are also regularly expressed per item, per unit time. The extent to which these costs are to be included in a particular model and the breakdown which is required to define them will vary considerably between different companies and applications; thus their precise meaning and quantification is not trivial. A large body of literature is available, where numerous inventory cost models are introduced (see, e.g. Chikan, 1990). In this paper we will briefly describe two cost models: one for non-repairable parts and another for repairable parts, selected due to their simplicity. These models have been applied in the case studies presented in section 6 of the paper.
5.1 A Cost Model for Non-Repairable Components, with no Downtime and Immediate Delivery

is the number of replacements in the interval of length T and ()+ = Max{0,}. Note that the model

In the case of non-repairable spares, one alternative is to assume that whenever the stock is depleted, an urgent requisition order can be placed (at a premium in cost) with immediate delivery. This implies that in this model no downtime is allowed. As we are interested in determining the optimal stock to support the operation of a system for an interval of length T, we assume that a certain quantity of spare components is bought at the beginning of the interval, each with a cost of Cr. If stock is depleted, we can buy additional spares (i.e. emergency spares) at a cost of Ce each. Then the total acquisition costs are given by C r S + C e ( N T S )+ , where N T


considers the acquisition cost per unit as a single quantity, and does not distinguish between purchase price and ordering costs (for relatively expensive parts, the ordering cost is typically small compared to the purchase price). To account for inventory holding costs, we will initially only consider depreciation of the items, which can be equally defined based on a rate of value lost per unit time (denoted by r) or by the resale value of an unused spare at the end of the interval, DT. This model assumes that, in the case some unused spares are left in stock at the end of the interval, they can be resold at this value. If depreciation is defined as a rate, then the unused value of a spare at the end of the T interval can be calculated as DT = (1 r ) C r . The total cost over the complete interval is then

CT = C r S + C e (N T S )+ DT (S N T )+ . The expected total cost will be E (CT ) = (C r DT )S + DT E ( N T ) + (C e DT )E ( N T S )+ . Optimal stock size S* is the stock level which minimizes E (CT ) . For the calculation of E ( N T S )+ , we can use the following approximations: (i) E ( NT S )+ = P ( NT > k ) =
k =S



k = S +1

P ( Sk < T ) =

k = S +1

k m m k

if the normal distribution approach is adopted, or (ii) E ( N T S )+ = (k S )P( N T = k ) = a(1 P( N t S 1)) S (1 P( N T S ))

k >S

S S 1 a k ak = a 1 e a S 1 e a , k = 0 k! k = 0 k!

if the Poisson process approach is adopted. E ( N T ) can be obtained from (i) or (ii), using S = 0 . When additional components of holding cost (other than depreciation, such as capital costs or warehousing costs) are considered, they should be included in the cost model. Then new terms should also be included in equations (25) and (26). The stock held during [0, T] is
H T = ( S NT ) + T +
min{ s , NT }

i =1

S i = ( S N T ) + T + Si I ( S i T ) ,
i =1



where Si = S (i, m) is the time of the ith replacement. If Ch is the additional holding cost of one spare per unit time, then Ch H T has to be included in equation (25), and Ch E ( H T ) has to be included in equation (26). Further details on the calculation of E ( H T ) are given in the Appendix.
5.2 A Cost Model for Repairable Components

In the case of repairable components, where the stock can be interpreted as a pool of spare components, regularly no new units will be bought if a backorder occurs, thus we will present a cost model that considers downtime. Let Cd be the downtime cost for one component per unit time, and Ch the total inventory holding costs of one spare per unit time. Then the total expected cost per unit time, E(CS), associated with the inventory system can be calculated as E (CS ) = U S Cd + S Ch , (28)

where U S is given by equation (23) and S is the initial stock size. Note that this cost model assumes that total inventory holding costs are paid for the entire stock. More complex models can be used, where capital, depreciation and obsolescence costs (if applicable) are paid for the entire stock, whereas operational holding costs are only paid for the stock at hand. Based on the model of equation (28), the optimal stock size S* can be found so that expected costs per unit time, E (C S ) , are minimized.
6. CASE STUDIES 6.1 Case Study for Non-Repairable Components: Electricity Transformers

This first case study is based on data coming from an example originally generated by the Independent Electricity Market Operator in the province of Ontario, Canada. It relates to the determination of the optimal number of spare transformers to be stocked by a metering service provider that operates 150 single-phase instrument transformers (IT). These transformers can be replaced by spare transformers held in stock, or by new units bought at a higher price from the manufacturers during emergencies. The 150 transformers are assumed to fail completely at random (i.e. exponential time between failures). It is also assumed that no preventive replacement policy is enforced. In addition, since the transformers are replaced with new ones either obtained from stock or purchased from manufacturers, the spares are considered non-repairable. The interval of interest is a ten-month period, because this is the time between re-stockings used by the company. The regular cost of each transformer is $30,000; and it is assumed that the emergency cost of a transformer is $60,000. The value of an unused transformer at the end of the ten-month period is estimated at $25,000. The failure rate for one transformer is = 1 300 [ failures IT year ] . Table I summarizes the data used for this case study.


Table II presents the optimal stock calculated according to two different criteria for optimization: interval reliability and cost (using the cost model of equation (26)). Since the failure rate is considerably low compared to the interval of time examined, the expected number of failures during the ten-month period is small. Thus, the optimal number of spares to be kept in stock in both cases is also small. For the interval reliability case, assuming a goal of achieving 99% reliability for the stock (i.e. acceptable risk of shortage is less than 1%), two spares are required. On the other hand, due to the fact that depreciation of the transformers is quite large (19.65% per year), only one spare should be kept in stock in order to minimize total costs. Associated interval reliability in the latter case is 93.39%. Table I. Summary of Data for Case Study 1 Non-Repairable Components Parameter Value Spare type Non-repairable transformers Components in operation 150 Rate of replacements 1/300 [failures IT year ] Planning horizon 10 months 0.833 years Regular Cost per IT $30,000 Emergency Cost per IT $60,000 Future Value of Unused Spare $25,000 Table II. Main Results for Case Study 1 Non-Repairable Components Optimization Criteria Optimal Stock Level Associated Values Interval Reliability 2 Reliability = 99.12% (goal = 99%) Total cost per year = $20,760 Cost Minimization 1 Total Cost per year = $18,073 Reliability = 93.39%
6.2 Case Study for Repairable Components: Haul Trucks in a Mining Operation

This case study is based on data coming from a copper mining operation in South America. At the site, a fleet of 39 haul trucks is used to transport ore and waste from the pit to the crusher and waste dumps around the mine. In the propulsion system of these trucks, there exists one type of component (that we will refer to as component X) which is critical for the adequate operation of the system. Every time one of these components is removed from a truck, a spare is put into work, and the failed (or preventively removed) one is sent to the shop for repair or reconditioning. Once the repair is performed, the component is stocked back in inventory. Each truck utilizes two units of component X, thus there are 78 such components in operation at the mine site. Normally, a haul truck has an operating load of 6,600 hours per year, and the preventive maintenance (reconditioning) interval for component X is 9,000 operating hours. Perfect repair of the components is assumed.


We will only consider the situation when there is no limit on the number of repair servers (unlimited repair capacity). We are interested in determining the optimal initial number of spares in stock, according to different optimization criteria such as instant and interval reliability, availability and cost minimization. Databases of maintenance events (PMs, failures, etc.), including date, truck age (measured in operating hours), component age (operating hours), along with a brief description of the event and its cause were available. The resulting data set for component X consisted of 171 events (86 failures and 85 suspensions corresponding to preventive replacements). After testing for possible trends or correlation, a two parameter Weibull distribution was fitted to the data (=0.8565, =14,650 operating hours). As the number of components in operation is relatively large (m=78), demand for spares can be approximated by a Poisson process, independently of the underlying failure distribution (note in addition that in this case is relatively close to one, so an exponential distribution for the time between failures could have been assumed directly, i.e. an exponential distribution is not rejected by appropriate goodness of fit tests). Note that the company enforced a preventive replacement policy in spite of the data showing a shape parameter close to one. To calculate , the mean time to replacement of a component, we take into account the preventive replacement age policy in place. In this case was calculated using equation (2), and found to be 6,420.3 operating hours. As no data was available to calculate the MTTR (Mean Time to Repair), it was obtained from estimations provided by maintenance personnel, at MTTR=452 operating hours (repair rate =0.002212). Note that it is expressed in terms of operating hours for consistency (conversion of calendar time into operating hours was performed using the actual utilization factor for the fleet). The cost of downtime was estimated based on the average cycle time and load of a haul truck in the mine, the average ore grade and the metallurgical recuperation in downstream processes. No consideration of the mine-plant stockpile was made for this case study (i.e. loss of production is a direct consequence of truck downtime). The cost of downtime per operational hour lost, for one truck, was estimated at $2,173.3. The total holding cost of one spare was estimated at $1.51 per operational hour (or 25% of the value of a new part per annum, with an average utilization of the truck of 6,600 hours per year). For this case study, a planning horizon of one year (6,600 operating hours) was selected (for the interval reliability calculations). Table III summarizes the main parameters for this application example. Table III. Summary of Data for Case Study 2 Repairable Components Parameter Value Spare type Repairable component X Components in operation 78 Rate of replacements 1/6,420.3 [replacements componentop.hour ] Planning horizon 6,600 op. hours Rate of repair 0.002212 [repairs op.hour ] Cost per spare component X $40,000 Holding cost for one spare $1.51 per op. hour


Cost of downtime for one component

$2173.3 per op. hour

Four optimization criteria were considered for this case study. Table IV presents the main results obtained. As expected, optimal stock size varies depending on the goal of the optimization. It can be observed that interval reliability for the stock is more demanding than instantaneous reliability, which is obvious. Of course, if the interval of interest were very small, then results obtained using the interval reliability approach would be close to those obtained using instantaneous reliability. Table IV. Main Results for Case Study 2 Repairable Components Optimization Criteria Optimal Stock Level Associated Values Interval Reliability 15 Reliability = 98.05% (goal = 95%) (for Stock = 14, Reliability = 94.99%) Instantaneous 10 Reliability = 97.53% Reliability (for Stock = 9, Reliability = 94.75%) (goal = 95%) Availability 6 Availability = 99.14 % (goal = 99%) Cost minimization 14 Total cost per unit time = $23.00 Instantaneous Reliability = 99.94% In order to achieve 99% availability only six spares are required (recall that availability refers to non-downtime due to spares shortage for component X only, and it is not the overall availability of the fleet). Nonetheless, due to the high costs associated with downtime, the optimal stock size when minimization of costs is pursued is much larger. This is because (almost) no spares shortages are to be allowed in order to minimize cost or, in other words, it is better to stock a larger number of components than to accept the (small) risk of spares shortage instantaneous reliability associated to a stock size of 14 is 99.94%.

We have presented a number of basic spares inventory models used to determine the optimal stock size for the cases of non-repairable and repairable critical components, according to different optimization criteria, namely: (i) reliability of the stock (instantaneous or interval, depending on the application), (ii) availability (in the case of repairable components), and (iii) cost. In addition, procedures to find the interval of supportability given a stock level and desired reliability are introduced. Three brief case studies were reviewed, illustrating industrial spares stockholding problems. Most of the models discussed have been incorporated into a prototype software called SMS (Spares Management Software), developed by the Condition-Based Maintenance Laboratory at the University of Toronto. Most of the models included in this paper are based on the widely accepted use of a Poisson process approximation for spares demand (in the case of slow-moving, critical parts).


The complexity of the models is such that it facilitates their implementation in practice, in the opinion of the authors. Note that the models discussed assume that a demand rate has been estimated based on actual failure or replacement observations. In the case of lack of real failure (demand) observations, as in pre-operation stages of an industrial project, the use of a Bayesian procedure to estimate demand is justified. For further details on Bayesian analysis in the context of inventory control see Sherbrooke (2004). A recent and complete application study of a Bayesian procedure in the determination of the stock size for circuit packs at a communications company is presented in Aronis et al. (2004). They show the benefits of using such an updating procedure to adjust stock levels after observations have been recorded year after year. Earlier studies in the area of Bayesian analysis in inventory problems can be found in Kaplan (1988) (and references therein), where a Bayesian procedure is used to update the forecast for demand of repair parts of new weapon systems, and Hill (1999) who advocates for the use of Bayesian methods when faced to limited data. He reviews the application of Bayesian analysis to a single-period inventory model with compound Poisson demand, where demand size is known, but demand rate is not.

The authors wish to acknowledge the Natural Sciences and Engineering Research Council (NSERC) of Canada, Materials and Manufacturing Ontario (MMO) of Canada, and the CBM Consortium members for their financial support. The authors would also like to thank Ms. Samantha Chan, a student of Industrial Engineering at the University of Toronto, for preparation of the case study using electrical transformers. They also acknowledge the FOndo Nacional de DEsarrollo Cientifico Y Tecnolgico (FONDECYT) of the Chilean government (project 1090079).

1. Aronis, K.P., Magou, I., Dekker, R. and Tagaras, G., 2004. Inventory control of spare parts using a Bayesian approach: a case study, European Journal of Operational Research, 154: 730-739. 2. Barlow, R.E. and Proschan, F., 1965. Mathematical Theory of Reliability, Society for Industrial and Applied Mathematics, Philadelphia. 3. Bhat, U.N. and Miller, G.K., 2002. Elements of Applied Stochastic Processes, Third Edition, Wiley, Hoboken. 4. Birolini, A., 1999. Reliability Engineering, Third Edition, Springer, Berlin. 5. Carrillo, M.J., 1991. Extensions of Palms theorem: a review, Management Science, 37: 739-744. 6. Chikan, A. (Ed.), 1990. Inventory Models. Theory and Decision Library Series B: Mathematical and Statistical Methods, Kluwer, Dordrecht. 7. Chung, K.L., 1967. Markov Chains with Stationary Transition Probabilities, Second Edition, Springer-Verlag, Berlin.


8. Cocozza-Thivent, C., 1997. Processus Stochastiques et Fiabilit des Systemes, Springer, Berlin. 9. Cox, R., 1962. Renewal Theory, Methuen, London. 10. Daz, A. and Fu, M.C., 1997. Models for multi-echelon repairable item inventory systems with limited repair capacity, European Journal of Operational Research, 97: 480-492. 11. Feller, W., 1966. An Introduction to Probability Theory and its Applications, Vol. II, Wiley, New York. 12. Ghodrati, B. and Kumar, U., 2005. Reliability and operating environment-based spare parts estimation approach: a case study in Kiruna mine, Sweden, Journal of Quality in Maintenance Engineering, 11: 169-184. 13. Guide, V.D.R. and Srivastava, R., 1997. Repairable inventory theory: models and applications, European Journal of Operational Research, 102: 1-20. 14. Hill, R., 1999. Bayesian decision-making in inventory modelling, IMA Journal of Mathematics Applied in Business and Industry, 10: 147-163. 15. R.R. Inman, 1999. Empirical evaluation of exponential and independence assumptions in queueing model of manufacturing systems. Production and Operations Management, 8(4): 409-432. 16. Kaplan, A.J., 1988. Bayesian approach to inventory control of new parts, IIE Transactions, 20: 151-156. 17. Kennedy, W.J.; Patterson, J.W. and Fredendall, L.D., 2002. An overview of recent literature on spare parts inventories, International Journal of Production Economics, 76: 201-215. 18. Khintchine, A.Y., 1969. Mathematical methods in the theory of queuing. Griffins Statistical Monographs and Courses, Charles Griffin & Co., London. 19. Klein, M.B., 1984. Suitability of the lognormal distribution for corrective maintenance repair times, Reliability Engineering, 9: 65-80. 20. Latouche, G. and Ramaswami, V., 1999. Introduction to Matrix Analytic Methods in Stochastic Modeling, ASA-SIAM Series on Statistics and Applied Probability, Society for Industrial and Applied Mathematics, Philadelphia. 21. Louit, D., Banjevic, D. and Jardine, A.K.S., 2005. Optimization of spare parts inventories composed of non-repairable or repairable parts, proceedings of the International Conference of Maintenance Societies, ICOMS, paper 23, Hobart, Australia. 22. Petrovic, R., Senborn, A. and Vujosevic, M., 1986. Hierarchical Spare Parts Inventory Systems. Studies in Production and Engineering Economics 5, Elsevier, Amsterdam. 23. Ross, S.M., 2003. Introduction to Probability Models, Eighth Edition, Academic Press, San Diego. 24. Rustenburg, W.D.; van Houtum, G.J. and Zijm, W.H.M., 2000. Spare parts management for technical systems: resupply of spare parts under limited budgets, IIE Transactions, 32: 1013-1026. 25. Sheikh, A.K., Younas, M. and Raouf, A., 2000. Reliability based spare parts forecasting and procurement strategies, in Ben-Daya, M.; Duffuaa, S.O. and Raouf,


A. (Eds.) Maintenance, Modeling and Optimization, Kluwer Academic Publishers, Boston. 26. Sherbrooke, C.C., 2004. Optimal Inventory Modelling of Systems: multi-echelon techniques, Second edition. Kluwer, Boston. 27. Tersine, R., 1988. Inventory Control and Materials Management, North Holland, New York. 28. Verrijdt, J., Adan, I. and de Kok, T., 1998. A trade off between emergency repair and inventory investment, IIE Transactions, 30: 119-132. 29. Wong, J.Y.F., Chung, D.W.C., Ngai, B.M.Y., Banjevic, D. and Jardine, A.K.S., 1997. Evaluation of spares requirements using statistical and probability analysis techniques, Transactions of Mechanical Engineering, I.E. Aust., 22: 77-84. APPENDIX We want to calculate E ( H T ) in section 5.1, where H T = ( S NT ) + T + Si I ( Si T ) is
i =1 s

the holding time of spares. We should first notice that

E ( S NT ) + = P( NT < k ) = P ( Sk > T ) = S P( Sk T ) and that

k =1 k =1 k =1

E ( H T ) = E ( S NT ) + T + E ( Si ; Si T ) = T S [TP(Si T ) E ( Si ; Si T )] .
i =1 i =1

Also, for a continuous random variable X, and X * = ( X E ( X )) / ( X ) , it is easy to show that


E( X ; X T ) =

xf ( x)dx =E ( X ) P( X

T * ) + ( X )E( X *; X * T * ) .

(i) For the standardized normal distribution with ( x) = exp( x 2 / 2) / 2 , it is easy to see that
E( X ; X t) =
* *

x ( x)dx = (t ) . If we use the normal approximation for the distribution of

Si with parameters E ( Si ) = i / m and Var ( Si ) = 2i / m 2 , then E ( Si ; Si T ) = E ( Si ) (T * ) ( Si ) (T * ) , where T * = (T E ( Si )) / ( Si ) . From this,

i m , g T m i i =1 m where g ( x) = x ( x) + ( x) .
E ( HT ) = T S

(ii) If we use a Poisson distribution with parameter a, a = mT / , for NT , we have, after some algebra, E ( S NT ) + = ( S i ) P( NT = i ) = S P( NT S ) a P( NT S 1) .
i =0 S


Noticing that Si follows an Erlang distribution with parameter = a T , with some algebra we have S S T S T S S i xi 1 i iT E (Si ; Si T ) = x fi ( x)dx = x (i 1)! e x dx = P( Si +1 T ) = a P( NT i + 1) = i =1 i =1 0 i =1 0 i =1 i =1
iT T S ( S + 1) S i a j a T S ( S + 1) a2 a (1 P( NT i)) = a 2 i j ! e = a 2 (1 P( NT S )) + 2 P( NT S 2) i =1 i =0 j =0

Then, combining the previous formulas, we have T S ( S + 1) a2 E ( H T ) = T ( S P( NT S ) a P( NT S 1)) + (1 P ( NT S )) + P( NT S 2) a 2 2 We may also notice that in general for any two random variables X and Y, ( X Y ) + + Y = X + (Y X ) + , so that E ( S NT ) + + E ( NT ) = S + E ( NT S ) + .