You are on page 1of 14
Journal of Rajasthan Academy of Physical Sciences ISSN :0972-6306; URL :http://raops.org.in Vol. 12, No.2, June 2013, pp 199-212 RELIABILITY MODELING OF ACOMPUTER SYSTEM WITH PRIORITY TO PM OVER H/W REPLACEMENT SUBJECT TO MOT AND MRT ASHISH KUMAR ANDS.C, MALIK Department of Statistics, M.D. University, Robtak-124001, Haryana (India) Ennail: ashishbarak2020@gmail com, sc_malik@irediffiail com Recieved : March 29, 2013; Revised : May 04,2013 Abstract : In this paper we shall discuss a reliability model of a computer system with two identical units- one is operative and other is kept as cold standby considering maximum operation time (MOT) and maximum repair time (MRT) ofthe unit, In each unit hvw and s/w components fails independently. There isa single sener who visits the system immediately to perform PM, hw repair and replacement and sw replacement, The unit under goes for preventive maintenance after a ‘maximum operation time at normal mode. The hAy components under go for repair at their failure and are replaced by new one in case these are not repaired up to a ‘maximum repair time. Although, only replacement for sv components is made in cease s/w fails to mect out the requirements, Priority to the preventive maintenance (PM) of the unit is given over replacement of the h/w components, The failure time distribution of the components follow negative exponential whereas the distributions of preventive maintenance, repair and replacement time are taken as arbitrary with different probability density functions. Several reliability measures have been derived using semi farkov and regenerative point technique. The graphs are drawn, to depict the behaviour of the results, Key Woruls: Computer System, Reliability Model, Preventive Maintenance, Masimum (Operation and Repair Times, Priority and Replacement, 2010 Mathematics Subject Classification: 90B25 and GOK 10 1. Introduction The evaluation of computer system reliability is an important aspect for the design of new systems and further development of the old one. A major challenge to the industrialists now a day is to provide reliable b/w and s/w components for the computer © 2013 RAOPS, All right reserved 200 Ashish Kumar and S.C. Malik systems, For this purpose, most of the scientists and academicians are trying to explore new techniques for reliability improvement of the computer systems. In spite of these efforts, alittle work has been dedicated to the reliability modeling of computer systems And, most ofthe research work carried out so far in the subject of sv and h/w reliability has been limited to the consideration of either h/w subsystem alone or s/tv subsystem alone. But there are many complex systems in which h/w and sv components work together to provide computer functionality. Friedman and Tran [1] and Welke et al. (6] tried to establish a combined reliability model for the whole system in which hardware and software components work together. Recently, Malik and Anand [3] and Malik and Kumar [2] suggested reliability models of a computer system with independent failure of h/w and s/w components Further, the continued operation and ageing of these systems gradually reduce their performance, reliability and safety: And, a breakdown of such systems is costly, dangerous and may create confusion in our society. Its, therefore, of great importan to operate such systems with high reliability, It is proved that preventive maintenance ean slow the deterioration process of a repairable system and restore the system in a younger age or state, Thus, the method of preventive maintenance can be used to improve the reliability and profit of system, Malik and Nandal [4] has proposed a reliability model for complex systems introducing the concept of preventive maintenance of the unit after a maximum operation time, Further, the reliability of a system can be increased by making replacement of the components by new one in case repair time is too long i... if it extends to a pre~ specific time. Singh and Agrafiotis [5] analyzed stochastically a two-unit cold standby system subject to maximum operation and repair time, While considering the above facts and to fill up the gap, the present paper is designed to evaluate some reliability measures of a computer system in which h/w and s/w components fails independently. A reliability model of two identical-units is developed considering computer system as a single- unit. Initially one unit is operative and the other unit is kept as cold standby. In each unit h/w and siw components fails independently There is a single server who visits the system immediately to perform PM. hw repair, s/w replacement and h/w replacement of the components. The unit under goes for preventive maintenance after a maximum operation time at normal mode. The hiw components under go for repair at their failure and are replaced by new one in case these are not repaired up to a maximum repair time. Although, there is only replacement or up g fadation facility for s/w components. Priority to the preventive maintenance Reliability Modeling of a Computer System with Priority 10 201 (PM) of the unit is given over replacement of the hiv components, The failure time distribution of the components follow negative exponential whereas the distributions of preventive maintenance, repair and replacement time are taken as arbitrary with different probability density functions, Several reliability measures such as mean time to system failure (MTSF), availability, busy period of the server due to PM, busy period of the server due to h/w repair, busy period of the server due to h/w replacement, busy period of the server due to s/w replacement, expected number of h/w replacements, expected number of shv replacements, expected number of visits ofthe server and profit funtion are obtained using semi-Markov and regenerative point technique. The graphical behaviour of the results has also been shown for a particular case to highlight the importance of the study Notations E ‘The set of regenerative states NO The unit is operative and in normal mode Cs ‘The unit is in cold standby ab Probability that the system has hardware / software failure al Constant hardware / software failure rate a, Maximum constant rate of Operation Time B, Maximum constant rate of Repair Time. Pm/PM The unit is under preventive Maintenance/ under preventive ‘maintenance continuously from previous state WPm/WPM The unit is waiting for PM / waiting for preventive maintenance continuously from previous state HFur/HFUR The unitis failed due to hardware and is under repair / under repair continuously from previous state HFurp/HFURP : The unit is failed due to hiw and is under replacement / under replacement continuously from previous state HF wr /HEWR : The unit is failed due to hiv and is waiting for repair/waiting for repair continuously from previous state SFurp/SFURP : The unit is failed due to the s/w and is under replacement/under replacement continuously from previous state 202 Ashish Kumar and S.C. Malik SFwrp/SFWRP ho / HO) a()/ GW) (ty M(t) ft) / FO) 4g, (/ Q) pdt! cdf 4, 0/20) HO wit) EO eile * (desh) The unit is failed due tothe software and is waiting for replacement / waiting for replacement continuously from previous state paf / cdf of replacement time of unit due to software paf / cdf of repair time of the hardware paf / cdf of replacement time of the hardware paf / cdf of the time for PM of the unit pat / cdf of passage time from regenerative state ito a regenerative state j or toa failed state j without visiting any other regenerative state in (0, t] Probability density function/ Cumulative density function pafedf of direct transition time from regenerative state i to a regenerative state j or to a failed state j visiting state k, r once in 0.t} Probability that the system up initially in state S, € Eis up at time t without visiting to any regenerative state Probability that the server is busy in the state S, upto time ‘t'without making any transition to any other regenerative state or returning to the same state via one or more non-regenerative states, Contribution to mean sojourn time (11) in state S, when system transit directly to state $, so that =", and m, = J1do,(9 =-4; 0) symbol for Laplace-Stieltjes convolution/Laplace convolution Symbol for Laplace Steiltjes Transform (LST) / Laplace Transform an Used to represent alternative result Transition Probabilities and Mean Sojourn Times Simple probabilistic considerations yield the following expressions for the non-zero elements Reliability Modeling of Computer System with Priority to 203 P, = 2,()= ["4,(Odt as rm) ao a0, baz ad _ bas Pay = A PoALALP = BUSAN = Pig LISA Piss, Py “(B). Pp. SE UPAN= Byiy Pas= Ferg BH : bar s AI : 4 FU 8 Bla = GELB Pas GUL eBid = KW Py _ Gai "(Aye = #0 “(A= =m — bAe = SELMA Pa P= PUPAE Pays Po MAD Pay = GL FAME Pasi Psy = 8 (By) Pose = 1° 8B. Pyar SHOP S=LO- Pos=F'O. Pos =O. Pirs= 8B) Pras 18 °B): Pass Ce MAN] ba: = PPL A= Poe “Bde Pras= 18°) Par =fO% Pa= MOP yyy = 7), Prog = LO)» Pyrs=LO, Piss = (0), Pros = MO), Par 5 = Fl 1-3 (B) 8B). Pass Gl 1g (B)IL- g BL Pou 2 OIL EWI Pannen 2 L 1 gL 8 HDI Ay 1 ON 8 O)Psss55 = BiL I BIL: g “GDI where A=bA2+aay +er0 and B=bd2+adi*antBo Q) It can be casily verified that p,,+P,.+Py, Pai*Par*Ps*Psyo = Pao Paar*Paast Paio™ Par Pais > Pox = Brio Pag =Pha2* Prats = Prat = Pray =Pys2 = Phos = Prog “Pras + Pint Pras *Pias +P ae tPars *Paics Pann *Pansnis 4 PustPaotPaes*Paaio 7 Pao *Pars* Prost Paais™ | ‘The mean sojourn times (j.) is the state S, are Prot Piet PretPyis = 204 Ashish Kumar and S.C. Malik +y (A) 070+ o)?+70(B)+ By (By + 9X0 + BB) =k py = B88), Ra O+ AyVO4+B)+(B+ Ay BNO Ao) ey 1O+BPO+AyB) o my Fos Mos = Hy IM FM Ms Mh 9 = Hh My + Mgg + Mys +My) + May May + Mary Mays + Maso = Hs Img + May = Hy Maga Imgys#Mhaa = My Mea = Ms Mg = Hy Is = He My = Hos Mh = o> Im hag + Mae FIM 9 = MSA) My, +My My 5 +My 56 May 9 Meg 215 Mhg 1 +My pa = HE (SAY) HAGAN). My F Maa yy + Myrag + Magar = H(SMy) 6) Myg + Myy + My5 +My. .0 Reliability and Mean Time to System Failure (MTSF) Let f(t) be the c.d.f of first passage time from the regenerative state i toa failed state, Regarding the failed state as absorbing state, we have the following recursive relation for f(t): 4(N= DQ, 0940+ 2, x(t) © Where jis an un-failed regenerative state to which the given regenerative state i can transit and £ is a failed state to which the state / can transit directly Taking LT of above relation (6) and solving for §, (s.) We have ao Reliability Modeling of Computer System with Priority to 208 ‘The reliability of the system model can be obtained by taking Laplace inverse transform of (7). ‘The mean time to system failure (MTSF) is given by xo) ¥, MTSE =lim Ao ) where Ny = Hy + Polly + Poadls + Pally * Pos D, = !~PaPio ~ PorPa» ~ PosPan ~ PinPasPan 4, and Steady State Availability Let A(0) be the probability that the system is in up-state at instant “t given that the system entered regenerative state i at t = 0. The recursive relations for A, (t) are given as 4 (= Mil +E47 (04; 0] 0 Where j is any successive regenerative state to which the regenerative state # can transit through >| (natural number) transitions. M(t) is the probability that the system is up initially in state S, €£ is up at time t without visiting to any other regenerative state, we have My (0) =! M1) = 0H FM (0) = WMG EY, Mae" HO), M(Qe "MO (10) ‘Taking LT of above relations (9) and solving for 4; (s) , the steady state availability is given by ay where Pas ParesA-HolC> Priasd€ (PrxPasis * Pros Pasao)) + PraePai sPraas” PisaParsPaoro)I> BLP (Pos -Paa is Paais Pas) PosPa sPas is PrsPsi oParrol Hs (Po) £ PrPaie PrPaw* CP IPoPo ely WPeParslt HaPoPre Piss * Pre (P50) PoaPai Piss * U-Pysyo) Ur Prasad Pos (> Py s)Paaz * Pat sPioo)l+ 206 Ashish Kumar and S.C. Malik Cpa dE HC Pid (LP, 9) (Py ,0)* Pra (Pasn* Ps Pasiolt PsioPasart PosinD-PrsstPso-Pare*Po(lPasisPasiaad8) * Hy Poll-Pasa 2) Paso P Pi) Pay sPaa yi Pasi DT *PoaLPse-Pa Pa oP: ID 8 {Po [Pra 4(-Pas0)"Po2-Prssl™ PrrisdMlPssigh PaisPrssl Poll Pyyis )Pr27*Pa1 Piao} + HsfPoal Pas Pasi Pio” Wyss} Pool (Py .3) Pasn* Past. 0* Pass Par sHPoal Ppa Pan s*> Psa) = a# Paros Phi ss)E (Psa -Paais > Przis(l Pasi) + Py2ePsi Pass” Pys Pat Paco) a [Poy (Ps2 oP ys ie*Paoio(1-P as 10))* PooPsy »Pas ae PonPsy 9Pao ol * HCP PrzePasrs™ PrasPrzid CPi. )PoaPasre"Cl Pr dPaP ae iel (44+ Pas. /17)IPoiProPias* Prael-Praid)* PesPyisPiae* U-Paysio) > Pyaad* Pas C+ Py )Pp2>* Par ProdB HOP. dE WL> Py) (Che, 1) P5510" 11 )}Pis6EPay sO-P 35 10)Psy Pass Pos} Pras Ps2 Pa sto ofl dhl + fPalC-P., 219) TPs id° PaosPasnt Pasniadl PoalPa sO Pah Pay oPaay1* Pasi) PiPsa Pay Pay A 1d OfP i LPi2c-Pss 0" PysrPiagl? Palle Py, ICLP 35 0)Py1Prsab Pisll> By ys Pro e¥Pa Prool + (PolPos* ParndPi2ct (-Pas Paral Poall-Prris) Pas Pasriad* Pray Pa sl Pol PrrPastl- Pras) UPaiP2iaddd Busy Period Analysis for Server Let BY (1) BR(t) BS(t) and B® (r) be the probabilities that the server is busy in Preventive maintenance of the system, repairing the unit due to hardware failure, replacement of the software and hardware components at an instant “t” given that the system entered state iat t = 0. The recursive relations for B/(r) By (1) BS(t) and By" (t) are as follows: BP (1)=W,(1)+ Lai? (NOBY (1) BE(Q=W.()+ Dai (NOB) BE ()=H,() + Laid (EB; (1) BI ()=W.() + Lae OB," (0 (2) Reliability Modeling of Computer System with Priority to 207 Where is any successive regenerative state to which the regenerative state # ean transit through ne” (natural number) transitions. W,(1) be the probability thatthe server is busy in state §, due to preventive maintenance, hardware and software failure up to time t without making any transition to any other regenerative state or returning to the same via one or more non-regenerative states and so Wy, =e OR) (age VO DF) Hade P= ™"O IFO + OAC "© DFO) AMEN lag -MO HE Malai MO EAQ+b4e*4O ND W, =0 FQ) age) Qe O NAO bac" DH) x Wi, =e MT) Hae © IMOHAE MO DMO +46" "O DMO) He = FO, = FO Taking LT of above relations (12) and solving for B/(1) BF (1) BS(t) and B’"*(t) the time for which server is busy due to PM, h/w repair and h/y and s/w replacements respectively is given by Ng NS Nt BE =limsBy"() Bi lim BSC) = 5, BE hm BS) DE By = lim 5B (8) = Na and 8. = in, 9B" (S) = 9 (3) NEO™ Pact Pare) HG O) Le Por Pao Paris * Prais-Pssio} + PeaPa Pass” PuPs:Praidl* Wi) Pars WPo(PsaPiss+ Prac-Pas.0)* PasPaioPise* (-Pssio > Pad Pos C= Py dPio2t PPI THRE [Pot lPe2i2Pz21215) U-Pasio Pag o(Posiy* Pas1.1)3 +Pea Pay 1-P3s1o)* Pai o(Paay* Pas1)3+Post Py of I> 2D] > PofP,2gl-Pss oP Pissl*Piol LP, js U-Pss0)-Ps: Pass! Pas Psy Prag) NSO= > pd VOL Post PssaPiss* Prsc(l-Pssi0} + Pos [Psi vPiss* YA> Praa D3 Poy (> Py a2)P25 + Pai oProet C=O) (Bae* Paros) [Port PraPais PrasProiod * Pro UP) Pass” 0,113) PosPaordl+ C-Pyi-) [Po Pre-Piss + Pr2-Pasid* Poo Pn Piss + OP aar0 ) > Prad* Pos (I> Pyia)Psa2* PaiP2dBHOPa dl Pot Pr2Pasn* Pasnid* Piss-Pari2Pa2i2is) FF Poot Pasart Pasisid Ce Pris PrssPai sh Past Paros Pras JD 208 Ashish Kumar and S.C. Malik NEO Bact Posed (O)L Por Psa Pras * Prael-Pssid} * Pos Pao io) C= Py aaD3* Pos (> By Paas# Par Piz} Pa sPiagt (I> Expected Number of Replacements of the Units Let R"(t) and R(t) the expected number of replacements of the failed hardware and software components by the server in (0, t] given that the system entered the regenerative state i at t = 0 The recursive relations for are given as ge O=Ea7 (9, +R), RS ()= E409 (N9[5, +8) (| a) ‘Where j is any regenerative state to which the given regenerative state # transits and. 8)=1, if /is the regenerative state where the server does job afresh, otherwise j= 0. ‘Taking LT of relations and, solving for 2; (s) and Ré (s). The expected numbers of replacements per unit time to the hardware and software failures are respectively of given by s s as Ni R54) = lim si (9) = T (15) Where D, is already mentioned NEO= Prsj24¢* Posy 10 CPs.) PP: Pras * PyeallPss 0) PesPs Piss 1310) I Paras) ¥¥ Pos (I> Pay is)Paas* Pa Pio (Part Pasgs) Pact Prais* Pasivll Poy Pan Pras * Prac l-Pssid)* PoaC-Pa Pras * C-Paaio ) A> Pinas) ¥* Pos (> Py a3)Paa3* Pai sPi20)l NEO= s+ Boyes) Wool ProePaste” Paz isPras)* PooPaais(l> Py as) Pos Co Py Paral CPs MPost Pas Posinra 14 P33 *Post Pay sPoy s*U> 1 Pasi D) tool 2Pa stC> Pips) U-Pari2Pros2a9) Reliability Modeling of a Computer System with Priority 10 209 Expected Number of Visits by the Server Let N(®) be the expected number of visits by the server in (0, t] given that the system entered the regenerative state i at t = 0. The recursive relations for Nit) are given as ; o 7 N, (Q=E4l? ()2[5, +4, (9| an j Where j is any regenerative state to which the given regenerative state # transits and 8j=! 0. |, if is the regenerative state where the server does job afresh, otherwise 8)= Taking LT of relation (20) and solving for N,,(s) . The expected number of visit per unit time by the server are given by nsi(s)=~ No(ce) = lig D; + Where rr} Ns “(ai Paiss) [Pros Par sPazio” Par oPaaisPiz6) + PoaPssis(l> Pras) Pos (I Pir DPasyolt C-Pa DL > Py d= asus PasiidPx2AU-Pss io ML: De PrrofPay sUP3510 #P51 Past Pasind}PrasfPse Pai stPs 9 U-P2212P221219)] 9, Economic Analysis The profit incurred to the system model in steady state can be obtained as P = KoAo~ Kio — K:B§ — K.B§ — KsBg!? — KR — Ker§ — KoNo To) K, = Revenue per unit up-time of the system, K, = Cost per unit time for which server is busy due preventive maintenance K, = Cost per unit time for which server is busy due to hardware failure K, = Cost per unit replacement of the failed software component K, = Cost per unit replacement of the failed hardware component K, = Cost per unit replacement of the failed hardware K,, =. Cost per unit replacement of the failed software K,= Cost per unit visit by the server 210 Ashish Kumar and S.C. Malik Conelusion For a particular case, the numerical results for some reliability and economic ‘measures are obtained for a computer system of two identical units in which hAw and s! w components fails independently. The graphs for these measures are drawn with respect to preventive maintenance rate (a) for fixed values of other parameters as shown in figs. 2 to 4. It is revealed that MTSF, Availability and profit increase with the increase of PM rate (a) and repair rate (0) of the hardware components. But their values with the increase of maximum operation time (c.,). However if we increase the constant rate of repair time (B.), the value of MTSF becomes more while the values of .1 State Transition Diagram f(t) bio sft oe Ee | [eet Te = [ee [8g stn | fof BUR | S12 te [20S 80 Wm TSS cle Mw] ny TR | fm | Ade [als EY aa “ No ‘SFurp s1f wen T\S\ cs m —. NK ©! fro suf SFURP | f(t) fw Wom we O = Operative State => Failed State © => Regenerative Point Reliability Modeling of Computer System with Priority to 21 availability and profit are decline Thus the study suggested that the r of a computer system of identical units ean be improved by (Reducing the repair time of the hi. ‘ maintenance of the unit, Giving priority to replacement of the h/w components over preventive 03, 025 035 \ sanToes. ‘availability __> ng eens esha mon nt) Aviablity Vs Preventive Maintenance Rate (a) > 05.002 IP-1O p05; 1045068 1P-1O POS Hl= VIRL | o-$-2 UGS) KANT ARTES 212 Ashish Kumar and S.C. Malik Acknowledgement ‘The authors are thankful to the worthy referee for his useful suggestion for the improvement of the paper. References [1] Friedman, M. A. and Tran, P,(1992), Reliability techniques for combined hardware/software systems, Proc. Of Annual Reliability and Maintability Symposiym, 290-298. [2] Malik, S.C. and Ashish Kumar(2011). Profit analysis of a computer system with priority to software replacement over hardware repair subject to maximum operation and repair times, International Journal of Engineering Science & Technology 3(10), 7452- 7468. [3] Malik, S.C. and Jyoti Anand (2010). Reliability and economic analysis of a computer system with independent hardware and software failures, Bulletin of Pure and Applied Sciences 29 E (Math, & Stat.)(1), 141-153 [4] Malik, $. C. and Nandal, P.(2010). Cost- Analysis of Stochastie Models with Priorit to Repair Over Preventive Maintenance Subject to Maximum Operation Time, Edited Book, Learning Manual on Modeling, Optimization and Their Applications, Excel India Publishers, 165-178. [5] Singh, S. K. and Agrafiotis, G K.(1995). Stochastic analysis of a two-unit cold standby system subject to maximum operation and repair time. Microelectron, Reliab. 35(12), 1489-1493. [6] Wolke, S. R.; Labib, S. W. and Ahmed, A. M.(1995), Reliability modeling of hardware/software system, IEEE Transactions on Reliability 44(3), 413-418.

You might also like