You are on page 1of 494

Preface

This is the seventh volume in the series 'Handbook of Statistics' started by the
late Professor P. R. Krishnaiah to provide comprehensive reference books in
different areas of statistical theory and applications. Each volume is devoted to
a particular topic in statistics; the present one is on 'Quality Control and Relia-
bility', a modern branch of statistics dealing with the complex problems in the
production of goods and services, maintenance and repair, and management and
operations. The accent is on quality and reliability in all these aspects.
The leading chapter in the volume is written by W. Edwards Deming, a pioneer
in statistical quality control, who spearheaded the quality control movement in
Japan and helped the country in its rapid industrial development during the post
war period. He gives a 14-point program for the management to keep a country
in the ascending path of industrial development.
Two main areas of concern in practice are the reliability of the hardware and
of the process control software. The estimation of hardware reliability and its uses
is discussed under a variety of models for reliability by R.A. Johnson in
Chapter 3, M. Mazumdar in Chapter 4, L. F. Pan in Chapter 15, H. L. Harter
in Chapter 22, A. P. Basu in Chapter 23, and S. Iyengar and G. Patwardhan in
Chapter 24. The estimation of software reliability is considered by F. B. Bastani
and C. V. Ramamoorthy in Chapter 2 and T. A. Mazzuchi and N. D. Singpur-
walla in Chapter 5.
The main concepts and theory of reliability are discussed in Chapters 10, 12,
13, 14 and 21 by F. Proschan in collaboration with P. J. Boland, F. Guess, R. E.
Barlow, G. Mimmack, E. E1-Neweihi and J. Sethuraman.
Chapter 6 by N. R. Chaganty and K. Joag-dev, Chapter 7 by B. W. Woodruff
and A. H. Moore, Chapter 9 by S. S. Gupta and S. Panchapakesan, Chapter 11
by M . C . Bhattacharjee and Chapter 16 by W . J . Padgett deal with some
statistical inference problems arising in reliability theory.
Several aspects of quality control of manufactured goods are discussed in
Chapter 17 by F. B. Alt and N. D. Smith, in Chapter 18 by B. Hoadley, in
Chapter 20 by M. CsOrg6 and L. Horv6th and in Chapter 19 by P. R. Krishnaiah
and B. Q. Miao.
All the chapters are written by outstanding scholars in their fields of expertise
and I wish to thank all of them for their excellent contributions. Special thanks
are due to Elsevier Science Publishers B.V. (North-Holland) for their patience and
cooperation in bringing out this volume.

C. R. Rao
Contributors

F. B. Alt, Dept. of Management Science & Stat., University of Maryland, College


Park, MD 20742, USA (Ch. 17)
F. B. Bastani, Dept. of Computer Science, University of Houston, University Park,
Houston, TX 77004, USA (Ch. 2)
A. P. Basu, Dept. of Statistics, University of Missouri-Columbia, 328 Math. Science
Building, Columbia, MO 65201, USA (Ch. 23)
M. C. Bhattacharjee, Dept. of Mathematics, New Jersey Inst. of Technology, Newark,
NJ 07102, USA (Ch. 11)
H. W. Block, Dept. of Mathematics & Statistics, University of Pittsburgh, Pittsburgh,
PA 15260, USA (Ch. 8)
P. J. Boland, Dept. of Mathematics, University College, Belfield, Dublin 4, Ireland
(Ch. 10)
R. E. Barlow, Operations Research Center, University of California, Berkeley, CA
94720, USA (Ch. 13)
N. R. Chaganty, Math, Dept., Old Dominion University, Hampton Blvd., Norfolk, VA
23508, USA (Ch. 6)
M. CsOrg6, Dept. of Mathematics & Statistics, Carleton University, Ottawa, Ontario,
Canada K1S 5B6 (Ch. 20)
W. Edwards Deming, Consultant in Statistical Studies, 4924 Butterworth Place,
Washington, DC 20016, USA (Ch. 1)
F. M. Guess, Department of Statistics, University of South Carolina, Columbia,
South Carolina 29208, USA (Ch. 12)
S. Gupta, Dept. of Statistics, Math./Science Building, Purdue University, Lafayette,
IN 47907, USA (Ch. 9)
H. L. Harter, 32 S. Wright Ave., Dayton, OH 45403, USA (Ch. 22)
B. Hoadley, Bell Laboratories, HP 1A-250, HolmdeL NJ 07733, USA (Ch. 18)
L. Horvhth, Bolyai Institute, Szeged University, Aradi Vertanuk tere 1, H-6720
Szeged, Hungary (Ch. 20)
S. Iyengar, Dept. of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213,
USA (Ch. 24)
K. Joag-dev, Dept. of Mathematics, University of Illinois at Urbana-Champaign,
Urbana, IL 61801, USA (Ch. 6)
R. A. Johnson, Dept. of Statistics, 1210 West Dayton Street, Madison, WI 53706,
USA (Oh. 3)
xiii
xiv Contributors

M. Mazumdar, Dept. of Industrial Engineering, University of Pittsburgh, Benedum


Hall 1048, Pittsburgh, PA 15260, USA (Ch. 4)
T. A. Mazzuchi, c/o N. D. Singpurwalla, Operations Research & Statistics, Geo
Washington University, Washington, DC 20052, USA (Ch. 5)
B. Miao, Dept. of Math. & Stat., University of Pittsburgh, Pittsburgh, PA 15260,
USA (Ch. 19)
G. M. Mimmack, c/o F. Proschan, Statistics Department, Florida State University,
Tallahassee, FL 32306, USA (Ch. 14)
A. H. Moore, AFIT/ENC, Wright-Patterson AFB, OH 45433, USA (Ch. 7)
E. E1-Neweihi, Dept. of Math., Stat. & Comp. Sci., University of Illinois, Chicago,
IL 60680, USA (Ch. 21)
W. J. Padgett, Math. & Stat. Department, University of South Carolina, Columbia,
SC 29208, USA (Ch. 16)
G. Patwardhan, Dept. of Mathematics, Pennsylvania State University at Altoona,
Altoona, PA 16603, USA (Ch. 24)
S. Panchapakesan, Mathematics Department, Southern Illinois University, Carbon-
dale, IL 62901, USA (Ch. 9)
L. F. Pau, 7 Route de Drize, CH 1227 Carouge, Switzerland (Ch. 15)
F. Proschan, Statistics Department, Florida State University, Tallahassee, FL 32306,
USA (Ch. 10, 12, 13, 14, 21)
C. V. Ramamoorthy, Dept. of Electrical Engineering & Comp. Sci., University of
California at Berkeley, Berkeley, CA 94720, USA (Ch. 2)
T. H. Savits, Dept. of Mathematics & Statistics, University of Pittsburgh, Pittsburgh,
PA 15260, USA (Ch. 8)
J. Sethuraman, Dept. of Statistics, Florida State University, Tallahassee, FL 32306,
USA (Ch. 22)
N. D. Singpurwalla, Operations Research & Statistics, George Washington Uni-
versity, Washington, DC 20052, USA (Ch. 5)
N. D. Smith, Dept. of Management Sci. & Stat., University of Maryland, College
Park, MD 20742, USA (Ch. 17)
B. Woodruff, Directorate of Mathematical & Inf. Service, AFOSR/NM, Bolling Air
Force Base, DC 20332, USA (Ch. 17)
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7 1
&
© Elsevier Science Publishers B.V. (1988) 1-6

Transformation of Westem Style of Management*

W. Edwards Deming

1. The crisis of Western industry

The decline of Western industry, which began in 1968 and 1969, a victim of
competition, has reached little by little a stage that can only be characterized as
a crisis. The decline is caused by Western style of management, and it will
continue until the cause is corrected. In fact, the decline may be ready for a nose
dive. Some companies will die a natural death, victims of Charles Darwin's
inexhorable law of survival of the fittest. In others, there will be awakening and
conversion of management.
What happened? American industry knew nothing but expansion from 1950 till
around 1968. American goods had the market. Then, one by one, many American
companies awakened to the reality of competition from Japan.
Little by little, one by one, the manufacture of parts and materials moves out
of the Western world into Japan, Korea, Taiwan, and now Brazil, for reasons of
quality and price. More business is carded on now between the U. S. and the
Pacific Basin than across the Atlantic Ocean.
A sudden crisis like Pearl Harbor brings everybody out in full force, ready for
action, even if they have no idea what to do. But a crisis that creeps in catches
its victims asleep.

2. A declining market exposes weaknesses

Management in an expanding market is fairly easy. It is difficult to lose when


business simply drops into the basket. But when competition presses into the
market, knowledge and skill are required for survival. Excuses ran out. By 1969,
the comptroller and the legal department began to take charge for survival, fight-
ing a defensive war, backs to the wall. The comptroller does his best, using only
visible figures, trying to hold the company in the black, unaware of the importance

* Parts of this Chapter are extracts from the author's book Out of the Crisis (Center for Advanced
Engineering Study, Massachusetts Institute of Technology, 1985).
2 w. Edwards Deming

for management of figures that are unknown and unknowable. The legal depart-
ment fights off creditors and predators that are on the lookout for an attractive
takeover. Unfortunately, management by the comptroller and the legal department
only brings further decline.

3. Forces that feed the decline

The decline is accelerated by the aim of management to boost the quarterly


dividend, and to maximize the price of the company's stock. Quick returns,
whether by acquisition, or by divestiture, or by paper profits or by creative
accounting, are self-defeating. The effect in the long run erodes investment and
ends up as just the opposite to what is intended.
A far better plan is to protect investment by plans and methods by which to
improve product and service, accepting the inevitable decrease in costs that accom-
pany improvement of quality and service, thus reversing the decline, capturing the
market with better quality and lower price. As a result, the company stays in
business and provides jobs and more jobs.
For years, price tag and not total cost of use governed the purchase of materials
and equipment.
Numerical goals and M.B.O. have made their contribution to the decline. A
numerical goal outside the capability of a system can be achieved only by impair-
ment or destruction of some other part of the company. Work standards more
than double costs of production. Worse than that, they rob people of their pride
of workmanship. Quotas of production are guarantee of poor quality. Exhorta-
tions are directed at the wrong people. They should be directed at the manage-
ment, not at the workers.
Other forces are still more destructive.
(1) Lack of constancy of purpose to plan product and service that will have
a market and keep the company in business, and provide jobs.
(2) Emphasis on short-term profits: short-term thinking (just the opposite from
constancy of purpose to stay in business), fed by fear of unfriendly takeover, and
by push from bankers and owners for dividends.
(3) Personal review system, or evaluation of performance, merit rating, annual
review, or annual appraisal, by whatever name, for people in management, the
effects of which are devastating.
(4) Mobility of management; job hopping from one company to another.
(5) Use of visible figures only for management, with little or no consideration
of figures that are unknown or unknowable.
Peculiar to industry in the Unites States:
(6) Excessive medical costs.
(7) Excessive costs of liability.*

* Eugene L. Grant, interviewin the journal Quality, Chicago, March 1984.


Transformation of Western style of management 3

Anyone could add more inhibitors. One, for example, is the choking of business
by laws and regulations; also by legislation brought on by groups of people with
special interests, the effect of which is too often to nullify the work of standard-
izing committees of industry, government, and consumers.
Still another force is the system of detailed budgets which leave a division
manager no leeway. In contrast, the manager in Japan is not bothered by detail.
He has complete freedom except for one item; he can not transfer to other uses
his expenditure for education and training.

4. Remarks on evaluation of performance, or the so-called merit rating

Many companies in America have systems by which everyone in management


or in research receives from his superiors a rating every year. Some government
agencies have a similar system. The merit system leads to management by fear.
The effect is devastating.
- It nourishes short-term performance, annihilates long-term planning, builds
fear, demolishes teamwork; nourishes rivalry and politics,
- It leaves people bitter, others despondent and dejected, some even depressed,
unfit for work for weeks after receipt of rating, unable to comprehend why they
are inferior. It is unfair, as it ascribes to the people in a group differences that
may be caused largely if not totally by the system that they work in.
The idea of a merit rating is alluring. The sound of the words captivates the
imagination: pay for what you get; get what you pay for; motivate people to do
their best, for their own good.
The effect of the merit rating is exactly the opposite of what the words promise.
Everyone propels himself forward, or tries to, for his own good, on his own life
preserver. The organization is the loser.
Moreover, a merit rating is meaningless as a predictor of performance, whether
in the same job or in one that he might be promoted into. One may predict
performance only for someone that falls outside the limits of differences attributa-
ble to the system that the people work in.

5. Modern principles of leadership

Modern principles of leadership will in time replace the annual performance


review. The first step in a company will be to provide education in leadership.
This education will include the theory of variation, also known as statistical
theory. The annual performance review may then be abolished. Leadership will
take its place. Suggestions follow.
(1) Institute education in leadership; obligations, principles, and methods.
(2) More careful selection of the people in the first place.
(3) Better training and education after selection.
4 w. Edwards Deming

(4) A leader, instead of being a judge, will be a colleague, counseling and


leading his people on a day-to-day basis, learning from them and with them.
(5) A leader will discover who if any of his people is (a) outside the system on
the good side, (b)outside on the poor side, (c) belonging to the system. The
calculations required are fairly simple if numbers are used for measures of per-
formance. Ranking of people (outstanding down to unsatisfactory) that belong to
the system violates scientific logic and is ruinous as a policy.
In the absence of numerical data, a leader must make subjective judgment. A
leader will spend hours with every one of his people. They will know what kind
of help they need. There will sometimes be incontrovertible evidence of excellent
performance, such as patents, publication of papers, invitations to give lectures.
People that are on the poor side of the system will require individual help.
Monetary reward for outstanding performance outside the system, without
other, more satisfactory recognition, may be counterproductive.
(6) The people of a group that form a system will all be subject to the com-
pany's formula for privileges and for raisesin pay. This formula may involve (e.g.)
seniority. It is important to note that privilege will not depend on rank within the
system. (In bad times, there may be no raise for anybody.)
(7) Figures on performance should be used not to rank the people in a group
that fall within the system, but to assist the leader to accomplish improvement of
the system. These figures may also point out to him some of his own weaknesses.
(8) Have a frank talk with every employee, up to three or four hours, at least
once a year, not for criticism, but to learn from each of them about the job and
how to work together.
The day is here when anyone deprived of a raise or of any privilege through
misuse of figures for performance (as by ranking the people in a group) may with
justice file a grievance.
Improvement of the system will help everybody, and will decrease the spread
between the figures for the performances of people.

6. Other obstacles

(1) Hope for quick results (instant pudding).


(2) The excuse that 'our problems are different'.
(3) Inept teaching in schools of business.
(4) Failure of schools of engineering to teach statistical theory.
(5) Statistical teaching centres fail to prepare students for the needs of industry.
Students learn statistical theory for enumerative studies, then see them applied in
class and in textbooks, without justification nor explanation, to analytic problems.
They learn to calculate estimates of standard errors of the result of an experiment
and in other analytic problems where there is no such thing as a standard error.
They learn tests of hypothesis, null hypothesis, and probability levels of signifi-
cance. Such calculations and the underlying theory are excellent mathematical
exercises, but they provide no basis for action, no basis for evaluation of the risk
Transformation of Western style of management 5

of prediction of the results of the next experiment, nor of tomorrow's product,


which is the only question of interest in a study aimed at improvement of per-
formance of a process or of a product.
(6) The supposition by management that the work-force could turn out quality
if they would apply full force their skill and effort. The fact is that nearly everyone
in Western industry, management and work-force, is impeded by barriers to pride
of workmanship.
(7) Reliance on QC-Circles, employee involvement, employee participation
groups, quality of work life, anything to get rid of the problems of people. These
shams, without management's participation, deteriorate and break up after a few
months. The big task ahead is to get the management involved in management
for quality and productivity. The work-force has always been involved. There will
then be quality of work life, pride of workmanship, and quality. Applications of
techniques within the system as it exists often accomplish great improvements in
quality, productivity and reduction of waste.

7. Remarks on use of visible figures

The comptroller runs the company on visible figures. This is a sure road to
decline. Why? Because the most important figures for management are not visible:
they are unknown and unknowable. Do courses in finance teach students the
importance of the unknown and unknowable loss
- from a dissatisfied customer?
- from a dissatisfied employee, one that, because of correctible faults of the
system, can not take pride in his work?
- from the annual rating on performance, the so-called merit rating?
- loss from absenteeism (purely a function of supervision)?
Do courses in finance teach their students about the increase in productivity
that comes from people that can take pride in their work?
Unfortunately, the answer is no.

8. Condensation of the 14 points for management

There is now a theory of management. No one can say now that there is
nothing about management to teach. If experience by itself would teach manage-
ment how to improve, then why are we in this predicament? Everyone doing his
best is not the answer that will halt the decline. It is necessary that everyone know
what to do; then for everyone to do his best.
The 14 points apply anywhere, to small organizations as well as to large ones,
to the service industry as well as to manufacturing.
(1) Create constancy of purpose toward improvement of product and service,
with the aim to excel in quality of product and service, to stay in business, and
to provide jobs.
6 IV. Edwards Deming

(2) Adopt the new philosophy. We are in a new economic age, created by
Japan. Transformation of Western style of management is necessary to halt the
continued decline of industry.
(3) Cease dependence on inspection to achieve quality. Eliminate the need for
inspection on a mass basis by building quality into the product in the first place.
(4) End the practice of awarding business on the basis of price tag. Purchasing
must be combined with design of product, manufacturing, and sales, to work with
the chosen supplier, the aim being to minimizing total cost, not initial cost.
(5) Improve constantly and forever every activity in the company, to improve
quality and productivity, and thus constantly decrease costs. Improve design of
product.
(6) Institute training on the job, including management.
(7) Institute supervision. The aim of supervision should be to help people and
machines and gadgets to do a better job.
(8) Drive out fear, so that everyone may work effectively for the company.
(9) Break down barriers between departments. People in research, design,
sales, and production must work as a team, to foresee problems of production
and in use that may be encountered wJ.th the product or service.
(10) Eliminate slogans, exhortations, and targets for the work force asking for
fewer defects and new levels of productivity. Such exhortations only create adver-
sarial relationships, as the bulk of the causes of low quality and low productivity
belong to the system and thus lie beyond the power of the work force.
(11) Eliminate work standards that prescribe numerical quotas for the day.
Substitute aids and helpful supervision.
(12a) Remove the barriers that rob the hourly worker of his right to pride of
workmanship. The responsibility of supervisors must be changed from sheer
numbers to quality.
(b) Remove the barriers that rob people in management and in engineering of
their right to pride of workmanship. This means, inter alia, abolishment of the
annual or merit rating and of management by objective.
(13) Institute a vigorous program of self-improvement and education.
(14) Put everybody in the company to work in teams to accomplish the trans-
formation. Teamwork is possible only where the merit rating is abolished, and
leadership put in its place.

9. What is required for change?

The first step is for Western management to awaken to the need for change.
It will be noted that the 14 points as a package, plus removal of the deadly
diseases and obstacles to quality, are the responsibility of management.
Management in authority will explain by seminars and other means to a critical
mass of people in the company why change is necessary, and that the change will
involve everybody. Everyone must understand the 14 points, the deadly diseases,
and the obstacles. Top management and everyone else must have the courage to
change. Top management must break out of line, even to the point of exile
amongst their peers.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7 t)
© Elsevier Science Publishers B.V. (1988) 7-25

Software Reliability

F. B. Bastani a n d C. V. R a m a m o o r t h y

1. Introduction

Process control systems, such as nuclear power plant safety control systems,
air-traffic control systems and ballistic missile defense systems, are embedded
computer systems. They are characterized by severe reliability, performance and
maintainability requirements. The reliability criterion is particularly crucial since
any failures can be catastrophic. Hence, the reliability of these systems must be
accurately measured prior to actual use.
The theoretical basis for methods of estimating the reliability of the hardware
is well developed (Barlow and Proschan, 1975). In this paper we discuss methods
of estimating the reliability of process control software.
Program proving techniques can, in principle, establish whether the program is
correct with respect to its specification or whether it contains some errors. This
is the ideal approach since there is no physical deterioration or random mal-
functions in software. However, the functions expected of process control systems
are usually so complex that the specifications themselves can be incorrect and/or
incomplete, thus limiting the applicability of program proofs.
One approach is to use statistical methods in order to assess the reliability of
the program based on the set of test cases used. Since the early 1970's, several
models have been proposed for estimating software reliability and some related
parameters, such as the mean time to failure (MTTF), residual error content, and
other measures of confidence in the software. These models are based on three
basic approaches to estimating software reliability. Firstly, one can observe the
error history of a program and use this in order to predict its future behavior.
Models in this category are applicable during the testing and debugging phase. It
is often assumed that the correction of errors does not introduce any new errors.
Hence, the reliability of the program increases and, therefore, these models are
often called reliability growth models. A problem with these models is the dif-
ficulty in modelling realistic testing processes. Also, they cannot incorporate pro-
gram proofs, cannot be applied prior to the debugging phase and have to be
modified significantly in order to be applicable to programs developed using
iterative enhancement.
8 F.B. Bastani and C. V. Ramamoorthy

The second approach attempts to predict the reliability of a program on the


basis of its behavior for a sample of points taken from its input domain. These
software reliability models are applicable during the validation phase (Ramamoor-
thy and Bastani, 1982; TRW, 1976). Errors found during this phase are not
corrected. In fact, if errors are discovered the software may be rejected. The size
of the sample required for a given confidence in the reliability estimate can be
reduced by using some knowledge about the relationship between different points
in the input domain. However, general modelling of the nature of the input
domain results in mathematically intractable derivations.
The third method which can be used to estimate software reliability is based
on error seeding (Mills, 1973; Schick and Wolverton, 1978). In this approach the
program is seeded with artificial errors without the knowledge of the team
responsible for testing and debugging the software. At the conclusion of the
testing and debugging phase, the correctness of the program is estimated by
comparing the number of artificial and actual errors found by the test team.
The rest of this paper is organized as follows: Section 2 defines software
reliability and classifies some of the models which have been proposed over the
past several years. Section 3 discusses the concept of error size and testing
process. It states the assumptions of software reliability growth models and
reviews error-counting and non-error-counting models. Section 4 discusses the
measurement of software reliability/correctness using Nelson's model (TRW,
1976) and an input domain based model (Ramamoorthy and Bastani, 1979).
Section 5 summarizes the paper and outlines some research issues in this area.

2. Definition and classification

In this section we first give a formal definition of software reliability and then
present a classification of the models proposed for estimating the reliability of a
program.

2.1. Definition
Software reliability has been defined as the probability that a software fault
which causes deviation from the required output by more than the specified
tolerances, in a specified environment, does not occur during a specified exposure
period (TRW, 1976). Thus, the software needs to be correct only for inputs for
which it is designed (specified environment). Also, if the output is correct within
the specified tolerances in spite of an error, then the error is ignored. This may
happen in the evaluation of complicated floating point expressions where many
approximations are used (e.g., polynomial approximations for cosine, sine, etc.).
It is possible that a failure may be due to errors in the compiler, operating
system, microcode or even the hardware. These failures are ignored in estimating
the reliability of the application program. However, the estimation of the overall
system reliability will include the correctness of the supporting software and the
reliability of the hardware.
Software reliability 9

In some cases it may be desirable to classify software faults into several


categories, ranging from trivial errors (e.g., minor misspellings on a hardcopy
output) to catastrophic errors (e.g., resulting in total loss of control). Then, one
could specify different reliability requirements for the various types of faults. Most
software reliability models can be easily adapted for errors in a given class by
merely ignoring other types of errors when using the model. However, this
decreases the confidence in the reliability estimate since the sample size available
for estimating the parameters of the model is reduced.
The exposure period should be independent of extraneous factors like machine
execution time, programming environment, etc. For many applications the appro-
priate unit of exposure period is a run corresponding to the selection of a point
from the input domain (specified environment) of the program. However, for some
programs (e.g., an operating system), it is difficult to determine what constitutes
a 'run'. In such cases, the unit of exposure period is time. One has to be careful
in measuring time in these cases (Musa, 1975). For example, if a multiuser,
interactive data base system is being accessed by five users, should the exposure
period be five times the observed time? This may be reasonable if the system is
not saturated since then five users are likely to generate approximately five times
as much work in the observed time as would a single user. However, this is not
true if the system is saturated.
Thus, we have:

(1) R(i) = reliability over i runs = P{no failure over i runs}


or
(2) R(t) = reliability over t seconds = P{no failure in interval [0, t)}.

(P{E} denotes the probability of the event E.)

Definition (1) leads to an intuitive measure of software reliability. Assuming


that inputs are selected independently according to some probability distribution
function, we have:

R(i) = [R(1)]; = (R);,

where R = R(1). We can define the reliability, R, as follows:

R = 1 - lim nf
n~oo n

where n = number of runs and nf--- number of failures in n runs.


This is the operational definition of software reliability. We can estimate the
reliability of a program by observing the outcomes (success/failure) of a number
of runs under its operating environment. If we observe nf failures out of n runs,
the estimate of R, denoted by/~, is:
10 F. B. Bastani and C. V. Ramamoorthy

/~=1 nf
n

This method of estimating R is the basis of the Nelson model (TRW, 1976).

2.2. Classification
In this subsection we present a classification of some of the software reliability
models proposed over the past fifteen years. The classification scheme is based
on the three different methods of estimating software discussed in Section 1. The
main features of a model serves as a subclassification.
After a program has been coded, it enters a testing and debugging phase.
During this phase, the implemented software is tested till an error is detected.
Then the error is located and corrected. The error history of the program is defined
to be the realization of a sequence of random variables 1"1, T2, . . . , T,, where Tt
denotes the time spent in testing the program after the ( i - 1)-th error was
corrected till the i-th error is detected. One class of software reliability models
attempts to predict the reliability of a program on the basis of its error history.
It is frequently assumed that the correction of errors does not introduce any new
errors. Hence, the reliability of the program increases, and therefore such models
are called software reliability growth models.
Software reliability growth models can be further classified according to whether
they express the reliability in terms of the number of errors remaining in the
program or not. These constitute error-counting and nonerror-counting models,
respectively.
Error-counting models estimate both the number of errors remaining in the
program as well as its reliability. Both deterministic and stochastic models have
been proposed. Deterministic models assume that if the model parameters are
known then the correction of an error results in a known increase in the reliability.
This category includes the Jelinski-Moranda (1972), Shooman (1972), Musa
(1975), and Schick-Wolverton (1978) models. The general Poisson model (Angus
et al., 1980) is a generalization of these four models. Stochastic models include
Littlewood's Bayesian model (Littlewood, 1980a) which models the (usual) case
where larger errors are detected earlier than smaller errors, and the G o e l -
Okumoto Nonhomogeneous Poisson Process Model (NHPP) (Goel and Okumoto,
1979a) which assumes that the number of faults to be detected is a random
variable whose observed value depends on the test and other environmental
factors. Extensions to the Goel-Okumoto N H P P model have been proposed by
Ohba (1984) and Yamada et al. (Yamada et al., 1983; Yamada and Osaki, 1985).
The number of errors remaining in the program is useful in estimating the
maintenance cost. However, with these models it is d~Aficult to incorporate the
case where new errors may be introduced in the program as a result of imperfect
debugging. Further, for some of these models the reliability estimate is unstable
if the estimate of the number of remaining errors is low (Forman and Sing-
purwalla, 1977; Littlewood and Verall, 1980b).
Software reliability 11

Nonerror-counting models only estimate the reliability of the software. The


Jelinski-Moranda geometric de-eutrophication model (Moranda, 1975) and a
simple model used in the Halden project (Dahl and Lahti, 1978) are deterministic
models in this category. Stochastic models consider the situation where different
errors have different effects on the failure rate of the program. The correction of
an error results in a stochastic increase in the reliability. Examples include a
stochastic input domain based model (1L~M 80), Littlewood and Verrall's
Bayesian model (Littlewood and Verrall, 1973), and the Musa-Okumoto loga-
rithmic model (Musa and Okumoto, 1984).
All the models described above treat the program as a black box. That is, the
reliability is estimated without regard to the structure of the program. The validity
of their assumptions usually increases as the size of the program increases. Since
programs for critical control systems may be of medium size only, these models
are mainly used to obtain a preliminary estimate of the software reliability.
Several variants of software reliability growth models can be obtained by con-
sidering various orthogonal factors such as (1) the development of calendar time
expressions for predictions of MTTF, stopping time, etc. (Musa, 1975; Musa and
Okumoto, 1984); (2) the consideration of the time spent in locating and correcting
errors; this aspect is modelled as a Markov process by Trivedi and Shooman
(1975); and, (3) the possibility of imperfect debugging, including the introduction
of new errors (Goel and Okumoto, 1979b).
The second class of software reliability models, called sampling models, estimate
the reliability of a program on the basis of its behavior for a set of points selected
from its input domain. These models are especially attractive for estimating the
reliability of programs developed for critical applications, such as air-traffic con-
trol programs, which must be shown to have a high reliability prior to actual use.
At the end of the testing and debugging phase, the software is subjected to a large
amount of testing in order to assess its reliability. Errors found during this phase
are not corrected. In fact, if errors are discovered then the software may be
rejected.
One sampling model is the Nelson model developed at TRW (1976). It assumes
that the software is tested with test cases having the same distribution as the
actual operating environment. The operational definition discussed earlier is used
to obtain the reliability estimate.
The only disadvantage of the Nelson model is that a large amount of test cases
are required in order to have a high confidence in the reliability estimate. The
approach developed in (Ramamoorthy and Bastani, 1979) reduces the number of
test cases by exploiting the nature of the input domain of the program. An
important feature of this model is that the testing need not be random--any type
of test-selection strategy can be used. However, the model is difficult mathemati-
cally and difficult to validate experimentally.
The third approach to assessing software reliability is to insert several known
errors into the program prior to the testing and debugging phase. At the end of
this phase the number of errors remaining in the program can be computed on
the basis of the number of known and unknown errors detected. Models based
12 F. B. Bastani and C. F. Ramamoorthy

on this approach have been proposed by Mills and Basin (Mills and Basin, 1973;
Schick and Wolverton, 1978) and, more recently, by Duran and Wiorkowski
(1981). The major problem is that it is difficult to select errors which have the
same distribution (such as ease of detectability) as the actual errors in the
program. An alternate approach is to let two different teams independently debug
a program and then estimate the number of errors remaining in the program on
the basis of the number of common and disjoint errors found by them. Besides
the extra cost, this method may underestimate the number of errors remaining in
the program since many errors are easy to detect and, hence, are more likely to
be detected by both the teams. DeMillo, Lipton and Sayward (1978) discuss a
related technique called 'program mutation' for systematically seeding errors into
a program.
In this section we have classified many software reliability models without
describing them in detail. References (Bologna and Ehrenberger, 1978; Dahl and
Lahti, 1978; Schick and Wolverton, 1978; Tal, 1976; Ramamoorthy and Bastani,
1982; Goel and Okumoto, 1985) contain a detailed survey of most of these
models. In the next two sections we discuss a few software reliability growth
models and sampling models, respectively.

30 Software reliability growth models

In this section we first discuss the concepts of error size and testing process.
We develop a general framework for software reliability growth models using these
concepts. Then we briefly discuss some error-counting and nonerror-counting
models. The section concludes with a discussion on the practical application of
such models.

3.1. Error sizes

A program P, maps its input domain,/, into its output space, O. Each element
in I is mapped to a unique element in O if we assume that the state variables (i.e.,
output variables whose values are used during the next run, as in process control
software) are considered a part of both I and O. Software reliability models used
during the development phase are intimately concerned with the size of an error.
This is defined as follows:

DEFINITION. The size of an error is the probability that an element selected


from I according to the test case selection criterion results in failure due to that
error.

An error is easily detected if it has a large size since then it affects many input
elements. Similarly, if it has a small size, then it is relatively more difficult to
detect the error. The size of an error depends on the way the inputs are selected.
Good test case selection strategies, like boundary value testing, path testing and
Software reliability 13

range testing, magnify the size of an error since they exercise error-prone con-
structs. Likewise, the observed (effective) error size is lower if the test cases are
randomly chosen from the input domain.
We can generalize the notion of 'error size' by basing it on the different
methods of observing programs. For example, an error has a large size visually
if it can be easily detected by code reading. Similarly, an error is difficult to detect
by code review if it has a small size (e.g., when only one character is missing).
The development phase is assumed to consist of the following cycle:
(1) The program is tested till an error is found;
(2) The error is corrected and step (1) is repeated.
As we have noted above, the error history of a program depends on the testing
strategy employed, so that the reliability models must consider the testing process
used. This is discussed in the following subsection.

3.2. Testing process


As a simple example of a case where the error history is strongly dependent
on the testing process used, consider a program which has three paths, thus
partitioning the input domain into three disjoint subsets. If each input is con-
sidered as equally likely, then initially errors are frequently detected. As these are
corrected, the interval between error detection increases since fewer errors remain.
If a path is tested 'well' before testing another path, then whenever a switch is
made to a new path the error detection rate increases. Similarly, if we switch from
random testing to boundary value testing, the error detection rate can increase.
The major assumption of all software reliability growth models is:

ASSUMPTION. Inputs are selected randomly and independently from the input
domain according to the operational distribution.

This is a very strong assumption and will not hold in general, especially so in
the case of process control software where successive inputs are correlated in time
during system operation. For example, if an input corresponds to a temperature
reading then it cannot change very rapidly. To complicate the issue further, most
process control software systems maintain a history of the input variables. The
input to the program is not only the current sensor inputs, but also their history.
This further reduces the validity of the above assumption. The assumption is
necessary in order to keep the analysis and data requirements simple. However,
it is possible to relax it as follows:

ASSUMPTION. Inputs are selected randomly and independently from the input
domain according to some probability distribution (which can change with time).
This means that the effective error size varies with time even though the
program is not changed. This permits a straightforward modelling of the testing
process as discussed in the following subsection.
14 F. B. Bastani and C. V. Ramamoorthy

3.3. General growth model


Let
j = number of failures experienced;
k = number of runs since the j-th failure;
Tj(k) = testing process for the k-th run after j failures;
Vj(k) = size of residual errors for the k-th run after j failures; this can be random.
Now,
e{success on the k-th run IJ failures} = 1 - Vj(k)
= 1 - f(Tj(k))2j
where )~j = error size under operational inputs; this can be a r a n d o m variable;
0 ~< 2./~< 1 ; and f(Tj(k)) = severity of the testing process relative to the operational
inputs; 0 ~< f(Tj(k)) <~ 1/2j.
Hence,

Rj(kl2j) = P{no failure over k runs b2j}


k
= I-[ P{no failure on the i-th run 12j},
i=1

since successive test cases have independent failure probability.


Hence,
k
Rj(kl2j) = [ I [I - f(Tj(i))2j]

i.e.,

Rj(k) = E~j [1 - f(Tj(i))~j]


i

where E~j[.] is the expectation over ,~j.


For cases where it is difficult to identify 'runs', such as operating systems and
real-time process control systems, it is simpler to work in continuous time. The
above relation becomes"

Rj(t) = E~j[e- ~jS'of(rj(s))d,]

where
).j -- failure rate after the j-th failure; 0 <~ ).j ~< ~ ;
Tj(s) = testing process at time s after the j-th failure;
f ( T j ( s ) ) = severity of testing process relative to operational distribution;
0 <~f(Tj(s)) <~ ~ .

REMARKS. (1) As we have noted above, f(Tj(.)) is the severity of the testing
Software reliability 15

process relative to the operational distribution, where the testing severity is the
ratio of the probability that a run based on the test case selection strategy detects
an error to the probability that a failure occurs on a run selected according to the
operational distribution. Obviously, during the operational phase, f(Tj(.)) = 1. In
general it is difficult to determine the severity of the test cases, and most models
assume that f ( T j ( . ) ) = 1. However, for some testing strategies we can quantify
f(Tj(.)). For example, in functional testing, the severity increases as we switch to
new functions since these are more likely to contain errors than functions which
have already been tested.
(2) Even the weaker assumption is difficult to justify for programs developed
using incremental top-down or bottom-up integration (Myers, 1978), since the
input domain keeps on changing. Further, the assumption ignores other methods
of debugging programs, such as code reviews, static analysis, program proofs, etc.
(3) In the continuous case, the time is the CPU time (Musa, 1975).
(4) Software reliability growth models can be applied (in principle) to any type
of software. However, their validity increases as the size of the software and the
number of programmers involved increases.
(5) This process is a type of doubly stochastic process; these processes were
originally studied by Cox in 1955 (Cox, 1966).

3.4. Error-counting models


These models attempt to estimate the software reliability in terms of the esti-
mated number of errors remaining in the program. The Jelinski-Moranda model
(1972) was the first error-counting model. The Shooman model (1972) underwent
some changes and is now similar to the Jelinski-Moranda model. The
Schick-Wolverton model (1978) extended the Jelinski-Moranda model by incor-
porating a factor representing the severity of the test cases. The Musa model
(1975) is equivalent to the Jelinski-Moranda model. However, it is better devel-
oped and is the first model to insist on execution time data rather than the
calendar time data used in the earlier models. These early models assumed that
all the errors had the same error rate. This is clearly unsatisfactory since one
would expect that errors which are detected later should have smaller (opera-
tional) error rates than those which are detected earlier. This is rectified by
Littlewood's model (1980a) which incorporates the case where the failure rate of
successive errors is stochastically decreasing. The Goel-Okumoto N H P P model
(1979a) makes another departure from the other models by treating the number
of faults to be detected as a random variable instead of a fixed unknown constant.
Two additional assumptions made by most error-counting models are:
(a) The failure rates of the errors remaining in the program are independently
identically distributed random variables.
(b) The program failure rate is the sum of the individual failure rates.
Taken together these assumptions are not true in general since the error dis-
tribution across modules is often skewed (Myers, 1978), so that a few complex,
error-prone modules contain a large proportion of the errors. Since there is likely
16 F. B. Bastani and C. V. Ramamoorthy

to be a considerable overlap in the elements (in the input domain) affected by


such closely related errors, the removal of each error, except the last error,
decreases the failure rate by l e s s than its own failure rate. This can result in an
incorrect estimate of the reliability of the program since each detected error would
be perceived as having a failure rate smaller than its actual failure rate. Further,
a common testing strategy is to direct subsequent test cases at the module in
which an error was most recently detected till sufficient confidence is restored in
its correctness. However, this would mean that the failure rates are no longer
independently identically distributed.
In order to illustrate models in this category, we now present the details of the
general Poisson model (GPM) discussed in (Angus et al., 1980). It generalizes the
Jelinski-Moranda linear de-eutrophication model, the Shooman model, and the
Schick-Wolverton model. The key parts of the Musa model are also generalized
by this model.
The inputs to the model are (1) tl, tz, . . . , t , where ts is the rime required to
detect the j-th failure after the error(s) causing the ( j - 1)-th failure has (have)
been corrected, and (2) m l , m z , . . . , m n where m j is the number of errors fixed
as a result of the j-th failure.
The G P M model assumes that

f(Tj(s)) = as ~-1 , 2s = ( N - M j ) ~ b ,

where N is the number of errors originally present, Mj = Z ji= 1 mi, and 0~, q~ are
constants.
Hence
R j ( t ) = e - dp(N- Mj)t ~ "

The assumptions of the G P M model are as follows:


(1) consecutive inputs have independent failure probabilities,
(2) all errors have the same disjoint failure rate (p,
(3) the severity of the testing process is proportional to a power of the elapsed
CPU time,
(4) no new errors are introduced.
Assumption (1) has already been discussed above. Assumption (2) is a major
drawback of these models (Littlewood, 1980a): earlier errors are likely to have a
larger failure rate since they are detected more easily. Assumption (3) depends to
a large extent on the testing strategy used. Intuitively, as time increases, the
severity of the testing increases (Schick and Wolverton, 1978). Assumption (4) is
not true in general and can lead to invalid estimates (Angus et al., 1980). Musa
(1975) partly overcomes this by estimating the total number of errors to be
eventually detected.
The Maximum Likelihood Estimates (MLE) for the parameters of the model
can be derived as follows:

failure PDFj(t) dRj(t) = (o(N - Mj)~t ~- 1 e- ~(N-Mj),~.


dt
Software reliability 17

The likelihood function is

L = fi PD~_,(~)
j=l

Hence, the log likelihood function is

logL = n iog~b + n log~ + ~ log(N - Mj_ 1)


j=l

log, - (N-
j=l j=l
The MLE's can be computed by numerically solving the equations obtained by
equating the partial derivatives of logL with respect to N, c¢, and ~p to O. The final
equations are as follows:

~ 1 ~ ^ ~
j=l IV-Mj_ 1 j=l~tj =0,

n ^
- - + ~ l o g t j - ~ (p(2V-Mj_,)tTlogtj= 0 ,
~ j=l i=1

n ~ (N - Mj_I)tj ~ = 0.
(~ j=l

These are discussed further in (Angus et al., 1980).

3.5. Nonerror-counting models


These models only estimate the reliability of the software. They consider the
effect of a debugging action on the error size or on the failure rate without concern
as to the number of errors detected at a time. For example, in the Jelinski-
Moranda Geometric De-eutrophication model, we have

~j = ~ j - 1
D

where 2j is the error rate and D is a constant to be estimated. An interesting


observation is that the estimate of the parameters of this model may exist even
in cases where those of the linear de-eutrophication model do not exist, i.e., fail
to converge (Dahl and Lahti, 1978; Tal, 1976). Similarly, for the Littlewood-
Verrall Bayesian model (1973) we have
st •
18 F. B. Bastani and C. V. Ramamoorthy

This models the case where there is a possibility that a debugging action may
introduce new errors into the program. For the stochastic input domain based
model (Ramamoorthy and Bastani, 1980) we have:

,~j__l -- ~j-- ~ j _ l X ,

where 2j is the error size and X is a random variable having a piecewise con-
tinuous distribution. This models the case where errors detected later have
(stochastically) smaller sizes than those detected earlier.
In order to illustrate models in this category, we present details of the M u s a -
Okumoto Logarithmic model (i984). The inputs to the model are tl, t2, . . . , tn
where t/ is the time (not interval) at which the j-th error was detected. In this
model:

2o
f(Tj(s))-- 1, 2 ( 0 - - -
20 Ot + 1

Thus, the model assumes that the failure rate decreases continuously over the
testing and debugging phase, rather than at discrete points corresponding to error
correction times. Further, the rate of decrease in 2(0 itself decreases with time,
thus modelling the decrease in the size of errors detected as debugging proceeds

Rj(t)= e_/~+,~(,)d,={ 2oOtj+ 1 }1/o.


2o O ( t j + t ) + 1

From this, the failure probability density function is

failure PDF/(t) = 2(t/+ t) e -Ig +' ~<')a"x(')d"


Hence,

L = {j=~l )L(lj)} e- So"a(s)d~

Taking the logarithm of the likelihood function, we get

logL = n log)~o - ~ log(2o0t/ + 1) - 1 log(2o0t, + 1)


j=l

Setting the derivative of logL with respect to 2o and 0 to 0 yields two equations
which can be solved numerically for the maximum likelihood estimates of 2 o and
0, i.e., 2o and 0:
Software reliability 19

n t}~ tj tn -0
A A A

20 j=l 2o0tj+ 1 - ^ 20 &n+ 1


A
n 1 ^ ^ 2ot,
tj +-- log(2 oOt. + 1) ^ ^ ^ =0.
- to El= 0t, + 1 b2 0(4o0t. + 1)

Experience has shown that this model is more accurate than the earlier model
proposed by Musa (1975). Further discussions concerning the application of the
new model appear in (Musa and Okumoto, 1984)

3.6. Summary
We can view 2 as a random walk process in the interval (0, e). Each time the
program is changed (due to error corrections or other modifications) 2 changes.
In the formulation of the general model, 2i denotes the state of 2 after the j-th
change to the program. Let Zj denote the time between failures after the j-th
change. Zj is a random variable whose distribution depends on 2j. In all the above
continuous (discrete) time models, we have assumed that this distribution is the
exponential (geometric) distribution with parameter 2j, provided that f(Tj(.)) = 1.
We do not know anything about the random walk process of 2 other than a
sample of time between failures. Hence, one approach is to construct a model for
2 and fit the parameters of the model to the sample data. Then we assume that
the future behavior of 2 can be predicted from the behavior of the model.
Some of the models for 2 which have been developed are as follows:
General Poisson Model (Angus et al., 1980): The set of possible states are (0, e/N,
2e/N . . . . , e); 2j = ( N - j ) e / N ; the parameters are e and N, there is a finite num-
ber of states.
Geometric De-Eutrophication Model (Moranda, 1975): The set of possible states are
(e, ed, ed 2, ed 3. . . . ), where d < 1; 2j = edJ; the parameters are e and d; there is
an infinite (although countable) number of states.
Stochastic (Input Domain) Model (Ramamoorthy and Bastani, 1980): The state is
continuous over the interval (0, e); 2j = 2j_ 1 + Zig.,where Aj ~ 2j_ 1X, X ~ fl(r, s);
the parameters are r and s.
An alternative approach is the Bayesian approach advocated by Littlewood
(1979). In this method, we postulate a prior distribution for each of 2 l, 22, ..., 2j.
Then based on the sample data, we compute the posterior distribution of 2j+ 1.
Some additional discussions appear in (Ramamoorthy, 1980).
Over 50 different software reliability growth models have been proposed so far.
These models yield widely varying predictions for the same set of failure data
(Abdel-Ghaly et al., 1986). Further, any given model gives reasonable predictions
for one set of data and incorrect predictions for other sets of data. This has led
some researchers to propose that for each project several models should be used
and then goodness-of-fit tests should be performed prior to selecting a model that
is valid for the given set of failure data (Goel, 1985; Abdel-Ghaly et al., 1986).
20 F. B. Bastani and C. V. Ramamoorthy

A basic problem with all software reliability growth models is that their assump-
tion that errors are detected as a result of random testing is not true for modern
software development methods. Models which have been validated using data
gathered over a decade ago are not necesarily valid for current projects that use
more systematic methods and tools. As an analogy, consider the task of reviewing
a technical paper. There are (at least) three major types of errors which can creep
into a manuscript. These are (1) spelling, typographical, and other context inde-
pendent errors, (2) grammatical, organization, style, and other context dependent
errors, and (3) correctness of equations, significance of the contribution, and other
technical errors. Context dependent errors can be detected by random testing (i.e.,
by selecting anyone familiar with the language to review the paper) while three
carefully selected referees are vastly superior to a thousand randomly selected
referees in their ability to detect technical errors. Also, the failure process
observed when all the errors are detected by human beings (testing) is different
from that observed when automated tools such as spelling and grammar checkers
are used. Similarly, in software development we now have tools that can detect
most context independent errors (syntax errors, incorrect procedure calls, etc.)
and context dependent errors (undefined variables, invalid pointers, inaccessible
code segments, etc.). These tools include strongly typed languages and their
compilers, data flow analyzers, etc. The remaining errors are generally the result
of misunderstanding of specifications. These are best detected by formal code
review and walk-through, simulation, verification where possible, and systematic
testing which can be either incremental bottom-up or top-down and which
emphasizes error prone regions of the input domain, such as boundary and
special value points. Again, the failure process when these methods are used is
completely different from that obtained when only random testing is used.
In summary, software reliability growth models treat the program as a black
box. That is, the reliability is estimated without regard to the structure of the
program, number of procedures which have been formally proved/derived, etc.
The validity of their assumption regarding random testing is generally not true for
modern program development methods. Experience shows that with systematic
validation techniques, errors are initially detected in quick succession with an
abrupt transition to an (almost) error free state. Thus, these models can only be
used for obtaining an approximate estimate of the reliability of programs.

4. Sampling models

Software developed for critical applications, like air-traffic control, must be


shown to have a high reliability prior to actual use. Since the possibility of
specification errors exists, program testing must be used in addition to program
proofs. At the end of the development phase, the software is subjected to a large
amount of testing in order to estimate its reliability. Errors found during this
phase are not corrected. In fact, if errors are discovered the software may be
rejected (Ramamoorthy, 1979).
Software reliability 21

In this section we discuss methods of measuring the reliability of a program


based on the sample selected. We first discuss Nelson's method (MacWilliarns,
1973; Nelson, 1978; TRW, 1976) and then a model for estimating the correctness
probability of a program based on its input domain.

4.1. The Nelson model


This model (TRW, 1976) is based on the operational definition of software
reliability given earlier. It is the only model whose theoretical foundations are
sound. However, it suffers from a number of practical drawbacks:
(1) In order to have a high confidence in the reliability estimate, a large number
of test cases must be used.
(2) It does not take into account 'continuity' in the input domain. For example,
if the program is correct for a given test case, then it is likely that it is correct
for all test cases executing the same sequence of statements.
(3) It assumes random sampling of the input domain. Thus, it cannot take
advantage of testing strategies which have a higher probability of detecting errors,
e.g., boundary value testing, etc. Further, for most real-time control systems, the
successive inputs are correlated if the inputs are sensor readings of physical
quantities, like temperature, which cannot change rapidly. In these cases we
cannot perform random testing.
(4) It does not consider any complexity measure of the program, e.g., number
of paths, statements, etc. Generally, a complex program should be tested more
than a simple program for the same confidence in the reliability estimate.
In order to overcome these drawbacks, the model has been extended (Nelson,
1978) as follows: The input domain is divided into several equivalence classes.
The division can be based on paths or some other criteria when the number of
paths is too large (e.g., program sub-functions). It is assumed that there is some
continuity among the elements in an equivalence class, i.e., if the program
executes correctly for an input from the j-th equivalence class, then it will execute
correctly for any randomly selected input from the same equivalence class with
probability 1 - bj., where bj ,~ 1. Then:

e(1)= ej(1-
j=l

where m = number of equivalence classes; and Pj = probability of selecting an


input from the j-th equivalence class during actual operation.

DISCUSSION. This model is a big improvement over the original model. Some
comments are:
(1) The assignment of values to bj is ad hoc; no theoretical justification is given
for the assignment (Nelson, 1978).
(2) The model uses only one type of complexity measure, namely, number of
paths, functions, etc. However, it does not consider the relative complexity of
each path, function, etc.
22 F. B. Bastani and C. V. Ramamoorthy

Many other interesting aspects of the Nelson model are discussed in (TRW,
1976).

4.2. Input domain based model


This model is discussed in detail in (Ramamoorthy and Bastani, 1979). It
removes most of the objections to the Nelson model. The price is the increased
complexity of the model. The model was developed for assessing the quality of
critical real-time process control programs. In such systems no failures should be
detected during the reliability estimation phase, so that the reliability estimate is
one. Hence, the important metric of concern is the confidence in the reliability
estimate. This model provides an estimate of the conditional probability that the
program is correct for all possible inputs given that it is correct for a given set
of inputs. The basic assumption is that the outcome of each test case provides
at least some stochastic information about the behavior of the program for points
which are close to the test point. The model uses the concept of probabilistic
equivalence classes which is defined as follows: E is a probabilistic equivalence
class if E is a subset o f / , where I is the input domain of the program P, and
P is correct for all elements in E, with probability P(X~, . . . , Xa}, if P is correct
for each X,. in E, i = 1. . . . . d. Then, P { I IX) is the correctness probability of P
based on the set of test cases X. (Obviously, the program must be correct for each
element in X.) Probabilistic equivalence classes are derived from the requirements
specification and the program source code in order to minimize control flow
errors. A suggested selection criterion (Ramamoorthy and Bastani, 1979) is:
Let E be a probabilistic equivalence class. X is in E if an error in the program
which affects any element in E can affect X, and vice versa. The results of this
classification scheme are:
(1) It includes all paths without loops since distinct paths differ in at least one
statement.
(2) Multiple conditions are treated separately since an error in one condition
need not affect the other conditions.
(3) Loops are restricted to a finite number of repetitions.
In order to further minimize control flow errors, these classes should be inter-
sected with classes derived from the requirements specification (Weyuker and
Ostrand, 1980). Finally, we can estimate the correctness probability of the pro-
gram using the continuity assumption, namely, closely related points in the input
domain are 'correlated' with respect to the implementation of the function. This
is true in general for algebraic programs where errors usually affect an interval of
nearby points. These regions correspond to high probability equivalence classes,
such as those formed on the basis of program paths. A specific model is devel-
oped in (Ramamoorthy and Bastani, 1979). The main result of this model is
P{program is correct for all points in [a, a + V lit is correct for
test cases having successive distances xj, j = 1. . . . , n - 1}

= e-RV
j=l 1 +e -'txj '
Software reliability 23

where 2 is a parameter which is deduced from some measure of the complexity


of the source code.

DISCUSSION. The advantages of this model are:


(1) Any test case selection strategy can be used. This will minimize the testing
effort since we can choose test cases which exercise error-prone constructs.
(2) It does not assume random sampling.
(3) It takes into account the complexity of the program: A simple program is
tested less than a complicated program for the same correctness probability. The
model also yields the optimal testing strategy to be used. Specifically, for algebraic
programs the test cases should be spread out over the input domain for higher
correctness probability.
The disadvantages of the model are:
(1) It is relatively expensive to determine the equivalence classes and their
complexity.
(2) Incorporation of more general continuity assumptions (e.g., boundary value
relationships) results in mathematically intractable derivations.

4.3. Summary
The models discussed in this section are especially attractive for medium size
programs whose reliability cannot be accurately estimated by using reliability
growth models. These models also have the advantage of considering the structure
of the program. This enables the joint use of program proving and testing in order
to validate the program and assess its reliability (Long et al., 1977).

5. Conclusion

We first defined software reliability and discussed three methods of measuring


it. Then we developed a general framework for software reliability growth models
using the concept of error size and testing process. We distinguished between
error counting and nonerror counting models. If only the reliability estimate is
required, then the nonerror counting models are preferable since they can model
the debugging process more realistically. Error counting models should be used
when an estimate of the number of remaining errors is needed. This may be
required if resources have to be allocated for the maintenance phase (assuming
that the average resource per error correction is known). It is also possible to
estimate the number of errors remaining in a program by using error seeding
techniques. Finally, we briefly discussed two sampling models, namely, the Nelson
model and its extension and an input domain based model.
At the present time no specific software reliability has found wide acceptance.
This is partly due to the cost involved in gathering failure data and partly because
of the difficulty in modelling the testing process. In the following, we outline a
method combining well established proof procedures with software reliability esti-
mation methods. It is particularly suitable for critical process control systems.
24 F. B. Bastani and C. V. Ramamoorthy

(1) During the testing and debugging phase at least two different software
reliability growth models should be used, primarily for helping the manager to
make decisions such as when to stop testing, etc. Goodness-of-fit tests should be
performed in order to select the model which is most appropriate for the failure
data obtained from the project.
(2) After the reliability growth models indicate that the reliability objective has
been achieved, a sampling model is used in order to get a more accurate estimate
of the reliability of the program.
(a) At first equivalence classes are determined based on the paths in the
program using the selection criterion discussed in Section 4.2. Boundary value
and range testing are performed in order to ensure that the classes are chosen
properly.
(b) If the path corresponding to each equivalence class can be verified (e.g., by
using symbolic execution) then the correctness probability of the class is 1.
(c) If the correctness of the path cannot be verified, then the degree of the
equivalence class is estimated. Next, as many test cases as necessary are used so
as to achieve a desired confidence in the correctness of the software.
During the first decade of software reliability research the major emphasis was
on developing models based on various assumptions. This resulted in the pro-
liferation of models, most of which were neither used nor validated. Currently the
consensus appears to be that perhaps there is no single model which can be
applied to all types of projects. Hence, one area of active research is to investigate
whether a set of models can be combined so as to achieve more accurate
reliability estimates for various situations. Other research topics include (1) devel-
oping methods of analyzing the confidence in the predictions of a model, and
(2) using software reliability theory to assist with the management of a project
throughout its life cycle.

References

Abdel-Ghaly, A. A., Chan, P. Y. and Littlewood, B. (1966). Evaluation of competing software


reliability predictions. 1EEE Trans. Softw. Eng. 12(9).
Angus, J. E., Schafer, R. E. and Sukert, A. (1980). Software reliability model validation. In Proc.
Annu. Rel. and Maintainability Syrup., San Francisco, CA, Jan. 1980, 191-199.
Barlow, R. E. and Proschan, F. (1975). Statistical Theory of Reliability and Life Testing. Holt, Rinehart
and Winston, New York.
Bologna, S. and Ehrenberger, W. (1978). Applicabilityof statistical models for reactor safety software
verification. Unpublished report.
Cox, D. R. and Lewis, P. A. W. (1966). The Statistical Analysis of Series of Events. Methuen, London.
Dahl, G. and Lahti, J. (1978). Investigation of methods for production and verification of computer
programmes with high requirements for reliability. OECD Halden Reactor Project, Preliminary
Report.
DeMillo, R. A., Lipton, R. J. and Sayward, F. G. (1978). Hints on test data selection: Help for the
practicing programmer. Computer (IEEE), April, 34-41.
Duran, J. W., Wiorkowski, J. J. Capture-recapture sampling for estimating software error content.
IEEE Trans. Softw. Eng. 7(1).
Software reliability 25

Forman, E. H. and Singpurwalla, N. D. (1977). An empirical stopping rule for debugging and testing
computer software. J. Amer. Stat. Ass. 72, 750-757.
Goel, A. L. and Okumoto, K. (1979a). A time-dependent error-detection rate model for software
reliability and other performance measures. 1EEE Trans. ReL 28(3), 206-211.
Goel, A. L. and Okumoto, K. (1979b). A Markovian model for reliability and other performance
measures for software systems. In Proc. Nat. Comput. Conf., New York 48, 767-774.
Goel, A. L. (1985). Software reliability models: Assumptions, limitations, and applicability. IEEE
Trans. Softw. Eng. 11(12), 1411-1423.
Jelinski, Z. and Moranda, P. (1972). Software reliability research. In: W. Freiberger, ed., Statistical
Computer Performance Evaluation. Academic Press, New York, 465-484.
Littlewood, B. and Verrall, J. L. (1973). A Bayesian reliability growth model for computer software.
J. Roy. Stat. Soc. 22(3), 332-346.
Littlewood, B. (1979). How to measure software reliability and how not to... IEEE Trans. Rel. 28,
103-110.
Littlewood, B. (1980a). A Bayesian differential debugging model for software reliability. Proc.
COMPSAC "80. Chicago, IL, 511-519.
Littlewood, B. and Verrall, J. L. (1980b). On the likelihood function of a debugging model for
computer software reliability. Dep. Math., City Univ., London.
Long, A. B. et al. (1977). A methodology for the development and validation of critical software for
nuclear power plants. Proc. 1st Int. Conf. Comp. Softw. & Appl. (COMPSAC "77). Chicago, IL.
MacWilliams, W. H. (1973). Reliability of large real-time control software systems. In: Rec. 1973
1EEE Syrup. Comput. Sofiw. Rel. New York, 1-6.
Mills, H. D. (1973). On the development of large reliable software. Rec. IEEE Syrup. Comp. Softw.
Rel. New York, 155-159.
Moranda, P. B. (1975). Prediction of software reliability during debugging. In: Proc. 1975 Annu. Rel.
and Maintainability Symp. Washington, DC, 327-332.
Musa, J. D. (1975). A theory of software reliability and its applications. IEEE Trans. Softw. Eng.
1(3), 312-327.
Musa, J. D. and Okumoto, K. (1984). A logarithmic Poisson execution time model for software
reliability measurement. In: Proc. 7th Int. Conf. Softw. Eng., Orlando, FL, 230-237.
Myers, G. J. (1978). The Art of Software Testing. Wiley, New York.
Nelson, E. (1978). Estimating software reliability from test data. Microelectronics and Reliability 17,
67-74.
Ohba, M. (1984). Software reliability analysis models. IBM J. Res. Develop. 28, 428-443.
Ramamoorthy, C. V. and Bastani, F. B. (1979). An input domain based approach to the quantitative
estimation of software reliability. Proc. Taipei Sere. on Softw. Eng. Taipei.
Ramamoorthy, C. V. and Bastani, F. B. (1980). Modelling of the software reliability growth process.
In: Proc. COMPSAC "80, Chicago, IL, 161-169.
Ramamoorthy, C.. and Bastani, F. B. (1982). Software reliability--Status and perspectives. 1EEE
Trans. Soflw. Eng. 8(4), 354-371.
Schick, G. J. and Wolverton, R. W. (1978). An analysis of competing software reliability models.
IEEE Trans. Softw. Eng. 4(2), 104-120.
Shooman, M. L. (1972). Probability models for software reliability prediction. In: W. Freiberger, ed.,
Statistical Computer Performance Evaluation. Academic Press, New York, 485-502.
Tal, J. (1976). Development and evaluation of software reliability estimators. UTEC SR 77-013, Univ.
of Utah, Elect. Eng. Dep., Salt Lake City, UT.
Trivedi, A. K. and Shooman, M. L. (1975). A many-state Markov model for the estimation and
prediction of computer software performance parameters. In: Proc. 1975 Int. Conf. Rel. Sofiw., Los
Angeles, CA, 208-220.
TRW Defense and Space Systems Group (1976). Software Reliability Study. Rep. No. 76-2260.1-9-5,
RW, Redondo Beach, CA.
Weyuker, E. J. and Ostrand, T. J. (1980). Theories of program testing and the application of revealing
subdomains. IEEE Trans. Softw. Eng. 6(3), 236-246.
Yamada, S., Ohba, M. and Osaki, S. (1983). S-shaped reliability growth modeling for software error
detection. 1EEE Trans. Rel. 32, 475-478.
Yamada, S. and Osaki, S. (1985). Software reliability growth modeling: Models and applications.
1EEE Trans. Softw. Eng. 11(12), 1431-1437.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7 "~
© Elsevier Science Publishers B.V. (1988) 27-54 J

Stress-Strength Models for Reliability

Richard A. Johnson

I. Introduction

It is a well accepted fact that the strength of a manufactured unit is a variable


quantity that should be modeled as a random variable. This fact forms the basis
for all of reliability modeling. A second source of variability may also have to be
taken into account. When ascertaining the reliability of equipment or the viability
of a material, it is also necessary to take into account the stress conditions of the
operating environment. That is, uncertainty about the actual environmental stress
to be encountered should be modeled as random. The terminology stress-strength
model makes explicit that both stress and strength are treated as random
variables.
Let X be the stress placed on a unit by its operating environment. In many
applications, X is taken to represent the maximum value attained by a critical kind
of stress. Lloyd and Lipow (1962) describe an application where X is the maxi-
mum chamber pressure generated by the ignition of a solid propellant in a rocket
engine. Kececioglu (1972) discusses a case where a torsion stress is the most
critical type of stress for a rotating steel shaft on a computer. Typically, the stress
variable is the most difficult to model accurately because of the lack of sufficient
data.
In the simplest stress-strength model, X is the stress placed on the unit by the
operating environment and Y is the strength of the unit. A unit is able to perform
its intended function if its strength is greater than the stress imposed upon it. In
this context, we define reliability (R) as

R = probability that the unit performs its task satisfactorily.

That is, reliability is the probability that the unit is strong enough to overcome
the stress.
Let the stress X have continuous distribution F(x) and strength Y have con-
tinuous distribution G(y). When X and Y can be treated as independent,

R = _f F(y)dG(y)= .f [1 - G(x)]dF(x)= P[Y>X].


27
28 R. A. Johnson

This model, first considered by Birnbaum (1956), has found an increasing number
of applications in civil, mechanical and aerospace engineering.
The following examples help to delineate the versatility of the model.

EXAMPLE 1.1 (Rocket engines). Let X represent the maximum chamber pres-
sure generated by ignition of a solid propellant, and Y be the strength of the
rocket chamber. Then R is the probability of a successful firing of the engine.

EXAMPLE 1.2 (Comparing two treatments). A standard design for the com-
parison of two drugs is to assign Drug A to one group of subjects and Drug B
to another group. Denote by X and Y the remission times with Drug A and
Drug B, respectively. Inferences about R = P [ Y > X ] , based on the remission
time data X l , X 2 . . . . , X m and I11, Y2, " " , Yn, are of primary interest to the
experimenter. Although the name 'stress-strength' is not appropriate in the
present context, our target of inference is the parameter R which has the same
structure as in Example 1.1.

EXAMPLE 1.3 (Threshold response model). A unit, say a receptor in the human
eye, operates only if it is stimulated by a source whose random magnitude, Y, is
greater than a (random) lower threshold for the unit. Here

P[ Y> X] = P[unit operates]

is again of the form described above in stress-strength context.

2. Nonparametric inference about stress-strength reliability

Let the data consist of a random sample of size m of stresses X 1, X 2. . . . . Xm


and an independent random sample of size n of strengths I11, Y2. . . . . In.
Birnbaum (1956) proposed the point estimate

= Ulmn

where U = number of pairs (X,., Yj) with Yj > Xr Alternatively, we can express/~
as

1~ = Fro(y) dG,(y)
-oo

where Fm(") and Gn(') are the empirical cdfs of the X's and Y's, respectively.
Now

E(/~)= ~ k P[ Yj > X ; ] - P [ Y > X ] = R


i=1 j=l mn
Stress-strength models for reliability 29

so /~ is an unbiased estimator of reliability. Under the assumption that the


underlying cdfs F(.) and G(.) are continuous, the order statistics
X(1) ~< • • • ~<X(,,) and t"(1) ~< " " " ~< Y~n) are complete sufficient statistics so that/~
is the unique uniform minimum variance unbiased estimator of R.
Another equivalent expression for R takes advantage of the relation between the
Mann-Whitney and Wilcoxon form of the two-sample rank statistics. That is

U---mn+m(m+ 1 ) / 2 - ~ rank(Xi).
i=l

Owen, Craswell and Hansen (1964) point out that /~ remains unbiased even if
F(.) and G(.) are not continuous.
It is also possible to obtain a distribution-free lower confidence bound on
R = P[Y<X] based on /~. First note that,

1~ - R = Fro(y) d G . ( y ) - F ( y ) dG(y)
(2.1)

=
X- ~
[Fm(y) - F(y)] d G , ( y ) +
-oo
F ( y ) [dGn(y ) - d G ( y ) ] .

Birnbaum and McCarthy (1958) bound the right hand side of relation (2.1) by

sup[Fm(x ) - F(x)] + sup[G(y) - G.(y)] = D.7, + D +


x y

where D + (D,,7,) is the Smirnov statistic based on a random sample of size m.


Consequently,

P[I~ - R <~ c] >f P[DZ,, + D~+ <~ c] (2.2)

so a conservative lower confidence bound can be obtained for R from the dis-
tribution of D~, + Dff. If c is selected so that 1 - ~ ~< P[D~, + D~+ <~ c], then

P [I~ - c <~ R ] >~ 1 - or. (2.3)

Thus 1~ - c is a conservative 100(1 - ~)% confidence bound f o r R .


Under the transformation Z = F ( X ) , Fm(X ) - F ( x ) = H m ( z ) - z where
H m ( z ) = (number of F(Xi) <~ z)/m. Since the Z i = F ( X i ) are independent uniform
random variables, the distribution of Dm+ , and that of D~-, are free of F(.) and
G('). Furthermore, P[D~, < d ] =P[D+m < d ] and it is well known that
P[D+m < d ] - L ( d x / ~ ) ~ O uniformly in d, where L ( z ) = 1 - e x p ( - 2 z 2 ) . This
suggests the approximation
30 R . A . Johnson

P [ D m + D + <<.c] ~ f f L((c - u) x ~ ) d L ( u x//m)

n e - 2mc2 m e - 2nc2
-1 (2.4)
m+n m+n

2 x / ~ m n c e - 2mnc2/(m+ n) 1 f 2~c/,/~+~
e - ~/2 d t .
(m + n) 3/2 xf r ~ d-2mc/m~/'~

Since D m and D + are independent, both P[Df,, + D + <~ c] = P [ D 2 + D + <<.c]


and the approximation are symmetric in the sample sizes m and n. Owen,
Craswell and Hansen (1964) present tables, based on approximation (2.4), which
extend those presented by Birnbaum and M c C a r t y (1958). The tables are entered,
using the confidence level and 2 where 2 = m/(n + m), to obtain 3 = c x / m + n in
our notation. Note that their upper bound on P [ X > Y] yields our lower bound
on R = P [ Y > X ] .
EXAMPLE 2.1. Suppose we have m = 20 values of maximum rocket pressures
and 30 observations on the strength of the chambers. Counting cases of strict
inequality in our samples, we obtain U = 591 so /~ = 591/(30)(20) = 0.985 and
2 = 20/50 = 0.4. For a 90~o confidence interval, b = 2.69289 = e x / ~ so
c -- 0.381 and ,~ - c = 0.985 - 0.381 = 0.604 is the 9 5 ~ lower confidence bound
on R.

Govindarajulu (1968) reports investigating two-sided confidence bounds based


on the inequality

I/~ - R ] ~ s u p I F . - F ] + s u p l G m - GI = Dm + D,,
x y

that follows directly from (2.1). However, the resulting intervals were very wide
so he suggests employing the large sample normal approximation for R.

THEOREM 2.1. I f m, n--* ~ ,

/~-R S
x/min (m, n'~ - - , N (0, 1)

where

<y

+-
n -<y
G(x) [ 1 - G(y)] dF(x) d F ( y ) ] .
Stress-strength models for reliability 31

The asymptotic normality follows directly from the representation as a


U-statistic. Involving Van Dantzig's bound Var(l~)<<.R(1-R)/min(m,n)
~< 1/4 min(m, n), one can choose c to satisfy

c = ~ - 1(1 - y) (2.5)
2 x / m i n (m, n)

where cb-l(") is the inverse of the standard normal cdf. Then / ~ - c is the
100(1 - 7)Yo lower-confidence bound on R. For the equal sample size situation,
Govindarajulu (1968) shows the 9 5 ~ confidence bound takes x / m + n c = 1.17
whereas ~/-m + n c = 2.93 for the Birnbaum and McCarthy approach.
Alternatively, a 2 in Theorem 2.1 can be estimated by replacing F by Fm and
G b y Gn.

~r2=min(m,n) {~[ f FZ(x)dG,,(x)- ( f Fm(X)dG,,(x))2]

Sen (1967) gives essentially the same result as Govindarajulu although he derives
slightly different estimates of a 2. One of his estimates of a 2 can be described in
terms of ranks.
Let Rl, . . . , R m be the ranks of the X i and S 1. . . . . S n the ranks on the Yj in
the combined sample. Set

Sff0 _ 1 (R i _ i)2 _ m R-- m +


m-1 i=1 2

sgl = 1
n-1 j=l
(Si - j ) - n S - ( +)]
2
12 .

The rank estimator of 0. 2 is

&2=(Sm°+~)min(m,n).

The normality of/~ should be a reasonable approximation if m, n t> 50 unless


the reliability in question is extremely high. In this latter case, a conservative
bound can be obtained based on (2.2) but using the exact distributions of
D~ + D~- rather than the approximation (2.4).
The nonparametric approach has one serious drawback. In return for its dis-
tribution-flee property, it is not possible to establish high reliability with even
moderate sample sizes.
32 R.A. Johnson
3. Parametric inference procedures

Given any parametric families {F(x[O1), 01e191} for stress and {G(y[02),
Oze 192} for strength, the reliability

Ro,, o2 = f F(xlO~) dG(xl02). (3.1)

Among the numerous choices for stress and strength distributions, only a few
special cases are amenable to exact small sample inference procedures. We first
treat the normal and then the Weibull stress-strength models before discussing the
general case.

3.1. Normal theory stress-strength models


Suppose F(') is N(# t, a~) and G(') is N(/~ 2, a2). Then

{ ]~2--/"£1 "~
R = PIt> Xl = e l Y - x > Ol = ¢

where (I)(.) is the standard normal cdf with pdf cp(.). Without further assumptions,
it is not possible to obtain exact confidence procedures.

3.1. I. The general normal-normal model


Let Xl, X 2 . . . . , X m be a random sample of stress values and Y~, I"2. . . . . Y,
be an independent random sample of strengths. Downton (1973) obtained an
expression for the uniform minimum variance unbiased (UMVU) estimate of R
by conditioning the indicator I[X < Y] on the complete sufficient statistic x, y,
S~ = ~ ( X i -X')2/(m - 1), s~ = Y~(yj- y)2/(n - 1). In particular

×
f l f d(v) (1 -- /)2)(n - 4)/2(1 _ U2)(m- 4)/2 d u d v
-1 ~/-1
with
(y - X) x//m s2(n - 1) fmm
d(v) =
s l ( m - 1)--+I) sl(m 1) ~]n "
We take
1~=~0 i f d ( v ) < ~ - 1 , all I r i s < l ,
(3.2)
h ifd(v)>f 1, all [v[~<l.
Stress-strength modelsfor reflability 33

When the sample sizes m and n are both large, confidence bounds for R can
be set using the approximate normality of ~, &2 = y ~ ( x _ y ) 2 / m , ~ and
& 2 = y . ( y ~ - 2 ) 2 / n . The maximum likelihood estimator of reliability is
a2
/~ = ~ ( Y - x)/v/~-~ + a2)" Since

Y- X #2 - #l 1 [(Y- ~2) - ( x - u,)]


,/712+4
1 (122 -- ~/1) [(a2 _ {7?) "}- ({722 -- 0"22)]
2 (a~ + a~)3/2

x/ 1 ) (3.3)
+ op min (m, n)

we obtain the following asymptotic result.

THEOREM 3.1. I f m, n ~ o o and m/(m + n)~)~ ( 0 < 2 < 1), then

(~__+=X 2 / , 2 _ f ~t "~ &a N(0, a2) (3.4)

where an can be estimated by

^2_Im+n [~+__+&22 ( y _ y ) 2
~rR ~.2 + 8.2 n 2(8 2 + &2z)z

(~m-1 --n-1 ^4)1 (3.5)


x a~+ n2 ~2 •

As a consequence of Theorem 3.1, an approximate 100(1 - a)% lower con-


fidence bound for R is given by

(3.6)
',x/ a~ + a 2

where 1 - e = ~(z~).

3.1.2. Equal variances." ~rZ = aft


Suppose it can be assumed that the stress and strength distributions have equal
variances. Estimating the common 0-2 by
m n
s2 = E i=l (xi - -x)2 + YV = 1 ( Y j -- . ~ ) 2
re+n-2
34 R. A. Johnson

leads to the non-central t-variable

/[(1 1~,~,2]1/2
T=(Y-X) +n] "J

whose noncentrality parameter is b = (/~2 -/-~l)/a(1/m + 1/n) I/2. Since P~[ T ~< t] is
monotonically decreasing in 5, a 100(1 - e)~o lower confidence bound for b is
given by _5 where

P ~ [ T o b ~ ~< t] = 1 - (3.7)

(see Lehmann (1959), Corollary 3, p. 80 and also p. 223-224).


Next,

so the 100(1 - ~)% lower confidence bound on R is

R=¢ (~-~/1m + ~) (3.8)


2
where _b is a solution to (3.7). Govindarajulu (1967) gives two-sided limits.

3.1.3. Known stress distribution


Mazumdar (1970) derived minimum variance unbiased estimates of R when the
stress distribution is known and when a~ is either known or unknown. Downton
(1973) gives an alternative integral expression for the estimator. Church and
Harris (1970) suggested a closely related estimator and derive the large sample
approximate confidence interval. In the notation of Section 3.1.1, their
100(1 - ~)% lower confidence bound is asymptotically equivalent to

(3.9)

where
^2 - _ _
n ^2 +
(y-/~l) 2 ( n - 1)a 2

The fact that/~l and ~r1 are known does not seem to lead to exact inference
procedures. Mazumdar (1970) does obtain an exact, but inefficient, confidence
bound by introducing m pseudo random numbers for the first sample.
Stress-strengthmodelsfor reliability 35

3.1.4. Some sample size considerations


Owen, Craswell and Hansen (1964) also treat paired observations and cases
where the variances and covariances are known. They then obtain some upper
bounds on the sample size required to achieve specified confidence bounds. We
present an extension of their approach.
For independent samples, when a 2 = a22 = 6 2, R = ~((/.L 2 -- #l)/N/~ 0") and a 2
is estimated by sp2. Given a fixed precision c and reliability R, required sample
sizes can be obtained by solving

1 - cc= P[I~ - c < R ] = P [ Y~-X~z(l_R_c,]


Lx/2 s v

=PIT(bm,,,)<x/2(l+~)-'/2z(~_R_c) 1 (3.10)

where T(bm,n) is a non-central t-variable with m + n - 2 degrees of freedom and


non-centrality parameter

~rn, n = + Z1-R "

Note that the sample sizes m and n enter the non-centrality parameter, degrees
of freedom and the percentile x/~+mz(l_R_c). The values of m and n do,
however, enter (3.10) symmetrically. In an application, the solutions m, n must be
maximized over the range R of interest. Owen, Craswell and Hansen (1964) give
a table of values for the case of equal sample sizes.

3.2. Exponential and Weibull distributions with equal shape


When the stress and strength distributions are both Weibull, and their shape
parameters are equal

Rol.o~,p=l-fo~e-(x/o1'"P(X~] p- e- (x/°:)Pdx
02 \02/
1
1 + (O,/Oz) p (3.11)

This Weibull expression includes the negative exponential distribution when p = 1,


and the Rayleigh distribution, when p = 2. Unless the common shape parameter
is known, only large sample approaches to inference are available.
When both distributions are negative exponential, some exact procedures are
available. With independent random samples, the likelihood is
36 R. A. Johnson
O l m e'rim= 1 Xi/Ol 0 2 n eZT=1 r~/o2 (3.12)

and the maximum mikelihood estimator of R = 1/(1 + 01/02) is /~ = 1/(1 + X/Y).


The bias is relatively small and (m + n)(/~ - R) = O(1) where R is the UMVU
estimator. Since (X/01)/(Y/02) is distributed Fz,,,2n, a 100(1 - e)% lower con-
fidence bound on R is given by

( X
1/ 1 + ~ F2rt, 2m(0~ ) ) <R (3.13)

where F2. ' 2.n(~) is the upper ~-th point of the F2. ' 2,.-distribution.
Alternatively, since (n/m)F2." 2m/(1 + (n/m)F2,,, 2m) has a beta distribution with
parameters n and m, the lower confidence bound can also be expressed as

1 -- ~/1--0¢
<R
- m~

where t/1 _~ is the 100(1 - ~)-th percentile of the beta distribution. The case of
known stress parameter, 01, can be treated by the same methods.
Basu (1980) considers the Marshall-Olkin bivariate exponential distribution.

3.3. General parametric families


A ^
Given point estimates 01 and 0 2 the point estimate of R,

= f F~,(x) dG~2(x ) (3.14)

can
A
usual!y be evaluated by numerical methods. Notice that /~ is the MLE if
0~ and 02 are MLE's. Except for the normal and exponential cases, confi-
dence bounds must be based on large sample theory.
Suppose

- , N(0, I 1 1(01) )

A
independent of 0 2 and

o2) ~'~, N(0, 12- '(02))

where 11(01) and 12(02) are the Fisher information matrices for the stress and strength
distributions, respectively. Then, if the derivatives are smooth
Stress-strength modelsfor reliability 37

~/m + n(R ~,. ~ - ROI"02)

= n vym(o,_0,)' ~ Fo,(x)dGo~(x)

? +n ^ ~ f Fo,(x)dGo~(x)

+O(minmn,) 1
(3.15)

Under suitable regularity conditions including the interchange of integration and


differentiation.

ao~=~ fFo,(x)dGo2(X)=f~_~f~ -~0,


- f(ul O,)g(xl 02) d u d x

~f(ul01)
f~ [1 - Go2(U)] f(ulO,)du,
_ ~ ~ O , / f ( u [ 01 )

bo = f ~ F o , ( X ) ~02/g(xl
8g(x]02)02) g(xl 02) dx. (3.16)

Notice that ao, and bo2 are expressions for the covariance of score functions.

THEOREM 3.2. If m, n ~ and m/(m + n ) ~ 2 ( 0 < 2 < 1 ) , then

~/m + n(R~ ,, ~ - Ro,,o~) ~ , N(O, a~,2 ~) (3.17)

where aR,2 A may be estimated as

^2 1 ^ 1 1( ^ b
aR'~'=--~ a'°llll(Ol)a'°l+--l- 2 b'~ I f 02) ~ .

As a consequence of Theorem 3.2 an approximate large sample 100(1- ~)%


lower confidence bound on Ro," o~ is given by

R ~ , . ~2 - z ~ R , ~ / , f m + n. (3.18)
38 R. A. Johnson

3.4. Drawback of the parametric model


Only moderate sample sizes are required for estimates of /~1, #2 and
a( = al = a2) in the normal model. However, estimates of reliability and the lower
confidence bound make strong use of the assumptions that the upper tail of the
stress distribution and lower tail of the strength are normal If the sample sizes
are not large enough to produce observations in these tails, we cannot even check
this assumption.
If a small fraction of the population of units contain major defects of material
or workmanship, even a moderate sample of strengths will not show these 'rare'
causes of failure. In this situation, use of an assumed parametric form for the
stress distribution will, typically lead to estimates of P [ Y > X ] which are, in-
correctly, very high.
Even without such extreme departures from the postulated models, tail areas
remain very difficult to estimate. The choice between normal, Weibull or log-
normal tails can change the estimated reliability by several orders of magnitude
when R is extremely large.

4. Stress-strength models for system reliability

System models have been discussed by Bhattacharyya and Johnson (1974,


1975, 1977), McCarthy and Orringer (1975), and Chandra and Owen (1975).
Bhattacharyya and Johnson (1975) study the situation where a system, consisting
of k components, functions when at least s (1 ~< s ~< k) of the components survive
a common shock of random magnitude. This formulation includes all series,
k-out-of-k, and parallel, 1-out-of-k, systems.

EXAMPLE 4.1. A panel consisting of k identical solar cells maintains an ade-


quate power output if at least s of the cells are active during the duration of the
mission. The external force interfering with the operation of the cells may be
extreme temperatures and the strength of a cell, in this context, may be taken as
its capacity to withstand the external temperatures.
Under an 'identical component' model, the strengths of the components are
assumed to be independent and identically distributed random variables with cdf
G(y). The stress, common to all components, is a random variable having cdf
F(x). The system reliability is then a function of F(.) and G('). In particular, the
reliability of an s-out-of-k system, Rs. k, is given by

R,, k = . [1 - G(x)]JGk-J(x) dF(x)


j ~ s -- oo

= 1-~2N[G(x)]dF(x ) (4.1)

where ~ ( ' ) is the cdf of the beta distribution having density oc uk-S(1 - u)s- 1
Stress-strengthmodelsfor reliability 39

4.1. Nonparametric estimation of system reliability


Let Y 1, . . . , X m be a random sample from F(.) and Y1, . . . , Yn be a random
sample from G(') where F(.) and G(.) are assumed to be continuous. Replacing
F(.) and G(.), in (4.1), by the empirical cdf's Fm(") and Gn(.), gives rise to the
intuitive estimator

R*k=l-f~ ~[G,,(x)]dFm(X)=f~ Fm(x) d~[G,,(x)]

m i= 1 (4.2)

where s(~)~< S(2 ) ~ " ' " ~ S(n) are the ordered ranks of the Y's in the combined
sample. Bhattacharyya and Johnson (1975) also derive the UMVE estimator as
a generalized U-statistic based on the kernal

h(xl;Y~ .... ,Yk)= 1 ifs or m o r e y l . . . . ,Yk exceed x l ,


= 0 otherwise. (4.3)

After some simplification, the UMVU estimator /~s, k can be expressed as

m,:, ,44,

Note that/~s,~ is similar in form t o / ~ * k but that it has the feature of a trimmed
mean.
Bhattacharyya and Johnson (1977) establish the following large sample result.

THEOREM 4.1. Let m, n~oo with m/(m + n)--*2 (0<2< 1). Then pointwise

(m + n) (Rs, ~ - R,,~)
~* = 0(1)
and
,/m + n(~,,k- R,,~) ~ , Y O, 1 - 2
where
a~ = VarF[ ~(G(X))] = f ~ 2 ( G ) d F - [ f ~(G)dFl 2,
(4.5)
~r~ =
--0(3 f2 b[ G(x)] b[ G(y)] {G(min (x, y)) - G(x) G(y)} dF(x) dF(y),

and b(u) is the pdf associated with ~.


40 R. A. Johnson

From Theorem 4.1 we conclude that a large sample 100(1 - a)% lower con-
fidence bound for R~, k is given by

, ~ 1_2+ (4.6)

where a^ 2I , aA22 are obtained by replacing F and G by F m and G n in the


expressions for alz, a 2. Clearly /~*,k could replace R~,k in the confidence
bound (4.6).
When the stress distribution F is known, the intuitive estimator has the form

/~./,(F) = .~1 [ ~ ( ~ ) - ~ ( ~ n ~ ) l F ( Y ( i ) ) (4.7,

and the UMVU estimator is

t~* k(F) 1 .-~+1 (i_l)C_i)F(y(o)" (4.9)

Bhattacharyya and Johnson (1977) also establish

x/n(/~,~,(F) - R) ~ , N(0, a~),


(4.9)
n ( R * k ( r ) -/~s,k(r)) = O(1), pointwise,

so confidence bounds similar to (4.6) are immediate. When F(.) is known, the
100(1 - e)% confidence bound on R~I, is

Z~
/~,,k ~ &l. (4.10)

4.2. Exponential distributions for stress and strength


When F(x) = 1 - e -x/°l and G(x) = 1 - e -x/°2,

k! ~ 1
Rs, k
= 1 s! j=s iJ + 02/01)
1 Z (-1) j k-s () 1 (4.11)
B(s, k s + 1) j=o j (s + j + 02/01)

where the last expression is obtained by expanding the product into partial
Stress-strength modelsfor reliability 41
~m
fractions. Here B(s, k - s + 1) is the beta function. We note that ( ;=7 Xe,
Y~"e=l Y,.) is a complete sufficient statistic and ( s + j ) - l u [ ( s + j ) X 1 - YI],
u(x)--1(0) if x > (~<)0, is an unbiased estimator of (s + k + 02/01) -1. The
Rao-Blackwell method leads to the UMVU estimator but its form is complicated
and depends on the hypergeometric function of the second kind. The maximum
likelihood estimator, /~s, k, has the considerably simpler form

k! k~s 1 (4.12)
/~s.k = 1 S! j=O (j + S + Y/X)

Asymptotically, /~s, g is normally distributed.

THEOREM 4.2. L e t m, n --* oo a n d m / ( m + n) ~ 2, 0 < 2 < 1, then

+ n(~'~, k - R~, 1,) ',,CP) N(0, o'R


z)
where

111 41>
As a consequence of Theorem 4.2, lower confidence bounds are obtained using
/~s, k to estimate R and Y/X to estimate 02/01 in the expression for trnz.
The asymptotic relative efficiency of the nonparametric estimator (4.2) or (4.4),
versus the exponential maximum likelihood estimator (4.11), is given by

1 2
2 o22 , +-OJOl)]j
e= (4.14)
(1 - 2)a 2 + 2aft

Bhattacharyya and Johnson (1975) tabel values of e.

4.3. Further generalizations of the s-out-of-k


The foregoing results are concerned with the reliability of an s out of k system
where the underlying assumptions are that the component strengths I11. . . . , Yk
are iid random variables and all the components are subjected to a common
random stress X which is independent of the Y's. We outline here some exten-
sions of the model for representing the reliability structure of more complex
systems.

(a) Non-identical component strength distributions. When the components of a


system are of different structure, the assymption of identical strength distributions
may not be realistic. This is often the case with systems having standby corn-
42 R. A. Johnson

ponents. Suppose that out of the k components, k~ are of one category and their
strengths can be reasonably assumed to have a common distribution G 1. The
remaining k 2 = k - k I components are of a different category and their common
strength distribution is denoted by G2. All the k components are exposed to a
common stress X having the distribution F, and the system operates successfully
if at least s of the k components withstand the stress. This corresponds to the
same structure function (4.3). Here, however, Y~. . . . , Yg, are iid G~,
Yk, + 1. . . . , Yg are iid G2 and X is distributed as F. The system reliability is a
functional of the triplet (F, G~, G2) and it can be formally expressed as

kl k2
R= (j~)(j2)j~ ~ ~ f(1-G,y' Gf'-J'(1-G2)J2Gkl-J2 dF (4.15)

where the sum extends over 0 ~ j a ~< k l , 0 ~<J2 ~< k2 such that s ~ j a +J2 ~< k.
When F, G1 and G 2 are exponential with the scale parameters 0, fl~ and f12, the
integral in (4.15) can be simplified to a linear function of terms of the form
[alfl I + a2fi2 + 0] - l where the known constants a I and a2 vary from term to
term. With independent random samples {X~, . . . , Xm}, {Y~, ..., Yah,} and
{Y21 . . . . . Y2n2} from F, G~ and G 2 respectively, one can easily obtain the maxi-
mum likelihood estimator of R. The U M V U estimator can also be worked out
along the lines of Section 4.2.
Nonparametric estimators of R can be constructed by either of the two proce-
dures. For instance, a nonparametric estimator/~* is obtained by replacing F, G 1
and G 2 in (4.15) by the empirical cdfs. Alternatively, defining the kernel function

h(X1; Yll, .'., Ylk,; Y2~, "", Y2k:) = 1 if at least s of the (k~ + k2)
Y's exceed X1,
= 0 otherwise, (4.16)

and averaging h over all mC",~t"2~


~,kl ] k k 2 ]
choices of the ordered subscripts, one obtains
the U M V U estimator of R.

EXAMPLE 4.2. Consider a system with k = 2 and s = 1 where the two com-
ponents have strength distributions G 1 and G 2 and are subjected to common
stress with distribution F.
Stress-strength models for reliability 43

From (4.15) with k~ = k 2 = 1, we obtain

R= f (1-G1)G2dF+ f G I ( 1 - G z ) d F + f ( 1 - G 1 ) ( 1 - G 2 ) d F

= 1 - f GIG2dF.

The nonparanaetric UMVU estimator, based on random samples {X~. . . . . Xm},


{Y~ . . . . , Y~n,} and {Y2~, --., Y2n~} from F, G 1 and G 2 respectively, is given by

RNP = (Tl + T2 + T3)/mnxn2

where T 1, Tz and T3 are the numbers of triplets {X;, Ylj,, Y2j2} satisfying
(Y1j, < X i < Y2j2), (Y2j2 < Xi < Yljl) and (Xi < Ylj,, Xi < Y2j:), respectively. The
estimator based on the empirical cdf's is given by

1~* = 1 - f Gln G2,,2dFm = 1 - [mnln2] -1 ~=,~(Qi- i)(Q" - i)

where Qi is the rank of the i-th order statistic X(o within the combined X and Y~
samples, and Q[ is the rank of X(,.) within the combined X and Y2 samples.

(b) Subsystems with independent stresses. In a more complex situation a system


may consist of a number of independent subsystems performing different
functions. Within each subsystem, the components have independent and identi-
cally distributed strengths and are subjected to a common stress so that each
subsystem has the structure of an s out of k stress-strength model. The strength
and stress distributions as well as s and k may vary among the subsystems. The
following diagram illustrates such a system where the two subsystems A and B
are serially connected.
subsystem A subsystem B

2 out of 3 1 out of 2

Fig. 1. Serially connected subsystems with independent stresses.


44 R. A. Johnson

The subsystem A functions when at least two of the three components survive
the stress X. The component strengths are iid with distribution G~ and the
common stress X has distribution F 1. Similarly, the subsystem B has the structure
of a 1 out of 2 stress-strength model where the strength and stress distributions
are G2 and F 2 respectively. The system reliability R is given by

R = R 2A, 3 R B
1,2

where the factors on the rhs are the stress-strength reliability functions for the
subsystems and they have the same forms as given in (4.1). Using the methods
of Section 4.1, one can obtain the UMVU estimator for each of R~, 3 and R B 1,2
and, due to independence, their product will give the nonparameter UMVU
estimator /~ of R. The limiting normal distribution o f / ~ and the form of the
asymptotic variance can then be obtained from the subsystem results.

(c) Binomial data on components. Often, components are tested under random
stress conditions that prevail, and only the number of survivors are recorded
rather than the measurements of stresses and strengths. In the context of a single
component stress-strength model where our objective is to estimate the probability
R~ = P [ Y > X] = S (1 - G ) d F , the present sampling process yields the count Z n
which is the number of pairs (X~, Y~.), i = 1, . . . , n, such that Y,. > X i. The numeri-
cal measurements of Y,. and Xi are not recorded. The problem then reduces to
estimating a binomial probability from the number of successes in n trials. More
generally, consider a system consisting of c subsystems where each subsystem has
the structure of a single component stress-strength model. The system reliability
is then a function

R = g ( P l , P2, " . . , Pc)

where Pi = S (1 - Gi) dF;, G,. and Fi are the strength and stress distribution for the
i-th subsystem, and the functional form o f g is determined by the manner in which
the system is structured. Methods of estimating the system reliability from
binomial count data have been developed by Myhre and Saunders (1968),
Madansky (1965), Easterling (1972), and many others. The stress-strength formu-
lation of the model loses its distinctive features when only the count data are
recorded and the subsystems have single components.
For a k (~>2) component stress-strength system where all the components are
exposed to a common stress X in their natural operating environment, some care
is needed for using binomial count data of the component failures for estimating
the system reliability. Intuitively, one might interpret the reliability of an s out of
k system as the probability of obtaining s or more successes in k Bernoulli trials
and proceed to estimate this binomial probability from the count data. In this
process, one would be estimating the functional
Stress-strength modelsfor reliability 45

where R~ = ~ (1 - G ) d F . This is not the same as the system reliability for an s


out of k system which is given by

Rs, k= ~ (4.18)

Notice, in particular, when s = k and k ~> 2,

O(F, G) = Ef (1 - G ) d < (1 - G)~ d F = Rk, k .

Bhattacharyya (1977) explores estimation procedures in this contect. He considers


data in the form of failure counts when m components are subjected to a common
stress, and this experiment is repeated n times. Efficiences are also calculated
relative to the exponential model.

5. Extensions of the basic stress-strength model

Two recent developments merit further attention.

5.1. Stochastic process formulation


A more sophisticated stress-strength model allows the stress, X(t), and strength
Y(t) to vary over time. Specifically, let {X(t):t > O} and {Y(t):t > 0} be inde-
pendent stochastic processes. Consonant with our initial formulation of the
stress-strength model in Section 1, we would define reliability for the period
(0, to] as

Rl(to) = P[ inf Y(t) > supX(t)]. (5.1)


t~t o t<~t o

Alternative definitions are also plausible. We could only require that current
strength exceed the maximum thus far encountered.

R2(to) = P[T(t) > supX(s), all t ~< to]. (5.2)


s<~t

Even less stringent, the requirement could be that current strength exceeds current
stress.

Ra(to) = P[ Y(t) > X(t) , all t <~to]. (5.3)

Using definition (5.3), Basu and Ebrahimi (1983) consider the case where X(t)
and Y(t) are brownian motion processes with means /Zl, #a and covariances
tr~ min(s, t), tr22min(s, t). They show that
46 R . A . Johnson

R3(to) = q~ ( /~2- ~1 (5.4)


\(~? + ,r~)to/

which is of the same form as the normal theory model in Section (3.1). Expression
(5.4) would not be expected to apply for large to since R(to)>_. 0.5 all to, when
['/'2 > /21"

5.2. Stress-strength models with covariates

Strength can usually only be determined by testing a unit to destruction.


However, it is often possible to measure covariates of strength without damaging
the unit.

EXAMPLE 5.1. A 2 X 4 to be used in the frame of a house has bending strength


Y which can be observed only by destructive testing. Yet stiffness (the modulus
of elasticity), which can be used to predict strength, is easily measured by a
non-destructive test. Data for some species suggest that strength is related to
stiffness Z according to the linear relation

Y = ~z + f12z + e2

where e2 is distributed N(0, ~rzZ). For a specimen whose stiffness is z, the con-
ditional reliability becomes

B.z "~
R(X) = PIE> X[z] = r'z- ~]A1] . (5.5)

EXAMPLE 5.2. Refer to Example 1.2 where the purpose is to compare remission
time X using Drug A with remission time Y using Drug B. Suppose that the age
z of the subject influences the remission time. We postulate the linear regression
relations
X= cq + fllz + e l , Y = c~2 + ~2z + e z

where e 1 is distributed N(0, a2) independent of e2 which is distributed N(0, a~).


For a new subject of given age z, we should provide information about

/ + (¢~2 -- fl,)Z'].
r(z) = e[r> Xlz] = ~1 ~ 2 (5.6)
\

The models in Example 5.1 and Example 5.2 were introduced by Bhattacharyya
and Johnson (1981). Initially, we consider the more general model where X and
Y may depend on possible different covariates. Set

Z 1 = [ Z l l , Z12, . . . , Zlql It and Z,2 = [Z21 , .722 . . . . . Z2q2] t


Stress-strength models for reliability 47

and assume

X l z 1 ~ N(~ 1 + ~'IZl, 0"?)


independent of

rrz: ~ N(~: + / ~ z : , a~).

We are then interested in making inferences concerning the reliability

R(Zl, Z2)= P[ Y > X'Zl, z2] = ~ ( ~ 2 - O~-+--[J'2z--~2-''lZl) • (5.7)

Some exact inference procedures are available when the variances are equal.
Set a 2 = a ~ = a 2 so

R(Z,, Z2) = ~ C x 2 - cq + '2Z2 -- fl'lZl)


v/~

We have available, data of the form

(Xl, z , , ) , (x2, z~2) . . . . . (Xm, zl,,,),

(Y1, Z21), (Y2, z22) ..... (Y,, Z2,).

Given the covariate values Zao, Z2o we note that

is normally distributed with mean ~2 + P~Z2o - ~ - P' zlo and standard deviation
Coa where

= - - "+ -- "1- ( Z l 0 -- ~ 1 ) t
Em
Z j = 1 (Zlj -- ~ I ) ( Z l j -- ~ 1 ) t
]1 (ZlO -- Z l )
m n
1
(Z2o - ~2). (5.8)

^ ^
Here ~1 and f12 a r e the least squares estimators. Also

(m + n - qa - q2 - 2) se = ~ (xj - ~ - ]~11(Zlj- ~i)) 2


j=l

+ Z (y; - y - ~ ( z 2 j - ~2)) 2 •
j=l
48 R. A. Johnson

is independently distributed as O"2 X#2+ ,. - 2 -- q ! - q2" Consequently,

At
T + ~2(Z2o - ~2)
T=
CoS

has a non-central t-distribution with m + n - 2 - ql - - q2 degrees of freedom and


noncentrality parameter.

q = ~ + t ~ Z o - ~, - tr, Z~o
CoO"

A lower 95% confidence bound, r/, is obtained by solving Fn(Tobs) 0.95 for =

r/. Consequently, a 95% lower confidence bound for R(z~o, Z2o) is given by

R(zlo, Z2o)= ~(Cotl/x/~). (5.9)


Gutmann, Johnson, Bhattacharyya and Reiser (1988) discuss the unequal vari-
ance case.

6. Bayesian inference procedures

Given the random sample X1 . . . . , Xm from f(" q01) and an independent ran-
dom sample from g(" I 02), together with a prior density p(O,, Oz), in principle one
can obtain the posterior distribution

h(01, 02[Xl . . . . , Xm, Y,, "" ", Yn) = p(01, 02) f i f(xil01) (-~ g(y, 102)
i= I j= 1 (6.1)

for (01, 02)- This distribution could then be transformed to the posterior distribu-
tion of Ro," o2 = ~ F(yl 01) dG(y] 02). Enis and Geisser (1971) obtained analytical
results for negative exponential distributions and for normal distributions with
equal variances.

6.1. Bayesian analysis with exponential distributions


Enis and Geisser (1971) assume that the negative exponential scale parameters
01 and 02 are independent, a priori. In particular they make the choice of con-
jugate prior distributions

pa(01 ) ~ O-s, -1 e - c,/O,, P2(02) oc 0z-s2- 1 e - c2/o2 (6.2)

Combining the likelihood (3.12) of the samples of sizes m and n, we obtain the
joint posterior density
..... 1 ...... l
h(O,, 02I~, Y)~:\~I e-(c'+m~>/°'\~l e-C.... ~>/o2
(6.3)
Stress-strength modelsfor reliability 49

Transforming to r = 02/(01 + 0 2 ) and v = 01 a + 0 2 1 and then integrating out v


produces the marginal posterior distribution of R.
h(r)ocrm+S~-l(1 - r)n+s2 1(1 - cr)--(m+n+sl+s2) (6.4)
where
c 2 - c I + ny - m 2
c= <1. (6.5)
C2 + n y

The transformed variable p = (1 - r)/(1 - cr) has a beta distribution with parame-
ters n + s 2 and m + s ~ so

P[r<R]=PIp< 1 - 1 _ r]_
(6.6)
1 f(l -- E)/(1 -- c_r)
u,+S2- 1(1 _ u)m+sl- 1 du.
B(n + 82, m + sl) Jo
A 100(1 - ~)% Bayesian lower bound on R is given by

1 -- ~]1--o:
r - (6.7)
1 - cql _~

where qa - ~ is the 100(1 - ~)-th percentile of the beta distribution with parameters
n + s 2 and m + s I . Comparing (6.7) with the alternative form for the bound below
(3.13), we see that the choice of 'informationless' priors, s a - - s 2 = 0 and
c I = c 2 -- 0, leads to the same bound as the classical procedure.

6.2. Bayesian analysis with normal distributions


For the case of independent samples, Enis and Geisser (1971) restrict their
treatment to normal populations with equal variances. They employ the conjugate
prior

P(#l, #2, O')~O'-(b+3)exp { -~12Cr2 (bco+cl(l.tl-ml)2+C2(liz-m2)2)}

(6.8)

where b, c o, c l, c 2 > 0. The likelihood is

1 1
(2/1;)TM + n)/2 o.rn + n

x exp --- [(m + n - 2)sp2 + m(/~ I - 2)z + n(#2 _ .~)2]}. (6.9)


2ff 2
50 R. A. Johnson

Since the reliability R = ~(6) where 6 = (/t 2 - I ~ ) / v / 2 a , it is convenient to


determine the joint posterior distribution of b and (m + n - 2)sz/tr 2. The joint
posterior density for (/~,/~2, a) can then be written as

h(#l, ~2, tr)m t r - ( b + l + m + " ) e x p


{1[ - ~ a 2 bco + (m + n - 2)Sp2

+ m¢l(-X - ml) 2 + nc2(Y__- _m2)2]t


m + c1 n = c2 A)

.o._lexp { (c I + m ) ( mx+clml~2~
20.2 I'll CZ + m "] .)

• rr 1 exp { 2a 2
#2 . . . . . . .
c2 + n / )
• ,610,

F r o m (6.10) it is readily seen that, a posteriori,

[/(ny+_c2m2) (_my+ clm,)


~5 given
[
a"~N~x
n+c2
~r~
m+c 1
, ~
l(m+
+C 1
1
n+c 2
))
independent of

t~
Z = -
0,2

(m + n - 2)s 2 + bc 0 + mcl(Y - ml)2/(m + cl) + nc2(y - m2)/(n + c2)


0-2

w h i c h is d i s t r i b u t e d 2 + n + b" Setting
as ~(m

m ~ + c~m~-] / . -
c =1 I - -1 + - - 1 1 , k . [. n~+c2m2
. . _ _ _ _ //,,/2v,
m + C1 Fl "[- C 2 L n -J- C 2 m + c~ d /

the joint posterior distribution of b, z is

1 e_C(6_k~)2/2 z(m+n+b)/2- I e-Z~2

427zc 2(m+n+b)/2mWH+b
F() 2
Stress-strength models for reliability 51

and the marginal posterior distribution of b is given by

h(blxl, ..., x m, Yl . . . . ,Yn)

1 (C- lk2 + 1) -(m+n+b)12


e - ~)2/2C

• ~ [xf2c- 'k6(1 + c - 'k2) - lizlg F(½(m + n + b + j ) ) (6.11)


s-o r ( j + 1)

Although expression (6.11) is rather tbrmidable, lower bounds on ~5 can be


obtained via one-dimensional numerical integration.
In addition to their usual interpretation as information from earlier samples,
some guidance in the choice of parameters is provided by the prior expected value

Ep[R] = e I tb < m2 - ml 1

0 +--
+C 1 I+C

Enis and Geisser (1971) show that the choice of a vague prior oc tr- ~ produces
a posterior distribution whose expectation E ( R ) is closer to 0.5 than is the
maximum likelihood estimator• Finally, it should be remarked that they treat the
slightly more general case of estimating P [ a I X 1 + a 2 X 2 + • • • + apXp > 0] and
that one of their formulations includes paired stress-strength data.

6.3. The Bayesian stress-strength model in risk analysis

The stress-strength reliability model is also an integral part of many risk


analyses. At the component level, for instance, it may be necessary to make an
assessment of the reliability of a motor operated value in a nuclear power plant.
This application of the stress-strength model has one dominant feature. Little or
no data are available on either the critical stress or even on the strength of the
component.
With regard to estimating the strength distribution, one method is to gather
expert opinions from several persons. The ellicited information could be in the
form of percentiles such as the 10-th, 50-th and 90-th percentiles. A lognormal,
or other distribution, could be fit to each person's percentiles. These must then
be combined, possibly in a weighted fit, to provide an estimated strength dis-
tribution. Estimation of the stress distribution is usually approached via mathe-
matical models which convert phenomena like earthquake magnitude to the stress
on a component located at a given site. Random quantiles, like ground motion
from the earthquake and parameters of the structure housing the component, are
then introduced. The resulting process is studied by simulation to produce an
52 R.A. Johnson

estimated stress distribution. The component reliability, R = P [ Y > X], given an


earthquake, can then be estimated using the estimated stress and strength distribu-
tions determined above. Mensing (1984) provided the following example.

EXAMPLE 6.1. One important eomponent in the operation of a nuclear power


plant is the steam generator. In a study of the risk of a nuclear power plant to
earthquakes, it is necessary to assess the ability of the generator to withstand the
stresses imposed by the ground motion due to an earthquake. Almost no data
exists for estimating the strength of steam generators, with respect to ground
motion, so expert opinions were elucidated. It was first determined that the most
likely cause of generator failure would be failure of its supports. Five experts were
asked to estimate the 10-th, 50-th and 90-th percentiles for the strength of the
steam generator supports. Their responses are summarized in Table 5.1 where the
strength variable is the peak acceleration in ft/sec 2.

Table 5.1
Expert opinions concerningpercentiles of the strength dis-
tributions (ft/sec2)

Percentile

Expert 10-th 50-th 90-th

1 80.71 96.85 103.31


2 77.48 83.94 96.85
3 29.06 59.72 96.85
4 19.37 29.06 48.43
5 32.28 43.58 61.34

Assuming the strength of the generator supports can be approximated by a


lognormal distribution, a weighted least squares procedure was used to estimate
the mean, 0, and standard deviation, b, of the natural logarithm of the strength
distribution. The resulting estimates are 0 = 4.06 and ~ = 0.29.
Mathematical modeling can be used to estimate the distribution of stress at the
base of the steam generator. Suppose the stress distribution is modeled by a
lognormal distribution where the natural logarithm of stress has mean
0s = 2.32 ft/sec 2 and standard deviation bs = 0.40 ft/sec 2. Then, it is clear that the
reliability of the steam generator is nearly 1.0. Specifically, R = P [ l n Y > lnX]
= ~(3.52) = 0.99978.

Since the primary source of information about the random variation in stress
and strength is expert opinion and engineering judgement, it is a more difficult
problem to obtain lower bounds for R. In the context of the nuclear power plant,
the lower bound on R converts to an upper bound on the probability of failure
and subsequent radioactive release. Some attempts have been made to quantify
the uncertainty experts have in formulating their opinions and using this quantified
Stress-strength models for reliability 53

uncertainty to develop bounds for the probability of failure. See Bohn et al. (1983)
for more information.
A risk analysis of a system is considerably more complicated than for a single
component. With a nuclear power plant, failure can occur in numerous ways.
From a fault-tree analysis, each separate failure path is determined. Data are
typically available on some component strengths but it is mostly expert opinion
that must be combined in order to obtain an estimate of the failure path proba-
bilities and, ultimately, the system reliability. The calculation of an estimate of
system reliability can involve as many as 300 to 400 components and the proba-
bility of an accident sequence is calculated from, say, a multivariate normal
distribution. In this setting it is possible to include a stress such as an earthquake,
as a common stress to numerous components.

References

Basu A. (1980). The estimation of P[X< Y] for distributions useful in life testing. Navel Res. Log.
Quart. 3, 383-392.
Basu, A. and Ebrabimi, N. (1983). On computing the reliability of stochastic systems. Statistics and
Probability Letters 1, 265-268.
Bhattacharyya, G. K. (1977). Reliability estimation from survivor count data in a stress-strength
setting. IAPQR Transactions--Journal of the Indian Association for Productivity, Quality and Reliability
2, 1-15.
Bhattacharyya, G. K. and Johnson, R. A. (1974). Estimation of reliability in a multicomponent
stress-strength model. J. Amer. Statist. Assoc. 69, 966-70.
Bhattacharyya, G. K. and Johnson, R. A. (1975). Stress-strength models for system reliability. Proc.
Syrup. on Reliability and Fault-tree Analysis, SIAM, 509-32.
Bhattacharyya, G. K. and Johnson, R. A. (1977). Estimation of system reliability by nonparamatric
techniques. Bulletin of the Mathematical Society of Greece (Memorial Volume), 94-105.
Bhattacharyya, G. K. and Johnson, R. A. (1981). Stress-strength models for reliability: Overview
and recent advances. Proc. 26th Design of Experiments Conference, 531-546.
Bhattacharyya, G. K. and Johnson, R. A. (1983). Some reliability concepts useful in materials testing.
Reliability in the Acquisitions Process. Marcel Dekker, New York, 115-131.
Birnbaum, Z. W. (1956). On a use of the Mann-Whitney statistic. Proc. Third Berkeley Symp. Math.
Statist. Prob. 1, 13-17.
Birnbaum, Z. W. and McCarthy, R. C. (1958). A distribution free upper confidence bound for
P(Y < X) based on independent samples of X and Y. Ann. Math. Statist. 29, 558-62.
Bohn, M. P. et al. (1983). Application of the SSMRP methodology to the seismic risk at the Zion
Nuclear Power Plant, NUREG/CR-3428 Nuclear Regulatory Commission, Nov.
chandra, S. and Owen, D. B. (1975). On estimating the reliability of a component subject to several
different stresses (strengths). Naval Res. Log. Quart. 22, 31-40.
Church, J. D. and Harris, B. (1970). The estimation of reliability from stress-strength relationship.
Technometncs 12, 49-54.
Downton, F. (1973). The estimation of P ( Y < X) in the normal case. Technometrics 15, 551-558.
Easterling, R. (1972). Approximate confidence limits for system reliability. J. Amer. Statist. Assoc. 67,
220-22.
Enis, P. and Geisser, S. (1971). Estimation of the probability that Y < X. J. Amer. Statist. Assoc. 66,
162-68.
Govindarajulu, Z. (1967). Two sided confidence limits for P ( Y < X ) for normal samples of X and
Y. Sankhy-d B 29, 35-40.
54 R . A . Johnson

Govindarajulu, Z. (1968). Distribution-free confidence bounds for P ( X < Y). Ann. Inst. Statist. Math.
20, 229-38.
Guttman, I., Johnson, R. A., Bhattacharyya, G. K. and Reiser, B. (1988). Confidence limits for
stress-strength models with explanatory variables. Technometrics (in press).
Lehmann, E. (1959). Testing Statistical Hypotheses. Wiley, New York.
Kececioglu, D. (1972). Reliability analysis of mechanical components and systems. Nuclear Eng. Des.
9, 257-290.
Lloyd, D. K. and Lipow, M. (1962). Reliability, Management, Methods and Mathematics. Prentice-Hall,
Englewood Cliffs, NJ.
Madansky, A. (1965). Approximate confidence limits for the reliability of series and parallel systems.
Technometrics 7, 495-503.
Mazumdar, M. (1970). Some estimates of reliability using interference theory, Naval Res. Log. Quart.
17, 159-65.
McCarthy, J. F. and Orringer, O. (1975). Some approaches to assessing failure probabilities of
redundant structures. Composite Reliability, ASTM STP 580, American Society for Testing and
Materials, 5-31.
Mensing, R. (1984). Personal communication.
Myhre, J. M. and Saunders, S. C. (1968). On confidence limits for the reliability of systems. Ann.
Math. Statist. 39, 1463-72.
Owen, D. B., Craswell, K. J. and Hanson, D. L. (1964). Nonparametric upper confidence bounds
for P(Y < X) and confidence limits for P(Y < X ) when X and Y are normal. J. Amer. Statist. Assoc.
59, 906-24.
Sen, P. K. (1967). A note. on asymptotically distribution-free confidence bounds-for P(X < Y) based
on two independent samples. Sankhy-d A 29, 95-102.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7 h
1
© Elsevier Science Publishers B.V. (1988) 55-72

Approximate Computation of Power Generating


System Reliability Indexes

M. Mazumdar

I. Introduction

An electric power system is a massive energy conversion and transmission


facility. Its function is to convert chemical, nuclear, or kinetic potential into a
more useful electrical potential and transmit electrical energy to its consumers.
Power systems tend to have generation concentrated in specific locations, whereas
demand is spread over a large geographic region. The problem of providing power
to widely scattered demands from remote generating stations is solved by the
electric utility companies through a three-tiered system. Elements of this system
are: power generation subsystem, transmission subsystem, and distribution sub-
system. In the power generation system, electric power is produced from a
number of different types of generating plants (fossil, nuclear, hydroelectric, etc.).
Transmission systems carry large amounts of power for long distance at a high
voltage level. From the transmission sources, distribution systems carry the load
to a service area by forming a fine network.
The reliability of an electric power system has been defined as the probability
of providing the user with continuous service of satisfactory quality [8]. By
satisfactory quality, it is meant that the frequency and the voltage of the power
supply remain within prescribed bounds. There are several reasons why reliability
is very important to the electric power industry. The public has grown accustomed
to very reliable supply of electricity, and it would not accept lower standards. The
occurrence of power failure is expensive to the customer as well as to the utilities.
The social costs of power failure have also been well-documented. There has been
increasing concern during the recent years on the risks to public health and safety
associated with different energy sources that are used to produce electricity. In
relation to nuclear power, these risks are largely contingent on the probability and
severity of infrequent system failures. Therefore, reliability considerations have
come to play a major role in the planning, design, operation and maintenance of
electric power plants. To achieve a high degree of reliability at the customer's
level, it is necessary that each of the three components of the power system--
generation, transmission and distribution--provide an even higher degree of relia-
bility.
55
56 M. Mazumdar

The performance of electric power systems is influenced by a large number of


random phenomena. First, the demand for electric power has a large stochastic
component, which is strongly influenced by weather. The outdoor equipment, such
as transmission lines, is subject to natural causes, e.g., storm, lightning, floods as
well as to inadvertent man and animal-caused damages. The equipment used to
generate and transmit electricity fails randomly. The time to restore the failed
equipment is also a random variable. It is thus necessary to construct probability
models which can be used to predict the performance of the power systems as
they are influenced by these random variables. These probability models are used
to compute standard reliability parameters such as mean time to failure, availa-
bility, etc., as well as reliability indexes which are special to the electric utility
industry. Concurrently, one needs to pay attention to proper collection and analy-
sis of 'outage' data so that one has appropriate confidence in the output of these
reliability studies.
Early studies in power system reliability evaluation were confined to determina-
tion of generating system reserve capacity. Only comparatively recently, such
studies have been extended to cover the transmission and distribution systems.
Consequently, the state-of-the-art with regard to generation reliability models is
much more advanced as compared to the transmission and distribution systems.
Reliability models play an important role in the determination of required
installation generation reserves of a given electric utility company, and its long-
term planning for generation capacity expansion. The quantity determined here is
the amount of installed reserve capacity required such that the probability of
load-loss does not exceed a prescribed small amount. These studies thus help the
planner in scheduling generating unit additions as the load grows over time. Such
models also play an important role in the evaluation of expected generation
system production costs. An electric utility system typically consists of many
generating units of different capacities, availabilities and operating costs. Not all
the units within a system experience equal utilization over time because of such
differences, and units with high running costs are pressed into operation only if
the load is high and/or the cheaper units are failed at the time. Therefore, the
computation of expected overall production costs of a utility system needs to
account for the stochastic characteristics of the generating unit failures and the
system load. Power generation reliability models include only the generating units
within a given system, and the rest of the system is assumed to be perfectly
reliable. Thus, according to these models, a system failure occurs when the total
power generated by the system falls short of the system load. In order to contrast
the available system capacity with the demand, two sets of models are required,
one for the states (e.g., failed and non-failed) of the generating units, and the other
for load variations in a given system. When one combines these two stochastic
models, one arrives at an overall system model whose solution provides the
required reliability indexes which can then be used as engineering tools for plan-
ning and operating decisions.
In this paper, we will confine ourselves to an examination of the computational
aspects of two important power generation system reliability indexes which are
Approximate computation of power generating system reliability indexes 57

used in power system planning and production costing evaluation. In particular,


we provide additional results on the use of an effective approximation scheme
which was proposed in a recent paper [ 14]. This scheme uses a transformation
proposed by Esscher [10] for computing actuarial risks. Section 2 describes the
reliability models used for the power generating system in connection with the
determination of the risk due to load loss and the expected production cost. We
define here the reliability indexes of interest, and point out the difficulties in their
computation. Section 3 gives a description of the Esscher's approximation method
as well as that of a more common approximation known in the recent power
system literature as the method of cumulants [ 15]. We derive here the necessary
formulas in connection with the application of Esscher's method. Section 4
provides the numerical estimates of the accuracy of this method by applying it to
several prototype systems. Section 5 states the conclusions.
For a more detailed discussion of generation reliability models and their uses,
the reader is referred to the monographs by Billinton [3], Billinton, Ringlee and
Wood [4], Endrenyi [8], and Sullivan [16].

2. Generating system model and the reliability indexes

Generating unit representation


It is assumed that the generating system under consideration is composed of
n independent generating units. That is, they can fail, and be repaired, inde-
pendent of failures and repairs of the other units. This is usually a realistic
assumption except when a single boiler supplies steam to several turbogenerators
through a common header. In course of their operation, generating units may
suffer complete failures or partial failures, where they lose a part of their capacity.
If a simple two-state model is assumed for the generating unit, such that it
alternates between two operating states, 'up' (operating) or 'down' (under repair),
a measure of the unit performance is given by its unavailability, which is defined
as follows [2]:
Mean downtime
A -- (1)
Mean downtime + Mean uptime
m

The index A measures the fraction of the time that a unit is unavailable for service
during periods when it is not on planned outage. Endrenyi [8] has shown that
this index is meaningful even when maintenance lasting short length of times is
concerned, provided that maintenance itself does not contribute to failure.
In the power system vocabulary, the term used for unavailability is the forced
outage rate (FOR), which unfortunately is a misnomer, since the index represents
a pure number and not a rate. The FOR is defined as

Forced outage time


FOR = , (2)
Forced outage time + In-service time
58 M. Mazumdar

where the times appearing in the numerator and denominator refer to a reasonably
long period of observation. The index (2) is equivalent to (1) when the period
under question is long enough. The above definition of the forced outage rate or
the unavailability assumes that the generating unit has only two states--operating
at 100~o capacity or completely failed. The intermediate capacity states are
usually accounted for by defining an index called the equivalent forced outage rate
(EFOR), which is given by the following equation:

FOH + EFOH
EFOR =
SH + FOH

where FOH, EFOH and SH denote respectively full forced outage hours, equiva-
lent forced outage hours and service hours. The quantity, equivalent forced outage
hours, is obtained by multiplying the actual partial outage horus for each partial
outage event by the fractional capacity reduction and then summing the products.
The introduction of the index EFOR enables one to approximate a unit with
several capacity states by one having only two states. In this two-state equivalent
representation, the index EFOR estimates the long-run probability of being fully
out and the quantity (1-EFOR) estimates the long-run probability that it is availa-
ble at full capacity. Data on EFOR are presented for a variety of sizes and types
of generating units in reports published by the Edison Electric Institute, see, for
example, [7].

Load models

An hourly load duration curve is obtained by first plotting on the vertical axis
the power demand forecasted for each hour in a planning period in a chronologi-
cal order along the horizontal axis. The load duration curve (LDC) is then
obtained by reordering the demands in a descending order of magnitude. Here,
the number of days on which the load exceeds a given value is plotted as an
abscissa with the forecasted load value as the ordinate. Assume that the fore-
casted peak demand occurs for one hour during each of the days in a 20-day
planning. Then, one can say that the peak load occurred in a fraction equal to
1/24 of the planning period. Figure 1 shows that the system load was expected
to be above 100 MW during 50~o of the time. When the abscissa is normalized
to 1, the figure can be read to denote the fraction of the time the load is expected
to be above a given value. Thus it is possible to give a probabilistic interpretation
to the load duration curve. The horizontal axis of the curve yields the survivor
function of the load when it is treated as a random variable. It gives the proba-
bility that the observed load will exceed a specified value as denoted by the
ordinate.
In some studies on generation reliability, notably when unit production costs
are evaluated, it is a practice to merge the individual generating unit failure models
and the load probability distribution by defining the so-called equivalent load
Approximate computation of power generating system reliability indexes 59

150 MW-

100 MW

75 MW

0 240 480 (in hours)

Fig. 1. H o u r l y l o a d d u r a t i o n curve: An example (abscissa normalized to 1 for LOLP calculations).

duration curve, abbreviated as E L D C [15]. This definition rests on the observa-


tion that the outages caused by plant unreliability can be thought of as additions
to the true load on the system. Suppose that all n units within a given system are
candidates for operation to meet a given load, L. Then

Available capacity = c 1 Jr- c 2 --1- • • • --1- C n - ( X 1 "Jr"X 2 -1- " ' " "~ X n "1- L ) ,

where ci is the installed capacity of unit i, and X; is the capacity on outage for
unit i. Notice that the quantity, (X 1 + X 2 + • • • + X, + L), plays the role of an
equivalent load that confronts the n units of the system. A curve which shows the
proportion of times that the observed equivalent load will exceed given specified
values is called the equivalent load duration curve (ELDC). It is clear from the
foregoing discussion that separate sets of such curves can be drawn for all the
n individual units of the system.

Loss of load probability index


Two different sets of generation reliability indexes are used by the electric utility
industry--one in the context of long-range planning and the other for short-term
operational planning. The former provides inputs to decisions in generation
expansion planning and the scheduling of new unit additions. The latter indexes
are of use to the operating engineer in the daily operation of a power system. The
loss of load probability (LOLP) index is used in the long-range planning context,
and it measures the probability that a given system's available capacity is insuf-
ficient to meet the system peak load on a given day. It estimates the fraction of
time the utility system will have a generation deficit, with no consideration given
to the magnitude of the deficit.
Consider a system consisting of n units such that the installed capacity of unit
i is c~ and its (equivalent) forced outage rate is p;, i = 1, 2, ..., n. Define X i as
the unavailable capacity or the capacity on outage for unit i on a given day. We
60 M. Mazumdar

assume that X; is a sequence of independent random variables. Thus X; is a


random variable with ,,, distribution of

X i = ci with probability = Pi,


= 0 with probability = 1 - p ; . (3)

Let L denote the system peak load. Then the loss-of-load probability (LOLP)
index is measured by

LOLP=Pr{X 1 +X 2+... +X n>c 1+c2+... +c,-L}. (4)

In the situation where the LOLP index is being estimated for future time periods,
as is typically done in power generation planning, the forecasted peak load will
be uncertain and regarded as a random variable. We usually regard L as normally
distributed with mean/~ and variance a 2, its distribution being independent of the
X i random variables. If the peak load is regarded as known, a 2 = 0 and L = #,
but otherwise, a 2 > 0, and departures from normality may also be anticipated. Let
Y denote the deviation of the peak load from its mean /~. Then we can also
express (4) as follows:

LOLP=Pr{XI+X 2 + ... + X n + Y> z} , (5)

where z = Cl + c2 + + Cn ~, a n d Y is normally distributed with mean 0 and


" ' " - -

variance a 2. The electric utilities in the United States plan their operation so as
to meet a targeted value of the LOLP index of the order of 10- 4. Thus, the LOLP
measure represeilts the extreme tail probability in the distribution of
X l + X2 + " " + X . .

P r o d u c t i o n costing i n d e x

For the evaluation of the expected operating costs of a given utility, we assume
somewhat simplifying the real-life situation, that (a) there are n units in the sys-
tem, (b) the units are brought into operation in accordance with somespecified
measure of economy of operation (e.g., marginal cost), and ( c ) t h e unit i, in
decreasing order of economy of operation, has capacity, c i and EFOR, pi,
i = 1, 2, ..., n. Let U denote the system load, and let F(x) = Pr { U > x}. Thus
F(x) represents the load-duration curve or LDC.
Consider now the i-th unit in the loading order and let W~ denote the energy
unserved (i.e., the unmet demand) after it has been loaded. Let, as before, X,.
denote the unavailable capacity for unit i, whose probability distribution is given
by (3) and let U denote the system load. We define

C i : C i ~- C 2 ~- " " " -~ C i. i = 1, 2, . . . , n , (6)


E~=U+XI+X2+'"+X ~, i=1,2,...,n. (7)
Approximate computation of power generating system reliability indexes 61

Thus, Z; represents the equivalent load on the first i units. Let ge(.) and G~(.)
denote the probability density and distribution functions, respectively, of Z;.
Clearly,

wi=0 if z , < c,,


= z,- c, if z , > c,. (8)
Thus,
E(Wi) = fc o(z - Ce)g~(z) dz. (9)
i

Now denote by ei the energy produced by unit i. Then it follows from (9) that

E(ee) = E(W,._ 1) - E(Wi)

= (z- Ci_l)gi_l(Z)dz- ( z - C~)g~(z)dz


i -1 i

Gi- l(z) dz - ai(z) dz


=

Ci- I
Ce --
f/
where
=(l-p;)
I Ci-1
G i_,(z) d z , i = 1, 2 , . . . , n . (10)

G,(z)= 1 - G , ( z ) , i = 1,2 . . . . , n , and Go(z)=if(z).

In the above, we interpret CO = 0 and Go(x) = if(x). The development of (10) is


due to Baleriaux et al. [1]. We define the capacity factor for unit i to be

E(ei)
CF(i)= , i = 1,2 . . . . . n. (11)
¢i

This index gives the ratio of the expected output to the maximum possible output
for each unit. An accurate estimate of this index is needed by the utilities for the
purposes of evaluating expected system operating costs and optimizing its genera-
tion planning.

Computational difficulties
In its planning process, a given utility needs to compute the LOLP and CF
indexes for various combinations of the system load and mix of generating units.
Thus it is necessary that an inexpensive method of computation be used for the
purpose of computing these indexes. Examining (4), we observe that when the ci's
and the pt's are all different, at least 2 n arithmetic operations will be required to
evaluate one LOLP index. Thus, the total number of arithmetic operations in the
computation of one LOLP index varies exponentially with the number of gener-
62 M. Mazumdar

ating units in the system, and it might become prohibitively large for large values
of n. From (10), we observe that the expected energy output of a given unit is
proportional to an average LOLP value over a range of z between Ce_ ~ and Ci.
Thus, it is not feasible for a power system planner to engage in a direct compu-
tation of (4) or (10), and he has to resort to approximations which require much
less computer time.

3. Approximate procedures

Method of cumulants
From an uncritical application of the central limit theorem, one could have
made the convenient assumption that the distribution of X1 + )(2 + "'" + Xn in (5)
or the survivor function G~_ l(x) in (10) will be approximately normal. While this
assumption may not cause problems while computing probabilities near the cen-
tral region of the probability distribution, the 'tail' probabilities may be in-
accurately estimated. A typical generation mix within a given utility usually
contains several large units and otherwise mostly small units thus violating the
spirit of the Lindeberg condition [ 11 ] of the central limit theorem. An approach
to the problem of near-normality is that of making small corrections to the normal
distribution approximation by using asymptotic expansions (Edgeworth or
Gram-Charlier) based on the central limit theorem. Use of these expansions in
evaluating power generating system reliability indexes has come to be known in
the recent power-system literature as the method of cumulants. For details on the
use of these expansions in computation of LOLP, see [13], and for its use in
computing the capacity factor index, see [5]. In the evaluation of LOLP, one first
obtains the cumulants of X1 + X 2 + • • • + X n + Y by summing the corresponding
cumulants of the Xi's and of Y. These are then used in the appropriate Edgeworth
or Gram-Charlier expansion. Similarly, for the purpose of evaluating E(e~) in
(10), one first obtains the cumulants of Z; for each i = 1, 2 . . . . , n, by summing
up the cumulants of X1, X 2. . . . . X~ and U. Next, one writes the formal expansion
for G~(x) using these cumulants upto a given order. Next, one integrates the series
term by term in (10) to obtain an approximation for E(ei). Caramanis et al. [5]
have made a detailed investigation of this approximation in the computation of
the capacity factor indexes. Their results have cast favorable light on the efficiency
of this method.

Esscher's approximation: Computation of LOLP


We illustrate this method first with respect to the computation of LOLP. We
assume that the peak load is non-random and known, i.e., a = 0. As demon-
strated in [ 14], this is the worst case for the peak load distribution insofar as the
relative accuracy of the different approximation methods is concerned. We use the
symbols F i and F* to denote the distribution functions of the random variables
X, and X 1 -~- X 2 -t- " " " -1- X n , respectively. The moment generating functions of F;
Approximate computation of power generating system reliability indexes 63

and F* are respectively given by

Fi(s) = eaCip; + (1 -p~.), (12)

.~*(s) = f i l~i(s ) - e ~'(s~ , say. (13)


i=1

In order to provide a notation which covers the continuous as well as discrete


variables, we use the symbol F(dx) to denote the 'density' of the distribution
function F(.), (see Feller [11, p. 139] for a mathematical explanation of the
symbol F(dx)). We now define for some s,

eSXF~(dx)
V~(dx) - - - (14)

Further, let V* denote the convolution of V~, V2 . . . . . V,. With these definitions,
it is seen that the LOLP index may be expressed, as follows:

LOLP = F*(dx) = F*(s) e-sxV*(dx). (15)

We now choose s such that z equals the mean of V*(.). Thus, although in
practical application, z will lie in the fight hand tall of F*(-), it will now be at the
center of the d.f. V*(.). We also expect the distribution of V*(.) to be much closer
to the normal distribution in the central portion of the distribution (much more
so as compared to the tails). Thus, in the second integral of (15), the integration
is being done in a region where V*(.) could be accurately approximated by a
normal distribution or an expansion of the Edgeworth type. The effect of the
multiplier e-sz for s > 0 is to de-emphasize the contribution of V*(dx) for values
of x in the tail. Esscher's approximation technique consists in replacing V*(dx)
by an appropriate normal distribution or an Edgeworth expansion, and evaluating
(15).
It can be shown [9] that corresponding to a given s, the first four cumulants
of V*(.) are given by

~'(s) = ~ PiCi , (16a)


i= 1 Pi + (1 -- Pi) e . . . .

p,(_l- p9c£
O"(s) = , =~"1 [Pi + (1 - P i ) e . . . . 12 , (16b)

~b'(s) = ~ c~pi(1 - Pi) e . . . . [ - P i + (1 - p;) e-~C~ ]


(16b)
;=' [Pi + (1 - p ; ) e . . . . ]3 '
64 M. Mazumdar

~/(4)(S) = ~ c ? p , ( l -- p,) e - ~ ' [ p e - 4p~.(1 - p;) e . . . . + (1 - p,)2 e - 2~,]


i=1 [p; + (1 - p , ) e-SC'] 4
(16d)

In applying Esscher's approximation, we first solve the equation (in s):

~O'(s) = z. (17)

Call this unique root so. We now replace V*(dx) in (15) by the normal density
or an appropriate Edgeworth expansion. For a random variable X, whose first
four cumulants are kl, k2, k 3 and k 4, its density F(dx) is approximated by the
Edgeworth expansion [6] formula as follows:

F(dx) -~ kl/~ q(t) - ~- ~ ( t ) +~ ~4~(t) + ~ ~(t) d~, (18)

where
tp(t) 1 ,2/2 d~tp(t) t - x - k1
=~e- , ¢~(t)- dt ~ , g~/2 '

k3 k4
~1 k3/2 , 72 k~

Now, if we replace V*(dx) in (15) by the appropriate normal and Edgeworth


expansions (18) using first and second order terms, the following formulas result:

LOLP=Pr{X~+X2+'"+Xn>z}
- LOLP l = eq,¢s0)-~O~Eo(u) (19a)

LOLP2 = L O L P I I I _ 7~6 vl (19b)

where
_ LOLP 3 = LOLP 2 + LOLP 1
[;4 uv + -7'lz
-
72
u3v - , (19c)

u = ~o~/q,"(So), Eo(u ) = e"2/2[ 1 - ~(u)] (¢(u) =


S -oo
q~(u) du),

7', - @"(s°) ?~ - ~4)(s°) w = x/~Eo(u)


[ 0" (So)] 3/2 [ 0" (so)l 2
Approximate computation of power generating system reliability indexes 65

u2- 1
1) = U 3 -- - - , ~](SO) = logeP*(So) •
W

E s s c h e r ' s a p p r o x i m a t i o n : C o m p u t a t i o n o f unit capacity f a c t o r s

A typical load distribution curve is multimodal, and it cannot be approximated


by a standard distribution. For the purpose of applying the present approxima-
tion, we discretize the load-duration curve into a distribution representation
having probability masses at a given number (say, m) of discrete points. That is,
we obtain a discrete approximation of the load duration curve where the load
points are l 1, l2 . . . . , lm with the corresponding probabilities q , r 2 . . . . , rm, where
rj = P r { U = lj.}. With this approximation, one can evaluate G i _ l ( x ) in (10) as
follows:
m

Gi_l(X)= Pr{U + X 1 +X 2 + ''' +Xi_ 1> X}

--- ~ P r { X I + X 2 + " " +X,_l>zj}r j, (20)


j=l

where zj = x - lj. The expression Pr{X 1 + X 2 + . . . + X i_ 1 > z j } can be evalu-


ated using the formulas given in (19).
It can be seen from (16b) that q/(s) = z is an increasing function in s, and we
have defined s o to be the root of the equation: ~k'(s) = z. From (16a), we observe
n
that q/(0) = Y~;= ~ cep e. Thus, in (20), if z j < E [ X ~ + X 2 + . . . + X i_ 1 ], So(Zj) will
be negative. Now consider equation (15). If So < 0, the effect of the multiplier e-sx
is to amplify the error in the approximation of V*(dx) for large x - - a clearly
undesirable situation. Thus, it appears appropriate in this situation to express

Pr{X, + X 2 + . . . +Xi_ , >zj} = 1 - Pr{X, +)(2 + " ' " +Xi_, <~zj},
(21)

and use Esscher's method on the right hand side of (21).


We define

L O L P = Pr{X~ + X 2 + . . - + X , <~ z } . (22)

Corresponding to (19), we obtain the following approximation for LOLP:

L O L P --- L O L P 1 = e¢(S°)-S°ZEo(u) (23a)

LOLP 2 = LOLP1 ()1 - 7-1 v'


6
, (23b)

_~ L O L P 3 = L O L P 2 + L O L P ~ 7~ u v ' + - - uv' -
(24 72 ~7 '
(23c)
66 M. Mazumdar

where
u2 - 1
~o(U) = e "2/2 ~ ( u ) , w' = - , j ~ e o ( U ) , v' = u 3 - - -

w'

For the purpose of evaluating E(ei) in (10), the integration can be done using
an appropriate numerical integration routine after evaluating G,._ l(x) for as many
points as the quadrature formula requires. In the numerical work reported in
Section 4, we used the Trapezoidal rule for numerical integration.

4. Numerical results

This section applies the formulas obtained in the preceding section to two
prototype systems. System A is the prototype generating system provided by the
Reliability Test System Task Force of the I E E E Power Engineering Application
of Probabilistic Methods Subcommittee [12]. Table 1 gives the assumed genera-
tion mix of the 32 units comprising the system--their installed capacities and

Table 1
Unit power ratings for a prototype generating
system, and their assumed FOR's (System A)

Unit Number Forced


size of outage
(MW) units rate

12 5 0.02
20 4 0.10
50 6 0.01
76 4 0.02
100 3 0.04
155 4 0.04
197 3 0.05
350 1 0.08
400 2 0.12
32

FOR's. Table 2 provides a comparison of the estimated L O L P corresponding to


different values of the system margin obtained with the use of Esscher's approxi-
mation formulas (19) and the method of cumulants. For the latter method, we use
the Edgeworth expansion formula keeping terms up to the first four cumulants
only. Usually, such expansions are sufficient to provide close enough approxi-
mations in cases where the use of such expansion is appropriate. We also display
in this table the exact L O L P values for benchmarking and comparison. Figure 2
shows the percentage relative errors resulting from using the two approximations
for a wide range of values of the system margin.
Approximate computation of power generating system reliability indexes 67

Table 2
Comparison of algorithms for LOLP estimation (System A)

Esscher's approximation
z Exact a
(MW) value (19a) (19b) (19c) Cumulants

500 1.23 ( - 1) 1.35 ( - 1) 1.26 ( - 1) 1.23 q - 1 ) 1 . 1 9 ( - 1)


600 6.21 ( - 2) 7.75 ( - 2) 7.45 ( - 2) 7.22 - 2) 6.30 ( - 2)
700 4.25(-2) 4.23(-2) 4.16(-2) 4.03~ - 2) 3.48 ( - 2)
800 2.47 ( - 2) 2.19(-2) 2.19(-2) 2.13, - 2 ) 2.18(-2)
900 1.16(-2) 1.07(-2) 1.09(-2) 1.06~ - 2) 1.34 ( - 2)
1000 4.34 ( - 3) 4.94 ( - 3) 5.07 ( - 3) 4.95 - 3) 6.82 ( - 3)
1100 2.35(-3) 2.13(-3) 2.20(-3) 2.16, - 3) 2.74 ( - 3)
1200 7.91 ( - 4) 8.51 ( - 4) 8.85 ( - 4) 8.75, - 4) 8.57 ( - 4)
1300 4.01 ( - 4 ) 3.12(-4) 3.24(-4) 3.24, - 4 ) 2.10(-4)
1400 1.02 ( - 4) 1.03 ( - 4) 1.07 ( - 4) 1.08, - 4) 4.07 ( - 5)
1500 4.04(-5) 3.01 ( - 5 ) 3.11 ( - 5 ) 3.16, - 5) 6.29 ( - 6)
1600 8.06 ( - 6) 7.69 ( - 6) 7.85 ( - 6) 7.99, - 6) 7.78 ( - 7)
1700 1.58 ( - 6) 1.69 ( - 6) 1.72 ( - 6) 1.73, - 6) 7.77 ( - 8)
1800 2.91(-7) 3.19(-7) 3.23(-7) 3.23~ - 7) 6.28 ( - 9)
1900 4.69 ( - 8) 5.16 ( - 8) 5.23 ( - 8) 5.21, -8) 4.12 ( - 10)
2000 7.25 ( - 9) 7.07 ( - 9) 7.08 ( - 9) 6.90 - 9 ) 2.21(-11)
2100 8.43 ( - 10) 8.29 ( - 10) 8.49 ( - 10) 8.49~ - 10) 9.63 ( - 13)
2200 9.27 ( - 11) 8.29(-11) 9.98(-11) 1.45~ - 10) 3.44 ( - 14)
2300 7.97 ( - 12) 6.27 ( - 12) 5.54 ( - 12) 1.68, - 12) 0

" Excerpted from (12).

Table 2 and Figure 2 impress one with the accuracy of Esscher's approximation
in the region of our interest, i.e., for values of LOLP in the range between 10-3
and 10-5 and beyond. There is very little difference between the three formulas,
and perhaps the formula (19b) represents the overall best choice. The cumulants
methods does not fare too badly in the probability range 10- 1 to 10- 3; but below
this range, the Esscher approximations appear to be decidedly superior to the
method of cumulants. Similar comparisons for several other systems are given in
a research report [9]. The results of this report as well as those given in [14] show
that Esscher's method, while very accurate, is also speedy enough to be adopted
in routine utility practice.
For the purpose of evaluating the accuracy of Esscher's approximation in
providing production costing expressions, we use the data provided by Caramanis
et al. [15] with respect to a second synthetic system, referred to as the EPRI
system D. Tables 3 and 4 give respectively the capacity mix of the system with
the associated FOR's and the load duration curve. Table 5 gives the derived
probability distribution (Is, rs) obtained from Table 4. Here, ls is the interval
midpoint for the j-th load class interval in Table 4, and rj is the associated
probability mass obtained from differencing. Table 6 gives the estimates capacity
factors using the three versions of Esscher's approximations using the normal,
68 M. Mazumdar

z Percentage Relative Error Exact


(100 MW) Value
-80 -60 -40 -20 0 20 40 60 80
5 "--T'-'T" -] 1.2(-1)
6- - 6.2(-2)
7 --~ 4.2(-2)
8- 2.5(-2)
9- 1.2(-2)
10- 4.3(-3)
11 - 2.4(-3)
12 7.9(-4)
13
t 4.0(-41
14 1.0(-4)
4.0(-5)

17
~ 8.1 (-6)

i1.6(-6)
18 > - 2.9(-7)
Equation (19a) : 0
Equation (19b) : 0 4.7(-8)
Equation (19c) : []
2O Cumulants : I> 7.2(-9)
21 8.4(-10)
I I I I I I I I I I I I I I I t t i I

Fig. 2. Graph of relative error for the Esscher and cumulants approximations for LOLP (system A).

Table 3
EPRI system D. Unit power ratings in loading order

Power rating (MW) No. of units Availabilitya

1200 6 0.85320
800 1 0.85320
800 2 0.75910
600 6 0.78750
400 7 0.87420
200 56 0.92564
50 96 0.76000

a Availability -= 1 - FOR.

first a n d s e c o n d o r d e r E d g e w o r t h e x p a n s i o n s . T h e s e e s t i m a t e s are c o m p a r e d with


a n u m e r i c a l analytic algorithm ( d e n o t e d b y SC-16), w h i c h is c o n s i d e r e d as a n
i n d u s t r y b e n c h m a r k , a n d P3, a n algorithm b a s e d o n the m e t h o d o f c u m u l a n t s .
Approximate computation of power generating system reliability indexes 69

Table 4
EPRI system D. Description of the LDC

Load duration Load (MW) Load duration


Load (MW) value value value
x ~(x) x ~(x)

0.0 1.000000 18944.0 0.469293


12288.0 1.000000 19456.0 0.475412
12800.0 0.974227 19968.0 0.445531
13312.0 0.962347 20480.0 0.409888
13824.0 0.911022 20992.0 0.390267
14336.0 0.855419 21504.0 0.350484
14848.0 0.804035 22016.0 0.320782
15360.0 0.752947 22528.0 0.291080
15872.0 0.677207 23040.0 0.243557
16384.0 0.635624 23552.0 0.190093
16896.0 0.570880 24064.0 0.124749
17408.0 0.522756 24576.0 0.071285
17920.0 0.493054 25088.0 0.035643
18432.0 0.475233 25600.0 0.0

Table 5
Discrete version of LDC (EPRI system D)

Interval Load (MW) Probability Interval Load (MW) Probability


(j) (lj) (rj) (j) (lj) (rj)

1 12 544 0.025773 14 19 200 0.011881


2 13 056 0.011880 15 19 712 0.011881
3 13 568 0.051325 16 20 224 0.035643
4 14080 0.055603 17 20 736 0.017821
5 14 592 0.051384 18 21248 0.04 ! 583
6 15104 0.051088 19 21760 0.029702
7 15 616 0.075740 20 22 272 0.029702
8 16128 0.041583 21 22784 0.047523
9 16 640 0.064744 22 23 296 0.053464
10 17,152 0.048124 23 23 808 0.065344
11 17 664 0.029702 24 24 320 0.053464
12 18176 0.017821 25 24 832 0.035642
13 18 688 0.005940 26 25 344 0.035643

T h e latter t w o a l g o r i t h m s are c o n s i d e r e d to be t h e b e s t in their r e s p e c t i v e


c a t e g o r i e s b y C a r a m a n i s et al. [5].
W h e n o n e r e g a r d s the v a l u e s p r o v i d e d by S C - 1 6 as b e n c h m a r k v a l u e s as
C a r a m a n i s et al. [5] do, o n e o b s e r v e s t h a t E s s c h e r ' s m e t h o d p r o v i d e s excellent
a p p r o x i m a t i o n s to t h e c a p a c i t y f a c t o r s for e a c h unit in the l o a d i n g o r d e r o f E P R I
S y s t e m D . Especially, the L D - 2 a n d L D - 3 a p p r o x i m a t i o n s u n i f o r m l y o u t p e r f o r m
70 M. Mazumdar

Table 6
Comparison of algorithms for capacity factors (EPRI system D)

Esscher's approximation
Unit
no. SC-16 a P3 ~ LD-1 LD-2 LD-3 b

1-7 0.853 0.853 0.853 0.853 0.853


8-9 0.759 0.759 0.759 0.759 0.759
10-13 0.788 0.787 0.788 0.788 0.788
14 0.787 0.787 0.787 0.787 0.787
15 0.786 0.787 0.786 0.786 0.786
16 0.870 0.874 0.871 0.870 0.870
t7 0.866 0.874 0.868 0.866 0.866
18 0.861 0.864 0.862 0.861 0.861
19 0.852 0.844 0.854 0.852 0.852
20 0.841 0.831 0.844 0.841 0.841
21 0.827 0.816 0.831 0.827 0.827
22 0.809 0.799 0.815 0.810 0.810
23-28 0.809 0.800 0.816 0.809 0.809
29 0.758 0.753 0.766 0.759 0.759
30-33 0.717 0.716 0.726 0.718 0.718
34-39 0.634 0.641 0.642 0.634 0.633
40 0.578 0.590 0.585 0.578 0.578
41-44 0.544 0.557 0.549 0.544 0.544
45 - 4 9 0.492 0.503 0.496 0.492 0.492
50 0.463 0.471 0.467 0.464 0.463
51-55 0.439 0.442 0.442 0.439 0.439
56-59 0.403 0.399 0.406 0.403 0.403
60 0.382 0.376 0.386 0.382 0.382
61-67 0.344 0.334 0.349 0.344 0.344
68-69 0.294 0.283 0.300 0.295 0.295
70 0.276 0.264 0.281 0.276 0.276
71 - 78 0.213 0.207 0.220 0.213 0.213
79-89 0.116 0.118 0.121 0.116 0.116
90 0.102 0.106 0.107 0.102 0.102
91-99 0.092 0.097 0.096 0.092 0.092
100 0.082 0.088 0.086 0.082 0.081
101-102 0.079 0.085 0.083 0.079 0.078
103-109 0.070 0.078 0.074 0.070 0.070
110 0.063 0.071 0.067 0.063 0.063
111-114 0.059 0.067 0.062 0.059 0.059
115-125 0.048 0.056 0.050 0.048 0.048
126 0.040 0.048 0.042 0.040 0.040
127-131 0,036 0.044 0.038 0.036 0.036
132 0.033 0.041 0.034 0.033 0.033
133-138 0.029 0.036 0.031 0.029 0.029
139 0.026 0.033 0.027 0.026 0.026
140 0.025 0,032 0.026 0.025 0.025
141 - 150 0.021 0.026 0.021 0.020 0.020
151-159 0.014 0.019 0.014 0.014 0.014
160 0.012 0.015 0.012 0.012 0.011
161-162 0.011 0.014 0.01 ! 0.011 0.011
163-164 0.010 0.013 0.011 0.011 0.011
Approximate computation of power generating system reliability indexes 71

Table 6
(continued)

Esscher's approximation
Unit
no. SC- 16a P3 a LD- 1 LD-2 LD-3 b

165 0.009 0.012 0.010 0,009 0,009


166 0.009 0.012 0,009 0.009 0.009
167 0.009 0.011 0.009 0,009 0,009
168 0.008 0.010 0.008 0.008 0.008
169 0.008 0.010 0,008 0.008 0,008
170 0.008 0.009 0,008 0,008 0.007
171 0.007 0,009 0.007 0,007 0,007
172 0.007 0.006 0.007 0,007 0.007
173 0.007 0,000 0,007 0.007 0.007
174 0.006 0.000 0.006 0.006 0.006

a Excerpted from [5].


u LD-1, LD-2, LD-3: Esscher's approximations using normal and first and second order Edgeworth
expansions.

the method of cumulants. We conjecture that the performance of the Esscher


approximation will be more convincingly superior to the method of cumulants for
systems with lower unit FOR values.

Summary and conclusions

Reliability of electrical power supply is of utmost importance to the public. To


insure adequate and reliable power supply, the electric power industry spends a
considerable effort in long-term generation planning. In this connection, several
reliability indexes are used by the power system planners. The loss-of-load proba-
bility (LOLP) index for a power generating system measures the probability that
system load exceeds its available capacity. Direct numerical computation of this
index proves unfeasible, and one needs to resort to approximate methods. We
adapt an approximation scheme proposed by Esscher in an actuarial context for
evaluating the LOLP index. Numerical results given in this article demonstrate
that this approximation is very accurate.
A second problem considered is estimating the capacity factors of various units
which experience different rates of utilization within the system. These indexes are
used to determine the expected operating costs of an electric utility company. The
computation of these indexes involves similar difficulties as that of LOLP. Here,
also, for a typical system evaluated, Esscher's method provides very accurate
results.
72 M. Mazumdar

References

[1] Baleriaux, E., Jamoville, E. and Fr. Linard de Guertechin (1967). Simulation de l'exploitation
d'un pare de machines thermiques de production d'61eetricit6 couples a des stations de
pompage. Revue E(SRBE ed.) 5, 3-24.
[2] Barlow, R. E. and Proschan, F. (1975). Statistical Theory of Reliability and Life Testing Probability
Models. Holt, Rinehart and Winston, New York.
[3] Billinton, R. (1970). Power System Reliability Evaluation. Gordon and Breach, New York.
[4] Billinton, R., Ringlee, R. J. and Wood, A. J. (1973). Power System Reliability Calculations. MIT
Press, Cambridge, MA.
[5] Caramanis, M., Stremmel, J. V., Fleck, W. and Daniel, S. (1983). Probabilistic production
costing. International Journal of Electrical Power and Energy Systems 5, 75-86.
[6] Cramer, H. (1946). Mathematical Methods of Statistics. Princeton University Press, Princeton, NJ.
[7] EEI Equipment Availability Task Force (1976). Report on equipment availability for the
ten-year period, 1966-1975. Edison Electric Institute, New York.
[8] Endrenyi, J. (1978). Reliability Modeling in Electric Power Systems. Wiley, New York.
[9] Electric Power Research Institute (1985). Large-deviation approximation to computation of
generating-system reliability and production costs. EPRI EL-4567, Palo Alto, CA.
[10] Esscher, F. (1932). On the probability function in the collective theory of risk. Scandinavian
Actuariedskrift 15, 175-195.
[11] Feller, W. (1971). An Introduction to Probability Theory and its Applications, Vol. II, 2nd ed.
Wiley, New York.
[12] IEEE reliability test system (1979). A report prepared by the Reliability Test System Task
Force of the Application of Probability Methods Subcommittee. IEEE Transactions on Power
Apparatus and Systems 98, 2047-2064.
[13] Levy, D. J. and Kahn, E. P. (1982). Accuracy of the edgeworth expansion of LOLP calculations
in small power systems. IEEE Transactions on Power Apparatus and Systems 101, 986-994.
[14] Mazumdar, M. and Gaver, D. P. (1984). On the computation of power-generating system
reliability indexes. Technometrics 26, 173-185.
[15] Stremmel, J. P., Jenkins, R. T., Babb, R. A. and Bayless, W. D. (1980). Production costing using
the cumulant method of representing the equivalent load curve. IEEE Transactions on Power
Apparatus and Systems 99, 1947-1953.
[16] Sullivan, R. L. (1976). Power Systems Planning, McGraw-Hill, New York.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
© Elsevier Science Publishers B.V. (1988) 73-98

Software Reliability Models

Thomas A. Mazzuchi and Nozer D. Singpurwalla

I. Introduction

In the past ten years or so, there has been considerable effort in what has been
termed software reliability modeling. The generally accepted definition of software
reliability is 'the probability of failure-free operation of a computer program in a
specified environment for a specified period of time' (Musa and Okumoto, 1982).
This area has begun to receive much attention for several reasons. Today the
computer is used in many vital areas where a failure could mean costly, even
catastrophic consequences. Due to the recent advances in hardware modeling and
technology, the main cause for computer system failure would be in the software
sector. At the other end of the spectrum, software production is costly and time
consuming, with much of the time and cost being devoted to testing, correcting
and retesting the software. The software producer needs to know the benefits of
testing and must be able to present some tangible evidence of software quality.
The issues concerning the quality and performance of software which are of
interest to the statistician (see Barlow and Singpurwalla, 1985) are:
(1) The quantification and measurement of software reliability.
(2) The assessment of the changes in software reliability over time.
(3) The analysis of software failure data.
(4) The decision of whether to continue or stop testing the software.
The problem of software reliability is different from that of hardware reliability
for several reasons. The cause of software failures is human error, not mechanical
or electrical imperfection, or the wearing of components. Also, once all the errors
are removed, the software is 100~o reliable and will continue to be so. Further-
more, unlike hardware errors there is no process which generate failures. Rather
software 'bugs' which are in the program due to human error are uncovered by
certain program inputs and it is these inputs which are randomly generated as
part of some operational environment.

Research supported by Contract N00014-85-K-0202, Project NR 347-128-410, Office of Naval


Research and Grant DAAG 29-84-K-0160, U.S. Army Research Office.

73
74 T. A. Mazzuchi and N. D. Singpurwalla

A more formal discussion of the software failure process is given in Musa and
Okumoto (1982). A computer program, is a 'set of complete instructions (opera-
tions with operands specified) that executes within a single computer some major
function', and undergoes several runs, where a run is associated with 'the accom-
plishment of a user function'. Each run is characterized by its input variables
which is 'any data element that exists external to the run and is used by the run
or any externally-initiated interrupt'. The environment of a computer program is
the complete set of input variables for each run and the probability of occurrence
of each input during operation. A failure is 'a departure of program operation
from program requirements' and is usually described in terms of the output
variables which are 'any data element that exists external to the run and is set by
the run or any interrupt generated by the run and intended for use by another
program'. A fault or bug is the 'defect in implementation that is associated with
a failure'. The 'act or set of acts of omission or commission by an implementor
or implementors that results from a fault', is an error.
For a more indepth treatment of software terminology the reader is referred to
Musa and Okumoto (1982). For further clarification of types of software errors
and their causes see Amster and Shooman (1975).
Software reliability models may be classified by their attributes (Musa and
Okumoto, 1982; Shanthikumar, 1983) or the phase of the software life cycle
where they may be used (Ramamoorthy and Bastani, 1982). The later approach
will be used here. There are four main phases of the software lifecycle: testing and
debugging phase, validation phase, operational phase, and maintenance phase.
Currently no models exist for use in the maintenance phase and thus this phase
will not be discussed.

2. Models of the testing and debugging phase

In the testing and debugging phase the software is tested for errors. In this
phase an attempt is made to correct any bugs which are discovered. The
discovery of a software bug is a function of its size (the probability that an input
will result in the bug's dis:zovery) and the testing intensity which reflects the way
in which inputs are selected. Another issue in this phase, is the treatment of the
error correction process. The more simple models assume all errors are corrected
with certainty and without introducing new errors, while other account for
imperfect debugging. Models in this phase may be classified into two main
categories: error counting models, and non-error counting models. Models may
be further classified by their approach (Bayesian or classical), their treatment of
the effect of error removal on reliability (deterministic or stochastic) and their
consideration of the time it takes to find and fix software bugs.
Software reliability models 75

2.1. Error counting models


Error counting models are based on the assumption that
(A1) The failure rate of the software at any point in time is a function of the
residual number of errors in the program.
Thus effort centers around estimating the residual number of errors and using this
to obtain other reliability measures. Furthermore, deterministic models are based
on the assumption that
(A2) Conditional on the model parameters the correction of an error results in
a known improvement in the reliability of the software.
The simplest, most cited and most criticized model is that of Jelinski-Moranda
(henceforth JM). In addition to (A1) and (A2) this model is based on the
assumptions
(JM1) Each undetected error contributes an equal amount to the failure rate of
the software, which is proportional to the number of remaining errors,
(JM2) Conditional on the model parameters the time between successive software
failures are independent,
(JM3) Once discovered, errors are removed in a minimal amount of time without
introducing any new errors.
Given the above, the reliability function for Ti, the time between the (i - 1)st and
ith failure, is given by

R(til(a,N)=exp{-~)(N-i+l)tt} for i=1 ..... N, (2.1.1)

where N is the initial number of software bugs and ~p is the failure rate
contribution of an individual error. This model may be used to make inference
about the software once estimates of N and ~p are obtained. Given the software
is tested and n ~< N software failures have occurred, the parameters N and q5 can
be obtained by maximum likelihood techniques. They are obtained as the simulta-
neous solution to

n
q~ = n ' (2.1.2a)
NT- 5~i=1 ( i - 1)t,

U 1 n
i=12N - i + 1
-

N---
1
~ ( i - 1)t~
, (2.1.2b)

Ti=l

where T = Y~7=1 ti"


The authors note that the above model assumes equal amounts of testing in all
periods. They suggest normalizing the time scale by using a time dependent
76 T. A. Mazzuchi and N. D. Singpurwalla

parameter e(t), the exposure rate. This parameter would reflect the testing intensity
at any time. The model could thus be modified by normalizing the time between
failures as

t* =
f t t; e(u) d u

where t; is the time of the ith failure.


Shooman (1972) develops a model similar to JM and further elaborates on the
notion of 'testing intensity'. Shooman suggests treating the number of corrected
software bugs as a continuous function of debugging time, say e(z). The function
~(z) would relate the cumulative number of corrected errors/number of program
instructions/debugging time. Once e(v) was established, future software questions,
such as when to stop testing could be answered. Analogous to (2,1.1) the reliability
function for software which has undergone ~ months of debugging is

R ( t [ N , I, 8(z)) = exp { - C ( N / I - ~(z))t} (2.1.3)

where I is the total number of program instructions and C is an unknown


constant. In shooman (1973) the function e(z) is defined simply as 'the total
number of errors corrected by time z normalized with respect to I'. This
assumption essentially makes the Shooman and JM models different in notation
only. Shooman (1973) and (1975) however, suggest a different technique for
estimating C (and thus ~p) and N. The debugging process is divided into k intervals
of lengths H~ . . . . , H k. The end of the ith debugging interval is denoted rt- In the
ith interval ni failures are recorded but they are not fixed until the end of the
interval. A similar approach is undertaken using the JM model in Lipow (1974).
The parameters C and N may be obtained by the method of moments by choosing
times zi < ~ such that e(z;)< e(zj.) and solving

Hi_ 1
(2.1.4a)
n~ C[N/I- e(r,.)]

/-/j_ 1
(2.1.4b)
nj C[N//- e(zj)]

or by using the method of maximum likelihood estimation and solving

C= Z~=I ni , (2.1.5a)
k
Z j = 1 [ N / I - g(za.)]//j

Zk n i / [ N / I - e(zi) ]
C= i=l (2.1.5b)
Z jk= l n j
Software reliability models 77

If large sample theory is applicable, asymptotic variances of the MLE's are


obtained as

~2
var (d) - - (2.1.6a)
2iK1 ni

12
var( ) (2.1.6b)
, n;[R/x - =

Shick and Wolverton (1973) actually specify and incorporate a testing intensity
in their modification of the JM model by assuming the failure rate of the software
is a linear function of testing time. The resulting distribution for the interfailure
times is the Rayleigh distribution

R(til cp, N) = exp { - ~p(N - i + 1)t2/2} (2.1.7)

and the resulting MLE's for N and ¢ are obtained by solving


n
N = [2n/~ + 2;=1 (i - 1)t,.2]
(2.1.8a)
n
2 i=l t2
[27=1 2 / ( N - i + 1)]
q~ = , (2.1.8b)
~ n
i = l t2

Several alterations of this have appeared. Wagoner (1973) fit a Weibull distribu-
tion to software failure data using least squares estimation for parameters. Lipow
(1974) suggested using a linear term which would be a function of the most recent
failure time. Shick and Wolverton (1978) discuss the use of a parabolic function
to model testing intensity. Sukert (1976) also adapted the model to include the
case of more than one failure occurring in a debugging interval.
Musa (1975) was the first to point out that software reliability models should
be based on execution time rather than calendar time. Musa's model is essentially
the same as the JM model but he attempts to model the debugging process in a
more realistic fashion. The model undergoes some alterations in Musa (1979).
Here, the expected net number of corrected software bugs is expressed as an
exponential function of execution time, and the fault correction occurrence rate is
assumed proportional to the failure occurrence rate.
The reliability of software tested for ~ units of execution time is
R(t) = exp { - t/T} where T, the mean time to failure (in execution time) is given
by

r = TOexp (2.1.9)
78 7". A. Mazzuchi and N. D. Singl~urwalla

In the above, TO is the mean time to failure before debugging, M o is the total
number of possible software failures in the maintained life of the software and C
is a testing compression factor. The value TO is further expressed by TO = 1 / f K N o
where f is a ratio of average instruction execution rate to the number of
instructions, called the linear execution frequency and K is an error exposure ratio
relating error exposure frequency to linear execution frequency. The value N O is
the initial number of software errors in the program and is related to M o by
M o = N o / B . The parameter B is called the fault reduction factor. This gives the
model the additional characteristic of being able to handle the possibility of more
than one error being found at one time or the possibility of imperfect debugging.
The value C is a ratio relating the rate of failures during testing to that during use.
From the parameter relationships, two central measures are obtained. The
additional number of software errors which needs to be corrected to increase the
mean time to failure for the program from T~ to T2 is given as

Am = M o To T1E1 (2.1.10)

and the additional execution time required to increase the mean time to failure
from T 1 to T2 is given as

A z - M°T° log(T2/T1) . (2.1.11)


C

Musa derives an execution to calendar time conversion by pointing out that


testing time is a function of three limited resources: failure identification personel
(I), failure correction personnel (F) and computer time (C). The resource expendi-
ture associated with a change in mean time to failure is approximated by

h Z k = O k h r + l~kAm (2.1.12)

for k = I, F, C, where Az and Am are the additional execution time needed and
the additional errors corrected to bring about the change and Ok and/1~ are the
average resource expenditure rate per execution time and failure respectively.
Assuming resources remain constant throughout testing, the testing phase may be
divided into three distinct phases. In each phase only one of the resources is
limiting and the other two are not fully utilized. Thus the additional calendar time
required to increase the mean time to failure from T 1 to T2 is given as

1 I (1 1) Ok log(Tk2~] (2.1.13)

where k = C, F, I corresponding to the appropriate resource limiting phase, P. is


the amount of resource available, p. is the resource utilization factor, and O. and
Software reliability models 79

/~. are as previously defined. The quantities Tk, and Tk2 are the mean time to
failures at the boundaries of each resource limiting phase. These boundaries are
at the present and desired mean time to failure and the transition points which
lie in this range. The mean time to failure for a transition point is derived as

c [ P k , ~ , p~ - P~,,~p~, ]
Tk~,, = (2.1.14)
[P~, p~, Ok - P,p~O~, ]

for (k, k ' ) = (C, F), (F, I), (I, C). M u s a notes that it is generally true that OF = 0
and PI = 1 and discusses a method for obtaining PF by treating the failure
correction process as a truncated M / M / P F queueing model.
Most of the parameters of M u s a ' s model must be obtained from past data on
similar projects. The parameters M o and T O (and thus K and No) m a y be obtained
by using m a x i m u m likelihood techniques. The M L E ' s are obtained by solving

i-1
T o = -- 1- Ze, (2.1.15a)
n i=1 Mo

1 c
- - ze , (2.1.15b)
i=1 M o - i + 1 M o T o i=1

where zi, i = 1, . . . , n, is the e x e c u t i o n t i m e between the ( i - 1)st and ith failure.


An exact expression for the variance of To is obtained as

V a r (7"o) = 7"2/n (2.1.16)

yielding a coefficient of variation of 1/n 1/2. Though an exact expression for the
variance of M o is not available, confidence bounds for M o are obtained using
Chebychev's inequality. Based on the distribution of the failure m o m e n t statistic
7 = M o / n - 1/A~k where A ~ = ~k(Mo + 1) - ~k(Mo + 1 - n) and qJ is the d i g a m m a
function, a (1 - ~)~o confidence interval for M o is obtained by determining the
values of M o which correspond to the values of 7 such that

7= ~ + SD(~) (2.1.17)

where
_~to 1 SD(~)= _ 1 (A~O' +
n A¢ ' (A ~k)2 \ ( A ~k)2

with A~k' = ~k'(Mo + 1) - ~ ' ( M o + 1 - n) and ~k' is the trigamma function.


The M u s a model was one of the first to suggest that the number of software
failures was governed by a Poisson distribution. Another model which adopted
this approach was the Generalized Poisson Model ( G P M ) of Angus, Schafer and
80 T. A. M a z z u c h i a n d N. D. Sin~vurwalla

Sukert (1980). This model is also based on the JM assumptions but includes the
additional assumption that the severity of the testing process is proportional to
an unknown power of elapsed test time. In the ith debugging time interval of
length H;, the number of errors observed N t is given by a Poisson distribution with
mean value E [ N i ] = dp(N - M e_ I ) H i ~ where M i_ 1 is the number of errors removed
before the start of the ith debugging interval and ~ is an unknown constant. As
in the debugging scenario of Shooman it is assumed that if bugs are corrected they
are corrected at the conclusion of the debugging interval.
Parameters N, ~p and ~ may be obtained by solving the maximum likelihood
equations. In Ramamoorthy and Bastani (1982) these are given for the case
H i = ti, the time between the ( i - 1)st and ith failure, as

1 ~ ~pt~ = O, (2.1.18a)
i=1 N- Mi_ 1 i=1
n
- + ~ logti- ~ ¢p(N- M i_,)t/~ logt/= 0, (2.1.18b)
0¢ i=1 i=1

- Z ( N - M i_l)t~ = 0. (2.1.18c)
p i=l

The extra parameter gives the GPM flexibility but also difficulties in terms of
parameter estimation. Once the parameter estimates are obtained they may be
used with the model to make conclusion regarding the software. One important
expression obtained in Angus, Schafer and Sukert (1980) is the expected time
until the removal of an additional k ~<N - M faults given M faults have already
been removed. The expression is
M+k
7~k= &-'F(&) Z {~[N-i+ 11} -'/~ (2.1.19)
i=M+ I
A
where F(.) is the gamma function and &, ¢p, and N are the MLE's of ~, ¢p
and N. The use of least squares estimates is also discussed by the aforementioned
authors.
There has been much comparison and criticism of the early models in terms
of their assumptions and their parameter estimation. (See for example Forman
and Singpurwalla (1977), Shick and Wolverton (1978), Forman and Singpurwalla
(1979), Sukert (1979), Musa (1979), Littlewood (1979), Littlewood (1980a), Litt-
lewood (1980b), Angus, Schafer and Sukert (1980), Littlewood (1981a), Littlewood
(1981b), Keiller, Littlewood, Miller and Sofer (1982), Musa and Okumoto (1982),
Ramamoorthy and Bastani (1982), Stefanski (1982), Singpurwalla and Meinhold
(1983), Langberg and Singpurwalla (1985)). The paramete(estimation of the JM
model has been most criticized. Forman and Singpurwalla (1977) and (1979),
Littlewood and Verrall (1981) and Joe and Reid (1983), have all illustrated that
the solution of the maximum likelihood equations for the JM model can produce
unreasonably large even non-f'mite estimate for N. In Forman and Singpurwalla
Softwarereliabilitymodels 81

(1977) the authors found that when n is small relative to N, the likelihood function
of N is very unstable and may not have a finite optimum. Littlewood and Verrall
(1981) found that the estimate of N is finite if and only if
n
2"i=1 (i - 1)ti> Y~1=1t;
(2.1.20)
n
2i= 1 (i - 1) n

The authors note that violation of the above implies that no reliabilitygrowth
is
taking place as a result of the debugging process. In Joe and Reid (1983) the
authors show that g is an unsatisfactory point estimate because its median is
negatively biased and can be infinite with substantial probability. The authors
advocate the use of likelihood interval estimates.
Forman and Singpurwalla (1977) and (1979) develop an estimation procedure
to insure against unreasonably large estimates. They propose a stopping rule
based on the comparison of the relative likelihood function for N, to the
'approximate normal relative likelihood' for N:

R . . . . . 1(N) = exp { - ½(N - N)z/var(N)} (2.1.21)

where

Var(N) = n /Ini~= ( 1 )2 - ( ~ , 1))21 .


1 (N-/+l) i=1 ( N - i + I

The above function may be used to give an indication of the appropriateness of


the large sample theory for estimating N. When appropriate, plots of the relative
likelihood function and that of R . . . . al(N) compare favorably. Thus to get a
meaningful estimate of N, the authors suggest the following stopping rule. After
testing the software to n failures

(1) Compute g the MLE of N using (2.1.2a) and (2.1.2b).


(2) If g ~ n go to step 3, if not continue testing until another failure occurs and
return to step 1.
(3) Compute the relative likelihood function for N and compare it with
Rnormal(N ). If plots of the two functions display a large discrepancy, this estimate
is misleading. Continue testing until another failure occurs then go to step 1. If
the plots are in good agreement, stop testing.
Furthermore, if the large sample theory appears appropriate, then inference
concerning N (and in an analogous manner tp) may be obtained using the normal
distribution.
Meinhold and Singpurwalla (1983) suggest the adoption of the Bayesian point
of view when considering the likelihood function of the JM model. In so doing,
the conclusion to be obtained from ridiculous parameter estimates is that the
method of inference--specifically maximum likelihood estimation, rather than the
82 T. A. Mazzuchi and N. D. Singpurwalla

model that needs to be questioned. A Bayesian approach to inference on N and


~p is discussed.
Goel and O k u m o t o (1979) treat the cumulative number of software failures by
time t, N(t) is assumed to be a nonhomogeneous Poisson process with mean value
function

m(t) = a(1 - e - b t ) (2.1.22)

where the unknown constants a and b represent the expected number of failures
eventually discovered and the occurrence rate of an individual error respectively.
Thus for any t >~ 0

[a(1 - e - b ' ) ] " e - [a(l - e-bt)]


Pr {N(t) : n la, b} =
n!
(2.1.23)
= poim(n'a(1 - e-bt)), n = 0, 1, 2 . . . . .

F r o m (2.1.23) the distribution for the total error content is poim(n" a) and the
conditional distribution of the number of remaining errors at time t ' ,
N ( t ' ) : N ( o o ) - N ( t ' ) is

P r { N ( t ) = n ' l N ( t ) = n, a, b} -- p o i m ( n ' + n , a ) , n' = 0 , 1 , 2 , . . . .


(2.1.24)

The reliability function for the interfailure time T; is given by

R(ti] t~_ 1, a, b) = exp { - a l e -bt;-I - e - b(t''- 1+ti)]} (2.1.25)

where t i j= l tj is the time until the ith failure. Thus in contrast to JM2,
software interfailure times are not independent. Also note that due to this
dependence, the G o e l - O k u m o t o model is of the stochastic type.
Estimators of a and b are obtained via the solution of the m a x i m u m likelihood
equations

n/a = 1 - exp { - bt'n }, (2.1.26a)

n/b = ~ t'k + I t 'n e x p { - b t ' n } . (2.1.26b)


k=l

A (1 - ~)~/o confidence region for a and b may be established using the approxi-
mation

L ( h, b lt; . . . . , t'n) - L(a, blt~', . . . , t'n) ~ ~Xz,~ (2.1.27)

Goel and O k u m o t o (1980) also discuss the use of the asymptotic normality of
h and b for constructing confidence intervals. Here, model results are based
Software reliabilitymodels 83

on execution rather than calendar time. This approach represents an extension of


the basic model derived in Schneidewind (1975) and is itself extended in Shan-
thikumar (1981) using a nonhomogeneous Markov process.
A combination of the Musa model and Goel and Okumoto model is given in
Musa and Okumoto (1984). This model incorporates use of execution time with
the analytical ease of the Nonhomogeneous Poisson Process. Furthermore, the
authors define the failure intensity in such a way as to reflect the fact that errors
with larger size are found earlier. If 20 and 0 are the initial failure intensity and
the rate of reduction in the normalized failure intensity per failure, the failure
intensity is defined in terms of execution time as

2(~) = 20 e - Om(*) (2.1.28)

where re(v) is the mean value function for N(~). Given the above the mean value
function is given by

1
m('c) = = log(2oO'r + 1) (2.1.29)
0

and the distribution for N(,) is given by poim(n : (1/0) log(2 o 0z + 1)). Expressions
analogous to (2.1.22)-(2.1.25) are obtained by substituting (1/0)log(2 o 0~ + 1) for
a(1 - e-bt). Musa and Okumoto obtain further functions of interest by exploiting
the relationship between time until the ith failure, T" and the number of failures
in a given time. Using this notion

e { r ; ~< ~} = ~ [m(~)]J e -m(x) (2.1.30)


j=e j!

and

P{T" < "cIN('c,) = nx} =


oe
[m(~) n
m(T1)]j-hl e- [ m ( z ) -- m ( ' q )]
j=i ( j - rh)[

where T; = Z'j = ~ Tj. is the time of the ith software failure.


Maximum likelihood estimation is discussed for both cases where failure times
and number of failures are used. The complexity of the estimation procedure is
reduced by estimating the parameter cp = 200 and solving for 2 and 0 by choosing
the mean number of failures equal to the number of software failures encountered.
When the software is tested for a time v~ and n failures are recorded at times
z;, ..., v,~, ~p may be obtained by solving

n T~
= 0 (2.1.32)
q~ i=l tp~,' + 1 (q~" + 1)(log(tpr~ + 1))
84 T. A. Mazzuchi and N. D. Singpurwalla
^ ^

Given ~, estimates 20 and /) may^ be obtained ^by ^setting m(z)=


(1/0)ln[~bz" + 1] = n, thus 0 = (1/n)ln[~x" + 1] and 2 = ~/0. When the
software is tested over an interval [0, xp] and this is partitioned into intervals
(0, xl], (x 1, xz] . . . . . (Xp_ 1, Xp] with n; denoting the number of failures recorded
in (0, xi], i = 1, . . . , p, then the maximum likelihood equation for q~ is given as
Xi Xi- 1
p ~x i + 1 ~bxi + 1 '
Z ni _ np Xp = 0
;=x log(~xi + 1 ) - log(~xi_ l + 1) (~bXp + 1)log(q~xp + 1)
(2.1.33)

where ni = n ; - n;_l. ^ Again using the same approach as before ~ and 2o


may be obtained as 0 = (1/np)log[~pXp + 1] and 2 = @~.

In the above model, times are in terms of executime time rather than calendar
time. The conversion to calendar time follows the developments in Musa (1975).
The Musa (1975) model was also one of the first models to address the notion
of imperfect debugging. Goel and Okumoto (1979) suggested the use of a Markov
process to model imperfect debugging. Kremer (1982) uses a multidimensional
birth-death process to account for imperfect debugging and the introduction of
new errors as the result of debugging. Kremer (K) begins by assuming that the
failure rate of the software is a product of its fault content and an exposure rate,
h(t). To account for imperfect debugging he further assumes

(K) When a failure occurs, the repair effort is instantaneous and results in one
of three mutually exclusive outcomes
(i) the fault content is reduced by 1 with probability p;
(ii) the fault content remains unchanged with probability q;
(iii) the fault content is increased by 1 with probability r.
Thus the author defines a birth-death process with birth rate rh(t) and death rate
ph(t).
A multidimensional process is defined with X(t) denoting the fault content of
the software at time t and N ( t ) the number of failures to time t. Though reliability
measures are obtained from N(t), the failure rate of the software is a function of
X(t), which is changing in a stochastic manner.
Given the initial fault content of N, the expected number of faults in the
program by time t is
E [ X ( t ) IN, p, r] = N e - p(o (2.1.34)

where p(t) = ( p - r ) ~ o h ( u ) d u and the expected number of failures by time t is

N
p-r
E [ N ( t ) IN, p, r] = (2.1.35)
N h(u) d u , p = r.
Software reliability models 85

Thus in the life of the software (if p > r) the expected number of failures will be
N/(p - r). Thus p - r is similar to Musa's constant B. Given n failures obtained
by time to, conditional expectations may be obtained as

E[X(t o + t)lN, p, r,N(to) = n] = [ N - (p - r)n] e -p(t°,') (2.1.36)

where p(t o, t) = p(t o + t) - p(to). Using (2.1.36) the conditional expectation for the
number of failures in (to, to + t] is

E[N(t o + t) - N(t)IN, p, r, N(to) = n]

(N- !P_-_r)n) [ 1 _ e-P(to, t,] ffp~r,


p-r
= [Dt o + t (2.1.37)
N Jt h(u) du ifp=r.
o

The birth-death differential-difference equations may be solved for


pm(t) = e{x(t) = m} as

Po(t) = [~(t)] N, (2.1.38a)


min (N, m)
(jN.) ( N + N S ~ - 1)
Pm(t) = 2
j=O

(a(t)) lv-j(fl(t)) m -J(1 - ct(t) - fl(tt} i (2.1.38b)


where

e(t) = 1 1 and fl(t)= 1 e-°(°


e -°(t) + A(t) e -p(° + A(t)
and

A(t) = f o rh(u) e pCu)d u .

From these, the reliability of a program tested for to units of time may be obtained
as

R (t[ N, p, r) = ~ pro(to) [ Sto(t)] m (2.1.39)


m
where

Sto(t) = exp { - f , i ° + ' h ( u ) d u }

is the reliability attribute of each remaining fault. Given n failures by time to the
reliability may be expressed as

R(tln, p, r, N(to) = n) = ~ Pm(to)[Sto(t)] ~v-m (2.1.40)


m
86 7'. A, Mazzuchi and N. D. Singpurwalla

where Pro(to) = P{X(to) = N - m l N ( t o ) = n} and is given by

n~
P,,,(to) = ~ - - piqJrk.
i--k=m i!j!k!

This model is dependent on the parameters N, p, q, r and h(t). Maximum


likelihood estimates may be used for N, p, q, r and the parameters of h(t). The
amount of data required and the accuracy of the estimates have not been
investigated. Estimates of p, q and r could be obtained from experience or best
prior guesses. The author also suggests a Bayesian approach for estimating h(t),
which closely resembles that pursued in Littlewood (1981).
The model of Goel and Okumoto (1979) and Musa and Okumoto (1984)
represent a step towards a Bayesian analysis of the problem. In Singpurwalla and
Kyparisis (1984) a fully Bayesian approach is taken using the nonhomogeneous
poisson process with failure intensity function 2(0 = (fl/~)(t/~) t~- ~ for t>~ 0.
Due to the resemblance of 2(0 to the failure rate function of the Weibull
distribution, the model is referred to as the Weibull process. Thus N(t) again is
assumed to be a nonhomogeneous poisson process with mean value function
m(t) = (t/~) t~. In the true Bayesian context uncertainty concerning ~ and fl are
expressed by their respective prior densities

1
go(a) = - - , 0 < ~ ~< 7o, (2.1.41a)
~0
r ( k , + k2) (~ - ~ , Y " - 1(~2 - ~y,2-,
fo(/~) -
r(kl)r(k2) (/~2 -/~IY' +k2-1
O~fll "(fl(fl2 ; kl, k2)O- (2.1.41b)

For convenience it is assumed that the prior distributions for ~ and fl are
independent. Posterior inference concerning the number of future failures in an
interval or the time until the next failure may be obtained once the posterior
distributions of ~ and fl are computed. The posterior distribution of fl is of interest
in its own right as it may be used to assess the extent, of reliability growth.
Reliability growth would be taking place if fie (0, 1), by observing the posterior
density one may examine the extent to which this is true.
Posterior analysis is conducted fo~ both the case where only the number of
failures per interval and the case where the actual failure times are recorded. In
both cases the posterior distributions of ~ and/3 are intractable. An approximation
is given for the posterior of ft. Due to the intractability of the posterior
distributions of c~ and r, posterior inference concerning the number of failures in
future intervals and the time next failure are conducted numerically via a computer
code described in Kyparisis, Soyer and Daryanani (1984).
When only the number failed in each interval is recorded over a period [0, Xe]
the posterior distribution of Ark the number of failures in (xk_ 1, xk], k = p + 1,
Software reliability models 87

p + 2, p + 3, ... is given by

Pr{Nk = nk[nl . . . . . np}


1)] n~
= f o = ° f l ] =~ [m(Xk)--m(Xk-nk
' exp { - [m(Xk) -- m(Xk_ 1)]}

• gl(~,fl[nl . . . . . np) do~ d/3 (2.1.42)

where gl(c~,/31nl . . . . , np) is the joint posterior density o f ~ and/3. The approximate
marginal posterior density of/3 is obtained as

gl(/3ln I . . . . , nk) OC (/3 -- ill) ~' -1(/32 -- /3)k2- 1.5(/3)1/fl


- - . F(np - 1//3)
/3
p
Ix f - X f _ l ] n~
• [1 (2.1.43)
,=1 s(/3)

where S(/3) = Y~= 1 ( x f - xf_ 1). The approximate posterior distribution for /3 is
based on the approximation

~) exp - d~ (2.1.44)
/35(/3)n~ - 1/fl

which works well if % f> S(/3) 1/~.


When the software is tested over a period (0, T) and failure times
t'1 ~< t~ ~< • • • ~< t', are recorded, then the joint posterior distribution of a and/3 is
given by

g2(=, ~3It'1, . . . , tL)oc ( / 3 - / 3 , ) < - ' ( / 3 2 -/3)k2-'(/3/0"

• [I (t; I 0 ~ - ' exp { - (t'lT) p} (2.1.45)


i=l

and the marginal posterior of/3 is given by

g2(/3[t'1, . . . , t~,)oc(fl- /31)kl-- 1(/32 -- /3)k2 1 / 3 " - ' r ( n - 1//3)

• t" (t.)l - n/~ (2.1.46)


i=1

using an approximation similar to (2.1.44) which works well provided ~o >/t;.


Posterior inference concerning the number of failures in future intervals may be
obtained using (2.1.42) in conjunction with (2.1.45). Posterior inference concerning
Z k given t',, the time to the (n + k)th failure from t', is obtained by noting that
88 T. A. Mazzuchi and N. D. Singpurwalla

given ~ and/7, failure times (t~/a) ~, (t2/a)


' # .... can be viewed as being generated
from a homogeneous poisson process. The posterior conditional distribution of Z k
given t, is obtained from

ao f12 (t n z) ok-- 1 e - v

P r { Z ~ <<.z l t l . . . . . t'} =
fo f, fo 1
(k i 1)! dv

• gz(~, [3l tl, . . . , t',) d a dfl (2.1.47)


where
v(tn, _ _ _(~)t~.

Littlewood (1980) also initiates a Bayesian approach to error counting, but


expresses uncertainty of the software's performance through 2~, the failure rate of
the software given i - 1 failures have occurred. This Littlewood model embraces
the assumptions of the JM model except for (JM1). Arguing that errors with the
largest size (and thus greater failure contribution) will be discovered first, Litt-
lewood instead views 2~ = q~ + (P2 q- " ' " -t" ~N--i+ i where (p~ is the failure contri-
bution of the ith remaining error. Uncertainty about the ~O~is expressed via the
prior distribution fl~ (p~- ~ e - ~ , (p >i 0 (2.1.48)

which is denoted ~ ~ G(a, fl).


Because the uncertainty is the same for all (p~, i = 1. . . . . N, initially, the prior
distributions will all be identical. The failure contribution for an error which has
not been observed by the (i - 1)st failure is given q~i~ G(a, fl + t'_ l) where, as
usual the t;_ 1 is the time of the ( i - 1)st failure. Thus the uncertainty about the
failure rate of the software after the ( i - 1 ) s t failure is expressed via
2 i ~ G ( ( N - i + 1)~, fl + t'_ 1). The reliability function of T~ may be expressed as

f l + t , _ l ](N-i+l)~
..... , (2.1.49)
R ( t ils, fl) = fl + t[ 1 + ti

a Pareto distribution. Unlike the exponential distribution the Pareto distribution


permits the possibility of very large error free intervals. Also it is interesting to
note that the failure rate function given by

2(ti) = ( N - i + 1)/(fl + t ; - i + ti) (2.1.50)

displays a decreasing failure rate and this property can be shown to be indepen-
dent of the prior distribution for the (Pi.
Littlewood discusses the use of (2.1.48) and (2.1.49) in determining other
reliability measures. The author suggests the use of maximum likelihood estimation
Software reliability models 89

(similar to that used in the JM model) in order to obtain estimates of N, a, and


ft. A purely Bayesian approach would determine the parameters from elicited prior
information.
All models thus far have ignored the time required to find and correct software
errors. While this keeps the model derivation simple, it may not be adequate and
does not enable the measurement of an important reliability parameter, availability.
Shooman and Trivedi (1976) introduced the use of the Markov Process to
account for the time to find and correct software bugs in large software systems.
The thrust of this analysis is to estimate availability rather than reliability. In Kim,
Kim, and Park (1982) (KKP) this model is developed and extended. As with the
JM model it is assumed that the failure rate of the software is directly proportional
to the number of errors and that each error contributes an equal amount to the
failure rate. To account for the debugging process the following additional as-
sumption is made
(KKP) When a failure occurs, errors are corrected perfectly with rate #o, or are
corrected but with the addition of a new error with rate #l-
Given the above assumptions the differential difference equations for
p,(t) = e(N(t) = n} when the computer is up, and q,(t) = e{N(t) = n} when the
computer is down, are given by

pN(t ) _ AN + #o + Ill eA+vt+ BN + ~o + 111 eS+~t, (2.1.51a)


AN - BN BN -- AN

pN_1,(t) -(~b#°~'N! ~ { (Alv-J+ l~O + #')eAU-/

(BN-j + 110 + ~1)eBN-jt


+ [[~=o (B N-/--
. . AN-i)
. . . . []i=O,,+,i(BN-:
; - - - - - -- BN-i) J '

k = 1, . . . , N, (2.1.51b)

and
~)(~)#o)kN' ~= (~H (eAN Jt -- e-(bt°+ tll)t
qN- k(t) -
j:o - (A _j -
AN- i)
(eBu-jt _ e-(f,o + u,)t

17~:o (B,,±+ - A~_~) Hi:~ o, i+,j (BN-j BN- i) l


k = 0, 1 . . . . , N- 1, (2.1.52)
90 T. A. Mazzuchi and N. D. SingJ~urwalla

where

f AN-

ON-
k

k
½ { - [ # o +/~1 + ( N - k)~p]

~[~O + ~'~1 -I- ( N - k ) ~ ] 2 - 4 ( N - k)~j[~o) . (2.1.53)

Once estimates of N, tp, #o and #1 are obtained the availability of the system
is given by Y,~=oPN_k(t). The authors specify no means for estimating the
parameters, however N and tp could be estimated using methods applied to the
JM model, while #o and /~1 could be estimated from past experience or from
correction times.

2.2. Non-error counting models


Non-error counting models are not designed to provide estimates of the number
of residual failures but only provides estimates of the effects of the residual errors
on software reliability. Deterministic models are represented by the Halden
Project model (Dahil and Lahti (1978) and a modification of the JM model called
the Jelinski-Moranda Geometric De-Eutrophication model presented in Moranda
(1975) and (1979)). This model was designed to handle the case where groups of
errors are removed at one time, but can also be used to account for the case
where larger size errors are removed first, as in Littlewood (1980) and Musa and
Okumoto (1984). The model assumes that 2; = D U - 1 where D is the initial
detection rate, and k is the ratio between the ( i - 1)st and ith failure. These
parameters may be estimated from the maximum likelihood equations

iUti E kite = (n + 1)/2, (2.2.1a)


i= 1 //\i= 1

D = kn k it; . (2.2.1b)
i 1
Moranda also suggests using this formulation in conjunction with the nonhomo-
geneous Poisson process. Sukert (1977) generalizes the model to include more
than on failure per debugging interval.
In Littlewood and Verrall (1973) a stochastic Bayesian model is presented. In
this approach the author attempts to model the debugging behavior of the
programmer or programmers involved. As each error is encountered, it is the
intent of programmer to correct the error and thus increase the reliability of the
software. Though this is always the intent it is not always achieved. Often new
errors are created which reduce the reliability of the software. To model this
situation in a Bayesian context, Littlewood suggests expressing the uncertainty
about 2; by assuming a priori that 2 i ~ G(a, ~(i)) where 0(i) is an increasing
function indicating the complexity of the program and quality of the programmer.
Defining 0(i) as an increasing function of L incorporates the assumption that the
programmer's intent is always to improve the software's reliability since
0(i) > 0 ( i - 1) implies that
Software reliability models 91

e(,t, < z} I_. < z} (2.2.2)

for l > 0. The above implies that the ).i are stochastically ordered.
Combining the usual assumption that given 2; the variables T;, i = 1. . . . , n, are
independent exponential random variables, with the prior distributions for 2; the
posterior reliability for T~ can be obtained as

R(,,) =[ ,i] (2.2.3)

which is a Pareto distribution.


The author suggest trying several parametric families for ~(i) notably
t~(i) = [3o + [31i a n d ~ ( i ) = floi + B l i a. The author does discuss the possibility of
using a prior distribution for ~, but Littlewood (1980) suggest maximum likelihood
estimation for the model parameters, thus making this model a hybrid approach.
Further analysis along the lines of modeling the stochastic ordering of 2; are
pursued in Ramamoorthy and Bastani (1980). Specifically these models are
referred to as the mixed gamma model and the stochastic input domain model.
Bayesian time series analysis is used to assess software reliability growth and
other reliability parameters in Horigome, Singpurwalla and Soyer (1984) (HSS)
and Singpurwalla and Soyer (1985). The authors assume a power law relationship
between T,. and T;_ 1 where T; is defined as the failure time at the ith testing stage
(note if a testing stage consists of testing to the first system failure the T; is as
previously defined). The relationship assumed is

T; = Ti~ , bi (2.2.4)

where 0; reflects the effects of the changes made as a result of the (i - 1)st stage
of testing and bI is an error term to account for uncertainty. Note that reliability
growth will have taken place as a result of changes made in the (i - 1)st stage
of testing if 0; > 1; 0; = 1 indicates no improvement and 0; < 1 indicates reliability
decay.
The model is developed based on the following assumptions:
(HSS1) The variables Ti, i = 1. . . . . n, are lognormally distributed with T;~< 1
assumed for all i.
(HSS2) The values b;, i = 1, . . . , n, are lognormally distributed with known
parameters 0 and a 2.
(HSS3) The quantities 0i, i = 1. . . . , n, are exchangeable and are distributed
according to some distribution G with density g.
Taking the logarithm of both sides of (2.2.4) yields

r, -- o , r ; _ 1 + ( 2.2.5)
92 T. A. M a z z u c h i and N. D. Sin~ourwalla

where Y~= log T~ and e; = log ~ are normally distributed, the latter with mean 0
and variance alz. The sequence { Y,.} is thus given by a first order autoregressive
process with a random coefficient 0,..
By assuming further that 0~ ~ N(2, a22) where a ff is known and 2 ~ N(#, a 2)
with # and a32 known, the following posterior results are obtained.

(i) (2tYl . . . . . y . ) ~ N ( # . , a.:) with

#n=(~232+i=1
~ Y i Y i -,l/] a f f

__
ay = 1 ÷ Yi- 1 /-'
i= 1 Wi _ 1.t

Wi- 1 = a~Yi- 1 + a~ ;

(ii) (0nlYl. . . . . y.)'~ N (a~#"+a~YnY"-l,a~(W._la~+aa2a~)Iw~_l);


Wn- 1

(iii) (I1. +1 tyl . . . . . y.) ~ + w.) ;


(iv) (On +1 [el . . . . . Y~) "~ N(#n, tr2 + tr2) •

Note that aft reflects the views about the consistency of policies regarding modifi-
cations and design changes made. Using the above, posterior inference can be
obtained for any relevant quantity. For example Bayes probability intervals can
be constructed for the next failure time or reliability growth at each stage can be
assessed by plotting E[Oilyl, . . . , yi] vs. i. Overall, reliability growth can be
examined via E[2Iy~, . . . , y~], i = 1 . . . . . n. In Singpurwalla and Soyer (1985) this
basic model is extended by assuming various dependence structures for the
sequence {0;}. Three additional models are developed using the structure of the
Kalman Filter Model.

2.3. Model unification


Though highly criticized, the JM model remains central to the topic of software
reliability. Langberg and Singpurwalla (1985) provide an altemative motivation for
the JM model using shock models. Stefanski (1982) provides another motivation
for the JM model using renewal theoretic arguments. Both works allude to the
centrality of the model. Langberg and Singpurwalla further provide a unification
of software reliability models by illustrating that many other well known models
such as Littlewood-Verrall (1973) and Goel and Okumoto (1979) can be obtained
by specifying prior distributions for the parameters of the JM model. Extensions
to the basic Bayes model and the discussion of the use of posterior modes as
point estimates is given in Jewell (1985).
Software reliability models 93

3. Models of the validation phase

When a decision is made to stop testing the software (see Forman and
Singpurwalla, 1977; Okumoto and Goel, 1979, 1980; Krten and Levy, 1980;
Shanthikumar and Tufekci, 1981, 1983; Koch and Kubat, 1983; Chow and
Schechner, 1985, for decision criteria), the software enters t.he validation phase.
In this phase the software undergoes intensive testing in its operational environ-
ment with a goal of obtaining some measurement of its reliability. Software errors
are not corrected in this phase and, in fact, a software failure could result in the
rejection of the software.
Nelson (1978) introduced a simple reliability estimate based on probabilistic
laws. Letting e r denote the size of the remaining errors in the program and noting
that errors are not removed, the number of runs until a software failure is a
geometric random variable with parameter e r. Thus the maximum likelihood
estimate of e r can be used to determine an estimate of reliability. This is given
as
R = 1 - nf/n (3.1)

where n is the total number of sample runs and nf is the number of sample runs
which ended in failure.
The above model suffers from several drawbacks (Ramamoorthy and Bastani,
1982) stemming from its simplicity.
(1) A large number of sample runs is required to obtain meaningful estimates.
(2) The model is based on the assumption that inputs are randomly selected
from the input domain and thus does not consider the correlation of runs from
adjacent segments of the input domain.
(3) The model does not consider any measure of complexity of the program.
Extensions to the basic model have attempted to reduce the number sample
runs by specifying equivalence classes for the input domain (Nelson, 1978;
Ramamoorthy and Bastani, 1979). This goal is achieved at the cost of an increase
in model complexity.
Crow and Singpurwalla (1984) address the issue of correlation of inputs using
a fourier series model. The authors observe that in many cases software failures
occur in clusters and thus the usual assumption that the times between failures
are independent may not be valid. Rather they assume that the time between
failures is given by

T i = f(i) + ~ (3.2)

where ee is a disturbance term with mean 0 and constant variance and f(i) is some
cyclical trend. To identify the cyclical pattern (if any) with which failures occur,
the authors fit the Fourier series model

f(i) = eo + ~ [e(kj)cos 2re- kji+ fl(kj)sin 2~r- kji 1 (3.3)


j=l n n
94 T. A. Mazzuchi and N. D. Singpurwalla

where n (the number of observed time between failures) is assumed odd,


q = (n - 1)/2 and kj = j , j = 1 . . . . . q. Using the method of least squares the model
parameters are obtained as

1
ao = - ~ ti, (3.4a)
/'/ i = 1

~(kj) = -2 5] t ; c o s -2~
- kfi, j = 1. . . . . q, (3.4b)
ni=l n

]~(kj)= 2 ~ t i s i n 2 n k f i , j= 1. . . . . q. (3.4c)
ni=l n

The spectrogram is used to identify the period of the series, and thus the clustering
behavior. A parsimonious model may also be obtained by using only those
weights ~(kj) and/~(kj) for which p2(kj) = a2(ki) + fl2(kj) is large.
This model was applied to three sets of failure data from each of two software
systems. The model was found to adequately represent the failure behavior. One
potential problem of the model is that due to the relationship of a(kj) and/~(kj)
on trigonometric functions, negative values of f ( i ) may be produced. When such
is the case, the authors interpret this as an implication of a very small time
between failure.
Though the intent of the authors in this paper is data analysis, the model can
be used to predict future time between failures and future failure clusters. Also by
specifying a functional form for ee (such as the usual normal assumption),
inference can be made.

4. Models of the operational phase

Models in this phase are used to illustrate the behavior of the software in its
operating environment. Both Littlewood (1979) and Cheung (1980) obtain the
software reaiiability by assuming the software program is divided into modules.
Cheung suggests a combination of deterministic properties of the structure of
he software with the stochastic properties of module failure behavior, via a
Markov process. He assumes
(C1) Reliabilities of the modules are independent.
(C2) Transfer of control among program modules is a Markov process.
(C3) The program begins and ends with a single module, denoted N 1 and Nn
respectively.
The state space is divided into N 1. . . . . Nn, C, F where N; are the modules, C
indicates successful completion, and F indicates an encountered failure. States C
and F are absorbing. Transition probabilities from N; to Nj (i # j ) are given by
Software reliability models 95

R~p~j where Ri is the reliability of module i and p~j is the usual transition
probability from module i to module j. The transition probability from Ni to F is
1 - R ; and the transition probability from N n to C is given by R,. Thus the
reliability of the software is obtained as the probability of being absorbed into
state C given that the initial state is N 1. This is obtained as

R = S(1, n)R, (4.1)

where S(i, j) is the (i, j)th entry in the matrix S = ( I - Q)-1 and Q is the
transition matrix of the process with the rows and columns of C and F deleted.
The module reliabilities R~ may be determined before system integration by
techniques of Section 2 or 3. Transition probabilities may be estimated by running
test case. Cheung further discusses the use of this module in determining testing
strategies and expected error cost of the software. The latter may be used in place
of system reliability in determining the acceptance of the software.
Littlewood (1979) assumes semi-Markov process and takes into account the
time spent in each module. The model further incorporates two sources of failure:
within module failure with rate 7~, i = 1. . . . , n, and failure associated with the
transfer from module i to module j which is given with rate 2;j (i # j ) . Assuming
that these individual failure rates are small in comparison to the switching rates
between modules, Littlewood states that the failure point process of the integrated
program is asymptotically a Poisson proces s with rate parameter

E,.j YlePu(#~J 7~ + 2u) (4.2)


2,.: 11,pu.f
In the above 11 = (H~, ..., Hn) is the equilibrium vector of the imbedded Markov
chain, and /x~j is the expected sojourn time in module i before transferring to
module j. An estimate of overall program availability is given as

E,,j YIiPulX~J (4.3)


Ei,j Ilipij[#~ j + #~Jv,rnl + 2um[J]
where m~ and m~: are the expected downtime due to failure in module i and due
to transfer from module i to module j, respectively.
As with Chueng's model individual module failure rates can be obtained before
interfacing takes place and all other parameter values may be estimated from test
cases or experience with similar programs. Estimation of expected costs of failures
is also discussed by Littlewood.

5. Closing comments

Though there is a large body of literature on software reliability (see Shick and
Wolverton, 1978; Ramamoorthy and Bastani, 1982; Shanthikumar, 1983) several
96 T. A. Mazzuchi and N. D. Singpurwalla

issues remain. First, there is a lack of models for the validation, operational and
m a i n t e n a n c e phase of the software. Additional models are needed to address such
issues as software design and testing criteria for release of software. Furthermore,
the vast n u m b e r of models for the testing and development phase has left the user
somewhat confused. Criteria for c o m p a r i s o n and selection of software models
needs to be developed as is done initially in M u s a and O k u m o t o (1982), Kieffer,
Littlewood, Miller and Sofer (1982) a n d I a n n i n o , Musa, Okumoto, Littlewood
(1984), and Soyer and Singpurwalla (1985).

References

Amster, S. J. and Shooman, M. L. (1975). Software reliability: An overview. In: E. Barlow, J. B.


Fussell and N. D. Singpurwalla, eds., Reliability and Fault Tree Analysis: Theoretical and Applied
Aspects of System Reliability and Safety Assessment. SIAM, Philadelphia, PA, 455-485.
Angus, J. E., Schafer, R. E. and Sukert, A. (1980). Software reliability model validation. Proceedings
of the 1980 Annual Reliability and Maintainability Symposium, 191-198.
Barlow, R. E. and Singpurwalla, N. D. (1985). Assessing the reliability of computer software and
computer networks: An opportunity for partnership with computer scientists. The American
Statistician 39, 88-94.
Cheung, R. C. (1980). A user-oriented software reliability model. IEEE Transactions on Software
Engineering 6, 118-125.
Chow, C. and Schechner, Z. (1985). On simple statistical stopping rules for software debugging
processes. Technical Report. Columbia University.
Crow, L. H. and Singpurwalla, N. D. (1984). An empirically developed Fourier series model for
describing software failures. IEEE Transactions on Reliability 33, 176-183.
Dahil, D. and Lahti, J. (1978). Investigation of methods for production and verification of computer
programs with high requirements for reliability. OECD Halden Reactor Project Preliminary
Report.
Forman, E. H. and Singpurwalla,N. D. (1977). An empirical stopping rule for debugging and testing
computer software. Journal of the American Statistical Association 72, 750-757.
Forman, E. H. and Singpurwalla, N. D. (1979). Optimal time intervals for testing hypotheses on
computer software errors. IEEE Transactions on Reliability 28, 250-253.
Goel, A. L. (1980). Software error detection model with application. The Journal of Systems and
Software 1, 243-249.
Goel, A. Lo (1980). A summary of the discussion on 'An analysis of competing software reliability
models'. IEEE Transactions on Software Engineering 6, 501-502.
Goel, A. L. and Okumoto, K. (1979). Time-dependent error-detection rate model for software
reliability and other performance measures. 1EEE Transactions on Reliability 28, 206-211.
Goel, A. L. and Okumoto, K. (1979). A Markovian model for reliability and other performance
measures. Proceedings of the National Computer Conference, 769-774.
Horigome, M., Singpurwalla, N. D. and Soyer, R. (1984). A Bayes empirical Bayes approach for
(software) reliability growth. In: L. Bilard, ed., Computer Science and Statistics: Proceedings of the
16th Symposium on the Interface. North-Holland, Amsterdam, 45-56.
Iannino, A., Musa, J. D., Okumoto, K. and Littlewood, B. (1984), Criteria for Software Reliability
Model Comparisons. IEEE Transactions on Software Engineering 10, 687-691.
Jelinski, Z. and Moranda, P. (1972). Software reliability research. In W. Freiberger, ed., Statistical
Computer Performance Evaluation. New York, Academic Press, 465-484.
Jewell, W. S. (1985). Bayesian extensions to a basic model of software reliability. Technical Report,
Operations Research Center, University of California in Berkeley.
Joe and Reid (1983). Estimating the number of faults in a system. Submitted to JASA.
KeiUer, P. A., Littlewood, B., Miller, D. R. and Sofer, A. (1982). On the quality of software reliability
Software reliability models 97

prediction. In: J. K. Skwirzynski, ed., Electronic Systems Effectiveness and Life Cycle Costing.
Springer, New York, 441-460.
Kim, J. H., Kim, Y. H. and Park, C. J. (1982). A modified Markov model for the estimation of
computer software performance. Operations Research Letters 1, 253-257.
Koch, H. S. and Kubat, P. (1983). Optimal Release Time of Computer Software. IEEE Transactions
on Software Engineering 9, 323-327.
Kremer, W. (1983). Birth-death and bug counting. IEEE Transactions on Reliability 32, 37-46.
Krten, O. J. and Levy, J. (1980). Software modeling from optimal field energy. Proceedings of the
Annual Reliability and Maintainability Symposium, 410-414.
Kyparisis, J. and Singpurwalla, N. D. (1984). Bayesian inference for the Weibull process. In: L.
Bilard, ed., Computer Science and Statistics; Proceedings of the 16th Symposium on the Interface.
North-Holland, Amsterdam, 57-64.
Kyparisis, J., Soyer, R. and Daryanani, S. (1984). Computer programs for inference from the Weibull
process. Institute for Reliability and Risk Analysis Technical Report, The George Washington
University, Washington, DC.
Langberg, N. and Singpurwalla, N. D. (1985). Unification of some software reliability models via the
Bayesian approach. SIAM Journal on Scientific and Statistical Computing 6, 781-790.
Lipow, M. (1974). Some variations of a model for software time-to-failure. Correspondence ML-74-
2260.1, TRW Systems Group.
Littlewood, B. (1979). How to measure software reliability and how not to. IEEE Transactions on
Reliability 28, 103-110.
Littlewood, B. (1979). Software reliability model for modular program structure. IEEE Transactions
on Reliability 28, 241-246.
Littlewood, B. (1980). The Littlewood-Verral model for software reliability compared with some
rivals. The Journal of Systems and Software 1,251-258.
Littlewood, B. (1980). Theories of software reliability: How good are they and how can they be
improved. IEEE Transactions on Software Engineering 6, 489-500.
Littlewood, B. (1981). A critique of the Jelinski-Moranda model for software reliability. Proceedings
of the 1981 Annual Reliability and Maintainability Symposium, 357-362.
Littlewood, B. (1981). Stochastic reliability growth: a model for fault-removal in computer-programs
and hardware design. IEEE Transactions on Reliability 30, 313-320.
Littlewood, B. and Veri'all, J. L. (1973). A Bayesian reliability growth model for computer software.
Applied Statistics 22, 332-346.
Littlewood, B. and Verrall, J. L. (1981). Likelihood function of a debugging model for computer
software reliability. IEEE Transactions on Reliability 30, 145-148.
Meinhold, R. J. and Singpurwalla, N. D. (1983). Bayesian analysis of a commonly used model for
describing software failures. The American Statistician 32, 168-173.
Moranda, P. B. (1975). Prediction of software reliability during debugging. Proceedings of the 1981
Annual Reliability and Maintainability Symposium, 327-332.
Moranda, P. B. (1979). Event-altered rate models for general reliability analysis. IEEE Transactions
on Reliability 28, 376-381.
Musa, J. D. (1975). A theory of software reliability and its application. IEEE Transactions on Software
Engineering 1, 312-327.
Musa, J. D. (1979). Validity of execution-time theory of software reliability. IEEE Transactions on
Reliability 28, 181-191.
Musa, J. D. and Okumoto, K. (1982). Software reliability models: Concepts classification, compari-
sons, and practice. In: J. K. Skwirzynski, ed., Electronic Systems Effectiveness and Life Cycle Costing.
Springer, New York, 395-423.
Musa, J. D. and Okumoto, K. (1984). A logarithm Poisson execution time model for software
reliability measurement. Proceedings of the 1984 Reliability and Maintainability Symposium.
Nelson, E. (1978). Estimating software reliability from test data. Microelectron. Reliab. 17, 67-74.
Okumoto, K. and Goel, A. L. (1979). Optimal release time for software systems. Proceedings of
COMPSAC, 500-503.
98 T. A. Mazzuchi and N. D. Singpurwalla

Okumoto, K. and Goel, A. L. (1980). Optimal release time for software systems based on reliability
and cost criteria. Journal of Systems and Software 1, 315-318.
Petroski, C. M. (1984). A survey of software reliability. Student Report, The George Washington
University.
Ramamoorthy, C. V. and Bastani, F. B. (1979). An input domain based approach to the quantitative
estimation of software reliability. Proceedings of the Taipei Seminar on Software Engineering, Taipei,
Taiwan.
Ramamoorthy, C. V. and Bastani ,F. B. (1980). Modeling the software reliability growth process.
Proceedings of COMPSAC, Chicago, IL, 161-169.
Ramamoorthy, C. V. and Bastani, F. B. (1982). Software reliability-status and perspectives. IEEE
Transactions on Software Engineering 8, 354-371.
Schick, G. J. and Wolverton, R. W. (1978). An analysis of competing software reliability models.
IEEE Transactions on software Engineering 4, 104-120.
Schick, G. J. and Wolverton, R. W. (1973). Assessment of software reliability. Proceedings Operations
Research, Physica, Werzberg-Wein, 395-422.
Schneidewind, N. F. (1975). An analysis of computer processes in computer software. Proceedings
of the International Conference on Reliable Software, 337-346.
Shanthikumar, J. G. (1981). A general software reliability model for performance prediction.
Mircoelectron. Reliab. 27, 671-682.
Shanthikumar, J. G. (1983). Software reliability models: A review. Microelectron. Reliab. 23, 903-943.
Shanthikumar, J. G. and Tufekci, S. (1981). Optimal release time using generalized decision trees.
Proceedings of the Fourteenth Annual Hawaii International Conference on System Sciences, 58-65.
Shanthikumar, J. G. and Tufekci (1983). Application of a software reliability model to describe
software release time. Microelectron. Reliab. 23, 41-59.
Shooman, M. L. (1972). Probabilistic models for software reliability prediction. In: W. Freiberger,
ed. Statistical Computer Performance Evaluation. Academic Press, New York, 485-502.
Shooman, M. L. (1973). Operational testing and software reliability estimation during program
development. Record of the 1973 IEEE Symposium on Computer Software Reliability, 51-57.
Shooman, M. L. (1975). Software reliability: Measurement and models. Proceedings of the 1975
Annual Reliability and Maintainability Symposium, 485-489.
Shooman, M. L. and Trivedi, A. K. (1976). A many state Markov model for computer software
performance parameters. IEEE Transactions on Reliability 25, 66-68.
Singpurwalla, N. D. and Soyer, R. (1985). Assessing (software) reliability growth using a random
coefficient autoregressive process and its ramifications. To appear in IEEE Transactions on Software
Engineering.
Sukert, A. N. (1977). An investigation of software reliability models. Proceedings of the Annual
Reliability and Maintainability Symposium, 78-84.
Sukert, A. N. (1979). Empirical validation of three software prediction models. IEEE Transactions on
Reliability 28, 199-205.
Stefanski, L. A. (1982). An application of renewal theory to software reliability. Proceedings of the
Twenty-Seventh Conference on the Design of Experiments in Army Research Development Testing. ARO
Report 82-2, 101-118.
Wagoner, W. L. (1973). The final report on a software reliability measurement study. Report TOR-
0074-(41221)-1, The Aerospace Corp., El Segundo, CA.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7 ~,~
© Elsevier Science Publishers B.V. (1988)99-111

Dependence Notions in Reliability Theory

Narasinga R. Chaganty and Kumar Joag-dev

1. Introduction

The concepts of stochastic dependence play an important role in many statisti-


cal applications. Although in reliability theory it is rare that new dependence
concepts are created, the well known concepts such as Markov dependence, total
positivity, stochastic monotonicity and some others related to positive dependence
are quite important. The study of their significance and relevance in reliability
theory is the main object of the present chapter. The definitions and some
immediate consequences of the concepts which we use in the following, have
already appeared in the Handbook of two articles: Boland and Proschan (this
Volume, Chapter 10), and Joag-dev (see Vol. 4, Chapter 4). We briefly review
these for the sake of completeness.

Part I

The first part of our study will consist of the effects of dependence on the
classification of life distributions according to the properties of aging. Most of
these concepts originate in the bivariate case and due to its importance and
simplicity we will study this case in more detail. Major source for the material
covered in this part consists of the articles by Freund (1961), Harris (1970),
Brindly and Thompson (1972), Shaked (1977) and the book by Barlow and
Proschan (1981).

1.1. Definitions
Let (X, Y) be a pair of real valued random variables defined on a fixed proba-
bility space. The joint distribution function and the marginals of (X, Y) will be
denoted by Fx, v, Fx and F r and the corresponding density functions by fx, r, f x ,
f y respectively. We write I(A) for the indicator of an event A. Many of the
concepts of positive and negative dependence can be defined in terms of condi-
tions on covariances of functions restricted to certain classes. Thus conditions
99
100 N. R. Chaganty and K. Joag-dev

(a) Cov[X, Y] i> 0, (1.1)


(b) Cov[gl(X), hi(Y)] >~ 0, where gl and hi are nondecreasing,
(c) Cov[g2(X, Y), h2(X, r)] >i 0, where g2 and hE are co-ordinatewise non-
decreasing,
define successively (strictly) stronger positive dependence conditions. Condition
(b) is known as positive quadrant dependence (PQD), it can be seen to be
equivalent to
(b ~) C o v [ I ( X > x), I ( Y > y)] >10.
Condition (c) is known as association. A condition stronger than (c) known as
positive regression dependence is obtained by requiring
(d) E [ f l ( X ) I Y = y] to be nondecreasing in y, for every nondecreasing function
L.
Note that this condition is non-symmetric. A condition known as 'monotone
likelihood ratio' or 'totally positive of order 2 (TP2)' is even stronger and is given
by
(e) fx. r(x2, Y2)fx, r(xl, Y,) >>"fx. r(x2, Y , ) f x , r(x,, Y2)
for x 2 > Xl and Y2 > YI.
Some of the concepts above have multivariate analogs. We mention some of
these. Corresponding to PQD, two non-equivalent multivariate generalizations
can be described. First one is called 'positive upper orthant dependence' (PUOD)
and the second one is labeled as 'positive lower orthant dependence' (PLOD).
These are defined by the conditions:
k
P[Xi >~ x e, i = 1, . . . , k] >t l-I e [ x , >/xe) (1.2)
i=1

for every x = (xl, ..., xk)e Nk,


k
P [ X i <~x i, i = 1, . . . , k] >t l~ P[Xi <<-xi] for e v e r y x e ~ k . (1.3)
i=1

The condition of 'association' for X = (X1, ... , Xk) , which is stronger than
P U O D and PLOD is given by

Cov[gk(X), hk(X)] 1> O, (1.4)

for every co-ordinatewise nondecreasing pair of functions ~k_~ R. This condition


was first introduced and studied by Esary, Proschan and Walkup (1967), (see
Boland and Proschan's article in this volume).
A version of regression dependence similar to (d) above would be to require,
for every i = 1. . . . , k,

E[fI(Xi)[Xj = Xy, j = 1, . . . , (i - 1)], (1.5)


Dependence notions in reliability theory 101

nondecreasing in each xj, for every f l nondecreasing. This is sometimes known


as 'positive regression dependence in sequence'. It can be shown that this implies
association.
The property of association is important for obtaining bounds on the survival
probabilities of the coherent systems. For example, if it is a series system and if
the component lives are denoted by Ti then the system life is min e T~ and asso-
ciation provides the bound,

P[min T,. > t] ~> 1-]P[Ti> t]. (1.6)


i

Similar bound can be obtained for a parallel system. These two bounds can be
combined to obtain bounds for a general coherent system.
An analog of TP z dependence given in (e) above is obtained by imposing this
condition on every pair of the arguments of the joint density in •k, while other
arguments are kept fixed. This condition known was MTP z implies association
(see for example Barlow and Proschan, 1981).
Finally, some of these conditions with appropriate changes, may be used to
define negative dependence. For example, see Block, Savits and Shaked (1982)
and Joag-dev and Proschan (1983). The components of a vector
X = (Xa, X2, . . . , Xk) are negatively associated if for every nonempty subset A of
{ 1, 2, ..., k} and every pair of co-ordinatewise nondecreasing functions g and h,
the Cov(f(XA), g(X~)) is nonpositive, where A- denotes the complement of A.
Negative dependence is relevant in systems defined in closed environments. For
example, a given number of species competing in an ecosystem with a fixed
amount of resources, may have their life lengths negatively associated.

1.2. Dependence and aging classification


We adopt the usual notation. A life distribution function F is said to be
increasing failure rate (IFR) if the ratio r(x) = f(x)/ff(x) is nondecreasing in x. We
say F is decreasing failure rate (DFR) if r(x) is nonincreasing in x. Here f is the
density corresponding to F and ff = 1 - F is the survival function. The function
r(x) is known as the failure rate. The distribution function F is said to be
increasing failure rate on the average (IFRA) if [if(x)] l/x is nonincreasing in x >/0
and F is new better than used (NBU) if ff(x + y) <~F(x)F(y) for all x, y >t 0.
Let X, Y be the life-lengths of two components. We examine some dependence
relations which have interpretations in terms of failure rates. First note that IFR
property is equivalent to having F log concave. Thus the conditional failure rate
r(x] Y = y) being increasing in x for every y, is equivalent to having conditional
survival function ffc(xly) log concave in x for every y. Suppose now that r(x[y)
is decreasing in y for every x, in addition to the conditional IFR property. This
would imply
d d2
- - - r(xly) - (logFc(xly)) >~ O, (1.7)
dy dy dx
102 N. R. Chaganty and K. Joag-dev

or equivalently ff(xl Y = y) considered as a function of x and y is TP z. Note that


if the joint density fx. r(x, y) is TP z, so is the conditional density fc(xly). How-
ever, this implies that ffo(xly) is TP 2. This is analogous to the univariate case,
where log-concavity implies IFR.
Another quantity of interest is the 'mean residual life', re(x), which is the
conditional expectation of life at age x. This is given by

m(x) =
f; tf(t) dt/ff(x) = i(t) dt/i(x). (1.8)

The life distribution F is said to be increasing mean residual life (IMRL) if m(x)
is increasing in x >~ 0. We say F is decreasing mean residual life (DMRL) if m(x)
is decreasing in x >~ 0. To obtain the monotone behaviour of the conditional mean
residual life mo(x]y), it can be shown that it suffices to have

h(x, y) =
fx (t - x)fc(tly) dt (1.9)

be TP 2. Again it can be shown that this condition is weaker than that needed for
the monotonicity of r(xlc). These results and some extensions were derived by
Shaked (1977). In the same article, Shaked (1977) also introduced the concept of
dependence by total positivity (DTP) for bivariate distributions. Recently Lee
(1985a) generalized the DTP concepts to the multivariate case and obtained a
number of inequalities and monotonicity properties of conditional hazard rate and
mean residual life functions of some multivariate distributions satisfying the DTP
property. In a subsequent paper Lee (1985b) introduced the concept of depen-
dence by reverse regular (DRR) rule, which is the mirror image of DTP, and
studied the relationship of DRR with other concepts of negative dependence.
Harris (1970) defined IHR (increasing hazard rate) property for a multivariate
distribution by requiring
(a) ff(x + t 1)~if(x) nonincreasing in x, and
(b) P [ X > u IX > x] nondecreasing in x for every fixed vector u. (1.10)
Geometric interpretation of (b) has prompted its name 'right comer set increasing'
(RCSI). Condition (a) is clearly 'wear out' condition, while as we shall see, (b)
describes positive dependence.
Brindley and Thompson (1972) studied the class of distributions where only (a)
is satisfied. In order to distinguish between these two classes based on aging
property, one satisfying (a) is called IFR, while the subclass with the additional
requirement of (b) is called IHR (H is for hazard or Harris!). In both cases the
classes can be seen to be closed under (a) taking subsets (b) unions of indepen-
dent sets of variables (c) taking minimums over subsets. Note that ir,Jportance of
the minimums stems from its role in the series systems. Both definitions, when
restricted to univariate, yield the usual IFR distribution. For the univariate case
(b) is trivially satisfied.
Dependence notions in reliability theory 103

To see that RCSI implies positive dependence, let K and M be arbitrary subsets
(not necessarily disjoint) of { 1, 2 . . . . . n}. Denoting appropriate subvectors by x K
and xM etc., it can be seen readily that (1.10b) implies that

P[XM > uM ]XK > xK] (1.11)

is a co-ordinatewise nondecreasing function of Xk, for every u M fixed. Repeated


application of condition (1.11) with a singleton K yields

F(x) >~ f i F;(xi) (1.12)


i=1

which is PUOD.
It would be worthwhile to mention examples of distributions where the above
dependency concepts are manifested in a natural way. If the components are
independent, then most of the conditions are trivially satisfied and hence we
consider those having dependent components.
Let U, X~, X 2 be independent random variables. Consider Y1 = min(U, X1),
I12 = min(U, X2), such functions determine the life of a system where the com-
ponent corresponding to U is connected in series. These functions are also impor-
tant when U represents the arrival time of a shock which disables components
corresponding to X~, X 2. This model, when U, Xa, X 2 each has exponential
distribution, was studied by Marshall and Olkin (1967). They also studied its
multivariate analog where different shocks disable 2, 3. . . . , n components. It
should be noted that the property of association is preserved due to the fact that
minimum of random variables is a co-ordinatewise increasing functions.
Gumbel (1960) discussed a simple model with bivariate distribution where its
survival function is given by
m

G(x, y) = exp(- x - y - bxy) , x, y >~ O , (1.13)

where 0 ~< b ~< 1. It is clear that the marginals are exponential and since X has
negative regression dependence, it is only appropriate when two variables have
such dependence. Freund (1961) describes a bivariate model of a two component
system where the joint survival function is same as that of two independent
exponentially distributed random variables with shape parameters e and B, as long
as both components have not failed. Upon failure of one item the shape
parameter of the life distribution of the other component is changed to e I or
changed to ill. The joint survival probability function can be written as

if(x, y) = exp ( - (e + fl)x) [[( fl - - fll e x p ( - (e + f l ) ( y - x))

"-i

e e x p ( - fll(y - x ) ) / , x~y
+ (e +/~ - ¢~') 3
104 N. R. Chaganty and K. Joag-dev

~--CZ 1
= exp(-(o~ + ]~)x)[ioc exp( - (~ + / / ) ( y - x))

q
exp(-eX(y- x))[, y~< x . (1.14)
+ (~ +/~ _ ~1) A

The marginal distributions are not exponential but are certain mixtures of
exponentials and the nature of dependence is determined by the relative magni-
tudes of the parameters. In fact,

ffl(X ) - ~- O~1 exp ( - (~ +/~)y) + fl exp ( -- ~ 1X)


( . +/~ - .~) ( . +/~ - .~)
(1.15)
and

- exp ( - (~ + fl)x) + exp ( - fl'y).

(1.16)

It is easy to verify that Fl(x) is IFR if and only if ~ < ~1 and F2(y) is IFR if
and only if fl < ill.

Part II

The second part of our study deals with dependence concepts relevant to the
models which consider repair and replacement of the components of a system.
These dependent concepts arise from the study of the theory of stochastic pro-
cesses. Some of the classical types of stochastic processes characterized by differ-
ent dependence relationships are Markov processes, Renewal processes and
Markov renewal processes. The latter includes the previous two as special cases.
The dependent relations such as total positivity, association, stochastic monotoni-
city studied in Part I, have natural occurrence among these processes. It is
needless to say that the vast number of results in the study of the above processes
have wide applications in reliability theory. In the next few sections we shall
examine some of these processes and their applicability in characterizing the
failure rate of the life distributions of systems, as well as in obtaining bounds of
some other quantities of interest in reliability theory. The organization of this part
is as follows: In Section 2.1, we define totally positive Markov process and
discuss some useful theorems related to this process. A concept weaker than
totally positivity is stochastic monotonicity, that is, all totally positive Markov
processes are stochastically monotone but not vice versa. This is discussed in
Section 2.2.
Dependencenotionsin reliabilitytheory 105

Many of the models in reliability theory which consider replacement of items


as they fail can be delineated by a renewal process. The renewal function is
defined as the expected number of items replaced at a given instant of time. We
can obtain lower and upper bounds for the renewal function, when the life
distribution of the items is assumed to be in one of the reliability classes of life
distributions. These results are discussed in the last Section 2.3.

2.1. Totally positive Markov processes

DEFINITION 1. A stochastic process {XA t+ [0, +)} is said to be a Markov


process with state space S if for any t, s >_-0 and j in S,

P[X,+ s = J IX,; u ~< tl = P[Xt+s =j [Xt]. (2.1)

The Markov process is said to be a time-homogeneous Markov process when


the conditional probability,
P [ X t + s = J bXt = i] = P s ( i , j ) (2.2)

is independent of t >t 0, for all i, j in S and s >/0. The collection of matrices


j)), t > 0, is simply called the transition function of the Markov pro-
P , = (Pt(i,
cesses.

DEFINITION 2. A Markov process with transition matrix Pt is said to be totally


positive (TP) if i I < i 2 < . • • < i n a n d J l <J2 < " " " < J , , the determinant

[ il, ,, inl et(il'Jl)'''et(il'Jn)


P t; "' = " "
Jl, .,Jn pt(i,,,jl)...Pt(i,,,jn) -

is strictly positive when t > 0 for all n >~ 1. If (2.3) holds for n ~< r, we say that the
Mmkov process is totally positive of order r (TPr).

When the state space S is a countable set and the parameter set is the set of
integers, the Markov process is known as a Markov chain. The Markov chain is
said to be time-homogeneous if the transition function Pn is independent of n, in
which case we simply write P. The Markov chain is totally positive if P satisfies
condition (2.3). Karlin and McGregor (1959a, b) have shown that, indeed several
Markov chains and Markov processes are totally positive, the prominant one
being the birth and death process.
An excellent treatise of totally positive Markov chains and totally positive
Markov processes together with applications in several domains of mathematics,
including reliability theory, is given in Karlin (1964). Typical of the results of
Karlin (1964) are the following theorems regarding inheritance of TP character.
106 N. R. Chaganty and K. Joag-dev

THEOREM 3. Let the transition matrix P of a Markov chain {XK, K >/1) be TP r.


Define for i > j,
Q(n, i) = P [ j < XK <~ i, l <~K <~ n - 1, xn = j [ X o = i ] . (2.4)

Then Q(n, i) is TPr for n >~ 0 and i > j.

The TP property is also prevalent when the initial state of the Markov chain
is fixed. We state this in the theorem below.

THEOREM 4. Assume the hypothesis of Theorem 3. Define for i > L

Q I [ n , j ) = P [ i < XK < j , l < K < n - 1, X n = j [ X o = i 1. (2.5)

Then Q1 is TPr in the variables n ~ O and i > j.

The above Theorem 4 was used by Brown and Chaganty (1983) to show that
the first passage time distribution from an initial state to a higher state in a birth
and death process is IFR. This result was also obtained by Keilson (1979),
Derman, Ross and Schechner (1979) using other methods. Another application of
Theorem 4 is given by Assaf, Shaked and Shanthikumar (1985). They have
shown that the time to failure of some systems which are subject to shocks and
damages, which are not necessarily nonnegative, is IFR.

2.2. Stochastic monotonicity in Markov processes


A useful notion weaker than total positivity is stochastic monotonicity. This
concept was introduced by Kalmykov (1962) and later was discussed in detail by
Veinott (1965), Daley (1968), O'Brien (1972) and Kirstein (1976). A detailed study
of stochastic monotonicity in Markov processes can be found in the book by
Keilson (1979). Stochastic monotonicity is a structural property of the Markov
process. The random variables in such processes are associated and this con-
nection gives rise to many interesting inequalities in reliability theory. We define
below, stochastic monotonicity for Markov chains and then extend the definition
for Markov processes.

DEFINITION 5. A Markov chain {XK, K >/0) is said to be stochastically mono-


tone if XK+ 1 given X K = i, is stochastically larger than XK+ 1 given X K =j, for
all k / > 0 and i > j .

The extension of stochastic monotone property to continuous time Markov


processes is straight forward.

DEFINITION 6. A time-homogeneous Markov process {Art, t >~ 0} is said to be


stochastically monotone if X t given Xo = x l is stochastically larger than Xt given
X o = x 2 for all t > 0 and X l > X 2 .
Dependence notions in reliability theory 107

Numerous Markov processes are indeed stochastically monotone. These in-


clude Markov diffusion processes. More generally the class of totally positive
Markov process is a proper subset of the class of stochastically monotone
Markov process.
Stochastically monotone Markov chains with partially ordered state spaces
were introduced by Kamae, Krengel and O'Brien (1977) and their applications to
problems in reliability theory were studied by Brown and Chaganty (1983). We
discuss these after introducing some notation. Let S be a countable set with a
partial ordering denoted by >~. A subset C of S is said to be increasing set if i
belongs to C and j >/i implies j is in C. A time homogeneous Markov chain
{Xn, n ~> 0} with state space S is said to be stochastically monotone if for j >/i,
the transition probability from j to C is larger than from i to C, for all increasing
sets C. The Markov chain is said to have monotone paths if P ( X n + 1 >>-Xn) = 1,
for all n >/0. The following theorem characterizes the class of I F R A distributions
with stochastically monotone Markov chains.

THEOREM 7. Let S be a partially ordered countable set. Let {X n, n >i 0} be a


stochastically monotone Markov chain with monotone paths and state space S. Let C
be an increasing subset of S, with finite complement. Then the first passage time from
state i to set C is IFRA.

Shaked and Shanthikumar (1984) generalized the above theorem by removing


the restriction that the complement of C is finite. As a converse to Theorem 7 we
have the following result.

THEOREM 8. Every I F R A distribution in discrete time & either the first passage
time distribution to an increasing set for a stochastically monotone Markov chain with
monotone paths on a partially ordered finite set, or the limit of a sequence of such
distributions.

Analogous theorems in the continuous time frame also hold. The above
theorems were used by Brown and Chaganty (1983) to show that the convolution
of two I F R A distributions is IFRA. Various other applications of the above
theorems to shock models in reliability theory, sampling with and without replace-
ment can also be found in Brown and Chaganty (1983).
Stochastically monotone Markov chains also take an important place in
obtaining optimum control limit rules. The following formulation is due to Derman
(1963). Suppose that a system is inspected at regular intervals of time and that
after each inspection it is classified into one of (m 4- 1) states denoted by 0, 1,
2 . . . . . m. A control limit rule l simply says that replace the system is the observed
state is one of the states k, k + 1, . . . , m for some predetermined state k. The
state k is called the control limit of l. Let X n denote the observed state of the
system in use at time n >/0. We assume that {X~, n ~> 0} is a stationary Markov
chain. Let c ( j ) denote the cost incurred when the system is in state j. Let L
denote the class of all possible control limit rules. For l ~ L , the asymptotic
108 N.R. Chagan~ and K. Joag-dev
,n
expected average cost is defined as A(I) = l i m , _ ~ 1/n ~,= 1 c(X,). The following
theorem was proved by Derman (1963).

THEOREM 9. Let the Markov chain {X~, n >/0} be stochastically monotone. Then
there exists a control limit rule l* such that

A (I*) = miLnA (l). (2.6)

2.3. Renewal theory in reliability


Let {Xi, i/> 1} be a sequence of nonnegative, independent and identically distri-
buted random variables. Let S n = X 1 + . . . + X n be the nth partial sum and let
N, be the maximum value of n for which S n ~< t. In the context of reliability theory
we can think that the Xt's represent the life times of items being replaced. The
partial sum Sn represents the time at which the nth renewal takes place and N t
is the number of renewals that will have occurred by time t. The dependent
process {N,, t ~> 0} is known as a renewal process. The study of renewal theory
is to derive properties of certain random variables associated with N t from the
knowledge of the distribution function F of X~. In this section we shall discuss
the important results, when the underlying distribution F is assumed to belong to
one of the reliability classes of life distributions. For an extensive study of the
general theory of renewal process we refer the reader to the expository article by
Smith (1968) and to the books by Cox (1962), Feller (1966) and Karlin and
Taylor (1975).
The renewal function M(t) = E[Nt] plays a central role in reliability, especially
in maintenance models. It is useful to get bounds on M(t) for finite t, since in
most cases computing M(t) may be difficult. One such bound is given by
M(t) ~ t/#~ - 1, where #1 is the mean of F. Under the additional assumption that
F is IFR, Obretenov (1974) obtained the following sharper bound:
t
M(t) >~-- + - - - 1, (2.7)
~/1 ]'/1
where ~ = l i m n _ o ~ n + l / ( n + 1)/~, #n =E(X~). Barlow and Proschan (1964),
while studying replacement policies, when the life distribution of the unit is IFR,
obtained the following lower and upper bounds for the renewal random variable
N t •

THEOREM 10. Let R(t) = -logF(t). If F is IFR with mean #5 then

(a) P(N,~n)~ ~ (t/l~l)J e x p ( - t / ~ t l ) , for0~<t<#l,


j=n j!

(b) P(N, >~n) >~ ~ (nR(t/n))J e x p ( - nR(t/n))


j=, j!
for t >>.O, n >~ l.
Dependence notions in reliability theory 109

Under weaker conditions on F we have the following theorem.

THEOREM 11. Let R(t) = - logF(t). I f F is N B U with finite mean then

(a) P(N t >~n) <~ ~Z (R(t)) exp(- R(t)),


j=, j!

(b) M(h) <~M ( t + h) - M ( t ) ,

(c) Var (Nt) < M(t)


for t >~ O, h >~ O, n >>. 1.

The reverse inequalities in the above theorem are valid for F new worse than
used (NWU), that is, ff(x + y) >1 F(x)F(y), for all x, y/> 0. In a two paper series
Brown (1980, 1981) obtained nice properties for the renewal function M(t) when
the underlying distribution F is assumed to be D F R or IMRL. Let
Z(t) = S N ( t ) + 1 - - t denote the forward recurrence time at time t and A(t) = t - SN,,
the renewal age at t. The following theorem can be found in Brown (1980, 1981).

THEOREM 12. (a) I f the underlying distribution F of the renewal process is DFR,
then the renewal density M ' ( t ) exists on (0, ~ ) and is decreasing, that is, M(t) is
concave. Furthermore, Z(t), A(t) are both stochastically increasing in t >/O.
(b) I f F is I M R L then M ( t ) - t/l~ is increasing in t>~ 0 and E[~b(Z(t))] is
increasing in t >/0 for increasing convex functions ~.

In the case where F is IMRL, Brown (1981) provides counter examples to show
that Z(t) is not necessarily stochastically increasing, E[A(t)] not necessarily
increasing and M(t) need not to be concave. An example of Berman (1978) shows
that the analogous results do not hold for I F R and D M R L distributions. As an
application of Theorem 12, Brown (1980) obtained sharp bounds for the renewal
function M(t) for F I M R L , with improved bounds for F DFR. These results are
given in the next theorem.

THEOREM 13. Let Pn = E(X~'), n ~> 1. Let U(t) = t/Izl +/~2/2#12. Let #K+2 be
finite for some k ~ O. I f F is I M R L then

U(t) >~ M(t) >~ U(t) - min d i t - ' , (2.8)


O<~i<~k

where the constant di is a simple function of gl . . . . . #;÷2. Furthermore, if F is D F R


then

U(t) >~M(t) >1 U(t) - min uid~t -~, (2.9)


O<~i<~K

where % = 1, c¢i = (i/i + 1) i for i >1 1.


110 N. R. Chaganty and K. Joag-dev

M a r s h a l l and P r o s c h a n (1972) o b t a i n e d the f o l l o w i n g c h a r a c t e r i z a t i o n o f the


N B U class o f life d i s t r i b u t i o n s in t e r m s o f the r e n e w a l p r o c e s s N,.

THEOREM 14. The distribution function F is B N U ( N W U ) i f and only i f


N(s + t) >1 ( <~) N ( s ) • N(t) f o r all s, t >~ O, where • denotes the convolution operation.

Esary, M a r s h a l l and P r o s c h a n (1973) e s t a b l i s h e d the following I F R A p r o p e r t y


for the r e n e w a l p r o c e s s , while studying s o m e s h o c k m o d e l s .

THEOREM 15. L e t {Nt, t>~0} be a renewal process. Then P [ N t > / k ] l/k is


decreasing in k >~ 1, that is, N t possesses the discrete I F R A property.

References

Assaf, D., Shaked, M. and Shanthikumar, J. G. (1985). First passage times with PF r densities.
Journal of Appl. Prob. 22, 185-196.
Barlow, R. E. and Proschan, F. (1964). Comparison of replacement policies, and renewal theory
implications. Ann. Math. Statist. 35, 577-589.
Barlow, R. E. and Proschan, F. (1981). Statistical Theory of Reliability and Life Testing. To Begin With,
Silver Spring, Maryland.
Berman, M. (1978). Regenerative multivariate point processes. Adv. Appl. Probability 10, 411-430.
Block, H. W., Savits, T. H. and Shaked, M. (1982). Some concepts of negative dependence. Ann.
of Probability 10, 765-772.
Brindley, E. C. Jr. and Thompson, W. A. Jr. (1972). Dependence and aging aspects of multivariate
survival. Journal of Amer. Stat. Assoc. 67, 822-830.
Brown, M. (1980). Bounds, inequalities, and monotonicity properties for some specialized renewal
processes. Ann. of Probability 8, 227-240.
Brown, M. (1981). Further monotonicity properties for specialized renewal processes. Ann. of P,oba-
bility 9, 891--895.
Brown, M. and Chaganty, N. R. (1983). On the first passage time distribution for a class of Markov
Chains. Ann. of Probability 11, 1000-1008.
Cox, D. R. (1962). Renewal Theory. Methuen, London.
Daley, D. J. (!968). Stochastically monotone Markov chains. Z. Wahrsch. verw. Gebiete 10, 305-317.
Derman, C. (1963). On optimal replacement rules when changes of state are Markovian. In: Richard
Bellman, ed., Mathematical Optimization Techniques. Univ. of California Press, 201-210.
Derman, C., Ross, S. M. and Schechner, Z. (1979). A note on first passage times in birth and death
and negative diffusion processes. Unpublished manuscript.
Esary, J. D., Marshall, A. W. and Proschan, F. (1973). Shock models and wear processes. Ann. of
Probability 1, 627-649.
Esary, J. D., Proschan, F. and Walkup, D. W. (1967). Association of random variables, with
applications. Ann. Math. Stat. 38, 1466-1474.
Feller, W. (1966). An Introduction to Probability Theory and lts Applications, Vol. II. Wiley, New York.
Freund, J. E. (1961). A bivariate extension of the exponential distribution. Journal of Amer. State.
Assoc. 56, 971-977.
Gumbel, E. J. (1960). Bivariate exponential distributions. Journal of Amer. Star. Assoc. 55, 698-707.
Harris, R. (1970). A multivariate definition for increasing hazard rate distribution functions. Ann.
Math. Statist. 41, 713-717.
Joag-dev, K. and Proschan, F. (1983). Negative association of random variables with applications.
Ann. Statist. 11, 286-295.
Karlin, S. (1964). Total positivity, absorption probabilities and applications. Trans. Amer. Math. Soc.
Dependence notions in reliability theory 111

III, 33-107.
Karlin, S. and McGregor, J. (1959a). Coincidence properties of birth and death processes. Pacific
Journal of Math. 9, 1109-1140.
Karlin, S. and McGregor, J. (1959b). Coincidence probabilities. Pacific Journal of Math. 9, 1141-1164.
Karlin, S. and Taylor, H. M. (1975). A First Course in Stochastic Processes, 2nd edition. Academic
Press, New York.
Kalmykov, G. I. (1962). On the partial ordering of one-dimensional Markov processes, Theor. Prob.
Appl. 7, 456-459.
Kamae, T., Krengel, U. and O'Brien, G. C. (1977). Stochastic inequalities on partially ordered
spaces. Ann. of Probability 5, 899-912.
Keilson, J. (1979). Markov Chain Models--Rarity and Exponentiality. Springer, New York.
Kirstein, B. M. (1976). Monotonicity and comparability of time homogeneous Markov processes with
discrete state space. Math. Operations Forschung. Stat. 7, 151-168.
Lee, Mei-Ling Ting (1985a). Dependence by total positivity. Ann. of Probability 13, 572-582.
Lee, Mei-Ling Ting (1985b). Dependence by reverse regular rule. Ann. of Probability 13, 583-591.
Marshall, A. W. and Proschan, F. (1972). Classes of distributions applicable in replacement, with
renewal theory implications. Proceedings of the 6th Berkeley Symposium on Math. Stat. and Prob.
I. Univ. of California Press, Berkeley, CA, 395-415.
Marshall, A. W. and Olkin, I. (1967). A multivariate exponential distribution. Journal ofAmer. Stat.
Assoc. 62, 30-44.
Obretenov, A. (1974). An estimation for the renewal function of an IFR distribution. In: Colloq.
Math. Soc. Janos Bolyai 9. North-Holland, Amsterdam, 587-591.
O'Brien, G. (1972). A note on comparisons of Markov processes. Ann. of Math. Stat. 43, 365-368.
Shaked, M. (1977). A family of concepts of dependence for bivariate distributions. Journal of Amer.
Stat. Assoc. 72, 642-650.
Shaked, M. and Shanthikumar, J. G. (1984). Multivariate IFRA properties of some Markov jump
processes with general state space. Preprint.
Smith, W. L. (1958). Renewal theory and its ramifications. J. Roy. Statist. Sot., Series B 20, 243-302.
Veinott, A. F. (1965). Optimal policy in a dynamic, single product, nonstationary inventory model
with several demand classes. Operations Research 13, 761-778.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7 '7
/
© Elsevier Science Publishers B.V. (1988) 113-120

Application of Goodness-of-Fit Tests in Reliability

B. W. W o o d r u f f a n d A. H. M o o r e

1. Introduction

Prior to using a probability model to represent the population underlying data,


it is important to test adequacy of the model. One way to do this is by a
goodness-of-fit test. However one must make an initial selection of models to be
tested. Several avenues are available for an initial screening of the data. One could
construct histograms, frequency polygons or more sophisticated non-parametric
density estimates [4, 23]. Another very useful initial screening device is the use
of a probability plot on special graph paper available for a variety of common
distributions used in life testing. Nelson [19] gives an extensive coverage to the
use of probability plots in his book on reliability theory. After one has selected
a model to be tested further, an initial screening of the model could be done by
a X2 goodness-of-fit test discussed below. If the Z2 test rejects at a suitable
significance level, then one can proceed to test other reasonable models. However
if one fails to reject the model, then one should consider, if possible, other more
powerful goodness-of-fit tests.

2. Z2 goodness-of-fit tests

This classical test is an almost universal goodness-of-fit test since it can be


applied to discrete, continuous or mixed distributions, with grouped or ungrouped
data, model completely specified or with the parameters estimated. It can also be
adapted to be used with censored data or truncated distributions.
The test is an approximate test since the sample statistic is only asymptotically
g 2 distributed. Several authors have shown it to have lower power than other
applicable tests. In applying the test, the data must be grouped into intervals.
Since several statisticians may group the data differently, this may lead to a
change in the reject or accept decision and hence the test is not unique. It also
requires moderate to large sample sizes.

113
114 B. W. Woodruff and A. H. Moore

2.1. X2 test procedure

Ho: F(x) = Fo(x),


H A : F(x) ~ Fo(x ) .

Take a random (or censored) sample from the unknown distribution and divide
the support set into a set of k subsets. Now under the null hypothesis, determine
the expected number of observations in each subset denoted by E i (i = 1. . . . . k).
The observed number of sample observations in each subset is denoted by O,. A
usual rule is to choose the subsets so that the expected number of observations
in each subset is greater than or equal to 5. The test statistic is

k ( O i - Ei)2
i=1 Ei

We reject Ho if Z^2 > ) ~ 2. k - p - i where p is the number of parameters esti-


mated in the specification of the null hypothesis Fo(x).

3. Graphical techniques

A probability plot is a very useful way to provide a preliminary examination of


how well a particular distribution fits the data. It is fast and easy to use and can
provide parameter and percentile estimates of the distribution. It can be applied
to complete and censored data and to grouped data. There are probability graph
papers for normal, lognormal, exponential, Weibull, extreme-value and chi-square
distributions. Weibull graph paper may be used for the Rayleigh distribution by
assuming the shape parameter is two.

3.1. Procedure for graphical techniques


(i) Order the observations from smallest to largest x(i ) (1 ~< i ~ n).
(ii) Assign the value of the cdf at each order statistic F(x(o ). A reasonable
value of the cdf at the ith order statistic is its median rank (i - 0.3)/(n + 0.4).
Exact tables of median ranks are available for the smaller values of i and n
(where n is the sample size). Harter [8, 10] recently wrote several papers where
he studied various plotting positions.
(iii) Plot the values of x(~) vs. F(x(o ) on the probability paper. The papers are
constructed so that if a particular distribution fits the data, then the graph will
be approximately a straight line. A curved line would indicate that the chosen
distribution is inadequate to model the sample. Probability plots could also
uncover mixtures of distributions in modeling the sample. Mardia [ 11] states:
Application of goodness-of-fit tests in reliability 115

'The importance of the graphical method should not be underestimated and it is


always worthwhile to supplement a test procedure with a plot.'

4. Modified goodness-of-fit test

Goodness-of-fit tests based on the empirical distribution fimction (EDF) fall


into two categories: (a) a test where the probability model to be tested is com-
pletely specified and a single table may be used for all continuous distributions
for each test statistic, and (b) a test where the parameters are estimated, called
modified goodness-of-fit tests. A different table must be used for each family of
distributions. Occasions where the null hypothesis may be completely specified are
rare and that, except for one case, will not be pursued further in this paper. If
one foolishly used tables for the completely specified case when the parameters
are estimated then the actual a error is much smaller than the specified value so
strongly biasing the test towards acceptance that it is almost equivalent to accept-
ing H o without testing. See Lawless [ 12] for an extensive coverage of goodness-
of-fit tests.

4.1. M o d i f i e d test statistics b a s e d on E D F

To use a modified goodness-of-fit test based on the EDF, one has to choose
a family of cdfs of the form F [ ( x - c)/O] where c is a location parameter and 0
is a scale parameter. The estimators of the nuisance parameters must be scale and
location invariant. Usual estimators having this property are maximum likelihood
estimators. When the estimators are inserted in the cdf, we will denote the
cdf evaluated at each order statistic under the null hypothesis Fo[(X i - d ) / 0 ]
by t0i.
Consider the following test statistics:
(i) The Kolmogorov-Smirnov statistic /£:

/£ = max(D +, D - ) ,
where D + = 1.u.b. (i/n - P i ) ,
l <~ i <~ n .
D = 1.u.b.(F, - [ ( i - 1)/n]),

(ii) The Anderson-Darling statistic ,~2:

,~2 = _ ~ [/~._ ( 2 i - 1)/2n] 2 + (1/12n).


i=1

(iii) The Cramer-von Mises statistic 1~'2:

I'V2 = ~ [Fi - ( 2 i - 1)/2n] z + (1/12n).


i=1
116 B. W. Woodruffand A. H. Moore

(iv) The Kuiper statistic I7":

I?=D+ +D-.

(v) The Watson statistic U2:

0 2 = I,V2 - n ( F - 1/2)2 w h e r e P = ~ Filn.


i=1

When the parameters are estimated by location and scale estimators, then the
null distribution of the test statistic and hence its percentage points do not depend
on c and 0. However in using the tables, one must use the same estimators as
were used in the construction of the table. The table of critical values and the
power of the test is affected by the invariant estimators chosen.

4.2. Normal (and lognormal)


Mardia [ 11] gave an extensive discussion on tests of univariate and multivariate
normality. Many of the techniques discussed are applicable to other distributions.
In Table 1, he summarized the main univariate test statistics. If the distribution
is a two-parameter lognormal, then if we transform the data by taking the
logarithm of each observation, then we have a sample from the normal distribu-
tion with mean/~ and variance 02. If in a test for normality with the transformed
data we accept H o, then we are accepting that the original data was lognormal
with parameters /~ and 02. Lilliefors [13] derived by Monte Carlo simulation
tables for a modified goodness-of-fit test for the normal using the
Kolmogorov-Smirnov (KS) statistic and pointed out the difference in the per-
centage points for the modified test and standard test for a completely specified
Ho. He tabled critical values for n = 4(1)20(5)30 for significance levels ~ = 0.01,
0.05(0.05)0.20. He performed a power study for n = 10 and 20 with ~ -- 0.05 and
= 0.10 using four alternate distributions. In the power study he demonstrated
that the modified KS test had considerably higher power than the Z 2 test. Green
and Hegazy [6] derived tables for the modified goodness-of-fit test for the normal
among other distributions using Cramer-von Mises (CvM) and Anderson-Darling
(AD) statistics for n = 5, 10, 20, 40, 80, 160. Their power study showed improved
power over other known tests.

4.3. Exponential and Rayleigh distributions


Lilliefors [14] derived tables for a modified KS goodness-of-fit test for the
exponential distribution with unknown mean. He tabled critical values for
n = 3(1)20(5)30 and for significance levels 0.1, 0.05(0.05) 0.20 and of an n = 10,
20, and 50. He conducted a power study for two alternative distributions.
Woodruff et al. [24] and Bush et al. [2] derived tables or modified KS, CvM
and AD tests for the two-parameter negative-exponential (Weibull with shape
parameter 1.0) for n = 5(1)15(5)30 and significance levels as above. A power
Application of goodness-of-fit tests in reliability 117

study was done for seven altemate distributions. It was shown that the CvM test
had the highest power for most of the alternative distributions studied when the
null hypothesis was the two parameter negative exponential.
Woodruff et al. [24] and Bush et al. [2] also derived tables for the Rayleigh
distribution (Weibull shape parameter 2.0) for the same sample sizes and signifi-
cance levels given above.
The papers by Woodruff and Bush also studied a range of other Weibull shape
parameters from 0.5(0.5)4.0. A second power study with seven alternate distribu-
tions showed that the AD statistic was the most powerful when the null distribu-
tion was a Weibull with shape parameter 3.5. A relationship between critical
values and the inverse of the shape parameter was presented for the range of
shape parameters studied.

4.4. Extreme-value and Weibull distributions


Nancy Mann [16] used the fact that two-parameter Weibull distributions (with
known location parameter) may be transformed, by taking the logarithm of the
observations, to the extreme-value distribution. After the transformation, one has
a family with unknown scale and location parameters. She was able by deriving
the variance-covariance matrix of the standardized order statistic from extreme-
value distribution, to obtain best linear unbiased (BLUE) and best linear invariant
(BLIE) estimators for the unknown parameters and hence estimates of the
parameters of the original Weibull distribution. It should be noted that the
estimators of the parameters of the extreme-value are invariant scale and location
parameter estimators. In a following paper [ 17], she derived a goodness-of-fit test
for the extreme-value distribution of smallest-values. Accepting the smallest
extreme-value distributions as the model for the transformed data is equivalent to
accepting the Weibull distribution as the model for the original data.
The test is not an E D F test but several papers based on the E D F followed that
used the same principal of transforming the Weibull into the extreme-value dis-
tribution.
Littell et al. [15] derived, by Monte Carlo techniques, tables of critical values
for the modified KS, CvM and AD statistics for the extreme-value distribution for
n = 10(5)40 and ~ = 0.1, 0.5(0.5)0.20. They use maximum likelihood estimators
for the parameters. A power study compared the three new goodness-of-fit tests
with several earlier ones. In a later paper, Chandra et ai. [3] derived tables of
critical values for modified goodness-of-fit statistics for the KS and for the Kuiper
tests for testing the fit to the extreme-value distribution with unknown parameters.
The unknown parameters were estimated by the method of maximum likelihood.

4.5. Gamma distribut&n


Woodruff et al. [25] derived tables for the percentage points for the modified
KS, AD and CvM statistics for goodness-of-fit tests for the gamma distribution
with unknown scale and location parameters and known shape parameter for
n = 5(5)30 and/~ = 0.1, 0.5(0.5)0.20.
118 B. W. Woodruff and A. H. Moore

A power study indicated that for larger sample sizes, the CvM was the most
powerful of the three tests. The equation C = a o + ~l(1/fl 2) describes the form of
the relationship between the critical values C and the shape parameter fl derived
for each of the statistics studied. Again ML estimators were used.

4.6. Logistic distribution

Woodruff et al. [26] derived tables of critical values for the modified KS, AD
and CvM goodness-of-fit statistics for the logistic distribution with unknown
shape and location parameters. ML estimators were used to obtain estimates of
the unknown parameters. The statistics were tabled for sample sizes n = 5(5)30
and significance levels ~ = 0.1, 0.5(0.5)0.20. A power study indicated quite good
power against uniform and exponential alternatives. The modified KS test had
lower power than the other two tests studied.

4.7. Pareto distribution

Porter and Moore [20] derived tables of critical values for the modified KS,
AD, and CvM goodness-of-fit statistics for the Pareto distribution with unknown
location and scale parameters and known shape parameters. Best linear unbiased
estimators were used to obtain the parameter estimates. The critical values were
tabled for sample sizes n = 5(5)30, significance levels ~ = 0.1, 0.5(0.5)2.0 and
Pareto shape parameters 0.5(0.5)4.0. The powers were investigated for eight alter-
native distributions. A functional relation between the critical values of test
statistics and the Pareto shape parameters was derived.

4.8. Laplace distribution


Yen and Moore [28] derived tables of critical values for the modified AD and
CvM goodness-of-fit statistics for the Laplace distribution. The critical values
were tables for sample sizes n = 5(5)50 and significance levels ~ = 0.1,
0.5(0.5)0.20. The AD test generally yielded higher power than the CvM test.

5. Modifications of the EDF

One way to improve the power of a goodness-of-fit test is to improve the


non-parametric estimate of the distribution function. Harter, Khamis and Lamb
[7] modified the definition of the cdf at the ith order statistic to obtain a
(modified) KS test statistic for the case where the probability model to be tested
is completely specified. They have shown that the test obtained in this fashion is
substantially more powerful than the usual KS tests for small to moderate sample
sizes. Harter [9] also developed asymptotic formulaes for the critical values of the
above test statistic.
Application of goodness-of-fit tests in reliability 119

6. New modified goodness-of-fit tests

New goodness-of-fit tests for symmetric alternatives were obtained by Moore


et al. [18], W o o d r u f f et al. [27] and Yen and Moore [29] for the normal, uni-
form, and Laplace distributions, respectively. A reflection technique in which the
data points are reflected about an invariant estimate of the mean is used to double
the sample size. The new sample is used to obtain a better estimate of the
distribution function to be used in the goodness-of-fit statistics. New tables were
derived for the KS, A D and C v M statistics. The new goodness-of-fit statistics are
still invariant with respect to a change in scale or location parameters. Extensive
power studies showed that the new test yielded considerably higher power for
sample sizes greater than or equal to 25 for all symmetric or nearly symmetric
alternative distributions. For non-symmetric alternative distributions, the new test
showed a decrease in power which was expected since Schuster [21] showed that
the reflection technique gave a poorer estimate of the distribution function in this
case.

7. Likelihood ratio tests

When a goodness-of-fit test fails to reject two families of distributions, one can
use a likelihood ratio test to discriminate between them. Bain [1] ~ves an
extensive coverage to likelihood ratio tests. H e lists the test statistic to be used
to discriminate between normal vs. two-parameter exponential, normal vs. double
exponential, normal vs. Cauchy, Weibull vs. lognormal, and extreme-value vs.
normal. For large samples, the asymptotic likelihood ratio test could be used. For
small samples from other distributions Monte Carlo techniques can be used to
obtain the percentage points of the sample statistic for the likelihood ratio test.

References

[1] Bain, L. J. (1978). Statistical Analysis of Reliability and Life-Testing Models (Theory and Methods).
Marcel Dekker, New York and Basel.
[2] Bush, J. G., Woodruff, B. W., Moore, A. H. and Dunne, E. J. (1983). Modified Cramer-von
Mises and Anderson-Darling tests for Weibull distribution with unknown location and scale
parameters. Commun. Statist.-- Theor. Meth. A 12, 2463-2476.
[3] Chandra, M., Singpurwalla, N. and Stephens, M. A. (1981). Kolmogorov statistics for tests of
fit for the extreme-value and Weibull distributions. J. Amer. Statist. Assoc. 71, 204-209.
[4] Devroye, L. and Gyrrfi, L. (1985). Non-Parametric Density Estimation: the Li View. Wiley, New
York.
[5] Durbin, J. (1975). Kolmogorov-Smirnov tests when parameters are established with application
tests of exponentiality and tests on spacings. Biometn'ka 62, 5-22.
[6] Green, J. R. and Hegazy, Y. A. S. (1976). Powerful modified goodness-of-fit tests. J. Amer.
Statist. Assoc. 71, 204-209.
[7] Harter, H. L., Khamis, H. T. and Lamb, R. E. (1984). Modified Kolmogorov-Smirnov tests
of goodness-of-fit. Commun. Statist.--Simula. Computa. 13, 293-323.
120 B. W. Woodruff and A. H. Moore

[8] Harter, H. L. (1984). Another look at plotting positions. Commun. Statist.--Theor. Method. 13,
1613-1633.
[9] Hatter, H. L. (1984). Asymptotic formulas for critical values of a modified Kolmogorov test
statistic. Communications in Statistics B 13, 719-721.
[10] Harter, H. L. and Wiegand, R. P. (1985). A Monte Carlo study of plotting positions. Commun.
Statist.--Simula. Computa. 14, 317-343.
[11] Krishnaiah, P. R. (1980). Handbook of Statistics I, North-Holland, Amsterdam.
[12] Lawless, J. F. (1982). Statistical Models and Methods for Lifetime Data. Wiley, New York.
[13] Lilliefors, H. W. (1967). On the Kolmogorov test for normality with mean and variance
unknown. J. Am. Statist. Assoc. 62, 143-147.
[14] Lilliefors, H. W. (1969). On the Kolmogorov test for the exponential distribution with mean
unknown. J. Am. Statist. Assoc. 64, 387-389.
[15] Littell, R. D., McClave, J. T. and Often, W. W. (1979). Goodness-of-fit tests for the two-
parameter Weibull distribution. Commun. Statist.--Simula. Computa. B 8, 257-269.
[16] Mann, N. R. (1968). Point and interval estimation procedures for the two-parameter Weibull
and extreme-value distributions. Technometrics 10, 231-256.
[17] Mann, N. R., Scheuer, E. M. and Fertig, K. W. (1973). A new goodness-of-fit test for the two
parameter Weibull or extreme-value distribution with unknown parameters. Communications in
Statistics 2, 383-400.
[18] Moore, A. H., Ream, T. J. and Woodruff, B. W. A new goodness-of-fit test for normality with
mean and variance unknown. (Submitted for publication.)
[19] Nelson, W. (1982). Applied Life Data Analysis. Wiley, New York.
[20] Porter, J. E., Moore, A. H. and Coleman, J. W. Modified Kolmogorov, Anderson-Darling and
Cramer-von Mises tests for the Pareto distribution with unknown location and scale parame-
ters. (Submitted for publication.)
[21] Schuster, E. F. (1975). Estimating the distribution function of a symmetric distribution. Bio-
metrika 62, 631-635.
[22] Stephens, M. A. (1977). Goodness-of-fit for the extreme-value distribution. Biometrika 64,
583-588.
[23] Tapia, R. A. and Thompson, J. R. (1978). Nonparametric Probability Density Estimation. The
Johns Hopkins University Press, Baltimore and London.
[24] Woodruff, B. W;, Moore, A. H., Dunne, E. J. and Cortes, R. (1983). A modified
Kolmogorov-Smirnov test for Weibull distributions with unknown location and scale parame-
ters. IEEE Transactions on Reliability 32, 209-213.
[25] Woodruff, B. W., Viviano, P. J., Moore, A. H. and Dunne, E. J. (1984). Modified goodness-of-fit
tests for gamma distributions with unknown location and scale parameters. 1EEE Transactions
on Reliability 33, 241-245.
[26] Woodruff, B. W., Moore, A. H., Yoder, J. D. and Dunne, E. J. (1986). Modified goodness-of-fit
tests for logistic distribution with unknown location and scale parameters. Commun.
Statist.--Simula. Computa. 15(1), 77-83.
[27] Woodruff, B. W., Woodbury, L. B. and Moore, A. H. A new goodness-of-fit test for the uniform
with unspecified parameters. (Submitted for publication.)
[28] Yen, V. C. and Moore, A. H. Modified goodness-of-fit tests for the Laplace distribution.
(Submitted for publication.)
[29] Yen, V. C. and Moore, A. H. New modified goodness-of-fit tests for the Laplace distribution.
(Submitted for publication.)
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7
v
© Elsevier Science Publishers B.V. (1988) 121-129

Multivariate Nonparametric Classes in Reliability

Henry W. Block* and Thomas H. Savits*

I. Introduction

This paper is a sequel to the survey paper of Hollander and Proschan (1984)
who examine univariate nonparametric classes and methods in reliability. In this
paper we will examine multivariate nonparametric classes and methods in relia-
bility.
Hollander and Proschan (1984) describe the various univariate nonparametric
classes in reliability. The classes of adverse aging described include the IFR,
IFRA, NBU, N B U E and D M R L classes. The dual classes of beneficial aging are
also covered. Several new univariate classes have been introduced since that time.
One that we will briefly mention is the H N B U E class, since we are aware of
several multivariate generalizations of this class.
The univariate classes in reliability are important in applications concerning
systems where the components can be assumed to be independent. In this case
the components are often assumed to experience wearout or beneficial aging of
a similar type. For example, it is often reasonable to assume that components
have increasing failure rate (IFR). In making this I F R assumption it is implicit
that each component separately experiences wear and no interactions among
components can occur. However in many realistic situations, adverse wear on one
component will promulgate adverse wear on other components. From another
point of view a common environment will cause components to behave similarly.
In either situation, it is clear that an assumption of independence on the components
would not be valid. Consequently multivariate concepts of adverse or beneficial
aging are required.
Multivariate nonparametric classes have been proposed as early as 1970. For
background and references as well as some discussion of univariate classes with
multivariate generalizations in mind see Block and Savits (1981). In the present
paper we shall only describe a few fundamental developments prior to 1981 and

* Supported by Grant No. AFOSR-84-0113 and ONR Contract N00014-84-K-0084.


121
122 H. W. Block and T. H. Savits

focus on developments since then. The coverage will not be exhaustive but will
emphasize the topics which we feel are most important.
Section 2 deals with multivariate nonparametric classes. In Section 2.1 multi-
variate IFRA is discussed with emphasis on the Block and Savits (1980) class.
Multivariate N B U is covered in Section 2.2 and multivariate N B U E classes are
mentioned in Section 2.3. New developments in multivariate IFR are considered
in Section 2.4 and in Section 2.5 the topics of multivariate D M R L and H N B U E
are touched on.
Familiarity with the univariate classes is assumed. The basic reference for the
IFR, IFRA, NBU and N B U E classes is Barlow and Proschan (1981). See also
Block and Savits (1981). For information on the D M R L class see Hollander and
Proschan (1984). The H N B U E class is relatively recent and the best references
are the original articles. See for example, Klefsj6 (1982) and the references
contained there.

2. Multivariate nonparametric classes

Many multivariate versions of the univariate classes were proposed using


generalizations of various failure rate functions. These multivariate classes were
extensively discussed in Block and Savits (1981). Other classes were proposed by
attempting to imitate univariate definitions in a multivariate setting. (See also
Block and Savits, 1981.) One of the most important of these extensions was due
to Block and Savits (1980) who generalized the IFRA class. This multivariate
class was proposed to parallel the developments of the univariate case where the
IFRA class possessed many important closure properties. As in the univariate
case the following multivariate class of IFRA, designated the MIFRA class,
satisfies important closure properties. First, as in the univariate case, monotone
systems with MIFRA lifetimes have MIFRA lifetimes and independent sums of
MIFRA lifetimes are MIFRA. From the multivariate point of view, subfamilies
of MIFRA are MIFRA, conjunctions of independent MIFRA are MIFRA, scaled
MIFRA lifetimes are MIFRA, and various other properties are satisfied. We
discuss this extension first since several other classes have been defined using
similar techniques.

2.1. Multivariate I F R A
Using a characterization of the univariate IFRA class in Block and Savits
(1976) the following definition can be made.

DEFINITION 2.1.1. Let T = (T1, ..., 7",) be a nonnegative random lifetime. The
random vector T is said to be M I F R A if

E~'[h(T)] <<.E[h~'(T/o~]

for all continuous nonnegative nondecreasing functions h and all 0 < ~ ~< 1.
Multivariate nonparametric classes in reliability 123

This definition as mentioned above implies all of the properties one would
desire for a multivariate analog of the univariate IFRA class. Part of the reason
for this is that the definition is equivalent to many other properties which are both
theoretically and intuitively appealing. The statement and proofs of these results
are given below; the form in which these are presented is influenced by the paper
of Marshall and Shaked (1982) who defined a similar M N B U class.

NOXES. (1) Obviously in Definition 2.1.1 we need only consider h defined on


E+ = {xlx >i 0}. Hence all of the functions and sets mentioned below are
assumed to be Borel measurable in ~q+.
(2) We say a function g is homogeneous (subhomogeneous) on ~ + if

~g(t) = (<~)g(at) for all 0~< a~< 1, 0 ~ < t .

(3) A is an upper set if x ~ A and x <<,y implies y ~ A.

THEOREM 2.1.2. The following conditions are all equivalent to T being MIFRA.
(i) P~{T~A" 5 <~P{T~c~A) for all open upper sets in R~+, all 0 < oct< 1.
(ii) P ~ { T 6 A ) < ~ P { T ~ A ) for all upper sets in R"+, all 0 < ~ < 1 . (i.e.
E~((o(T)) <~E(gp~(T/~)) for all nonnegative, binary, nondecreasing ~ on ~+ ).
(iii) E~(h(T))<~E(h(T/~)) for all nonnegative, nondecreasing h on R~+, all
0<~<1.
(iv) For all nonnegative, nondecreasing, subhomogeneous h on ~"+, h(T) is IFRA.
(v) For all nonnegative, nondecreasing, homogeneous h on R+, h(T) is IFRA.

PROOF. (i) => (ii). By Theorem 3.3 of Esary, Proschan and Walkup (1967) for
an upper set A and any e > 0 there is an open upper set A~ such that A c A~ and
P { T 6 ~A~} <~P{TE aA} + e. Thus

P~{T~A} <<.P={T~A~} <<.P(T~ ~A~} <<.P{T~ ~a) + e.

(ii) ~ (iii). Let hk, k = 1, 2, . . . , be an increasing sequence of increasing step


functions such that l i m k _ ~ h , = h. Specifically take

i-1 if i - 1 /
2~ 2k < , h ( t ) < 2 k ' i= 1 , 2 , . . . , k 2 k

hl,(t) =
k if h(t)>, k ,
i.e.

h~(t)= kZk
E ~1 IA,.~(t)
i=1

where IA,.~ is the indicator function of the upper set Ai,/, = {tih(t) >~i/2k}. Thus
124 H. IV. Block and T. H. Savits

we need only prove the result for functions of the form


h(t) = ~ ailA~(t), ai>~ O,
i=l

where A1, . . . , A m are upper sets, since the remainder follows by the monotone
convergence theorem. We have

E~(i~=l ailAi(T))=[i~l aiP{T~Ai}] ~

V~, a,P1/~'{ }]'~


Li=I

=[~=, {~a~'l,(t/¢)dF(,)~)l/~-]1
<~i=~, f ailAi(t/g)dF(t)
= E ([;=~ ailA,(T/~))~]

where the last inequality is due to Minkowski.


(iii) =:- Def. Obvious.
Def. ~ (i). From Esary, Proschan and Walkup (1967) for any open upper
set A there exist nonnegative, nondecreasing, continuous functions h~, such that
hkT IA. Then apply the monotone convergence theorem.
(iii) ~ (iv). Let h be nonnegative, nondecreasing and subhomogeneous. Then

<~E(l(t, oo)(~ h(T)))= e{h(T)> ~t}

where the first inequality follows from (iii) and the second by the subhomogeneity.
(iv) ~ (v). Obvious.
(v) => (vi). Let A be an open upper set and define

sup 0 > 0 : 1
{ o t~A } if 0 O: 1

h(t) =
0 otherwise.

Then h is nonnegative, nondecreasing and homogeneous. Thus

P={TeA} = P={h(T)> 1} -<. P{h(T)> ~} = P{T~ ~4}.


Multivariate nonparametric classes in reliability 125

NOTE 2.1.3. The following two alternate conditions could also have been added
to the above list of equivalent conditions (provided F(0) = 1).
(vi) P ~ { T ~ A } <~P{T~ ~A} for each set of the form A = U,."_1A+ where
A + = { x l x > x + } , x+E~+ and for all 0 < c ~ < 1 .
(vii) For each k - - 1 , 2 . . . . . for each a o, i = 1. . . . . k, j = 1. . . . . n,
0 ~< a+/~< 0% and for each coherent life function z of order kn z(allT1,
a~2T~ . . . . . alnT1, a21T2, . . . , ak, T,) is IFRA. (See Block and Savits (1980) for
a definition of coherent life function and for some details of the proof.)
In conjunction with the preceding result the following lemma makes it easy to
demonstrate that a host of different lifetimes are MIFRA.

LEMMA 2.1.4. Let T be MIFRA and ~1 . . . . . t~m be any continuous, subhomo-


geneous functions of n variables. Then if Si= ~O+(T) for i= 1. . . . . m,
S = (S1, . . . , Sm) is MIFRA.

PROOF. This follows easily by considering a nonnegative, increasing, continuous


function h of m variables and applying the M I F R A property of T and the
monotonicity of the ~;.

COROLLARY 2.1.5. Let ~ . . . . . rm be coherent life functions and T be MIFRA.


Then (z~(T) . . . . . zm(T)) is MIFRA.

PROOf. Since coherent life functions are homogeneous this follows easily.

EXAMPLE 2.1.6. Let X 1. . . . . X n be independent I F R A lifetimes and


0 = S + c { 1 , 2 . . . . . n}, i = 1. . . . , m . Since it is not hard to show that inde-
pendent I F R A lifetimes are MIFRA, it follows that T+ = minj+s Xs, i = 1. . . . . m,
are MIFRA. Since many different types of multivariate I F R A can be generated
in the above way, the example shows that any of these are MiFRA. See Esary
and Marshall (1979) where various types of multivariate I F R A of the type in this
example are defined. See Block and Savits (1982) for relationships among these
various definitions.

Multivariate shock models with multivariate I F R A properties have been treated


in Marshall and Shaked (1979) and in Savits and Shaked (1981).

2.2. Multivariate NBU


As with all of the multivariate classes, the need for each of them is evident
because of the usefulness of the corresponding univariate class. The only dif-
ference is that in the multivariate case, the independence of the components is
lacking. In particular the concept of N B U is fundamental in discussing main-
tenance policies in a single component system. For a multicomponent system,
where components are dependent, marginally components satisfy the univariate
N B U property under various maintenance protocols. However, a joint concept
126 H. W. Block and T. H. Savits

describing the interaction of all the components is necessary. Hence multivariate


N B U concepts are required.
Most of the earliest definition of multivariate N B U (see for example Buchanan
and Singpurwalla, 1977) consisted of various generalizations of the defining
property of the univariate N B U class. For a survey of these see definitions (1)-(5)
of Section 5 of Block and Savits (1981). For shock models which satisfy these
definitions see Marshall and Shaked (1979), Griffith (1982), Ebrahimi and Ghosh
(1981) and Klefsjo (1982). Other definition s involving generalizations of properties
of univariate N B U distributions are given by (7)-(9) of the same reference. These
are similar to definitions used by Esary and Marshall (1979) to define multivariate
I F R A distributions. Definitions (7) and (8) of the Block and Savits (1981)
reference represent a certain type of definition and bear repeating here. The vector
T is said to be multivariate N B U if:

~(T1, . . . , Tn) is N B U
for all ~ in a certain class of life functions; (2.2.1)

There exist independent N B U X~ . . . . . X k and life functions %


i = 1, . . . , n, in a certain class such that T,. = vi(X), i = 1. . . . , n.
(2.2.2)

E1-Neweihi, Proschan and Sethuraman (1983) have considered a special case of


(2.2.2) where the zi are minimums and have related this case to some other
definitions including the special case of (2.2.1) where ~ is any minimum.
As shown in Theorem 2.1, definitions involving increasing functions can be
given equivalently in terms of upper (or open upper) sets. Two multivariate N B U
definitions which were given in terms of upper sets were those of E1-Neweihi
(1981) and Marshall and Shaked (1982). These are respectively:
For every upper s e t A c R + and for every 0 < ~ < 1

P { T ~ A } <~P(min(T'/c~, T"/(1 - cO~A) (2.2.3)

where T, T ' , T" are independent and have the same distribution.
For every upper s e t A c N + and for every ~ > 0 , f l > 0

(2.2.4)

Relationships among these definitions are given in E1-Neweihi (1981). A more


restrictive definition than either of the above has been given in Berg and Kesten
(1984):
For every upper A, B c ~n,

P ( T c A + B) <~P ( T c A ) P ( T c B ) . (2.2.5)

This definition was shown to be useful in percolation theory as well as reliability


theory.
Multivariate nonparametric classes in reliability 127

A general framework involving generalizations of the concept (2.2.1) called


taking the C-closure of ~ and of the concept (2.2.2) called C-generating from
(where ~ is the class of univariate NBU lifetimes in (2.2.1) and (2.2.2)) has been
given by Marshall and Shaked (1984). Many of the previous NBU definitions are
organized within this framework. A similar remark applies when the classes ~- are
exponential, IFR, IFRA and NBUE. See Marshall and Shaked (1984).

2.3. Multivariate NBUE


Along with the multivariate NBU versions of Buchanan and Singpurwalla
(1977) are integrated versions of these definitions. These authors give three
versions of multivariate NBUE. The relations among these and closure properties
are discussed in Ebrahimi and Ghosh (1981). Furthermore the latter authors
relate these multivariate N B U E definitions to four definitions of multivariate
NBU (i.e. definitions (1)-(4) of Section 5 of Block and Savits (1981)).
Some other multivariate N B U E classes are mentioned by Block and Savits
(1981) and Marshall and Shaked (1984). One extension of a univariate character-
ization of the N B U E class mentioned in Block and Savits (1978) has been
proposed by Savits (1983).

2.4. Multivariate IFR


Perhaps the most important univariate concept in reliability is that of increasing
failure rate. One reason for this is that in a very simple and compelling way this
idea describes the wearout of a component. Many engineers, biologists and
actuaries find this description fundamental. The monotonicity of the failure rate
function is simple and intuitive and occurs in many physical situations. This also
is crucial in the multicomponent case where the components are dependent.
Several authors have attempted to describe the action of the failure rates
increasing for n components simultaneously. These cases were discussed in Block
and Savits (1981) and in the references contained therein.
A recent definition of multivariate IFR was given by Savits (1985) and is in
the spirit of the classes defined by Block and Savits (1980) and Marshall and
Shaked (1982). For shock models involving multivariate IFR concepts see Ghosh
and Ebrahimi (1981).
It is shown in Savits (1985) that a univariate lifetime T is IFR if and only if
E[h(x, T)] is log concave in x for all functions h(x, t) which are log concave in
(x, t) and are nondecreasing in t for each fixed x >~ 0. This leads to the following
multivariate definition.

DEFINITION 2.4.1. Let T be a nonnegative random vector. Then T has an


MIFR distribution if E[h(x, T)] is log concave in x for all functions h(x, t) which
are log concave in (x, t) and nondecreasing in t >~ 0 for each fixed x >~ 0.
This class enjoys many closure properties. Among these are that all marginals
are MIFR, conjunction of independent M I F R are MIFR, convolutions of MIFR
128 H. W. Block and T. H. Savits

are MIFR, scaled MIFR are MIFR, nonnegative nondecreasing concave


functions of MIFR are MIFR, and weak convergence preserves MIFR. See
Savits (1985) for details. From these results it follows that the multivariate
exponential distribution of Marshall and Olkin (1967)is MIFR, as are all distribu-
tions with log concave densities. Since the multivariate folded normal has a log
concave density, it also is MIFR.
The technique used in Definition 2.4.1 for the MIFR class extends to other
multivariate classes. In particular, if we replace log concave with log subhomo-
geneous, we get the same multivariate IFRA class as in Definition 2.1.1; if we
replace log concave with log subadditive, we get a new multivariate NBU class
which is between that of (2.2.3) and (2.2.4). For more details see Savits (1983,
1985).

2.5. Multivariate D M R L and H N B U E


Few definitions of multivariate D M R L have been discussed in the literature,
although E. E1-Neweihi has privately communicated one to us. Since develop-
ments are premature with respect to this class we will not go into details.
Multivariate extensions of the H N B U E class have been proposed by Basu and
Ebrahimi (1981) and Klefsj0 (1980). The extensions of the former authors are
similar in spirit to the multivariate N B U E classes of Ghosh and Ebrahimi (1981).
The latter author's definition extends the univariate definition by replacing the
univariate exponential distribution with the bivariate Marshall and Olkin (1967)
distribution and considers various multivariate versions of the definition.
Basu and Ebrahimi (1981) show relationships among their definitions and
KlefsjO's, given some closure properties and also point out relations with multi-
variate N B U E classes.

References

Barlow, R. E. and Proschan, F. (1981). Statistical Theory of Reliability and Life Testing: Probability
Models. To Begin With, Silver Spring, MD.
Basu, A. P. and Ebrahimi, N. (1981). Multivariate HNBUE distributions. University of Missouri-
Columbia, Technical Report # 110.
Berg, J. and Keston, H. (1984). Inequalities with applications to percolation and reliability.
Unpublished report.
Block, H. W. and Savits, T. H. (1976). The IFRA closure problem. Ann. Prob. 4, 1030-1032.
Block, H. W. and Savits, T. H. (1978). Shock models with NBUE survival. J. AppL Prob. 15,
621-628.
Block, H. W. and Savits, T. H. (1980). Multivariate increasing failure rate average distributions. Ann.
Prob. 8, 793-801.
Block, H. W. and Savits, T. H. (1981). Multivariate classes in reliability theory. Math. of O.R. 6,
453-461.
Block, H. W. and Savits, T. H. (1982). The class of MIFRA lifetimes and its relation to other classes.
NRLO 29, 55-61.
Buchanan, B. and Singpurwalla, N, D. (1977). Some stochastic characterizations of multivariate
survival. In: C. P. Toskos and I. Shimi, eds., The Theory and AppL of Reliability, Vol. I, Academic
Press, New York, 329-348.
Multivariate nonparametric classes in reliability 129

Ebrahimi, N. and Ghosh, M. (1981). Multivariate NBU and NBUE distributions. The Egyptian
Statistical Journal 25, 36-55.
E1-Neweihi, E. (1981). Stochastic ordering and a class of multivariate new better than used distribu-
tions. Comm. Statist.-Theor. Meth. A 10(16), 1655-1672.
EI-Neweihi, E., Proschan, F. and Sethuraman, J. (1983). A multivariate new better than used class
derived from a shock model. Operations Research 31, 177-183.
Esary, J. D. and Marshall, A. W. (1979). Multivariate distributions with increasing hazard rate
averages. Ann. Prob. 7, 359-370.
Esary, J. D., Proschan, F. and Walkup, D. W. (1967). Association of random variables, with
applications. Ann. Math. Stat. 38, 1466-1474.
Ghosh, M. and Ebrahimi, N. (1981). Shock models leading to increasing failure rate and decreasing
mean residual life survival. J. Appl. Prob. 19, 158-166.
Griffith, W. (1982). Remarks on a univariate shock model with some bivariate generalizations. NRLQ
29, 63-74.
Hollander, M. and Proschan, F. (1984). Nonparametric concepts and methods in reliability. In: P.
R. Krishnaiah and P. K. Sen, eds., Handbook of Statistics, Vol. 4, Elsevier, Amsterdam.
Klefsj6, B. (1980). Multivariate HNBUE. Unpublished report.
Klefsj6, B. (1982). NBU and NBUE survival under the Marshall-Olkin shock model. IAPQR Trans-
actions 7, 87-96.
Klefsj6, B. (1982). The HNBUE and HNWUE classes of life distributions. NRLQ 29, 331-344.
Marshall, A. W. and Olkin, I. (1967). A generalized bivariate exponential distribution. J. Appl. Prob.
4, 291-302.
Marshall, A. W. and Shaked, M. (1979). Multivariate shock models for distributions with increasing
hazard rate average. Ann. Prob. 7, 343-358.
Marshall, A. W. and Shaked, M. (1982). A class of multivariate new better than used distributions.
Ann. Prob. 10, 259-264.
Marshall, A. W. and Shaked, M. (1984). Multivariate new better than used distributions. Un-
published report.
Savits, T. H. and Shaked, M. (1981). Shock models and the MIFRA property. Stoch. Proc. Appl.
11, 273-283.
Savits, T. H. (1983). Multivariate life classes and inequalities. In: Y. L. Tong, ed., Inequalities on
Probability and Statistics IMS, Hayward, CA.
Savits, T. H. (1985). A multivariate IFR class. J. Appl. Prob., 22, 197-204.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7 0
./
© Elsevier Science Publishers B.V. (1988) 131-156

Selection and Ranking Procedures in Reliability


Models*

Shanti S. Gupta and S. Panchapakesan

I. Introduction

Situations abound in practice where the aim of the statistical analyst is to


compare two or more populations in some fashion with a view to rank them or
select the best one(s) among them. For example, a purchasing firm may want to
determine which one of several competing suppliers of components for a certain
computer is producing the highest quality product. Typically, the populations that
are compared will be life length distributions of the components from the com-
peting manufacturers. The best population could be defined as the one with the
largest mean life or with the largest quantile (percentile) of a given order. In such
situations, the classical tests of homogeneity are not designed to answer efficiently
several possible questions of interest to the experimenter. Selection and ranking
procedures were initially devised in the early 1950's to provide the analyst appro-
priate tools to answer these questions. Most of the investigations in the last thirty
odd years have adopted one or the other of two basic formulations. One of them
is the so-called indifference zone (IZ) formulation of Bechhofer (1954) and the
other is the subset selection (SS) approach of Gupta (1956).
Our main purpose in this paper is to describe some important selection proce-
dures that are relevant to reliability models. Selection procedures are available in
the literature for various parametric families of distributions. Many of these dis-
tributions serve as appropriate models for the life length of a unit. However, we
will be concerned with only a few of these such as exponential, gamma, and
Weibull distributions. Besides some nonparametric and distribution-free proce-
dures, we emphasize selection procedures for restricted families of distributions
such as the increasing failure rate (IFR) and increasing failure rate on the average
(IFRA) families which are of importance in reliability problems. In dealing with
these procedures, we mainly use the SS aproach.

* This research was supported by the Office of Naval Research Contract N00014-84-C-0167 at
Purdue University. Reproduction in whole or in part is permitted for any purpose of the United
States Government.
131
732 s. s. Gupta and S. Panchapakesan

In the last three decades and more, the literature on selection and ranking
procedures has grown enormously. Several books have appeared exclusively
dealing with selection and ranking procedures. Of these, the monograph of
Bechhofer, Kiefer and Sobel (1968) deals with sequential procedures with special
emphasis on Koopman-Darmois family. Gibbons, Olkin and Sobel (1977) deal
with methods and techniques mostly under the IZ formulation. Gupta and
Panchapakesan (1979) provide a comprehensive survey of the developments in the
field of ranking and selection, with a special chapter on Guide to Tables. They
deal with all aspects of the problem and provide an extensive bibliography.
BOringer, Martin and Schriever (1980) and Gupta and Huang (1981) have dis-
cussed some specific aspects of the problem. A fairly comprehensive categorized
bibliography is provided by Dudewicz and Koo (1982). For a critical review and
an assessment of developments in subset selection theory and techniques, refer-
ence may be made to Gupta and Panchapakesan (1985).
Section 2 discusses the formulation of the basic problem of selecting the best
population using the IZ and SS approaches. Section 3 deals with selection from
gamma, exponential and Weibull populations. Procedures for different generalized
goals are discussed using both IZ and SS approaches. Nonparametric procedures
are discussed in Section 4 for selecting in terms of ~t-quantiles. This section also
discusses procedures for Bernoulli distributions. These serve as distribution-free
procedures for selecting from life distributions in terms of reliability at an arbi-
trarily chosen time. Procedures for selection from restricted families of distribu-
tions are described in Section 5. These include procedures for IFR and IFRA
families in particular. A brief discussion of selection in comparison with a
standard or control follows in Section 6.

2. Selection and ranking procedures

Let 7Zl, ..., 7Zk be k given populations where ni has the associated distribution
function Fo,, i = 1. . . . , k. The 0i are real-valued parameters taking values in the
set O. It is assumed that the 0; are unknown. The ordered 0,. are denoted by
011j ~< 0[2] ~< • • • ~< 0[k] and the (unknown) population ne associated with Oto by n;,
i = 1. . . . . k. The populations are ranked according to their 0-values. To be
specific, nu~ is defined to be better than nti) if i < j . No prior information is
assumed regarding the true pairing between (01 . . . . . 0~) and (0711, ..., 0[k]).

2.1. Indifference zone (IZ) formulation


The goal in the basic problem in the IZ approach is to select the best popula-
tion, namely, the one associated with 0[k]. A procedure is required to choose one
of the populations. A correct selection (CS) is a selection of population(s) satis-
fying the goal. Here CS corresponds to choosing the best population. Any selec-
tion procedure is required to guarantee a minimum probability o f a correct selection
(PCS). In the IZ formulation, this requirement is that, for any rule R,
Selection and ranking procedures in reliability models 133

P(CS IR) ~ P* whenever b(0[/,], 0[~_ 11) >/b*, (2.1)

where P(CSIR) denotes the PCS using R, and b(Otk1, 0[k_ 1]) is an appropriate
measure of separation of the best population re(k) from the next best re(k- 1~' The
constants P* and b* are specified by the experimenter in advance. The statistical
problem is to define a selection rule which really consists of a sampling rule, a
stopping rule for sampling, and a decision rule. If we consider taking a single
sample of fixed size n from each population, then the minimum value of n is
determined subject to (2.1). A crucial step involved in this is to evaluate the
infimum of the PCS over 12~. = {0 = 01, . . . , Ok): b(Otl,], Otk_ 11) ~> b*}. Any con-
figuration of 0 where this infimum is attained is called a least favorable con-
figuration (LFC). Between two valid (i.e. satisfying (2.1)) single sample proce-
dures, the sample size n is an obvious criterion for efficiency comparison. The
region f2b. is called the preference zone. No requirement is made regarding the
PCS when 0 belongs to the complement of fib* which, in fact, is the indifference
zone.

2.2. Subset selection (SS) approach


In the SS approach for selecting the best population, the goal is to select a
nonempty subset for the k populations which includes the best population. The
size of the selected subset is not fixed in advance; it is rather determined by the
data themselves. Selection of any subset consistent with the goal (i.e. including
the best population) is a correct selection. It is required that, for any rule R,

P(CSIR)>~P* for all 0~f2 (2.2)

where f2 = {0} is the whole parameter space. It should be noted that there is no
indifference zone specification in this formulation. As is to be expected, a crucial
step is the evaluation of the infimum of the PCS over 12. Any subset selection rule
that satisfies (2.2) meets the criterion of validity. Denoting the selected subset by
S and its size by IS I, the expected value of lSI serves as a reasonable measure
for efficiency comparison between valid procedures. Besides E(IS b), possible
performance characteristics include E(IS I) - PCS and E([S ])/PCS. The former
one represents the expected number of nonbest populations included in the
selected subset. As an overall measure, one can also consider the supremum of
E ( / S I) over O.

2.3. Some general remarks


The probability requirement, (2.1) or (2.2) as the case may be, is usually
referred to as the basic probability requirement, or the P*-requirement, or the
P*-condition. There are several modifications and generalizations of the basic goal
and requirements on the procedures in both IZ and SS formulations. These will
be described as the necessity arises during our discussion of several procedures.
134 s. s. Gupta and S. Panchapakesan

For details on these aspects of the problem, reference may be made to Gupta and
Panchapakesan (1979).
Suppose that the best population is the one associated with the largest 0,.. A
procedure R is said to be monotone if the probability of selecting ~i is at least as
large as that of selecting rcj whenever 0~> 0j..

2.4. Two types of subset selection rules


Let T~ be the statistic associated with the sample from rce (i = 1. . . . . k) with
distribution function F(x, 0e); the 0i are the parameters to be ranked. Most of the
rules that have been studied in the literature are of one of the following types:

RI: Select re; if and only if


T,. >t max Tj - d (2.3)
1 <~j<~<k
and
R2: Select zci if and only if
r,~>c max Tj (2.4)
1 <~j<~k

where d > 0 and c e (0, 1) are to be determined so that the P*-requirement is


satisfied.
These rules R 1 and R 2 have been typically proposed when 0; is a location and
a scale parameter, respectively. When 0,. is neither a location nor a scale parameter
(e.g. a noncentrality parameter), usually one of these two rules has been proposed
depending on the nature of the support of T,.. Most of the rules that are discussed
in this paper c o m e under one of these two types. Treatment of R 1 and R 2 in the
location and scale case, respectively, is given in Gupta (1965). The following
properties hold for RI in the location case and R 2 in the scale case.
(1) The procedure is monotone (Gupta, 1965).
(2) If the distribution F(x, O) possesses a density f ( x , O) having a monotone
likelihood ratio (MLR) in x, then E ( [ S J ) is maximized when 01 . . . . . Ok and
this maximum is kP* (Gupta, 1965).
(3) Under the MLR assumption, the rule is minimax when the loss is measured
by JSp or the number of non-best populations selected (Berger, 1979).
(4) In a fairly large class of rules, the procedure is minimax when the loss is
measured by the maximum probability of including a non-best population (Berger
and Gupta, 1980).
A comprehensive unified theory is due to Gupta and Panchapakesan (1972),
who have considered a class of rules which includes R1 and R 2 as special cases;
see Gupta and Panchapakesan (1979, Section 11.2). Gupta and Huang (1980)
have obtained an optimal rule in the class of rules for which the PCS is at least
7 by minimizing the supremum of E([S I).
Selection and ranking procedures in reliability models 135

3. Selection from parametric families

N u m e r o u s p a r a m e t r i c m o d e l s are e m p l o y e d in the analysis o f life length d a t a


a n d in p r o b l e m s c o n n e c t e d w i t h t h e m o d e l i n g o f aging o r failure p r o c e s s e s .
A m o n g u n i v a r i a t e m o d e l s , a few p a r t i c u l a r distributions, n a m e l y , the e x p o n e n t i a l ,
Weibull, a n d g a m m a , s t a n d o u t in v i e w o f their p r o v e n u s e f u l n e s s in a w i d e r a n g e
o f situations. O f course, t h e s e d i s t r i b u t i o n s are related to e a c h other. In this
section, we will d i s c u s s a few typical p r o c e d u r e s for selection f r o m t h e s e p o p u l a -
tions.

3.1. Selection from gamma populations


Let 7zI . . . . , rc~ d e n o t e k given g a m m a p o p u l a t i o n s with d e n s i t y f u n c t i o n s

f ( x , Oi)- - - exp(-x/0~), x>0; 0,., e > 0 ; i= 1. . . . . k, (3.1)


r(~)o?

w i t h a c o m m o n k n o w n s h a p e p a r a m e t e r ~. F o r the goal o f selecting a subset


c o n t a i n i n g the b e s t p o p u l a t i o n , n a m e l y , the o n e a s s o c i a t e d w i t h 0tk 1, G u p t a
(1963a) p r o p o s e d a rule b a s e d o n the s a m p l e m e a n s X;, i = 1, . . . , k, arising f r o m
n i n d e p e n d e n t o b s e r v a t i o n s f r o m e a c h p o p u l a t i o n . T h e rule o f G u p t a (1963a) is

Table la
Values of the constant c of Rule R3 satisfying equation (3.3); P* = 0.90

v 2 3 4 5 6 7 8 9 10 11

1 0.111 0.072 0.059 0.052 0.047 0.044 0.041 0.039 0.038 0.036
2 0.244 0.183 0.159 0.145 0.135 0.128 0.123 0.119 0.116 0.113
3 0.327 0.260 0.232 0.215 0.203 0.195 0.188 0.183 0.178 0.174
4 0.386 0.317 0.286 0.268 0.255 0.246 0.239 0.232 0.228 0.223
5 0.430 0.360 0.329 0.310 0.297 0.287 0.279 0.273 0.268 0.263
6 0.466 0.396 0.364 0.345 0.332 0.321 0.313 0.307 0.301 0.296
7 0.494 0.426 0.394 0.374 0.361 0.350 0.342 0.336 0.330 0.325
8 0.519 0.451 0.419 0.400 0.386 0.376 0.367 0.360 0.355 0.350
9 0.539 0.472 0.441 0.422 0.408 0.398 0.389 0.382 0.376 0.371
10 0.558 0.492 0.460 0.441 0.428 0.417 0.409 0.402 0.396 0.391
11 0.573 0.508 0.478 0.459 0.445 0.434 0.426 0.419 0.414 0.408
12 0.588 0.524 0.493 0.474 0.461 0.450 0.442 0.435 0.429 0.424
13 0.600 0.537 0.507 0.488 0.475 0.465 0.456 0.450 0.444 0.439
14 0.612 0.550 0.520 0.502 0.488 0.478 0.470 0.463 0.457 0.452
15 0.622 0.561 0.532 0.514 0.500 0.490 0.482 0.475 0.469 0.464
16 0.632 0.572 0.543 0.525 0.511 0.501 0.493 0.486 0.481 0.476
17 0.641 0.582 0.553 0.535 0.522 0.512 0.504 0.497 0.491 0.486
18 0.649 0.591 0.562 0.544 0.532 0.522 0.514 0.507 0.501 0.496
19 0.657 0.599 0.571 0.553 0.540 0.531 0.523 0.516 0.510 0.506
20 0.664 0.607 0.579 0.562 0.549 0.539 0.531 0.525 0.519 0.514
136 S. S. Gupta and S. Panchapakesan

Table lb
Values of the constant c of Rule R3 satisfying equation (3.3); P* = 0.95

v 2 3 4 5 6 7 8 9 10 11

1 0.053 0.035 0.028 0.025 0.023 0.021 0.020 0.019 0.018 0.018
2 0.156 0.119 0.104 0.095 0.089 0.085 0.082 0.079 0.076 0.074
3 0.233 0.188 0.168 0.156 0.148 0.142 0.138 0.134 0.131 0.128
4 0.291 0.242 0.220 0.206 0.197 0.190 0.184 0.180 0.176 0.173
5 0.336 0.285 0.261 0.247 0.237 0.229 0.223 0.218 0.214 0.210
6 0.372 0.320 0.296 0.281 0.271 0.263 0.256 0.251 0.247 0.243
7 0.403 0.350 0.326 0.310 0.300 0.291 0.285 0.279 0.275 0.271
8 0.428 0.376 0.351 0.336 0.325 0.316 0.310 0.304 0.300 0.296
9 0.451 0.399 0.374 0.358 0.347 0.339 0.332 0.326 0.322 0.317
10 0.471 0.419 0.394 0.378 0.367 0.359 0.352 0.346 0.341 0.337
11 0.488 0.437 0.412 0.396 0.385 0.377 0.370 0.364 0.359 0.355
12 0.504 0.453 0.428 0.413 0.402 0.393 0.386 0.380 0.376 0.371
13 0.518 0.468 0.443 0.428 0.417 0.408 0.401 0.395 0.390 0.386
14 0.531 0.481 0.457 0.442 0.430 0.422 0.415 0.409 0.404 0.400
15 0.543 0.494 0.470 0.454 0.443 0.434 0.428 0.422 0.417 0.413
16 0.554 0.505 0.481 0.466 0.455 0.446 0.439 0.434 0.429 0.424
17 0.564 0.516 0.492 0.477 0.466 0.457 0.450 0.445 0.440 0.436
18 0.574 0.526 0.502 0.487 0.476 0.468 0.461 0.455 0.450 0.446
19 0.582 0.535 0.512 0.497 0.486 0.477 0.470 0.465 0.460 0.456
20 0.591 0.544 0.520 0.506 0.495 0.486 0.480 0.474 0.469 0.465

R3: Select rEi if a n d o n l y if

Xi >~ c m a x X j (3.2)
1 ~<j<~k

w h e r e c is the largest n u m b e r w i t h 0 < c < 1 for w h i c h the P * - r e q u i r e m e n t is met.


T h e L F C is given by 01 . . . . . Ok a n d the c o n s t a n t c is d e t e r m i n e d by

fO e Gkv - l ( x / c ) g v ( x ) d x = e * , (3.3)

w h e r e v = nc~ and, Gv a n d gv are the c d f a n d the density, respectively, o f a


s t a n d a r d i z e d g a m m a r a n d o m v a r i a b l e (i.e. with 0 = 1) w i t h s h a p e p a r a m e t e r v.
G u p t a (1963a) has t a b u l a t e d the v a l u e s o f c for v = 1(1)25, k = 2(1)11, a n d
P * = 0.75, 0.90, 0.95, 0.99. T a b l e s l a a n d l b are e x c e r p t e d f r o m the tables o f
G u p t a (1963a) a n d t h e y p r o v i d e c - v a l u e s for k = 2(1)11, v = 1(1)20, a n d
P * = 0.90 a n d 0.95, respectively.
D e p e n d i n g on the p h y s i c a l n a t u r e o f the p r o b l e m , w e m a y be i n t e r e s t e d in
selecting the p o p u l a t i o n a s s o c i a t i o n w i t h 0tl 1, w h i c h is the best p o p u l a t i o n n o w .
Selection and ranking procedures in reliability models 137

In this case, the procedure analogous to R 3 is

R4: Select zcg if and only if

1
X~ ~< - min X. (3.4)
C 1 <~j~<k J

where 0 < c' < 1 is the largest number for which the P*-condition is met. The
constant c' is given by

f o ~ [ 1 - Gv(c' x)] k - lgv(x ) d x = P * (3.4)

where v = n~. The values of the constant c' have been tabulated for v = 1(1)25,
k = 2(1)11, and P* = 0.75, 0.90, 0.95, 0.99 by Gupta and Sobel (1962b) who have
studied rule R 4 in the context of selecting from k normal populations the one with
the smallest variance in a companion paper (1962a).
It is known that the gamma family {F(x, 0)}, with common parameter r, is
stochastically increasing in 0, i.e., F(x, 0~) and F(x, Oj) are distinct for 0,. # 0j, and
F(x, 0~) >1F(x, Oj) for all x when 0~< 0j.. This implies that ranking them in terms
of 0 is equivalent to ranking in terms of a-quantile for any 0 < a < 1.

3.2. Selection from exponential (one-parameter) populations


We first note that this is a special case of gamma populations with densities
f(x, 0~) in (3.1) with ~ - - 1 . Thus the rules R 3 and R 4 are applicable. Now
consider a life testing situation where a sample of n items from each population
is put on test and the sample is censored (type II) at the rth failure. Let
Xil < X i 2 < ' " < X t r denote the r complete lives in the sample from re;,
i = 1, . . . , k. Define

r
T,= L X,j + (n - r)X,r, i= l, ..., k. (3.5)
j=l
The Ti are the so-called total life statistics. It is well-known that 2Te/Oi has a
chi-square distribution with 2r degrees of freedom. In other words, 7",. has a
gamma distribution with scale parameter 0~ and shape parameter r. Thus for
selecting the population with the largest mean life 0e, the procedure R 3 (stated in
terms of the T~) will be

R3: Select I[i if and only if


T,/> c max Tj (3.6)
1 <~j<~k

where c is given by (3.3) with v -- r.


138 s. s. Gupta and S. Panchapakesan

3.3. Selection J~om two-parameter exponential distributions


Let ni have density

f(x'Oi'~r)=l-aexp-{~}' x>Oi; O ~ , a > O ; i = l , . . . , k . ( 3 . 7 )

The density (3.7) provides a model for life length data when we assume a
minimum guaranteed life 0~, which is here a location parameter. It is assumed that
all the k populations have a common scale parameter a. The 0i are unknown and
our interest is in selecting the population associated with the largest 0~. We will
discuss some procedures under the IZ formulation. Consider the generalized goal
of selecting a subset of fixed size s so that the t best populations (1 ~< t ~< s < k)
are included in the selected subset. This generalized goal was introduced by Desu
and Sobel (1968). The special case of t = s, namely, that of choosing t populations
so that they are the t best, was considered originally by Bechhofer (1954). When
s = t = 1, we get the basic goal of selecting the best population. The probability
requirement is that

PCS >~ P* whenever Otk-t+ lj - 0tk-tl >/0* > 0 (3.8)

where 0* and P* are specified in advance and a correct selection occurs when
a subset of s populations is selected consistent with the goal. Also, for a
meaningful problem, we should have 1/(~) < P* < 1. In describing several proce-
dures, we will adopt either the generalized goal or one of its special cases. We
will consider the two cases of known and unknown a separately.

Case A: Known or. We can assume without loss of generality that cr = 1. Let
Xij, j = 1, ..., n, denote a sample of n observations from re;, i = 1, . . . , k. Define
Yi mini <-~j<~nXij , i = 1, ..., k.
=

Raghavachari and Starr (1970) considered the goal of selecting the t best
populations (i.e. 1 ~<s = t < k ) and they stvdied the 'natural' rule

Rs: Select the t populations associated with Ytk-,+ ~1,''', Y[kl"


(3.9)

The L F C for this rule is given by

0[l . . . . . O[k_t] ;
O[k-t+ 1] ~--- ' ' " = O[k] ; (3. lo)
O[k-t + 11 O[k_t] = O*.
Selection and ranking procedures in reliability models 139

The minimum sample size required to satisfy (3.8) is the smallest integer n for
which

(1-e-n°*) k t + ( k _ t) e n O . t i ( e - n O . , t + 1, k - t)>~ P * (3.11)


where
I(z;~,fl)=f~u~-~(1-u)/3-1du, ~ , f l > 0 ; 0 ~ < z ~ < 1. (3.12)

Equivalently, we need the smallest integer n such that

nO* >~ - log v, (3.13)

where v (0 < v < 1) is the solution of the equation

(1 - v) k - t + (k - t ) v - t I ( v , t + 1, k - t) = P * . (3.14)

Raghavachari and Starr (1970) have tabulated the v-values for k = 2(1)15,
t = l(1)k - 1, and P* = 0.90, 0.95, 0.975, 0.99.
In particular, for selecting the best population, the equation (3.14) reduces to

(vk)-l[1 - (1 - v) k] = P * . (3.15)

For the generalized goal, Desu and Sobel (1968) studied the following rule R 6.
R6: Select the s populations associated with Ytk-s+ 1]. . . . ' Ytk~-

Given n, k, t, 0", and P*, they have shown that the smallest s for which the
probability requirement (3.8) is satisfied is the smallest integer s such that

(~) >~p.(k)
t
e-n,o*, ' (3.16)

It should be pointed out that Desu and Sobel (1968) have obtained general results
for location parameter family. They have also considered the dual problem of
selecting a subset of size s (s ~< t) so that all the selected populations are among
the t best.

Case B: Unknown a. In this case, we consider the basic goal of selecting the
best population. Since a is unknown, it is not possible to determine in advance
the sample size needed for a single sample procedure in order to guarantee the
P*-condition. This is similar to the situation that arises in selecting the population
with the largest mean from several normal populations with a common unknown
variance. For this latter problem, Bechhofer, Dunnett and Sobel (1954) proposed
a non-elimination type two-stage procedure in which the first stage samples are
utilized purely for estimating the variance without eliminating any population from
further consideration. A similar procedure was proposed by Desu, Narula and
Villarreal (1977) for selecting the best exponential population. Kim and Lee (1985)
have studied an elimination type two-stage procedure analogous to that of Gupta
140 S. S. Gupta and S. Panchapakesan

and Kim (1984) for the normal means problem. In their procedure, the first stage
is used not only to estimate a but also to possibly eliminate non-contenders. Their
Monte-Carlo study shows that, when 0tkI - 0tk_ 1] is sufficiently large, the elimi-
nation type procedure performs better than the other type procedure in terms ot
the expected total sample size.
The procedure R 7 of Kim and Lee (1985) consists of two stages as follows.

Stage I ." Take n o independent observations from each rcg (1 ~< i ~< k), and compute
y/.(o = min~ ~j<~noXij, and a pooled estimate ~ of a, namely,

k no

O" ~- 2 2 ( Y / j - Y~l))/k(n 0 -- 1).


i=1 j = l

Determine a subset I of {1, ..., k} defined by

I = {i1 y.(1) >~ max y)l) _ (2k(no _ 1) &h/n o - 0") + } ,


1~j~k

where the symbol a + denotes the positive part of a, and h ( > 0 ) is a design
constant to be determined.
(a) If I has only one element, stop sampling and assert that the population
association with V(1)
--[k] as the best.
(b) If I has more than one element, go to the second stage.

Stage 2: Take N - n o additional observations X U from each re,. for i E L where

N = max{n o, (2k(n o - 1)~rh/O*)},

and the symbol ( y ) denotes the smallest integer equal to greater than y. Then
compute, for the overall sample, Y~.= maxl~j~vX~j and choose the population
associated with maxi~ x Y~ as the best.
The constant h used in the procedure R 7 is given by

fO °° {1 -- (1 -- O~(x))k}2/{k20~2(x)}fv(X) d x = P* (3.17)

where e ( x ) - - e x p ( - h x ) and fv(x) is the chi-square density with v = 2 k ( n o - 1)


degrees of freedom. The h-values have been tabulated by Kim and Lee (1985) for
P* = 0.95, k = 2(1)5(5)20, and n o = 2(1)30.

3.4. Selection from Weibull distributions


Let n~ have a two-parameter Weibull distribution given by the cdf

Fi(x ) =- F(x; 0 i, e l ) = 1 - e x p { - (x/O~)C'}, x > 0;

0;,c~>0; i = 1. . . . , k . (3.18)
Selection and ranking procedures in reliability models 141

The c`. and Oz. are unknown. Kingston and Patel (1980a, b) have considered the
problem of selecting from Weibull distributions in terms of their reliabilities
(survival probabilities) at an arbitrary but specified time L > 0. The reliability at
L for F~ (i = 1. . . . . k) is given by

p`. = 1 - F~(L) = exp { - (L/O`.)c'}. (3.19)

We can without loss of generality assume that L = 1 because the observed failure
times can be scaled so that L = 1 time unit. Further, letting (0`.)c' = 2;, we get
p`. = exp { - 27 1}. Obviously, ranking the populations in terms of the p; is equiva-
lent to ranking in terms of the 2;, and the best population is the one associated
with 2[k], the largest 2,.. Kingston and Patel (1980a) considered the problem of
selecting the best one under the IZ formulation using the natural procedure based
on estimates of the 2`. constructed from type II censored samples. They also
considered the problem of selecting the best in terms of the a-quantiles for a given
~ (0, 1), ~ 1 - e -1, in the case where 01 . . . . . Ok= 0 (unknown). The
~-quantile of F`. is given by ¢`. = 0[ - l o g ( 1 - ~)]l/ci so that ranking in terms of the
~-quantiles is equivalent to ranking in terms of the shape parameter. It should be
noted that the ranking of the ci is in the same order as that of the associated 4`.
if a < 1 - e-1, and is in the reverse order if a > 1 - e-1. The procedures dis-
cussed above are based on maximum likelihood estimators as well as simplified
linear estimators (SLE) considered by Bain (1978, p. 265). For further details on
these procedures, see Kingston and Patel (1980a).
In another paper, Kingston and Patel (1980b) considered the goal of selecting
a subset of restricted size. This formulation, usually referred to as restricted subset
selection (RSS) approach, is due to Gupta and Santner (1973) and Santner
(1975). In the usual s s approach of Gupta (1956), it is possible that the proce-
dure selects all the k populations. In the RSS approach, we restrict the size of
the selected subset by specifying an upper bound m (1 ~< m ~< k - 1); the size of
the selected subset is still random variable taking on values 1, 2 . . . . , m. Thus it
is a generalization of the usual approach (m = k). However, in doing so, an
indifference zone is introduced. The selection goal can be more general than
selecting the best. We now consider a generalized goal in the RSS approach for
selection from Weibull populations, namely, to select a subset of the k given
populations not exceeding m in size such that the selected subset contains at least
s of the t best populations. As before, the populations are ranked in terms of their
2-values. Note that 1 ~< s ~< min (t, m) ~< k. The probability requirement now is
that

PCS >~P* whenever ~, = (21 . . . . . 2~)~f2a. (3.20)


where

f2~. = {2: 2"2[k t~ ~< 2[k-,+ ,], 2* ~> 1}. (3.21)


142 S . S . Gupta and S. Panchapakesan

When t = s = m and 2* > 1, the problem reduces to selecting the t best popula-
tions using the IZ formulation. When s = t < m = k and 2*= 1, the problem
reduces to selecting a subset of random size containing the t best populations (the
usual SS approach). Thus the RSS approach integrates the formulations of
Bechhofer (1954), Gupta (1956), and Desu and Sobel (1968). General theory
under the RSS approach is given by Santner (1975).
Returning to the Weibul selection problem with the generalized RSS goal,
Kingston and Patel (1980b) studied a procedure based on type II censored
samples from each population. It is defined in terms of the maximum likelihood
A

estimators (or the SLE estimators) 2 i. This procedure is

R8: Include 7ri in the selected subset if and only if

,~i >~ max{'~[k-m+ t1, CA[k-,+ 1]}, (3.22)

where c~ [0, 1] is suitably chosen to satisfy (3.20).


Let n denote the common sample size and consider censoring each sample at
the rth failure. For given k, r, n, s, t, and m, we have three quantities associated
with the procedure R 8, namely, P*, c, and 2 * > 0. Given two of these, one can
find the third; however, the solution may not be admissible. For example, for
some P* and 2*, there may not be a constant c e [0, 1] so that (3.20) is satisfied
unless m = k. Kingston and Patel (1980b) have given a few tables of ),*-values for
selected values of other constants. Their table values are based on Monte Carlo
techniques and the choice of SLE's.

4. Nonparametric and distribution-free procedures

Parametric families of distributions serve as life models in situations where


there are strong reasons to select a particular family. For example, the model may
fit data on hand well, or there may be a good knowledge of the underlying aging
or failure process that indicates the appropriateness of the model. But there are
many situations in which it becomes desirable to avoid strong assumptions about
the model. Nonparametric or distribution-free procedures are important in this
context.
Gupta and McDonald (1982)have surveyed nonparametric selection and rank-
ing procedures applicable to one-way classification, two-way classification, and
paired-comparison models. These procedures are based on rank scores and/or
robust estimators such as the Hodges-Lehmann estimator. For the usual types
of procedures based on ranks, the LFC is not always the one corresponding to
identical distributions. Since all these nonparametric procedures are relevant in
the context of selection from life length distributions, the reader is best referred
to the survey papers of Gupta and McDonald (1982), Gupta and Panchapakesan
(1985), and Chapters 8 and 15 of Gupta and Panchapakesan (1979).
Selection a n d ranking p r o c e d u r e s in reliability m o d e l s 143

There have been some investigations of subset selection rules based on ranks
while still assuming that the distributions associated with the populations are
known. This is appealing especially in situations in which the order of the observa-
tions is more readily available than the actual measurements themselves due,
perhaps, to excessive cost or other physical constraints. Under this setup, Nagel
(1970), Gupta, Huang and Nagel (1979), Huang and Panchapakesan (1982), and
Gupta and Liang (1987) have investigated locally optimal subset selection rules
which satisfy the validity criterion that the infimum of the PCS is P* when the
distributions are identical. They have used different optimality criteria in some
neighborhood of an equiparameter point in the parameter space. An account of
these rules is given in Gupta and Panchapakesan (1985).
Characterizations of life length distributions are provided in many situations by
so-called restricted families of distributions which are defined by partial order
relations with respect to known distributions. Well-known examples of such
families are those with increasing (decreasing) failure rate and increasing (decreas-
ing) failure rate average. Selection procedures for such families will be discussed
in the next section.
In the remaining part of this section, we will be mainly concerned with non-
parametric procedures for selection in terms of a quantile and selection from
several Bernoulli distributions. Though the Bernoulli selection problem could have
been discussed under parametric model, it is discussed here to emphasize the fact
that we can use the Bernoulli selection procedures as distribution-free procedures
for selecting from unknown continuous (life) distributions in terms of reliability at
any arbitrarily chosen time point L.

4.1. Selection & terms of quantiles


Let ~1 . . . . . rck be k populations with continuous distributions F+(x), i = 1, ..., k,
respectively. Given 0 < c~< 1., let x~(F) denote the ~th quantile ofF. It is assumed
that the ~-quantiles of the k populations are unique. The populations are ranked
according to their ~-quantiles. The population associated with the largest ~-quan-
tile is defined to be the best. Rizvi and Sobel (1967) proposed a procedure for
selecting a subset containing the best. Let n denote the common size of the
samples from the given populations and assume n to be sufficiently large so that
1 ~< (n + 1)~< n. Let r be a positive integer such that r~< (n + 1)~< r + 1. It
follows that 1 ~< r ~< n. Let Yj, i denote the jth order statistic in the sample from
rc~, i = 1. . . . . k. The procedure of Rizvi and Sobel (1967) is

R9: Select ~zi if and only if


Y~ i>~ max Yr e j (4.1)
' l~j<~k -- "

where c is the smallest integer with 1 ~< c ~< r - 1 for which the P*-condition is
satisfied.
For the procedure R9, the infimum of the PCS is attained when the distribu-
tions F 1. . . . . F k are identical and it is shown by Rizvi and Sobel (1967) that c
144 s. s. Gupta and S. Panchapakesan

is the smallest integer with 1 ~< c ~< r - 1 satisfying

where
~0 1 Grk--cl(u) dGr(u) ~> P* (4.2)

n~
Gr(u)= ur - l ( 1 - u ) . . . . 1, 0,N<u~<l. (4.3)
(r- 1)!(n - r)!

Rizvi and Sobel have shown that the maximum permissible value o f P* such that
a c-value satisfying (4.2) exists is P1 = PI( n, ~, k) given by

P1 = (,,(i+ (4.4)
i=0 ". r 1))

A short table of Pl-values is given by Rizvi and Sobel for ~ = 0.5 and k = 2(1)10.
The n-values range from 1 in steps of 2 to a value (depending on k) for which
P1 gets very close to 1. Also given by them is a table of the largest value of r - c
for c~ = 1/2 (which means that r = (n + 1)/2), k = 2(1)10, n = 5(10)95(50)495, and
P* = 0.75, 0.90, 0.95, 0.975, 0.99. For the IZ approach to this selection problem,
see Sobel (1967).

4.2. Distribution-free procedures using Bernoulli model


Let re1, ..., lt~ be k populations with the associated continuous (life) distribu-
tions F 1. . . . , F k, respectively. The reliability of ~; at L is p~ = 1 - Fi(L ). Let Xo,
j = 1, . . . , n, be sample observations from rc~, i = 1. . . . , k. Define

{~ if X ° > L
Y,y= ' i=1 ..... k;j=l ..... n. (4.1)
otherwise,

The Yil ..... Yin are independent and identically distributed Bernoulli r a n d o m
variables with success probability p;, i = 1. . . . . k. We are interested in selecting
the population associated with the largest pi.
G u p t a and Sobel (1960) proposed a subset selection rule based on
Yi = ~nj=l Y/j, i = 1, . . . , k. Their rule is

Rio: Select re,. if and only if


Y,. >/ max Ys - D (4.2)
1 <-%j<~k

where D is the smallest nonnegative integer for which the P*-requirement is met.
An interesting feature o f Procedure Rio is that the infimum of the PCS occurs
when Pl . . . . . Pk = P (say) but it is not independent of their c o m m o n value p.
Selection and ranking procedures in reliability models 145

For k = 2, Gupta and Sobel (1960) showed that the infimum takes place when
p = 1/2. When k > 2, the common value Po for which the infimum takes place is
not known. However, it is known that this common value Po ~ 1/2 as n ~ ~ . An
improvement in the situation is provided by Gupta, Huang and Huang (1976)
who investigated conditional selection rules and, using the conditioning argument,
obtained a conservative value of d. Their conditional procedure is
RI~: Select re,. if and only if
Y~>>. m a x Yj-D(t) (4.3)
1 ~<j~< k

given T = ~k;= ~ Y~-= t, where D(t) > 0 is chosen to satisfy the P*-condition. Exact
result for the infimum of the PCS is ~ ~tained only for k = 2; in this case, the
infimum is attained when p~ = P2 = P and is independent of the common value p.
For k > 2, Gupta, Huang and Huang (1976) obtained a conservative value for
D(t) and also for D of Rule Rio. They have shown that infP(CS ]R~I ) >i P * if D(t)
is chosen such that
Sd(t) for k = 2,
D(t) (4.4)
~max{d(r): r = 0, 1, . . . , min(t, 2n)) for k > 2,

where d(r) is defined as the smallest value such that

for k = 2 ,
N(2; d(r), r, n) >1/.[1 - (1 - P * ) ( k - 1)- l] (zn) (4.5)
for k > 2 ,

and N(k; d(t), t, n) = • ( ~ ) . . . ( ~ ) , with the summation taken over the set of all
. . k
nonnegatlve integers s; such. that ~ i = 1 si = t and s k >>,m a x i <~j<<.k- ~sj - d(t).
A conservative constant d for Procedure Rio is given by d = maxo<.t<~knd(t ).
Gupta, Huang and Huang (1976) have tabulated the smallest value d(t) satisfying
(4.5) for k = 2,4(1)10, n = 1(i)10, t = 1(1)20, and P* = 0.75, 0.90, 0.95, 0.99.
They have also tabulated the d-values (conservative) for Procedure Rio for
P* = 0.75, 0.90, 0.95, 0.99, and n = 1(1)4 when k = 3(1)15, and n -- 5(1)10 when
k = 3(1)5.
Under the IZ formulation, one can use the procedure of Sobel and Huyett
(1957) for selecting the population associated with the largest Pi which guarantees
a minimum PCS P* whenever PtkJ -- Ptg- II >/A* > 0. Based on samples of size
n from each population, their procedure based on the Yi defined in (4.1) is

R12: Select the population associated with the largest Yi,


using randomization to break ties, if any. (4.6)

The sample size required is the smallest n for which the PCS >~ P* when
Pt~] . . . . . P[k-lJ = P t k ] - A*, the LCF in this case. Sobel and Huyett (1957)
have tabulated the sample sizes (exact and approximate) for k = 2, 3, 4, 10;
A* = 0.05(0.05)0.50, and P* = 0.50, 0.60, 0.75(0.05)0.95, 0.99.
146 S. S. Gupta and S. Panchapakesan

When n is large, the normal approximation to the PCS yields

n ~ c2(1 - A*z)/4A .2 (4.7)

where c = c(k, P * ) is the constant satisfying

f ~
--oO
qtr~- l(x + c)qg(x)dx = P* (4.8)

and, ~ and q~ denote correspondingly the cdf and density of the standard normal
distribution. The c-value can be obtained from tables of Bechhofer (1954), Gupta
(1963b), Milton (1963) and Gupta, Nagel and Panchapakesan (1973) for several
selected values of k and P*.
The Bernoulli selection problem has applications to the drug selection problem
and to clinical trials. This fact has spurred lots of research activity involving
investigations of selection procedures using sampling procedures such as the
play-the-winner (PW) sampling rule (introduced by Robbins, 1952 and 1956) and
vector-at-a-time (VT) rule with a variety of stopping rules. One of the main
considerations in many of these procedures is to design the sampling rule so as
to minimize the expected total number of observations and/or the expected num-
ber of observations from the worst population. Some of these procedures suffer
from one drawback or another. For excellent review/survey/comprehensive assess-
ment of these (and other) procedures, reference should be made to Bechhofer and
Kulkarni (1982), BOringer, Martin and Schriever (1980), Gupta and
Panchapakesan (1979, Sections 4.2 through 4.6), and Hoel, Sobel and Weiss
(1975). For corresponding developments in subset selection theory, see Gupta and
Panchapakesan (1979, Section 13.2).

5. Selection from restricted families of distributions

A restricted family of probability distributions is defined by a partial order


relation with respect to a known distribution. As we have pointed out earlier, such
families provide characterizations of life length distributions. Selection rules for
such restricted families were first considered by Barlow and Gupta (1969). We
define below the binary partial order relations ( < ) that have been used in studying
selection procedures. These are partial ordering in the sense that they enjoy only
reflexivity and transitivity properties, that is, (1) F < F for all distributions F, and
(2) F < G, G < H implies F < H. Note that F < G and G < F do not necessarily
imply F - G.

DEFINITION 5.1. (1) F is said to be convex with respect to G ( F < c G ) if and


only if G 1F(x) is convex on the support of F.
(2) F is said to be star-shaped with respect to G ( F < . G) if and only if
F(O) = G(O) = O, and G - 1F(x)/x is increasing in x >I 0 on the support of F.
Selection and ranking procedures in reliability models 147

(3) F is said to be r-ordered with respect to G ( F < r G ) if and only if


F(0) = G(0) = 1/2 and G - 1 F ( x ) / x is increasing (decreasing) in x positive (nega-
tive).
(4) F is said to be tail-ordered with respect to G ( F < t G ) if and only if
F(0) = G(0) = 1/2 and G - iF(x) - x is increasing on the support of F.

It is well-known that convex ordering implies star ordering. Further, when


G(x) = 1 - e - x (x >i 0), F < c G is equivalent to saying that F has an increasing
failure rate (IFR) and F < . G is equivalent to saying that F has an increasing
failure on the average (IFRA). Of course, if F is IFR, then it is also IFRA. IFR
distributions were first studied in detail by Barlow, Marshall and Proschan (1963)
and IFRA distributions by Birnbaum, Esary and Marshall (1966). The r-ordering
was investigated by Lawrence (1975). Doksum (1969) used the tail-ordering. The
convex ordering and s-ordering (not defined here) have been studied by van Zwet
(1964). Without the assumption of the common median zero, Definition 5.1-(4)
has been used by Bickel and Lehmann (1979) to define an ordering by spread with
the germinal concept attributed to Brown and Tukey (1946). Saunders and Moran
(1978) have also perceived this kind of ordering (called ordering by dispersion by
them) in the context of a neurobiological problem.
Gupta and Panchapakesan (1974) have defined a general partial ordering
through a class of real-valued functions, which provides a unified way to handle
selection problems for star-ordered and tail-ordered families. Their ordering is
defined as follows.

DEFINITION 5.2. Let ~ = {h(x)} be a class of real-valued functions h(x). Let F


and G be distributions such that F(0) = G(0). F is said to be ~-ordered with
respect to G ( F < i~eG) if G-1F(h(x))>f h(G-1F(x)) for all h • ~ and all x on
the support of F.

It is easy to see that we get star-ordering and tail-ordering as special cases of


W-ordering by taking W = {ax, a>1 1}, F ( 0 ) = G ( 0 ) = 0 , and out° = { x + b ,
b >~ 0}, F(0) = G(0) = 1/2, respectively. Hooper and Santner (1979) have used a
modified definition of W-ordering. For some useful probability inequalities involv-
ing Jt~-ordering, see Gupta, Huang and Panchapakesan (1984).

5. I. Selection in terms of quantiles from star-ordered distributions


Let rc~, ..., ~tk have the associated absolutely continuous distributions
F 1. . . . . F~, respectively. All the F i are star-shaped with respect to a known
continuous distribution G. The population having the largest ~-quantile
(0 < ~ < 1) is defined as the best population. It is assumed that the best popula-
tion is stochastically larger than any of the other populations. Under this setup,
Barlow and Gupta (1969) proposed a procedure for selecting a subset containing
the best. Let Tj. i denote the jth order statistic in a sample of n independent
observations from rci, i = 1. . . . , k, where n is assumed to be large enough so that
148 S. S. Gupta and S. Panchapakesan

j ~< (n + 1)c¢< j + 1 for some j. The Barlow-Gupta procedure is

R13" Select n i if and only if


Tji>~c max Tjr (5.1)
" 1 <~r<~k "

where c = c(k, P*, n, j) is the largest number in (0, 1) for which the P*-condition
is satisfied. The constant c is given by

~o~ Gf- '(x/c)&.(x) dx p* (5.2)

where Gj denotes the cdf of the jth order statistic in a sample of n observations
from G, and gj is the corresponding density function. The values of c satisfying
(5.2) are tabulated by Barlow, Gupta and Panchapakesan (1969) in the special
case of exponential G, i.e. for selecting from IFRA populations, for P* = 0.75,
0.90, 0.95, 0.99, and the following values of k, n, and j: (i) j = 1, k = 2(1)11 (in
this case, c is independent of n), (ii) k = 2(1)6, j = 2(1)n, and n = 5(1)10 or 12 or
15 depending on k. Table 2a is excerpted from the tables of Barlow, Gupta and
Panchapakesan (1969). It gives the values of c for P* = 0.90, 0.95, k = 2(1)5,
Table 2a
Values of the constant c of Rule R13 satisfying equation (4.2) for selecting
the IFRA distribution with the largest median; G(x)= 1 - e -x, x~>0,
j~< (n + 1)/2 < j + 1, P * = 0.90 (top entry), 0.95 (bottom entry)

n 2 3 4 5

5 0.32197 0.25464 0.22607 0.20924


0.22871 0.18353 0.16388 0.15215
6 0.32397 0.25665 0.22808 0.21123
0.23045 0.18521 0.16551 0.15377
7 0.38021 0.31045 0.27994 0.26164
0.28527 0.23611 0.21406 0.20068
8 0.38198 0.31228 0.28179 0.26351
0.28692 0.23774 0.21568 0.20229
9 0.42434 0.35398 0.32257 0.30353
0.32973 0.27855 0.25515 0.24079
10 0.42587 0.35559 0.32422 0.30519
0.33121 0.28005 0.25665 0.24228
I1 0.45939 0.38927 0.35750 0.33808
0.36592 0.31377 0.28958 0.27461
12 0.46071 0.39069 0.35896 0.33956
0.36724 0.31512 0.29094 0.27597
Selection and ranking procedures in reliability models 149

Table 2b
Values of the constant d of Rule RI4 satisfying equation (5.4) for selecting
the IFRA distribution with the smallest median; G(x)= 1 - e -x, x>~O,
j ~<(n + 1)/2 < j + 1, P* = 0.90 (top entry), 0.95 (bottom entry)

n 2 3 4 5

5 0.32197 0.23711 0.19983 0.17752


0.22871 0.17100 0.14516 0.12953
6 0.32397 0.23881 0.20134 0.17891
0.23045 0.17244 0.14643 0.13060
7 0.38021 0.29477 0.25597 0.23226
0.28527 0.22441 0.19623 0.17883
8 0.38198 0.29636 0.25744 0.23365
0.28692 0.22585 0.19755 0.18007
9 0.42434 0.33988 0.30072 0.27650
0.32972 0.26775 0.23845 0.22014
10 0.42587 0.34131 0.30208 0.27779
0.33121 0.26909 0.23971 0.22134
11 0.45939 0.37647 0.33748 0.31315
0.36592 0.30378 0.27399 0.25521
12 0.46071 0.37775 0.33871 0.31433
0.36724 0.30501 0.27516 0.25634

n = 5(1)12, and j Such that j ~ (n + 1)/2 < j + 1 (i.e. a p p r o p r i a t e for selection in


terms of median).
F o r the selection of the p o p u l a t i o n with the smallest a-quantile ( a s s u m e d to be
stochastically smaller than any other Fe) the analogous p r o c e d u r e is

R14: Select rei if a n d only if

dTs.,i>~ min
l <~r<~k
Tj, r (5.3)

where d = d(k, P*, n, j ) is the largest n u m b e r in (0, 1) satisfying the P * - c o n d i t i o n


and is given by

f o B [1 - G j ( x d ) ] k - l g j ( x ) d x = P * (5.4)

where Gj and gs are defined as in (5.2). Barlow, G u p t a a n d P a n c h a p a k e s a n (1969)


have t a b u l a t e d the values of d in the case o f exponential G for P * = 0.75, 0.90,
0.95, 0.99 a n d the following values o f k, n, and j : ( i ) j --- 1, k = 2(1)11 (d is
i n d e p e n d e n t o f n), (ii) k = 2(1)6, j -- 2(1)n, n = 5(1)12 for k = 6, and n = 5(1)15
150 s. s. Gupta and S. Panchapakesan

for other k values. Table 2b is excerpted from the tables of Barlow, Gupta and
Panchapakesan (1969). It gives the values of d for P * = 0.90, 0.95, k = 2(1)5,
n = 5(1)12, and j such that j ~< (n + 1)/2 < j + 1 (i.e. appropriate for selection in
terms of median).
Suppose that G is the Weibull distribution with cdf G(x) = 1 - exp { -(x/O)~},
x ~> 0, and 0, 2 > 0. It is assumed that 2 is known. Then it is easy to see that
the new constant c~ is given by c I = c ~/~, where c is the constant in the exponen-
tial case (2 = 1). Another interesting special case of G is the half-normal distribu-
tion obtained by folding N(0, a 2) at the origin, where a is assumed to be known.
The class of distributions which are star-shaped with respect to this folded normal
is a subclass of IFRA distributions. Selection in terms of quantiles in this case
has been considered by Gupta and Panchapakesan (1975), who have tabulated
the constant c associated with RI3 for k--- 2(1)10, n = 5(1)10, j = l(1)n, and
P* = 0.75, 0.90, 0.95, 0.99.

5.2. Selection in terms of medians from tail-ordered distributions


Barlow and Gupta (1969) considered also the selection of the population with
the largest median (assumed to be stochastically larger than other populations)
from a set of distributions F,., i = 1, . . . , k, which have lighter tails than a specified
distribution G with G(0)= 1/2. This means that, for each i, F i centered at its
median A; is r-ordered with respect to G, and (d/dx)Fi(x+Ai)lx= o
>1 (d/dx)G(x)Ix= o. This definition of F,. having a lighter tail than G used by them
implies that F~ centered at Ai is tail-ordered with respect to G. The procedure of
Barlow and Gupta (1969) has been shown by Gupta and Panchapakesan (1974)
to work for this wider class defined using tail-ordering. Actually, Gupta and
Panchapakesan have also shown a generalized version of this by considering
tail-ordering of F; and G when both are centered at their respective ~-quantiles.
For selection in terms of .medians, the procedure of Barlow and Gupta is
R15: Select ni if and only if
Tj.t>/ max T/ -D j~<(n+ 1)/2<j+ 1 (5.5)
" 1 ~r~<k ,r ,

where the T/, r are defined as in the case of the procedure R13 , and the appropriate
constant D = D(k, P*, n) > 0 is given by

f ~_~ G f - '(t + D)gy(t) dt = P*. (5.6)

Here, Gs and gs are the cdf and the density of the jth order statistic in a sample
of n independent observations from G. The values of D are given by Gupta and
Panchapakesan (1974) in the special case where G is the logistic distribution,
G(x) = [ 1 + e-X] - 1, for k = 2(1)10, n = 5(2)15, and P* = 0.75, 0.90, 0.95, 0.99.
Using the ~-ordering (Definition 5.2) with the functions h satisfying certain
properties, Gupta and Panchapakesan (1974) have discussed a class of proce-
Selection and ranking procedures in reliabilitymodels 151

dures for selecting the best (i.e. the one which is stochastically larger than any
other, assumed to exist) of k distributions F;, i, . . . , k, which are Yr'-ordered with
respect to G. The procedures R13 and R15 are special cases of their procedure.
Hooper and Santner (1979) considered selection of good populations in terms
of c~-quantiles for star- and tail-ordered distributions using the RSS approach. Let
ni have the distribution F; and let Fvl denote the distribution having the ith
smallest c~-quantile. Denoting the c~-quantile of any distribution F by x~(F), ~ is
called a good population if x~(F~) > c*x~(Ftk_,+ 11), 0 < c* < 1, in the case of
star-ordered families, and if x~(F,.)> x~(Ft~,_t+ q ) - d*, d* > 0, in the case of
tail-ordered families. The goal of Hooper and Santner (1979) is to select a subset
of size not exceeding m(1 ~< m ~< k - 1) that contains at least one good popula-
tion. They have also considered the problem of selecting a subset of fixed size s
so as to include at least r good populations (r~< t, r~< s < k - t + r) using the IZ
approach.
Selection of one or more good populations as a goal is a relaxation from that
of selecting the best population(s). A good population is defined suitably to reflect
the fact that it is 'nearly' as good as the best. In some form or other it has been
considered by several authors; mention should be made of Fabian (1962),
Lehmann (1963), Desu (1970), Carroll, Gupta and Huang (1975), and Pancha-
pakesan and Santner (1977). A discussion of this can be found in Gupta and
Panchapakesan (1985, Section 4.2).

5.3. Selection from convex ordered distributions


Let ~t~. . . . . rc~ have absolutely continuous distributions F 1. . . . . F k, respectively,
of which one is assumed to be stochastically larger than the rest. This distribution,
denoted by Ft~j, is defined to be the best. It is assumed that Ft~,l < c G, where G
is a known continuous distribution. All distributions in the context are assumed
to have the positive real line as the support. Let X)f)~(Yj,n) denote the jth order
statistic in a random sample of size n from Fe(G ). Considering samples of size n
from F~, . . . , F k each censored at the rth failure, define

T i= ~ a X g )
--y--J~ n ,
i= 1, " " " '
k '
(5.7)
J=l
where
aj=gG-l(J-n 1)-gG-l(~ ), j= 1,...,r- 1,
(5.8)
a~=gG-'(~-),

and g is the density associated with G.


If G(y) = 1 - e-Y, y >>,O, then a 1 . . . . . a t - 1 = 1/n, and ar = (n - r + 1)/n.
r-- 1 (1)
Consequently, n 7",.= ~]j = 1 X)f~ + (n - r + 1) X~I n, the well-known total life statistic
until the rth failure from F i.
152 S. s. Gupta and S. Panchapakesan

Now, for selecting a subset containing Fte], Gupta and Lu (1979) proposed the
rule

R16: Select n~ if and only if


Ti>~ c max Tj, (5.9)
1 <~j<~k

where c is the largest number in (0, 1) satisfying the P*-condition. They have
shown that, if aj ~> 0 for j = 1. . . . . r, a,/> c, and g(0) ~< 1, then

infP(CS ]R16) = G~r- l ( y / c ) d G r ( y ) , (5.10)


g2 ~O ~
r
where GT- is the distribution of T = Y~j= 1 aj Yj, n, and f2 is the space of all k-tuples
(F 1. . . . . Fk) such that there is one among them which is stochastically larger than
the others and is convex with respect to G. Thus, the constant c = min(ar, c*)
where c* is the solution for e by equating the fight-hand side of (5.10) to P*.
For the special case of G ( y ) = 1 - e -y, y~>0, we get c = m i n ( c * ,
(n - r + 1)/n). This special case is a slight generalization of the results of Patel
(1976).

6. Comparison with a standard or control

Although the experimenter is generally interested in selecting the best of k (>t 2)


competing categories, in some situations even the best one among them may not
be good enough to warrant its selection. Such a situation arises when the
goodness of a population is defined in comparison with a standard (known) or
a control population. For convenience, we may refer to either one as the control.
Let ~1, "'', nk be the k (experimental) populations with associated distribution
functions F ( x , Or), i = 1, . . . , k, respectively. The 0r are unknown. Let 0o be the
specified standard or the unknown parameter associated with the control popula-
tion n o whose distribution function is F ( x , 0o). Several different goals have been
considered in the literature. For example, one may want to select the best experi-
mental population (i.e. the one associated with 0[k], the largest 0;) provided that
it is better than the control (i.e. 0rk] > 0o), and not to select any of them other-
wise. An alternative goal is to select a subset (of random size) of the k popula-
tions which includes all those populations that are better than the control. Some
of the early papers dealing with these problems are Paulson (1952), Dunnett
(1955), and Gupta and Sobel (1958).
One can define a good population in different ways using comparison with a
control. For example, rc~ may be called good if 0r > 0o + A, or [0,. - 0o1 ~< A for
some A > 0. Several procedures have been investigated with the goal of selecting
good populations or those better than the control and these will not be described
here. A good account of these can be had from Gupta and Panchapakesan (1979,
Selection and ranking procedures in reliability models 153

Chapter 20). A review of subset selection procedures in this context, including


recent developments, is contained in Gupta and Panchapakesan (1985).
An important aspect of the recent developments is the so-called isotonic p r o c e -
d u r e s which become relevant in the situations where it is known that
01 <~ 02 <~ • • • <<, Ok although the values of the 0,. are unknown. This is typical, for
example, of experiments involving different dose levels of a drug so that the
treatment effects will have a known ordering. Suppose that a population ni is
defined to be good if 0~>~ 0o and bad otherwise. For the goal of selecting all the
good populations, any reasonable procedure R should have the property: If R
selects ~ti then it selects all populations nj for j > i. This is the isotonic behavior
of R. Naturally, one would consider procedures based on isotonic estimator of the
0,. Such procedures have been recently studied by Gupta and Yang (1984) in the
case of normal means (common variance o"2, known or unknown), by Gupta and
Huang (1984) in the case of binomial populations with success probabilities 0;,
and by Gupta and Leu (1986) in the case of two-parameter exponential popula-
tions with guarantee times (location parameters) 0i and common (known or
unknown) scale parameter. All these papers deal with both cases of known and
unknown 00.

7. Concluding remarks

In the preceding sections, we have described several selection procedures that


have special significance in reliability studies. However, we have confined our
attention to the classical type procedures since they are of common interest to a
wide variety of users. We have also generally restricted ourselves to single-stage
procedures. T h e r e is ample literature on two-stage and sequential procedures.
Further, we have not discussed decision-theoretic formulations and Bayes and
empirical Bayes procedures. There have been substantial developments in these
regards, especially using subset selection approach, in the last ten years. For a
comprehensive survey of developments until the late 1970's, we refer to Gupta
and Panchapakesan (1979). A critical review of developments in the subset selec-
tion theory including very recent developments is given by Gupta and Pancha-
pakesan (1985).

References
Bain, L. (1978). Statistical Analysis of Reliability and Life-Testing Models, Theory and Methods. Marcel
Dekker, New York.
Barlow, R. E. and Gupta, S. S. (1969). Selectionprocedures for restricted families of distributions.
Ann. Math. Statist. 40, 905-917.
Barlow, R. E., Gupta, S. S. and Panchapakesan, S. (1969). On the distribution of the maximum and
minimum of ratios of order statistics. Ann. Math. Statist. 40, 918-934.
Barlow, R. E., Marshall, A. W. and Proschan, F. (1963). Properties of probability distributions with
monotone hazard rate. Ann. Math. Statist. 34, 375-389.
154 S. S. Gupta and S. Panchapakesan

Bechhofer, R. E. (1954). A single-sample multiple decision procedure for ranking means of normal
populations with known variances. Ann. Math. Statist. 25, 16-39.
Bechhofer, R. E., Dunnett, C. W. and Sobel, M. (1954). A two-sample multiple-decision procedure
for ranking means of normal populations with a common unknown variance. Biometrika 41,
170-176.
Bechhofer, R. E., Kiefer, J. and Sobel, M. (1968). Sequential Identification and Ranking Procedures
(with special reference to Koopman-Darmois populations). The University of Chicago Press, Chicago.
Bechhofer, R. E. and Kulkarni, R. V. (1982). Closed adaptive sequential procedures for selecting the
best of k >/2 Bernoulli populations. In: S. S. Gupta and J. O. Berger, eds., Statistical Decision
Theory and Related Topics--Ill, Vol. 1, Academic Press, New York, 61-108.
Berger, R. L, (1979). Minimax subset selection for loss measured by subset size. Ann. Statist. 7,
1333-1338.
Berger, R. L. and Gupta, S. S. (1980). Minimax subset selection rules with applications to unequal
variance (unequal sample size) problems. Scand. J. Statist. 7, 21-26.
Bickel, P. J. and Lehmann, E. L. (1979). Descriptive statistics for nonparametric models IV. Spread.
In: Jana Jureckova, ed., Contributions to Statistics: Jaroslav Hajek Memorial Volume, Reidel, Boston,
3-40.
Birnbaum, Z. W., Esary, J. D. and Marshall, A. W. (1966). A stochastic characterization of wear-out
for components and systems. Ann. Math. Statist. 37, 816-825.
Brown, G. and Tukey, J. W. (1946). Some distributions of sample means. Ann. Math. Statist. 7, 1-12.
BiJringer, H., Martin, H. and Schriever, K.-I-I. (1980). Nonparametric Sequential Selection Procedures.
Birkhanser, Boston, MA.
Carroll, R. J., Gupta, S. S. and Huang, D.-Y. (1975). On selection procedures for the t best
populations and some related problems. Comm. Statist. 4, 987-1008.
Desu, M. M. (1970). A selection problem. Ann. Math. Statist. 41, 1596-1603.
Desu, M. M., Narula, S. C. and Villarreal, B. (1977). A two-stage procedure for selecting the best
of k exponential distributions. Comm. Statist. A--Theory Methods 6, 1223-1230.
Desu, M. M. and Sobel, M. (1968). A fixed-subset size approach to a selection problem. Biometrika
55, 401-410. Corrections and amendments: 63 (1976), 685.
Doksum, M. (1969). Starshaped transformations and the power of rank tests. Ann. Math. Statist. 40,
1167-1176.
Dudewicz, E. J. and Koo, J. O. (1982). The Complete Categorized Guide to Statistical Selection and
Ranking Procedures. Series in Mathematical and Management Sciences, Vol. 6, American Sciences
Press, Columbus, OH.
Dunnett, C. W. (1955). A multiple comparison procedure for comparing several treatments with a
control. J. Amer. Statist. Assoc. 50, 1096-1121.
Fabian, V. (1962). On multiple decision methods for ranking population means. Ann. Math. Statist.
33, 248-254.
Gibbons, J. D., Olkin, I. and Sobel, M. (1977). Selecting and Ordering Populations: A New Statistical
Methodology. Wiley, New York.
Gupta, S. S. (1956). On a decision rule for a problem in ranking means. Mimeograph Series No.
150, Institute of Statistics, University of North Carolina, Chapel Hill, NC.
Gupta, S. S. (1963a). On a selection and ranking procedure for gamma populations. Ann. Inst. Statist.
Math. 14, 199-216.
Gupta, S. S. (1963b). Probability integrals of the multivariate normal and multivariate t. Ann. Math.
Statist. 34, 792-828.
Gupta, S. S. (1965). On some multiple decision (selection and ranking) rules. Technometrics 7,
225-245.
Gupta, S. S. and Huang, D.-Y. (1980). A note on optimal subset selection procedures. Ann. Statist.
8, 1164-1167.
Gupta, S. S. and Huang, D.-Y. (1981). Multiple Decision Theory: Recent Developments. Lecture Notes
in Statistics, Vol. 6, Springer, New York.
Gupta, S. S., Huang, D.-Y. and Huang, W.-T. (1976). On ranking and selection procedures and tests
of homogeneity for binomial populations. In: S. Ikeda, T. Hayakawa, H. Hudimoto, M. Okamoto,
Selection and ranking procedures in reliability models 155

M. Siotani and S. Yamamoto, eds., Essays in Probability and Statistics, Shinko Tsusho Co. Ltd.,
Tokyo, Japan, Chapter 33, 501-533.
Gupta, S. S., Huang, D.-Y. and Nagel, K. (1979). Locally optimal subset selection procedures based
on ranks. In: J. S. Rustagi, ed., Optimizing Methods in Statistics, Academic Press, New York,
251-260.
Gupta, S. S., Huang, D.-Y. and Panchapakesan, S. (1984). On some inequalities and monotonicity
results in selection and ranking theory. In: Y. L. Tong, ed., Inequalities in Statistics and Probability,
IMS Lecture Notes--Monograph Series, Vol. 5, 211-217.
Gupta, S. S., Huang, W. T. (1984). On isotonic selection rules for binomial populations better than
a standard. In: A. M. Abuammoh, E. A. Ali, E. A. El-Neweihi and M. Q. E1-Osh, eds.,
Developments in Statistics and lts Applications, King Sand Univ. Library, Riyadh, 89-112.
Gupta, S. A. and Kim, W.-X. (1984). A two-stage elimination type procedure for selecting the largest
of several normal means with a common unknown variance. In: T. J. Santner and A. C. Tamhane,
eds., Design of Experiments: Ranking and Selection, Marcel Dekker, New York, 77-93.
Gupta, S. S. and Leu, L.-Y. (1986). Isotonic procedures for selecting populations better than a
standard: two-parameter exponential distributions. In: A. P. Basu, ed., Reliability and Quality
Control, Elsevier Science Publishers B.V., Amsterdam, 167-183.
Gupta, S. S. and Liang, T.-C. (1987). Locally optimal subset selection rules based on ranks under
joint type II censoring. Statistics and Decisions 5, 1-13.
Gupta, S. S. and Lu, M.-W. (1979). Subset selection procedures for restricted families of probability
distributions. Ann. Inst. Statist. Math. 31, 253-252.
Gupta, S. S. and McDonald, G. C. (1982). Nonparametric procedures in multiple decisions (ranking
and selection procedures). In: B. V. Gnedenko, M. L. Puri and I. Vincze, eds., Colloquia
Mathematica Societatis Janos Bolyai, 32: Nonparametric Statistical Inference, Vol. I, North-Holland,
Amsterdam, 361-389.
Gupta, S. S., Nagel, K. and Panchapakesan, S. (1973). On the order statistics from equally cor-
related normal random variables. Biometrika 60, 403-413.
Gupta, S. S. and Panchapakesan, S. (1972). On a class of subset selection procedures. Ann. Math.
Statist. 43, 814-822.
Gupta, S. S. and Panchapakesan, S. (1974). Inference for restricted families: (a) multiple decision
procedures; (b) order statistics inequalities. In: F. Proschan and R. J. Serfling, eds., Reliability and
Biometry: Statistical Analysis of Lifelength, SIAM, Philadelphia, 503-596.
Gupta, S. S. and Panchapakesan, S. (1975). On a quantile selection procedure and associated
distribution of ratios of order statistics from a restricted family of probability distributions. In: R.
E. Barlow, J. B. Fussell and N. D. Singpurwalla, eds., Reliability and Fault Tree Analysis: Theoretical
and Applied Aspects of System Reliability and Safety Assessment, SIAM, Philadelphia, 557-576.
Gupta, S. S. and Panchapakesan, S. (1979). Multiple Decision Procedures: Theory and Methodology of
Selecting and Ranking Populations. Wiley, New York.
Gupta, S. S. and Panchapakesan, S. (1985). Subset selection procedures: review and assessment.
Amer. J. Management Math. Sci. 5, 235-311.
Gupta, S. S. and Santner, T. J. (1973). On selection and ranking procedures--a restricted subset
selection rule. Proceedings of the 39th Session of the International Statistical Institute, Vol. 45, Book I,
478-486.
Gupta, S. S. and Sobel, M. (1958). On selecting a subset which contains all populations better than
a standard. Ann. Math. Statist. 29, 235-244.
Gupta, S. S. and Sobel, M. (1960). Selecting a subset containing the best of several binomial
populations. In: I. Olkin, S. G. Ghurye, W. Hoeffding, W. G. Madow and H. B. Mann, eds.,
Contributions to Probability and Statistics, Stanford University Press, Stanford, Chapter 20, 224-248.
Gupta, S. S. and Sobel, M. (1962a). On selecting a subset containing the population with the smallest
variance. Biometrika 49, 495-507.
Gupta, S. S. and Sobel, M. (1962b). On the smallest of several correlated F-statistics. Biometrika 49,
509-523.
Gupta, S. S. and Yang, H.-M. (1984). Isotonic procedures for selecting populations better than a
control under ordering prior. In: J. K. Ghosh and J. Roy, eds., Statistics: Applications and New
156 S. S. Gupta and S. Panchapakesan

Directions: Proceedings of the Indian Statistical Institute Golden Jubilee International Conference,
Indian Statistical Institute, Calcutta, 279-312.
Hoel, D. G., Sobel, M. and Weiss, G. H. (1975). A survey of adaptive sampling for clinical trials.
In: R. M. Elashoff, ed., Perspectives in Biometry, Academic Press, New York, 29-61.
Hooper, J. H. and Santner, T. J. (1979). Design of experiments for selection from ordered families
of distributions. Ann. Statist. 7, 615-643.
Huang, D.-Y. and Panchapakesan, S. (1982). Some locally optimal subset selection rules based on
ranks. In: S. S. Gupta and J. O. Berger, eds., Statistical Decision Theory and Related Topics--III,
Vol. 2, Academic Press, New York, 1-14.
Kim, W.-C. and Lee, S.-H. (1985). An elimination type two-stage selection procedure for exponential
distributions. Comm. Statist.--Theor. Meth. 14, 2563-2571.
Kingston, J. V. and Patel, J. K. (1980a). Selecting the best one of several Weibull populations. Comm.
Statist. A--Theory Methods 9, 383-398.
Kingston, J. V. and Patel, J. K. (1980b). A restricted subset selection procedure for Weibull
distributions. Comm. Statist. A--Theory Methods 9, 1371-1383.
Lawrence, M. J. (1975). Inequalities for s-ordered distributions. Ann. Statist. 3, 413-428.
Lehmann, E. L. (1963). A class of selection procedures based on ranks. Math. Annalen 150, 268-275.
Milton, R. C. (1963). Tables of equally correlated multivariate normal probability integral. Technical
Report No. 27, Department of Statistics, University of Minnesota, Minneapolis, MI.
Nagel, K. (1970). On subset selection rules with certain optimality properties. Ph.D. Thesis (also
Mimeograph Series No. 222), Department of Statistics, Purdue University, West Lafayette, IN.
Panchapakesan, S. and Santner, T. J. (1977). Subset selection procedures for Ap-superior popula-
tions. Comm. Statist. A--Theory Methods 6, 1081-1090.
Patel, J. K. (1976). Ranking and selection of IFR populations based on means. J. Amer. Statist. Assoc.
71, 143-146.
Paulson, E. (1952). On the comparison of several experimental categories with a control. Ann. Math.
Statist. 23, 239-246.
Raghavachari, M. and Starr, N. (1970). Selection problems for some terminal distributions. Metron
28, 185-197.
Rizvi, M. H. and Sobel, M. (1967). Nonparametric procedures for selecting a subset containing the
population with the largest ~-quantile. Ann. Math. Statist. 38, 1788-1803.
Robbins, H. (1952). Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc.
58, 527-535.
Robbins, H. (1956). A sequential design problem with a finite memory. Proc. Nat. Acad. Sci. U.S.A.
42, 920-923.
Santner, T. J. (1975). A restricted subset selection approach to ranking and selection problems. Ann.
Statist. 3, 334-349.
Saunders, I. W. and Moran, P. A. P. (1978). On the quantiles of the gamma and F distributions.
J. AppL Prob. 15, 426-432.
Sobel, M. (1967). Nonparametric procedures for selecting the t populations with the largest
c~-quantiles. Ann. Math. Statist. 38, 1804-1816.
Sobel, M. and Huyett, M. J. (1957). Selecting the best one of several binomial populations. Bell
System Tech. J. 36, 537-576.
Zwet, W. R. van (1964). Convex Transformations of Random Variables. Mathematical Center,
Amsterdam.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics. Vol. 7 | ~'~
© Elsevier Science Publishers B.V. (1988) 157-174

The Impact of Reliability Theory on Some


Branches of Mathematics and Statistics

Philip J. Boland and Frank Proschan*

0. Introduction

It is obvious that reliability theory has used a great variety of mathematical and
statistical tools to help achieve needed results. These include: total positivity,
majorization and Schur functions, renewal theory, Bayesian statistics, isotonic
regression, Markov and semi-Markov processes, stochastic comparisons and
bounds, convexity theory, rearrangement inequalities, optimization theory--the
list is almost endless•
The question now arises: Has reliability theory reciprocated--that is, has
reliability theory made any contributions to the development of any of the
mathematical and statistical disciplines listed above? The answer is a definite Yes.
In this article we shall show that in the course of solving reliability problems,
theoreticians have developed new results in some of the disciplines above, of
direct value to the discipline and having application in other branches of statistics
and mathematics•

1. Total positivity and P61ya frequency functions

A function K ( x , y) of two real variables ranging over linearly ordered sets X


and Y respectively is said to be totally positive o f order r (TPr) if for all 1 ~< rn ~< r,
Xl < x2 < " " " < Xm, Yl < Y2 < " " " < Ym (Xi ~ X, yj E Y), we have the inequalities

K
[;::;; ;m] g(xl, Yl)
K(X2, Yl)
K ( X l , Y2) "'" K(x1, Ym) >I
K ( x 2 , Y2) "" " K ( x 2 , Y,,,) O.

g ( x m , Yl) K ( x m , y2)" " K(x,,,, y,,,)

* Research supported by the Air Force Office of Scientific Research Grant AFOSR 82-K-0007.
157
158 P. J. Boland and F. Proschan

Typically, X is an interval of the real line, or a countable set of discrete values


on the real line such as the set of all integers or the set of nonnegative integers;
similarly for Y. When X or Y is a set of integers, we may use the term 'sequence'
rather than 'function'.
If a TPr function K(x, y) is a probability density in one of the variables, say
x, with respect to a a-finite measure #(x) for each fixed value of y, and is expressible
as a function K(x, y) = f ( x - y) of the difference of x and y, then f is said to be
a P6lyafrequencyfunction (or density) of order r (PFr). The argument o f f traverses
the real line. If the argument is confined to the integers we shall speak of a P61ya
frequency sequence of order r (PF~ sequence). Note that if f is a density function
on R, then K(x, y) = f ( x - y) is TP 2 if and only if the family of density functions
{ f ( x - y ) : y ~ Y} has the monotone likelihood ratio property.
Many totally positive kernels (functions) may be generated by the judicious use
of the following convolution result:

THEOREM 1.1. I f K is TP r, L is TP s and # is a a-finite measure, then the


convolution

M(x, y) = f K(x, z)L(z, y) d/~(z)

is TPmin (r, ~).

PROOF. The result follows from the 'Basic Composition Formula' (see Karlin
(1968) for a proof):

MIXI'X2..... Xml= I'''I KFXI..... Xm]


LYl,Y2, ..., Y,~ A kZ~, ., z,,,
~1 < Z 2 < o • - "<Zm

xF;,1.... mZm] 'l'


whenever M(x, y) = S K(x, z)L(z, y) d#(z) converges absolutely with respect to
the a-finite measure #.

An important feature of totally positive functions is their variation diminishing


property. Suppose that K(x, y) is TP r and that h(y) changes sign j times where
j ~< r - 1. Let g(x) = S K(x, y)h(y) d#(y), an absolutely convergent integral with
/~ a a-finite measure. Then g(x) changes sign at most j times. Moreover, if g(x)
actually changes sign j times, then g(x) must have the same arrangement of signs
as h(y) does, as x and y traverse their respective domains from left to right. The
variation diminishing property is actually equivalent to the (inequalities) definition
we have given of TP r (see Karlin and Proschan, 1960; Karlin, 1968, Chapter 5).
The impact of reliability theory on mathematics and statistics 159

The theory of totally positive kernels and P61ya frequency functions has been
extensively applied in several domains of mathematics, statistics, economics, and
mechanics. In particular to give but a few examples in the theory of reliability, the
techniques of total positivity have been useful in developing properties of life
distributions with monotone failure rates and in the study of some notions of
component dependence, P61ya frequency functions of order 2 have been helpful
in determining optimal inspection policies, and the variation diminishing property
has been used in establishing characteristics of certain shock models. Reliability
theory has in turn, however, been the motivating force behind some important
developments in the theory of total positivity itself. A good example is the follow-
ing result (see Karlin and Proschan, 1960):

THEOREM 1.2. Let f l , f2, ... be any sequence of densities of nonnegative random
variables, where each f is PF r Then the n-fold convolution g(n, x) = f l * r E * " " *
fn(x) is TP~ in the variables n and x, where n ranges over 1, 2, ... and x traverses
the positive real line.

A similar total positivity result for the first passage time probabilities of the
partial sum process can be proved in the more general case when the random
variables range over the whole real line.

THEOREM 1.3. Let f l , f z , ... be any sequence of PF~ densities of random variables
X l , X 2 . . . . respectively, which are not necessarily non-negative. Consider the first
passage probability for x positive:

, ..... . 1 ]
for n = 1, 2 . . . . . Then h(n, x') is TPr, where n ranges over 1, 2, . . . , and x traverses
the positive real line.

Theorems 1.2 and 1.3 were initially inspired by certain models in reliability and
inventory theory, and these results in turn motivated Karlin (1964) to characterize
new classes of totally positive kernels and develop new applications in (for ex-
ample) discrete Markov chains. Typical of the results of Karlin (1964) are the
following two propositions (see also Karlin, 1968, p. 43):

PROPOSITION 1.4. Let ~ be a temporarily homogeneous TP r Markov chain (the


transition probability matrix P is TPr) whose state space is the set of nonnegative
integers. Then the n-step transition function P~j is TPr in the variables 0 ~ j < oo and
n>~O.

PROPOSITION 1.5. Let ~ be a TP r Markov chain. Let Finjo denote the probability
that the first passage into the set of states <<-Jooccurs at the nth transition when the
initial state of the process is i > Jo. Then FT.jo is TP r in the variables n >i 1 and i > Jo.
160 P.J. Boland and F. Proschan

We now briefly trace the development leading to Theorems 1.2 and 1.3,
beginning with a basic problem in reliability theory that Black and Proschan
(1959) consider. (See also Proschan (1960) and Barlow and Proschan (1965) for
related problems).
Suppose that a system is required to operate for the period [0, to]. When a
component fails, it is immediately replaced by a spare component of the same
type if one is available. The system fails if no spare is available. Only the originally
supplied spares may be used for replacement during the period [0, to]. Assume
that the system uses k different types of components. At time 0, for each
i = 1, ..., k there are d r 'positions' in the system which are filled with components
of type i. By 'position (i, j)' we mean the jth location in the system where a
component of type i is used. Components of the same type in different positions
may be subject to varying stresses, and so we assume that the life of a component
in position (i, j ) has density function f j . Each replacement has the same life
distribution as its predecessor, and component lives are assumed to be mutually
independent. Let Pr(nr) be the reliability during [0, to] of the ith part of the system
(that is the subsystem consisting of the di components of type i), assuming that
n; spares of type i are available for replacement. The problem is to determine the
'spares kit' n = ( n ~ , . . . , nk) which will maximize the reliability of the system
P~n) = I~/k=l er(nr) during [0, to] subject to a cost constraint of the form
Y~r= 1 crnr <~ C (where cr > 0 for all i = 1, . . . , k).
A vector n o = (n °, n °, ..., n°) is an undominated spares allocation if whenever
O k k
P(n) > e(n ), then Y.r= l cinr > Y~= l Crn°. Black and Proschan (1959) consider
methods for quickly generating families of undominated spares allocations, which
can then be used to solve (approximately) the above problem. One of their
procedures is to start with the cheapest cost allocation (0, 0 . . . . . 0), and succes-
sively generate more expensive allocations as follows: If the present allocation is
n, determine the index io for which

[logPr(nr + 1) - logP~(n¢)]/G (i = 1. . . . . k)

is a maximum (in the case of ties the lowest such index is taken). The next
allocation is then n' = (n 1. . . . , nio_ j, n;o + 1, n;o + 1. . . . , nk). Black and Proschan
observe that the procedures they describe generate undominated allocations if each
Pc(n) is log concave in n. They are able to verify this directly in the case where
the component lives in the ith part of the system are exponentially distributed with
parameter 2~.
Note that logP;(n) is concave in n if and only if (Pr(n + 1)/Pr(n)) is a decreasing
function of n, or equivalently that Pr(n) is a PF a sequence. Let N o. for j = 1, . . . , d r
be the random variable indicating the number of replacements of type i needed
at position (i, j ) in the interval [0, to]. Proschan (1960) is able to show that iff.j.(t)
satisfies the monotone likelihood ratio property for translations (equivalently that
fj(t) is a PF 2 function), then f~n)(t) is a TPa function in the variables n and t
(where f;~n) is the n-fold convolution of f j with itself). Judiciously using
Theorem 1.1 on convolutions of totally positive functions, one is then able to
The impact of reliability theory on mathematics and statistics 161

show P r o b ( N i j = n ) is a PF 2 sequence and finally that P i ( n ) = P r o b


(Ni~ + • •. + Nid ~<~ n) is a PF 2 sequence. The key tool is of course to show that
when f ( t ) is a PF 2 function, then f(n~(t) is TP2 in n and t. Theorems 1.2 and 1.3
are natural generalizations of this result.
One may further generalise the 'spares kit' procedure above to show that when
each life distribution function F;j (for position (i, j)) is IFR (equivalently that
ff = 1 - F is PF2), then the procedure generates undominated allocations (Barlow
and Proschan, 1965).

2. Association of random variables

The notion of associated random variables is one of the most valuable contribu-
tions to statistics that has been generated as a result of reliability theory con-
siderations.
We consider two random variables to be in some sense associated if they are
positively correlated, that is cov(S, T) >/0. A stronger requirement is c o v ( f ( S ) ,
g ( T ) ) >1 0 for all nondecreasing f and g. Finally if c o v ( f ( S , T), g ( S , T)) >I 0 for
all f and g nondecreasing in each argument, we have still stronger version of
association. Esary, Proschan and Walkup (1967) generalize this strongest version
of association to the multivariate case in defining random variables T1, . . . , T~ to
be associated if c o v ( f ( T ) , g ( T ) ) >>,0 for all nondecreasing functions f and g for
which the covariance in question exists. Equivalent definitions of associated
random variables result if the functions f and g are taken to be increasing and
either (i) binary or (ii) bounded and continuous.
Association of random variables satisfies the following desirable multivariate
properties:
(P1) Any subset of a set of associated random variables is a set of associated
random variables.
(P2) If two sets of associated random variables are independent of one another,
then the union of the two sets is a set of associated random variables.
(P3) Any set consisting of a single random variable is a set of associated random
variables.
(P4) Increasing functions of associated random variables are associated.
(Ps) A limit in distribution of a sequence of sets of associated random variables
is a set of associated random variables.
Note that properties P3 and P2 imply that any set of independent random varia-
bles is associated. This fact, together with property P4 enables one to generate
many practical examples of associated random variables. In the special case when
dealing with binary random variables, one can readily show that the binary
random variables X~, . . . , X n are associated if and only if 1 - X ~ ,
1 - X 2 . . . . . 1 - X n are associated.
Many interesting applications may be obtained as a consequence of the follow-
ing result about associated random variables:
162 P. J. Boland and F. Proschan

THEOREM 2.1. L e t T 1. . . . . 1", be associated, and Si = f ( T ) be a nondecreasing


function f o r each i = 1, . . . , k. Then

k
P[S1 <~ S1 . . . . ' Sk <~"Sk] ~ H P [ S i <'~Si]
i=l
and
k
P[S1 > Sl' " ' ' ' Sk > Sk] >~ H P[Si > si]
i=l

f o r all s = (s 1. . . . , Sk)~ R ~.

The following two corollaries are immediate consequences of this theorem.

COROLLARY 2.2. (Robbins, 1954). L e t T~ . . . . . T , be independent random varia-


be the i t h p a r t i a l s u m f o r i = 1, .. . , n . Then
bles, and let S i = ~ 'j = I T j

P [ S 1 <~ s I . . . . . S n <. s.] >~ fi P(S, <. s,)


i=l

f o r all s = (s, . . . . . s,) e R'.

COROLLARY 2.3. L e t T t u , . . . , T[, 1 be the order statistics in a random sample


T 1. . . . , T,. Then

k
P [ T v , 1 <~ ti~, . . . , Tu, ] <~ t~] >/ I-I P[Tto] <~ t~]
j=l
and
k
P [ T v , I > til, " " , 'Tvkl > tik] >/ I ] P[T[,)1 > tij]
j=l

f o r every choice o f 1 <~ i~ < • • • < ik <~ n and ti, < • • • < tik.

Marshall and Olkin (1967) consider the multivariate exponential distribution

F
F(s, . . . . , sin) = 1 - exp [ - £ 2is i - Z 2u max(s,., sj)
k i= 1 i<j

- ~ 2;jk max(s;, sj, s~) . . . .


i<j<k
"1
212 . . . . max(sl, s2 . . . . , Sin)I •

They point out that if this is the distribution function of the random variables
S 1. . . . . S=, then there exist independent exponential random variables
T l . . . . , T , such that Sj = m i n { T i : i e A j } where A j c {1, 2, . . . , n}. The random
variables S 1. . . . , S= are associated and therefore using Theorem 2.1, we can
The impact of reliability theory on mathematics and statistics 163

show that

F(s 1. . . . , Sm) ~/ ~ Fi(si)


i=1
and
1 - F(s 1, . . . , Sm) ~ ~ [1 -- Fi(si) ]
i=l

where F i is the marginal distribution of S i. The multivariate exponential distribu-


tion is also useful in studying shock models.
Another application of Theorem 2.1 can be made in the case of analysis of
variance in which two hypothesis are tested using the same error variance for each
test. Consider the case in which the effects of both rows and columns are to be
tested. The standard procedure is to calculate three quadratic forms, ql, q2, q3
which are independently distributed as Z2 with n~, n z, and n 3 degrees of freedom
respectively, where ql represents the sum of squares between rows, q2 the sum of
squares between columns, and q3 the error sum of squares. The likelihood ratio
test statistics for testing the two hypotheses are

F 1 = (ql/nx)/(q3/n3) and F 2 = (q2/n2)/(q3/n3).

The probability of making no errors of the first kind is P [ F I , , F 2 <<.F2~ ], where


F1~ (F2~) is the 100~ per cent point of the distribution of F 1 (/72). Kimball (1951)
proves

P [ F , <~ F,=, F 2 ~ F2= ] > P[F~ ~ FI~,]P[F 2 <~F2=],

or in other words that the chance of no errors of the first kind is greater following
the standard experimental procedure than if separate experiments had been per-
formed. This result is an immediate consequence of Theorem 2.1 once it is
observed that F 1 and F z are nondecreasing functions of the associated random
variables qx, q2, q3 1
The concept of associated random variables has proved to be a useful tool in
various areas of operations research. Shogan (1977) uses properties of associated
random variables to construct bounds for the stochastic activity duration of
PERT network. Heidelberger and Inglehart (1979) use associativity to construct
a set of sufficient conditions which guarantee that the dependent simulations of
a stochastic system produce a variance reduction over independent simulations.
Niu (1981) makes use of association in studying queues with dependent inter-
arrival and service times.
The notion of association of random variables is just one of many notions of
multivariate dependence. Lehmann (1966) introduces several concepts of bivariate
dependence, the strongest of which is TP z dependence ((S, T) are TP 2 dependent
if the joint probability density (or in the discrete case joint frequency function)
f ( s , t) is totally positive of order 2). For a discussion concerning the relationship
164 P. J. Boland and F. Proschan

among several notions of multivariate dependence see Barlow and Proschan


(1981). Newman and Wright (1981) obtain limit theory result for sequences of
associated random variables.
In applications it is often easier to verify that one of the alternative notions
which imply association holds, instead of verifying association directly. For
example if T = (T 1. . . . . T,) has density f ( q , . . . , t,) which is TP 2 in every pair
of variables when the remaining variables are kept fixed and which is everywhere
positive on a rectangular support, then T~ . . . . . T, are associated (see Kemper-
man, 1977). Pitt (1982) proves the following important characterization of associa-
tion for the multivariate normal case (for a simpler proof see also Joag-dev, Perlman
and Pitt (1983)):

THEOREM 2.4. L e t T -- (T 1. . . . , T , ) be multivariate normal. Then T l . . . . , T~ are


associated if and only if coy(T/, Tj.) ~> 0 f o r all i, j = 1, . . . , n.

A related result of particular importance in statistical mechanics is the F K G


inequality. Let T = (T1, . . . , Tn) be a random vector with density f ( q . . . . , t,).
For s = (s 1, . . . , sn) and t = (q, . . . , tn), let

s v t = (max(s,, tl), max(s2, tz) . . . . , max(s~, tn))


and
S ^ t = (min(s 1, q), min(s 2, t2), . . . , min(s,, t~))

f is said to satisfy the F K G condition (or be multivariate totally positive o f order 2


(MTP2)) if

f ( s v t ) f ( s ^ t) >t f ( s ) f ( t ) for all s, t ~ ~ .

The F K G inequality, obtained by Fortuin, Kasteleyn and Ginibre (1971), says


that if the density f of T satisfies the F K G condition, then T 1. . . . . T, are
associated. For an excellent discussion of the application of the F K G inequality
in statistics see Kemperman (1977).
The notion of association of random variables which Esary, Proschan and
Walkup (1967) develop, has its origins in a problem of Esary and Proschan (1963)
concerning coherent structures. Moore and Shannon (1956) investigate the relia-
bility of relay circuits and show that arbitrarily reliable circuits can be constructed
from arbitrarily unreliable relays. They prove that if h ( p ) is the probability of
closure of a relay network plotted as a function of the common probability p of
the closure of a simple relay, then

p(1-p)h'(p)>h(p)(1-h(p)) for O < p < 1.

Therefore h ( p ) is s-shaped (crosses the diagonal at most once and always from
below), a property which is crucial in constructing relay circuits of arbitrarily high
reliability. Birnbaum, Esary and Saunders (1961) generalize this result of Moore
The impact of reliability theory on mathematics and statistics 165

and Shannon to coherent structures of independent components with identical


reliability. Esary and Proschan (1963) in turn generalize to coherent structures
with independent components not necessarily of the same reliability. The main
tool in their paper is the following specialized version of an inequality of
Tchebichev (see Hardy, Littlewood and P61ya, 1952), which may be regarded as
a 'forerunner' to the definition of association of random variables:

THEOREM 2.5. Let X 1 , . . . , X n be independent binary random variables. Let f(X),


i = 1, 2, be increasing functions. Then cov[fl(X), f2(X)] ~> 0.

Esary and Proschan also use Theorem 2.5 to construct upper and lower bounds
for t h e reliability of a coherent structure in terms of the minimal paths and
minimal cut sets of the structure.

3. Renewal theory

Renewal theory has its origins in the study of self-renewing aggregates and
especially in actuarial science. Today we view the subject more generally as the
study of functions of independent identically distributed nonnegative random
variables which represent the successive intervals between renewals of a process.
The theory is applied to a wide variety of fields such as risk analysis, counting
processes, fatigue analysis, inventory theory, queuing theory, traffic flow, and
reliability theory. We will summarize a few of the more important and basic ideas
in renewal theory (for a more complete treatment consult Smith (1958), Cox
(1962), Feller (1966), Ross (1970), or Karlin and Taylor (1975)) and then indicate
some of the contributions to this area arising from reliability theory.
By a renewal process we will mean a sequence of independent identically
distributed nonnegative random variables X1, X 2 . . . . . which are not all zero with
probability one. We let F be the distribution function of X1, and F (k) will denote
the k-fold convolution of F with itself. The kth partial sum S k = X 1 + • • • + X k is
the kth renewal point and has distribution function F (k). For convenience we
define F (°) by F(°)(t) = 1 for t >i 0 and zero otherwise. Renewal theory is primarily
concerned with the number N(t) of renewals in the interval [0, t]. N(t), the
renewal random variable, is the maximum value of k for which Sk <~ t, with the
understanding that N(t)= 0 if X ~ > t . It is clear that P ( N ( t ) = n ) =
F(n)(t) - F (n+ 1)(0 and e ( N ( t ) >>.n ) = F(")(t). The process {N(t): t >/0} is known
as a renewal counting process.
The renewal function M(t) is defined to be the expected number of renewals
in [0, t], that is M(t) = E(N(t)). Since M(t) = E(N(t)) = 2 k~= l k P [ N ( t ) = k] =
oo
~=1~ P[N(t) >t k], it follows that M(t) = Zk= ~ FCk)(t) and moreover that M(t) =
~o- [1 + M ( t - x)] dF(x) (this latter identity being known as the fundamental
renewal equation). In spite of the fact that a closed functional form for M(t) is
known for only a few special distributions F, the renewal function M(t) plays a
central role in renewal theory.
166 P. J. Boland and F. Proschan

If F is the distribution function of X1, F is nonlattice if there exists no h > 0


such that the range of X 1 c {h, 2h, 3h,...}. The following basic results were
proved in the early stages of renewal theory development.

THEOREM 3.1. I f F has mean #i, then N ( t ) / t ~ 1/# 1 almost surely as t--* oo.

THEOREM 3.2. Let F have mean #1. Then


(i) M(t) >1 t/# 1 I for all t >1 0;
--

(ii) (Blackwell) if F is non-lattice,


lira [M(t + h) - M(t)] -- h / # , for any h > 0 ;
l~oo

(iii) if F is non-lattice with 2nd moment #2 < + ~ ,


M ( t ) = t / # l + # 2 / 2 # 2 - 1 +o(1) as t ~ c o .

Note that important as these results may be, they are, with the exception of
Theorem 3.2 (i), asymptotic in nature.
In their comparison of replacement policies for stochastically failing units,
Barlow and Proschan (1964) obtain several new renewal theory inequalities. An
age replacement policy is one whereby a unit is replaced upon failure or at age T,
a specified constant, whichever comes first. Under a block replacement policy a
unit is replaced upon failure and at times T, 2T, 3T, .... It is assumed that
failures occur independently and that the replacement time is negligible. There are
advantages for both types of policy, and hence it is of interest to compare the two
types stochastically with respect to numbers of failures, planned replacements and
removals (a removal is a failure or a planned replacement). In many situations it
will be assumed that the life distribution of a unit belongs to a monotone class
such as the IFR (DFR) class (F is IFR if it has increasing (decreasing) failure
rate). It is clear that the evaluation of replacement policies depends heavily on the
theory of renewal processes.
Suppose we let N(t) indicate the number of renewals in [0, t] due to replace-
ments at failure, N*(t) be the number of failures in [0, t] under a block policy,
and N*(t) the number of failures in [0, t] under an age policy. Barlow and
Proschan (1964) prove the following result stochastically comparing these random
variables:

THEOREM 3.3. If F is IFR (DFR), then

P(N(t) >>,n) >~ ( <~)P(U*(t) >/n) >~ ( <~)P(U*(t) >/n)

for t >l O and n = O, 1, 2 , . . . .

The following bounds on the renewal function M(t) = E(N(t)) are an immediate
consequence:
The impact of reliability theory on mathematics and statistics 167

COROLLARY 3.4. I f F is IFR (DFR), then


(i) M(t) >~ ( <~)E(N*(t)) >1 ( <~)e(N*(t)).
(ii) M(t) >t (<<.)kM(t/k), k = 1, 2, . . . .
(iii) M(t) <~ (>1) t/kL1
(iv) M(h) <~(>. )M(t + h) - M(t) for all h, t >~ O.

By considering the number of failures and the number of removals per unit of
time as the duration of the replacement operation becomes indefinitely large,
Barlow and Proschan (1964) obtain the following simple useful bounds on the
renewal function for any F, and an improvement on these bounds for the IFR
(DFR) case (these bounds were conjectured by Bazovsky (1962)):

THEOREM 3.5. (i) M(t) >~ t/S o i ( x ) d x - 1 >>.t/# 1 - 1 for all t >~ O.
(ii) I f F is IFR (DFR), then M(t) <~( >~) tF(t)/ S o if(x) d x <<.(>1) t/l~ 1for all t >~O.

As a consequence of this result, it follows that when F is IFR the expected


numbers of failures per unit of time under block and age replacement policies do
not differ by more than 1/T in the limit as t--, ~ .
Feller (1948) shows than l i m t ~ V a r ( N ( t ) ) / M ( t ) = tr2/#2~< 1. Barlow and
Proschan (1964) partially generalize this result in proving the following:

THEOREM 3.6. I f F is IFR (DFR), then Var(N(t)) ~<(>~)M(t), and this inequality
is sharp.

The renewal theory implications of the work of Barlow and Proschan (1964)
provide the key tool in the probabilistic interpretation of Miner's rule given by
Birnbaum and Saunders (1968) and Saunders (1970). Miner's rule (Miner, 1945)
is a deterministic formula extensively used in engineering practice for the cumula-
tive damage due to fatigue. Prior to the work of Birnbaum and Saunders, Miner's
rule was supported by empirical evidence but had very little theoretical justifi-
cation. Birnbaum and Saunders investigate models for stochastic crack growth
with incremental extensions having an increasing failure rate distribution. The
result that for an IFR distribution function F the inequality t/I21 - 1 <~M(t) <~ t/[.t 1
holds, is used to prove that T/121 -- 1 ~ ~ 1"/121 where ]A1 is the expected crack
increment per cycle, z is the expected crack length at which failure occurs and 7
is the expected number of loading cycles to failure. This in turn is used to show
that under certain conditions of dependence on load, Miner's rule does yield the
mathematical expectation of fatigue life. Saunders (1970) extends some of these
results by weakening the model assumptions, in particular by assuming that the
IFR assumption for the crack growth can be relaxed to assuming that F be new
better than used in expectation (NBUE), that is #l > So ff(t + x ) / i ( t ) d x for all
t >~ 0 such that F(t) > 0.
Marshall and Proschan (1972) determine the largest classes of life distributions
for which age and block replacement policies diminish, either stochastically or in
expected value, the number of failures in service. In doing so, they give the first
168 P. J. Boland and F. Proschan

systematic treatment of the NBU, NWU, NBUE and N W U E classes of life


distributions, which are now widely used in statistics. A life distribution function
F is new better than used (NBU) if i ( x + y) <~ F(x)F(y) for all x, y >~ 0. The new
worse than used (NWU) class of distributions is similarly defined by reversing the
order in this inequality. In their investigation they obtain important renewal
quantity inequalities, many of which generalize results from Barlow and Proschan
(1964). For example they show that if F is NBU (NWU) then VarN(t) ~< (>~)M(t)
and M(h) <~ (>l)M(t + h) - M(t) for all h, t ~> 0, while if F is NBUE (NWUE)
then M(t)<~ (>>.)tI# 1. The following interesting characterization of the NBU class
in terms of the renewal random variable is obtained. Let • denote convolution.

THEOREM 3.7. N(s) * N(t) <~ (>~)N(s + t)for all s, t >~ 0 ¢~ F is NBU (NWU).
Straub (1970) is interested in bounding the probability that the total amount of
insurance claims arising in a fixed period of time does not exceed the amount t
of premiums collected. Letting F(t) be the distribution function for the individual
claims amount, Straub desires bounds for ff(')(t)= P ( N ( t ) < n). Here we may
interpret N(t) as the maximum value of k such that the first k claims sum to a
total ~<t. Motivated by the use of tools in reliability theory and in particular in
the work of Barlow and Marshall on bounds for classes of monotone distribu-
tions, Straub establishes the following important result (see Barlow and Proschan,
1981):
THEOREM 3.8. Let F be a continuous distribution function with hazard function
R (t) = - logif(t).
(a) I f F is NBU (NWU), then

"-' (R(t))J
P(N(t)<n)>/(<<,) ~ e-n(t) fort>/O,n= 1,2 .....
.j=o j!

(b) I f F is IFR (DFR), then

, - l [nR(t/n)]j
P(N(t) < n) <~ (>I) F, e-'R(t/n) for t >1 O, n = 1, 2, . . . .
j=o j!

The bounds for the renewal function established by Barlow and Proschan
(1964) motivate Marshall (1973) to investigate the existence of 'best' linear bounds
for M(t) ('best' is interpreted to mean the sharpest bounds which when iterated
in the fundamental renewal equation converge monotonically to M(t) for all t).
Esary, Marshall and Proschan (1973) establish properties of the survival
function of a device subject to shocks and wear. One of their principal tools is
the result that [Ftkl(x)] 1/~" is decreasing in k = 1, 2, ..., for any distribution
function F such that F(x) = 0 for x < 0. This result, which is equivalent to the
following property of the renewal random variable N(t), can be used to demon-
strate monotonicity properties of first passage time distributions for certain
Markov processes.
The impact o f reliability theory on mathematics and statistics 169

THEOREM 3.9. Let N(t) denote the number of renewals in [0, t] for a renewal
process. Then [P(N(t) >~k)] 1/k is decreasing in k = 1, 2 . . . . .

Another class of monotone distributions used for modeling in reliability theory


is the increasing mean residual life (IMRL) class. Let X 1 have life distribution F.
Then F is IMRL if E(X~ - tIX~ > t) is nondecreasing in t/> 0. A D F R distribution
function F with finite mean # 1 is IMRL. Mixtures of D F R distributions are DFR,
and D F R distributions are used to model the lifetimes of units which improve
with age, such as blast furnaces and work-hardening materials. Keilson (1975)
shows that a large class of first passage time distributions for Markov process are
DFR. Brown (1980) and (1981) proves some very nice renewal quantity results
for the D F R and IMRL classes, among which is the following:

THEOREM 3.10. (a) I f F is DFR, then the renewal function M(t) is concave. (b) I f
F is IMRL, then M(t) - (t/#~ - 1) is increasing in t >~O. (Note however that M(t)
is not necessarily concave.)

4. Majorization and Schur functions

The theory of inequalities has played a fundamental role in developing new


results in reliability theory. In attempting to compare and establish bounds for
probability distributions and systems, workers in reliability have been discovering
new inequalities. Many of these inequalities are of a general nature and can be
presented using the techniques of majorization and Schur functions.
Given a vector x = (xl, . . . , X n ) , let Xtl ] ~< X [ 2 ] ~ " " " ~ X[n ] denote an in-
creasing rearrangement of Xl, . . . , x,. The vector x is said to majorize the vector
y (we write x > m y ) if

~X{il) ~Y[il forj=2 ..... n and ~X[il= ~-'~ Y[il-


i=j i=j i= 1 i= 1

Hardy, Littlewood, and P61ya (1952) show that x > m y if and only if there exists
a doubly stochastic matrix H such that y = xlI. Schur functions are real valued
functions which are monotone with repsect to the partial ordering of majorization.
A function h with the property that x > m y ~ h(x)>i ( ~ ) h ( y ) is called Schur-
convex (Schur-concave). A convenient characterization of Schur-convexity (-con-
cavity) is provided by the Schur-Ostrowski condition, which states that a dif-
ferentiable permutation invariant function h defined on R" is Schur-convex
(Schur-concave) if and only if

( x i _ xj)(O~x, a#h)>~ (~<)0 for all i , j and xe~q".

For an excellent and extensive treatment of the theory of majorization, the reader
should consult Marshall and Olkin (1979).
170 P.J. Boland and F. Proschan

A k out o f n system is a system with n components which functions if and only


if k or more of the components function. Systems of this type are frequently
encountered in practice. A one out of n system is a parallel system and an n out
of n system is a series system. We assume that the n components of the system
function independently. Let hk(p) denote the reliability of a k out of n system in
which the component reliabilities are given by p = (Pl, - . . , P,). Computing the
reliability function hk(p) is often difficult, particularly when a large number of
unlike component probabilities are involved. Some interesting inequalities with
applications in other areas of statistics have resulted from efforts to obtain more
computable bounds for the system reliability hk(p).
For component reliability p; we define the corresponding component hazard R,.
by R; = -logp~. Pledger and Proschan (1971) obtain the following comparisons
for h~(p):

THEOREM 4.1. Let R = (R1, . . . , R , ) be a vector of component hazards which


majorizes R' = (R'1. . . . , R ' ) , a second vector of component hazards. Then for the
n n t
corresponding component reliability vectors p and p' (note that [I1 Pt = H1 p~ since
~ R~ = Y,~ R ; ) we have

hk(p)>~h~(p') fork= 1,... ,n- 1


and
h , ( p ) = h , ( p ' ) (that is the two systems are equally good in series).

Considering the particular case where R '1 . . . . . R ' , one obtains the useful
bound hl,(pl, . . . , p,)>1 hk(Pc, . . . , P c ) for k = 1, ..., n, where Pc is the geo-
metric mean (!q~ pt) 1/'.
Although a large collection of theory and methods exists for order statistics
from a single underlying distribution, a relatively small set of results is available
for the case of order statistics from underlying heterogeneous distributions. In as
much as the time to failure of a k out of n system of independent components
with respective life distributions F 1. . . . . F, corresponds to the (n - k + 1)th order
statistic from the set of underlying heterogeneous distributions {F 1. . . . , Fn},
results about k out of n systems may be interpreted in terms of order statistics
from heterogeneous distributions.
Let us assume that Y/(Y; ) is an observation from distribution Fi (F;) and that
Ri(x ) = - l o g f f i ( x ) (R~ (x) = -logff" (x)) is the corresponding hazard function for
i = 1. . . . , n. The ordered observations are denoted by YH ~ < ' ' ' ~ < Yt-~
(YI~I ~ < " " ~< YI-I)" A random variable Y is stochastically larger than Y'
( y >~st y , ) if Fr(x) <~Fy, (x) for all x. In the realm of order statistics, Theorem 4.1
yields the following result:

THEOREM 4.2. Let (/~I(X), . . . , Rn(X)) )-m(R11(X), ..., gn(x)) for all x >t O. Then

' and Yrk~ >~st YEk]


Ytll-~tYtl~
- , for k = 2 . . . . , n .
The impact of reliability theory on mathematics and statistics 171

Pledger and Proschan (1971) obtain further results of this type for the case of
proportional hazards. We say that the distributions Fl, . . . , F,, F'l, . . . , F'n have
proportional hazards with constants of proportionality 21, ..., 2,, 2'1. . . . . 2" if
Ri(x ) = 2~R(x) and R ; ( x ) = 2;R(x) for some hazard function R ( x ) and all
i = 1, . . . , n. A consequence of Theorem 4.2 is the following:

COROLLARY 4.3. Let F 1. . . . . F,, F'I, . . . , F'n have proportional hazard func-
tions with 21 . . . . . 2n, 2 ' 1 , . . . , 2;, as constants of proportionality. I f (21 . . . . . 2n)
>m(2'1, . . . , 2"), then Y[1] =st Y[I] and Y[k] >~st Yil,] for k = 2, . . . , n.

Proschan and Sethuraman (1976) generalize Corollary 4.3 and show that under
the same stated conditions, Y = (Y1, . . . , Yn) > / s t Y ' = (Y'l . . . . . Y'n) ( y > s t y , if
and only if f ( Y ) / > s t f ( y , ) for all real valued increasing functions f of n variables).
For more on stochastic ordering the interested reader should consult Kamae,
Krengel and O'Brien (1977). Proschan and Sethuraman apply their result to study
the robustness of standard estimates of the parameter 2 in an exponential distribu-
tion (F(x) = 1 - e - a x ) when the observations actually come from a set of hetero-
geneous exponential distributions.
Other comparisons for k out of n systems are given by Gleser (1975), and
Boland and Proschan (1983). While investigating the distribution of the number
of successes in independent but not necessarily identical Bernoulli trials,
Hoeffding shows that

1 >~ hk(1, . . . , 1, 2 1 P ; - [2~Pi], 0 . . . . . 0) >/hk(pl . . . . . p , )

n
whenever Y~1Pi >~ k, and

0 = hk(1 . . . . . 1, Z l P, - [ Z~ p,-], 0 . . . . . 0) ~< hk(pl . . . . . p n )


~< h~(~, . . . , F)
n n n n
whenever Z l p ~ < ~ k . Here f i = Z l p ~ / n and [Y. lpe] is the integer part of Z l P i .
Gleser generalizes this in showing the following:

THEOREM 4.4. hk(p) is Schur convex in the region where Z~pe>~ k + 1 and
Schur concave in the region where Z~ pi <~ k - 2.

In further research on the reliability of k out of n systems, Boland and Proschan


(1983) show the following related result:

THEOREM 4.5. h~(p) is Schur convex in [(k - 1)/(n - 1), 1] n and Schur concave
in [0, ( k - 1)/(n - 1)] n.
172 P. J. Boland and F. Proschan

Theorems 4.4 and 4.5 represent inequalities which have practical use in the
study of k out of n systems. However it should be clear that they are of more
general interest and have applications in particular in the areas of order statistics
and independent Bernoulli trials.
Barlow and Proschan (1965) show that the mean life of a series system with
IFR components exceeds (is greater than or equal to) the mean life of a similar
system with exponential components, assuming component mean lives match in
the two systems. The reverse ordering is shown to hold in the parallel case.
Solovyev and Ushakov (1967) extend these results to include comparisons with
systems of degenerate and truncated exponential distributions. Marshall and
Proschan (1970) more generally show that if the life distributions F,. and Gi
of corresponding components of a pair of series systems satisfy
~o P~(x)dx >~ ~o Gi(x) dx for all t~> 0, then the same kind of inequality holds for
the system life distribution. Similarly they show that the domination
~) if(x) dx ~> ~ ~ G(x) dx for all t t> 0 is preserved under the formation of parallel
systems, and that both of these types of domination are preserved under con-
volutions. Marshall and Proschan (1970) are (implicitly) working with the concept
of continuous majorization (see Marshall and Olkin (1979)). We say the life
distribution function F majorizes the life distribution function G (written F >m G)
if #F = ~ o f f ( x ) d x = ~ o - G ( x ) d x = # a and ~ f f ( x ) d x > > , ~ - G ( x ) d x for all
t t> 0. As a by-product of their work on the mean life of series and parallel
systems, Marshall and Proschan establish the following result in the theory of
majorization.

THEOREM 4.6. Suppose that Fi > m G J o r each i = 1. . . . , n where Fe and G~ are life
distribution functions. L e t F(t) = F 1 * • " • * Fn(t) and G(t) = G 1 * • ' • * Gn(t) be n-fold
convolutions, with respective means #F and #G. Then

F>mG.

Many elementary inequalities of general interest have been generated through


optimization problems in reliability theory. Derman, Lieberman and Ross (1972)
consider the problem of how to assemble J systems with n different components
in order to maximize the expected number of functioning systems. They extend
a basic inequality of Hardy, Littlewood, and P61ya and 'rediscover' (their exten-
sion is a special case of a result of Lorentz (1953)) the following inequality:

THEOREM 4.7. L e t F ( x l , . . . , xn) be a joint distribution function. I f xi 1 <~ " " <~ x J
for i = 1,..., n, then

J J
Z F(x{ ..... x~) >1 Z F(x~, x022(j) . . . . , xO~n(j))
j=l j 1

whenever ~i (i = 2, . . . , n) are permutations o f 1, 2 . . . . . J.


The impact of reliability theory on mathematics and statistics 173

References

Barlow, R. E. and Proschan, F. (1964). Comparison of replacement policies, and renewal theory
implications. Ann. Math. Statist. 35, 577-589.
Barlow, R. E. and Proschan, F. (1965). Mathematical Theory of Reliability. Wiley, New York.
Barlow, R. E. and Proschan, F. (1981). Statistical Theory of Reliability and Life Testing. To Begin With,
Silver Spring, MD.
Bazovsky, I. (1962). Study of maintenance cost optimization and reliability of shipboard machinery.
ONR Contract No. Nonr-374000(00) (FBM), United Control Corp., Seattle, WA.
Birnbaum, Z. W., Esary, J. D. and Saunders, S. C. (1961). Multi-component systems and structures
and their reliability. Technometrics 3, 55-77.
Birnbaum, Z. W. and Saunders, S. C. (1968). A probabilistic interpretation of Miner's rule. S I A M
J. App. Math. 16, 637-652.
Black, G. and Proschan, F. (1959). On optimal redundancy. Oper. Res. 7, 581-588.
Boland, P. J. and Proschan, F. (1983). The reliability of k out of n systems. Ann. Prob. 11, 760-764.
Boland, P. J. and Proschan, F. (1984). An integral inequality with applications to order statistics.
To appear.
Brown, M. (1980). Bounds, inequalities, and monotonicity properties for some specialized renewal
processes. Ann. Probability 8, 227-240.
Brown, M. (1981). Further monotonicity properties for specialized renewal processes. Ann. Proba-
bility. 9, 891-895.
Cox, D. R. (1982). Renewal Theory. Wiley, New York.
Derman, C., Lieberman, G. J. and Ross, S. M. (1972). On optimal assembly of systems. Nay. Res.
Log. Quart. 19, 569-574.
Esary, J. D., Marshall, A. W. and Proschan, F. (1973). Shock models and wear processes. Ann. Prob.
1, 627-649.
Esary, J. D. and Proschan, F. (1963). Coherent structures Of non-identical components. Technometrics
5, 191-209.
Esary, J. D., Proschan, F. and Walkup, D. W. (1967). Association of random variables, with
applications. Ann. Math. Stat. 38, 1466-1474.
Feller, W. (1948). On Probability problems in the theory of counters. Courant Anniversary Volume.
Interscience, New York.
Feller, W. (1966). An Introduction to Probability Theory and Its Applications, Vol. II. Wiley, New York.
Fortuin, C. M., Kastelyn, P. W. a~d Ginibre, J. (1971). Correlation inequalities on some partially
ordered sets. Comm. Math. Phys. 22, 89-103.
Gleser, L. (1975). On the distribution of the number of successes in independent trials. Ann. Prob.
3, 182-188.
Hardy, G. H., Littlewood, J. E. and P61ya. (1952). Inequalities. Cambridge University Press, New
York.
Heidelberger, P. and Inglehart, D. L. (1979). Comparing stochastic systems using regenerative
simulation with common random numbers. Adv. Appl. Prob. 11, 804-819.
Hoeffding, W. (1956). On the distribution of the number of successes in independent trials. Ann.
Math. Stat. 27, 713-721.
Joag-dev, K., Perlman, M. D. and Pitt, L. D. (1983). Association of normal random variables and
Slepian's inequality. Ann. Prob. 11, 451-455.
Kamae, T., Krengel, U. and O'Brien, G. L. (1977). Stochastic inequalities on partially ordered spaces.
Ann. Probab. 5, 899-912.
Karlin, S. (1964). Total positivity, absorption probabilities and applications. Trans. Amer. Math. Soc.
III, 33-107.
Karlin, S. (1968). Total Positivity. Stanford University Press, Stanford, CA.
Karlin, S. and Proschan, F. (1960). P61ya type distributions of convolutions. Ann. Math. Stat. 31,
721-736.
Karlin, S. and Taylor, H. M. (1975). A First Course in Stochastic Processes, 2nd edition. Academic
Press, New York.
174 P. J. Boland and F. Proschan

Keilson, J. (1975). Systems of independent Markov components and their transient behavior. In: R.
E. Barlow, J. B. Fussel and N. D. Singpurwalla, eds., Reliability and Fault Tree Analysis. SIAM,
Philadelphia, PA, 351-364.
Kemperman, J. H. B. (1977). On the FKG-inequality for measures on a partially ordered space.
lndag. Math. 39, 313-331.
Kimball, A. W. (1951). On dependent tests of significance in the analysis of variance. Ann. Math.
Star. 22, 600-602.
Lehmann, E. L. (1966). Some concepts of dependence. Ann. Math. Stat. 37, 1137-1153.
Lorentz, G. G. (1953). An inequality for rearrangements. Amer. Math. Mon. 60, 176-179.
Marshall, A. W. and Olkin, I. (1967). A multivariate exponential distribution. J. Amer. Stat. Assoc.
62, 30-44.
Marshall, A. W. and Olkin, I. (1979). Inequalities: Theory of Majorization and Its Applications.
Academic Press, New York.
Marshall, A. W. and Proschan, F. (1970). Mean life of series and parallel systems. J. App. Prob. 7,
165-174.
Marshall, A. W. and Proschan, F. (1972). Classes of distributions applicable in replacement, with
renewal theory implications. In: L. LeCom, J. Neyman and E. L. Scott, eds., Proceedings of the 6th
Berkeley Symposium on Mathematical Statistics and Probability, Vol. I, University of California Press,
Berkeley, CA, 395-415.
Marshall, K. T. (1973). Linear bounds on the renewal function. SIAM J. App. Math. 24, 245-250.
Miner, M. A. (1945). Cumulative damage in fatigue. J. AppL Mech. 12, A159-A164.
Moore, E. F. and Shannon, C. E. (1956). Reliable circuits using less reliable relays. J. Franklin
Institute 262, part I 191-208 and part II 281-297.
Newman, C. M. and Wright, A. L. (1981). An invariance principle for certain dependent sequences.
Ann. Prob. 9, 671-675.
Niu, S. C. (1981). On queues with dependent interarrival and service times. Nay. Res. Log. Quart.
28, 497-501.
Pitt, L. D. (1982). Positively correlated normal random variables are associated. Ann. Prob. 10,
496-499.
Pledger, G. and Proschan, F. (1971). Comparisons of order statistics and of spacings from hetero-
geneous distributions. In: J. S. Rustagi, ed., Optimizing Methods in Statistics. Academic Press, New
York, 89-113.
Proschan, F. (1960). P6lya Type Distributions in Renewal Theory, with an Application to an Inventory
Problem. Prentice-Hall, Englewood, NJ.
Proschan, F. and Sethuraman, J. (1976). Stochastic comparisons of order statistics from hetero-
geneous populations, with applications in reliability theory. J. Mult. Anal 6, 608-616.
Robbins, H. (1954). A remark on the joint distribution of cumulative sums. Ann. Math. Stat. 25,
614-616.
Ross, S. M. (1970). Applied Probability Models with Optimization Applications, Holden-Day, San
Francisco.
Saunders, S. C. (1970). A probabilistic interpretation of Miner's rule. II. SlAM J. App. Math. 19,
251-265.
Shogan, A. W. (1977). Bounding distributions for a stochastic PERT network. Networks 7, 359-381.
Solovyev, A. D. and Ushakov, I. A. (1967). Some estimates for systems with components 'wearing
out'. (In Russian). Avtomat. i Vycisl. Tehn. 6, 38-44.
Smith, W. L. (1968). Renewal theory and its ramifications. J. Roy. Statist. Soc., Series B 20, 243-302.
P. R. Krishnaiah and C, R. Rao, eds., Handbook of Statistics, Vol. 7 1 1
.lk 1
© Elsevier Science Publishers B.V. (1988) 175-213

Reliability Ideas and Applications in Economics and


Social Sciences

M. C. Bhattacharjee*

O. Introduction and summary

0.1. In recent times, Reliability theoretic ideas and methods have been used
successfully in several other areas of investigation with a view towards exploiting
concepts and tools, which have their roots in Reliability Theory, in other
settings to draw useful conclusions. For a purely illustrative list of some of these
areas and corresponding problems which have been so addressed, one may
mention: demography (bounds on the 'Malthusian parameter', reproductive value
and other related parameters in population growth models--useful when the
age-specific birth and death-rates are unknown or subject to error: Barlow and
Saboia (1973)), queueing theory (probabilistic structure of and bounds on the
stationary waiting time and queue lengths in single server queues: Kleinrock
(1975), Bergmann and Stoyan (1976), KollerstrOm (1976), Daley (1983)) and
economics ('inequality of distribution' and associated problems: Chandra and
Singpurwalla (1981), Klefsj6' (1984), Bhattacharjee and Krishnaji (1985)). In each
of these problems, the domain of primary concern and immediate reference is not
the lifelengths of physical devices/systems of such components or their failure-
logic structure per se but some phenomenon, possibly random, evolving in time
and space. Nevertheless, the basic reason behind the success of cross-fertilization
of ideas and methods in each of the examples listed above is that the concepts
and tools which owe their origin to traditional Reliability theory are in principle
applicable to non-negative (random) variables and (stochastic) processes generated
by such variables.

0.2. Rather than attempt to provide a bibliography of all known applications


of Reliability in widely diverse areas, our purpose in this paper is more modest.
We review recent work on such applications to some problems in economics and
social sciences--which is illustrative of the non-traditional applications of Relia-
bility ideas that is finding increase use. In Section 1, 'social choice functions' and

* Work done while the author was visiting the University of Arizona.

175
176 M. C. Bhattacharjee

the celebrated 'impossibility theorem' of Arrow (1951) are considered as an appli-


cation of 'monotone-structure' ideas. Section 2 considers 'voting games' and
'power indices' which are among the best known quantitative models of group
behavior in political science, to show they can be modeled via the theory of
structure functions. Besides providing new viewpoints and alternative proofs of
well known classic results which these situations illustrate, reliability ideas can
also lead to new insights. Sections 3 and 4, which exploit appropriate parametric
and nonparametric 'life distribution' ideas, are in the latter category. Section 3
considers alternatives to the traditional Lorenz-coefficient and Gini-index for
measuring 'inequality of distribution' in economics by exploiting mean residual life
and TTT-transform concepts. Section 4 describes an approach to modeling some
aspects of the 'economics of innovation and R & D rivalry' by considering the
'reliability characteristics' of the time to innovation of a technologically feasible
product or process among a competing group of entrepreneurs or firms which are
in the race to be the first to innovate.
In each of the four themes, a summary of the problem formulation and basic
results of interest precedes the reliability analogies and arguments which can be
brought to bear on the problems. No detailed proofs are given except for Arrow's
theorem (Section 1.2) from an unpublished technical report whose succint argu-
ments are reviewed to illustrate how the reliability approach can be constructive
in clarifying the role of underlying assumptions and an alternative insight. The role
of interpretation of appropriate reliability theoretic concepts and results for such
an interplay cannot be minimized and are interspersed throughout our presen-
tation. The format is mainly expository in nature, although some results are new.
In each section, we also indicate some possible directions of further development
that would be interesting from the point of view of the themes addressed and that
of reliability theory and applications.

1. The 'Impossibility Theorem' of Arrow

1.i. Arrow (1951) considered the problem of aggregating 'individual perference


orderings' to form a 'social preference ordering'. In the conceptual framework of
social decision making and particularly in the context of voting theory, his celebrated
'impossibility theorem' is a landmark result which essentially states that there is
no social preference ordering which obeys two reasonable axioms and four condi-
tions that one would expect all reasonable ways of aggregating individual prefer-
ences to a collective one to satisfy. Pechlivanides (1975) in a paper investigating
some aspects of social decision structures, has given an alternative proof of
Arrow's theorem using coherent-structure arguments of reliability theory which
appears to have remained unpublished and which we believe is a very apt illus-
tration of the reliability arguments for many modeling problems in the social
sciences. His arguments are somewhat succint which we will review and amplify.
Before reviewing Pechlivanides' proof, we take up a brief description and
formal statement of Arrow's theorem which may not be entirely familiar to relia-
Reliability applications in economics 177

bility researchers. Central to this is the idea of a preference ordering R among the
elements x, y,, ... of a finite set F. R is a relation among the elements of F such
that for any x, y ~ F, we say: x R y iff x is at least as preferred as y. Such a relation
R is required to satisfy the two axioms:
(A1) Transitivity: For all x, y, z t F; x R y and y R z ~ x R z.
(A2) Connectedness: For all x, y 6 F; either x R y or y R x or both.
Technically R is a complete pre-order on F; it is analogous to a relation such as
'at least as tall as' among a set of persons. Notice that we can have both x R y
and y R x but x ~ y. For a given F, it is sometimes easier to understand the
relation R through two other relations P, I defined as x P y ~*~ x is strictly
preferred to y; while x I y ,~ x and y are equally preferred (indifference). Then
note, (i) x R y ~:~ y ~ x, i.e., x R y is the negation o f y P x and (ii) the axiom (A2)
says: either x P y or y P x or x I y .
Now consider a society S = { 1, 2 . . . . . n} of n-individuals (voters), n >I 2 and
a finite set A of alternatives consisting of k-choices (candidates/policies/actions),
k > 2. Each individual i t S has a personal preference ordering R i on A satisfying
the axions (A1) and (A2). The problem is to aggregate all the individual
preferences into a choice for S as a whole. To put it another way, since R;
indicates how i 'votes', an 'election' 8 is a complete set of 'votes' {formally,
= {Ri:i~ S}) and since the result of any such election must amalgamate its
elements (i.e., the individual voter-preferences) in a reasonable manner into a
well-defined collective preference of the society S; such a result can be thought
of as another relation R* on A which, to be reasonable, must again satisfy the
same two axioms (A1) and (A2) with F = A.
Arrow conceptualizes the definition of a "voting system" as the specification of
a social preference ordering R* given S, A. There are many possible R* that one
can define including highly arbitrary ones such as R* = R~ for some i ~ S (such
an individual i, if it exists, is called a 'dictator'). To model real-world situations,
we require to exclude such unreasonable voting systems and confine ourselves to
those R* which satisfy some intuitive criteria of fairness and consistency. Arrow
visualized four such conditions, namely:
(C1) (Well-definedness). A voting system R * must be capable of a decision. For
any pair of alternatives a, b; there exists an 'election' for which the society prefers
a to b. [R* must be defined on the set of all n-tuples B = (R~ . . . . . Rn) of
individual preferences and is such that for all a, b in A, either a R* b or a ~ * b,
there exists an B such that b $ * a . ]
(C2) (Independence of Irrelevant Alternatives). R* must be invariant under
addition or delition of alternatives. [ I f A ' c A and o~ = {Ri: i t S} is any election,
then RI*, should depend only on {Ril A, : i t S} where Rtl A, (RI*, 1, respectively) is
the restriction of R; (R* respectively) to A ' . ]
(C3) (Positive Responsiveness). An increasing (i.e., nondecreasing) preference for
an alternative between two elections does not decrease its social preference.
[Formally, given S and A, let g = {R~:i~S} and g ' = { R ' ' i ~ S } be two
elections. If there exists an a t A such that
178 M. C. Bhattacharjee

(i) a R i a ' =¢. aR; a' for all i t S , and a' ~ a ;


(ii) for all pairs ( a ' , b ' ) t A x A with a ' # a , b'~b, a'#b', {(a',b'):
a' R,b'} = {(a', b'):a' R; b'},
then, a R * a ' ~ a R * ' a', for all a' ~ a. In other words, if each voter looks on
a t A at least as favorably under g ' as he does under g and if the individual
preferences between any other pair of altematives remain the same under both
elections, then the society looks on a at least as favorably under g ' as it does
under do.]
(C4) (No Dictator). There is no individual whose preference ('vote') always
coincides with the social preference regardless of the other individual preferences.
[There does not exist i t S with R* = Ri, i.e., such that for all (a, b),
a R i b ~ A R * b and a ~ i b ~ a ~ * b . ]
Call a voting system (social preference ordering) R * admissible iff it satisfies the
axioms (A1), (A2) and the conditions (C1)-(C4). Arrow's impossibility theorem
then claims that for a society of at least two individuals and more than two
alternatives, an admissible voting system does not exist.

1.2. The 'reliability' argument. Traditional proof of Arrow's theorem depends


heavily on the properties of complete pre-orders. To see the relevance of reliability
ideas for proving Arrow's theorem, Pechlivanides imagines the society S as a
system and each voter i t S as one of its components. For every pair (a, b) of
alternatives with a ¢ b , associate a binary variable x i : A 2 - - * { O , 1}, where
A 2 = {(a, b): a t A, b E A, a ~ b} is a set in A x A devoid of its diagonal, by

xi(a,b)= 1 i f a R i b ,
= 0 if aI~ib. (1.1)

Relative to b, every xi(a, b) is a vote for a if xe(a, b) = 1 and is a vote against


a if it equals zero. Thus x i defines i's vote and is an equivalent description of his
individual preference ordering R r The vote-vector x = {x I . . . . . xn): A 2 ~ {0, 1} n
is an equivalent description of an election ~ = (R l . . . . , R,). A voting system
(social preference ordering) R * is similarly equivalent to specifying a social choice
function FA: A 2 ~ {0, 1} such that

FA(a,b)= 1 i f a R * b ,
=0 if a ~ * b . (1.2)

Each xe(a, b) = 1 or 0 (FA(a, b) = 1 or 0 respectively) according as the individual


i (society S, respectively) does not/does prefer b to a. Formally, Arrow's result is
then:

IMPOSSIBILITY THEOREM (Arrow). There does not exist a social choice function
FA satisfying (A1), (A2) and (C1)-(C4).
Reliability applications in economics 179

To argue that the two axioms and four conditions are collectively inconsistent,
the first step is to show:

LEMMA 1. (C1)-(C3) hold ¢~ FA = 4(x) for some monotone structure function 4.

PROOF. Recall that a monotone structure function in reliability theory is any


function 4: {0, 1}" ~ {0, 1} such that 4 is non-decreasing in each argument and
4(0) = 0, 4(1)= 1, where 0 = (0, ..., 0) and 1 = (1, ..., 1) (viz., Barlow and
Proschan, 1975).
First note (C2) ~ FA(a, b) depends only on (a, b) and not on all of A. Hence
we will simply write F for F A. The condition (C1) =*. F(a, b) = 4(x(a, b)) for all
(a, b ) ~ A 2, for some binary structure function 4. Next, (C3) =*, this 4(x) is
monotone non-decreasing in each coordinate x;. Finally (C1) and (C3) toge-
ther =~ 4(0) = 0, 4(1) = 1; viz., since by (C1), there exist vote-vectors x o and x 1
such that 4(Xo) = 0, 4(xl) = 1; by the monotonicity hypothesis (C3) for 4, we get

0 ~< 4(0) ~< 4(Xo) = 0,


1 : 4(Xl) ~ 4{1) ~ 1.

Thus the conditions (C1)-(C3) imply F = 4(x) for some monotone structure
function 4. The converse is trivial. []

The axioms (A1) and (A2) for voting systems translated to requirements on the
social choice function F(a, b) = 4(x(a, b)) become
(A1) Transitivity: F(a, b) = 1 = F(b, c) =~ F(a, c) = 1.
(A2) Connectedness: F(a, b)= 1 or 0.

Consider a pair of alternatives (a, b ) ~ A 2 such that F(a, b)= 4(x(a, b))= 1.
Borrowing the terminology of reliability theory, we will say

P(a, b ) = : { i ~ S : xi(a, b ) = 1) = {i~ S: a R, b} (1.3)

is an (a, b)-path. Similarly if F(a, b) = 0, call the set of individuals

C(a, b) = : { i ~ S : xi(a , b) = 0) = {i6 S: b P~a} (1.4)

as an (a, b)-cut. Thus an (a, b)-path ((a, b)-cut, respectively) is any coalition, i.e.,
subset of individuals whose common 'non-preference of b relative to a' ('pre-
ference of b over a', respectively) is inherited by the whole society S. Obviously
such paths (cuts) always exist since the whole society S is always a path as well
as a cut for every pair of alternatives. When the relevant pair of alternatives (a, b)
is clear from the context, we drop the prefix (a, b) for simplicity and just refer
to (1.3) and (1.4) as path and cut. A minimal path (cut) is a coalition of which
no proper subset is a path (cut).
180 M. C. Bhattacharjee

To return to the main proof, notice that Lemma 1 limits the search for social
choice functions F = ~(x) to those monotone structure functions tp which satisfy
(A1), (A2) and (C4). A social choice function satisfies the connectedness axiom
(A2) iff for every pair of alternatives (a, b); there exists either a path or a cut,
according as F(a, b) = 1 or 0, whose members' common vote agrees with the
social choise F(a, b). The transitivity axiom (A1) that F(a, b)= 1 =
F(b, c) =~ F(a, c ) = 1 for each triple of alternatives (a, b, c) can be similarly
translated as: for each of the pairs (a, b), (b, c), (a, c); there exists a path, not
necessarily the same, which allow the cycle of alternatives a, b, c, to pass.
Let ~ ' be the class of monotone structure functions and set
= : { ~ J g : no two paths are disjoint},
~ * =: {q~ J / : intersection of all paths is nonempty},
=

where q~d is the dual-structure function

~d(x) = :1 - ~b(1 - x).

(~-* respectively) are those monotone structures for which there is at least one
common component shared by any two paths (all paths, respectively). ~ is the
class of self-dual monotone structures for which every path (cut) is also a cut
(path).
Clearly i f * ~ ~. Also ~ c ~ ; for if not, then there exists two paths P~, /'2
(which are also cuts by self-duality) which are disjoint so that we then have a cut
P1 disjoint from a p a t h / 2 . This contradicts the fact that any two coalitions of
which one is a path and the other a cut must have at least one common com-
ponent, for otherwise it would be possible for a structure tp to fail (tp(x) = 0) and
not-fail ((p(x)~ 0) simultaneously violating the weU-definedness condition (C1).
Thus

c~ ~ * ~ ~ . (1.5)

To see if there is an admissible social choice function F, we are asking if there


exists a $ ~ ' satisfying (A1), (A2) and (C4). To check that the answer is no,
the underlying argument is as follows. First check

(A2) ~ ~ ~ (1.6)

and hence q ~ ~ by (1.5). Which are the structures in (A2) that satisfy (A1)? We
show this is precisely ~ * , i.e., claim

~ (A1) = ~ * (1.7)

so that any admissible F = q~(x)~ ~ * . The final step is to show the property
defining ~ * and the no-dictator hypothesis (C4) are mutually inconsistent.
Reliability applications in economics 181

The following outlines the steps of the argument. For any pair (a, b) of alter-
natives, the society S obeying axiom (A2) must either decide 'b is not preferred
to a' (F(a, b)= q)(x(a, b))= 0) or its negation 'b is preferred to a'
(F(a, b) = ¢(x(a, b)) = 1). If the individual votes x(a, b) result in either of these
two social choices as it must, the dual response 1 - x(a, b) (which changes every
individual vote in x(a, b) to its negation) must induce the other; i.e., for each x,

q~(x) = 0 (1, resp.) ¢> q~(1 - x) = 1 (0, resp.)


.¢~ ~a(x) = 0 (1, resp.) = ¢(x)

Thus (A2) restricts use to ~.


To argue (1.6), consider a q~e o~*. If i0 is a component individual common to
all paths for all pairs of alternatives, then {io} is necessarily a cut; i.e., systems
in ~ * have a singleton cut {to}. Since this component io obeys the transitivity
axiom, so does q~. Thus systems in o~* satisfy (A1) so that together with o~ * c o~
we see, o~* is contained in o~ n (A1). One thus has to only argue the reverse
inclusion: systems in ~ obeying transitivity must be in o~*. Consider any such
system cpe ~ and the set of all of its paths for all alternative pairs (a, b). Now
(i) if there is only a single path, then cp¢ o~* trivially and hence satisfies (A1)
since ~ * does.
(ii) If there are exactly two paths in all, then ~ = ~ * ; so again ¢ e ~'*
satisfying (A1).
(iii) If there are at least three paths, choose any three, say P~, p2, p3. Let
i*(1, 2) be a component in p1 ~ e2. Suppose i*(1, 2) ¢ p3 if possible. Then there
exists distinct components i*(2, 3), i*(1, 3) in p2 n p3 and p1 c~ p3 respectively.
Choose the component-votes (individual preference orderings) of these com-
ponents, and "the system-votes (social choices) by appropriate choices of the votes
for the remaining components in the three paths for an arbitrary but fixed cycle
of alternatives (a, b, c) as shown in Table 1 (for simplicity, the component prefer-
ences and votes are generically denoted by P and x(., ") by suppressing the
individual identity subscript. Thus for i*(1, 2), the preference P = Pi*(1.2),
x(a, b) = xi.(1 ' 2)(a, b) . . . . etc.).

Table 1

Paths Common Individual Equivalent Suitable Correspond-


component preference component- choices of ing social
vote votes for choice
other com-
ponents in
ply p2 i~(1, 2) aP bP c x(c, b) = x(b, a) = 0 p1 F(c, b) = 0
p2, p3
t~(2, 3) cP aP b x(b, a) = x(a, c) = 0 p2 F(b, a) = 0
p l , p3 t'*(1, 3) bP cP a x(a, c) = x(c, b) = 0 p3 F(a, c) = 0
182 M. C. Bhattacharjee

Since F = cp(x) is self-dual, we have

F(a,b)= 1-F(b,a), all ( a , b ) ~ A 2 ;

viz., xi(a, b) = 1 - xt(b, a), all i~ S, all (a, b); hence F(a, b) = qb(x(a, b)) =
~d(x(a, b) = 1 - ~p(1 - x(a, b)) = 1 - ~(x(b, a)) = 1 - F(b, a). Hence, for the
cycle of alternatives (a, b, c); from the last column of the above table, we have:
F(b, c) = 1 = F(c, a), but F(b, a) = 0; thus contradicting the transitiveness axiom
(A1). Hence all three paths must share a common component.
In the spirit of the above construction, an inductive argument can now similarly
show that if there are (j + 1) paths in all and if every set of j paths have a
common component, then so does the set of all (j + 1) paths; j = 1, 2 . . . . if (A1)
is to hold. Thus there is a component common to all paths, i.e., q ~ if*. Let i*
be such a component. Since i* belongs to every path, it is a one-component cut.
It is also a one component path, but the self-duality of qk That {i*} is both a path
and a cut says,

x,.=l(o) ~ ~(x)=l(0),

irrespective of the votes x~ of all other individuals i ~ S , i # i*. Hence i* is a


dictator. But this contradicts (C4). []

While unless there are at least two individual components (n >~ 2) the problem
of aggregation is vacuous, notice the role of the assumption that there are at least
three choices ( k > 2 alternatives) which places the transitiveness axiom in
perspective. There are real-life voting systems (social choice functions) which do
not satisfy (A1). One such example is the majority system R * such that

aR*b .¢~ N ( a , b ) > l N ( b , a )

where
N ( a , b ) = {# of voters i ~ S with aRab} = ~ x~(a,b).
i=1

Since each individual is a one-component self-dual system (viz.,


xi(a, b) = 1 - xi(b, a), all (a, b)); the social choice function F corresponding to
the majority voting system R* is

r(a, b ) = (a(x(a, b))= O(l) ~ ~ xi(a, 6)>1 (<)½n.


i=l

Thus F is the so-called (m, n)-structure cp in reliability theory, where

m = [½n] + 1 i f n o d d ,
= ½n i f n even.
Reliability applications in economics 183

This F = ~p(x) is monotone, indeed a coherent-structure; but F and the corre-


sponding voting system R* is not transitive since with three choices (a, b, c), we
may have a majority (>~ n/2) voters not preferring 'c to b' and 'b to a' but strictly
less than a majority not preferring 'c to a'. Formally Y~7=1x~(a, b)>~n/2,
"i = 1 x i ( b , c) >i n/2 but ~ni = 1 xi(a, c) < n/2; correspondingly F(a, b) = F(b, c) = 1
but F(a, c) = O.
The non-transitiveness of majority systems is a telling example of the impossi-
bility of meeting conflicting requirements each of which is desirable by itself.
Pechlivanides (ibid.) also shows that if we replace axiom (A1) by symmetry of
components (i.e., require tp(x) to be permutation-invariant in coordinates of x) but
retain all other assumptions in Arrow's theorem; the only possible resulting struc-
tures are the odd-majority systems. In this sense, majority voting systems with an
odd number (n = 2m + 1) of voters is a reasonable system. While transitiveness
is essentially a consistency requirement, the symmetry hypothesis is an assump-
tion of irrelevance of the identity of individuals in that any mutual exchange of
their identities do not affect the collective choice. One can ponder the implications
of the trade-off between these assumptions for any theory of democratic behavior
for social decision maing.

1.3. The monotone structures tp in Lemma 1 are referred to as coherent struc-


tures in Pechlivanides (1975). In accepted contemporary use (viz., Barlow and
Proschan, 1975) however, coherence requires substituting the assumption q~(x) = x
for x = 0, 1 for monotone structures by the assumption that all components are
'relevant'. A component (voter) i E S is irrelevant if its (the person's) functioning
or non-functioning (individual preference for or against an alternative) does not
affect the system's performance (social choice) i.e., ~(x) is constant in all x~,
equivalently

tp(1,, x) - tp(0;, x) = 0, all x

where (0;, x):= (x I . . . . x,._ l, 0, xi+ 1. . . . . xn) and (li, x) is defined similarly.
Hence tp(.;, x) is the social choice given i's vote, i e S. Thus,

ie S is relevant ¢~ q~(li, x) - tp(0i, x) ¢ 0, some x


~b(li, x(a, b)) - ~(0 i, x(a, b)) v~ O, some (a, b)

when relevance is translated in terms of social choice given i's vote; while

i ~ S is a dictator
q~(le, x(a, b) = 1, qb(Oi, x(a, b)) = O, all (a, b).

Let

S~, b = {i~ S: ¢(li, x(a, b)) - ~(0,, x(a, b)) = O}


184 M. C. Bhattacharjee

Then the set of dictators, if any, is

D = {i ~ S: tp(1 t, x) - (a(Oe, x) ~ O, all x} = ~ S~, b,


(a, b ) ~ A 2

while the set of irrelevant components is

D O = {i 6 S : tP(li, x) - tP(Oi, x) = O, all x} = (~ Sa, a.


(a, b ) ~ A 2

Note, tp is coherent ,~ ~p is coordinatewise monotone nondecreasing and D O =


(empty); while the 'no dictator hypothesis' holds ~,, D = ~.
In the context of the social choice problem, we may call D O as the set of
'dummy' voters who are those whose individual preferences are of no consequence
for the social choice. An assumption of no dummies (Do empty), which together
with (CI)-(C3) then leads to a coherent social choice function F = ~p(x), would
require that for every individual there is some pair of alternatives (a, b) for which
the social preference agrees with his own. By contrast Arrow's no-indicator hypo-
thesis is the other side of the coin: i.e., for every individual there is some (a, b)
for which his preference is immaterial as a determinant of the society's choice.
While the coherence assumption of reliability theory has yielded rich dividends for
modeling aging/wear and tear of physical systems, it is also clear that the 'no
dummy' interpretation of 'all components are relevant' assumption is certainly not
an unreasonable one to require of social choice functions. What are the implica-
tions, for traditional reliability theory, of replacing the condition of relevance of
each component for coherent structures by the no-dictator hypothesis ? Conversely
in the framework of social choice, it may be interesting to persue the ramifications
of substituting the no dictator hypothesis (C4) by the condition of 'no dummy
voters'--themes which we will not pursue here, but which may lead to new
insights.

2. Voting g a m e s and political power

We turn to 'voting games' as another illustration of the application of reliability


ideas in other fields. Of interest to political scientists, these are among the better
known mathematical models of group behavior which attempt to explain the
processes of decision for or against an issue in the social setting of a committee
of n persons and formalize the notion of political power. For an excellent over-
view of literature and recent research in this area, see Lucas (1978), Deegan and
Packel (1978), and Straffin (1978)--all in Brams, Lucas and Straffin (1978a).

2.1. The model and basic results. Denote a committee of n persons by N. Ele-
ments of N are called players. We can take N = {1, 2 . . . . . n} without loss of
generality. A coalition is any subset S of players, S ~ N. Each player votes yes
or no, i.e., for or against the proposition. A winning (blocking) coalition is any
Reliability applications in economics 185

coalition whose individual yes (no)-votes collectively ensure the committee passes
(falls) the proposition Let W be the set of winning coalitions and v: 2Jv~ {0, 1},
t h e binary coalition-value function
v(S) = 1 if S ~ W (S winning),
= 0 if s~ W (S is not winning). (2.1)

Formally, a simple voting game G (also referred to as a simple game) is an ordered


pair G = (N, W), such that

(i) ~ s W , N ~ W and (ii) S ~ W , S c T =~ T e W

(if everyone votes 'no' ('yes'), the proposition fails (wins); and any coalition
containing a winning coalition is also a winning coalition) or, equivalently by an
ordered pair (N, v) where

(i) v(~) = 0, v(S) = 1 and (ii) v is nondecreasing.

The geometry and analysis of winning coalitions in voting games, as conceptual


models of real life committee situations, provides insights into the decision pro-
cesses involved within a group behavior setting for accepting or rejecting a pro-
position. The theoretical framework invoked for such analysis is that of multi-
person cooperative games in which the games G are a special class. To formulate
notions of political power we view a measure of individual player's ability to
influence the result of a voting game G as a measure of such power. Two such
power indices have been advanced. To describe these we need the notions of a
pivot and a swing. For any permutation odering 7t = (re(l), ..., re(n)) of the players
N = { 1, ..., n), let Ji(r0 = {j ~ N: re(j) preceeds zr(i)} be the predecessor of i. The
player i is a pivot in zc if Jr(re) ~ W but Je(rc) u {i) e W; i.e., player i is a pivot if
i's vote is decisive in the sense that given the votes are cast sequentially in the
order 7r; his vote turns a loosing coalition into a winning one. A coalition S is
a swing for i if i E S, S e W but S \ { i } q~ W; i.e., if his vote is critical in turning
a winning coalition into a loosing one by changing his vote. Then we have the
following two power indices for each player i e N:

(Shapley- Shubik) • i =:P(i is pivotal when all permutations are


equiprobable)

= ~ ( s - 1)!(n - s)! , (2.2)


n!

where s = :[ S] = the number of voters in S and the sum is over all s such that
S is a swing for i.
186 M. C. Bhattacharjee

(Banzhaff) /~+= :proportion of swings for i among all coalitions in


which i votes 'yes'

_ 7+ _ 7+ , (2.3)

Y~+~N7+ 2 n-1

where 7+ is the number of swings for i. The Banzhaff power index also has a
probability interpretation that we shall see later (Section 2.4).
If the indicator variable,

xi = 1 if player i votes 'yes',


=0 if player i votes 'no', (2.4)

denotes i's vote and C l ( x ) = {x: x+ --- i} is the coalition of assenting players for
a realization x = (x 1, . . . , xn) of 2 n such voting configurations, then the outcome
function ¢: {0, 1}n~ {0, 1} of the voting game is

q,(x) = v ( C , ( x ) ) ,

where v is as defined in (2.1) and tells us whether the proposition passes or fails
in the committee. Note q/models the decision structure in the committee given its
rules, i.e., given the winning coalitions. In the stochastic version of a simple game,
the voting configuration X = (X 1, . . . , Xn) is a random vector whose joint distri-
bution determines the voting-function

v =:E~O(X) = P { $ ( X ) = 1},

the win probability of the proposition in the voting game. Sensitivity of v to the
parameters of the distribution of X captures the effects of individual players' and
their different possible coalitions' voting attitudes on the collective committee
decision for a specified decision structure ft.
When the players act independently with probabilities p = (Pl . . . . . Pn) of voting
'yes', the voting function is

v = h(p) (2.5)

for some h: [0, 1 ] n ~ [0, 1]. The function h is called Owen's multilinear extension
and satisfies (Owen, 1981):

h ( p ) = p~h(l~, p) + (1 - p+)h(O~, p ) ,

Oh
he(p) = : - - = h(l+, p) - h(0+, p ) , (2.6)

since the outcome function can be seen to obey the decomposition


Reliability applications in economics 187

~k(x) = xiO(le, x) + (1 - x~) ~k(O. x ) , (2.7)

where ('i,x) is same as x except xi is specified and h(.,p)=:


EO(., x) = h(pl ..... P i - 1, ", P~+ 1. . . . , p , ) . These identities are reminiscent of
well known results in reliability theory on the reliability function of coherent
structures of independent components, a theme we return to in Section 2.2.
If, as a more realistic description of voting behavior, one wants to drop the
assumption of independent players; the modeling choices become literally too
wide to draw meaningful conclusions. The problem of assigning suitable joint
distributions to the voting configuration X = {X1. . . . , X,) which would capture
and mimic some of the essence of real life voting situations has been considered
by Straffin (1978a) and others. Straffin assumes the players to be homogeneous
in the sense that they have a common 'yes' voting probability p chosen randomly
in [0, 1]. Thus according to Straffin's homogeneity assumption; the players agree
to collectively or through a third party select a random number p in the unit
interval and then given the choice of p, vote independently. The fact that p has
a prior, in this case the uniform distribution, makes (X 1. . . . . X.) mutually
dependent with joint distribution

k ! ( n - k)!
P(Xr:(1 ) ..... X . ( k ) = 1, X . ( k + 1) . . . . . X u ( n ) = O) -
(n + 1)!
(2.8)
for any permutation (n(1), ..., n(n)) of the players. (2.8) is a description of
homogeneity of the players which Straffin uses to formulate (i) a power index and
(ii) an agreement index which is a measure of the extent to which a player's vote
and the outcome function coincide. He also considers the relationship between
these indices corresponding to the uniform prior and the prior
f ( p ) = constp(1 - p ) ; results we will fred more convenient to describe in a more
general format in the next section.

2.2. Implications of the reliability framework for voting games. F r o m the above
discussions, it is clear that voting games are conceptually equivalent to systems
of components in reliability theory. Table 2 is a list o f the dual interpretations of
several theoretical concepts in the two contexts:

Table 2

Voting games Reliability structures

player component
committee system
winning (loosing) coalition patch (cut)
blocking coalition complement of a cut
outcome function structure function
voting function reliability function
multilinear extension reliability function with independent components
188 M. C. Bhattacharjee

Thus every voting game has an equivalent reliability network representation and
can consequently be analysed using methods of the latter. As an illustration
consider the following:

EXAMPLE. The simple game (N, IV) with a five player committee
N = {1, 2, 3, 4, 5} and winning coalitions IV as the sets

(1,2,5), (2,3,5), (1,2,3,5), (1,3,4,5,) (1,2,3,4,5),


(1,4,5), (2,4,5), (1,2,4,5), (2,3,4,5).

This voting game is equivalent to a coherent structure

1 3

I O 5

2 4

of two parallel subsystems of two components each and a fifth component all in
series. We see that to win in the corresponding voting game, a proposition must
pass through each of two subcommittees with '50~o majority wins' voting rule and
then also be passed by the chairperson (component 5). The voting function of this
game when committee members vote 'yes' independently with a probability p (i.e.,
the version of Owen's multilinear extension in the i.i.d, case) is thus given by
the reliability function

h(p) = p3(2 - p)2

of the above coherent structure. The minimal path sets of this structure are the
smallest possible winning coalitions, which are the four 3-player coalitions in IV.
Since the minimal cut sets are (1, 2), (3, 4) and (5), their complements

(3,4,5), (1,2,5), (1,2,3,4)

are the minimal blocking conditions which are the smallest possible coalitions B with
veto-power in the sense that their complements N \ B are not winning coalitions.

To persue the reliability analogy further, we proceed as follows. Although it is


not the usual way, we may look at a voting game (N, W) as the social choice
problem of Section 1 when there are only two alternatives A = {a, b}. Set a = fail
the proposition, and b = pass the proposition. Player i's personal preference
ordering R; is then defined by
Reliability applications in economics 189

aR;b(ag,.b) ~ i d o e s not (does) prefer b t o a


i votes no (yes).

If xi is i's 'vote' as in (2.4) and y,. = yi(a, b) = 1 or 0 according as a R~ b or a ~,. b


(as in Section 1) is the indicator of preference, then Ye = 1 - xi, i s N , and clearly
qJ(x) = 0 (1) ~ proposition fails (passes) ~ qJ(1 - x) = (p(y) --- 1 (0), where (p is
the social choice and ~ the outcome function. Hence

qJ(x) = 1 - q~(1 - x) = ~bd(x) = tp(x)

since ~b is self-dual. Thus ~O= (p and hence qJ is also self-dual. The latter in
particular implies the existence of a player who must be present in every winning
coalition (viz. (1.7)).
With the choice set restricted to two alternatives; Arrow's condition (C1) is
trivial, condition (C2) of irrelevant alternatives is vacously true and so is the
transitivity axiom (A1). Since ~O= tp, the condition (C1) says ~k(x) must be defined
for all x while axiom (A2) says ~k is binary. The condition of positive respon-
siveness (C3) holds ¢~- all supersets of winning coalitions are winning, built in the
definition of a voting game. Lemma 1 thus implies:

LEMMA 2. The outcome function ~k o f a voting game is a monotone structure


function. ~b is a coherent structure iff there are no "dummies'.

The first part of the above result is due to Ramamarthy and Parthasarathy
(1984). The social choice function analogy of the outcome function and its
coherence in the absence of dummies is new.
A dummy player is one whose exclusion from a winning coalition does not
destroy the winning property of the reduced coalition, i.e.,

i~Nis dummy ~*, i~S, S~W ~ S\{i}¢W.

Equivalently, i is not a dummy iff there is a swing S for i. The coherence


conclusion in Lemma 2 holds since in a voting game the 'no dummy hypothesis'
says all components are relevant in the equivalent reliability network, viz. for any
i~N,
i is relevant ~ there exists x ° such that ~O(li, x °) - qJ(0;, x °) ~ 0
So=:{j~U:j¢i, x ° = 1} u {i} is a swing for i
¢~ player i is not a dummy.
An equivalent characterization of a dummy i ~ N is that i ¢ minimal winning
coalitions. On the other hand in the social choice scenario of Section 1, a player
i ~ N is a dictator if {i} is a winning as well as a blocking coalition.
When the players act independently in a stochastic voting game, we recognize
the identities (2.6), (2.7) on the outcome function and Owen's multilinears
extension as reproducing standard decomposition results in coherent structure
190 M. C. Bhattacharjee

theory, as they must. The voting funcion h(p) being a monotone (coherent)
structure's reliability function must be coordinatewise monotone: p<~p'
=~ h(p)<~ h(p') which has been independently recognized in the voting game
context (Owen, 1982). The Banzhaffpower index (2.3) is none other than the
structural importance of components in ~. Since research in voting games and
reliability structures have evolved largely independent of each other, this general
lack of recognition of their dualism has been the source of some unnecessary
duplication of effort. Every result in either theory has a dual interpretation in the
other, although they may not be equally meaningful in both contexts. The following
are some further well known reliability ideas in the context of independent or i.i.d.
components which have appropriate and interesting implications for voting games.
With the exception of 2.2.1 below, we believe the impact of these ideas have not
yet been recognized in the literature on voting games with independent or i.i.d.
players.

2.2.1. The reliability importance

v, = E{~/,(1,, x) - ~k(Oi, x)} (2.9)

measures how crucial is i's vote in a game with outcome function ~k and random
voting probabilities. As an index of i's voting power, v; is defined for any
stochastic voting configuration X and has been used by Straffin within the homo-
geneity framework ((X~, . . . , X,) conditionally i.i.d, given p). We may call v; the
voting importance of i. If the players are independent, then

Vi = h i ( p )

in the notation of Section 2.1 (viz. (2.6)). Thus e.g., in the stochastic unanimity
game where all players must vote yes to pass a proposition, the player least likely
to vote in favor has the most voting importance. Similarly in other committee
decision structures, one can use vi to rank the players in order of their voting
importance. For a game with i.i.d, players, i's voting importance becomes the
function v; = hi(p) where he(p) = h(1 i, p) - h(O;, p) and h('i, o), h(p) denote the
corresponding versions of h(.i, p), h(p) respectively when p = (p . . . . . p). Since
in this case h'(p) = Y,i~Nhi(P), one can also use the proportional voting impor-
tance
v~* - vi _hi(P)
E j ~ N Vj h' ( p )

as a normalized power index in the i.i.d, case.

2.2.2. The fault-tree-analysis algorithm of reliability theory will systematically


enumerate the smallest cut sets and hence the minimal blocking coalitions of a
voting game through its reliability network representation. The dual event tree
Reliability applications in economics 191

algorithm will similarly produce all minimal winning coalitions, the Banzhaff
power indices and the voting importances.

2.2.3. S-shapedness of the voting function for i.i.d, players with no dummies. This
follows from the M o o r e - S h a n n o n inequality (Barlow and Proschan, 1965)

dh
p(1 - p) ~ >~ h(p)(1 - h(p))
dp

for the reliability function of a coherent structure with i.i.d, components. Implica-
tions of this f a c t in the voting game context is probably not well known. In
particular the S-shapedness of the voting function implies that among all com-
mittees of a given size n, the k-out-of-n structure (lOOk~n% majority voting
games) have the sharpest rate of increase of the probability of a committee of n
i.i.d, players passing a bill as the players' common yes-voting probability in-
creases.

2.2.4. Component duplication is more effective than system duplication. This


property of a structure function implies: replicating committees is less effective in
the sense of resulting in a smaller outcome/voting function than replicating com-
mittee members by subcommittees (modules) which mimic the original
committee structure ~. This may be useful in the context of designing representa-
tive bodies when such choices are available.

2.2.5. Composition of coherent structures. Suppose a voting game (N, W) has no


dummies and is not an unanimity game (series structure) or its dual (any single
yes vote is enough: parallel structure). Suppose each player in this committee N
with structure ~b is replaced by a subcommittee whose structure replicates the
original committee, and this process is repeated k-times; k = 1, 2, .... With i.i.d.
players, the voting function hk(p) of the resulting expanded committee is then the
reliability function of the k-fold composition of the coherent structure qJ which has
the property

hk(p) $ 0, = Po, 1' 1 ¢> p < , = or > Po

as ki', ~ or ~ ~ (Barlow and Proschan, 1965) where Po is the unique value


satisfying h(po) = Po, guaranteed by S-shapedness. When we interpret the above
for voting games, the first conclusion is perhaps not surprising, although the role
of the critical value Po is not fully intuitive. The other two run counter to crude
intuition; particularly the last one which says that by expanding the original
committee through enough repeated compositions, one can almost ensure winning
any proposition which is sufficiently attractive individually. The dictum 'too many
cooks spoil the broth' does not apply here.
192 M. C. Bhattacharjee

2.2.6. Compound voting games and modular decomposition. If (Nj, Wj), j = 1,


2, ..., k, are simple games with palrwise disjoint player sets and (M, V) is a
simple game with XMI = k players; the compound voting game (N, W ) is defined
as the game with N = Uj= ~Nj and

W= {ScN: {jeM: SnNje Wj.}e V}.

(M, V) is called the master-game and (Nj, Wj) the modules of the compound game
(N, W). The combinatorial aspects of compound voting games have been exten-
sively studied. Considering the equivalent reliability networks it is clear however
that if the component games (Nj, Wj) have structures ~, j = 1, ..., k, and the
master game (M, V) has structure tp; then the compound voting game (N, W) has
structure

= ,/,(¢,, ..., ~).

Conversely the existence of some tp, ~k~, ..., ~bk satisfying this representation for
a given ~k can be taken as an equivalent definition of the corresponding master
game, component subgames and the accompanying player sets as the modular
sets of the original voting game. E.g., in the 5-player example at the beginning of
this section, clearly both subcommittees J1 = { 1, 2}, J2 - {3, 4} are modular sets
and the corresponding parallel subsystems are the subgame modules. Ramamur-
thy and Parthasarathy (1983) have recently exploited the results on modular
decomposition of coherent systems to investigate voting games in relation to its
component subgames (modules) and to decompose a compound voting game into
its modular factors (player sets obtained by intersecting maximal modular sets or
their complements with each other). Modular factors decompose a voting game
into its largest disjoint modules. The following is typical of the results which can
be derived via coherent structure arguments (Ramanurthy and Parthasarathy,
1983).

THREE MODULES THEOREM. Let J;, i = 1, 2, 3, be coalitions in a voting game


(N, W ) with a structure ~b such that Ja to J2, Jz to J3 are both modular. Then each
J~ is modular, i = 1, 2, 3 and U~= x Ji is either itself modular or the full committee
N. The modules (J1, ~ki) i = 1, 2, 3 which appear in (N, ~k) are either in series or in
parallel, i.e., the three-player master game is either an unanimity game, or a trivial
game where the only blocking location is the full committee.

2.3. The usual approach in modeling coherent structures of dependent com-


ponents is to assume the components are associated (Barlow and Proschan, 1975).
By contrast, the prevalent theoretical approach in voting games, as suggested by
Straffin (1978) when the players are not independent assumes a special form of
dependence according to (2.8). One can show that (2.8) implies X 1. . . . , Xn are
associated. Thus voting game results under Straffin's model and its generalized
version suggests an approach for modeling dependent coherent structures. These
Reliability applications in economics 193

results are necessarily stronger than those that can be derived under the asso-
ciatedness hypothesis alone.
The remarkable insight behind Straffin's homogeneity assumption is that it
amounts to the voting configuration X being a finite segment of a special sequence
of exchangeable variables. The effect of this assumption is that the probability of
any voting pattern x -- (x~, . . . , x,) depends only on the size of the assenting and
dissenting coalitions and not on the identity of the players, as witness (2.8). One
can reproduce this homogeneity of players through an assumption more general
than Strattin's. Ramamurthy and Parthasarathy (1984) exploit appropriate relia-
bility ideas to generalize many results of Straffin and others, by considering the
following weakening of Straffin's assumption.

GENERAL HOMOGENEITY HYPOTHESIS. The random voting configuration


X = (X 1. . . . . X , ) is a finite segment of an infinite exchangeable sequence.

Since X l , 2 2 , . . . are binary; by the Finnetti's well known theorem, the voting
configuration's joint distribution has a representation

P(X~o ) . . . . . X,~(k) = 1, X.(k+ ~) . . . . . X,~(,,) = O)

= --1"~p~'(1 - p ) " - k dF(p) (2.10)


.)o

for some prior distribution F on [0, 1]; and the votes X 1 . . . . . X n are conditionally
independent given the 'yes' voting probability p. Straffin's homogeneity assump-
tion corresponds to an uniform prior for p, leading to (2.8). For a stochastic
voting game defined by its outcome (structure) function ~k, consider the power-
index

v,. =:E{$(1 i, X) - ~(0i, X)},

defined in (2.9) and the agreement indices

Ai = : e { x , = ¢ ( x ) } ,
pi =:cov(x;, q4x)),

t5 =: cov(X, q l ( X ) l p ) d F ( p ) .
)
Also, let
b = :cov(P, H ( P ) ) .

Here P is the randomized probability of voting 'yes' with prior F in (2.10). Note
b, tri are defined only under the general homogeneity assumption, while vi, A t and
Pi are well defined for every joint distribution of the voting configuration X. Recall
194 M. C. Bhattacharjee

that a power index measures the extent of change in the voting game's outcome
as a consequence of a player's switching his vote and an agreement index
measures the extent of coincidence of a player's vote and the final outcome. Thus
any measure of mutual dependence between two variables reflecting the voting
attitudes of a player and the whole committee respectively qualifies as an
agreement index. An analysis of the interrelationships of these indices provides an
insight into the interactions between players' individual level of command over the
game and the extent to which they are in tume with the committee decision and
ride the decisive bandwagon.
The agreement index A i is due to Rae (1979). Under (2.8), ve becomes Straffin's
power index and a e is proportional to an agreement index also considered by
Straffin. Note all the coefficients are non-negative. This is clear for ve and A e, and
follows Pc, ere and b from standard facts for associated r.v.s. (Barlow and
Proschan, 1975) which is weaker than the general homogeneity (GH) hypothesis.
The interesting results under the assumption of general homogeneity (Ramamurthy
and Parthasarathy, 1984) are

pe=ai+b,

2 b s ~ ) ~ tri >/
i~N ~0 h(p)(1 - h(p)) d F ( p ) ,

EXe=½ ~ A e = 2 o - j + 2 b + 1. (2.11)

The equality in the second assertion holds only under StralTm's homogeneity (SH)
assumption. This assertion follows by noting tre = ~ o1 P ( 1 - h(p))dF(p) under
GH, h'(p) = Y'e hi(P), termwise integration by parts in Y~etre with uniform prior to
conclude the equality and invoking the S-shapedness of h(p) for the bound.
The above relations in particular imply
(i) Under GH, i is dummy ¢~ a~ = 0. If the odds of each player voting yes and
no are equal under GH, i.e., if the marginal probability P(X e = 1) = ½; then we
also have, i dummy ¢:~ Pc--- b ~ A i = 2b + ½. Thus since ~5 is in a sense the
minimal affinity between a player's vote and the committee's decision, Straffin
suggests using 2a e (Ae - 2b - 1) as an agreement index.
(ii) Let w, l = 2 n - w be the number winning and losing coalitions. Since
hi(½) = fli (structural importance = Banzhaff power index) and h(1) = w/2"; taking
F as a point-mass at ½, (2.11) gives

Z fli >/2-2(n-1) wl"


i~N

Without the equal odds condition, the last relation in (2.11) has a more general
version that we may easily develop. Let n; = : .[ 1 p dF(p) = E X~ be the marginal
probability of i voting yes under general homogeneity. Then
Reliability applications in economics 195

1
A i = ~ P(X i = ~b(X) = j ) = E X~k(1., X ) + E((1 - X~)(1 - ~b(0e, X))
j=0

= E X 1 ~O(X) + E(1 - X 0 ( 1 - if(X))


= 2 cov(X 1, qJ(X)) + E ~O(X){2E X~ - 1} + 1 - E X~
= 2p, + v ( 2 n , - 1) + (1 - hi),
= 2 p , + ~ v + (1 - h i ) ( 1 - v)

which reduces to the stated relationship whenever n i = 1 for some i e N. Notice


that the convex combination term in braces, which measures the marginal contri-
bution to A i of a player's voting probability n/, depends on the game's value v via
1
an interaction term unless n i - 2"

2.4. Influence indices and stochastic compound voting games. There are some
interesting relationships among members of a class of voting games via their
power and agreement indices. In the spirit of (2.10), consider a compound voting
game consisting of the two game modules

(i) a voting game G = (N, W) with N = { 1. . . . , n}, and


(ii) a simple majority voting game G,, = ( N , W,,) of (2m + 1) players with

Nm= {n+ 1,...,n+2m, n + 2 m + 1},


W m = ( S = U m" ISl>~m+ 1}, (2.12)

i.e., any majority (at least (m + 1) players) coalition wins. Replacing the player
- ( n + 2m + 1) in the majority game by the game G = (N, W), define the
compound game G~* = (N*, W*), where

N*=NwN,,= {1 . . . . . n , n + 1. . . . . n + 2 m } ,
W* = {S c N*" either ] S \ N I ~ m + 1 or/and
I S \ N I >~m, S n N ~ W}. (2.13)

G* models the situation where the player - (n + 2m + 1) in the majority game G m


is bound by the wishes of a constituency N, as determined by the outcome of the
constituency voting game G = (N, W), which he represents in the committee N m.
The winning coalitions in the composite game G* are those which either have
enough members to win the majority game G,, or is at most a single vote short
of winning the same Gm when the player representing the constituency N is not
counted but containing a winning coalition for the constituency game G = (N, W).
The winning coalitions in the latter category are precisely those S such that
(i) ]S\N[ = m, i.e., for any i¢ S \ N , {i} u S \ N is a swing for every such player
i in the majority game Gm and (ii) using appropriate players in S also wins the
constituency voting game G. With i.i.d, voting configuration, if hi(p) and h*(p)
196 M. C. Bhattacharjee

respectively denote the voting importance of i~ N in G and G*, then clearly

h*(p)=(2n~)pm(1-p)mh,(p) , i~N. (2.14)

Under general homogeneity, the class of priors

F a . b ( p ) = ( a ~( )aT+(b~-- - l ) 1)! ! fo p u a - 1 ( 1 - u ) b- 1 du, a>O, b>O,

which leads to the voting configuration distribution

a(k) b(n - k)
/'(X~ . . . . . X k = 1, Xk+~ . . . . = X. = 0)- (a + b) (") ' (2.15)

can reflect different degrees of mutual dependence (tendency of alignments and


formation of voting blocks) of players for different choices of a, b. Player i's vote
X,. in the model (2.15) is described by the result of the i-th drawing in the well
known Polya-urn model which starts with a white and b black balls and adds a
ball of the same color as the one drawn in successive random drawings. For any
voting game G with a Polya-urn prior Fa. b, denote the associated influence indices
of power/agreement by writing ve = re(G: a, b), etc . . . . Notice that Straffin's
original homogeneity assumption corresponds to the prior F1, 1. Notice that
Straffin's original homogeneity assumption corresponds to the prior F1, 2. Using
vi(G: a, b)= S~ht(p)dF(p) and (2.14), Ramamurthy and Parthasarathy (1984)
have shown:

v,.(G: 1, 1)= ~i,

ab
a/(G: a, b ) = vi(G: a + 1, b + 1),
(a+b)(a+b+ 1)

and, in the framework of the compound voting game G* in (2.13),

oi(G: m + 1, m + 1) = (2m + l)vi(G*: 1, 1), iEN, (2.16)

extending the corresponding results of Straffin (1978) which can be recovered


from the above by setting a = b = m = 1. The second assertion above shows that
the apparently distinct influence notions of 'agreement' and 'power' are not
unrelated and one can capture either one from the other by modifying the degree
of dependence among the voters as modeled by (a, b) to (a + 1, b + 1) or
(a - 1, b - 1) as may be appropriate. The first assertion states the equivalence of
Shapley-Shubik index with voting importance under uniform prior (Straffin's
Reliability applications in economics 197

power index), while the third assertion shows a relationship between voting impor-
tances in the compound game in (2.13) and the corresponding constituency game
under appropriate choice of voter-dependence in the two games.
Notice v~(G: m + 1, m + 1)--}fl;, the Banzhaff power-index in the constituency
game, since the case of players voting yes or no independently with equal odds
(p = ½) can be obtained by letting m ~ oo in the prior Fm+ ~.m + 1" Hence by
(2.16), in the composite game G* with (2m + 1) players,

(2m + 1)v;(G~: 1, 1)~fle as n ~ oo, ieN,

i.e., Straffin's power-index in the compound game G* multiplied by the number


of players approaches the Banzhaff power index (structural importance) in the
constituency game G = (N, W).
The priors Fa. b, under the general homogeneity hypothesis, reflect progressively
less and less voter interdependence with increasing (a, b) and thus in this sense
also models the maximum possible such dependence under Straffm's homogeneity
when a = b = 1, the minimal values for a Polya-urn. To emphasize the conceptual
difference as well as similarity of the Shapley-Shulik and Banzhaff indices of
power, we may note that they are the two extreme cases of the voting importance
vt (viz. 2.9)) corresponding to a = b = 1 and limiting case a = b---} oo.
It is interesting to contrast the probability interpretations of the Shapley-Shubik
and Banzhaff power indices. A player i~ N is crucial if given the others' votes, his
voting makes the difference between winning or loosing the proposition in the
committee. While the Shapley-Shubik index ~; in (2.2) is the probability that i ~ N
is crucial under Straffin's homogeneity (player's votes are conditionally i.i.d, given
p), the Banzhaff index fl; in (2.3) is the probability that i is crucial when the players
choose 'yes'-voting probabilities Pi, i ~ N, independently and the Pi, i ~ N are
uniformly distributed. The probability of individual group agreement under this
independence assumption is

/g;. (1) + (1 -/~;). (½) = ½(1 +/8~).

The right hand side can be used as an agreement index. These results are due to
Straffin (1978).

2.5. While we have argued that several voting game concepts and results are
variants of system reliability ideas in a different guise; others and in particular the
general homogeneity assumption and its implications may contain important
lessons for reliability theory. For example; in systems in which the status of some
or all components may not be directly observable except via perfect or highly
reliable monitors--such as hazardous components in a nuclear installation, the
agreement indices can serve as alternative or surrogate indices of reliability
importance of inaccesible components. The general homogeneity assumption in
system reliability would amount to considering coherent structures of exchange-
able components, a strengthening of the concept of associatedness as a measure
198 M. C. Bhattacharjee

of component dependence; an approach which we believe has not been fully


exploited and which should lead to more refined results than under associatedness
of components alone.

3. 'Inequality' of distribution of wealth

3.1. One of the chief concerns of development economists is the measurement


of inequality of income or other economic variables distributed over a population
that reflects the degree of disparity in ownership of wealth among its members.
The usual tool kit used by economists to measure such inequality of distribution
is the well known Lorenz curve and the Gini index for the relevant distribution
of income or other similar variables, traditionally assumed to follow a log-normal
distribution for which there is substantial empirical evidence and some theoretical
arguments. Some studies however have questioned the universality of the log-
normal assumption; see e.g., Salem and Mount (1974), MacDonald and Ransom
(1979). Mukherjee (1967) has considered some stochastic models leading to
gamma distributions for distribution of welath variables such as landholding.
Bhattacharjee and Krishnaji (1985) have considered a model for the landholding
process across generations, allowing for acquisition and disposal of land in each
generation and where ownership is inherited, to argue that the equilibrium distri-
bution of landholding when it exists must be NWU ('new worse than used') in the
sense of reliability theory, i.e., the excess residual holding X - t [ X > t over any
threshold t stochasticaly dominates the original landholding variable X in the
population. The N W U property is a fairly picturesque description of the relative
abundance of 'rich' landowners (those holding X > t) compared to the total popu-
lation of landowners across the entire size scale.
In practice, even stronger evidence of disparity has been found. In an attempt
to empirically model the distribution of landholdings in India, it has been found
(Bhattacharjee and Krishnaji, 1985) that either the log-gamma or/and the D F R
gamma laws provide a better approximation to the landholding data for each state

Table 3
Landholding in the State of W. Bengal, India (1961-1962) and model estimates

Landholding NS S Log- DFR Log-


size (acres) normal gamma gamma
on (1, oo)

0- 1 1896 2285 1832 -


1- 5 1716 1350 1745 1794
5-10 482 333 515 422
10-20 164 189 165 132
>20 39 138 40 52
Reliability applications in economics 199

in India based on National Sample Survey (NSS) figures. Table 3 is typical of


the relatively better approximations provided by the gamma and the log-gamma
on (1, ~ ) relative to log-normal. While the log-gamma is known to have an
eventually decreasing failure rate, the estimated shape parameter of the gammas
were all less than one and typically around ½ for every state and hence all had
decreasing failure rates.
For landholdings, the NWU argument and the empirical D F R evidence above
(everywhere with gammas, or in the long range as with the log-gamma) are
suggestive of the possibility of exploiting reliability ideas. If X >/0 is the amount
of wealth, such as land, owned with distribution F; it is then natural to invoke
appropriate life-distribution for the concepts for the holding distribution F in an
attempt to model the degree of inequality present in the pattern of ownership of
wealth. The residual-holding X - t l X > t in excess of t with distribution
Ft(x ) = 1 - {ff(t + x)/ff(t)} and the mean residual holding

g(t) : = E ( X - t IX > t)

correspond respectively to the notions of the residual-life and the mean residual
life in reliability theory. In particular the extent of wealth which the 'rich' com-
mand is described by the behavior of g(t) for large values of t. More generally,
the nature of/7, and the excess average holding g(t) over an affluence threshold
t as a function of the threshold provides a more detailed description of the pattern
of ownership across different levels of affluence in the population.
Using the above interpretations of F, and g(t); the notion of skew and heavy
tailed distributions of wealth as being symptomatic of the social disparity of
ownership can be captured in fairly pitcuresque ways with varying degrees of
strength by the different anti-aging classes (DFR, IMRL, NWU, NWUE) of 'life
distributions' well known in reliability theory. For example a holding distribution
F is D F R (decreasing failure rate: F,i"st stochastically increasing in t) if the pro-
portion of the progressively 'rich' with residual holding in excess of any given
amount increases with the level of affluence. The other weaker anti-aging hypo-
theses: IMRL (increasing mean residual life: g(t)'r ), NWU (new worse than used:
Ft >~StF, all t) and N W U E (new worse than used in expectation: g(t)>~ g(0+)) can
be similarly interpreted as weaker descriptions of disparity.
Motivated by these considerations, Bhattacharjee and Krishnaji (1985) have
suggested using

11 = g*/l~, where g* = lim g(t), /~ = g(0 +)


t~ oo

1 2 = t ~ o o l i m E ( E I x > t ) = l + limt_~g(t)--t ' (3.1)

when they exist, as indices of inequality in the distribution of wealth. They also
consider a related measure Io = g* - # =/~(I1 - 1) which is a variant of I~, but
200 M. C. Bhattacharjee

is not dimension free as 11, 12 are. The assumption that the limits in (3.1) exist
is usually not a real limitation in practice. In particular the existence of g* ~< oo
is free under IMRL and DFR assumptions, with g* finite for reasonably nice
subfamilies such as the D F R gammas. More generally, the holding distributions
for which g* ~< oo (g* < oo respectively) exists is the family of 'age-smooth' life
distributions which are those F for which the residual-life hazard function
- l n f f t ( x ) converges on [0, oo] ((0, ~ ] respectively) for each x as t ~ o o
(Bhattacharjee, 1986).
11 and 12 are indicators of aggregate inequality of the distribution of wealth in
two different senses. 11 measures the relative prepondrance of the wealth of the
super-rich, while 12 indicates in a sense how rich they are. The traditional index
of aggregate inequality, on the other hand, as measured by the classical Gini-index
(Lorenz measure) G can be expressed as

G = P ( Y > X ) - P(Y<~ X ) = 1 - 2 Fa(x ) dF(x), (3.2)


~0°°
where X is the amount of wealth with holding distribution F and Y has the so
called "share-distribution'

Fl(X ) = : # - 1 f o t dF(t),

the share of the population below x. A somewhat pleasantly surprising but not
fully understood feature of the three indices 11, I 2 and G is that they turn out to
be monotone increasing in the coefficient of variation for many holding distribu-
tions F. Such is the case with G under log-normal, 11 under gamma and I 2 under
log-gamma (Bhattacharjee and Krishnaji, 1985). Note also that whenever the
holding distribution is anti-aging in DFR, IMRL, NWU or NWUE sense, the
coefficient of variation (c.v.) is at least one (Barlow and Proschan, 1975); a
skewness feature aptly descriptive of the disproportionate share of the rich.
Recently the author has considered other inequality indices which share this
monotonicity in c.v. under weak anti-aging hypotheses and have re-examined the
appropriateness of 11, 12 and measures of aggregate inequality to show
(Bhattacharjee, 1986a):

(i) The non-trivial case 1 < 12 < m, implies I~ = ~ necessarily and then

12 = (1 + r/:) lim ~,'(t) (3.3)


t~ ~ 11(0

where t/ is the coefficient of variation of the holding distribution F,


11(0 = g(t)/l~ = S ~ ff(u) d u / # f f ( t ) ~ I~ = ~ and IFl(t) is the inequality function
11( 0 computed for the share distribution F 1 associated with F.
Reliability applications in economics 201

(ii) The ratio of the hazard functions of the holding and share distributions
converge to 12:

12 = lim l n ( 1 - F(t)) (3.4)


' ~ ln(1 - El(t))
Clearly 11 ~> l if the holding distribution F is N W U E , with equality iff F is
exponential. Similarly by (3.1) I z >/1 with equality iff g(t) = o(t) or, an equivalent
condition on hazard functions via (3.4). The question, when 11 and I 2 are finite
so as to be meaningful for purposes of comparison across populations has the
following answers (Bhattacharjee, 1986a):
(iii) 11 < ~ ~ 1 - F(ln x) is ( - p)-varying, for some p • (0, 0o ]. F is strictly
N W U E ~ I 1 > 1.
(iv) For any holding distribution F, I <~ I 2 <<.00. The different possibilities are
characterized by
(a) I f F is D F R , then 12 = 1 ~:~ the residual holding scaled by its mean converges
to exponential, i.e.,

e ( x > t + xg(t) [X > t) ~ e - x .

This condition is necessary for I 2 = 1, without the D F R hypothesis.


(b) 1 < 12 < oo . ~ the "excess holding factor' over an affluence threshold t con-
verges to the Pareto distribution:

P(flX> t)~x -~,

with ~ = & l ( & - 1).


(c) 12 = ~ ¢:~ P ( X - t > x i X > t) ~ t/(t + x) as t ~ 0o. Notice that the distri-
bution on the right hand side is D F R with infinite mean.
The n.s.c, in (iii) is the condition of generalized regular variation (Feller, 1966;
Senata, 1976): a real valued function h(x) on the half-line is regularly-varying if
h(xy)lh(y) converges as y ~ o o and then h ( x y ) / h ( y ) - - , x ~, some ~ ( - ~ , ~).
With an obvious interpretation of x ~ when ~ = + ~ , such an h(x) is called
a-varying.

3.2. The Lorenz curve and TTT-transform. While 11, 12 and the classical Gini
index are all aggregate measures of inequality, it is also useful to have a more
dynamic measure of inequality which will describe the variation of the disparity
of ownership with changing levels of affluence. This is classically modeled by the
Lorenz curve

L(p)=# -lf~F-l(u)du, O<~p<~l,

where # is the average holding and F - J(u) = inf{t: F(t) >1 u} measures the pro-
portion of total wealth owned by the poorest 100p ~o of the population, and is thus
202 M. C. Bhattacharjee

a variant of the share distribution F 1 in (3.2), namely L(F(t)) = Fl(t ). As remarked


earlier, the ratio g(t)/# of the mean residual holding to the average holding can
also serve such a purpose. The Lorenz curve L and its inverse L - l are both
distribution functions on the unit interval. The relevance of reliability ideas for
modeling inequality and relationships of the Lorenz curve to some well known
functionals of life distributions was first indicated by Chandra and Singpurwalla
(1981) and further studied by Klefjs0 (1984). If

W(p) =" ~-- 1 ~0F '(p) F(t) dt

is the scaled total time on test (TTT) transform of the holding distribution F viewed
as a life distribution with mean # and the cumulative TTT-transform,
V:= So1 W ( p ) d p , then

L ( p ) = W ( p ) - (1 - p)/~- i F - l(p),

V=I-G,

(Chandra and Singpurwalla, 1981) where the Gini-index


1

G= 1-2
fo F,(t) d F ( t ) = 1 - 2 L ( p ) dp

= 2
fo'
{ p - L ( p ) } d? (3.5)

is scale-equivalent to the area bounded by the diagonal and the Lorenz curve, as
is well known. Based on a random sample with order statistics X(1), X(2). . . . , X(,)
from F, the estimated sample Lorenz curve and the Gini-statistic

~'wl / n
G.=: j=,j(n -j)(X(j+I)- X(j))
n
(n - 1) Z j _ , X(j)

are similarly related to the total time on test statistic and its cumulative version

L. = W. - (n - i) i) X(:) ,
j i

Go=I-V n.

Chandra and Singpurwalla (1981), Klefsj0 (1984) and Taillie (1981) have used
partial orderings of life distributions to compare the Lorenz curves of holding
distributions which are so ordered. For the partial ordering notions
Reliability applications in economics 203

(i) H <c F if F - IH is convex,


(ii) H < . F if F - 1H is star-shaped (F- ~H(t)/t is increasing,
(iii) H <.T F if ( F - 1 / H - 1 ) is increasing,
(iv) H < m F if ~x~ { i f ( t ) - H(t)} dt>~ 0, all x > 0, with equality at x = 0;
they show,

H <oF or H<.TF ~ L~I(p)<.TLT--'(p) ~ LF(p)<~LI_I(p) ,


H<cF ~ L~'(p)<cLFl(p),
H<m F ~ L r ( p ) <~L~r(p) (3.6)

In particular taking H to be exponential, the distribution F in (i) above corre-


sponds to DFR, (ii)to D F R A and (iv)to H N W U E (Klefsj6, 1982). Reversing
the roles of H and F leads to the dual aging classes. (3.6) implies that

L ( p ) <~p + (1 - p)ln(1 - p ) , (3.7)

the Lorenz curve of the exponential whenever the holding distribution is H N W U E


with a finite mean. This bound obviously remains valid for the smaller class of
NWU and D F R distributions for which we have earlier found some theoretical
and empirical evidence respectively as plausible models of landholding distribu-
tions.
In a more general vein, Klefsj6 (1984) remarks that in the spirit of (3.5);
contrasting the Lorenz curve against the uniform distribution on (0, 1), the quan-
tities

Jk =:(k + 1)fo'pk-l{p-L(p)}dp, k>~ 1,

Lk=:k(k- 1) f o ~ ( 1 - p ) k 2 { p _ L ( p ) } d p , k>~2, (3.8)

can be used as generalized indices of inequality. The Gini-index is the special case
G = J~ = L 2. Notice in view of (3.7), we have Jk >t O, L k >~ 0 for all anti-aging
holding distributions F or their 'aging' duals; and J~ = L k = 0 only in the
egaliterian case L ( p ) = p where everybody owns the same amount of wealth (F
is degenerate). By expressing Jk as

Klefsj6 (1984) implicitly notes that Jk can be interpreted as the excess over k - 1
of the ratio of the mean life of a parallel system of (k + 1) i.i.d, components with
life distribution F to that of a similar system with exponential lives. Similarly, we
note
204 M. C. Bhattacharjee

Lk = k ; ( l - u ) ~ - 1 ( 1 - W(u))du= 1 - # 1 ffk(t) dt

measures the relative advantage of a component with life F against a series system
of k such i.i.d, components as measured by the difference of the corresponding
mean lives as a fraction of the component mean life. These interpretations bring
to a sharper focus the relationships of the notion of 'inequality of distribution' in
economics to measures of system effectiveness in reliability.

3.3. Applications to statistical analysis of lifelengths. The reliability approach to


modeling 'inequality of distributions' suggest applications to reliability inference.
Using weak convergence of the empirical Lorenz process {L~(t): 0 ~< t ~< 1),

L,(t)=:~ L, -L(t) } if j -
n
1 < t ~ < -j,
n

=:0 if t = 0,

to a process related to Brownian bridge (Goldie, 1977), it is thus possible to


construct a test of exponentiality--a theme of central interest in reliability and life
testing. However the difficulty of evaluating the exact distribution of L,(t) to
determine the critical points of the goodness-of-fit test based on the sample
Lorenz curve has in practice required simulation even in large samples (Gail and
Gatswirth, 1978). In contrast the critical cut-off values of the corresponding test
based on the sampled TTT-process Wn(t)=:xfn{Wn(j/n)-W(t)}, 0 ~ t ~ < 1,
(Barlow and Campo, 1975) are the usual Kolmogroff-Smirnov statistics; since,
under the null hypothesis of exponentiality (W(t) = t), Wn(t) converges exactly to
the Brownian Bridge.
If the alternatives belong to a more restricted family such as the well known
non-parametric life distribution classes in reliability, then there are other possi-
bilities. Kelfsj0 (1983) has used a variant of the aggregate inequality index L~ in
(3.8) to construct a test of exponentiality against H N B U E ( H N W U E ) alterna-
tives. His test statistic is based on an estimate of B~, =:kLk- ( k - 1), noting
B k >/(~<)0 if F is H N B U E ( H N W U E ) with B k = 0 only if F is exponential.
Estimation and tests of monotonicity and a turning point of the mean residual
life function g(t) have been considered by Hollander and Proschan (1975), Guess
and Proschan (1983). Our inequality indices 11 and 12 suggest a related open
problem: estimation and tests for I~, I 2 which are parameters descriptive of the
tail behavior of the mean residual life. The question of estimating I l is well defined
within the family of age-smooth life distributions (Bhattacharjee, 1986). On the
other hand the domains of attraction results (Bhattacharjee, 1986a) described
earlier, which characterize possible values of 12 implies that estimating 12 and
testing I s = 1 against 1 < 12 < oe are problems of independent interest for reliabili-
ty theory.
Reliability applications in economics 205

4. R & D rivalry and the economics of innovation

4.1. Innovations and accompanying technological breakthroughs have changed


the lot of mankind throughout history and noticeably more so in the present
century at an accelerating pace. Since technological change affects market struc-
ture through altering the means of production, economists began to be interested
in the subject of technical advance around the fifties. Although there are some
earlier references to the economic aspects of technological advance (Taussig, 1915;
Hicks, 1932), the stage for serious inquiry on the economics of such advance was
set by Schumpeter (1961, 1964, 1975) who emphasized the role of innovation as
an economic activity. Since then, the recognition of technical advance as a major
source of economic growth has been the subject of many studies, mostly empiri-
cal. These studies deal with empirical relationships of industrial innovations to
firm size and concentration as indicators of market structure, the 'technology-
push' and 'demand-pull' factors (Arrow, 1962) as incentives for innovation, and
such other relevant variables. Collectively they point to the need for a conceptual
framework and recently an economic theory of technical advance has began to
emerge (Kamien and Schwartz, 1982). In this view, the economic agents are firms
or entrepreneurs and an act of product- or process-innovation straddles all
activities from basic research through invention to development, production, dis-
tribution and collection of consequent revenues against the backdrop of industrial
rivalry in the competition to gain market supremacy. Schumpeter recognized that
acts of invention and innovational entrepreneurship are distinct as are the corre-
sponding risks; and it is only the latter which can lead to the diffusion of benefits
of invention to its ultimate consumers.
Innovation and entrepreneurship in this framework is viewed as a race to be
the first with the incentive of commanding extraordinary profits at least until
imitators appear when such monopoly profits will begin to be eroded. The
'Schumpeterian hypothesis' that the opportunity to realize monopoy profits spurs
invention and the presence of some monopoly power has a similar effect, the latter
also stressed by Galbraith (1952), forms the basis of a modem economic theory
of technical advance. The accent is on competition through innovation rather than
through price alone, and is thus contrary to the traditional tenets of the western
economic doctrine of 'perfect competition' which would eliminate any excess
profit of an innovation by immediate imitation.

4.2. The presence of identified or potential rivals who are in the race to be the
first to innovate constitutes the major source of uncertainty for an entrepreneur.
It is this aspect of innovational ( R & D ) rivalry on which reliability ideas can be
brought to bear that is of interest to us. Even within the context of such applica-
tions, there are a host of issues in modeling the economics of innovation which
can be so addressed within the Schumpeterian framework. Kamien and Schwartz
(1982) provide a definitive account of contemporary research on the economics
of technical advance, where reliability researchers will recognize the potential to
exploit reliability ideas through modeling the uncertainty associated with
206 M. C. Bhattacharjee

innovational rivalry and possible duration of monopoly between successful


innovation and rivals' imitation. These ideas do not appear to have been explicitly
recognized and are only implicit in Kamien and Schwartz (1982). We will
consider one such model to focus on the relevance of reliability concepts in
modeling the economics of technical advance which may lead to deeper insights
into the role of innovational rivalry as a determinant of technological progress.
In this simplified model of innovation as an economic activity under the
Schumpeter scenario; our entrepreneur or firm has either only one product
(economic 'good') or none at all (breaking in as a newcomer), and is competing
against rivals to develop an innovation. We assume there is no essential resource
constraint and no major uncertainty important enough to warrant stochastic
modelling of the entrepreneur's time to complete development. Any desired com-
pletion time r can be achieved by spending a required amount C(v) representing
the net present value of the cost stream incurred to complete development at
time ~. Although it is usual to assume that 0 < C(x) is convex decreasing, for our
purposes the latter assumption is unnecessary, and only assuming C(0) sufficiently
large to prevent instantaneous development will suffice. Assume a market growth
rate 7; 7>, = or < 0 according as the market is growing, stationary or
decreasing.
The development process is assumed to be contractual in the sense that
innovation will be seen through its completion by the entrepreneur as well as the
rivals either as a pioneer or as an imitator. The entrepreneur has only an incom-
plete knowledge about rivals' introduction time T reflected by its d.f.
H(t) = P(T<~ t) about which more will be said later.
The current rate of the entrepreneur's return r(t; ~, T) at time t depends not
only on when the innovation is introduced in the market but also on whether our
entrepreneur is a winner succeeding first or, an imitator of the rivals. Let this be
r o (receipt on current good) until introduction of the innovation changes it to r 1
or Po recording as some rival or the entrepreneur succeeds first. These rates
remain in effect until the moment both the innovating pioneer and the imitator
appear. Once the entrepreneur and the rivals are both in the market, the former's
rate of return changes again. The current value of its contribution to the total
return is a function P(z, T), the current capitalized value of the stream of future
receipts, which depend on and T typically through I v - T I: the lag between
innovation and imitation. The structure of P also depends on whether the rivals
win (r >~ T; correspondingly P = :P1('), say) or imitate (T > z, when P = : Po(')).
Accordingly,

P(z, T)= P o ( T - z) ifz<T,

= PI('~ - T) if z > / T ;
Reliability applications in economics 207

and the flow of receipts can be schematically described as below

r , , P ,

min (z, T) max (~, T)

ro Po Po
x [ z < T: rival imitates
z T

ro rl P1
z >/T: rival precedes
T z

The expected net present value of the entrepreneur's returns, with a market
interest rate i, as a consequence of the decision to choose an introduction time
z is
oo

U(z) =
L E { e - ( i - ~ ) ' r ( t ; z, T)} dt + E { e - ( ' - r) max(z. T)p(.c, r)}

= e-(i- r),{ro~(t ) + rill(t) } dt + Po e - ( i - ,)t~(t) dt

+ e -(i-')* Pl(z - t) dH(t) + e - ( i - ' ) t P o ( t - ~) d H ( t ) .


(4.1)

The optimal introduction time z* is of course the solution which maximizes the
expected value of profit

V('O = U(-c)- C('O. (4.2)

While z* = 0 can be ruled out by taking C(0) to be sufficiently large, it is possible


to have z* = oo (best not to undertake development at all) depending on the
relative values of the economic parameters. In the remaining cases there is a finite
economically best introduction time. It is usual, but not necessary to have
Po >~ ro >1 rl and PD >~ 0, P'I ~< 0 which are easily interpreted: (i) rival precedence,
should it occur, does not increase the rate of return from old good which further
increases if the entrepreneur succeeds first, (ii)in the post-innovation-cum-
imitation period, the greater is the lag of rival entry, if we succeed first (the greater
is the lag in our following, if the rivals succeed first), the greater (the smaller) is
our return from the remaining market. Various special cases m a y occur within
these constraints, e.g., rivals' early success m a y m a k e our current good obsolete
(r~ = 0); or the entrepreneur m a y be a new entrant with no current good to be
208 M. C. Bhattacharjee

replaced (ro = r 1 = 0 ) . Sensitivity of the optimal introduction time to these and


other parameters in the model are of obvious economic interest and are easily
derived (Kamien and Schwartz, 1982).

4.3. Intensity of rivalry as a reliability idea and its implications. What interests us
more is how the speed of development, as reflected by the economic z*, is
affected by the extent of innovational rivalry which is built-in in the rivals' intro-
duction time distribution H. Kamien and Schwartz (1982) postulate
m

H(t) = : P ( T > t) = e -hA(t)

and propose h > 0 as a degree of innovational hazard. To avoid confusion with


the notion of hazard in reliability theory, we call h as the intensity of innovational
rivalry. Setting F(t) = 1 - e-A(O, it is clear that

H(t) = fib(t) (4.3)

i.e., the rival introduction time d.f. H belongs to a family of distributions with
proportional hazards which are of considerable interest in reliability. We may
think of F as the distribution of rivals' development time under unit rivalry (h = 1)
for judging how fast may the rivals complete development as indicated by H.
Since the hazard function A n ( t ) = : - i n H ( t ) is a measure of time-varying in-
novational risk of rival pre-emption, the proportional hazards hypothesis
A~(t) = hA(t) in (4.3) says the effects of time and rivalry on the entrepreneur's
innovational hazards are separable and multiplicative. If F has a density and
correspondingly a hazard rate (i.e., 'failure rate') 2(0, the so does H with failure
rate h2(t). It is the innovational rate of hazard at time t from the viewpoint of
our entrepreneur; and by standard reliability theoretic interpretation of failure
rates, the conditional probability of rivals' completion soon after t given com-
pletion has not occurred within time t is

P(T<<. t + 61 T > t) = h62(t) + 0(6).

As the intensity of rivalry increases by a factor from h to ch; this probability, for
each fixed t and small b, also increases essentiall by the same factor c.
To examine the effect of the intensity of rivalry on the speed of development,
assume that having imitators is preferable to being one (Po > P~) and that the
corresponding rewards are independent of 'innovation-imitation lag' (P'1 = P~ = 0)
as a simplifying assumption. By (4.1) and (4.2), the optimal introduction time z*
is then the implicit solution of

OV
- e-(i-~)~[{ro _ Po + h(P, - Po)2(z)}F(z)
&

+ rl - ( i - 2)P~}F(z)] - C'(t) = O, (4.4)


Reliability applications in economics 209

satisfying the second derivative condition for a maximum at z*. (4.4) defines
z* = z*(h) implicitly as function of the rivalry intensity. Kamien and Schwartz
(1982) show that if

2(t) t and 2(t)/A(t)$ in t, (4.5)

then either (i) z*(h) 1" or (ii) z*(h) is initially ~ and then t in h. The crux of their
argument is the following. If ro(h) is implicitly defined by the equation

2(t){A~z)- h} = {po - ro + rl - ( i - 2)P1}/(Po- P1), (4.6)

i.e., the condition for the left hand side of (4.4) to have a local extremum as a
function of h; then z*(h) is decreasing, stationary or increasing in h according as
z*(h) > , = or < zo(h). Accordingly, since (4.5) implies that zo(h) is decreasing in
h; either z*(h) behaves according to one of the two possibilities mentioned, or
(iii) r*(h) < zo(h) for all h >~ 0. The last possibility can be ruled out by the con-
tinuity of V= V(z, h) in (4.2), V(0, h ) < 0, V(z*, h ) > 0 and the condition
P1 > Po. Which one of the two possibilities obtains of course depends on the
model parameters. In case (i), the optimal introduction time z*(h) increases with
increasing rivalry and the absence of rivalry (h = 0) yields the smallest such
optimal introduction time. The other case (ii), that depending on the rates of
return and other relevant parameters, there may be an intermediate degree of
rivalry for which the optimal development is quickest possible, is certainly not
obvious a-priori and highlights the non-intuitive effects of rivalry on decisions to
innovate.

4.4. Further reliability ramifications. From a reliability point of view, Kamien


and Schwartz's assumption (4.5) says

F ~ {IFR} c3 ~ (4.7)

and hence so does H; where ~( is the set of life distributions with a log-concave
hazard function. The IFR hypothesis is easy to interpret. It says; the composite
rivals' residual time to development is stochastically decreasing so that if they
have not succeeded so far, then completion of their development within any
additional deadline becomes more and more likely with elapsed time. This reflects
the accumulation of efforts positively reinforcing the chances of success in future.
The other condition that F, and thus H, also has a log-concave hazard function
is less apparent to such interpretation; it essentially restricts the way in which the
time-dependent component of the entrepreneur's innovational hazard from com-
peting rivals grows with time t.
The proportional hazard model (4.3) can accomodate different configurations
of market structure as special cases, an argument clearly in its favor. By (4.3), as
210 M. C. Bhattacharjee

h --, O, P(T > t) ~ 1 for all t > 0 and in the limiting case T is an improper r.v. witb
all its mass at infinity. Thus h = 0 corresponds to absence of rivalry. Similarly as
h ~ 0% P ( T > t)---,O for all t > 0; in the limit the composite rivals' appearance is
immediate and this prevents the possibility of entreprenunial precedence. If our
entrepreneur had a head start with no rivals until a later time when rivals appear
with a very large h, then even if our entrepreneur innovates first; his supernormal
profits from innovation will very quickly be eliminated by rival imitation with high
probability within a very short time as a consequence of high rivalry intensity h,
which shrinks to instantaneous imitation as h approaches infinity. In this sense
the case h = oo reflects the traditional economists' dream of 'perfect competition'.
Among the remaining possibilities 0 < h < oo that reflect more of a realism, Barzel
(1968) distinguishes between moderate and intense rivalry, the latter corresponding
to the situation when the intensity of rivalry exceeds the market growth rate
( h > 7). If rivalry is sufficiently intense, no development becomes best
(h >>~, ~ z*(h) = ~ ) . In other cases, the intense rivalry and non-rivalous solu-
tions provide vividly contrasting benchmarks to understand the innovation pro-
cess under varying degrees of moderate to intense rivalry.
Our modeling to illustrate the use of reliability ideas has been limited to a
relatively simplified situation. It is possible to introduce other variations and
features of realism such as modification of rivals' effort as a result of entre-
preneur's early success, budget constraints, non-contractual development which
allows the option of stopping development under rival precedence, and game
theoretic formulations which incorporate technical uncertainty. There is now sub-
stantial literature on these various aspects of innovation as an economic process
(DasGupta and Stiglitz, 1980, 1980a; Kamien and Schwarz, 1968, 1971, 1972,
1974, 1975, 1982; Lee and Wilde, 1980; Lowry, 1979). It appears to us that there
are many questions, interesting from a reliability application viewpoint which can
be profitably asked and would lead to a deeper understanding of the economics
of innovation. Even in the context of the present model which captures the
essence of the innovating proces under risk of rivalry, there are many such
questions. For example, what kind of framework for R & D rivalry and market
mechanisms lead to the rival entry model (4.3)? Stochastic modeling of such
mechanisms would be of obvious interest. Note the exponential: H ( t ) = e -m,
2(0 = 1; Weibull: H(t) = e -h'~, 2(0 = ~t ~- 1 and the extreme-value distributions:
H(t) = e x p { - h ( e ~ ' - 1)}, 2(t)= 0~e~t all satisfy (4.3) and (4.7), the latter for
~>1.
A related open question is the following. Suppose the rival introduction time
satisfies (4.3) but its distribution F under unit rivalry (h = 1) is unknown. Under
what conditions, interesting from a reliability point of view with an appropriate
interpretation in the context of rivalry, does there exist a finite maximin intro-
duction time ~*(h) and what, if any, is a least favorable distribution F* of time
to rival entry? Such a pair (z*(h), F*), for which

max rain V(~, h; F) = min max V(z, h; F ) = V(z*(h), h; F * ) ,


z F F ~c
Reliability applications in economics 211

would indicate the entrepreneur's best economic introduction time within any
specified regime of rivalry when he has only an incomplete knowledge of the
benchmark distribution F. Here V(v, h; F) is the total expected reward (4.2) and
(4.1) under (4.3).
The proportional hazards model (4.3) aggregates all sources of rivalry, from
existing firms or potential new entrants. This is actually less of a criticism than
it appears because in the entrepreneur's preception, only the distribution of com-
posite rival entry time matters. It is possible to introduce technical uncertainty in
the model by recognizing that the effort, usually parametrized through cost,
required to successfully complete development is also subject to uncertainties
(Kamien and Schwartz, 1971). Suppose there are n competetors including our
entrepreneur, the rivals are independent and let G(z) be the probability that any
rival completes development with an effort no more than z. If z(t) is the cumula-
tive rival effort up to time t, then the probability that none of the rivals will
succeed by time t is

P(t) = 1 - {1 - G(z(t))} n-1

This leads to (4.3) with F--- G(z), H = P and intensity h = (n - 1) the number of
rivals. We note this provides one possible answer to the question of modeling
rivalry described by (4.3). What other alternative mechanisms can also lead to
(4.3)? If the effort distribution G has a 'failure rate' (intensity of effort) r(z), then
the innovational hazard function and rates are

An(t ) ( n - 1) r(u) du,


(4.8)
2H(t) = (n - 1)z'(t)r(z(t)),

which show how technical uncertainty can generate market uncertainty. If our
entrepreneur's effort distribution is also G(z) and independent of the rivals; then
note the role of each player in the innovation game is symmetric and each faces
the hazard rate (4.8) since from the perspective of each competitor, the other
(n - 1) rivals are i.i.d, and in series. It would clearly be desirable to remove the
i.i.d, assumption to reflect more of a realism in so far as a rival's effort and
spending decisions are often dictated by those of others.
Some of the effects of an innovation may be irreversible. Computers and
information processing technology which have now begun to affect every facet of
human life is clearly a case in point. Are these impacts or their possible irreversi-
bility best for the whole society? None of the above formulations can address this
issue, a question not in the perview of economists and quantitative modeling
alone; nor do they dispute their relevance. What they can and do provide is an
understanding of the structure and evolution of the innovating process as a risky
enterprise and it is here that reliability ideas may be able to play a more significant
role than hitherto in explaining rivalry and their impacts on the economics of
212 M. C. Bhattacharjee

i n n o v a t i o n . In t u r n the m e a s u r a b l e p a r a m e t e r s o f s u c h m o d e l s a n d their c o n s e -
q u e n c e s c a n t h e n serve as s i g n p o s t s for an i n f o r m e d d e b a t e o n the w i d e r
q u e s t i o n s o f social r e l e v a n c e o f an i n n o v a t i o n .

References

Arrow, K. J. (1951). Social Choice and Individual Values. Wiley, New York.
Arrow, K. J. (1962). Economic welfare and the allocation of resources for invention. In: R. R. Nelson,
ed., The Rate and Direction of Inventive Activity. Princeton University Press, Princeton, NJ.
Barlow, R. E. and Campo, R. (1975). Total time on test processes and applications to failure data
analysis. In: R. E. Barlow, J. Fussell and N. D. Singpurwalla, eds., Reliability and Fault Tree
Analysis, SIAM, Philadelphia, PA, 451-481.
Barlow, R. E. and Saboia, J. L. M. (1973). Bounds and inequalities in the rate of population growth.
In: F. Proschan and R. J. Serfling, eds., Reliability and Biometry, Statistical Analysis of Lifelengths,
SIAM, Philadelphia, PA, 129-162.
Barlow, R. E. and Proschan, F. (1965). Mathematical Theory of Reliability. Wiley, New York.
Barlow, R. E. and Proschan, F. (1975). Statistical Theory of Reliability and Life Testing: Probability
Models. Holt, Rinehart and Winston, New York.
Barzel, Y. (1968). Optimal timing of innovation. Review of Economics and Statistics 50, 348-355.
Bergmann, R. and Stoyan, D. (1976). On exponential bound for the waiting time distribution in
GI/G/1. J. AppL Prob. 13(2), 411-417.
Bhattacharjee, M. C. and Krishnaji, N. (1985). DFR and other heavy tail properties in modeling the
distribution of land and some alternative measures of inequality. In: J. K. Ghosse, ed., Statistics:
Applications and New Directions, Indian Statistical Institute, Eka Press, Calcutta; 100-115.
Bhattacharjee, M. C. (1986). Tail behaviour of age-smooth failure distribution and applications. In:
A. P. Basu, ed., Reliability and Statistical Quality Control, North-Holland, Amsterdam, 69-86.
Bhattacharjee, M. C. (1986a). On using Reliability Concepts to Model Aggregate Inequality of Distribu-
tions. Technical Report, Dept. of Mathematics, University of Arizona, Tucson.
Brains, S. J., Lucas, W. F. and Straffin, P. D., Jr. (eds.) (1978). Political and Related Models. Modules
in Applied Mathematics: Vol. 2, Springer, New York.
Chandra, M. and Singpurwalla, N. D. (1981). Relationships between some notions which are
common to reliability and economics. Mathematics of Operations Research 6, 113-121.
Daley, D. (ed.) (1983). Stochastic Comparison Methods for Queues and Other Processes. Wiley, New
York.
Deegan, J., Jr. and Packel, E. W. (1978). To the (Minimal Winning) Victors go the (Equally Divided)
Spoils: A New Power Idex for Simple n-Person Games. In: S. J. Brahms, W. F. Lucas and P.
D. Straffin, Jr. (eds.): Political and Related Models. Springer-Verlag, New York, 239-255.
DasGupta, P. and Stiglitz, J. (1980). Industrial structure and the nature of innovative activity.
Economic Journal 90, 266-293.
DasGupta, P. and Stiglitz, J. (1980a). Uncertainty, industrial structure and the speed of R& D. Bell
Journal of Economics 11, 1-28.
Feller, W. (1966). Introduction to Probability Theory and Applications. 2nd ed. Wiley, New York.
Gail, M. H. and Gatswirth, J. L. (1978). A scale-free goodness-of-fit test for the exponential distribu-
tion based on the Lorenz curve. J. Amer. Statist. Assoc. 73, 787-793.
Galbraith, J. K. (1952). American Capitalism. Houghton and Mifflin, Boston.
Goldie, C. M. (1977). Convergence theorems for empirical Lorenz curves and their inverses. Advances
in Appl. Prob. 9, 765-791.
Guess, F., Hollander, M. and Proschan, F. (1983). Testing whether Mean Residual Life Changes Trend.
FSU Technical Report #M665, Dept. of Statistics, Florida State University, Tallahassee.
Hicks, J. R. (1932). The Theory of Wages. Macmillan, London.
Hollander, M. and Proschan, F. (1975). Tests for the mean residual life. Biometrika 62, 585-593.
Kamien, M. and Schwartz, N. (1968). Optimal induced technical change. Econometrika 36, 1-17.
Reliability applications in economics 213

Kamien, M. and Schwartz, N. (1971). Expenditure patterns for risky R & D projects. J. Appl. Prob.
8, 60-73.
Kamien, M. and Schwartz, N. (1972). Timing of innovations under rivalry. Econometrika 40, 43-60.
Kamien, M. and Schwartz, N. (1974). Risky R & D with rivalry. Annals of Economic and Social
Measurement 3, 276-277.
Kamien, M. and Schwartz, N. (1975). Market structure and innovative activity: A survey. J.
Economic Literature 13, 1-37.
Kamien, M. and Schwartz, N. (1982). Market Structure and Innovation. Cambridge University Press,
London.
Kelfsj/J, B. (1982). The HNBUE and HNWUE class of life distributions. Naval Res. Logist. Qrtly. 29,
331-344.
Kelfsj/5, B. (1983). Testing exponentiality against HNBUE. Scandinavian J. Statist. 10, 65-75.
Kelfsj~, B. (1984). Reliability interpretations of some concepts from economics. Naval Res. Logist.
Qrtly. 31,301-308.
Kleinrock, L. (1975). Queueing Systems, Vol. 1. Theory. Wiley, New York.
KSllerstrSm, J. (1976). Stochastic bounds for the single server queue. Math. Proc. Cambridge Phil.
Soc. 80, 521-525.
Lucas, W. F. (1978). Measuring power in weighted voting systems. In: S. J. Brahms, W. F. Lucas
and P. D. Straffin, Jr., eds., Political Science and Related Models. Springer, New York, 183-238.
Lee, T. and Wilde, L. (1980). Market structure and innovation: A reformulation. Qrtly. J. of
Economics 194, 429-436.
Loury, G. C. (1979). Market structure and innovation. Qrtly. J. of Economics XCIII, 395-410.
Macdonald, J. B. and Ransom, M. R. (1979). Functional forms, estimation techniques and the
distribution of income. Ecometrika 47, 1513-1525.
Mukherjee, V. (1967). Type III distribution and its stochastic evolution in the context of distribution
of income, landholdings and other economic variables. Sankhy-d A 29, 405-416.
Owen, G. (1982). Game Theory. 2nd edition. Academic Press, New York.
Pechlivanides, P. M. (1975). Social Choice and Coherent Structures. Unpublished Tech. Report # ORC
75-14, Operations Research Center, University of California, Berkeley,
Rae, D. (1979). Decision rules and individual values in constitutional choice. American Political
Science Review 63.
Ramamurthy, K. G. and Parthasarathy, T. (1983). A note on factorization of simple games. Opsearch
20(3), 170-174.
Ramamurthy, K. G. and Parthasarathy, T. (1984). Probabilistic implications of the assumption of
homogeneity in voting games. Opsearch 21(2), 81-91.
Salem, A. B. Z. and Mount, T. D. (1974). A convenient descriptive model of income distribution.
Econometrika 42, 1115-1127.
Schumpeter, J. A. (1961). Theory of Economic Development. Oxford University Press, New York.
Schumpeter, J. A. (1964). Business Cycles. McGraw-Hill, New York.
Schumpeter, J. A. (1975). Capitalism, Socialism and Democracy. Harper and Row, New York.
Seneta, E. (1976). Regularly Varying Functions. Lecture Notes in Math. 508, Springer, New York.
Straffin, P. D., Jr. (1978). Power indices in politics. In: S. J. Brams, W. F. Lucas and P. D. Straffin,
Jr., eds., Political Science and Related Models. Springer, New York, 256-321.
Straffin, P. D., Jr. (1978a). Probability models for power indices. In: P. C. Ordershook, ed., Game
Theory and Political Science, University Press, New York.
TaiUie, C. (1981). Lorenz ordering within the generalized gamma family of income distributions. In:
C. Taillie, P. P. Ganapati and B. A. Baldessari, eds., Statistical Distributions in Scientific Work. Vol.
6. Reidel, Dordrecht/Boston, 181-192.
Taussig, F. W. (1915). Innovation and Money Makers. McMillan, New York.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7 "1'~
.]k g ~
© Elsevier Science Publishers B.V. (1988) 215-224

Mean Residual Life: Theory and Applications*

Frank Guess and Frank Proschan

1. Introduction and summary

The mean residual life (MRL) has been used as far back as the third century
A.D. (cf. Deevey (1947) and Chiang (1968)). In the last two decades, however,
reliabilists, statisticians, and others have shown intensified interest in the MRL
and derived many useful results concerning it. Given that a unit is of age t, the
remaining life after time t is random. The expected value of this random residual
life is called the mean residual life at time t. Since the MRL is defined for each
time t, we also speak of the M R L function. (See Section 2 for a more formal
definition.)
The M R L function is like the density function, the moment generating function,
or the characteristic function: for a distribution with a finite mean, the MRL
completely determines the distribution via an inversion formula (e.g., see Cox
(1962), Kotz and Shanbhag (1980), and Hall and Wellner (1981)). Hall and
Wellner (1981) and Bhattacharjee (1982) derive necessary and sufficient condi-
tions for an arbitrary function to be a M R L function. These authors recommend
the use of the M R L as a helpful tool in model building.
Not only is the M R L used for parametric modeling but also for nonparametric
modeling. Hall and Wellner (1981) discuss parametric uses of the MRL. Large
nonparametric classes of life distributions such as decreasing mean residual life
(DMRL) and new better than used in expectation (NBUE) have been defined
using MRL. Barlow, Marshall and Proschan (1963) note that the D M R L class
is a natural one in reliability. Brown (1983) studies the problem of approximating
increasing mean residual life (IMRL) distributions by exponential distributions.
He mentions that certain IMRL distributions, '... arise naturally in a class of first
passage time distributions for Markov processes, as first illuminated by Keilson'.
See Barlow and Proschan (1965) and Hollander and Proschan (1984) for further
comments on the nonparametric use of MRL.
A fascinating aspect about M R L is its tremendous range of applications. For
example, Watson and Wells (1961) use MRL in studying burn-in. Kuo (1984)
* Research sponsored by the Air Force Office of Scientific Research, AFSC, USAF, under Grant
AFOSR 85-C-0007.

215
216 F. Guess and F. Proschan

presents further references on M R L and burn-in in his Appendix 1, as well as a


brief history on research in burn-in.
Actuaries apply MRL to setting rates and benefits for life insurance. In the
biomedical setting researchers analyze survivorship studies by MRL. See Elandt-
Johnson and Johnson (1980) and Gross and Clark (1975).
Morrison (1978) mentions IMRL distributions have been found useful as
models in the social sciences for the lifelengths of wars and strikes. Bhattacharjee
(1982) observes M R L functions occur naturally in other areas such as optimal
disposal of an asset, renewal theory, dynamic programming, and branching pro-
cesses.
In Section 2 we define more formally the M R L function and survey some of
the key theory. In Section 3 we discuss further its wide range of applications.

2. T h e o r y o f m e a n r e s i d u a l life

Let F be a life distribution (i.e., F(t) = 0 for t < 0) with a finite first moment.
Let i ( t ) = 1 - F(t). X is the random life with distribution F. The mean residual
life function is defined as

m(t)= E [ X - t I X > t] for if(t)> 0,


= 0 for if(t) = 0 , (2.1)

for t >/0. Note that we can express

~ i(x + t) f o~ i(u)
m(t) - - dx = ff-~ du
L r(t)

when i ( t ) > O. If F also has a density f we can write

re(t) : uf(u) du/~'(t) - t .

Like the failure rate function (recall that it is defined as r(t)= f(t)/F(t) when
F(t) > 0), the MRL function is a conditional concept. Both functions are condi-
tioned on survival to time t.
While the failure rate function at t provides information about a small interval
after time t ('just after t', see p. 10 Barlow and Proschan (1965)), the M R L
function at t considers information about the whole interval after t ('all after t').
This intuition explains the difference between the two.
Note that it is possible for the M R L function to exist but for the failure rate
function not to exist (e.g., consider the standard Cantor ternary function, see
Chung (1974), p. 12). On the other hand, it is possible for the failure rate function
Mean residual life: theory and applications 217

to exist but the M R L function not to exist (e.g., consider modifying the Cauchy
density to yield f ( t ) = 2/n(1 + t 2) for t >f 0). Both the M R L and the failure rate
functions are needed in theory and in practice.
When m and r both exist the following relationship holds between the two:

m'(t) = m ( t ) r ( t ) - 1. (2.2)

See Watson and Wells (1961) for further comments on (2.2) and its uses•
If the failure rate is a constant ( > 0 ) the distribution is an exponential. If the
MRL is a constant ( > 0 ) the distribution is also an exponential.
L e t / t = E(X). If F(0) = 0 then m(0) = #. If F(0) > 0 then m(0) = #/F(0) ~ #.
For simplicity in discussions and definitions in this section, we assume F(0) = 0.
Let F be right continuous (not necessarily continuous). Knowledge of the MRL
function completely determines the reliability function as follows:

if(t) = m(O) e- $Om~


, , d~ for 0 ~ t < F - l ( 1 ) ,
m(O

=0 for t~> F - I ( 1 ) , (2.3)

where F - l ( 1 ) ~ f s u p { t [ F ( t ) < 1}.


Cox (1962) assigns as an exercise the demonstration that M R L determines the
reliability. Meilijson (1972) gives an elegant, simple proof of (2.3). Kotz and
Shanbhag (1980) derive a generalized inversion formula for distributions that are
not necessarily life distributions. Hall and Wellner (1981) have an excellent dis-
cussion of (2.3) along with further references.
A natural question to ask is: what functions are M R L functions? A charac-
terization is possible which answers this. By a function f being increasing (de-
creasing) we mean that x ~<y implies f(x)<~ (>>,)fly).

THEOREM 2.1. Consider the following conditions:


(i) m:[0, 09)--+ [0, 09).
(ii) m(0) > 0.
(iii) m is right continuous (not necessarily continuous).
(iv) d(t) ,lof
= re(t) + t is increasing on [0, 09).
• def.
(v) When there exists to such that m ( t o ) = llInt~t¢ re(t) = 0, then m(t) = 0
holds for t ~ [to, 09). Otherwise, when there does not exist such a to with
m ( t o ) = O, then S o 1~re(u)du = 09 holds.
A function m satisfies (i)-(v) /f and only if m is the M R L function of a non-
degenerate at 0 life distribution.

See Hall and Wellner (1981) for a proof. See Bhattacharjee (1982) for another
characterization. Note that condition (ii) rules out the degenerate at 0 distribution•
218 F. Guess and F. Proschan

For (iv) note that d(t) is simply the expected time of death (failure) given that a
unit has survived to time t. Theorem 2.1 delineates which functions can serve as
MRL functions, and hence, provides models for lifelengths.
We restate several bounds involving MRL from Hall and Wellner (1981). Recall
a + = a if a >i 0, otherwise a + = 0.

THEOREM 2.2. Let F be nondegenerate. L e t ~tr = E X r ~ oo for r > 1.


(i) m ( t ) < ~ ( F - l ( 1 ) - t ) + for all t. Equality holds if and only if F ( t ) =
F ( ( F - 1(1))-) or 1.
(ii) m(t) <~ (#~if(t)) - t f o r all t. Equality holds if and only if F(t) = O.
(iii) m(t) < (#r/F(t)) l / r - t for all t.
(iv) m(t) >~ (kt - t)+ /F(t) for t < F - 1(1). Equality holds if and only if r ( t ) = O.
(v) m(t) > [# - F(t)(l~r/F(t))l/~]iF(t ) - t f o r t < F - 1(1).
(vi) m(t)>~ ( # - t) + for all t. Equality holds if and only if F(t) = 0 or 1.

Various nonparametric classes of life distributions have been defined using


MRL. (Recall, for simplicity we assume F(0) = 0 and the mean is finite for these
definitions.)

DEFINITION 2.3. DMRL. A life distribution F has decreasing mean residual life
if its MRL m is a decreasing function.

DEFINITION 2.4. NBUE. A life distribution F is new better than used in expec-
tation if m(0) >1 m(t) for all t >t 0.

DEFINITION 2.5. IDMRL. A life distribution F has increasing then decreasing


mean residual life if there exist z>~ 0 such that m is increasing on [0, z) and
decreasing on [z, ~ ) .

Each of these classes above has an obvious dual class associated with it, i.e.,
increasing mean residual life, new worse than used in expectation (NWUE), and
decreasing then increasing mean residual life (DIMRL), respectively.
The D M R L class models aging that is adverse (e.g., wearing occurs). Barlow,
Marshall and Proschan (1963) note that the D M R L class is a natural one in
reliability. See also Barlow and Proschan (1965). The older a D M R L unit is, the
shorter is the remaining life on the average. Chen, Hollander and Langberg (1983)
contains an excellent discussion of the uses of the D M R L class.
Burn-in procedures are needed for units with IMRL. E.g., integrated circuits
have been observed empirically to have decreasing failure rates; and thus they
satisfy the less restrictive condition of IMRL. Investigating job mobility, social
scientists refer to IMRL as inertia. See Morrison (1978) for example. Brown
(1983) studies approximating IMRL distributions by exponentials. He comments
that certain IMRL distributions, '... arise naturally in a class of first passage time
distributions for Markov processes, as first illuminated by Keilson'.
Note that D M R L implies NBUE. The N B U E class is a broader and less
Mean residual life: theory and applications 219

restrictive class. Hall and Wellner (1981) show for NBUE distributions that the
coefficient of variation a/it ~< 1, where a z = Var(X). They also comment on the
use of NBUE in renewal theory. Bhattacharjee (1984b) discusses a new notion,
age-smoothness, and its relation to NBUE for choosing life distribution models
for equipment subject to eventual wear. Note that burn-in is appropriate for
NWUE units.
For relationships of DMRL, IMRL, NBUE, and N W U E with other classes
used in reliability see the survey paper Hollander and Proschan (1984).
The IDMRL class models aging that is initially beneficial, then adverse. Si-
tuations where it is reasonable to postulate an IDMRL model include:
(i) Length of time employees stay with certain companies: An employee with a
company for four years has more time and career invested in the company than
an employee of only two months. The M R L of the four-year employee is likely
to be longer than the M R L of the two-month employee. After this initial IMRL
(this is called 'inertia' by social scientists), the processes of aging and retirement
yield a D M R L period.
(ii) Life lengths of human." High infant mortality explains the initial IMRL.
Deterioration and aging explain the later D M R L stage.
See Guess (1984) and Guess, Hollander, and Proschan (1983) for further
examples and discussion. Bhattacharjee (1983) comments that Gertsbakh and
Kordonskiy (1969) graph the MRL function of a lognormal distribution that has
a 'bath-tub' shaped M R L (i.e., DIMRL).
Hall and Wellner (1981) characterize distributions with MRL's that have linear
segments. They use this characterization as a tool for choosing parametric
models. Morrison (1978) investigates linearly IMRL. He states and proves that
if F is a mixture of exponential then F has linearly IMRL if and only if the mixing
distribution, say G, is a gamma. Howell (1984) studies and lists other references
on linearly DMRL.
In renewal theory M R L arises naturally also. For a renewal process with
underlying distribution F, let G(t) = ( ~ if(u)du)/#. G is the limiting distribution
of both the forward and the backward recurrence times. See Cox (1962) for more
details. Also if the renewal process is in equilibrium then G is the exact distribu-
tion of the recurrence times. G(t) = (m(t)ff(t))/#. The failure rate of G, r 6, is
inversely related to the MRL of F, m F. I.e., re(t ) = 1/mF(t ). Note, however, that
rF(t) ~ 1/mF(t ) is USually the case. See Hall and Wellner (1981), Rolski (1975),
Meilijson (1972), and Watson and Wells (1961) for related discussions.
Kotz and Shanbhag (1980) establish a stability result concerning convergence
of an arbitrary sequence of M R L functions to a limiting MRL function. (See also
Bhattacharjee (1982).) They show an analogous stability result for hazard
measures. (When the failure rate for F exists and vF is F's hazard measure, then
VF(B) = ~B rF(t) dt for B a Borel set.) Their results imply that MRL functions can
provide more stable and reliable information than hazard measures when
assessing noncontinuous distributions from data.
In a multivariate setting, Lee (1985) shows the effect of dependence by total
positivity on M R L functions.
220 F. Guess and F. Proschan

3. Applications of mean residual life

A mean is easy to calculate and explain to a person not necessarily skilled in


statistics. To calculate the empirical M R L function, one does not need calculus.
Details of computing the empirical M R L follow.
Let X 1, X 2 . . . . , X~ be a r a n d o m sample from F. For simpler initial notation,
we assume first no ties. Later we allow for ties. Order the observations as

x,. <x2. < "" <Xn.. (3.1)

Let Xo, = 0. The empirical M R L function is defined as

2 ni = k + l (Sin -- t)
mn(t ) = for te [Xk,, X(k + l),) , (3.2)
n-k

and k = 0, 1, ..., n - 1. rn~(t) = 0 for t>~X,n.


Note that (3.2) is simply

Total time on test observed after t


m,(t) = (3.3)
N u m b e r of units observed after t
-- def[~n
The empirical M R L function at 0, mn(0) = X , = ~,. ~= 1 Xi)/n, is just the usual
sample mean when no unit fails at time 0. If a unit fails at 0 then m n ( 0 ) > X,.
If ties exist let

0 = Xol<Xll<X2l < ... <X~ll (3.4)

be the distinct ordered times of failure,

i
n; = number of observed failures at time ~';z, se=n- ~ nj (3.5)
j=0

for i = 0, 1, ..., I < n. Note that n i ~ 0, i = 1, . . . , / , while n o = 0 is allowed.

m.(t) l
= ~i=k+ 1
ni(Xil- t) for t~ [~'kZ, X(k+ ,),),
Sk (3.6)
= 0 for t >~/~'u,

for k = 0, 1. . . . , l - 1. Note that (3.6) is simply notation for (3.3).


We illustrate in the following example.

EXAMPLE 3.1. Bjerkedal (1960) studies the lifelengths of guinea pigs injected
with different amounts of tubercle bacilli. Guinea pigs are known to have a high
Mean residual life." theory and applications 221

susceptibility to human tuberculosis, which is one reason for choosing this


species. We describe the only study (M) in which animals in a single cage are
under the same regimen. The regimen number is the common log of the number
of bacillary units in 0.5 ml of the challenge solution, e.g., regimen 4.3 corresponds

Table 3.1
Empirical m e a n residual life in days at the unique times of death for the 72 guinea pigs under
regimen 5.5. We include the empirical M R L at time 0 also.

Number of Time of Empirical N u m b e r of Time of Empirical


ties death MRL ties death MRL
nz Xm mn(Xin) n, .~z~ mn(-~m)
0 0 141.85 1 123 114.92
1 43 100.24 1 126 116.40
1 45 99.64 1 128 119.17
1 53 92.97 1 137 114.96
2 56 92.66 1 138 119.14
1 57 93.05 1 139 123.76
1 58 93.46 1 144 124.70
1 66 86.80 1 145 130.21
1 67 87.16 1 147 135.33
1 73 82.47 1 156 133.76
1 74 82.80 1 162 135.75
1 79 79.10 1 174 132.00
2 80 80.79 1 178 137.14
3 81 84.15 1 179 146.62
1 82 84.69 1 184 153.42
2 83 86.90 1 191 159.73
1 84 87.59 1 198 168.00
1 88 85.26 1 211 172.22
1 89 85.98 1 214 190.38
2 91 87.55 1 243 184.43
2 92 90.40 1 249 208.17
1 97 87.34 1 329 153.80
2 99 89.40 I 380 128.50
2 100 92.83 1 403 140.67
1 101 94.18 1 511 49.00
3 102 100.94 1 522 76.00
1 103 102.80 1 598 0.00
1 104 104.79
1 107 104.88
1 108 107.13
1 109 109.55
1 113 109.07
1 114 111.79
1 118 111.64
1 121 112,67
222 F. Guess and F. Proschan

to 2.2 × 104 bacillary units per 0.5 ml (loglo(2.2 × 104)=4.342). Table 3.1
presents the data from regimen 5.5 and the empirical MRL.
Graphs of MRL provide useful information not only for data analysis but also
for presentations. Commenting on fatigue longevity and on preventive main-
tenance, Gertsbakh and Kordonskiy (1969) recommend the MRL function as
another helpful tool in such analyses. They graph the MRL for different distribu-
tions (e.g., Weibull, lognormal, and gamma). Hall and Wellner (1979) graph the
empirical MRL for Bjerkedal's (1960) regimen 4.3 and regimen 6.6 data. Bryson
and Siddiqui (1969) illustrate the graphical use of the empirical MRL on survival
data from chronic granulocytic leukemia patients. Using the standard Kaplan-
Meier estimator (e.g., see Lawless (1982), Nelson (1982), or Miller (1980)), Chen,
Hollander, and Langberg (1983) graph the empirical MRL analogue for censored
lifetime data.
Gertsbakh and Kordonskiy (1969) note that estimation of MRL is more stable
than estimation of the failure rate. Statistical properties of estimated means are
better than those of estimated derivatives (which enter into failure rates).
Yang (1978) shows that the empirical MRL is uniformly strongly consistent.
She establishes that mn, suitably standardized, converges weakly to a Gaussian
process. Hall and Wellner (1979) require less restrictive conditions to apply these
results. They derive and illustrate the use of simultaneous confidence bands for
m. Yang (1978) comments that for t > 0, ran(t) is a slightly biased estimator.
Specifically, E(mn(t))= m(t)(1 -Fn(t)). Note, however, that l i m ~ E(m~(t))=
re(t). Thus, for larger samples rn,(t) is practically unbiased. See also Gertsbakh
and Kordonskiy (1969).
Yang (1977) studies estimation of the MRL function when the data are ran-
domly censored. For parametric modeling Hall and Wellner (1981) use the empiri-
cal MRL plot. They observe that the empirical MRL function is a helpful addition
to other life data techniques, such as total time on test plots, empirical (cumula-
tive) failure rate functions, etc. The MRL plot detects certain aspects of the
distribution more readily than other techniques. See Hall and WeUner (1981), Hall
and WeUner (1979), and Gertsbakh and Kordonskiy (1969) for further comments.
When a parametric approach seems inadvisable, the MRL function can still be
used as a nonparametric tool. Broad classes defined in terms of MRL allow a
more flexible approach while still incorporating preliminary information. For ex-
ample, to describe a wear process, a DMRL is appropriate. When newly
developed components are initially produced, many may fall early (such early
failure is called infant mortality and this early stage is called the debugging stage).
Another subgroup tends to last longer. Depending on information about this latter
subgroup, we suggest IMRL (e.g., lifelengths of integrated circuits) or IDMRL
(e.g., more complicated systems where there are infant mortality, useful life, and
wear out stages).
Objective tests exist for these and other classes defined in terms of MRL. E.g.,
see Hollander and Proschan (1984) and Guess, Hollander and Proschan (1983).
To describe 'burn-in' the MRL is a natural function to use. Kuo's (1984)
Appendix 1 presents an excellent brief introduction to burn-in problems and
applications of MRL.
Mean residual life: theory and applications 223

Actuaries apply M R L to setting rates and benefits for life insurance. In the
biomedical setting researchers analyze survivorship studies by M R L . For example,
see E l a n d t - J o h n s o n and J o h n s o n (1980) and Gross and Clark (1975).
Social scientists use I M R L for studies on job mobility, length o f wars, duration
of strikes, etc. See Morrison (1978).
In economics M R L arises also. Bhattacharjee and Krishnaji (1981) present
applications of M R L for investigating landholding. Bhattacharjee (1984a) uses
N B U E for developing optimal inventory policies for perishable items with r a n d o m
shelf life and variable supply.
Bhattacharjee (1982) observes M R L functions occur naturally in other areas
such as optimal disposal of an asset, renewal theory, dynamic programming, and
branching processes.

Acknowledgements

We thank Dr. J. Travis, Department of Biological Sciences, and Dr. D. Meeter,


Department of Statistics, Florida State University, for the Deevey (1947)
reference. We are also grateful to Dr. M. Bhattacharjee, Indian Institute o f
Management, Calcutta, and to Dr. M. Hollander, Department of Statistics,
Florida State University for discussions on M R L .

References
Barlow, R. E., Marshall, A. W. and Proschan, F. (1963). Properties of probability distributions with
monotone hazard rate. Ann. Math. Statist. 34, 375-389.
Barlow, R. E. and Proschan, F. (1965). Mathematical Theory of Reliability. Wiley, New York.
Barlow, R. E. and Proschan, F. (1981). Statistical Theory of Reliability and Life Testing. To Begin With,
Silver Springs, MD.
Bhattacharjee, M. C. (1984a). Ordering policies for perishable items with unknown shelf life/variable
supply distribution. Indian Institute of Management, Calcutta, Technical Report.
Bhattacharjee, M. C. (1984b). Tail behavior of age-smooth failure distributions and applications.
Indian Institute of Management, Calcutta, Technical Report.
Bhattacharjee, M. C. (1983). Personal communication.
Bhattacharjee, M. C. (1982). The class of mean residual lives and some consequences. S l A M J.
Algebraic Discrete Methods 3, 56-65.
Bhattacharjee, M. C. and Krishnaji, N. (1981). DFR and other heavy tail properties in modelling the
distribution of land and some alternative measures of inequality. Proceedings of the Indian Statisti-
cal Institute Golden Jubilee International Conference.
Bjerkedal, T. (1960). Acquisition of resistance in guinea pigs infected with different doses of virulent
tubercle bacilli. Amer. J. Hygiene 72, 130-148.
Brown, M. (1983). Approximating IMRL distributions by exponential distributions, with applications
to first passage times. Ann. Probab. 11, 419-427.
Bryson, M. C. and Siddiqui, M. M. (1969). Some criteria for aging. J. Amer. Statist. Assoc. 64,
1472-1483.
Chen, Y. Y., Hollander, M. and Langberg, N. A. (1983). Tests for monotone mean residual life,
using randomly censored data. Biometrics 39, 119-127.
Chiang, C. L. (1968). Introduction to Stochastic Processes in Biostatistics. Wiley, New York.
224 F. Guess and F. Proschan

Chung, K. L. (1974). A Course in Probability Theory, 2nd ed. Academic Press, New York.
Cox, D. R. (1962). Renewal Theory. Methuen, London.
Deevey, E. S. (1947). Life tables for natural populations of animals. Quarterly Review of Biology 22,
283-314.
Elandt-Johnson, R. C. and Johnson, N. L. (1980). Survival Models and Data Analysis. Wiley, New
York.
Gertsbakh, I. B. and Kordonskiy, K. B. (1969). Models of Failure. Springer, New York.
Gross, A. J. and Clark, V. A. (1975). Survival Distributions: Reliability Applications in the Biomedical
Sciences. Wiley, New York.
Guess, F. (1984). Testing whether mean residual life changes trend. Ph.D. dissertation, Department
of Statistics, Florida State University.
Guess, F., Hollander, M. and Proschan, F. (1983). Testing whether mean residual life changes trend.
Florida State University Department of Statistics Report M665. (Air Force Office of Scientific
Research Report 83-160).
Hall, W. J. and Wellner, J. A. (1979). Estimation of mean residual life. University of Rochester
Department of Statistics Technical Report.
Hall, W. J. and Wellner, J. A. (1981). Mean residual life. In: M. CsSrgS, D. A. Dawson, J. N. K.
Rao and A. K. Md. E. Saleh, eds., Statistics and Related Topics, North-Holland, Amsterdam,
169-184.
Hollander, M. and Proschan, F. (1984). Nonparametric concepts and methods in reliability. In: P.
R. Krishnaiah and P. K. Sen, eds., Handbook of Statistics, Vol. 4, Nonparametric Methods, North-
Holland, Amsterdam.
Howell, I. P. S. (1984). Small sample studies for linear decreasing mean residual life. In: M. S.
Abdel-Hameed, J. Quinn and E. ~inlar, eds., Reliability Theory and Models, Academic Press, New
York.
Keilson, J. (1979). Markov Chain Models--Rarity and Exponentiality. Springer, New York.
Kotz, S. and Shanbhag, D. N. (1980). Some new approaches to probability distributions. Adv. in
Appl. Probab. 12, 903-921.
Kuo, W. (1984). Reliability enhancement through optimal burn-in. IEEE Trans. Reliability 33,
145-156.
Lawless, J. F. (1982). Statistical Models and Methods for Lifetime Data. Wiley, New York.
Lee, M. T. (1985). Dependence by total positivity. Ann. Probab. 13, 572-582.
Meilijson, I. (1972). Limiting properties for the mean residual lifetime function. Ann. Statist. 1,
354-357.
Miller, R. G. (1981). Survival Analysis. Wiley, New York.
Morrison, D. G. (1978). On linearly increasing mean residual lifetimes. J. Appl. Probab. 15, 617-620.
Nelson, W. (1982). Applied Life Data Analysis. Wiley, New York.
Rolski, T. (1975). Mean residual life. Bulletin of the International Statistical Institute, Book 4 (Pro-
ceedings of the 40th Session), 266-270.
Swartz, G. B. (1973). The mean residual lifetime function. IEEE Trans. Reliability 22, 108-109.
Watson, G. S. and Wells, W. T. (1961). On the possibility of improving the mean useful life of items
by eliminating those with short lives. Technometrics 3, 281-298.
Yang, G. L. (1978). Estimation of a biometric function. Ann. Statist. 6, 112-116.
Yang, G. (1977). Life expectancy under random censorship. Stochastic Process. AppL 6, 33-39.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7 1
i J
© Elsevier Science Publishers B.V. (1988) 225-249

Life Distribution Models and Incomplete Data*

Richard E. Barlow and Frank Proschan

O. Introduction

In this paper our objective is to introduce life distribution models and to


discuss methods useful for analyzing failure data, especially incomplete data. We
show how to express the likelihood functions for general distributions and in-
complete data. The likelihood function tends to be fairly fiat for incomplete data,
For this reason the maximum likelihood estimator may be of limited value. It is
therefore especially important in this situation to assess a prior distribution for
parameters and plot the posterior distribution or its contours.
Inference based on the exponential model is discussed for general sampling
plans. Parameter estimators and credibility intervals are derived for special cases.
The Weibull distribution is a very useful model for life distribution studies and
also for the analysis of strength data. For these reasons, we describe failure
m e c h a n i s m s leading to a Weibull life distribution model. Contour plotting methods
for analyzing life data based on a Weibull distribution are also given.

1. Likelihood

In this section we present a unified way of analyzing incomplete data for a large
number of failure distribution models. We often assume that the failure distribu-
tion F is absolutely continuous with density f and failure rate

r(x) f(x) (1.1)


= F(x) '

where if(x) = 1 - F(x). We call

* This research was supported by the Air Force Office of Scientific Research (AFSC), USAF, under
Grant AFOSR-77-3179 with the University ef California. Reproduction in whole or in part is
permitted for any purpose of the United States Government.

225
226 R. E. Barlowand F. Proschan

R(x) = fo r(u) du (1.2)

the hazard function associated with F. For general F, define

R(x) = - lnF(x) (1.3)

so that if(x)= e x p [ - R(x)]. N o t e that when F has a density f,

d [ _ lnff(x)] _ _f(x) _ r(x)


dx F(x)
so that (1.2) and (1.3) agree in this case.
F r o m (1.1) and (1.3) we see that

f(x) = r(x) e -R(x) . (1.4)

For a discussion of these fundamental concepts, their inter-relationships and


illustrations in the case of well k n o w n distributions, see Barlow and Proschan
(1975), Chapter 3.
Suppose now we observe n independent lifetimes xl, x 2 . . . . . x, corresponding
to a given failure rate function, r. The joint density is

i~__l f(xi) = [ i=~-Ilr(xi) l exp[ - i~=l R(Xi) ] . (1.5)

The likelihood as a function of the failure rate function for data


D = (xl, x 2 . . . . . x,) is then

L(r(u), u >~OlD) = [ i=(Ilr(xi)] expf - i~=l R(Xi) ] . (1.6)

EXAMPLE 1.1. The time-transformed exponential model Suppose the survival


function is of the form

ff(xl2) = e - ~R°(x) (1.7)

where it is assumed that R o is k n o w n and differentiable but 2 is unknown. By


(1.2) we may writte

2R°(x) = fo 2r°(u) du.


It follows that the hazard function and the failure rate function are essumed
known up to the parameter 2. Another way to view the model is to consider time
Life distribution models and incomplete data 227

x to be transformed by the function Ro('). For this reason (1.7) is called the
time-transformed exponential model

Let x~, x 2, ..., x, be n independent observations given 2 from this model. The
likelihood is

L(21D) = 2n Ii__I~1 ro(Xi)] exp I -/~ i=~l Ro(xi) ] • (1.8)

We conclude that Y,"i=l Ro(x~) and n are jointly sufficient for 2. If we use the
gamma prior for 2,

b a,~a - 1 e - b2
~(,~) --
r(a)

we obtain as the posterior density for 2:


n
2a+m_ 1 exp{ - 2[b + 5~,.= ~Ro(x;)]}
1t(2ID) = b + Ro(xi)
i=1
r ( a + n)
(1.9)

Inference preceeds exactly as for the exponential model, except that observation
x i of the exponential model is replaced by its time-transformed value Ro(x~). This
is valid assuming only that Ro(" ) is continuous.

1.1. The general sampling plan


In many practical life testing situations, the lifetime data collected are in-
complete. This may be due to the sampling plan itself or due to the unplanned
withdrawal of test units during the test. (For example, in a medical experiment,
one or more of the subjects may leave town, or suffer an accident, etc.)
We now describe one type of sampling plan. Suppose unit i having lifetime
distribution F is observed over an interval of time starting at age 0 and ending
at a random or nonrandom age. Termination of observation occurs in either one
of the following two ways:
(1) The ith unit is withdrawn or lost from observation at age l; ~> 0; li may be
random or nonrandom.
(2) The ith unit fails at age Xi, where X; is a random variable.
In addition, we require a technical assumption regarding the 'stopping rule'; i.e.,
a prescription for determining when to stop observation:
(3) Suppose unit lifetime, X, depends on an unknown parameter (or parame-
ters) 0. Observation on a unit may stop before unit lifetime is observed. Let STOP
be a rule or set of instructions which determines when observation of a unit stops.
228 R. E. Barlow and F. Proschan

STOP is noninformative relative to 0, that is, STOP provides no additional informa-


tion about 0 other than that contained in the data.
It is important to remark that the 'stopping rule' is not necessarily the same as
the 'stopping time'.
To understand assumption (3), consider the sampling plan: put n items on life
test and stop testing at the kth observed failure. In this case, the stopping rule
depends only on k and is clearly independent of life distribution parameters since
k is fixed in advance of testing.
Suppose we stop testing at time to. Since to is fixed in advance of testing, the
stopping rule is again independent of life distribution parameters.
For these sampling plans, the likelihood, up to a constant of proportionality,
depends only on the life distribution model and the observed data. This propor-
tionality constant depends on the stopping rule, but not on the unknown parameter.

1.2. Examples of informative stopping rules


Records are routinely kept on failures (partial or otherwise) and maintenance
actions on critical units such as airplane engines. Should a relatively new type of
unit start exhibiting problems earlier than anticipated, this may trigger early with-
drawal of units. If this happens, the stopping rule, which is contingent on per-
formance, may also be informative relative to life distribution parameters. This fact
needs to be considered when calculating the likelihood and analyzing the data.
The second example illustrates another case where assumption (3) is violated.
Suppose lifetime X is exponential with failure rate 2 and the random withdrawal
time, W, is also exponential with parameter ~p. We observe the minimum of X and
W. Furthermore, suppose that X given 2 and W given q~ are judged independent.
Then the likelihood given an observed failure at x is
L(2, ~blx) = 2 e-~X e-*X.

If ~. and ~ are judged a priori independent then the posterior density of ~. is

7t(21 x) oc ;t e - zx n(2)

where n is the prior density for ~. However, if ). and ~ are judged dependent
with joint prior rt(2, ~p), then the posterior density is

zt(21x)oc)~e-~Xf~e-~X~(2,(p)d(o.

The factor ~o e-¢Xrc( 2, q~) dq~, contributed by the stopping rule, depends on ~..
There is an important case not covered by the General Sampling Plan--namely
when it is known that a unit has failed within some time interval but the exact
time of failure is unknown.
The following simple example illustrates the way in which incomplete data can
arise.
Life distribution models and incomplete data 229

EXAMPLE 1.2. Operating data are collected on an airplane part for a fleet of
airplanes. A typical age history for several engines is shown in Figure 1.1. The
crosses indicate the observed ages at failure. Ordered withdrawal times (nonfailure
times) are indicated by short vertical lines. In our example, units 2 and 4 fail at
respective times xco and x~2~ while observation on units 1 and 3 is terminated
without failure at times l~2~ and l~1~ respectively.
Unit
number

Age u
X(1) l(t) x(2) l(2)

Fig. 1.1. Age of airplane part at failure or withdrawal.

It is important to note that all data are plotted against the age axis. Figure 1.2
illustrates how events may have occurred in calendar time. For example, units 1
and 3 had not failed at the end of the calendar record.

1.3. Total time on test

The total time on test is an important statistic for the exponential model.

Unit
number

2
L ×
3
I
4
I

Start of End of
calendar record calendar record

Fig. 1.2. Calendar records for airplane parts.


230 R. E. Barlow and F. Proschan

DEFINITION 1.3. The total time on test T is the total of the periods of observa-
tion of all the units undergoing test. Excluded from this statistic are any periods
following death or withdrawal or preceding observation. Specifically, the periods
being totalled include only those in which a death or a withdrawal of a unit under
observation can be observed.

n(u)

Age u
x(,) l(,) x(2 ) 1(2)

Fig. 1.3. Number of units in operation n(u) as a function of age.

Let n(u) be the number of units observed to be operating at age u. The observed
function n(u) u >~ O, for Example 1.2 is displayed in Figure 1.3. From Figure 1.3
we may readily calculate the total time on test T(t) corresponding to any
t, 0 <~ t ~ l(2):

r(t) = I f n(u) du. (1.10)

For example, for t such that x(2) < t </(2), we obtain from Figure 1.3:

T(t) [ ' n(u) du = 4x(1 ) + 3(l(1) - x(1)) + 2(x(2) - l(1)) + (t - x(z)).


do
After simplifying algebraically, we obtain

T(t) = x(a ) + l(1) + x(2) + t. (1.11)

Note that the resulting expression, given in (1.11), can be obtained directly, since
X(l ) and x(2) represent the observed lifetimes of the 2 units that are observed to
fail, l(1) represents the observed age of withdrawal of the unit first withdrawn from
observation, and finally t represents the age of the second unit at the instant t
specified.
Although in this small example, the directly calculated expression (1.11) for
total time on test is simpler, Equation (1.10) is an important identity, since it
Life distribution models and incomplete data 231

yields the total time on test accumulated by age t in terms of the (varying) number
of units on test at each instant during the interval [0, t] for any data set in which
the ages at death or withdrawal are observed. Thus it is a general formula applicable
in a great variety of problems in which data may be incomplete.
Although n(u) is a step function, the integral representation in (1.10) is advan-
tageous, since it is compact, mathematically tractable, and applicable in a great
variety of incomplete data situations. Of course, So n ( u ) d u < ~ in practical
problems since observation ultimately ceases in order to analyze the data in hand.

1.4. The likelihood function for incomplete data


All recorded data are necessarily discrete. Likewise real world life distribution
models should also be discrete. Continuous life distribution models are convenient
approximations to real world life distributions. However, it is most convenient to
define initially the likelihood concept in the context of discrete models.
For our purposes, we find it preferable to define the likelihood concept for the
General Sampling Plan in the context of a discrete model. Computation of the
likelihood function is an intermediate step between specification of the prior
distribution on the space O and computation of the posterior distribution on O
given observed data D.
Suppose temporarily that the life distribution is discrete, i.e., failures can occur
only at times 1, 2, ... ; similarly, withdrawals can occur only at these time points.
Suppose that the probability of failure of a given unit at x is p(xl 0). Suppose k
failures are observed at times xs, s = 1, ..., k, and m withdrawals are observed
at times lt, t = 1, ..., m. Failure and withdrawal times need not be distinct. All
observations are assumed statistically independent, given parameters. Withdrawal
times are produced by a stopping rule which is noninformative concerning 0.
For example, the stopping rule might specify that we observe a unit until failure
or until withdrawal, whichever comes first, where withdrawal time is specified in
advance. For this model, the probability of the observed outcome is

k
p(DIO) = ~ p(x,[O) f i P(I,[0), (1.12)
s=l t=l

where P(ujl O) def


= Zi= 1 P(Uj+irO) represents the probability that a specified unit
fails at age uj+ ~ or later, given the parameter is 0. Note that the first product
corresponds to the k failures at respective ages x~ . . . . . x k, while the second
product corresponds to the m withdrawals at respective ages l I . . . . . Ira.
Another way to model withdrawal is to suppose there exists a random with-
drawal age W such that P [ W = t] --- q(t), t = 1, 2 . . . . . with W independent of unit
lifetimes and of 0. Under this model, we suppose that we observe

minimum (X, W) = ~ X if X ~< W,


( W if X > W.
232 R. E. Barlow and F. Proschan

Now for observed data D = {x l, ..., x k, l 1, ... lm}, the probability of the observed
outcome given parameter 0 is

k k

p(D[O) = f i q(It) 1-I Q(xs) I-I p(xs[O) f i e ( l t l 0 ) , (1.13)


t=l s=l s=l t=l

where Q(uj) ~ f ~]i=1


~ q(uj+e) represents the probability that W > uj. Note that
(1.12) and (1.13) differ only by a factor that does not depend on 0. Thus, relative
to calculating the M L E of O, the two models for withdrawals (withdrawal deterministic
or withdrawal random) do not differ essentially.
There are many practical testing situations in which withdrawals occur as a
result of chance mechanisms unrelated to the parameter 0 of the lifetime distribu-
tion. For example, concluding the collection of data at a specified chronological
time has the effect of withdrawing from observation those units still alive at that
point in time. In Figure 1.2, this phenomenon is illustrated by units 1 and 3. Other
chance mechanisms causing withdrawal at a random age result from human errors
and accidents. The net effect of the various stopping rules that are unrelated to
the value of the parameter 0 is summarized in the factor g(x, l) in the expression
for the probability of the observed outcome:

p(D[ O) = g(x, 1) -if(It[ 0). (1.14)


t=l

DEFINITION 1.4. The likelihood, L(OiD), is the probability o f the observed


outcome, p(D]O), considered as a function of the parameter 0 given the data, D.
In the case of a continuous model, the corresponding likelihood will have this
interpretation relative to a discrete probability approximation.

It follows from (1.14) that

k
L(OID)o¢ I-I p(x~]O) f i P(ltlO ). (1.15)
s=l t=l

From Bayes' Theorem, it is clear that we need not know g(x, !) in order to
compute the posterior density of 0.
In this subsection, we have thus far confined our discussion to the case of
discrete time life distributions since the basic concepts are easier to grasp in this
case. However, in the case of continuous time life distributions, the likelihood
concept is equally relevant, and in fact the expression for the likelihood L(OID)
assumes a rather elegant form if we use n(u), the number on test function. In the
continuous case, p(x[O) is replaced by the probability density element f(xlO).

THEOREM 1.5. Given the failure rate, independent observations are made under the
General Sampling Plan. Let Xl, x 2, ..., x k denote the k observed failure ages. Let
n(u) denote the number of units under observation at age, u, u >i O, and r(u) denote
Life distribution models and incomplete data 233

the failure rate function of the unit at age u. Then the likelihood of the failure rate
function r(u), having observed data D described above, is given by
L(r(u), u >101D)

I~=[-I r(xs)]exp[- ~o°°n(u)r(u)dul, k>/1,


OC (1.16)
exp[-~o°°n(u)r(u)du], k=O.

PROOF. To justify (1.16), we first note that the underlying random events are
the ages at failure or withdrawal. Thus the likelihood of the observed outcome is
specified by the likelihood of the failure ages and survivals until withdrawal. By
Assumption (3) of the General Sampling Model, we need not include any factor
contributed by the stopping rule, since the stopping rule does not depend on the
failure rate function r(-).
To calculate the likelihood, we use the fact that given r(.),

;or'U'U]
(See (1.4).) Specifically, if a unit is observed from age 0 until it is withdrawn at
age l, without having failed during the interval [0, lt], a factor e x p [ - S~ r(u)du]
is contributed to the likelihood. Thus, if no units fail during the test (i.e., k = 0),
the likelihood of the observed outcome is proportional to the expression given in
(1.16) for k = 0.
On the other hand, if a unit is observed from age 0 until it fails at age x~, a
factor

r(x~)expl- fo~r(u) du]

is contributed to the likelihood. The exponential factor corresponds to the survival


of the unit during [0, xs], while r(xs) represents the rate of failure at age xs. (Note
that if we had retained the differential element 'dx', the corresponding expression
r(Xs) dx would approximate an actual probability: the conditional probability of
a failure during the interval (xs, x s + dx) given survival to age x~.)
The likelihood expression in (1.16) corresponding to the outcome k >i 1 now is
clear. The exponential factor corresponds to the survival intervals of both units
that failed under observation and units that were withdrawn before failing:

yo n(u)r(u) du = ~. r(u) du + ~ r(u) du,


234 R. E. Barlowand F. Proschan

where the first sum is taken over units that failed while the second sum is taken
over units that were withdrawn. The upper limit ' ~ ' is for simplicity and intro-
duces no technical difficulty, since n(u)=-0 after observation ends. []
The likelihood (1.16) applies for any absolutely continuous life distribution. In
the important special case of an exponential life distribution model,
f(xl2) = 2 e-~x, the likelihood of the observed outcome takes the simpler form

L(AID) oc
[;o
2 kexp -2
]
n(u) du , k>~ 1,
(1.17)

[fo
exp - 2
]
n(u) du , k=O.

The following theorem is obvious from (1.17).

THEOREM 1.6. Assume that the test plan satisfies Assumptions (1), (2) and (3) of
the General Sampling Plan. Assume that k failures and the number of units operating
at age u, n(u), u >~O, are observed and that the model is the exponential density
f(x]2) = 2 e- ~x. Then
(a) k and T = So n(u) du together constitute a sufficient statistic for 2;
(b) kiT is the MLE for 2.

Note that the MLE, k/T, for 2 represents the number of observed failures divided
by the total time of test.
The maximum likelihood estimator is the mode of the posterior density corre-
sponding to a uniform prior (over an interval containing the MLE). A uniform
prior is often a convenient reference prior. Under suitable circumstances, the
analyst's actual posterior distribution will be approximately what it would have
been had the analyst's prior been uniform. To ignore the departure from uni-
formity, it is sufficient that the analyst's actual prior density changes gently in the
region favored by the data and also that the prior density not too strongly favors
some other region. This result is rigorously expressed in the Principle of Stable
Estimation [see Edwards, Lindman and Savage (1963)]. DeGroot (1970), pages
198-201, refers to this result under the name of precise measurement.

EXAMPLE 1.7. The exact likelihood can be calculated explicitly for specified
stopping rules. Suppose that withdrawal times are determined in advance. Then
the likelihood is

L(r(u), u >~OID) = I ~=l n(xT )r(xs)] e x p l - f o~ n(u)r(u) dul (1.18)

where n(Xs ) is the number surviving just prior to the observed failure at age x s.
To see this consider the airplane engine data in Example 1.2. Using Figure 1.3 as
a guide, the likelihood will have the following factors:
Life distributionmodelsand incompletedata 235

1. For the interval [0, x(1)] we have the contribution

4r(xo))expI-~o"~4r(u)du]

corresponding to the probability that all 4 units survive to x(S) and the first failure
occurs at x(1).
2. For the interval (x(l), l(1)] we have the contribution

ex,I
corresponding to the probability that the remaining 3 units survive this interval.
3. For the interval (l(1), x(2)] we have the contribution

2r(x(2)) exp[ - f t~i;~2r(u) du]

corresponding to the probability that the remaining 2 units survive to x(~) and the
failure occurs at x(z).
4. For the interval (x(2), l(2)] we have the contribution

expf
corresponding to the conditional probability that the remaining unit survives to
age l(2). Multiplying together these conditional probabilities, we obtain a likelihood
having the form shown in (1.18).

2. Parameter estimators and credible intervals

In the previous section we saw how to calculate the likelihood function for
general life distributions. This is required in order to calculate the posterior
distribution. Calculation and possibly graphical display of the posterior density
would conceivably complete our data analysis.
If we assume a life density p(xlO) and n(O) is the prior, then
p(x, O) = p(x] 0)~(0) is the joint density and p(x) = ~op(x[O)~(0) dO is the mar-
ginal or predictive density. Given data D and the posterior density r~(0[D), the
predictive density is

p(xlD) = foP(XlO)zr(OID)dO.
236 R. E. Barlow and F. Proschan

If asked to give the probability of survival until time t, we would calculate

P(X > t l D) = p(xlD) d x .

EXAMPLE 2.1. For the exponential density 2 e-xx, k ovserved failures, T total
time on test, and the General Sampling Plan, the likelihood is proportional to
2/` e - a t . For the natural conjugate prior,

b a 2a - 1 e - oa
~(2) =

r(a)
the posterior density is

~(2lk, T) = (b + T ) a + k 2 a + k - I e-(b+ r)x/F(a + k).

In this case the probability of survival until time t is

P ( X > thk, T) =
f: e-'t/Tz(2]k, T ) d 2

(2.1)
+t+ T/

2.1. Bayes estimators


We will need the following notation:

El0]=fo 0~z(0)d0 and E[O,D]=~o On(OlD)dO.

Of course, E[ t?] is the mean of the prior distribution while E[ OlD] is the
mean of the posterior distribution.
We wish to select a single value as representing our 'best' estimator of the
unknown parameter 0. To define the best estimator we must specify a criterion
of goodness (or equivalently, of poorness). Statisticians measure the poorness of
an estimator 0 by the expected 'loss' resulting from their estimator 0. One
very popular loss function is squared error loss: specifically, having observed data
D and determined the posterior density ~z(0[D), the expected squared error loss
is given by

E l ( 0 - 0)2ID] ; (2.2)

the expectation is calculated with respect to the posterior density n(OID). We


choose a point estimator 0 so as to minimize the expected squared error loss
Life distribution models and incomplete data 237

in (2.2); i.e., we choose O to satisfy

minimum E[( 0 - a)2rD] = E[( 0 - 0)21D]. (2.3)


a

To find the minimizing value t), we add and subtract E ( O I D ) in the loss
function to obtain

E l ( 0 - a)ZlO] = E[( O - E ( OID))21D] + [E( OID) - a] 2 .

Since we wish to minimize the right hand side, we set a = E ( 0 J D ) , which


then represents the solution to (2.3). The resulting estimator, E(0ID), the
mean of the posterior, is called the Bayes estimator with respect to squared error
loss.

THEOREM 2.2. The Bayes estimator of a parameter 0 with respect to squared loss
is the mean E ( 0 1D) of the posterior density.

Another loss function in popular use is the absolute value loss function:

Eli 0- 01 ID]. (2.4)

To find the minimizing estimator using this criterion, we choose 0 to satisfy:

minimumE[ p0 - al ID] = E[I 0 - Of ID]. (2.5)


a

It is easy to show:

THEOREM 2.3. The Bayes estimator of a parameter 0 with respect to the absolute
value loss function is the median of the posterior density. Specifically, the estimator
0 satisfies

~c(OID) dO = n(OID) dO = ½ . (2.6)

Of course, the prior density and the loss function enter crucially in determining
a 'best' estimator. However, no matter what criterion is used, all the information
concerning the unknown parameter 0 is contained in the posterior density. Thus,
a graph of rc(0[D) is more informative than any single parameter of the posterior
density, whether it be the mean, the median, the mode, a quartile, etc.

EXAMPLE 2.4. Assume that lifetime is governed by the exponential model,


O - l e -x/°. Suppose we conjecture that E[ 0 Ik, T], for sampling plan with k, T
sufficient, is linear in T for fixed k. It turns out that such a linear relationship
holds if and only if we use as our prior the natural conjugate prior:
238 R. E. Barlow and F. Proschan

bao-(a+ 1) e-b/O
~(o) =
r(a)

(See Diaconis and Ylvisaker (1979) for a proof of this result and for more general
results of this kind.) The corresponding Bayes estimator with respect to squared
error loss is

E[ OIk, T] _ (b + T) (2.7)
( a + k - 1)

However, the natural conjugate prior would not be appropriate if we believed,


for example, that 0 could assume values only in two disjoint intervals. Under this
belief, a bimodal prior density would be more natural, and the corresponding
estimator E[ 0lD] would very likely be difficult to obtain in closed form
such as in (2.7). However E[ 0 ID] could be computed by numerical integration.
There are many other functions of unknown parameters for which we may want
the Bayes estimator with respect to squared error loss. For example, we may wish
to estimate the probability of survival until age t for the exponential model; i.e.,
estimate

g(O) = e x p [ - ~ ] . (2.8)

It is easy to show in this case that

(2.9)

is the Bayes estimator. If n(O) is the natural conjugate prior, then it is easy to
verify that

b+ T ]a+k
g= b+t+ Tl '

i.e., this is the Bayes estimator of the probability of survival to age t given total
time on test T and k observed failures. Note that this ~ is precisely the marginal
probability of survival until time t.

2.2. Credible intervals


As we have seen, Bayes estimators correspond to certain functions of the
posterior distribution such as the mean, the mode, etc. A credible set or interval
is another way of presenting a partial description of the posterior distribution.
Life distn'bution models and incomplete data 239

Specifically, we choose a set C on the positive axis (since we are dealing with
lifetime) such that

f rr(OID)dO= 1 - a. (2.10)
C

Such a set C is called a Bayesian (1 - a) 100 percent credible set (or credible
interval if C is an interval) for 0.
Obviously, the set C is not uniquely determined. It would seem desirable to
choose the set C to be as small (e.g., least length, area, volume) as possible. To
achieve this, we seek a constant c 1 _ ~ and a corresponding set C such that

C = {0] It(OlD)>/c,_~,} (2.11)


and

f re(OlD)dO= 1 - ~. (2.12)
C

A set C satisfying (2.11) and (2.12) is called a highestposterior density credible set
(Box and Tiao, 1973). In general, C would have to be determined numerically with
the aid of a computer.
For the exponential model 2 e-ax, the natural conjugate prior is the gamma
density. Since the gamma density is a generalization of the chi-square density, we
recall the definition of the latter so that we can make use of it to determine
credible intervals for the failure rate of the exponential.

DEFINITION 2.5. A random variable, gZ(n), having density

X n/2- 1 exp I - 2ix1


fx2~°)(x) = for x/> 0, n = 1, 2 . . . . , (2.13)

is called a chi-square random variable with n degrees of freedom (d.f.).


A table of percentage points of the chi-square distribution may be found in
Pearson and Hartley (1958). In addition, chi-square programs are available for
more extensive calculations using electronic computers and programmable calcu-
lators.
It is easy to verify that the Z2 random variable with 2n d.f. is distributed as
2(Y1 + Y2 + " ' " + Yn), where Y1, Y2 . . . . . Y~ are independent, exponentially dis-
tributed random variables with mean one. Thus, we obtain the following result
useful in computing credibility intervals for the failure rate of the exponential
model with corresponding natural conjugate prior.
240 R. E. Barlow and F. Proschan

THEOREM 2.6. Let k failures and total time on test T be observed under sampling
assumptions (1), (2) and (3) (Section 1)for the exponential model 2e -zx. Let )~
have the posterior density corresponding to the natural conjugate prior

b a )a - 1 e- b2
~(~) -

r(a)

with a an integer. Then

p[Z2/2[2(a + k)] <<,~ <Z2-~/2[2(a + k ) ] i D ] = l _ ~, (2.14)


[ 2(b + T) 2(b + T)

where z~(n) is the lOOfl percentage point of a chi-square distribution with n d.f.; i.e.,

f ~ ( " ) fz2(m (x) dx = ft.

REMARK, Because of the lack of symmetry of the Z2 density, the interval in


(2.14) is not the highest posterior density credible interval.

PROOF. It is easy to verify that (b + T)). given the data has a gamma density,

1~a+k-1 e 2
F(a + k)

corresponding to the density of Y1 + "'" + Ya + k, where the Y's are independent


unit exponential random variables. Hence

2).(b + T) ~t 2(Y 1 + " " + Ya+k),

where st denotes stochastic equality; i.e., 2),(b + T) has a chi-square density


with 2(a + k) d.f. []

COROLLARY 2.7. For 2(a + k) large (say 2(a + k ) > 30), the normal approxima-
tion provides the approximate credibility statement

p [ ( a + k) + (a + k)l/Zz~/2 ~ (a + k) + (a + k)l/2z~_ ~/2 D1


I b+T b+T J -1-~,
(2.15)
where z~ satisfies ~ o~ ep(u) du = c~and q~(u) = ( 1 / x / ~ ) e u2/2 is the normal density
-

with mean 0 and variance 1.


Life distributionmodels and incomplete data 241

PROOF. Since the Z2(2n) random variable can be written as

Z2(2n) = 2(Yz + Y2 + "'" + Y.)

where YI, Y2. . . . , Yn are independent unit exponentials, the Central Limit
Theorem (e.g., Hoel, Port and Stone, 1971) applies. Note that EX2(2n) = 2n and
Var[z2(2n)] = 4n. Thus,

Z2(2n) - 2n

is approximately normal with mean 0 and variance 1 by the Central Limit


Theorem. []

COROLLARY 2.8. Let k failures and T total time on test be observed under the
General Sampling Plan assumptions (1), (2) and (3) (Section 1), for the exponential
model O-l e-~/o. Let 0 have the natural conjugate prior with integer a, then

I 2(b + T) ~< 2(b + T ) I D ] = l_~t" (2.16)


P Z 2- ,/2 [2(a + k)] Z2/2 [2(a + k)]

PROOF. Since 0 has the natural conjugate prior distribution for the model
0 - 1 e -x/°, then ,~ = 1/0 has the natural conjugate prior for the model 2 e -~x.
(2.16) follows from (2.14). []

3. The Weibuli distribution

Whenever possible, the choice of a life distribution model should be based on


the underlying failure mechanisms. Simple structures composed of statistically
independent components have been used to derive life distribution models valid
when the number of structural components is very large.
Suppose a structure of n components fails as soon as k components fail. If also
component lifetimes are judged identically distributed and independent, then there
are only two possible limiting structure life distributions in the sense that there
exist sequences of normalizing constants {a,)~=l, {2n)n~_-i such that for all
real x,

lim P { 2 n ( ~ . , - a,) ~ x}
n~o~

exists. The limit is either

1 (' [,~(x - a)] ~


- Jo e - " u k - a du, ~ , 2 > 0, x > a ~ > 0, (3.1)
(k 1)!
242 R . E . B a r l o w a n d F. P r o s c h a n

or
1 f f e x p [)t(x - a)]
- | e - uuk - 1 du ,
(k 1)! ~o
-oo<x<~, -oo<a<~,A>O (3.2)

(Smirnov, 1952). In both cases a is a location parameter and 2 is a scale


parameter while ~ and k are shape parameters.
If k = 1, then (3.1) becomes

W(xla, 4, e)= 1 - e x p { - [ 2 ( x - a ) ] ~ } , x>~a>/0, (3.1')

the Weibull distribution, and (3.2) becomes

A(xla, 2)= 1 - e x p { - e ~ ( X - a ) } , -oo<x<oo. (3.2')


Thus, if X is the structure lifetime, then either X or exp (X) has a Weibull distribu-
tion. The failure rate for the Weibull distribution of (3.1') is

rw(x )= ~ 2 ~ ( x _ a ) = - i forx>~a, (3.3)

and 0 elsewhere. In the second case it is

rA(X) = 2 e x p [ 2 ( x - a)]. (3.4)

For all parameter values, (3.4) is increasing in x. Hence, if we wish to allow the
possibility that the failure rate may be decreasing we must choose the Weibull
model, (3.1'), with e < 1.
The Weibull model appears to furnish an adequate fit for some strand lifetime
data with estimated values of e less than 4. On the other hand, it has been
empirically observed that for strength data, estimates for e using the Weibull
model are often large ( > 27 in some cases). This suggests that (3.2') may provide
a better model for strand strength data.

3.I. Inference for the Weibull distribution


The Weibull life distribution model has three parameters: a, 2, and e. The
parameter a > 0 is a threshold value for lifetime; before time a we expect to see
no failures. If there is no physical reason to justify a positive threshold value, the
analyst should use the two parameter Weibull model. The most simple model
compatible with prior knowledge concerning physical processes will often provide
the most insight. The Weibull density is

f(xla, ~, 2) = ~2~(x - a) ~- a e-[~(x-a)l~ (3.5)

for x >~ a and 0 elsewhere.


Life distribution models and incomplete data 243

Usually we wish to quantify our uncertainty about a particular aspect of the life
distribution, such as the probability of surviving x hours. For the three parameter
Weibull model, this is given by

f f ( x l a , 2, ~) = e x p { - [ 2 ( x - a)]~}. (3.6)

It is clearly sufficient to assess our uncertainty concerning a, 2, and ~.


Suppose data are obtained under the General Sampling Plan (Section 1). Let
xl, x 2 . . . . . x k denote the unordered observed failure ages and n(u) the number
surviving until age u. Then by Theorem 1.6 in Section 1, the likelihood is given
by

L ( a , ~, ).ID)

oc c~2 k~
i=l
(x i - a)
l 'I Iraexp - 2~ an(u) (u - a) ~ - 1 du
1t
for a ~< xi and ~, 2 > 0. Suppose there are m withdrawals and we pool observed
failure and loss times and relabel them as

0 =-- t(o ) ~ t(1 ) ~ t(2 ) ~ ' ' ' ~ t(k+m ) ~ t.

Then, for a ~< x i, i = 1, 2 . . . . . k, we have

f
a°o k +m F t(O
n(u) (u - a) ~ - ' du = Z (n-i+ 1) (u-a)~ ,du
i= 1 ,I t(i_ 1)

+ (n - k - m) f/ (k+m)
(u - a) ~ - I d u . (3.7)

Observation is confined to the age interval [0, t].


Two important deductions can be made from (3.7):
1. The only sufficient statistic for all three parameters (or for a and 2 alone
when a = 0) is the entire data set.
2. No natural conjugate family of priors is available for all three parameters (or
for ~ and 2 alone when a = 0). Consequently, the posterior distribution must be
computed using numerical integration [see Diaconis and Ylvisaker (1979)].
For most statistical investigations, a and perhaps also a would be considered
nuisance parameters. By matching our joint prior density in a, 2 and a with the
likelihood (3.7), we can calculate the posterior density, re(a, 2, aiD). For example,
of a is considered a nuisance parameter, then we would calculate the marginal
density on 2 and ~ as

~z(a, ).ID) = n(a, ct, 2[D) d a .


~O~
244 R. E. Barlow and F. Proschan

3.2. Credibility regions for two parameter models


Let rc(~, kID) be the posterior density for a two parameter model such as the
Weibull model above with scale parameter 2 and shape parameter ~. To find the
so-called 'highest posterior density' credibility region for ~ and 2 simultaneously
(Section 2), we find a constant c(fi) by sequential search such that:

R = [(c¢, 2) 1 (Tr(~, 410)>~ c(fl)] (3.8)


and
f f ~(a,21D)d~d2=fl.
The region R defined above is a fl(100) percent credibility region for a and 4. For
unimodal densities such regions are bounded by a single curve C which does not
intersect itself (i.e., a 'simply connected region').
To illustrate the use of Weibull credibility regions we have computed credibility
regions corresponding to the data in Tables 3.1 and 3.2. Twenty-one pressure
vessels were put on life test at 68~o of their ultimate mean burst stress. A pressure
vessel is filled with a gas or liquid and provides a source of mechanical energy.
They are used on space satellites and other space vehicles, After 13488 hours of
testing, 5 failures were recorded, After an additional 7080 hours of testing, an
additional 4 failures were recorded.

Table 3.1
Ordered failure ages of pressure vessels life
tested at 68~o of mean rupture strength (n = 21,
observation to 13488 hours)

Number of failure Age at failure (hours)

1 4000
2 5376
3 7320
4 8616
5 9120

Table 3.2
Ordered failure ages of pressure vessels life
tested at 68~o of mean rupture strength (failures
between 13488 hours and 20568 hours)

Number of failure Age at failure (hours)

1 14400
2 16104
3 20231
4 20233
Life distribution models and incomplete data 245

Figure 3.1 displays credibility contours for ct and 2 after 13488 hours of testing
and again after 20 568 hours of testing. The posterior densities were computed
relative to uniform priors. The posterior density computed after 20568 hours
could also be haterpreted as the result of using the posterior (calculated on the
basis of Table 3.1 and a fiat prior) as the new prior for the data in Table 3.2. A
qualitative measure of the information gained by an additional year of testing can
be deduced by comparing the initial (dark) contours and the tighter (light)
contours in Figure 3.1.

2 , O0

~after 13 4 8 8 h o u r s

--after 20 568 hours

1 . 5 0 --

1. O0 --

(D
0
0

O. 5 0 --

O. O0 -- ~ i !f] J I I r i i ~ ~i ,: i i I I r ]1 J I [ J J J I f t r I I I I Ii I i i Ir
0.'t0 0.80 1.20 1.60 2.00 2.40 2.80 3.20 3.60 4.00 ~t.40 4 . 8 0

Fig. 3.1. Highest probability density contours for ~ and 2 for Kevlar/epoxy pressure vessel life test
data, T h e pressure vessels w e r e tested at 68~o stress level.
246 R. E. Barlow and 1:. Proschan

To predict pressure vessel life at the 68~o stress level, we can numerically
compute

P[X>tlD]=fo~fo°~e-(X°°~(a, AlD)d=d2

where rt(~, 2[D) must be numerically computed using the given data, D.
If the mean life

,(1+:)
O-

or the standard deviation of life are of interest, their posterior densities can be
computed by making a change of variable and integrating out the nuisance
parameter. For example, if a = 0 in the Weibull model and we are interested in
the mean life, 0, we can use the Weibull density in terms of c~ and 0.

1+ F 1+-
f ( x l a , O) = a x ~- 1 exp - a
0 0

to compute the joint posterior density rc(~t, 0[ D). The prior for a and 2 must be
replaced by the induced prior for a and 0. This may be accomplished by a change
of variable and by computing the appropriate Jacobian. The marginal posterior
density of 0 is then

n(OID ) = 7r(a, OlD) d~.


~0°°
This can then be used to obtain credibility intervals on 0.

4. Notes and references

4. I. Section 1
In the General Sampling Plan we needed to assume that any stopping rules
used were noninformative concerning the failure distribution. The need for this
assumption was pointed out by Raiffa and Schlaiffer (1961). Examples of infor-
mative stopping rules were given by Roberts (1967) in the context of two stage
sampling of biological populations to estimate population size (so-called capture-
recapture sampling).
Life distribution models and incomplete data 247

4.2. Section 2: Unbiasedness


The posterior mean is a Bayes estimator of a parameter, say 0, with respect to
^

squared error loss. It is also a function of the data. An estimator, O(D), is


called unbiased in the sample theory sense if

E~[ b(D)l 0] = 0

for each 0e O. No Bayes estimator (based on a corresponding proper prior) can


be unbiased in the sample theory sense (Bickel and Blackwell, 1967).
Most unbiased estimators are in fact inadmissible in the sample theory sense
with respect to squared error loss. For example, 0(D) = T/k is a
sample theory unbiased estimator for the mean of the density 0-~ e -x/°. However
it is inadmissible in the sense that there exists another cO(D) with e :~ 1
such that, for all 0

Er[[cO(D ) - 0121 0] < E,~[ [ O(D) - O]z ] 0].

To find this c, consider Y = O(D)/O and note E Y = 1. Then we need only find
c such that

ElJ(Cr- 1)210]
is minimum. This occurs for co = E Y / E Y 2 which is clearly not 1. Hence 0(D)
is sample theory inadmissible. Sample theory unbiasedness is not a viable
criterion.
For ~_arge k, 0 ( D ) = T/k will be approximately the same as our Bayes
estimator. However, T/k is not recommended for small k.
Since tables of the chi-square distribution have in the past been more accessible
than tables of the gamma distribution, we have given the chi-square special
treatment. However with modern computing facilities, we really only need to use
the more general gamma distribution.

4.3. Confidence intervals


A (1 - c~)100~o confidence interval in the sample theory sense in one such that
if the experiment is repeated infinitely often (and the interval recomputed each
time) then (1 - ~)100~o of the time the interval will cover the fixed unknown true
parameter 0. Since confidence intervals do not produce a probability distribution
on the parameter space for 0, they cannot provide the basis for action in the
decision theory sense; i.e., a decision maker cannot use a sample theory confi-
dence interval to compute an expected utility function which can then be
maximized over his set of possible decisions.
If for 2 e -~x we choose the improper prior, n ( 2 ) = 1/2, then the chi-square
( 1 - ~)100~o credible intervals and the sample theory ( 1 - a)100~o confidence
intervals agree. Unfortunately, such improper credible intervals can be shown to
248 R. E, Barlow and F. Proschan

violate certain rules of logical behavior. Lindley (personal communication)


provides the following simple illustration of this fact for the exponential model
2 e-~x. Suppose n units are put on test and we stop at the first failure, so that
T = nXo). Now T given 2 also has density 2e -~x so that (ln2)/T is a 50~o
improper upper credible limit on 2; i.e.,

P [ "~<(ln2)
lT T'rc(A)=~l=0"50" (4.1)

Suppose now that T is observed and we accept the probability statement (4.1).
Consider the following hypothetical bet.
(i) If ~. < (ln2)/T we lose the amount e- r;
(ii) If 2 >/(In 2)/T we win e- r.
We can pretend that the true 2 is somehow revealed and bets are paid off. If
we believe statement (4.1), then given T such a bet is certainly fair.
Now let us compute our expected gain before T is observed (preposterior
analysis). This is easily seen to be (conditional on 2)

- f on 2~/~2 e - ~ t e - t d t + f ~ 2 e - ~ t e - ~ d t = - 2[ 2- ~/~- 1]
,d 0 ,)(ln 2)/Z 1+ 2

which is negative for all 2 > 0. Note that this is what we subjectively expect, since
as (improper) Bayesians, every probability (and presumably even an improper
prior) is subjective.
The contradiction lies in the observation that
1. conditional on 2 and prior to observing T, our expected winnings are nega-
tive for all 2;
2. conditional on T, our expected loss is zero (using the improper prior
~ ( A ) = 1/2).
The source of the contradiction is that we have not measured our uncertainty
for all events by probability. For example, we have assigned the value ~ to the
event 2 < 2 o for all 2 0 > 0 ; i.e., ~ r c ( 2 ) d 2 = S ~ ( 1 / A ) d 2 = ~ . We can
prove that for any set of uncertainty statements that are not probabilistically
based (relative to proper distributions), a system of bets can be constructed which
will result in the certain loss of money. A bet consists of paying pz < z dollars
to participate with the understanding that if an event E occurs you win z dollars
and otherwise you win nothing.

4.4. Section 3
The Weibull distribution is one of several extreme value distributions. See
Barlow and Proschan (1975), Chapter 8, for a more advanced discussion of
extreme value distributions.
Life distribution models and incomplete data 249

Acknowledgements

W e w o u l d like to a c k n o w l e d g e D e n n i s L i n d l e y for his perceptive c o m m e n t s a n d


criticisms o f a n earlier draft. T h a n k s are also d u e to C o l l e e n P o s t m u s a n d M a r i k o
K u b i k for t y p i n g m a n y v e r s i o n s p r e v i o u s to this one.

References
Barlow, R. E. and Proschan, F. (1975). Statistical Theory of Realibility and Life Testing. Holt, Rinehart
and Winston, New York.
Bickel, P. J. and Blackwell, D. (1967). A note on Bayes estimates, Ann. Math. Statist. 38, 1907-1911.
Box, G. E. P. and Tiao, T. C. (1973). Bayesian Inference in Statistical Analysis. Addison-Wesley,
Reading, MA.
De Groot, M. H. (1970). Optimal Statistical Decisions. McGraw-Hill, New York.
Diaconis, R. and Ylvisaker, D. (1979). Conjugate priors for exponential families. Ann. Statist. 7,
269-281.
Edwards, W., Lindman, H., and Savage, L. J. (1963). Bayesian statistical inference for psychological
research. Psychological Rev. 70, 193-242.
Hoel, P. G., Port, S. C., and Stone, C. J. (1971). Introduction to Probability Theory. Houghton Mifflin,
Boston, MA.
Lindley, D. V. (1978). The Bayesian approach. Scandinavian J. Statist. 5, 1-26.
Pearson, E. S. and Hartley, H. O. (1958). Biometnka Tables for Statisticians. Vol. 1. The University
Press, Cambridge, England.
Raiffa, H. and Schlaiffer, R. (1961). Applied Statistical Decosion Theory. Harvard Business School,
Boston, MA.
Roberts, H. V. (1967). Informative stopping rules and inferences about population size. J. Amer.
Statist. Assoc. 62, 763-775.
Smirnov, N. V. (1952). Limit distributions for the terms of a variational series. Trans. Math. Soc.
Ser. 1, 1-64.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7 l/1
/
© Elsevier Science Publishers B.V. (1988) 251-280

Piecewise Geometric Estimation of a Survival


Function*

Gillian M. Mimmack and Frank Proschan

1. Introduction and summary

The problem of estimating survival probabilities from incomplete data is well


known in the fields of reliability, medicine, biometry and actuarial science. The
general situation is described as follows. The variable of interest is the lifespan
of some unit: the investigator wishes to estimate the probability of survival beyond
any given time. To this end, n identical units are placed 'on test'. Each item is
either observed until failure, resulting in an uncensored observation, or is removed
from the test before failure, resulting in a censored observation. Thus the data
available consist of a number of lifelengths and a number of truncated lifelengths:
the statistical problem is to estimate the probability distribution of the lifelengths.
The various statistical approaches to the problem can generally be classified
according to the restrictiveness of the model assumed and the type of information
utilized. At one extreme are purely parametric procedures, which involve assuming
that the underlying life distribution belongs to a specific parametric family.
These procedures utilize interval information. The Bayesian estimator described
by Susarla and Van Ryzin (1976) makes allowance for both parametric and
nonparametric models: the type of information utilized depends on the assump-
tions about the prior distribution. As our approach to the problem is neither
parametric nor Bayesian, we do not consider these procedures further but concen-
trate on nonparametric procedures.
Nonparametric procedures range in sophistication from the well-known actu-
arial estimator, which is a step function constructed from ordinal information
alone, to the piecewise polynomial estimators of Whittemore and Keller (1983)
that utilize interval information. The most widely used nonparametric estimators
are those of Kaplan and Meier (1958) and Nelson (1969). These estimators are
also step functions constructed from ordinal information. Their properties are
described by Eft-on (1967), Breslow and Crowley (1974), Petersen (1977), Aalen

* Research supported by the Air Force Office of Scientific Research, AFSC, USAR, under Grant
AFOSR 82-K-0007.

251
252 G. M. Mimmack and F. Proschan

(1976, 1978), Kitchin, Langberg and Proschan (1983), Nelson (1972), Fleming
and Harrington (1979), and Chen, Hollander and Langberg (1982).
One of the by-products of the estimation process is an estimate of the failure
rate function: here, another issue is raised. It is evident that survival function
estimators that are step functions do not provide useful failure rate function
estimators: Miller (1981) mentions smoothing the Kaplan-Meier estimator for
this reason and summarizes the development of other survival function estimators
that may be obtained by considering a special case of the regression model of Cox
(1972). These estimators generally correspond to failure rate function estimators
that are step functions and utilize at most part (but not all) of the interval
information contained in the data. Whittemore and Keller (1983) give several
more refined failure rate function estimators that are step functions and utilize full
interval information. They also describe even more complex estimators that utilize
full interval information: however, these are not computationally convenient com-
pared with their simpler estimators. It seems, from their work, that a successful
rival of the Kaplan-Meier estimator should be only marginally more complex than
it (so as to be computationally convenient and yet yield a useful failure rate
function estimator) and also should utilize more than ordinal information.
In Section 2, we propose an estimator that not only provides a reasonable
failure rate function estimator but also utilizes interval information. Moreover, it
is computationally simple. Our estimator is a discrete counterpart of two versions
of a continuous estimator proposed independently by Kitchin, Langberg and
Proschan (1983) and Whittemore and Keller (1983). The motivation for the
construction of our estimator is the same as that of the former authors, and our
model is the discrete version of theirs: in contrast, the latter authors assume the
more restrictive model of random censorship and obtain their estimator by the
method of maximum likelihood. This provides an alternative method of deriving
our estimator.
The remaining sections are concerned with properties of our estimator. As this
presentation is expository, proofs are omitted: Mimmack (1985) provides proofs.
In Section 3, we explore the asymptotic properties of our estimator under
increasingly restrictive models. Our estimator is strongly consistent and asymptoti-
cally normal under conditions more general than those typically assumed.
Section 4 deals with the relationships among our estimator, the Kaplan-Meier
estimator, and the above-mentioned estimator of Kitchin et al. and Whittemore
and Keller. The section ends with an example using real data.
In Section 5, we continue the comparison of the new estimator and the
Kaplan-Meier estimator: since the properties of the new estimator are expected
to resemble those of its continuous counterparts, we discuss the implications of
simulation studies designed to investigate the small sample behaviour of these
estimators. We also present the results of a Monte Carlo pilot study designed to
investigate the small sample properties of our estimator.
Piecewise geometric estimation of a survival function 253

2. Preliminaries

In this section we formulate the problem in statistical terms and define our
cstimator.
Let X denote the lifelength of a randomly chosen unit, where X has distribution
function G. Suppose that n identical items are placed on test. The resultant
sample consists of the pairs (Z1, bl) . . . . . (Z~, b~), where Z; represents the time
for which unit i is observed and b; indicates whether unit i fails while under
observation or is removed from the test before failure. Symbolically, for
i = 1, ..., n, we have

X; --- lifelength of unit i, where X; has distribution G,


Y~ = time to censorship of unit i,
Z; = min(X~, Ye),

ai = I(X;<... Y~).

(Xt, Yt), - . . , (Xn, Yn) are assumed to be independent random pairs. Elements
of a pair X; and Y~, where i = 1, . . . , n, are not assumed to be independent.
We assume that the lifelength and censoring random variables are discrete. Let
5f = {x~, x 2. . . . } denote the set of possible values of X and Y¢ = {Yl, Y2. . . . }
denote the union of the sets of possible values of Y~, Yz . . . . , where ~¢ ___ &r. The
survival probabilities of interest are denoted P ( X > xk), k = 1, 2 . . . . . where
P ( X > xk) = G(x~) = 1 - G(xk), k = 1, 2 , . . . .
It is evident that this formulation differs from that of the model of random
censorship which is generally assumed in the literature, and in particular, by
Whittemore and Keller (1983). These authors assume that the lifelength and
censoring random variables are continuous, that the corresponding pairs X; and
Y~, where i = 1, 2, ..., are independent, and that the censoring random variables
are identically distributed. Although Kaplan and Meier (1958) assume only inde-
pendence between corresponding lifelength and censoring random variables,
Breslow and Crowley (1974), Petersen (1977), Aalen (1976, 1978), and others--all
of whom describe the properties of the K a p l a n - M e i e r e s t m a t o r - - a s s u m e also
that the censoring random variables are identically distributed. Our formulation
is the discrete counterpart of that of Kitchin, Langberg and Proschan (1983):
likewise, our estimator is the discrete counterpart of theirs.
Before describing our estimator, we give the notation required.
Let nl be the random number of distinct uncensored observations in the sample
and let t I < t2 < • • • < tn, denote these distinct observed failure times, with to = 0.
Let n 2 be the random number of distinct censored observations in the sample and
let s~ < s 2 < • • • < sn: denote these times, with s o --- 0.
Let D; be the number of failures observed at time t;:

D,= ~. I ( Z j = ti, ~.= l ) fori=l,...,n,.


j=l
254 G. M. Mirnmack and F. Proschan

Let C,. be the number of censored observations equal to se:

C i = ~ I ( Z j = s i, b j = 0 ) for i = 1. . . . . n 2.
j=l

Let fin(t) --- 1 - Fn(t ) denote the proportion of observations that exceed t:

ff~(t) = 1 ~ I(Z_i> t) for t~ [0, oo).


H j=l

Let F~ (t) denote the proportion of failures observed at or before t:

F~(t)= -1 ~ I(Zj<~t, bj= 1) for t~[O, ~).

Let T~. be a measure of the total time on test in the interval (t;_ 1, ti]:

T i = # {m: ti_ 1 < Xm ~ ti)(nFn(ti) -~- Oi)


+ E ~{m:ti_l<Xm<~Sk}Ck for i = 1, . . . , n , ,
k: l i l < Sk ~ tl

where # A denotes the cardinality of the set A.


(If failure and censoring r a n d o m variables are lattice r a n d o m variables, then T;
is the total time on test in (t;_ 1, ti]. In general, however, 7",.increases by one unit
whenever an item on test survives an interval of the form ( x j _ l , xj], where
t i - 1 < X j _ 1 < X j ~ ti, irrespective of the distance between xj_ 1 and x±)
We now construct our estimator. Expressing the survival function G in terms
of the failure rates P ( X = XklX>~ Xk), k = 1, 2, . . . , we have

k
P(X>Xk): 1-I [ 1 - P ( X = x + I X > ~ x ; ) ] for k = 1 , 2 . . . . . (2.1)
j=l

It is evident from (2.1) that we may estimate our survival function at x k from
estimates of the failure rates at xl, x2 . . . . , x k. In the experimental situation,
failures are not observed at all the times x 1, x 2 . . . . so specific information about
the failure rates at m a n y of the possible failure times is not available. Having
observed failures at q, t 2, . . . , t,~, we find it simple to estimate the failure rates
P ( X = te]X >~ ti), i = 1. . . . . n 1. However, the question of how to estimate the
failure rates at the intervening possible failure times requires special consideration.
One a p p r o a c h - - t h a t of Kaplan and Meier (1958), Nelson (1969) and o t h e r s - - i s
to estimate the failure rates at these intervening times as zero since no failures are
observed then. However, not observing failures at some possible failure times may
be a result of being in an experimental situation rather than evidence of very small
Piecewise geometric estimation of a survival function 255

failure rates at these times, so we discard this approach and consider nonzero
estimates.
It is reasonable to assume that the underlying process possesses an element of
continuity in that adjacent failure rates do not differ radically from one another.
Thus we consider using the estimate of the failure rate at t,. to estimate the failure
rate at each of the possible failure times between t;_ 1 and ti, where i = 1, ..., n~.
We are therefore assuming that our approximating distribution has a constant
failure rate between the times at which failures are o b s e r v e d - - t h a t is,

f i ( X = x k l X >~ Xk) = O, for ti_ l < x k <~ t i, i = 1 , . . . , n l , (2.2)


where
qe=P(X=tilX>~t~) for i = 1. . . . . n 1.

Substituting (2.2) into (2.1), we obtain


i--1
P(X> Xk) = (1 - qi) #{m:t . . . . . . ~Xk} I-I (l -- Oj)#{m:tj . . . . . ~tj}
j=l

for t;_ ~ < x k <~ t i, i = 1 , . . . , nI . (2.3)


We note that the property of having constant failure rate on gr characterizes
a family of geometric distributions defined on Y'. In particular, the failure rates
q~ . . . . . qn, identify n~ geometric distributions G~ . . . . , G~ defined on ~c. The
survival functions, G 1. . . . . Gn,, have the geometric form
I

Gi(Xk)--(1-qi) k for k = 1 , 2 . . . . and i = 1. . . . . n l . (2.4)

Inspection of (2.3) and (2.4) reveals that our estimating function is constructed
from the geometric survival functions G1, . . . , Gn,, where G; is used in the inter-
val (t i_ 1, ti], i = 1 . . . . . n l . Consequently, the estimator (2.3) is called the Piece-
wise G e o m e t r i c E s t i m a t o r (PEGE).
It remains to define estimators of the failure rates ql . . . . , qm" This was origi-
nally done by separately obtaining the maximum likelihood estimators of the
parameters of n 1 truncated geometric distributions: the procedure is outlined at
a later stage because it utilizes the geometric structure of (2.3) and therefore
provides further motivation for the name ' P E G E ' . A more straightforward but less
appealing approach is to obtain the maximum likelihood estimates of q~, . . . , q,,
directly: denoting by L the likelihood of the sample, we have

Substituting (2.3) into this expression and differentiating yields the unique maxi-
mum likelihood estimates

qi = Di/Ti, i = 1. . . . , n1.
256 G. M. Mimmack and F. Proschan

Substituting 01, . . . , an, into (2.3), we finally obtain our estimator, formally defined
as follows.

DEFINITION 2.1. The Piecewise Geometric Estimator ( P E G E ) of the survival


function of the lifelength r a n d o m variable X is defined as follows:

1 forxk<O or n l = O ,
i--1
(1 - D f f r , ) # { ~ : , . . . . . . ~Xk} IX (1 - D j / T j ) # { m : t , - ' < x " ~ ' J }
j=l
: ( x > xk) = for ti_ 1 < Xk <~ ti, i = 1, . . . , hi, n I > 0 ,
nl
I-1(1 - Dj/Tj) #(m:t' ..... <~t,}
j=1

for Xk > tnl , n 1 > 0 .

The alternative derivation of the P E G E emphasizes its geometric structure: it


turns out that 01 . . . . , 0 , , defined above are m a x i m u m likelihood estimators of the
parameters of the truncated geometric distributions G* . . . . , G*, defined below.
For i = 1, ..., n~ we formulate the following definitions:
Let Ne = # {m: t,._ 1 < Xm ~ t~} be the number of possible times of failure in the
interval (t,._ 1, t~] and let X* be the number of possible times of failure that a unit
of age t~_ 1 s u r v i v e s - - t h a t is,

X* = number of trials to failure of a unit of age t,._ 1 ,

where the possible values of X* are assumed to be 1, 2 . . . . , N~, N,.+ . The distribu-
tion G* of X* is then given by

G*(k)=(1-q,)k for k = 1 , 2 . . . . ,IV,.,


G : ( N ; - ) = O.

The information available for estimating qi consists of nff,(t~_ l ) observations


on X,.*: of these, D; are equal to N~, nF,(te) are equal to N~+ , and for all sj in the
interval (t~_ 1, ti], Cj are equal to the number # {m: t~_ 1 < Xm <<-Sj}. The resultant
m a x i m u m likelihood estimator of q; is precisely 0~ defined above.
It is evident that the estimators 01, - . . , 0,, have the form of the usual m a x i m u m
likelihood estimator of a geometric p a r a m e t e r - - t h a t is,

number of failures observed


Estimated failure rate =
total time on test

Moreover, we note that this is the form of the failure rate estimators in the
intervals (t o, q ] . . . . . (t,l, oo) defined for the Piecewise Exponential Estimator
Piecewise geometric estimation of a survival function 257

(PEXE) of Kitchin, Langberg and Proschan (1983). In terms of our notation


(modified for continuity), the PEXE is defined as follows:

1 for t < 0 or nl = 0 ,
i--1
exp[-(t- ti_,)2i] I-[ e x p [ - ( t j . -
j=1
tj_ 1),~j]
P * ( X > t) = for ti_ l < t <~ t/, i = l . . . . , n l , nl > O , (2.5)
n!
I-'[ e x p [ - ( t j - tj_ 1),~j]
j=l

for t > t n , , nl>0,


where ^
2 i = 1/7i for i = 1 , . . . , n 1,

7/=
f t ti nFn(u) du for i-- 1. . . . . n l .
i-1

For i = 1, ..., n 1, 2/is the failure rate in the interval (ti_ 1, t/] and 7,- is the total
time on test in this interval.
The PEXE is a piecewise exponential function because its construction is based
on the assumption of constant failure rate between observed failures: just as a
constant discrete failure rate characterizes a geometric distribution so a constant
continuous failure rate characterizes an exponential distribution. Thus the P E G E
is the discrete counterpart of the PEXE.
Returning to our introductory discussion about the desirable features of survival
function estimators, we now compare the P E G E with other estimators in terms
of these and other features.
First, the P E G E is intuitively pleasing because it reflects the continuity inherent
in any life process. The Kaplan-Meier and other estimators that are step
functions do not have this property.
Second, we note that the P E G E utilizes interval information from both cen-
sored and uncensored observations. It is therefore more sophisticated than the
Kaplan-Meier and Nelson estimators. Moreover, none of the estimators of
Whittemore and Keller utilizes more information than does the PEGE.
Third, the P E G E provides a simple, useful estimator of the failure rate function.
While this estimator is naive compared with the nonlinear estimators of Whitte-
more and Keller, the P E G E has the advantage of being simple enough to calculate
by hand--moreover it requires only marginally more computational effort than
does the Kaplan-Meier estimator.
Regarding the applicability of the PEGE, we note that use of the P E G E is not
restricted to discrete distributions because it can be easily modified by linear
interpolation or by being defined as continuous wherever necessary. This is
theoretically justified by the fact that the integer part of an exponential random
variable has a geometric distribution: by defining the P E G E to be continuous, we
258 G. M. Mimmack and F. Proschan

are merely defining a variant of the PEXE. The properties of this estimator follow
immediately from those of the PEXE.
Finally, apart from being intuitively pleasing, the form of the P E G E allows
reasonable estimates of both the survival function and its percentiles. The
Kaplan-Meier estimator is known to overestimate because of its step function
form. We show in a later section that the P E G E tends to be less than the
Kaplan-Meier estimator, and therefore the P E G E may be more accurate than the
Kaplan-Meier estimator. Whittemore and Keller give some favourable indications
in this respect. They define three survival function estimators that have constant
failure rate between observed failure times. One of these is the PEXE, modified
for ties in the data: the form of the failure rate estimator is the same as the form
of the P E G E failure rate estimator--specifically, for i = 1. . . . , nl,

number of failures observed at t~


Estimated failure rate in (t;_ 1, t,.] =
total time on test during (t~_ 1, ti]
(2.6)

The second of these estimators is defined instead on intervals of the form


[t~_ 1, ti): for i = 1, ..., nl, the failure rate estimator has the form

number of failures observed at t;_ 1


Estimated failure rate in [tt_ 1, ti) =
total time on test during [t~_ 1, t;)
(2.7)

The third of these estimators is obtained from the average of the two failure rate
estimators described by (2.6) and (2.7).
In a simulation study to investigate the small sample properties of these three
estimators, Whittemore and Keller find that the first estimator tends to under-
estimate the survival function while the second tends to overestimate the survival
function. From these results, we expect the P E G E to underestimate the survival
function and its percentiles. Whittemore and Keller do not record further results
for the first two estimators: however, they do indicate that, in terms of bias at
extreme percentiles, variance and mean square error, the third estimator tends to
be better than the Kaplan-Meier estimator.
The implications for the discrete version of the third estimator are that, in terms
of bias, variance and mean square error, it will compare favourably with the
Kaplan-Meier estimator. An unanswered question is whether the performance of
this estimator is so superior to the performance of the P E G E as to warrant the
additional computational effort required for the former.
Piecewise geometric estimation of a survival function 259

3. Asymptotic properties of the PEGE

This section treats the asymptotic properties of the P E G E and of the cor-
responding failure rate function estimator. The properties of primary interest are
those of consistency and asymptotic normality: secondary issues are asymptotic
bias and asymptotic correlation.
Initially considering a very general model, we obtain the limiting function of the
PEGE and show that the s e q u e n c e s {Pn(X>Xk)}~°=l and {Pn(X=x/,p
X >/ X k)}~oo= ~ converge in distribution to Gaussian sequences. We then explore the

effects of making various assumptions about the lifelength and censoring random
variables. Under the most general model, the PEGE is not consistent and the
failure rate estimators are not asymptotically uncorrelated: a sufficient condition
for consistency is independence between corresponding lifelength and censoring
random variables, and a sufficient condition for asymptotically independent failure
rate estimators is that the censoring random variables be identically distributed.
However, it is not necessary to impose both of these conditions in order to ensure
both consistency and asymptotic independence of the failure rate estimators:
relaxing the condition of independent lifelength and censoring random variables,
we give conditions under which both desirable properties are obtained.
Before investigating the asymptotic properties of the PEGE, we describe the
theoretical framework of the problem, give some notation, and present a prelimi-
nary result that facilitates the exploration of the asymptotic properties of the
PEGE.
The probability space (f2, ~, P) on which all of the lifelength and censoring
random variables are defined is envisaged as the infinite product probability space
that may be constructed in the usual way from the sequence of probability spaces
corresponding to the sequence of independent random pairs (X1, Yl),
0(2, II2). . . . . Thus 1"2 consists of all possible sequences of pairs of outcomes
corresponding to pairs of realizations in 5f x Y¢: the first member of each pair
corresponds to failure at a particular time and the second member of each pair
corresponds to censorship at a particular time--that is, for each co in f2,
k = 1 , 2 , . . . and j = 1,2 . . . . .

(Xi, Yt)(co) = (Xi(co), Y~.(co)) = (Xk, yj) if the ith element of the infinite
sequence co is the pair of outcomes corresponding to failure at xg and
censorship at yj.

The argument co is omitted wherever possible.


Two conditions are imposed on the random pairs (XI, YI), (22, Y2) . . . . :
(A1) There is a distribution function F such that

1
lira ~ P(Zi<~xk)=F(xk) for k = 1 , 2 , . . . .
n~°° H i=1
260 G. M. Mirnmack and F, Proschan

(A2) There is a subdistribution function F 1 such that

1
lim - ~ P ( Z ~ < x k , 6;= 1 ) = F l ( x k ) for k = 1,2 . . . . .
n ~ o o iv/ i= 1

It is evident that a sufficient condition for (A1) and (A2) is that the censoring
random variables be identically distributed.
Definitions of symbols used in this section are given below. Assumptions (A1)
and (A2) ensure the existence of the limits defined. Let

P k i = P ( Z i = X k , ~i= 1) f o r k = 1 , 2 , . . . andi--- 1. . . . . n,
R k t = P ( Z i = x k, bi = 0 ) fork= 1,2,... andi= 1,...,n,

1 n
Pk = lim -~ Z P k ; = F I ( x k ) - F ' ( X k - , ) f o r k = 1,2 . . . . .
n~°° l'l i=1

1
R k = lim - Z Rk, f o r k = 1,2 . . . . .
n~oo n i= 1

The proposition below is fundamental: it asserts that, with probability one, as


the sample size increases to infinity, at least one failure is observed at every
possible value of the lifelength random variable. First, we need a definition.

DEFINITION 3.1. Let t2* c f2 be the set of infinite sequences which contain, for
each possible failure time, at least one element corresponding to the outcome of
observing failure at that time--that is,

t2* = {~: (Vk)(3n)X,,(e9) = x k, Y,,(o9) >1 Xk}.

PROPOSITION 3.2. P(~2*) = 1.

The proposition is proven by showing that the set of infinite sequences that do
not contain at least one element corresponding to the outcome of observing failure
at each possible failure time x, has probability z e r o - - t h a t is,

P ( nlim°°~ ~=1
~ {Xi= xk' Yi>/xk}C) =0 for k = 1 , 2 .....

As the pairs (X1, Y1), (X2, Y2), ... are independent, this is equivalent to proving
the following equality:
n
lira l~ ( 1 - P ( X i = x k, Y,.>tXk))=O for k = 1,2 . . . . . (3.1)
n~ i=l
Piecewise geometric estimation of a survival function 261

Since [I i=
°° 1 (1 - p~) = 0 if and only if ~ i=
o~ 1 Pi = OO, where {Pi}~= 1 is any sequence
of probabilities, and since (A2) implies that

• P(X i=xk, Y,.>/Xk)= o0 for k = 1,2 . . . . ,


i=1

we have (3.1).
The importance of the preceding proposition lies in the simplifications it allows.
It turns out that, on 12" and for n large enough, the P E G E may be expressed in
simple terms of functions that have well-known convergence properties. Since
P(12*) = 1, we need consider the asymptotic properties of the P E G E on O*
alone: these properties are easily obtained from those of the well-known functions.
In order to express the P E G E in this convenient way, we view the estimation
procedure in an asymptotic context.
Suppose co is chosen arbitrarily from f2*. Then, for each k, there is an N
(depending on k and co) such that X;(co) = xj and }',.(co)>~ xj for j = 1. . . . . k and
some i ~< N. Consequently, for n >~ N, the smallest k distinct observed failure times
tl, . . . , tk are merelY x l , . . . , x k, and, since the set of possible censoring times is
contained in f , the smallest k distinct observed times are also x l , . . . , x k. T h e
first k intervals between observed failure times are simply (0, x~],
(Xl, x2] . . . . . (Xk- 1, Xk], and the function T~,~ defined on the ith interval is given
by the number of units on test just before the end of the ith interval--that is,

Ti, n = n F n ( x f - ) = n F n ( x i - 1) for i ~-- 1 . . . . , k and n/> N . (3.2)

Likewise, we express the function D~, n defined on the ith interval in terms of the
empirical subdistribution function F2 as follows:

D~.,, = n [ F 2 ( x i ) - F 2 ( x ~ _ , ) ] for i = 1, . . . , k and n ~> N . (3.3)

As the P E G E is a function of D;. n and T;, n, it can be expressed in terms of


the empirical functions Fn and F2. Specifically, on O*, for any choice of k, there
is an N such that

F l ( x ~ ) - F 2 ( x , _ 1!) for >I N .

Consequently, taking the limit of each side and using Proposition 3.2, we have

P
n
lim/~,(X> Xk)= lim
n~oo i= 1
1-
F~ (xi) - F~ (x i_ l )
~,,-(x~.-5
_
')
f o r k = 1,2 . . . . ] = 1.
J
262 G. M. M i m m a c k and F. Proschan

In exploring the asymptotic behaviour of the P E G E , therefore, we consider the


behaviour of the limiting sequence of the sequence

{i~l (1 Fln(Xi)-Fl(xi-1) °°
- ffn(X~--l) )}k=l
The proofs of the results that follow are omitted in the interest of brevity. The
most general model we consider is that in which only conditions (A1) and (A2)
are imposed. The following theorem identifies the limits of the sequences
{P.(X = x~lX>~ x~)}~=, and {/3~(X> Xk)}~= ~ for k = l, 2 . . . . and establishes
that the sequences {/S.(X= XkIX>~Xk)}~=~ and {/S.(X> Xk)}ff=l converge to
Gaussian sequences.

THEOREM 3.3. (i) With probability 1,

FI(Xk) - Fl(xk - 1)
lim P . ( X = x k l X >>,x k ) = fork=l, 2,....
n~o~

(ii) With probability 1,

fi(Fl(xi)-Fl(xi-1))
lira /~.(X > xk) = 1- ~_ for k = 1, 2 . . . . .
~~ i= 1 F ( x i - l)

(iii) Let kl, . . . , kM be M arbitrarily chosen integers such that


kl < k2 < " " < kM. Then

(P~(X = XkllX>~ xk, ) . . . . . ffn(X = XkM]X>~ XkM)) is AN


(1)
/~*, - Z*
n
,

where
~,* = (P~,/~(xk, _ , ) . . . . . P,~,JF(xkM- ,)) ,

q-1 r--,
~q~ = PkqPkr 2 2 (~kinkj ~- ~kM+ki, kj ~[- aki, kM+kj -~ ~kM+ki, kM+k j)
i=1 j=l

/(~(x,~q_ ~ ) ~ ( ~ , _ ,)y
r--1

"]- Pkr 2 (ffkM + kq,ki q- ~kM + kq, kM + ki)/((F(xkr-1 ))2F(xkq 1))


i=1
q-I
+ Pk~ ~ (ok,~ + a~.+,~.,D/(~(xk ,)(?(xk ,))2)
i=1
+ akM + k,. kr/(ff(Xkq - 1))F(Xkr-,) for q < r.
Piecewise geometric estimation of a survivalfunction 263

lim 1 ~ P ~ , , ( 1 - P~.;) for q = r, q = 1. . . . , M ,


n~:x~ n i=1

-lim 1 ~ Pl,.,iPk~,, forq<r,q= 1. . . . . M,


n~oo n i= 1

r= 1,...,M,

r= - l i m 1 ~ Pkq_M. iRkr. i forq =M+ 1, . . . , 2M,


n~oo n i= 1

r = 1, . . . , M ,

lim 1 ~ R~q_M,i(l_ Rk~_M,;) f o r q = r , q = M + l,


n~oo n i= 1

.... 2M,

- lim 1 ~ Rkq_M,iRk,_m,i forq<r,q-M+ 1. . . . .


n~oo n i= 1
2M, r = M + 1. . . . . 2M.

(iv) Let k I . . . . , kM be M arbitrarily chosen integers such that


kl<k2<...<k M. Then

(/~.(X> xk, ) . . . . , P . ( X > XkM)) is A N


(1)
.
p**, - Z**
n
,

where

p** =
\i=l
(1 - P,./F(x,_ ,)1 . . . . . l-I (1 - P / i f ( x , _ 1)) ,
i=1
)
Z** /
].tJqr f q = l ..... M;r=l ..... M,
kq kr
Cry** ---- 1--[ (1 -- P / i f ( x , _ 1)) ~ I (1 -- e j / ~ ' ( x j _ 1))
i=1 j=l
kq kr
' Z 20"/*m/[(1 -- e l / f f ( X l _ 1)(1 - Pro~if(x,,,_ 1))1 for q <~ r.
l=1 m=l

It is evident from the theorem above that the P E G E is a strongly consistent


estimator of the underlying survival function if and only if

F l ( x k ) - F l ( x k - 1) _ P ( X : xk)
for k = 1, 2 . . . . . (3.4)
ff(Xk_l) P(X>/x~)

The theorems below give conditions under which this equality holds. As for
correlation, it is evident from the structure of the P E G E that any two elements
of the sequence {Pn(X> xg))ff= 1 are correlated. Consequently the matrix 2~**
264 G. M . M i m m a c k and F. Proschan

cannot be reduced to a diagonal matrix under even the most stringent conditions.
However it turns out that, under certain conditions, the asymptotic correlation
between pairs of the sequence {/Sn(X = xklX>~ x~)}ff= 1 is z e r o - - t h a t is, 1;* is a
diagonal matrix.
The following theorem shows that independence between lifelength and cen-
soring random variables results in strongly consistent (and therefore asymptoti-
cally unbiased) estimators. However any pair in the sequence
{/~n(X= xklY>~xk)}2= 1 is asymptotically correlated in this case. Since the
matrices ~2" and Z** have the same form as in the theorem above, they are not
explicitly defined below.

THEOREM 3.4. Suppose (i) the random variables X i and Y,. are independent for
i = 1, 2 , . . . , and
(ii) there is a distribution function H such that

1 n
lim -- Z P(Y¢<<-x~)=H(x*) fork= 1, 2 , . . . .
n~°° n i=1

Then
k
(iii) F l ( x k ) = ~ P ( X = x i ) H ( x i_ 1) and ff(x~) = P ( X > x k ) H ( x ~ ) for k = 1, 2 . . . .
i=1

(iv) with probability 1,

nlim/Sn(X>xk)=G(xk) for k= 1,2 .....

(v) (/Sn(X = Xk, IX>~xk,), . . . , /S,~(X = xk,~lX>~xk,,,)) is AN


(1) ~*, - 22* ,
n

where k~ < k2 < "'" < k M are arbitrarily chosen integers and

i,* = ( P ( x = xk~lX>_, x k , ) . . . . . P ( X = xk,~bX>-- x,,~,)).

(vi) (/Sn(X> xk,), . . . , f i n ( X > XkM)) is AN


(1)
~**, - 22** ,
n

where k~ < k 2 < " ' < k M are arbitrarily chosen integers and

** = ( e ( x > x < ) , . . . , P ( X > XkM)) .

A sufficient condition for (A1), (A2) and assumption (ii) of the preceding
theorem is that the censoring random variables be identically distributed. In this
case the failure rate estimators are asymptotically independent and the matrix Z**
Piecewise geometric estimation of a survival function 265

is somewhat simplified: The conditions of the following corollary define the model
of random censorship widely assumed in the literature.

COROLLARY 3.5. Suppose (i) the random variables X i and Y,. are independent for
i = 1, 2 , . . . , and
(ii) the random variables Y1, Y2 . . . . are identically distributed.
Then
(iii) with probability 1,

lifnoo13n(X> Xk) = -G(Xk) for k = 1, 2 . . . . .

(iv) (/~,(X= Xk,[X>~Xk~ ) . . . . . f t , ( X = XkM[X>~XkM)) is AN


(1)
~*, - X* ,
n

where
l~* = (P(X = Xk~lX>~ xk, ) . . . . . P ( X = XkM]X>~ Xk~)),

~* = { O ~ q r } q = 1. . . . . M;r=l ..... M'

O.j:r={Po(X=Xkq'X~Xkq)P(X~Xkq'X~xkq)/F(Xkq-1) for q = r,
for q # r.

(v) (P.(X> xk,), ..., P.(X> XkM))is AN


(1)
#**,
n
X** ,

where
~,** = (P(X> x~,) . . . . . P(X > x~..)).

r q=l,...,M;r=l,...,M'
kq
aS** = P ( X > Xk,)P(X > Xkr) 2 P ( X = x i l X >~ x,)/[ff(x i_ ,)
i=1

"P ( X > x i l X >~ xi) ] for q <~ r.

Having dealt with the most restrictive case in which the lifelength and censoring
random variables are assumed to be independent, we now consider relaxing this
condition. It turns out that independence between corresponding lifelength and
censoring random variables is not necessary for asymptotic independence between
pairs of the sequence of failure rate estimators: if the censoring random variables
are assumed to be identically distributed but not necessarily independent of the
corresponding lifelength random variables, then the failure rate estimators are
asymptotically independent. However both the survival function and failure rate
estimators are asymptotically biased. The following corollary expresses these facts
formally.
266 G. M. Mimmack and F. Proschan

COROLLARY 3.6. Suppose (i) the random variables Y1, Y2 . . . . are identically dis-
tributed.
Then
(ii) P k = P ( Z = x k, b = 1) and F ( x k ) = P ( Z > x k )

(iii) (/Sn(X = xk~lX>~ xk,) . . . . . P~(X = x~MIX>~ x~M)) is AN


(1)
for k = 1, 2 . . . . .

#*, I2" ,
n
where
#* = (Pk~/?(Xk, 1 ) , ' ' . , Pk,/F(xk~ 1)),
Z* = {l~i~ ) i _ 1..... M;j--1 ..... M'

~Pk~(1-Pk,/ff(Xk, 1))/ff(Xk, 1))2 for i = j ,


for i ¢ j .

(iv) (/~n(X> xk, ) . . . . , /~n(X> XkM)) is A N


(1)
#**, - L-** ,
n

where
#** = (1 - P,/ff(x i_1)), . . ' , l~ (1 - PJF(x i_1)) ,
\i=1 i=1

..... M;,=I ..... M,

kl
aS;* = I~ (1 - Pi/?(x~_ ,)) 1~ (1 - Pm/ff(Xm_ 1))
i--1 m=l
gj
• ~ Pr/[(F(xr- 1))2( 1 - Pr/ff(Xr-1))] forj<~ l.
r 1

The corollaries above give sufficient (rather than necessary) conditions for the
two desirable properties of (i) consistency and (ii) asymptotic independence
between pairs of the sequence of failure rate estimators {fi,(X = x k l X >1 Xk)}k~_ 1.
The final corollaries show that both of the conditions of Corollary 3.5 are not
necessary for these two desirable properties: the conditions specified in these
corollaries are not so stringent as to require that corresponding censoring and
lifelength random variables be independent (as in Corollary 3.5), but rather that
they be related in a certain way.

COROLLARY 3.7. I f the random variables Y1, Y 2 , . . . are identically distributed,


then with probability 1,

nlim ff n(X > xk) = G(x~) for k = 1, 2 . . . .

if and only if

P(Y,.>/ x k l X = xk) = P(Y,.>~ xglX>~ xk) for k = 1, 2, ...


and i= 1 , 2 , . . . .
Piecewise geometric estimation of a survival function 267

COROLLARY 3.8. Suppose (i) the random variables Y1, Y2, " " are identically dis-
tributed, and
(ii) P(Y,>~ XklX = Xk) = P(Y,>~ XkIX>>- Xk) for k = 1, 2 . . . . and i = 1, 2, . . . .

Then

(iii) (IS,(X = Xk~rX>~ Xk,), . . . , P , ( X = XkMlX>~ Xk,~)) is AN


(1) p*, - Z* ,
1l

where
t~* = (P(X = x~, IX >t x~,) . . . . , P(X = x~, Ix >>-XkM)),
Z* = {G;j },~, ..... ~;j=, ..... ~,

,7,= { o ( X = xk,[X >~ x~,)P(X > xk~IX >~ x~,)/F(xk~ ~) for i = j ,


jbr i ~ j .

(iv) (ft,(X> xk~), . . . , /~,,(X> XkM)) is AN


(1),u**, - Z**
n
,

where
p** : ( P ( X > xk,) . . . . . P ( X > x k , ) ) , '

z**={~.~*)j=l ..... M;,=, ..... M,

~** = P(X > X k ) P ( X > Xk,) ~ P(X = xilX >~ xi)/[ff(x,_ ,)


i=1

• P(X>x~lX>/x~)] forj<<.l.

The last two corollaries are of special interest because they deal with con-
sistency and asymptotic independence in the case of dependent lifelength and
censoring random variables--a situation that is not generally considered despite
its obvious practical significance. Desu and Narula (1977), Langberg, Proschan
and Quinzi (1981) and Kitchin, Langberg and Proschan (1983) consider the
continuous version of the model specified in the last two corollaries.
The condition specifying the relationship between lifelength and censoring
random variables is in fact a mild one: re-expressing it, we have the following
condition:

P(X = xklX>~ x k, Yi>~ xg) _ P(X = Xk)


f o r k = 1,2 . . . .
P ( X ~ x k I X ~ xk, Y t ~ xg) P(X>. x~,)
and/= 1,2 . . . . .

This condition specifies that the failure rate among those under observation at any
particular age is the same as the failure rate of the whole population of that age.
268 G. M. Mimmack and F. Proschan

It is evident both intuitively and mathematically that this is a fundamental assump-


tion inherent in the process of estimating a life distribution from incomplete data:
if this assumption could not be made, the data available would be deemed
inadequate for estimating the life distribution. Formally, it is the fact that the
condition is both necessary and sufficient for consistency that indicates that it is
minimal for the estimation process. It is clear, therefore, that the last two corol-
laries play an important role in estimation in the context of a practical model more
general than the statistically convenient, but unnecessarily restrictive, model of
random censorship.

4. The PEGE compared with rivals

In Section 1 we motivate the construction of the PEGE by describing some


desirable properties of nonparametric survival function estimators and then
mentioning that the commonly used estimator of Kaplan and Meier (1958) does
not fare well in terms of these properties. We now compare the PEGE with the
Kaplan-Meier estimator.
We begin with the most obvious desirable features of survival function esti-
mators and then consider statistical and mathematical properties. In comparing
the two estimators, we find that the issue of continuity arises and that the PEXE
enters the comparison. The section ends with an example using real data. The
subsequent section continues the comparison: we discuss the results of simulation
studies.
The K a p l a n - M e i e r estimator (KME) of the survival function of the lifelength
random variable X is defined as follows:

1 forn 1=0or t < t l , n11>l,


i--1
P(X > t) = I-[ (1 - D J n f f n ( t f )) for ti_ 1 ~< t < ti, i = 2, . . . , n 1,
j~l nl~>2,
nl
I-I (1 - D J n f f , ( t f - )) for t t> tnl, nl ~> 1.
j=l

To the prospective user of a survival function estimator, two fundamental


questions are, firstly, does the estimating function have the appearance of a
survival function, and secondly, is it easy to compute?
Considering the second question first, we observe that calculating the PEGE
involves only marginally more effort than calculating the KME. Therefore, both
estimators are accessible to users equipped with only hand calculators.
The first question is a deeper one. If the sample is small or if there are many
ties among the uncensored observations in a large sample, the K M E has only a
few steps and consequently appears unrealistic. The PEGE, in contrast, reflects
the continuity inherent in any life process by decreasing at every possible failure
time, not only at the observed failure times. As the number of distinct uncensored
Piecewise geometric estimation of a survival function 269

observations increases, both the P E G E and the K M E become smoother: the


many steps of the K M E do allow it the appearance of a survival function, except
possibly at the right extreme--there is no way of extrapolating very far beyond
the range of observation if the K M E is used. (There are several ways of extra-
polating from the PEGE.) At face value, therefore, the P E G E is at least as
attractive as the KME.
A related consideration is whether the estimator provides a realistic estimate of
the failure rate function. The KME, being a step function, does not. The serious-
ness of this omission becomes more apparent when the K M E failure rate function
is examined from a user's point of view: if an item of age t has a (perhaps large)
chance of failing at its age, then claiming that a slightly older (or slightly younger)
item cannot fail at its age seems unreasonable, particularly when it becomes
evident that the claim is made on the grounds that none of the items on test
happened to fail just after (or just before) time t. Intuitively--or from a fre-
quentist's point of view--the very fact that one of the items on test failed at time
t makes it less likely that another item in the sample will fail soon after t because
the observed failure times should be scattered along the appropriate range accord-
ing to the distribution function. Clearly, then, the gaps between observed failure
times are a result of the fact that the sample is finite and are not indicative of
zero (or very small) failure rates.
The PEGE, on the other hand, is constructed so that a failure at time t, say,
affects the failure rate in the gap before t. Thus the P E G E compensates for the
lack of observations at the possible (but unobserved) failure times. The resultant
failure rate function, being a step function, is still na'fve, but it does at least take
into account the continuity of life processes and it does provide reasonable
estimates of the failure rates at all possible failure times.
A more aesthetic--but none the less important--issue is that of information
loss. Here the P E G E is again at an advantage. Although interval information
about the uncensored observations is used in spacing out the successive values
of the KME, the failure rate estimators utilize only ordinal information. Moreover,
the only information utilized from the censored observations is their positioning
relative to the uncensored observations. Thus the information lost by the K M E
is of both the ordinal and interval types. In contrast, the P E G E failure rate
estimators use interval information from all the observations: in particular, the
positions of censored observations are taken into account precisely. In terms of
information usage, then, the P E G E is far more desirable than the KME.
An apparently attractive feature of the K M E is that its values are invariant
under monotone transformation of the scale of measurement. The P E G E is not
invariant under even linear transformation. However, in the light of the discussion
about information loss, it is evident that the KME's invariance, and the PEGE's
lack thereof, are results of their levels of sophistication rather than properties that
can be used for comparison.
Having noted that the step function form of the K M E is not pleasing, we now
point out that it is also responsible for a statistical defect, namely, that the K M E
tends to overestimate the underlying survival function and its percentiles. The fact
270 G. M. Mimmack and F. Proschan

that the KME consistently overestimates suggests that its form is inappropriate.
Some indications about the bias of the PEGE are given by considering the
relationship between the PEGE and the KME.
Under certain conditions (for example, if there are no ties among the uncen-
sored observations), the PEGE and the K M E interlace: within each failure inter-
val, the PEGE crosses the K M E once from above. This is not true in general,
however. It turns out that the K M E may have large steps in the presence of ties.
In the case of the PEGE, however, the effect of the ties is damped and the PEGE
decreases slowly relative to the KME. In general, therefore, it is possible to relate
the PEGE and the KME only in a one-sided fashion: specifically, the PEGE at
any observed failure time is larger than the K M E at that time. Examples have
been constructed to show that, in general, the PEGE cannot be bounded from
above by the KME. The following theorem relates /s (the PEGE) and P (the
KME).

THEOREM 4.1. (i) P ( X > ti) >~ P ( X > ti) for i = 1 . . . . . n 1.

(ii) I f n f f , ( t j _ ~ ) / ( n F , ( t j _ l ) + Wj_I)<<.DflDj_ 1 for j = 2, . . . , i, where Wj de-


notes the number of censored observations at tj f o r j = 1. . . . , n 1, then
f f ( X > ti) <~ P ( X > ti- ) for i = 1 . . . . . n I .

It is evident that the condition in (ii) is met if there are no ties among the
uncensored observations: this is likely if the sample is small. From the relation-
ships in the theorem, we infer that the bias of the PEGE is likely to be of the
same order of magnitude as that of the KME. Further indications about bias are
given later.
Having considered some of the practical and physical features of the PEGE
and the KME, we turn briefly to asymptotic properties--briefly because the
PEGE and the K M E are asymptotically equivalent--that is,

P[(V k) nlim P n ( X > x~) = ,lirn P n ( X > xk)] = 1.

The practical implication of this is that there is little reason for strong preference
of either the PEGE or the K M E if the sample is very large.
We now compare the models assumed in using the K M E and the PEGE. In
the many studies of the KME, the most general model includes the assumption
of independence between corresponding life and censoring random variables. Our
most general model does not include this assumption. However this difference is
not important because the assumption of independence is used only to facilitate
the derivation of certain asymptotic properties of the KME: in fact, the definition
of the K M E does not depend on this assumption, and the K M E and the PEGE
are asymptotically equivalent under the conditions of the most general model of
the PEGE. Therefore this assumption is not necessary for using the KME.
The other difference between the models assumed is that the PEGE is designed
specifically for discrete life and censoring distributions while the Kaplan-Meier
model makes no stipulations about the supports of these distributions. However,
Piecewise geometric estimation o f a survival function 271

distinguishing between continuous and discrete random variables in this context


is merely a statistical convention--in fact, time to occurrence of some event is
always measured along a continuous scale, and the set of observable values is
always countable because it is defined by the precision of measurement. Since the
process of estimating a life distribution requires measurements, it always entails
the assumption of a discrete distribution: whether the support of the estimator is
continuous or discrete depends on the way the user perceives the scale of
measurement. In practice, therefore, there are no differences between the models
underlying the P E G E and the KME: the P E G E is appropriate whenever the
K M E is, and vice versa.
Having pointed out that the P E G E may be used for estimating continuous
survival functions, and having introduced the PEXE as the continuous counter-
part of the PEGE, we compare the two. First we note that the PEXE is the
continuous version of the P E G E because the construction of each is based on the
assumption of constant failure rate between distinct observed failure times. The
forms of the estimators differ because of the difference in the ways of expressing
discrete and continuous survival functions in terms of failure rates. The P E G E
and the PEXE are equally widely applicable since a minor modification of the
PEXE can be made to allow for ties. (This estimator is defined in Whittemore
and Keller (1983).)
The relationship between the P E G E and the modified PEXE, and their posi-
tioning relative to the KME, is summarized by the following theorem and the
succeeding relationship.

THEOREM 4.2. Let P * * ( X > t) denote the modified PEXE of the survival proba-
bility P(X > t) for t > O.
(i) P ( x > O < e * * ( x > t ) fort>O.
(ii) l f nF,(tj_l)/(nT"n(tj_a) + Wj_I)<~Dj/Dj_ 1 for j = 2, ..., i, where Wj de-
notes the number of censored observations at tj for j = 1, ..., n,, then
e * * ( x > t,) ~ P ( x > t,_ ,) for i= 1. . . . , n 1.

From Theorems 4.1(i) and 4.2(i), we have

P ( X > t~) <~P ( X > t;) < P * * ( X > t;) for i = 1. . . . , n I .

Consequently, if the condition in (ii) above is met (as it is when there are no ties
among the uncensored observations), both the P E G E and the PEXE interlace
with the KME: in each interval of the form (tt_ ,, t~], the P E G E and the PEXE
cross the K M E once from above. Practical experience suggests that the condition
in (ii) above is not a stringent one: even though this condition is violated in many
of the data sets considered to date, the P E G E and the PEXE still interlace with
the K M E in the manner described. Another indication from practical experience
is that the difference between the PEXE and the P E G E is negligible, even in small
samples.
Finally, we present an example using the data of Freireich et al. (1963). The
272 G. M. Mimmack and F. Proschan

t ×

4-×

i"

I /
x4 ,..a,,

1/
I/
x+
I/
+,
LJA ,/I
+x
k~

Jf
× + x ,+
I i I, /
×Jr x+
,t'
i..,

x -.I-
I,/
×+
0

I" CD

×/+/~
z
I,/ _o
×+ co

r'c
,Y o '-';
..~+× -~o t.u
1 I
X I I.--
,+ -LD

I
I
x -J-
I /
× 4- co

1 /
C~

I /
x~-,
0
?o
Piecewise geometric estimation of a survival function 273

data are the remission times of 21 leukemia patients who have received 6 MP (a
mercaptopurine used in the treatment of leukemia). The ordered remission times
in weeks are: 6, 6, 6, 6 + , 7, 9 + , 10, 10+, 11+, 13, 16, 17+, 19+, 2 0 + , 22,
23, 2 5 + , 32+, 3 2 + , 3 4 + , 3 5 + . The P E G E and the K M E are presented in
Figure 1. (Since the P E G E and the PEXE differ by at most 0.09, only the
PEGE appears.) The graphs illustrate the smoothness of the P E G E in contrast
with the jagged outline of the KME. The K M E and the PEGE interlace even
though the condition in Theorems 4. l(ii) and 4.2(ii) is violated. Since the PEGE
is only slightly above the K M E at the observed failure times and the PEGE
crosses the K M E early in each failure interval, the K M E is considerably larger
than the P E G E by the end of each interval. This behaviour is typical. We infer
that the PEGE certainly does not overestimate: it may even tend to under-
estimate.
We conclude that the PEGE (and the modified PEXE) have significant
advantages over the KME, particularly in the cases of large samples containing
many ties and small samples. It is only in the case of a large sample spread over
a large range that the slight increase in computational effort required for the
PEGE might merit using the K M E because the P E G E and the K M E are likely
to be very similar.

5. Small sample properties of the PEGE

In this section we give some indications of the small sample properties of the
PEGE by considering three simulation studies. In the first study, Kitchin (1980)
compares the small sample properties of the PEXE with those of the KME. In
the second study, Whittemore and Keller (1983) consider the small sample
behaviour of a number of estimators: we extract the results for the K M E and a
particular version of the PEXE. In the third study, we make a preliminary
comparison of the K M E and the PEGE. We expect the behaviour of the piece-
wise exponential estimators to resemble that of the PEGE because piecewise
exponential estimators are continuous versions of the PEGE and, moreover,
piecewise exponential estimators and the PEGE are similar when the underlying
life distribution is continuous.
The pi_ecewise exponential estimator considered by Whittemore and Keller is
denoted FQ4" It is constructed by averaging the PEXE failure rate function estima-
tor with a variant of the PEXE failure rate function estimator--that is, ~Q4 is the
same as the PEXE except that the PEXE failure rate estimators 2/- . . . . , 2 ~ are
replaced by the failure rate estimators 2", ..., 2*, defined as follows:

2* = 5(2;
~ - + 2t+- l ) f o r / = 1, .. ., n l ,
where
2;- = D;/total time on test in (t;_ 2 , ti] for i = 1, . . . , n 1 ,

2e+ = D~./total time on test in [ti, ti+ ~) for i = 0, . . . , n~ - ,


274 G. M. Mimmack and F. Proschan

2+,1 = {O~,/total time on test in [t,,, ~) ifotherwise~


~,~,max Z;. > t t l l ,

A_lthough Whittemore and Keller include in their study the two estimators
FQ, and FQ2 constructed from 2 f . . . . . 2~-, and 2~- . . . . . 2,~] respectively, they
present the results for the hybrid estimator FQ, alone because they find that FQI
tends to be negatively biased and ffQ: tends to be positively biased.
The same model is assumed in all three studies. The model is that of random
censorship: corresponding life and censoring random variables are independent
and the censoring random variables are identically distributed. Whittemore and
Keller generate 200 samples in each of the 6 x 3 x 4 = 72 situations that result
from considering six life distributions (representing failure rate functions that are
constant, linearly increasing, exponentially increasing, decreasing, U-shaped, and
discontinuous), three levels of censoring (P(Y<X)~ O, 0.55, 0.76), and four
sample sizes (n = 10, 25, 50, 100). Kitchin obtains 1000 samples in each of a
variety of situations: he considers four life distributions (Exponential, Weibull with
parameter 2, Weibull with parameter ½ and Uniform), three levels of censoring
(P(Y<X) = 0, 0.5, 0.67), and four sample sizes (n = 10, 20, 50). Kitchin's study
is broader than that of Whittemore and Keller in that Kitchin considers Exponen-
tial, Weibull and Uniform censoring distributions while Whittemore and Keller
consider only Exponential censoring distributions. Kitchin apparently produces
the greater variety of sampling conditions because his results vary slightly accord-
ing to the model, while Whittemore and Keller find so much similarity in the
results from the various distributions that they record only the results from the
Weibull distribution.
The conclusions we draw from the two studies are similar. Regarding mean
squared error (MSE), both Kitchin and Whittemore and Keller find that, in
general:
(i) The MSE of the exponential estimator is smaller than that of the KME.
(ii) As the level of censoring increases, the increase in the MSE is smaller for
the exponential estimator than for the KME.
Kitchin reports than (i) and (ii) are not always true of the PEXE and the KME:
the exceptional cases occur in the tails of the distributions.
The conclusions about bias are not so straightforward. Whittemore and Keller
find that the PEXE tends to be negatively biased while Kitchin reports that the
bias of the PEXE is a monotone increasing function of time: examining his
figures, we find that the bias tends to be near zero at some point between the 40th
and 60th percentiles except when the life and censoring distributions are Uniform.
(In this case, the bias is positive only after the 90th percentile.) We conclude that
Whittemore and Keller merely avoid detailed discussion of bias. Regarding the
hybrid estimator, we find in the figures recorded some suggestions of the tenden-
cies observed in the PEXE--specifically, monotone increasing bias and a tendency
for underestimation when the sample size is small and censoring is heavy.
Whether this behaviour is typical of the PEGE also remains to be seen.
Piecewise geometric estimation of a survival function 275

In considering the magnitude of the bias of the estimators, we find the fol-
lowing.
(i) Both Kitchin and Whittemore and Keller report that the bias of the KME
is negligible except in the right tail of the distribution and in the case of a very
small sample (n = 10) and heavy censoring.
(ii) The PEXE i_s considerably more biased than the KME.
(iii) The bias of FQ4 is negligible except in the case of a very small sample and
heavy censoring.
(iv) The bias of each estimator increases as the censoring becomes heavier and
it decreases as the sample size increases.
In view of these two studies, we conclude, firstly, that the PEGE is likely to
compare favourably with the K M E in terms of MSE, and secondly, that the
PEGE is likely to be considerably more biased than the KME. We expect that
the discrete counterpart of FQ4 performs well in terms of both MSE and bias.
Since the bias of this estimator is likely to be small, adjustment for its presumed
tendency to increase monotonically is deemed an unnecessary complication.
In the pilot study we generate three collections of data, each consisting of 100
samples of size 10, from independent Geometric life and censoring distributions.
In each case the life distribution has parameter p = exp(-0.1). The censoring
distributions are chosen so as to produce three levels of censoring: setting
p = e x p ( - 2 ) , where 2=0.00001, 0.1, 0.3, yields the censoring probabilities
P(Y<X) = 0, 0.475, 0.711 respectively.
The conventions followed for extrapolation in the range beyond the largest
observed failure time are as follows:

ff(X>k)={Po(X>t., ) fort.,<~k<s~:,
for k~> s~2 ~> tm ,
fi(X>k)=fi(X>tnl)(1-O~,) k-t"' for k ~> t~,.

This definition of the K M E rests on the assumption that the largest observation
is uncensored, while the definition of the PEGE results from assuming that the
failure rate after the largest observed failure time is the same as the failure rate
in the interval (tn,_ l, t,l ]-
Our conventions for extrapolation differ from those of Kitchin and of Whitte-
more and Keller. Consequently our results involving fight-hand tail probabilities
differ from theirs: a preliminary indication is that our extrapolation procedures
result in estimators that are more realistic than theirs.
Although the size of the study precludes reaching more than tentative con-
clusions, we observe several tendencies.
Tables l(a), 2(a) and 3(a) contain the estimated bias and mean squared error
(MSE) for the K M E and the P E G E of P(X > k) for k = ~p, where ~p is the pth
percentile of the underlying life distribution and p = 1, 5, i0, 20, 30, 40, 50, 60,
70, 80, 90, 95, 99. From these tables we make the following observations.
(i) The MSE of the P E G E is generally smaller than that of the KME. The
276 G. M. Mimmack and F. Proschan

Table 1
Results of pilot study using 100 samples of size 10, Geometric (p = e x p ( - 0 . 1 ) ) life distribution,
Geometric (p = e x p ( - 0.00001)) censoring distribution and P ( Y < X ) ~ - 0

Estimated bias Estimated M S E

Percentile PEGE KME PEGE KME

(a) Survivalfunction estimato~


1 - 0.0184 -0.0018 0.0078 0.0101
5 - 0.0184 - 0.0018 0.0078 0.0101
10 - 0.0137 0.0123 0.0118 0.0145
20 - 0.0172 0.0092 0.0161 0.0182
30 - 0.0253 -0.0053 0.0194 0.0225
40 - 0.0293 -0.0118 0.0255 0.0279
50 - 0.0351 - 0.0196 0.0271 0.0278
60 -0.0347 - 0.0159 0.0223 0.0257
70 - 0.0318 - 0.0185 0.0176 0.0212
80 - 0.0283 - 0.0187 0.0108 0.0133
90 - 0.0199 - 0.0167 0.0047 0.0060
95 - 0.0096 - 0.0028 0.0028 0.0049
99 0.0029 - 0.0011 0.0006 0.0009

(b) Percentile estimators


1 0.00 0.63 0.00 1.63
5 0.21 0.63 0.35 1.63
10 - 0.21 - 0.37 1.69 1.37
20 - 0.08 - 0.32 3.00 2.38
30 0.28 - 0.10 4.48 3.88
40 - 0.16 - 0.79 6.20 5.71
50 0.53 - 0.08 9.57 9.72
60 - 0.62 - 1.31 13.70 14.37
70 - 1.35 - 2.28 20.43 22.82
80 - 1.87 - 3.34 35.23 35.96
90 - 2.29 - 4.87 82.53 95.37
95 - 2.20 - 1.53 130.22 140.17
99 - 5.01 - 18.53 577.47 481.19

exceptions occur in the right-hand tail of the distribution under conditions of


moderate and heavy censoring.
(ii) T h e MSE of each estimator increases as censoring increases.
(iii) T h e d i s p a r i t y i n t h e M S E of the two estimators becomes more marked as
the censoring increases--that is, t h e M S E of the PEGE increases by relatively
little a s t h e c e n s o r i n g increases, except in the right-hand tail.
(iv) T h e difference in the MSE of the two estimators is s m a l l e s t near the
median of the distribution.
(v) Both the KME and the PEGE generally exhibit negative bias: the magni-
tude of the bias of each estimator is g r e a t e s t around the median of the distribu-
tion.
Piecewise geometric estimation of a survival function 277

Table 2
Results of pilot study using 100 samples of size 10, Geometric (p = exp(-0.1)) life distribution,
Geometric (p = e x p ( - 0.1)) censoring distribution and P ( Y < X) ~ 0.475

Estimated bias Estimated MSE

Percentile PEGE KME PEGE KME

(a) Su~ivalfunction estima~


1 - 0.0223 - 0.0018 0.0077 0.0101
5 - 0.0223 - 0.0018 0.0077 0.0101
10 - 0.0207 0.0106 0.0124 0.0157
20 - 0.0215 0.0094 0.0170 0.0208
30 - 0.0282 -0.0042 0.0244 0.0300
40 - 0.0432 - 0.0037 0.0407 0.0502
50 - 0.0509 - 0.0230 0.0475 0.0601
60 - 0.0564 -0.0442 0.0430 0.0634
70 - 0.0553 - 0.0800 0.0333 0.0603
80 - 0.0368 - 0.0707 0.0229 0.0413
90 - 0.0060 - 0.0590 0.0124 0.0151
95 0.0082 - 0.0401 0.0082 0.0049
99 0.0149 - 0.0091 0.0033 0.0001

(b) Percentile estimators


1 0.00 0.80 0.00 3.36
5 0.19 0.80 0.33 3.36
10 - 0.34 - 0.20 1.66 2.76
20 - 0.09 0.08 3.69 5.36
30 0.38 0.80 7.40 9.84
40 o. 10 0.64 12.62 17.24
50 0.77 1.43 20.97 25.21
60 - 0.20 0.62 34.24 37.26
70 - 0.67 - 1.44 64.85 36.28
80 - 0.88 - 2.73 128.02 52.21
90 - 1.23 - 8.92 302.31 121.66
95 - 0.60 - 14.92 561.06 264.70
99 - 2.30 - 31.92 1497.30 1060.98

(vi) T h e m a g n i t u d e o f t h e b i a s o f t h e K M E is c o n s i s t e n t l y s m a l l e r t h a n t h a t
o f t h e P E G E o n l y w h e n t h e r e is n o c e n s o r i n g . U n d e r c o n d i t i o n s o f m o d e r a t e a n d
h e a v y c e n s o r i n g , t h e K M E is less b i a s e d t h a n t h e P E G E o n l y a t p e r c e n t i l e s t o
t h e left o f t h e m e d i a n : t o t h e r i g h t o f t h e m e d i a n , t h e P E G E is c o n s i d e r a b l y less
biased than the KME.
(vii) A s c e n s o r i n g i n c r e a s e s , t h e m a g n i t u d e o f t h e b i a s o f t h e K M E i n c r e a s e s
faster than does that of the PEGE.
Tables l(b), 2(b) and 3(b) contain the estimated bias and MSE for the
Kaplan-Meier (KM) and piecewise geometric (PG) estimators of the percentiles
~p, p = 1, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 95, 99. F r o m t h e s e t a b l e s w e m a k e
the following observations.
278 G. M. Mimmack and F. Proschan

Table 3
Results of pilot study using 100 s a m p l e s of size 10, G e o m e t r i c ( p = e x p ( - 0 . 1 ) ) life distribution,
G e o m e t r i c ( p = exp ( - 0.3)) censoring distribution a n d P ( Y < X ) --- 0.711

E s t i m a t e d bias Estimated MSE

Percentile PEGE KME PEGE KME

(a) Surv#alfunction estimato~


1 - 0.0230 - 0.0018 0.0077 0.0101
5 - 0.0230 - 0.0018 0.0077 0.0101
10 - 0.0370 0.0033 0.0171 0.0185
20 - 0.0582 -0.0273 0.0301 0.0508
30 - 0.0714 -0.0479 0.0437 0.0704
40 - 0.1150 - 0.1011 0.0705 0.1257
50 - 0.1232 - 0.1443 0.0709 0.1382
60 - 0.1006 - 0.2421 0.0594 0.1273
70 - 0.0702 - 0.2286 0.0456 0.0711
80 - 0.0347 - 0.1775 0.0321 0.0341
90 0.0032 - 0.0907 0.0187 0.0082
95 0.0173 - 0.0498 0.0125 0.0025
99 0.0206 - 0.0091 0.0043 0.0001

(b) Percentile estimators


1 0.10 0.87 0.52 3.27
5 0.24 0.87 0.68 3.27
10 - 0.41 - 0.13 1.37 2.53
20 - 0.08 0.52 3.22 7.86
30 0.29 0.76 7.19 8.82
40 - 0.20 - 0.10 15.16 9.56
50 0.48 0.16 28.06 10.86
60 - 0.47 - 2.38 50.99 16.66
70 - 0.78 - 4.91 90.72 36.07
80 - 1.11 - 8.54 167.67 84.44
90 - 1.68 - 15.53 357.58 252.63
95 - 1.25 - 21.53 619.71 474.99
99 - 3.34 - 38.53 1508.06 1496.01

(i) With a few exceptions, the PG percentile estimator is less biased than the
KM percentile estimator.
(ii) Both estimators tend to be negatively biased.
(iii) At each level of censoring, the bias of the PG percentile estimator is
negligible for percentiles smaller than the 70th, and it is acceptably small for larger
percentiles, except perhaps the 99th percentile. In contrast, the KM percentile
estimators are almost unbiased only for percentiles smaller than the 60th: to the
right of the 60th percentile the bias tends to be very much larger than that of the
PG estimators. This tendency is particularly noticeable in the case of heavy
censoring.
(iv) The MSE of the PG percentile estimator is smaller than that of the KM
percentile estimator only in certain ranges, viz.: p ~< 70 for heavy censoring,
Piecewise geometric estimation of a survival function 279

p ~< 40 for moderate censoring, and 5 ~<p ~< 95 for no censoring. Since the PG
percentile estimator is almost unbiased outside these ranges, the large MSE must
be the result of having large variance.
On the basis of the observations involving the survival function estimators, we
conclude that the small sample behaviour of the P E G E resembles that of the
PEXE: specifically, when there is little or no censoring, the PEGE compares
favourably with the K M E in terms of MSE but not in terms of bias. We expect
that this is true irrespective of the level of censoring when the sample size is
larger. It remains to be seen whether inversion of this general behaviour is typical
when the sample size is very small and censoring is heavy. It is evident that
increased censoring affects the bias and the MSE of the PEGE less than it affects
the bias and the MSE of the KME.
Our conclusions about the percentile estimators are even more tentative
because of the lack of results involving the behaviour of percentile estimators. The
fact that the PG percentile estimator is almost unbiased even in the presence of
heavy censoring, and even as far to the right as the 95th percentile, is of con-
siderable interest because the KM extrapolation procedures are clearly inadequate
for estimating extreme right percentiles.
Regarding the MSE, we note that, under conditions of moderate or heavy
larcensoring, any estimator of the larger percentiles is expected to vary considerably
because there are likely to be very few observations in this range. The ad hoc
extrapolation procedure for the KM is expected to cause the estimators of the
extreme right percentiles to exhibit large negative bias and little variation. In view
of these considerations and the accuracy of the PG percentile estimators, we
conclude that the fact the MSE of the PG percentile estimator of the larger
percentiles is greater than that of the KM percentile estimator is not evidence of
a breakdown in the reliability and efficiency of the PG percentile estimator.
The general indications of our pilot study are that the PEGE and the discrete
version of ffQ4
are attractive alternatives to the KME. In view of the resemblan__ce
between the properties of the P E G E and those of the PEXE, the results for PQ4
portend well for the new discrete estimator: we expect it to be almost unbiased
and to be not only more efficient than the K M E but also more stable under
increased censoring. Moreover, we expect the corresponding percentile estimator
to have these desirable properties also because it is likely to behave at least as
well as the PG percentile estimator.
The properties involving relative efficiency are of considerable importance
because relative efficiency is a measure of the relative quantities of information
utilized by the estimators being compared. This interpretation of relative effi-
ciency, and the fa__ct that heavy censoring is often encountered in engineering
problems, makes FQ4 and its discrete counterpart even more attractive.

References

Aalen, O. (1976). Nonparametric inference in connection with multiple decrement models. Scan-
dinavian J. Statist. 3, 15-27.
280 G. M. Mimmack and F. Proschan

Aalen, O. (1978). Nonparametric estimation of partial transition probabilities in multiple decrement


models. Ann. Statist. 6, 534-545.
Breslow, N. and Crowley, J. (1974). A large sample study of the life table and product limit estimators
under random censorship. Ann. Statist. 2, 437-453.
Chen, Y. Y., Hollander, M. and Langberg, N. (1982). Small-sample results for the Kaplan-Meier
estimator. J. Amer. Statist. Assoc. 77, 141-144.
Cox, D. R. (1972). Regression models and life tables. J. Roy. Statist. Soc. Ser. B 34, 187-202.
Desu, M. M. and Narula, S. C. (1977). Reliability estimation under competing causes of failure. In:
I. Shimi and C. P. Tsokos, eds., The Theory and Applications of Reliability I. Academic Press, New
York.
Efron, B. (1967). The two sample problem with censored data. In: Proceedings of the Fifth Berkeley
Symposium on Mathematical Statistics and Probability Vol. IV. University of California Press,
Berkeley, CA, 831-853.
Fleming, T. R. and Harrington, D. P. (1979). Nonparametric estimation of the survival distribution
in censored data. Technical Report No. 8, Section of Medical Research Statistics, Mayo Clinic,
Rochester, MN.
Freireich, E. J. et al. (1963). The effect of 6-Mercaptopurine on the duration of steroid-induced
remission in acute leukemia. Blood 21, 699-716.
Kaplan, E. L. and Meier, P. (1958). Nonparametric estimation from incomplete observations. J.
Amer. Statist. Assoc. 53, 457-481.
Kitchin, J. (1980). A new method for estimating life distributions from incomplete data. Unpublished
doctoral dissertation, Florida State University.
Kitchin, J., Langberg, N. and Proschan, F. (1983). A new method for estimating life distributions
from incomplete data. Statist. and Decisions 1, 241-255.
Langberg, N., Proschan, F. and Quinzi, A. J. (1981). Estimating dependent life lengths, with applica-
tions to the theory of competing risks. Ann. Statist. 9, 157-167.
Miller, R. G. (1981). Survival Analysis. Wiley, New York.
Mimmack, G. M. (1985). Piecewise geometric estimation of a survival function. Unpublished doctoral
dissertation, Florida State University.
Nelson, W. (1969). Hazard plotting for incomplete failure data. J. Quality Technology 1, 27-52.
Nelson, W. (1972). Theory and applications of hazard plotting for censored failure data. Tech-
nometrics 14, 945-966.
Peterson, A. V. (1977). Expressing the Kaplan-Meier estimator as a function of empirical sub-
survival functions. J. Amer. Statist. Assoc. 72, 854-858.
Susarla, V. and Van Ryzin, J. (1976). Nonparametric Bayesian estimation of survival curves from
incomplete observations. J. Amer. Statist. Assoc. 71, 897-902.
Umholtz, R. L. (1984). Estimation of the exponential parameter for discrete data. Report, Aberdeen
Proving Ground.
Whittemore, A. S. and Keller, J. B. (1983). Survival estimation with censored data. Stanford
University Technical Report No. 69.
P. R. Krishnaiah and C. R. Rao, eds., Handbook of Statistics, Vol. 7 ]
]k
© Elsevier Science Publishers B.V. (1988)281-311

Applications of Pattern Recognition in Failure


Diagnosis and Quality Control

L. F. Pau

1. Introduction

Through its compliance with, and implications on design, manufacturing,


quality control, testing, operations and maintenance (Figures 1 and 2), the field
of technical diagnostics has wide ranging consequences in all technical fields;
some of the measures hereof are:
- system availability;
- system survivability;
- safety;

FAILURE-FREE SYSTEM E°
(SAFE and AVAILABLE STATE)

~ ILURE R A T E ~ REPAIR RATE

FAILED STATE Ei i:q,..,(N-1) DIAGNOSED STATE Ei i=0,..,(N-q)


(UNSAFE and NON-AVAILABLE STATE) (SAFE and NON-AVAILABLE STATE )
{SYSTEM FAILURE NOT DIAGNOSED ) (SYSTEM FAILURE DIAGNOSED )

Fig. 1. Relation between failure diagnosis, reliability or degradation processes, safety and mainten-
ance. If the repair is instantaneous (/% = + oo), if there is no detection delay (tin + td = 0), and if the
diagnostic system itself never fails, the asymptotic system availability in the stationary case is:
i=N--I
A = Prob(UUT not failed)t_ ÷ oo = lqi= 1 ei/(2i + ee). More general formulae may be derwed, espe-
cially for finite repair times, and more general degradation processes.

281
282 L. F. Pau

<

D
a~

D [--,
0
m

r~

o
¥

H rn N
r~
II Z
[-~ .r'n
Z H
O
N r~
H
> _ _
0 0
~ .,e-- O
Z
m o r~ E~ m .N
0
Z
Z ~

m o

I ~Z
©
H
b
f
._1_ ~

g
z
o
z
H

o
g
H

o
e,i
Applications of pattern recognition in failure diagnosis and quality control 283

- production yield;
- quality;
- failure tolerance;
- system activation delays;
- costs (lifetime, operation);
- maintenance;
- warranties.
We define here technical diagnostics as the field dealing with all methods,
processes, devices and systems whereby one can detect, localize, analyse and
monitor failure modes of a system, i.e., defects and degradations (see Section 2).
It is at this stage essential to stress that, whereas system reliability and safety
theories are concerned with a priori assessments of the probability that the system
will perform a required task under specified conditions, without failure, for a
specified period of time, the field of failure diagnosis is essentially focusing on
a posteriori and on-line processing and acquisition of all monitoring information
for later decision making.
Failure diagnosis has itself evolved from the utilization of stand-alone tools (e.g.
calibers), to heuristic procedures, later codified into maintenance manuals. At a
later stage, automatic test systems and non-destructive testing instruments, based
on specific test sequences and sensors, have assisted the diagnosis; examples are:
rotating machine vibration monitoring, signature analysis, optical flaw detection,
ultrasonics, ferrography, wear sensors, process parameters, thermography, etc.
More recently, however, there has been implementations and research on evolved
diagnostic processes, with heavier emphasis on sensor integration, signal/image
processing, software and communications. Finally, research is carried out on
automated failure diagnosis, and on expert systems to accumulate and structure
failure symptoms and diagnostic strategies (e.g. avionics, aircraft engines, soft-
ware).
Although the number of application areas and the complexicity of diagnostic
systems have increased, there is still a heavy reliance on 'ad-hoc' or heuristic
approaches to basing decisions on diagnostic information. But a number of funda-
mental diagnostic strategies have emerged from these approaches, which can be
found to be common to these very diversified applications.
After having introduced in Section 2 a number of basic concepts in technical
diagnostics, we will review in Section 3 some of the measurement problems. The
basic diagnostic strategies will be summarized in Section 4. Areas for future
research and progress will be proposed in Section 5.

2. B a s i c c o n c e p t s in t e c h n i c a l d i a g n o s t i c s

Although they may have parts in common, we will essentially make the
difference between the system for which a diagnosis is sought (system/unit/process
under test: UUT), and the diagnostic system. The basic events in technical
diagnostics are well defined in terminology standards; they are: failure, defect,
degradation, condition.
284 L. F. Pau

ESTIMATED FAILURE MODE


<

( CONFUSION MATRIX )

t
A C T U A L F A I L U R E M O D E E.
1
i=0,1,..,N-1

CATASTROPHIC DEGRADATION FAILURE LOCALIZATION

FAILURE

FAILURE DETECTION - /
I
I FAILURE DIAGNOSIS

Fig. 3.

A failure mode is then the particular manner in which an omission of expected


occurrence or performance of task or mission happens; it is thus a combination
of failures, defects, and degradations. For a given task or mission, the N possible
failure modes will be noted Eo, E 1. . . . . E~v_ l, where E o is the no-failure operating
mode fulfilling all technical specifications./~ is the failure mode identified by the
diagnostic system.

2.1. The basic troubleshooting process (Figure 3)


2.1.1. Failure detection: This is the act of identifying the presence or absence of
a non-specified failure mode in a specified system carrying out a given task or
mission, or manufactured to a given standard.

2.1.2. Failure localization: If the outcome of failure detection is positive, then


failure localization designates the material, structures, components, processes,
systems or programs which have had a failure.

2.1.3. Failure diagnosis: The act or process of identifying a failure mode E upon
an evaluation of its signs and symptoms, including monitoring information. The
diagnostic process carries therefore out a breakdown of failure detection into
individual failure modes.
Applications of pattern recognition in failure diagnosis and quality control 285

2.1.4. Failure analysis: The process of retrieving via adequate sensors all possible
information, measurements, and non-destructive observations, alltogether called
diagnostic information, about the life of the system prior and up to the failure;
it is also a method whereby to correlate these informations.

2.1.5. Failure monitoring: This is the act of observing indicative change of equip-
ment condition or functional measurements, as warnings for possible needed
corrections.

2.2. Performance of a diagnostic process


As a decision operator, any diagnostic system can make errors; each of the
following errors or performances can be specified either for a specific failure
mode, or in the expected sense over the set of all possible failure modes
Eo . . . . , EN-1. The probabilities 2.2.1-2.2.4 can be derived from the confusion
matrix (Figure 3). The overall effect of these performances is to affect system
availability, with or without test system linked to UUT (Pan, 1987b).

2.2.1. Probability of incorrect diagnosis: This is the probability of diagnosing a


failure mode different from the actual one, with everything else equal.

2.2.2. Probability of reject (or miss, or non-detection): This is the probability of


taking no decision (diagnosis or detection) when a failure mode is actually present.

2.2.3. Probability of false alarm: The probability of diagnosing that a failure mode
is present, when in fact none is present (except the normal condition Eo).

2.2.4. Probability of correct detection: The probability of detecting correctly a


failure mode to be present, when it actually is (E o excepted); when there is only
one possible failure mode El, it is the complement to one of the probability of
false alarm.

2.2.5. Failure coverage: This is the conditional probability that, given there exists
a failure mode of the UUT, this system is able to recover automatically and
continue operations. The process of automatic reconfiguration, and redundancy
management has the purpose of improving the coverage and making the system
fault-tolerant.

2.2.6. Measurement time tin: This is the total time required for acquiring all
diagnostic iinformation (except a priori information) required for the failure
detection, localization and diagnosis, This time may be fractioned into sub-
sequences, and estimated in expected value.

2.2.7. Detection (of diagnosis) delay td: This is the total time required to process
and analyze the diagnostic information, and also to display or transmit the failure
286 L. F. Pau

mode as determined. This time may be fractioned into subsequences, and


estimated in expected value.

2.2.8. Forecasting capability tf: This is the lead time with which the occurrence of
a specific failure mode (E o excepted) can be forecasted, given a confidence
interval or margin.

2.2.9. Risks and costs: Costs/risks are attached to each diagnosed failure mode/~,
obtained as a result of the decision process; they include: testing costs,
maintenance/repair costs, safety risks, lost usage costs, warranties, yields.

3. Sensors and diagnostic information

3.I. Degradation processes


The thorough knowledge about the failure mode occurrence process, and not
only about the normal operating mode Eo, is an absolute must. It requires the
understanding of all physical effects, as well as of software errors, besides design,
operations, human factors and procedures (Figure 4). Failure modes may also
occur because of interactions with other UUT's (machines, or communication
nodes working together).
The results of this knowledge is the derivation of inference of:
• categorized lists of failure modes, and their duration or extent;
• lists of features or characteristic symptoms for detection and diagnosis, with
measurement ranges;
• priorities among categorized failure modes vs.:
- probabilities (availability),
- safety (critical events),
- timing (triggering, windowing, etc.),
- fault-effect models (e.g. error, propagation, stress-fracture relations).
This information is also used for the selection of sensors for technical
diagnostics.

3.2. Sensors for technical diagnostics


It is important to distinguish between two classes of diagnostic sensors:
- passive sensors, with no interaction of probing energy with the UTT;
- active sensors, with interaction of probing energy with the U U T perturbating
the operations; this is carried out by personnel, automatic test systems, pro-
grammable systems, or other probing means.
In turn, the measurement process is either destructive, or non-destructive for
the UUT.
Needless to say, there is a very wide range of sensors, described and reported
in the technical diagnostics, measurement, and non-destructive testing litteratures.
These sensors are generally used in sequence. We will give below one application
Applicatmns of pattern recognition in failure diagnos~ and q u a ~ control 287

DEGRADATION PROCESSES

~ DESIGN ~" .~ __

PHYSICAL WORLD SOFTWARE WORLD


(PHYSICS, CHEMISTRY,
MATERIALS )

I\
T
OPERATIONS ~ HUMAN FACTORS PROCEDURES

Fig. 4. Main causes for a degradation.

area into which much sensor development research is going, and refer the reader
to the References for other fields.

EXAMPLE. Integrated circuits diagnosis. See Figure 5.

3.3, Data fusion and feature extraction


3.3.1. Data fusion: In evolved diagnostic systems, it is realized that efficient
diagnosis cannot, in many cases, be based on the acquisition of one single
measurement only, possibly with one single sensor only (Pau, 1987a).
Another fundamental approach, is to strive towards the acquisition of the
measurement(s) by monitoring throughout the entire system life, including manu-
facturing, testing, operations, maintenance, modifications.
In order to cover those two requirements, evolved diagnostic systems are based
on sensor diversity, which besides increases the global sensor reliability and
reduces the vulnerability (Figure 6).

3.3.2. Feature extraction: The features are then those combined symptoms derived
jointly from d~erent sensors, these measurements being combined together by an
288 L. F. Pau

Active sensors Passive sensors

Non-destructive - Electrical signature - Visual inspection


analysis - Electron microscopy
- Logic testing - Electrical pin-to-pin
- Micromanipulator probes characterization
(after removing die coat) - Leak testing
- Nematic LCD to - Auger analysis
highlight operating circuit - Infrared thermography
paths - Freon boiling of hot spot
- LCD displays for - LCD to detect changes in
comparative circuit nodal electrical field
analysis
- Soft failure testing
(alpha)
- Electron beam
microscopy
- X-ray analysis

Destructive Capacitive discharges - Passive accelerated testing/burn-in


Dynamic and monitored - Storage reliability testing
accelerated testing/burn-in
-Humidity, vibration,
EMC testing
Mechanical abrasion with
ultrasonic probe
Radiation testing
Laser melt
Photoresist etching

Fig. 5. Sensors and measurement processes for the diagnosis of integrated circuits.

o p e r a t i o n called feature extraction, to i n c r e a s e their u s e f u l n e s s for diagnosis. D a t a


f u s i o n from diverse s e n s o r s usually leads to m u c h i m p r o v e d features, a n d to
m o n i t o r i n g capabilities over the entire system life.

3.3.3. Sensor diversity: T h e diversity is in t e r m s of:


- m e a s u r e m e n t processes,
- design,
- location,
- acquisition rate, b a n d w i d t h , gain, wavelength, etc.,
- e n v i r o n m e n t a l exposure,
with possible s e n s o r r e d u n d a n c i e s (active, p a s s i v e ) a n d d i s t r i b u t e d s e n s o r control.

3.4. Measurement problems in technical diagnostics


I n a d d i t i o n to the classical issues o f calibration, m e a s u r e m e n t stability, process
consistency/stability, e n v i r o n m e n t , noise, the specific c o n c e r n s are:

3.4.1. Observability: T h i s is a n e v e n t u a l p r o p e r t y o f d y n a m i c systems which


expresses the ability to infer or e s t i m a t e the system c o n d i t i o n at a given p a s t
Applications of pattern recognition in failure diagnosis and quality control 289

instant in time, from quantified records of all measurements made on it at later


points in time. This property does not hold for most UTT's, first because of
missing measurements/data, and next because of time dependent changes of the
system condition which, in general, cannot be modelled.

3.4.2. Accessability to measurement points: One of the main limitations to obser-


vability is bad accessability of the main test or measurement points because of
inadequate design, and the insufficient number of such measurements. Another
source of limitation is inadequate selection of the measurement sampling frequency
(spatial or temporal or optical), so that fine features revealing incipient failures get
unnoticed. Measurement delays tm are also a problem.

3.4.3. Effect of control elements and protective elements: The observability is further
reduced for some parts of the UUT because of:
- physical protection: hybrid/composite structures, coatings, multilayers, casings;
- p r o t e c t i o n and failure recovery systems: protection networks, fault-tolerant
parts, active spares, sating modes;
- control elements: feedback controllers, limiters, and measurement effects due to
the detection delay td.

3.4.4. Sensor-UUT interactions: In case of electrical and mechanical measure-


ments, impedance and bandwidth mismatch are introduced at the interface level,
resulting in signal distortion features which do not originate in system failure
modes. In the case of human observations, sources to obervation errors are many,
as expected. In the case of active sensors, it is essential to understand and model
as well as possible:
- the propagation of the probing energy into the U U T and the interaction with
the defects or failures;
- the inverse problem, of how defect and failure features propagate to the sensor.

EXAMPLE. Effects of intrinsic fracture energy on brittle fractures vs. ductile


fracture under plasticity, external and internal chemistry, and structural loadings.
This leads to complex crack kinetics, and ductile vs. brittle process models.

3.4.5. Support structure: The support structure, casing or board may, by its
properties or behavior, interfere with both the sensor and UUT, e.g. because of
mechanical impedance, electromagnetic interference (EMI), etc.

3.4.6. Distorsion: Is a classical problem in measurement, but added difficulties


result from the fact that the sensors themselves cannot be properly modelled
outside their normal operating bandwidth, whereas likely true measurements on
systems which fail will be characterized by extremely large bandwidths. Such large
bandwidths also contradict with low noise, and unfrequent calibration require-
ments.
290 L. F. Pau

- Sensor/measurement type
- Location
Diversity by: - Design
-Environment
-Data acquisition (bandwidth, gain, wavelength, data rate)
- Software

with possible redudancies (active, passive, software), and distributed control

Sensor measurement type 1:


Signals (analog; digital; radiation)

Feature
Sensor measurement type 2: extraction
Images, electromagnetic waves of diagnostic
information g
>

Sensor measurement type 3:


Human text; procedures; software, behavior

Fig. 6. Feature extraction and data fusion with sensor diversity.

3.4.7. Sensor reliability: Failure analysis and diagnosis are only possible if the
sensors of all kinds survive to system failures; this m a y require sensor redundancy
(physical or analytical), separate power supplies, and different technologies and
computing resources. Besides sensor and processor reliability, short reaction time
t m and good feature extraction are conflicting hardware requirements, all of which
contribute to increased costs which in turn limit the extent of possible implemen-
tations. Any diagnostic subsystem, and any U U T subsystem which can be
activated separately, should be equipped with a time meter, unit or cycle counter.

3.4.8. Data transmission errors: Whether the U U T is autonomous or not, analog


or digital multiplexing will often be used, followed by data transmission, e.g. on
a c o m m o n bus or local network. These transmission links may themselves
generate errors and fail. However, if the data acquisition rate is slow under good
operating conditions, data transmission becomes sometimes irrelevant: on-site
temporary data storage is then a convenient solution.

3.5. Research on sensors f o r diagnosis

The main trends are:


- development of cheap and reliable distributed sensor arrays (acoustic imaging,
fiber optic sensors, distributed position sensors, accelerometers .... );
- sensor integration and measurement fusion, to enhance the detection and
diagnosis capabilities (vibrations/pressure, temperature/pressure, optical/tempera-
ture, pressure/acceleration/flow);
- in-built analog-to-digital, or optoelectronic conversion;
Applications of pattern recognition in failure diagnosis and quality control 291

r~

•. ~

D
© H ~

I-.-I 0
121 0
M r~

m N 0
r~

tr~
\

0
r~

,7.
i--t
o ~
r~
r~ H r~ d
0 W
0 t~
Z
o m rJ~
H o
o~
N
<

0
0 o

t~
Z
0 E~ r~
eZ

D Z
H
Z H
0 Z
292 L. F. Pau

- in-built digital data error-detecting-correcting circuits;


- software controlled calibration;
- better impedance matching of active sensors;
- noise suppression.
Moreover, there is increased attention given to the processing of unstructured
verbal/written reports and actions by human operators: even if expressed in plain
language, they will often reveal essential diagnostic features.

4. Diagnostic processes

As already mentioned in Section 1, there appears to exist essentially a few


fundamental diagnostic processes. The discovery of those admidst the tech-
nicalities of specific implementations, have actually led to substantial achieve-
ments across different application areas (e.g. from mechanical to control systems,
from software to mechanical processes). We will therefore review the:
- diagnostic strategies;
- diagnostic system architectures controlled by these strategies (active and passive
sensors);
- test generation.

4.1. Diagnostic strategies S (Figure 7)


4.1.1. Diagnostic strategies S are always sequential, in at least one of the
following aspects:

4.1.1.1. UUT configuration D: Diagnosis is sequentially applied to:


- units/components;
- systems obtained by stepwise integration of these units/components;
- automata, software modules, operating systems obtained by stepwise integration
of the U U T with other interfacing systems (sensors, displays, controls, etc.), the
selection being guided by the diagnostic strategy.

4.1.1.2. Diagnostic information Y: The diagnosis is using increasing numbers of


diagnostic measurements coming from a diversity of sensors, the selection being
guided by S; when active sensors are considered, the diagnostic measurements are
the results of the probing, as applied to successive UUT decompositions D.

4.1.1.3. A priori/learning information I: The diagnosis is using increasing num-


bers of a priori/learning information, the retrieval being guided by S; this informa-
tion set I includes data on the degradation process (see Section 3.1).

As a result, a diagnostic strategy S is a sequential search process in the product


set (D x Y x I): it is clear that U U T parts registration, data labelling are both
needed, besides timing information.
Applications of pattern recognition in failure diagnosis and quality control 293

4.1.2. There are essentially three basic diagnostic strategies S:

4.1.2.1. Failure mode removal by analysis and inspection: The detection, diagnosis,
localization and removal o f the failure mode which has occurred, are carried out
in sequence; the removal affects, a m o n g others: requirements, design, control,
usage, parts, repair, programs, etc.

4.1.2.2. Validation: Diagnosis cannot be considered complete until the U U T has


been demonstrated to solve the requirements that were set out in the U U T
specifications; validation consists in verifying that these are met.

4.1.2.3. Exploring the operational envelope: The external specifications define the
operational envelope within which the U U T must perform correctly in mode E o.
These performance limits, while representative o f the realworld process, are not
necessarily accurate, and quite different system states m a y occur. These strategies
S therefore explore the behavior under circumstances not given as performance
requirements, including 'severe' operating environments.

4.1.3. Diagnostic strategy assessment: The assessment is done in terms of the


expected risk attached to a r a n d o m failure m o d e E, as estimated in terms of the
various performance criteria listed in Section 2.2.

4.1.4. Example: classification of software testing strategies S: The k n o w n software


testing techniques can be classified into the 3 classes o f Section 412; see Figure 8.

1. Failure removal: Sensitized path testing


-

Fault seeding
-

Hardware/software test points and monitoring software


-

- Code analyzers
- Dynamic test probes, injection of test patterns of bits

2. Validation: - Proof-of-correctness
- Program verification by predicate testing
- Proof-of-loops
- Validation using a representation in a specification language
- Validation by simulation

3. Exploring the - Endurance tests


operational - Derivation of tests outside the specifications, by a specification
envelope: language
- Automatic test case generation
- Behavior of specific routines in extreme cases
- Stress tests (inputs, time), saturation tests

Fig. 8. Classification of software testing strategies S.


294 L. F. Pau

)
)

ca<

C> 1 '7

z
)

o 8
) u3
r~
u ~

Z
o ~

~ z N
~q~q < Z
O~ ~-m < ._~
pq •
H~Z~
0 ~ U 0
OqJ
g] ~[4
~q~
P40
o ~
z ~

H~

r~
Z ~
O ~

~ 8

~
~3 N
Applications of pattern recognition in failure diagnosis and quality control 295

4.2. Diagnostic system architectures

The diagnostic strategies S to be implemented control the utilization and access


to: UUT configuration D, diagnostic information Y, failure models and a priori
information/, all of which are part of the diagnostic system. The failure mode/~
is determined by the final diagnostic decision unit. Especially important in the
diagnostic system architecture, are the sequential set-up vs. D, Y, 1 with back-
trackings, and the:

4.2.1. Measurement/diagnostic information unit: This senses diagnostic information


by active and passive sensors, and performs a parametric UUT identification by
adjusting a parametric model of the UUT; the estimated parameters are fed into
the diagnostic decision unit.
If these parameters are all measurable, the diagnosis is called external; if they
are only observable (and estimated by e.g. modal analysis, Kalman filter, or
error-detection-correction), the diagnosis is called internal.

4.2.2. Failure model unit: For a given UUT configuration D, operational environ-
ment, and set of other learning information/, this unit identifies and prioritizes
the possible failure modes Eo, E 1. . . . , E N - 1 (e.g. critical parts, active routines,
fracture locations). A failure mode effect model (FMEA analysis) is then adjusted
to a usage model of the UUT (incorporating e.g. fatigue, ductility, heating, cumu-
lative failures, cumulative contents of registers) to derive predicted parameter values
for all possible failure modes Eo, E l , E N_ 1, and the potential effects on the
...,

UUT performances.

Note that under a sequential diagnostic strategy S, a whole hierarchy of models,


with corresponding adjustment factors (environment, specification of parts, usage)
are needed; these models usually take the simple form of multi-entry tables stored
in read-only memories (e.g. fault dictionaries).

EXAMPLE. S n e a k circuit analysis (failure mode identification). This is, for elec-
tronic circuits, a systematic review of electric current and logic paths down to the

Failure modes E l , ..., E N_ 1 Examples of feature parameters


- Fatigue of rolling elements/tracks Vibration parameters
Fiber optic inspection
Shock pulses
- Wear Radial position changes in shaft position/deflection
- Cage failures Frictional losses
Temperature changes
- Lubrication starvation, contamination Temperature changes
Fig. 10. Failure modes of bearings (FMEA analysis).
296 L. F. Pau

components and logic statements, to detect latent paths, timing errors, software
errors, hardware failures. It uses essentially the specifications and nodal/topologi-
cal network analysis, in addition to state diagrams for the logic.

EXAMPLE. Failures of bearings (FMEA analysis). See Figure 10.

4.2.3. Diagnostic decision unit (Figure 11). This decision logic determines the likely
failure mode /~ among Eo, El, ..., EN_I, from the estimated and predicted
parameters, with account for the cost/risk/time factors. This process, which may
also derive classification features from these data, is essentially a pattern
recognition process (signals, images, coded data, text, symbols, logic invariants);
the simplest case is straightforward comparison (template matching) between
estimated and predicted parameters (including event counts).

When the diagnostic decosion is used for the prediction of the remaining U U T
life, and passive sensors only are used, one would use the term non-destructive
evaluation (NDE) instead of technical diagnostics.
Extensions to be above are required within the context of knowledge based
systems or expert systems for diagnostics (Pan, 1986).

4.3. Test generation


This is the process whereby the active sensors, controlled by the diagnostic
strategy S, select and apply specific types of probing energy to the UUT. These
processes can be classified according to two criteria:
(i) functional testing (by cause-effect tables) vs. structural testing (by sensitizing
probing energy);
(ii) deterministic vs. random (by noise, Monte Carlo simulation, random
events),
The possible failure modes, and the corresponding probing signals generated by
the active sensors, will usually be determined by the failure model unit
(Section 422).
However, the difficult design/selection issue to be resolved is whether these test
signals can also detect other failure modes than those which they should charac-
terize. Test generation design will have both to minimize these overlaps, and to
find minimum test sequences to energize all hypothesized failure modes.

4.4. Design considerations for diagnostic system architectures


These architectures must meet conflicting criteria, which are essentially:
- maximum diagnostic system reliability, because it must in general be larger than
the UUT reliability;
- relative diagnostic system cost vs. UUT cost;
- ease of use for human operators; the diagnostic system must be either faster
or more intelligent;
- updating capabilities and traceability;
- simultaneous design of the U U T and diagnostic system.
Applications of pattern recognition in failure diagnosis and quality control 297

H1

r
rJ3 1

m i

H i
ul i

e~
o
"3

@
e~

d=

[ .<

o
rfl
~z
r~

r~
>
i...t
[.-t
U

i
298 L. F. Pau

4.5. Statistical pattern recognition methods used


The diagnostic decision (Section 4.2.3 and Figure 2) is explicitily a pattern
classification problem, as already stated (Pau, 1981). In the case the measure-
ments Y are restricted to numerical values (signals, data), the statistical pattern
recognition (Fukunaga, 1972; S ebestyen, 1962) methods apply (Saeks and Liberty,
1977; Pau, 1981a, b; Rasmussen and Rouse, 1981). In view of the requirements
of the previous sections (especially 4.4), the standard methods used at each stage
for the diagnostic decision are (Section 2.1):

Features are selected and priority ranked among the following:


1. User traffic (demand)
2. Off-lineteletraffic measurements and statistics on:
- each route or link (flows and intensities)
- around each traffic node (input-output measurements)
3. On-line teletraffic measurements for:
- flow control
- congestion control/windowing
- routing
- protocol use and interrupts
4. Hardware, software node condition monitoring
5. Error correction, propagation anomalies compensation, and disruption of links
6. Test and monitoring unit condition
7. Protection of transmission links carrying diagnostic information

Fig. 12. Features for data communications network tests and monitoring.

Failure detection
- Sequential hypothesis testing (Wald, 1947).
- Non-parametric sequential testing (Pau, 1978; Fu, 1968; Wald, 1947).
- Hypothesis testing (shift of the mean, variance) (Clark et al., 1975; Sebestyen,
1962).
- Bayes classification (Fukunaga, 1972).
- Discriminant analysis (Fukunaga, 1972; Sebestyen, 1962).
- Nearest neighbor classification rule (Fukunaga, 1972; Devijver, 1979).
- Sensor/observation error compensation (Pau and Kittler, 1980).

Failure localization
- Graph search algorithms (Saeks and Liberty, 1977; Rasmussen and Rouse,
1981; Slagle and Lee, 1971).
- Branch-and-bound algorithms (Navendra and Fukunaga, 1977).
Dynamic programming (Pau, 1981a; Bellman, 1966).
-

- Logical inference (Pau, 1984).

Failure diagnosis
- Correspondence analysis (Pau, 1981a; Hill, 1974; Section 5).
- Discriminant analysis (Van de Geer, 1971; Benzecri, 1977).
Applications of pattern recognition in failure diagnosis and quality control 299

- Canonical analysis (Hastman, 1960; Benzecri, 1977).


- Nearest neighbor classification rule (Fukunaga, 1972; Devijver, 1979).
- Knowledge based or expert systems for diagnostics (Pan, 1986).

Failure analysis
- Variance analysis, correlation analysis (Van de Geer, 1971).
- Principal components analysis (Pau, 1981a; Van de Geer, 1971; Chien and Fu,
1967).
- Scatter analysis (Van de Geer, 1971; Everitt, 1974).
- Clustering procedures, e.g. dynamic clusters algorithm (Pau, 1981a; Everitt,
1974; Hartigan, 1975).
- Multivariate probability density estimation (Parzen, kernel functions, k-nearest
neighbour estimators) (Fukunaga, 1972; Devijver, 1979; Parzen, 1962).
- Multivariate sampling plans (Pan et al., 1983).

Failure monitoring
- Statistics of level crossings, especially two-level crossings (Saeks and Liberty,
1977; Pau, 1981a).
- Spectral analysis and FFT (Chen, 1982).
- Kalman estimation (Pau, 1981a, 1977).
- Recursive least-squares estimators.
- Linear prediction ARMA, ARIMA estimators (Chen, 1982).
- Knowledge based or expert systems for failure monitoring (Pau, 1986).

5. Example: Correspondence analysis and its application

The problem is to diagnose defective machines among 33 machines, described


each by 4 measurements, while deriving a sequential diagnostic strategy S and
satisfying in that order three detection criteria:
(c0 maximum vibration level,
(/~) minimum flow,
(7) minimum electricity consumption.

5.1. Method
5.1.1. Introduction and problem analys&
(a) The case is set up as a clustering problem, where each of the 33 machines
considered is described by measurement attributes (vibration level, operating time,
electricity consumption, flow). The raw data are given in Figure 13. Some essential
characteristics of this problem are the following:
(i) the answer requested is to reduce the number of alternatives for the
diagnosis and failure location;
(ii) it is obvious, for technical reasons, that the four attributes are correlated;
(iii) the number of attributes measured on each machine is fairly small, and all
observations are real valued and non-negative.
300 L.F. Pau

Machine Vibration Operating Electricity Flow


no. level time consumption WATR
PRIC TIME CONS

1 509 74 1.5 114


2 425 80 1.5 110
3 446 72 1.6 135
4 564 65 1.6 118
5 547 53 1.8 140
6 450 68 1.6 135
7 473 65 1.6 130
8 484 56 1.7 115
9 456 68 1.6 130
10 488 72 1.6 114
11 530 55 1.7 135
12 477 76 1.5 110
13 589 53 1.6 130
14 534 61 1.4 122
15 536 57 1.7 110
16 494 72 1.5 135
17 425 65 1.8 120
18 555 53 1.7 125
19 543 57 1.6 120
20 515 68 1.5 130
21 452 76 1.5 112
22 547 68 1.5 120
23 421 76 1.4 130
24 498 68 1.6 120
25 467 65 1.7 130
26 595 50 1.8 135
27 414 68 1.7 125
28 431 66 1.7 110
29 452 72 1.5 115
30 408 77 1.6 119
31 478 59 1.8 110
32 395 76 1.5 120
33 543 57 1.5 135

Fig. 13. Raw data of machine diagnosis case (Section 5).

However, the parameters of these relations are u n k n o w n and they can only be
inferred from the sample of 33 machines.
(b) These characteristics build justifications for the use of multivariate statisti-
cal analysis, a n d of correspondence analysis in particular because of its joint use
of information about the machines and about the diagnostic measurements. The
main steps of correspondence analysis are the following (Pan, 1981a; Chen,
1982):
Step 1. First, infer from the data estimated correlations between machines and
between diagnostic measurements, a reduced set of i n d e p e n d e n t feature measure-
ments, according to which the 33 alternative machines may be ranked. As far as
this step is concerned, and this step only, correspondence analysis is comparable
Applications of pattern recognition in failure diagnosis and quality control 301

to factor analysis (Van de Geer, 1971; Hartman, 1960), although the two differ
in the remaining steps.
Step 2. Next, interpret the nature of these statistically independent feature
measurements, by indicating the contribution to each of these by the original
attribute measurements, and determine the diagnosis in terms of these features.
Step 3. Thereafter, rank the feature measurements by decreasing contributions
to the reconstruction of the original 33 x 4 evaluation measurements; the best
feature measurement (e.g. the first) is, in correspondence analysis, the one
maximizing the variance in that direction; in other words, this is the feature
measurement which produces the ranking with the highest possible discrimination
among the 33 machines, thus reducing the doubt of the repairman.
Step 4. Finally, recommend to the failure location those machines which get
the most favorable ranking (in terms of the interpretation) on the first feature axis,
eventually also on the second axis.
(c) One essential advantage of this approach is that the decision maker, will
be provided with a two-dimensional chart, which he may easily interpret, and on
which he may spot with the eye in a straightforward manner, the final reduced
set of candidate machines. Also, apart from the number of feature measurements
used in step 4, no additional assumption is needed, because unsupervised multi-
variate statistical analysis is used. The effect of linear transformation and rescaling
of the initial data is indicated in Section 5.1.2.6.

5.1.2. Theory and use of correspondence analysis (Chen, 1982; Hill, 1974; Pau,
1981a).
5.1.2.1. Notation. Let k(I, J) be the incidence table of non-negative numbers,
representing the attribute measurements j t 3", j = 1, 2, 3, 4, on the machines i t / ,
i = 1, ..., 33. The marginals are defined as follows:

k(i, ") ~=~ k(, j), K(. , j) ~=~ k(i, j) .


j i

It is convenient to operate on the contingency table p(I, J), rather than on the
incidence table k(1, J):

p(i, j) =A k(i, j) / ~] k(m, n), and corresponding p(i, "), p(',j)


! m,~t

r will be the number of feature measurements extracted; here r ~< 4.

5.1.2.2. Concepts and principles of interpretation. Generalizing the classical partition


of a contingency table by a Z2 test (Pearson), correspondence analysis yields
natural clusters made of rows i t I and columns j t J which go together to form
natural groups in the feature measurement space. Their construction is essentially
based upon geometrical proximities between rows i t I and/or columns j t J; these "
proximities may be identified by visual inspection, if only two feature measure-
ments are considered, by building coordinate axes for all machines i t I and
302 L. F. Pau

attribute measurements j E J. Such representations, called maps, are precious tools


for visual clustering, and thus to diagnose causality relations between measure-
ments and machines.
By construction, all the effects of statistically dependent rows and columns such
that:
k(i, j) = k(i, ") k ( ' , j)

will be removed. Equivalent machines will thus appear immediately as having very
close representations on the maps. The machine space I is provided with a
distance measure, called Z2 metric, defined by

d2(il, i2) = ~ p ( ' , j) [x(i,, j) - x(i2, j)12,


J

x(i, j) a= _ p(',j) 1.
p(;, .)p(., j)
Moreover, each machine i~ I and each measurement j e J are assigned the
weights p(i, .), and p ( . , j), respectively, for all variance computations using the
Z2 metric.

5.1.2.3. Theory of correspondence analysis: summary (Pau, 1981a; Chen, 1982; Hill,
1974).
(a) Correspondence analysis, or as it is also called, Fisher's canonical analysis
of contingency tables, amounts to looking for vectors

F = t(F(1), . . . , F(Card(J))) and G = t(G(1), ..., G(Card(I))),

where Card(. ) is the number of elements in the set, such that when the functions
f, g of the random variables (Y, X) = ( j , / ) are defined by the relations

f(Y) = F(j), g ( X ) = G(i),

then the correlation between the random variables f ( Y ) , g(X) is maximum. Corre-
spondence analysis can be applied to non-negative incidence tables k(L J), as
well as to contingency tables p(I, J); the former will be considered in the
following.
(b) Let k(L ' ) and k ( ' , J ) be the diagonal matrices of row and column totals,
assuming none to be zero. The sequence of operations

F (1) = ( k ( ' , J ) ) - ~ tk(I, J ) G °~ ,

G (2) = (k(/, "))- l k ( / , J ) F (1) ,

F Ce) = (k(., J ) ) - ~ tk(I, J ) G (2~ , etc.


Applications of pattern recognition in failure diagnosis and quality control 303

in which new vectors F ('m, G (m) are successively derived from an initial vector
G (1), is referred to here as the Co(k((L J)) algorithm corresponding to the tableau
I,(i, J).
(c) Its eigenvectors, as defined below, are the solutions of the correspondence
analysis problem, and the coordinates of the individuals and measurements in the
feature space are simply:

F(j, n) = F * ( j ) , G(i, n) = G*(i),

where n = 1, ..., M i n ( C a r d ( / ) , Card(J)), and F*, G* are the eigenvectors of rank


n of the algorithm Co(k(I, J)), when ranked by decreasing eigenvalues 2,.
(d) Each triple (p, F*, G*) is an eigensolution if:

pGg¢ ~. (k(L .))-1 k(l~ J)F*,


p=
pF* = (k(., J ) ) - ' tk(I, J ) G * ,

5.1.2.4. Computational formulas.


(1) Define the dimension 1 ~< r ~< Min (Card)(/), C a r d ( J ) ) of the feature space
after data compression.
(2) (a) G* and 2, = pn2 are respectively the (n + 1)st column eigenvector and
associated eigenvalue of the symmetrical semi-definite matrix S = [sit]:

sit = ~ p(i, j)p(i, l)


i~1 p(i, ' ) x / p ( . , j ) p ( . , i ) ' j' l ~ J ,

which has 2 0 = 1 as largest eigenvalue; all the coordinates of G* are equal.


(b) These eigenvectors G* = [G*(i), i = 1. . . . , C a r d ( / ) ] are ranked by de-
creasing eigenvalues 1 >/21 > / . . . > 2r > 0. They are the factor axes of the cluster
N(I).
(3) The factor axes F* of the cluster N ( J ) are associated to the same eigen-
values 2,, and

F* = ( 1 / x / ~ ) ( p ( . , J ) ) - ' tp(I, J ) G * ,

( p ( . j ) ) - i tp(i, j ) = [p(j, i)/(p(., j ) ] , i = row ; j = c o l u m n .


(4) (a) The coordinate G(i, n), n = 1. . . . , r, of the individual i e I on the factor
axis G* is G*(O.
(b) The coordinate F(j, n), n -- 1. . . . , r, of the measurement j e J on the factor
axis F~*, is F~*(j).
(c) Both individuals i e I and measurements j e J m a y then be displayed in the
same r-dimensional feature space, with basis vectors G*, n = 1, . . . , r.

1 1
(d) G(i, n) - ~ p(i, j ) F ( j , n) . .i e .I , . n .= 1,
. r
p(i, • ) jT"J
304 L. F. Pau

(5) Data reconstruction formula:

p(i, j) = p(i, .)p(', j) [1 + x/~. F(j, n)G(i, n)l .


t/=l~...,r

5.1.2.5. Contributions, and interpretations of the factor axes representing the feature
measurements. On a map, the squared Euclidean distance D between rows and/or
columns, has the same value as the Z2 distance between the corresponding
profiles, and

2. = ~ p(' ,j)2 (F(j, n)) 2 = ~. p(i, ") (G(i, n)) 2 , n= 1,...,r,


j i
:~n = ~n" Trace(S).
This justifies the following definitions:
(i) p(i, .)(G(i, n))2 Sign(G(/, n)) is the contribution of the row/machine i to the
factor axis n of inertia ).n ;
(ii) p(., j) (F(j, n))2 Sign(F(j, n) is the contribution of the column/measurement
j to the factor axis n of the inertia 2,,.
The rule is then to interpret the feature axis n, with reference only to those
machines and measurements which have the largest (or smallest) contributions to
that axis.

5.I.2.6. Lffect of rescaling the data k(L J). If the attribute measurement k(i, j) is
rescaled by a factor aj > 0, and if the modified x coordinates are noted xa, then

xa(i, j) ~=(x(i, j) + 1) -1.


1 + (aj - 1)p(i, j)ip(i, • )

If we assume aj small, Card(J) large, we may replace p(i, j) by its expected


value and get the approximation

ai 1 ] (x(i, j)+ 1)- i.


-
xa(i, j) "~ 1 Card(J),l

As a consequence, the modified ~2 distance becomes

da2(il, i2) = aj 1 aj - 1_] 2 d2(i,, i2).


Card(J),/

In other words, if one attribute measurement j ~ J is rescaled, essentially only


the point representing this measurement will be moved, whereas all distances in
the machine space I will be multiplied by the same factor.
Rescaling does consequently not affect the relative positions of the machines,
and the machine diagnosis procedure does still apply.
Applications of pattern recognition in failure diagnosis and quality control 305

1. C o o r d i n a t e s F M 1 2 3
o f the m e a s u r e m e n t s
F(PRIC) - 0.03785 - 0.00886 0.00010
F(CONS) 0.04187 0.02053 - 0.08758
F(WATR) 0.05526 0.06180 0.00058
F(TIME) 0.17734 0.05025 0.00032

2. C o o r d i n a t e s G M 1 2 3
o f the m a c h i n e s
G(L26) - 0.11505 0.02421 - 0.00150
G(L13) - 0.10726 0.00762 0.00304
G(L18) -0.09264 0.00778 - 0.00159
G (L15) - 0.08407 - 0.03511 - 0.00472
G(L19) - 0.07633 - 0.00993 0.00033
G (L 5) - 0.06924 0.05048 - 0.00186
G(L 4) - 0.06350 - 0.03973 0.00142
G(L33) - 0.05833 0.03013 0.00587
G(Lll) - 0.05656 0.04041 - 0.00033
G(L14) - 0.05310 -0.01005 0.00656
G(L 8) - 0.04395 0.00195 -0.00599
G (L22) - 0.03896 - 0.03516 0.00438
G (L31) -0.03345 - 0.01782 - 0.01013
G (L20) - 0.00388 0.00380 0.00537
G(L24) - 0.00200 - 0.01753 0.00005
G(L 1) 0.00459 - 0.05232 0.00285
G(L10) 0.01442 - 0.04054 - 0.00104
G(L 7) 0.01917 0.02893 0.00087
G(L25) 0.02446 0.03182 - 0.00243
G (L16) 0.03331 0.01714 0.00617
G (L12) 0.03458 - 0.05806 0.00130
G(L28) 0.03717 -0.01573 - 0.00833
G (L 9) 0.04593 0.02954 0.00064
G(L29) 0.04765 -0.02400 0.00103
G(L17) 0.05156 0.02187 - 0.00978
G(L 6) 0.05731 0.04714 0.00146
G(L21) 0.06003 - 0.04308 0.00087
G(L 3) 0.07663 0.03911 0.00179
G(L27) 0.08129 0.03521 - 0.00539
G(L 2) 0.10110 - 0.04922 -0.00005
G(L23) 0.11189 0.02595 0.00689
G (L30) 0.11780 - 0.00506 -0.00246
G (L32) 0.12948 0.00688 0.00054

3. E i g e n v a l u e s r 1 2 3
and inertia
0.4629 E - 0 2 0.9931E -03 0.1817 E - 0 4
82.07~ 17.61~o 0.32~o
Z 82 ~o 99.68~0 100 00~o

4. E i g e n v e c t o r s 0.84848 0.31096 0.04857 0.42548


- 0.47204 0.81047 0.02989 0.34558
- 0.23851 - 0.49587 0.03164 0.83441
0.01960 0.02369 - 0.99787 0.05752

Fig. 14. C o o r d i n a t e s o f all m e a s u r e m e n t s a n d m a c h i n e s ( S e c t i o n 5).


306 L. F. Pau

5.3. Case results


Following the procedure presented in Section 5.1, the theory of which was
summarized in Section 5.1.2, we will in the following interpret the numerical
results obtained, eventually displayed in the compagnion Figures 14, 15, 16.

5.2.1. Step 1: Computation of the feature measurements. First r = 3 feature


measurements are extracted; they are the eigenvectors G~', G~', G*.

5.2.2. Step 2: Interpretation of the feature measurements.


(a) They are obvious from the reading of the computed contributions of the
machines and measurements to G*, G*, and G* (see Figure 14).
(i) G~': The first feature measurement opposes the operating time (con-
tribution = 0.304 E - 02) to the vibration level (contribution = - 0.103 E - 02),
while the flow has weaker but here similar contribution to the operating time; this
first feature measurement is thus the vibration level per unit of operating time.
(ii) G*: The second feature measurement opposes the flow (contribu-
tion = 0.691 RE - 03) to operating time (contribution = - 0.244 E - 03); the
second feature measurement is thus the flow required for running the machine.
(iii) G*: The third feature measurement isolates the electricity consumption
alone (contribution = 0.181 E - 0 4 ) ; this means that it has only a minor impact
on the machine diagnosis problem.
(b) The goals are to fulfill, in the given order, the following diagnostic criteria:
(a) maximize the vibration level per unit operating time, thus select machines
with large positive contributions and coordinates on G~';
(fl) minimize the flow, thus select machines with large positive contributions
and coordinates on G~';
(~) minimize the electricity consumption, thus select machines with large posi-
tive contributions and coordinates on G*.

5.2.3. Step 3: Ranking the feature measurements. The numlerical results from
Figure 14 yields:
).1 eigenvalue of G* = 0.4629 E - 02 or z I -- 82.07~o ,
22 eigenvalue of G* = 0.993 E - 0 3 or z2 --- 1 7 . 6 1 ~ ,
23 eigenvalue of G* = 0.181 E - 0 4 or z3 = 0.32~o .

Here, it is obvious that the machine diagnosis would essentially rely on the first
feature measurement (vibration level per unit of operating time) and eventually
somehow on the second (flow). Our three-criteria problem has been reduced to
a two-criteria problem with G* as a leading diagnostic criteria to be maximized.

5.2.4. Step 4: Machine diagnosis.


(a) Looking at the machines in the first quadrant of Figure 16, one sees that
the non-dominated points according to the two criteria (~) and (13) are 32, 23, 27,
3, 30,2.
Applications of pattern recognition in failure diagnosis and quality control 307

l l l l l l l i l l I [ 1 1 1 [ l l i ; l l l l l l l l i l l l i IIII

I l l i l l l i l l l f l l i

000000000000000000000000000000000 0000
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ ZZZZ

.=_

l l i l l i l l l l l l l l l l l l l l i l l i l i t i l l i l l III~ rn

I I I I I I l l l l l l l l l II

000000000000000000000000000000000 0000
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ ZZZZ
0
[.)

I I I I I I I t l l l i l l l l l l ~ l i f l l l l l l l l ~ l l l i l l

l i l i l l l i l l l l l l l I

000000000000000000000000000000000 0000
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ ZZZZ

~ ~ - ~ - - ~ - - ~ . . . . . . . . . ~ ~o~
308 L. F. Pau
. . . . . . . . . . . . . . . . . . . . . . WATR . . . . . . .

L5

L6

LII
L3

L27

L25
L33 L9
L7
L23
IL26

L17
CONS

L16

I,LI3 L18 L32

L2O
L8
(+)

L30

PRIC
LI9 LI4

L28
L24
L31

L29

Ll5 L22

L4
LIO
L21

L2
TI~E

Fig. 16. Map of all 4 measurements and 33 machines.


Applications of pattern recognition in failure diagnosis and quality control 309

(b) Because we want the criterion (a) to dominate, we will have to make an
ordering within these non-dominated solutions. Figure 15, which contains the
contributions of the machines to G*, Figure 14 which contains their coordinates,
and last but not least the map of Figure 16, give us, according to the rule (~), the
solution:
Diagnose as defective machine # 3 2 ; if not: # 3 0 ; if not: # 2 3 ; if not: # 2 ; if
not: # 2 7 ; if not: # 3 ; etc.
However, the first machine in this sequence also to have a large positive
contribution to G* (flow) according to criterion (fl), is Machine 27, and the next
Machine 3, or Machine 6. Machines 30 and 2 have negative contributions to G*,
and should be eliminated.
(c) By visual clustering, one could select right away the machines by the original
criteria of minimizing the vibration level, the operating time, the electricity con-
sumption, or the flow p e r s e , by looking at the factor map Figure 16, for which
machines are close to the points representing these criteria/measurements:
(i) Max vibration level: Machines 14, 19, 31, 24, 8, 20, 13, 18, close to PRIC.
(ii) Min operating time: Machines 2, 21, 1, 10, close to TIME.
(iii) Min electricity consumption: Machines 17, 16, close to CONS.
(iv) Min flow: Machines 6, 3, 27, 25, 11, close to WATR.
Notice the large differences between the previous selections (a), (b) according
to criteria (~) and (fl), and the latter ones (c).

5.2.5. Conclusion. Because of the significant contributions of G~* and G*, and
because of the removal of correlated effects, we recommend the following reduced
diagnosis of defect machines:
Machines 32, 23, 27, 3 (in that order, the first being the most likely to have
failed).

References

The bibliography on statistical and pattern recognition approaches to failure


diagnosis is enormous, and scattered across many sections of the technical littera-
ture, often within the context of specific applications. Therefore, in addition to a
few numbered recent references of a general nature, are listed a number of major
public conferences dealing to a substantial extent with technical diagnostics.
Neither lists are by any means complete, but are indicated to seve as starting
points.

Beliman, R. (1966). Dynamic programming, pattern recognition and location of faults in complex
systems. J. AppL Probab. 3, 268-280.
Benzecri, J. P. (1977). L'Analyse des Donn~es, Vol. 1 & 2. Dunod, Paris.
Chen, C. H. (1982). Digital Waveform Processing and Recognition. CRC Press, Boca Raton, FL.
Chien, Y, T. and Fu, K. S. (1967). On the generalized Karhunen-Lorve expansion. IEEE Trans.
Inform. Theory 13, 518-520.
Clark, R. N. et al. (1975). Detecting instrument malfunctions in control systems. IEEE Trans.
Aerospace Electron. Systems 11 (4).
310 L. F. Pau

Collacott, R. A. (1976). Mechanical Fault Diagnosis and Condition Monitoring. Chapman & Hall,
London.
Devijver, P. A. (1979). New error bounds with the nearest neighbor rule. IEEE Trans. Inform. Theory
25, 749-753.
Everitt, B. (1974). Cluster Analysis. Wiley, New York.
Fu, K. S. (1968). Sequential Methods in Pattern Recognition and Machine Learning. Academic Press,
New York.
Fukunaga, K. (1972). Introduction to Statistical Pattern Recognition. Academic Press, New York.
Hartigan, J. A. (1975). Clustering Algorithms. Wiley, New York.
Hartman, H. (1960). Modern Factor Analysis. University of Chicago Press, Chicago, IL.
Hill, M. O. (1974). Correspondence analysis: a neglected multivariate method. AppL Statist. Ser. C
23 (3), 340-354.
IEEU Spectrum (1981). Special issue on reliability, October 1981.
IMEKO (1980). TC-10: Glossary of terms and definitions recommended for use in technical
diagnostics and condition-based maintenance. IMEKO Secretariat, Budapest.
Narendra, P. M. and Fukunaga, K. (1977). A branch and bound algorithm for feature subset
selection. IEEE Trans. Comput. 26, 917-922.
Parzen, E. (1962). On estimation of a probability density function and mode. Ann. Math. Statist. 33,
1065-1076.
Pan, L. F. (1977). An adaptive signal classification procedure: application to aircraft engine
monitoring. Pattern Recognition 9 (3), 121-130.
Pau, L. F. (1978). Classification du signal par tests s6quentiels non-param&riques. In: Proc. Conf.
Reconnaissance des formes et traitement des images. INRIA, Rocquencourt, pp. 159-168.
Pau, L. F. (1981a). Failure Diagnosis and Performance Monitoring. Marcel Dekker, New York.
Pau, L. F. (1981b). Applications of pattern recognition to failure analysis and diagnosis. In: K. S. Fu,
ed, Applications of Pattern Recognition. CRC Press, Boca Raton, FL, Chapter 5.
Pau, L. F. (1984). Failure detection processes by an expert system and hybrid pattern recognition.
Pattern Recognition Lett. 2, 419-425.
Pau, L. F. (1986). A survey of expert systems for failure diagnosis, test generation and maintenance.
Expert Systems J. 3 (2), 100-111.
Pau, L. F. (1987a). Knowledge representation approaches in sensor fusion. In: Proc. IFAC World
Congress. Pergamon Press, Oxford.
Pau, L. F. (1987b). System availability in presence of an imperfect test and monitoring system. IEEE
Trans. Aerospace Electron. Systems, 23(5), 625-633.
Pau, L. F. and Kittler, J. (1980). Automatic inspection by lots in the presence of classification errors.
Pattern Recognition 12 (4), 237-241
Pau, L. F., Toghrai, C. and Chen, C. H. (1983). Multivariate sampling plans in quality control: a
numerical example. IEEE Trans. Reliability 32 (4), 359-365.
Rasmussen, J. and Rouse, W. B. (Editors) (1981). Human Detection and Diagnosis of System Failures.
NATO Conference series, Vol. 15, Series 3. Plenum Press, New York.
Saeks, R. and Liberty, S. (1977). Rational Fault Analysis. Marcel Dekker, New York.
Sebestyen, G. (1962). Decision Making Processes in Pattern Recognition. MacMillan, New York.
Slagle, J. R. and Lee, R. C. T. (1971). Application of game tree searching