You are on page 1of 15

Hazard Function

The hazard function (also known as the failure rate, hazard rate, or force of mortality)
the probability density function
to thesurvival function
, given by

is the ratio of

(1)
(2)

where

is the distribution function (Evans et al. 2000, p. 13).

Failure rate
From Wikipedia, the free encyclopedia

This article includes a list of references, related reading or external links, but its
sources remain unclear because it lacks inline
citations. Please improve this article by introducing more precise
citations. (November 2009)

Failure rate is the frequency with which an engineered system or component fails, expressed, for
example, in failures per hour. It is often denoted by the Greek letter (lambda) and is important
in reliability engineering.
The failure rate of a system usually depends on time, with the rate varying over the life cycle of the
system. For example, an automobile's failure rate in its fifth year of service may be many times
greater than its failure rate during its first year of service. One does not expect to replace an exhaust
pipe, overhaul the brakes, or have majortransmission problems in a new vehicle.
In practice, the mean time between failures (MTBF, 1/) is often reported instead of the failure rate.
This is valid and useful if the failure rate may be assumed constant often used for complex units /
systems, electronics and is a general agreement in some reliability standards (Military and
Aerospace). It does in this case only relate to the flat region of the bathtub curve, also called the
"useful life period". Because of this, it is incorrect to extrapolate MTBF to give an estimate of the
service life time of a component, which will typically be much less than suggested by the MTBF due
to the much higher failure rates in the "end-of-life wearout" part of the "bathtub curve".
The reason for the preferred use for MTBF numbers is that the use of large positive numbers (such
as 2000 hours) is more intuitive and easier to remember than very small numbers (such as 0.0005
per hour).
The MTBF is an important system parameter in systems where failure rate needs to be managed, in
particular for safety systems. The MTBF appears frequently in theengineering design requirements,
and governs frequency of required system maintenance and inspections. In special processes
called renewal processes, where the time to recover from failure can be neglected and the likelihood

of failure remains constant with respect to time, the failure rate is simply the multiplicative inverse of
the MTBF (1/).
A similar ratio used in the transport industries, especially in railways and trucking is "mean distance
between failures", a variation which attempts to correlate actual loaded distances to similar reliability
needs and practices.
Failure rates are important factors in the insurance, finance, commerce and regulatory industries and
fundamental to the design of safe systems in a wide variety of applications.
Contents

1 Failure rate in the discrete sense

2 Failure rate in the continuous sense

3 Decreasing failure rate


o

3.1 Renewal processes

3.2 Applications

3.3 Coefficient of variation

4 Failure rate data


o

4.1 Units

4.2 Additivity

4.3 Example

5 Estimation

6 See also

7 References

8 External links

Failure rate in the discrete sense[edit]


The failure rate can be defined as the following:
The total number of failures within an item population, divided by the total time expended by
that population, during a particular measurement interval under stated conditions.
(MacDiarmid, et al.)

Although the failure rate,


, is often thought of as the probability that a failure occurs in a
specified interval given no failure before time , it is not actually a probability because it can
exceed 1. Erroneous expression of the failure rate in % could result in incorrect perception of the
measure, especially if it would be measured from repairable systems and multiple systems with
non-constant failure rates or different operation times. It can be defined with the aid of
the reliability function, also called the survival function,
time .

, where
function) and

over a time interval

, the probability of no failure before

is the time to (first) failure distribution (i.e. the failure density


.

from

(or ) to

and

Note that this is a conditional probability, hence the

is defined as

in the denominator.

The
function is a CONDITIONAL probability of the failure DENSITY function. The
condition is that the failure has not occurred at time .
Hazard rate and ROCOF (rate of occurrence of failures) are often incorrectly seen as
the same and equal to the failure rate.

Failure rate in the continuous sense[edit]

Exponential failure density functions

Calculating the failure rate for ever smaller intervals of time, results in the hazard
function (also called hazard rate),
as
tends to zero:

. This becomes the instantaneous failure rate

A continuous failure rate depends on the existence of a failure distribution,


,
which is a cumulative distribution function that describes the probability of failure (at
least) up to and including time t,

where
is the failure time. The failure distribution function is the integral of the
failure density function, f(t),

The hazard function can be defined now as

Many probability distributions can be used to model the failure


distribution (see List of important probability distributions). A common
model is the exponential failure distribution,

which is based on the exponential density function. The hazard rate


function for this is:

Thus, for an exponential failure distribution, the hazard rate is a


constant with respect to time (that is, the distribution is
"memory-less"). For other distributions, such as aWeibull
distribution or a log-normal distribution, the hazard function
may not be constant with respect to time. For some such as the
deterministic distribution[citation needed] it is monotonic increasing
(analogous to "wearing out"), for others such as the Pareto

distribution it is monotonic decreasing (analogous to"burning


in"), while for many it is not monotonic.

Decreasing failure rate[edit]


A decreasing failure rate (DFR) describes a phenomenon
where the probability of an event in a fixed time interval in the
future decreases over time. A decreasing failure rate can
describe a period of "infant mortality" where earlier failures are
eliminated or corrected[1] and corresponds to the situation
where (t) is a decreasing function.
Mixtures of DFR variables are DFR,[2] and also mixtures
of exponentially distributed random variables are DFR.[3]

Renewal processes[edit]
For a renewal process with DFR renewal function, interrenewal times are concave.[4][2] Brown conjectured the
converse, that DFR is also necessary for the inter-renewal
times to be concave,[5] however it has been shown that this
conjecture holds neither in the discrete case[4] or continuous
case.[6]

Applications[edit]
Increasing failure rate is an intuitive concept caused by
components wearing out. Decreasing failure rate describes a
system which improves with age.[3] Decreasing failure rates
have been found in the lifetimes of spacecraft, Baker and
Baker commenting that "those spacecraft that last, last on and
on."[7][8] The reliability of aircraft air conditioning systems were
individually found to have an exponential distribution, and thus
in the pooled population a DFR.[3]

Coefficient of variation[edit]
When the failure rate is decreasing the coefficient of variation is
1, and when the failure rate is increasing the coefficient of
variation is 1.[9] Note that this result only holds when the
failure rate is defined for all t 0[10] and that the converse result
(coefficient of variation determining nature of failure rate) does
not hold.

Failure rate data[edit]


Failure rate data can be obtained in several ways. The most
common means are:
Historical data about the device or system under
consideration
Many organizations maintain internal databases of failure information on the devices or
systems that they produce, which can be used to calculate failure rates for those devices or
systems. For new devices or systems, the historical data for similar devices or systems can
serve as a useful estimate.
Government and commercial failure rate data
Handbooks of failure rate data for various components are available from government and
commercial sources. MIL-HDBK-217F, Reliability Prediction of Electronic Equipment, is
a military standard that provides failure rate data for many military electronic components.
Several failure rate data sources are available commercially that focus on commercial
components, including some non-electronic components.
Testing
The most accurate source of data is to test samples of the actual devices or systems in order
to generate failure data. This is often prohibitively expensive or impractical, so that the
previous data sources are often used instead.

Units[edit]
Failure rates can be expressed using any measure
of time, but hours is the most common unit in
practice. Other units, such as miles, revolutions,
etc., can also be used in place of "time" units.
Failure rates are often expressed in engineering
notation as failures per million, or 106, especially
for individual components, since their failure rates
are often very low.
The Failures In Time (FIT) rate of a device is the
number of failures that can be expected in
one billion (109) device-hours of operation. (E.g.
1000 devices for 1 million hours, or 1 million
devices for 1000 hours each, or some other
combination.) This term is used particularly by
the semiconductor industry.
The relationship of FIT to MTBF may be expressed
as: MTBF = 1,000,000,000 x 1/FIT.

Additivity[edit]
Under certain engineering assumptions (e.g.
besides the above assumptions for a constant
failure rate, the assumption that the considered
system has no relevantredundancies), the failure
rate for a complex system is simply the sum of the
individual failure rates of its components, as long
as the units are consistent, e.g. failures per million
hours. This permits testing of individual
components or subsystems, whose failure rates
are then added to obtain the total system failure
rate.[citation needed]

Example[edit]
Suppose it is desired to estimate the failure rate of
a certain component. A test can be performed to
estimate its failure rate. Ten identical components
are each tested until they either fail or reach 1000
hours, at which time the test is terminated for that
component. (The level of statistical confidence is
not considered in this example.) The results are as
follows:
Estimated failure rate is

or 799.8 failures for every million hours of


operation.

Survival analysis
From Wikipedia, the free encyclopedia

This article includes a list of references, but its sources remain unclear
because it has insufficient inline citations.Please help to improve this article

by introducing more precise citations.

(November 2011)

Survival analysis is a branch of statistics which deals with analysis of time duration to until one or
more events happen, such as death in biological organisms and failure in mechanical systems. This
topic is called reliability theory or reliability analysis in engineering, and duration analysis or duration
modeling in economics or event history analysis in sociology. Survival analysis attempts to answer
questions such as: what is the proportion of a population which will survive past a certain time? Of
those that survive, at what rate will they die or fail? Can multiple causes of death or failure be taken
into account? How do particular circumstances or characteristics increase or decrease the
probability of survival?
To answer such questions, it is necessary to define "lifetime". In the case of biological
survival, death is unambiguous, but for mechanical reliability, failure may not be well-defined, for
there may well be mechanical systems in which failure is partial, a matter of degree, or not otherwise
localized in time. Even in biological problems, some events (for example, heart attack or other organ
failure) may have the same ambiguity. The theory outlined below assumes well-defined events at
specific times; other cases may be better treated by models which explicitly account for ambiguous
events.
More generally, survival analysis involves the modeling of time to event data; in this context, death or
failure is considered an "event" in the survival analysis literature traditionally only a single event
occurs for each subject, after which the organism or mechanism is dead or broken. Recurring
event or repeated event models relax that assumption. The study of recurring events is relevant
in systems reliability, and in many areas of social sciences and medical research.
Contents

1 General formulation
o

1.1 Survival function

1.2 Lifetime distribution function and event density

1.3 Hazard function and cumulative hazard function

1.4 Quantities derived from the survival distribution

2 Censoring

3 Fitting parameters to data

4 Non-parametric estimation

5 Distributions used in survival analysis

6 See also

7 References

8 Further reading

9 External links

General formulation[edit]
Survival function[edit]
Main article: survival function
The object of primary interest is the survival function, conventionally denoted S, which is defined
as

where t is some time, T is a random variable denoting the time of death, and "Pr" stands
for probability. That is, the survival function is the probability that the time of death is later than
some specified time t. The survival function is also called the survivor function or survivorship
function in problems of biological survival, and the reliability function in mechanical survival
problems. In the latter case, the reliability function is denoted R(t).
Usually one assumes S(0) = 1, although it could be less than 1 if there is the possibility of
immediate death or failure.
The survival function must be non-increasing: S(u) S(t) if u t. This property follows directly
because T>u implies T>t. This reflects the notion that survival to a later age is only possible if all
younger ages are attained. Given this property, the lifetime distribution function and event
density (F and f below) are well-defined.
The survival function is usually assumed to approach zero as age increases without bound,
i.e., S(t) 0 as t , although the limit could be greater than zero if eternal life is possible. For
instance, we could apply survival analysis to a mixture of stable and unstable carbon isotopes;
unstable isotopes would decay sooner or later, but the stable isotopes would last indefinitely.

Lifetime distribution function and event density[edit]


Related quantities are defined in terms of the survival function.
The lifetime distribution function, conventionally denoted F, is defined as the complement of
the survival function,

If F is differentiable then the derivative, which is the density function of the lifetime
distribution, is conventionally denoted f,

The function f is sometimes called the event density; it is the rate of death or failure
events per unit time.
The survival function can be expressed in terms of probability distribution and probability
density functions

Similarly, a survival event density function can be defined as

Hazard function and cumulative hazard function [edit]


The hazard function, conventionally denoted , is defined as the event rate at
time t conditional on survival until time t or later (that is, T t),

Force of mortality is a synonym of hazard function which is used particularly


in demography and actuarial science, where it is denoted by . The
term hazard rate is another synonym.
The hazard function must be non-negative, (t) 0, and its integral
over
must be infinite, but is not otherwise constrained; it may be
increasing or decreasing, non-monotonic, or discontinuous. An example is
the bathtub curve hazard function, which is large for small values of t,
decreasing to some minimum, and thereafter increasing again; this can
model the property of some mechanical systems to either fail soon after
operation, or much later, as the system ages.
The hazard function can alternatively be represented in terms of
the cumulative hazard function, conventionally denoted :

so transposing signs and exponentiating

or differentiating (with the chain rule)

The name "cumulative hazard function" is derived from the fact


that

which is the "accumulation" of the hazard over time.


From the definition of
, we see that it increases
without bound as t tends to infinity (assuming that S(t)
tends to zero). This implies that
must not decrease
too quickly, since, by definition, the cumulative hazard has
to diverge. For example,
is not the hazard
function of any survival distribution, because its integral
converges to 1.

Quantities derived from the survival


distribution[edit]
Future lifetime at a given time
is the time remaining
until death, given survival to age . Thus, it is
in
the present notation. The expected future lifetime is
the expected value of future lifetime. The probability of
death at or before age
, given survival until age ,
is just

Therefore the probability density of future lifetime is

and the expected future lifetime is

where the second expression is obtained


using integration by parts.
For
, that is, at birth, this reduces to
the expected lifetime.
In reliability problems, the expected lifetime is
called the mean time to failure, and the
expected future lifetime is called the mean
residual lifetime.
As the probability of an individual surviving
until age t or later is S(t), by definition, the
expected number of survivors at age t out of
an initial population of n newborns is n S(t),
assuming the same survival function for all
individuals. Thus the expected proportion of
survivors is S(t). If the survival of different
individuals is independent, the number of
survivors at age t has a binomial
distribution with parameters n and S(t), and
the variance of the proportion of survivors
is S(t) (1-S(t))/n.
The age at which a specified proportion of
survivors remain can be found by solving the
equation S(t) = q for t, where q is
the quantile in question. Typically one is
interested in the median lifetime, for
which q = 1/2, or other quantiles such as q =
0.90 or q = 0.99.
One can also make more complex inferences
from the survival distribution. In mechanical
reliability problems, one can bring cost (or,
more generally, utility) into consideration, and
thus solve problems concerning repair or
replacement. This leads to the study
of renewal theory and reliability theory of aging
and longevity.

Censoring[edit]
Censoring is a form of missing data problem
which is common in survival analysis. Ideally,
both the birth and death dates of a subject are
known, in which case the lifetime is known.

If it is known only that the date of death is after


some date, this is called right censoring. Right
censoring will occur for those subjects whose
birth date is known but who are still alive when
they are lost to follow-up or when the study
ends.
If a subject's lifetime is known to be less than a
certain duration, the lifetime is said to be leftcensored.
It may also happen that subjects with a lifetime
less than some threshold may not be observed
at all: this is called truncation. Note that
truncation is different from left censoring, since
for a left censored datum, we know the subject
exists, but for a truncated datum, we may be
completely unaware of the subject. Truncation
is also common. In a so-called delayed
entry study, subjects are not observed at all
until they have reached a certain age. For
example, people may not be observed until
they have reached the age to enter school.
Any deceased subjects in the pre-school age
group would be unknown. Left-truncated data
are common in actuarial work for life insurance
and pensions.[1]
We generally encounter right-censored data.
Left-censored data can occur when a person's
survival time becomes incomplete on the left
side of the follow-up period for the person. As
an example, we may follow up a patient for
any infectious disorder from the time of his or
her being tested positive for the infection. We
may never know the exact time of exposure to
the infectious agent.[2]

Fitting parameters to
data[edit]
Survival models can be usefully viewed as
ordinary regression models in which the
response variable is time. However, computing
the likelihood function (needed for fitting
parameters or making other kinds of
inferences) is complicated by the censoring.
The likelihood function for a survival model, in

the presence of censored data, is formulated


as follows. By definition the likelihood function
is the conditional probability of the data given
the parameters of the model. It is customary to
assume that the data are independent given
the parameters. Then the likelihood function is
the product of the likelihood of each datum. It
is convenient to partition the data into four
categories: uncensored, left censored, right
censored, and interval censored. These are
denoted "unc.", "l.c.", "r.c.", and "i.c." in the
equation below.

For uncensored data, with


age at death, we have

equal to the

For left-censored data, such that the


age at death is known to be less
than , we have

For right-censored data, such that


the age at death is known to be
greater than , we have

For an interval censored


datum, such that the age at
death is known to be less
than
than

and greater
, we have

An important application
where interval-censored
data arises is current
status data, where the
actual occurrence of an

event
is only known to
the extent that it known
not to occurred before
observation time and to
have occurred before the
next.

Nonparametric
estimation[edit]
The NelsonAalen
estimator can be used to
provide a nonparametric estimate of the
cumulative hazard rate
function.

You might also like