# Reliability

In statistics, reliability is the consistency of a set of measurements or measuring instrument, often used to describe a test. This can either be whether the measurements of the same instrument give or are likely to give the same measurement (test-retest), or in the case of more subjective instruments, such as personality or trait inventories, whether two independent assessors give similar scores (inter-rater reliability). Reliability is inversely related to random error. Reliability does not imply validity. That is, a reliable measure is measuring something consistently, but not necessarily what it is supposed to be measuring. For example, while there are many reliable tests of specific abilities, not all of them would be valid for predicting, say, job performance. In terms of accuracy and precision, reliability is precision, while validity is accuracy. In experimental sciences, reliability is the extent to which the measurements of a test remain consistent over repeated tests of the same subject under identical conditions. An experiment is reliable if it yields consistent results of the same measure. It is unreliable if repeated measurements give different results. It can also be interpreted as the lack of random error in measurement. In engineering, reliability is the ability of a system or component to perform its required functions under stated conditions for a specified period of time. It is often reported in terms of a probability. Evaluations of reliability involve the use of many statistical tools. See Reliability engineering for further discussion.

Estimation
Reliability may be estimated through a variety of methods that fall into two types: Singleadministration and multiple-administration. Multiple-administration methods require that two assessments are administered. In the test-retest method, reliability is estimated as the Pearson product-moment correlation coefficient between two administrations of the same measure. In the alternate forms method, reliability is estimated by the Pearson productmoment correlation coefficient of two different forms of a measure, usually administered together. Single-administration methods include split-half and internal consistency. The split-half method treats the two halves of a measure as alternate forms. This "halves reliability" estimate is then stepped up to the full test length using the Spearman-Brown prediction formula. The most common internal consistency measure is Cronbach's alpha, which is usually interpreted as the mean of all possible split-half coefficients. [2] Cronbach's alpha is a generalization of an earlier form of estimating internal consistency, Kuder-Richardson Formula 20.[2] Each of these estimation methods is sensitive to different sources of error and so might not be expected to be equal. Also, reliability is a property of the scores of a measure rather than the measure itself and are thus said to be sample dependent. Reliability

[2] and other informal means. so a variety of methods are used to estimate the reliability of a test. and/or have near-zero or negative discrimination are replaced with better items. and are the variances on the measured. lengthening the measure. and parallel-test reliability. one minus the ratio of the variation of the error score and the variation of the observed score: where ρxx' is the symbol for the reliability of the observed score.and low-scoring test-takers. If items that are too difficult. (This is true of measures of all types--yardsticks might measure houses well yet have poor reliability when used to measure the lengths of insects. Or. X. there is no way to directly observe or calculate the true score. . • • R(t) = 1 − F(t). is considered the most effective way to increase reliability. the reliability of the measure will increase. Some examples of the methods to estimate reliability include test-retest reliability. Item response theory extends the concept of reliability from a single index to a function called the information function.) Reliability may be improved by clarity of expression (for written assessments). the latter index involving computation of correlations between the items and sum of the item scores of the entire test. reliability is defined mathematically as the ratio of the variation of the true score and the variation of the observed score. equivalently. too easy. However. called the item analysis. true and error scores respectively. (where λ is the failure rate) Classical test theory In classical test theory. .estimates from one sample might differ from those of a second sample (beyond what might be expected due to sampling variations) if the second sample is drawn from a different population because the true reliability is different in this second population. internal consistency reliability. Unfortunately. This analysis consists of computation of item difficulties and item discrimination indices. formal psychometric analysis. Tests tend to distinguish better for test-takers with moderate trait levels and worse among high. R(t) = exp( − λt). The IRT information function is the inverse of the conditional . Item response theory It was well-known to classical test theorists that measurement precision is not uniform across the scale of measurement. Each method comes at the problem of figuring out the source of error in the test somewhat differently.

such as reliability prediction. and perform appropriate . reliability testing and accelerated life testing. Higher levels of IRT information indicate higher precision and thus greater reliability.observed score standard error at any given test score. thermal management. and reliability theory. Many engineering techniques are used in reliability engineering. Weibull analysis.. The ability of something to "fail well" (fail without catastrophic consequences) Reliability engineers rely heavily on statistics. probability theory. ……………………………………. The ability of a device or system to perform a required function under stated conditions for a specified period of time. Because of the large number of reliability techniques. The resistance to failure of a device or system. establish an adequate reliability program. The function of reliability engineering is to develop the reliability requirements for the product. and the varying degrees of reliability required for different situations. It is often reported in terms of a probability. Overview A Reliability Block Diagram Reliability may be defined in several ways: • • • • • • The idea that something is fit for purpose with respect to time. most projects develop a reliability program plan to specify the reliability tasks that will be performed for that specific system. The capacity of a device or system to perform as designed. The probability that a functional unit will perform its required function for a specified interval under stated conditions. their expense. Reliability engineering Reliability engineering is an engineering field that deals with the study of reliability: the ability of a system or component to perform its required functions under stated conditions for a specified period of time.

These tasks are managed by a reliability engineer. Many types of engineering employ reliability engineers and use the tools and methodology of reliability engineering. Reliability theory is the foundation of reliability engineering. this may be expressed as. For engineering purposes. Reliability engineering is performed throughout the entire life cycle of a system.analyses and tasks to ensure the product will meet its requirements. Reliability theory Main articles: reliability theory. production and operation. Many problems from other fields. including development. failure rate. In software engineering and systems engineering the reliability engineering is the sub-discipline of ensuring that a system (or a device in general) will perform its intended function(s) when operated in a specified manner for a specified length of time. reliability is defined as: the probability that a device will perform its intended function during a specified period of time under stated conditions. . who usually holds an accredited engineering degree and has additional reliability-specific education and training. For example: • • • • • System engineers design complex systems having a specified reliability Mechanical engineers may have to design a machine or system with a specified reliability Automotive engineers have reliability requirements for the automobiles (and components) which they design Electronics engineers must design and test their products for reliability requirements. Reliability engineering is closely associated with maintainability engineering and logistics engineering. test. Please see the references for a more comprehensive treatment. such as security engineering. can also be approached using reliability engineering techniques. This article provides an overview of some of the most common reliability engineering tasks. Mathematically.

The consequences of failure are grave. this is taken to mean operation without failure. A commercial airliner must operate under a wide range of conditions. at a specified statistical confidence level. reliability is restricted to operation under stated conditions. • Fourth. and tools can be used to achieve reliability. The operating environment must be addressed during design and testing. but has a much different set of operational conditions. reliability is a probability. The system requirements specification is the criterion against which reliability is measured. and a much lower budget. even if no individual part of the system fails. and we do not express any information on individual failures. where. In practical terms. A Mars Rover will have different specified conditions than the family car.. • Second. • Reliability program plan Many tasks. reliability applies to a specified period of time. or relationships between failures. the military might specify reliability of a gun for a certain number of rounds fired. Reliability engineering ensures that components and materials will meet the requirements during the specified time. The automotive industry might specify reliability in terms of miles. This means that failure is regarded as a random phenomenon: it is a recurring event. reliability is predicated on "intended function:" Generally. • Third. Reliability engineering is concerned with meeting the specified probability of success. this means that a system has a specified chance that it will operate without failure before time . However. Reliability engineering is concerned with four key elements of this definition: First. Units other than time may sometimes be used. then it is still charged against the system reliability. Every system requires a different level of reliability. This constraint is necessary because it is impossible to design a system for unlimited conditions. but there is a correspondingly higher budget. the causes of failures. is the failure probability density function and t is the length of the period (which is assumed to start from time zero). but the system as a whole does not do what was intended. except that the likelihood for failures to occur varies over time according to the given probability function. methods. insignificant consequences of failure. A piece of mechanical equipment may have a reliability rating value in terms of cycles of use. A pencil sharpener may be more reliable than an airliner. .

reliability is specified as the probability of mission success. In other cases. The MTBF is usually specified in hours. Reliability requirements For any system.A reliability program plan is used to document exactly what tasks. but can also be used with other units of measurement such as miles or cycles. For simple systems. The reliability program plan is essential for a successful reliability program and is developed early during system development. test plans. Single-shot reliability is specified as a probability of success. This PFD is derived from failure rate and mission time for non-repairable systems. reliability of a scheduled aircraft flight can be specified as a dimensionless probability or a percentage. In all cases. In addition to system level requirements. Single-shot missile reliability may be incorporated into a requirement for the probability of hit. it is obtained from failure rate and MTTR and test interval. These parameters are very useful for systems that are operated on a regular basis. which can also be specified as the failure rate or the number of failures during a given period. It specifies not only what the reliability engineer does. one of the first tasks of reliability engineering is to adequately specify the reliability requirements. These are devices or systems that remain relatively dormant and only operate once. For complex systems. The most common reliability parameter is the mean-time-between-failure (MTBF). and electronic equipment. or is subsumed into a related parameter. such as most vehicles. Reliability requirements are included in the appropriate system/subsystem requirements specifications. but also the tasks performed by others. thermal batteries and missiles. and contract statements. machinery. For example. reliability parameters are specified with appropriate statistical confidence intervals. test and assessment requirements. reliability requirements may be specified for critical subsystems. System reliability parameters Requirements are specified using reliability parameters. the probability of failure on demand (PFD) is the reliability measure. This measure may not be unique for a given system as this measure depends on the kind of demand. A special case of mission success is the single-shot device or system. For repairable systems. tools. methods. the reliability program plan is a separate document. The reliability program plan is approved by top program management. Reliability increases as the MTBF increases. For such systems. it may be combined with the systems engineering management plan. and associated tasks and documentation. and tests are required for a particular system. analyses. Reliability requirements address the system itself. Examples include automobile airbags. Reliability modelling .

such as crack propagation or chemical corrosion.g. Combined with a zero-defect experiment this becomes even more pessimistic. the Chi-square distribution can be used to find the goodness of fit for the estimated failure rate. Due to the insufficient sample size. Two separate fields of investigation are common: The physics of failure approach uses an understanding of the failure mechanisms involved. this is quite pessimistic. the shape parameter of a Weibull distribution. A single test is insufficient to generate enough statistical data. only the part of the distribution with early failures can be determined. which is often a material property. even highly reliable systems have some chance of failure. At any rate. Some tests are simply impractical.and the sample size is much smaller. The effort is greatly reduced in this case: one does not have to determine a second model parameter (e. Reliability test requirements Because reliability is a probability. The stress is applied for a limited period of time in what is called a censored test. The empirical failure distribution is often parametrised with a Weibull or a log-normal model. stress time. Multiple tests or longduration tests are usually very expensive. testing reliability requirements is problematic for several reasons. only an upper limit of the early failure rate can be determined. higher stresses are necessary to get failure in a reasonable period of time. and the stress they undergo during operation. Here in general only moderate stress is necessary. It is a general praxis to model the early failure rate with an exponential distribution. only limited information about the failure distribution is acquired. In such cases. the empirical distribution function of these failure times can be determined. However. These experiments can be divided into two main categories: Early failure rate studies determine the distribution with a decreasing failure rate over the first part of the bathtub curve. Several degrees of stress have to be applied to determine an acceleration model. This less complex model for the failure distribution has only one parameter: the constant failure rate. Reliability engineering is used to design a realistic and affordable test program that provides enough evidence that the system meets its requirement. or its confidence interval (e. Statistical confidence levels are used to . This is done in general in an accelerated experiment with increased stress.g by an MLE / Maximum likelihood approach) . or the sample size is so low that not a single failure occurs. Therefore.Reliability modelling is the process of predicting or understanding the reliability of a component or system. Here the stress. In a study of the intrinsic failure distribution. The parts stress modelling approach is an empirical method for prediction based on counting the number and type of components of the system. it looks good for the customer if there are no failures. In so-called zero defect experiments. Compared to a model with a decreasing failure rate. For systems with a clearly defined failure time (which is sometimes not given for systems with a drifting parameter).

Reliability models use block diagrams and fault trees to provide a graphical means of evaluating the relationships between different parts of the system. Task selection depends on the criticality of the system as well as cost. an MTBF of 1000 hours at 90% confidence level. planning. reliability engineering may be used to design an accelerated life test. many factors must be addressed during testing. A critical system may require a formal failure reporting and review process throughout development. . is an emerging discipline that refers to the process of designing reliability into products. Care is needed to select the best combination of requirements. such as extreme temperature and humidity. test. For systems that must last many years. vibration. Reliability engineering determines an effective test strategy so that all parts are exercised in relevant environments. While the predictions are often not accurate in an absolute sense. and failure reporting. These requirements are generally specified in the contract statement of work and depend on how much leeway the customer wishes to provide to the contractor. and heat. From this specification.address some of these concerns. and system. such as component. production. Also. subsystem. Typically. the top-level reliability requirements are then allocated to subsystems by design engineers and reliability engineers working together. The combination of reliability parameter value and confidence level greatly affects the development cost and the risk to both the customer and producer. Reliability testing may be performed at various levels. These models incorporate predictions based on parts-count failure rates taken from historical data. Reliability design begins with the development of a model. Requirements for reliability tasks Reliability engineering must also address requirements for various reliability tasks and documentation during system development. This process encompasses several tools and practices and describes the order of their deployment that an organization needs to have in place in order to drive reliability into their products. shock. The most common reliability program tasks are documented in reliability program standards. the first step in the DFR process is to set the system’s reliability requirements. Reliability must be "designed in" to the system. During system design. whereas a non-critical system may rely on final test reports. the reliability engineer can design a test with explicit criteria for the number of hours and number of failures until the requirement is met or failed. and operation. A certain parameter is expressed along with a corresponding confidence level: for example. they are valuable to assess relative differences in design alternatives. such as MIL-STD-785 and IEEE 1332. Reliability tasks include various analyses. Design for reliability Design For Reliability (DFR).

and is therefore limited to critical parts of the system. Many tasks. Redundancy significantly increases system reliability. relies on understanding the physical processes of stress. Commonly these include: • • • • • • • • • • • • Built-in test (BIT) Failure mode and effects analysis (FMEA) Reliability simulation modeling Thermal analysis Reliability Block Diagram analysis Fault tree analysis Sneak circuit analysis Accelerated Testing Reliability Growth analysis Weibull analysis Electromagnetic analysis Statistical interference . and is often the only viable means of doing so. strength and failure at a very detailed level. If one bulb fails.A Fault Tree Diagram One of the most important design techniques is redundancy. However. the brake light still operates using the other bulb. Another design technique. as using a heavier gauge wire that exceeds the normal specification for the expected electrical current. An automobile brake light might use two light bulbs. techniques and analyses are specific to particular industries and applications. Another common design technique is component derating: selecting components whose tolerance significantly exceeds the expected stress. redundancy is difficult and expensive. This means that if one part of the system fails. Then the material or component can be redesigned to reduce the probability of failure. physics of failure. there is an alternate success path. such as a backup system.

It is not always feasible to test all system requirements. some complex . (The test level nomenclature varies among applications. Reliability testing A Reliability Sequential Test Plan The purpose of reliability testing is to discover potential problems with the design as early as possible and. Reliability is just one requirement among many system requirements. The drawbacks to such extensive testing are time and expense.) For example. provide confidence that the system meets its reliability requirements.Results are presented during the system design reviews and logistics reviews. Engineering trade studies are used to determine the optimum balance between reliability and other requirements and constraints. Customers may choose to accept more risk by eliminating some or all lower levels of testing. Some systems are prohibitively expensive to test. Testing proceeds during each level of integration through full-up system testing. circuit board. Reliability testing may be performed at several levels. some failure modes may take years to observe. Complex systems may be tested at component. unit. analysis and corrective active systems (FRACAS) are often employed to improve reliability as testing progresses. such as piece parts or small assemblies. System reliability is calculated at each test level. performing environmental stress screening tests at lower levels. Reliability growth techniques and failure reporting. assembly. subsystem and system levels. developmental testing. and operational testing. catches problems before they cause failures at higher levels. thereby reducing program risk. ultimately.

the reliability engineer develops a test strategy with the customer. This scoring is the official result used by the reliability engineer. and unexpected situations create differences between the customer and the system developer. Test plans and procedures are developed for each reliability test. which wants as much data as possible. Although this may seem obvious. design of experiments. A scoring conference includes representatives from the customer. One strategy to address this issue is to use a scoring conference process. and results are documented in official reports. weather. and some tests require the use of limited test ranges or other resources. schedule. and constraints such as cost. The desired reliability. the test organization. and risk levels for each side influence the ultimate test plan. The test strategy makes trade-offs between the needs of the reliability organization. In such a test the product is expected to fail in the lab just as it would have failed in the field—but in much less time. different approaches to testing can be used. Good test requirements ensure that the customer and developer agree in advance on how reliability requirements will be tested. As part of the requirements phase. but nonetheless representative. Variations in test conditions. operator differences. such as accelerated life testing. and sometimes independent observers. the reliability organization. there are many situations where it is not clear whether a failure is really the fault of the system. Different test plans result in different levels of risk to the producer and consumer. Each test case is considered by the group and "scored" as a success or failure. the developer. The main objective of an accelerated test is either of the following: • • To discover failure modes To predict the normal field life from the high stress lab life Accelerated testing need planning and as following • • • Define objective and scope of the test Collect required information about the product Identify the stress(es) .interactions result in a huge number of possible test cases. A key aspect of reliability testing is to define "failure". environment. Statistical confidence is increased by increasing either the test time or the number of items tested. and available resources. The scoring conference process is defined in the statement of work. Accelerated testing The purpose of accelerated life testing is to induce field failure in the laboratory at a much faster rate by providing a harsher. The desired level of statistical confidence also plays an important role in reliability testing. In such cases. Reliability test plans are designed to achieve the specified reliability at the specified confidence level with the minimum number of test units and test time. statistical confidence. and simulations.

There are significant differences. There is more overlap between software quality engineering and software reliability engineering than between hardware quality and reliability. Repairing or replacing the hardware component restores the system to its original unfailed state. The software development plan describes the design and coding standards. A good software development plan is a key aspect of the software reliability program. nearly all present day systems. the higher the probability you’ll eventually use it in an untested manner and find a latent defect that results in a failure. hence. Common way to determine a life stress relationship are • • • • • Arrhenius Model Eyring Model Inverse Power Law Model Temperature-Humidity Model Temperature Non-thermal Model Software reliability Software reliability is a special aspect of reliability engineering. Instead. As with hardware. . software. software does not fail in the same sense that hardware fails. design and implementation. Software reliability engineering must take this into account. software unreliability is the result of unanticipated results of software operations. peer reviews. Even relatively small software programs can have astronomically large combinations of inputs and states that are infeasible to exhaustively test. Traditionally. software reliability depends on good requirements. including hardware. Software reliability engineering relies heavily on a disciplined software engineering process to anticipate and design against unintended consequences. by definition. However. software metrics and software models to be used during software development. in how software and hardware behave. includes all parts of the system.• • Determine level of stress(es) Conduct the Accelerated test and analyse the accelerated data. unit tests. however. Despite this difference in the source of failure between software and hardware — software doesn’t wear out — some in the software reliability engineering community believe statistical models used in hardware reliability are nevertheless useful as a measure of software reliability. Most hardware unreliability is the result of a component or material failure that results in the system not performing its intended function. Since the widespread use of digital integrated circuit technology. describing what we experience with software: the longer you run software. System reliability. software has become an increasingly critical part of most electronics and. configuration management. reliability engineering focuses on critical hardware parts of the system. operators and procedures. Restoring software to its original state only works until the same combination of inputs and states results in the same unintended result.

Reliability estimates are updated based on the fault density and other metrics. Unlike hardware. through integration and full-up system testing. Any changes to the system. and software reliability is subsumed by system reliability. reliability engineering during the system operation phase monitors. the software is integrated with the hardware in the top-level system. Reliability data and estimates are also key inputs for system logistics. Instead. For systems in dormant storage or on standby. Data collection and analysis are the primary tools used. Consumer product failures are often tracked by the number of returns. performing the exact same test on the exact same software configuration does not provide increased statistical confidence. are also used. system failures and corrective actions are reported to the reliability engineering organization. starting with individual units. Reliability operational assessment After a system is produced. assesses. . When possible. such as Weibull analysis and linear regression. and the probability of the combination of inputs necessary to encounter the fault. As with hardware. Data collection is highly dependent on the nature of the system. corrected. such as field upgrades or recall repairs. because of the way software faults are distributed in the code. At system level. and machinery. and corrects deficiencies. however. mean-time-between-failure data is collected and used to estimate reliability. equipment. Unlike hardware. Even the best software development process results in some software faults that are nearly undetectable until tested. their severity. Nevertheless. software reliability uses different metrics such as test coverage. The data is constantly analyzed using statistical techniques. software faults are discovered. Establishing a direct connection between fault density and mean-time-between-failure is difficult. The Software Engineering Institute's Capability Maturity Model is a common means of assessing the overall software development process for reliability and quality purposes. require additional reliability testing to ensure the reliability of the modification. During all phases of testing. it is necessary to establish a formal surveillance program to inspect and test random samples. usually expressed as faults per thousand lines of code.A common reliability metric is the number of software faults. to ensure the system reliability meets the specification. software is tested at several levels. Eventually. is key to most software reliability models and estimates. Other software metrics. such as complexity. Testing is even more important for software than hardware. it is inadvisable to skip levels of software testing. The theory is that the software reliability increases as the number of faults (or fault density) goes down. along with software execution time. Most large organizations have quality control groups that collect failure data on vehicles. and re-tested. fault density serves as a useful indicator for the reliability engineer. This metric.

This is desirable to ensure that the system reliability. In larger organizations. Other reliability engineers typically have an engineering degree. Certification The American Society for Quality has a program to become a Certified Reliability Engineer. design and development. which is often expensive and time consuming. However the organization is structured. design evaluation. logistics. which can be in any field of engineering. non-critical systems. reliability engineering may be informal. There are many professional conferences and industry training programs available for reliability engineers. experience. For small. There are several common types of reliability organizations. see University of Maryland). Because reliability is important to the customer. The body of knowledge for the test includes: reliability management. The reliability engineering organization must be consistent with the company's organizational structure. from an accredited university or college program.. the need arises for a formal reliability function. the customer may even specify certain aspects of the reliability organization. and a certification test: periodic recertification is required. The project manager or chief engineer may employ one or more reliability engineers directly. etc. human factors. a company may wish to establish an independent reliability organization. the reliability engineer reports to the product assurance manager or specialty engineering manager. etc. A reliability engineer may be registered as a Professional Engineer by the state. In such case. which may include reliability. such as a commercial company or a government agency. Certification is based on education. CRE. there is usually a product assurance or specialty engineering organization. quality. including the . reliability testing. Several professional organizations exist for reliability engineers. product safety. is not unduly slighted due to budget and schedule pressures. collecting and using data. modeling. Because reliability engineering is critical to early system design. Reliability engineering education Some Universities offer graduate degrees in Reliability Engineering (e. Many engineering programs offer reliability courses.Reliability organizations Systems of any significant complexity are developed by organizations of people. the reliability engineer works for the project on a day-to-day basis. safety. but this is not required by most employers. maintainability. it has become common for reliability engineers. As complexity grows. and some universities have entire reliability engineering programs.g. statistical tools. In some cases. but is actually employed and paid by a separate organization within the company. In such cases. to work as part of an integrated product team.

These factors account for many safety and maintenance practices in engineering and industry practices and government regulations.) . etc. business. the American Society for Quality (ASQ). and the Society of Reliability Engineers (SRE). expressed for example in hours per failure. automotive design. during a particular measurement interval under stated conditions. For example. in some product like a brick or protected steel beam). a variation which attempts to correlate actual loaded distances to similar reliability needs and practices. Failure rate is usually time dependent. overhaul the brakes. failure rate is simply the inverse of the mean time between failure (MTBF). So in the special case when the likelihood of failure remains constant with respect to time (for example. Failure rate Failure rate is the frequency with which an engineered system or component fails. and an intuitive corollary is that both rates change over time versus the expected life cycle of a system. the reciprocal rate MTBF is more commonly expressed and used for high quality components or systems. (MacDiarmid. aerospace engineering. —in short. ………………………………………. such as how often certain inspections and overhauls are required on an aircraft. or have major power plant-transmission problems in a new vehicle. In practice.IEEE Reliability Society. particularly where lives might be lost if such factors are not taken into account. Failure rate in the discrete sense In words appearing in an experiment. the failure rate in its fifth year of service may be many times greater than its failure rate during its first year of service—one simply does not expect to replace an exhaust pipe. It is often denoted by the Greek letter λ (lambda) and is important in reliability theory. the failure rate can be defined as The total number of failures within an item population. MTBF is an important specification parameter in all aspects of high importance engineering design— such as naval architecture. Failure rates and their projective manifestations are important factors in insurance. et al. especially in railways and trucking is 'Mean Distance Between Failure'. and regulation practices as well as fundamental to design of safe systems throughout a national or international economy. divided by the total time expended by that population. expressed for example in failures per hour. as an automobile grows older. any task where failure in a key part or of the whole of a system needs be minimized and severely curtailed. A similar ratio used in the transport industries.

where T is the failure time. f(x). the interval becomes infinitely small. The failure distribution function is the integral of the failure density function. the probability of no failure before time t. which is a cumulative distribution function that describes the probability of failure prior to time t. which is the instantaneous failure rate at any point in time: Continuous failure rate depends on a failure distribution. . This results in the hazard function.Here failure rate λ(t) can be thought of as the probability that a failure occurs in a specified interval. hence the R(t) in the denominator. Note that this is a conditional probability. It can be defined with the aid of the reliability function or survival function R(t). . given no failure before time t. as: where t1 (or t) and t2 are respectively the beginning and ending of a specified interval of time spanning Δt. Failure rate in the continuous sense Exponential failure density functions By calculating the failure rate for smaller and smaller intervals of time .

The most common means are: • Historical data about the device or system under consideration. Failure rate data Failure rate data can be obtained in several ways. For other distributions. such as a Weibull distribution or a log-normal distribution. the historical data for similar devices or systems can serve as a useful estimate. Handbooks of failure rate data for various components are available from government and commercial sources. for others such as the Pareto distribution it is monotonic decreasing (analogous to "burning in"). MIL-HDBK-217. For some such as the deterministic distribution it is monotonic increasing (analogous to "wearing out"). while for many it is not monotonic. Many organizations maintain internal databases of failure information on the devices or systems that they produce. the distribution is "memoryless"). For an exponential failure distribution the hazard rate is a constant with respect to time (that is. For new devices or systems.The hazard function can be defined now as Many probability distributions can be used to model the failure distribution (see List of important probability distributions). Reliability Prediction of Electronic Equipment. which is based on the exponential density function. Several failure rate data sources are . the hazard function is not constant with respect to time. • Government and commercial failure rate data. is a military standard that provides failure rate data for many military electronic components. which can be used to calculate failure rates for those devices or systems. A common model is the exponential failure distribution.

The most accurate source of data is to test samples of the actual devices or systems in order to generate failure data. revolutions. since their failure rates are often very low. The Failures In Time (FIT) rate of a device is the number of failures that can be expected in one billion (109) hours of operation. but hours is the most common unit in practice. Failure rates are often expressed in engineering notation as failures per million. at which time the test is terminated for that component. Example Suppose it is desired to estimate the failure rate of a certain component. whose failure rates are then added to obtain the total system failure rate.g. etc. This term is used particularly by the semiconductor industry. Additivity Under certain engineering assumptions. Other units. This permits testing of individual components or subsystems. the failure rate for a complex system is simply the sum of the individual failure rates of its components. e. failures per million hours. can also be used in place of "time" units.available commercially that focus on commercial components. This is often prohibitively expensive or impractical. as long as the units are consistent. Units Failure rates can be expressed using any measure of time. A test can be performed to estimate its failure rate. or 106. so that the previous data sources are often used instead. (The level of statistical confidence is not considered in this example. including some non-electronic components.. Ten identical components are each tested until they either fail or reach 1000 hours. especially for individual components. • Testing. such as miles.) The results are as follows: Failure Rate Calculation Example Component Component 1 Component 2 Component 3 Hours Failure 1000 1000 467 No failure No failure Failed .

Component 4 Component 5 Component 6 Component 7 Component 8 Component 9 Component 10 Totals 1000 630 590 1000 285 648 882 7502 No failure Failed Failed No failure Failed Failed Failed 6 Estimated failure rate is or 799.8 failures for every million hours of operation. .