You are on page 1of 41

ASQ Certified Quality Engineer.

Ch.7 Reliability.

Edited by Seung Hyun Lee (Ph.D., CRE/CQE)


E-mail : lkangsan@iems.co.kr, Homepage : www.IEMS.co.kr

- 1 -
■ Definition of Reliability.
[CQE Primer pp1 - pp7]

■ The probability that a product will perform its intended function


satisfactorily for a pre-determined period of time and in a given
environment.

■ From the definition above, there are four key elements of reliability.
1. Probability.
2. Intended Function.
3. Time.
4. Environment.

- 1 -
■ Designing for Reliability.
[CQE Primer pp1 - pp7]

■ Designing for reliability is often into phases such as concept,


design and development, full scale development, operational and
disposal.

■ Concept Phase.
․ The first phase or the earliest part of the reliability.
․ The designers work with customers to develop a product that meet the needs of
the perceived customer in terms of ease of use, special training needs, special
power requirements, complexity of design, support, etc. . .

■ Design & Development Phase.


․ The stage when the issues such as ergonomics, maintainability, safety and other
major design considerations become a "product on paper."

- 2 -
■ Designing for Reliability (con).
[CQE Primer pp1 - pp7]

■ Full Scale Development.


․ The design is basically complete and prototype(pre-production) runs of final build
are occurring.
․ Changes during this phase are very costly and in many cases are not done in
order to meet some other goal or target.

■ Operational Phase.
․ The use of the item in the field.
․ The ease of use and ease of maintenance have a significant impact on the
reliability of a product.

■ Disposal.
․ Designers are new required to take the disposal of the product into consideration
of the design.
․ In the design for disassembly it is important that performance is not degraded.
- 3 -
■ Some Considerations for Reliability.
[CQE Primer pp1 - pp7]

■ Cost Factors.
․ Reliability engineers and design engineers must achieve optimum reliability, or else
costs incurred throughout subsequent operations will be greater than necessary.

■ Environmental Factors.
․ In system design, there are two categories of environmental with which the
reliability engineer must contend.
1. The first is the family environment in which the components must function
as a system.
2. The second is the environment in which the system must function.

- 4 -
■ Some Considerations for Reliability.
[CQE Primer pp1 - pp7]

■ Human Factors.
․ Human factors such as hearing and visual acuities, sex, mental and physical
capabilities, and compatibility of work methods with human anatomy all can be vital
factors in achieving reliability objectives.
․ A summary of some important human factors.
1. Dangerous or critical controls should be made obvious, be protected with
a cover, and labeled.
2. Knobs, levers and controls should be designed for easy access; they should
not protrude to permit accidentally bumping.
3. Foot controls are best applied from the sitting position.
4. Conditions of the work area should be appropriate.
(Light, temperature, noise, and odors should be considered.)
5. If an operation can be inherently hazardous, it should be labeled accordingly.
6. Age, dexterity, agility, physical condition, aptitude, skill level, hearing,
eyesight, mental ability, and temperament, are important factors when
designing equipment, and assigning work.

- 5 -
■ Some Considerations for Reliability.
[CQE Primer pp1 - pp7]

■ Simplification.
․ The smallest number of components should be used without compromising
performance, particularly if the design of a system is a series design.

■ Redundancy.
․ Redundancy is defined as the existence of more than one means for achieving
a stated level of performance.
․ As parallel elements are added to such a redundant system, the reliability
increases because each new element provides a different "route" or "bypass".

■ Derating.
․ Derating may be applied in structural design by designing a 10,000 lb. load
capability into a device when the maximum specification is 1,000 lbs.
․ System reliability can be increased by designing components with operating safety
margins.
- 6 -
■ Some Considerations for Reliability.
[CQE Primer pp1 - pp7]
■ Fail-safe.
․ If failure of the water circulation system can overheat an X-ray machine, then the
machine can be designed to stop as soon as water circulation is interrupted.
․ When failure to operate a product can lead to fatality or substantial financial loss,
a fail-safe type design should be adopted.

■ Producibility.
․ Products must be designed not only for performance, but also so that they can be
produced with quality.

■ Maintainability.
․ Products should be designed such that weak or marginal parts can be replaced
conveniently, normal maintenance can be conducted effectively, and a field
representative or user can effect planned maintenance economically without
sacrificing reliability.

- 7 -
■ System Effectiveness.
[CQE Primer p27]
■ Definition.
․ A measure of the degree to which an item or system can be expected to achieve
a set of specific mission requirements, and which may be expressed as a function
of availability, dependability and capability.

- 8 -
■ System Effectiveness.
[CQE Primer p27]

■ Three Components of System Effectiveness.


․ Availability. A measure of the degree to which an item or system is in the
operable and committable state at the start of the mission, when the mission is
called for at an unknown point in time.

․ Dependability. The probability that an item will (a) enter any one of its required
operational modes during a specified mission, and (b) perform the functions
associated with those operational modes.

․ Capability. A measure of ability of an item or system to achieve mission objectives


given the conditions during the mission.

- 9 -
■ System Effectiveness.
[CQE Primer p27]

■ System Effectiveness Formula.


․ SE = Mission Reliability × Operational Readiness × Design Adequacy.

- 10 -
■ Series & Parallel System Reliability.
[CQE Primer pp21 - pp22]

■ Series System Reliability.


․ In a series system, the total reliability of the system is dependent on each
individual component working.

․ For a series system to operate successfully, all components must operate


successfully.
․ The reliability of a series : R s = Ri
․ Example.

R1 = 0.9 R2 = 0.95 R3 = 0.94

Solution : Rsys = R1 × R2 × R3 = 0.90 × 0.95 × 0.94 = 0.80


- 11 -
■ Series & Parallel System Reliability.
[CQE Primer pp21 - pp22]

■ Parallel System Reliability.


․ In a parallel system, the reliability of the system is calculated by subtracting the
product of the unreliabilities from 1.
․ Reliability of a parallel : Rs= 1- (1-Ri)
․ Example.

R1 = 0.90

R2 = 0.95

R3 = 0.94

Solution : Rsys = 1 - (1 - R1) × (1 - R2) × (1 - R3)


= 1 - (1- 0.90) × (1 - 0.95) × (1 - 0.94) = 0.9997
- 12 -
■ Failure Modes & Mechanisms.
[CQE Primer p19]

- 13 -
■ Failure Modes & Mechanisms.
[CQE Primer p19]

■ Infant Mortality.
․ These failures are generally the result of components that do not meet
specifications or workmanship that is not up to standard.
․ There are not design related issues, but quality related issues.
․ The infant mortality period is noted by a decreasing failure rate. The Weibull
distribution is commonly used to determine when the infant mortality period is over.

■ Constant Failure Rate.


․ This is called the random failure rate period.
․ We can only predict the probability of a failure in a certain interval, but not a
specific failure at a specific time.
․ The constant failure rate period is the most common time frame for making
reliability predictions, where the exponential distribution is utilized.

- 14 -
■ Failure Modes & Mechanisms.
[CQE Primer pp22 - pp26]

■ Constant Failure Rate (continued).


․ For exponential data, the failure rate of a product can calculated from test data.

․ Failure Rate & MTBF.


No. of items failed
λ = Failure rate = Total test time

Total test time 1


MTBF = =
No. of items failed λ

․ Example. If seven items are tested for 50 hours each


and three item fail at 20, 38, and 42 hours respectively, what is the failure rate of
the item ?
3 3
λ = 20 + 38 + 42 + (4) × (50) = 300 = 0.01/hr

- 15 -
■ Failure Modes & Mechanisms.
[CQE Primer pp22 - pp26]

■ Wearout Period.
․ As time goes on, we see failure occurring more and more frequently to a point
where it may not longer be practical to continue operating the system.
․ Several distributions may be appropriate to model the wearout period.

- 16 -
■ Failure Density Functions.
[CQE Primer pp22 - pp26]

■ Exponential Distribution.
․ The exponential distribution is commonly used for predicting the reliability of items
in the constant rate failure period.
․ Failure Density Function : f(t) = λe -λt where λ = failure rate.

t
․ The Reliability Function : R(t) = e -λt = e - θ
․ The Hazard Function : λ(t) = λ
- 17 -
■ Failure Density Functions.
[CQE Primer pp22 - pp26]

■ Weibull Distribution.
․ The Weibull distribution consist of many distributional shapes rather than a single
unique shape as in many distribution.

( )
β
(β-1) x-δ
β x-δ
․ Failure Density Function : f(x)=
θ( θ ) e θ
, x≥δ
where β : the shape parameter.

θ : the scale parameter.


δ : the location parameter.

β < 1 Infant mortality.


β = 1 Useful life.
β > 1 Wearout.

- 18 -
■ Maintainability and Availability.
[CQE Primer pp28 - pp35]

■ Maintainability.
․ The measure of the ability of an item to be retained or restored to a specified
condition when maintenance is performed by personnel having specified skill levels,
using prescribed procedures and resources, at each prescribed level of
maintenance and repair.

■ Availability.
․ The measure of the degree to which an item is in the operable and committable
state at the start of a mission, when the mission is called for at an unknown(random)
time.

- 19 -
■ Maintainability and Availability.
[CQE Primer pp28 - pp35]

■ Availability.
․ The three common measures of availability.
Inherent Availability ( A i ) : This is the ideal state for analyzing availability.
The only considerations are the MTBF and the MTTR. This measure does not take
into account the time for preventive maintenance and assumes repair begins
immediately upon failure of the system.
MTBF
Ai =
MTBF + MTTR MTTR : Mean time to repair.

Achieved Availability. ( A a ) : This is somewhat realistic in that it takes preventive


maintenance into account, as well as corrective maintenance.
MTBMA
Aa =
MTBMA + MMT

MMT : Mean Maintenance Action Time.

- 20 -
■ Maintainability and Availability.
[CQE Primer pp28 - pp35]

■ Availability.
․ The three common measures of availability.
Operational Availability ( A o) : This is what generally occurs in practice.
Operational availability takes into account that the maintenance response is not
instantaneous, parts may not be in stock for repair, as well as, other logistics issues.
MTBMA
Ao =
MTBMA + MDT MDT : Mean Down Time.

- 21 -
■ Preventive Maintenance.
[CQE Primer pp28 - pp35]

■ Types of Maintenance : Corrective Maintenance.


․ Corrective maintenance cannot be planned, but can be determined by reliability. The
mean time to repair (MTTR) is applicable for such items. The time to repair hasthree
elements to it
1. Preparation Time : locating people, traveling to the site, obtaining tools, parts and
instruments.
2. Active Maintenance Time : studying the charts, performing the repair, and verifying
the repair.
3. Delay Time : the wait time involved in such activities as locating charts, waiting a
the stores counter, waiting on production to clear the area, and
awaiting personnel to verify repairs.

- 22 -
■ Preventive Maintenance.
[CQE Primer pp28 - pp35]

■ Types of Maintenance : Preventive Maintenance.


․ PM has the function of the prevention of failure via planned or scheduled efforts. PM
can be based on : scheduled service for cleaning, service for lubricating, detection of
early signals of problems, length of use, or based on other failure in service.

- 23 -
■ Maintenance Levels.
[CQE Primer pp28 - pp35]

■ Organizational Maintenance.
․ Where parts are repaired through the least skilled maintenance needs.

■ Intermediate Maintenance.
․ Where parts are beyond a basic level of repair, so the need is at a slightly higher
level of skill.

■ Depot Level.
․ When highly specialized skills are required and possibly at a more central location
with specialized equipment.

- 24 -
■ Preventive Maintenance Strategy.
[CQE Primer pp28 - pp35]

■ Decreasing Hazard Rate.

Decreasing Hazard Rate.


Schedule maintenance will return
the part to the top of the curve.

․ Given a decreasing hazard rate, it is best to not replace the part.


․ If these failures cannot be prevented, a burn-in of the part should be implemented
before it is installed and put into service.
․ Various causes of infant mortality.
Improper use, Improper installation, Inadequate materials, Poor quality conformance,
Over-stressing, Power surges, Improper set up, Handling damage.

- 25 -
■ Preventive Maintenance Strategy.
[CQE Primer pp28 - pp35]
■ Constant Hazard Rate.

Constant Hazard Rate.


Replacement of a part will result
in the same probability of failure
as before.

․ Given a constant hazard rate, part replacement does not reduce failure rates.

- 26 -
■ Preventive Maintenance Strategy.
[CQE Primer pp28 - pp35]

■ Increasing Failure Rate.

․ Given a increasing hazard rate, scheduled replacement reduces failure rates.


․ Given a nearly failure free, but increasing hazard rate, scheduled replacement will
provide a near zero failure rate.

- 27 -
■ Designing for Maintainability & Availability.
[CQE Primer pp28 - pp35]

■ Some Guidelines for Improving Maintainability & Availability.


․ Standardization : Look for compatibility of mating parts and minimize the number of
different parts in the system.

․ Modularization : Have standards on sizes, shapes, modular units. This will allow for
standardized assembly and disassembly procedures.

․ Functional Packaging : Place all needed components of an item into a kit or package.

․ Interchageability : This refers to plug-in devices where spares are instantly


interchangeable with failed parts. One part can be used in other units.

․ Accessibility : A part should be easy to get to and to replace. Good parts should not
be removed to gain access to failed parts.

- 28 -
■ Designing for Maintainability & Availability.
[CQE Primer pp28 - pp35]

■ Some Guidelines for Improving Maintainability & Availability (continued).


․ Malfunction Annunciation : Provide a means to notify the operator when the unit fails.

․ Fault Isolation : A malfunction can be isolated. This is the most time consuming task
of all maintenance work. This problem could be minimized by preventive maintenance
procedures, built-in test equipment (BITE), simplicity in design of parts, and by trained
personnel.

․ Identification : Have a unique identification of all components and a method of


recording corrective and preventive maintenance.

- 29 -
■ Failure Recurrence Control System.
[CQE Primer pp36 - pp50]

■ Failure Reporting, Analysis and Corrective Action System (FRACAS)


․ A closed loop failure reporting, analysis and corrective action system to :
1. Eliminate critical failure modes.
2. Perform a detailed analysis of each failure.
3. Implement corrective action to prevent failure.
4. Categorize failure to detect trends.

- 30 -
■ Failure Recurrence Control System.
[CQE Primer pp36 - pp50]

■ Failure Mode and Effect Analysis (FMEA)


․ This system examines ways in which a product or system failure may occur.
․ FMEA starts with potential problems and looks for resulting bad effects.

- 31 -
■ Failure Recurrence Control System.
[CQE Primer pp36 - pp50]

■ Fault Tree Analysis (FTA).


․ FTA starts with undesired events for which the designer must provide some solution.
․ Safety hazards and injuries are often diagnosed with this method.

- 32 -
■ Risk Management.
[CQE Primer pp8 - pp10]

■ Safety Factor.
․ A design engineer concerned with mechanical loading devices must consider the
safety factor and margin of safety.
․ Safety factor and margin of safety.

μx μx-μy
Safety Factor : μy Margin of Safety : μy

․ Example.
An aircraft component is being designed with an average material strength of 60,000psi.
The expected stress is 32,000psi. What is the safety factor ? What is the margin of
safety ?

μx 60,000
Safety Factor = μy = 32,000 = 1.875 (187.5%)

μx-μy 60,000-32,000
Margin of Safety = μy = 32,000 = 0.875

- 33 -
■ Risk Management.
[CQE Primer pp8 - pp10]

■ Stress-Strength Interference.
․ An item fails when the applied stress exceeds the strength of the item.
․ In general, designers design for a normal strength and a nominal stress that will be
applied to an item. One must also be aware of the variability about the stress and
strength nominals.

․ Stress-Strength Separation.
The distribution curves for stress and strength are far enough apart that there is little
probability that a high stress level interfere with an item that is on the low end of the
strength distribution.

- 34 -
■ Risk Management.
[CQE Primer pp8 - pp10]
■ Stress-Strength Interference (continued).
․ Stress-Strength Overlap.
There is too much variability for the proximity of the means for stress and strength
and there is an increased likelihood of failure which is represented by the overlapping
shaded area.

․ Modeling for Stress-Strength Overlap.


When the stress distribution and strength distribution are independent of each other,

μ X-Y= μ X-μ Y σ X-Y = σ 2X+σ 2Y

- 35 -
■ Risk Management.
[CQE Primer pp8 - pp10]

■ Stress-Strength Interference (continued).


․ Example.
If the stress distribution has a mean stress of 1500 lbs. with a standard deviation of 20
lbs. and the unit is designed to handle 1600 lbs. with a standard deviation of 30 lbs. we
have.

Sol)

μX = 1600 μY = 1500
σX = 30 σY = 20.

Calculate Z to get the probability of failure :


1600-1500
Z= = 2.77
30 2 +20 2

- 36 -
■ Risk Management.
[CQE Primer pp8 - pp10]

■ Monte Carlo Simulation.


․ Monte Carlo simulation is a technique that permits the setting up of a process to
emulate real world conditions as closely as possible.
․ In an area analysis, there may be several variable and distributions that are combined
to determine the probability of a failure.
․ The use of Monte Carlo simulations is greatly enhanced through the use of computers
for generation random numbers and for analysis to determine if a significant event will
occur.

- 37 -
■ Product Safety and Liability.
[CQE Primer pp51 - pp53]

■ Introduction.
․ A new force exists in the economy known as consumerism. Consumers exist in large
numbers, have voting power and advocates.
․ In most modern industrialized countries, customers are increasingly demanding quality,
product safety, non confusing pricing and manufacturer liability.

■ Company Programs to Improve Product Safety.


․ Top management.
1. Commits to make and sell only safe products.
2. Mandates formal design reviews.
3. Establishes guidelines for product traceability.
4. Establishes claim defense guidelines.
5. Establishes safety performance guidelines.
6. Assures compliance vis audits.

- 38 -
■ Product Safety and Liability.
[CQE Primer pp51 - pp53]

■ Company Programs to Improve Product Safety (continued).


․ Supplemental organization product safety structure.
1. A product safety committee.
2. Safety engineers.
3. Outside experts for advice and audit.

․ Other key product safety organizational responsibility centers.


1. Product Design.
2. Manufacturing.
3. Quality control.
4. Marketing.
5. Field service.

- 39 -
■ Product Safety and Liability.
[CQE Primer pp51 - pp53]

■ Criminal and Personal Liability.


․ Historical : For centuries, common law established two grounds for imposing criminal
liability in injury cases :
1. The defendant engaged in some kind of fraud. They did something wrong and they
knew it.
2. The defendant was grossly negligent.

․ Current liability.
1. Recent regulatory laws have dropped key words like knowingly from a statement :
"any person who violates . . . . . shall be guilty."
2. Generally, the chief targets of consumer advocates are managers and chief
executives.

- 40 -

You might also like