Professional Documents
Culture Documents
B.S. Dhillon
Reasonable efforts have been made to publish reliable data and information, but the author and publisher
cannot assume responsibility for the validity of all materials or the consequences of their use. The authors
and publishers have attempted to trace the copyright holders of all material reproduced in this publication and
apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright
material has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, trans-
mitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter
invented, including photocopying, microfilming, and recording, or in any information storage or retrieval
system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.com
or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-
750-8400. For works that are not available on CCC please contact mpkbookspermissions@tandf.co.uk
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.
DOI: 10.1201/9781003298571
Typeset in Times
by KnowledgeWorks Global Ltd.
This book is affectionately dedicated to my son,
Mark, for challenging me to write 50 books.
Contents
Preface....................................................................................................................xvii
Author Biography.....................................................................................................xxi
Chapter 1 Introduction........................................................................................... 1
1.1 Reliability, Usability, and Quality History................................. 1
1.2 Need of Reliability, Usability, and Quality
in Product Design.......................................................................1
1.3 Terms and Definitions................................................................ 2
1.4 Useful Sources for Obtaining Information on Reliability,
Usability, and Quality................................................................. 4
1.4.1 Journals and Magazines................................................ 4
1.4.2 Conference Proceedings................................................ 4
1.4.3 Books.............................................................................4
1.4.4 Standards....................................................................... 5
1.4.5 Data Sources..................................................................6
1.5 Scope of the Book...................................................................... 6
1.6 Problems..................................................................................... 7
References............................................................................................. 7
vii
viii Contents
xvii
xviii Preface
Chapter 6 is devoted to robot reliability. Some of the topics covered in the chapter are
robot failure categories, causes, and corrective measures; robot reliability measures,
reliability analysis of hydraulic and electric robots, and models for conducting robot
reliability and maintenance studies.
Chapter 7 presents various important aspects of computer and internet reliabil-
ity. Some of the topics covered in the chapter are computer failure-related causes
and issues in computer system reliability, comparisons between computer hardware
and software reliability, fault masking, software reliability assessment methods,
internet outage classifications and an approach for automating fault detection in
internet-related services, and mathematical models for conducting internet reli-
ability and availability analysis. Chapter 8 is devoted to power system reliability.
Some of the topics covered in the chapter are loss of load probability, power system
service performance indices, availability analysis of transmission and associated
systems, and availability analysis of a single generator unit. Chapter 9 presents
various important aspects of medical device usability. Some of the topics covered
in the chapter are medical devices with high incidence of user/human error and
general approach for developing medical devices’ effective user interfaces, useful
guidelines for making interfaces of medical device more user-friendly, designing
medical devices for old users, and cumulative trauma disorder (CTD) implications
in medical device design.
Chapter 10 is devoted to software usability. Some of the topics covered in the
chapter are need for considering usability during the software development process,
software usability engineering process, steps to improve software usability, software
usability inspection methods, software usability testing methods, and useful guide-
lines to perform software usability testing.
Chapter 11 presents various important aspects of web usability. Some of the top-
ics covered in the chapter are common web design-related errors, web page design,
website design, navigation aids, and web usability evaluation tools. Chapter 12 is
devoted to quality in health care. Some of the topics covered in the chapter are
comparisons of tradition quality assurance and total quality management (TQM)
in regard to health care and quality assurance versus quality improvement in health
care institutions, steps for quality improvement in health care and physician reac-
tions to total quality, and quality tools for use in health care. Chapter 13 presents
various important aspects of medical device quality assurance. Some of the topics
covered in the chapter are regulatory compliance of medical device quality assur-
ance, medical device design quality assurance programme, tools for assuring medi-
cal device quality and quality indices.
Finally, Chapter 14 is devoted to software quality. Some of the topics covered in
the chapter are software quality factors and their categories, useful quality methods
for use during the software development process, quality-related measures during
the software development life cycle, software quality-associated metrics, and soft-
ware quality-related cost.
This book will be useful to many individuals including reliability engineers,
design engineers, usability and quality control professionals, system engineers, engi-
neering administrators, graduate and senior undergraduate students of engineering,
researchers and instructors of reliability, usability, and quality, and engineers-at-large.
Preface xix
B.S. Dhillon
University of Ottawa
Author Biography
Dr. B.S. Dhillon is a professor of Engineering Management in the Department of
Mechanical Engineering at the University of Ottawa. He has served as a Chairman/
Director of Mechanical Engineering Department/Engineering Management Progra
mme for over 10 years at the same institution. He is the founder of the probability
distribution named Dhillon Distribution/Law/Model by statistical researchers in
their publications around the world. He has published over 377 (i.e., 224 [70 single
authored + 154 co-authored] journal and 153 conference proceedings) articles on
reliability engineering, maintainability, safety, engineering management, etc. He is
or has been on the editorial boards of 14 international scientific journals. In addi-
tion, Dr. Dhillon has written 50 books on various aspects of health care, engineering
management, design, reliability, safety, and quality published by Wiley (1981), Van
Nostrand (1982), Butterworth (1983), Marcel Dekker (1984), Pergamon (1986), etc.
His books are being used in over 100 countries and many of them are translated into
languages such as German, Russian, Chinese, and Persian (Iranian).
He has served as General Chairman of two international conferences on reliabil-
ity and quality control held in Los Angeles and Paris in 1987. Prof. Dhillon has also
served as a consultant to various organisations and bodies and has many years of
experience in the industrial sector. At the University of Ottawa, he has been teach-
ing reliability, quality, engineering management, design, and related areas and he
has also lectured in over 50 countries, including keynote addresses at various inter-
national scientific conferences held in North America, Europe, Asia, and Africa.
In March 2004, Dr. Dhillon was a distinguished speaker at the Conf./Workshop
on Surgical Errors (sponsored by White House Health and Safety Committee and
Pentagon), held at the Capitol Hill (One Constitution Avenue, Washington, DC).
Professor Dhillon attended the University of Wales where he received a BS in
electrical and electronic engineering and an MS in mechanical engineering. He
received a PhD in industrial engineering from the University of Windsor.
xxi
1 Introduction
design specifications, competition, and public demand. The first two of these factors
are described below in detail.
Even if we consider the increase in the product complexity in regard to parts
alone, there has been a phenomenal growth of some products. For example, a typical
Boeing 747 jumbo jet airplane was made up of around 4.5 million parts, including
fasteners. Even for relatively simpler products, there has been a quite significant
increase in complexity in regard to parts. For example, in 1935 a farm tractor was
made up of 1200 critical parts and in 1990 the number increased to around 2900.
In regard to the past system failures, various studies have revealed that design-
associated problems are generally the greatest causes for product failures. For exam-
ple, a study conducted by the U.S. Navy concerning electronic equipment failure
causes indicated 43% to design, 30% to operation and maintenance, 20% to manu-
facturing, and 7% to miscellaneous factors [13].
Well-publicised system failures such as Space Shuttle Challenger Disaster, Chernobyl
Nuclear Reactor Explosion, and Point Pleasant Bridge Disaster may have also contrib-
uted to more serious consideration of reliability in product design [14–16].
Usability engineering is an effective approach to product design and development
and is specifically based on customer feedback and data. For example, over 30% of all
software development projects are cancelled prior to completion primarily because of
inadequate user design-related inputs, resulting in a loss of over $100 billion annually
to the United States economy. Moreover, some studies clearly indicate that around
80% of product maintenance is due to unmet or unforeseen user requirements.
All in all, it may be added that the key challenge in designing new products using
modern technologies is how best to take advantage of all potential users’ skills in
creating the most effective work environment; this may simply be referred to as the
usability engineering challenge.
Nowadays, a vast sum of money is spent annually worldwide to design and develop
good quality products. Global competition and other factors are forcing manufactur-
ers to design and produce good quality products. Needless to say, quality principles
are being applied across many diverse sectors of the economy; each of these sectors
has tailored quality principles, methods, and procedures to satisfy its product design-
related needs. Some examples of these sectors are robotics, electric power genera-
tion, software, and the Internet.
As a result, there is a definite need for quality professionals working in diverse
areas such as these to know about each other’s work activities because this may help
them to perform their tasks more effectively. In turn, this will result in better quality
of end products.
• Reliability. This is the probability that an item will perform its stated mis-
sion satisfactorily for the specified time period when used under the stated
conditions.
Introduction 3
1.4.3 Books
• Shooman, M.L., Probabilistic Reliability: An Engineering Approach,
McGraw-Hill Book Company, New York, 1968.
• Dhillon, B.S., Design Reliability: Fundamentals and Applications, CRC
Press, Boca Raton, Florida, 1999.
• Evans, J.W., Evans, J.Y., Productivity Integrity and Reliability in Design,
Springer-Verlag, New York, 2001.
• Dhillon, B.S., Computer System Reliability: Safety and Usability, CRC
Press, Boca Raton, Florida, 2013.
Introduction 5
1.4.4 Standards
• MIL-STD-721, Definitions of Terms for Reliability and Maintainability,
U.S. Department of Defense, Washington, DC.
• MIL-HDBK-217, Reliability Prediction of Electronic Equipment, U.S.
Department of Defense, Washington, DC.
• MIL-STD-1629, Procedures for Performing Failure Mode, Effects and
Criticality Analysis, U.S. Department of Defense, Washington, DC.
• MIL-STD-785, Reliability Program for Systems and Equipment, Development
and Production, U.S. Department of Defense, Washington, DC.
• MIL-HDBK-338, Electronics Reliability Design Handbook, U.S. Department
of Defense, Washington, DC.
• ISO 9241-11 (1998), Ergonomic Requirements for Office Work with
Visual Display Terminals (VDTs): Guidance on Usability, International
Organization for Standardization (ISO), Geneva, Switzerland.
• ISO 9241-13 (1998), Ergonomics Requirements for Office Work with Visual
Display Terminals (VDTs): User Guidance, International Organization for
Standardization (ISO), Geneva, Switzerland.
• ETSI ETR 095, Human Factors: Guide for Usability Evaluations of
Telecommunications Systems and Services, European Telecommunications
Standardization Institute (ETSI), Sophia Antipolis, France.
• ETSI ETR 198, User Trials User Control Procedures in ISDN Video Telephony,
European Telecommunications Standards Institute (ETSI), Sophia, Antipolis,
France.
• MIL-STD-1472D, Human Engineering Design Criteria for Military Systems,
Equipment and Facilities, Department of Defense, Washington, DC.
6 Applied Reliability, Usability, and Quality for Engineers
1.4.5 Data Sources
• Reliability Analysis Center, Rome Air Development Center (RADC),
Griffiss Air Force Base, Rome, NY.
• Government Industry Data Exchange Program (GIDEP), GIDEP Operations
Center, U.S. Department of Navy, Corona, CA.
• American National Standards Institute (ANSI), New York.
• National Technical Information Service (NTIS), United States Department
of Commerce, Springfield, VA.
• Defense Technical Information Center, DTIC-FDAC, Fort Belvoir, VA.
1.6 PROBLEMS
1. Discuss the need for reliability, usability, and quality in product design.
2. Write an essay on the history of reliability, usability, and quality.
3. Define the following three terms:
i. Reliability
ii. Quality
iii. Usability
4. List eight of the most important journals or magazines for obtaining infor-
mation on reliability, usability, or quality.
5. List at least four books considered quite useful to obtain information on
usability.
6. Define the following four terms:
i. Downtime
ii. User-centred design
iii. Quality assurance
iv. Hazard rate
7. List at least four standards that are directly or indirectly concerned with
usability.
8. List four most useful standards concerned with reliability.
9. List at least four standards concerned with quality.
10. List at least four data information sources.
REFERENCES
1. Lyman, W.J., Fundamental Consideration in Preparing a Master System Plan, Electrical
World, Vol. 101, 1933, pp. 778–792.
2. Smith, S.A., Service Reliability Measured by Probabilities of Outage, Electrical World,
Vol. 103, 1934, pp. 371–374.
3. Coppola, A., Reliability Engineering of Electronic Equipment: A Historical Perspective,
IEEE Transactions on Reliability, Vol. 33, 1984, pp. 29–35.
4. Dhillon, B.S., Design Reliability: Fundamentals and Applications, CRC Press, Boca
Raton, Florida, 1999.
5. AMCP 706-133, Engineering Design Handbook: Maintainability Engineering Theory
and Practice, Department of Defense, Washington, DC, 1976.
6. Shackel, B., Richardson, S., Human Factors for Informatics Usability: Background
and Overview, in Human Factors for Informatics Usability, edited by Shackel, B.,
Richardson, S., Cambridge University Press, Cambridge, UK, 1991, pp. 1–19.
7. Dhillon, B.S., Engineering Usability: Fundamentals, Applications, Human Factors, and
Human Error, American Scientific Publishers, Stevenson Ranch, California, 2004.
8. Butler, K.A., Usability Engineering Turns Ten, Interactions, 1996, pp. 59–75.
9. Rosson, M.B., Carroll, J.M., Usability Engineering: Scenario-Based Development of
Human-Computer Interaction, Academic Press, San Francisco, California, 2002.
10. Radford, G.S., Quality Control (Control of Quality), Industrial Management, Vol. 54,
1917, p. 100.
11. Golomski, W.A., Quality Control: History in the Making, Quality Progress, Vol. 9, No. 7,
July 1976, pp. 16–18.
12. Krismann, C., Quality Control: An Annotated Bibliography, The Kraus Organization
Limited, White Plains, New York, 1990.
8 Applied Reliability, Usability, and Quality for Engineers
13. Niebel, B.W., Engineering Maintenance Management, Marcel Dekker, New York,
1994.
14. Dhillon, B.S., Engineering Design: A Modern Approach, Richard D. Irwin, Chicago,
Illinois, 1996.
15. Elsayed, E.A., Reliability Engineering, Addison Wesley Longman, Reading, MA,
1996.
16. Dhillon, B.S., Advanced Design Concepts for Engineers, Technomic Publishing
Company, Lancaster, PA, 1998.
17. Omdahl, T.P., ed., Reliability, Availability, Maintainability (RAM) Dictionary, ASQC
Quality Press, Milwaukee, Wisconsin, 1988.
18. ANSI/ASQC A3-1978, Quality Systems Terminology, American Society for Quality
Control, Milwaukee, Wisconsin, 1978.
19. Naresky, J.J., Reliability Definitions, IEEE Transactions on Reliability, Vol. 19, 1970,
pp. 198–200.
20. Glossary of Terms Used in Usability Engineering, Available online at http://www.ucc.
ie/hfrg/baseline/glossary.html.
21. User-Centered Design Process for Interactive Systems, ISO 13407 1999, International
Organization for Standardization (ISO), Geneva, Switzerland, 1999.
2 Basic Mathematical
Concepts
2.1 INTRODUCTION
Just like in the development of other areas of science and engineering, mathematics has
also played an important role in the development of reliability, usability, and quality
fields. Although the origin of the word “mathematics” may be traced back to the ancient
Greek word “mathema”, which means “science, knowledge, or learning”, the history of
current number symbols, sometimes referred to as the “Hindu-Arabic numeral system”,
goes back to around 250 BCE, to the stone columns erected by the Scythian emperor of
India named Asoka [1]. The evidences of the use of these number symbols are notches
found on the stone columns.
The history of probability goes back to the gambler’s manual written by Girolamo
Cardano (1501–1576), in which he considered a number of interesting issues on prob-
ability [1, 2]. However, Blaise Pascal (1623–1662) and Pierre Fermat (1601–1665)
were the first two individuals who independently and correctly solved the problem of
dividing the winnings in a game of chance. Pierre Fermat also introduced the idea
of “differentiation”.
Laplace transforms, frequently used for finding solutions to a set of differential
equations, were developed by Pierre-Simon Laplace (1749–1827). Additional informa-
tion on the history of mathematics, including probability, is available in Refs. [1, 2].
This chapter presents various mathematical concepts considered useful to understand
subsequent chapters of this book.
2.2.1 Arithmetic Mean
Often, the arithmetic mean is simply referred to as mean and is defined by
k
∑x
i =1
i
m= (2.1)
k
DOI: 10.1201/9781003298571-2 9
10 Applied Reliability, Usability, and Quality for Engineers
where
m is the mean value (i.e., arithmetic mean).
xi is the data value i, for i = 1, 2, …, k.
k is the number of data values.
Example 2.1
4 + 6 + 8 + 10 + 12 + 14
m= =9
6
Thus, the average number of defects per system is 9. In other words, the arithmetic
mean of the data set is 9.
∑ DV − m
i =1
i
MD = (2.2)
k
where
MD is the mean deviation.
DVi is the data value i, for i = 1, 2, 3, …, k.
k is the number of data values.
m is the mean value of the given data set.
DVi − m is the absolute value of the deviation of DVi from m.
Example 2.2
Calculate the mean deviation of the data set provided in Example 2.1
By using the data set from Example 2.1 and the calculated mean value (i.e., m = 9
defects per system) in Equation (2.2), we obtain
4 − 9 + 6 − 9 + 8 − 9 + 10 − 9 + 12 − 9 + 14 − 9
MD =
6
5 + 3 + 1+ 1+ 3 + 5
=
6
=3
Thus, the mean deviation of the Example 2.1 data set is 3.
Basic Mathematical Concepts 11
2.2.3 Standard Deviation
Standard deviation is a quite widely used measure of dispersion of data in a given
data set about the mean and is defined by
1/2
k
∑ i =1
( DVi − m)2
σ= (2.3)
k
where
σ is the standard deviation.
DVi is the data value i, for i = 1, 2, 3, …, k.
m is the mean value.
k is the number of data values.
The following three standard deviation’s properties are associated with the widely
used normal distribution:
Example 2.3
Calculate the standard deviation of the data set given in Example 2.1.
Using the Example 2.1 data set and the calculated mean value (m = 9) in
Equation (2.3), we obtain
1/ 2
(4 − 9)2 + (6 − 9)2 + (8 − 9)2 + (10 − 9)2 + (12 − 9)2 + (14 − 9)2
σ=
6
1/ 2
25 + 9 + 1+ 1+ 9 + 25
=
6
= 3.41
Thus, the standard deviation of the Example 2.1 data set is 3.41.
• Idempotent law:
A + A = A(2.4)
A.A = A(2.5)
where
A is an arbitrary set or event.
Dot(.) denotes the interaction of sets. It is to be noted that Equation (2.5)
sometimes is written without the dot (e.g., AA), but it still conveys
the same meaning.
+ denotes the union of sets.
• Commutative law:
A + B = B + A(2.6)
A.B = B.A(2.7)
where
B is an arbitrary set or event.
• Distributive law:
( A + B)( A + C ) = A = BC(2.8)
A ( B + C ) = AB + AC(2.9)
where
C is an arbitrary set or event.
• Associative law:
( AB) C = A ( BC )(2.10)
( A + B) + C = A + ( B + C )(2.11)
• Absorption law:
A ( A + B) = A(2.12)
A + ( AB) = A(2.13)
Basic Mathematical Concepts 13
where
P(C) is the probability of occurrence of event C.
N is the number of times event C occurs in the n repeated experiments.
O ≤ P( A) ≤ 1(2.15)
P ( S) = 1(2.16)
P( S ) = 0(2.17)
where
S is the negation of the sample space S.
P( A) + P( A) = 1(2.18)
where
P(A) is the probability of occurrence of event A.
P( A) is the probability of nonoccurrence of event A.
P( A1 + A2 + − − − − + An ) = 1 − ∏(1 − P( A ))(2.19)
i =1
i
where
P( Ai ) is the probability of occurrence of event Ai, for i = 1, 2, 3,…, n.
P( A1 + A2 + − − − − + An ) = ∑P( A )(2.20)
i =1
i
F (t ) =
−∞
∫ f (x)dx (2.22)
where
x is a continuous random variable.
t is time.
f(x) is the probability density function.
F(t) is the cumulative distribution function.
F ( ∞) =
∫ f (x)dx
−∞
(2.23)
=1
It means that the total area under the probability density curve is equal to unity.
2.5.3 Expected Value
The expected value of a continuous random variable is defined by
∞
∫
E (t ) = tf (t )dt
−∞
(2.25)
where
E(t) is the expected value (i.e., mean value) of the continuous random variable t.
2.5.4 Laplace Transform
The Laplace transform of the function, f(t), is defined by
∞
f (s ) =
∫ f (t)e
0
− st
dt (2.26)
Basic Mathematical Concepts 15
where
s is the Laplace transform variable.
t is time variable.
f(s) is the Laplace transform of function f(t).
Example 2.4
f (t ) = e −λt (2.27)
where
λ is a constant.
∫
f ( s ) = e −λt e − st dt
0
∞
∫
= e −( s +λ )t dt
0
1
=
s+λ (2.28)
Laplace transforms of some frequently occurring functions used in the area of applied
reliability, usability, and quality are presented in Table 2.1 [8, 9].
TABLE 2.1
Laplace Transforms of Some Frequently Occurring Functions
in Applied Reliability, Usability, and Quality Work
f(t) f(s)
e −λt 1
s+λ
k, a constant k
s
t n , n = 0,1,2,3,... n!
s n +1
tf (t ) df (s)
−
ds
df (t ) sf(s)-f(0)
dt
θ1 f1 (t ) + θ2 f2 (t ) θ1 f1 (s) + θ2 f2 (s)
te −λt 1
(s + λ ) 2
T 1
s2
16 Applied Reliability, Usability, and Quality for Engineers
Example 2.5
Prove by using the following equation that the left-hand side of Equation (2.29) is
equal to its right-hand side:
µ λ
f (t ) = + e −( λ+µ )t(2.30)
(λ + µ ) (λ + µ )
where
λ and µ are constants.
By substituting Equation (2.30) into the left-hand side of Equation (2.29), we obtain
µ λ µ
lim + e − ( λ+µ )t = (2.31)
t →∞ (λ + µ ) (λ + µ )
λ+µ
µ λ 1
f (s ) = + . (2.32)
s (λ + µ ) (λ + µ ) (s + λ + µ )
By substituting Equation (2.32) into the right-hand side of Equation (2.29), we obtain
sµ sλ 1 µ
lim − + . = (λ + µ ) (2.33)
s→ 0
s ( λ + µ ) ( λ + µ ) ( s + λ + µ )
The right-hand sides of Equations (2.31) and (2.33) are the same. Thus, it proves that
the left-hand side of Equation (2.29) is equal to its right-hand side.
()
n
f ( x ) = i p x q n − x , for x = 0,1, 2,…, n (2.34)
where
(i ) = i!(nn−! i)!
n
F (x) = ∑(n) p q
i=0
i
i n−i
(2.35)
where
F(x) is the cumulative distribution function or the probability of x or fewer nonoc-
currences (e.g., failures) in n trials.
2.6.2 Exponential Distribution
This is a continuous random variable distribution that is widely used in the industrial
sector, particularly in conducting reliability studies [11]. The probability density func-
tion of the distribution is defined by
where
f(t) is the probability density function.
t is time.
α is the distribution parameter.
By inserting Equation (2.36) into Equation (2.22), we obtain the following equation
for the cumulative distribution function:
F (t ) = 1 − e −αt (2.37)
Using Equations (2.36) and (2.25), we get the following equation for the distribution
mean value:
1
E (t ) = m = (2.38)
α
where
m is the mean value.
18 Applied Reliability, Usability, and Quality for Engineers
2.6.3 Rayleigh Distribution
This continuous random variable distribution is named after John Rayleigh (1842–1919),
its founder [1]. The probability density function of the distribution is defined by
2
t
2 −
f (t ) = 2 te α , t ≥ 0, α > 0 (2.39)
α
where
α is the distribution parameter.
Substituting Equation (2.39) into Equation (2.22), we obtain the following cumula-
tive distribution function:
2
t
−
α
F (t ) = 1 − e (2.40)
Using Equations (2.39) and (2.25), we obtain the following expression for the distri-
bution mean value:
3
E (t ) = m = αΓ (2.41)
2
where
Γ(.) is gamma function, which is defined by
∞
∫
Γ () = t n −1e − t dt , for n > 0
0
(2.42)
By inserting Equation (2.43) into Equation (2.22), we obtain the following equation
for the cumulative distribution function:
t c
−
θ
F (t ) = 1 − e (2.44)
Basic Mathematical Concepts 19
It is to be noted that exponential and Rayleigh distributions are the special cases
of this distribution for c = 1 and c = 2, respectively.
Using Equations (2.43) and (2.25), we get the following equation for the distribu-
tion mean value:
1
E (t ) = m = θΓ 1 + (2.45)
c
2.6.5 Normal Distribution
This continuous random variable distribution is widely used, and sometimes it called
the Gaussian distribution after Carl Friedrich Gauss (1777–1855), a German math-
ematician. The probability density function of the distribution is defined by
1 (t − µ)2
f (t ) = exp − 2
, −∞ < t < +∞ (2.46)
σ 2π 2σ
where
µ and σ are the distribution parameters (i.e., mean and standard deviation,
respectively).
Using Equations (2.22) and (2.46), we get the following cumulative distribution
function:
t
1 (t − µ)2
F (t ) =
σ 2π ∫
−∞
exp −
2σ 2
dx (2.47)
Inserting Equation (2.46) into Equation (2.25) yields the following equation for the
distribution mean value:
∞
1 (t − µ)2
E (t ) = m =
σ 2π ∫
−∞
t exp −
2σ 2
dx (2.48)
f (t ) = cθ(θt )c −1 e
{ c
} , for t ≥ 0, θ > 0, c > 0
− e( θt ) − ( θt )c −1
(2.49)
where
c and θ are the distribution shape and scale parameters, respectively.
20 Applied Reliability, Usability, and Quality for Engineers
By inserting Equation (2.49) into Equation (2.22), we get the following equation
for cumulative distribution function:
F (t ) = 1 − e
{ c
}
− e( θt ) −1
(2.50)
It is to be noted that for c = 0.5, this probability distribution gives the bathtub-shaped
hazard rate curve, and for c = 1, it gives the extreme value probability distribution.
In other words, the extreme value probability distribution is the special case of this
probability distribution at c = 1.
Example 2.6
Assume that an engineering system can be in any of the three states: operating
normally, failed due to a hardware failure, or failed due to a usability error. The fol-
lowing three first-order linear differential equations describe the engineering system
under consideration:
dP0 (t )
+ (λ + λ u )P0 (t ) = 0(2.51)
dt
dP1(t )
− λP0 (t ) = 0 (2.52)
dt
dP2 (t )
− λ u P0 (t ) = 0 (2.53)
dt
where
λ is the engineering system constant hardware failure rate.
λ s is the engineering system constant usability error rate.
Pi (t ) is the probability that the engineering system is in state i at time t, for i = 0
(operating normally), i = 1 (failed due to a hardware failure), and i = 2
(failed due to a usability error).
At time t = 0, P0 (0) = 1, P1(0) = 0, and P2 (0) = 0.
Solve differential Equations (2.51), (2.52), and (2.53) by using Laplace transforms.
Using Table 2.1, the stated initial conditions, and Equations (2.51)–(2.53), we
obtain
1
P0 (s) = (2.57)
(s + λ + λ u )
λ
P1 (s) = (2.58)
(s + λ + λ u )
λu
P2 (s) = (2.59)
(s + λ + λ u )
P0 (t ) = e − ( λ+λu )t (2.60)
λ
P1 (t ) = 1 − e − ( λ+λu )t (2.61)
(λ + λ u )
λu
P2 (t ) = 1 − e − ( λ+λu )t (2.62)
(λ + λ u )
2.8 PROBLEMS
1. Assume that the quality control department of an engineering systems man-
ufacturing company inspected eight identical systems and discovered 5, 4,
8, 11, 2, 9, 10, and 3 defects in each system. Calculate the average number
of defects per system.
2. Calculate the mean deviation of the data set given in question 1.
3. Calculate the standard deviation of the data set given in question1.
4. What is idempotent law?
5. Define probability and expected value of a continuous random variable.
6. Define the following two items:
• Cumulative distribution function
• Laplace transform
7. Write down the probability density functions of the following two
distributions:
• Rayleigh distribution
• Exponential distribution
8. Write down probability density and cumulative distribution functions for
normal distribution.
22 Applied Reliability, Usability, and Quality for Engineers
9. What are the special case distributions of the bathtub hazard rate curve and
Weibull distributions?
10. Prove Equations (2.60)–(2.62) by using Equations (2.57)–(2.59).
REFERENCES
1. Eves, H., An Introduction to the History of Mathematics, Rinehart and Winston,
New York, 1976.
2. Owen, D.B., ed., On the History of Statistics and Probability, Marcel Dekker, New
York, 1976.
3. Fault Tree Handbook, Report No. NUREG-0492, U.S. Nuclear Regulatory Commission,
Washington, D.C., 1981.
4. Lipschutz, S., Set Theory, McGraw-Hill, New York, 1964.
5. Mann, N.R., Schefer, R.E., Singpurwalla, N.D., Methods for Statistical Analysis of
Reliability and Life Data, John Wiley and Sons, New York, 1974.
6. Llipschutz, S., Probability, McGraw-Hill, New York, 1965.
7. Shooman, M.L., Probabilistic Reliability: An Engineering Approach, McGraw-Hill,
New York, 1968.
8. Spiegel, M.R., Laplace Transforms, McGraw-Hill, New York, 1965.
9. Oberhettinger, F., Badic, L., Tables of Laplace Transforms, Springer-Verlag, New York,
1973.
10. Patel, J.K., Kapadia, Owen, D.B., Handbook of Statistical Distributions, Marcel Dekker,
New York, 1976.
11. Davis, D.J., An Analysis of Some Failure Data, The Journal of the American Statistical
Association, 1952, pp. 113–150.
12. Weibull, W., A Statistical Distribution of Wide Applicability, The Journal of Applied
Mechanics, Vol. 18, 1951, pp. 293–297.
13. Dhillon, B.S., Life Distributions, IEEE Transactions on Reliability, Vol. 30, 1981,
pp. 457–460.
14. Baker, R.D., Non-parametric Estimation of the Renewal Function, Computers Opera
tions Research, Vol. 20, No. 2, 1993, pp. 167–178.
15. Cabana, A., Cabana, E.M., Goodness-of-fit to the Exponential Distribution, Focused
on Weibull Alternatives, Communications in Statistics-Simulation and Computation,
Vol. 34, 2005, pp. 711–723.
16. Grane, A., Fortiana, J., A Directional Test of Exponentiality Based on Maximum
Correlations, Metrika, Vol. 73, 2011, pp. 711–723.
17. Henze, N., Meintnis, S.G., Recent and Classical Tests for Exponentiality: A Partial
Review with Comparisons, Metrika, Vol. 61, 2005, pp. 29–45.
18. Jammalamadaka, S.R., Taufer, E., Testing Exponentiality by Comparing the Empirical
Distribution Function of the Normalized Spacings with that of the Original Data,
Journal of Nonparametric Statistics, Vol. 15, No. 6, 2003, pp. 719–729.
19. Hollander, M., Laird, G., Song, K.S., Non-parametric Interference for the Proportionality
Function in the Random Censorship Model, Journal of Nonparametric Statistics,
Vol. 15, No. 2, 2003, pp. 151–169.
20. Jammalamadaka, S.R., Taufer, E., Use of Mean Residual Life in Testing Departures
from Exponentiality, Journal of Nonparametric Statistics, Vol. 18, No. 3, 2006,
pp. 277–292.
21. Kunitz, H., Pamme, H., The Mixed Gamma Ageing Model in Life Data Analysis,
Statistical Papers, Vol. 34, 1993, pp. 303–318.
22. Kunitz, H., A New Class of Bathtub-shaped Hazard Rates and its Application in
Comparison of Two Test-statistics, IEEE Transactions on Reliability, Vol. 38, No. 3,
1989, pp. 351–354.
Basic Mathematical Concepts 23
23. Meintanis, S.G., A Class of Tests for Exponentiality Based on a Continuum of Moment
Conditions, Kybernetika, Vol. 45, No. 6, 2009, pp. 946–959.
24. Morris, K., Szynal, D., Goodness-of-fit Tests Based on Characterizations Involving
Moments of Order Statistics, International Journal of Pure and Applied Mathematics,
Vol. 38, No. 1, 2007, pp. 83–121.
25. Na, M.H., Spline Hazard Rate Estimation Using Censored Data, Journal of KSIAM,
Vol. 3, No. 2, 1999, pp. 99–106.
26. Morris, K., Szynal, D., Some U-statistics in Goodness-of-fit Tests Derived from Chara
cterizations via Record Values, International Journal of Pure and Applied Mathematics,
Vol. 4, No. 4, 2008, pp. 339–414.
27. Nam, K.H., Park, D.H., Failure Rate for Dhillon Model, Proceedings of the Spring
Conference of the Korean Statistical Society, 1997, pp. 114–118.
28. Nimoto, N., Zitikis, R., The Atkinson Index, The Moran Statistic, and Testing Expo
nentiality, Journal of the Japan Statistical Society, Vol. 38, No. 2, 2008, pp. 187–205.
29. Nam, K.H., Chang, S.J., Approximation of the Renewal Function for Hjorth Model and
Dhillon Model, Journal of the Korean Society for Quality Management, Vol. 34, No. 1,
2006, pp. 34–39.
30. Noughabi, H.A., Arghami, N.R., Testing Exponentiality Based on Characterizations
of the Exponential Distribution, Journal of Statistical Computation and Simulation,
Vol. 1, 2011, pp. 1–11.
31. Szynal, D., Goodness-of-fit Tests Derived from Characterizations of Continuous
Distributions, Stability in Probability, Banach Center Publications, Vol. 90, Institute of
Mathematics, Polish Academy of Sciences, Warszawa, Poland, 2010, pp. 203–223.
32. Szynal, D., Wolynski, W., Goodness-of-fit Tests for Exponentiality and Rayleigh
Distribution, International Journal of Pure and Applied Mathematics, Vol. 78, No. 5,
2013, pp. 751–772.
33. Nam, K.H., Park, D.H., A Study on Trend Changes for Certain Parametric Families,
Journal of the Korean Society for Quality Management, Vol. 23, No. 3, 1995, pp. 93–101.
3 Reliability Basics,
Human Factors Basics
for Usability, and
Quality Basics
3.1 INTRODUCTION
Nowadays, the reliability of engineering systems has become a challenging issue during
the design process due to the increasing dependence of our daily lives and schedules on
these systems’ proper functioning. Some examples of these systems are automobiles,
computers, aircraft, nuclear power generating reactors, and space satellites.
The emergence of usability engineering is deeply embedded in the discipline of
human factors. The main reason for the existence of the discipline of human factors is
that humans keep making errors while using machines/systems. Otherwise, it would
be difficult to justify the discipline’s existence.
The importance of quality in business and industry is increasing rapidly. Today,
our day-to-day lives and schedules are more dependent than ever before on the sat-
isfactory functioning of products and services (e.g., automobiles, computers, and
a continuous supply of electricity). Needless to say, factors such as competition,
product sophistication, and growing demand from customers for better quality have
played a very important role in increasing the importance of quality.
This chapter presents the reliability basics, human factors basics for usability, and
quality basics considered useful to understand the subsequent chapters of this book.
DOI: 10.1201/9781003298571-3 25
26 Applied Reliability, Usability, and Quality for Engineers
Finally, during the wear-out period, the item/system hazard rate increases with time
t. Some of the reasons for the occurrence of failures during this period are wear due to
friction, corrosion, and creep; poor maintenance, wear due to aging, short designed-in
life of the item/system under consideration; and incorrect overhaul practices.
Mathematically, the following equation can be used to represent the bathtub
hazard rate curve shown in Fig. 3.1 [3]:
θ
λ(t ) = γθ( γt )θ−1 e( γt ) (3.1)
where
λ(t ) is hazard rate (time-dependent failure rate).
t is time.
γ is the scale parameter.
θ is the shape parameter.
At θ = 0.5, Equation (3.1) gives the shape of the bathtub curve hazard rate curve
shown in Fig. 3.1.
dR(t )
f (t ) = − (3.2)
dt
where
f(t) is the item/system failure (or probability) density function.
R(t) is the item/system reliability at time t.
Reliability Basics, Human Factors Basics for Usability, & Quality Basics 27
Example 3.1
Rs (t ) = e −λ st (3.3)
where
Rs (t ) is the system reliability at time t.
λ s is the system constant failure rate.
Obtain an expression for the failure (probability) density function of the system by
using Equation (3.2).
By substituting Equation (3.3) into Equation (3.2), we obtain
de −λ st
f (t ) = −
dt
−λ s t
= λse (3.4)
Thus, Equation (3.4) is the expression for the failure (probability) density function
of the system.
f (t )
λ(t ) = (3.5)
R(t )
where
λ(t ) is the item/system hazard rate (i.e., time-dependent failure rate).
1 dR(t )
λ(t ) = − . (3.6)
R(t ) dt
Example 3.2
Obtain an expression for the system hazard rate by using Equations (3.3) and (3.6).
By inserting Equation (3.3) into Equation (3.6), we obtain
1 de −λ s t
λ(t ) = − −λ s t .
e dt
= λs (3.7)
Thus, the system hazard rate is given by Equation (3.7). It is to be noted that the
right-hand side of this equation is not a function of time t. In other words, it is
constant. Generally, it is referred to as the constant failure rate of an item/system
because it does not depend on time t.
28 Applied Reliability, Usability, and Quality for Engineers
At time t = 0, R(t) = 1.
By evaluating the right side of Equation (3.9) and rearranging, we obtain
t
∫
− λ ( t ) dt
R(t ) = e 0
(3.11)
Thus, Equation (3.11) is the general expression for the reliability function. It can be
used to obtain reliability function of an item/system when its times to failure follow any
time-continuous probability distribution (e.g., Weibull, Rayleigh, and Exponential).
Example 3.3
Assume that the hazard rate of an engineering system is expressed by Equation (3.1).
Obtain an expression for the reliability function of the engineering system by using
Equation (3.11).
By inserting Equation (3.1) into Equation (3.11), we get
t
θ
− γθ( γt )θ−1 e( γt ) dt
∫
R(t ) = e 0
θ
− e( γt ) −1
=e (3.12)
Thus, Equation (3.12) is the expression for the reliability function of the engineer-
ing system.
or
or
t
MTTF = E (t ) = tf (t )dt
∫
0
(3.15)
where
MTTF is the mean time to failure.
s is the Laplace transform variable.
R(s) is the Laplace transform of the reliability function R(t).
E(t) is the expected value.
Example 3.4
Prove by using Equation (3.3) that Equations (3.13) and (3.14) yield the same result
for the system mean time to failure.
By inserting Equation (3.3) into Equation (3.13), we obtain
∞
∫
MTTFs = e −λ st dt
0
1
=
λs (3.16)
where
MTTFs is the system mean time to failure.
∫
Rs ( s ) = e − st e −λ st dt
0
1
=
s + λs (3.17)
where
Rs ( s ) is the Laplace transform of the system reliability function Rs (t ).
1
MTTFs = lim
s→0 (s + λ s )
1
=
λs (3.18)
Equations (3.16) and (3.18) are identical, which proves that Equations (3.13) and
(3.14) yield the same result for the system mean time to failure.
30 Applied Reliability, Usability, and Quality for Engineers
3.4.1 Series Network
This is the simplest reliability network, and its block diagram is shown in Fig. 3.2.
The diagram represents an m-unit system, and each block in the diagram denotes a
unit. If any one of the m units malfunctions or fails, the series network/system fails.
In other words, for the successful operation of the series network/system, all the m
network/system units must operate normally.
The series network/system shown in Fig. 3.2, reliability is expressed by
where
Rss is the series system reliability.
Ei is the successful operation (i.e., success event) of unit i; for i = 1, 2, 3, …, m.
P( E1E2 E3 … Em ) is the occurrence probability of events E1E2 E3 … Em .
Rss = P( E1 ) P( E2 ) P( E3 )… P( Em ) (3.20)
where
P( Ei ) is the probability of occurrence of event Ei, for i = 1, 2, 3, …, m.
Rss = R1 R2 R3 ... Rm
m
= ∏R
i =1
i
(3.21)
where
Ri is the unit i reliability, for i = 1, 2, 3, …, m.
∫
− λ i dt
Ri (t ) = e 0
= e −λi t (3.22)
where
Ri (t ) is the reliability of unit i at time t.
where
Rss (t ) is the series system reliability at time t.
By substituting Equation (3.23) into Equation (3.13), we get the following expression
for the mean time to failure of the series system/network:
m
∞
− ∑λi t
MTTFss = e
∫
0
i =1
dt
1
= m (3.24)
∑λ
i =1
i
where
MTTFss is the series system/network mean time to failure.
By inserting Equation (3.23) into Equation (3.6), we get the following expression for
the series system/network hazard rate:
m
m
− ∑λi t
λ ss (t ) = −
−
1
m
∑λi t
−
∑ i =1
λ i e i =1
e i =1
= ∑λ
i =1
i (3.25)
where
λ ss (t ) is the series system/network failure rate (hazard rate).
It is to be noted that the right side of Equation (3.25) is independent of time t. Thus,
the left side of Equation (3.25) is simply λ ss , the failure rate of the series system/
network. It means that whenever we add up failure rates of independent units/items,
32 Applied Reliability, Usability, and Quality for Engineers
Example 3.5
1
MTTFss =
(0.0004 + 0.0004 + 0.0004 + 0.0004)
= 625 hours
Thus, the engineering system reliability, mean time to failure, and failure rate are
0.8253, 625 hours, and 0.0016 failures per hour, respectively.
Fps = P( y1 y2 y3 … ym ) (3.26)
where
Fps is the failure probability of the parallel network/system.
yi is the failure (i.e., failure event) of unit i, for i = 1, 2, 3, …, m.
P( y1 y2 y3 … ym ) is the occurrence probability of events y1 , y2 , y3 ,…, and ym .
Reliability Basics, Human Factors Basics for Usability, & Quality Basics 33
Fps = P( y1 ) P( y2 ) P( y3 )… P( ym ) (3.27)
where
P( yi ) is the occurrence probability of failure event yi , for i = 1, 2, 3, …, m.
Fps = ∏F
i =1
i (3.28)
where
Fi is the failure probability of unit i, for i = 1, 2, 3, …, m.
Rps = 1 − ∏Fi =1
i (3.29)
where
Rps is the parallel network/system reliability.
For constant failure rate λ i of unit i, subtracting Equation (3.22) from unity and then
inserting it into Equation (3.29) yields
m
Rps (t ) = 1 − ∏ (1 − e )
i =1
−λ i t
(3.30)
where
Rps (t ) is the parallel network/system reliability at time t.
34 Applied Reliability, Usability, and Quality for Engineers
where
λ is the unit constant failure rate.
By substituting Equation (3.31) into Equation (3.13), we get the following equation
for the parallel network/system mean time to failure:
∞
MTTFps =
∫ 1 − (1 − e
0
) dt
−λt m
=
1
λ ∑ 1i
i =1
(3.32)
where
MTTFps is the identical units parallel network/system mean time to failure.
Example 3.6
3
Rps (100) = 1− 1− e −(0.0005)(100)
= 0.9998
1 1 1
MTTFps = 1+ +
(0.0005) 2 3
= 3666.66 hours
Thus, engineering system reliability and mean time to failure are 0.9998 and
3666.66 hours, respectively.
By using the binomial distribution, for independent and identical units, we write
the following equation for reliability of k-out-of-n unit network shown in Fig. 3.4:
∑ ( j ) R (1 − R)
n
n n− j
Rk /n = j
(3.33)
j= k
where
( j) = (n −nj!)! j!
n
(3.34)
For constant failure rates of the identical units, by using Equations (3.11) and (3.33),
we get
n
Rk /n (t ) = ∑ n e (1 − e )
j= k
j
− jλt −λt n − j
(3.35)
where
Rk / n (t ) is the k-out-of-n network/system reliability at time t.
λ is the unit constant failure rate.
=
1
λ ∑ 1j
j= k
(3.36)
where
MTTFk /n is the k-out-of-n network/system mean time to failure.
36 Applied Reliability, Usability, and Quality for Engineers
Example 3.7
Assume that an engineering system has four active, identical, and independent
units in parallel. At least three units must function normally for the successful
operation of the engineering system. Calculate the engineering system mean time
to failure if the unit constant failure rate is 0.0006 failures per hour.
By inserting the given data values into Equation (3.36), we get
4
MTTF3/ 4 =
1
(0.0006) ∑ 1j
3
1 1 1
= +
(0.0006) 3 4
= 972.22 hours
3.4.4 Standby System
This is another reliability network/system in which only one unit functions and n
units are kept in their standby mode. The system is composed of (n + 1) units, as soon
as the functioning unit fails, the switching mechanism detects the failure and turns
on one of the standby units. The system fails when all its standby units fail.
The block diagram of a standby system with one functioning and n standby units
is shown in Fig. 3.5. Each block in the diagram represents a unit. By utilising Fig. 3.5,
for independent and identical units, perfect switching mechanism and standby units,
and time-dependent unit failure rate, we obtain the following equation for the standby
system reliability [1, 5]:
t − λ (t ) dt
j t
λ(t )dt e ∫0
n
∑ ∫
j =1 0
Rss (t ) = (3.37)
j!
where
Rss (t ) is the standby system reliability at time t.
λ(t ) is the unit time-dependent failure rate or hazard rate.
n is the number of standby units.
For constant unit failure rate (i.e., λ(t ) = λ), Equation (3.37) becomes
n
∑(λt) e
j=0
j −λt
Rss (t ) = (3.38)
j!
where
λ is the unit constant failure rate.
Reliability Basics, Human Factors Basics for Usability, & Quality Basics 37
FIGURE 3.5 Block diagram of a standby system with one functioning and n standby units.
n
∞
j=0
∑(λt ) j e −λt
MTTFss =
0
∫ j!
dt
(n + 1) (3.39)
=
λ
where
MTTFss is the standby system mean time to failure.
Example 3.8
(3 + 1)
MTTFss =
(0.0004)
= 10,000 hours
Thus, the standby system mean time to failure is 10,000 hours.
For five independent units/parts, the bridge network shown in Fig. 3.6, reliability
is expressed by [6]
Rbn = 2 R1 R2 R3 R4 R5 + R1 R3 R5 + R2 R3 R4 + R2 R5 + R1 R4 − R1 R2 R3 R4
− R1 R2 R3 R5 − R2 R3 R4 R5 − R1 R2 R4 R5 − R3 R4 R5 R1 (3.40)
where
Rbn is the bridge network reliability.
R j is the unit j reliability, for j = 1, 2, 3, 4, 5.
Rbn = R5 − 5 R 4 + 2 R3 + 2 R 2 (3.41)
where
R is the unit reliability.
For constant failure rates of all five units, and using Equations (3.11) and (3.41), we obtain
Rbn (t ) = 2e −5λt − 5e −4 λt + 2e −3λt + 2e −2 λt (3.42)
where
Rbn (t ) is the bridge network reliability at time t.
λ is the unit constant failure rate.
MTTFbn =
∫ ( 2e
0
−5 λt
)
− 5e −4 λt + 2e −3λt + 2e −2 λt dt
49
= (3.43)
60 λ
where
MTTFbn is the bridge network mean time to failure.
Reliability Basics, Human Factors Basics for Usability, & Quality Basics 39
Example 3.9
Assume that an engineering system with five identical and independent units form
a bridge network. Calculate the bridge network’s reliability for a 100-hour mission
and mean time to failure, if the constant failure rate of each unit is 0.0002 failures
per hour.
By inserting the given data values into Equation (3.42), we obtain
Similarly, by inserting the specified data values into Equation (3.43), we get
49
MTTFbn =
60(0.0002)
= 4083.33 hours
Thus, the bridge network’s reliability and mean time to failure are 0.9992 and
4083.33 hours, respectively.
TABLE 3.1
A Comparison of Humans’ and Machines’ Capabilities and Limitations
• Typical behaviour V: Generally, humans know very little about their physi-
cal shortcomings (First, learn effectively about all human limitations and
then develop the design accordingly.).
• Typical behaviour VI: Generally, humans use their hands first for testing
or exploring (First, pay special attention to the handling aspect during the
item/product design process. Otherwise, recommend strongly and clearly
that the item/product use requires a device supplied for eliminating the
need to use the hands.).
• Typical behaviour VII: Humans have a tendency to hurry (Design items/
products in such a way that effectively takes into consideration the element
of hurry by humans.).
• Typical behaviour VIII: Humans get easily confused with unfamiliar items/
products (Avoid designing items/products that are totally unfamiliar to all
potential users.).
• Typical behaviour IX: During loss of balance, humans instinctively reach
for and grab the closest item/object (Develop the design of the item/product
in such a manner that it appropriately incorporates satisfactory emergency
supports.).
• Typical behaviour X: Humans have become very accustomed to specific
meanings of colour (Strictly observe current colour-coding standards dur-
ing the design process.).
3.5.3.2 Sight
The sense of sight is stimulated by electromagnetic radiation of certain wavelengths,
often referred to as the electromagnetic spectrum. The parts of the spectrum, as
seen by the human eye, appear to vary in brightness. According to a number of stud-
ies conducted over the years, in daylight, the eyes of humans are most sensitive to
42 Applied Reliability, Usability, and Quality for Engineers
greenish-yellow light with a wavelength of around 5,500 Å [9]. Moreover, the eyes
see differently from different angles.
Some of the important sight-related guidelines are as follows:
3.5.3.3 Touch
This is quite closely related to humans’ ability for interpreting visual and auditory
stimuli. The sensory cues received by the skin and muscles can be utilised for sending
messages to the brain. In turn, this helps to relieve a part of the load from eyes and ears.
This human quality can be utilised quite successfully in various areas of engineering
usability. For example, in situations when the users of an item/product is expected to
rely totally on his/her touch, different shapes of knob could be considered for use.
Finally, it is to be noted that the use of touch in various technical areas is not new;
it has been utilised for many centuries by artisans for detecting surface irregularities
and roughness in their work. In fact, past experiences over the years clearly highlight
that the detection accuracy of surface irregularities improves dramatically when the
involved individual moves an intermediate piece of paper or thin cloth over the sur-
face of the object under consideration instead of just bare fingers [11].
The main objective of a quality assurance system is to maintain the specified level of
quality. Its important elements/tasks are as follows [14]:
• Factor I: Management
• Factor II: Machine used in manufacturing
• Factor III: Money, manpower, and materials
• Factor IV: Motivation of employees
• Factor V: Modern information methods
• Factor VI: Market for product and services
• Factor VII: Mounting product requirements
The term total quality management (TQM) was coined by Nancy Warren, a behav-
ioural scientist, in 1985 [17]. It is composed of three words, each of which is described
below separately in detail.
• Total: This calls for an effective team effort of all involved parties for sat-
isfying customers. There are many factors that play a very important role
in developing a successful supplier-customer relationship. Some of these
factors are as follows:
• Customer-supplier relationships’ development on the basis of mutual
trust and respect.
• Customers developing their all internal needs.
44 Applied Reliability, Usability, and Quality for Engineers
TABLE 3.2
Comparisons between Traditional Quality Assurance Management and the
Total Quality Management
• Statistical approaches
• Customer service
• Team work
• Quality cost
• Training
For TQM process success, there are many goals that must be fulfilled properly. Some
of these goals are as follows [22]:
There are many organisations and books that promote the TQM concept. This section
lists some of these organisations and books separately.
3.7.4.1 Organisations
• American Society for Quality Control, 611 East Wisconsin Avenue, P.O.
Box 3005, Milwaukee, WI.
• American Productivity and Quality Center, 123 North Post Oak Lane,
Houston, Texas.
• Quality and Productivity Management Association, 300 Martingale Road,
Suite 230, Schamburg, IL.
3.7.4.2 Books
• Oakland, J.S., Total Quality Management: Text with Cases, Butterworth-
Heinemann, Burlington, MA, 2003.
• Besterfield, D.H., et al., Total Quality Management, Prentice Hall, Upper
Saddle River, NJ, 2003.
TABLE 3.3
Some TQM Obstacles in the Form of Questions
3.8 PROBLEMS
1. Describe the bathtub hazard rate concept.
2. Define the following functions:
• Failure density function
• Hazard rate function
• General reliability function
3. Write down three general formulas that can be used to obtain system mean
time to failure.
4. Assume that an engineering system has five active, identical, and independent
units in parallel. At least two units must operate normally for the successful
operation of the engineering system. Calculate the engineering system mean
time to failure if the unit constant failure rate is 0.0008 failures per hour.
5. Assume that an engineering system with five independent and identical
units form a bridge network. Calculate the bridge network’s reliability for
a 200-hour mission and mean time to failure, if the constant failure rate of
each unit is 0.0004 failures per hour.
6. Compare humans’ and machines’ capabilities and limitations.
7. Describe the following two humans senses:
• Hearing
• Touch
8. What are the products’ and services’ quality effecting factors?
9. List at least seven TQM elements.
10. Describe Deming approach to TQM.
REFERENCES
1. Dhillon, B.S., Design Reliability: Fundamentals and Applications, CRC Press, Boca
Raton, Florida, 1999.
2. Kapur, K.C., Reliability and Maintainability, in Handbook of Industrial Engineering,
edited by Salvendy, G., John Wiley and Sons, New York, 1982, pp. 8.5.1–8.5.34.
3. Dhillon, B.S., Life Distributions, IEEE Transactions on Reliability, Vol. 30, No. 5,
1981, pp. 457–460.
4. Shooman, M.L., Probabilistic Reliability: An Engineering Approach, McGraw-Hill,
New York, 1968.
48 Applied Reliability, Usability, and Quality for Engineers
DOI: 10.1201/9781003298571-4 49
50 Applied Reliability, Usability, and Quality for Engineers
There are many factors that must be explored carefully prior to the implementa-
tion of FMEA. Four of these factors are as follows [9, 10]:
Over the years, professionals directly or indirectly involved with reliability analysis
have established certain facts and guidelines concerning FMEA. Four of these facts
and guidelines are shown in Fig. 4.1 [8, 10].
There are many advantages of conducting FMEA. Some of the main ones are
presented below [1, 8–10]:
• A useful approach that starts from the detailed level and works upward.
• A visibility tool for management that reduces product development time
and cost.
• A useful approach for comparing designs and highlighting safety-related
concerns.
• A quite helpful tool for safeguarding against repeating the same mistakes
in the future.
• A useful approach for reducing engineering-related changes and improving
the efficiency of test planning.
• A useful tool for improving communications among design interface
personnel.
• A useful tool for understanding and improving customer satisfaction.
• A systematic tool for categorizing and classifying hardware failures.
It is to be noted that there are many prerequisites associated with FTA. Some of the
main ones are presented below [1, 8]:
FTA starts by highlighting the top event, which is associated with a system/item under
consideration. Fault events that can cause the top event’s occurrence are generated and
connected by logic operators such as AND and OR. The AND gate provides a true out-
put (i.e., fault) when all the inputs are true. Similarly, the OR gate provides a true output
(i.e., fault) when one or more inputs are true.
The construction of a fault tree proceeds by generating fault events in a successive
manner until the fault events need not be developed any further. These fault events
are known as primary/basic events. A fault tree relates the top event to the primary/
basic fault events. During a fault tree’s construction process, the following question
is successively asked:
been discussed earlier. The remaining two symbols (i.e., circle and rectangle)
are described below:
• Circle: It represents a primary/basic fault event (e.g., the failure of an
elementary component/part), and the primary/basic fault-event param-
eters are failure rate, failure probability, unavailability, and repair rate.
• Rectangle: It represents a resultant event that occurs from the combi-
nation of fault events through the input of a logic gate such as OR and
AND.
Example 4.1
Assume that a windowless room contains three light bulbs and one switch.
Develop a fault tree for the undesired fault event (i.e., top fault event) “Dark room”,
if the switch can only fail to close.
In this case, there can be no light in the room (i.e., dark room) only if all the
three light bulbs burn out, if there is no incoming electricity, or if the switch fails
to close. Using the all four symbols shown in Fig. 4.2, a fault tree for the example
is shown in Fig. 4.3. The single capital letters in the fault tree diagram represent
corresponding fault events (e.g., A: dark room, B: three bulbs burned out, and C:
power failure).
FIGURE 4.2 Basic fault tree symbols: (i) circle, (ii) rectangle, (iii) AND gate, and (iv) OR gate.
Reliability, Usability, and Quality Analysis Methods 53
FIGURE 4.3 A fault tree for the top fault event: dark room.
P( X ) = 1 − ∏ {1 − P(x )}
j =1
j (4.1)
where
P(X) is the occurrence probability of the OR gate output fault event X.
m is the number of OR gate input independent fault events.
P( x j ) is the probability of occurrence of the OR gate input fault event x j , for
j = 1, 2, 3, …, m.
Similarly, the occurrence probability of the AND gate output fault event (say Y) is
expressed by [1, 8]
n
P(Y ) = ∏P( y )
j =1
j (4.2)
where
P(Y) is the occurrence probability of the AND gate output fault event Y.
n is the number of AND gate input independent fault events.
P( y j ) is the probability of occurrence of the AND gate input fault event y j , for
j = 1, 2, 3, …, n.
Example 4.2
Assume that in Fig. 4.3, the occurrence probabilities of independent fault events
C, D, F, G, H, and I are 0.08, 0.07, 0.06, 0.05, 0.04, and 0.03, respectively.
Calculate the probability of occurrence of the top fault event A (Dark room) by
using Equations (4.1) and (4.2).
54 Applied Reliability, Usability, and Quality for Engineers
Similarly, by substituting the given occurrence probability values of the fault events
F, G, and H into Equation (4.2), we get
P ( Y ) = P (F )P (G)P (H)
= ( 0.06)( 0.05)( 0.04 )
= 0.00012
where
P(Y) is the occurrence probability of fault event Y (i.e., three bulbs burned out).
By substituting these calculated values and the given data value into Equation (4.1),
we get
{ }{ }{
P ( A ) = 1− 1− P ( X ) 1− P ( Y ) 1− P (D) }
= 1− {1− 0.1076}{1− 0.00012}{1− 0.07}
= 0.1701
Thus, the probability of occurrence of the top fault event A (Dark room) is 0.1701.
• The transitional probability from one system state to another in the finite
time interval ∆t is given by θ∆t, where θ is the transition rate (e.g., failure or
repair rate) from one system state to another.
• The probability of more than one transition occurrence taking place in the
finite time interval ∆t is negligible (e.g., (θ∆t )(θ∆t ) → 0).
• All occurrences are independent of each other.
Example 4.3
Assume that an engineering system can be in either operating or failed state, and
its constant failure and repair rates are λ and µ , respectively. The system state
space diagram is shown in Fig. 4.4, the numerals in circle and rectangle denote
the engineering systems states. Obtain expressions for the engineering system’s
time-dependent and steady-state availabilities and unavailabilities, reliability, and
mean time to failure by using the Markov method.
Using the Markov method, we write down the following equations for states 0
and 1, shown in Fig. 4.4, respectively:
where
t is time.
P0 (t + ∆t ) is the probability of the engineering system being in operating state 0
at time (t + ∆t )
P1(t + ∆t ) is the probability of the engineering system being in failed state 1 at
time (t + ∆t ).
Pi (t ) is the probability that the engineering system is in state i at time t, for i = 0,1.
λ∆t is the probability of the engineering system failure in finite time interval ∆t.
µ∆t is the probability of the engineering system repair in finite time interval ∆t .
(1− λ∆t ) is the probability of no failure in finite time interval ∆t .
(1− µ∆t ) is the probability of there being no repair in finite time interval ∆t .
P0 (t )(t + ∆t ) − P0 (t )
lim = −P0 (t )λ + P1(t )µ (4.6)
∆t → 0 ∆t
dP0 (t )
+ P0 (t )λ = P1(t )µ (4.7)
dt
dP1(t )
+ P1(t )µ = P0 (t )λ(4.8)
dt
µ λ
P0 (t ) = + e −( λ+µ )t (4.9)
(λ + µ ) λ + µ
λ λ
P1(t ) = − e −( λ+µ )t (4.10)
(λ + µ ) (λ + µ )
µ λ
AV (t ) = P0 (t ) = + e −( λ+µ )t (4.11)
(λ + µ ) (λ + µ )
λ λ
UA(t ) = P1(t ) = − e −( λ+µ )t (4.12)
(λ + µ ) (λ + µ )
where
AV(t) is the time-dependent availability of the engineering system.
UA(t) is the time-dependent unavailability of the engineering system.
µ
AV = lim AV (t ) = (4.13)
t →∞ (λ + µ )
Reliability, Usability, and Quality Analysis Methods 57
and
λ
UA = lim UA(t ) = (4.14)
t →∞ (λ + µ )
where
AV is the steady-state availability of the engineering system.
UA is the steady-state unavailability of the engineering system.
where
R(t) is the engineering system reliability at time t.
By integrating Equation (4.15) over the time interval [0, ∞], we get the following
equation for the mean time to failure of the engineering system [1]:
∫
MTTF = e −λt dt
0
1
= (4.16)
λ
where
MTTF is the mean time to failure of the engineering system.
Example 4.4
Assume that an engineering system’s constant failure and repair rates are 0.0005
failures/hour and 0.0008 repairs/hour, respectively. Calculate the engineering
system’s steady-state availability and availability during a 60 hours mission.
By substituting the given data values into Equations (4.13) and (4.11), we get
0.0008
AV = = 0.6153
(0.0005 + 0.0008)
and
0.0008 0.0005
AV (60) = + e −(0.0005+ 0.0008)(60)
(0.0005 + 0.0008) (0.0005 + 0.0008)
= 0.9711
• Benefits
• Are relatively fast to administer and lead directly to diagnostic and
prescriptive information.
• Are quite useful for facilitating communication among design person-
nel, particularly when they are divided into developers and requirement
analysts.
• Are quite useful to comprehend the user’s environment.
• Drawbacks
• A very high degree of reliance on the investigator’s judgement.
• Lack of guidance in selecting the appropriate tasks for evaluation.
out effectively for completing a specific task. However, it to be noted that complex
task analyses also, directly or indirectly, consider the cognitive steps associated
with a task.
The number of steps required for accomplishing a task may be considered as an
elementary measure of task complexity. The principle may simply be stated as “the
simpler the task, the fewer the steps”. Individuals such as potential system/product
users, experienced system/product designers, and domain experts can be quite valu-
able informants in conducting task analysis.
Some of the benefits and drawbacks of this method (i.e., task analysis) are as
follows [3, 4]:
• Benefits
• Is quite useful with regard to prescribing potential solutions to usability-
related problems.
• Is quite useful for highlighting the elements of the design of the system/
product that causes the inconsistencies.
• Requires the involved investigator to follow a specific procedure
because of the standardisation of task analysis notations.
• Drawbacks
• The assumption of “expert” performance with the system under
consideration.
• Problems with the measure of task complexity (i.e., simply counting the
number of steps involved in performing a task).
Example 4.5
Assume that a person has to perform two independent and distinct tasks (x and y)
to operate or use an engineering system. Task x is performed before task y and each
of these tasks can be performed correctly or incorrectly. Furthermore, assume that
the probabilities of the person not performing tasks x and y correctly are 0.05 and
0.1, respectively.
60 Applied Reliability, Usability, and Quality for Engineers
Develop a probability tree and obtain an expression for the probability of not
successfully accomplishing the mission (i.e., not operating the engineering system
correctly). Also, calculate the probability of correctly operating/using the engi-
neering system by the person.
In this example, the person first performs task x correctly or incorrectly, and
then proceeds to perform task y. This whole scenario is depicted by a probability
tree shown in Fig. 4.5.
The symbols used in Fig. 4.5 are defined below:
In Fig. 4.5, the term xy denotes operating the engineering system successfully
(i.e., overall mission success). Thus, the occurrence probability of event xy is [4, 18]
P( xy ) = Px Py (4.17)
where
Px is the probability of performing task x correctly.
Py is the probability of performing task y correctly.
Similarly, in Fig. 4.5, the terms xy , xy + x y denote three distinct possibilities of not
operating the engineering system correctly. Thus, the probability of not success-
fully accomplishing the overall mission is
Pns = P( xy + xy + x y )
= Px Py + Px Py + Px Py (4.18)
where
Pns is the probability of not successfully accomplishing the overall mission
(i.e., the probability of not operating the engineering system correctly).
Px is the probability of performing task x incorrectly.
Py is the probability of performing task y incorrectly.
Reliability, Usability, and Quality Analysis Methods 61
P( xy ) = Px Py
= (1− Px )(1− Py )
= (1− 0.05)(1− 0.1)
= 0.855
Example 4.6
Construct a CAED.
The CAED for this example is shown in Fig. 4.6.
4.9.1 The P-Charts
These charts are also called the control charts for attributes, in which the data popu-
lation is grouped under two classifications (e.g., good or bad, pass or fail). More
clearly, parts/components without defects and parts/components with defects. Thus,
attributes control charts make use of pass-fail information for charting and a p-chart
basically is a single chart that tracks the proportion of nonconforming items/parts in
each sample taken from representative population.
Reliability, Usability, and Quality Analysis Methods 63
UCL and LCL of p-charts are established by utilising the binomial distribution;
thus are expressed by
UCL p = µ b + 3σ b (4.19)
LCL p = µ b − 3σ b (4.20)
where
UCL p is the upper control limit of the p-chart.
LCL p is the lower control limit of the p-chart.
µ b is the mean of the binomial distribution.
σ b is the standard deviation of the binomial distribution.
M
µb = (4.21)
mγ
where
M is the total number of failures/defectives in classification.
m is the sample size.
γ is the number of samples.
σ b = [µ b (1 − µ b )/m]1/ 2 (4.22)
Example 4.7
Assume that ten samples were taken from the production line of a company
manufacturing certain mechanical parts for use in a nuclear power plant. Each
sample contained 80 parts. The inspection process revealed that samples 1, 2,
3, 4, 5, 6, 7, 8, 9, and 10 contain 2, 4, 6, 8, 5, 9, 10, 3, 12, and 7 defective parts,
respectively.
Calculate the UCL and LCL of the p-chart and determine if the fractions of
defective parts of all these samples fall within the UCL and LCL of the p-charts.
By substituting the given data values into Equation (4.21), we obtain
(2 + 4 + 6 + 8 + 5 + 9 + 10 + 3 + 12 + 7)
µb =
(80)(10)
= 0.0825
By inserting the above calculated value and the other given data value into
Equation (4.22), we obtain
σ b = [(0.0825)(1− 0.0825)/(80)]1/ 2
= 0.0307
64 Applied Reliability, Usability, and Quality for Engineers
2
p= = 0.025
80
Similarly, the fractions of defective parts in samples 2, 3, 4, 5, 6, 7, 8, 9, and 10
are 0.05, 0.075, 0.1, 0.0625, 0.1125, 0.125, 0.0375, 0.15, and 0.0875, respectively.
By substituting the above calculated values for µ b and σ b into Equations (4.19)
and (4.20), we obtain
As all the above sample fractions are within the UCL and LCL, it means that there
is no abnormality in the ongoing production process.
4.10 PROBLEMS
1. Describe FMEA.
2. What are the main benefits of conducting FMEA?
3. What are the main objectives, benefits, and drawbacks of performing FTA?
4. What are the four basic symbols used for constructing fault trees? Describe
each of these symbols.
5. Assume that a windowless room contains two light bulbs and one switch.
Develop a fault tree for the undesired fault event (i.e., top fault event) “Dark
room”, if the switch can only fail to close.
6. Prove Equations (4.9) and (4.10) by using Equations (4.7) and (4.8).
7. What are the benefits and drawbacks of cognitive walkthroughs and task
analysis?
8. Compare probability tree analysis with FTA.
9. Describe CAED and its advantages.
10. Assume that six samples were taken from the production line of a com-
pany manufacturing certain mechanical parts for use in a nuclear power
plant. Each sample contained 50 parts. The inspection process revealed
that samples 1, 2, 3, 4, 5, and 6 contain 4, 6, 2, 5, 1, and 3 defective parts,
respectively.
Calculate the UCL and LCL of the p-chart and determine if the fractions of defective
parts of all these samples fall within the UCL and LCL of the p-chart.
REFERENCES
1. Dhillon, B.S., Design Reliability: Fundamentals and Applications, CRC Press, Boca
Raton, Florida, 1999.
2. Dhillon, B.S., Singh, C., Engineering Reliability: New Techniques, John Wiley and
Sons, New York, 1981.
3. Jordon, P.W., An Introduction to Usability, Taylor and Francis Ltd, London, 1998.
Reliability, Usability, and Quality Analysis Methods 65
DOI: 10.1201/9781003298571-5 67
68 Applied Reliability, Usability, and Quality for Engineers
• Electrocardiographic monitors
• Cardiac defibrillators
Finally, it is to be noted that there could be some overlap between the above three
classifications of devices/equipment, particularly between classifications I and III. An
electrocardiograph monitor or recorder is a typical example of such equipment/devices.
The method calculates the equipment or system failure rate under the single-use
environment by using the following equation [21]:
m
λe = ∑α ( λ
i =1
i gc Qgc )i (5.1)
70 Applied Reliability, Usability, and Quality for Engineers
where
λ e is the equipment/system failure rate expressed in failures/106 hours.
m is the number of different generic component/part classifications.
λ gc is the generic component/part failure rate expressed in failures/106 hours.
α i is the generic component/part quantity for classification i.
Qgc is the generic component/part quality factor.
The values Qgc and λ gc are tabulated in Ref. [21], and additional information on the
method is available in Refs. [21, 22].
5.4.5 General Approach
This is a 13-step approach and it was developed by Bio-Optronics to produce reliable
and safe medical devices [27]. The approach steps are as follows [27]:
• Step VII: Test the prototype under the field use environment.
• Step VIII: Make changes to the device/product design for satisfying field
requirements.
• Step IX: Conduct laboratory and field test on the modified version of the
device/product.
• Step X: Build pilot units to conduct necessary tests.
• Step XI: Ask impartial experts to test pilot units under the field use
environments.
• Step XII: Release the device/product design for production.
• Step XIII: Study the device/product field performance and support with
appropriate device/product maintenance.
1. Glucose meter
2. Balloon catheter
3. Orthodontic bracket aligner
4. Administration kit for peritoneal dialysis
5. Permanent pacemaker electrode
6. Implantable spinal cord simulator
7. Intra-vascular catheter
8. Infusion pump
9. Urological catheter
10. Electrosurgical cutting and coagulation device
11. Non-powered suction apparatus
12. Mechanical/hydraulic impotence device
13. Implantable pacemaker
14. Peritoneal dialysate delivery system
15. Catheter Introducer
16. Catheter guide wire
17. Trans-luminal coronary angioplasty catheter
18. External low-energy defibrillator
19. Continuous ventilator (respirator)
20. Contact lens cleaning and disinfecting solutions
• Reliability professionals
• Keep in mind that manufacturers are fully responsible for reliability
during the device/equipment design and manufacturing phase, and dur-
ing its operational phase it is basically users’ responsibility.
• Use methods such as qualitative FTA, FMEA, parts review, and design
review for obtaining immediate results.
• Focus on critical failures as not all equipment/device failures are
equally important.
Medical Equipment Reliability 73
• Focus on cost effectiveness and always keep in mind that some reliabil-
ity-associated improvement decisions need very small or no additional
expenditure.
• Always aim to use simple and straightforward reliability methods as
much as possible instead of some highly sophisticated approaches used
in the aerospace industry.
• Other professionals
• Compare human body and medical device/equipment failures. Both
of them require appropriate measures from reliability professionals
and doctors for enhancing device/equipment reliability and extending
human life, respectively.
• Recognise that failures are the cause of poor medical device/equipment
reliability, and positive thinking and measures can be very useful for
improving medical device/equipment reliability.
• Keep in mind that the application of reliability principles have success-
fully improved the reliability of systems/equipment used in the aero-
space area, and their applications to medical devices/equipment can
generate similar dividends.
• Remember that the cost of failures is probably the largest single expense
in a business organisation. These failures could be associated with busi-
ness systems, equipment, humans, etc., and a reduction in such failures
can decrease the business cost quite significantly.
• For the total success with respect to equipment/device reliability,
both users and manufacturers must accept their share of related
responsibilities.
• Accessibility
• Labelling and coding
• Handles
• Connectors
• Manuals, checklists, charts, and aids
• Test points
• Mounting and fasteners
• Cases, covers, and doors
• Test equipment
• Controls
• Displays
∑T λ
i =1
ri i
MTTR = k (5.2)
∑λi =1
i
where
MTTR is the mean time to repair.
k is the number of units.
Tri is the repair time required to repair unit i; for i = 1, 2, 3, …, k.
λ i is the constant failure rate of unit i; for i = 1, 2, 3, …, k.
Medical Equipment Reliability 75
M (t ) =
∫ f (t)dt(5.3)
0
where
t is time.
f(t) is the probability density function of the repair time.
Example 5.1
1 t
f (t ) = exp −
MTTR MTTR
1 t
= exp − (5.4)
3 3
By substituting Equation (5.4) and the specified data value into Equation (5.3) we
obtain
9
1 t
M(9) =
∫ 3 exp − 3 dt
0
9
= 1− exp −
3
= 0.9502
5.7.2.1 Indices
Just like in the case of the general maintenance activity, there are many indices that
can be used for measuring the effectiveness of the medical equipment maintenance-
associated activity.
Three of these indices are presented below [42].
• Index I
This index measures how much time elapses from a customer request until
the failed medical equipment is repaired and put back in service. The index
is defined
Tt
β at = (5.5)
n
where
β at is the average turnaround time per repair.
Tt is the total turnaround time.
n is the total number of work orders or repairs.
As per one study, the turnaround time per medical equipment repair ranged
from 35.4 hours to 135 hours [9].
• Index II
This index is a cost ratio and is defined by
Cms
βcr = (5.6)
Cma
where
βcr is the cost ratio.
Cma is the medical equipment acquisition cost.
Cms is the medical equipment service cost. It includes all parts, materi-
als, and labour costs for scheduled and unscheduled service, includ-
ing in-house, vendor, prepaid contracts, and maintenance insurance.
Medical Equipment Reliability 77
Rr
βc = (5.7)
k
5.7.2.2.1 Model I
This model can be used for determining the optimum time interval between item
replacements. The model is based on the assumption that the equipment/item aver-
age annual cost is made up of average investment, operating, and maintenance costs.
Thus, the average annual total cost of a piece of equipment is expressed by
Cinv (tei − 1)
Cat = C0 f + Cmf + + [ j0 + im ] (5.8)
tei 2
where
Cat is the average annual total cost of a piece of equipment.
tei is the equipment/item life expressed in years.
C0 f is the equipment/item operational cost for the first year.
Cmf is the equipment/item maintenance cost for the first year.
Cinv is the investment cost.
j0 is the amount by which operational cost increases annually.
im is the amount by which maintenance cost increases annually.
By differentiating Equation (5.8) with respect to tei and then equating it to zero, we
obtain
1/2
2Cinv
tei* = (5.9)
j0 + im
where
tei* is the optimum time between equipment/item replacements.
78 Applied Reliability, Usability, and Quality for Engineers
Example 5.2
Assume that for a medical equipment, we have the following data values:
j0 = $300
im = $200
Cinv = $600,000
Determine the optimum replacement period for the medical equipment under
consideration.
By inserting the above specified data values into Equation (5.9), we get
1/ 2
2(600000)
t* =
(300) + (200)
= 19.36 years
Thus, the optimum replacement period for the medical equipment under consid-
eration is 19.36 years.
• Center for Devices and Radiological Health (CDRH), FDA, 1390 Piccard
Drive, Rockville, MD 20850, USA.
• Emergency Care Research Institute (ECRI), 5200 Butler Parkway, Plymouth
Meeting, PA 19462, USA.
• Parts Reliability Information Center (PRINCE) Reliability Office,
George C. Marshall Space Flight Center, National Aeronautics and Space
Administration (NASA), Huntsville, AL 35812, USA.
• Reliability Analysis Center (RAC), Rome Air Development Center (RADC),
Griffis Air Force Base, Department of Defense, Rome, NY 13441, USA.
• National Technical Information Center, 5285 Port Royal Road, Springfield,
VA 22161, USA.
Some of the data banks and documents considered quite useful to obtain failure data
concerning medical equipment are as follows:
5.9 PROBLEMS
1. List at least four facts and figures concerned, directly or indirectly, with
medical devices/equipment reliability.
2. What are the main classifications of electronic equipment/devices used in
the health care system? Discuss at least two of these classifications in detail.
3. Discuss the steps of the approach developed by Bio-optronics for producing
safe and reliable medical devices?
4. Describe the parts count method.
5. List at least four facts and figures concerned, directly or indirectly, with
human error in medical devices/equipment.
6. List at least ten medical devices with a high incidence of human error.
7. Define the following two terms:
• Medical equipment maintenance
• Medical equipment maintainability
8. Assume that the repair times of a medical equipment/device are exponen-
tially distributed with a mean value (i.e., MTTR) of 2 hours. Calculate the
probability that a repair will be completed in 5 hours.
9. Define at least two indices that can be used for measuring the effectiveness
of the medical equipment maintenance-associated activity.
10. List at least six good sources to obtain medical equipment/device reliability-
related data.
REFERENCES
1. Hutt, P.B., A History of Government Regulation of Adulteration and Misbranding
of Medical Devices, in The Medical Device Industry, edited by Estrin, N.F., Marcel
Dekker, Inc, New York, 1990, pp. 17–33.
2. Murray, K., Canada’s Medical Device Industry Faces Cost Pressures, Regulatory
Reform, Medical Device and Diagnostic Industry Magazine, Vol. 19, No. 8, 1997,
pp. 30–39.
3. Meyer, J.L., Some Instrument Induced Errors in the Electrocardiogram, The Journal of
the American Medical Association, Vol. 201, 1967, pp. 351–358.
4. Johnson, J.P., Reliability of ECG Instrumentation in a Hospital, Proceedings of the
Annual Symposium on Reliability, 1967, pp. 314–318.
80 Applied Reliability, Usability, and Quality for Engineers
5. Gechman, R., Tiny Flaws in Medical Design Can Kill, Hospital Topics, Vol. 46, 1968,
pp. 23–24.
6. Crump, J.E., Safety and Reliability in Medical Electronics, Proceedings of the Annual
Symposium on Reliability, 1969, pp. 320–327.
7. Taylor, E.F., The Effect of Medical Test Instrument Reliability on Patient Risks,
Proceedings of the Annual Symposium on Reliability, 1969, pp. 328–330.
8. Dhillon, B.S., Bibliography of Literature on Medical Reliability, Microelectronics and
Reliability, Vol. 20, 1980, pp. 737–742.
9. Dhillon, B.S., Medical Device Reliability and Associated Areas, CRC Press, Boca
Raton, Florida, 2000.
10. Schwartz, A.P., A Call for Real Added Value, Medical Industry Executive, February/
March 1994, pp. 5–9.
11. Medical Devices, Hearing Before the Subcommittee on Public Health and Environment,
U.S. Congress House Interstate and Foreign Commerce, Serial No. 93-61, U.S. Govern
ment Printing Office, Washington, D.C., 1973.
12. Banta, H.D., The Regulation of Medical Devices, Preventive Medicine, Vol. 19, 1990,
pp. 693–699.
13. Kohn, L.T., Corrigan, J.M., Donaldson, M.S., Editors, To Err Is Human: Building a Safer
Health System, Institute of Medicine Report, National Academy Press, Washington,
D.C., 1999.
14. Micco, L.A., Motivation for the Biomedical Instrument Manufacturers, Proceedings of
the Annual Reliability and Maintainability Symposium, 1972, pp. 242–244.
15. Walter, C.W., Instrumentation Failure Fatalities, Electronic News, January 27, 1969.
16. Dhillon, B.S., Reliability Technology in Health Care Systems, Proceedings of the
IASTED International Symposium on Computers Advanced Technology in Medicine,
Health Care, and Bioengineering, 1990, pp. 84–87.
17. Allen, D., California Home to Almost One-Fifth of U.S. Medical Device Industry,
Medical Device and Diagnostic Industry Magazine, Vol. 19, No. 10, 1997, pp. 64–67.
18. Palady, P., Failure Modes and Effects Analysis, PT Publications, West Palm Beach,
Florida, 1995.
19. MIL-STD-1629, Procedures for Performing a Failure Mode, Effects and Criticality
Analysis, Department of Defense, Washington, D.C.
20. Dhillon, B.S., Design Reliability: Fundamentals and Applications, CRC Press, Boca
Raton, Florida, 1999.
21. MIL-HDBK-217, Reliability Prediction of Electronic Equipment, Department of
Defense, Washington, D.C.
22. RDH-376, Reliability Design Handbook, Reliability Analysis Center, Rome Air
Development Center, Griffiss Air Force Base, Rome, New York, 1976.
23. Dhillon, B.S., Singh, C., Engineering Reliability: New Techniques, John Wiley and
Sons, New York, 1981.
24. Fault Tree Handbook, Report No. NUREG-0492, U.S. Nuclear Regulatory Commission,
Washington, D.C.
25. Singh, C., Reliability Calculations on Large Systems, Proceedings of the Annual
Reliability and Maintainability Symposium, 1975, pp. 188–193.
26. Shooman, M.L., Probabilistic Reliability: An Engineering Approach, McGraw Hill
Book Company, New York, 1968.
27. Rose, H.B., A Small Instrument Manufacturer’s Experience with Medical Equipment
Reliability, Proceedings of the Annual Reliability and Maintainability Symposium,
1972, pp. 251–154.
28. Bogner, M.S., Medical Devices and Human Error, in Human Performance in Automated
Systems: Current Research and Trends, edited by Mouloua, M., Parasuraman, R.,
Lawrence Erlbaum Associates, Hillsdale, New Jersey, 1994, pp. 64–67.
Medical Equipment Reliability 81
29. Novel, J.L., Medical Device Failures and Adverse Effects, Pediatric Emergency Care,
Vol. 17, 1991, pp. 120–123.
30. Bogner, M.S., Medical Devices: A New Frontier for Human Factors, CSERIAC
Gateway, Vol. 4, No. 1, 1993, pp. 12–14.
31. Sawyer, D., Do It by Design: Introduction to Human Factors in Medical Devices,
Center for Devices and Radiological Health (CDRH), Food and Drug Administration,
Washington, D.C, 1996.
32. Casey, S., Set Phasers on Stun: And other True Tales of Design Technology and Human
Error, Aegean, Inc, Santa Barbara, California, 1993.
33. Hyman, W.A., Human Factors in Medical Devices, in Encyclopaedia of Medical
Devices and Instrumentation edited by J.G. Webster, Vol. 3, John Wiley and Sons,
New York, 1988, pp. 1542–1553.
34. Wikland, M.E., Medical Device and Equipment Design, Interpharm Press Inc, Buffalo
Grove, Illinois, 1995.
35. Taylor, E.F., The Reliability Engineer in the Health Care System, Proceedings of the
Annual Reliability and Maintainability Symposium, 1972, pp. 245–248.
36. Norman, J.C., Goodman, L., Acquaintance with and Maintenance of Biomedical
Instrumentation, J. Assoc. Advan. Med. Inst, Vol. 1, September 1966, pp. 8–10.
37. Waits, W., Planned Maintenance, Medical Research Engineering, Vol. 7, No. 12, 1968,
pp. 15–18.
38. Grant-Ireson, W., Coombs, C.F., Moss, R.V., Eds., Handbook of Reliability Engineering
and Management, McGraw Hill Book Company, New York, 1988.
39. AMCP-113, Engineering Design Handbook: Maintainability Engineering Theory and
Practice, Department of Army, Washington, D.C, 1976.
40. Blanchard, B.S., Verma, D., Peterson, E.L., Maintainability, John Wiley and Sons,
New York, 1995.
41. Dhillon, B.S., Engineering Maintainability, Gulf Publishing Company, Houston, Texas,
1999.
42. Cohen, T., Validating Medical Equipment Repair and Maintenance Metrics: A Progress
Report, Biomedical Instrumentation and Technology, Jan./Feb., 1997, pp. 23–32.
6 Robot Reliability
6.1 INTRODUCTION
Nowadays, robots are increasingly being used to perform various types of tasks includ-
ing materials handling, arc welding, spot welding, and routing. A robot may simply
be described as a mechanism guided by automatic controls and the word “robot” is
derived from the Czechoslovakian language, in which it means “worker” [1].
In 1954, George Devol designed and applied for a patent for a programmable
device that could be considered the first industrial robot. Nonetheless, the Planet
Corporation, in 1959, manufactured the first commercial robot [2]. Nowadays, mil-
lions of industrial robots are being used throughout the world [3]. As robots used
electronics, mechanical, hydraulic, pneumatic, and electrical components, their
reliability-related problems are quite challenging because of many different sources
of failures. Although there is no clear-cut definitive point in the beginning of robot
reliability field, a publication by J.F. Engelberger, in 1974, could be regarded as its
starting point [4]. A comprehensive list of publications on robot reliability up to
2002 is available in Refs. [5, 6].
This chapter presents various important aspects of robot reliability.
• Robot reliability: This is the probability that a robot will perform its speci-
fied mission according to stated conditions for a given time period.
• Robot mean time to failure: This is the average time that a robot will oper-
ate before failure.
• Robot repair: This is to restore robots and their associated parts or systems
to an operational condition after experiencing failure, damage, or wear.
• Robot mean time to repair: This is the average time that a robot is expected
to be out of operation after failure.
• Robot availability: This is the probability that a robot is available for ser-
vice at the moment of need.
• Fail-safe: This is the failure of a robot/robot part without endangering peo-
ple or damage to equipment or plant facilities.
• Graceful failure: The performance of the manipulator degraded at a slow
pace, in response to overloads, instead of failing catastrophically.
• Error recovery: This is the capability of intelligent robotic systems to reveal
errors and, through programming, to initiate appropriate correction actions
to overcome the impending problem and complete the specified process.
DOI: 10.1201/9781003298571-6 83
84 Applied Reliability, Usability, and Quality for Engineers
• Fault in teach pendent: This is the part failure in the teach pendent of a
robot.
• Erratic robot: A robot moved appreciably off its specified path.
• Robot out of synchronisation: This is when the position of the robot’s arm is
not in line with the robot’s memory of where it is supposed to be.
• Improper tools
• Poor system/equipment design
• Poorly written operating and maintenance-related procedures
• High temperature in the work area
• Inadequate lighting in the work area
• Task complexities
• Poor training of operating and maintenance personnel
Thus, human errors may be divided into classifications such as design errors,
assembly errors, inspection errors, operating errors, maintenance errors, and installa-
tion errors. Some of the methods that can be used for reducing human errors’ occur-
rence are error cause removal programme, man-machine systems analysis, quality
circles, and fault tree analysis. The last method (i.e., fault tree analysis) is described
in Chapter 4 and other three methods are described in Ref. [14].
Category III: Random component failures are those failures that occur unpredict-
ably during the components useful life. Some of the reasons for their occurrence
are low safety factors, undetectable defects, unavoidable failures, and unexplained
causes. In order to reduce the occurrence of such failures, the methods presented in
Chapter 4 can be used.
Finally, Category IV: Systematic hardware faults are those failures that can occur
due to unrevealed mechanisms present in the robot system design. Some of the rea-
sons for the occurrence of such failures are failure to make the appropriate environ-
ment-related provisions in the initial design, peculiar wrist orientations, and unusual
joint-to-straight-line mode transition.
quite wide variation of MTTRP and MTTRF. There are many factors that dictate
robots’ effectiveness. Some of these factors are as follows [11, 16]:
6.5.1 Robot Reliability
Robot reliability may simply be expressed as the probability that a robot will per-
form its specified function satisfactorily for the stated time period when used as per
designed conditions. The general formula to obtain time-dependent robot reliability
is defined by [11, 17]:
t
0
∫
Rr ( t ) = exp − λ r ( t ) dt
(6.1)
where
Rr ( t ) is the robot reliability at time t.
λ r ( t ) is the robot hazard rate or time-dependent failure rate.
Equation (6.1) can be used for obtaining the reliability function, of a robot for any
failure times probability distribution (e.g., exponential, Rayleigh, or Weibull).
Example 6.1
Assume that the hazard rate of a robot is expressed by the following function:
θt θ−1
λ r (t ) = (6.2)
ββθ−1
where
λ r (t ) is the hazard rate of the robot when its times to failure follow Weibull
distribution.
t is time.
β is the scale parameter.
θ is the shape parameter.
Robot Reliability 87
Obtain an expression for the robot reliability and then use it to calculate robot reli-
ability when t = 100 hours, θ = 1 (i.e., exponential distribution), and β = 1,500 hours.
By substituting Equation (6.2) into Equation (6.1), we obtain
t θt θ−1
Rr (t ) = exp −
0
∫
dt
ββθ−1
θ
t
−
β
=e (6.3)
100
−
Rr (100 ) = e 1500
= 0.9355
Thus, the robot reliability for the stated mission period of 100 hours is 0.9355.
1 dRr ( t )
λr (t ) = − . (6.4)
Rr ( t ) dt
where
λ r ( t ) is the robot hazard rate.
Rr ( t ) is the robot reliability at time t.
It is to be noted that Equation (6.4) can be used for obtaining the hazard rate when
robot times to failure follow any time-continuous probability distribution (e.g.,
Weibull, exponential, Rayleigh, etc.).
Example 6.2
With the aid of Equations (6.3) and (6.4) prove that the robot hazard rate is given
by Equation (6.2).
By substituting Equation (6.3) into Equation (6.4), we obtain
1 θ t θ−1 − βt
λ r (t ) = − θ − e
t
− β β
β
e
θ t θ−1
= (6.5)
β βθ−1
Both Equations (6.2) and (6.5) are identical. Thus, it proves that Equation (6.2) is an
expression for robot hazard rate.
88 Applied Reliability, Usability, and Quality for Engineers
Example 6.3
Assume that at an industrial installation, the annual robot production hours and
downtime due to robot-related problems are 60,000 hours and 800 hours, respec-
tively. During that period, there were ten robot-related problems. Calculate the
mean time to robot-related problems.
By substituting the given data values into Equation (6.6), we obtain
60,000 − 800
MTRP =
10
= 5,920 hours
∫
MTRF = Rr (t )dt
0
(6.7)
RPH − DTDRF
MTRF = (6.8)
TNRF
MTRF = lim Rr (s) (6.9)
s→ 0
where
MTRF is the mean time to robot failure.
Rr (t ) is the robot reliability at time t.
RPH is the robot production hours.
DTDRF is the downtime due to robot failures expressed in hours.
TNRF is the total number of robot failures.
s is the Laplace transform variable.
Rr (s) is the Laplace transform of the robot reliability function.
Robot Reliability 89
Example 6.4
Assume that annual production hours of a robot and its annual downtime due to
failures are 4,000 hours and 200 hours, respectively. During that period, the robot
failed five times. Calculate the MTTRF.
By inserting the specified data values into Equation (6.8), we obtain
4000 − 200
MTRF =
5
= 760 hours
Example 6.5
Assume that the constant failure rate, λ r , of a robot is 0.0004 failures per hour and
its reliability is expressed by
Rr (t ) = e −λ r t
= e −(0.0004)t (6.10)
where
Rr (t ) is the robot reliability at time t.
Calculate the MTTRF by using Equations (6.7) and (6.9). Comment on the end result.
By inserting Equation (6.10) into Equation (6.7), we get
∞
∫
MTRF = e −(0.0004)t dt
0
1
=
0.0004
= 2500 hours
1
Rr ( s ) = (6.11)
( s + 0.0004)
where
Rr ( s ) is the Laplace transform of the reliability function.
1
MTRF = lim
s→0( s + 0.0004)
1
=
(0.0004)
= 2500 hours
In both cases, the end result (i.e., MTRF = 2500 hours) is exactly the same. It proves
that both equations [i.e., Equations (6.7) and (6.9)] yield the same end result.
90 Applied Reliability, Usability, and Quality for Engineers
• Servo valve controls the motion of each hydraulic actuator. This motion
is transmitted directly or indirectly (i.e., through rods, gears, chains, etc.)
to the robot’s specific limb and, in turn, each limb is coupled to a position
transducer.
• Under high flow demand, an accumulator assists the pump to supply an
additional hydraulic fluid.
• Position transducer provides the joint angle codes and, in turn, each code’s
scanning is conducted by a multiplexer.
• Operator makes use of a teach pendent to control the arm-motion in teach
mode.
• Hydraulic fluid is pumped from the reservoir.
• Conventional motor and pump assembly generates pressure.
• Unloading valve is employed to keep pressure under the maximum limit.
FIGURE 6.4 Block diagram representing two subsystems shown in Fig. 6.3: (a) gripper
subsystem and (b) hydraulic pressure supply subsystem.
Furthermore, as shown in Fig. 6.5, the drive subsystem (shown in Fig. 6.3) is com-
posed of five parts (i.e., joints 1, 2, 3, 4, and 5) in series.
With the aid of Fig. 6.3, we obtain the following expression for the probability
of the nonoccurrence of the hydraulic robot event (i.e., undesirable hydraulic robot
movement causing damage to the robot-associated other equipment and possible
harm to humans):
where
Rhr is the hydraulic robot reliability or the probability of the nonoccurrence of the
hydraulic robot event (i.e., undesirable robotic arm movement causing dam-
age to the robot-associated other equipment and possible harm to humans).
Rgs is the reliability of the independent gripper subsystem.
Res is the reliability of the independent electronic and control subsystem.
Rds is the reliability of the independent drive subsystem.
Rhs is the reliability of the independent hydraulic pressure supply subsystem.
For independent parts, the reliabilities Rgs , Rhs , and Rds of gripper subsystem, hydraulic
pressure supply subsystem, and drive subsystem, using Figs. 6.4(a), 6.4 (b), and 6.5,
respectively, are
FIGURE 6.5 Block diagram representing subsystem 3 (i.e., drive subsystem) shown in Fig. 6.3.
92 Applied Reliability, Usability, and Quality for Engineers
and
5
Rds = ∏Ri =1
i (6.15)
where
Rps is the reliability of the pneumatic system.
Rcs is the reliability of the control signal.
Rhc is the reliability of the hydraulic component.
Rp is the reliability of the piping.
Ri is the reliability of joint i; for i = 1, 2, 3, 4, 5.
For constant failure rates of independent subsystems shown in Fig. 6.3, in turn, of
their independent parts shown in Figs. 6.4 and 6.5; from Equation (6.12) through
Equation (6.15), we obtain:
Rhr (t ) = e −λ gs t e −λes t e −λ ds t e −λ hs t
5
−λ ps t −λ cs t −λ es t
− ∑λi t
=e e e e i =1
e −λ hct e −λ pt
5
∑
− λ ps +λ cs +λ es + λ i +λ hc +λ p t
=e i =1 (6.16)
where
λ gs is the constant failure rate of the gripper subsystem.
λ ps is the constant failure rate of the pneumatic system.
λ cs is the constant failure rate of the control signal.
λ es is the constant failure rate of the electronic and control subsystem.
λ i is the constant failure rate of the joint i; for i = 1, 2, 3, 4, 5.
λ ds is the constant failure rate of the drive subsystem.
λ hs is the constant failure rate of the hydraulic pressure supply subsystem.
λ hc is the constant failure rate of the hydraulic component.
λ p is the constant failure rate of the piping.
1
= (6.17)
5
λ ps + λ cs + λ es +
∑ i =1
λ i + λ hc + λ p
where
MTTHRF is the mean time to hydraulic robot failure (i.e., the mean time to the
occurrence of the hydraulic robot undesirable event).
Robot Reliability 93
Example 6.6
Assume that the constant failure rates of the above type of hydraulic robot
are λ ps = 0.0009 failures/hour, λ cs = 0.0008 failures/hour, λ es = 0.0007 failures/
hour, λ1 = λ 2 = λ 3 = λ 4 = λ 5 = 0.0006 failures/hour, λ hc = 0.0005 failures/hour,
and λ p = 0.0004 failures/hour. Calculate the mean time to hydraulic robot failure.
By substituting the specified data values into Equation (6.17), we get
1
MTTHRF =
(0.0009 + 0.0008 + 0.0007 + 5(0.0006) + 0.0005 + 0.0004)
= 158.73 hours
• Interface bus allows interaction between the supervisory controller and the
joint control processors.
• Each joint is coupled with a feedback encoder (i.e., transducer).
• Transducer sends all appropriate signals to the joint controller.
• Motor shaft rotation is transmitted to the robot’s appropriate limb through
a transmission unit.
• Microprocessor control card controls each joint.
• Direct current (DC) motor actuates each joint.
• Supervising computer/controller directs all joints.
With respect to reliability, the block diagram shown in Fig. 6.6 represents the electric
robot under consideration.
Fig. 6.6 shows that the electric robot under consideration has two hypothetical
subsystems 1 and 2 in series. Subsystem 1 represents no movement due to external
factors, and subsystem 2 represents no failure within the robot causing its movement.
In turn, as shown in Fig. 6.7 (a), Fig. 6.6 subsystem 1 has two hypothetical ele-
ments X and Y in series and subsystem 2 has five parts (i.e., supervisory computer/
controller, drive transmission, joint control, end-effector, and interface) [Fig. 6.7 (b)]
FIGURE 6.6 Block diagram for estimating the nonoccurrence probability (i.e., reliability)
of the undesirable movement of the electric robot.
94 Applied Reliability, Usability, and Quality for Engineers
FIGURE 6.7 Block diagram representing two subsystems shown in Fig. 6.6: (a) subsystem 1,
(b) subsystem 2.
in series. Furthermore, the Fig. 6.7 (a) element has two hypothetical subelements M
and N in series as shown in Fig. 6.8.
With the aid of Fig. 6.6, we get the following equation for the probability of non-
occurrence of the undesirable electric robot movement (i.e., reliability):
where
Rerm is the probability of nonoccurrence (reliability) of the undesirable electric
robot movement.
Rss1 is the reliability of the independent subsystem 1.
Rss2 is the reliability of the independent subsystem 2.
For independent elements X and Y, the reliability of subsystem 1 in Fig. 6.6 (a) is
expressed by
Rss1 = RX RY (6.19)
where
RX is the reliability of the element X.
RY is the reliability of the element Y.
For hypothetical and independent subelements, the element X’s reliability in Fig. 6.8 is
RX = RM RN (6.20)
where
RM is the reliability of subelement M (i.e., the maintenance person’s reliability in
regard to causing the robot’s movement).
RN is the reliability of subelement N (i.e., the operator’s reliability in regard to
causing the robot’s movement).
Similarly, the reliability of subsystem 2 in Fig. 6.7 (b), for independent parts, is
expressed by
where
Rsc is the reliability of the supervisory computer/controller.
Rdt is the reliability of the drive transmission.
R jc is the reliability of the joint control.
Ree is the reliability of the end-effector.
Ri is the reliability of the interface.
Example 6.7
Assume that the following reliability data values are given for the above type of
electric robot:
RX = (0.95)(0.94)
= 0.893
And
Rss 2 = (0.94)(0.93)(0.92)(0.91)(0.9)
= 0.6586
96 Applied Reliability, Usability, and Quality for Engineers
By inserting the above calculated value for RX and the given value for RY into
Equation (6.18), we get
Rss1 = (0.893)(0.96)
= 0.8572
Rerm = (0.8572)(0.6586)
= 0.5645
6.7.1 Model I
This mathematical model can be utilised to calculate the optimum number of inspec-
tions per robot facility per unit time [19, 20]. This information is considered quite
useful to decision makers but inspections are often disruptive; however, such inspec-
tions usually lower the robot downtime because they reduce breakdowns. In this
model, the total downtime of the robot is minimised to obtain the optimum the
number of inspections.
The robot’s total downtime per unit time is defined by [21]
θTdb
RTDT = kTdi + (6.22)
k
where
RTDT is the robot’s total downtime per unit time.
k is the number of inspections per robot facility per unit time.
Tdi is the downtime per inspection for a robot facility.
θ is the constant for a specific robot facility.
Tdb is the downtime per breakdown for a robot facility.
By differentiating Equation (6.22) with respect to k and then equating it to zero, we get
1/2
θT
k * = db (6.23)
Tdi
where
k * is the optimum number of inspections per robot facility per unit time.
Robot Reliability 97
where
RTDT * is the minimum total downtime of the robot.
Example 6.8
Assume that for a robot facility, the following data values are specified:
Calculate the optimum number of robot inspections per month and the minimum
total robot downtime.
By substituting the above given data values into Equations (6.23) and (6.24),
we get
1/ 2
3(0.6)
k* =
(0.04)
= 6.7 inspections per month
and
RTDT * = 2[(3)(0.04)(0.6)]1/ 2
= 0.53 months
Thus, the optimum number of robot inspections per month and the minimum total
downtime are 6.7 and 0.53 months, respectively.
6.7.2 Model II
This model is concerned with determining the robot’s economic life. More specif-
ically, the time limit beyond this is not economical for conducting robot repairs.
Thus, the robot economic life is expressed by [18–20, 22]:
1/2
2(Cri − Vrs )
REL = (6.25)
RAIRC
where
REL is the robot economic life.
Cri is the robot initial cost (installed).
Vrs is the robot scrap value.
RAIRC is the robot’s annual increase in repair cost.
Example 6.9
Assume that the initial cost (installed) of a robot is $200,000 and its estimated scrap
value is $5,000. The estimated annual increase in its repair-associated cost is $400.
Estimate the time limit beyond which the robot-associated repairs will not be beneficial.
98 Applied Reliability, Usability, and Quality for Engineers
1/ 2
2(200,000 − 5,000
REL =
400
= 31.22 years
Thus, the time limit beyond which the robot-associated repairs will not be benefi-
cial is 31.22 years.
• Human error and other failures are statistically independent and the repaired
robot system is as good as new.
• Human error and other failure rates are constant.
• Failed robot system repair rates are constant.
The following symbols are associated with the diagram shown in Fig. 6.9 and its
associated equations:
With the aid of Markov method presented in Chapter 4, we write down the follow-
ing equations for Fig. 6.9 [11, 19]:
dP0 (t )
+ (λ + λ h ) P0 (t ) = α h P1 (t ) + αP2 (t ) (6.26)
dt
dP1 (t )
+ α h P1 (t ) = λ h P0 (t ) (6.27)
dt
dP2 (t )
+ αP2 (t ) = λP0 (t ) (6.28)
dt
where
1/2
− b ± b 2 − 4(αα h + λ h α + λα h
k1 , k2 = (6.30)
2
b = λ + λh + α + α h (6.31)
k1k2 = αα h + λ h α + λα h (6.32)
( k1 + k2 ) = (λ + λ h + α + α h ) (6.33)
αλ h λ h k1 + λ h α k1t (α + k2 )λ h k2t
P1 (t ) = + e − e (6.34)
k1k2 k1 ( k1 − k2 ) k2 ( k1 − k2 )
RSAV (t ) = P0 (t ) (6.36)
As time t becomes very large in Equations (6.34)–(6.36), we get the following steady-
state probability equations:
αα h
RSAV = (6.37)
k1k2
100 Applied Reliability, Usability, and Quality for Engineers
αλ h
P1 = (6.38)
k1k2
λα h
P2 = (6.39)
k1k2
where
RSAV is the robot system steady-state availability.
P1 is the steady-state probability of the robot system being in state 1.
P2 is the steady-state probability of the robot system being in state 2.
P0 (t ) = e − ( λ+λ h )t (6.40)
λh
P1 (t ) = 1 − e( λ+λ h )t (6.41)
(λ + λ h )
λ
P2 (t ) = 1 − e − ( λ+λ h )t (6.42)
(λ + λ h )
RSR(t ) = e − ( λ1 +λ h )t (6.43)
where
RSR(t) is the robot system reliability at time t.
By substituting Equation (6.43) into Equation (6.7), we obtain the following equation
for MTRF:
∞
∫
MTRF = e − ( λ+λ h )t dt
0
1
= (6.44)
(λ + λ h )
where
MTRF is the mean time to robot (i.e., robot system) failure.
Using Equation (6.43) in Equation (6.4), we get the following equation for the robot
(i.e., robot system) hazard rate:
1 d e − ( λ+λ h )t
λr = − .
e − ( λ+λ h ) dt
= λ + λh (6.45)
Robot Reliability 101
Example 6.10
Assume that a robot (i.e., robot system) can fail either due to a human error or
other failures, and its human errors and other failure rates are 0.0004 errors per
hour and 0.0008 failures per hour, respectively. The robot repair rate from both
the failure modes is 0.005 repairs per hour. Calculate the robot steady-state
availability.
By substituting the specified data values into Equation (6.37) we obtain
(0.005)(0.005)
RSAV =
(0.005)(0.005) + (0.0004)(0.005) + (0.0008)(0.005)
= 0.8064
6.8 PROBLEMS
1. Define the following four terms:
• Robot reliability
• Robot mean time to failure
• Graceful failure
• Error recovery
2. Discuss robot failure categories and their causes.
3. What are the robot effectiveness dictating factors?
4. Write down formula to obtain robot hazard rate.
5. Assume that at an industrial installation, robot production hours and
downtime due to robot-related problems are 40,000 hours and 600 hours,
respectively. During that period, there were five robot-related problems.
Calculate the mean time to robot-related problems.
6. Compare a hydraulic robot with an electric robot with respect to reliability.
7. Assume that for a robot facility, the following data values are given:
• θ=4
• Tdi = 0.02 months
• Tdb = 0.8 months
Calculate the optimum number of robot inspections per month and the
minimum total robot downtime.
8. Assume that the initial cost (installed) of a robot is $500,000 and its esti-
mated scrap value is $4,000. The estimated annual increase in its repair
cost is $300. Estimate the time limit beyond which the robot-associated
repairs will not be beneficial.
9. Prove that the sum of Equations (6.29), (6.34), and (6.35) is equal to unity.
10. Write down three formulas that can be used to calculate MTTRF.
102 Applied Reliability, Usability, and Quality for Engineers
REFERENCES
1. Jablonowski, J., Posey, J.W., Robotics Terminology, in Handbook of Industrial Robotics,
edited by Nof, S.Y., John Wiley and Sons, New York, 1985, pp. 1271–1303.
2. Zeldman, M.I., What Every Engineer Should Know About Robots, Marcel Dekker,
New York, 1984.
3. Rudall, B.H., Automation and Robotics Worldwide: Reports and Surveys, Robotica,
Vol. 14, 1996, pp. 164–168.
4. Engleberger, J.F., Three Million Hours of Robot Field Experience, The Industrial
Robot, 1974, pp. 164–168.
5. Dhillon, B.S., Fashandi, A.R.M., Liu, K.L., Robot Systems Reliability and Safety: A Review,
Journal of Quality in Maintenance Engineering, Vol. 8, No. 3, 2002, pp. 170–212.
6. Dhillon, B.S., On Robot Reliability and Safety: Bibliography, Microelectronics and
Reliability, Vol. 27, 1987, pp. 105–118.
7. Tver, D.F., Bolz, R.W., Robotics Sourcebook and Dictionary, Industrial Press, New York,
1983.
8. Glossary of Robotics Terminology, in Robotics, edited by Fisher, E.L., Industrial
Engineering and Management Press, Institute of Industrial Engineers, Atlanta, Georgia,
1983, pp. 231–253.
9. American National Standard for Industrial Robots and Robot Systems: Safety
Requirements, ANSI/RIA R15.06-1986, American National Standards Institute (ANSI),
New York, 1986.
10. Susnjara, K.A., Manager’s Guide to Industrial Robots, Corinthian Press, Shaker
Heights, Ohio, 1982.
11. Dhillon, B.S., Robot Reliability and Safety, Springer-Verlag, New York, 1991.
12. Khodanbandehloo, K., Duggan, F., Husband, T.F., Reliability assessment of Industrial
Robots, Proceedings of the 14th International Symposium on Industrial Robots, 1984,
pp. 209–220.
13. Khodanbandehloo, K., Duggan, F., Husband, T.F., Reliability of Industrial Robots: A
Safety Viewpoint, Proceedings of the 7th British Robot Association Annual Conference,
1984, pp. 133–242.
14. Dhillon, B.S., Human Reliability: With Human Factors, Peramon Press, New York,
1986.
15. Jones, R., Dawson, S., People and Robots: Their Safety and Reliability, Proceedings of
the 7th British Robot Association Annual Conference, 1984, pp. 243–258.
16. Young, J.F., Robotics, Butterworth, London, 1973.
17. Dhillon, B.S., Design Reliability: Fundamentals and Applications, CRC Press, Boca
Raton, Florida, 1999.
18. Varnum, E.C., Bassett, B.B., Machine and Tool Replacement Practices, in Manufacturing
Planning and Estimating Handbook, edited by Wilson, F.W., Harvey, P.D., McGraw
Hill, New York, 1963, pp. 18.1–18.22.
19. Dhillon, B.S., Applied Reliability and Quality, Springer-Verlag, London, 2007.
20. Dhillon, B.S., Mechanical Reliability: Theory, Models, and Applications, American
Institute of Aeronautics and Astronautics, Washington, D.C, 1988.
21. Wild, R., Essential of Production and Operations Management, Holt, Rinehart, and
Winston, London, 1985, pp. 356–368.
22. Eidmann, F.L., Economic Control of Engineering and Manufacturing, McGraw Hill,
New York, 1931.
7 Computer and
Internet Reliability
7.1 INTRODUCTION
Nowadays, a vast amount of money is being spent annually around the globe to
produce computers for various types of applications ranging from personal use to
control space and other systems. As the computers are composed of both the hard-
ware and software components, for their successful operation, the reliability of both
these components is equally important. The history of computer hardware reliability
may be traced back to the late 1940s and 1950s [1–4]. For example, in 1956 Von
Neumann proposed triple modular redundancy (TMR) scheme for improving com-
puter hardware reliability [3]. It appears that the first serious effort on software reli-
ability started in 1964 at Bell Laboratories [5]. Nonetheless, some of the important
works that appeared in the 1960s on software reliability are available in Refs. [5–7].
The history of the internet goes back to 1969 with the development of Advanced
Research Projects Agency Network (ARPANET) and it was grown form four hosts
in 1969 to over 147 million hosts and 38 million sites in 2002 [8]. Nowadays, bil-
lions of people around the globe use internet services [8]. In 2001, there were over
52,000 internet-related failures and incidents. Needless to say, today the reliability
and stability of the internet has become extremely important to the world economy
and other areas, because internet-associated failures can easily generate millions of
dollars in losses and interrupt the daily routines of millions of end users [9].
This chapter presents various important aspects of computer hardware, software,
and internet reliability.
The first six of the above computer failure-related causes are described below.
Communication network failures are mostly of a transient nature and are associ-
ated with inter-module communication. The application of “vertical parity” logic can
help to cut down approximately 70% of errors in communication lines. Peripheral
device failures are important to consider because they can cause serious problems
but they seldom result in a system shutdown. The frequently occurring errors in
peripheral devices are transient or intermittent, and the devices’ electromechanical
nature is the usual reason for their occurrence.
Human errors, in general, take place due to operator oversights and mistakes,
and frequently occur during starting up, running, and shutting down the system.
Processor and memory failures are associated with processor and memory party
errors. Although the occurrence of processor errors is quite rare, they are generally
catastrophic. However, there are occasions when the central processor malfunctions
to execute instructions appropriately due to a “dropped bit”. Nowadays, the memory
parity errors take place very rarely because of improvements in hardware reliability
and also they are not necessarily fatal.
Environmental failures take place due to factors such as failure of air condition-
ing equipment, electromagnetic interference, earthquakes, and fires, whereas power
failures due to factors such as total power loss from the local utility company and
transient fluctuations in frequency or voltage. In real-life systems, the failures that
cannot be classified properly are called mysterious failures. An example of such
failures is the sudden stop functioning of a normally functioning system without
indication of any problem (i.e., hardware, software, etc.).
There are many issues, directly or indirectly, concerned with computer system reli-
ability. In this case, some of the important factors to consider are as follows [8, 12, 13]:
• Inherited errors
• Handwriting errors
• Data preparation errors
• Keying errors
• Optical character reader
generate quite a significant proportion of errors. As per Ref. [15], at least 40% of all
errors come from manipulating the data (i.e., data preparation) prior to writing it
down or entering it into the involved computer system.
In the area of the computer system reliability, many measures are being used.
They may be grouped under the following two classifications [12, 16]:
TABLE 7.1
Hardware and Software Reliability Comparisons
The best known fault masking method is probably modular redundancy and is
presented in the following sections [17].
Rtv = (3 R 2 − 2 R3 ) Rv (7.1)
where
R is the unit/module reliability.
Rv is the voter reliability.
Rtv is the reliability of TMR system with voter.
108 Applied Reliability, Usability, and Quality for Engineers
FIGURE 7.2 Block diagram representing the TMR scheme with voter.
Rtv = 3 R 2 − 2 R3 (7.2)
where
Rtv is the reliability of the TMR system with perfect voter.
It is to be noted that the voter reliability and the single unit’s reliability determine
the improvement in reliability of the TMR system over a single unit system. For
the perfect voter (i.e., Rv = 1) the TMR system reliability given by Equation (7.2) is
only better than the single unit system when the single unit’s reliability is greater
than 0.5.
At Rv = 0.8, the reliability of the TMR system is always less than a single unit’s
reliability. Furthermore, when Rv = 0.9 the TMR system reliability is only marginally
better than the single unit/module reliability when the single unit/module reliability is
approximately between 0.833 and 0.667 [21].
Rtv 3 R 2 − 2 R3
α= = = 3R − 2 R2 (7.3)
R R
By differentiating Equation (7.3) with respect to R and then equating it to zero, we
obtain
dα
= 3R − 4 R = 0 (7.4)
dR
Computer and Internet Reliability 109
R = 0.75
This simply means that the TMR system’s maximum reliability will occur at R = 0.75.
Thus, by substituting this value for R into Equation (7.2), we get
Thus, the maximum value of the TMR system reliability with the perfect voter is
0.8438.
Example 7.1
Assume that the reliability of a TMR system with a perfect voter is expressed by
Equation (7.2). Determine the points where the single-unit and the TMR-system
reliabilities are equal.
In order to determine the points, we equate the reliability of the single unit (i.e., R)
with Equation (7.2) to obtain
R = Rtv = 3R 2 − 2R3(7.5)
2R 2 − 3R + 1 = 0 (7.6)
The above Equation (7.6) is a quadratic equation and its roots are
3 + [ 9 − (4)(2)(1)]
1/ 2
R= = 1(7.7)
(2)(2)
and
3 − [ 9 − (4)(2)(1)]
1/ 2
1 (7.8)
R= =
(2)(2) 2
It means that the reliabilities of the TMR system with perfect voter and the single
unit are equal at R = 1 or R = 1/2. Furthermore, the reliability of the TMR system
with perfect voter will only be higher than the single unit reliability when the value
of R is greater than 0.5.
By integrating Equation (7.9) over the time interval from 0 to ∞, we get the following
expression for the TMR system with voter mean time to failure [12, 17]:
∞
MTTFtv =
∫ 3e
0
− (2 λ+λ v ) t
− 2e − (3λ+λ v )t dt
3 2
= −
(2λ + λ v ) (3λ + λ v ) (7.10)
where
MTTFtv is the mean time to failure of the TMR system with voter.
3 2
MTTFtp = −
2λ 3λ
5
=
6λ (7.11)
where
MTTFtp is the mean time to failure of the TMR system with perfect voter.
Example 7.2
Assume that the constant failure rate of a unit/module belonging to a TMR system
with voter is λ = 0.0002 failures per hour. Calculate the system reliability for a 400-
hour mission if the voter constant rate is λ v = 0.0001 failures per hour. In addition,
calculate the TMR system mean time to failure.
By substituting the given data values into Equation (7.9), we obtain
3 2
MTTFtv = −
[ 2(0.0002) + 0.0001] [3(0.0002) + 0.0001]
= 3142.85 hours
Thus, the TMR system with voter reliability and mean time to failure are 0.9446
and 3142.85 hours, respectively.
Computer and Internet Reliability 111
k
Rnv = Rv
∑(N )R
i =1
i
N −i
(1 − R)i
(7.12)
where
N!
(N ) =
i ( N − i)! i !
Rv is the voter reliability.
R is the unit/module reliability.
Rnv is the NMR system with voter reliability.
Finally, it is to be noted that the time dependent reliability analysis of an NMR sys-
tem can be conducted in a manner similar to the TMR system reliability analysis.
Additional information on redundancy schemes is available in Ref. [23].
where
t is time.
MT is the mean time to failure at the beginning of the test.
m is the net number of corrected faults.
θ is the testing compression factor defined as the average ratio of detection rate
of failures during test of the rate during normal use of the software pro-
gramme under consideration.
N is the initial number of faults.
Mean time to failure, MTTF, increases exponentially with execution time and is
defined by
From the above relationships, we get the number of failures that must occur to
increase mean time to failure from, say, MTTF1 to MTTF2 [23]
1 1
∆m = nMT − (7.16)
MTTF1 MTTF2
Computer and Internet Reliability 113
nMT MTTF2
∆t = ln (7.17)
θ MTTF1
Example 7.3
1 1
(600 − 10) = (600)(4) − (7.18)
4 MTTF2
By substituting the above result and the other given data values into Equation (7.17),
we obtain
(600)(4) 240
∆t = ln
2 4
= 4913.21hours
Similarly, using the calculated and given data values in Equation (7.15) yields
80
R(80) = exp −
240
= 0.7165
Thus, the needed testing time is 4913.21 hours, and the reliability of the software
for the stated operational period is 0.7165.
be noted that the value of this measure can only be calculated if the seeded faults
are found.
The maximum likelihood of the unseeded faults is expressed by [21, 28]
where
UFm1 is the maximum likelihood of the unseeded faults.
SFf is the number of seeded faults found.
UFu is the number of unseeded faults uncovered.
SF is the number of seeded faults.
Thus, the number of unseeded faults still remaining in a software programme under
consideration is given by
where
β is the number of unseeded faults still remaining in a software programme under
consideration.
m
CDRc =
∑
j =1
γ j /θ
(7.21)
where
m is the number of reviews.
γ j is the number of unique defects at or above a specified severity level, found in
the jth code review.
θ is the total number of source lines of code reviewed, expressed in thousands.
CDRc is the cumulative defect ratio for code.
Computer and Internet Reliability 115
m
CDRd =
∑
j =1
β j /α
(7.22)
where
m is the number of reviews.
β j is the number of unique defects at or above a specified severity level, found in
the jth design review.
α is the total number of source lines of design statement in the design phase,
expressed in thousands.
CDRd is the cumulative defect ratio for design.
• In 2011, over 2.1 billion people around the globe were using the internet,
and around 45% of them were below the age of 25 years [29].
• From 2006 to 2011, developing countries in the world increased their share
of the world’s total number of internet users from 44% to 62% [29].
• In 2001, there were 52,658 internet-associated failures and incidents [9].
• In 2000, in the entire United States internet-associated economy generated
approximately $830 billion in revenues [9].
• In 2000, the internet carried around 51% of the information flowing through
two-way telecommunication, and by 2007 over 97% of all telecommuni-
cated information was transmitted through the internet [30].
• On November 8, 1998, a malformed routing control message because of
a software fault triggered an inter-operability problem between a number
of core internet backbone routers produced by different vendors. In turn,
this caused a widespread loss of network connectivity in addition to an
increment in packet loss and latency [8]. It took many hours for most of the
backbone providers for overcoming this outage.
• On August 14, 1998, a misconfigured main internet database server incor-
rectly referred all queries for internet machines/systems with names ending
in “net” to the wrong secondary database server. In turn, due to this prob-
lem, most of the connections to “net” internet web servers and other end
stations malfunctioned for a number of hours [8].
116 Applied Reliability, Usability, and Quality for Engineers
In 1999, a study reported the following four internet reliability-related observations [32]
As many internet services (e.g., e-commerce and search engines) suffer faults, a
quick detection of these faults could be a very important factor in improving system
availability. For this very purpose, an approach known as the pinpoint method is
considered very useful. This method combines the easy deploy ability of low-level
monitors with the higher-level monitors’ ability to detect application-level faults [32].
Computer and Internet Reliability 117
This method is based upon the following three assumptions in regard to the system
under observation and its workload [32]:
The pinpoint method for detecting and localising anomalies is basically a three stage
process [32]:
• Stage 1: Observing the system. This is concerned with capturing the run-
time path of each and every request served by the system and then from
these paths extracting two specific low-level behaviors likely for reflecting
high-level functionality: path shapes and components’ interactions.
• Stage 2: Learning the patterns in system behaviour. This is concerned
with constructing a reference model representing the usual behaviour of
an application in regard to path shapes and components/parts interactions,
by assuming that most of the system functions correctly most of the time.
• Stage 3: Detecting anomalies in system behaviors. This is concerned with
analysing the system’s current behaviour and detecting anomalies in regard
to the reference model.
7.9.1 Model I
This mathematical model is concerned with evaluating the reliability and availabil-
ity of an internet server system. The model assumes that the server system can either
be in an operating or a failed state, its failure/outage and restoration/repair rates are
constant, and all its failures/outages occur independently and the repaired/restores
server system is as good as new.
The server system state space diagram is shown in Fig. 7.3, and the numerals in
box and circle denote system states.
118 Applied Reliability, Usability, and Quality for Engineers
The following symbols were used to develop equations for the model:
j is the jth internet server system state shown in Fig. 7.3, for j = 0 (internet
server system operating normally), j = 1 (internet server system failed).
λ ss is the internet server system constant failure/outage rate.
µ ss is the internet server system constant repair/restoration rate.
Pj (t ) is the probability that the internet server system is in state j at time t, for
j = 0,1.
Using the Markov method presented in Chapter 4, we write down the following
equations for the diagram shown in Fig. 7.3 [17]:
dP0 (t )
+ λ ss P0 (t ) = µ ss P1 (t ) (7.23)
dt
dP1 (t )
+ µ ss P1 (t ) = λ ss P0 (t ) (7.24)
dt
µ ss λ ss
P0 (t ) = AVss (t ) = + e − ( λ ss +µss )t (7.25)
(λ ss + µ ss ) (λ ss + µ ss )
λ ss λ ss
P1 (t ) = UAss (t ) = − e − ( λ ss +µss )t (7.26)
(λ ss + µ ss ) (λ ss + µ ss )
where
AVss (t ) is the internet server system availability at time t.
UAss (t ) is the internet server system unavailability at time t.
µ ss
AVss = lim AVss (t ) = (7.27)
t →∞ λ ss + µ ss
Computer and Internet Reliability 119
λ ss
UAss = lim UAss (t ) = (7.28)
t →∞ λ ss + µ ss
Rss (t ) = e −λ ss t (7.29)
where
Rss (t ) is the internet server system reliability at time t.
Thus, the internet server system mean time to failure is given by [17]
∞
∫
MTTFss = Rss (t )dt
0
∞
∫
= e −λ ss t dt
0
1
=
λ ss (7.30)
where
MTTFss is the internet server system mean time to failure.
Example 7.4
0.08
AVss =
0.0005 + 0.08
= 0.9937
Thus, the steady state availability of the internet server system is 0.9937.
7.9.2 Model II
This mathematical model is concerned with evaluating the availability of an
internetworking (router) system with two independent and identical switches. The
model assumes that the system malfunctions when both the switches malfunc-
tion and the switches form a standby-type configuration. In addition, the switch
failure/malfunction and restoration (repair) rates are constant. The system space
diagram is shown in Fig. 7.4. The numerals in rectangles and circle denote the
system state.
120 Applied Reliability, Usability, and Quality for Engineers
The following symbols were used to develop equations for the model:
j is the jth state shown in Fig. 7.4, for j = 0 (system operating normally [i.e.,
two switches functional: one operating, other on standby]), j = 1 (one switch
operating, the other failed), j = 2 (system failed [both switches failed]).
λ s is the switch constant failure rate.
µ s is the switch constant repair/restoration rate.
µ 2 is the constant repair/restoration rate from state 2 to state 0.
P is the probability of failure detection and successful switchover from switch
failure.
Pj (t ) is the probability that the internetworking (router) system is in state j at
time t; for j = 0,1,2.
Using the Markov method presented in Chapter 4, we write down the following
equations for the diagram shown in Fig. 7.4 [17, 37].
dP0 (t )
+ [ pλ s + (1 − p)λ s ] P0 (t ) = µ s P1 (t ) + µ 2 P2 (t ) (7.31)
dt
dP1 (t )
+ (λ s + µ s ) P1 (t ) = pλ s P0 (t ) (7.32)
dt
dP2 (t )
+ µ 2 P2 (t ) = λ s P1 (t ) + (1 − p)λ s P0 (t ) (7.33)
dt
P0 = µ 2 (µ s + λ s )/X (7.34)
Computer and Internet Reliability 121
where
X = µ 2 (µ s + pλ s + λ s ) + (1 − p)λ s (µ s + λ s ) + pλ s2 (7.35)
P1 = pλ s µ 2 /X (7.36)
P2 = pλ 2s + (1 − p)λ s (µ s + λ s ) /X (7.37)
where
Pj is the steady-state probability that internetworking (router) system is in state j,
for j = 0,1,2.
AVis = P0 + P1
= [ µ 2 (µ s + λ s ) + pλ 2µ 2 ] /X (7.38)
where
AVis is the internetworking (router) system steady-state availability.
7.10 PROBLEMS
1. Make a comparison between software reliability and hardware reliability.
2. What are the main causes of computer failures?
3. Discuss at least five categories of computer failures.
4. What are the sources of computer software and hardware errors?
5. What is fault masking?
6. Assume that the constant failure rate of a unit/module belonging to a TMR
system with voter is λ = 0.0004 failures per hour. Calculate the system reli-
ability for a 200-hour mission if the voter constant failure rate is 0.0002 fail-
ures per hour. In addition, calculate the TMR system mean time to failure.
7. Compare the Mills model with the Musa model.
8. Describe the pinpoint method.
9. Assume that the constant failure and repair rates of an internet server sys-
tem are 0.005 failures per hour and 0.04 repairs per hour, respectively.
Calculate the internet server system availability for a 20-hour mission.
10. Prove Equations (7.34), (7.36), and (7.37) by using Equations (7.31)–(7.33).
REFERENCES
1. Shannon, C.A., A Mathematical Theory of Communications, Bell System Technical
Journal, Vol. 27, 1948, pp. 379–423 and 623–656.
2. Hamming, W.R., Error Detecting and Error Correcting Codes, Bell System Technical
Journal, Vol. 29, 1950, pp. 147–160.
3. Von Neumann, J., Probabilistic Logistics and the Synthesis of Reliable Organisms from
Reliable Components, in Automata Studies, edited by Shannon, C.E., McCarthy, J.,
Princeton University Press, Princeton, New Jersey, 1956, pp. 43–48.
122 Applied Reliability, Usability, and Quality for Engineers
4. Moore, E.F., Shannon, C.E., Reliable Circuits Using Less Reliable Relays, Journal of
the Franklin Institute, Vol. 262, 1956, pp. 191–208.
5. Haugk, G., Tsiang, S.H., Zimmermann, L., System Testing of the No. 1 Electronic
Switching System, Bell System Technical Journal, Vol. 43, 1964, pp. 2575–2592.
6. Sauter, J.L., Reliability in Computer Programs, Mechanical Engineering, Vol. 91, 1969,
pp. 24–27.
7. Barlow, R., Scheuer, E.M., Reliability Growth During a Development Testing Program,
Technometrics, Vol. 8, 1966, pp. 53–60.
8. Dhillon, B.S., Computer System Reliability: Safety and Usability, CRC Press, Boca
Raton, Florida, 2013.
9. Goseva-Popstojanova, K., Mazimdar, S., Singh, A.D., Empirical Study of Session-Based
Workload and Reliability for Web Servers, Proceedings of the 15th Int. Symposium on
Software Reliability Engineering, 2004, pp. 403–414.
10. Yourdon, E., The Causes of System Failures-Part 2, Modern Data, Vol. 5, February
1972, pp. 50–56.
11. Yourdon, E., The Causes of System Failures-Part 3, Modern Data, Vol. 5, March 1972,
pp. 36–40.
12. Dhillon, B.S., Reliability in Computer System Design, Ablex Publishing, Norwood,
New Jersey, 1987.
13. Goldberg, J., A Survey of the Design and Analysis of Fault-Tolerant Computers, in
Reliability and Fault Tree Analysis, edited by Barlow, R.E., Fussell, J.B., Singpurwalla,
N.D., Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania,
1975, pp. 667–685.
14. Kletz, T., Chung, P., Broomfield, E., Shen-Orr, C., Computer Control and Human Error,
Gulf Publishing, Houston, Texas, 1995.
15. Bailey, R.W., Human Error in Computer Systems, Prentice Hall, Englewood Cliffs,
New Jersey, 1983.
16. Beaudry, M.D., Performance Related Reliability Measures for Computer Systems,
IEEE Transactions on Computers, Vol. 27, June 1978, pp. 540–547.
17. Dhillon, B.S., Design Reliability: Fundamentals and Applications, CRC Press, Boca
Raton, Florida, 1999.
18. Kline, M.B., Software and Hardware Reliability and Maintainability: What are the
Differences?, Proceedings of the Annual Reliability and Maintainability Symposium,
1980, pp. 179–185.
19. Ireson, W.G., Coombs, C.F., Moss, R.Y., Handbook of Reliability Engineering and
Management, McGraw Hill, New York, 1996.
20. Mathur, F.R., Avizienis, A., Reliability Analysis and Architecture of a Hybrid Redun
dant Digital System: Generalized Triple Modular Redundancy with Self-Repair, Procee
dings of the AFIPS Spring Joint Computer Conference, 1970, pp. 375–387.
21. Pecht, M., Ed., Product Reliability, Maintainability, and Supportability Handbook,
CRC Press, Boca Raton, Florida, 1995.
22. Shooman, M.L., Reliability of Computer Systems and Networks: Fault Tolerance,
Analysis, and Design, John Wiley and Sons, New York, 2002.
23. Nerber, P.O., Power-off Time Impact on Reliability Estimates, IEEE International
Convention Record, Part 10, March 1965, pp. 1–8.
24. Sukert, A.N., An Investigation of Software Reliability Models, Proceedings of the
Annual Reliability and Maintainability Symposium, 1977, pp. 478–484.
25. Musa, J.D., Iannino, A., Okumoto, K., Software Reliability, McGraw Hill, New York,
1987.
26. Schick, G.J., Wolverton, R.W., An Analysis of Competing Software Reliability Models,
IEEE Transactions on Software Engineering, Vol. 4, 1978, pp. 104–120.
Computer and Internet Reliability 123
27. Musa, J.D., A Theory of Software Reliability and Its Applications, IEEE Transactions
on Software Engineering, Vol. 1, 1975, pp. 312–327.
28. Mills, H.D., On the Statistical Validation of Computer Programs, Report No. 72-6015,
IBM: Federal Systems Division, Gaithersburg, MD, 1972.
29. ICT Facts and Figures, International Telecommunication Union, ICT Data and Statistics
Division, Telecommunication Development Bureau, Geneva, Switzerland, 2011.
30. Hilbert, M., Lopez, P., The World’s Technological Capacity to Store, Communicate,
and Compute Information, Science, Vol. 332, No. 6025, April 2011, pp. 60–65.
31. Barrett, R., Haar, S., Whitestone, R., Routing Snafu Causes Internet Outage, Interactive
Week, April 25, 1997, p. 9.
32. Kiciman, E., Fox, A., Detecting Application-Level Failures in Component-Based Internet
Services, IEEE Transactions on Neural Networks, Vol. 16, No. 5, 2005, pp. 1027–1041.
33. Chan, C.K., Tortorella, M., Spares-Inventory Sizing for End-to-End Service Availa
bility, Proceedings of the Annual Reliability and Maintainability Symposium, 2001,
pp. 98–102.
34. Imaizumi, M., Kimura, M., Yasui, K., Optimal Monitoring Policy for Server System
with Illegal Access, Proceedings of the llth ISSAT International Conference on
Reliability and Quality in Design, 2005, pp. 155–159.
35. Hecht, M., Reliability/Availability Modeling and Prediction of E-Commerce and Other
Internet Information Systems, Proceedings of the Annual Reliability and Maintainability
Symposium, 2001, pp. 176–182.
36. Aida, M., Abe, T., Stochastic Model of Internet Access Patterns, IEICE Transactions on
Communications, Vol. E84-B, No. 8, 2001, pp. 2142–2150.
37. Dhillon, B.S., Kirmizi, F., Probabilistic Safety Analysis of Maintainable Systems,
Journal of Quality in Maintenance Engineering, Vol. 9, No. 3, 2003, pp. 303–320.
8 Power System Reliability
8.1 INTRODUCTION
An electric power system’s main areas are generation, transmission, and distribution
and a modern electric power system’s basic function is to supply its customers cost-
effective electrical energy with a high degree of reliability. In the context of electric
power system, reliability may simply be defined as concern regarding the system’s
ability for providing a satisfactory amount of electrical power [1].
The power system reliability’s history goes back to the early 1930s when probabil-
ity concepts were applied to electric power system-related problems [2–4]. The first
book on the subject in English appeared in 1970 [5]. Over the years, a large number of
publications, directly or indirectly, related to power system reliability have appeared.
Most of these publications are listed in Refs. [5–7].
This chapter presents various important aspects of power system reliability.
• LOLP does not take into consideration the factor of additional emergency
support that one region or control area may, directly or indirectly, receive
from another, or other emergency actions/measures that control area opera-
tors can exercise to maintain system reliability.
• LOLP itself does not specify the magnitude or duration of the electricity’s
shortage.
• Major loss-of-load incidents generally occur because of contingencies not
modelled appropriately by the traditional LOLP calculation.
• Different LOLP estimation methods can result in different indices for
exactly the same electric power system.
8.4.1 Index I
This index is known as system average interruption frequency index and is
defined by
CI tn
α1 = (8.1)
Ctn
where
α1 is the system average interruption frequency.
CItn is the total number of customer interruptions per year.
Ctn is the total number of customers.
8.4.2 Index II
This index is concerned with measuring service quality (i.e., measuring the continuity
of electricity supply to the customer) and is expressed by [13, 14]
where
α 2 is the mean number of annual down hours (i.e., service outage hours) per
customer.
MTTF is the mean time to failure (i.e., the average time between electricity
interruptions).
MTEI is the mean time to electricity interruption.
(8,760) is the total number of hours in one year.
Example 8.1
Assume that the annual failure rate of the electricity supply is 0.4 and the mean
time of electricity interruption is 4 hours. Calculate the mean number of annual
down hours (i.e., service outage hours) per customer.
In this case, MTTF (i.e., the average time between electricity interruptions) is
1
MTTF = = 2.8 hours
0.4
By inserting the above calculated value and the given values into Equation (8.2),
we obtain
α 2 = (8760)(4)/[(2.5)(8760) + (4)]
= 1.599 hours per year per customer
Thus, the mean number of annual down hours (i.e., service outage hours) per
customer is 1.599 hours.
128 Applied Reliability, Usability, and Quality for Engineers
8.4.3 Index III
This index is known as customer average interruption frequency index and is
expressed by
CI tn
α3 = (8.3)
CAtn
where
α 3 is the customer average interruption frequency.
CI tn is the total number of customer interruptions per year.
CAtn is the total number of customers affected. It is to be noted that the customers
affected should only be counted once, irrespective of the number of inter-
ruptions throughout the year they may have experienced.
8.4.4 Index IV
This index is known as system average interruption duration index and is defined by
CIDs
α4 = (8.4)
Ctn
where
α 4 is the system average interruption duration.
CIDs is the sum of customer interruption durations per year.
Ctn is the total number of customers.
8.4.5 Index V
This index is known as customer average interruption duration index and is
defined by
CIDs
α5 = (8.5)
CI tn
where
α 5 is the customer average interruption duration.
CIDs is the sum of customer interruption durations per year.
CI tn is the total number of customer interruptions per year.
8.4.6 Index VI
This index is known as average service availability index and is expressed by
CH as
α6 = (8.6)
CH d
Power System Reliability 129
where
α 6 is the average service availability.
CH as is the customer hours of available service.
CH d is the customer hours demanded. These hours are given by the 12-month
average number of customers serviced times 8,760 hours.
8.5.1 Model I
This mathematical model represents a system composed of three active and iden-
tical single-phase transformers with one standby transformer (i.e., unit) [9]. The
system state space diagram is shown in Fig. 8.1. The numerals in circles denote
system states.
The model is subject to the following five assumptions [9, 11]:
FIGURE 8.1 State space diagram of three identical single-phase transformers with one on
standby.
130 Applied Reliability, Usability, and Quality for Engineers
The following symbols are associated with the state space diagram shown in Fig. 8.1
and its associated equations:
Pj is the probability that the system is in state j at time t; for j = 0 (three trans-
formers operating, one on standby), j = 1, (two transformers operating, one
on standby), j = 2 (three transformers operating, none on standby), j = 3 (two
transformers operating, none on standby).
λ is the transformer failure rate.
µ is the transformer repair rate.
α is the standby transformer/unit installation rate.
Using the Markov method presented in Chapter 4, we write down the following
equations for Fig. 8.1 state space diagram [9, 11]:
dP0 (t )
+ 3λP0 (t ) − µP2 (t ) = 0 (8.7)
dt
dP1 (t )
+ αP1 (t ) − 3λP0 (t ) − µP3 (t ) = 0 (8.8)
dt
dP2 (t )
(3λ + µ) P2 (t ) − αP1 (t ) = 0 (8.9)
dt
dP3 (t )
+ µP3 (t ) − 3λP2 (t ) = 0 (8.10)
dt
P0 = [1 + M1 (1 + M 2 + M1 ) ]
−1
(8.11)
where
M1 = 3λ /µ (8.12)
P1 = M1 M 2 P0 (8.14)
P2 = M1P0 (8.15)
P3 = M12 P0 (8.16)
where
P0 , P1 , P2 , and P3 are the steady state probabilities of the system being in states 0,
1, 2, and 3, respectively.
AVss = P0 + P1 + P2 (8.17)
where
AVss is the system steady state availability.
8.5.2 Model II
This mathematical model represents a system composed of two non-identical and
redundant transmission lines subject to common-cause failures. A common-cause
failure may simply be described as any instance where multiple units fail due to
a single cause [15, 16]. In transmission lines a common-cause failure may take
place due to factors such as tornado, aircraft crash, and poor weather. The system
state space diagram is shown in Fig. 8.2. The numerals in circles and box denote
system states.
The following three assumptions are associated with the model:
The following symbols are associated with the state space diagram shown in Fig. 8.2
and its associated equations:
Pi (t ) is the probability that the system is in state i at time t, for i = 0 (both trans-
mission lines operating normally), i = 1 (transmission line a failed, other
operating), i = 2 (transmission line b failed, other operating), i = 3 (both
transmission lines failed).
λ c is the system common-cause failure rate.
λ ta is the transmission line a failure rate.
λ tb is the transmission line b failure rate.
µ ta is the transmission line a repair rate.
µ tb is the transmission line b repair rate.
132 Applied Reliability, Usability, and Quality for Engineers
FIGURE 8.2 State space diagram for two non-identical transmission lines.
Using the Markov method presented in Chapter 4, we write down the following
equations for Fig. 8.2 state space diagram [11, 15, 16]:
dP0 (t )
+ (λ ta + λ tb + λ c ) P0 (t ) − µ ta P1 (t ) − µ tb P2 (t ) = 0 (8.18)
dt
dP1 (t )
+ (λ tb + µ ta ) P1 (t ) − µ tb P3 (t ) − λ ta P0 (t ) = 0 (8.19)
dt
dP2 (t )
+ (λ ta + µ tb ) P2 (t ) − µ ta P3 (t ) − λ tb P0 (t ) = 0 (8.20)
dt
dP3 (t )
+ (µ ta + µ tb ) P3 (t ) − λ ta P2 (t ) − λ tb P1 (t ) − λ c P0 (t ) = 0 (8.21)
dt
P0 = µ ta µ tb N /N 3 (8.22)
Power System Reliability 133
where
N = N1 + N 2 (8.23)
N1 = (λ ta + µ ta ) (8.24)
N 2 = (λ tb + µ tb ) (8.25)
N 3 = NN1 N 2 + λ c N1 ( N 2 + µ ta ) + µ tb N 2 (8.26)
P1 = [ N1λ ta + N 4 λ c ] µ tb /N 3 (8.27)
where
N 4 = (λ ta + µ tb ) (8.28)
P2 = [ Nλ tb + N 5 λ c ] µ ta /N 3 (8.29)
where
N 5 = (λ tb + µ ta ) (8.30)
P3 = Nλ ta λ tb + N 4 N 5 λ c /N 3 (8.31)
P0 , P1 , P2 and P3 are the steady state probabilities of the system being in state 0, 1, 2,
and 3, respectively.
The system steady state availability is given by
AVss = P0 + P1 + P2 (8.32)
where
AVss is the system steady state availability.
FIGURE 8.3 State space diagram of a system operating under fluctuating environments.
The following symbols are associated with the state space diagram shown in
Fig. 8.3 and its associated equations:
Using the Markov method presented in Chapter 4, we write down the following
equations for Fig. 8.3 state space diagram [11]:
dP0 (t )
+ (λ n + θ) P0 (t ) − γP2 (t ) − µ n P1 (t ) = 0 (8.33)
dt
dP1 (t )
+ (µ n + θ) P1 (t ) − γP3 (t ) − λ n P0 (t ) = 0 (8.34)
dt
Power System Reliability 135
dP2 (t )
+ (λ s + γ ) P2 (t ) − µ s P3 (t ) − θP0 (t ) = 0 (8.35)
dt
dP3 (t )
+ ( γ + µ s ) P3 (t ) − λ s P2 (t ) − θP1 (t ) = 0 (8.36)
dt
γA1
P0 = (8.37)
θ( A2 + A3 ) + γ ( A4 + A1 )
where
A1 = µ s θ + µ n A5 (8.38)
A2 = µ n γ + µ s A6 (8.39)
A3 = λ n γ + λ s A6 (8.40)
A4 = λ s θ + λ n A5 (8.41)
A5 = λ s + γ + µ s (8.42)
A6 = λ n + θ + µ n (8.43)
P1 = A4 P0 /A1 (8.44)
P0 , P1 , P2 , and P3 are the steady state probabilities of the system being in states 0, 1,
2, and 3, respectively.
The system steady state availability is given by
AVss = P0 + P2 (8.47)
where
AVss is the system steady state availability.
136 Applied Reliability, Usability, and Quality for Engineers
8.6.1 Model I
This mathematical model represents a power generator unit that can either be in
operating state or a failed state. The failed power unit is repaired. The power genera-
tor unit state space diagram is shown in Fig. 8.4. The numerals in the circles denote
the power generator unit state.
The following three assumptions are associated with the model:
The following symbols are associated with the diagram shown in Fig. 8.4 and its
associated equations:
Pi (t ) is the probability that the power generator unit is in state i at time t; for
i = 0 (operating normally), i = 1 (failed).
λ pg is the power generator unit constant failure rate.
θ pg is the power generator unit constant repair rate.
Using the Markov method presented in Chapter 4, we write down the following
equations for the state space diagram shown in Fig. 8.4 [11]:
dP0 (t )
+ λ pg P0 (t ) − θ pg P1 (t ) = 0 (8.48)
dt
dP1 (t )
+ θ pg P1 (t ) − λ pg P0 (t ) = 0 (8.49)
dt
θ pg λ pg
P0 (t ) = + e − ( λ pg +θ pg )t (8.50)
λ pg + θ pg λ pg + θ pg
λ pg θ pg
P1 (t ) = − e − ( λ pg +θ pg )t (8.51)
λ pg + θ pg λ pg + θ pg
θ pg λ pg
AVpg (t ) = P0 (t ) = + e − ( λ pg +θ pg )t (8.52)
λ pg + θ pg λ pg + θ pg
and
λ pg θ pg
UApg (t ) = P1 (t ) = − e − ( λ pg +θ pg )t (8.53)
λ pg + θ pg λ pg + θ pg
where
AVpg (t ) is the power generator unit availability at time t.
UApg (t ) is the power generator unit unavailability at time t.
θ pg
AVpg = (8.54)
λ pg + θ pg
and
λ pg
UApg = (8.55)
λ pg + θ pg
where
AVpg is the power generator unit steady state availability.
UApg is the power generator unit steady state unavailability.
1 1
Since λ pg = and θ pg = , Equations (8.54)–(8.55) become
MTTFpg MTTRpg
MTTFpg
AVpg = (8.56)
MTTRpg + MTTFpg
138 Applied Reliability, Usability, and Quality for Engineers
and
MTTRpg
UApg = (8.57)
MTTRpg + MTTFpg
where
MTTFpg is the power generator unit mean time to failure.
MTTRpg is the power generator unit mean time to repair.
Example 8.2
Assume that constant failure and repair rates of a power generator unit are as
follows:
λ pg = 0.0002 failures/hour
and
θ pg = 0.0006 repairs/hour
0.0006
AVpg = = 0.75
0.0002 + 0.0006
Thus, the steady state availability of the power generator unit is 0.75.
8.6.2 Model II
This mathematical model represents a power generator unit that can either be in
operating state or failed state or down for preventive maintenance. This scenario is
depicted by the state space diagram shown in Fig. 8.5. The numerals in circles and
box denote the system state.
• The power generator unit failure, repair, preventive maintenance down, and
preventive maintenance performance rates are constant.
• The power generator unit failures are statistically independent.
• After repair and preventive maintenance, the power generator unit is as
good as new.
The following symbols are associated with the state space diagram shown in Fig. 8.5
and its associated equations:
Pj (t ) is the probability that the power generator unit is in state j at time t; for
j = 0 (operating normally), j = 1 (down for preventive maintenance), j = 2
(failed).
λ is the power generator unit failure rate.
λ p is the power generator unit (down for) preventive maintenance rate.
θ is the power generator unit repair rate.
θ p is the power generator unit preventive maintenance performance (repair)
rate.
Using the Markov method presented in Chapter 4, we write down the following
equations for the state space diagram shown in Fig. 8.5 [11]:
dP0 (t )
+ (λ p + λ) P0 (t ) − θ p P1 (t ) − θP2 (t ) = 0 (8.58)
dt
dP1 (t )
+ θ p P1 (t ) − λ p P0 (t ) = 0 (8.59)
dt
dP2 (t )
+ θP2 (t ) − λP0 (t ) = 0 (8.60)
dt
where
b1b2 = θ pθ + λ pθ + λθ p (8.64)
b1 + b2 = −(θ p + θ + λ p + λ) (8.65)
It is to be noted that the above availability expression is valid if and only if b1 and b2
are negative. Thus, for large t, Equation (8.66) reduces to
θ pθ
AVpg = lim AVpg (t ) = (8.67)
t →∞ b1b2
where
AVpg is the power generator unit steady state availability.
Example 8.3
Assume that for a power generator unit we have the following data values:
λ = 0.0004 failures/hour
λ p = 0.0007/hour
θ p = 0.0008/hour
θ = 0.0005 repairs/hour
Calculate the power generator unit steady state availability. By substituting the
given data values into Equation (8.67), we get
(0.0008)(0.0005)
AVpg =
(0.0008)(0.0005) + (0.0007)(0.0005) + (0.0004)(0.0008)
= 0.3738
Thus, the steady state availability of the power generator unit is 0.3738.
The following symbols are associated with Fig. 8.6 diagram and its associated
equations:
Pi (t ) is the probability that the power generator unit is in state i at time t; for
i = 0 (operating normally), i = 1 (derated), i = 2 (failed).
λ is the power generator unit failure rate from state 0 to state 2.
λ d is the power generator unit failure rate from state 0 to state 1.
λ1 is the power generator unit failure rate from state 1 to state 2.
θ is the power generator unit repair rate from state 2 to state 0.
θd is the power generator unit repair rate from state 1 to state 0.
θ1 is the power generator unit repair rate from state 2 to state 1.
Using the Markov method presented in Chapter 4, we write down the following
equations for Fig. 8.6 state space diagram [11]:
dP0 (t )
+ (λ + λ d ) P0 (t ) − θd P1 (t ) − θP2 (t ) = 0 (8.68)
dt
dP1 (t )
+ (θd + λ1 ) P1 (t ) − θ1P2 (t ) − λ d P0 (t ) = 0 (8.69)
dt
142 Applied Reliability, Usability, and Quality for Engineers
dP2 (t )
+ (θ + θ1 ) P2 (t ) − λ1P1 (t ) − λP0 (t ) = 0 (8.70)
dt
M1 M2 M1 M2 m2 t
P0 (t ) = + e m1t + 1 − − e (8.71)
m1m 2 m1 (m1 − m2 ) m1m2 m1 (m1 − m2 )
where
1/2
− M 3 ± M 32 − 4 M 4
m1 , m2 = (8.74)
2
M 3 = θ + θ1 + θd + λ + λ1 + λ d (8.75)
M5 M6 M M6 m2 t
P1 (t ) = + e m1t − 5 + e (8.77)
m1m2 m1 (m1 − m2 ) m1m2 m1 (m1 − m2 )
where
M 5 = λ d θ + λ d θ1 + λθ1 (8.78)
M 6 = m1λ d + M 5 (8.79)
M7 M8 M M8 m2 t
P2 (t ) = + e m1t − 7 + e (8.80)
m1m2 m1 (m1 − m2 ) m1m2 m1 (m1 − m2 )
where
M 7 = λ d λ1 + θd λ + λλ1 (8.81)
M8 = m1λ + M 7 (8.82)
Power System Reliability 143
AVpgo (t ) = P0 (t ) + P1 (t ) (8.83)
where
AVpgo (t ) is the power generator unit operational availability at time t.
8.7 PROBLEMS
1. Write an essay on power system reliability.
2. Define the following four terms:
• Forced derating
• Power system reliability
• Forced outage rate
• Forced outage
3. Describe loss of load probability.
4. What are the difficulties associated with the use of loss of load probability?
5. Define the following indices:
• System average interruption frequency index
• Customer average interruption duration index
• Average service availability index
6. Assume that the annual failure rate of the electricity supply is 0.5 and the
mean time to electricity interruption is 6 hours. Calculate the mean number
of annual down hours (i.e., service outage hours) per customer.
7. Assume that constant failure and repair rates of a power generator unit are
as follows:
• λ pg = 0.0001 failures/hour
• µ pg = 0.0005 repairs per hour
• Calculate the steady state unavailability of the power generator unit.
8. Prove that the sum of Equations (8.37), (8.44)–(8.46) is equal to unity.
9. Prove Equation (8.17) by using Equations (8.7)–(8.10).
10. Prove Equation (8.84).
REFERENCES
1. Billinton, R., Allan, R.N., Reliability of Electric Power Systems: An Overview, in
Handbook of Reliability Engineering, edited by Pham, H., Springer-Verlag, London,
2003, pp. 511–528.
2. Smith, S.A., Service Reliability Measured by Probabilities of Outage, Electrical World,
Vol. 101, 1934, pp. 371–374.
144 Applied Reliability, Usability, and Quality for Engineers
9.1 INTRODUCTION
Each year, a vast sum of money is spent to produce various types of medical devices
around the globe. Their usability has become a very important issue, because vari-
ous studies performed over the years clearly indicate that poorly designed medical
devices’ human-machine interfaces significantly increase the risk for the occurrence
of human errors [1–4]. These errors can result in patient injury or even death.
Medical device usability may simply be defined as the medical device’s inter-
active systems quality in regard to factors such as ease of use, ease of learning,
and user satisfaction [3, 5]. This means that for attaining high user adoption and
to successfully navigate all involved regulatory processes, a medical device must
be properly designed around all types of users and must incorporate appropriate
defences against potential risks under varied conditions. More clearly, the medical
device usability-associated concerns in regard to users such as nurses, physicians,
patients, family members, and professional caregivers must be raised to the same
level as conventional economic, technological, and manufacturing concerns, during
the design phase.
This chapter presents various important aspects of medical device usability.
The user interfaces of medical devices are very important because they help
to facilitate correct actions and prevent or discourage the hazardous actions’
occurrence. They comprise all elements of medical devices with which users
interact while using devices, preparing them for use (e.g., setup and calibration),
or conducting maintenance-related activities. More specifically, the user inter-
face incorporates all hardware features that control the device operation. Some
examples of these features are knobs, buttons, switches, and user information
providing device features such as indicator lights, displays, and auditory and
visual alarms.
Nowadays, in medical devices, the user interfaces are usually computer based.
Thus, in this case interface characteristics include items such as follows [4, 6]:
• Navigation logic
• Data entry requirements
• Control and monitoring screens
• Alerting mechanisms
• The manner in which data are organised and presented
• Screen elements
• Help functions
• Prompts
• Keyboards
• Mouse
Finally, it is to be noted that items such as device labelling, training materials, pack-
aging, and operation instructions are also considered part of the user interface, and
thus require a careful consideration in regard to their effective usability.
In order to understand a medical device’s use completely and accurately, the
clearly written its use description is essential. The description includes information
on items such as follows [4, 6]:
• User needs for effective and safe use of the device and how the device
satisfy them
• Device operation
• User population characteristics
• User interface design or preliminary design
• Use environments
• General use scenarios
The medical devices’ use environments can vary quite significantly from situation
to another and can have major impacts on their usability. The four main factors with
respect to device users that must be considered carefully are as follows [4]:
All in all, this FDA study indicated that errors in using medical devices cause, each
day in the United States, an average of at least three deaths or serious injuries [7].
Six steps of a general approach for developing medical devices’ effective user
interfaces area as follows [3, 9]:
It is to be noted that Step 2 is also concerned with allocating tasks between the
humans and the system and in Step 3, design prototypes and usability goals are
also developed. In Step 4, the results of usability testing are also evaluated against
performance objectives and goals, and a loop back to Step 3 is made as appropriate.
In Step 6, a loop back to Step 3 is made as necessary.
• Guideline IV: Ascribe to a Grid. Past experiences, over the years, indi-
cate that most screens generally operate and look better when their screen
components are aligned and they serve a utilitarian objective effectively.
Grid-based screens are generally easier to implement in computer code
because of visual elements’ predictability. The following two guidelines
are considered quite useful with respect to ascribing to a grid:
• Keep on-screen elements at a fixed distance from the grid lines.
• Begin by defining the screen’s dominant elements and approximate
space-related requirements when developing a grid structure.
• Guideline V: Harmonise and refine icons. This is a very important
guideline and some of the actions that can be taken to give the icons a
FIGURE 9.1 Useful actions to provide appropriate navigation cues and options.
150 Applied Reliability, Usability, and Quality for Engineers
FIGURE 9.2 Factors to be considered in designing medical devices for use by older personnel.
five or six Americans will be over 65 years of age [7, 11]. It means there is a definite
need to design medical devices for use by older people by considering factors such as
those shown in Fig. 9.2 during the device design phase [7, 12–14]. The factors shown
in Fig. 9.2 are sensory limitations, cognitive limitations, and physical limitations.
In regard to sensory limitations, the two common limitations among older person-
nel are impaired vision and hearing. To overcome impaired vision associated limi-
tations, the designers must give careful consideration to using somewhat oversized
fonts for displays, readouts, and labels on medical devices. Decline in the hearing
ability of a person is generally a function of age. Males and females, as they age, suf-
fer greater hearing loss at frequencies in the 3000–6000 Hz range and 550–1000 Hz
range, respectively. Therefore, designers must carefully consider these factors in medi-
cal devices to be used by older people.
In regard to cognitive limitations, past experiences clearly indicate that the cog-
nitive abilities of older people can vary quite significantly. Some may experience
attention deficits often referred to as “cognitive rigidity”, a condition that makes it
quite difficult to learn new procedures and approaches, so designers must lower the
number of steps in a given procedure concerning a medical device for improving its
usability effectiveness among the older population [4, 7].
Finally, in regard to physical limitations, a significant number of people generally lose
10%–20% of their strength by the time they reach 60–70 years of age. Their mobility
may also be limited by various types of joint-associated problems. In order to overcome
physical limitations such as these, the medical devices’ designers should incorporate
controls with large-diameter knobs so that rotation needs lesser fine control, textured
knob surfaces that need less pinching strength for overcoming finger slippage, and so on.
hands [15]. CTD presents a significant risk to healthcare workers, but it can be pre-
vented through actions such as better medical device design and better work habits
reinforced through effective warning labels and instructions [4, 16].
Some guidelines for designing hand-operated medical devices with respect to
CTD are as follows [4, 7, 17]:
• Select appropriate materials for handles that provide a non-slip grip and
protection to the hands of users from electrical conduction, vibrations, and
cold temperatures.
• Design objects in such a way so that can easily be grasped by the entire
hand, rather than pinched between fingers and the thumb, in circumstances
when high precision is not required.
• Position work surfaces in such a way that permits forearms to extend at an
angle of around 90 degrees with respect to the user’s body, with the elbows
held at one’s side.
• Provide padding and ergonomically contoured surfaces for reducing con-
centration of mechanical stresses on the user’s skin and underlying tissues.
• Ensure that an adequate amount of space is provided for forearm and hand
movements for preventing potential device users from assuming poor hand
postures while performing tasks.
• Ensure that gripping controls and surfaces are designed in such a way that
they enable potential users to keep their hands in a neutral, resting position.
• Provide force-assist mechanisms as necessary for decreasing the muscle
exertion needed to operate a device.
• Perform analysis of the range of user hand motion as a basis to determine
the dynamic characteristics of controls and handles.
• Design heavy/awkwardly shaped objects in such a way that they can easily
be lifted/grasped by device users with both hands.
• Design hand-operated devices in such a way that they will be comfortable
for individuals with different hand sizes.
• Develop operational sequences that prevent the frequent occurrence of a
repetitive movement.
• Provide effective advisory instructions/visual ones with respect to holding
a device.
• Reduce the weight of objects so that they can easily be picked up or moved.
• Avoid those designs that will require device users to exert force continuously.
• Provide appropriate instructions to device users on how to prevent the
occurrence of CTDs.
• Shield devices to reduce vibrations they will transmit to their potential users.
9.8 PROBLEMS
1. List at least seven important characteristics of the medical device potential
users that should be taken into consideration during the design process?
2. Discuss the main factors with respect to device users that must be consid-
ered carefully.
3. List at least twelve medical devices with high incidence of user/human
error.
4. Describe a general approach for developing medical devices’ effective user
interfaces.
5. List at least nine guidelines for making medical device interfaces more
user-friendly.
6. Discuss the following three factors with respect to medical device use
environments:
• Noise and light
• Mental workload
• Physical workload
7. What are the important factors that must be considered during the medical
device design phase when the device is going to be used by persons over
65 years of age?
8. List at least twelve useful guidelines for designing hand-operated medical
devices with respect to CTD.
9. Write an essay on medical device usability.
10. List at least six most useful documents to improve usability of medical
devices.
REFERENCES
1. Obradovich, J.H., Woods, D.D., Users as Designers: How People Cope with Poor HCI
Design in Computer-Based Medical Devices, Human Factors, Vol. 38, 1996, pp. 40–46.
2. Hayman, W.A., Errors in the Use of Medical Equipment, in Human Error in Medicine,
edited by Bogner, M.S., Lawrence Erlbaum Associates, New York, 1994, pp. 327–347.
3. Garmer, K., et al., Arguing for the Need of Triangulation and Iteration When Designing
Medical Equipment, Journal of Clinical Monitoring and Computing, Vol. 17, 2002,
pp. 105–114.
4. Dhillon, B.S., Engineering Usability: Fundamentals, Applications, Human Factors, and
Human Error, American Scientific Publishers, Stevenson Ranch, California, 2004.
5. ISO 13407, User-Centered Design Process for Interactive Systems, International
Organization for Standardization (ISO), Geneva, Switzerland, 1999.
Medical Device Usability 155
10.1 INTRODUCTION
Over the years, due to increasing development of software for interactive application,
attention to the requirements and preferences of potential end users has intensified
quite significantly. Nowadays, the user interface quite often plays a very important
role in the success or failure of a software project. Furthermore, as per Ref. [1],
around 50%–80% of all source code development accounts, directly or indirectly,
for the user interface.
Generally, user-friendly software enables its all potential users to conduct their
tasks easily and intuitively, and it clearly supports rapid learning and high skills
retention of all the involved individuals. Furthermore, in today’s competitive global
environment, usability is not a luxury, but a basic ingredient in software systems, as
the users’ productivity as well as comfort relate directly to it.
Software usability may simply be defined as quality in software application or
use: specifically, how productively its users will be able to conduct their tasks, how
much support the users will require, and how easy and straightforward the software
is to learn and use [2, 3].
This chapter presents various important aspects of software usability.
method are that it is quite straight forward to learn and use, allows itera-
tive evaluation/testing, and is quite useful in satisfying the criteria of all
involved parties.
• Cognitive walkthrough: This method makes use of a detailed procedure
for simulating task execution at each step of the dialogue for determining,
if the simulated user’s memory content and goals can be safely assumed
for leading to the next correct anticipated measure or action. The principal
benefits of this method are that it is a quite effective approach for predict-
ing problems and capturing the cognitive process. In contrast, its two main
drawbacks are that it is focused on only one attribute of usability and that
there is a need to train a skilled evaluator [12].
• Guidelines checklists: These are quite useful for ensuring that the appropriate
usability principles will be considered in software design work. A checklist
provides involved inspectors with a basis for comparing the software product.
Generally, checklists are employed in conjunction with a usability inspection/
evaluation method.
• Heuristic evaluation: This method involves usability specialists who deter-
mine whether each and every dialogue element satisfies set usability principles
effectively. Some of the main benefits of this method are straightforward to
learn and use, useful for highlighting problems early in the design process,
and inexpensive to implement.
• Standards inspection: This method involves usability experts who inspect
the interface for compliance with given standards. These standards could
be user-interface standards, domain-specific software standards, or depart-
mental standards (if any).
selection process must take into consideration with care, the different foci of the
evaluation. Most of these considerations/foci are as follows [13]:
data. Normally, these data are in the form of performance metrics (e.g.,
required time to conduct specific tasks). Finally, it is to be noted that the
International Organization for Standardization (ISO) promotes the usabil-
ity evaluation method based on measured performance of predetermined
usability metrics [15].
Finally, factors such as those shown in Fig. 10.2 are considered very important in
regard to software usability testing methods [16].
Additional information on the above four methods is available in Refs. [5, 13, 16].
10.8 PROBLEMS
1. What are the important factors that dictate the need for considering usabil-
ity during the software development process?
2. What are the fundamental principles of the human-computer interface in
regard to software?
3. Describe the software usability engineering process.
4. Discuss the steps for improving software product usability.
5. What are the widely used methods for inspecting/evaluating software
usability? Describe at least two of these methods.
6. Describe the following two software usability testing methods:
• Performance measurement
• Thinking-aloud protocol
7. What are the important factors in regard to software usability testing methods?
8. Discuss at least four useful guidelines to perform software usability testing.
9. List at least six factors that must be considered during the selection of soft-
ware usability inspection methods.
10. Write an essay on software usability.
REFERENCES
1. Myers, B., Robson, M.B., Survey on User Interface Programming, Proceedings of the
ACM CHI’92 Human Factors in Computing Systems Conference, 1992, pp. 195–202.
2. International Organization for Standardization (ISO), Software Product Evaluation:
General Overview, ISO/IEC 14598-1, Geneva, Switzerland.
3. Juristo, N., Windl, H., Constantine, L., Introducing Usability, IEEE Software, Vol. 18
(January/February), 2001, pp. 20–21.
4. Dhillon, B.S., Engineering Usability: Fundamentals, Applications, Human Factors, and
Human Error, American Scientific Publishers, Stevenson Ranch, California, 2004.
Software Usability 165
11.1 INTRODUCTION
A few decades ago World Wide Web (WWW) was released by the European
Laboratory for Particle Physics (CERN), and its growth has mushroomed to hundreds
of millions of sites around the globe from just 623 sites in 1993 [1]. Nowadays, the
usage of the web in the global economy has become an instrumental factor. For exam-
ple, in 2001, the global e-commerce market was estimated to be around $1.2 trillion,
and today it has grown to many trillions of dollars [1, 2]. Moreover, there are billions
of web users throughout the world.
Web usability may simply be expressed as allowing the users to manipulate fea-
tures of a website for accomplishing certain goals [1, 2]. Some of the main goals of
web usability are as follows [2, 3]:
• Provide the correct choices to all the potential users, and do so in a very
obvious way.
• Present the information to all potential users in a clear and concise fashion.
• Put the most important thing in the appropriate place on a web page or a
web application.
• Remove any ambiguity whatsoever concerning an action’s consequences
(e.g., clicking on remove/delete/purchase).
Nowadays, usability rules the web, if a website is not easy and straightforward to
use, people simply leave and move to something else. This chapter presents various
important aspects of web usability.
• User interface accounts for about 47%–60% of the lines of system or appli-
cation code [10].
• A number of studies have shown that approximately 10% of all users scroll
beyond the information that is visible on the screen when a web page
appears [2, 9, 11].
• In 2000, over 50% of the companies in the United States sold their products
online, and there were over 800 million pages on the web throughout the
United States [12, 13].
• Content authorizing error. This type of error occurs when the designer
writes in the normal linear style instead of writing for potential online read-
ers who often scan text and need short pages, where secondary information
is best relegated to supporting pages.
• Page design error. This type of error frequently occurs when the emphasis
is on creating attractive pages for evoking good feelings about the organisa-
tion/company. Instead, the emphasis should be on designing for an optimal
user experience under a day-to-day environment. Furthermore, utility is
more important than attractive pages.
• Linking strategy error. This type of errors occurs when the designer treats
the website as an indispensable entity and does not provide appropriate
links to other sites. Beyond not having proper entry points where others can
link, many organisations/companies even overlook the use of essential links
to their own site in their very own advertisements.
• Project management-associated error. This type of error occurs when a
web project is managed simply as a conventional corporate project rather
than as a single-customer-interface project. The main drawback of the
traditional method is that it generally leads to a rather internally focused
design with an inconsistent user interface.
• Information architecture-associated error. This type of error occurs when
the website is constructed for mirroring the organisational structure rather
than structuring it to mirror the tasks of users and their specific views of
the information space.
• Business model-associated error. This type of error occurs when the
designer treats the web as a marketing communication (Marcom) brochure,
rather than recognizing it as a paradigm shift that will ultimately change
the way that business-associated transactions are conducted in this age of a
networked economy.
TABLE 11.1
Web Page Design: Important Usability Dos and Don’ts
the important usability-related dos and don’ts in regard to page design are presented
in Table 11.1 [14].
In web page design, some of the important factors that should be considered
with utmost care are shown in Fig. 11.1 [14]. Each of the important factors shown
in Fig. 11.1 is described in detail in Sections 11.4.1–11.4.5.
In regard to “The downloading and displaying speed of pages”, the length of time
for downloading a page from the server and displaying it in the browser window is
a very important factor to sizing web pages effectively. Nonetheless, response time
may simply be expressed as the time from when a user requests a page to when it has
been displayed totally. Some of the useful guidelines concerned with response time
are as follows:
• Ensure that the response time is well within ten seconds for keeping the
attention of potential users.
• Ensure that the response time is within one second to fit into the chain of
potential users’ thoughts.
• Ensure that the response time is within 0.1 second in order to make the
system feel interactive.
• Provide an adequate warning to potential users when a web page will
require more than 10 seconds to download.
In regard to “The flexibility of the pages to fit the available display area”, some of
the guidelines considered quite useful for its successful achievement are as follows:
• Design web pages in such a way that they can easily be resized (i.e., to fit
within a wide range of window sizes).
• Generally, no horizontal scrolling is needed when the window is 800 pixels
wide.
• Ensure that all the key page elements are clearly visible with scrolling when
the window is 400 pixels in height.
• Use relative, instead of absolute, sizes for all elements that fall under a
browser’s resizing capability.
• During the design process, pay close attention to the ability for resizing
footers, headers, etc.
11.4.2 Font Usage
Fonts are used for creating a variety of web page elements, including buttons, naviga-
tion bars, links, menus, footers and headers, and tables, in addition to the text that
conveys most of the content of a website. Font faces fall under two basic categories:
serif and sans-serif. Serif fonts have small appendages at the bottoms and tops of
letters. Three examples of such fonts are Times Roman, Courier, and Century. These
fonts are useful as they make it easier to read long lines.
Sans-serif fonts are simpler in shape because they consist of only basic line
strokes. Two examples of sans-serif fonts are Helvetica and Arial. Some of the use-
ful pointers directly or indirectly concerned with font usage are as follows [2, 14]:
• Avoid specifying as much as possible absolute font sizes and getting carried
away in the use of font sizes, faces, and styles.
• Note that different browsers support different font faces.
• Use italics for defining terms or emphasising an occasional word.
information, but also clearly reflects how websites are used. Some of the guidelines
considered quite useful in writing effective web page text are as follows [14]:
11.4.4 Image Usage
Users quite often blame webpage images as an impediment to successful web access. In
this regard, the following guidelines to the use of images may be quite helpful [2, 9, 14]:
11.4.5 Help Users
Past experiences over the years, clearly indicate that web users generally do not read
web pages in a serial manner; rather, they hop from one visual element to another.
Thus, the biggest challenge faced by web designers is to use the visual elements most
effectively for drawing the attention of potential users to key elements. Furthermore, it
should be emphasised that users will not read text unless they are specifically enticed
to do son. The following list presents some of the important guidelines/pointers [14]:
• Use the same colour for developing a common thread among elements that
cannot be placed next to each other.
• Ensure that the visual-highlighting approaches’ application is consistent
throughout the website.
• Test each web page’s final design by eliminating all visual elements in ques-
tion, one at a time.
• Past experiences over the years, clearly indicate that users generally assume
that a row of similar elements should be “read” from left to right or from
top to bottom.
• Reinforce the hierarchy of web page contents with a visual dominance
hierarchy.
172 Applied Reliability, Usability, and Quality for Engineers
TABLE 11.2
Website Design: Important Usability Dos and Don’ts
• Ensure that the key text has the highest possible contrast.
• Use size for making users understand which elements fall where in regard
to the content hierarchy.
• Past experiences over the years, clearly indicate that items above and to the
left of the page centre appear to be noticed first.
11.5.1 Site Organisation
Past experiences over the years, clearly indicate that site organisation needs careful con-
sideration during design because users generally do not read web pages the way they read
books. Some useful guidelines concerning website organisation are as follows [9, 14]:
• Ensure that the pertinent information is positioned in such a way that is still
clearly visible even in a situation when the browser window is shrunk to
around 50% of the screen width.
• Organise the site into many bite-size pieces capable of being traversed in
varying ways to take advantage of the web’s navigational flexibility.
• Ensure that pointers to related topics are clearly visible somewhere in the
upper half of the page in question.
• Provide users some content on each and every page.
• Do not display blocks of text in a large font.
TABLE 11.3
Navigation Aids: Important Usability Dos and Don’ts
11.6.1 Link Usage
Links are probably the most common mechanism that directly or indirectly support web-
site navigation, and web pages make use of links in the following three ways [2, 14, 15]:
• To direct users to an alternative source when the current page does not
contain the required information.
• To provide efficient access to the website’s other pages.
• To direct users to pages that contain additional information on the text/
graphic stated in the link.
The following list presents some guidelines considered quite useful in regard to
the effective use of links [14, 15]:
• Underline the words that really matter for improving the link’s readability.
• Make the image itself be the link when there is a definite need to link to
larger copy of an image.
• Group the links into categories when multiple links show up in a list.
• Make use of standard colours and underline all links.
• Format all links by utilising lowercase and uppercase letters.
• Locate all the alternative links at the top of the page.
• Select link text with utmost care.
• Ensure that all menus are anchored properly to a menu bar across the top
of the web page.
• Ensure that all menu titles form a consistent group and are short.
• Format all menus titles and menu items by utilising uppercase and lower-
case letters.
• Ensure that all menu items are grouped together logically.
• Avoid using cascading (i.e., multilevel) menus.
• Factor I: The selection of navigation labels and structure with utmost care.
• Factor II: The selection of the top ten things (during the navigation bar
development process) that all users are most likely to do on the site in
question.
The following seven steps are considered quite useful in selecting navigation
structure and labels [14, 15]:
• Step III: Spread out the cards with care and group them into logical
categories.
• Step IV: Highlight at least five users and have them repeat the preceding
three steps.
• Step V: Make comparison of the findings of all the sortings. If a pattern
of classifications/categories is not emerging, then repeat the entire process
(i.e., all the steps) again.
• Step VI: When the classifications/categories arrived at by most users look
similar, take advantage of them for developing an outline of the website
structure.
• Step VII: Present the structure’s outline to at least five other users and ask
their inputs. Repeat the process as the need arises.
• Category I: Form use. Under this category, the problems are concerned
with the form Submit and Reset buttons.
• Category II: Readability. Under this category, the problems are concerned
with content readability.
• Category III: Performance. Under this category, the problems are concerned
with the size and coding of graphics in regard to page download speeds.
• Category IV: Accessibility. Under this category, the problems are concerned
with the page making appropriate use of tags for visually impaired users.
• Category V: Maintainability. Under this category, the problems are con-
cerned with tags and coding information that would make the page easier
to port to another server.
• Category VI: Navigation. Under this category, the problems are concerned
with the coding of links.
Web Usability 177
All in all, the main limitation of the Web SAT is that it can examine or check only
individual web pages.
11.7.2 Max
This is another useful usability tool that scans through a website for collecting
information about vital statistics and rates concerning a site’s usability. Max uses a
statistical model for simulating the experience of a user in calculating ratings in the
following three areas [2, 16]:
• Area I: Accessibility. In this case, Max estimates the mean time a user takes
to find something on the site under consideration.
• Area II: Content. In this case, Max summarises the percentage of different
media elements (i.e., graphics, multimedia, and text) in addition to client-side
technologies utilised (e.g., Flash and Portable Document Format [PDF]) that
comprise the website.
• Area III: Load time. In this case, Max estimates the mean time to load
website pages.
The principal weakness of Max is that it does not provide many suggestions for mak-
ing changes to design. In contrast, its main strength is that it provides a performance
benchmark.
11.7.3 NetRaker
NetRaker consists of a number of online tools that help to highlight usability-related
problems and conduct market research. NetRaker provides a set of comprehen-
sive guidelines for composing objective survey questions and a customisable set of
usability survey templates. The questions are randomly made available to website’s
users by providing them with an option to participate.
The survey requires users to conduct tasks on the website and then provide satisfac-
tory feedback in regard to the simplicity of carrying out the tasks. Some of the main
benefits of NetRaker are as follows [2, 16].
• This is a very useful tool for obtaining feedback in the context of a website’s
intended purpose as opposed to relying totally on generic hypertext markup
language (HTML) checks of statistical analysis.
• This is a quite useful tool to survey users and gather usability-associated
feedback quickly.
• The NetRaker automation ensures that all users are surveyed consistently.
Finally, it is added that NetRaker is one of the best tools/methods for identifying
usability-related issues because it is based on users’ direct feedback.
178 Applied Reliability, Usability, and Quality for Engineers
11.7.4 Lift
This is another usability tool used for performing analysis of a web page for uncover-
ing usability-related problems. There are following two types of Lift [2]:
• Type I: Lift Online. This carry out HTML checks derived from usability
principles in a similar way to Web SAT. More specifically, it checks one
page at a time and then provides a report on the usability-related issues of a
page. Furthermore, Lift Online goes a step further than Web SAT because
it provides appropriate code change-related recommendations.
• Type II: Lift Onsite. This can be easily run from a personal computer (PC),
and it provides the very compelling feature of directly fixing the HTML-
associated problems as they are being reviewed in the usability evaluation
report.
All in all, Lift provides usability-based HTML validations for ensuring good coding
practices.
11.8.1 Concept
Some of the questions pertaining to this area are as follows [13, 17–19]:
• What existing websites can be compared with the one in question or under
consideration?
• What expectations will the website raise for its visitors?
• What basic image of the company/organisation does the site project?
• What does the first page clearly promise concerning the rest of the website?
• Can the rest of the site satisfy this promise appropriately?
11.8.2 Content
Some of the questions pertaining to this area are as follows [13, 17–19]:
11.8.3 Text
Some of the questions pertaining to this are as follows [13, 17–19]:
• Does the first page effectively convey a clear message to potential visitors,
including what they can expect to find in the website?
• Are the titles and subheadings informative enough for their effective
application?
• Is the text on the first and feature pages short enough for its effective
application?
• Are all of the titles appropriate for their effective use by search engines?
• Are the hyperlinks and button titles straightforward and clear?
• Can the text be read appropriately in a cursory manner?
• Is the text under consideration grammatically checked?
• Is the text sufficiently attractive for reading?
11.8.4 Mechanics
Some of the questions pertaining to this area are as follows [13, 17–19]:
• Do tools such as roll-down menus and mouse over events (if utilised) clearly
support the site’s use?
• Are all hyperlinks and buttons operating as per requirements?
• How quickly does the site react; how quickly do the pages load?
• How functional is the website under consideration?
• Are there any error-associated messages?
11.8.5 Design
Some of the questions pertaining to this area are as follows [13, 17–19]:
11.8.6 Navigation
Some of the questions pertaining to this area are as follows [13, 17–19]:
11.9 PROBLEMS
1. Define the term “Web usability” and write an essay on web usability.
2. List at least five web usability-associated facts and figures.
3. Discuss commonly occurring web design-related errors.
4. What are the important factors that must be considered in the web page
design?
5. List at least four web page design usability-related dos and don’ts.
6. List at least three website design usability-associated dos and don’ts.
7. What are the important factors to be considered in website design? Discuss
at least two of these factors.
8. List at least three navigation-aids-related usability dos and don’ts.
9. What are the important factors to be considered with respect to navigation
aids? Discuss at least one such factor.
10. Describe the following two tools used for evaluating web usability:
• Max
• Web SAT
REFERENCES
1. Powell, T., Web Design: The Complete Reference, Osborne McGraw-Hill, Berkeley,
California, 2000.
2. Dhillon, B.S., Engineering Usability: Fundamentals, Applications, Human Factors, and
Human Error, American Scientific Publishers, Stevenson Ranch, California, 2004.
3. Cloyd, M.H., Designing User-Centered Web Applications in Web Time, IEEE Software,
Vol. 18, No. 1, 2001, pp. 62–69.
4. Chi, E.H., Improving Web Usability Through Visualization, IEEE Internet Computing,
Vol. 6, No. 2, 2002, pp. 64–71.
5. Manning, H., McCarthy, J.C., Souza, R.K., Why Most Web Sites Fail (White Paper),
Forrester Research, Cambridge, Massachusetts, September 1998.
6. Souza, R.K., Manning, H., Goldman, H., Tong, J., The Best of Retail Site Design
(White Paper), Forrester Research, Cambridge, Massachusetts, October 2000.
7. Becker, S.A., Mottay, F.E., A Global Perspective on Web Site Usability, IEEE Software,
Vol. 18, No. 1, 2001, pp. 54–61.
Web Usability 181
12.1 INTRODUCTION
Each year a vast sum of money is being spent on health care around the globe. For
example, in 1992 the United States alone spent around $840 billion on health care,
or about 14% of its gross domestic product (GDP) [1]. Furthermore, since 1960 the
health care-related spending in the United States has increased from 5.3% of the gross
national product (GNP) to about 13% in 1991 [2].
The history of quality in health care goes back to the 1860s, when Florence
Nightingale (1820–1910), a British nurse, helped to lay the foundation for the health
care quality assurance programmes, by advocating the need for a uniform system for
the collection and evaluation of hospital-associated statistics [1]. Her analysis of the
data collected clearly showed that mortality rates varied quite significantly from one
hospital to another.
In 1914, E.A. Codman (1869–1940) in the United States, studied the results of
health care in regard to quality, and clearly emphasised the issues, when examining
the quality of care, such as the accreditation of institutions, the importance of licen-
sure or certification of providers, the need for properly taking into consideration the
severity or stage of the disease, the economic-related barriers to receiving care, and
the patients’ health and illness behaviours [1, 3].
This chapter presents various important aspects quality in health care.
• Clinical audit. This is the process of reviewing the delivery of care against
established standards to highlight and remedy all deficiencies through a
process of continuous quality improvement.
• Quality assurance. This is the measurement of the degree of care given
(assessment) and, when appropriate, mechanisms for improving it.
• Dimensions of quality. These are the measures of health system per-
formance, including measures of effectiveness, appropriateness, safety,
capability, sustainability, accessibility, responsiveness, continuity, and
efficiency.
• Total quality management (TQM). This is a philosophy of pursuing contin-
uous improvement in each and every process through the integrated efforts
of all concerned persons associated with the organisation.
• Quality improvement. This is the total of all the appropriate activities that
create a desired change in quality.
• Cost of quality. This is the expense of not doing effectively all the right
things right the first time.
There are many reasons for the rising health care-related cost. Six of these reasons
are as follows [6]:
All of the above six reasons are discussed in detail in Refs. [2, 6].
TABLE 12.1
Comparisons of Traditional Quality Assurance and Total Quality
Management in Regard to Health Care
Area
No. (Characteristic) Traditional Quality Assurance Total Quality Management
1. Scope Clinical processes and outcomes All processes and systems (i.e., clinical
and non-clinical)
2. Purpose Enhance quality of patient care Enhance all products and services quality
for patients for patients and other customers
3. Focus Peer review vertically focused by Horizontally focused peer review for
clinical process or department improving all processes and individuals
(i.e., each department looks that affect outcomes
after its own quality assurance)
4. Leadership Physician and clinical leaders All leaders (i.e., clinical and
(i.e., clinical staff chief and non-clinical)
quality assurance committee)
5. Aim Problem solving Continuous improvement, even when no
deficiency/problem is identified
6. Customer Customers are review Customers are review organisations,
organisations and professionals professionals, patients, and others
with focus on patients
7. Outcomes Includes measurement and Includes also measurement and
monitoring monitoring
8. People involved Appointed committees and Each and every individual involved with
quality assurance programme process
9. Methods Includes hypothesis testing, Includes Pareto chart, force field analysis,
nominal group techniques, chart checklist, fishbone diagram, flow charts,
audits, and indicator monitoring control chart, Hoshin planning, etc.
TABLE 12.2
Comparisons of Quality Assurance and Quality Improvement in Health Care
Institutions
Over the years, there have been varying reactions of physicians to TQM. Seven of
the typical ones are as follows [2]:
• Cost-benefit analysis
• Brainstorming
• Check sheets
• Multivoting
• Force field analysis
• Affinity diagram
• Cause and effect diagram
Quality in Health Care 189
• Control charts
• Proposed options matrix
• Pareto chart
• Prioritisation matrix
• Scatter diagram
• Histogram
• Process flowchart
12.6.2 Brainstorming
The objective of brainstorming in health care quality is to generate ideas, options
or highlight problems, concerns. It is quite often referred to as a form of divergent
thinking because the basic objective is to enlarge the number of ideas being con-
sidered. Thus, brainstorming may simply be described as a group decision-making
approach designed for generating many creative ideas by following an interactive
process. The team concerned with health care quality can make use of brainstorming
for getting its ideas organised into a quality method such as a process flow diagram
or a cause and effect diagram.
Past experiences over the years, clearly indicate that questions such as presented
below can be very useful to start a brainstorming session concerned with health care
quality [12].
Six guidelines that are considered very useful for conducting effective brain-
storming sessions are as follows [16, 17]:
12.6.4 Multivoting
This is a quite useful method for reducing a large number of ideas to a manageable few
judged important by the participating personnel. Generally by following this approach,
the number of ideas are reduced to three to five [2]. Another thing that can be said
about multivoting is that it is a form of convergent thinking because the objective is to
lower the number of ideas being considered. Needless to say, moltivoting is considered
to be a very useful tool for application in the area of health care quality, additional
information on the method is available in Ref. [21].
Past experiences over the years, indicate that there are many potential barriers to
the implementation of Six Sigma programmes in hospitals. Some of these barriers
are as follows [23]:
• Governmental regulations.
• Nursing shortage.
• Costs (start-up and maintenance).
12.8 PROBLEMS
1. Define the following four terms:
• Clinical audit
• Quality of care
• Adverse event
• Health care
2. Compare quality assurance and quality improvement in health care
institutions.
3. What are the main reasons for the rising health care-related cost?
4. Discuss important health care-associated goals?
5. What are the ten steps that can be used in improving quality in the health
care system?
6. Discuss physician reactions to TQM.
7. List at least 12 quality tools for use in health care.
8. Discuss the following two methods considered useful to improve quality in
health care:
• Brainstorming
• Force field analysis
9. Discuss the implementation of Six Sigma methodology in hospitals and its
advantages.
10. Write a short essay on the historical developments in health care quality.
REFERENCES
1. Graham, N.O., Quality Trends in Health Care, in Quality in Health Care, edited by
N.O. Graham, Aspen Publishers, Gaithersburg, Maryland, 1995, pp. 3–14.
2. Gaucher, E.J., Coffey, R.J., Total Quality in Health Care: from Theory to Practice,
Jossey-Bass Publishers, San Francisco, California, 1993.
3. Codman, E.A., The Product of the Hospital, Surgical Gynaecology and Obstetrics,
Vol. 28, 1914, pp. 491–496.
4. Graham, N.O., Ed., Quality in Health Care: Theory, Application, and Evolution, Aspen
Publishers, Gaithersburg, Maryland, 1995.
5. Glossary of Terms Commonly Used in Health Care, Prepared by the Academy Health,
Suite 701-L, 1801 k St. NW, Washington, D.C., 2004.
6. Marszalek-Gaucher, E., Coffey, R.J., Transforming Health Care Organizations: How to
Achieve and Sustain Organizational Excellence, John Wiley and Sons, New York, 1990.
7. Coltin, K.L., Aronow, D.B., Quality Assurance and Quality Improvement in the
Information Age, In Quality in Health Care: Theory, Application, and Evolution, edited
by N.O. Graham, Aspen Publishers, Gaithersburg, Maryland, 1995.
8. Berwick, D.M., Peer Review and Quality Management: Are They Compatible? Quality
Review Bulletin, Vol. 16, 1990, pp. 246–251.
9. Laffel, G., Blumenthal, D., The Case for Using Industrial Quality Management Science
in Health Care Organization, Journal of the American Medical Association, Vol. 262,
1989, pp. 2869–2873.
Quality in Health Care 193
10. Fainter, J., Quality Assurance Not Quality Improvement, Journal of Quality Assurance,
January/February 1991, pp. 8, 9, and 36.
11. Andrews, S.L., QA versus QI: The Changing Role of Quality in Health Care, January/
February 1991, pp. 14, 15, 38.
12. Stamatis, D.H., Total Quality Management in Health Care, Irwin Professional
Publishing, Chicago, Illinois, 1996.
13. Dhillon, B.S., Creativity for Engineers, World Scientific Publishing, River Edge,
New Jersey, 2006.
14. Levin, H.M., McEwan, P.J., Cost-Effectiveness Analysis: Methods and Applications,
Sage Publications, Thousand Oaks, California, 2001.
15. Boardman, A.E., Cost-Benefit Analysis: Concepts and Practice, Prentice Hall, Upper
Saddle River, New Jersey, 2006.
16. Osborn, A.F., Applied Imagination, Charles Scribner’s Sons, New York, 1963.
17. Dhillon, B.S., Engineering and Technology Management Tools and Applications,
Artech House, Inc, Boston, Massachusetts, 2002.
18. Montgomery, D.C., Introduction to Statistical Control, John Wiley and Sons, New York,
1996.
19. Ishikawa, K., Guide to Quality Control, Asian Productivity Organization, Tokyo, 1976.
20. Leitnaker, M.G., Sanders, R.D., Hild, C., The Power of Statistical Thinking: Improving
Industrial Processes, Addison-Wesley, Reading, Massachusetts, 1996.
21. Tague, N.R., The Quality Toolbox, ASQ Quality Press, Milwaukee, Wisconsin, 2005.
22. Jay, R., The Ultimate Book of Business Creativity: 50 Great Thinking Tools, for
Transforming Your Business, Capstone Publishing Limited, Oxford, U.K, 2000.
23. Frings, G.W., Graut, L., Who Moved My Sigma-Effective Implementation of the Six
Sigma Methodology to Hospitals, Quality and Reliability Engineering International,
Vol. 21, 2005, pp. 311–328.
13 Medical Device
Quality Assurance
13.1 INTRODUCTION
Nowadays, because quality is very important in the manufacture of medical devices,
manufacturers are under increasing pressure to follow more closely the quality sys-
tems for ensuring that the manufactured items are effective, reliable, safe, and they
clearly meet applicable specifications and standards.
Although, the history of quality assurance may be traced back to the ancient
times, in regard to medical devices, the amendments to the Federal, Food, Drug,
and Cosmetic Act of 1976 concerning medical devices have established a com-
plex statutory framework for allowing the Food and Drug Administration (FDA)
to regulate almost all aspects of medical devices, from testing to marketing, thus
putting more pressure on the quality assurance programmes concerning medical
devices.
This chapter presents various important aspects of medical device quality
assurance.
quality assurance programme is a very important factor in this regard. The FDA has
played a pivotal role in getting manufacturers to develop design quality assurance
programmes by publishing a document entitled “Preproduction Quality Assurance
Planning: Recommendations for Medical Device Manufacturers” [3]. This document
clearly outlines useful design-related practices applicable to medical devices, thus
assisting manufacturers in planning and implementing their preproduction quality
assurance programmes.
There are twelve elements, shown in Fig. 13.1, in the preproduction or design
quality assurance programme recommended by the FDA.
All the 12 elements shown in Fig. 13.1 are described in Sections 13.3.1–13.3.12.
13.3.1 Organization
This is concerned with the organisational aspects of the preproduction or design
quality assurance programme, for example, the organisational elements and authori-
ties appropriate for developing the programme, to execute programme-related
requirements, formal establishment of audit programme, formal documentation of
the specified programme-related goals, etc.
13.3.2 Specifications
After establishing physical performance, and chemical-related characteristics for the
proposed device, the characteristics should be translated into formally documented
design specifications through which the design can be developed, controlled, and
evaluated. These specifications should clearly address factors such as reliability,
198 Applied Reliability, Usability, and Quality for Engineers
13.3.3 Design Review
The purpose of design review is to highlight and rectify design-related deficien-
cies as early as possible because they will be less costly to implement. The design
review programme should be well-documented and include items such as organisa-
tional units, procedures, variables’ checklist, schedule, and process flow diagrams.
Although the extent and frequency of design reviews will very much depend on
the complexity and significance of the device under study, the assessment should
include items such as subsystems, packaging, software (if applicable), labelling,
components, and support documentation (i.e., instructions, test specifications,
drawings, etc.).
The design review team members should be from areas such as quality assurance,
research and development, engineering, manufacturing, purchasing, servicing, and
marketing. Also, when considered appropriate, design reviews should include the
performance of failure modes and effect analysis (FMEA) and fault tree analysis
(FTA). Both these methods are described in Chapter 4.
13.3.4 Reliability Assessment
This may simply be described as the process of prediction and demonstration used
for estimating the basic reliability of an item or device. Reliability assessment should
be conducted for new and modified designs, and its appropriateness and extent
should be determined by the degree of risk the device presents to its user. Reliability
assessment is started by statistical and theoretical approaches by first determining
the reliability of each and every part/component/element and ultimately the entire
device/system. It is to be noted that this approach provides only an estimate of reli-
ability. For proper or better assessment, the device/system should be tested under a
Medical Device Quality Assurance 199
13.3.7 Labelling
This includes display labels, manuals, charts, inserts, panels, and recommended
test and calibration protocols. The design review process should also review label-
ling to assure it appropriately complies with all applicable laws and regulations and
contains easy-to-understand directions. The verification of instructions’ accuracy
contained in the labelling should be a part of the qualification testing of the device
under consideration.
Maintenance manuals (if applicable) must be written clearly so that the device
under consideration could be maintained in an effective and safe condition.
200 Applied Reliability, Usability, and Quality for Engineers
13.3.8 Design Transfer
After translating the design into a physical entity, the design’s technical adequacy,
safety, and reliability should be appropriately verified through intensive testing under
simulated or real-life use environments. After verifying technical adequacy through
appropriate testing, the design is generally approved. It is to be noted that when
moving laboratory to scaled-up production, standards, or methods and procedures
may not be properly transferred. It is quite possible that additional manufacturing
processes are needed. Thus, this very scenario requires careful consideration.
13.3.9 Certification
After the successful passing of preproduction qualification testing by the initial
production units, it is essential to carry out a formal technical review so that the
adequacy of the design, production, and quality assurance-related procedures is
assured. In addition, the review should determine the following six factors:
13.3.10 Test Instrumentation
This involves effectively calibrating and maintaining all equipment employed in the
qualification of the design. More clearly, such equipment should be kept under a
formal calibration and maintenance programme.
13.3.11 Personnel
This calls for the performance of design-related activities, including design review,
analysis, and testing by properly trained professionals.
Medical device manufacturers should also make a special effort for assuring that
failure data collected from service and complaint records relating to design-associ-
ated problems are reviewed by the design professionals.
It is to be noted that this diagram could be extremely useful for improving the quality
of medical device designs.
13.4.4 Flowcharts
Flowcharts are used for describing processes in as much detail as feasible by graph-
ically showing the steps in proper order. A good flowchart generally displays all
process steps under consideration or analysis by the quality improvement team,
highlights crucial process points for control, suggests areas for improvement, and
serves as a useful tool for explaining and solving a problem.
A flowchart could be simple or quite complex, composed of many symbols, boxes,
etc. More clearly, the complex version indicates the process steps in the appropriate
sequence and associated step conditions and the related constraints by making the
use of elements such as arrows, yes/no choices, or if/then statements.
13.4.5 Scatter Diagram
This is the simplest way for determining how two variables are related or if a cause-
and-effect relationship exists between the two variables. However, it is to be noted
that the scatter diagram cannot prove that one variable causes the change in the
other, but only the existence of their relationship and its strength. In this diagram,
the horizontal axis denotes the measurement values of one variable and the vertical
axis denotes the measurements of the other variable.
If sometimes it is desirable to fit a straight line to the plotted data points for
obtaining a prediction equation, a line can be drawn on the scatter diagram either
204 Applied Reliability, Usability, and Quality for Engineers
visually or mathematically utilizing the least squares approach. Whenever the line
is extended beyond the plotted data points, a dashed line is used for indicating that
there are no data for the concerned area.
13.4.7 Histogram
A histogram is employed when good clarity is desired. It plots data in a frequency
distribution table. Its main distinction from a check sheet is that its data are catego-
rised into rows for the purpose of losing the identify of individual values. It may be
said that the histogram is the first “statistical” process control method because it can
appropriately describe the variation in the process.
A histogram can provide a satisfactory amount of information concerning a qual-
ity-related problem, thus providing a basis for making decisions without additional
analysis. The histogram’s shape shows the nature of the distribution of the data, in
addition to central tendency and variability. Furthermore, specification limits may
be used for showing process capability.
(α − β)(100)
γ= (13.1)
(α − β + m)
where
γ is the percentage of defects accurately identified by the regular inspector.
α is the total number of defects found by the regular inspector.
β is the total number of items or units without defects rejected by the regular
inspector as found by the check inspector.
m is the total number of defects missed by the regular inspector as discovered by
the check inspector.
Example 13.1
Assume that a lot of medical devices were inspected by a regular inspector who
found 50 defects. Subsequently, the same lot was re-examined by the check
inspector, and the values of m and β were 10 and 4, respectively. Determine the
percentage of defects accurately discovered by the regular inspector.
By substituting the specified data values into Equation (13.1), we obtain
(50 − 4)(100)
γ= = 82.14%
(50 − 4 + 10)
where
I qcp is the value of the quality cost performance index.
QCv is the vendor quality cost.
C p is the purchased cost.
It is to be noted that the value of this index equals only unity for the perfect vendor,
i.e., there is no vendor quality cost. For example, there is no complaint to investigate,
no receiving inspection, no defective rejection, etc. When the value of I qcp is 1.1 or
higher, it clearly indicates that there is an immediate need for the corrective action.
Interpretations for other values of the index are as follows [11]:
Example 13.2
• C p = $100,000
• QCv = $4,000
Calculate the value of the quality cost performance index, and comment on the
end result.
By inserting the specified data values into Equation (13.2), we obtain
It means that the vendor’s quality cost performance can be rated as fair.
where
θ is the total number of items or devices inspected.
TQC (100)
µ= + 100 (13.4)
TVO
where
µ is the value of the quality cost index.
TVO is the total value of output.
TQC is the total quality cost.
The value of this index may be estimated in six steps presented below:
• Step V: Add the end results of the previous two steps (i.e., Steps III and IV).
• Step VI: Compute the value of the quality cost index, µ, by using Equation
(13.4) as well as the resulting values of Steps II and V.
• 100 < µ < 130: This is the common range when quality-related costs are
ignored by manufacturers.
• µ = 105: This value is achievable in real life.
• µ = 100: There is no defective output, thus no money is spent to conduct
quality checks.
13.6 PROBLEMS
1. Write an essay on ISO 9000.
2. Discuss historical developments in quality control in general and medical
device quality assurance in particular.
3. List the elements of the FDAs “Preproduction Quality Assurance Planning:
Recommendations for Medical Device Manufacturers” programme.
4. Discuss in detail at least four elements listed in question 3.
5. List at least six tools/methods that can be used to assure medical device
quality.
6. Describe the following two tools/methods that can be used to assure medi-
cal devices’ quality:
• Pareto diagram
• Quality function deployment (QFD)
7. What are the main steps involved in developing a cause-and-effect dia-
gram? Also, what are its benefits and drawback?
8. Mathematically define the quality inspector inaccuracy index.
9. Assume that a lot of medical devices was inspected by a regular inspector
who discovered 70 defects. Subsequently, the same lot was re-examined by
the check inspector and the values of m and β were 12 and 6, respectively.
Determine the percentage of defects accurately discovered by the regular
inspector using Equation (13.1).
10. Assume that we have the following data values:
• QCv = $5, 000
• C p = $110, 000
Calculate the value of the quality cost performance index using Equation (13.2), and
comment on the final result.
REFERENCES
1. Fries, R.C., Medical Device Quality Assurance and Regulatory Compliance, Marcel
Dekker Inc, New York, 1998.
2. Montanez, J., Medical Device Quality Assurance Manual, Interpharm Press Inc,
Buffalo Grove, Illinois, 1996.
208 Applied Reliability, Usability, and Quality for Engineers
3. Hooten, W.F., A Brief History of FDA Good Manufacturing Practices, Medical Device
and Diagnostic Industry Magazine, Vol. 18, No. 5, 1996, p. 96.
4. Mears, P., Quality Improvement Tools and Techniques, McGraw-Hill Inc, New York,
1995.
5. Sahni, A., Seven Basic Tools that Can Improve Quality, Medical Device and Diagnostic
Industry Magazine, April 1998, pp. 89–98.
6. Besterfield, D.H., Besterfield-Michna, C., Besterfield, G.H., Besterfield-Sacre, M., Total
Quality Management, Prentice-Hall Inc, Englewood Cliffs, New Jersey, 1995.
7. Dhillon, B.S., Advanced Design Concepts for Engineers, Technomic Publishing Com
pany, Lancaster, PA, 1998.
8. Bracco, D., How to Implement a Statistical Process Control Program, Medical Device
and Diagnostic Industry Magazine, March 1998, pp. 129–139.
9. Yoji, K., Ed., Quality Function Deployment, Productivity Press, Cambridge, MA, 1990.
10. Juran, J.M., Gryna, F.M., Bingham, R.S., Quality Control Handbook, McGraw-Hill,
New York, 1974.
11. American Society for Quality Control. Guide for Managing Vendor Quality Costs,
American Society for Quality Control, Milwaukee, WI, 1980.
12. Lester, R.H., Enrick, N.L., Mottley, H.E., Quality Control for Profit, Industrial Press
Inc, New York, 1977.
14 Software Quality
14.1 INTRODUCTION
Nowadays, computers are widely used for applications ranging from day-to-day per-
sonal use to control of space systems. As the computers are made up of both hard-
ware and software elements, over the decades, the percentage of the total computer
cost spent on software has changed dramatically. For example, in 1955 the software
element (i.e., including software maintenance) accounted for approximately 20% of
the total computer cost and three decades later, in 1985, this percentage increased to
about 90% [1]. Needless to say, the introduction of computers into systems/products
in the late 1970s has, directly or indirectly, led to the software quality assurance for
all types of software [2].
Thus, the main objective of a quality assurance program with respect to software
quality is to ensure that the final software products are of good quality, through prop-
erly planned and systematic actions for achieving, maintaining, and determining that
quality [3, 4]. This chapter presents various important aspects of software quality.
• Run charts
• Pareto diagram
• Scatter diagram
• Histogram
• Control chart
• Cause and effect diagram
• Checklist
The first two of the above seven methods are described in Sections 14.4.1 and 14.4.2,
and detailed information on the remaining five methods is available in Refs. [12–14].
14.4.1 Run Charts
These charts are normally used for software project management, serving as real-
time statements of quality and workload. An example of run charts’ application is
tracking the percentage of software fixes that exceed the stated response-time crite-
ria, in order to ensure deliveries of fixes to all involved customers in a timely manner.
Run charts are also used for monitoring the weekly arrival of software defects as
well as the defect backlog during the formal testing phases of a machine under con-
sideration. During the software development process, run charts are often compared
to the relevant projection models and historical data so that the related interpreta-
tions can be placed into appropriate perspective.
Additional information on run charts with respect to their application during the
software development process is available in Ref. [11].
adapt to the SPC process for ensuring the quality maintenance. Additional informa-
tion on quality software maintenance is available in Refs. [17, 26].
For the successful application of the above two principals, it is absolutely essential
that the metrics satisfy the following eight requirements [10, 25]:
Ten software quality metrics considered quite useful are presented in Sections
14.6.1–14.6.10 [10, 27].
14.6.1 Metric I
This is one of the error-severity metrics and is expressed by
α1
CEas = (14.1)
α2
where
CEas is the average severity of code errors.
α1 is the number of weighted code errors detected.
α 2 is the number of code errors detected in the software code through testing and
inspections.
216 Applied Reliability, Usability, and Quality for Engineers
14.6.2 Metric II
This is one of the error-density metrics and is expressed by
α3
CEd = (14.2)
α2
where
CEd is the code error density.
α 3 is the number of code errors detected in the software code through testing and
inspections.
α 2 is the thousands of lines of code.
α4
CM e = (14.3)
α5
where
CM e is the corrective maintenance effectiveness.
α 4 is the total number of annual working hours invested in corrective mainte-
nance of the software system.
α 5 is the total number of software failures detected during a 1-year period of
maintenance service.
14.6.4 Metric IV
This metric is concerned with measuring the success of help-desk service (HDS)
and is defined by
α6
HDSsf = (14.4)
α7
where
HDSsf is the HDS success factor.
α 6 is the number of HDS calls completed on time during a 1-year period.
α 7 is the total number of HDS calls during a 1-year period.
14.6.5 Metric V
This metric is concerned with measuring the mean severity of the HDS calls and is
expressed by
α8
HDSmsc = (14.5)
α7
Software Quality 217
where
HDSmsc is the mean severity of HDS calls.
α 8 is the number of weighted HDS calls received during a 1-year period.
α 7 is the total number of HDS calls during a 1-year period.
14.6.6 Metric VI
This metric is one of the software process timetable metrics and is expressed by
α9
TO f = (14.6)
α10
where
TO f is the timetable observance factor.
α 9 is the number milestones completed on time.
α10 is the total number of milestones.
14.6.9 Metric IX
This metric is one of the HDS calls-density metrics and is expressed by
α15
HDScd = (14.9)
α16
218 Applied Reliability, Usability, and Quality for Engineers
where
HDScd is the HDS calls density.
α15 is the total number of HDS calls during a 1-year period.
α16 is the thousands of lines of maintained software code.
14.6.10 Metric X
This metric is one of the HDS productivity metrics and is defined by
α17
HDS pf = (14.10)
α18
where
HDS pf is the HDS productivity factor.
α17 is the total number of yearly working hours invested in help-desk servicing of
the software system.
α18 is the thousands of lines of maintained software code.
• Ensure that the quality assurance activity starts at an early stage of the
software development cycle.
• Develop a quality assurance activity and appropriately ensure its
independence.
• Ensure that, prior to the initiation of the testing process, the development
testing is well planned and organised.
• Carry out appropriate analysis of the code in addition to testing it.
• Try to carry out some type of quality assurance activity/activities in an
environment where you are unable to carry out the ideal maximum of
activities.
• Carefully evaluate the interfaces between any two elements/parts in the
system and appropriately resolve any misunderstandings, ambiguities, and
incompatibilities.
• Carry out an appropriately detailed verification analysis of the design and
requirements.
• Make sure that whole documentation is well controlled and cannot be
changed without proper controls.
• Always remain sceptical of errors in software received from any developer.
• Keep a careful track of the computer resources needed by the end program.
• Subcategory I: Internal failure costs. These costs are associated with cor-
recting errors found through design reviews, software tests, and accep-
tance tests, prior to the installation of the software at customer sites.
• Subcategory II: External failure costs. These costs are associated with
correcting failures detected by customers/maintenance teams after the
installation of the software system at customer sites.
14.10 PROBLEMS
1. Define the following four terms:
• Software
• Software quality
• Software quality assurance
• Software quality control
2. Write an essay on software quality.
3. What are the software quality factors? List at least ten of them.
4. Discuss at least two quality methods that can be used during the software
development process.
5. Discuss quality-related measures during the SDLC.
6. List at least six requirements that must be satisfied by software metrics for
their successful applicability.
7. Define at least four software quality metrics.
8. List at least nine main responsibilities of a software quality assurance manager.
9. What are the four subcategories of the software quality cost? Describe each
of these subcategories.
10. What are the main benefits of software quality assurance?
REFERENCES
1. Keene, S.J., Software Reliability Concepts, Annual Reliability and Maintainability
Symposium Tutorial Notes, 1992, pp. 1–21.
2. Dunn, R., Ullman, R., Quality Assurance for Computer Software, McGraw-Hill Book
Company, New York, 1982.
222 Applied Reliability, Usability, and Quality for Engineers
28. Tice, G.D., Management Policy and Practices for Quality Software, Proceedings of the
Annual American Society for Quality Control Conference, 1983, pp. 369–372.
29. Rubey, R.J., Planning for Software Reliability, Proceedings of the Annual Reliability
and Maintainability Symposium, 1977, pp. 495–499.
30. Fisher, M.J., Software Quality Assurance Standards: The Coming Revolution, Journal
of Systems and Software, Vol. 2, 1981, pp. 357–362.
31. Dunn, R., Ullman, R., Quality Assurance for Computer Software, McGraw-Hill,
New York, 1982.
32. Fischer, K.F., A Program for Software Quality Assurance, Proceedings of the Annual
Conference of the American Society for Quality Control, 1978, pp. 333–340.
Index
A Laplace transform, 14
probability, 13
Absorption law, 12 probability density function, 14
Administration kit for peritoneal dialysis, 147 Deming approach, 45–46
Advanced Research Projects Agency Network, 103 Design phase measure, 115
American Society for Quality Control, 1 Devol, G., 83
Arial, 170 Distributive law, 12
Arithmetic mean, 9 Documents for improving medical device
Associative law, 12 usability, 152–154
Average service availability index, 128–129
E
B
Electric robot, 93–96
Balloon catheter, 147 Electrosurgical cutting and coagulation device, 147
Bathtub hazard rate concept, 25–26 Emergency Care Research Institute, 68
Bathtub hazard rate curve distribution, 19–20 European laboratory for particle physics, 167
Bell Laboratories, 103 Exponential distribution, 17
Bell Telephone Laboratories, 51
Bernoulli, J., 17
Binomial distribution, 16 F
Boolean algebra laws, 11–12 Failure density function, 26
absorption law, 12 Failure modes and effect analysis, 49–50, 69
associative law, 12 Fault tree analysis, 51–54, 70
commutative law, 12 Federal, Food, Drug, and Cosmetic Act, 195
distributive law, 12 Fermat, P., 9
idempotent law, 12 Flowcharts, 203
Brainstorming, 188, 189–190 Food and Drug Administration, 67, 147, 195
Bridge network, 37–39 Force field analysis, 188, 190
Frazee, C. N., 1
C
Cardano, G., 9 G
Cause and effect diagram, 61–62, 201, 202 Gauss, C.F., 190
Center for Devices and Radiological Health, 71 General approach, 70–71
Check sheets, 188, 190 General reliability function, 28
Code and unit test phase measure, 114 Generic hypertext markup language, 177
Codman, E.A., 183 Glucose meter, 147
Cognitive walkthroughs, 58, 161 Guidelines checklists, 161
Commutative law, 12
Control charts, 204
Cost-benefit analysis, 188, 189 H
Cumulative trauma disorder, 151–152 Hazard rate function, 27
Customer average interruption duration index, 128 Helvetica, 170
Customer average interruption frequency index, 128 Heuristic evaluation, 161
Hindu-Arabic numeral system, 9
D Histogram, 204
Human Factors Society of America, 1
Definitions, 13–16 Human sensory capacities, 41–42
cumulative distribution function, 14 Human-computer interface fundamental
expected value, 14 principles, 158, 159
final value theorem Laplace transform, 16 Hydraulic robot, 90–93
225
226 Index
S T
Sans-serif fonts, 170 Task analysis, 58–59
Arial, 170 The P-charts, 62–64
Helvetica, 170 Tools for assuring medical device
Scatter diagram, 203–204 quality, 201–204
Scythian, 9 cause and effect diagram, 201, 202
Series network, 30–32 control charts, 204
Shewhart, W., 1, 62 flowcharts, 203
Six sigma methodology, 190–192 histogram, 204
Software development life cycle, 213–215 Pareto diagram, 203
Software metrics, 114–115 quality function deployment, 201–202
code and unit test phase measure, 114 scatter diagram, 203–204
design phase measure, 115 Total quality management, 184–185
Software quality assurance benefits, 220, 221 Total quality management elements, 43–44
Software quality assurance manager, 218 Traditional quality assurance, 184–185
Software quality assurance program, 219 Triple modular redundancy, 107–110
Software quality assurance standards, 220–221 Typical human behaviours, 39–41
Software quality factors, 210–212
product operation factors, 210–211 U
product revision factors, 210, 211
product transition factors, 210, 211–212 Urological catheter, 147
Software quality function deployment, 213
Software quality-associated metrics, 215–218 V
metric I, 215
metric II, 216 Vendor rating program index, 205–206
metric III, 216
metric IV, 216 W
metric IX, 217–218
metric V, 216–217 Web design-related errors, 168
metric VI, 217 Web SAT, 176–177
metric VII, 217 Web usability evaluation tools, 176–178
metric VIII, 217 Lift, 178
metric X, 218 Max, 177
Software reliability models, 112–114 NetRaker, 177
Mills model, 113–114 Web SAT, 176–177
Musa model, 112–113 Weibull distribution, 18–19
Software usability inspection methods, 160–161 Western Electric Company, 1
cognitive walkthrough, 161 World Wide Web, 167
guidelines checklists, 161 Wright Brothers’ airplane, 1