You are on page 1of 85

IE 4102 Statistics for Engineers

Lecture 1: Probability Basics

Assist. Prof. Duygun Fatih Demirel

Spring 2024

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
What Engineers Do?

An engineer is someone who solves problems of


interest to society with the efficient application of
scientific principles by:
• Refining existing products
• Designing new products or processes

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
The Usage of Statistics in
Engineering

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
The Creative Process

Figure 1-1 The engineering method

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Statistics Supports The Creative Process

The field of statistics deals with the collection,


presentation, analysis, and use of data to:
• Make decisions
• Solve problems
• Design products and processes
It is the science of data.

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Variability
• Statistical techniques are useful to describe
and understand variability.
• By variability, we mean successive observations
of a system or phenomenon do not produce
exactly the same result.
• Statistics gives us a framework for describing this
variability and for learning about potential
sources of variability.
6

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
An Engineering Example of Variability
Eight prototype units are produced and their pull-off forces are measured (in
pounds): 12.6, 12.9, 13.4, 12.3, 13.6, 13.5, 12.6, 13.1.

All of the prototypes does not have the same pull-off force. We can see the
variability in the above measurements as they exhibit variability.
The dot diagram is a very useful plot for displaying a small body of data -
say up to about 20 observations.

This plot allows us to see easily two features of the data; the location, or
the middle, and the scatter or variability.

Figure 1-2 Dot diagram of the pull-off force data.

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Basic Methods of Collecting Data

Three basic methods of collecting data:


– A retrospective study using historical data
• Data collected in the past for other purposes.
– An observational study
• Data, presently collected, by a passive
observer.
– A designed experiment
• Data collected in response to process input
changes.

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Hypothesis Tests
Hypothesis Test
• A statement about a process behavior value.
• Compared to a claim about another process value.
• Data is gathered to support or refuse the claim.

One-sample hypothesis test:


• Example: Ford avg mpg = 30
vs
Ford avg mpg < 30

Two-sample hypothesis test:


• Example: Ford avg mpg – Chevy avg mpg = 0
vs
Ford avg mpg – Chevy avg mpg > 0

Sec 1-2.4 Designed Experiments 9

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Factorial Experiment & Example
An experiment design which uses every possible combination
of the factor levels to form a basic experiment with “k”
different settings for the process. This type of experiment is
called a factorial experiment.
Example:

Consider a petroleum distillation column:

• Output is acetone concentration


• Inputs (factors) are:
1. Reboil temperature
2. Condensate temperature
3. Reflux rate

• Output changes as the inputs are changed by experimenter.


Sec 1.2.4 Designed Experiments 10

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Factorial Experiment Example
• Each factor is set at 2 reasonable levels (-1 and +1)
• Resultant data is used 8 (23) runs are made, at every
combination of factors, to observe acetone output.
• To create a mathematical model of the process representing
cause and effect.

Table 1-1 The Designed Experiment (Factorial Design) for the Distillation Column
Sec 1-2.4 Designed Experiments 11

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Mechanistic and Empirical Models
A mechanistic model is built from our underlying knowledge of
the basic physical mechanism that relates several variables.
Example: Ohm’s Law
Current = Voltage/Resistance
I = E/R
I = E/R + 
where  is a term added to the model to account for the fact that
the observed values of current flow do not perfectly conform to
the mechanistic model.
• The form of the function is known.

Sec 1-3 Mechanistic & Empirical Models 12

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Mechanistic and Empirical Models

An empirical model is built from our engineering


and scientific knowledge of the phenomenon, but
is not directly developed from our theoretical or
first-principles understanding of the underlying
mechanism.
The form of the function is not known a priori.

Sec 1-3 Mechanistic & Empirical Models 13

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
An Example of an Empirical Model

• In a semiconductor manufacturing plant, the


finished semiconductor is wire-bonded to a frame.
In an observational study, the variables recorded
were:
• Pull strength to break the bond (y)
• Wire length (x1)
• Die height (x2)

• The data recorded are shown on the next slide.

Sec 1-3 Mechanistic & Empirical Models 14

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
An Example of an Empirical Model
Table 1-2 Wire Bond Pull Strength Data

Sec 1-3 Mechanistic & Empirical Models 15

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
An Example of an Empirical Model
Pull ෣
strength = 𝛽0 + 𝛽1 wire length + 𝛽2 die height + 𝜀

where the “hat,” or circumflex, over pull strength


indicates that this is an estimated or predicted quality.

In general, this type of empirical model is called a


regression model.

The estimated regression relationship is given by:

Pull ෣
strength = 2.26 + 2.74 wire length + 0.0125 die height

Sec 1-3 Mechanistic & Empirical Models 16

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Visualizing the Data

Figure 1-6 Three-dimensional plot of the pull strength (y),


wire length (x1) and die height (x2) data.

Sec 1-3 Mechanistic & Empirical Models 17

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Visualizing the Resultant Model Using Regression Analysis

Figure 1-7 Plot of the predicted values (a plane) of pull


strength from the empirical regression model.

Sec 1-3 Mechanistic & Empirical Models 18

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Models Can Also Reflect Uncertainty

• Probability models help quantify the risks involved in


statistical inference, that is, risks involved in decisions
made every day.
• Probability provides the framework for the study and
application of statistics.
•Probability concepts will be introduced in the next
lecture.

Sec 1-4 Probability & Probability Models 19

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Random Experiment
• An experiment is a procedure that is
– carried out under controlled conditions, and
– executed to discover an unknown result.
• An experiment that results in different
outcomes even when repeated in the
same manner every time is a random
experiment.

20

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Sample Spaces

• The set of all possible outcomes of a


random experiment is called the
sample space, S.
• S is discrete if it consists of a finite or
countable infinite set of outcomes.
• S is continuous if it contains an
interval of real numbers.
21

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 2-1: Defining Sample
Spaces
• Randomly select a camera and record the recycle time of
a flash. S = R+ = {x | x > 0}, the positive real numbers.
• Suppose it is known that all recycle times are between 1.5
and 5 seconds. Then
S = {x | 1.5 < x < 5} is continuous.
• It is known that the recycle time has only three values(low,
medium or high). Then S = {low, medium,
high} is discrete.
• Does the camera conform to minimum recycle time
specifications?
S = {yes, no} is discrete.

22

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Events are Sets of Outcomes
• An event (E) is a subset of the sample space of
a random experiment.
• Event combinations
– The Union of two events consists of all outcomes
that are contained in one event or the other,
denoted as E1 U E2.
– The Intersection of two events consists of all
outcomes that are contained in one event and the
other, denoted as E1 ∩ E2.
– The Complement of an event is the set of outcomes
in the sample space that are not contained in the
event, denoted as E.
Sec 2-1.3 Events 23

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Mutually Exclusive Events
• Events A and B are mutually exclusive because they
share no common outcomes.
• The occurrence of one event precludes the occurrence
of the other.
• Symbolically, A ∩B = Ø

Sec 2-1.3 Events 24

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Probability
• Probability is the likelihood or chance that
a particular outcome or event from a
random experiment will occur.
• In this chapter, we consider only discrete
(finite or countably infinite) sample spaces.
• Probability is a number in the [0,1] interval.
• A probability of:
– 1 means certainty
– 0 means impossibility
Sec 2-2 Interpretations & Axioms of Probability 25

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Probability Based on Equally-Likely
Outcomes

• Whenever a sample space consists of N


possible outcomes that are equally likely, the
probability of each outcome is 1/N.
• Example: In a batch of 100 diodes, 1 is laser
diode. A diode is randomly selected from the
batch. Random means each diode has an
equal chance of being selected. The probability
of choosing the laser diode is 1/100 or 0.01,
because each outcome in the sample space is
equally likely.

Sec 2-2 Interpretations & Axioms of Probabilities 26

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Probability of an Event
• For a discrete sample space, the probability
of an event E, denoted by P(E), equals the
sum of the probabilities of the outcomes in E.

• The discrete sample space may be:


– A finite set of outcomes
– A countably infinite set of outcomes.

Sec 2-2 Interpretations & Axioms of Probability 27

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Axioms of Probability
• Probability is a number that is assigned to each member of a collection
of events from a random experiment that satisfies the following
properties:

If S is the sample space and E is any event in the random experiment,

1. P(S) = 1
2. 0 ≤ P(E) ≤ 1
3. For any two events E1 and E2 with E1∩E2 = Ø,
P(E1UE2) = P(E1) + P(E2)

• The axioms imply that:


– P(Ø) = 0 and P(E′ ) = 1 – P(E)
– If E1 is contained in E2, then P(E1) ≤ P(E2).

Sec 2-2 Interpretations & Axioms of Probability 28

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Probability of a Union
• For any two events A and B, the
probability of union is given by:
P( A  B) = P( A) + P(B) − P( A  B)

• If events A and B are mutually


exclusive, then P( A  B) = ,
and therefore:
P( A  B) = P( A) + P(B)
Sec 2-3 Addition Rules 29

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Conditional Probability
• P(B | A) is the probability of event B
occurring, given that event A has already
occurred.
• A communications channel has an error
rate of 1 per 1000 bits transmitted. Errors
are rare, but do tend to occur in bursts. If a
bit is in error, the probability that the next
bit is also in error is greater than 1/1000.

Sec 2-4 Conditional Probability 30

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Conditional Probability Rule
• The conditional probability of an event B
given an event A, denoted as P(B | A), is:
P(B | A) = P(A∩B) / P(A) for P(A) > 0.
• From a relative frequency perspective of n
equally likely outcomes:
– P(A) = (number of outcomes in A) / n
– P(A∩B) = (number of outcomes in A∩B) / n
– P(B | A) = number of outcomes in A ∩ B / number
of outcomes in A

31

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 2-11
There are 4 probabilities conditioned on flaws in the
below table. Parts Classified
Surface Flaws
Defective Total
Yes (F ) No (F' )
Yes (D ) 10 18 28
No (D' ) 30 342 372
Total 40 360 400

P(F ) = 40 400 and P(D) = 28 400


P(D | F ) = P(D F ) P(F ) = 400 10 40
400 = 10
40

P ( D ' | F ) = P ( D ' F ) P ( F ) = 400


30 40
400 = 30
40

P ( D | F ') = P ( D F ') P ( F ') = 400


18 360
400 = 360
18

P ( D ' | F ') = P ( D ' F ') P ( F ') = 342


400
360
400 = 342
360
32

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Multiplication Rule
• The conditional probability can be
rewritten to generalize a multiplication
rule.

P(A∩B) = P(B|A)·P(A) = P(A|B)·P(B)

• The last expression is obtained by


exchanging the roles of A and B.
33

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 2-13: Machining
Stages
The probability that a part made in the 1st stage of a
machining operation meets specifications is 0.90.
The probability that it meets specifications in the
2nd stage, given that met specifications in the first
stage is 0.95.
What is the probability that both stages meet
specifications?

• Let A and B denote the events that the part has


met1st and 2nd stage specifications, respectively.
• P(A∩B) = P(B | A)·P(A) = 0.95·0.90 = 0.855

Sec 2-5 Multiplication & Total Probability Rules 34

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Two Mutually Exclusive
Subsets
• A and A are mutually
exclusive.
• A∩B and A ∩B are
mutually exclusive
• B = (A∩B) U(A ∩B)

Total Probability Rule


For any two events A and B
P ( B) = P ( B  A) + P ( B  A)
= P ( B | A)  P ( A) + P ( B | A)  P ( A)
35

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 2-14: Semiconductor Contamination
Information about product failure based on chip manufacturing
process contamination is given below. Find the probability of
failure.
Probability Level of Probability
of Failure Contamination of Level
0.1 High 0.2
0.005 Not High 0.8

Let F denote the event that the product fails.


Let H denote the event that the chip is exposed to high
contamination during manufacture. Then

− P(F | H) = 0.100 and P(H) = 0.2, so P(F H) = 0.02


− P(F | H ) = 0.005 and P(H ) = 0.8, so P(F H ) = 0.004
− P(F) = P(F ∩H) + P(F ∩H ) (Using Total Probability rule)
= 0.020 + 0.004 = 0.024

36

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Event Independence
• Two events are independent if any one
of the following equivalent statements is
true:
1. P(A | B) = P(A)
2. P(B | A) = P(B)
3. P(A∩B) = P(A)·P(B)
• This means that occurrence of one
event has no impact on the probability
of occurrence of the other event.
37

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 2-16:
Table 1 provides an example of 400 parts classified by surface flaws
and as (functionally) defective. Suppose that the situation is different
and follows Table 2. Let F denote the event that the part has surface
flaws. Let D denote the event that the part is defective.

The data shows whether the events are independent.

TABLE 1 Parts Classified TABLE 2 Parts Classified (data chg'd)


Surface Flaws Surface Flaws
Defective Yes (F ) No (F' ) Total Defective Yes (F ) No (F' ) Total
Yes (D ) 10 18 28 Yes (D ) 2 18 20
No (D' ) 30 342 372 No (D' ) 38 342 380
Total 40 360 400 Total 40 360 400

P (D |F ) = 10/40 = 0.25 P (D |F ) = 2/40 = 0.05


P (D ) = 28/400 = 0.10 P (D ) = 20/400 =
0.05
not same same
Events D & F are dependent Events D & F are independent

38

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Bayes Theorem with Total
Probability
If E1, E2, … Ek are k mutually exclusive and
exhaustive events and B is any event,
P ( B | E1 ) P ( E1 )
P ( E1 | B) =
P ( B | E1 ) P ( E1 ) + P ( B | E2 ) P ( E2 ) + ... + P ( B | Ek ) P ( Ek )

where P(B) > 0

Note : Numerator expression is always one of


the terms in the sum of the denominator.
Sec 2-7 Bayes Theorem 39

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 2-19
A printer manufacturer obtained the following three types of printer
failure probabilities. Hardware P(H) = 0.3, software P(S) = 0.6, and other
P(O) = 0.1. Also, P(F | H) = 0.9, P(F | S) = 0.2, and P(F | O) = 0.5.
If a failure occurs, determine if it’s most likely due to hardware,
software, or other.
P(F ) = P(F | H )P(H ) + P(F | S )P(S ) + P(F | O)P(O)
= 0.9(0.1) + 0.2(0.6) + 0.5(0.3) = 0.36
P(F | H )  P(H ) 0.9  0.1
P(H | F ) = = = 0.250
P(F ) 0.36
P(F | S )  P(S ) 0.2  0.6
P(S | F ) = = = 0.333
P(F ) 0.36
P(F | O)  P(O) 0.5  0.3
P(O | F ) = = = 0.417
P(F ) 0.36
Note that the conditionals given failure add to 1. Because P(O | F) is
largest, the most likely cause of the problem is in the other category.
40

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Random Variable and its
Notation
• A variable that associates a number with the
outcome of a random experiment is called a
random variable.
• A random variable is a function that assigns a real
number to each outcome in the sample space of a
random experiment.
• A random variable is denoted by an uppercase
letter such as X. After the experiment is conducted,
the measured value of the random variable is
denoted by a lowercase letter such as
x = 70 milliamperes. X and x are shown in italics,
e.g., P(X = x).
Sec 2-8 Random Variables 41

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Discrete & Continuous Random
Variables
• A discrete random variable is a random
variable with a finite or countably infinite
range. Its values are obtained by counting.
• A continuous random variable is a random
variable with an interval (either finite or
infinite) of real numbers for its range. Its
values are obtained by measuring.

Sec 2-8 Random Variables 42

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Examples of Discrete & Continuous Random
Variables

• Discrete random variables:


– Number of scratches on a surface.
– Proportion of defective parts among 100 tested.
– Number of transmitted bits received in error.
– Number of common stock shares traded per day.
• Continuous random variables:
– Electrical current and voltage.
– Physical measurements, e.g., length, weight, time,
temperature, pressure.

Sec 2-8 Random Variables 43

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Discrete Random Variables
Physical systems can be modeled by the same or
similar random experiments and random variables.
The distribution of the random variable involved in
each of these common systems can be analyzed. The
results can be used in different applications and
examples.

We often omit a discussion of the underlying sample


space of the random experiment and directly describe
the distribution of a particular random variable.

Sec 3-1 Discrete Random Variables 44

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Probability Distributions
A random variable is a function that assigns a
real number to each outcome in the sample
space of a random experiment.

The probability distribution of a random variable


X gives the probability for each value of X.

Sec 3-2 Probability Distributions & Probability Mass 45


Functions
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 3-4: Digital Channel
• There is a chance that a bit
transmitted through a digital
transmission channel is
received in error.
• Let X equal the number of bits
received in error in the next 4 Figure 3-1 Probability
bits transmitted. distribution for bits in
• The associated probability error.
distribution of X is shown in the P(X =0) = 0.6561
table. P(X =1) = 0.2916
P(X =2) = 0.0486
• The probability distribution of X P(X =3) = 0.0036
is given by the possible values P(X =4) = 0.0001
along with their probabilities. 1.0000
46

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Probability Mass Function
For a discrete random variable X with possible values
x1, x2, …, xn, a probability mass function is a function
such that:

(1) f ( xi )  0
n
(2)  f ( xi ) = 1
i =1

(3) f ( xi ) = P( X = xi )

Sec 3-2 Probability Distributions & Probability Mass 47


Functions
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Cumulative Distribution Functions
Example 3-6: Consider the probability distribution for the digital channel
example.
x P (X =x )
0 0.6561
1 0.2916
2 0.0486
3 0.0036
4 0.0001
1.0000

Find the probability of three or fewer bits in error.


• The event (X ≤ 3) is the total of the events: (X = 0), (X = 1), (X = 2),
and (X = 3).
• From the table:
P(X ≤ 3) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) = 0.9999

Sec 3-3 Cumulative Distribution Functions 48

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Cumulative Distribution Function and Properties

The cumulative distribution function, is the probability that a random


variable X with a given probability distribution will be found at a value
less than or equal to x.

Symbolically,
F ( x) = P( X  x) =  f ( xi )
xi  x

For a discrete random variable X, F(x) satisfies the following properties:

(1) F ( x) = P( X  x) =  f ( xi )
xi  x

(2) 0  F ( x)  1

(3) If x  y, then F (x)  F ( y)


Sec 3-3 Cumulative Distribution Functions 49

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Variance Formula Derivations
V ( X ) = (x − )2 f ( x) is the definitional formula
x

= ( x2 − 2 x +  2 ) f ( x)
x

=  x2 f (x) − 2 xf (x) +  2  f (x)


x x

=  x2 f (x) − 2 2 +  2
x

=  x2 f (x) −  2 is the computational formula


x

The computational formula is easier to calculate manually.

Sec 3-4 Mean & Variance of a Discrete Random Variable 50

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 3-9: Digital Channel
In Example 3-4, there is a chance that a bit transmitted through a digital
transmission channel is received in error. X is the number of bits received in
error of the next 4 transmitted. The probabilities are
P(X = 0) = 0.6561, P(X = 2) = 0.0486, P(X = 4) = 0.0001,
P(X = 1) = 0.2916, P(X = 3) = 0.0036
Use table to calculate the mean & variance.
x f (x ) x · f (x ) (x - 0.4)2 (x - 0.4)2 · f (x ) x 2 · f (x )
0 0.6561 0.0000 0.160 0.1050 0.0000
1 0.2916 0.2916 0.360 0.1050 0.2916
2 0.0486 0.0972 2.560 0.1244 0.1944
3 0.0036 0.0108 6.760 0.0243 0.0324
4 0.0001 0.0004 12.960 0.0013 0.0016
Total = 0.4000 0.3600 0.5200
= Mean = Variance (σ 2) = E (x 2)
=μ σ 2 = E (x 2) - μ 2 = 0.3600
Computational formula

Sec 3-4 Mean & Variance of a Discrete Random Variable 51

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Expected Value of a Function of a Discrete Random Variable

If X is a discrete random variable with probability mass function f (x),

E h( X ) = h(x) f (x)


x

If h ( x) = ( X −  ) , then its expectation is the variance of X.


2

Sec 3-4 Mean & Variance of a Discrete Random Variable 52

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 3-12: Digital Channel

In Example 3-9, X is the number of bits in error in the


next four bits transmitted. What is the expected value of
the square of the number of bits in error?

X 0 1 2 3 4
f (X ) 0.6561 0.2916 0.0486 0.0036 0.0001

Here h ( x) = X 2

Answer:

E ( X 2 ) = X 2  f ( X ) = 02  0.6561+12  0.2916 + 22  0.0486 + 32  0.036 + 42  0.0001


= 0.5200

Sec 3-4 Mean & Variance of a Discrete Random Variable 53

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Binomial Distribution
The random variable X that equals the number
of trials that result in a success is a binomial
random variable with parameters 0 < p < 1 and
n = 1, 2, ....
The probability mass function is:
 n x
f ( x) =   p (1− p) for x = 0,1,...n
n− x
(3-7)
 x
For constants a and b, the binomial expansion is
( ) ( k ) b
n
+ = k n−k
n n
a b a
Sec 3-6 Binomial Distribution k =0 54

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 3-17: Binomial Coefficient
Exercises in binomial coefficient calculation:
10! 10  9  8  7!
( 3 ) = 3!7! = 3 2 1 7! = 120
10

15! 15 14 1312 11.10!


( 15
10 ) = 10!5! = 10!.5 4  3 2 1 = 3,003
100! 100  99  98  97.96!
( 100
4 ) = 4!96! = 4  3 2 1.96! 3,921,225

Sec 3-6 Binomial Distribution 55

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Exercise 3-18:
Each sample of water has a 10% chance of containing a particular
organic pollutant. Assume that the samples are independent with
regard to the presence of the pollutant. Find the probability that, in the
next 18 samples, exactly 2 contain the pollutant.

Answer:
Let X denote the number of samples that contain the pollutant in the
next 18 samples analyzed. Then X is a binomial random variable with
p = 0.1 and n = 18
18
P ( X = 2) =   ( 0.1) ( 0.9) = 153( 0.1) ( 0.9) = 0.2835
2 16 2 16

 2

0.2835 = BINOMDIST(2,18,0.1,FALSE)

Sec 3-6 Binomial Distribution 56

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Exercise 3-18:
Determine the probability that at least 4
samples contain the pollutant.
Answer:
P ( X  4) = 1− P ( X  4)
18
3
= 1−   ( 0.1) (0.9)
x 18− x

x=0  x 

= 1− 0.150 + 0.300 + 0.284 + 0.168


= 0.098

0.0982 =1- BINOMDIST(3,18,0.1,TRUE)


57

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Exercise 3-18:

Now determine the6 probability that 3 ≤ X <


18
7. P (3  X  7) =   (0.1) (0.9)
x 18− x

x=3  x 
Answer: = 0.168 + 0.070 + 0.022 + 0.005
= 0.265

0.2660 = BINOMDIST(7,18,0.1,TRUE) - BINOMDIST(2,18,0.1,TRUE)

Appendix A, Table II (pg. 705) is a cumulative binomial table


for selected values of p and n.

58

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Binomial Mean and Variance
If X is a binomial random variable
with
parameters p and n,
 = E ( X ) = np
and
 = V ( X ) = np (1− p )
2

Sec 3-6 Binomial Distribution 59

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 3-19:
For the number of transmitted bit received in error
in Example 3-16, n = 4 and p = 0.1. Find the mean
and variance of the binomial random variable.

Answer:
μ = E(X) = np = 4*0.1 = 0.4

σ2 = V(X) = np(1-p) = 4*0.1*0.9 = 0.36

σ = SD(X) = 0.6
Sec 3-6 Binomial Distribution 60

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Geometric Distribution
• Binomial distribution has:
– Fixed number of trials.
– Random number of successes.

• Geometric distribution has reversed roles:


– Random number of trials.
– Fixed number of successes, in this case 1.

• The probability density function of Geometric


distribution is
f ( x) = p(1− p)x−1
x = 1, 2, …, the number of failures until the 1st success.
0 < p < 1, the probability of success.
Sec 3-7 Geometric & Negative Binomial Distributions 61

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 3.21:
The probability that a wafer contains a large particle of
contamination is 0.01. Assume that the wafers are
independent. What is the probability that exactly 125
wafers need to be analyzed before a particle is
detected?
Answer:
Let X denote the number of samples analyzed until a
large particle is detected. Then X is a geometric random
variable with parameter p = 0.01.

P(X=125) = (0.99)124(0.01) = 0.00288.

62

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Geometric Mean & Variance
If X is a geometric random variable with
parameter p,

 = E( X ) =
1
p
and

 2
=V ( X ) =
(1− p)
p2

Sec 3-7 Geometric & Negative Binomial Distributions 63

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Exercise 3-22: Mean and Standard Deviation

The probability that a bit transmitted through a digital transmission


channel is received in error is 0.1. Assume that the transmissions are
independent events, and let the random variable X denote the number
of bits transmitted until the first error. Find the mean and standard
deviation.

Answer:

Mean = μ = E(X) = 1 / p = 1 / 0.1 = 10

Variance = σ2 = V(X) = (1-p) / p2 = 0.9 / 0.01 = 90

Standard deviation = 90 = 9.49

64

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Poisson Distribution
The random variable X that equals the
number of events in a Poisson process is a
Poisson random variable with parameter λ > 0,
and the probability density function is:

e 
− x
f ( x) = for x = 0,1,2,3,... (3-16)
x!

65

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 3-31: Calculations for Wire Flaws-1
For the case of the thin copper wire, suppose that
the number of flaws follows a Poisson distribution
With a mean of 2.3 flaws per mm. Find the
probability of exactly 2 flaws in 1 mm of wire.
Answer:
Let X denote the number of flaws in 1 mm of wire
e−2.3 2.32
P ( X = 2) = = 0.265
2!
In Excel
0.26518 = POISSON(2,2.3,FALSE)
66

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 3-31: Calculations for Wire Flaws-2
Determine the probability of 10 flaws in 5 mm of
wire.
Answer :
Let X denote the number of flaws in 5 mm of wire.

E ( X ) =  = 5mm  2.3 flaws/mm=11.5 flaws


11.510
P ( X = 10) = e −11.5
= 0.113
10!

In Excel
0.1129 = POISSON(10,11.5,FALSE)

67

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Poisson Mean & Variance

If X is a Poisson random variable with parameter λ, then

μ = E(X) = λ and σ2 =V(X) = λ

The mean and variance of the Poisson model are the same.

For example, if particle counts follow a Poisson distribution with a mean


of 25 particles per square centimeter, the variance is also 25 and the
standard deviation of the counts is 5 per square centimeter.

If the variance of a data is much greater than the mean, then the
Poisson distribution would not be a good model for the distribution of
the random variable.

68

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Continuous Random Variables
• A continuous random variable is one which
takes values in an uncountable set.
• They are used to measure physical characteristics
such as height, weight, time, volume, position,
etc...
Examples
1. Let Y be the height of a person (a real number).
2. Let X be the volume of juice in a can.
3. Let Y be the waiting time until the next person
arrives at the server.
69

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Probability Density Function
For a continuous random variable X, a probability density function is
a function such that

(1) f ( x)  0 means that the function is always non-negative.



(2)  f ( x) dx = 1
−
b
(3) P ( a  X  b) =  f ( x) dx = area under f ( x ) dx from a to b
a

(4) f ( x) = 0 means there is no area exactly at x.

70

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 4-1: Electric Current
Let the continuous random variable X denote the current
measured in a thin copper wire in milliamperes(mA).
Assume that the range of X is 4.9 ≤ x ≤ 5.1 and f(x) = 5.
What is the probability that a current is less than 5mA?

Answer:
5 5
P ( X  5) =  f ( x)dx =  5 dx = 0.5
4.9 4.9

5.1
P ( 4.95  X  5.1) =  f ( x)dx = 0.75
4.95 Figure 4-4 P(X < 5)
illustrated.

71

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Cumulative Distribution
Functions
The cumulative distribution function of a continuous
random variable X is,
x
F ( x) = P ( X  x) =  f (u ) du for −  x  
−

The cumulative distribution function is defined for all real


numbers.

Sec 4-3 Cumulative Distribution Functions 72

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 4-3: Electric Current
For the copper wire current measurement in
Exercise 4-1, the cumulative distribution
function consists of three expressions.

0 x < 4.9
F (x ) = 5x - 24.5 4.9 ≤ x ≤ 5.1
1 5.1 ≤ x

Figure 4-6 Cumulative distribution function

The plot of F(x) is shown in Figure 4-6.

Sec 4-3 Cumulative Distribution Functions 73

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Probability Density Function from the Cumulative
Distribution Function

• The probability density function (PDF) is


the derivative of the cumulative
distribution function (CDF).
• The cumulative distribution function
(CDF) is the integral of the probability
density function (PDF).
dF ( x)
Given F ( x) , f ( x) = as long as the derivative exists.
dx

Sec 4-3 Cumulative Distribution Functions 74

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Exercise 4-5: Reaction Time
• The time until a chemical reaction is complete (in
milliseconds, ms) is approximated by this
cumulative distribution function:

F ( x) = 0
 for x  0
1− e−0.01x for 0  x
• What is the Probability density function?

f ( x) =
dF ( x) d  0
dx
=  −0.01x = 0 −0.01x for x  0
dx 1− e 0.01e for 0  x 
• What proportion of reactions is complete within
200 ms?
P ( X  200) = F ( 200) = 1− e−2 = 0.8647
Sec 4-3 Cumulative Distribution Functions 75

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Mean & Variance
Suppose X is a continuous random variable with probability
density function f(x). The mean or expected value of X,
denoted as µ or E(X), is

 = E( X ) =  xf ( x) dx
−

The variance of X, denoted as V(X) or is 2 ,


 
 2 = V ( X ) =  ( x −  ) f ( x) dx =  x2 f ( x) dx −  2
2

− −

The standard deviation of X is  = 2.

76

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Normal Distribution
A random variable X with probability density function

−( x− )
2

f ( x) =
1
e 2 2 -  x  
2

is a normal random variable with parameters µ,


where −     , and   0. Also,
E ( X ) =  and V ( X ) =  2
and the notation N ( , ) is used to denote the distribution.
2

Sec 4-6 Normal Distribution 77

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Empirical Rule
For any normal random variable,
P(μ – σ < X < μ + σ) = 0.6827
P(μ – 2σ < X < μ + 2σ) = 0.9545
P(μ – 3σ < X < μ + 3σ) = 0.9973

Figure 4-12 Probabilities associated with a normal distribution


78

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Standard Normal Random
Variable
A normal random variable with
μ = 0 and σ2 = 1
is called a standard normal random variable
and is denoted as Z. The cumulative
distribution function of a standard normal
random variable is denoted as:
Φ(z) = P(Z ≤ z)
Values are found in Appendix Table III and
by using Excel and Minitab.

Sec 4-6 Normal Distribution 79

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 4-11
Assume Z is a standard normal random variable.
Find P(Z ≤ 1.50). Answer: 0.93319

Figure 4-13 Standard normal Probability density function

Find P(Z ≤ 1.53). Answer: 0.93699


Find P(Z ≤ 0.02). Answer: 0.50398

NOTE : The column headings refer to the hundredths digit of the value of z in P(Z ≤ z).
For example, P(Z ≤ 1.53) is found by reading down the z column to the row 1.5 and then selecting
the probability from the column labeled 0.03 to be 0.93699.

80

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Standardizing a Normal Random
Variable
Suppose X is a normal random variable with mean  and variance  2 ,
the random variable

Z=
( X − )

is a normal random variable with E(Z ) = 0 and V (Z ) =1.

The probability is obtained by using Appendix Table III with z =


( x − )
.

Sec 4-6 Normal Distribution 81

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 4-14:
Suppose that the current measurements in a strip of wire are assumed
to follow a normal distribution with μ = 10 and σ = 2 mA, what is the
probability that the current measurement is between 9 and 11 mA?

Answer:
9 −10 x −10 11 −10 
P (9  X  11) = P   
 2 2 2 
= P ( −0.5  z  0.5)
= P ( z  0.5) − P ( z  −0.5)
= 0.69146 − 0.30854 = 0.38292

Using Excel
0.38292 = NORMDIST(11,10,2,TRUE) - NORMDIST(9,10,2,TRUE)

82

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Example 4-14
Determine the value for which the probability that a current
measurement is below 0.98.

Answer:
 X − 10 x − 10 
P ( X  x) = P   
 2 2 
 x − 10 
= P Z   = 0.98
 2 
z = 2.05 is the closest value.
z = 2 ( 2.05) + 10 = 14.1 mA.

Using Excel
14.107 = NORMINV(0.98,10,2)
83

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
84

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
85

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.

You might also like