You are on page 1of 29

Course : Statistical Methods

Effective Period : September 2015

Probabilities

Session 3-4
Learning Outcome

Calculate relevant food technology problems using of


frequency distribution, probabilities, and hypothesis
Introduction to Probability

Basic Concepts

A probability is a numerical quantity that expresses the likelihood of an event.


The probability of an event E is written as

Pr{E}

The probability Pr{E} is always a number between 0 and 1, inclusive.


Example

Coin Tossing Consider the familiar chance operation of tossing a coin, and
define the event E: Heads
Each time the coin is tossed, either it falls heads or it does not. If the coin
is equally likely to fall heads or tails, then

Such an ideal coin is called a “fair” coin. If the coin is not fair (perhaps
because it is slightly bent), then Pr{E} will be some value other than 0.5,
for instance,
Probability Trees
If a fair coin is tossed twice, then the probability of
heads is 0.5 on each toss. The first part of a probability
tree for this scenario shows that there are two possible
outcomes for the first toss and that they have
probability 0.5 each.
Probability Trees
Then the tree shows that, for either outcome of the first
toss, the second toss can be either heads or tails, again
with probabilities 0.5 each.
Probability Trees

Example

In the Drosophila population, 30% of the flies are black


and 70% are gray. Suppose that two flies are randomly
chosen from the population. Suppose we wish to find
the probability that both flies are the same color.
Density Curves

The examples presented in Section 3.2 dealt with probabilities for discrete
variables. In this section we will consider probability when the variable is
continuous.

Relative Frequency Histograms and Density Curves

Example 3.4.1

Blood Glucose A glucose tolerance test can be useful in diagnosing diabetes. The
blood level of glucose is measured one hour after the subject has drunk 50 mg of
glucose dissolved in water. Figure 3.4.1 shows the distribution of responses to this
test for a certain population of women.7 The distribution is represented by
histograms with class widths equal to (a) 10 and (b) 5, and by (c) a smooth curve.
Example 3.4.2

Blood Glucose Figure 3.4.4 shows the density curve for the blood glucose
distribution of Example 3.4.1, with the vertical scale explicitly shown.The
shaded area is equal to 0.42, which indicates that about 42% of the glucose
levels are between 100 mg/dl and 150 mg/dl. The area under the density curve
to the left of 100 mg/dl is equal to 0.50; this indicates that the population
median glucose level is 100 mg/dl. The area under the entire curve is 1.
Probabilities and Density Curves
If a variable has a continuous distribution, then we find probabilities by using
the density curve for the variable. A probability for a continuous variable
equals the area under the density curve for the variable between two points.

Example 3.4.3

Blood Glucose Consider the blood glucose level, in mg/dl, of a randomly


chosen subject from the population described in Example 3.4.2.We saw in
Example 3.4.2 that 42% of the population glucose levels are between 100
mg/dl and 150 mg/dl. Thus, Pr{100 ≤ glucose level ≤ 150} = 0.42.

We are modeling blood glucose level as being a continuous variable, which


means that Pr{glucose level = 100} = 0, as we noted above. Thus,
Example 3.4.4
Tree Diameters The diameter of a tree trunk is an important variable in
forestry. The density curve shown in Figure 3.4.5 represents the distribution of
diameters (measured 4.5 feet above the ground) in a population of 30-year-old
Douglas fir trees; areas under the curve are shown in the figure.8 Consider the
diameter, in inches, of a randomly chosen tree. Then, for example,
Pr{4 < diameter < 6} = 0.33.

If we want to find the probability that a randomly chosen tree has a diameter
greater than 8 inches, we must add the last two areas under the curve in Figure
3.4.3:
Pr{diameter > 8} = 0.12 + 0.07 = 0.19.
Random Variables
A random variable is simply a variable that takes on numerical values that
depend on the outcome of a chance operation. The following examples
illustrate this idea.
Example 3.5.1

Dice Consider the chance operation of tossing a die. Let the random variable Y
represent the number of spots showing. The possible values of Y are Y = 1, 2, 3,
4, 5, or 6. We do not know the value of Y until we have tossed the die. If we
know how the die is weighted, then we can specify the probability that Y has a
particular value, say Pr{Y = 4}, or a particular set of values, say Pr{2 ≤ Y ≤ 4}. For
instance, if the die is perfectly balanced so that each of the six faces is equally
likely, then
Pr{Y = 4} = 1/6
and
Pr{2 ≤ Y ≤ 4} = 3/6 = 0.5
Example 3.5.3
Medications After someone has heart surgery, the person is usually given
several medications. Let the random variable Y denote the number of
medications that a patient is given following cardiac surgery.

If we know the distribution of the number of medications per patient for the
entire population, then we can specify the probability that Y has a certain value
or falls within a certain interval of values. For instance, if 52% of all patients are
given 2, 3, 4, or 5 medications, then

Pr{2 ≤ Y ≤ 5} = 0.52
Mean and Variance of a Random Variable
Example 3.5.5
Fish Vertebrae In a certain population of the freshwater sculpin, Cottus
rotheus, the distribution of the number of tail vertebrae, Y, is as shown in Table
3.5.1.
Example 3.5.7
Fish Vertebrae Consider the distribution of vertebrae given in Table 3.5.1. In
Example 3.5.5 we found that the mean of Y is mY = 21.49. The variance of Y is

The standard deviation of Y is


The Binomial Distribution
To add some depth to the notion of probability and random variables, we now
consider a special type of random variable, the binomial. The distribution of a
binomial random variable is a probability distribution associated with a special
kind of chance operation. The chance operation is defined in terms of a set of
conditions called the independent-trials model.
Example 3.6.1

Albinism If two carriers of the gene for albinism marry, each of their children
has probability 1/4 of being albino.

The chance that the second child is albino is the same (1/4) whether or not
the first child is albino; similarly, the outcome for the third child is
independent of the first two, and so on.

Using the labels “success” for albino and “failure” for non albino, the
independent-trials model applies with p = ¼ and n = the number of children in
the family.
The Binomial Distribution Formula
Example 3.6.4

Mutant Cats Suppose we draw a random sample of five individuals from a


large population in which 37% of the individuals are mutants (as in Example
3.6.2).

The probabilities of the various possible samples are then given by the
binomial distribution formula with n = 5 and p = 0.37 the results are displayed
in Table 3.6.3. For instance, the probability of a sample containing 2 mutants
and 3 non mutants is
Thus, Pr{Y = 3} ≈ 0.3. This means that about 34% of random samples of size 5 will
contain two mutants and three non mutants.

Notice that the probabilities in Table 3.6.3 add to 1. The probabilities in a


probability distribution must always add to 1, because they account for 100% of
the possibilities.
Applicability of the Binomial Distribution
Example 3.6.9
Chickenpox Consider the occurrence of chickenpox in children. Each child in a
family can be categorized according to whether he had chickenpox during a
certain year.

One can say that each child constitutes a “trial” and that “success” is having
chickenpox during the year, but the trials are not independent because the
chance of a particular child catching chickenpox depends on whether his sibling
caught chickenpox.

As a specific example, consider a family with five children, and suppose that the
chance of an individual child catching chickenpox during the year is equal to
0.10. The binomial distribution gives the chance of all five children getting
chickenpox as
Exercises
3.2.1 In a certain population of the freshwater sculpin, Cottus rotheus, the
distribution of the number of tail vertebrae is as shown in the table.

Find the probability that the number of tail vertebrae in a fish randomly chosen from
the population
(a) equals 21 (b) is less than or equal to 22.
(c) is greater than 21 (d) is no more than 21.
Exercises
3.2.2 In a certain college, 55% of the students are
women. Suppose we take a sample of two students. Use
a probability tree to find the probability
(a) that both chosen students are women.
(b) that at least one of the two students is a woman
Exercises
3.5.7 A group of college students were surveyed to learn how
many time they had visited a dentist in the previous year. The
probability distribution for Y, the number of visits, is given by the
following table

Calculate the mean of the number of visits.


Exercises
3.6.6 The sex ratio of newborn human infants is about 105 males
100 females.17 If four infants are chosen at random, what is the
probability that
(a) two are male and two are female?
(b) all four are male?
(c) all four are the same sex?
References
Myra L. Samuels, Jeffrey A. Witmer, and Andrew
Schaffner. Statistics for the life science, Chapter 3

You might also like