Professional Documents
Culture Documents
CHAPTER ONE
1.1 Introduction
Descriptive statistics deals with processing data without attempting to draw any
inferences from them. It refers to the presentation of data in the form of tables,
charts/graphs and gives some characteristics of data such as averages and dispersion.
Inductive statistics is a scientific discipline concerned with developing and using
mathematical tools to make forecasts and inferences. The term inference means the act or
process of deriving a conclusion based solely on what an individual knows.
This chapter introduces presentation of data, the techniques that are commonly used in
statistics. Inductive statistics will be discussed later after the review of probability and
probability distributions.
Example 1.1 The following data were obtained when a die was rolled 30 times.
1 2 4 2 2 6 3 5 6 3 3 1 3 1 3
4 5 3 5 3 5 1 6 3 1 2 4 2 4 4
Use the above data to construct a frequency table
Solution
The frequency table is constructed by tallying repeated observations/numbers, in order to
know a number of times a certain observation appears in the data set. This exercise is
shown in the following table.
When there is a huge mass of data with many of the values being distinct, it is convenient
to form a grouped frequency distribution rather than ungrouped. In this case various
values are grouped in a class and they are tallied to obtain a class frequency. The grouped
frequency distributions of equal class size are reasonable only when the data do not
contain extreme values (values that are very far from others in the same data set).
3
There are no specific rules in formulating such kind of frequency distributions, it depends
on the number of classes do you want. It is recommended to have grouped frequency
tables with number of classes between 6 and 12 inclusive, depending on the size of the
data set.
1. Identify the smallest and largest values of the data set and hence compute the range.
2. Decide on the number of classes do you want in the distribution and hence compute
the class size h using the relation
range
h
Number of classes
3. Write your first class of size h with the lower limit (first value) 2 or 3 units below the
minimum value. List other classes each of size h make sure that all data are included
in the distribution. To avoid confusion during tallying, the disconnected class are
preferable.
4. Tally the frequency of each class and hence obtain a grouped frequency distribution.
Example 1.2 The following data give the amount (in dollars) spent on groceries by a
family during the past forty weeks
32 22 19 18 43 42 40 43 18 21
31 26 22 25 47 40 26 32 22 34
28 35 47 26 35 38 35 28 19 38
35 38 36 25 22 45 48 26 34 41
Solution
The minimum value is 18, the maximum value is 48.
Then, the range = 48 – 18 = 30.
Number of classes is 7, so the class size h = 30/7 = 4.29. So we try h = 5.
4
The classes are 15 – 19, 20 – 24, 25 – 29, 30 – 34, 35 – 39, 40 – 44, 45 – 49 which surely
include all values from 18 to 48.
The frequency distribution for the grocery expenditure is formulated below:
20 – 24 5
25 – 29 8
30 – 34 5
35 – 39 8
40 – 44 6
45 – 49 4
TOTAL 40
If data consists of some extreme values, the previous techniques can not be generally
applicable. In this case only values that are closer from each other are considered first and
the extreme values might be grouped together into one class.
Example 1.3 Prices of thirty stocks (in thousand of Shillings) on a given day were recorded
as follows
11.2 8.9 20.0 9.5 35.0 41.0 14.6 100.0 9.0 10.5
79.0 32.5 46.7 22.9 13.5 17.3 41.8 30.4 93.0 33.7
14.4 20.9 34.5 10.8 45.7 104.0 42.6 10.1 41.0 53.8
Solution
Although data range from 8.9 to 104 we find that most of the values concentrated between
8 and 55, and only few observations fall between 55 and 104.
Since we need only five classes, we obtain four classes from nearly closed values and one
class for the extreme values. For the first four classes we proceed as follows;
Minimum value = 8.9, maximum value = 53.8. So range = 44.9
Number of classes is 4. Hence h = 44.9/4 = 11.225. So try h = 12, hence the first four
classes are 8 – 19, 20 – 31, 32 – 43 and 44 – 55, and the fifth class is 56 – 104
The frequency distribution table is given below;
8 – 19 11
20 – 31 4
32 – 43 8
44 – 55 3
56 – 104 4
TOTAL 30
1.3 Class Limits, Class Boundaries, Class Marks and Class Intervals
A class mark is the middle value between lower and upper class boundaries or limits
A class interval/size/width/length is the difference between upper boundary and lower
boundary of a class.
Example 1.4 Find the class limits, boundaries, class marks and class width of the following
classes 15 – 19 and 20 – 29.
Solution
The required statistics are summarized in the following table.
1.4 Histograms
Example 1.5 Draw the histogram for grocery problem given in Example 1.2.
Solution
In this case we first create frequency table with class boundaries as shown below
9
8
7
Frequency
6
5
4
3
2
1
0
14.5 – 19.5 19.5 – 24.5 24.5 – 29.5 29.5 – 34.5 34.5 – 39.5 39.5 – 44.5 44.5 – 49.5
A frequency polygon is a polygon whose vertices are the frequencies at the class marks
of the classes. To create a frequency polygon, one have to extend the distribution by
introducing one class before the lowest class and one class after the highest class both
of them will be having zero frequencies.
Example 1.6 Draw a frequency polygon for the data in Example 1.2.
Solution
The frequency distribution with class marks is shown below
30 – 34 32 5
35 – 39 37 8
40 – 44 42 6
45 – 49 47 4
50 – 54 52 0
The frequency polygon is given below where the class marks are amounts in dollars
9
8
7
6
Frequency
5
4
3
2
1
0
$ 12 $ 17 $ 22 $ 27 $ 32 $ 37 $ 42 $ 47 $ 52
Class marks
This is a line graph obtained by representing the upper class boundaries along the
horizontal axis and the corresponding cumulative frequencies along the vertical axis.
Example 1.7 Draw a cumulative frequency polygon for the data given in Table 1.5.
Solution
The table will be extended by adding a column of upper class boundaries and a column of
cumulative frequency as shown below
9
45
40
35
Cumulative Frequency
30
25
20
15
10
0
Less than Less than Less than Less than Less than Less than Less than Less than
14.5 19.5 24.5 29.5 34.5 39.5 44.5 49.5
Boundaries
10
EXERCISES 1
1. The scores of 50 students in one of their mathematics tests were recorded as follows:
55 34 49 73 81 38 66 71 63 56
47 52 39 75 61 56 50 48 65 73
54 62 58 46 48 35 76 85 68 55
66 72 69 58 54 46 49 58 76 67
77 44 59 68 36 36 48 47 58 73
Construct the frequency distribution using six classes with the lower limit of the first
class interval being 30. The distribution should include the relative frequencies.
2. Forty statistical values were collected from a certain engineering firm and the results
in thousands of dollars were given below;
1.8 2.3 10.0 30.0 1.2 4.9 2.0 5.1 8.1 40.1
1.3 2.5 3.6 1.8 2.4 7.6 3.1 8.5 6.5 6.2
1.2 2.8 4.5 6.1 7.8 3.2 2.8 6.1 3.8 4.9
19.6 1.7 8.6 6.4 5.2 4.1 3.1 8.1 6.8 5.8
Construct a frequency distribution with five classes only. Starting with the lower limit
of the first class as 1.0
4. Repeat question (3) using the grouped frequency distribution obtained in (2) above.
11
CHAPTER TWO
2.1 Introduction
In this chapter, we shall discuss the measure of averages (central tendencies) and
dispersion (spread). These measures are important for statistical reporting and analyses.
They do not involve any inference on them, but their information is important for
decision making.
Mean, median and Mode are three major groups of measures of central tendency or
some times called measures of average. These measures can be determined for
ungrouped and grouped data.
2.2.1.1 Mean
There are various mean measures in statistics. These include arithmetic mean,
geometric mean and harmonic mean. We define each of these measures below;
1
H n
1 1
n
x
i 1 i
This is also not commonly used measure of average, sometimes it is also desirable.
Definition 2.3 (Arithmetic mean) The most commonly used measure of average is the
arithmetic mean. It is denoted by x . The arithmetic mean is regarded as a suitable
measure of central tendency when the values in the data set are symmetric. In other
words, if the data set contains no extreme values. Given a set of n observations as
shown above, then
1 n
x xi
n i1
2.2.1.2 Median
Another measure of central tendency is the median. This is defined as the middle value
of the data set when the data are arranged in order (ascending or descending). The
median is a suitable measure of average for data with extreme values. It is also used to
give the general overview for a huge mass of data whereby the computation of the
arithmetic mean might be tedious. It can be denoted by ~
x.
n 1
The median takes the position th for odd number of observations. However, if
2
th th
n n
there is an even number of observations, it is the average of the and 1
2 2
observations.
2.2.1.3 Mode
This is the value(s) with the highest frequency from the data set. It can be used to
determine the most favourable output of a certain experiment and help decide on what
measures may be taken from that output. It is commonly denoted by x̂ . For instance, a
shop keeper may observe that the preferable neck size of shirts in his/she is 40, this will
enable him/her to increase the stock of the most preferable neck size in the future orders
Example 2.1 Compute arithmetic mean, median and mode for the following data.
3, 4, 6, 8, 3, 5, 9, 11, 7, 10
13
Solution
We can first arrange the data in ascending order as follows;
3, 3, 4, 5, 6, 7, 8, 9, 10, 11
Arithmetic mean The arithmetic mean is given by,
x
x 3 3 4 5 6 7 8 9 10 11 66 6.6 .
n 10 10
Mode is the value with highest frequency. In this case the highest frequency is 2.
Therefore, the mode is 3.
Median There is even number of observations n = 10. In this case, median is the
average of the fifth and sixth observations which are 6 and 7. Hence,
67
Median = = 6.5.
2
f
i 1
i xi
x ,
f i
To compute the median we first need to identify the class which contains the median,
we call this as the median class. Then the median can be computed using the formula
14
N
Cb h
x L
~ 2
fm
where L = lower boundary of the median class
N = total number of observations
C b = cumulative frequency before median class
f m = frequency of the median class
h = class width/size of the median class
2.2.2.3 Mode
Example 2.2 The height (in inches) of 100 male students at ABC College were
recorded as follows
Height 60 – 62 63 – 65 66 – 68 69 – 71 72 - 74
Frequency 5 18 42 27 8
Solution
Given such a grouped frequency, we first compute various sums and summarize in a
table as shown below
15
(a) Mean =
fx i i
6745
67.45
f i 100
(b) Median is the value contained in class 66 – 68. From this class we have
L 65.5 , f m 42 , Cb 23 , h 3.
Then,
N
Cb h
Median = L
2 65.5 50 23(3) 67.43
fm 42
(c) The class with the highest frequency is 66 – 68 and thus it is the modal class.
From this class, we find that
L 65.5 , h 3 , f m 42 , f b 18 , f a 27 .
Implying that 1 f m f b 42 18 24 , 2 f m f a 42 27 15 .
1 24
Therefore, mode = L h 65.5 (3) 65.5 1.846 67.35
1 2 39
Measures of dispersion show how data deviate from the given measure of average
(arithmetic mean or median). These measures include range, mean absolute deviation,
standard deviation and quartile deviation.
The most commonly used measure of variation is the sample standard deviation, since
the population standard deviation is not easily obtained in practice. However, this
measure is not suitable for data with extreme values. If the data consists of some
extreme values, the appropriate measure would be the quartile deviation. The mean
absolute deviation is rarely used to compare the variation between two data sets.
16
The range is a very crude measure of dispersion. It is used just to give the general
overview on the spread of data. In similar fashion we shall separately discuss
ungrouped and grouped data.
Consider a set of values X 1 ,, X N from a certain population, and x1 , x2 ,, xn from a
sample of a given population. Then, we compute the following measures
MAD
X i X
N
where X is the population mean and N = population size.
Similarly for sample we have
MAD
x i x
, where n = sample size.
n
2
X i X2
N
s2
x .
i x 2
n 1
The standard deviation is defined as the square root of the variance. It is denoted by
for a population, and s for a sample. It is also denoted in general by SD X .
From the above formulas we have
X i X2
and s
x i x 2
N n 1
17
Quartile Deviation
Example 2.3 Compute mean absolute deviation, sample standard deviation and quartile
deviation of the following data 10, 12, 8, 16, 8, 20, 21, 15
MAD
x i x
36
4.5
n 8
s
x i x 2
=
190
= 5.21
n 1 7
We can also obtain the sample variance without prior computation of the arithmetic
mean, x . This alternative formula is given by
1 xi 2
s
2
n 1
xi2
n
th
Q3
3
n 1th observation = 27 observation
4 4
= 6.75th observation
= 6 th 0.75 7 th 6 th
= 17 + 0.75 (20 - 17) = 17 + 2.25 = 19.25
In this subsection we will consider only two measures commonly used in statistical
analysis and decision making. These include sample standard deviation and quartile
deviation.
x
fx , where n f i
i i
n
Then, the mean absolute deviation for the sample is given by
MAD
f i xi x
n
s2
f x i i x 2
n 1
Alternatively we use
1 f i xi 2
s
2
n 1
f i xi n
2
N
Cb h
Q1 L
4
f
where, L = lower boundary of the class contains lower quartile
C b = cumulative frequency before the class which contains lower quartile
h = the class size
f = the class frequency
Similarly, we define the upper quartile by
3
N Cb h
Q3 L
4
f
where, L = lower boundary of the class contains upper quartile
C b = cumulative frequency before the class which contains upper quartile
h = the class size
f = the class frequency
Example 2.4 Compute the sample standard deviation and the quartile deviations for
the data in the following table
Class 15 – 19 20 – 24 25 – 29 30 – 34 35 – 39 40 – 44 45 – 49
Freq. 4 5 8 5 8 6 4
Solution
Calculations are summarized in the table below
Class Freq. ( f i ) c. mark ( xi ) f i xi f i xi2
15 – 19 4 17 68 1156
20 – 24 5 22 110 2420
25 – 29 8 27 216 5832
30 – 34 5 32 160 5120
35 – 39 8 37 296 10952
40 – 44 6 42 252 10584
45 – 49 4 47 188 8836
TOTAL 40 1290 44900
21
EXERCISES 2
Class 40 – 44 45 – 49 50 – 54 55 – 59 60 – 64
frequency 15 30 35 15 5
3. The absolute errors done by two different measuring balances (grams) for the past
eight days were recorded as follows;
Based on sample variances only, suggest which of the two measuring balances may
be considered fairly consistent.
23
CHAPTER THREE
INTRODUCTION TO PROBABILITY
3.1 Introduction
The term probability can be defined as the chance that a certain event will occur. It
ranges between zero and one. It can be represented as a decimal, percentage or fraction.
For instance, according to Tanzania Meteorological Authority (TMA), the probability
that there will be a rain tomorrow is 0.65.
Before giving the theoretical concept to probability let us define some of the important
terms.
A sample space: This is a set consisting of all sample points. It is denoted by S. Consider
a case of tossing a fair coin twice with H being number of heads shown up and T being
number of tails. In this case, S is a set of four points given by S HH , HT , TH , TT .
An Event: An event is a subset of a sample space. It is denoted by E. From the above
experiment, if E is an event that only one head shown up, then E will be a set of two
points given by E HT , TH .
A sample Point: It is an individual element of a sample space. It is denoted by e . From
the above experiment, e1 HH .
The theoretical definition of probability is also depending on one of the two cases.
Example 3.1
A football team has to play two matches to qualify for the second round. There is 0.7
chances that it will win the first match and 0.8 chance of winning the second match. By
assuming that winning or losing the first match does not affect the second match, find the
probability that the team will win
(a) Only one match.
(b) At least one matches.
25
Solution
First step is to define the outcomes
Let W1 = a team wins the first match
L1 = a team loses the first match
W2 = a team wins the second match
L2 = a team loses the second match
From the given information we find that
PW1 0.7 , PL1 0.3 , PW2 0.8 , PL2 0.2
The sample space can be obtained from the tree diagram as shown below:
W2 W1W2
W1 L2 W1 L2
L1 W2 L1W2
L2 L1 L2
(a) If B is the event that a team wins at least one match, then
B W1W2 , W1 L2 , L1W2
And,
PB PW1 PW2 PW1 PL2 PL1 PW2
0.7 0.7 0.38 0.56 0.38 0.94
ii 0 PE 1
iii Pei 1
all i
Example 3.2
Given the following information P A 0.6 , PB 0.5 , P A B 0.4
(a) Find the following i P A B ii P A' iii PB ' iv P A B'
(b) Show that P A B ' P A 'B '
Solution
a i P A B P A PB P A B 0.6 0.5 0.4 0.7
ii P A ' 1 P A 1 0.6 0.4
iii PB ' 1 PB 1 0.5 0.5
iv P A B ' 1 P A B 1 0.4 0.6
27
Definition 3.1 The conditional probability of an event A to occur given that event B has
already occurred denoted by P A / B is given by
P A B
P A / B , P B 0
P B
P B A
Similarly, PB / A , p A 0
P A
Example 3.3
An individual is picked at random from a group of 52 athletes. Suppose that 26 of the
athletes are female of which 6 are swimmers. Also, there are 10 swimmers among male
athletes.
(a) Given that the individual picked is a female, find the probability that she is a
swimmer.
(b) Given that the individual picked is a swimmer, find the probability that he is a male.
Solution
Let M = male, F = female, S = swimmer
The formula for the conditional probability gives that general multiplication rule of
probability.
Example 3.4
The probability that the stock market goes up on Monday is 0.6. Given that it goes up on
Monday, the probability that it goes up on Tuesday is 0.3. Find the probability that the
market will go up on both days.
Solution
Let M = market goes up on Monday, T = market goes up on Tuesday
Given PM 0.6 , PT / M 0.3 . Required to find PM T
From the rule, PM T PM .PT / M 0.6 0.3 0.18
Example 3.5
A bag contains 2 white and 3 red balls. If two balls are picked one at a time with
replacement, find the probability that,
29
Solution
Let R = red ball, W = white ball
The sample space is S RR, RW , WR , WW . Given that PR 3 / 5, PW 2 / 5
(a) Required: PRR
Example 3.6
A certain machine is operated using three components C1 , C 2 and C 3 . The probabilities
of these three components to perform well are respectively 0.80, 0.96 and 0.91. Suppose
that these components work independently, find the probability that
(a) All three components work properly.
(b) Only two components work properly.
Solution
(a) PAll three components work PC1 C 2 C3
PC1 PC 2 PC3
0.80 0.96 0.91 0.69888
H 1H
T 1T
1
H 2H
2 T 2T
H 3H
3 T 3T
H 4H
4
T 4T
5 H 5H
T 5T
6
H 6H
T 6T
3.6.2 Permutation
Permutation is an ordered arrangement of objects (letters or numbers). These n objects
could be distinct or not distinct.
The number of permutations of n distinct objects taken r at a time denoted by n Pr is
given by
n!
n Pr
n r !
where n! nn 1n 22 1 .Note that 0! 1 and nPn n!
31
Example 3.7
How many numbers with three distinct digits are possible using the digits 3, 4, 5, 6, 7, 8?
Solution
We need to find n Pr where n = 6 and r = 3
6! 6!
Then, 6 P3 120
6 3! 3!
Five of these numbers are 345, 356, 347, 378, 567.
Example 3.8
In how many ways can 5 people be arranged in a line?
Solution:
There are 5! 120 ways of arranging five people in a line.
Solution:
The word ESSENTIAL consists of nine letters of which S repeats 2 times and E repeats 2
times.
9!
Thus, the possible number of ways = 90720 .
2! 2!
3.6.4 Combination
Unlike permutation, in the case of combination, the order is not important.
We can define a combination as a selection of r objects in a group of n objects. It is
denoted by
n n n!
or nCr , where
r r r !n r !
32
Example 3.10
How many ways are there of choosing a set of three books from a set of eight books?
Solution
n 8
Since the order is not important, the answer is 56 ways.
r 3
Example 3.11
The letters of the word VOLUME are arranged in all possible ways. Find the probability
that
(a) The word ends with a vowel.
(b) The word starts with a consonant and ends with a vowel.
Solution
The word volume has six distinct letters and hence there are 6! ways of arranging these
letters, where 6! = 720.
(a) There are three choices of the last letter so that the word ends with a vowel. The first
through the fifth letters are arranged in 5! ways. Thus the number of ways that a word
ends with a vowel is given by 3 5 4 3 2 1 360
360
The required probability = 0.5
720
(b) For a word to start with a consonant and end with a vowel, there are three choices for
the first letter and also three choices for the last letter. The middle four letters can be
arranged in 4! ways. The required number of ways in this case is 3 3 4 3 2 1 216
216
The required probability = 0.3
720
Example 3.12
An Engineering consultant is faced with a problem of surveying five sites at Kinondoni,
seven sites at Ilala and eight sites at Temeke. Due to time constraint he/she decided to
choose only six sites to survey. Of these six sites, find the probability that
33
(a) 2 sites are from Kinondoni, One site from Ilala and three sites from Temeke.
(b) 2 sites are from Kinondoni.
Solution
(a) We first need to find the total number of ways of selecting six sites out of twenty.
20
This selection can be done in 38,760 ways.
6
5
There are ways of selecting two sites out of five from Kinondoni.
2
7 8
Similarly, there are ways for one site at Ilala, and
3
ways of three sites from
1
Temeke.
5 7 8
Combined number of ways is given by
2 1 3 3920
3920
The required probability = 0.101
38,760
(b) Since the only restriction is that two sites are from Kinondoni, the rest four sites can
be chosen from either Ilala or Temeke.
5 15
Hence there are 2
ways for two sites from Kinondoni and ways of selecting
4
the rest four sites.
5 15
The combined number of ways is therefore 13,650.
2 4
13,650
The required probability = 0.352
38,760
34
Exercises 3
1. A certain consultant has written two proposals ready to request for funding. The
probability that either of the proposals will be successful is 0.7, and that both
proposals will be successful is 0.2. There is 0.6 chance that the first proposal will not
be successful. Find the probability that the second proposal will be successful.
2. The probability that a female student studies is 0.7. Given that she studies, the
probability is 0.8 that she will pass a course. Given that she does not study, the
probability is 0.3 that she will pass the course. Find the probability that
(a) she will study and pass
(b) she will not study and pass the course
(c) she will pass the course
3. The probability that an executive is promoted to a higher position is 0.625. If he is
promoted, he will go on vacation with a probability of 0.83; however, if he is not
promoted, there is a probability of 0.33 that he will take a vacation.
(a) find the probability that he will go on a vacation
(b) Given that he has gone on a vacation, find the probability that he had been
promoted.
4. A box contains five identical items of which three are defective. Suppose that three
items are selected at random.
(a) Obtain the sample space for the experiment.
(b) Find the probability that exactly two items are not defective.
7. A four-digit number is written using the digits 2, 3, 4, 5, 7 and 8. Find the probability
that a number formed is an odd number.
36
CHAPTER FOUR
PROBABILITY DISTRIBUTIONS
4.1 Introduction
In this chapter we will discuss some probability distributions of random variables. There
are two types of probability distributions depend on the type of a random variable; these
are discrete and continuous probability distributions.
Example 4.1
Find the formula for the probability distribution of a total number of heads obtained by
tossing a fair coin three times.
Solution
The following tree diagram is used to obtain the sample space S
H HHH
H T HHT
H HTH
H T
T HTT
T H H THH
T THT
T H TTH
T TTT
X 0 1 2 3
P(X) 1/8 3/8 3/8 1/8
Theorem 4.2 The value F x of the distribution function of a discrete random variable X
satisfy the following conditions
1. F 0 and F 1
2. If a b , then F a F b for any real numbers a and b.
Example 4.2
Find the distribution function from the probability distribution obtained in example 3.1.
Solution
From the above example we have f 0 1 / 8, f 1 3 / 8, f 2 3 / 8 and f 3 1 / 8 ,
then by definition of distribution function, we get
F 0 f 0 1 / 8
F 1 f 0 f 1 1 / 8 3 / 8 4 / 8
F 2 f 0 f 1 f 2 1 / 8 3 / 8 3 / 8 7 / 8
F 3 f 0 f 1 f 2 f 3 1 / 8 3 / 8 3 / 8 1 / 8 1
Definition 4.4 A function with values f x defined over the set of all real numbers is
called a p.d.f. of the continuous random variable X if and only if
Pa X b f x dx
b
a
39
Example 4.3
If X has the probability density given by
k e 3 x for x 0
f x
0 elsewhere
Find (i) the value of k (ii) P0.5 X 1
Solution:
(i) Since f x is a probability density, then
k
3 x
ke dx 1 e 3 x 1
0 3 0
k
0 1
3
k 3
P0.5 X 1 3 e 3 x dx e 3 x
1 1
0.5 0.5
Definition 4.4 If X is a continuous random variable and the value of its probability
density is f t , then the distribution function of X is given by
F x P X x f t dt , x
x
Theorem 4.4 If f x and F x are the respective values of the probability density and
the distribution function of X at x, then
Pa X b F b F a
40
Example 4.4
Find the distribution function of X whose p.d.f. is given by
3 e 3 x for x 0
f x
0 elsewhere
Solution
Using the definition, we have
F x P X x f t dt
x
0
x
3 e 3 t dt
0
e 3 t 1 1 e 3 x
x
e 3 t
0
Where x p x
E X2 2
all x
Solution
Let X be a daily income of the firm. We summarize the required sums in the following
table
X p(x) x p(x) x2p(x)
1.2 0.30 0.36 0.432
3.3 0.15 0.495 1.6335
1.8 0.20 0.36 0.648
0.9 0.15 0.135 0.1215
2.8 0.20 0.56 1.568
SUM 1.91 4.403
Then,
E X xpx 1.91
Var X E X 2 E X 4.403 1.912 0.7549
2
Definition 4.7 If X is a continuous random variable, then its expected value is given by
E X xf x dx
Where
E X 2 x 2 f x dx
Example 4.6
A continuous random variables X has the probability density given by
3 e 3 x for x 0
f x
0 elsewhere
Find the expected values and the standard deviation of X.
Solution
Given f x 3e 3 x for 0 x
Then by definition,
E X xf x dx 3xe 3 x dx xe 3 x
1
e 3 x dx
0 0 0 0 3
Also,
E X 2 x 2 f x dx 3x 2 e 3 x dx
2
0 0 9
9 9 9
Theorem 4.5 Two discrete random variables (say X and Y) are said to be jointly
distributed if the following conditions are satisfied
1. f x, y 0
2. f x, y 1
x y
43
Example 4.7
Find the value of k if the following is a joint probability distribution
kxy for x 1, 2,3; y 2, 3
f x, y
0 otherwise
Solution
Given that the function is a joint probability distribution, we have f x, y 1 .
x y
It implies that,
f 1,2 f 1,3 f 2,2 f 2,3 f 3,2 f 3,3 1
2k 3k 4k 6k 6k 9k 1
1
So that 30k 1 or k
30
Example 4.8
40 defective items produced by a certain engineering firm in 2010 was recorded
depending on the type of department they belongs. They are further categorized as high,
average and low. Their joint probabilities were given in the table below
Departments
Defective items Civil (C) Electrical (E) Plumbing (PL)
(12) (14) (14)
High (H), (14) 0.10 0.10 0.15
Average (A), (16) 0.10 0.20 0.10
Low (F), (10) 0.10 0.05 0.10
Solution
Using probabilities from the table, we get the following
a P Average from Elecrical P A, E 0.20
b PLow defective only P( F ) 0.10 0.05 0.10 0.25
c PDefective from plumbing PPL 0.15 0.10 0.10 0.35
Definition 4.10 If X and Y are discrete random variables and f x, y is the value of
their joint probability distribution at x, y , then the function given by
g x f x, y
y
for each x within the range of X is called the marginal distribution of X, similarly, the
function
h y f x, y
x
Example 4.9
Use the joint distribution of example 3.8 to obtain a marginal distribution of
department.
Solution
By summing all probabilities from defective items category we get
PCivil 0.10 0.10 0.10 0.30
PElecrical 0.10 0.20 0.05 0.35
PPlumbing 0.15 0.10 0.10 0.35
And the resulting marginal distribution is given by
Definition 4.11 If f x, y is the value of the joint probability distribution of the
discrete random variables X and Y at x, y , and h y is the marginal distribution of Y
at y , the function
f x, y
f x | y , h y 0
h y
for each x within the range of X , is called the conditional distribution of X / Y y ,
correspondingly, the function
f x, y
f y | x , g x 0
g x
for each y within the range of Y , is called the conditional distribution of Y / X x .
Example 4.10
Use the table in example 3.8 to obtain the conditional distribution of defective items
given that they come from electrical department.
Solution
We know from example 3.9 that PE 0.35 . Now we need to find PH / E , P A / E
and PL / E in order to obtain a complete distribution.
But,
PH , E 0.10
P H / E 0.286
P E 0.35
P A, E 0.20
P A / E 0.571
P E 0.35
PL, E 0.05
P L / E 0.143
P E 0.35
46
Once the distributions are obtained, mathematical expectations and standard deviations
are computed in the same way as in univariate cases.
Theorem 4.6 A bivariate function can serve as a joint probability density function of a
pair of continuous random variables X and Y if its values, f x, y , satisfy the conditions
1. f x, y 0 for x , y ;
2. f x, y dx dy 1
Since the probabilities are values of functions of several variables, answers to various
questions need techniques of multiple integrals.
Example 4.11
Given a function of two random variables X and Y by
3
x y x for 0 x 1, 0 y 2
f x, y 5
0 elsewhere
(a) Show that the given function is a joint probability density
1
(b) Find P 0 x , 1 y 2
2
47
Solution
(a) We see that for each point x, y , f x, y 0 , we now need to show that
1 2
5x y x dy dx 1
3
0 0
But,
1 2 1 2
3xy 2 3x 2 y
3
0 0 5 x y x dydx 0 10 5 dx
0
1
6x 6x 2
dx
0
5 5
1
3x 2 2 x 3 3 2
1
5 5 0 5 5
0.5 2 0.5 2
3xy 2 3x 2 y
1 5 x y x dydx
1 3
(b) P 0 x , 1 y 2
2
0
0
10
5 1
dx
0.5
12 6 3 3
10 x 5 x x x 2 dx
2
0
10 5
0.5
9 1 3
10 x x dx
2
0
5
0.5
9 2 1 3 9 1 1 1 11
x x
20 5 0 20 4 5 8 80
where f s, t is the value of the joint probability density of X and Y at s, t is called
the joint distribution function of X and Y.
In order to get the joint density from the joint distribution function, we apply the mixed
partial derivative, such that
48
2 F x, y
f x, y
x y
Example 4.12
The joint density of X and Y is given by
x y for 0 x 1, 0 y 1
f x, y
0 elsewhere
Find the joint distribution function of these random variables.
Solution
In order to obtain the joint distribution function, we need to consider different cases as
shown in the following diagram
III IV
I II
x
1
1. When x 0 , y 0 ,
In this case, it follows immediately that F x, y 0
2. When 0 x 1, 0 y 1 (Region I). In this case we get
49
y x
F x, y s t ds dt 2 xy x y
1
0 0
F x, y s t ds dt 2 y y 1
1
0 0
Example 4.13
Given the joint distribution function by
1 ex 1 e y
F x, y
for x 0 , y 0
0 elsewhere
Find the joint probability density of the two random variables X and Y and hence find
P1 X 3 ,1 Y 2
Solution
Using partial differentiation we get
50
2 F x, y
f x, y
x y
x y
1 e y e x e x e y
y
x
e e x e y
e x e y
e x y
Therefore, the joint density is given by
e x y for x 0 , y 0
f x, y
0 elsewhere
Hence,
2 3
P1 X 3 , 1 Y 2 e x y dx dy
1 1
2 3
x y
e dy
1 1
2
e y 3 e y 1 dy
1
e y 3 e y 1
2
1
5 3 4
e e e e 2
0.074
Example 4.14
The joint density of X and Y is given by
51
x y for 0 x 1, 0 y 1
f x, y
0 elsewhere
Solution
The marginal density of X is given by
1 1
g x x y dy xy y 2 2 x 1
1 1
0
2 0
2
Written as
1
2 x 1 for 0 x 1
g x 2
0 elsewhere
Similarly, the marginal density of Y is given by
1 1
h y x y dx x 2 xy 2 y 1
1 1
0
2 0
2
And is written as
1
2 y 1 for 0 y 1
h y 2
0 elsewhere
Example 4.15
With reference to example 3.14, find the conditional density of X and use it to evaluate
1 1
P X , Y
2 4
Solution
From the definition, the conditional density of X is given by
f x, y x y
f x / y
h y 1
2 y 1
2
Written by
x y
1 for 0 x 1
f x / y 2 y 1
2
0 elsewhere
1 1 1
Before we evaluate P X , Y , we first find f x / Y
2 4 4
But,
1
x
1
4 x 1
4 1
f x/
4 1 2 1 1 3
2 4
It follows that
1
1
1
11 1 1
2
P x , Y 4 x 1 dx 2 x 2 x
1 1 1 2
2 4 0 3 3 0 3 2 2 3
Expected values and standard deviations are computed in similar fashion as for single
variable cases.
inversely, they give a negative covariance. We can precisely define the covariance as
follows;
Definition 4.16 If X and Y are two random variables, their covariance denoted by
COV X , Y or X , Y is given by
COV X , Y E X E X Y EY E XY E X EY
Where E XY is a joint expectation, and E X and E Y are marginal expectations of
X and Y respectively.
For discrete random variables we define E XY as
E XY xyf x, y
x y
Example 4.16
Two machines A and B are expected to produce identical items which categorized as
high and standard quality. Their joint probability distribution is given in the following
table
Find the covariance between the number of items produced by the machines and their
qualities.
Solution
The joint distribution enables us to directly compute the joint expectation such that
E XY xyf x, y 8 10
6 4 2 3 189
7 10 8 5 7 5 63
x y 15 15 15 15 3
In order to find the marginal expectations we first have to obtain the marginal
distribution for each variable.
The marginal distribution of number of items produced by the machines (X) is given by
54
E X xf x 8
8 7 113
7
x 15 15 15
E Y yf y 10
10 5 25
5
y 15 15 3
Therefore, the covariance is given by
X ,Y
189 113 25 567 565 2
3 15 3 9 9 9
Comment Positive covariance indicating a direct relationship between the machines
and the quality of products.
4.5.2 Independence
Definition 4.17 Two jointly distributed random variables are said to be independent if
E XY E X EY
It follows immediately from the above definition that if two jointly random variables
are independent, their covariance is zero.
Definition 4.18 If X 1 , X 2 ,, X n are random variables and a1 , a2 ,, an are constants,
the sum
55
n
Y ai X i
i 1
i 1 i 1 i j i j
a1 EX 1
a and E X
a E X
n n
Example 4.17
Given the random variables X, Y, and Z with means x 2 , y 5 , z 2 , variances
Exercises 4
(a) Beyond eight years (b) between six and eight years.
CHAPTER FIVE
SPECIAL PROBABILITY DISTRIBUTIONS
5.1 Introduction
We already have visited probability distributions in the general cases. In this chapter we
shall concentrate on special cases of those distributions that are commonly used in daily
life applications. We shall start our discussion with discrete probability distributions
before we go on continuous cases.
Definition 5.1 A discrete random variable X is said to have a binomial distribution with
parameters n and p if its probability distribution is given by
n
P X x p x 1 p
n x
x 0,1, 2,, n
x
The expected value of this random variable is given by E X np , and the standard
deviation is given by SD X np1 p
Example 5.1
Produced items are to be inspected and checked whether they are good or defective. The
probability that an item chosen is defective is 0.003. Find the probability that out of 6
items inspected
(a) Only one is defective
(b) At least one is defective
(c) No defective item be found
Solution
60
This is a binomial situation with p 0.003 and n 6 . Note that the success here is the
number of defective items.
(a) Required P X 1 , then
6
P X 1 0.003 0.997 0.01773
5
1
(b) Required P X 1 , then
6
P X 1 1 P X 1 1 P X 0 1 0.003 0.997 1 0.997 6 0.01787
0 6
0
0
Example 5.2
The number of customers attended at NBC MLIMANI follows a Poisson distribution
with mean 8 per hour. Find the probability that in any given hour
(a) Exactly 6 customers will be attended
(b) No customer will be attended
(c) At least 2 customers will be attended
Solution
61
Example 5.3
The probability of getting a defective tire is estimated at 5%. Find the probability that out
of 50 tires produced,
(a) Exactly one will be defective
(b) At least one will be defective
Solution
Let X be the number of defective produced.
This is a binomial situation with n 50 and p 5% 0.05
Using Poisson approximation to binomial, we have np 50 0.05 2.5
62
e 2.5 2.51
P X 1 0.205212
1!
(b) Required P X 1 , but
e 2.5 2.5 0
P X 1 1 P X 1 1 P X 0 1 0.917915
0!
Theorem 5.1 The mean and variance of a uniformly distributed random variable are
given by
EX and Var X 2
1
2 12
Example 5.4
63
Solution
Given 3 , 5 , then,
53 2
a P X 3 P3 X 5 .
52 3
b P X 4 0 because of a single point .
52
c P1 X 10 P1 X 2 P2 X 5 P5 X 10 0 0 1.
52
Definition 5.4 A continuous random variable X has a normal distribution with parameters
and if its probability density is given by
1 x 2
f x
1
e 2 , x
2
If X has a normal distribution, then E X and Var X 2
The common notation for a normal random variable is X ~ N , 2 . Meaning that a
random variable X is normally distributed with mean and variance 2
f(x)
X
The normal density is very complicated to handle but the shape of the normal curve is
more promising and also interested.
In order to answer probability questions concerning with the normal random variable, X,
we use a standardized distribution with a standard normal variable Z. The values of the
standard normal variable are obtained from standard normal tables which are designed in
different forms.
X x x
For instance, the question P X x becomes P or P Z
The following are examples of probability questions based on the afore mentioned form
65
Example 5.5
Use the standard normal table to evaluate the following probabilities
a P0 Z 1.25 b PZ 2.21 c P 3.01 Z 0.5 d P1.0 Z 1.4
Solution
a P0 Z 1.25 0.3944
c P 3.01 Z 0.5 P0 Z 3.01 P0 Z 0.5 0.4987 0.1915 0.6902
Example 5.6
A random variable X has a normal with mean 12 and variance 16. Find the following
probabilities
a P X 10 b P X 10 c P8 X 14 d P X 15
Solution
Given 12, 2 16 4
66
X 10 12
a P X 10 P PZ 0.5 0.5 P0 Z 0.5
4
0.5 0.1915 0.6915
X 10 12
b P X 10 P PZ 0.5 PZ 0.5
4
0.5 P0 Z 0.5 0.5 0.1915 0.3085
8 12 X 12 14 12
c P8 X 14 P
4 4 4
P 1 Z 0.5 P0 Z 1.0 P0 Z 0.5
0.3413 0.1915 0.5328
X 12 15 12
d P X 15 P PZ 0.75 0.5 P0 Z 0.75
4 4
0.5 0.2734 0.2266
Example 5.7
The incomes in thousand of dollars of a given company are normally distributed with
mean 20 and the standard deviation of 5. Find the probability that a selected income
will be
(a) More than twenty four thousand dollars.
(b) Anywhere between eighteen and twenty five thousand dollars.
Solution
Let X be a random variable representing an income of a given company
In this we have 20 and 5
(a) Required P X 24 , then
67
24 20
P X 24 P Z PZ 0.8
5
X X np
Z
npq
Since binomial distribution is a discrete distribution while the normal is continuous there
is a need of continuity correction. This is done by adding or subtracting 0.5 to the number
of successes depending on the nature of the inequality. See the following table for
illustrations;
Example 5.8
The probability of defective tire is 8%. Find the probability that out of 400 tires
inspected, at most 20 will be defective.
Solution
Given p 8% 0.08 and n 400 implying that
0.08 400 32 , 400 0.08 0.92 5.43
Then,
20.5 20.5 32
P X 20 P X 20.5 P Z P Z PZ 2.12
5.43
Therefore, PZ 2.12 0.5 P0 Z 2.12 0.5 0.4830 0.0170
formula very tedious. In this case, and 2 implying that . Hence, the
standard normal variable, Z , becomes
X X
Z
Example 5.9
In a certain automobile plant, the number of work stoppages per day due to equipment
problems in a production process is 12. What is the probability of having less than 15
stoppages in any working day?
Solution
Let X be the number of stoppages during a production process.
This is a Poisson situation with 12 and n 15 .
Using normal approximation to Poisson, we have 12, 12 3.46
Then,
14.5 12
P X 15 P X 14.5 P Z PZ 0.72
3.46
Definition 5.5 Sampling is the technique used to select the individual members of a
population to make a sample.
Definition 5.8 A random sampling is the sampling procedure in which each member of a
population has a known chance of being selected.
The most commonly used sampling procedure is a simple random sampling which is
defined as follows.
Definition 5.9 Simple random sampling is a sampling technique whereby each
individual member of a population has an equal chance of being selected.
We are going to discuss the most applicable sampling distributions. These include the
distribution of the arithmetic mean, Chi-square distribution, student’s t – distribution and
F – distribution.
1 1
Var X Var X i 2 Var X , by the properties of Var and Σ
i
n n
operators
Let Var X i 2 for each i and that X i ' s are independent, then
2
Var X 2
n
1 1
n
i 1
2
2 n
n
2
n
71
2
It is concluded that if X ~ N , 2 then X ~ N ,
n
The square root of the variance of X is called the standard error of X , given by
SE X Var X
n
Example 5.10
A random variable X is normally distributed with mean 8 and variance 25. A sample of
36 observations yields x 7.5 . Find
(a) The standard error of this sampling distribution.
(b) The probability that the sample mean is greater than 9.
Solution
The following information are given
8 , 2 25 , n 36 , x 7.5
(a) We need to find SE X , but
SE X
5 5
0.833
n 36 6
(b) We need to find PX 9 , then
x 98
PX 9 P Z P Z PZ 1.20
SE X 0.833
But,
PZ 1.20 0.5 P0 Z 1.20 0.5 0.3849 0.1151
Definition 5.9 If s 2 is the variance of the sample of size n taken from a normal
Given that Z
X
and C
n 1s 2 , then the formula for T is given by
2
n
T
X
n 1 s 2
n 1 2
n
X s
n
X
s
n
X
s
n
Therefore,
X
T ~ t n 1
s
n
73
This distribution is highly used in estimating and hypotheses testing of population means
in cases whereby the population variances are not known and the sample sizes are small.
In most statistical applications, a sample is considered to be small if its size n 30 .
Exercises 5
1. Show that if X has a binomial distribution with parameters n and p then
a E X np b Var X np1 p
2. The probability is 0.23 that a car stolen in Dar es Salaam will be recovered. Find the
probability that out of 8 cars stolen
(a) More than three will be recovered.
(b) At least two will be recovered.
3. If X is a discrete random variable having Poisson distribution. Show that its mean and
variance is , where is constant.
4. The probability is 0.002 that a manufactured item from a certain engineering firm is
defective. Find the probability that out of 400 items manufactured by the firm
(a) Exactly 10 will be defective;
(b) At most 2 will be defective.
5. Given that Z ~ N 0, 1 , evaluate the following probabilities
a P 2.14 Z 2 b P 1.7 Z 0.8556 c P 3.4 Z 0.65
d PZ 2 e PZ 1.67
6. Given that X ~ N 8, 9 , evaluate the following probabilities
a P X 7 b P X 10 c P6 X 12
7. A random variable has a normal distribution with 10. If the probability that the
random variable will take on a value less than 82.5 is 0.8212, what is the probability
that it will take on a value greater than 58.3?
8. Suppose that the actual amount of instant coffee that a filling machine puts into a 6g
cane is a random variable having a normal distribution with standard deviation of
0.05g. If only three percent of these canes are to contain less that 6g of coffee, what is
the mean fill of these canes?
9. Suppose that 23 percent of all patients with high blood pressure have bad side effects
from a certain kind of medicine. Find that probability that among 120 patients treated
with this medicine,
(a) More that 32 will have bad side effects.
(b) At most 50 will have bad side effects.
10. The fuel consumption of a certain type of machines is approximately normal with
mean 2.4
litres per hour and the standard deviation of 0.4. Using a random sample of size 32
machines, find the probability that the mean fuel consumption of all machines is at
least 2.2
75
CHAPTER SIX
ESTIMATION
6.1 Introduction.
In practice it is not always possible to work with the whole population and determine the
desirable statistical measures, like mean and standard deviation. This is because the
populations might be infinite or very expensive to work on it.
Estimation involves sampling techniques whereby findings from those samples are used
to represent the whole population. Estimators are the formulas used to estimate the
population parameters.
A good estimator must have the following properties
(a) Unbiasedness
(b) Efficiency
(c) Consistency
Unbiasedness
An estimator ˆ for a population parameter is said to be unbiased if E ˆ . The
quantity E ˆ is called the bias of .
Efficiency
This property is used to compare the efficiency of one estimator over the others in
estimating the same population parameter . The estimator with this property is also
known as MVUE (Minimum Variance Unbiased Estimator). This property is described
as follows
Let ˆ and ˆ be two unbiased estimators for , then ˆ is said to be more efficient over
1 2 1
than ˆ2 if
Var ˆ1 Var ˆ2
Consistency
An estimator ˆ for is said to be consistent if both its bias and variance tend to zero
when the sample size approaches infinity.
6.4 Confident Interval Estimate for a Population Mean when the Variance is known.
In this situation, the distribution used is Z, and hence the formula for 1 100%
confidence interval estimate for is given by
x Z 2
n
Example 6.1
A population is known to have a variance of 81. A random sample of size 16 showed
that x 10.5 . Estimate the population mean by means of 95% confidence interval.
Solution
Given 2 81 9, n 16, x 10.5, 5% 0.05
Then, 95% confidence interval is given by
9
x Z 2 10.5 Z 0.025 10.5 (1.96) (2.25) 10.5 4.41
n 16
6.5 Confidence Interval Estimate for a Population Mean when Variance is unknown
There are two situations describing unknown variance.
1. Large sample size n 30
2. Small sample size n 30
If the sample size is large, the unknown population variance is replaced by sample
variance. The distribution used is still Z. The formula for 1 100% confidence
interval estimate for is given by
s
x Z 2
n
Example 6.2
Let X be a normal random variable representing the value of individual invoices (in
dollars) issued by a certain firm. Suppose that and are unknown. A random sample
of 49 invoices selected, showed that x 520 and s 91. Compute 95% confidence
interval estimate for
Solution
Given n 49, x 520, s 91, 5%
Since the sample size is large, the distribution used is Z, and hence the formula for 95%
confidence interval for is
s
x Z 2
n
But Z 2 Z 0.025 1.96 , so we have
Example 6.3
Repeat example 5.2 with sample size 25.
Solution
78
6.6 Estimation of Difference between Means when Population Variances are known
Let X and Y be two normally distributed random variables representing two populations,
and let n x and n y be respective sample sizes. Then, the formula for
1 100% confidence interval estimate for the difference between means x y is
given by
x2 y2
x y Z 2
nx ny
Example 6.4
Random variables X and Y are normally distributed with standard deviations
x 1.2 and y 0.9 ; random samples of observations on both variables, each of size
32, provide the following information x 4.1 and y 3.5 . Estimate the difference
between population means by means of a 95% confidence interval.
Solution
Since the populations variances are known, the distribution used is Z, and hence the
formula for 95% confidence interval estimate for x y is given by
x2 y2
x y Z 2
nx ny
Where Z 2 Z 0.025 1.96 , then the confidence interval is
79
x2 y2 1.2 2 0.9 2
x y Z 2 4.1 3.5 1.96
nx ny 32 32
0.60 1.960.2652
0.60 0.52
6.7 Estimation of the Difference between Means when Population Variances are
unknown
In similar fashion as the population mean, two cases are considered in this situation.
1. Sample sizes are both large, n1 30 and n2 30 .
2. At least one of the samples is small.
6.7.1 Large Sample Sizes
Similarly, in this case population variances are replaced by sample variances and the
distribution used is Z. Hence the formula for 1 100% confidence interval estimate for
the difference between means x y is given by
2
s x2 s y
x y Z 2
nx n y
Example 6.5
A utility company used to send out monthly statements to its customers without
addressed return envelopes. From a random sample of 120 customers it was determined
that, on average, it took 9 days for a payment to be made, with a sample standard
deviation of 2 days.
Wishing to speed up receipt of payment, pre-addressed return envelopes were
subsequently included with the invoices. An independent sample of 130 customers
indicated that average payment time fell to 8 days, with a sample standard deviation of
2.2 days.
Compute a 95% confidence interval estimate for the difference between population
means.
Solution
Let X represent the invoices sent without addressed return envelopes.
Let Y represent the invoices sent with pre-addressed return envelopes.
The following information are given
80
Since the sample sizes are large, the distribution used is Z, population variances are
unknown but are replaced by the corresponding sample variances and hence the formula
for 95% confidence interval is given by
2
s x2 s y
x y Z 2
nx n y
Where Z 2 Z 0.025 1.96 , so we have
2
s x2 s y 2 2 2.2 2
x y Z 2 9 8 1.96
nx n y 120 130
1.0 1.960.2656
1.0 0.52
n x 1 s x2 n y 1s y2
s
2
nx n y 2
p
s 2p s 2p
x y t 2 , n n 2
1 2
nx ny
Example 6.6
Repeat example 5.5 with sample sizes n x 19 and n y 25
Solution
The following information are given
81
s 2p s 2p
x y t 2 , n n 2
1 2
nx ny
s 2p s 2p
x y t 2 , n n 2 9 8 2.021
4.48 4.48
1 2
nx ny 19 25
1.0 2.0210.6442
1.0 1.30
Other formulas for confidence interval estimations are summarized in the following table
Example 6.7
In a random sample of 500 families owning television sets in a certain city, it is found
that 340 have not yet subscribed to a newly introduced digital transmission system. Find
a 95% confidence interval for the actual proportion of all families in the city who have
not yet subscribed to the system.
Solution
The point estimate of p is pˆ 340 500 0.68. For 95% confidence, we have 0.05 ,
then, Z Z 0.025 1.96 . Therefore, 95% confidence interval for p is
2
0.68 1.96
0.680.34 p 0.68 1.96 0.680.34
500 500
0.64 p 0.72
Example 6.8
The following are weights, in decagrams, of 10 packages of grass seed distributed by a
certain company: 46.4, 46.1, 45.8, 47.0, 46.1, 45.9, 45.8, 46.9, 45.2 and 46.0. Find a 95%
confidence interval estimate for the variance of all such packages of grass seed
distributed by this company, assuming the normal population.
Solution
We first compute sample variance of this data as follows;
1 2
xi2 x 21,273.12 641.22 0.286
1 1 1
s2
n 1 n 9 10
For 95% confidence interval, we have 0.05 , then
2 n 1 02.025 9 19.023 and 12 n 1 02.975 9 2.700
2 2
Exercises 6
1. A study of the annual growth of a certain kind of fish showed that 64 of them selected
at
random in a lake, grew on the average of 52.80 mm with a standard deviation of 4.5
mm. Estimate the true average annual growth of all fish in the lake by means of 99%
confidence interval.
2. Independent random samples of size n1 16 and n2 25 from normal populations
with 1 4.8 and 2 3.5 have means x1 18.2 and x2 23.4 . Construct 90%
confidence interval for 1 2
3. Repeat Q2 when the population variances are unknown but the samples result with
s1 5.0 and s 2 4.0
4. 32 out of 500 items produced by a certain engineering firm are found defective.
Estimate the true proportion of all defective items by means of 95% confidence
interval.
5. 14 out of 120 items from machine A are defective and only 8 out of 70 produced by
machine B are defective. Construct the 99% confidence interval for the true
difference between proportions.
6. The following are the heat-producing capacities of coal from two mines (in millions of
calories per ton)
Mine A : 8500 8330 8480 7960 8030
Mine B : 7710 7890 7920 8270 7860
Assuming that the data constitute independent random samples from normal
populations. Construct 90% confidence interval for the ratio of their variances.
84
CHAPTER SEVEN
HYPOTHESES TESTING
7.1 Introduction
A statistical hypothesis is an assertion made about the distribution or value of a given
random variable. A hypotheses testing is a technique involving a set of rules to be
followed in order to make a decision of choosing one of the two conflicting
hypotheses/claims. These conflicting hypotheses are referred to as null and alternative
hypotheses. A null hypothesis is a statement that is considered to be true unless it has
been tested. It is denoted by H 0 . While an alternative hypothesis is the statement which
may be considered to be true only if the null hypothesis is not true. It is denoted by H 1 .
Example 7.1
A certain machine produces ball bearing whose variable diameter is 0.0025. A random
sample of 49 bearings gives the sample mean of 5.01mm. One wishes to test the null
hypothesis that the mean diameter is 5.00mm against the alternative that the mean
diameter is 5.035mm. If the decision rule says: Accept the null hypothesis if mean
diameter is less than 5.02mm and reject it otherwise. Find
(a) The size of type I error
(b) The size of type II error
Solution
From the given information we have 2 0.0025 , n 49, x 5.01
We need to test the following hypotheses H0 : 5 .00 against H1 : 5.035
The decision rule is: Accept H 0 , if x 5.02 . Reject H 0 , if x 5.02
Since the population variance is known, then the statistic used is Z such that
85
x
Z
n
(a) From the definition,
Size of typeI error PReject H 0 / H 0 is true
PX 5.02 / 5.00
5.02 5.00
P Z
0.05
49
PZ 2.8
0.0026
(b) Similarly,
Size of typeII error PAccept H 0 / H 0 is false
PX 5.02 / 5.035
5.02 5.035
P Z
0.05
49
PZ 2.1
0.0177
The testing procedures include choosing a suitable test statistic (a random variable whose
distribution is known whenever H0 is true) and dividing its values into two regions known
as rejection and non-rejection regions. This partitioning is done at the critical value(s).
The size of the rejection region is just the probability of committing Type I error. This
probability is also known as the level of significant. A critical value is the value of the
random variable whose area is equal to the level of significant.
Define 1 as the power of the test, where is the probability of committing Type II
error.
There are two types of testing procedure. These are one-sided/tailed and two-sided/tailed
tests.
It is customary to represent the null hypothesis by equality sign and the alternative ones
by inequalities. Suppose that H 0 : 0 , then the test will be one-sided if the
86
Solution
Given: n 16, x 9.8 ,
1. Hypotheses:
H 0 : 9.0 ; H1 : 9.0 , = 0.05
2. Test statistic:
Since is known, the test statistic is Z such that
x 9.8 9.0 0.8
Z 1.60
2 0.5
n 16
3. Critical values and regions:
The critical value for one-sided test with 0.05
is Z 0.05 1.645 . The distribution is then divided as follows
87
Non-rejection region
Rejection region
0 1.645
1.60
4. Decision
Since the value of the test statistic falls within non-rejection region we do not reject H0 at
5% level of significant.
Solution
Given: n 20 , x 9.5 and s 3
1. Hypotheses: H 0 : 11 against H 1: 11 0.05
2. Test statistic: Since is unknown and the sample is small, the test statistic will
be T.
x 9.5 11
Thus T 2.24
s 3
n 20
3. The critical values for two-sided test are t 0.025,19 2.093
-2.093 0 2.093
-2.24
4. Decision: Since the value of the test statistic falls within the rejection region, we
reject H 0 is favour of H 1 at 5% level of significance.
7.5 Testing for the Difference Between two Means when Variances are known
We test for H 0 : 1 2 c against one of the alternative hypotheses H1 : 1 2 c
or H1 : 1 2 c or H1 : 1 2 c
x1 x2 c
The test statistic is Z such that Z
12 22
n1 n2
7.6 Testing for the Difference Between two Means when the Variances are unknown.
In this case the test procedure is similar to previous situation except that the population
variances i2 are replaced by sample variances si2 for i 1, 2
The test statistic will be no longer Z, instead, T is again used which has a t-distribution
x1 x2 c
with n1 n2 2 degrees of freedom, where T
s 2p s 2p
n1 n2
Example 7.4
A random sample of size 16 showed an average of 480g with a standard deviation of 21g.
On the other hand, a sample of size 25 resulted to an average of 490g with a standard
deviation of 24. Test the null hypotheses H 0 : 1 2 0 against the alternative
H1 : 1 2 0 at 5% level of significant.
Solution
Given: n1 16, x1 480, s1 21, n2 25, x2 490, s2 24
Since the samples are small and the population variances are unknown, we use T as the
x1 x2 c n1 1s12 n2 1s22
test statistic, such that T where s
2
n1 n2 2
p
s 2p s 2p
n1 n2
n1 1s12 n2 1s22 16 1 212 25 1 24 2
Now, s 2p 524.08
n1 n2 2 16 25 2
Then,
x1 x2 c 480 490 0
T 1.36
2 2
s s 524.08 524.08
p
p
n1 n2 16 25
-1.645
-1.36
2n 1 s 2
0
2
Where, is the level of significance, and n 1 , is called the degrees of freedom. The
degrees of freedom simply indicating the number of free independent scores minus the
number of parameters that a certain distribution contains.
The critical value(s) depends on the nature of the alternative hypothesis, the following
table summarizes
Example 7.5
Suppose that the thickness of a part used in a semiconductor is its critical dimension and
that measurements of the thickness of a random sample of 18 such parts have the
variance of 0.68, measured in thousands of inch. The process is considered to be under
control if the variation of the thickness is not greater than 0.36. Assuming that the
measurements constitute a random sample from a normal population, test at 5% level that
the process is under control.
Solution
Given n 18, s 2 0.68, 02 0.36, 0.05
Hypotheses: H 0 : 2 0.36 against H1 : 2 0.36
Test statistic: The test statistic is computed as follows
2
n 1 s 2
17 0.68
32.11
2
0 0.36
Critical value: The critical value for this one-sided test is
2 02.05 17 27.587
Decision: Since the value of the test statistic is greater than the critical value, it falls
within the rejection region. Therefore the null hypothesis is rejected at 5% level, and
concludes that the process is out of control and it should be adjusted immediately.
Suppose we have two independent populations 1 and 2. We need to test the null
hypothesis H 0 : 12 22 against one of the following alternatives
H1 : 12 22 or H1 : 12 22 or H1 : 12 22
The appropriate test statistic in this case is F such that
s12 s 22
F if s12 s 22 or F if s 22 s12
s 22 s12
The critical value also depends on the nature of the alternative hypotheses as well the
critical values, the following table summarizes
s 22
F
12 22 s12 s 22 s12 F F n2 1, n1 1
s12
F
12 22 s12 s 22 s 22 F F n1 1, n2 1
2
s 22
F
s 22 s12 s12 F F n2 1, n1 1
2
Example 7.6
In comparing the variability of the tensile strength of two kinds of structural steel, an
experiment yielded the following results; n1 13, s12 19.2 , n2 16 , s22 3.5 . Assuming
that the measurements constitute independent random samples from two normal
populations, test at 2% level the null hypothesis that 12 22 against the
alternative 12 22 .
Solution
Given n1 13, s12 19.2 , n2 16 , s22 3.5 and 0.02
Hypotheses: H 0 : 12 22 against H1 : 12 22
93
Decision: Since the value of the test statistic exceeds the critical value, we reject the null
hypothesis at 2% level of significance and conclude that the variability of the tensile
strength of the two kinds of steel is not the same.
Example 7.7
A commonly prescribed drug for receiving nervous tension is believed to be only 60%
effective. Experimental results with a new drug administered to a random sample of 100
94
adults who were suffering from nervous tension show that 70 received relief. Is this
sufficient evidence to conclude that the new drug is superior to the one commonly
prescribed? Test at 5% level of significance.
Solution
Given p0 0.60, n 100, x 70, 0.05
Hypotheses: H 0 : p 0.60 against H1 : p 0.60
Test statistic: The test statistic used is Z such that
X np0 70 1000.60
Z 2.04
np0 q0 1000.600.40
Critical value: The critical value for the one-sided test is Z 0.05 1.645
Decision: Since the value of the statistic exceeds the critical value, the null hypothesis is
rejected and concludes that the new drug is superior.
Example 7.8
In a study to estimate the proportion of residents in a certain city and its suburbs who
favour the construction of nuclear plant, it is found that 63 of 100 urban residents favour
the deal while only 59 of 125 sub-urban residents are in favour. Is there a significance
difference between the proportion of urban and suburban residents who favour the
construction of the nuclear plant? Test this at 5% level of significance.
Solution
Given the following information x1 63, n1 100, x2 59, n2 125, 0.05
Hypotheses: We test H 0 : p1 p2 against H1 : p1 p2
Test statistic: The test statistic is Z such that
Pˆ1 Pˆ2
Z
1 1
pˆ qˆ
n1 n2
We first compute sample proportions and the pooled sample proportion by
x1 x 63 59
pˆ 1 0.63, pˆ 2 2 0.472, pˆ 0.542 qˆ 0.458
n2 n2 100 125
Then,
0.63 0.472
Z 2.36
0.5420.458 1 1
100 125
Critical value: The critical values for the two-sided test are Z Z 0.025 1.96
2
Decision: Since the calculated value of the test statistic is greater than the critical value,
we reject the null hypothesis and conclude that there is a significant difference between
the proportion of urban residents and suburban residents who favour the construction of
the nuclear plant in their city.
Such information are commonly given is the form of a tables known as contingency
tables and the analysis of data is refereed to as analysis of r c tables.
Here we test the null hypothesis that two factors are independent against the alternative
hypothesis that they are not independent.
The test statistic is a 2 given by
2
r c
O E 2
i 1 j 1 E
Where
O Observed value in a cell and E Expeceted value in a cell
The expected value for each cell is computed as follows;
Row total Column total
E
Grand total
The critical value is given by 2 r 1c 1 , where is the level of significance and
r 1c 1 is the degrees of freedom for this test.
Example 7.9
Use the data in the following table to test at 0.01 level of significance whether a person’s
ability in mathematics is independent of his or her interest in statistics.
Ability in mathematics
Low Average High
Interest in Low 63 42 15
statistics Average 58 61 31
High 14 47 29
Solution
We need to test the following hypotheses
H 0 : Ability in mathematics and interest in statistics are independen t
H 1 : Ability in mathematics and interest in statistics are not independen t
Before we compute the value of the test statistic, we need to compute the row and column
total as well as the expected values E, using the above relation. The following table
summarizes this information, whereby the expected value for each cell is given in a
bracket.
97
2
3 3
O E 2
63 452 42 502
29 18.752 32.14
i 1 j 1 E 45 50 18.75
The critical value is given by 02.01 4 13.277 .
Decision: Since the value of the test statistic exceeds the critical value, we reject the null
hypothesis at 0.01 and conclude the person’s ability in mathematics and the interest is
statistics are not independent. Implying that, there is a relationship between one’s ability
in mathematics and his or her interest in statistics.
98
Exercises 7
1. A certain firm produces items that are normally distributed with variance 9. There
was a policy of paying double tax for those firms whose products exceed 400 items
per week. To avoid paying the tax, the manager claimed to produce only 400 items
per week. A random sample of 20 weeks showed that x 430 and s 3.6 . Does the
manager subject to this penalty? Test this claim at 5% level of significance.
2. A random sample of size 25 students showed a mean score of 45 marks and the
standard deviation of 6 marks. Test the null hypotheses that the average score of the
class was 50 against the alternative that it was different from 50 at 1% level of
significant. What will be your conclusion at 5% level.
3. Suppose you want to test the equality in revenues from two different sectors at your
region. Two samples each of size 14 taken from normal populations, showed that
x 4.5, s x 1.2 , y 5.1, s y 1.5 . Perform suitable testing hypotheses procedure
for this situation.
4. The owner of a certain private school claimed that the number of girls and boys at her
school is equally likely. To test this claim, a sample of 80 students was randomly
selected and it was found that, of them 36 were girls. Test the owner’s claim at 5%
level of significance.
5. The fidelity and the selectivity of 190 radios produced the results shown in the
following table. Test for independence at 5% level of significance.
Fidelity
Low Average High
Low 7 12 31
Selectivity Average 35 59 18
High 15 13 0
99
CHAPTER EIGHT
REGRESSION AND CORRELATION ANALYSIS
8.1 Introduction
Regression analysis is the study of investigating the relationship between two or more
variables, one being the dependent variable and the rest are called independent variables.
The relationship could be linear or non linear, simple or multiple. We are going to discuss
a simple linear regression analysis which will involve a dependent variable labeled Y
and only one independent variable labeled X related linearly.
Generally, this relation can be expressed in the following equation
yi 1 2 xi ui , i 1,, n 1
Where 1 and 2 are constant parameters and u is called the disturbance term. The 1
and 2 as well as x are treated as non-random, and u is treated as a random term and
hence y.
There is a number of assumptions concerning with the probability distribution of u . The
first one is that E ui 0 for each i 1,, n . This enable us derive the population
regression line given by
E yi xi 1 2 xi , i 1,, n 2
We still can not compute the parameters 1 and 2 from equation (2). However, this is
possible if we take a paired sample of size n from the same population and estimate their
values. The estimated parameters reveal what we call it the line of best fit, sometimes is
referred to as a sample regression line given in equation (3) below
yˆ ˆ ˆ x , i 1,, n
i 1 2 i 3
Where ˆ1 and ˆ 2 are estimates of 1 and 2 respectively. Suitable estimators for these
parameters are obtained from the method of least squares.
q q
0, 0
ˆ1 ˆ 2
x x i
n x x
2 2 2
If we define S xx x 2
1
x 2 , S yy y 2 1 y 2 , S xy xy 1 x y ,
n n n
Sx y
we can simply define ̂ 2
Sxx
The line of best fit can roughly be estimated from a scatter diagram plotted using paired
sample data x, y . The sample regression line can be used to predict the value of y given
the value of x.
Example 8.1
The following pairs of values (x, y) are given by
X 5 10 15 20 25
Y 22 32 38 59 67
(a) Plot a scatter diagram and estimate the line of best fit.
(b) Obtain the regression line by using least squares method.
(c) Predict the value of y when x = 110.
Solution
(a) a scatter diagram and the estimated line of best fit is shown in the figure below
101
80
70
60
50
40
30
20
10
0
0 5 10 15 20 25
(b) The summary data can be obtained from the following table
x y xy x2 y2
5 22 110 25 484
10 32 320 100 1024
15 38 570 225 1444
20 59 1180 400 3481
25 67 1675 625 4489
Total 75 218 3855 1375 10922
Sx x x2
1
x 2 1375 1 752 250
n 5
Sy y y2
1
y 2 10922 1 2182 1417.2
n 5
Then, ˆ 2
S xy
S xx
585
250
2.34 , ˆ1
1
n
y ˆ x 15 218 2.34 75 8.5
2
102
2
Similarly, the distribution of ˆ 2 is a normal with mean 2 and variance , where 2
Sxx
is the variance of u.
Thus,
2 x2
and ˆ 2 ~ N 2 ,
2
ˆ
1 ~ N 1 , .
n S x x Sxx
We can not directly compute the value of 2 , and thus in order to proceed with our
analysis, we have to estimate its value. An unbiased estimator of 2 is ˆ 2 such that
e 2
e
RSS
ˆ 2
i
, where
2
S y y ̂ 2 S x y .
n2 n2
i
Implying that
1
ˆ 2
n2
S y y ˆ2 S x y .
These estimators revealed the t-distribution such that
ˆ1 1 ˆ 2
~ t n 2 and 2 ~ t n 2
SE ˆ1 SE ˆ 2
ˆ 2 x 2
ˆ
Where, SE 1
n S xx
and SE ˆ 2 ˆ 2
Sx x
103
Example 8.2
Construct 95% confidence interval estimate of ˆ1 and ˆ 2 using the data given in
example 7.1.
Solution
Recall that S xx 250 , S yy 1417.2 , S x y 585 , ˆ1 8.5 and ˆ2 2.34
ˆ 2
1
n2
S yy ˆ 2 S xy 1417.2 2.34 585 16.1
1
3
It implies that
SE ˆ1
16.1 1375
5 250
4.21 and SE ˆ2
16.1
250
0.254
Example 8.3
Use information given in example 7.1, compute both r and r 2 and interpret their results.
Solution
Sx y 585
(a) Using different sum of squares, we have r 0.9828 .
Sx xSy y 250 1417.2
This value suggests that there is a very strong positive linear relationship between the
variables x and y.
(b) The coefficient of determination is r 2 0.9828 0.966 96.6% .
2
This measure indicating that an explanatory variable (x) has explain 96.59% of all linear
factors of the dependent variable (y), and the rest 3.4% can be explained by other
variables, not considered in this case.
n -2 degrees of freedom.
Example 8.4
Use the data given in example 4.1 to test H 0 : 0 against H1 : 0 at 5% level of
significance.
Solution
The test statistic is T such that
T
n 2 r 2
3 0.966
9.23
1 r
2
1 0.966
The critical values at 5% level are t 0.025 3 3.182 . Clearly, H0 is rejected at 5%
level.
This implying that there is a very strong linear relationship between the variables.
106
Exercises 8
1. various doses of a poisonous substance were given to groups of 25 mice and the
following results were recorded:
(a) Find the equation of the least squares line to fit these data.
(b) Estimate the number of deaths in groups of 25 mice who received a 7-
milligram dose of this poison.
(c) Compute the correlation coefficient as well as the coefficient of determination
and comment on your results.
(d) Compute 90% confidence interval estimates for 1 and 2 .
(e) Use the data to test H 0 : 0 against H1 : 0 at 5% level of significance.
107
CHAPTER NINE
ANALYSIS OF VARIANCE
9.1 Introduction
In hypotheses testing, we were testing for the values assigned to a single population, as
well as testing the difference between means of two independent populations. In this
chapter we are going to give a general test procedure to test the equality of two or more
population means. The general test discussed here is called the analysis of variance or
simply known as ANOVA. Analysis of variance is classified into two procedures; One-
Way analysis of variance and Two-Way analysis of variance. Each of these procedures
are discussed below.
or
SST SSTr SSE
Where,
SST = Total sum of squares
SSTr = Treatment sum of squares
108
x x
2
x x
2
i 1
i
n 1s 2 ~ 2 n 1
2 2
Then, for k populations and n total observations we have
x xi . ~ 2 k n 1
k n
1 2
2 ij
i 1 j 1
x x.. ~ 2 k 1
k
n 2
2 i.
i 1
~ F k 1, k n 1
MSTr
F
MSE
The above calculations are usually presented in the form of table known as Analysis of
Variance (ANOVA) table whose general layout is shown below
Example 9.1
Scores of five best candidates in mathematics from three different schools at the same
level are given below
School A: 77, 81, 71, 76, 80
School B: 72, 58, 74, 66, 70
School C: 76, 85, 82, 80, 77
Perform a suitable test to check if the mean performances of these three schools are equal
or significantly different from one another.
Solution
For the above populations, we compute sum of squared observations as
3 5
x
i 1 j 1
2
ij 77 2 812 80 2 77 2 85041
i 1 j 1 kn 15
SSTr
1 k 2 1 2 1
n i 1 kn 5
Ti . T.. 385 2 340 2 400 2 1125 390
1
15
2
It follows that
SSE SST SSTr 666 390 276
Example 9.2
A consumer wishes to test the accuracy of the thermostats of three different kinds of
electric irons set at 240 0 C .
Iron X: 251, 246, 238, 245
Iron Y: 242, 250, 248
Iron Z: 240, 248, 247, 246, 247
Test at 5% level that the three means of thermostatic accuracies are equal against the
alternative hypothesis that the means are different.
Solution
We first compute the total sum of squared observations as follows
3 5
x
i 1 j 1
2
ij 2512 246 2 246 2 247 2 724392
k ni
T.. 724392 2948 166.67
1 2 1
SST xij2
2
i 1 j 1 N 12
k
1 2 980 2 740 2 1228 2
Ti 2. 1
SSTr T.. 29482 4.8
i 1 ni N 4 3 5 12
It follows that
~ F k 1, n 1k 1
MSTr
FTr
MSE
And
~ F n 1, n 1k 1
MSB
FB
MSE
Example 9.3
The driving time to work (in minutes) taken by a person from Monday to Friday using
four different routes were recorded as follows;
Mon Tue Wed Thu Fri
Route 1 22 26 25 25 31
Route 2 25 27 28 26 29
Route 3 26 29 33 30 33
Route 4 26 28 27 30 30
Test at 5% level whether or not the mean driving time among routes are significantly
different.
Also test the mean time among days.
Solution
In this case we have four routes (treatments) and five days (blocks). That means we have
k=4 and n=5.
We compute the total sum of squared observations as follows
4 5
x
i 1 j 1
2
ij 22 2 26 2 30 2 30 2 15610
Treatments, blocks and grand total are summarized in the following table
Blocks Total
Treatments 22 26 25 25 31 129
25 27 28 26 29 135
115
26 29 33 30 33 151
26 28 27 30 30 141
Total 99 110 113 111 123 556
4 5
T.. 15610 556 153.2
1 2 1
SST xij2
2
i 1 j 1 kn 20
SSTr
1 4 2 1 2 1
5 i 1 kn 5
20
Ti . T.. 129 2 135 2 1512 1412 556 52.8
1 2
SSB
1 5 2 1 2 1
4 i 1 kn 4
20
T. j T.. 99 2 110 2 113 2 1112 123 2 556 73.2
1 2
It implies that
The critical value for testing for the treatment means is F0.05 3,12 3.49 , and the critical
value for testing the block means is F0.05 4,12 3.26 .
Decisions
Since the calculated F – value for routes (treatments) is greater than the critical value; we
reject the null hypothesis that the mean driving times among the routes are equal, and
conclude that the mean driving times among different routes are significantly different.
116
Similarly, the calculated F – value for days (blocks) is greater than the critical value; we
reject the null hypothesis that the mean driving times among the days are equal, and
conclude that the mean driving times among different days are also significantly
different.
117
Exercises 9
1. The following are the numbers of mistakes made in five successive weeks by four
technicians working for medical laboratory:
Technician I: 13, 16, 12, 14, 15
Technician II: 15, 16, 11, 19, 15
Technician III: 13, 18, 16, 14, 18
Technician IV: 18, 10, 14, 15, 12
Test at 0.05 level of significance whether the differences among the four sample
means can be attributed to chance.
2. An experiment was performed to judge the effect of four different fuels and three
different types if launchers on the range of a certain rocket. Test, on the basis of the
following ranges in miles, whether there is a significant effect due to differences in fuels.
Test also whether there is a significant effect due to differences in launchers.
CHAPTER TEN
QUALITY CONTROL
10.1 Introduction
Quality control designed to maintain quality in production processes. It is important
ingredient to the development of Japan’s industry and economy. It is now receiving
increasing attention as a management tool in which important characteristics of products
are observed, assessed and compared with some types of standard. The various
procedures in quality control involve considerable use of sampling procedures and
statistical principles. It is clear that effective quality control programs enhances the
quality of the product being produced and increases profits.
The line shown is called the centerline which represents an expected value of the
characteristic when the process is in control. The potted points represent the sample
average of a characteristic with samples taken over time. The upper and lower control
limits are chosen in such away that all sample points should be covered by these
boundaries for the process to be in control. If any point falls outside these boundaries, it
suggests that the process is out of control. Also nonrandom pattern of points within the
boundaries may be considered suspicious and certainly an indication that the process
requires appropriate correction action.
One obvious purpose of the control chart is mere surveillance of the process, that is, to
determine if changes need to be made. Constant systematic gathering of data often allows
management to access process capability. Quality characteristics of control charts fall
generally into two categories; variables and attributes.
Example 10.1
A quality control charts are to be used on a process of manufacturing a certain engine
parts. Suppose the process mean is 50 mm and the standard deviation is
0.01mm . Suppose also that groups of five parts are sampled every hour and the
values of the sample mean X are recorded and plotted. Based on the standard deviation
of the random variable X , set the control limits of the X - chart.
Solution
Assume that engine parts are normally distributed, then by the central limit theorem, we
have
0.01
X ~ N 50,
5
Then, 1 100% of the X - values fall inside the limits when the process is in control.
The required limits are given by
50 Z 0.00447
0.01
LCL Z 50 Z
2 n 2 5 2
And,
50 Z 0.00447
0.01
UCL Z 50 Z
2 n 2 5 2
In most practical situations, control analysts use “three-sigma” limits, meaning that they
use Z 3 . Therefore, we have
2
taken when the process is in control. Typically, the estimates are determined during a
period in which background or start-up information is gathered.
A basis for rational subgroups is chosen and the data are gathered with samples of size n
in each subgroup. The sample sizes are usually small 4, 5 or 6 and k samples are taken,
with k 20 . During the period whereby the process is assumed to be in control, the user
establishes estimates for and on the control chart is based. The important
information gathered during this period includes the sample means in the subgroups, the
overall mean, and the sample range in each subgroup.
For each sample we compute X i , i 1,, k to form the sample points X 1 ,, X k , and the
general sample mean is given by
1 k
X Xi
k i 1
Where, X is the appropriate estimator for .
In quality control applications it is often convenient to estimate from the information
related to the ranges in the samples rather than sample standard deviations.
For the i th sample, we define the range for the data by
Ri X max, i X min ,i
The appropriate estimate of is the function of the average range given by
1 k
R Ri
k i 1
An estimate of , say ˆ , is obtained from the formula
R
ˆ
d2
Where d 2 is a constant depending on the sample size, n.
The use of range to estimate has roots in quality control type applications since it can
be easily computed. Under the assumption of normality, we make use of a random
variable called the relative range, and is given by
R
W
Which is a simple function of the sample size, n whose expected gives d 2 . That is
E R
E W d2
R
This makes the estimate ˆ more understood.
d2
122
It is known that the use of range produces as efficient estimator of in relatively small
samples. Using range method, the estimated parameters are given by
3R 3R
Centerline X , LCL X , UCL X
d2 n d2 n
3
By defining the quantity A2 , we have
d2 n
Sample Observations Xi Ri
number
1 1515 1518 1512 1498 1511 1510.8 20
2 1504 1511 1507 1499 1502 1504.6 12
3 1517 1513 1504 1521 1520 1515.0 17
4 1497 1503 1510 1508 1502 1504.0 13
5 1507 1502 1497 1509 1512 1505.4 15
6 1519 1522 1523 1517 1511 1518.4 12
7 1498 1497 1507 1511 1508 1504.2 14
8 1511 1518 1507 1503 1509 1509.6 15
9 1506 1503 1498 1508 1506 1504.2 10
10 1503 1506 1511 1501 1500 1504.2 11
11 1499 1503 1507 1503 1501 1502.6 8
12 1507 1503 1502 1500 1501 1502.6 7
13 1500 1506 1501 1498 1507 1502.4 9
14 1501 1509 1503 1508 1503 1504.8 8
15 1507 1508 1502 1509 1501 1505.4 8
16 1511 1509 1503 1510 1507 1508.0 8
17 1508 1511 1513 1509 1506 1509.4 7
18 1508 1509 1512 1515 1519 1512.6 11
19 1520 1517 1519 1522 1516 1518.8 6
20 1506 1511 1517 1516 1508 1511.6 11
21 1500 1498 1503 1504 1508 1502.6 10
22 1511 1514 1509 1508 1506 1509.6 8
23 1505 1508 1500 1509 1503 1505.0 9
24 1501 1498 1505 1502 1505 1502.2 7
25 1509 1511 1507 1500 1499 1505.2 12
Total 37683.2 268
124
Example 10.2
A process manufacturing missile component parts is being controlled, with the
performance characteristic being the tensile strength in pounds per square inch. Samples
of size 5 each are taken every hour and 25 samples are reported.
The data are shown in the above table (sample means and sample ranges have already
computed from the data and shown in the last two columns). Construct (a) the R-chart
and (b) the X -chart of the tensile strength.
Solution
(a) For R-chart we proceed as follows,
The centreline is computed as
1 25 268
R
25 i 1
R1
25
10.72
Using the table A.23 with n 5 we find that D3 0 and D4 2.114 . Using these
results, we construct the control limits as follows
LCL R D3 10.720 0 UCL R D4 10.722.114 22.6621
The R-chart with limits are shown below;
For the sample of size 5, table A.23 gives A2 0.577 . The control limits are
x x
2
s
i
n 1
should be used in the control of both the mean and variability.
We know that s 2 is an unbiased estimator of 2 , but unfortunately s not unbiased for
. That means E s .
However, if X is independently and normally distributed random variable, then
126
n 2
E s c4 where c4
2
n 1 n 1 2
We also have
vars 2 1 c42 s 1 c42
Therefore, if is unknown, the control limits are computed by
LCL c4 3 1 c42 c4 3 1 c42 B5
UCL c4 3 1 c42 c 4 3 1 c B
2
4 6
Where B5 and B6 are tabulated values depending on the sample size. However, In most
practical situations, is unknown, and needs to be estimated. In this case, an unbiased
estimator is ˆ which is given by
ˆ s c
4
1 m
where s si with si being a sample standard deviation.
m i 1
The control limits for S-chart are given by
centreline s LCL B3 s UCL B4 s
3 3
Where B3 1 1 c42 and B4 1 1 c42 are tabulated values.
c4 c4
Similarly, for the X -chart we have,
centreline X LCL X A3 s UCL X A3 s
Where A3 is a tabulated value depending on the sample size.
Example 10.3
Containers are produced by a process where the volume of the containers is subject to
quality control. Twenty five samples of size 5 each were used to establish the quality
25 25
control parameters. The results showed that X i 1558.14 and
i 1
s
i 1
i 0.9025 .
Construct the X -chart and the s-chart for the volume of the containers.
Solution
(a) The X -chart is constructed as follows;
The centreline is given by
1 25 1558.14
X
25 i 1
Xi
25
62.3256
127
x
This is a binomial situation with E X np and var X np1 p . An unbiased
The distribution properties of p̂ are important in the development of the p-chart. These
are
p1 p
E pˆ p and var pˆ
n
Thus, if p is known, the control limits are given by
p1 p p1 p
centreline p LCL p 3 UCL p 3
n n
If p is not known, then an unbiased estimator for p is p such that
m
pˆ i
p
i 1 m
Where p̂ i is the fraction defective in the i th sample.
The resulting control limits are
p 1 p p 1 p
centreline p LCL p 3 UCL p 3
n n
Note that m is the number of samples and n is the common sample size.
Example 10.4
The number of defective electronic components is subject to quality control. To establish
preliminary control chart values, twenty samples each of size 50 are taken and the
20
number of defective items was recorded. The results show that pˆ i 1.76 . Determine
i 1
Solution
We first compute the centreline as
m
pˆ i 1.76
p 0.088
i 1 m 20
Then, the control limits are computed by
p 1 p 0.0880.912
LCL p 3 0.088 3 0.0322
n 50
p 1 p 0.0880.912 0.2082
UCL p 3 0.088 3
n 50
Comments:
1. Negative LCL is set to zero, since all fractional defective are positive or zero.
130
2. The control limits show that the process is in control during this preliminary period.
p1 p
Substituting UCL p 3 in the above equation we get
n
p1 p
p p1 3 0
n
We can now solve for n, the size of each sample, using 3 limits, to get
p 1 p ,
9
n
2
Where is the “shift” in the value of p.
However, if the control charts based on k in general, we have
k2
n p 1 p
2
Example 10.5
Suppose that an attribute quality control chart is being designed with a value of p 0.01
for the in-control probability of a defective. What is the sample size per subgroup
producing a probability of 0.5 that the process shift to p p1 0.05 will be defective?
Given that the resulting p-chart will involve 3 limits.
Solution
Given p 0.01, p1 0.05 , it follows that p p1 0.01 0.05 0.04
Then, using 3 we have
131
p 1 p 0.010.99 56.
9 9
n
2
0.042
132
Exercises 10
1. Consider the following data taken on subgroups of size 5. the data contains 20
averages, and ranges on the diameter (in millimeters) of an important component part of
an engine.
Sample X R
1 2.3972 0.0052
2 2.4191 0.0117
3 2.4215 0.0062
4 2.3917 0.0089
5 2.4151 0.0095
6 2.4027 0.0101
7 2.3921 0.0091
8 2.4171 0.0059
9 2.3951 0.0068
10 2.4215 0.0048
11 2.3887 0.0082
12 2.4107 0.0032
13 2.4009 0.0077
14 2.3992 0.0107
15 2.3889 0.0025
16 2.4107 0.0138
17 2.4109 0.0037
18 2.3944 0.0052
19 2.3951 0.0038
20 2.4015 0.0017
2. Samples of size 50 are taken every hour from a process producing a certain type
of item that is either considered defective or not defective. Twenty samples are
taken and the results are shown in the following table:
CHAPTER ELEVEN
ELEMENTARY CONCEPTS ON SYSTEM RELIABILITY
11.1 Introduction
The analysis of the reliability of a system must be based on precisely defined concepts.
Since it is readily accepted that a population of supposedly identical systems, operating
under similar conditions, fall at different points in time, then a failure phenomenon can
only be described in terms of probabilities. Thus, the fundamental definitions of
reliability must depend on concepts of probability theory. In general, a system may be
required to perform various functions, each of which may have a different reliability. In
addition, at different times, the system may have a different probability of successfully
performing the required function under the stated conditions. The term failure means the
system is not capable of performing a function when required.
Or, equivalently, as
f t
d
Rt (11.4)
dt
The density function can be mathematically described in terms of failure time T;
f t lim Pt T t t (11.5)
t 0
Equation (11.5) can be interpreted as the probability that the failure will occur between
the operating time t and the next interval of operation, t t .
It is believed that a system operates at a probability of one at time t 0 and decreases to
zero probability as time increases to infinity without any repair. Clearly, reliability is a
function of mission time.
Example 11.1
A computer system has an exponential failure time density function
t
1 9000
f t e , t0
9000
What is the probability that the system will fail after the warranty (six months or 4380
hours) and before the end of year one (or 8760 hours)?
Solution
The required probability is given by
8760 t
1 9000
P4380 T 8760 e dt 0.237
4380
9000
This indicates that the probability of failure during the interval from six months to one
year is 23.7 %.
If the time to failure is described by an exponential failure time density function, then
t
f t
1
e
, t 0, 0 (11.6)
And this will lead to the reliability function
s t
Rt e ds e
1
, t0 (11.7)
t
136
Substituting f t
d
Rt into equation (10.8) and taking integration by parts gives
dt
MTTF t d Rt t Rt Rt dt (11.9)
0 0 0
The first term on the right hand side of (11.9) is zero at both limits, since the system must
fail after a finite number of operating time. That is we must have tR t 0 as t .
Therefore, we have
MTTF Rt dt (11.10)
0
f t dt f t dt f t dt F t 2 F t1
t1 t1 t2
The rate at which failure occur in a certain time interval t1 ,t 2 is called the failure rate.
It is defined as the probability that a failure per unit time occurs in the interval, given that
a failure has not occurred prior to time t1 , the beginning of the interval. Thus, the failure
rate is
Rt1 Rt 2
FR (11.11)
t 2 t1 Rt1
Note that a failure rate is a function of time. If we redefine the interval as t , t t , the
above expression becomes
Rt Rt t
FR (11.12)
t Rt
137
Using equation (11.12) we define the hazard function as the limit of the failure rate as
the interval approaches zero. Thus, the hazard function ht is the instantaneous failure
rate, and is defined by
Rt Rt t 1 d f t
ht lim Rt (11.13)
t 0 t Rt Rt dt Rt
The importance of hazard function is that it indicates the change in the failure rate over
the life of a population of components by plotting their hazard functions on a single axis.
11.5 Maintainability
When a system fails to perform satisfactorily, repair is normally carried out to locate and
correct the fault. The system is restored to operational effectiveness by making an
adjustment or by replacing a component.
Maintainability is defined as the probability that a failed system will be restored to
specified conditions within a given period of time when maintenance is performed
according to prescribed procedures and resources. In other words maintainability is the
probability of isolating and repairing a fault in a system within a given time.
Maintainability engineers must work with system designers to ensure that the system
product can be maintained by the customer efficiency and cost effectively.
This function requires the analysis of part removal, replacement, tear-down, and build-up
of the product in order to determine the required time to carry out the operation, the
necessary skill, the type of support equipment and the documentation.
Let T denote the random variable of the time to repair or the total downtime. If the repair
time T has a repair time density function g t , then the maintainability, denoted by V t ,
is defined as the probability that the failed system will be back in service by time t;
t
V t PT t g s ds (11.14)
0
An important measure often used in maintenance studies is the mean time to repair
(MTTR) or the mean downtime. MTTR is the expected value of the random variable
repair time, not failure time, and is given by
138
MTTR t g t dt (11.15)
0
When the distribution has a repair time has the exponential density given by
g t e t , then the MTTR 1 .
11.6 Availability
Reliability is a measure that requires system success for an entire mission time. No
failures or repairs are allowed. Space missions and aircraft flights are good examples of
systems that do not allow failure or repair. Availability is a measure that allows for a
system to repair when failure occurs.
The availability of a system is defined as the probability that the system is successful at
time t. Mathematically we have
System uptime
Availabili ty
System uptime System downtime
(11.16)
MTTF
MTTF MTTR
Availability is a measure of success used primarily for repairable systems. For non-
repairable systems, availability At is equal to reliability Rt . In repairable systems,
At Rt .
most important measure in repairable system is called Mean time between repairs
(MTBR). It implies that the system has failed and has been repaired. This is
mathematically given by
t t
f t e t and Rt e
1
e
, t0
Where 1 0 is an MTTF’s parameter and 0 is a constant failure rate.
The hazard function or failure rate for the exponential density function is constant, that is,
f t e t
ht t (11.18)
Rt e
The failure rate for this distribution is a constant , which is the main reason for this
widely used distribution. Because of its constant failure rate property, the exponential is
an excellent model for the long flat “intrinsic failure” portion of the bathtub curve.
Example 11.2
A manufacturer performs an operational life test on ceramic capacitors and finds that they
exhibit constant failure rate with a value of 3 10 8 failure per hour. What is the
reliability of a capacitor at 10 4 hour?
Solution
The reliability of a capacitor at 10 4 hours is
Rt e t e 310
8 4
104
e 310 0.9997
Given that
f t
d
Rt
dt
The hazard function becomes
f t 1 Rt
ht (11.20)
Rt Rt
Example 11.3
a component has a normal distribution of failure times with mean 2000 hours and the
standard deviation of 100 hours. Find the reliability of the component and the value of the
hazard function at 1900 hours.
Solution
Given 2000 , 100 , then
The reliability at time t 1900 hours is given by
1900 2000
R1900 P Z PZ 1.0 0.8413
100
The value of the hazard function is given by
The normal distribution is flexible enough to make it a very useful empirical model. It
can be theoretically derived under assumptions that matching many failure mechanisms
resulting from chemical reactions and processes.
141
Exercises 11
2. A fax machine with constant failure rate will survive for a period of 720 hours
without failure, with probability 0.80.
(a) Determine the failure rate .
(b) Determine the probability that the machine, which is functioning after 600
hours, will still function after 800 hours.
(c) Find the probability that the machine will fail within 900 hours, given that the
machine was functioning at 720 hours.
3. A diesel is known to have an operating life (in hours) that fits the following pdf
f t
2a
, t0
t b2
The average operating life of the diesel has been estimated to be 8760 hours.
(a) Determine a and b.
(b) Determine the probability that the diesel will not fail during the first 6000
operating hours.
(c) If the manufacturer wants no more than 10% of the diesel returned for
warranty service, how long should the warranty be?
ht
t
, t 0
t 1
Where t is time in years.
(a) Determine the reliability function Rt .
(b) Determine the MTTF of the component.
142
Bibliography
1. Walpole, R. E., Myers, R. H., Myers, S. L., Ye K., (2002), “Probability and
Statistics for Engineers and Scientists”, 7th edition, Prentice Hall.
2. Miller, I., Miller M., (1999), “John E. Freunds’s Mathematical Statistics”, 6th
edition, Prentice Hall.
3. http://www.springer.com/978-1-85233-950-0
143
144
145
146
147
148