Professional Documents
Culture Documents
35
If two sets of variables vary in such a way that the
changes of one set are related by changes in the other,
then these sets are said to be correlated. For example,
there is a relation between income and expenditure,
height and weight, rainfall and production, demand and
price, etc.
(OR)
Correlation measures the degree of relationship between
two variables.
36
The following are the types of correlation.
37
A positive correlation is a relationship between two
variables in which both variables move in the same
direction. Therefore, when one variable increases as the
other variable increases, or one variable decreases while
the other decreases. An example of positive
correlation would be height and weight, household
income and expenditure.
38
A negative correlation is a relationship between two
variables in which an increase in one variable is
associated with a decrease in the other. An example of
negative correlation would be price and demand of
goods, unemployment and purchasing power.
39
Simple Correlation
Simple correlation is defined as a variation related amongst any two
variables. E.g. Income and expenditures.
Multiple Correlation
Correlation between two variables where three or more variables are included
is called partial correlation. E.g. Correlation between production of rice and
amount of rainfall after removing the effect of third variable as average daily
temperature.
Linear Correlation
When the amount of change in one variable is not in a constant ratio to the
change in the other variable, we say that the correlation is non linear.
40
Correlation lying between −1 𝑡𝑜 + 1 and denoted by ′𝑟′.
Where
𝑟 < 0, Negative correlation
𝑟 > 0, Positive correlation
𝑟 = 0, No relationship between variables
𝑟 = 1, Perfect positive correlation
𝑟 = −1, Perfect Negative correlation
41
42
43
A scatter diagram is the simplest way of the diagrammatic representation of bivariate
data. One variable is represented along the X-axis and the other variable is represented
along the Y-axis. The pair of points are plotted on the two dimensional graph. The
diagram of points so obtained is known as scatter diagram. The direction of flow of
points shows the type of correlation that exists between the two given variables.
44
When there exists some relationship between two
measurable variables, we compute the degree of
relationship using the correlation coefficient.
Or
Where
45
The correlation coefficient between X and Y is same as the
correlation coefficient between Y and X (i.e, rxy = ryx ).
The correlation coefficient is free from the units of
measurements of X and Y
The correlation coefficient is unaffected by change of scale
and origin.
Thus, if ui = [xi – A] /c and vi = [yi – B] /d with c ≠ 0 and d ≠ 0
i=1,2, ..., n
46
Example
47
48
Example
49
50
The Coefficient of determination is defined as the square
of the coefficient of correlation and when multiplied by
100, it gives the proportion of the variance in the
dependent variable that is predictable from the
independent variable.
E.g. 𝑟 = 0.80
𝑟 2 = 0.64 × 100 = 64%
In other words we can say that the regression equation is
64% reliable to be used for estimation.
51
To check the reliability (or significance) of coefficient of
correlation r, probable error is used. The formula for
calculating the probable error is
1−𝑟 2
𝑃. 𝐸 = 0.6745
𝑛
Where ‘r’ is the coefficient of correlation and ‘n’ is the
number of pairs of observations.
52
(a) If r is less than P.E, then there is no evidence of
correlation (i.e. the correlation is not significant).
(b) If r is greater than 6 × 𝑃. 𝐸, then there is certain
correlation (i.e. coefficient of correlation is
significant).
(c) If the P.E is comparatively smaller than the
coefficient of correlation then the following rules
hold good:
(i) If r is less than 0.3, correlation is
insignificant i.e. there is not much evidence
of correlation.
(ii) If r is more than 0.3, then there is good
evidence of correlation.
53
Example
54
In 1904, Charles Edward Spearman, a British
psychologist found out the method of ascertaining the
coefficient of correlation by ranks. This method is based
on rank. This measure is useful in dealing with
qualitative characteristics, such as intelligence, beauty,
morality, character, etc. It cannot be measured
quantitatively, as in the case of Pearson’s coefficient of
correlation.
Rank correlation is applicable only to individual
observations. The result we get from this method is only
an approximate one, because under ranking method
original value are not taken into account.
55
The formula for Spearman’s rank correlation which is denoted
by ρ (pronounced as row) is
where
d = The difference of two ranks = R X - RY and
N = Number of paired observations.
Rank coefficient of correlation value lies between –1 and +1.
Symbolically, –1 ≤ ρ ≤ +1
When we come across spearman’s rank correlation, we may find
three types of problem
(i) When ranks are given
(ii) When ranks are not given
(iii) When the values of the series are the same.
56
CASE#01
Example
57
58
Example
59
Solution:
60
61
Presenter: Ms. Sidra Raees
LECTURE 04
Department of Mathematics, NED University of
Engineering & Technology, Karachi
1
CASE#02
Example
2
Solution:
3
4
CASE#03
Example
Calculate spearman‟s rank correlation of the following
data.
X 50 55 65 50 55 60 50 65 70 75
Y 110 110 115 125 140 115 130 120 115 160
5
6
Formula
6 𝑑2 + 𝑚1 3 − 𝑚1 + 𝑚2 3 − 𝑚2 + ⋯ + 𝑚𝑛 3 − 𝑚𝑛
𝜌=1−
𝑁 𝑁2 − 1
= 0.155 (Negligible and no relation)
Where
𝑛 = 10
𝑚𝑖 = 2,2,2,3,3
𝑑 2 = 134
7
CASE#03
Example
Calculate spearman‟s rank correlation of the following
data.
8
9
Formula
6 𝑑2 + 𝑚1 3 − 𝑚1 + 𝑚2 3 − 𝑚2 + ⋯ + 𝑚𝑛 3 − 𝑚𝑛
𝜌=1−
𝑁 𝑁2 − 1
Where
𝑛 = 10
𝑚1 = 2
10
Rank correlation coefficient measure the degree of
agreement between two ranking but sometimes it happen
that the individual or object are ranked by more than two
person or judges. In that case we have to find out the
measure of element among the judges. This can be
calculated by the following formula:
12𝑆
𝐶= 2 3
𝑚 𝑛 −𝑛
𝑚 𝑛+1
Where 𝑆 = 𝑥−𝑥 2 , 𝑥=
2
𝑚 = 𝑛𝑜. 𝑜𝑓 𝑗𝑢𝑑𝑔𝑒𝑠
𝑛 = 𝑛𝑜. 𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠
11
Example
The following data give ranking of six persons for their
ability by three judges P, Q and R. Calculate coefficient
of concordance.
12
Here
𝑚 = 3, 𝑛 = 6, 𝑥 = 10.5
12𝑆
𝐶= 2 3
𝑚 𝑛 −𝑛
𝐶 = 0.34
13
Regression analysis, in general sense, means the
estimation or prediction of the unknown value of one
variable from the known value of the other variable. It is
one of the most important statistical tools which is
extensively used in almost all sciences, Natural, Social
and Physical. It is specially used in business and
economics to study the relationship between two or more
variables that are related causally and for the estimate of
demand and supply graphs, cost functions, production
and consumption functions and so on.
14
A line of regression is a line which gives the best
estimate for the values of X for any given value of Y.
There are two lines on Regression:
(i) Line Y on X
(ii) Line X on Y
15
Y on X
The regression line Y on X is used for estimating Y. If
there is a linear relationship between the two variables X
and Y, the equation y= 𝑎 + 𝑏𝑦𝑥 𝑥 , is called the
regression equation Y on X. Where a and b are some
constants which determine the line.
Where
𝑛 𝑥𝑦−( 𝑥)( 𝑦)
𝑏𝑦𝑥 =
𝑛 𝑥2− 𝑥 2
𝑎 = 𝑦 − 𝑏𝑦𝑥 𝑥
𝑏𝑦𝑥 is called regression coefficient.
16
X on Y
The regression line X on Y is used for estimating X. If
there is a linear relationship between the two variables X
and Y, the equation x = 𝑎 + 𝑏𝑥𝑦 𝑦 , is called the
regression equation X on Y. Where a and b are some
constants which determine the line.
Where
𝑛 𝑥𝑦−( 𝑥)( 𝑦)
𝑏𝑥𝑦 =
𝑛 𝑦2− 𝑦 2
𝑎 = 𝑥 − 𝑏𝑥𝑦 𝑦
𝑏𝑥𝑦 is called regression coefficient.
17
The regression lines of Y on X and X
on Y are also called least squares lines
of regression.
18
Example
19
20
21
The following table give the age of cars of a certain make and
actual maintenance cost.
22
12
As an application of probability, there are two more concepts
namely random variables and probability distributions. Before
seeing the definition of probability distribution, random variable
needs to be explained. It has been a general notion that if an
experiment is repeated under identical conditions, values of the
variable so obtained would be similar. However, there are situations
where these observations vary even though the experiment is
repeated under identical conditions. As the result, the outcomes of
the variable are unpredictable and the experiments become random.
14
Sample Space HH TH HT TT
X 2 1 1 0
That is;
X(HH) = 2, X(HT) = 1, X(TH) = 1, X(TT) = 0
Therefore the numbers 2, 1, 0 in the above example are
random quantities determined by the outcomes of the
random experiment. Such a numerical quantity whose
value is determined by the outcome of a random
experiment, is called a random variable. Thus, the
number of heads obtained in the experiment of tossing a
coin two times in the above example are the values of
random variable.
15
A random variable is also called a chance
variable, a Stochastic variable or simply a
variate. We shall denote a random variable by
capital letters X, Y, Z, etc. and the values of the
random variable are denoted by the
corresponding small letters x, y, z, etc.
16
Random variable may be Discrete or Continuous. If the
random variable takes on the integer values (i.e. the values in
whole numbers) such as 0, 1, 2, 3,……., then it is called a
discrete random variable. For example, the number of
defective items in a sample, the number of printing mistakes
in each page of a book, the number of telephone calls
received by an office of a firm, etc. A discrete random
variable may be defined as a random variable whose values
form a finite (or countably infinite) set of numbers.
If the random variable can take any value (i.e. numerical or
fractional) within a given interval, then it is called a
Continuous Random Variable. For example, height of a
person, weight of a baby, temperature at a place, etc.
17
For a discrete random variable X, a table, a graph or a
formula showing all possible values of the random variable X
i.e. 𝑥1 , 𝑥2 , 𝑥3 , … … . . 𝑥𝑛 with their corresponding probabilities
𝑃 𝑋 = 𝑥1 , 𝑃 𝑋 = 𝑥2 , 𝑃 𝑋 = 𝑥3 , … … … , 𝑃 𝑋 = 𝑥𝑛
Is called a discrete probability distribution of the random
variable X. In any probability distribution, the sum of all
probabilities should be equal to unity.
NOTE:
The probability distribution of a discrete random variable X is
also called probability mass function or simple probability
function of the random variable X.
18
A discrete probability distribution must posses the
following properties:
𝑓 𝑥 ≥ 0,
𝑥 𝑓 𝑥 = 1,
𝑃 𝑋 = 𝑥 = 𝑓(𝑥).
19
Example
Suppose a unbiased coin is tossed 3 times, then find
probability distribution of the random variable “No. of
Heads” in the following forms:
(a) Tabular form (b) Graphic Form
Solution:
20
1
𝑓 0 = 𝑃 𝑋 = 0 = 𝑃 𝑎𝑙𝑙 𝑇𝑎𝑖𝑙𝑠 =
8
3
𝑓 1 = 𝑃 𝑋 = 1 = 𝑃 1 − 𝐻𝑒𝑎𝑑 𝑎𝑛𝑑 2 − 𝑇𝑎𝑖𝑙 =
8
3
𝑓 2 = 𝑃 𝑋 = 2 = 𝑃 2 − 𝐻𝑒𝑎𝑑 𝑎𝑛𝑑 1 − 𝑇𝑎𝑖𝑙 =
8
1
𝑓 3 = 𝑃 𝑋 = 3 = 𝑃 3 𝐻𝑒𝑎𝑑𝑠 =
8
Therefore, the probability distribution (i.e. probability
mass function) in tabular form is
𝒙 0 1 2 3
𝑓 𝑥 = 𝑃(𝑋 = 𝑥) 1 3 3 1
8 8 8 8
21
(b) Graphic Form
22
Example
A shipment of 20 similar laptop computers to a retail
outlet contains 3 that are defective. If a school makes a
random purchase of 2 of these computers, find the
probability distribution for the number of defectives.
Solution:
Let X be a random variable whose values x are the
possible numbers of defective computers purchased by
the school. Then x can only take the numbers 0, 1, and 2.
Now
23
3𝐶 × 17𝐶 68
0 2
𝑓 0 =𝑃 𝑋=0 = 20𝐶
=
2 95
3𝐶 × 17𝐶 51
1 1
𝑓 1 =𝑃 𝑋=1 = 20𝐶
=
2 190
3
𝐶2 × 17𝐶0 3
𝑓 2 =𝑃 𝑋=2 = 20𝐶
=
2 190
𝒙 0 1 2
𝑓(𝑥) 68 51 3
95 190 190
24
Example
A bag contains two white and three black balls. Two balls
are selected at random. Find the probability distribution
for the number of white balls.
Solution:
Let X = No. of white balls, then the possible values of
x = 0, 1, and 2. Now find the probabilities for x = 0, 1, 2
25
2𝐶 × 3𝐶 3
0 2
𝑓 0 =𝑃 𝑋=0 = 5𝐶
=
2 10
2𝐶 × 3𝐶 6
1 1
𝑓 1 =𝑃 𝑋=1 = 5𝐶
=
2 10
2
𝐶2 × 3𝐶0 1
𝑓 2 =𝑃 𝑋=2 = 5𝐶
=
2 10
Therefore, the probability distribution for the number of
white balls is
𝒙 0 1 2
𝑓 𝑥 = 𝑃(𝑋 = 𝑥) 3 6 1
10 10 10
26
The cumulative distribution function 𝐹(𝑥) of a discrete
random variable X with probability distribution 𝑓 𝑥 is
27
Example
Find the cumulative distribution of the random variable
X for the following probability distribution:
Solution:
The cumulative distribution function of coin is
1
𝐹 0 =𝑓 0 =
8
1 3 4
𝐹 1 =𝑓 0 +𝑓 1 = + =
8 8 8
28
1 3 3 7
𝐹 2 =𝑓 0 +𝑓 1 +𝑓 2 = + + =
8 8 8 8
1 3 3 1
𝐹 3 =𝑓 0 +𝑓 1 +𝑓 2 +𝑓 3 = + + + =1
8 8 8 8
Hence,
0, 𝑓𝑜𝑟 𝑥 < 0
1
, 𝑓𝑜𝑟 0 ≤ 𝑥 < 1
8
4
, 𝑓𝑜𝑟 1 ≤ 𝑥 < 2
8
7
, 𝑓𝑜𝑟 2 ≤ 𝑥 < 3
8
1, 𝑓𝑜𝑟 𝑥 ≥ 3
29
30
Example
A random variable x has the following probability
distributions.
𝒙 0 1 2 3 4 5 6 7
𝑃 𝑥 = 𝑓(𝑥) 0 k 2k 2k 3k 𝑘2 2𝑘 2 7𝑘 2 + 𝑘
Find
(a) K
(b) 𝑃 𝑥 < 6
(c) 𝑃 𝑥 ≥ 6
(d) 𝑃 0 < 𝑥 < 5
(e) Distribution function (CDF)
31
Solution:
(a) 𝑓 𝑥 =1
10𝑘 2 + 9𝑘 = 1
10𝑘 2 + 9𝑘 − 1 = 0
10𝑘 2 + 10𝑘 − 𝑘 − 1 = 0
𝑘 + 1 10𝑘 − 1 = 0
1
𝑘 = −1 , 𝑘=
10
𝒙 0 1 2 3 4 5 6 7
𝑃(𝑥) 0 0.1 0.2 0.2 0.3 0.01 0.02 0.17
32
(b) 𝑃 𝑥 < 6 = 𝑃 0 + 𝑃 1 + 𝑃 2 + 𝑃 3 + 𝑃 4 + 𝑃(5)
= 0 + 0.1 + 0.2 + 0.2 + 0.3 + 0.01
= 0.81
(OR)
𝑃 𝑥 <6 =1−𝑃 𝑥 ≥6
= 1 − *𝑃 6 + 𝑃(7)+
= 1 − 0.19
= 0.81
(c) 𝑃 𝑥 ≥ 6 = 𝑃 6 + 𝑃 7
= 0.02 + 0.17
= 0.19
33
(d) 𝑃 0 < 𝑥 < 5 = 𝑃 1 + 𝑃 2 + 𝑃 3 + 𝑃 4
= 0.1 + 0.2 + 0.2 + 0.3
= 0.8
34
Presenter: Ms. Sidra Raees
LECTURE 08
Department of Mathematics, NED University of
Engineering & Technology, Karachi
1
The function 𝑓 𝑥 is a probability density function (pdf) for the
continuous random variable X, defined over the set of real numbers, if
𝑓 𝑥 ≥ 0, −∞ < 𝑥 < ∞
∞
−∞
𝑓 𝑥 =1
𝑏
𝑃 𝑎<𝑋<𝑏 = 𝑎
𝑓(𝑥)
𝑃 𝑎<𝑋<𝑏 =𝑃 𝑎≤𝑋<𝑏 =𝑃 𝑎<𝑋≤𝑏 =𝑃 𝑎≤𝑋≤𝑏
NOTE:
The probability distribution of a continuous random variable X is also
called probability density function (pdf) or simple density function of the
random variable X.
2
Example
Suppose that the error in the reaction temperature, in ℃,
for a controlled laboratory experiment is a continuous
random variable X having the probability density
function.
𝑥2
𝑓 𝑥 = 3 , −1 < 𝑥 < 2
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
3
Solution:
(a) We have
∞
−∞
𝑓 𝑥 =1
2 2
𝑥
𝑑𝑥 = 1
−1 3
𝑥3 2
=1
9 −1
8 1
+ =1
9 9
1=1
1 𝑥2 𝑥3 1 1
(b) 𝑃 0 < 𝑋 ≤ 1 = 𝑑𝑥 = =
0 3 9 0 9
4
Example
5
Solution:
∞
(a) −∞
𝑓 𝑥 =1
4
𝐶
𝑑𝑥 = 1
0 𝑥
1 4
2𝐶𝑥 2 = 1
0
4𝐶 = 1
1
𝐶=
4
1 1
1 1 1 1 1
(b) 𝑃 𝑋 < = 4
0 4 𝑥
𝑑𝑥 = 𝑥 2 04 =
4 2 4
4 1 1 1 4 1
(c) 𝑃 𝑋 > 1 = 1 4 𝑥
𝑑𝑥 = 𝑥2 1 =
2 2
6
The cumulative distribution function 𝐹(𝑥) of a
continuous random variable X with density function 𝑓 𝑥
is
𝑥
𝐹 𝑥 =𝑃 𝑋≤𝑥 = 𝑓
𝑡 𝑑𝑡, 𝑓𝑜𝑟 − ∞ < 𝑥 < ∞
−∞
𝑃 𝑎 < 𝑋 < 𝑏 = 𝐹 𝑏 − 𝐹(𝑎)
𝑑
𝑓 𝑥 = 𝑑𝑥
𝐹(𝑥)
Note:
Integrate pdf to find cdf and differentiate cdf to find pdf.
7
Example
For the density function of example on slide no. 04 , find
𝐹 𝑥 , and use it to evaluate 𝑃 0 < 𝑋 ≤ 1 .
Solution:
𝑥 𝑥
𝑡2 𝑡3 𝑥 𝑥3 + 1
𝐹 𝑥 = 𝑓 𝑡 𝑑𝑡 = 𝑑𝑡 = =
−∞ −1 3 3 −1 9
Now
𝑃 0 < 𝑋 ≤ 1 = 𝐹 1 − 𝐹(0)
2 1
=9−9
1
= 9
Which agrees the result obtained by using the density
function.
8
Example
5 2
𝑓 𝑦 = , 𝑏 ≤ 𝑦 ≤ 2𝑏
8𝑏 5
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
9
Solution:
2
For 𝑏 ≤ 𝑦 ≤ 2𝑏,
5
𝑦
5 5𝑡 𝑦 5𝑦 1
𝐹 𝑦 = 𝑑𝑡 = = −
2𝑏 8𝑏 8𝑏 2𝑏 8𝑏 4
5 5
10
11
An extremely useful concept in problems involving random
variables or distributions is that of expectation. Random
variables can be characterized and dealt with effectively for
practical purposes by consideration of quantities called their
expectation. The concept of mathematical expectation arose
in connection with games of chance. For example, a gambler
might be interested in his average winnings at a game, a
businessman in his average profits on a product, and so on.
The average value of a random phenomenon is also termed as
its Mathematical expectation or expected value. In the
following sections, we will define and study the concept of
mathematical expectation for both discrete and continuous
random variables, which will be used in the following
subsection.
12
Probability distribution gives us an idea about the likely value
of a random variable and the probability of the various events
related to random variable. Even though it is necessary for us
to explain probabilities using central tendencies, dispersion,
symmetry and kurtosis. These are called descriptive measures
and summary measures. Like frequency distribution we have
to see the properties of probability distribution. This section
focuses on how to calculate these summary measures. These
measures can be calculated using
i. Mathematical Expectation and variance.
ii. Moments.
13
14
𝐸 𝑐 = 𝑐, where c is a constant
𝐸 𝑐𝑋 = 𝑐𝐸 𝑋 , where c is a constant
𝐸 𝑎𝑋 + 𝑏 = 𝑎𝐸 𝑋 + 𝑏
Variance
The variance of a random variable X will be a measure of
the spread or dispersion of the density of X or simply the
variability in the values of a random variable.
𝑉𝑎𝑟 𝑋 = 𝐸 𝑋 2 − 𝐸(𝑋) 2
15
Example
Solution:
16
Example
17
Solution:
18
Example
Solution:
19
Example
Solution:
20
21
Example
22
Solution:
First, we know that the salesperson, for the two appointments,
can have 4 possible commission totals: $0, $1000, $1500 and
$2500. We then need to calculate their associated
probabilities. By independence, we obtain
𝑓 $0 = 1 − 0.7 1 − 0.4 = 0.18
𝑓 $2500 = 0.7 0.4 = 0.28
𝑓 $1000 = 0.7 1 − 0.4 = 0.42
𝑓 $1500 = 1 − 0.7 0.4 = 0.12
Therefore, the expected commission for the salesperson is
𝐸 𝑋 = $0 0.18 + $1000 0.42 + $1500 0.12
+ $2500 0.28
= $1300
23
24
Example
Solution:
25
Example
Solution:
26
27
Example
Solution:
28
29
Example
Solution:
30
Another approach helpful to find the summary measures
for probability distribution is based on the ‘moments’.
We will discuss two types of moments.
31
32
𝑬(𝒙 − 𝒙)𝒓 , 𝒇𝒐𝒓 𝒅𝒊𝒔𝒄𝒓𝒆𝒕𝒆 𝒓𝒂𝒏𝒅𝒐𝒎 𝒗𝒂𝒓𝒊𝒂𝒃𝒍𝒆
𝒓 ∞
𝝁𝒓 = 𝑬 𝑿 − 𝑿 =
𝒙 − 𝒙 𝒓 𝒇 𝒙 𝒅𝒙 , 𝒇𝒐𝒓 𝒄𝒐𝒏𝒕𝒊𝒏𝒖𝒐𝒖𝒔 𝒓𝒂𝒏𝒅𝒐𝒎 𝒗𝒂𝒓𝒊𝒂𝒃𝒍𝒆
−∞
Note:
In the calculation of moments about mean we generally use
the relationship between moments about mean and about
origin.
33
1. 𝜇1 = 0
2. 𝜇2 = 𝜇2 ′ − 𝜇1 ′ 2
3. 𝜇3 = 𝜇3 ′ − 3𝜇1 ′ 𝜇2 ′ +2 𝜇1 ′ 3
4. 𝜇4 = 𝜇4 ′ − 4𝜇3 ′ 𝜇1 ′ + 6𝜇2 ′ 𝜇1 ′ 2 − 3 𝜇1 ′ 4
34
Skewness
Kurtosis
35
Example
Solution:
36
37
38
Presenter: Ms. Sidra Raees
LECTURE 09
Department of Mathematics, NED University of
Engineering & Technology, Karachi
1
The probability of the various values of the random variables are obtained in
accordance with the events and the nature of the experiment
2
Note:
3
The following are the two types of Theoretical
distributions:
1. Discrete distribution
2. Continuous distribution
4
Discrete Distribution
In discrete probability distribution we will discuss:
Binomial distribution
Poisson distribution
Hypergeometric distribution
Continuous Distribution
In continuous probability distribution we will discuss
Normal Distribution. Normal distribution is the most
important and powerful of all the distribution in statistics.
5
6
A Bernoulli trial can result in a success with probability
𝑝 and a failure with probability 𝑞 = 1 − 𝑝. Then the
probability distribution of the binomial random variable
X, the number of successes in 𝑛 independent trials, is
𝑏 𝑥; 𝑛, 𝑝 = 𝑛𝐶𝑥 𝑝 𝑥 𝑞𝑛−𝑥 , 𝑥 = 0, 1, 2, … , 𝑛
7
The result of each trial can be classified into only two
categories called success or failure.
The probability of success remains constant from one
trail to the next.
The successive trials are independent.
The experiment is repeated a fixed number of times.
Remarks:
Binomial Distribution has two parameters i.e. 𝑛 and 𝑝.
The mean and variance of the binomial distribution are
𝜇 = 𝑛𝑝 and 𝜎 2 = 𝑛𝑝𝑞
8
Example
The probability that a certain kind of component will
survive a shock test is 3/4. Find the probability that
exactly 2 of the next 4 components tested survive.
Solution:
Assuming that the tests are independent and 𝑝 = 3/4 for
each of the 4 tests, we obtain
2 4−2
3 4
3 1 27
𝑏 𝑥; 𝑛, 𝑝 = 𝑏 2; 4, = 𝐶2 =
4 4 4 128
9
Example
The probability that a patient recovers from a rare blood
disease is 0.4. If 15 people are known to have contracted
this disease, what is the probability that
(a) at least 10 survive, (b) from 3 to 8 survive, and
(c) exactly 5 survive?
Solution:
Let X be the number of people who survive.
(a) 𝑃 𝑋 ≥ 10 = 15𝑥=10 𝑏 𝑥; 15,0.4 = 0.0338
(b) 𝑃 3 ≤ 𝑋 ≤ 8 = 8𝑥=3 𝑏 𝑥; 15,0.4 = 0.8779
(c) 𝑃 𝑋 = 5 = 𝑏 5; 15,0.4 = 0.1859
10
Example
It is conjectured that an impurity exists in 30% of all
drinking wells in a certain rural community. In order to
gain some insight into the true extent of the problem, it is
determined that some testing is necessary. It is too
expensive to test all of the wells in the area, so 10 are
randomly selected for testing.
(a) Using the binomial distribution, what is the
probability that exactly 3 wells have the impurity,
assuming that the conjecture is correct.
(b) What is the probability that more than 3 wells are
impure?
11
Solution:
(a) We require
𝑏 3; 10, 0.3 = 10𝐶3 (0.3)3 (0.7)10−3
= 0.2668
(b) 𝑃 𝑋 > 3 = 1 − 𝑃 𝑋 ≤ 3
3
12
Example
A T.V channel conducted a poll regarding construction of
dams in Pakistan. 75% people were support at construction,
15% were against and 10% were undecided. A sample of 10
person taken. What is the probability that at least 3 will
support the construction.
Solution:
We have
𝑝 = 0.75, 𝑞 = 0.25, 𝑛 = 10
𝑃 𝑋 ≥3 =1−𝑃 𝑋 <3
= 1 − 2𝑥=0 𝑏(𝑥; 10, 0.75)
10𝐶 0.75 0 0.25 10 +
0
= 1 − 10
𝐶1 0.75 1 0.25 9 + 10𝐶2 0.75 2 0.25 8
= 0.99
13
Example
A and B play a game in which A’s probability of winning
is 2/3. In a series of 8 games, what is the probability that
A will win
(a) Exactly 4 games (b) at least 4 games
(c) 6 or more games (d) from 3 to 6 games.
Solution:
We have
2 1
𝑝= , 𝑞= , 𝑛=8
3 3
14
2 4 1 8−4
(a) P X = 4 = 𝑏 4; 8, = 8𝐶 2
3 4 3 3
= 0.1707
8 2
(c) 𝑃 𝑋 ≥ 6 = 𝑥=6 𝑏 𝑥; 8, = 0.4682
3
6 2
(d) 𝑃(3 ≤ 𝑋 ≤ 6) = 𝑥=3 𝑏 𝑥; 8, = 0.7852
3
15
Example
If on the average rain falls on 9 days in every thirty days, find
the probability that rain will fall on at least two days of a
given week.
Solution:
Probability of raining on a particular day is given by
9 3 7
𝑝 = 30 = 10 and 𝑞 = 1 − 𝑝 = 10.
There are 7 days in a week so the probability of raining for at
least 2 days is given by
𝑃 𝑋 ≥ 2 = 1−𝑃 𝑋 < 2
1 3
=1− 𝑥=0 𝑏 𝑥; 7, 10
= 1 − 0.329
= 0.6706
16
The binomial distribution finds applications in many
scientific fields. An industrial engineer is keenly interested in
the “proportion defective” in an industrial process. Often,
quality control measures and sampling schemes for processes
are based on the binomial distribution. This distribution
applies to any industrial situation where an outcome of a
process is dichotomous and the results of the process are
independent, with the probability of success being constant
from trial to trial. The binomial distribution is also used
extensively for medical and military applications. In both
fields, a success or failure result is important. For example,
“cure” or “no cure” is important in pharmaceutical work, and
“hit” or “miss” is often the interpretation of the result of firing
a guided missile.
17
18
Experiments yielding numerical values of a random variable
X, the number of outcomes occurring during a given time
interval or in a specified region, are called Poisson
experiments. The given time interval may be of any length,
such as a minute, a day, a week, a month, or even a year. For
example, a Poisson experiment can generate observations for
the random variable X representing the number of telephone
calls received per hour by an office, the number of days
school is closed due to snow during the winter, or the number
of games postponed due to rain during a baseball season. The
specified region could be a line segment, an area, a volume,
or perhaps a piece of material. In such instances, X might
represent the number of field mice per acre, the number of
bacteria in a given culture, or the number of typing errors per
page. A Poisson experiment is derived from the Poisson
process.
19
The probability distribution of the Poisson random
variable X, representing the number of outcomes
occurring in a given time interval or specified region
denoted by t, is
𝑒 −𝜆𝑡 𝜆𝑡 𝑥
𝑝 𝑥; 𝜆𝑡 = , 𝑥 = 0, 1, 2, … .
𝑥!
20
Example
During a laboratory experiment, the average number of
radioactive particles passing through a counter in 1
millisecond is 4. What is the probability that 6 particles
enter the counter in a given millisecond?
Solution:
Using the Poisson distribution with 𝑥 = 6 and 𝜆𝑡 = 4, we
have
𝑒 −4 4 6
𝑝 𝑥; 𝜆𝑡 = 𝑃 6; 4 = = 0.1042
6!
21
Example
Ten is the average number of oil tankers arriving each
day at a certain port. The facilities at the port can handle
at most 15 tankers per day. What is the probability that on
a given day tankers have to be turned away?
Solution:
Let X be the number of tankers arriving each day. Then,
we have
𝑃 𝑋 > 15 = 1 − 𝑃 𝑋 ≤ 15 = 1 − 15 𝑥=0 𝑝( 𝑥; 10)
= 1 − 0.9513
= 0.0487
22
Example
Flaws in a certain type of drapery material appear on the
average of one in 150 square feet. If we assume the
Poisson distribution, find the probability of at most one
flaw in 225 square feet.
Solution:
Taking 150 square feet as the unit area, we have 𝜆 = 1
flaw per 150 square feet.
225
As 225 square feet are = 1.5 units of area, so 𝑡 = 1.5
150
and therefore the average number of flaws per 225 square
feet, i.e. 𝜆𝑡 = 1 × 1.5 = 1.5.
23
Assuming the flaws are a Poisson process, we have
𝑃 𝑋 ≤ 1 = 𝑝 𝑥; 𝜆𝑡 = 𝑝(𝑥; 1.5)
𝑥=0
= 0.2231 + 0.3347
= 0.5578
24
Poisson and binomial distributions give approximately
the same results under the following conditions:
Let X be a binomial random variable with probability
distribution 𝑏 𝑥; 𝑛, 𝑝 . When 𝑛 → ∞, 𝑝 → 0, and
𝑛𝑝 𝜇 remains constant,
𝑏 𝑥; 𝑛, 𝑝 𝑝(𝑥; 𝜇)
25
Example
In a certain industrial facility, accidents occur infrequently. It
is known that the probability of an accident on any given day
is 0.005 and accidents are independent of each other.
(a) What is the probability that in any given period of 400
days there will be an accident on one day?
(b) What is the probability that there are at most three days
with an accident?
Solution:
Let X be a binomial random variable with 𝑛 = 400 and
𝑝 = 0.005. Thus, 𝑛𝑝 = 2. Using the Poisson approximation,
𝑒 −2 2 1
(a) P X = 1 = 𝑝 𝑥; 𝜇 = 𝑃 1; 2 = 1! = 0.271
3 3 𝑒 −2 (2)𝑥
(b) 𝑃 𝑋≤3 = 𝑥=0 𝑝(𝑥; 2) = 𝑥=0 𝑥!
= 0.857
26
Example
In a manufacturing process where glass products are made, defects
or bubbles occur, occasionally rendering the piece undesirable for
marketing. It is known that, on average, 1 in every 1000 of these
items produced has one or more bubbles. What is the probability
that a random sample of 8000 will yield fewer than 7 items
possessing bubbles?
Solution:
This is essentially a binomial experiment with 𝑛 = 8000 and
𝑝 = 0.001. Since 𝑝 is very close to 0 and 𝑛 is quite large, we shall
approximate with the Poisson distribution using
𝜇 = 8000 0.001 = 8
Hence, if X represents the number of bubbles, we have
6
27
28
The simplest way to view the distinction between the binomial
distribution and the hypergeometric distribution is to note the way the
sampling is done. The types of applications for the hypergeometric are
very similar to those for the binomial distribution. We are interested in
computing probabilities for the number of observations that fall into a
particular category. But in the case of the binomial distribution,
independence among trials is required. As a result, if that distribution is
applied to, say, sampling from a lot of items (deck of cards, batch of
production items), the sampling must be done with replacement of each
item after it is observed. On the other hand, the hypergeometric
distribution does not require independence and is based on sampling done
without replacement.
Applications for the hypergeometric distribution are found in many areas,
with heavy use in acceptance sampling, electronic testing, and quality
assurance. Obviously, in many of these fields, testing is done at the
expense of the item being tested. That is, the item is destroyed and hence
cannot be replaced in the sample. Thus, sampling without replacement is
necessary.
29
The probability distribution of the hypergeometric
random variable X, the number of successes in a random
sample of size 𝑛 selected from 𝑁 items of which 𝑘 are
labeled success and 𝑁 − 𝑘 labeled failure, is
𝑘𝐶 𝑁−𝑘 𝐶
𝑥 𝑛−𝑥
ℎ 𝑥; 𝑁, 𝑛, 𝑘 = 𝑁𝐶
, 𝑥 = 0, 1, 2 … . , 𝑛
𝑛
30
The result of each trial can be classified into one of two
categories, say success and failure.
The probability of success changes on each trial.
Successive trials are dependent.
The experiment is repeated a fixed number of times.
Remarks:
The mean and variance of the hypergeometric
𝑛𝑘 2 𝑁−𝑛 𝑛𝑘(𝑁−𝑘)
distribution are 𝜇 = and 𝜎 = ∙ 2
𝑁 𝑁−1 𝑁
31
Like the binomial distribution, the hypergeometric
distribution finds applications in acceptance sampling,
where lots of materials or parts are sampled in order to
determine whether or not the entire lot is accepted.
32
Example
A box of 8 screws contains 5-defective screws. If a random
sample of 3 screws is selected without replacement. What is
the probability that the number of defective screws in the
sample is 2.
Solution:
Let 𝑥 = 2 = no. of defective screws in the sample
𝑛 = 3 = size of sample, 𝑘 = 5 = no. of successes
𝑁 − 𝑘 = 8 − 5 = 3 = no. of failure
𝑃 𝑋 = 2 = ℎ 2; 8, 3, 5
5𝐶 ∙ 8−5𝐶
2 3−2
= 8𝐶
3
15
=
28
33
Example
Lots of 40 components each are deemed unacceptable if they
contain 3 or more defectives. The procedure for sampling a
lot is to select 5 components at random and to reject the lot if
a defect is found. What is the probability that exactly 1
defective is found in the sample if there are 3 defectives in the
entire lot?
Solution:
Using the hypergeometric distribution with 𝑁 = 40, 𝑛 = 5,
𝑘 = 3, 𝑁 − 𝑘 = 37 𝑎𝑛𝑑 𝑥 = 1, we find the probability of
obtaining 1 defective to be
3
𝐶1 ∙ 40−3𝐶5−1
ℎ 1; 40, 5, 3 = 40𝐶
= 0.3011
5
Once again, this plan is not desirable since it detects a bad lot
(3 defectives) only about 30% of the time.
34
Presenter: Ms. Sidra Raees
LECTURE 10
Department of Mathematics, NED University of
Engineering & Technology, Karachi
1
The most important continuous probability distribution in the entire
field of statistics is the normal distribution. Its graph, called the
normal curve, is the bell-shaped curve, which approximately
describes many phenomena that occur in nature, industry, and
research. For example, physical measurements in areas such as
meteorological experiments, rainfall studies, and measurements of
manufactured parts are often more than adequately explained with
a normal distribution. In addition, errors in scientific measurements
are extremely well approximated by a normal distribution. In 1733,
Abraham DeMoivre developed the mathematical equation of the
normal curve. It provided a basis from which much of the theory of
inductive statistics is founded. The normal distribution is often
referred to as Gaussian distribution, in honor of Karl Friedrich
Gauss (1777-1855), who also derived its equation from a study of
errors in repeated measurements of the same quantity.
2
A continuous random variable X having the bell shaped
distribution is called a normal random variable. The
mathematical equation for the probability distribution of
the normal variable depends on the two parameters 𝜇 and
𝜎, its mean and standard deviation, respectively. Hence,
we denote the values of the density of X by 𝑛 𝑥; 𝜇, 𝜎 .
3
The density of the normal random variable X, with mean
𝜇 and variance 𝜎 2 , is
1 1
− 2 𝑥−𝜇 2
𝑛 𝑥; 𝜇, 𝜎 = 𝑒 2𝜎 , −∞ < 𝑥 < ∞
2𝜋𝜎
4
The curve of any continuous probability distribution or
density function is constructed so that the area under the
curve bounded by the two ordinates 𝑥 = 𝑥1 and 𝑥 = 𝑥2
equals the probability that the random variable X
assumes a value between 𝑥 = 𝑥1 and 𝑥 = 𝑥2 . Thus, for
the normal curve
𝑥2
𝑃 𝑥1 < 𝑋 < 𝑥2 = 𝑥 𝑛 𝑥; 𝜇, 𝜎 𝑑𝑥
1
1 𝑥2 − 1 2 𝑥−𝜇 2
= 𝑥1
𝑒 2𝜎 𝑑𝑥
2𝜋𝜎
Is represented by the area
of the shaded region.
5
To determine the area or probability of an interval of a normal
distribution with mean 𝜇 and standard deviation 𝜎, first we
convert X values in Z values using
𝑋−𝜇
𝑍=
𝜎
Where Z is called standard normal variable with mean zero
and standard deviation one.
𝑥2 1
1 − 2 𝑥−𝜇 2
𝑃 𝑥1 < 𝑋 < 𝑥2 = 𝑒 2𝜎 𝑑𝑥
2𝜋𝜎 𝑥1
1 𝑧2 −1𝑧 2 𝑧2
= 𝑧
2𝜋 1
𝑒 2 𝑑𝑥 = 𝑧1
𝑛(𝑧; 0,1)
= 𝑃 𝑧1 < 𝑍 < 𝑧2
= 𝑃 𝑍 < 𝑧2 − 𝑃(𝑍 < 𝑧1 )
Where Z is seen to be a normal random variable with mean 0
and variance 1.
6
7
8
9
Example
Given a standard normal distribution, find the area under
the curve that lies
(a) To right of 𝑧 = 1.84 and
(b) Between 𝑧 = −1.97 and 𝑧 = 0.86
Solution:
(a) 𝑃 𝑍 > 1.84 = 1 − 𝑃 𝑍 < 1.84
= 1 − 0.9671
= 0.0329
10
(b) Between 𝑧 = −1.97 and 𝑧 = 0.86
11
Example
Given a standard normal distribution, find the value of k
such that 𝑃 𝑧 > 𝑘 = 0.3015.
Solution:
𝑃 𝑧>𝑘 = 0.3015
1−𝑃 𝑧 ≤ 𝑘 = 0.3015
𝑃 𝑧<𝑘 = 1 − 0.3015
𝑃 𝑧<𝑘 = 0.6985
𝑃 𝑧<𝑘 = 0.52
So, 𝑘 = 0.52
12
Example
Given a normal distribution with 𝜇 = 40 and 𝜎 = 6, find the
probability (or area) that X assumes a value
(a) Below 42 (b) Above 27
(c) Between 42 and 51
Solution:
Here 𝜇 = 40 and 𝜎 = 6, then
(a) 𝑃 𝑏𝑒𝑙𝑜𝑤 42 = 𝑃 𝑋 < 42
42 − 40
=𝑃 𝑍<
6
= 𝑃 𝑍 < 0.33
= 0.6293
13
(b) 𝑃 𝐴𝑏𝑜𝑣𝑒 27 = 𝑃 𝑋 > 27
= 1 − 𝑃 𝑋 < 27
27−40
=1−𝑃 𝑍 <
6
= 1 − 𝑃 Z < −2.17
= 1 − 0.0151
= 0.9850
14
15
Example
The burning time of an experiment rocket is a random
variable having the normal distribution with 𝜇 = 4.76
seconds and 𝜎 = 0.04 seconds. What is the probability
that this kind of rocket will burn
(a) Less than 4.66 seconds
(b) More than 4.80 seconds
(c) Anything from 4.70 to 4.82 seconds
Solution:
Here, 𝜇 = 4.76 and 𝜎 = 0.04
16
(a) 𝑃 𝑋 < 4.66 = 𝑃 𝑍 < −2.50
= 0.0062
17
Example
A certain type of storage battery lasts, on average, 3.0
years with a standard deviation of 0.5 years. Assuming
that battery life is normally distributed, find the
probability that a given battery will last less than 2.3
years.
Solution:
First construct a diagram, showing the given distribution
of battery lives and the desired area. To find 𝑃(𝑋 < 2.3),
we need to evaluate the area under the normal curve to
the left of 2.3. This is accomplished by finding the area to
the left of the corresponding 𝑧 value.
18
Hence, 𝜇 = 3.0 and 𝜎 = 0.5
and we find that
19
Example
A certain machine makes electrical resistors having a mean
resistance of 40 ohms and a standard deviation of 2 ohms.
Assuming that the resistance follows a normal distribution
and can be measured to any degree of accuracy, what
percentage of resistors will have a resistance exceeding 43
ohms?
Solution:
A percentage is found by multiplying the relative frequency
by 100%. Since the relative frequency for an interval is equal
to the probability of a value falling in the interval, we must
find the area to the right of 𝑥 = 43 in figure. This can be done
by transforming 𝑥 = 43 to the corresponding 𝑧 value,
obtaining the area to the left of 𝑧 from Table, and then
subtracting this area from 1.
20
We find
𝑃 𝑋 > 43 = 𝑃 𝑍 > 1.5
= 1 − 𝑃 𝑍 < 1.5
= 1 − 0.9332
= 0.0668
Hence 6.68% of the resistors will have a resistance
exceeding 43 ohms.
21
22
In any statistical investigation, the interest lies in the assessment of one or
more characteristics relating to the individuals belonging to a group.
When all the individuals present in the study are investigated, it is called
complete enumeration, but in practice, it is very difficult to investigate all
the individuals present in the study. So the technique of sampling is done
which states that a part of the individuals are selected for the study and
the assessment is made from the selected group of individuals. For
example
A housewife tastes a spoonful whatever she cooks to check whether it
tastes good or not.
A few drops of our blood are tested to check about the presence or
absence of a disease.
A grain merchant takes out a handful of grains to get an idea about the
quality of the whole consignment.
These are typical examples where decision making is done on the basis of
sample information. So sampling is the process of choosing a
representative sample from a given population.
23
Sampling is the procedure or process of selecting a
sample from a population. Sampling is quite often used
in our day-to-day practical life.
24
Population
The group of individuals considered under study is called as
population. The word population here refers not only to people but
to all items that have been chosen for the study. Thus in statistics,
population can be number of bikes manufactured in a day or week
or month, number of cars manufactured in a day or week or month,
number of fans, TVs, chalk pieces, people, students, girls, boys,
any manufacturing products, etc…
Parameter
The statistical constants of the population like mean, variance
are referred as population parameters.
Statistic
Any statistical measure computed from sample is known as
statistic.
Note:
In practice, the parameter values are not known and their estimates
based on the sample values are generally used.
26
Sampling is said to be with replacement when we draw a unit
from a finite population and return it to the population before
the next unit is drawn. In this case each unit can be drawn
more than once and the probability of drawing of each unit
remains constant throughout the sampling procedure.
Sampling is said to be without replacement, if we do not
return the selected unit to the population and draw the next
unit. In this case each unit can’t be drawn more than once and
the probability of drawing of each unit changes throughout
the sampling procedure.
27
Census or complete enumeration means to get the
information about each and every unit in the population.
28
The important advantages of sampling are listed below:
Sampling method is cheaper to collect information as
compared to census (i.e. complete enumeration).
The data may be collected, classified and analyzed
much more quickly with a sample than with a census
enquiry.
A sample is often used as a check to verify the accuracy
of complete count.
It provides greater accuracy because the volume of
work is reduced in the sample survey.
29
This is the simplest and the easiest method of drawing a
sample from a population. According to this method each and
every unit in the population has an equal chance of being
included in the sample and also each possible sample of the
size has an equal probability of being chosen.
Suppose, there is a population of N units and we want to draw
a sample of n units, then the possible number of samples in
case of sampling without replacement will be
𝑁𝐶 =
𝑁!
𝑛
𝑛! 𝑁 − 𝑛 !
And in case of sampling with replacement the possible
number of samples will be 𝑁 𝑛 .
30
Consider all possible samples of size n which can be
drawn from a given population (either with or without
replacement). For each sample we can compute a
statistics, such as mean, variance, etc. Which will vary
from sample to sample. In this way we obtain a
distribution of the statistics which is called its sampling
distribution. Therefore, the sampling distributions may be
of mean, variance, etc.
31
Mean of all sample means is equal to the population
mean 𝐸 𝑋 = 𝜇.
𝜎 2 𝑁−𝑛
𝑉 𝑋 = ∙ , where 𝜎 2 is the population variance.
𝑛 𝑁−1
32
Mean of all sample means is equal to the population
mean 𝐸 𝑋 = 𝜇.
𝜎2
𝑉 𝑋 = , where 𝜎 2 is the population variance.
𝑛
33
Example
A population consists of five 0, 2, 4, 6, 8
(a) List all possible samples of size 2 that can be drawn from
this population without replacement.
(b) Find mean of each sample
(c) Construct sampling distribution of 𝑋.
(d) Verify that mean of all sample means is equal to
population mean.
Solution:
Since N=5 and n=2 and the sampling is done without
replacement, then all possible samples
5
5!
𝐶2 = = 10
2! 5 − 2 !
34
35
Sampling Distribution of 𝑋 is
36
𝑓𝑋 40
The mean of 𝑋 = 𝐸 𝑋 = = =4
𝑓 10
and since,
𝑋
Population mean 𝜇 =
𝑁
0+2+4+6+8
=
5
=4
Therefore, it is verified that, mean of all sample means is
equal to population mean.
37
Example
A population consists of five numbers 0, 3, 6, 9, 12
(a) List all possible samples of size 3 that can be draw
from this population without replacement.
(b) Verify that, Mean of 𝑥 = 𝐸 𝑋 = 𝜇 and
𝜎2 𝑁 − 𝑛
𝑉 𝑋 = ∙
𝑛 𝑁−1
Solution:
Since N=5 and n=3 and sampling is done without
replacement. Then all possible samples
5𝐶 =
5!
3 = 10
3! 5 − 3 !
38
0, 3, 6, 9, 12
39
Mean
𝐸 𝑋 =𝜇
𝑋 𝑋
=
𝑛 𝑁
60 30
=
10 5
6=6
Variance
𝜎2 𝑁 − 𝑛
𝑉 𝑋 = ∙
𝑛 𝑁−1
2
𝑋2 𝑋 𝜎2 𝑁 − 𝑛
− = ∙
𝑛 𝑛 𝑛 𝑁−1
2
390 60 18 5 − 3
− = ∙
10 10 3 5−1
3=3
Therefore, it is verified.
40
Example
Draw all possible samples each of size 2 from the
population 2, 4, 6 and 8 using sampling with
replacement. Find mean of each sample and verify that
(a) Mean of 𝑋 = 𝐸 𝑋 = 𝜇
𝜎2
(b) 𝑉 𝑋 =
2
Solution:
Since N=4 and n=2 then, possible samples are
𝑁 𝑛 = 42 = 16
41
42
Mean
𝐸 𝑋 =𝜇
𝑋 𝑋
=
𝑛 𝑁
80 20
=
16 4
5=5
Variance
𝜎2
𝑉 𝑋 =
𝑛
2
𝑋2 𝑋 𝜎2
− =
𝑛 𝑛 𝑛
2
440 80 5
− =
16 16 2
2.5 = 2.5
Therefore, it is verified.
43
If 𝑋 is the mean of a random sample of size n taken from
a population with mean 𝜇 and finite variance 𝜎 2 , then the
limiting form of the distribution of
𝑋−𝜇
𝑍=𝜎 ,
𝑛
44
Example
An electrical firm manufactures light bulbs that have a
length of life that is approximately normally distributed,
with mean equal to 800 hours and a normal deviation of
40 hours. Find the probability that a random sample of 16
bulbs will have an average life less than 775 hours.
Solution:
The sampling distribution of 𝑋 will be approximately
normal, with 𝜇𝑋 = 800 and 𝜎𝑋 = 40 16 = 10. The
desired probability is given by the area of the shaded
region in figure.
45
Corresponding to 𝑥 = 775, we find that
𝑃 𝑋 < 775 = 𝑃 𝑍 < −2.5
= 0.0062
46
Example
Hourly wages of workers in an industry have a mean
wage rate of PRs. 50 per hour and a standard deviation of
PRs. 6. what is the probability that the mean wage of a
random sample of 50 workers will be between PRs. 51
and PRs. 52.
Solution:
The sampling distribution of 𝑋 will be approximately
normal, with 𝜇𝑋 = 50 and 𝜎𝑋 = 6 50 = 0.85. The
desired probability is given by the area of the shaded
region in figure.
47
We find that
𝑃 51 < 𝑋 < 52 = 𝑃 1.18 < 𝑍 < 2.35
= 𝑃 𝑍 < 2.35 − 𝑃 𝑍 < 1.18
= 0.9906 − 0.8810
= 0.1096
48
Presenter: Ms. Sidra Raees
LECTURE 11
Department of Mathematics, NED University of
Engineering & Technology, Karachi
1
One of the main objectives of any statistical investigation
is to draw inferences about a population from the analysis
of samples drawn from that population. Statistical
Inference provides us how to estimate a value from the
sample and test that value for the population. This is done
by the two important classifications in statistical
inference,
(i) Estimation
(ii) Testing of Hypotheses
2
It is possible to draw valid conclusion about the population
parameters from sampling distribution. Estimation helps in
estimating an unknown population parameter such as
population mean, standard deviation, etc., on the basis of
suitable statistic computed from the samples drawn from
population.
For Example, if a candidate for public office may wish to
estimate the true proportion of voters favoring him by
obtaining the opinions from a random sample of 100 eligible
voters. The fraction of voters in the sample favoring the
candidate could be used as an estimate of the true proportion
of the population of voters.
3
An estimator stands for the rule or a formula that is used
to estimate a parameter whereas an estimate stands for
the numerical value obtained by substituting the sample
observations in the rule or the formula.
For example, 2, 4, 6, 8, 10 are sample observations then
2 + 4 + 6 + 8 + 10
𝑥= =6
5
6 is an estimate whereas the statistic 𝑥 used as formula is
called an estimator.
4
To estimate an unknown parameter of the population,
concept of theory of estimation is used. There are two
types of estimation namely,
1. Point estimation
2. Interval estimation
5
When a single value is used as an estimate, the estimate is
called a point estimate of the population parameter. In other
words, an estimate of a population parameter given by a
single number is called as point estimation.
For example
55 is the mean marks obtained by a sample of 5 students
randomly drawn from a class of 100 students is considered
to be the mean marks of the entire class. This single value
55 is a point estimate.
50 kg is the average weight of a sample of 10 students
randomly drawn from a class of 100 students is considered
to be the average weight of the entire class. This single
value 50 is a point estimate.
6
Generally, there are situations where point estimation is not
desirable and we are interested in finding limits within which
the parameter would be expected to lie is called an interval
estimation.
For Example
If the average height of all college students is a value between
61” and 65”, then range of values from 61” and 65” is an
interval estimate.
7
The limits which contains a population parameter with a
given degree of confidence are called the confidence
limits. The interval between these limits is called
confidence interval.
8
Confidence level 𝟏−𝛂 𝛂 𝛂 𝒛𝛂
𝟐 𝟐
(𝟏 − 𝜶) 𝟏𝟎𝟎%
90 0.90 0.10 0.050 1.645
95 0.95 0.05 0.025 1.960
98 0.98 0.02 0.010 2.326
99 0.99 0.01 0.005 2.575
9
10
11
Confidence Interval on 𝝁 (When 𝝈 is known)
If 𝑥 is the mean of a random sample of size n from a
population with known variance 𝜎 2 , a 100(1 −𝛂) %
confidence interval for 𝜇 is given by
𝜎 𝜎
𝑥 − 𝑧𝛂 < 𝜇 < 𝑥 + 𝑧𝛂 ,
2 𝑛 2 𝑛
Where 𝑧𝛂 2 is the z-value leaving an area of 𝛂 2 to the right.
Note:
For small samples selected from nonnormal populations, we
cannot expect our degree of confidence to be accurate.
However, for samples of size 𝑛 ≥ 30, with the shape of the
distribution not too skewed, sampling theory guarantees good
results.
12
Example
The average zinc concentration recovered from a sample of
measurements taken in 36 different locations in a river is found to be 2.6
grams per milliliter. Find the 99% confidence interval for the mean zinc
concentration in the river. Assume that the population standard deviation
is 0.3 gram per milliliter.
Solution:
We have 𝑛 = 36, 𝑥 = 2.6, 𝜎 = 0.3
1 − 𝛂 = 0.99
𝛂 = 0.01
𝑧𝛂 = 𝑧0.01 = 𝑧0.005 = 2.575
2 2
Hence, the 99% confidence interval is
𝜎 𝜎
𝑥 − 𝑧𝛂 < 𝜇 < 𝑥 + 𝑧𝛂
2 𝑛 2 𝑛
0.3 0.3
2.6 − 2.575 < 𝜇 < 2.6 + 2.575
36 36
2.47 < 𝜇 < 2.73
13
Example
The quality control manager of a tyre company has sample of
hundred tyres and has found the mean life time to be 30214 km.
The population standard deviation is 860. Construct 95%
confidence interval for the mean life of tyres.
Solution:
We have 𝑛 = 100, 𝑥 = 30214, 𝜎 = 860
1 − 𝛂 = 0.95
𝛂 = 0.05
𝑧𝛂 = 𝑧0.05 = 𝑧0.025 = 1.96
2 2
Hence, the 95% confidence interval is
𝜎 𝜎
𝑥 − 𝑧𝛂 < 𝜇 < 𝑥 + 𝑧𝛂
2 𝑛 2 𝑛
860 860
30214 − 1.96 < 𝜇 < 30214 + (1.96)
100 100
30045.44 < 𝜇 < 30382.56
14
Confidence Interval on 𝝁 (When 𝝈 is unknown, 𝒏 ≥ 𝟑𝟎)
15
Example
The systolic blood pressure of 90 man has a mean of 128.9mnHg &
a standard deviation of 17mnHg. Assuming that these are a random
sample of B.P. Calculate 99% confidence interval for the mean B.P
in the population.
Solution:
We have 𝑛 = 90, 𝑥 = 128.9, 𝑠 = 17
1 − 𝛂 = 0.99
𝛂 = 0.01
𝑧𝛂 = 𝑧0.01 = 𝑧0.005 = 2.575
2 2
Hence, the 99% confidence interval is
𝑠 𝑠
𝑥 − 𝑧𝛂 < 𝜇 < 𝑥 + 𝑧𝛂
2 𝑛 2 𝑛
17 17
128.9 − 2.575 < 𝜇 < 128.9 + 2.575
90 90
124.74 < 𝜇 < 133.05
16
Example
Scholastic Aptitude Test (SAT) mathematics scores of a random
sample of 500 high school seniors in the state of Texas are
collected, and the sample mean and standard deviation are found to
be 501 and 112, respectively. Find a 99% confidence interval on the
mean SAT mathematics score for seniors in the state of Texas.
Solution:
We have 𝑛 = 500, 𝑥 = 501, 𝑠 = 112
1 − 𝛂 = 0.99
∝= 0.01
𝑧𝛂 = 𝑧0.01 = 𝑧0.005 = 2.575
2 2
Hence, the 99% confidence interval is
𝑠 𝑠
𝑥 − 𝑧𝛂 < 𝜇 < 𝑥 + 𝑧𝛂
2 𝑛 2 𝑛
112 112
501 − 2.575 < 𝜇 < 501 + 2.575
500 500
488.1 < 𝜇 < 513.9
17
Confidence Interval on 𝝁 (When 𝝈 is unknown, 𝒏 < 𝟑𝟎)
18
19
Example
An electrical firm manufacture light bulbs that have a length of life
with mean 𝜇 and an standard deviation of 40 hours. If a sample of
29 bulbs has an average life of 780 hours. Find 95% confidence
interval for the population mean of all bulbs produced by this firm.
Solution:
We have 𝑛 = 29, 𝑥 = 780, 𝑠 = 40
1 − 𝛂 = 0.95
𝛂 = 0.05
(𝑡𝛂 2 , 𝑛 − 1) = (𝑡0.05 , 29 − 1) = 𝑡0.025 , 28 = 2.048
2
Hence, the 95% confidence interval is
𝑠 𝑠
𝑥 − (𝑡𝛂 2 , 𝑛 − 1) < 𝜇 < 𝑥 + (𝑡𝛂 2 , 𝑛 − 1)
𝑛 𝑛
40 40
780 − 2.048 < 𝜇 < 780 + 2.048
29 29
764.78 < 𝜇 < 795.21
20
Example
The contents of seven similar containers of sulfuric acid are 9.8, 10.2,
10.4, 9.9, 10.0, 10.2, and 9.6 liters. Find a 95% confidence interval for the
mean contents of all such containers, assuming an approximately normal
distribution.
Solution:
The sample mean and standard deviation for the given data are
𝑥 = 10.0 and 𝑠 = 0.283
1 − 𝛂 = 0.95
𝛂 = 0.05
(𝑡𝛂 , 𝑛 − 1) = (𝑡0.05 , 7 − 1) = 𝑡0.025 , 6 = 2.447
2 2
Hence, the 95% confidence interval is
𝑠 𝑠
𝑥 − (𝑡𝛂 , 𝑛 − 1) < 𝜇 < 𝑥 + (𝑡𝛂 , 𝑛 − 1)
2 𝑛 2 𝑛
0.283 0.283
10.0 − 2.447 < 𝜇 < 10.0 + 2.447
7 7
9.74 < 𝜇 < 10.26
21
22
If we have two populations with means 𝜇1 and 𝜇2 and
variances 𝜎1 2 and 𝜎2 2 , respectively, a point estimator of
the difference between 𝜇1 and 𝜇2 is given by the statistic
𝑋1 − 𝑋2 . Therefore, to obtain a point estimate of 𝜇1 − 𝜇2 ,
we will select two independent random samples, one
from each population, of sizes 𝑛1 and 𝑛2 , and compute
𝑥1 − 𝑥2 , the difference of the sample means. Clearly, we
must consider the sampling distribution of 𝑋1 − 𝑋2 .
23
Confidence Interval for 𝝁𝟏 − 𝝁𝟐 (When 𝝈𝟏 and 𝝈𝟐 are known)
𝜎1 2 𝜎2 2 𝜎1 2 𝜎2 2
𝑥1 − 𝑥2 − 𝑧𝛂 + < 𝜇1 − 𝜇2 < 𝑥1 − 𝑥2 + 𝑧𝛂 + ,
2 𝑛1 𝑛2 2 𝑛
1 𝑛 2
Where 𝑧𝛂 2
is the z-value leaving an area of 𝛂 2 to the right.
24
Example
25
Solution:
The point estimate of 𝜇𝐵 − 𝜇𝐴 is 𝑥𝐵 − 𝑥𝐴 = 42 − 36 = 6.
using
1 − 𝛂 = 0.96
𝛂 = 0.04
𝑧𝛂 = 𝑧0.04 = 𝑧0.02 = 2.05
2 2
Hence, with substitution in the formula above, the 96%
confidence interval is
𝜎1 2 𝜎2 2 𝜎1 2 𝜎2 2
𝑥1 − 𝑥2 − 𝑧𝛂 + < 𝜇1 − 𝜇2 < 𝑥1 − 𝑥2 + 𝑧𝛂 + ,
2 𝑛1 𝑛2 2 𝑛
1 𝑛2
36 64 36 64
6 − 2.05 + < 𝜇𝐵 − 𝜇𝐴 < 6 + 2.05 +
50 75 50 75
3.43 < 𝜇𝐵 − 𝜇𝐴 < 8.57
26
Confidence Interval for 𝝁𝟏 − 𝝁𝟐 (When 𝝈𝟏 = 𝝈𝟐 but unknown)
1 1 1 1
𝑥1 − 𝑥2 − 𝑡𝛂 𝑠𝑝 + < 𝜇1 − 𝜇2 < 𝑥1 − 𝑥2 + 𝑡𝛂 𝑠𝑝 + ,
2 𝑛1 𝑛2 2 𝑛1 𝑛2
𝑛1 − 1 𝑠1 2 + (𝑛2 − 1)𝑠2 2
𝑠𝑝 2 =
𝑛1 + 𝑛2 − 2
27
Example
28
Solution:
29
1 − 𝛂 = 0.90
𝛂 = 0.10
𝑡𝛂 2 , 𝑛1 + 𝑛2 − 2 = 𝑡0.10 , 20 = 𝑡0.05 , 20 = 1.725
2
1 1 1 1
𝑥1 − 𝑥2 − 𝑡𝛂 𝑠𝑝 + < 𝜇1 − 𝜇2 < 𝑥1 − 𝑥2 + 𝑡𝛂 𝑠𝑝 + ,
2 𝑛1 𝑛2 2 𝑛1 𝑛2
1 1 1 1
4 − 1.725 4.48 + < 𝜇1 − 𝜇2 < 4 + 1.725 4.48 +
12 10 12 10
0.69 < 𝜇1 − 𝜇2 < 7.31
30
Confidence Interval for 𝝁𝟏 − 𝝁𝟐 (When 𝝈𝟏 ≠ 𝝈𝟐 but unknown)
𝑠1 2 𝑠2 2 𝑠1 2 𝑠2 2
𝑥1 − 𝑥2 − 𝑡𝛂 2 + < 𝜇1 − 𝜇2 < 𝑥1 − 𝑥2 + 𝑡𝛂 2 + ,
𝑛1 𝑛2 𝑛1 𝑛2
𝑠1 2 /𝑛1 + 𝑠2 2 /𝑛2 2
𝑣=
𝑠1 2 /𝑛1 2 /(𝑛1 − 1) + 𝑠2 2 /𝑛2 2 /(𝑛2 − 1)
31
Example
32
Solution:
33
Using
1 − 𝛂 = 0.95
𝛂 = 0.05
𝑡𝛂 2 , 𝑣 = 𝑡0.05 , 16 = 𝑡0.025 , 16 = 2.120
2
Therefore, the 95% confidence interval is
𝑠1 2 𝑠2 2 𝑠1 2 𝑠2 2
𝑥1 − 𝑥2 − 𝑡𝛂 + < 𝜇1 − 𝜇2 < 𝑥1 − 𝑥2 + 𝑡𝛂 +
2 𝑛1 𝑛2 2 𝑛1 𝑛2
34
Presenter: Ms. Sidra Raees
LECTURE 12
Department of Mathematics, NED University of
Engineering & Technology, Karachi
1
One of the important areas of statistical analysis is testing of
hypotheses. Often, in real life situations we require to take
decisions about the population on the basis of sample information.
Hypotheses testing is also referred to as “Statistical Decision
Making”. It employs statistical techniques to arrive at decisions in
certain situations where there is an element of uncertainty on the
basis of sample, whose size is fixed in advance. So statistics helps
us in arriving at the criterion for such decision is known as Testing
of hypotheses.
For Example: We may like to decide on the basis of sample data
whether a new vaccine is effective in curing cold, whether a new
training methodology is better than the existing one, whether the
new fertilizer is more productive than the earlier one and so on.
2
The structure of hypothesis testing will be formulated
with the use of the term null hypothesis, which refers to
any hypothesis we wish to test and is denoted by
𝐻0 . While the hypothesis opposite the null hypothesis is
called the alternative hypothesis, denoted by 𝐻1 .
For Example:
A car battery manufacturing company claims that the
batteries they produce possess an average length of life of
2 years. We can accept or reject their claims on the basis
of a sample by testing the relevant hypothesis.
3
In this example the null hypothesis is that the average
length of life is 2 years.
i.e. 𝐻0 ∶ 𝜇 = 2
The alternative hypothesis may be stated as
𝐻1 ∶ 𝜇 < 2, 𝐻1 ∶ 𝜇 > 2 or 𝐻1 ∶ 𝜇 ≠ 2
4
The null and alternative hypotheses must be established
in such a way that when one is true, the other is false i.e.
𝐻0 and 𝐻1 are opposites or disjoint.
The alternative hypothesis is always the form of
inequality. Inequality may be expressed in one of only
three ways:
greater than ( > ), less than ( < ), or not equal to ( ≠ )
Whereas, the null hypothesis is always expressed in some
form of equality such as, less than or equal to (≤),
greater than or equal to (≥), or exactly equal to (=).
5
Hence if 50 is the specified value of the population mean
𝜇 (i.e. parameter), then the possible null and alternative
hypotheses are;
𝐻0 ∶ 𝜇 = 50 𝐻1 ∶ 𝜇 ≠ 50
𝐻0 ∶ 𝜇 ≥ 50 𝐻1 ∶ 𝜇 < 50
𝐻0 ∶ 𝜇 ≤ 50 𝐻1 ∶ 𝜇 > 50
𝐻0 ∶ 𝜇 = 50 𝐻1 ∶ 𝜇 ≠ 50
𝐻0 ∶ 𝜇 = 50 𝐻1 ∶ 𝜇 < 50
𝐻0 ∶ 𝜇 = 50 𝐻1 ∶ 𝜇 > 50
6
The decision to accept or reject the null-hypothesis 𝐻0 is
made on the basis of the information supplied by the
sample data. Therefore, there is always chance of making
wrong decision. There are two types of wrong decision
that can be made. One is the rejection of a true null
hypothesis and the other is the acceptance of a false null
hypothesis. The wrong decision of rejecting a given null
hypothesis when it is really true is called a type-I error,
whereas the wrong decision of acceptance a given null
hypothesis when it is really false is called a type-II error.
7
These two types of error may be displayed by the
following table.
Accept 𝐻0 Reject 𝐻0
𝐻0 is True Correct Decision Wrong Decision
(No error) (Type-I error)
𝐻0 is False Wrong Decision Correct Decision
(Type-II error) (No error)
8
The probability of making a type-I error is also called the
level of significance of the test and it is denoted by 𝛼.
Whenever we test a given null-hypothesis, we fix a
certain amount of 𝛼 in the very beginning of the problem.
Generally we take 𝛼 = 0.01, 0.05, 𝑜𝑟 0.10. (i.e. 1%, 5%
or 10%) etc. The level of significance guards against
rejecting the null hypothesis when it is true.
9
The decision to reject or not to reject the null-hypothesis
is based on a statistic, called a test statistic computed
from sample data. A test statistic is a random variable and
possess an appropriate probability distribution. Some
common probability distribution which are used in
testing are z, t or 𝜒 2 distributions.
10
The main job of a decision maker is to establish a cut-off
point that can be used to separate the entire sample space
( i.e. all possible values of test statistic ) into two groups
or regions. One group makes up the acceptance region
and the other group the rejection region or critical region
and the cut-off point is called a critical value. In other
words we can say, the critical region is the region where
we reject 𝐻0 .
11
One Tailed and Two Tailed Tests
12
13
Testing procedure is as under:
1. Null Hypothesis:
𝐻0 ∶ 𝜇 = 𝜇0
2. Alternative Hypothesis:
𝐻1 ∶ 𝜇 < 𝜇0 , μ > 𝜇0 , 𝒐𝒓 𝜇 ≠ 𝜇0
3. Level of Significance:
Choose a level of significance equal to α. (generally we
take α = 0.01, 0.05, 𝑜𝑟 0.10 etc.)
14
4. Test Statistic:
The Test Statistic (z or t) may be decided according to the following
rules:
CASE-I
𝑥 − 𝜇0
𝑧= 𝜎 ,
𝑛
When 𝜎 is known.
CASE-II
𝑥 − 𝜇0
𝑧= 𝑠 ,
𝑛
When 𝜎 is unknown and 𝑛 ≥ 30, then 𝜎 is replaced by s (i.e. S.D. of a
sample).
CASE-III
𝑥 − 𝜇0
𝑡= 𝑠 ,
𝑛
When 𝜎 is unknown and 𝑛 < 30 with degree of freedom 𝑛 − 1 .
15
5. Critical Region:
16
6. Rejection Rule & Conclusion:
17
Level of Significance (𝜶) 0.10 0.05 0.01
18
Example
A random sample of 100 recorded deaths in the United
States during the past year showed an average life span
of 71.8 years. Assuming a population standard deviation
of 8.9 years, does this seem to indicate that the mean life
span today is greater than 70 years? Use a 0.05 level of
significance.
Solution:
1. 𝐻0 ∶ 𝜇 = 70 years.
2. 𝐻1 ∶ 𝜇 > 70 years.
3. 𝛼 = 0.05.
4. Critical Region: 𝑧𝛼 = 𝑧0.05 = 1.645
19
5. Computations:
𝑥 = 71.8 years, 𝜎 = 8.9 years, and hence
𝑥 − 𝜇0 71.8 − 70
𝑧= 𝜎 = = 2.02
8.9
𝑛 100
6. Decision:
Since calculated value falls in critical region
𝑧𝑐𝑎𝑙 > 𝑧𝑡𝑎𝑏 therefore, we reject 𝐻0 and conclude that
the mean life span today is greater than 70 years.
20
Example
A manufacturer of sports equipment has developed a new
synthetic fishing line that the company claims has a mean
breaking strength of 8 kilograms with a standard deviation of
0.5 kilogram. Test the hypothesis that 𝜇 = 8 kilograms
against the alternative that 𝜇 ≠ 8 kilograms if a random
sample of 50 lines is tested and found to have a mean
breaking strength of 7.8 kilograms. Use a 0.01 level of
significance.
Solution:
1. 𝐻0 ∶ 𝜇 = 8 kilograms.
2. 𝐻1 ∶ 𝜇 ≠ 8 kilograms.
3. 𝛼 = 0.01.
4. Critical Region: 𝑧𝛼 2 = 𝑧0.01 2 = 𝑧0.005 = ±2.575
21
5. Computations:
𝑥 = 7.8 kilograms, s = 0.5 kilograms, and hence
𝑥 − 𝜇0 7.8 − 8
𝑧= 𝑠 = = −2.83
0.5
𝑛 50
6. Decision:
Since calculated value falls in critical region
therefore, we reject 𝐻0 and conclude that the average
breaking strength is not equal to 8 but is, in fact, less
than 8 kilograms.
22
Example
The Edison Electric Institute has published figures on the number
of kilowatt hours used annually by various home appliances. It is
claimed that a vacuum cleaner uses an average of 46 kilowatt hours
per year. If a random sample of 12 homes included in a planned
study indicates that vacuum cleaners use an average of 42 kilowatt
hours per year with a standard deviation of 11.9 kilowatt hours,
does this suggest at the 0.05 level of significance that vacuum
cleaners use, on average, less than 46 kilowatt hours annually?
Assume the population of kilowatt hours to be normal.
Solution:
1. 𝐻0 ∶ 𝜇 = 46 kilowatt hours.
2. 𝐻1 ∶ 𝜇 < 46 kilowatt hours.
3. 𝛼 = 0.05.
4. Critical Region: (𝑡𝛼 , 𝑛 − 1) = (𝑡0.05 , 11) = −1.796
23
5. Computations:
𝑥 = 42 kilowatt hours, s = 11.9 kilowatt hours, and
𝑛 = 12. Hence
𝑥 − 𝜇0 42 − 46
𝑡= 𝑠 = = −1.16
11.9
𝑛 12
6. Decision:
Since calculated value falls in acceptance region
therefore, we don’t reject 𝐻0 and conclude that the
average number of kilowatt hours used annually by
home vacuum cleaners is not significantly less than
46.
24
25
The procedure for testing the difference between two
population means may be written as:
1. Null Hypothesis:
𝐻0 ∶ 𝜇1 − 𝜇2 = 𝑑0
2. Alternative Hypothesis:
𝐻1 ∶ 𝜇1 − 𝜇2 < 𝑑0 , 𝜇1 −𝜇2 > 𝑑0 , 𝒐𝒓 𝜇1 − 𝜇2 ≠ 𝑑0
3. Level of Significance:
Decide on the significance level α = 0.01, 0.05, 𝑜𝑟 0.10 etc.
26
4. Test Statistic:
Test Statistic (z or t) is decided according to the following
summarized rules:
CASE-I
(𝑥1 − 𝑥2 ) − 𝑑0
𝑧= ,
𝜎1 2 𝜎2 2
𝑛 + 𝑛1 2
When 𝜎1 and 𝜎2 are known.
CASE-II
(𝑥1 − 𝑥2 ) − 𝑑0
𝑡= ,
1 1
𝑠𝑝 𝑛 + 𝑛
1 2
With degree of freedom 𝑣 = 𝑛1 + 𝑛2 − 2
𝑛1 − 1 𝑠1 2 + (𝑛2 − 1)𝑠2 2
𝑠𝑝 2 =
𝑛1 + 𝑛2 − 2
When 𝜎1 = 𝜎2 but unknown.
27
CASE-III
(𝑥1 − 𝑥2 ) − 𝑑0
𝑡= ,
𝑠1 2 𝑠2 2
+
𝑛1 𝑛2
NOTE:
When 𝜎1 ≠ 𝜎2 and unknown.
28
5. Critical Region:
Critical Regions for Test Statistic z and t
For Alternative C.R. for C.R. for
Hypothesis 𝑯𝟏 Test Statistic z Test Statistic t
𝜇1 −𝜇2 > 𝑑0 𝑧 > 𝑧𝛼 𝑡 > 𝑡𝛼
𝜇1 − 𝜇2 < 𝑑0 𝑧 < −𝑧𝛼 𝑡 < −𝑡𝛼
𝜇1 − 𝜇2 ≠ 𝑑0 𝑧 > 𝑧𝛼 and 𝑧 < −𝑧𝛼 𝑡 > 𝑡𝛼 and 𝑡 < −𝑡𝛼
2 2 2 2
6. Conclusion:
If the calculated value of the test statistic (z or t) falls in
critical region, we reject 𝐻0 ; otherwise accept 𝐻0 .
29
Example
A farmer claims that the average yield of wheat of variety A
exceeds the average yield of variety B by at least 12 bushels per
acre. To test this claim, 50 acres of each variety are planted and
grown under similar conditions. Variety A yielded on the average,
86.7 bushels per acre with a population standard deviation of 6.28
bushels per acre, while variety B yielded, on the average 77.8
bushels per acre with a population standard deviation of 5.61
bushels per acre. Test the farmer’s claim at 𝛼 = 0.01.
Solution:
Let 𝜇1 and 𝜇2 represent the population means for the variety A and
variety B, respectively.
1. 𝐻0 ∶ 𝜇1 − 𝜇2 ≥ 12.
2. 𝐻1 ∶ 𝜇1 − 𝜇2 < 12.
3. 𝛼 = 0.01.
4. Critical Region: 𝑧𝛼 = 𝑧0.01 = −2.33
30
5. Computations:
𝑥1 = 86.7 , 𝜎1 = 6.28 , 𝑛1 = 50,
𝑥2 = 77.8 , 𝜎2 = 5.61 , 𝑛2 = 50,
6. Decision:
Since calculated value falls in critical region
therefore, we reject 𝐻0 . In other words, the farmer’s
claim cannot be accepted.
31
Example
An experiment was performed to compare the abrasive
wear of two different laminated materials. Twelve pieces
of material 1 were tested by exposing each piece to a
machine measuring wear. Ten pieces of material 2 were
similarly tested. In each case, the depth of wear was
observed. The samples of material 1 gave an average
(coded) wear of 85 units with a sample standard
deviation of 4, while the samples of material 2 gave an
average of 81 with a sample standard deviation of 5. Can
we conclude at the 0.05 level of significance that the
abrasive wear of material 1 exceeds that of material 2 by
more than 2 units? Assume the populations to be
approximately normal with equal variances.
32
Solution:
Let 𝜇1 and 𝜇2 represent the population means of the
abrasive wear for material 1 and material 2, respectively.
1. 𝐻0 ∶ 𝜇1 − 𝜇2 = 2.
2. 𝐻1 ∶ 𝜇1 − 𝜇2 > 2.
3. 𝛼 = 0.05.
4. Critical Region:
(𝑡𝛼 , 𝑛1 + 𝑛2 − 2) = (𝑡0.05 , 20) = 1.725
33
5. Computations:
𝑥1 = 85 , 𝑠1 = 4 , 𝑛1 = 12,
𝑥2 = 81 , 𝑠2 = 5 , 𝑛2 = 10,
2 2
𝑛1 − 1 𝑠1 + (𝑛2 − 1)𝑠2
𝑠𝑝 2 =
𝑛1 + 𝑛2 − 2
12 − 1 16 + (10 − 1)(25)
𝑠𝑝 = = 4.478,
12 + 10 − 2
(𝑥1 − 𝑥2 ) − 𝑑0 85 − 81 − 2
𝑡= = = 1.04,
1 1 1 1
𝑠𝑝 𝑛 + 𝑛 4.478 12 + 10
1 2
6. Decision:
Since calculated value falls in acceptance region
therefore, we don’t reject 𝐻0 . We are unable to conclude
that the abrasive wear of material 1 exceeds that of
material 2 by more than 2 units.
34
Example
A manufacturing company is interested in determining
whether there is a significant difference between the average
number of units produced per day by two different machine
operators. A random sample of ten daily outputs was selected
for each operator from the outputs over the past years. The
data on number of items produced per day are summarized in
the table.
Operator - 1 Operator - 2
𝑛1 = 10 𝑛1 = 10
𝑥1 = 35 𝑥2 = 31
𝑠1 2 = 17.2 𝑠2 2 = 19.1
35
Solution:
Let 𝜇1 and 𝜇2 represent the population means of the
operator 1 and operator 2, respectively.
1. 𝐻0 ∶ 𝜇1 = 𝜇2 𝑜𝑟 𝜇1 − 𝜇2 = 0.
2. 𝐻1 ∶ 𝜇1 ≠ 𝜇2 𝑜𝑟 𝜇1 − 𝜇2 ≠ 0.
3. 𝛼 = 0.01.
4. Critical Region:
(𝑡𝛼 , 𝑛1 + 𝑛2 − 2) = (𝑡0.005 , 18) = ±2.878
2
36
5. Computations:
𝑥1 = 35 , 𝑠1 2 = 17.2 , 𝑛1 = 10,
𝑥2 = 31 , 𝑠2 2 = 19.1 , 𝑛2 = 10,
2 2
𝑛 1 − 1 𝑠1 + (𝑛2 − 1)𝑠2
𝑠𝑝 2 =
𝑛1 + 𝑛2 − 2
10 − 1 17.2 + (10 − 1)(19.1)
𝑠𝑝 = = 4.26,
10 + 10 − 2
(𝑥1 − 𝑥2 ) − 𝑑0 35 − 31 − 0
𝑡= = = 2.10,
1 1 1 1
𝑠𝑝 𝑛 + 𝑛 4.26 10 + 10
1 2
6. Decision:
Since calculated value falls in acceptance region
therefore, we don’t reject 𝐻0 and conclude that the samples do
not provide sufficient evidence at 𝛼 = 0.01 that a difference
does exist between the mean daily outputs of the machine
operators.
37
Presenter: Ms. Sidra Raees
LECTURE 13
Department of Mathematics, NED University of
Engineering & Technology, Karachi
1
Throughout this chapter, we have been concerned with
the testing of statistical hypotheses about single
population parameters such as μ and 𝜎 2 . Now we shall
consider a test to determine if a population has a
specified theoretical distribution. The test is based on
how good a fit we have between the frequency of
occurrence of observations in an observed sample and the
expected frequencies obtained from the hypothesized
distribution.
2
A goodness of fit test is used to know whether or not a
given set of data follows a specified probability
distribution. For this purpose we use following test
statistic
𝑘 2
𝑜𝑖 − 𝑒𝑖
𝜒2 = ,
𝑒𝑖
𝑖=1
3
If the observed frequencies are closed to the
corresponding expected frequencies, the 𝜒 2 value will be
small, indicating a good fit. If the observed frequencies
differ considerably from the expected frequencies, the 𝜒 2
value will be large and the fit is poor. A good fit leads to
the acceptance of 𝐻0 , whereas a poor fit leads to its
rejection. The critical region will, therefore, fall in the
right tail of the chi-square distribution.
For a level of significance 𝛼 with 𝑣 degrees of freedom
we find the critical region value 𝜒𝛼 2 from the table.
4
The number of degrees of freedom in chi-square
goodness of fit test is equal to the number of categories
minus the number of quantities obtained from the
observed data, which are used in the calculations of the
expected frequencies.
The shapes of 𝜒𝛼 2 distribution for various degrees of
freedom are given below.
As the degrees of freedom increases
The shape become symmetrical.
5
The testing procedure involves the following steps.
1. 𝑯𝟎 : Fit is Good or (Sample data obtained from specified
distribution)
2. 𝑯𝟏 : Fit is not Good or (Sample data not obtained from
specified distribution)
3. Choose a level of significance equal to 𝜶.
4. Test Statistic:
𝑘
2
𝑜𝑖 − 𝑒𝑖
𝜒2 = ,
𝑒𝑖
𝑖=1
With 𝑑. 𝑓 = 𝑣 = 𝑘 − 1, where k is the No. of cells or
categories or classes.
6
5. Critical Region:
The critical region at level of significance 𝛼 for right
tailed (always) with degrees of freedom 𝑣 is
𝜒 2 > 𝜒𝛼 2
6. Conclusion:
Reject 𝐻0 , if the calculated value of 𝜒 2 falls in
critical region; otherwise accept 𝐻0 .
(OR)
Reject 𝐻0 , if the calculated value of 𝜒 2 is greater than
the tabulated value of 𝜒 2 (i.e. if 𝜒𝑐𝑎𝑙 2 > 𝜒𝑡𝑎𝑏 2 ), otherwise
accept 𝐻0 .
7
Example
A die is tossed 180 times with the following results
Dots on die (𝒙) 1 2 3 4 5 6
Frequency (o) 28 36 36 30 27 23
Solution:
It is important to note that when we hypothesize that the
die is honest / balance or fair, which is equivalent to
testing the hypothesis that the distribution of outcomes is
uniform.
8
The testing procedure is given below:
9
Computations are as under:
𝒙 Prob. o 𝒆 = 𝟏𝟖𝟎 × 𝒑𝒓𝒐𝒃. 𝒐−𝒆 𝟐
1 1 28 1 0.13
180 × = 30
6 6
2 1 36 1 1.20
180 × = 30
6 6
3 1 36 1 1.20
180 × = 30
6 6
1 1
4 30 180 × = 30 0.00
6 6
1 1
5 27 180 × = 30 0.30
6 6
1 1
6 23 180 × = 30 1.63
6 6
Then 𝜒 2 = 4.46
10
5. Critical Region:
The critical region at 𝛼 = 0.01 with
𝑑. 𝑓. = 𝑣 = 6 − 1 = 5 is 𝜒 2 > 𝜒0.01(5) = 15.09
6. Conclusion:
Since calculated value of 𝜒 2 falls in acceptance region,
therefore we accept 𝐻0 , and conclude that the die is
fair.
(OR)
Since 𝜒𝑐𝑎𝑙 2 < 𝜒𝑡𝑎𝑏 2 , we accept 𝐻0 .
11
Example
An admission committee has submitted a report to the
principal of the college, claiming that among the
freshman, 20% of the students have shown preference for
Pre-Medical, 40% for Pre-Engineering, 25% for
Commerce and rest of the freshmen for Arts. Of the 620
freshmen for this year, 98 students declared Pre-Medical,
300 students went for Pre-Engineering, 182 students
preferred Commerce and the rest of the students declared
for Arts. Test at 𝛼 = 0.05, if this data confirms the claims
of the admission committee.
12
The testing procedure is given below:
13
Computations are as under:
Then 𝜒 2 = 51.25
14
5. Critical Region:
The critical region at 𝛼 = 0.05 for right-tailed test
with 𝑑. 𝑓. = 𝑣 = 4 − 1 = 3 is 𝜒 2 > 𝜒0.05(3) = 7.815
6. Conclusion:
Since calculated value of 𝜒 2 falls in critical region,
therefore we reject 𝐻0 .
(OR)
Since 𝜒𝑐𝑎𝑙 2 > 𝜒𝑡𝑎𝑏 2 , we reject 𝐻0 .
15
Example
In a survey of 400 Infants chosen at random, it is found that
185 are girls. Are boy and girl births equally likely, according
to this survey use 𝛼 = 0.05 .
Solution:
The testing procedure is given below:
1. 𝐻0 : Proportion of girls and boys is same.
[i.e. P(B)=P(G)=1/2]
2. 𝐻1 : Proportion of girls and boys is not same.
3. 𝛼 = 0.05
4. Test Statistic:
𝑘
2
𝑜𝑖 − 𝑒𝑖
𝜒2 = ,
𝑒𝑖
𝑖=1
16
Computations are as under:
Then 𝜒 2 = 2.25
17
5. Critical Region:
The critical region at 𝛼 = 0.05 for right-tailed test
with 𝑑. 𝑓. = 𝑣 = 2 − 1 = 1 is 𝜒 2 > 𝜒0.05(1) = 3.84
6. Conclusion:
Since calculated value of 𝜒 2 falls in acceptance region,
therefore we accept 𝐻0 .
(OR)
Since 𝜒𝑐𝑎𝑙 2 < 𝜒𝑡𝑎𝑏 2 , we accept 𝐻0 .
18
19
A contingency table is defined as a two way table in
which frequencies of various categories of two attributes
(or factors) are classified in rows and columns.
For example, a sample of employed persons may be
classified according to educational attainment and type of
occupation; College students may be classified according
to class status and smoking habits, etc.
20
A table with r number of rows and c number of columns is called an 𝑟 × 𝑐
contingency table. The general 𝑟 × 𝑐 contingency table for the two attributes
A and B is shown below:
A B 𝑩𝟏 𝑩𝟐 ---- 𝑩𝒋 ---- 𝑩𝒄 Total
𝐴1 𝑂11 𝑂12 ---- 𝑂1𝑗 ---- 𝑂1𝑐 𝑅1
𝐴2 𝑂21 𝑂22 ---- 𝑂2𝑗 ---- 𝑂2𝑐 𝑅2
. . . . .
. . . . .
. . . .
𝐴𝑖 𝑂𝑖1 𝑂𝑖2 ---- 𝑂𝑖𝑗 ---- 𝑂𝑖𝑐 𝑅𝑖
. . . . . .
. . . . . .
. . . . .
𝐴𝑟 𝑂𝑟1 𝑂𝑟1 ---- 𝑂𝑟𝑗 ---- 𝑂𝑟𝑐 𝑅𝑟
Total 𝐶1 𝐶2 𝐶𝑗 𝐶𝑐 G
Where 𝑂𝑖𝑗 = observed frequency of ith row and jth column
𝑅𝑖 = total of ith row, 𝐶𝑗 = total of jth column,
G = grand total of all observed frequencies
21
A Contingency table is usually constructed for the
purpose of studying the relationship between two
attributes. It indicates whether two characteristics are
independent or dependent on one another.
Note:
It is important to note that the “natural” application of the
contingency table analysis is for cases in which each
observation is measured by two qualitative variables.
However quantitative variables may also be used to
classify the observations into rows and columns or both.
22
The procedure for testing the independence in a
contingency table is as follows:
23
4. Test Statistic:
2
2
𝑜𝑖𝑗 − 𝑒𝑖𝑗
𝜒 = ,
𝑒𝑖𝑗
Where 𝑜𝑖𝑗 represents observed frequency of ith row and jth
column and 𝑒𝑖𝑗 represents the expected frequency of ith row
and jth column.
The number of degrees of freedom is
𝑑. 𝑓. = (𝑟 − 1)(𝑐 − 1)
Where r and c are the number of rows and the columns in the
contingency table, respectively.
The formula for computing the expected frequency for each
cell (i.e. for ith row and jth column) is given as
𝑅𝑖 𝐶𝑗
𝑒𝑖𝑗 =
𝐺
24
5. Critical Region:
The critical region at level of significance 𝛼 for right
tailed (always) with degrees of freedom 𝑣 = (𝑟 − 1)(𝑐 − 1) is
𝜒 2 > 𝜒𝛼 2
6. Conclusion:
Reject 𝐻0 , if the calculated value of 𝜒 2 falls in
critical region; otherwise accept 𝐻0 .
(OR)
Reject 𝐻0 , if the calculated value of 𝜒 2 is greater than
the tabulated value of 𝜒 2 (i.e. if 𝜒𝑐𝑎𝑙 2 > 𝜒𝑡𝑎𝑏 2 ), otherwise
accept 𝐻0 .
25
Example
1600 families were selected at random in a city to test the
belief that high income families usually send their
children to private schools and low income families often
send their children to Government schools. The
following results were obtained:
School Income Private Govt. Total
Low 494 506 1000
High 162 438 600
Total 656 944 1600
26
Solution:
1. 𝐻0 : Income and type of schools are independent
2. 𝐻1 : Income and type of schools are not independent
3. 𝛼 = 0.05.
4. Test Statistic:
2
𝑜𝑖𝑗 −𝑒𝑖𝑗 𝑅𝑖 𝐶 𝑗
𝜒2 = , where 𝑒𝑖𝑗 =
𝑒𝑖𝑗 𝐺
Since 𝑂11 = 494, 𝑂12 = 506, 𝑂21 = 162, 𝑂22 = 438
then
𝑅1 𝐶1 (1000)(656) 𝑅 𝐶 (1000)(944)
𝑒11 = = = 410, 𝑒12 = 1 2 = = 590
𝐺 1600 𝐺 1600
𝑅2 𝐶1 (600)(656) 𝑅2 𝐶2 (600)(944)
𝑒21 = = = 246, 𝑒22 = = = 354
𝐺 1600 𝐺 1600
27
Computations are as under:
o e 𝒐−𝒆 𝟐
𝒆
𝑂11 = 494 𝑒11 = 410 17.2
𝑂12 = 506 𝑒12 = 590 11.96
𝑂21 = 162 𝑒21 = 246 28.68
𝑂22 = 438 𝑒22 = 354 19.93
Total 1600 1600 77.78
Then 𝜒 2 = 77.78
28
5. Critical Region:
The critical region at 𝛼 = 0.05 for right-tailed test
with 𝑑. 𝑓. = 𝑣 = (2 − 1)(2 − 1) = 1 is
𝜒 2 > 𝜒0.05(1) = 3.84
6. Conclusion:
Since calculated value of 𝜒 2 falls in critical region,
therefore we reject 𝐻0 and conclude that the income and
type of schools are dependent.
(OR)
Since 𝜒𝑐𝑎𝑙 2 > 𝜒𝑡𝑎𝑏 2 , we reject 𝐻0 .
29
Example
The following table shows the relation between the
number of accidents in 1 year and the age of the driver in
a random sample of 500 drivers between 18 and 50. Test
at 𝛼 = 0.01, the hypothesis that the number of accidents
is independent of driver’s age.
Age of Driver Total
18 - 25 26 - 40 Over 40
0 75 115 110 300
No. of 1 50 65 35 150
Accidents
2 25 20 5 50
Total 150 200 150 500
30
Solution:
1. 𝐻0 : There is no association between age of driver and
number of accidents
2. 𝐻1 : There is an association
3. 𝛼 = 0.01.
4. Test Statistic:
2
2 𝑜𝑖𝑗 −𝑒𝑖𝑗 𝑅𝑖 𝐶 𝑗
𝜒 = , where 𝑒𝑖𝑗 =
𝑒𝑖𝑗 𝐺
then
𝑅1 𝐶1 (300)(150) 𝑅1 𝐶2 (300)(200)
𝑒11 = 𝐺
= 500
= 90, 𝑒12 = 𝐺
= 500
= 120
𝑅 𝐶 (300)(150) 𝑅 𝐶 (150)(150)
𝑒13 = 1 3 = = 90, 𝑒21 = 2 1 = = 45
𝐺 500 𝐺 500
𝑅 𝐶 150 (200) 𝑅 𝐶 150 (150)
𝑒22 = 2𝐺 2 = 500 = 60, 𝑒23 = 2𝐺 3 = = 45
500
31
𝑅3 𝐶1 (50)(150) 𝑅3 𝐶2 (50)(200)
𝑒31 = 𝐺 = 500 = 15, 𝑒32 = = = 20
𝐺 500
𝑅 𝐶 (50)(150)
𝑒33 = 3 3 = = 15,
𝐺 500
Computations are as under:
o e 𝒐−𝒆 𝟐
𝒆
𝑂11 = 75 𝑒11 = 90 2.5
𝑂12 = 115 𝑒12 = 120 0.2
𝑂13 = 110 𝑒13 = 90 4.4
𝑂21 = 50 𝑒21 = 45 0.6
𝑂22 = 65 𝑒22 = 60 0.4
𝑂23 = 35 𝑒23 = 45 2.2
𝑂31 = 25 𝑒31 = 15 6.7
𝑂32 = 20 𝑒32 = 20 0
𝑂33 = 5 𝑒33 = 15 6.7
Total 500 500 23.7
Then 𝜒 2 = 23.7
32
5. Critical Region:
The critical region at 𝛼 = 0.01 for right-tailed test
with 𝑑. 𝑓. = 𝑣 = (3 − 1)(3 − 1) = 4 is
𝜒 2 > 𝜒0.01(4) = 13.28
6. Conclusion:
Since calculated value of 𝜒 2 falls in critical region,
therefore we reject 𝐻0 and conclude that there is a
relationship between number of accidents and age of the
drivers.
(OR)
Since 𝜒𝑐𝑎𝑙 2 > 𝜒𝑡𝑎𝑏 2 , we reject 𝐻0 .
33
34
Curve fitting is the process of constructing a curve, or
mathematical functions, which possess closest proximity to the
series of data. By the curve fitting we can mathematically
construct the functional relationship between the observed fact
and parameter values, etc. It is highly effective in
mathematical modeling some natural processes.
It is a statistical technique use to drive coefficient values for
equations that express the value of one (dependent) variable as
a function of another (independent variable).
35
The main purpose of curve fitting is to theoretically
describe experimental data with a model (function or
equation) and to find the parameters associated with this
model.
36
1. Equation of straight line
𝑦 = 𝑎𝑥 + 𝑏
3. Exponential curve
𝑦 = 𝑎𝑏 𝑥 or 𝑦 = 𝑎𝑒 𝑏𝑥
37
A line to be fitted for the data
𝑦 = 𝑎𝑥 + 𝑏, where a and b needs
to be calculated.
𝑦 = 𝑛𝑎 + 𝑏 𝑥
𝑥𝑦 = 𝑎 𝑥 + 𝑏 𝑥2
38
Then the error between the actual vertical points 𝑦𝑖 and
the fitted points 𝑦𝑖 is given by
𝑒𝑖 = 𝑦𝑖 − 𝑦𝑖
39
Example
Fit a straight line by the method of least squares to the
following data:
𝒙 1 2 3 4 5
𝑦 3 4 6 9 10
Solution:
Let the equation of the straight line to be fitted to the
data, be 𝑦 = 𝑎𝑥 + 𝑏, where a and b are to be evaluated.
The normal equations for determining a and b are
𝑦 = 𝑛𝑎 + 𝑏 𝑥
𝑥𝑦 = 𝑎 𝑥 + 𝑏 𝑥 2
40
𝒙 𝒚 𝒙𝒚 𝒙𝟐
1 3 3 1
2 4 8 4
3 6 18 9
4 9 36 16
5 10 50 25
𝑥 = 15 𝑦 = 32 𝑥𝑦 = 115 𝑥 2 = 55
𝑦 = 𝑛𝑎 + 𝑏 𝑥 + 𝑐 𝑥 2
𝑥𝑦 = 𝑎 𝑥 + 𝑏 𝑥 2 + 𝑐 𝑥 3
𝑥 2𝑦 = 𝑎 𝑥 2 + 𝑏 𝑥 3 + 𝑐 𝑥 4
Error:
𝑒𝑖 = 𝑦𝑖 − 𝑦𝑖
42
Example
Fit a second degree parabola to the following data:
𝒙 0 1 2 3 4
𝑦 1 1.8 1.3 2.5 6.3
Solution:
Let the equation of the second degree parabola be
𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑥 2
The normal equations are
𝑦 = 𝑛𝑎 + 𝑏 𝑥 + 𝑐 𝑥 2
𝑥𝑦 = 𝑎 𝑥 + 𝑏 𝑥 2 + 𝑐 𝑥 3
𝑥 2𝑦 = 𝑎 𝑥 2 + 𝑏 𝑥 3 + 𝑐 𝑥 4
43
𝒙 𝒚 𝒙𝒚 𝒙𝟐 𝒙𝟐 𝒚 𝒙𝟑 𝒙𝟒
0 1 0 0 0 0 0
1 1.8 1.8 1 1.8 1 1
2 1.3 2.6 4 5.2 8 16
3 2.5 7.5 9 22.5 27 81
4 6.3 25.2 16 100.8 64 256
10 12.9 37.1 30 130.3 100 354
Solution:
The given relation is
𝑦 = 𝑎𝑏 𝑥
Taking ln on both sides
𝑙𝑛𝑦 = 𝑙𝑛𝑎 + 𝑥𝑙𝑛𝑏
45
Taking
𝑌 = 𝑙𝑛𝑦, 𝐴 = 𝑙𝑛𝑎, 𝐵 = 𝑙𝑛𝑏
So,
𝑙𝑛𝑦 = 𝑙𝑛𝑎 + 𝑥𝑙𝑛𝑏
𝑌 = 𝐴 + 𝑥𝐵
𝑌 = 𝑛𝐴 + 𝐵 𝑥
𝑥𝑌 = 𝐴 𝑥 + 𝐵 𝑥 2
46
𝒙 𝒚 𝒀 = 𝒍𝒏𝒚 𝒙𝐘 𝒙𝟐
3 11 2.3978 7.1934 9
4 12 2.4849 9.9396 16
5 14 2.6390 13.195 25
6 18 2.8903 17.3418 36
7 19 2.9444 20.6108 49
8 21 3.0445 24.356 64
9 23 3.1354 28.2186 81
𝑥 = 42 𝑦 = 118 𝑌 = 19.5359 𝑥𝑌 = 120.8552 𝑥 2 = 280
47
Now,
𝐴 = 𝑙𝑛𝑎, 𝐵 = 𝑙𝑛𝑏
𝑒𝐴 = 𝑎 𝑒𝐵 = 𝑏
𝑒 2.01088 = 𝑎 𝑒 0.12999 = 𝑏
𝑎 = 7.4698 𝑏 =1.1388
𝑦 = 7.4698(1.1388)𝑥
48
Example
Determine the constants a and b by the method of least
square such that 𝑦 = 𝑎𝑒 𝑏𝑥
𝒙 2 4 6 8 10
𝑦 4.077 11.084 30.128 81.897 222.62
Solution:
The given relation is
𝑦 = 𝑎𝑒 𝑏𝑥
Taking ln on both sides
𝑙𝑛𝑦 = 𝑙𝑛𝑎 + 𝑏𝑥
49
Taking
𝑌 = 𝑙𝑛𝑦, 𝐴 = 𝑙𝑛𝑎
So,
𝑙𝑛𝑦 = 𝑙𝑛𝑎 + 𝑏𝑥
𝑌 = 𝐴 + 𝑏𝑥
𝑌 = 𝑛𝐴 + 𝑏 𝑥
𝑥𝑌 = 𝐴 𝑥 + 𝑏 𝑥 2
50
𝒙 𝒚 𝒀 = 𝒍𝒏𝒚 𝒙𝐘 𝒙𝟐
2 4.077 1.4054 2.8108 4
4 11.084 2.4055 9.622 16
6 30.128 3.4054 20.4324 36
8 181.897 5.2034 41.6272 64
10 222.62 5.4054 54.054 100
51
Now,
𝐴 = 𝑙𝑛𝑎
𝑒𝐴 = 𝑎
𝑎 = 𝑒 0.3256
𝑎 = 1.3848
𝑦 = 1.3848𝑒 0.539𝑥
52