You are on page 1of 49

ECON 316: APPLIED

STATISTICS FOR ECONOMIST

Lecture 1: Confidence Interval


Instructorr: Prof. D. K. Twerefou
Outline
• Confidence interval for difference in means
• Confidence interval for the population variance
and standard deviation
• Confidence interval for the ratio of two variances

3/5/2020 2
WHAT IS CONFIDENCE INTERVAL (CI)
• An interval of numbers used to approximate the
true value of a population parameter.

• Associated with any CI is a number that indicates


the faith or confidence we have that the
population parameter lies between the lower and
upper bounds
3/5/2020 3
CONFIDENCE INTERVAL FOR THEMEAN
• For the population mean: ≤ ≤ =1−α
(1-α)100% CI for the population mean :

large samples
̅− ≤ ≤ ̅+

Small samples
− ≤ ≤ ̅+
3/5/2020 4
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS (Paired Data)
• Suppose we have two independent random samples
with mean ̅ and ̅ respective sample sizes n1 and n2
from a normal population with mean and and
variances and and .

• Then we can determine the CI on − , the


difference between the two population means under
various assumptions of the population variance.
3/5/2020 5
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS contd
• Case 1: where and are KNOWN but n1 and n2
are LARGE.
•.
• ̅Then 100
− ̅ − /
1 − +
% confidence
≤ −
interval
≤ ̅ − ̅ +
for − + is
/
given by:

3/5/2020 6
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS contd
• CASE 2: where and are UNKNOWN but n1 and n2 are
LARGE.
• Then 100 1 − % confidence interval for − is given
by:

̅ − ̅ − / + ≤ − ≤ ̅ − ̅ + / +

3/5/2020 7
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS contd
• CASE 3: and are UNKNOWN but n1 and n2 are SMALL.
A. If the variances are EQUAL, that is = then a
100 1 − % confidence interval on − is given by:

1 1 1 1
̅ − ̅ − / + ≤ − ≤ ̅ − ̅ + / +

3/5/2020 8
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS contd
• Where =

̅ − ̅ − ∗ + ≤ − ≤ ̅ − ̅ + ∗ +
/ /

3/5/2020 9
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS contd
B.
If no assumption of equality of the variance is
made, then a 100 1 − % confidence interval on
− is given by:

3/5/2020 10
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS
• where ∗ is approximately t-distribution with degrees of
freedom f given by:

+
=

−1 + −1
3/5/2020 11
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS- EXAMPLE 1
• Research at the University JSS shows that first year JSS class
has 22 students whose mean height is 47.75 inches , while
the second year class has 25 students whose mean height is
50.40 inches. If the standard deviation for the heights of first
and second year students are known to be 1.80 and 2.05
inches respectively, find the 95% confidence interval for the
mean height − .
• Interpret your results
3/5/2020 12
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS -Solution
• = 22, = 25, ̅ = 47.75, ̅ = 50.40, = 1.80, =
2.05
• Note: Small sample size, mean and standard deviation are
known and no assumption about the equality of mean
Therefore we use the formula:

∗ ∗
• ̅ − ̅ − / + ≤ − ≤ ̅ − ̅ + / +

3/5/2020 13
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS -solution
. .
.
• = = = = 48.98 ;
. . .

∗ ∗
. , ≈ . , = 2.021
• By inserting the relevant values into the specified equation:
2.65 − ∗. , 0.561 ≤ − ≤ 2.65 + ∗. , 0.561
2.65− 2.021 ∗ (0.561)≤ _1− _2≤ 2.65+ 2.021∗ (0.561)
1.516 ≤ − ≤ 3.784
3/5/2020 14
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS-INTERPRETATION
• If we are to take 100 different paired samples and find the
differences in the means. For 95% of the samples the
difference in means will lie in the interval. For the remaining
5% the difference in means will lie outside the interval.

3/5/2020 15
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS-EXAMPLE 2
• QUESTION 2: A vending machine designed to dispense coffee into 8
milliliter cups was checked by a technician who samples 4 cups
before making an adjustment, and 5 cups after making an
adjustment. Assuming that the variance are known to be equal, find
a 90% confidence interval for the mean difference in the amount
dispensed due to the adjustment, if the sample showed the
following amount of coffee
Before adjustment. 6.92 7.34 7.26 6.88
After adjustment 7.33 7.93 7.65 7.49 7.10
3/5/2020 16
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS -SOLUTION
= 4, = 5, ̅ = 7.10, = 7.50,
= 0.2338 , = 0.3148 ,
. , = 1.895;
. ( . )
= = 0.2830
• By inserting the values above into the t-test formula specified:
1 1 1 1
̅ − ̅ − / + ≤ − ≤ ̅ − ̅ + / +
3/5/2020 17
CI FOR THE DIFFERENCE BETWEEN TWO
MEANS -SOLUTON
• 7.10 − 7.50 ± 1.895 0.2830 +
• −0.40 ± 0.36
• Therefore −0.76 ≤ − ≤ −0.04

• INTERPRETATION:???.

3/5/2020 18
DEGREE OF FREEDOM
• The degree of freedom represents the number of observations
in the sample that are free to vary around the mean of the
sample.
• Example 1: Let n = 2 and a and b are the values. For any mean
̅ , the value of b depends on a and not free to vary. That is if
̅ = 7 , and a = 10 then b must be 4.

• Example 2: If n=1 then there is no degree of freedom because


this number is the mean ̅ .
3/5/2020 19
DEGREE OF FREEDOM
• EXAMPLE 2: If n=2, ̅ = 10 and a=15 then b=5
• If n=3, then any two values are free to vary but once the two
are selected, the third is fixed.

• In general given any mean value ̅ and , sample


observations, once , − 1 value are determined the final value
is no longer free to vary.

3/5/2020 20
DEGREE OF FREEDOM
• In other words, the degree of freedom represents the number
of observations in the sample that are free to vary around the
mean of the sample. Example, Let n = 2 and a and b are the
values. For any mean , the value of b depends on a and not
free to vary. That is if , and a = 10 then b must be 4.

3/5/2020 21
CHI-SQUARE DISTRIBUTION
• Suppose we have a set of normal and independent variable
, ,…., and we normalized them, = , =
,……., = then the sum of the squares of the
normalized variables has a chi-squared ( ) distribution.

• That is =∑ =∑ is chi-square distribution with


v=n-1 degrees of freedom.

3/5/2020 22
CHARACTERISTICS OF CHI-SQUARE
DISTRIBUTION
• It is skewed to the right as the right tail is asymmetrical to the
horizontal axis and the domain consists of non-negative real
numbers
• The sampling distribution for inferences about a population
variance to its estimator is described by the chi-square
distribution
• As the sample size increases the χ2 distribution becomes
symmetrical. Thus = − 1 > 30 it is approximated to the
normal distribution.

3/5/2020 23
CHARACTERISTICS OF CHI-SQUARE
DISTRIBUTION
• Graphically:
( ) df=5

df=10

3/5/2020 24
CHARACTERISTICS OF CHI-SQUARE
DISTRIBUTION
• The parameter of the chi-square distribution is called the
degree of freedom which is = −1
• Like the standard normal distribution , the chi-squared
distribution is defined as the value for which the area
under the curve to the right is equal to .
• This value depends on the number of degrees of freedom
and must be obtained from a chi-square table.

3/5/2020 25

2

1  2
 CHARACTERISTICS OF CHI-SQUARE
 211 2

DISTRIBUTION
,v
2
 21 2 ,v
  2 ,v
 2 2 ,v

• Graphically:

3/5/2020 26
CHARACTERISTICS OF CHI-SQUARE
DISTRIBUTION
• Also, / is such that the area under the curve to the right
with v degrees of freedom is /2 and / means that the
area under the curve to the left with v degrees of freedom is
/2. This implies that the chi-square distribution is not
symmetrical.

3/5/2020 27
CHARACTERISTICS OF CHI-SQUARE
DISTRIBUTION
• Graphically:

3/5/2020 28
CI FOR A CHI-SQUARE DISTRIBUTION
• From the definition of chi-squared distribution, we know that:

• =∑ =∑ = ∑ = …1
• where s is the sample variance and = −1>0
• Let ≤ ≤ =1− … … .2

• Consider the inequality ≤ ≤ …..3

3/5/2020 29
CI FOR A CHI-SQUARE DISTRIBUTION
( )
• Recall: =
• By substituting equation 1 into equation 3, we obtain:
( )
≤ ≤ ……4
• By inverting equation 4:

• 1/ ≥ ≥ 1/
( )

3/5/2020 30
CI FOR A CHI-SQUARE DISTRIBUTION
• Multiply through the equation by ( − 1) we obtain:
( ) ( )
≥ ≥

• Reversing the order of the inequality we obtain :


CONFIDENCE INTERVAL FOR A POPULATION
( )
≤ ≤ VARIANCE

CONFIDENCE INTERVAL FOR A POPULATION


( ) STANDARD DEVIATION
≤σ≤
3/5/2020 31
CI FOR A CHI-SQUARE DISTRIBUTION
• INTERPRETATION: The interpretation is that with the help of
the sample variance (standard deviation) we are sure that the
population variance (standard deviation) will lie in between
the extreme points of the confidence interval.

3/5/2020 32
MEAN AND VARIANCE OF CHI-SQUARE
DISTRITUTION
• If a variable x has a chi squared distribution, the expected value of the
variable = degree of freedom and the variance =2 .

• As the sample size becomes larger the chi-squared is approximated to


( )
normal distribution and the standard variable Z = ~ (0,1)
( )

3/5/2020 33
CI FOR A CHI-SQUARE DISTRIBUTION
• QUESTION 1: machine was tested 9 times with a standard deviation of
0.15. Construct a 90% confidence interval for the variance.

• SOLUTION: = 9, = 0.15, = 0.90, = − 1 = 9 − 1 = 8, =


0.1, = 0.05, 1 − = 1 − 0.05 = 0.95 , . , = 15.51, . , = 2.733

( )
• By using the CI for pop Var. equation: ≤ ≤

3/5/2020 34
CI FOR A CHI-SQUARE DISTRIBUTION
• By inserting the relevant value into the equation:
. .
≤ ≤ = 0.012 ≤ ≤ 0.066
. .

• INTERPRETATION: If we are to test the machine 100 there would


be many different values of the variance. But in 1 − 100% =
90% of all the possible tests we will have the variance lying in the
interval 0.012 ≤ ≤ 0.066. In % = 10% of the tests the
variance will not lying in the interval.

3/5/2020 35
CI FOR A CHI-SQUARE DISTRIBUTION
• QUESTION 2: The weight of 15 books randomly selected from a
library has a sample standard deviation of 0.011. Construct a
95% confidence interval for the standard deviation of the
population sampled.

• SOLUTION: = 15, = 0.011, . = 95%, = −1=


15 − 1 = 14, = 0.05, = 0.025, 1 − =
0.975, . , = 26.12, . , = 5.63
3/5/2020 36
CI FOR A CHI-SQUARE DISTRIBUTION
( )
• By using the CI FOR THE POP STD DEV.: ≤σ≤

• By inserting the relevant values into the equation:


( )( . ) ( . )
≤σ≤ = 0.000064 ≤ ≤ 0.0003
. .

3/5/2020 37
F-DISTRIBUTION
• The F distribution is sometimes called the variance ratio.

• The F – statistics usually involve the ratio of two independent


estimates of variance and used to test for the equality of two
independent estimates of the variance or standard deviation.

3/5/2020 38
F-DISTRIBUTION
• If two variables have independent chi – squared distribution
and with and degrees of freedom respectively.
( )

The statistic formed by this ratio: = ( )


= = =

• If = has an F-distribution with degrees of freedom.

3/5/2020 39
CHARACTERISTICS OF F-DISTRIBUTION
• F distribution is skewed to the right and the range of values of
F is 0 ≤ ≤∝

3/5/2020 40
CHARACTERISTICS OF F-DISTRIBUTION
• The F-distribution has two set of degrees of freedom, one for
the numerator and the other for the denominator.
• The degree of freedom and depends on the way in which we
obtain estimates of the two variances appearing in the
numerator and denominator of the F-ratio.
• The F table gives the probability of the right hand tail. Since
the F distribution is not symmetrical the left hand tail cannot
be deduced from the regular F table.

3/5/2020 41
CHARACTERISTICS OF F-DISTRIBUTION
• By conversion for a two tail test, the F ratio is always evaluated with
the larger estimate of the variance as the numerator and the smaller
estimate as the denominator. Then / , =
/ ,

• Rule of thumb Method: When conducting a two tail test we halve the
value of our significant level and read from the F – table. E.g. if you
choose the 5% level of significant for a two tail test, we take the value
. with the relevant degrees of freedom as our critical value.
3/5/2020 42
CONFIDENCE INTERVAL ON THE RATIO OF
TWO VARIANCE
• To find the confidence interval for means that we have to

• ,
≤ ≤ / , =1−

3/5/2020 43
CONFIDENCE INTERVAL ON THE RATIO OF
TWO VARIANCE

• Consider the inequality: ,


≤ ≤ / ,

• By Rearranging we obtain: ,
≤ ≤ / ,

• By simplifying further: ,
≤ ≤ ,

3/5/2020 44
CONFIDENCE INTERVAL ON THE RATIO OF
TWO VARIANCE
• However, ,
=
/ ,

• By inserting the above equation: ≤ ≤ ,


,

− % CONFIDENCE
≤ ≤ INTERVAL OF THE RATIO OF
, THE TWO VARIANCES
,

3/5/2020 45
CONFIDENCE INTERVAL ON THE RATIO OF
TWO VARIANCE
• QUESTION: A study was conducted to compare the nicotine
content of two brands of cigarettes. Ten cigarettes of brand A
had average nicotine content of 3.1 mg and a standard
deviation of 0.5, while 8 of brand B had an average of 2.7 mg
and a standard deviation of 0.7.
a. Construct a 95% confidence interval for the ratio of the
variances.
b. Construct a 95 % confidence interval for the difference in
means
3/5/2020 46
CONFIDENCE INTERVAL ON THE RATIO OF
TWO VARIANCE
• SOLUTION: =8 = 10, ̅ = 3.10, = 2.50, = 0.5 , =
0.7, = 10 − 1 = 9, = 8 − 1 = 7; = 0.05, = 0.025

• By using the formula: ≤ ≤ ,


,

• By inserting relevant values into the formula:


( . ) ( . )
• ∗ ≤ ≤ ∗ 4.1970 = 0.4046 ≤ ≤ 8.226
( . ) . ( . )

3/5/2020 47
CONFIDENCE INTERVAL ON THE RATIO OF
TWO VARIANCE
• QUESTION: In measuring the content of 6 boxes by one
machine, a student determined the sample variance to be
0.1754. In measuring the content of 11 boxes filled by a second
machine, he found a sample variance of 0.2704. Assuming that
the amount dispense follows a normal distribution for each
machine, find a 95% confidence interval for the ratio of the
variance.

3/5/2020 48
CONFIDENCE INTERVAL ON THE RATIO OF
TWO VARIANCE
• SOLUTION: =6 = 11, = 0. 1754, = 0.2704 ,
= 0.05, = 0.025, ,
= . , =? ? ? ? ?

• By using the formula: ≤ ≤ ,


,

0.2704 0.2704
• Inserting the values: ≤ ≤ (? ? ? )
. (???) .

3/5/2020
=? ? ? ? ? ≤ ≤? ? ? ? ? 49

You might also like