Professional Documents
Culture Documents
Quantitative Analysis For Management - Ii: QAM - II by Gaurav Garg (IIM Lucknow)
Quantitative Analysis For Management - Ii: QAM - II by Gaurav Garg (IIM Lucknow)
http://ganga.iiml.ac.in/~ggarg/QAM2.htm
COURSE OUTLINE
Sampling Distributions Chi-Square, t and F distributions Interval Estimation Sample Size Decision Testing of Hypothesis
single population and two populations
Measures of Association for Qualitative data and contingency Table Chi-square test for Goodness of fit Analysis of Variance one way and two way Nonparametric tests Multiple Regression Analysis
QAM II by Gaurav Garg (IIM Lucknow)
QUIZ-3: ASSIGNMENT:
NOV 16, 2011/ 0830 - 0850 Hrs. NOV 24, 2011, due date: DEC 10, 2011 END TERM EXAM: 40% DEC 14 - 19, 2011 IMPRESSION: 05% -Three quizzes will be conducted and best two out of three will be considered.
QAM II by Gaurav Garg (IIM Lucknow)
00% 10%
Sampling Distributions
Concept of Sampling Distribution Distributions of Sample Mean and Sample Proportion Central Limit Theorem t, Chi-Square and F distributions.
Statistic:
Statistical measures computed using sample observations. Let x1, x2,, xn are sample units. Sample mean Sample Variance 1 n 1 n 1 n 2 2 2 s ( xi x ) or s1 ( xi x ) 2 x xi n i 1 n 1 i 1 n i 1
QAM II by Gaurav Garg (IIM Lucknow)
In practice, parameter values are not known. They are estimated using sample observations. Parameter values are fixed. Values of statistic varies sample to sample. Unbiased Estimate
If E(statistic) = parameter, then the statistic is said to be unbiased estimate of the parameter. Sample mean is an unbiased estimate of population mean.
QAM II by Gaurav Garg (IIM Lucknow)
Let us consider the following population of size 4: 18, 20, 22, 24 Population mean = (18 + 20 + 22 + 24)/ 4 = 21 Population Variance = [(18-21)2 + (20-21) 2 + (22-21) 2 + (24-21) 2] / 4 = 5 Consider all possible samples of size 2 Obtain sample mean and sample variance of all the samples. Sample mean is an unbiased estimate of population mean. This means that the average of all sample means equals population mean.
QAM II by Gaurav Garg (IIM Lucknow)
x
18 19
s2
0 1
s1 2
0 2
20
21 19 20 21 22 20 21
4
9 1 0 1 4 4 1
8
18 2 0 2 8 8 2
1 n x xi n i 1 1 2 s ( xi x) n i 1
2 n
20, 20
22, 20 24, 20 18, 22 20, 22 22, 22 24, 22 18, 24 20, 24 22, 24 24, 24
1 n s12 ( xi x)2 n 1 i 1
21, 2 5, n 2
E( x) E (s 2 ) 2 E (s )
2 1 2
22
23 21
0
1 9
0
2 18
22
23 24
4
1 0
8
2 0
Average
21
2.5
Sampling Distributions
Unknown parameters are estimated using sample observations. Parameter values are fixed. Values of statistic varies sample to sample. Each sample has some probability of being chosen.
0.25
18
20
22
24
Each item is frequented only once. Population distribution is discrete uniform distribution.
QAM II by Gaurav Garg (IIM Lucknow)
Samples (18, 18) (20, 18), (18, 20) (22, 18), (18, 20), (20, 20) (24, 18), (18, 24), (20, 22), (22,20) (20, 24), (24, 20), (22, 22) (22, 24), (24, 22)
20
21 22 23
3
4 3 2
3/16
4/16 3/16 2/16 Total
(24, 24)
24
1/16 1
The value of the sample mean depends on the chosen sample. Each sample is chosen with certain probability.
So, each possible value of sample mean is associated with some probability.
Distribution of sample mean is the list of all possible values along with corresponding probabilities.
Sample Mean 18 19 20 21 22 23 24
Probability
1/16
2/16
3/16
4/16
3/16
2/16
1/16
In other words, the statistic T x (sample mean) can be considered as a random variable. The distribution of T is given by following table:
t 18 19 20 21 22 23 24 P(T=t) 1/16 2/16 3/16 4/16 3/16 2/16 1/16 t x P(T=t) t2 x P(T=t) 1.125 20.250 2.375 45.125 3.750 75.000 5.250 110.250 4.125 90.750 2.875 66.125 1.500 36.000 21.000 443.500
1 n 1 1 n 1 n E ( x ) E xi E ( xi ) n n i 1 n n i 1 n i 1
1 n 2 1 2 1 n 1 n Var( x ) Var xi 2 Var( xi ) 2 2 n 2 n i 1 n n n i 1 n i 1
Common Notation:
x E ( x ) , Var ( x ) n
2 x 2
QAM II by Gaurav Garg (IIM Lucknow)
Standard Error
Different samples of the same size from the same population will yield different sample means. A measure of the variability in different values of sample mean is given by the Standard Error of the sample mean.
standarderror( x ) x Var( x )
In our example,
x 2.5 1.5811
Example: Suppose a population has mean = 8 and standard deviation = 3. Suppose a random sample of size n = 36 is selected. What is the probability that the sample mean is between 7.75 and 8.25? Even if the population is not normally distributed, the central limit theorem can be used (n > 30). So, the distribution of the sample mean is approximately N(8, 3/6). i.e, x ~ N (8, 3 / 6)
P[7.75 x 8.25] ?
QAM II by Gaurav Garg (IIM Lucknow)
Let
N= population size X= no. of people out of N possessing a particular attribute P= Actual proportion of the people possessing a particular attribute = X/N
X, P are population parameters. x, p are sample statistics. p provides an estimate of P. Note that, x ~ B(n, P) E(x) = nP, Var(x) = nPQ, (where Q = 1-P). This implies that E(p) = E(x/n) = P, Var(p) = Var(x/n) = nPQ/n2 = PQ/n. Standard error (p) = [Var(p)] = (PQ/n)
QAM II by Gaurav Garg (IIM Lucknow)
This is a particular case of central limit theorem. Practically, this result is true for n 30. Or, when nP 5 as well as nQ 5
QAM II by Gaurav Garg (IIM Lucknow)
Example: If the true proportion of voters who support ABC party 0.4. What is the probability that a sample of size 200 yields a sample proportion between 0.40 and 0.45? P = 0.4, Q = 1 0.4 = 0.6 n = 200. Pr[ 0.40 < p < 0.45 ] =? p P Z ~ N (0,1) PQ n
QAM II by Gaurav Garg (IIM Lucknow)
If the population size N is small, sample size n can not be sufficiently large.
And we can not apply central limit theorem.
In this situation, we multiply the standard error by Finite Population Correction (fpc),
which is given by
fpc
N n N 1
Thus
x ~ N , fpc
n or
x N n n N 1
~ N (0,1).
And
pP PQ fpc n
pP ~ N 0,1. PQ N n n N 1
Degree of Freedom
The no. of independent observations which make up a statistic, is known as the degrees of freedom (d.f.) associated with that statistic. d.f. is the number of values in the final calculation of a statistic that are free to vary.
In general, d.f. of a statistic = (no. of independent observations) - (no. of parameters estimated)
Assume four numbers: a, b, c, and d, such that a+b+c+d = m. You are free to choose the any three numbers at random. But 4th must be chosen so that it makes the total equal to m. Thus your degree of freedom is three.
QAM II by Gaurav Garg (IIM Lucknow)
x1 , x2 ,..., xn
2
xi i 1
n 2
from N(,).
The symbol is read as Chi-Square and has a Chi-Square Distribution with n degree of freedom and range (0,). This distribution is denoted as If we define the statistic as
2 (n ).
n 2
xi x 2 i 1
2 ( n 1)
If X ~ (2k ) , then E ( X ) k ,
Var( X ) 2k.
Students t Distribution
Let us take a sample x1 , x2 ,..., xn Define the statistic from N(,).
x 1 n 1 n 2 T , where x xi , s1 ( xi x ) 2 . n i 1 n 1 i 1 s1 n
Then T follows Students t Distribution with (n-1) d.f. and range (- ,). It is denoted as T ~ t ( n 1) If
T ~ t( k )
k , (k 2) , then E (T ) 0, Var(T ) k 2
QAM II by Gaurav Garg (IIM Lucknow)
0
QAM II by Gaurav Garg (IIM Lucknow)
Snedecors F Distribution Let X and Y be two independent random variables such that X ~ (2d1 ) and Y ~ (2d2 )
X d1 Define the statistic F Y d2
F follows Snedecors F Distribution with d1 and d2 d.f. and range (0,). It is denoted as F~F(d1,d2) .
d2 2d 2 (d 2 d1 2) E(F ) , d 2 2 and Var( F ) , d2 4 2 d2 2 d1 (d 2 2) (d 2 4)
2
QAM II by Gaurav Garg (IIM Lucknow)
Summary Parameter and Statistic Unbiasedness Distribution of sample mean Distribution of sample proportion Central limit theorem Finite population correction Degree of Freedom Students t, Chi-Square and Snedecors Fdistributions
QAM II by Gaurav Garg (IIM Lucknow)