You are on page 1of 82

Σ

Sampling
Distributions
Chapter 7
Stat 101 2nd Semester AY 2020-2021
Concept of a
Random Sample
It is denoted by (𝑋!, 𝑋", … , 𝑋# ).
Random Sample
• Using a particular probability sampling method
and collecting 𝑛 observations, we have the
random sample 𝑿𝟏 , 𝑿𝟐 , … , 𝑿𝒏 , where

𝑋! is the measure taken from the 1st element in the sample


𝑋" is the measure taken from the 2nd element in the sample

𝑋# is the measure taken from the nth element in the sample
Random Sample from a

infinite
finite population versus
population
Random Sample of size 𝑛
from a Finite Population
• Suppose we select 𝑛 distinct elements from a population
consisting of 𝑁 elements, using a particular probability
sampling method. Let
X1 = measure taken from the 1st element in the sample
X2 = measure taken from the 2nd element in the sample

Xn = measure taken from the nth element in the sample

Then, (𝑋! , 𝑋" , … , 𝑋# ) is called a random sample of size 𝒏 from a


finite population.
Remarks
• A common misconception about a random sample from a
finite population is that we have to assign all the elements of
the population the same chances of inclusion in the sample.

• This is not necessarily the case. It is possible to assign a


larger chance of inclusion in the sample to some of the
elements so long as the probabilities of inclusion are known
and nonzero.

• We just need to use any probability sampling method!


Remember that it does not require all the elements of the
population to have equal chances of inclusion in the sample.
EXAMPLE
• A sample selected by simple random sampling
WITHOUT replacement (SRSWOR) is a random
sample from a finite population.

• This is because if we use SRSWOR, then we are


assured that the 𝑛 elements we obtain in the
sample are distinct, since there is no replacement
done. This, then, satisfies the definition of a
random sample from a finite population.
EXAMPLE (cont.)
• However, a sample selected using simple random
sampling WITH replacement (SRSWR) does NOT
qualify as a random sample from a finite
population. This violates the requirement in the
definition that the elements must be distinct.

• Although SRSWR is not a random sample from a


finite population, we still consider it as a random
sample, but from an infinite population, which we
now define in the next slide.
Random Sample of size 𝑛
from an Infinite Population
• Let
X1 = measure taken from the 1st element in the sample
X2 = measure taken from the 2nd element in the sample

Xn = measure taken from the nth element in the sample
• Then, (𝑋! , 𝑋" , … , 𝑋# ) is called a random sample of size 𝒏
from an infinite population if the values of 𝑋! , 𝑋" , … , 𝑋# are 𝑛
independent observations generated from identical
cumulative distribution functions (CDFs). This common CDF
or its corresponding probability mass/density function, is
called the parent population or the distribution of the
population.
Remarks
• For brevity, we can restate the definition as “a random sample of
size 𝒏 from an infinite population is a sample generated by a
series of 𝒏 INDEPENDENT trials that are performed under
IDENTICAL conditions”

• random sample of size 𝑛 from an infinite population = i.i.d.

i ndependent &
i dentically
d istributed
EXAMPLE
• We have mentioned that a sample selected using simple random
sampling WITH replacement (SRSWR) qualifies as a random
sample from an infinite population.

• This is because if replacement is done, then every time we select


an element from the population, we can say that the draws/trials
are independent of one another. What we select in the 1st draw
would not affect what we select in the 2nd draw, and so on.
(INDEPENDENT, check!)

• Also, if replacement is done, then the population does not change,


and it will contain the same elements in every draw as if every
draw is a fresh draw. (INDENTICALLY DISTRIBUTED, check!)
EXAMPLE
• Suppose in a certain university, weight (in kilos) of the students is
normally distributed with mean 72 kilos and standard deviation 20
kilos.

• We then independently and randomly obtain a sample of 10


students from the university. Then, we have (𝑋! , 𝑋" , … , 𝑋!# ) as a
random sample from an infinite population.

• This is because each 𝑋$ in (𝑋! , 𝑋" , … , 𝑋!# ) come from the


same/identical 𝑁𝑜𝑟𝑚𝑎𝑙(𝜇 = 72, 𝜎 " = 20" ) distribution, AND that
they are taken independently of each other. (i.i.d, check!)
Remarks
• It is important to take note that a random sample from an infinite
population NEED NOT require that the physical population from
where we select our sample is infinite.

• In the first example, we can see that even though our physical
population is finite, we can still have a viewpoint of sampling from
an infinite population. SRSWR seems to create that illusion!

• In the second example, we can still see that even though our
physical population is finite, we can have the viewpoint of
sampling from an infinite population as long as they satisfy the IID
requirement.
Remarks
• In summary, random sampling from a finite or an
infinite population DOES NOT depend on the physical
population from where we take our sample, nor in the
possible values of 𝑋$ (just like how we classify discrete
vs continuous).

• It simply depends on the 𝑋$ s in the random sample, say


(𝑋!, 𝑋", … , 𝑋# ), if they are independent of each other
and are all identically distributed – the IID requirement,
for it to be classified as a random sample from an
infinite population.
Remarks
• To better understand these remarks, let’s look at these examples:
𝑋! ~𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙 𝑛 = 10, 𝑝 = 0.7 𝑋! ~𝑁𝑜𝑟𝑚𝑎𝑙 𝜇 = 2, 𝜎 " = 9
𝑋" ~𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙 𝑛 = 10, 𝑝 = 0.7 𝑋" ~𝑁𝑜𝑟𝑚𝑎𝑙 𝜇 = 2, 𝜎 " = 9
⋮ ⋮
𝑋# ~𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝑛 = 10, 𝑝 = 0.7) 𝑋# ~𝑁𝑜𝑟𝑚𝑎𝑙(𝜇 = 2, 𝜎 " = 9)

where 𝑋! , 𝑋" , … , 𝑋# are independent where 𝑋! , 𝑋" , … , 𝑋# are independent

Then, (𝑋! , 𝑋" , … , 𝑋# ) is a random Then, (𝑋! , 𝑋" , … , 𝑋# ) is a random


sample from an infinite population. sample from an infinite population.

This is still a r.s. from an infinite pop’n This is a r.s. from an infinite pop’n
EVEN IF each 𝑿𝒊 is discrete. This is NOT because each 𝑿𝒊 is normally
because each 𝑿𝒊 satisfies the IID distributed and can be any value from
requirement. − ∞ to ∞, BUT BECAUSE each 𝑿𝒊
satisfies the IID requirement.
Basic Concepts
Statistic, Sampling Distribution, Standard Error
Statistic
• Suppose (𝑋6 , 𝑋7 , … , 𝑋8 ) is a random sample.

A statistic is a random variable that is any


function of 𝑋6 , 𝑋7 , … , 𝑋8 .
EXAMPLES
• Suppose (𝑋! , 𝑋" , … , 𝑋% ) is a random sample. Then,

% %
1 1
𝑋9 = ; 𝑋$ and 𝑆" = ; 𝑋$ − 𝑋9 "
𝑛 𝑛−1
$&! $&!

are statistics, because they are functions of 𝑋! , 𝑋" , … , 𝑋% .


Remarks
• Being a function of the random variables 𝑋! , 𝑋" , … , 𝑋% , a statistic
is in itself a random variable also.

• As a random variable, the value of a statistic depends on the


outcome of the random experiment of selecting 𝑛 elements from a
population using a probability sampling method.

• Also, its value will vary from sample to sample. It is impossible to


predict with certainty what the realized value of the statistic will
be. It is only when we have actually selected our sample that we
can compute for its value.

• Lastly, as a random variable, it must have a probability


distribution, which we discuss in the next slide.
Sampling Distribution
• The sampling distribution of a statistic is its
probability distribution.

Again, since the statistic is a random variable, then it must have a


probability distribution in the form of a CDF or a PMF/PDF.

Even if the realized value of a statistic cannot be predicted with


utmost certainty, we can use its sampling distribution to
understand how its value changes from one sample to another (or
its behavior in terms of probability).
EXAMPLE
Filipinos are so fascinated with elections and the polls conducted to
predict the outcomes of these elections. Let us imagine a very small
barangay consisting of 6 qualified voters. Let’s label them as Voters A,
B, C, D, E, and F.

There are two candidates vying for the mayor position, say Jojo and
Junjun. What we do not know is that Voters A, B, C, and D have
already decided to elect Jojo, while voters E and F will elect Junjun.

Sadly, we only have enough resources to get a sample of size 3.


Suppose we use SRSWOR to select our sample of size 3. We will then
use the information from this sample to predict the outcome of the
election.
EXAMPLE (cont.)
Construct the sampling distribution of 𝑿 > , where we let
1, if the 𝑖 %& voter in the sample elects Jojo
𝑋$ = @
0, if the 𝑖 %& voter in the sample elects Junjun

Note: We are interested with getting the distribution of 𝑋9 because it is


actually the proportion of voters who will elect Jojo. Since 𝑋$ = 1 if
the 𝑖 '( voter will vote for Jojo and 𝑋$ = 0 otherwise, then

B! CB" CB# EFG HI#J G$KK LFMN OFP QFRF $# MEN SIHTKN
&
𝑋= =
D MFMIK #UHVNP FO LFMNPS $# MEN SIHTKN

As we know, this proportion can be viewed as a probability of Jojo winning the election.
That’s why we are interested with the sampling distribution of 𝑋$ and that is how we
come up with inferences about the results of the election.
EXAMPLE (cont.)
> , where we let
Construct the sampling distribution of 𝑿

1, if the 𝑖 %& voter in the sample elects Jojo


𝑋$ = @
0, if the 𝑖 %& voter in the sample elects Junjun

Our sample space contains all the 20 possible combinations of size


3. The sample points in our sample space are:
{A,B,C} {A,B,D} {A,B,E} {A,B,F}
{A,C,D} {A,C,E} {A,C,F} {A,D,E}
{A,D,F} {A,E,F} {B,C,D} {B,C,E}
{B,C,F} {B,D,E} {B,D,F} {B,E,F}
{C,D,E} {C,D,F} {C,E,F} {D,E,F}
EXAMPLE (cont.)
For every possible combination of size 3, we get the value of our
9
random sample (𝑋! , 𝑋" , 𝑋) ) and the value of the statistic of interest 𝑋.

Sample (X1,X2,X3) x9 Sample (X1,X2,X3) x9 Sample (X1,X2,X3) x9 Sample (X1,X2,X3) x9


{A,B,C} (1,1,1) 3/3 {A,B,D} (1,1,1) 3/3 {A,B,E} (1,1,0) 2/3 {A,B,F} (1,1,0) 2/3
{A,C,D} (1,1,1) 3/3 {A,C,E} (1,1,0) 2/3 {A,C,F} (1,1,0) 2/3 {A,D,E} (1,1,0) 2/3
{A,D,F} (1,1,0) 2/3 {A,E,F} (1,0,0) 1/3 {B,C,D} (1,1,1) 3/3 {B,C,E} (1,1,0) 2/3
{B,C,F} (1,1,0) 2/3 {B,D,E} (1,1,0) 2/3 {B,D,F} (1,1,0) 2/3 {B,E,F} (1,0,0) 1/3
{C,D,E} (1,1,0) 2/3 {C,D,F} (1,1,0) 2/3 {C,E,F} (1,0,0) 1/3 {D,E,F} (1,0,0) 1/3

Note that 𝑋9 is a discrete random variable because it has three possible


! " )
values only (finite). The mass points of 𝑋9 are , , and .
) ) )
EXAMPLE (cont.)
9 we just need to compute 𝑃(𝑋9 = 𝑥)̅ for every mass
To get the PMF of 𝑋,
! " )
point 𝑥̅ = , , .
) ) )

Sample (X1,X2,X3) x9 Sample (X1,X2,X3) x9 Sample (X1,X2,X3) x9 Sample (X1,X2,X3) x9


{A,B,C} (1,1,1) 3/3 {A,B,D} (1,1,1) 3/3 {A,B,E} (1,1,0) 2/3 {A,B,F} (1,1,0) 2/3
{A,C,D} (1,1,1) 3/3 {A,C,E} (1,1,0) 2/3 {A,C,F} (1,1,0) 2/3 {A,D,E} (1,1,0) 2/3
{A,D,F} (1,1,0) 2/3 {A,E,F} (1,0,0) 1/3 {B,C,D} (1,1,1) 3/3 {B,C,E} (1,1,0) 2/3
{B,C,F} (1,1,0) 2/3 {B,D,E} (1,1,0) 2/3 {B,D,F} (1,1,0) 2/3 {B,E,F} (1,0,0) 1/3
{C,D,E} (1,1,0) 2/3 {C,D,F} (1,1,0) 2/3 {C,E,F} (1,0,0) 1/3 {D,E,F} (1,0,0) 1/3

We can use the classical approach since we performed SRSWOR (thus,


we have equiprobable outcomes) and we have a finite sample space
(𝑛 Ω = 20)
EXAMPLE (cont.)
%(+)
Using the classical approach 𝑃 𝐴 = , we can see that
%(-)

1 4 2 12 3 4
𝑃 𝑋9 = = , 𝑃 𝑋9 = = , 𝑃 𝑋9 = =
3 20 3 20 3 20

Sample (X1,X2,X3) x9 Sample (X1,X2,X3) x9 Sample (X1,X2,X3) x9 Sample (X1,X2,X3) x9


{A,B,C} (1,1,1) 3/3 {A,B,D} (1,1,1) 3/3 {A,B,E} (1,1,0) 2/3 {A,B,F} (1,1,0) 2/3
{A,C,D} (1,1,1) 3/3 {A,C,E} (1,1,0) 2/3 {A,C,F} (1,1,0) 2/3 {A,D,E} (1,1,0) 2/3
{A,D,F} (1,1,0) 2/3 {A,E,F} (1,0,0) 1/3 {B,C,D} (1,1,1) 3/3 {B,C,E} (1,1,0) 2/3
{B,C,F} (1,1,0) 2/3 {B,D,E} (1,1,0) 2/3 {B,D,F} (1,1,0) 2/3 {B,E,F} (1,0,0) 1/3
{C,D,E} (1,1,0) 2/3 {C,D,F} (1,1,0) 2/3 {C,E,F} (1,0,0) 1/3 {D,E,F} (1,0,0) 1/3
EXAMPLE (cont.)
Thus, the sampling distribution of 𝑋9 is given by the PMF

x9 1/3 2/3 3/3


p(9x)=P(X9 =9x) 4 12 4
20 20 20

Simplifying, we have

x9 1/3 2/3 1
p(9x)=P(X9 =9x) 1 3 1
5 5 5
Remarks
• If we have a random sample from a finite population, then the
statistic of interest will have a CDF or a PMF.

• If we have a random sample from an infinite population, then the


statistic of interest will have a CDF or a PMF (if 𝑿𝒊 is discrete) OR a
CDF or a PDF (if 𝑿𝒊 is continuous).

• Note that in the previous example, we performed SRSWOR. Thus,


our (𝑋! , 𝑋" , 𝑋) ) there is actually a random sample from a finite
9
population. Hence, we obtained a PMF for the statistic of interest 𝑋.

• Again, this CDF or PMF/PDF of the statistic is called the sampling


distribution of the statistic.
Remarks
9
• Our common statistic of interest is the sample mean 𝑋.

• The value of 𝑋9 will change from one sample to another. Even if we


cannot predict with certainty what its value will be, we can use its
sampling distribution to understand its behavior.

• That’s why if we have a random sample from a finite population,


9 while if we have a
then we need to get the CDF or the PMF of 𝑋,
random sample from an infinite population, then we need to get
9
the CDF or the PMF/PDF of 𝑋.

• This is also the reason why we always ask the question “What is
> ?”
the distribution of 𝑿
Standard Error
• The standard deviation of a statistic is called its
standard error.

We have a special term for the standard deviation of a statistic


because we will use it to measure the reliability of our statistic.

A small standard error indicates that the computed values of our


statistic in the different samples generated are close to one
another., so that even if we know that the values of a statistic
varies from one sample to another, a small standard error gives us
an assurance that at least the variation among their values is not
too large.
μ
Illustration {0, 1, 1, 2, 2, 3, 3, 3, 4, 5}
04 2 3 Suppose in a population of 𝑁 = 10 values, the true and

1
unknown mean is 2.4. Since it is unknown, we may want to

2 3
guess its value by taking a sample, say of size 𝑛 = 5, then use
the sample mean 𝑋$ to come up with a guess. Take note that

1 3 we can have different possible sample of size 5. In our case,

2.4 5 we look at the 4 possible samples given below, and for each
$
one, we compute the value of 𝑋.

possible samples of size 5


(X1, X2, X3, X4, X5) = (1,4,3,2,1) 2.2
(X1, X2, X3, X4, X5) = (3,3,2,1,3) 2.4
(X1, X2, X3, X4, X5) = (1,0,1,2,2) 1.2
(X1, X2, X3, X4, X5) = (2,3,2,3,1)
2.2
Illustration
From the illustration, we can see that for the different
possible samples of size 5, we also have different values
for 𝑋& . We can see that sometimes it’s 2.2, while
sometimes it’s 2.4, or 1.2, or 2.2 again!

The variation that we see here, which can be measured by


getting the standard deviation of all possible values of 𝑋&
across all possible samples of size 𝑛 , is called the
standard error.
Standard Deviation vs
Standard Error
The standard deviation measures the variation within a
set of measurements, while the standard error (being a
standard deviation) measures the variation within a set of
measurements also, but these measurements are now
&
the possible values of a certain statistic, say 𝑋.

To illustrate,

STANDARD DEVIATION = variation in (0, 1, 1, 2, 2, 3, 3, 3, 4, 5)

STANDARD ERROR = variation in (2.2, 2.4, 1.2, 2.2, …)


EXAMPLE
Use the sampling distribution we constructed in
the Jojo-Junjun example to compute for
a. mean of 𝑋(
b. variance of 𝑋(
c. standard error of 𝑋(

x9 1/3 2/3 1
p(9x)=P(X9 =9x) 1 3 1
5 5 5
EXAMPLE (cont.)
x9 1/3 2/3 1
Sampling Distribution of 𝑋9
p(9x)=P(X9 =9x) 1 3 1
5 5 5

! ! " ' ! "


a. The mean of 𝑋W is 𝐸 𝑋W = + + 1 =
' ( ' ( ( '

"" " " "


b. The variance of 𝑋W is 𝑉𝑎𝑟 𝑋W = 𝐸 𝑋W " − 𝐸 𝑋W " = − =
)( ' )(

! " ! " " ' " ! ""


Aside, 𝐸 𝑋W " = + + 1 =
' ( ' ( ( )(

"
c. The standard error of 𝑋W is 𝑠. 𝑒. 𝑋W = W =
𝑉𝑎𝑟(𝑋) = 0.2108
)(
EXERCISE # 1
Suppose the sampling distribution of 𝑋! is as follows:

xW 5 5.5 6 6.5 7 7.5


p(Wx) = P(XW =Wx) 0.1 0.2 0.25 0.3 0.1 0.05

!
a. What is the mean of 𝑋?

!
b. What is the standard error of 𝑋?

c. Compute for P(5 < 𝑋! < 7).


EXERCISE # 1
a. The mean of 𝑋W is 𝐸 𝑋W = 5 0.1 + 5.5 0.2 + 6 0.25 +
6.5 0.3 + 7 0.1 + (7.5)(0.05) = 6.125.

b. The standard error of 𝑋W is 𝑠. 𝑒. 𝑋W = W =


𝑉𝑎𝑟(𝑋) 𝐸 𝑋W " − 𝐸 𝑋W " =
37.9375 − 6.125 " = 0.6495.

Aside, 𝐸 𝑋W " = 5 " 0.1 + 5.5 " 0.2 + 6 " 0.25 + 6.5 " 0.3
" "
+ 7 0.1 + 7.5 0.05 = 37.9375

c. 𝑃 5 < 𝑋W < 7 = 𝑝 5.5 + 𝑝 6 + 𝑝 6.5 = 0.2 + 0.25 + 0.3 = 0.75


Central Limit
Theorem
This is one of the most famous and important theorems in
Statistics.
Central Limit Theorem
• If 𝑋( is the mean of a random sample of size 𝑛
taken from a large or infinite population with
mean 𝜇 and variance 𝜎 7 , then the sampling
distribution of 𝑋( is approximately normally
distributed with

J$
( = 𝜇 and
mean 𝐸(𝑋) variance 𝑉𝑎𝑟 𝑋( =
8

as long as n is sufficiently large.


VISUALIZATION
random sample

approximate distribution of XW

population from
where the random
sample comes from

rule of thumb
ILLUSTRATION
Even if the population from where the sample came from looks like one of these…

mean: 𝜇 mean: 𝜇 mean: 𝜇 mean: 𝜇 mean: 𝜇


variance: 𝜎 ! variance: 𝜎 ! variance: 𝜎 ! variance: 𝜎 ! variance: 𝜎 !
…as long as it has finite mean µ and finite variance σ2, then X$ will be approximately distributed as …

normally distributed with


mean: 𝜇
"!
variance: #

given a sufficiently large sample size (n > 30, as our rule of thumb only)
Remarks
• From the result of the Central Limit Theorem (CLT) that
!"
𝑋! ≈ 𝑁𝑜𝑟𝑚𝑎𝑙 𝜇, as long as 𝑛 is sufficiently large, we
"
can standardize the random variable 𝑋! by subtracting its
!" !
mean 𝜇 and dividing by its standard deviation = .
" "
Thus, we know that

𝑋# − 𝜇
𝑍=𝜎 ≈ 𝑁(0,1)
' 𝑛
Remarks
• The normal approximation will hold even if the
distribution of the population from where the sample
came from is either discrete or continuous.

• The normal approximation will hold even if the


distribution of the population from where the sample
came from is either symmetric or skewed.

• We can use the approximation even for random samples


from finite populations so long as N is very large.
Remarks
• The normal approximation in the theorem will be
good if n > 30. However, this is only a rule of thumb.

• The theorem does NOT require that the random


sample comes from a normal distribution.

• However, if the distribution of the population is


Normal to begin with, then the sampling distribution
of 𝑋" will also be exactly (not just approximately)
Normal, no matter how small the sample size is.
EXAMPLE
A random sample of size 100 is taken from a large population with
> > 𝟗𝟗𝟖).
mean 𝝁 = 1,000 and variance 𝝈𝟐 = 625. Approximate 𝑷(𝑿

By the Central Limit Theorem, 𝑋W will! be approximately normally distributed with


W = 𝜇 = 1,000 and 𝑉𝑎𝑟(𝑋)
𝐸(𝑋) W = + = ,"( = 6.25.
# !--

𝑋W − 𝜇 998 − 1000
𝑃 𝑋W > 998 = 1 – 𝑃 𝑋W ≤ 998 ≈ 1 – 𝑃 ≤
𝜎 "h 6.25
𝑛

= 1 – 𝑃(𝑍 ≤ −0.8)

= 1 – 0.2119 = 0.7881
EXAMPLE (explained)
A random sample of size 100 is taken from a large population with
> > 𝟗𝟗𝟖).
mean 𝝁 = 1,000 and variance 𝝈𝟐 = 625. Approximate 𝑷(𝑿

By the Central Limit Theorem, 𝑋W will! be approximately normally distributed with


W = 𝜇 = 1,000 and 𝑉𝑎𝑟(𝑋)
𝐸(𝑋) W = + = ,"( = 6.25.
# !--

𝑋W − 𝜇 998 − 1000
𝑃 𝑋W > 998 = 1 – 𝑃 𝑋W ≤ 998 ≈ 1 – 𝑃 ≤
𝜎 "h 6.25
Here, we just standardize 𝑋/ because we 𝑛
know what it’s approximately normal. To
standardize is to subtract the mean and then This is the CDF of 𝑍
divide by the standard deviation. The mean of = 1 – 𝑃(𝑍 ≤ −0.8) evaluated at -0.8.
!"
𝑋/ is 𝜇 and its std. dev. is . Hence, to
"
! $
standardize is to let 𝑍 =
"#
. = 1 – 0.2119 = 0.7881
$%%
&
This is obtained from the 𝑧-table. Look at
Row -0.8 and Column 0.00.
EXAMPLE
Suppose the average time it takes a large group of students to
complete a certain exam is 46.2 minutes with a standard deviation of
15 minutes. Find the probability that a class of 50 students taking that
exam has a mean time of completing it in less than 40 minutes.

Let 𝑋. denote the length of time of the 𝑖 %& student to complete a certain exam.
Each 𝑋. is taken from a large group of students with mean time of completing
the exam 𝜇 = 46.2 and standard deviation 𝜎 = 15.
Also, 𝑋W is the sample mean time of 50 randomly selected students completing
the exam. By the Central Limit Theorem, 𝑋W will be approximately normally
! #
W = 𝜇 = 46.2 and 𝑉𝑎𝑟(𝑋)
distributed with 𝐸(𝑋) W = + = !( .
# (-
EXAMPLE
Suppose the average time it takes a large group of students to
complete a certain exam is 46.2 minutes with a standard deviation of
15 minutes. Find the probability that a class of 50 students taking that
exam has a mean time of completing it in less than 40 minutes.

And so, we have.

𝑋W − 𝜇 40 − 46.2
𝑃 𝑋W < 40 ≈ 𝑃 <
𝜎 "h 15"h
𝑛 50

= 𝑃(𝑍 < −2.92)

= 0.0018
EXERCISE # 2
An electrical firm manufactures electric light bulbs
that have a length of life with mean and standard
deviation equal to 500 and 50 hours, respectively.

Find the probability that a random sample of 35 bulbs


will have an average life of greater than 475 hours.
EXERCISE # 2
An electrical firm manufactures electric light bulbs that have a length
of life with mean and standard deviation equal to 500 and 50 hours,
respectively. Find the probability that a random sample of 35 bulbs will
have an average life greater than 475 hours.

Let 𝑋. denote the length of life of the 𝑖 %& light bulb. Each 𝑋. is taken from a
large manufacturing of bulbs with mean lifetime 𝜇 = 500 and standard
deviation 𝜎 = 50.
Also, 𝑋W is the sample mean length of life of 35 randomly selected light bulbs.
By the Central Limit Theorem, 𝑋W will be approximately normally distributed
+! (-#
W W
with 𝐸(𝑋) = 𝜇 = 500 and 𝑉𝑎𝑟(𝑋) = = .
# '(
EXERCISE # 2
An electrical firm manufactures electric light bulbs that have a length
of life with mean and standard deviation equal to 500 and 50 hours,
respectively. Find the probability that a random sample of 35 bulbs will
have an average life greater than 475 hours.

And so, we have

𝑋W − 𝜇 475 − 500
𝑃 𝑋W > 475 ≈ 𝑃 ≤
𝜎 "h 50"h
𝑛 35

= 𝑃(𝑍 > −2.96)

= 1 – 0.0015 = 0.9985
Remarks
• Again, the theorem does NOT require that the
random sample comes from a normal distribution.
• In the previous examples, we don’t have any
distribution at all to start with (we just know the
mean and variance of a large or infinite population),
yet we can still have an approximate distribution for
" just invoke the Central Limit Theorem! We can do
𝑋;
this since the sample sizes we have in those
examples are large enough based on our rule of
thumb.
t-distribution &
𝟐
𝝌 -distribution
In addition to the normal distribution, these are two special
distributions that are also commonly used in Inferential
Statistics.
t-distribution
X ~ t(v)
bell-
shaped
as the
symmetric larger
parameter: degrees of
about 0
variance degrees of
freedom
freedom
tails increases
approach
the x-axis

how it becomes
similarity with the difference with the almost similar to
standard normal standard normal the standard
normal
VISUALIZATION
Reading the 𝒕-table
• The 𝑡 -table gives the values of 𝑡Q,R , or the
100 1 − 𝛼 ST percentiles of 𝑋~𝑡(𝑣), for values
of 𝛼 = 0.10, 0.05, 0.025,0.01, and 0.005 and 𝑣
degrees of freedom from 1 to 30. That is why it
is labeled as “100(1-𝛼)th Percentiles of the t-
distribution”
Reading the 𝒕-table
• 𝑡Q,R is just a number in the x-axis such that its
AREA TO THE RIGHT is 𝛼.
EXAMPLE
We see from the table that 𝑡U.UW,X = 2.132.
To see this, just get the intersection of Row 4 and Column 0.05.

This means that if we draw the PDF of 𝑋~𝑡(𝑣 = 4),


then the area to the right of 2.132 is 0.05.
Equivalently, the area to the left of 2.132 is 0.95.
That’s why it’s called the 95th percentile, because
𝑃 𝑋 ≤ 2.132 = 0.95.
EXAMPLE
Find the value of 𝑡U.6,W .

We just look at Row 5, Column 0.1.

Thus, 𝑡j.!,k = 1.476.


EXAMPLE
Find the value of 𝑡U.U7W,Y .

We just look at Row 3, Column 0.025.

Thus, 𝑡j.j"k,D = 3.182.


EXERCISE # 3
Suppose 𝑋 ~ 𝑡(𝑣). Find the value of the following:

a. 𝑡j.j"k,!j

b. 𝑡j.!j,!"

c. 𝑡j.jk,!k
EXERCISE # 3
Suppose 𝑋 ~ 𝑡(𝑣). Find the value of the following:

a. 𝑡j.j"k,!j = 2.228

b. 𝑡j.!j,!" = 1.356

c. 𝑡j.jk,!k = 1.753
𝟐
𝝌 -distribution
!
X ~ 𝜒 (v)
almost
parameter: positive for symmetric
degrees of skewed to
freedom ℝ2 only the right as df
increases
VISUALIZATION
Reading the 𝟐
𝝌 -table
• The 𝜒 7 -table gives the values of 𝜒Q,R7
, or the
100 1 − 𝛼 ST percentiles of 𝑋~𝜒 7 (𝑣) , for
values of 𝛼
= 0.10, 0.05, 0.025,0.01,0.005, 0.90, 0.95,
0.975, 0.99, and 0.995 and 𝑣 degrees of
freedom from 1 to 30. That is why it is labeled
as “100(1-𝛼 )th Percentiles of the Chi-Square
distribution”
Reading the 𝟐
𝝌 -table
7
• 𝜒Q,R is just a number in the x-axis such that its
AREA TO THE RIGHT is 𝛼.
EXAMPLE
7
We see from the table that 𝜒U.UW,X = 9.488.
To see this, just get the intersection of Row 4 and Column 0.05.

This means that if we draw the PDF of 𝑋~𝜒 7 (𝑣 =


4), then the area to the right of 9.488 is 0.05.
Equivalently, the area to the left of 9.488 is 0.95.
That’s why it’s called the 95th percentile, because
𝑃 𝑋 ≤ 9.488 = 0.95.
EXAMPLE
7
Find the value of 𝜒U.6,W .

We just look at Row 5, Column 0.1.

"
Thus, 𝜒j.!,k = 9.236.
EXAMPLE
7
Find the value of 𝜒U.ZZ,Y .

We just look at Row 3, Column 0.99.

"
Thus, 𝜒j.ll,D = 0.115.
EXERCISE # 4
Suppose 𝑋 ~ 𝜒 "(𝑣). Find the value of the following:

"
a. 𝜒j.j"k,!j

"
b. 𝜒j.!j,!"

"
c. 𝜒j.jk,!k
EXERCISE # 4
Suppose 𝑋 ~ 𝜒 "(𝑣). Find the value of the following:

"
a. 𝜒j.j"k,!j = 20.483

"
b. 𝜒j.!j,!" = 18.549

"
c. 𝜒j.jk,!k = 24.996
Sampling from
Normal
Distribution
Now, we assume that the distribution of the population
from where we obtain the random sample is normally
distributed with mean 𝜇 and variance 𝜎 ".
Sampling from the Normal
Distribution
• Suppose (𝑋6 , 𝑋7 , … , 𝑋8 ) is a random sample
such that 𝑋[ ~𝑁𝑜𝑟𝑚𝑎𝑙(𝜇, 𝜎 7 ) for 𝑖 = 1,2, … , 𝑛.

That is, 𝑋!~𝑁 𝜇, 𝜎 " , 𝑋"~𝑁 𝜇, 𝜎 " , … , 𝑋# ~𝑁(𝜇, 𝜎 ").

J#
(
Then, 𝑋~𝑁𝑜𝑟𝑚𝑎𝑙 𝜇, .
8
VISUALIZATION
𝑋!~𝑁𝑜𝑟𝑚𝑎𝑙(𝜇, 𝜎 ")
+
𝑋"~𝑁𝑜𝑟𝑚𝑎𝑙(𝜇, 𝜎 ")
+
𝑋D~𝑁𝑜𝑟𝑚𝑎𝑙(𝜇, 𝜎 ")
+

+
𝑋# ~𝑁𝑜𝑟𝑚𝑎𝑙(𝜇, 𝜎 ")
_________________________________________________
𝑋! + 𝑋"+. . . +𝑋# 𝜎"
?
= 𝑋~𝑁𝑜𝑟𝑚𝑎𝑙 𝜇,
𝑛 𝑛
Remarks
• It’s tempting to say that this is just the result of the
Central Limit Theorem. However, there are differences!!

o Now we assume that all 𝑋$ s are normally distributed with mean


𝜇 and variance 𝜎 " , unlike in the CLT where it’s any large or
infinite population.

o 𝑋9 is now ~ exactly normally distributed, unlike in the CLT where


it’s just ≈ approximately normally distributed.

o We do note require 𝑛 > 30, because even for small sample


sizes, 𝑋9 will still be (exactly) normally distributed given that we
have a random sample from 𝑁𝑜𝑟𝑚𝑎𝑙(𝜇, 𝜎 " ).
EXAMPLE
Suppose (𝑿𝟏 , 𝑿𝟐 , … , 𝑿𝟏𝟎 ) is a random sample of weights
(in kilos) from a population that is normally distributed with
mean 72 kilos and variance 30. What is the distribution of
the sample mean weight?

We know that 𝑋!~𝑁 𝜇 = 72, 𝜎 " = 30 , 𝑋"~𝑁(𝜇 = 72, 𝜎 " =


30), …, 𝑋!j~𝑁 𝜇 = 72, 𝜎 " = 30 .

p" Dj
?
Thus, 𝑋~𝑁𝑜𝑟𝑚𝑎𝑙 𝜇= 72, # = ?
= 3 , or 𝑋~𝑁(72,3).
!j
EXAMPLE
IQ is normally distributed with mean 100 and standard
deviation 20. What is the probability of selecting a random
sample of size 100 people with mean IQ larger than 105?

Let 𝑋$ denote the IQ of the 𝑖 ME person.


We know that 𝑋!~𝑁 𝜇 = 100, 𝜎 " = 20" , 𝑋"~𝑁(𝜇
= 100, 𝜎 " = 20"), …, 𝑋!jj~𝑁 𝜇 = 100, 𝜎 " = 20" .

p" "j" qjj


?
Thus, 𝑋~𝑁𝑜𝑟𝑚𝑎𝑙 𝜇= 100, # = = !jj = 4 , or
!jj
?
𝑋~𝑁(100,4).
EXAMPLE
IQ is normally distributed with mean 100 and standard
deviation 20. What is the probability of selecting a random
sample of size 100 with mean IQ larger than 105?

𝑋? − 𝜇 105 − 100
𝑃(𝑋? > 105) = 𝑃 >
𝜎 "F 4
𝑛

= 𝑃 𝑍 > 2.50
= 1 − 𝑃 𝑍 ≤ 2.50
= 1 − 0.9938
= 0.0062
EXERCISE # 5
An electrical firm manufactures electric light bulbs that have a length of
life that is normally distributed with mean and standard deviation equal
to 500 and 50 hours, respectively. Find the probability that a random
sample of 16 bulbs will have an average life greater than 475 hours.

Let 𝑋, denote the length of life of the 𝑖 -. light bulb. We know that
𝑋! ~𝑁 𝜇 = 500, 𝜎 " = 50" , 𝑋" ~𝑁 𝜇 = 500, 𝜎 " = 50" , …,
𝑋!/ ~𝑁 𝜇 = 500, 𝜎 " = 50" .

5 6# "6## & & "6##


9
Thus, 𝑋~𝑁𝑜𝑟𝑚𝑎𝑙 𝜇 = 500, = = 9
, or 𝑋~𝑁(500, ).
% !7 !7 !7

NOTE: The example is quite similar to that of Exercise # 2, but now, we know that each 𝑋$ is
normally distributed and the sample size is smaller (𝑛 = 16). Although we cannot invoke the
Central Limit Theorem because of the small sample size, we still have a distribution for 𝑋/
' %
$
because of our result here in this section – that 𝑋~𝑁𝑜𝑟𝑚𝑎𝑙 𝜇, ( given (𝑋% , 𝑋& , … , 𝑋" ) a
&
random sample from 𝑁𝑜𝑟𝑚𝑎𝑙(𝜇, 𝜎 ) distribution.
EXERCISE # 5
An electrical firm manufactures electric light bulbs that have a length of
life that is normally distributed with mean and standard deviation equal
to 500 and 50 hours, respectively. Find the probability that a random
sample of 16 bulbs will have an average life greater than 475 hours.

𝑋9 − 𝜇 475 − 500
𝑃(𝑋9 > 475) = 𝑃 >
𝜎 "^ 50"^
𝑛 16

= 𝑃 𝑍 > −2
= 1 − 𝑃 𝑍 ≤ −2
= 1 − 0.0228
= 0.9772
Sampling Distribution of
Some Common Statistics
Sampling Distributions of Statistics
based on a random sample from a Normal Distribution

Statistic Sampling Distribution Parameter/s

𝑋! − 𝜇 standard normal mean = 0


𝑍=𝜎
' 𝑛 distribution variance = 1
𝑋! − 𝜇 degrees of freedom
𝑇= t-distribution
𝑆' 𝑣 = 𝑛 − 1
𝑛

Just familiarize yourself with the form of these statistics!


You will meet these in the next modules!
Σ
Sampling
Distributions
END OF CHAPTER 7

You might also like