You are on page 1of 21

SAMPLING DISTRIBUTIONS

Lecture: 11

Shovon Roy
Lecturer
Department of Economics
Sheikh Hasina University
Lecture Topic

• Definition & Some Terminology.


• Sampling Problem.
• Simple Random Sampling
Statistical Inference: Definition & Some Terminology
Descriptive
Statistics
Statistics
Statistical
Inference

• Statistical Inference: Statistical inference aims to develop estimates and test


hypotheses about a population's characteristics using the information in the
sample. In other words, statistical inference deals with the generalization from
part to whole.
Statistical Inference: Definition & Some Terminology Cont.

• Population: A population is the set of all the elements of interest in a study.


• Sample: A sample is a subset of the population. In other words, a sample is a set of
outcomes selected from the population.
• Parameter: Numerical characteristics of a population are called parameters. Such as Mean
and Standard deviation.
• Statistic: The characteristics of a sample given in the form of some summary measure are
called statistics (plural of statistic).

The basic difference between the parameter and statistic is parameter always refers to the
population. On the other hand, statistic refers to the sample
Statistical Inference: Definition & Some Terminology Cont.
• Estimator and Estimates: Estimator is the formula for estimating a population
parameter. And when an estimator has a numerical specific value, it is called an
estimate or statistic.

• Example(Conducting Information for Manager): A tire manufacturer develops a


new tire designed to provide an increase in mileage over the firm’s current line of
tires. To estimate the mean useful life of the new tires i.e., the number of miles
provided by the new tires, the manufacturer selected a sample of 120 new tires for
testing. The test results provided a sample mean of 36,500 miles. Hence an estimate
of the mean tire mileage for the population of new tires was 36,500 miles.
Statistical Inference: Definition & Some Terminology Cont.

• Note that in the tire mileage example, collecting the data on tire life involves
wearing out each tire tested. Clearly, it is impossible to test every tire in the
population, a sample is the only realistic way to obtain the desired tire mileage
data. However, it is important to realize that sample results provide only
estimates of the values of the population characteristics. That is, we do not expect
the sample mean of 36,500 miles to exactly equal the mean mileage for all tires in
the population.
Sampling Problem: The Electronic Associates Case Contd.

The director of personnel for Electronic Associates Inc.(EA), has been assigned the task of
developing a profile of the company's 2500 managers. The characteristics to be identified
include the mean annual salary for the managers and the proportion of managers having
completed the company’s management training program.

Let the population mean and standard deviation for the annual salary data are:
σ 𝒙𝒊
Population Mean: 𝝁 = 𝑵
= $𝟓𝟏𝟖𝟎𝟎

(𝒙𝒊 −𝝁)𝟐
Population Standard Deviation: σ = 𝑵
= $𝟒𝟎𝟎𝟎
Statistical Inference: Definition & Some Terminology Cont.
• Furthermore, the data on the training program status show that 1500 of the 2500
managers have completed the training program. So, the proportion of the population
having completed the training program: P= 1500/2500=0.60
• A parameter is a numerical characteristic of a population. Therefore, here 𝜇 = $51800 𝜎=
$4000 𝑎𝑛𝑑 𝑃 = 0.6 are parameters of the population of EAI managers.
• If the necessary information on all the EAI managers was not readily available, then
information from a sample can often be used to develop an estimate of the population
parameters of interest. Suppose that a sample of 30 managers will be used. Let how we
can identify a sample of 30 managers:
Simple Random Sampling
Several methods can be used to select a sample from the population, one of the most
common is simple random sampling. The definition of a simple random sample and the
process of selecting a simple random sample depends on whether the population is finite or
infinite.

With finite
Population
Simple Random
Sampling
With Infinite
Population
Simple Random Sampling Contd.

Sampling from a Finite Population: A simple random sample of size n from a


finite population of size N is defined as follows:
▪ A simple random sample of size n from a finite population of size N is a sample
selected such that each possible sample of size n has the same probability of
being selected.
Sampling from an Infinite Population: A simple random sample from an infinite
population is a sample selected such that the following conditions are satisfied:
1. Each element selected comes from the same population.
2. Each element is selected independently.
Simple Random Sampling Contd.

Population
Inference Sample Statistic
Parameter

◈Mean 𝜇 Estimation ◇Mean 𝑥ҧ


◈Standard and ◇Standard
Deviation 𝜎 Hypothesis Deviation s
◈Proportion 𝑃 Testing ◇Proportion 𝑃ത
Theory of Estimation
To estimate the unknowns, the usual procedure is to assume that we have a
random sample of size n from the known probability distribution and use the
sample data to estimate the unknown parameters. This process is called the
problem of estimation. The theory of estimation can be divided into two parts
which are shown below:

Point
Estimation
Theory of
Estimation
Interval
Estimation
Point Estimation
The aim of a point estimator is to use all the data and prior information to calculate
a value that would be our best guess as to the actual or the true value of the
parameter.
Let us Return to the EAI problem. Assume that a simple random sample of 30
managers has been selected and that the corresponding data on annual salary and
management training program participation are shown in the following table:
Point Estimation Contd.
To estimate the value of a population parameter, we compute the corresponding
characteristics of the sample, referred to as sample statistic. The following sample
statistic is used to estimate the population parameter:
σ 𝒙𝒊
Sample Mean 𝑥ҧ = = $𝟓𝟏𝟖𝟏𝟒
𝒏

(𝒙𝒊 −𝑥)ҧ 𝟐
Standard Deviation S= = $𝟑𝟑𝟒𝟕. 𝟕𝟐
𝒏−𝟏
19
Proportion 𝑝ҧ =
30
By Making The preceding computations, we have performed the statistical procedure
called “point estimation”.
Point Estimation Contd.

• In point estimation, we use the sample to compute a value of a sample statistic that
serves as an estimate of a population parameter. Using the terminology of point
ഥ is refer to as the point estimator of the population mean 𝜇, s as the point
estimation, 𝒙
estimator of the population standard deviation 𝜎 and 𝒑 ഥ as the point estimator of the
population proportion 𝑃. The actual numerical value obtained for 𝒙 ഥ, s and 𝒑
ഥ in a
particular sample is called the point estimate of the parameter.
• Thus, for the sample of 30 EAI managers, $51814 is the point estimate of 𝜇,
$3347.72 is the point estimate of 𝜎, and 0.63 is the point estimate of 𝑃.
Point Estimation Contd.

Population Parameter Parameter Value Point Estimator Point estimate

𝜇=Population mean of $51800 ഥ = Sample mean


𝒙 $51814
annual salary

𝜎 = Population standard $40000 𝑠= Sample standard $3347


deviation of annual salary deviation

𝑃 = Population proportion 0.60 ഥ =Sample proportion


𝒑 0.63
of annual salary
Point Estimation Contd.

• From the above table shows that none of the point estimates are exactly equal to the
corresponding population parameter. This variation is expected because only a sample
and not a census of the entire population is used to develop the estimate.
• The absolute value of the difference between an unbiased point estimate and the
corresponding population parameter is called the sampling error. The sampling error for
the sample mean, standard deviation, and proportion is:

𝒙 − 𝜇| =$51814 − $51800= $14


|ഥ
|𝑠 − 𝜎| = $3347.72 − $4000 = $652.28
|𝒑ഥ − 𝑃| = 0.63 – 0.60 = 0.03
Introduction to Sampling Distribution Contd.

Let us select another simple random sample of 30 EAI managers, and an analysis of the data from the
second sample provides the following information: Sample Mean, 𝒙 ഥ = $52669.70 Standard Deviation,
ഥ = 0.70
s = $4239.07, Proportion, 𝒑
ഥ, s and 𝒑
These results show that different values of 𝒙 ഥ have been obtained with the second sample. Let us
imagine carrying out the sampling process of selecting a new simple random sample of 30 managers
repeatedly, each time computing values of 𝒙 ഥ, s and 𝒑ഥ. To illustrate, we repeated the simple random
sampling process for the EAI problem until we obtained 500 samples of 30 managers each and the
ഥ, s and 𝒑
corresponding of 𝒙 ഥ.
Introduction to Sampling Distribution Contd.

Sample Number Sample Mean Sample standard Sample proportion


deviation

1 $51814 $3347.72 0.63


2 $52670 $4239.07 0.70
- - - -
- - - -
- - - -

500 $51752.00 $3857.82 0.50


Introduction to Sampling Distribution Contd.
We know that a random variable is a numerical description of the outcome of an
experiment. Let’s consider the process of selecting a simple random sample as an
experiment, the sample mean 𝒙 ഥ is the numerical description of the experiment’s
outcome of the experiment. Thus, the 𝒙 ഥ is a random variable. As a result, like other
random variables, 𝒙 ഥ has a mean or expected value, a standard deviation, and a
probability distribution. Because the various possible values of 𝒙ഥ are the result of
different simple random samples, the probability distribution of 𝒙 ഥ is called the
ഥ.
sample distribution of 𝒙
Introduction to Sampling Distribution Contd.
• Knowledge of this sampling distribution and its properties will enable us to
make probability statements about how close the sample mean 𝒙 ഥ is to the
population mean 𝜇 .
Mean annual salary $ Frequency Relative frequency
Relative frequency

0.25
49,500.00–49,999.99 2 .004
0.2 50,000.00–50,499.99 16 .032
50,500.00–50,999.99 52 .104
0.15
51,000.00–51,499.99 101 .202
0.1 51,500.00–51,999.99 133 .266
52,000.00–52,499.99 110 .220
0.05 52,500.00–52,999.99 54 .108
0
53,000.00–53,499.99 26 .052
53,500.00–53,999.99 6 .012
Total=500 1.000

You might also like