You are on page 1of 22

Biostatistics & Research Methodology

2021/22

Block 3 – Lecture 3

Sampling
Distribution
Aktham Osama Abdulazeez, MBChB
Estimators & Parameters
Estimators & Parameters

Statistic Population Parameter Sample Estimator

Mean µ ഥ
𝐗

SD 𝛔 s

Proportion P ෩
𝐏
Random Sampling
Random Sampling

In statistics, a population represents the entire group of


individuals in whom we are interested.

In statistics, it is usually difficult to study the whole population


because:
• It is costly.
• Needs time and human resources.
• In some cases may be impossible because the population may be
hypothetical e.g. Patients who may receive treatment in the future.
Sampling Distribution
• Data is collected from a sample of individuals who are
representative of the population (have similar characteristics
to the individuals in the population).
• The sample is used to draw conclusions (make inferences)
about the population.
• The information in the sample may not fully reflect what is
true in the population.
• Sampling error is produced by studying only some of the
population.
• We need to quantify this error in statistical work.
Sampling Distribution

• On taking repeated samples of the same size from a


population, it is unlikely that the estimates of the population
parameter would be exactly the same in each sample.
• However, the estimates should all be close to the true value
of the parameter in the population and the estimates
themselves should be similar to each other.

• By quantifying the variability of these estimates, we can


assess the sampling error.
Sampling Distribution of the Mean

• If all repeated samples of


size n are taken from the
population.
• Estimate the mean in each
sample.
• A histogram of the
estimates of the means
would show their
distribution(Sampling
distribution of the mean).
Sampling Distribution of the Mean
• If the sample size is reasonably large, the estimates of the
mean follow a normal distribution, whatever the distribution
of the original data in the population was (Central limit
theorem).

• If the sample size is small, the estimates of the mean follow a


normal distribution provided that the data in the population
follow a normal distribution.
Sampling Distribution of the Mean
• The mean of the estimates equals the true population mean.
• The variability of the distribution is measured by the standard
deviation of the estimates; The standard Error of the mean
(SE).

• A large SE indicates that the estimate is imprecise.


• A small SE indicates that the estimate is precise.
Sampling Distribution of the Mean
• Reducing the SE (obtaining more precise mean estimates) is
achieved by:
• Increasing the size of the sample.
• Studying less variable data.

• If you need to draw conclusions about the spread and


variability of the data, standard deviation is what you’ll need
to use.
• If you’re interested in finding how precise the sample mean is
or you’re testing the differences between two means,
standard error is your metric.
Sampling Distribution of the Mean

When sampling is from a normally distributed population, the


distribution of the sampling means will possess the following
properties:
• The distribution of the mean of the samples (X) will be
normal.
• The mean of the means of the samples μX will be equal to
the mean of the underlying population from which these
samples were drawn.
• The standard deviation of these means will be:
σ/√n = SE
Calculation of SE Depending on Estimator
Distribution of Sample Mean
(Z score of the SE)
Properties of the sampling distribution of the mean:
• Random
• Has a mean of 𝜇
• Has a standard error
• Distributed approximately normal for large samples.
• Normal for all samples if the variable X is normal.

𝑥ҧ − 𝜇
𝑍= 𝜎
ൗ 𝑛
Example 1
If the cranial length of certain large human population is
normally distributed with a mean =185.6 mm, and standard
deviation=12.7 mm.
What is the probability that a random sample of size 10 from
this population will have a mean greater than 190 mm?
Distribution of Difference of 2 Means
Example 2
If the level of vitamin A in the liver of two human populations is
normally distributed, the variance of population 1 =19600 unit2,
and of population 2 =8100 unit2.
If there is no difference in population means , what is the
probability of having a difference in means between two
samples (n1=15, n2=10) drawn at random is equal or greater
than 50 unit.
Distribution of Sample Proportion
• If the population has a proportion of p, then random samples
of the same size drawn from the population will have sample
proportions close to p.
• The sample proportions are approximately normal.
Example 3
Suppose in a certain human population, the prevalence of color
blindness is 8%.
If we randomly select 150 individuals from this population, what
is the probability that the prevalence in the sample is as great as
15%
Distribution of Difference of 2 Proportions
Example 4
A psychiatric social worker believes that in both community A
and community B , the proportion of adolescents suffering from
some mental or emotional problem is 20%.
In a sample of 150 adolescents from community A, 15 had
mental or emotional problem. In a sample of 100 from
community B, the number was 16.
What is the probability of observing such a difference or higher?
Thank You
Questions? Ask in the group or note it down for our next
live session!
Hope to see you next time ☺

You might also like