Slides Chp04 Stats 20221

biometry – bio220
chapter 4
estimation
1
estimation
• estimating population characteristics from
samples
• random sample
• uncertainty = imprecision
• how much can we infer about a population

from a limited sample?
2
estimation
population sample
parameter estimate
mean µ x̄ or Ȳ
sd σ s
proportion p p̂
3
estimation
• population frequency distribution (or
probability distribution)
• sample frequency distribution
4
population
• variable: length of a gene
• population: 20,290 protein coding genes in
the human genome
5
population
6
population
• parameters
7
sampling
• if we only had samples with n=100 from this
population, how should we estimate the
population characteristics?
• sample 100 genes randomly, without
replacement
• sampling without replacement = a single
observation can't be chosen twice
8
sampling
without replacement with replacement
https://www.spss-tutorials.com/simple-
9
random-sampling-what-is-it/
sample
10
sample
• sample estimates & the random sampling
effect
– note: Ȳ also frequently denoted as x ̄

– how to report Ȳ, given uncertainty ?
11
sampling distribution
• probability distribution of all possible sample
estimates
• e.g. all possible sample mean estimates Ȳ
• i.e., the “population” of sample means
12
• http://www.zoology.ubc.ca/~whitlock/kingfish
er/SamplingNormal.htm
• all estimates have a sampling distribution
• usually we cannot determine the sampling

distribution directly, but statistical theory
allows us to predict properties
13
14
sampling distribution – its mean
• property 1:
• Ȳ is an unbiased estimate of μ
• the average of sampling distribution of Ȳ will
be μ itself
15
sampling distribution – its shape
• property 2:
• the sampling distribution's shape is normal,
irrespective of the population's distribution
• as long as sample size is large enough
16
sampling distribution – its shape
17
sampling distribution - its variance
• property 3:the variance is an inverse function
of sample size
18
sampling distribution - its variance
higher sample size (n)
 more info (random noise cancels out)
 higher precision
 lower variance
19
standard error
• standard error = standard deviation of a
•  contains information about the precision of

the sample estimate
– remember: precision ≠ accuracy
• we can use this to calculate confidence
intervals for the population parameter
20
standard error of the mean
21
standard error of the mean
• but we usually do not know σ, but only s
the parameter the estimate
22
confidence interval
• range around a sample estimate
• indicates uncertainty for the estimate
• pop parameter should lie inside, with certain
probability
• 95% CI for the mean: a likely range for the

true population mean
23
confidence interval
• rough approximation to the 95% CI
Ȳ ± 2 SEȲ
• this is based on the fact that sampling

distributions are normal 
24
confidence interval
• remember that in
normal distributions:
• Ȳ ± 2 s.d. ~ 95% of the

data
• the sampling
distribution is also
normal
25
www.mathsisfun.com
confidence interval
• 95% CI : 2121.4 < μ < 2702.2
• “we are 95% confident that the population

mean lies within this range”
• http://www.zoology.ubc.ca/~whitlock/kingfish
er/CIMean.htm
26
confidence interval
27
confidence interval
• Why don`t we use 99.99999% confidence
interval?
28
the logic of estimation
• in real life, we don't have access to
populations
• = in real life, we cannot calculate sampling
distributions
• usually, we only have data from a single
sample, with specific n, mean and sd
• so what can we estimate about a population
based on this single sample?
29
• here we studied the behavior of sampling
distributions with simulations, in order to
understand the logic/theory behind the
statistical tools we use for estimation
• i.e. to understand the logic of why a 95% CI

can be constructed as
30
• the logic of why a 95% CI can be constructed
as
• because the sampling distribution mean is the
pop mean
• because the sampling distribution s.d. is the
pop s.d./sqrt(n)
• because the sampling distribution has a
normal shape when n is large enough
31
• we learned how samples from population
behaved in general
• based on this, we can make reliable inferences
about populations from a single sample
• sample  statistics  population
32
questions / to read
• solve all problems at the end of the chapter
• read the section about pseudoreplication

(interleaf-2)
33

Slides Chp04 Stats 20221

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Slides Chp04 Stats 20221

Uploaded by

Copyright:

Available Formats

biometry – bio220

• how much can we infer about a population

– note: Ȳ also frequently denoted as x ̄

• all estimates have a sampling distribution

• usually we cannot determine the sampling

•  contains information about the precision of

the parameter the estimate

• 95% CI for the mean: a likely range for the

• this is based on the fact that sampling

• Ȳ ± 2 s.d. ~ 95% of the

• “we are 95% confident that the population

• i.e. to understand the logic of why a 95% CI

• sample  statistics  population

• read the section about pseudoreplication

You might also like