Professional Documents
Culture Documents
In the last module, we looked at the distributions and statistical tests for
discrete/categorical variables. Now we are going to look at a common
distribution for continuous variables, the normal distribution. Then we will
continue this module with the most basic type of statistical test, the t-test.
I. The Normal Distribution
The normal distribution is the iconic distribution for biological variables. Its bell
shape, with most observations falling close to the mean, and fewer observations
falling away from the mean is a close approximation to the frequency
distribution of many variables we see in nature.
Relative frequency
0.4
Measurement
Normal distribution often happens in nature when many factors all have ~ equal magnitude of additive effects
1
ex, exam scores, trait controlled by many genes
The normal distribution can be completely described by two parameters:
– mean (µ) - location
– standard deviation (σ) - spread
According to:
u=10 u=20
1 −(𝑌−𝜇)2
𝑓(𝑌) = 𝑒 2𝜎2
√2𝜋𝜎 2
define normal distribution
- infinity # ø / #u of normal distributions - because they all share properties can convert them all into
standard normal distributions
2
II. The Standard Normal Distribution
All normal distributions are shaped alike, just with different means and
variances. Any values from a normal distribution can be converted to a standard
normal distribution, by:
𝑌−𝜇 - u- population mean of our normal distribution
𝑍=
𝜎 population standard deviation
y- particular value
0.4
We can also apply the standard normal distribution, to the distribution of means
sampled from a population. The distribution of means, will be normally
distributed with a standard deviation of (aka Standard error of the mean):
𝜎 with known parameters
𝜎𝑌̅ =
√𝑛
The calculating the probability of observing certain sample means becomes a Z-
score using the SEM as the standard deviation of the sample statistic.
𝑌̅ − 𝜇 - approx. the SEM
𝑍=
𝜎𝑌̅ u- pop. mean
y- mean of the distribution
=+0.22597
Z= Y-u/ø
5% 1.65 (ø) + u = Y
p=0.05 1.65 (0.385) + 0.037 = 0.067225 4
2. The following table lists the mean and standard deviations of several different
normal distributions. For each, a sample of 10 individuals was taken, as well as,
a sample of 30 individuals. For each sample, calculate the probability that the
mean of the samples was greater than the given value of Y.
Mean Standard Y n=10, Pr(𝑌̅ > 𝑌) n=30, Pr(𝑌̅ > 𝑌)
Deviation
14 5 15 0.2643 0.13786
15 3 15.5
-23 4 -22
Z = ¥ -u /ø¥ = ¥ - u / ø/ √n = (15 - 14)/ (5/ √10) = 0.632
p [Z =1.09] = 0.13786
=(15 - 14)/ (5/ √30) = 1.09
What do you notice about the Pr(𝑌̅ > 𝑌) as the sample size increases? Why is
this?
large sample size
IV. Central Limit Theorem -- The most amazing Central Limit Theorem!
Central Limit Theorem = ‘the mean of a large number of measurements
randomly sampled from a non-normal (or normal) population is approximately
normally distributed’ p. 286
Let’s look at an example:
Button pushing times
most responses
3rd try
Time (ms)
5
Now randomly sample from this distribution and calculate of mean:
Distributions of
1000 sample
-starts to look normal means for
samples of
different sizes
- normally distributed