Professional Documents
Culture Documents
W3 Normal Distribution
W3 Normal Distribution
DISTRIBUTION
SUMMARY
Descriptive statistics can be used to summarize and describe a single
variable (aka, UNIvariate)
• Frequencies (counts) & Percentages
– Use with discrete (nominal/ordinal) data
• Levels, types, groupings, yes/no, Drug A vs. Drug B
• Means & Standard Deviations
– Use with continuous (interval/ratio) data
• Height, weight, cholesterol, scores on a test
SUMMARY
Frequencies and percentages can be computed for
discrete data
– Examples: Likert Scales (Strongly Disagree to Strongly
Agree); High School/Some College/College Graduate/Graduate
School
60
50
40
30
20
10
0
Strongly Agree Disagree Strongly
Agree Disagree
SUMMARY
We can compute frequencies and percentages for continuous
data
– Examples: Temperature, Height, Weight
SUMMARY
Measures of central tendency and measures of dispersion are often computed with continuous data.
In research, means are usually presented along with standard deviations or standard errors.
The Normal Distribution
The distribution of continuous data (interval-ratio) often forms a “bell
shaped” curve.
– Many phenomena in life are normally distributed (age, height,
weight, IQ). Empirical Rule
THE NORMAL DISTRIBUTION
Important things to note:
1.8
ROLE OF NORMALITY
• Many statistical methods require that the numeric variables we are working
with have an approximate normal distribution.
• For example, t-tests, F-tests, and regression analyses all require in some sense that
the numeric variables are approximately normally distributed.
Standardized normal
distribution with
empirical rule
percentages.
USING THE NORMAL TABLE
• What is P(Z > 1.6) ?
P(0 < Z < 1.6)
0 1.6
1.10
USING THE NORMAL TABLE
• What is P(Z < -2.23) ?
P(0 < Z < 2.23)
-2.23 0 2.23
1.11
USING THE NORMAL TABLE
• What is P(Z < 1.52) ?
0 1.52
1.12
USING THE NORMAL TABLE
• What is P(0.9 < Z < 1.9) ?
P(0 < Z < 0.9)
0 0.9 1.9
1.13
APPLICATION OF NORMAL DISTRIBUTION
• We can use the following function to convert any normal random variable to a
standard normal random variable (z-score).
1.14
APLIKASI
Jumlah masa yang dianggarkan untuk menyiapkan tugasan statistik
bertaburan secara normal dengan min 45 minit dan sisihan piawai 7
minit.
a) Berapa peratus pelajar mengambil masa melebihi 55 minit untuk
menyiapkan tugasan statistik?
b) Berapa peratus pelajar menyiapkan tugasan statistik di antara 40
minit hingga 50 minit?
c) Apakah kebarangkalian pelajar menyiapkan tugasan statistik
kurang dari 20 minit?
1.15
LATIHAN
Boxplot
Visual inspection
• Look for a “bell-shape”. Severe skewness and/or outliers are indications of non-
normality.
• Histograms are not useful for small sample sizes as it is difficult to get a clear
picture of the distribution.
BOXPLOT
It is hard to detect normality using a box-plot. But, at the very least, look for
symmetry. Severe skewness and/or outliers are indications of non-normality.
NORMAL QUANTILE PLOT
(NORMAL PROBABILITY PLOT)
SKEWNESS AND KURTOSIS
• For both measures, a perfectly normal
distribution should return a score of 0 (Rose et
al., 2015)
– A positive skewness value indicates positive (right)
skew; a negative value indicates negative (left) skew.
The higher the absolute value, the greater the skew.
– A positive kurtosis value indicates positive kurtosis;
a negative one indicates negative kurtosis. The higher
the absolute value, the greater the kurtosis.
– Rule of thumb
• divide either score by its standard error, if the result is
greater than 1.96, data are not normal.
STATISTICAL TESTS
• Statistical tests for normality are more precise since actual probabilities are calculated.
• Tests for normality calculate the probability that the sample was drawn from a normal
population.
• The hypotheses used are:
Ho: The sample data are not significantly different than a normal population.
Ha: The sample data are significantly different than a normal population.
Option 2 is to leave your data non-normal and conduct the non parametric
tests.