You are on page 1of 24

NORMAL

DISTRIBUTION
SUMMARY
Descriptive statistics can be used to summarize and describe a single
variable (aka, UNIvariate)
• Frequencies (counts) & Percentages
– Use with discrete (nominal/ordinal) data
• Levels, types, groupings, yes/no, Drug A vs. Drug B
• Means & Standard Deviations
– Use with continuous (interval/ratio) data
• Height, weight, cholesterol, scores on a test
SUMMARY
Frequencies and percentages can be computed for
discrete data
– Examples: Likert Scales (Strongly Disagree to Strongly
Agree); High School/Some College/College Graduate/Graduate
School
60
50
40
30
20
10
0
Strongly Agree Disagree Strongly
Agree Disagree
SUMMARY
We can compute frequencies and percentages for continuous
data
– Examples: Temperature, Height, Weight
SUMMARY
Measures of central tendency and measures of dispersion are often computed with continuous data.

• Measures of Central Tendency (aka, the “Middle Point”)


– Mean, Median, Mode
– If your frequency distribution shows outliers, you might want to use the median instead of the mean

• Measures of Dispersion (aka, How “spread out” the data are)


― Variance, standard deviation, standard error of the mean
― Describe how “spread out” a distribution of scores is
― High numbers for variance and standard deviation may mean that scores are “all over the
place” and do not necessarily fall close to the mean

In research, means are usually presented along with standard deviations or standard errors.
The Normal Distribution
The distribution of continuous data (interval-ratio) often forms a “bell
shaped” curve.
– Many phenomena in life are normally distributed (age, height,
weight, IQ). Empirical Rule
THE NORMAL DISTRIBUTION
Important things to note:

The normal distribution is fully defined by two parameters:


its standard deviation and mean

The normal distribution is bell shaped and


symmetrical about the mean

Normal distributions range from minus infinity to plus infinity


1.7
STANDARD NORMAL DISTRIBUTION
• A normal distribution whose mean is zero and standard deviation is one is called
the standard normal distribution.
0
1

1.8
ROLE OF NORMALITY
• Many statistical methods require that the numeric variables we are working
with have an approximate normal distribution.
• For example, t-tests, F-tests, and regression analyses all require in some sense that
the numeric variables are approximately normally distributed.

Standardized normal
distribution with
empirical rule
percentages.
USING THE NORMAL TABLE
• What is P(Z > 1.6) ?
P(0 < Z < 1.6)

0 1.6

P(Z > 1.6) = .5 – P(0 < Z < 1.6)

1.10
USING THE NORMAL TABLE
• What is P(Z < -2.23) ?
P(0 < Z < 2.23)

P(Z < -2.23) P(Z > 2.23)

-2.23 0 2.23

P(Z < -2.23) = P(Z > 2.23)

1.11
USING THE NORMAL TABLE
• What is P(Z < 1.52) ?

P(Z < 0) = .5 P(0 < Z < 1.52)

0 1.52

P(Z < 1.52) = .5 + P(0 < Z < 1.52)

1.12
USING THE NORMAL TABLE
• What is P(0.9 < Z < 1.9) ?
P(0 < Z < 0.9)

P(0.9 < Z < 1.9)

0 0.9 1.9

P(0.9 < Z < 1.9)


= P(0 < Z < 1.9) – P(0 < Z < 0.9)

1.13
APPLICATION OF NORMAL DISTRIBUTION
• We can use the following function to convert any normal random variable to a
standard normal random variable (z-score).

Some advice: always


draw a picture!

1.14
APLIKASI
Jumlah masa yang dianggarkan untuk menyiapkan tugasan statistik
bertaburan secara normal dengan min 45 minit dan sisihan piawai 7
minit.
a) Berapa peratus pelajar mengambil masa melebihi 55 minit untuk
menyiapkan tugasan statistik?
b) Berapa peratus pelajar menyiapkan tugasan statistik di antara 40
minit hingga 50 minit?
c) Apakah kebarangkalian pelajar menyiapkan tugasan statistik
kurang dari 20 minit?

1.15
LATIHAN

The length of life of an instrument produced by a machine


has a normal distribution with a mean of 12 months and
standard deviation of 2 months. Find the probability that an
instrument produced by this machine will last
a) less than 7 months.
b) between 7 and 12 months
TOOLS FOR ASSESSING NORMALITY
 Histogram

 Boxplot
Visual inspection

 Normal Quantile Plot


(also called Normal Probability Plot)

 Skewness and Kurtosis

 Statistical Tests - Goodness of Fit Tests


Shapiro-Wilk Test (n<50)
Kolmogorov-Smirnov Test (K-S)
Lilliefors corrected K-S test
Anderson-Darling Test
HISTOGRAM

• Look for a “bell-shape”. Severe skewness and/or outliers are indications of non-
normality.
• Histograms are not useful for small sample sizes as it is difficult to get a clear
picture of the distribution.
BOXPLOT

It is hard to detect normality using a box-plot. But, at the very least, look for
symmetry. Severe skewness and/or outliers are indications of non-normality.
NORMAL QUANTILE PLOT
(NORMAL PROBABILITY PLOT)
SKEWNESS AND KURTOSIS
•  For both measures, a perfectly normal
distribution should return a score of 0 (Rose et
al., 2015)
– A positive skewness value indicates positive (right)
skew; a negative value indicates negative (left) skew.
The higher the absolute value, the greater the skew.
– A positive kurtosis value indicates positive kurtosis;
a negative one indicates negative kurtosis. The higher
the absolute value, the greater the kurtosis.

– Rule of thumb
• divide either score by its standard error, if the result is
greater than 1.96, data are not normal.
STATISTICAL TESTS
• Statistical tests for normality are more precise since actual probabilities are calculated.
• Tests for normality calculate the probability that the sample was drawn from a normal
population.
• The hypotheses used are:
Ho: The sample data are not significantly different than a normal population.
Ha: The sample data are significantly different than a normal population.

• So when testing for normality:


Probabilities > 0.05 mean the data are normal.
Probabilities < 0.05 mean the data are NOT normal.
SPSS OUTPUT

Shapiro-Wilk Test (n < 50)


Kolmogorov-Smirnov Test (n > 50 )
p-value (Sig.) must > 0.05 for a normal distribution
DEAL WITH NON-NORMALITY
 Option 1 is to leave you data non-normal and conduct the parametric tests
(best for slightly normal data).

 Option 2 is to leave your data non-normal and conduct the non parametric
tests.

 Option 3 is to conduct “robust” tests.

 Option 4 is to transform the data. Transforming you data involving using


mathematical formulas to modify the data into normality.

You might also like