Professional Documents
Culture Documents
Assumption of normality
Slide 1
Assumption of normality
Transformations
Assumption of normality script
Practice problems
Compu
ters II
Assumption of Normality
Slide 2
Compu
ters II
Evaluating normality
Slide 3
Compu
ters II
Transformations
Slide 4
Compu
ters II
Slide 5
Compu
ters II
Problem 1
Slide 6
True
True with caution
False
Incorrect application of a statistic
Compu
ters II
Slide 7
Compu
ters II
Slide 8
Compu
ters II
Slide 9
ters II
Slide
10
ters II
Slide
11
ters II
Slide
12
ters II
Slide
13
ters II
Slide
14
The histogram
Histogram
50
40
30
Frequency
20
10
0
0.0
20.0
10.0
40.0
30.0
60.0
50.0
80.0
70.0
100.0
90.0
ters II
Slide
15
Expected Normal
-1
-2
-3
-40
-20
Observed Value
20
40
ters II
Slide
16
Kolmogorov-Smirnov
Statistic
df
Sig.
TOTAL TIME SPENT
ON THE INTERNET
.246
93
.000
Statistic
.606
Shapiro-Wilk
df
93
Problem 1 asks about the results of the test of normality. Since the sample
size is larger than 50, we use the Kolmogorov-Smirnov test. If the sample
size were 50 or less, we would use the Shapiro-Wilk statistic instead.
The null hypothesis for the test of normality states that the actual
distribution of the variable is equal to the expected distribution, i.e., the
variable is normally distributed. Since the probability associated with the
test of normality is < 0.001 is less than or equal to the level of significance
(0.01), we reject the null hypothesis and conclude that total hours spent on
the Internet is not normally distributed. (Note: we report the probability as
<0.001 instead of .000 to be clear that the probability is not really zero.)
The answer to problem 1 is false.
Sig.
.000
ters II
Slide
17
ters II
Slide
18
Second, click on
the Run button to
activate the script.
ters II
Slide
19
ters II
Slide
20
Tests of Normality
a
Kolmogorov-Smirnov
Statistic
df
Sig.
TOTAL TIME SPENT
ON THE INTERNET
.246
93
.000
Statistic
Shapiro-Wilk
df
.606
93
Sig.
.000
ters II
Slide
21
Problem 2
In the dataset GSS2000.sav, is the following
statement true, false, or an incorrect application of a
statistic?
Based on the rule of thumb for the allowable
magnitude of skewness and kurtosis, total hours
spent on the Internet is normally distributed.
1.
2.
3.
4.
True
True with caution
False
Incorrect application of a statistic
ters II
Slide
22
Descriptives
TOTAL TIME SPENT
ON THE INTERNET
To answer problem
2, we look at the
values for skewness
and kurtosis in the
Descriptives table.
Mean
95% Confidence
Interval for Mean
5% Trimmed Mean
Median
Variance
Std. Deviation
Minimum
Maximum
Range
Interquartile Range
Skewness
Kurtosis
Lower Bound
Upper Bound
Statistic
10.731
7.570
13.893
8.295
5.500
235.655
15.3511
.2
102.0
101.8
10.200
3.532
15.614
The skewness and kurtosis for the variable both exceed the rule of
thumb criteria of 1.0. The variable is not normally distributed.
The answer to problem 2 if false.
Std. Error
1.5918
.250
.495
ters II
Slide
23
Problem 3
In the dataset GSS2000.sav, is the following statement
true, false, or an incorrect application of a statistic?
Use 0.01 as the level of significance.
Based on a diagnostic hypothesis test of normality,
"total hours spent on the Internet" is not normally
distributed. A logarithmic transformation of "total
hours spent on the Internet" results in a variable that
is normally distributed.
1.
2.
3.
4.
True
True with caution
False
Incorrect application of a statistic
ters II
Slide
24
Kolmogorov-Smirnov
Statistic
df
Sig.
Logarithm of NETIME
[LG10(NETIME)]
Square Root of NETIME
[SQRT(NETIME)]
Inverse of NETIME
[1/(NETIME)]
Statistic
Shapiro-Wilk
df
Sig.
.047
93
.200*
.994
93
.951
.118
93
.003
.868
93
.000
.288
93
.000
.495
93
.000
ters II
Slide
25
ters II
Slide
26
No
Incorrect application
of a statistic
Yes
No
False
Yes
Yes
True with caution
No
True
ters II
Slide
27
No
Incorrect application
of a statistic
Yes
Statistical evidence
supports normality?
No
No
Statistical evidence
for transformation
supports normality?
False
Yes
Either variable
ordinal level?
Yes
True with caution
No
True