Professional Documents
Culture Documents
Fifteen
Frequency
Distribution,
Cross-Tabulation, and
Hypothesis Testing
1
15-1
Chapter Outline
1) Overview
2) Frequency Distribution
3) Statistics Associated with Frequency
Distribution
i.
Measures of Location
ii.
Measures of Variability
iii.
Measures of Shape
15-2
Chapter Outline
6) Cross-Tabulations
i.
ii.
iii.
Chi-Square
ii.
iii.
Contingency Coefficient
iv.
Cramers V
v.
Lambda Coefficient
Other Statistics
vi.
2007 Prentice Hall
15-3
Chapter Outline
8) Cross-Tabulation in Practice
9) Hypothesis Testing Related to
Differences
10) Parametric Tests
i.
One Sample
ii.
iii.
Paired Samples
One Sample
ii.
iii.
Paired Samples
12) Summary
15-4
Sex
1.00
2.00
2.00
2.00
1.00
2.00
2.00
2.00
2.00
1.00
2.00
2.00
1.00
1.00
1.00
2.00
1.00
1.00
1.00
2.00
1.00
1.00
2.00
1.00
2.00
1.00
2.00
2.00
1.00
1.00
Familiarity
7.00
2.00
3.00
3.00
7.00
4.00
2.00
3.00
3.00
9.00
4.00
5.00
6.00
6.00
6.00
4.00
6.00
4.00
7.00
6.00
6.00
5.00
3.00
7.00
6.00
6.00
5.00
4.00
4.00
3.00
Internet
Usage
14.00
2.00
3.00
3.00
13.00
6.00
2.00
6.00
6.00
15.00
3.00
4.00
9.00
8.00
5.00
3.00
9.00
4.00
14.00
6.00
9.00
5.00
2.00
15.00
6.00
13.00
4.00
2.00
4.00
3.00
Attitude Toward
Usage of Internet
Internet
Technology Shopping
Banking
7.00
6.00
1.00
1.00
3.00
3.00
2.00
2.00
4.00
3.00
1.00
2.00
7.00
5.00
1.00
2.00
7.00
7.00
1.00
1.00
5.00
4.00
1.00
2.00
4.00
5.00
2.00
2.00
5.00
4.00
2.00
2.00
6.00
4.00
1.00
2.00
7.00
6.00
1.00
2.00
4.00
3.00
2.00
2.00
6.00
4.00
2.00
2.00
6.00
5.00
2.00
1.00
3.00
2.00
2.00
2.00
5.00
4.00
1.00
2.00
4.00
3.00
2.00
2.00
5.00
3.00
1.00
1.00
5.00
4.00
1.00
2.00
6.00
6.00
1.00
1.00
6.00
4.00
2.00
2.00
4.00
2.00
2.00
2.00
5.00
4.00
2.00
1.00
4.00
2.00
2.00
2.00
6.00
6.00
1.00
1.00
5.00
3.00
1.00
2.00
6.00
6.00
1.00
1.00
5.00
5.00
1.00
1.00
3.00
2.00
2.00
2.00
5.00
3.00
1.00
2.00
7.00
5.00
1.00
2.00
15-5
Frequency Distribution
15-6
Frequency Distribution of
Familiarity
with the Internet
Table 15.2
Valuelabel
Notsofamiliar
Veryfamiliar
Missing
ValueFrequency(N)
Valid
Cumulative
Percentagepercentage percentage
0
2
6
6
3
8
4
1
0.00.00.0
6.7
6.9
6.9
20.0 20.7
27.6
20.0
20.7
48.3
10.0
10.3
58.6
26.7
27.6
86.2
13.3
13.8
100.0
3.3
30
100.0
1
2
3
4
5
6
7
9
TOTAL
100.0
15-7
Frequency
Histogram
Fig. 15.1
8
7
Frequency
6
5
4
3
2
1
0
2007 Prentice Hall
4
Familiarity
7
15-8
,is
X = X i /n
i=1
Where,
Xi = Observed values of the variable X
n = Number of observations (sample size)
15-9
15-10
15-11
sx=
(XiX)
n1
i =1
CV=sx /X
2007 Prentice Hall
15-12
15-13
Skewness of a Distribution
Fig. 15.2
Symmetric
Distribution
Skewed Distribution
Mean
Media
n
Mode
(a)
2007 Prentice Hall
Mean Median
Mode (b)
15-14
15-15
15-16
H0: 0.40
H1:>0.40
15-17
The test of the null hypothesis is a onetailed test, because the alternative
hypothesis is expressed directionally. If
that is not the case, then a two-tailed
test would be required, and the
hypotheses would be expressed as:
H 0: = 0.40
H1:0.40
2007 Prentice Hall
15-18
p
z=
p
where
2007 Prentice Hall
p =
15-19
Type II Error
Type II error occurs when, based on the sample
results, the null hypothesis is not rejected when it
is in fact false.
by .
The probability of type II error is denoted
Unlike , which is specified by the researcher, the
15-20
The power of a test is the probability (1) of rejecting the null hypothesis when it is
false and should be rejected.
Although isunknown, it is related to
.
An extremely low value of (e.g., = 0.001)
will result in intolerably high
errors.
15-21
= 0.40
Z = 1.645
Critical
Value of Z
= 0.05
Z
99% of
Total Area
= 0.01
= 0.45
Z = -2.33
Z
15-22
Unshaded Area
= 0.0301
0
2007 Prentice Hall
z = 1.88
15-23
(0.40)(0.6)
30
= 0.089
15-24
= 0.567-0.40
0.089
= 1.88
2007 Prentice Hall
15-25
/2
15-26
If the probability
associated
with the calculated
or
(Critical
Value)
and Making
the
observed value of the test statistic
) is less
TS
(CAL
Decision
than the level of significance
( ), the null
hypothesis is rejected.
The probability associated with the calculated or
observed value of the test statistic is 0.0301. This
is the probability of getting a p value of 0.567
when
15-27
15-28
15-29
Tests of
Differences
Tests of
Association
Distribution
s
2007 Prentice Hall
Means
Proportions
Median/
Rankings
15-30
Cross-Tabulation
15-31
Gender
Internet Usage
Light (1)
Heavy (2)
Column Total
Male
Female
Row
Total
10
15
10
15
15
15
15-32
15-33
Gender
I nternet Usage
Male
Female
Light
33.3%
66.7%
Heavy
66.7%
33.3%
Column total
100%
100%
15-34
Internet Usage
Gender
Light
Heavy
Total
Male
33.3%
66.7%
100.0%
Female
66.7%
33.3%
100.0%
15-35
Introduction of a Third
Variable
in
Cross-Tabulation
Fig. 15.7
Original Two Variables
Some Association
between the Two
Variables
No Association
between the Two
Variables
Introduce a
Third Variable
Introduce a
Third Variable
Refined
No Association
Association
between the Two
between the Two
Variables
2007 Prentice
Hall
Variables
No Change
in the Initial
Pattern
Some
Association
between the Two
Variables 15-36
15-37
Purchase of
Fashion
Clothing
Unmarried
High
31%
52%
Low
69%
48%
100%
100%
700
300
Column
Number of
respondents
15-38
Fashion
Clothing
Male
Female
Married
25%
Not
Married
60%
High
35%
Not
Married
40%
Low
65%
60%
75%
40%
Column
totals
Numberof
cases
100%
100%
100%
100%
400
120
300
180
Married
15-39
15-40
Ownership of Expensive
Automobiles by Education
Table 15.8
Level
Own Expensive
Automobile
Education
College Degree
No College Degree
Yes
32%
21%
No
68%
79%
100%
100%
250
750
Column totals
Number of cases
15-41
Ownership of Expensive
Automobiles by Education Level
Table 15.9
and Income Levels
Income
Own
Expensive
Automobile
Low Income
High Income
College
Degree
No
College
Degree
College
Degree
No College
Degree
Yes
20%
20%
40%
40%
No
80%
80%
60%
60%
Column totals
100%
100%
100%
100%
100
700
150
50
Number of
respondents
15-42
15-43
Age
Less than 45
45 or More
Yes
50%
50%
No
50%
50%
Column totals
100%
100%
500
500
Number of respondents
15-44
15-45
15-46
Eating Frequently in
Fast-Food
Restaurants by Family
Table 15.12
Size
15-47
15-48
Chi-square Distribution
Fig. 15.8
Do Not
Reject H0
Reject H0
Critical
Value
15-50
fe=nrnnc
where
nc
n
2007 Prentice Hall
nr
= total number in the row
= total number in the column
= total sample size
15-51
15 X 15 = 7.50
30
15
X 15 = 7.50
30
15 X 15 = 7.50
30
cells
15-52
is
calculated as:
= (5 -7.5)2 + (10 - 7.5)2 + (10 - 7.5)2 + (5 7.5)2
7.5
7.5
7.5 7.5
=0.833 + 0.833 + 0.833+ 0.833
= 3.333
2007 Prentice Hall
15-53
For the cross-tabulation given in Table 15.3, there are (21) x (2-1) = 1 degree of freedom. The calculated chisquare statistic had a value of 3.333. Since this is less
than the critical value of 3.841, the null hypothesis of no
association can not be rejected indicating that the
association is not statistically significant at the 0.05
2007 level.
Prentice Hall
15-54
2
n
15-55
2
2+n
15-56
2
min(r1),(c1)
or
V=
2007 Prentice Hall
2/n
min(r1),(c1)
15-57
15-58
15-59
Cross-Tabulation in Practice
While conducting cross-tabulation analysis in practice, it is useful to
proceed along the following steps.
1.
2.
3.
4.
15-60
The samples are paired when the data for the two
samples relate to the same group of respondents.
2007 Prentice Hall
15-61
Hypothesis Tests
Non-parametric
Tests (Nonmetric
Tests)
Parametric
Tests (Metric
Tests)
One
Sample
* t test
* Z test
Two or More
Samples
Independe
nt
Samples
* Two-Group
t test
* Z test
2007 Prentice Hall
Paired
Samples
* Paired
t test
Two or More
One
Samples
Sample
* ChiSquare * KS
* Runs
* Binomial
Independe
Paired
nt
Samples
Samples
* Chi-Square
* Sign
* Mann-Whitney
* Median
* K-S
* Wilcoxon
* McNemar
* Chi-Square
15-62
Parametric
Tests
t =( X )/s X
Then,
is t distributed with n - 1
degrees of freedom.
The t distribution is similar to the normal
distribution in appearance. Both distributions are
bell-shaped and symmetric. As the number of
degrees of freedom increases, the t distribution
approaches the normal distribution.
2007 Prentice Hall
15-63
Hypothesis Testing
Using
the
t Statistic
Formulate
the null (H ) and the alternative (H )
1.
hypotheses.
2.
3.
4.
5.
15-64
7.
8.
15-65
exceeds
4.0, the neutral value on a 7 point scale. A
significance
H0: < 4.0
level of
= 0.05 is selected. The hypotheses
H1: >
4.0
may be
formulated
as:)/sX
t=(X
sX =s/ n
sX
29
= 1.579/
= 1.579/5.385 = 0.293
2007 Prentice Hall
t = (4.724-4.0)/0.293 = 0.724/0.293 =
2.471
15-66
15-67
where
X
29
1.5/
=
= 1.5/5.385 = 0.279
and
z = (4.724 - 4.0)/0.279 = 0.724/0.279 = 2.595
2007 Prentice Hall
15-68
15-69
H :
H :
(X
i 1
i1
n2
X ) (X
n n 2
2
i 1
i2
X )
2
or
2
s1 +
(n 1 1)
(n 21)
s =
n1 + n2 2
2
s2
15-70
sX 1X 2=
s 2(n1 +n1 )
1
t=
(X 1X 2)(12)
s X 1X 2
15-71
H 0:
H 1:
2 2
1
2
2
2
15-72
where
F(n11),(n21)=
n1
n2
n1-1
n2-1
s12
s22
=
=
=
=
=
=
s12
s22
size of sample 1
size of sample 2
degrees of freedom for sample 1
degrees of freedom for sample 2
sample variance for sample 1
sample variance for sample 2
15-73
Two
Independent-Samples t
Table
15.14
Tests
Table 15.14
15-74
H 0
: 1 = 2
H 1 : 1 2
P
P
Z
2007 Prentice Hall S
1
P1 p 2
15-75
P1 p 2
P(1 P )
n n
1
where
n1P1 + n2P2
P =
n1 + n2
2007 Prentice Hall
15-76
P P
1
= (11/15) -(6/15)
= 0.733 - 0.400 = 0.333
P1 p 2
=
2007 Prentice Hall
P = (15
x 0.733+15 x 0.4)/(15 + 15) = 0.567
0.567 x 0.433 [ 1 + 1 ]
15 15
= 0.181
Z = 0.333/0.181 = 1.84
15-77
Two Independent
Samples
Proportions
Given a two-tail test, the area to the right of the
critical value is 0.025. Hence, the critical value of
the test statistic is 1.96. Since the calculated value
is less than the critical value, the null hypothesis
can not be rejected. Thus, the proportion of users
(0.733 for males and 0.400 for females) is not
significantly different for the two samples. Note
that while the difference is substantial, it is not
statistically significant due to the small sample sizes
(15 in each group).
2007 Prentice Hall
15-78
Paired Samples
The difference in these cases is examined by a paired
samples t test. To compute t for paired samples, the
paired difference variable, denoted by D, is formed and
its mean and variance calculated. Then the t statistic
is computed. The degrees of freedom are n - 1, where
n is the number of pairs. The relevant
formulas are:
H 0: D = 0
H1: D 0
continued
2007 Prentice Hall
DD
tn1= s
D
n
15-79
Paired Samples
Where:
Di
D = i=1n
(Di D)2
n
s =
D
SD
i=1
n1
15-80
Paired-Samples t
Test
Table 15.15
Variable
Number
of Cases
Internet Attitude 30
Technology Attitude30
Mean
5.167
4.100
Standard
Deviation
1.234
1.398
Standard
Error
0.225
0.255
Difference = Internet
- - Technology
Difference Standard Standard
2-tail
Mean
deviat
ion
error Correlation
prob.
1.067
0.828
0.1511
0.809
0.000
t
Degrees of 2-tail
value freedom probability
7.059
29
0.000
15-81
Nonparametric
Tests
Nonparametric tests are used when the
independent variables are nonmetric. Like
parametric tests, nonparametric tests are
available for testing variables from one sample,
two independent samples, or two related samples.
15-82
Nonparametric Tests
One Sample
K=Max A i Oi
15-83
Nonparametric Tests
One Sample
15-84
15-85
Nonparametric Tests
One Sample
15-86
Nonparametric Tests
Two Independent Samples
For samples of less than 30, the exact significance level for U
is computed. For larger samples, U is transformed into a
normally distributed z statistic. This z can be corrected for
ties within ranks.
15-87
Nonparametric Tests
Two Independent Samples
15-88
Male
Female
Mean Rank
Cases
20.93
10.07
15
15
Total
30
U
31.000 151.000
z
-3.406
Note
U = Mann
-Whitney test statistic
W= Wilcoxon W Statistic
z = U transformed into normally distributed
z statistic.
2007 Prentice Hall
15-89
Nonparametric Tests
Paired Samples
15-90
15-91
15-92
A Summary of Hypothesis
Tests
Related
to
Differences
Table 15.19
Sample
Application
One Sample
Proportion
One Sample
Level of Scaling
Metric
Distributions Nonmetric
Test/Comments
Z test
K-S and chi-square for
goodness of fit
Runs test for randomness
Binomial test for goodness of
fit for dichotomous variables
One Sample
Means
Metric
A Summary of Hypothesis
Tests
Table 15.19, cont.
Related to Differences
15-94
A Summary of Hypothesis
Tests
Table
15.19, cont.to Differences
Related
15-95
SPSS Windows
15-96
SPSS Windows
To select these procedures click:
Analyze>Descriptive Statistics>Frequencies
Analyze>Descriptive Statistics>Descriptives
Analyze>Descriptive Statistics>Explore
The major cross-tabulation program is CROSSTABS.
This program will display the cross-classification tables
and provide cell counts, row and column percentages,
the
chi-square test for significance, and all the measures
of the strength of the association that have been
discussed.
To select these procedures click:
Analyze>Descriptive Statistics>Crosstabs
15-97
SPSS Windows
The major program for conducting parametric
tests in SPSS is COMPARE MEANS. This program can
be used to conduct t tests on one sample or
independent or paired samples. To select these
procedures using SPSS for Windows click:
Analyze>Compare Means>Means
Analyze>Compare Means>One-Sample T Test
Analyze>Compare Means>Independent- Samples T Test
15-98
SPSS Windows
The nonparametric tests discussed in this chapter can
be conducted using NONPARAMETRIC TESTS.
To select these procedures using SPSS for Windows
click:
Analyze>Nonparametric Tests>Chi-Square
Analyze>Nonparametric Tests>Binomial
Analyze>Nonparametric Tests>Runs
Analyze>Nonparametric Tests>1-Sample K-S
Analyze>Nonparametric Tests>2 Independent Samples
15-99
SPSS Windows:
Frequencies
1.
2.
3.
4.
Click STATISTICS
5.
15-100
SPSS Windows:
Frequencies
Click
7. Click
8. Click
9. Click
6.
CONTINUE
CHARTS
HISTOGRAMS, then click CONTINUE
OK
15-101
2.
3.
4.
5.
Click on CELLS.
6.
15-102
Click CONTINUE
8.
Click STATISTICS
9.
10.
Click CONTINUE.
11.
Click OK.
15-103
SPSS Windows:
One Sample t
Test
1.
Select ANALYZE from the SPSS menu
bar.
2.
3.
4.
5.
Click OK.
15-104
SPSS Windows:
Two Independent Samples t
Test
1.
Select ANALYZE from the SPSS menu bar.
2.
3.
4.
5.
6.
7.
Click CONTINUE.
8.
Click OK.
15-105
3.
4.
Click OK.
15-106