Professional Documents
Culture Documents
Unit 8: Inferential Statistics-Test of Relationship: Activating Prior Knowledge
Unit 8: Inferential Statistics-Test of Relationship: Activating Prior Knowledge
Introduction
In the conduct of research investigation, one of the common area of interest
among researchers is to confirm whether assumptions on the variables involved
are baseless or grounded. One way to objectively determine the reasons is to look
for a relationship or association between the variables. Therefore, in this unit, you
will learn the measures of association between and among variables. More so, you
will learn how to explore relationships using SPSS inferential statistics.
Learning Outcomes
At the end of this unit, you are expected to:
1. Solve research problem using the chi-square test
2. Solve research problem using the Pearson’s product- moment correlation
3. Solve problems involving regression analysis
1
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Learning Objectives
At the end of the lesson, you are expected to:
1. Discuss the characteristics of Chi-Square Test
2. Conduct and interpret chi-square test of independence hypothesis tests
using the SPSS Statistic.
Presentation of Content
Inferential statistics: Examining Relationships between Variables
Inferential questions that deals with relationships or associations between
two or more variables answers problems such as:
Is there a significant relationship between number of “Balik
Probinsya” arrivals and COVID-19 positive rates in Cagayan?
Is there a significant relationship between Math performance and
learner’s Problem solving skills?
Is there a significant relationship between learner’s Math
performance and teacher’s teaching styles?
NOTE: There are three popular inferential statistics examining
relationships between variables namely:
Chi-square test
Correlations
Simple linear regression and Multiple Regression
2
Unit 1: xxxxxxxxxxxxxxxxxxxxx
3
Unit 1: xxxxxxxxxxxxxxxxxxxxx
goodness-of-fit test:
where:
O = observed values
E = expected values
i = the number of rows in the table
j = the number of columns in the table
4
Unit 1: xxxxxxxxxxxxxxxxxxxxx
(O−E)2
There are i⋅j terms of the form .
E
A test of independence determines whether two
factors are independent or not.
NOTE: The expected
value for each cell needs Calculating and Interpreting Pearson Chi-
to be at least five in order Square using SPSS
for you to use this test.
Example 1: Instructors are looking for ways on
how to teach Statistics and Probability to undergraduates as part of a BSEd degree
course especially now during the COVID-19 pandemic. With current technology,
it is possible to present how-to guides for statistical programs online instead of in
a book. However, different students learn in different ways. An instructor of the
College of Teacher Education would like to know whether gender (male/female)
is associated with the preferred type of learning medium (online vs. books).
Solution: (Steps & Photos adopted from SPPS Statistics, IBM Corporation)
In SPSS Statistics, you need to create two variables so that you could enter your
data: Gender and Preferred Learning Medium.
To properly enter data in SPSS Statistics, visit https://statistics.laerd.com/spss-
tutorials/entering-data-in-spss-statistics.php for a “quick start” guide or watch the
video tutorial: https://www.youtube.com/watch?v=MoKDcPpRa_0
After which, follow the steps below on how to analyze your data using a chi-
square test for independence in SPSS statistics.
Step 1: Click Analyze > Descriptives Statistics > Crosstabs... on the top menu, as
shown below:
5
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Step 2: You will be presented with the following Crosstabs dialogue box:
Step 3: Transfer one of the variables into the Row(s): box and the other variable
into the Column(s): box. In the example, you need to transfer the
variable into the Row(s): box and into the
Column(s): box. There are two ways to do this. You can either: (1) highlight the
variable with your mouse and then use the relevant right arrow button to transfer
the variables; or (2) drag-and-drop the variables. How do you know which
variable goes in the row or column box? There is no right or wrong way. It will
depend on how you want to present your data.
6
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Step 4: Click on the button. You will be presented with the following
Crosstabs: Statistics dialogue box:
Step 5: Select the Chi-square and Phi and Cramer's V options, as shown
below:
7
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Step 7: Click on the button. You will be presented with the following
Crosstabs: Cell Display dialogue box:
8
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Step 11: You will be presented with the photo below. This option allows you to
change the order of the values to either ascending or descending.
Step 12: Once you have made your choice, click on the button.
Step 13: Click on the button to generate your output.
You will be presented with some tables in the Output Viewer under the title
"Crosstabs". The tables of note are presented below:
This table allows you to understand that both males and females prefer to
learn using online materials versus books.
9
Unit 1: xxxxxxxxxxxxxxxxxxxxx
When reading this table just focus on the results of the "Pearson Chi-Square"
row. You can see here that χ(1) = 0.487, p = .485. This tells you that there is no
statistically significant association between Gender and Preferred Learning
Medium; that is, both Males and Females equally prefer online learning versus
books.
10
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Application
Calculate and Interpret the result of the Chi-Square Test using the SPSS Statistic.
Statement of the Problem: Is there an association between personality and color
preference?
A group of students were classified in terms of personality (introvert or extrovert)
and in terms of color preference (red, yellow, green or blue). Personality and color
preference are categorical data. Access the data at
https://docs.google.com/spreadsheets/d/1ONg5_WEh_7NjRIUD5b_fdwpw62sqi7
o4vHDw1hKzOx4/edit?usp=sharing
Indicate the table and the interpretation below:
Value df Asymp.Sig. Exact Exact
(2-sided) Sig (2- Sig (1-
sided) sided)
Pearson Chi-Square Continuity
Correctiona Likelihood Ratio
Fisher's Exact Test
Linear-by-Linear Association
N of Valid Cases
Interpretation: ____________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
_________________________________________________________________.
Feedback
Goal: Your task is to determine if there is an association between the gender of
the students and Multiple Intelligences of the students.
Role: You are a college instructor and one of the requirements given by your
Dean is to make a study involving associations between two variables.
Audiende: Your target respodents are your students in the College of Teacher
Education (20 males and 20 females).
11
Unit 1: xxxxxxxxxxxxxxxxxxxxx
12
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Topic 2. CORRELATION
Learning Objectives
At the end of the lesson, you are expected to:
1. Describe characteristics of Correlations.
2. Calculate the Pearson Product Moment Correlation Coefficient using the
SPSS Statistic; and
3. Solve problem involving correlation analysis.
Presentation of Content
Correlation tests (Pearson correlation) are commonly used to examine
relationships between two or more quantitative/numerical variables. These
commonly measure the strength and direction of association that exists
between two variables measured on at least an interval scale.
NOTE: The Pearson correlation coefficient, r, can take a range of values
from +1 to -1. A value of 0 indicates that there is no association between
the two variables. A value greater than 0 indicates a positive association;
that is, as the value of one variable increases, so does the value of the
other variable. In other words, high values on the one variable are
associated with high values of the next. A value less than 0 indicates a
negative association; that is, as the value of one variable increases, the
value of the other variable decreases. In other words, high values on one
variable are associated with low values on the next. This is shown in the
diagram below:
13
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Negative
Positive Correlation
The sign of the relationship does not indicate the strength; (-).50 is the
same strength as (+).50 but different direction.
‘r’ is the symbol of the correlation coefficient.
You cannot use any type of variable for Pearson's correlation coefficient,
your two variables should be measured at the interval or ratio level (i.e.,
they are continuous). There must be a linear relationship between your two
variables and there should be no significant outliers.
The Pearson product-moment correlation does not take into consideration
Askwhether a variable
these questions has sure
to make beenthat
classified as a dependent
the Correlation or independent
is appropriate for
variable. Itquestion
the inferential treats all(Devonish,
variables equally.
Dwayne):
How many variables?
Which one is the independent variable, and which one is the 14
dependent variable?
What types of variables are they?
So Correlation appropriate?
Unit 1: xxxxxxxxxxxxxxxxxxxxx
After which, follow the steps below on how to analyze your data using Pearson’s
correlation in SPSS statistics.
Step 1: Click Analyze > Correlate > Bivariate... on the main menu, as shown
below:
box:
15
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Step 2: Transfer the variables Height and Jump_Dist into the Variables: box by
dragging-and-dropping them or by clicking on them and then clicking on the right
button. You will end up with a screen similar to the one below:
NOTE: If you study involves calculating more than one correlation and you want
to carry out these correlations at the
same time, we show you how to do this
in our enhanced Pearson’s correlation
guide. We also show you how to write
up the results from multiple
correlations.
16
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Step 4: Click on the Options button and you will be presented with the Bivariate
Correlations: Options dialogue box. If you wish to generate some descriptives,
you can do it here by clicking on the relevant checkbox in the –Statistics– area.
Step 6: Click on the button. This will generate the results of Pearson's
correlation.
Therefore, when running the Pearson’s correlation procedure, you will be
presented with the Correlations table in the IBM SPSS Statistics Output
Viewer. The Pearson's correlation result is highlighted below:
The results are presented in a matrix such that, as can be seen above, the
correlations are replicated. Nevertheless, the table presents the Pearson correlation
coefficient, its significance value and the sample size that the calculation is based
on.
In this example, we can see that the Pearson correlation coefficient, r, is 0.706,
and that it is statistically significant (p = 0.005). For interpreting multiple
correlations, see our enhanced Pearson’s guide.
Interpretation: A Pearson product-moment correlation was run to determine the
relationship between height and distance jumped in a long jump. There was a
17
Unit 1: xxxxxxxxxxxxxxxxxxxxx
strong, positive correlation between height and distance jumped, which was
statistically significant (r = .706, n = 14, p = .005).
Application
Calculate and Interpret the result of the Correlation Test using the SPSS Statistic.
Statement of the Problem: Is there aa significant relationship between Math
performance and gender of learners?
A group of students were classified in terms of their gender (male/female) and in
terms of Math performance. Access the data at
https://docs.google.com/spreadsheets/d/1ONg5_WEh_7NjRIUD5b_fdwpw62sqi7
o4vHDw1hKzOx4/edit?usp=sharing
Indicate the values on the table and the interpretation below:
Gender Math
performance
Gender Pearson Correlation
18
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Sig. (2-tailed)
N
Math performance Pearson Correlation
Sig. (2-tailed)
N
Interpretation: ____________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
_________________________________________________________________.
Feedback
Goal: You are to seek answers to the following research questions:
1. What is the profile of the students in terms of:
a. Gender
b. Age
c. Familiy monthly income
d. Parent’s highesdt educational attainment
e. Parent’s occupation
2. What is the Multiple Intelligences of the students?
3. What is the Math performance of the learner’s in their Math?
4. Is there a significant relationship between learner’s Mathematics
performance and their profile variables and Multiple Inteligences?
Audiende: Your target respodents are your respondents from the previous activity
(20 males and 20 females).
Situation: There is a need to conduct a study involving the associations between
the variables to fulfill your requirement. You decided to make a study that will
determine whetehr there is an association between Math peformance and stduent’s
profile and gender of the students and preferences for assessing student learning.
Product/Performance and Purpose: You need to submit the following:
a. E-copy of the questionnaire
b. Evidence of floating the questionnaire and a tally showing the data
gathered from the respodnents
c. The crosstabulation and correlation table from the SPSS Statistic
d. The interpretation of the result
e. Conclusion and Recommendation
Outputs will be graded using the following rubrics:
19
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Learning Objectives
At the end of the lesson, you are expected to:
1. Describe characteristics of Linear Regression.
4. Calculate the Linear Regression using the SPSS Statistics; and
5. Solve problem involving Linear Regression.
Presentation of Content
Linear regression is the next step up after correlation. Regression analyses are
used to examine the effect of different (predictor/independent) variables on a
single outcome (dependent) variable. More so, the use of the term “prediction” is
20
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Solution: (Steps & Photos adopted from SPPS Statistics, IBM Corporation)
In SPSS Statistics, you need to create two variables so that you could enter your
data: income (independent variable) and price (dependent variable).
21
Unit 1: xxxxxxxxxxxxxxxxxxxxx
After which, follow the steps below on how to analyze your data using Linear
Regression in SPSS statistics.
Step 1: Click Analyze > Regression > Linear... on the top menu, as shown below:
22
Unit 1: xxxxxxxxxxxxxxxxxxxxx
This table provides the R and R2 values. The R value represents the simple
correlation and is 0.873 (the "R" Column), which indicates a high degree of
correlation. The R2 value (the "R Square" column) indicates how much of the
total variation in the dependent variable, Price, can be explained by the
independent variable, Income. In this case, 76.2% can be explained, which is very
large.
The next table is the ANOVA table, which reports how well the regression
equation fits the data (i.e., predicts the dependent variable) and is shown below:
23
Unit 1: xxxxxxxxxxxxxxxxxxxxx
This table indicates that the regression model predicts the dependent variable
significantly well. How do we know this? Look at the "Regression" row and go
to the "Sig." column. This indicates the statistical significance of the regression
model that was run. Here, p < 0.0005, which is less than 0.05, and indicates that,
overall, the regression model statistically significantly predicts the outcome
variable (i.e., it is a good fit for the data).
The Coefficients table provides us with the necessary information to predict price
from income, as well as determine whether income contributes statistically
significantly to the model (by looking at the "Sig." column). Furthermore, we can
use the values in the "B" column under the "Unstandardized Coefficients"
column, as shown below:
24
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Interpretation: The information that needs to taken from this table is the R-square
(.133). The R-square is the proportion of variation in the dependent variable
(overall satisfaction) that is explained by the three independent variables. It is
expressed as a percentage. So 13.3 percent of the variation in overall satisfaction
can be explained by three independent variables in the model.
Look at the sig. (p-values) first. You can see that nationality (.000) and
satisfaction with restaurants (.000) are significant predictors (or significantly
related to) of overall satisfaction. The standardized beta tells you the strength and
direction of the relationships (interpreted like correlation coefficients).
Satisfaction with restaurants is positively related to overall satisfaction (.26). High
levels of this satisfaction correspond to higher overall satisfaction.
Nationality is a dichotomous variable where 1 = European, and 2 = U.S. The
positive coefficient (correlation) for nationality suggests that high value on this
variable (which is 2 = U.S) corresponds to higher scores on the dependent
variable (i.e. high levels of overall satisfaction). It is interpreted as U.S tourists
(=2) reported higher levels of overall satisfaction compared with European
tourists (=1).
25
Unit 1: xxxxxxxxxxxxxxxxxxxxx
As stated before, this table shows that the overall model explains a significant
proportion of variance (see Table 1 on prior slide), or that the overall model is
statistically significant – all three independent variables have a significant
combined effect on overall satisfaction, F (3, 181) = 24.55, p < .001.
26
Unit 1: xxxxxxxxxxxxxxxxxxxxx
The table also shows that satisfaction with local transport (Beta = .289, p < .001)
and satisfaction with accommodation (Beta = .32, p < .001) were positively
correlated with overall satisfaction, suggesting that higher levels of satisfaction
with the two categories are associated with higher levels of overall satisfaction.
Number of children is not a significant predictor of overall satisfaction (p = .12).
Solution. “Multiple regression was conducted to examine whether satisfaction
with local transport, satisfaction with accommodation, number of children and
impact on satisfaction. The overall model explained 28.9 percent of variance in
overall satisfaction, which was revealed to be statistically significant, F (3,181) =
24.55, p < .001. An inspection of individual predictors revealed that satisfaction
with local transport (Beta = .29, p < .001) and satisfaction with accommodation
(Beta = .32, p < .001) are significant predictors of overall satisfaction. Higher
satisfaction with accommodation, and with local transport were associated with
higher levels of overall satisfaction.”
Application
A health researcher wants to be able to predict "VO 2 max", an indicator of
fitness and health. Normally, to perform this procedure requires expensive
laboratory equipment and necessitates that an individual exercise to their
maximum (i.e., until they can longer continue exercising due to physical
exhaustion). This can put off those individuals who are not very active/fit
and those individuals who might be at higher risk of ill health (e.g., older
unfit subjects). For these reasons, it has been desirable to find a way of
predicting an individual's VO2 max based on attributes that can be
measured more easily and cheaply. To this end, a researcher recruited 100
participants to perform a maximum VO2max test, but also recorded their
"age", "weight", "heart rate" and "gender". Heart rate is the average of the
last 5 minutes of a 20 minute, much easier, lower workload cycling test.
The researcher's goal is to be able to predict VO2 max based on these four
attributes: age, weight, heart rate and gender.
27
Unit 1: xxxxxxxxxxxxxxxxxxxxx
Interpretation:_____________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
_________________________________________________________________.
Criteria Exceeds Meets Does Not Score
Expectations Expectation Meet
3 points 2 points Expectation
1 or 0 point
Explains data; Poor or
Analyzes data; Explains data;
accurate, missing
organizes and explains accurate, logical
logical explanation
information explanations
explanations of data
Develops conclusions, Reaches a Reaches a Reaches a
synthesizes creative, sensible reasonable poor decision
28
Unit 1: xxxxxxxxxxxxxxxxxxxxx
decision based
information and decision based on based on few
on most
makes decisions all factors or no factors
factors
Feedback
Goal: Based from the results of the activity along correlation, anwer the
additional research question:
Which of the factors singly or in combination that can predict the students’ Math
performance?
Situation: After the conducting the correlations among the variables, determine
the variables that predicts Math performance.
Product/Performance and Purpose: You need to submit the following:
a. The tables generated from the Linear Regression from the SPSS Statistics
f. The interpretation of the result
g. Conclusion and Recommendation
Outputs will be graded using the following rubrics:
Criteria Exceeds Meets Does Not Score
Expectations Expectation Meet
3 points 2 points Expectation
1 or 0 point
Collects and records Complete and
Missing or
data accurate records
Complete and incomplete
including
accurate data;
supplementary
records inaccurate
data from the
records
SPSS
Contributes
Recommendation cont
adequately or Does not
ributes data for a Contributes well
relates to contribute
larger research study
larger study
Explains data; Poor or
Analyzes data; Explains data;
accurate, missing
organizes and explains accurate, logical
logical explanation
information explanations
explanations of data
Reaches a
Develops conclusions, Reaches a Reaches a
reasonable
synthesizes creative, sensible poor decision
decision based
information and decision based on based on few
on most
makes decisions all factors or no factors
factors
Research Well-constructed Adequate Poor or
29
Unit 1: xxxxxxxxxxxxxxxxxxxxx
questionnaire missing
research
addresses research questionnaire research
questionnaire
question(s). questionnaire
Summary
There are three popular inferential statistics examining relationships
between variables namely:
Chi-square test
Correlations
Simple linear regression and Multiple Regression
The Pearson Chi-Square Test is used to tests for the strength of the
association between two categorical variables.
Correlation tests (Pearson correlation) are commonly used to examine
relationships between two or more quantitative/numerical variables.
Linear regression is the next step up after correlation. It examines whether
if one variable predicts (explains/impacts) another variable.
If you have two or more independent variables, rather than just one, you
need to use multiple regression.
Reflection
Congratulations! You are done with the first unit of this module. Now, go back to
the activities and lessons you have taken in this unit and answer the following
questions. Limit your answers for each question to 5 to 10 sentences only.
1. What are the differences among the three popular inferential statistics
examining relationships between variables?
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
___________________________________________________________.
2. Which of the topics in this unit you had like/disliked most? Why?
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
___________________________________________________________.
3. Which of the activities in this unit did you enjoy the most/ the least? Why?
30
Unit 1: xxxxxxxxxxxxxxxxxxxxx
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
____________________________________________________________
References
https://app.nova.edu/toolbox/instructionalproducts/Statistics%20&
%20SPSS/Module%209%20Nonparametric%20tests.pdf
https://courses.lumenlearning.com/introstats1
https://statistics.laerd.com/spss-tutorials
Exploring Relationships using SPSS inferential statistics (Part II) by Dwayne
Devonish
31