Group4Module Correlations

CORRELATION
INTRODUCTION
Correlation refers to the statistical relationship between two entities. In other words, it's
how two variables move in relation to one another. This means the two variables
moved either up or down in the same direction together.
It is a statistical measure that expresses the extent to which two variables are linearly
related (meaning they change together at a constant rate). It's a common tool for
describing simple relationships without making a statement about cause and effect.
Correlation is used to test relationships between quantitative variables or categorical
variables. In other words, it’s a measure of how things are related. The study of how
variables are correlated is called correlation analysis.
OBJECTIVES
At the end of this discussion we can be able to:
1. Understand the meaning and purpose of correlational analysis.
2. Know and identify the possible results of a correlational study.
3. Understand and apply the methods in solving correlational relationships.
4. Identify the types of correlational researches.
5. Know the basics in interpreting correlational data using SPSS.
DISCUSSION
As written above, correlation focuses on relationship of variables or bivariate data.
Correlation was invented by Francis Galton and dates from late in the year 1888, and it
arose when he recognized a common thread in three different scientific problems he
was studying.
Some examples of data that have a high correlation:
Excellence • Accountability • Service

 Your caloric intake and your weight.
 Your eye color and your relatives’ eye colors.
 The amount of time your study and your GPA.
Some examples of data that have a low correlation (or none at all):
 A dog’s name and the type of dog biscuit they prefer.

 The cost of a car wash and how long it takes to buy a soda inside the station.
Correlations are useful because if you can find out what relationship variables have,
you can make predictions about future behavior. Knowing what the future holds is
very important in the social sciences like government and healthcare. Businesses also
use these statistics for budgets and business plans.
There are three possible results of a correlational study: a positive correlation, a
negative correlation, and no correlation.
POSITIVE CORRELATION
■ A positive correlation occurs whenever the
change in the variables is occurring in the
same direction. If an increase in variable A
occurs and results in an increase in variable B,
there is a positive correlation. For example, an
increase in the number of hours that students
study could result in an increase in test scores
or lower ACT scores may indicate poorer
performance in college. These are both
examples of positive correlations because the
variables are moving in the same direction.
NEGATIVE CORRELATION
A negative (inverse) correlation occurs when one variable
increases and the other variable decreases. An example would
be the relationship between increasing exercise and reducing
the number of doctor visits for colds and common illnesses.

ZERO CORRELATION
Finally, there may be zero correlation when there is no
identifiable pattern for determining a relationship. For
example, there may be no relationship found between the
number of cups of coffee drank per day and intelligence.
The data from a correlational study is often represented graphically using a scatterplot
or scatter diagram. Scatterplots are used to summarize the relationship between two
variables (X and Y) by plotting the discrete data points and then looking for overall
trends. The following graph represents the three main types of correlational
relationships:
The strength of the relationship is measure of how consistently the values of each
variable change in relation to each other. Graphically, the stronger the relationship, the
closer the data points will fall along a line as seen in the example below:
Some uses of correlations:

PREDICTION
If there is a relationship between two variables, we can make predictions about one
from another.
VALIDITY
Concurrent validity (correlation between a new measure and an established measure).

RELIABILITY
Test-retest reliability (are measures consistent).
Inter-rater reliability (are observers consistent).
THEORY VERIFICATION
Predictive validity.
THE CORRELATION COEFFICIENT

A correlation coefficient is a way to put a value to the relationship. Correlation
coefficients have a value of between -1 and 1. A “0” means there is no relationship
between the variables at all, while -1 or 1 means that there is a perfect negative or
positive correlation (negative or positive correlation here refers to the type of graph the
relationship will produce).
Graphs showing a correlation of -1, 0 and +1
METHODS IN SOLVING FOR CORRELATIONS
PEARSON (r) CORRELATION

- Most common method to use for numerical variables
- It is under Quantitative and correlational analysis
- It measures the strength of linear relationship between two variables
- Correlation coefficient ranges from -1.00 to +1.00

- the sign (positive or negative) indicates the direction of relationship.
Three Basic types of Correlation

1. Positive Correlation - values of two variables tend to move in the same direction.
(X , Y ).
Example:
The more hours students spend studying for a test, the higher their test
grades tend to be. This correlation would be positive because as the values on
hours spent studying go up, test grades tend to go up (they move in the same
direction).
2. Negative Correlation - values of two variables tend to move in opposite

directions. (X , Y , or vv.).
Example:
The more hours students spend partying the night before an exam, the
lower their test grades tend to be. This correlation would be negative because as
the values on hours spent partying go up, the values on test grades tend to go
down (they move in opposite direction).
3. Zero correlation – No correlation at all.
Strength and Direction of a Correlation Coefficient

Reference: Research Methods, Design, and Analysis, Twelfth Edition. Pg. 414
Copyright © 2014, 2011, 2007 Pearson Education, Inc
Different Strength and Direction of a Correlation

Coefficient
Reference: Research Methods, Design, and Analysis, Twelfth Edition. Pg. 394Copyright ©
2014, 2011, 2007 Pearson Education, Inc

How to Solve for the PEARSON r Correlation
Formula
: Where:
N = Number of pair Scores
∑ x = Summation of the Values of x
∑ y = Summation of the Values of y
∑ xy = Summation of the Product of x and y
∑ x 2= Summation x
∑ y2= Summation y
Example 1
The more hours students spend studying for a test, the higher their test score
tend to be. This correlation would be positive because as the values on hours spent
studying go up, test grades tend to go up (they move in the same direction). Test items
is 20
X = Hours spend studying Y= Test Grade
x y xy x2 y2
1 10 10 1 10

2 12 24 4 144
3 15 45 9 225
4 18 72 16 324
5 20 100 25 400
∑ x= 15 ∑ y=75 ∑ xy=251 ∑ x 2=55 ∑ y2=1193
N ∑ xy −∑ x ∑ y
∑ r= √¿¿¿
¿
Very strong
positive
association/correl
ation
130 130
r= √(50)(340) r= √17000 r = 0.997

Very strong negative
correlation

KENDALL RANK CORRELATION
What about the Kendall Rank Correlation (also known as Kendall’s tau-b)? What is it?
How do I get started? When do I use the Kendall’s tau-b? Hey, just teach me everything
you know about Kendall Rank Correlation. “ — A curious mind.
What is Kendall Rank Correlation?

Also commonly known as “Kendall’s tau coefficient”. Kendall’s Tau coefficient and
Spearman’s rank correlation coefficient assess statistical associations based on the ranks
of the data. Kendall rank correlation (non-parametric) is an alternative to Pearson’s
correlation (parametric) when the data you’re working with has failed one or more
assumptions of the test. This is also the best alternative to Spearman correlation (non-
parametric) when your sample size is small and has many tied ranks.
Kendall rank correlation is used to test the similarities in the ordering of data when it is
ranked by quantities. Other types of correlation coefficients use the observations as the
basis of the correlation, Kendall’s correlation coefficient uses pairs of observations and
determines the strength of association based on the patter on concordance and
discordance between the pairs.
 Concordant: Ordered in the same way (consistency). A pair of observations is

considered concordant if (x2 — x1) and (y2 — y1) have the same sign.
 Discordant: Ordered differently (inconsistency). A pair of observations is

considered concordant if (x2 — x1) and (y2 — y1) have opposite signs.
Kendall’s Tau coefficient of correlation is usually smaller values than Spearman’s rho
correlation. The calculations are based on concordant and discordant pairs. Insensitive
to error. P values are more accurate with smaller sample sizes.
Questions that Kendall rank correlation answers.

1. Correlation between a student’s exam grade (A, B, C…) and the time spent
studying put in categories (<2 hours, 2–4 hours, 5–7 hours…)
2. Customer satisfaction (e.g. Very Satisfied, Somewhat Satisfied, Neutral…) and
delivery time (< 30 Minutes, 30 minutes — 1 Hour, 1–2 Hours etc)
Assumptions

You need to check that your data satisfies the assumptions before you dive into using
Kendall’s rank correlation. This will ensure that you have valid results that you can
actually use and not just numbers on your monitor.
1. The variables are measured on an ordinal or continuous scale. Ordinal scales are
typically measures of non-numeric concepts like satisfaction, happiness,
discomfort. e.g. Very Satisfied, Somewhat Satisfied, Neutral, Somewhat
Unsatisfied, Very Unsatisfied. Continuous scales are essentially interval (i.e.
temperature e.g. 30 degrees) or ratio variables (e.g. weight, height).
2. Desirable if your data appears to follow a monotonic relationship. In simple
terms, as the value of one variable increases, so does the other variable and as the
value of once variable increases, the other variable decreases. Here’s why:
Kendall’s rank correlation measures the strength and direction of association that
exists (determines if there’s a monotonic relationship) between two variables.
Knowing this, testing for the presence of a monotonic relationship makes sense.
But, like I said, it is desirable.
Monotonic vs Non-Monotonic Relationship
REFERENCE: https://towardsdatascience.com/kendall-rank-correlation-explained-
dee01d99c535

Sample Question: Two interviewers ranked 12 candidates (A through L) for a position.
The results from most preferred to least preferred are:

 Interviewer 1: ABCDEFGHIJKL.
 Interviewer 2: ABDCFEHGJILK.
Calculate the Kendall Tau correlation.
Step 1: Make a table of rankings. The first column, “Candidate” is optional and for
reference only. The rankings for Interviewer 1 should be in ascending order (from least
to greatest).
Step 2: Count the number of concordant pairs, using the second column. Concordant
pairs are how many larger ranks are below a certain rank. For example, the first rank in
the second interviewer’s column is a “1”, so all 11 ranks below it are larger.
However, going down the list to the third row (a rank of 4), the rank immediately below
(3) is smaller, so it doesn’t count for a concordant pair.

When all concordant pairs have been counted, it looks like this:
Step 3: Count the number of discordant pairs and insert them into the next column. The
number of discordant pairs is similar to Step 2, only you’re looking for smaller ranks,
not larger ones.

Step 4: Sum the values in the two columns:
Step 5: Insert the totals into the formula:

Kendall’s Tau = (C – D / C + D)
= (61 – 5) / (61 + 5) = 56 / 66 = .85.
The Tau coefficient is .85, suggesting a strong relationship between the rankings.
Perfect Correlation
Counting how many values are below the second column seems very odd when you
first do it. But it does work. Just as a thought experiment, here’s what the spreadsheet
looks like if both interviewers were in perfect agreement:
And, inserting the totals into the formula we get:

Tau = (66 – 0) / (66 + 0) = 1, which is (as we expect) perfect agreement.
Calculating Statistical Significance
If you want to calculate statistical significance for your result, use this formula to get a
z-value:

Inserting the values from our results:
= 3 * .85 * 11.489 / 7.616

= 3.85.
Finding the area for a z-score of 3.85 on a z-table gives an area of .0001 — a tiny
probability value which tells you this result is statistically significant.
SPEARMAN’S RANK CORRELATION

o Spearman’s correlation coefficient , (also signified by sr) measures the strength
and direction of association between two ranked varaiables.
o It is the non parametric version of the Pearson product-moment correlation
o The correlation coefficient takes on values ranging between -1 and +1
Formula :
Where,
value STRENGHT OF RELATIONSHIP

of Sr
Sr,< 0.30 none or very weak relationship

0.30 < Sr << 0.50 Weak relationship
0.50 <sr, < 0.70 Moderate relationship
Sr, > 0.70 Strong relationship
Sr valu e
Example problem: Find the correlation of the scores for Mathematics and
English
math English Math English D D2
56 66 6 2 4 16
75 70 2 1 1 1
45 40 7 7 0 0

71 60 3 4 -1 1
61 65 5 3 2 4
64 56 4 6 -2 4
80 59 1 5 -4 16
ΣD²=42
Substitute the value of the given formula
6 Σ D²
Sr= 1- -------------
n³- n
ΣD²=42
n=7 6 (42)
Sr= 1- -------------
7³ -7
Solve for the value of S?
6(42)
S= 1- --------------
7³ -7
25
Sr=1 --------------
343-7
257
Sr= 1--------------
336
S= 1- 0/75
S=1- 0.25 Very weak correlation
TYPES OF CORRELATIONAL STUDIES
HOW CORRELATIONAL STUDIES WORK

■ Correlational research is a preliminary way to gather information about a topic.
The method is also useful if researchers are unable to perform an experiment.
■ Researchers use correlations to see if a relationship between two or more
variables exists, but the variables themselves are not under the control of the
researchers.
■ While correlational research can demonstrate a relationship between variables, it
cannot prove that changing one variable will change another. In other words,
correlational studies cannot prove cause-and-effect relationships.
1. Naturalistic Observation - is when a researcher collects data by observing subjects in

their natural environment without interfering or interacting with them in any way. This
type of observation is commonly used when lab experimentation is not possible,
feasible or ethical. An example may be that a researcher wants to see if there if there is
correlation between class participation and grades by observing the amount of
participation by subjects in a classroom. This method can be time-consuming but offers
the advantage of being assured that the subjects are behaving normally.
2. Survey Research - is done by gathering information from a random selection of

subjects through the use of mail surveys, email or internet surveys, or interviews.
Survey research is relatively simple to perform once the survey questions have been
developed and the researcher can reach a large number of potential subjects quickly.
The drawbacks are that the response rate can be low and there is no guarantee that the
subjects are being honest. An example of survey research that is testing for a correlation
could be a researcher who is looking for a correlation between home ownership and
education level by surveying home owners and asking about their education level.
3. Archival Research - involves analyzing data that has previously been collected by
others and looking for correlations. The researcher does not have control over the data
or how it was gathered, however, the researcher may have access to large amounts of
data with relatively little effort and often the data is free. For example, a researcher
may examine the crime statistics of several neighborhoods to see if there is any
correlation with crime and a sluggish housing market in particular areas.
CORRELATION VS. CAUSATION

■ Causation means that one variable (often called the predictor variable or
independent variable) causes the other (often called the outcome variable or
dependent variable).
■ Experiments can be conducted to establish causation. An experiment isolates and
manipulates the independent variable to observe its effect on the dependent
variable, and controls the environment in order that extraneous variables may be
eliminated.
■ A correlation between variables, however, does not automatically mean that the
change in one variable is the cause of the change in the values of the other
variable. A correlation only shows if there is a relationship between variables.
■ Correlation does not always prove causation as a third variable may be involved.
For example, being a patient in hospital is correlated with dying, but this does
not mean that one event causes the other, as another third variable might be
involved (such as diet, level of exercise).
STRENGHTS OF CORRELATION
■ Correlation allows the researcher to investigate naturally occurring variables that
maybe unethical or impractical to test experimentally. For example, it would be
unethical to conduct an experiment on whether smoking causes lung cancer.
■ Correlation allows the researcher to clearly and easily see if there is a
relationship between variables. This can then be displayed in a graphical form.
LIMITATIONS OF CORRELATION
■ Correlation is not and cannot be taken to imply causation. Even if there is a very
strong association between two variables we cannot assume that one causes the
other.
■ For example suppose we found a positive correlation between watching violence
on T.V. and violent behavior in adolescence. It could be that the cause of both
these is a third (extraneous) variable - say for example, growing up in a violent
home - and that both the watching of T.V. and the violent behavior are the
outcome of this.
■ Correlation does not allow us to go beyond the data that is given. For example
suppose it was found that there was an association between time spent on

answering modules (1/2 hour to 3 hours) and number of finished modules (3
modules). It would not be legitimate to infer from this that spending 6 hours on
answering modules would be likely to finish 6 modules.
SUMMARY
"Correlation is not causation" means that just because two variables are
related it does not necessarily mean that one causes the other.

A correlation identifies variables and looks for a relationship between them.
An experiment tests the effect that an independent variable has upon a
dependent variable but a correlation looks for a relationship between two
variables.
This means that the experiment can predict cause and effect (causation) but
a correlation can only predict a relationship, as another extraneous variable
may be involved that it not known about.
ASSESSMENT
1.
2.
3.
4.
5.
REFERENCES
https://www.simplypsychology.org/correlation.html?
fbclid=IwAR3Yv9ncJDFWwkIoym_y7HhDR7gqDPxK27YcVue519gllyGL
Mkro5j2BYc

https://www.statisticssolutions.com/free-resources/directory-of-
statistical-analyses/correlation-pearson-kendall-spearman/?
fbclid=IwAR0pu-OFxnPQfNmD3llZJfJC-
o3fMvIU41B9jxpLT0enTkr0KrTjgApEww8
https://www.statisticshowto.com/probability-and-statistics/correlation-
analysis/?fbclid=IwAR0INSrC1iMMkOfyLtQ7s5aTU-
f3MOKJOD2nUCURFo1G8blSDDUsWmAZaRE
https://libguides.library.kent.edu/SPSS/PearsonCorr
https://statistics.laerd.com/spss-tutorials/kendalls-tau-b-using-spss-
statistics.php
https://statistics.laerd.com/spss-tutorials/spearmans-rank-order-
correlation-using-spss-statistics.php
https://ezspss.com/pearson-correlation-coefficient-and-interpretation-in-
spss/
https://www.slideshare.net/zarikahn/kendall-rank-correlation
THANK YOU!

Group4Module Correlations

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Group4Module Correlations

Uploaded by

Copyright:

Available Formats

CORRELATION

Some examples of data that have a high correlation:

Excellence • Accountability • Service

 A dog’s name and the type of dog biscuit they prefer.

Excellence • Accountability • Service

Some uses of correlations:

Excellence • Accountability • Service

THE CORRELATION COEFFICIENT

METHODS IN SOLVING FOR CORRELATIONS

PEARSON (r) CORRELATION

Excellence • Accountability • Service

Three Basic types of Correlation

2. Negative Correlation - values of two variables tend to move in opposite

3. Zero correlation – No correlation at all.

Strength and Direction of a Correlation Coefficient

Excellence • Accountability • Service

Different Strength and Direction of a Correlation

Excellence • Accountability • Service

Excellence • Accountability • Service

∑ x= 15 ∑ y=75 ∑ xy=251 ∑ x 2=55 ∑ y2=1193

Excellence • Accountability • Service

Excellence • Accountability • Service

What is Kendall Rank Correlation?

 Concordant: Ordered in the same way (consistency). A pair of observations is

 Discordant: Ordered differently (inconsistency). A pair of observations is

Questions that Kendall rank correlation answers.

Excellence • Accountability • Service

Monotonic vs Non-Monotonic Relationship

Excellence • Accountability • Service

Excellence • Accountability • Service

Excellence • Accountability • Service

Excellence • Accountability • Service

Step 5: Insert the totals into the formula:

And, inserting the totals into the formula we get:

Excellence • Accountability • Service

= 3 * .85 * 11.489 / 7.616

SPEARMAN’S RANK CORRELATION

Excellence • Accountability • Service

value STRENGHT OF RELATIONSHIP

Sr,< 0.30 none or very weak relationship

Excellence • Accountability • Service

S=1- 0.25 Very weak correlation

TYPES OF CORRELATIONAL STUDIES

HOW CORRELATIONAL STUDIES WORK

Excellence • Accountability • Service

1. Naturalistic Observation - is when a researcher collects data by observing subjects in

2. Survey Research - is done by gathering information from a random selection of

CORRELATION VS. CAUSATION

Excellence • Accountability • Service

Excellence • Accountability • Service

Excellence • Accountability • Service

Excellence • Accountability • Service

Excellence • Accountability • Service

You might also like