You are on page 1of 32

CORRELATION

ANALYSIS

1
Topics

1. Correlation Analysis
▪ Sample case on Correlation Analysis
▪ Example of Correlation using Jamovi
▪ Example of Correlation using MS Excel

2
Topics

1. Introduction
2. Correlation
• What It Means
• What It Does Not Mean

3
Case: Housing Prices

Your uncle is planning to sell his house in the


USA. To get an initial feel for the types of houses
in his area, he has collected the data in the file
named HOUSES. This file contains information
such as the selling price of the house, square
feet, numbers of bedrooms & bathrooms, and
the presence of an attic for 108 sample homes
sold in his neighborhood. He needs help in data
analysis

4
Variables for the Housing Prices Case

SQ_FT: Variable measuring the total square feet of a


house
BEDS & BATHS: the # of bedrooms & bathrooms
HEAT & STYLE: are categorical variables

HEAT takes on the value of 0 for gas forced air heating &
1 for electric heat.
STYLE: architectural style of a house: 0 indicates a
trilevel, 1 indicates a two-story house & 2 indicates that
a house is a bungalow

5
BATHS

Variables for the Housing Prices Case

GARAGE: the # of cars that can fit into the garage

BASEMENT: the presence (1) or absence (0) of a basement

AGE: the age of a house in years


FIRE: the presence (1) or absence (0) of a fireplace
PRICE: the selling price of a house in thousands of dollars
SCHOOL: the presence (1) or absence (0) of a school in the
area

6
BATHS

Case: Housing Prices

❖ RESEARCH OBJECTIVE:
To determine the
DETERMINANTS or PREDICTORS
of housing prices

7
BATHS

Case: Housing Prices

❖ Determinants or predictors are


known as INDEPENDENT VARIABLES

❖ Outcomes of the predictors are


known as DEPENDENT VARIABLES

8
Introduction

➢ Motivation for Conducting Correlation &


Linear Regression Analysis:

▪ Aim is to simultaneously analyze multiple


variables
o Consider a database of various variables across
clients (e.g. educational attainment, sex, income
& household assets)
o We may be interested in determining the
RELATIONSHIP among these variables

9
Introduction

❖ GUIDE to examine relationships:

10
Sir Francis
Galton:
Founder of
the
CORRELATION &
linear
regression

11
Introduction
Y
• Consider Galton’s
data on heights of
fathers & first
born sons
• Tall fathers tend to
have tall sons;
short fathers tend
to have short
sons.
X

12
Introduction

❖ Scatter Plot (scatter diagram)=


Can be used to show the relationship
between 2 numerical variables

▪ Aside from graphical devices, there are other ways


of assessing relationships:
▪ Correlation Analysis
▪ Simple Linear Regression

13
Introduction

Purpose of Correlation & Regression


❖ Correlation Analysis:
– Used to detect using Correlation Coefficient whether 2
variables are “linearly” related (or associated)
– i.e. Does one variable increase when the other
variable increases?
– Does one variable decrease when the other
variable increases?

14
Introduction

Purpose of Correlation & Regression

❖ Simple Linear Regression (SLR):


– Used to predict the value of 1 dependent
(response) variable based on the value of 1
independent (explanatory) variable

15
Correlation

16
Correlation & Scatter Plot

17
Correlation & Scatter Plot Diagram

❖ We examine if one independent


(explanatory) variable is related or
associated w/ one dependent
(response or outcome) variable

18
Correlation & Scatter Plot

19
Correlation

❖ Direction of the Relationship between 2 quantitative


variables
1) Positive relationship=
- As the independent variable increases, the dependent
variable increases as well
2) Negative relationship=
- As the independent variable increases, the
dependent variable decreases OR
- As the independent variable decreases, the
dependent variable increases

20
Scatter Plot

21
Example of a Positive Relationship

22
Example of a Negative Relationship

23
Scatter Plot

24
Correlation & Scatter Plot

25
26
Correlation

❖ Population Correlation Coefficient ρ (Rho)=


- Used to measure the strength of association (linear
relationship) between 2 numerical variables
– Concerned with strength of relationship
– No causal (cause-&-effect) effect is implied yet

• Sample Correlation Coefficient r is a point estimate of


ρ
- What we normally use since we only have samples instead of
populations

27
Correlation

Perfect Perfect
negative Zero positive
correlation correlation correlation

-1.0 -0.5 0 +0.5 +1.0

Increasing degree Increasing degree


of negative correlation of positive correlation

28
Degree of Strength of Correlation

• Perfect: If the value is near ± 1, then it said to be


a perfect correlation: as one variable increases,
the other variable tends to also increase (if
positive) or decrease (if negative).
• High degree: If the coefficient value lies between
± 0.50 and ± 1, then it is said to be a strong
correlation.
• Moderate degree: If the value lies between ±
0.30 and ± 0.49, then it is said to be a medium
correlation.
• Low degree: When the value lies below + .29,
then it is said to be a small correlation.
• No correlation: When the value is zero.

29
Sample Narrative when Describing the Results
of Correlation Analysis

❖ Square feet and house price are


correlated with a high degree of
correlation, r = .828, and were
significant (p < .001)

30
Null & Alternative Hypothesis Statements for
the Test of Correlation

❖ Null Hypothesis (Ho)=


The X & Y variables are NOT related.

❖ Alternative Hypothesis (Ha)=


The X & Y variables are related.

31
Ho & Ha Statements for Housing Price Case

❖ Null Hypothesis (Ho)=


The size of a house & its price are
NOT related.

❖ Alternative Hypothesis (Ha)=


The size of a house & its price are
related.

32

You might also like