You are on page 1of 20

BES220: Simple linear regression

Line fitting, and correlation

Dr. Elias J. Willemse


Poverty vs. HS graduate rate

The scatterplot below shows the relationship between High School


completion rate in different districts in South Africa and the % of
residents who live below the poverty line (income below R100 000 for
a family of 4 in 2012).

18
16
% in poverty

14
12
10
8
6
80 85 90
% HS grad
Poverty vs. HS graduate rate

The scatterplot below shows the relationship between High School


completion rate in different districts in South Africa and the % of
residents who live below the poverty line (income below R100 000 for
a family of 4 in 2012).

18 Response variable?
16
% in poverty

14
12
10
8
6
80 85 90
% HS grad
Poverty vs. HS graduate rate

The scatterplot below shows the relationship between High School


completion rate in different districts in South Africa and the % of
residents who live below the poverty line (income below R100 000 for
a family of 4 in 2012).

18 Response variable?
16
% in poverty
% in poverty

14
12
10
8
6
80 85 90
% HS grad
Poverty vs. HS graduate rate

The scatterplot below shows the relationship between High School


completion rate in different districts in South Africa and the % of
residents who live below the poverty line (income below R100 000 for
a family of 4 in 2012).

18 Response variable?
16
% in poverty
% in poverty

14
Explanatory variable?
12
10
8
6
80 85 90
% HS grad
Poverty vs. HS graduate rate

The scatterplot below shows the relationship between High School


completion rate in different districts in South Africa and the % of
residents who live below the poverty line (income below R100 000 for
a family of 4 in 2012).

18 Response variable?
16
% in poverty
% in poverty

14
Explanatory variable?
12
10 % HS grad
8
6
80 85 90
% HS grad
Poverty vs. HS graduate rate

The scatterplot below shows the relationship between High School


completion rate in different districts in South Africa and the % of
residents who live below the poverty line (income below R100 000 for
a family of 4 in 2012).

18 Response variable?
16
% in poverty
% in poverty

14
Explanatory variable?
12
10 % HS grad
8 Relationship?
6
80 85 90
% HS grad
Poverty vs. HS graduate rate

The scatterplot below shows the relationship between High School


completion rate in different districts in South Africa and the % of
residents who live below the poverty line (income below R100 000 for
a family of 4 in 2012).

18 Response variable?
16
% in poverty
% in poverty

14
Explanatory variable?
12
10 % HS grad
8 Relationship?
6
linear, negative, moderately
80 85 90
% HS grad
strong
Quantifying the relationship

• Correlation describes the strength of the linear association


between two variables.
Quantifying the relationship

• Correlation describes the strength of the linear association


between two variables.
• It takes values between -1 (perfect negative) and +1 (perfect
positive).
Quantifying the relationship

• Correlation describes the strength of the linear association


between two variables.
• It takes values between -1 (perfect negative) and +1 (perfect
positive).
• A value of 0 indicates no linear association.
Guessing the correlation

Which of the following is the best guess for the correlation between
% in poverty and % HS grad?

18
16
(a) 0.6

% in poverty
14
(b) -0.75
12
(c) -0.1 10
(d) 0.02 8

(e) -1.5 6
80 85 90
% HS grad
Guessing the correlation

Which of the following is the best guess for the correlation between
% in poverty and % HS grad?

18
16
(a) 0.6

% in poverty
14
(b) -0.75
12
(c) -0.1 10
(d) 0.02 8

(e) -1.5 6
80 85 90
% HS grad
Guessing the correlation

Which of the following is the best guess for the correlation be-
tween % in poverty and % female householder without a husband
present?

18
16
(a) 0.1

% in poverty
14
(b) -0.6
12
(c) -0.4
10
(d) 0.9 8

(e) 0.5 6
8 10 12 14 16 18
% female householder, no husband present
Guessing the correlation

Which of the following is the best guess for the correlation be-
tween % in poverty and % female householder without a husband
present?

18
16
(a) 0.1

% in poverty
14
(b) -0.6
12
(c) -0.4
10
(d) 0.9 8

(e) 0.5 6
8 10 12 14 16 18
% female householder, no husband present
Assessing the correlation

Which of the following has the strongest correlation, i.e. correlation


coefficient closest to +1 or -1?

(a) (b)

(c) (d)
Assessing the correlation

Which of the following has the strongest correlation, i.e. correlation


coefficient closest to +1 or -1?

(b) !
(a) (b)
correlation
means linear
association

(c) (d)
Calculating the correlation
Calculating the correlation

P P
1 x y (x x̄)(y ȳ)
r= ⇥
n 1 sx sy

You might also like