You are on page 1of 29

Correlation

and
Regression
Objectives

– Draw a scatter plot for a set of


ordered pairs.
– Find the correlation coefficient.
– Test the hypothesis H0:  = 0.
– Find the equation of the
regression line.
Objectives

– Find the coefficient of


determination.
– Find the standard error of
estimate.
Scatter Plots

A scatter plot is a graph of


the ordered pairs (x, y) of
numbers consisting of the
independent variable, x, and
the dependent variable, y.
Scatter Plots
- Example
Construct a scatter plot
for the data obtained
in a study on the
number of absences
and the final grades of
seven randomly
selected students from
a statistics class. The
data are shown here.
Scatter Plots
- Example
Scatter Plots –
Example

Positive Relationship

150
150
Pressure
Pressure

140
140

130
130

120
120
40
40 50
50 60
60 70
70
Age
Age
Scatter Plots –
Example

Negative Relationship
90
90
80
80
grade
Finalgrade

70
70
Final

60
60
50
50
40
40
55 10
10 15
15
Number
Numberofofabsences
absences
Scatter Plots –
Example

No Relationship
10
10

55
Y
y

00
00 10
10 20
20 30
30 4040 50
50 60
60 70
70
xX
Correlation
Coefficient

– The correlation coefficient computed from


the sample data measures the strength
and direction of a relationship between
two variables.
– Sample correlation coefficient, r.
– Population correlation coefficient, 
Range of Values for the
Correlation Coefficient

Strong negative No linear Strong positive


relationship relationship relationship

  
Formula for the
Correlation Coefficient (r)

n xy   x y


r
 2

n x    x n y    y
2 2 2

Where n is the number of data pairs
Correlation Coefficient
- Example
Compute the
correlation coefficient
of the data obtained in
a study on the number
of absences and the
final grades of seven
randomly selected
students from a
statistics class. The
data are shown here.
The Significance of the
Correlation Coefficient

– The population correlation


coefficient, , is the correlation
between all possible pairs of data
values (x, y) taken from a
population.
The Significance of the
Correlation Coefficient

– H0: = 0 H1:  0


– This tests for a significant
correlation between the variables
in the population.
Formula for the t test
for the Correlation
Coefficient

n2
t r
1 r
2

with d . f .  n  2
The Significance of
the Correlation
Coefficient Example
An environmentalist wants to determine the
relationships between the number (in thousands) of
forest fires over the year and the number (in hundred
thousands) of acres burned. The data for 8 recent
years are shown. Is there a significant relationship
between the data? Use α=0.05
Regression

–  Linear Regression is an approach for


modeling the relationship between a
dependent variable y and an independent
variable x.
– Regression line is called the line of best fit.
– The equation of the line is y’ = a + bx.
Formulas for the
Regression Line y  = a +
bx.
a
 y x    x xy
2


n x    x 
2 2

n xy   x  y 


b
n x    x 
2 2

Where a is the y  intercept and b is


the slope of the line.
Regression-
Example
In a study on speed and
braking distance, researchers
looked for a method to estimate
how fast a person was traveling
before an accident by measuring
the length of the skid marks. An
area that was focused on in the
study was the distance required to
completely stop a vehicle at
various speeds. Use the table to
answer the questions.
Regression-
Example
1. Find the linear regression
equation.
2. Find the braking distance
when the speed is 45 mph.
3. If the speeds differ by 30
mph, then what will be the
predicted difference in their
braking distance?
Regression-
Example
Coefficient of
Determination

– The coefficient of determination,


denoted by r2, is a measure of the
variation of the dependent variable that is
explained by the regression line and the
independent variable.
– r2 is the square of the correlation
coefficient.
Coefficient of
Determination

– Example: If r = 0.90, then r2 = 0.81.


(Coefficient of determination)
– The coefficient of
non-determination is (1 – r2).
– Example: If r = 0.90, then
1- r2 = 0.19. (Coefficient of
non-determination)
Standard Error of
Estimate

– The standard error of estimate,


denoted by sest, is the standard
deviation of the observed y values
about the predicted y  values.
Formula for the
Standard Error of
Estimate
Standard Error of
Estimate - Example
Given the data below and the equation of the
regression line y’ = -31.46 + 1.04x, calculate the
standard error of estimate.
Exercise
Recent agricultural data showed the number of eggs
produced and the price received per dozen for a given year.
Perform the following:
a. Draw the scatter plot for the variables then give a brief
description on the type of relationship.
b. Compute the value of the correlation coefficient.
c. Test the significance of the correlation coefficient at
α=0.05.
Exercise
These data were obtained from a sample of counties in
southwestern Pennsylvania and indicate the number (in
thousands) of tons of coal produced in each county and the
number of employees working in coal production in each county.
Perform the following:
a. Compute the correlation coefficient.
b. Determine the regression line equation.
c. Predict the amount of coal produced for a county that has 500
employees.
d. Calculate the standard error of estimate.

You might also like