You are on page 1of 14

Chapter 10

Regression
and
Correlation
Analysis
Measures of Relationship
o Statistics are widely used in the social sciences in
making predictions which are based upon the fact
that two variables are related.
o The process of obtaining the measure of the degree
of relationship or association between variables is
called correlation analysis.
o When a known measure of one variable is used to
make estimates of a second variable, the process is
known as regression analysis.
Regression Analysis
o Regression analysis is the process by which one
variable Y is predicted from another variable X.
o The variable Y is called the dependent variable and
X is called the independent variable or the predictor.

llustration. Predicting the academic performance of student


based on the knowledge of his IQ.

Independent variable = student’s IQ


Dependent variable = academic performance
Linear vs. Non-linear Regression
o is an approach for modeling the relationship between
a scalar dependent variable Y and one or
more explanatory variable denoted X.
o The case of one explanatory variable is called simple
linear regression.
o For more than one explanatory variable, the process
is called multiple linear regression.
o By linear regression, we mean that there is a straight
line relationship between the variables X and Y.
o Nonlinear regression is a form of regression analysis in
which observational data are modeled by a function which is
a nonlinear combination of the model parameters and
depends on one or more independent variables.
Linear Regression
o this linear relationship can be expressed in an
equation of the form
Y = a + bx

where Y = predicted score


a = the y-intercept
b = the slope of the line
Scatter Diagram
o Visual representation of the relationship between
two variables. In this method, values of the two
variables are plotted on a two-dimensional
coordinate system.

A line could be fitted and we A curve instead of a line will


could conclude that there is a fit the plotted points, so that
linear relationship between we can say that there is a non-
the two variables. linear relationship.
Method of Least Square
o The line obtained by the method of least square is
known as the regression line and also referred to as
the line of best fit.
Y = a + bx
n XY   X  Y
where b
n X    X 
2
2

a
 Y  X 
b   Y  bX
n  n 
Method of Least Square
Example. Given the data below, find the equation of the
regression line. Estimate the value of Y if X is 7.
X 2 4 6 8 10 12
Y 11 9 8 5 4 3
Correlation Analysis
o Correlation analysis is used to measure the linear
relationship or association between two variables.
o The measure of the degree of association between
two variables is known as the coefficient of
correlation (r).
o The value of r varies from –1 to +1. This can
expressed in the interval – 1  r  1.
o For perfectly positive correlation, r = 1, while in a
perfectly negative correlation, r = –1 .
o If r = 0, then there is no linear relation existing
between the two variables.
Correlation Analysis
o A positive correlation is present when high values in one variable are
associated with high values of another variable or vice versa.
o On the other hand, when high values on one variable are associated with
low values of the other variable or vice versa, a negative correlation is
present.
Correlation Analysis
o The degree of linear relationship can be interpreted
by using the following range of values:
Range of Value of r Description
0.90 to 1.00 or (-0.90 to -1.00) Very high positive (negative) correlation
0.70 to 0.89 or (-0.70 to -0.89) High positive (negative) correlation
0.50 to 0.69 or (-0.50 to -0.69) Moderate positive (negative) correlation
0.30 to 0.49 or (-0.30 to -0.49) Low positive (negative) correlation
0.00 to 0.29 or ( 0.00 to -0.29) Little, if any correlation
Correlation Analysis
Pearson Product Moment Correlation Coefficient
o is a measure of the linear correlation (dependence) between
two variables X and Y, giving a value between +1 and −1
inclusive, where 1 is total positive correlation, 0 is no
correlation, and −1 is total negative correlation.
o is widely used in the sciences as a measure of the degree of
linear dependence between two variables. It was developed
by Karl Pearson from a related idea introduced by
Francis Galton in the 1880s.

n XY   X  Y
Pearson r 
 n X 2   x 2   n Y 2   Y 2 
       
Correlation Analysis
Example.
Student English Math
Find the degree of the No. Score Score
relationship between English and
Mathematics score using Pearson r 1 86 65
on a college entrance examination 2 55 92
conducted. 3 75 85
4 93 60
5 89 58
6 67 84
7 60 86
8 52 90
9 83 72
10 86 69

You might also like