Professional Documents
Culture Documents
Regression
and
Correlation
Analysis
Measures of Relationship
o Statistics are widely used in the social sciences in
making predictions which are based upon the fact
that two variables are related.
o The process of obtaining the measure of the degree
of relationship or association between variables is
called correlation analysis.
o When a known measure of one variable is used to
make estimates of a second variable, the process is
known as regression analysis.
Regression Analysis
o Regression analysis is the process by which one
variable Y is predicted from another variable X.
o The variable Y is called the dependent variable and
X is called the independent variable or the predictor.
a
Y X
b Y bX
n n
Method of Least Square
Example. Given the data below, find the equation of the
regression line. Estimate the value of Y if X is 7.
X 2 4 6 8 10 12
Y 11 9 8 5 4 3
Correlation Analysis
o Correlation analysis is used to measure the linear
relationship or association between two variables.
o The measure of the degree of association between
two variables is known as the coefficient of
correlation (r).
o The value of r varies from –1 to +1. This can
expressed in the interval – 1 r 1.
o For perfectly positive correlation, r = 1, while in a
perfectly negative correlation, r = –1 .
o If r = 0, then there is no linear relation existing
between the two variables.
Correlation Analysis
o A positive correlation is present when high values in one variable are
associated with high values of another variable or vice versa.
o On the other hand, when high values on one variable are associated with
low values of the other variable or vice versa, a negative correlation is
present.
Correlation Analysis
o The degree of linear relationship can be interpreted
by using the following range of values:
Range of Value of r Description
0.90 to 1.00 or (-0.90 to -1.00) Very high positive (negative) correlation
0.70 to 0.89 or (-0.70 to -0.89) High positive (negative) correlation
0.50 to 0.69 or (-0.50 to -0.69) Moderate positive (negative) correlation
0.30 to 0.49 or (-0.30 to -0.49) Low positive (negative) correlation
0.00 to 0.29 or ( 0.00 to -0.29) Little, if any correlation
Correlation Analysis
Pearson Product Moment Correlation Coefficient
o is a measure of the linear correlation (dependence) between
two variables X and Y, giving a value between +1 and −1
inclusive, where 1 is total positive correlation, 0 is no
correlation, and −1 is total negative correlation.
o is widely used in the sciences as a measure of the degree of
linear dependence between two variables. It was developed
by Karl Pearson from a related idea introduced by
Francis Galton in the 1880s.
n XY X Y
Pearson r
n X 2 x 2 n Y 2 Y 2
Correlation Analysis
Example.
Student English Math
Find the degree of the No. Score Score
relationship between English and
Mathematics score using Pearson r 1 86 65
on a college entrance examination 2 55 92
conducted. 3 75 85
4 93 60
5 89 58
6 67 84
7 60 86
8 52 90
9 83 72
10 86 69