Professional Documents
Culture Documents
Correlation and Regression
Correlation and Regression
Unit-4
SCATTER PLOT
Scatter plot is used to detect whether
two variables are related or not.
In this one variable is taken at X-axis while other
at Y-axis.
Then plot is made by marking the pair of values
(x, y) on the graph.
COVARIANCE
Covariance is a measure of relationship between two variables.
Let X and Y be two variables then covariance between them
is
given by:
1. For raw data:
or,
2.For frequency
data: or,
CORRELATION
Scatter plot is used to make a guess about the
relationship among two variables, but it can’t tell
how much strong the relationship is.
To overcome from this difficulty correlation coefficients
are used.
It is used to measure of strong the linear relationship
among two variables.
Generally two types of Correlation coefficients are used.
1. Karl Pearson’s Correlation Coefficient
2. Spearman’s Rank Correlation Coefficient
Excellence and Service
CHRIST
Deemed to be University
Where,
V(x) = variance of X
V(y) = variance of Y
LINEAR REGRESSION
The scatter plot tells in what manner two variables are related,
whereas, correlation tells how much strong the linear relationship
among the two variables is. But these can’t tell what changes
would occur in variable when unit change is made in other
variable.
Regression analysis is used to interpret the amount of change in
one variable when changes are made in other variable.
Generally two types of regression analysis are used:
1. Linear regression
2. Curvelinear regression
It is used when data is of quantitative (Continuous) type.
It shows that what changes would occur in one variable when unit
changes are made in the other variable.
LINEAR REGRESSION
The model for linear regression is given by:
Yi = β0 + β1Xi + εi
If more than two variables are involved in
the study Then model can be extended as:
Yi = β0 + β1X1i + β2X2i + ...... + βkXki + εi
This model is called as Multiple
Linear Regression model.
50
5.32
= 1
40 a lary
S
30
20
10
Experience (X)
0 10 20
Irwin/McGraw-Hill © Andrew F. Siegel, 2003
CHRIST
Deemed to be University
LINEAR REGRESSION
Here Y is called as response variable or dependent
variable.
Xjs are called as predictor or regressors or
independent variables
ε is called as random error term which occur due to
unavoidable causes.
β0 is called as intercept parameter.
βj's are called as slope parameters. βj tells what
changes would occur in response variable if unit
change is made in ith predictor.
Excellence and Service
CHRIST
ASSUMPTIONS FOR LINEAR Deemed to be University
REGRESSION
For performing linear regression analysis certain assumptions
are made which are as follows:
1. The dependent variable and predictors must be continuous.
2. Predictors must be linearly related with the
dependent variables.
3. The dependent variable must be normally distributed.
4. Predictors must be uncorrelated with each other.
5. Dependent variable must be homogenous.
6. Dependent variable must not be depend upon its
past values (for time dependent data).
Excellence and Service
CHRIST
Deemed to be University
LINEAR REGRESSION
Let us consider the simplest case of linear regression, i.e., the case
when the relation is to be study among two variables only.
Let X and Y be the two variables under consideration and {(xi, yi);
i=1, 2, …,n} denotes the n pairs of observations on these two
variables.
Then the equation of Line of X on Y is given by:
xi = α0 + βxyyi +εi
PROPERTIES OF REGRESSION
COEFFICIENTS
1. The estimate of regression coefficient is free from the change
of the origin but not of the scale.