You are on page 1of 19

Correlation and

Regression
Correlation Analysis

 According to Ya Lun Chou, “ Correlation analysis


attempts to determine the degree of relationship
between variables.”

 According to W.I.King,” Correlation means that between


two series or group of data there exists some casual
connection.”

 According to A.M.Turtle, “ Correlation is an analysis of


the covariation between two or more variables.”
Correlation
 Bowley defines correlation as, ”when two quantities are
so related that the fluctuations in one are in sympathy
with the fluctuation of the other, that an increase or
decrease of the one is found in connection with an
increase or decrease of the other and greater the
magnitude of change in one, greater is the magnitude of
change in the other, the quantities are held to be
correlated.
Uses of correlation in business and
economics
 Correlation is very useful to economists to study the
relationship between variables, like price and quantity
demanded.
 Correlation analysis helps in measuring the degree of
relationship between the variables.
 The relation between variables can be verified and
tested for significance.
 It can be used for comparing the relationship between
variables which are expressed in different units.
 Sampling error can also be calculated.
 Correlation is the basis for the study of regression.
Correlation and Causation
The high degree of correlation between two variables
may exist due to any one or a combination of the
following reasons:
1. Variation in one variable may be caused by a variation
in the other.
E.g., price and demand.
2. Co-variation of two variables may be due to some other
variable(s).
E.g., Demand and supply. Price affects both demand
and supply.
3. Correlation may be due to chance – nonsensical or
spurious correlation.
E.g. Rainfall in India and number of robberies in
America.
Types of Correlation

 Positive and Negative Correlation


 Linear and Nonlinear Correlation
 Simple, Multiple and Partial Correlation
Linear and Non-linear Correlation

 On the basis of the degree of covariation, correlation


may be termed as linear or nonlinear.

 When the covariation between two sets of variates is


perfect or of degree one (or unity), which means that the
two variables have an exact functional relationship, the
correlation is said to be linear.
Simple, Multiple and Partial Correlation
e.g. for simple correlation – correlation between income
and consumption, price and demand, rate of interest and
the volume of investment.

e.g. for multiple correlation – correlation between


agricultural production and the factors like quality of
seed, nature of soil, manure applied, rainfall etc.,

e.g., for partial correlation – correlation between output and


input of labour units, keeping other factors like capital,
land etc., in the production function of a commodity,
constant will form partial corrrelation.
Degrees of Correlation
 Perfect Positive Correlation:
If an increase (or decrease) in one variable is always
followed by a corresponding and proportional increase or
(decrease) in the other related variable.

Perfect Negative Correlation:


There is perfect negative correlation when an increase
(or decrease) in one variable is always followed by a
corresponding and proportional decrease (or increase) in
the other related variable.
 Limited Degree of Positive Correlation:
When an increase (or decrease) in one variable is
always followed by a corresponding but non-proportional
increase (or decrease) in the other related variable,
correlation is said to be positive to a limited degree.

E.g., in the case of increasing and diminishing returns of


a firm, input and output are positively correlated but the
correlation is not perfect.
Similarly, the propensity to consume also has imperfect
positive correlation with income.
 Limited Degree of Negative Correlation:
When an increase (or decrease) in one variable is
always followed by a corresponding but non-proportional
decrease (or increase) in the other related variable,
correlation is said to be negative to a limited degree.

E.g., In the case of more elastic or less elastic demand


of commodities, price and demand are inversely related,
but changes in one due to another is not proportional.
 Zero Correlation:
There is no correlation at all if the values of one variable
cannot be associated with the values of the other
variable.

If for instance, the number of marriages in a community


are compared with the cases of suicides committed
during a given period it will be discovered that the two
variables are in no way related. In quantitative terms,
correlation in such instances is zero.
Methods of correlation

 Scatter Diagram
 Correlation graph
 Karl Pearson’s coefficient of correlation
 Spearman’s Rank correlation coefficient
 Concurrent Deviation Method
Assumptions of the Pearsonian Coefficient
 There is linear relationship between the variables.

 The two variables under study are affected by a large number of


independent causes so as to form a normal distribution. Variables
like height, weight, price, demand, supply etc. are affected by such
forces that a normal distribution is formed.

 There is a cause and effect relationship between the forces affecting


the distribution of the items in the two series. If such a relationship is
not formed between the variables, i.e., if the variables are
independent, there cannot be any correlation.
e.g., there is no relationship between income and height because the
forces that affect these variables are not common.
Probable Error

 Probable error of the coefficient of correlation is a


statistical measure which measures reliability and
dependability of the value of coefficient of correlation.

 If P.E. is added to or subtracted from the coefficient of


correlation it would give two such limits within which we
can reasonably expect the value of coefficient of
correlation to vary.
P.E = 0.6745( 1-r2)/√n

 If the value of r is less than the probable error there is no


evidence of correlation, i.e., the value of r is not at all
significant.
 If the value of r is more than six times the probable error,
the coefficient of correlation is practically certain. i.e., the
value of r is significant.
 If the probable error is not much and if the coefficient of
correlation is 0.5 or more it is generally considered to be
significant.
Conditions for the use of Probable Error

 The data must approximate a normal frequency curve


(bell-shaped curve).
 The statistical measure for which the P.E. is computed
must have been calculated from a sample.
 The sample must have been selected in an unbiased
manner and the individual items must be independent.
Properties of the Coefficient of Correlation

1. The coefficient of correlation lies between -1 and +1.


2. The coefficient of correlation is independent of change
of scale and origin of the variable X and Y.
3. The coefficient of correlation is the geometric mean of
two regression coefficients.
r = √ bxy*byx
4. The degree of relationship between the two variables
is symmetric.
rxy = ryx
Features of Spearman’s Correlation Coefficient

 The sum of the differences of ranks between two


variables shall be zero. Symbolically, ∑d=0
 Spearman’s correlation coefficient is distribution free or
non-parametric because no strict assumptions are made
about the form of population from which sample
observations are drawn.
 The Spearman’s correlation coefficient is nothing but
Karl Pearson’s correlation coefficient between the ranks.
Hence it can be interpreted in the same manner as
Pearsonian correlation coefficient.

You might also like