CORRELATION ANALYSIS

121620003

F.Y.MTECH(CONSTRUCTION MANAGEMENT)

CONTENTS

Necessity

Introduction

Significance

Types of correlation

Multiple correlation

Limitations

CIVIL ENGINEERING NECESSITY

Many hydrologic variables are related to each

other through cause and effect

Changes in the values of one or more variables

cause changes in some other variable

When simultaneous observations on such

hydrological variables are available, one may be

interested in finding out how strong is such

association.

Linear association between hydrologic variables

is expressed by the correlation

If one variable drives the other, they may be

correlated, as rainfall and runoff

The variables may also be correlated if they

share the same cause, such as river discharge,

concentration or transport rates of sediment

Correlation Analysis attempts to determine the

degree of relationship between variables- Ya-

Kun-Chou.

Correlation is an analysis of the covariation

between two or more variables.- A.M.Tuttle.

INTRODUCTION

Correlation: A LINEAR association between two

random variables

Correlation analysis show us how to determine

both the nature and strength of relationship

between two variables

When variables are dependent on time

correlation is applied

Correlation lies between +1 to -1

A zero correlation indicates that there is no

relationship between the variables

A correlation of 1 indicates a perfect negative

correlation

A correlation of +1 indicates a perfect positive

correlation

SIGNIFICANCE

from 0, we say that the correlation coefficient is

"significant

Correlation coefficient is not significantly

different from 0 (it is close to 0), we say that

correlation coefficient is "not significant"

CORRELATION DOES NOT NECESSARILY IMPLY

CAUSATION

Correlation means that there is a relationship between

two variables.

Causation means that if you see a change in your

explanatory variable, it should cause a change in the

response variable.

Even if a correlation is very strong, this is not by itself

good evidence that a change in x will cause a change in

y

EXAMPLE

One study in Victorian England showed a strong

correlation between people wearing top hats, and their

life expectancy. This relationship was shown to be

very strong (high r).

Does this mean that had Queen Victoria provided free

top-hats for all, the life expectancy in England would

have shot up?

There is a confirmed correlation. However, there is

NO causation. That is, wearing top hats does not

cause people to live longer.

Which situation describes a correlation that is

not a causal relationship?

(2) The more miles driven, the more gasoline

needed.

(3) The more powerful the microwave, the

faster the food cooks.

(4) The faster the pace of a runner, the quicker

the runner finishes.

POSITIVE CORRELATION

If all the plotted points form a straight line from

lower left hand corner to the upper right hand

corner

Perfectly Positive Correlation is denoted by

r = +1

NEGATIVE CORRELATION

If all the plotted dots lie on a straight line falling

from upper left hand corner to lower right hand

corner

Perfectly Negative Correlation is denoted by

r = -1

Depends upon the direction of change

of the variables

Positive Negative

If the two variables If the two variables

tend to move together tend to move together

in the same direction, in the opposite

then it is called positive direction, then it is

or direct correlation called negative or

Price and supply, height inverse correlation

and weight, yield and Price and demand,

rainfall yield of crop and price

Simple Multiple

One dependent

One independent and

and more than

one dependent

one independent

variable

variables

quantity of money

price, demand and

and price level supply

Partial

One dependent variable and

more than one independent

variable but only one

independent variable is

considered and other

independent variables are

considered constant

price and demand

eliminating supply side

Linear Non linear

When plotted on a When plotted on a

graph it tends to be graph it is not a

a perfect line straight line

METHODS OF STUDYING CORRELATION

Scatter Diagram Method

Karl Pearson Coefficient Correlation of Method

SCATTER DIAGRAM METHOD

between two variables diagrammatically

One variable is represented along the horizontal

axis and the second variable along the vertical

axis

For each pair of observations of two variables, we

put a dot in the plane

REGRESSION LINE

Regression line

The straight line of best fit drawn through the

points on a scatterplot

CORRELATION: LINEAR RELATIONSHIPS

Strong relationship = good linear fit

FIT

correlation

COEFFICIENT OF CORRELATION

A measure of the strength of the linear

relationship between two variables that is

defined in terms of the (sample) covariance of the

variables divided by their (sample) standard

deviations

Represented by r

r lies between +1 to -1

-1 < r < +1

correlations and negative linear correlations,

respectively

INTERPRETING CORRELATION

COEFFICIENT R

Strong correlation: r > .70 or r < .70

Moderate correlation: r is between .30 & .70 or r

is between .30 and .70

Weak correlation: r is between 0 and .30 or r is

between 0 and .30 .

COEFFICIENT OF DETERMINATION

Coefficient of determination lies between 0 to 1

Represented by r.r

how well the regression line represents the data

If the regression line passes exactly through

every point on the scatter plot, it would be able to

explain all of the variation

The farther the line is from the points, the less it

is able to explain the correlation

r.r is useful because it gives the proportion of the

variance (fluctuation) of one variable that is

predictable from the other variable

It is a measure that allows us to determine how

certain one can be in making predictions from a

certain model/graph

The coefficient of determination is the ratio of the

explained variation to the total variation

The coefficient of determination is such that

0 < r.r < 1, and denotes the strength of the linear

association between x and y

The Coefficient of determination represents the

percent of the data that is the closest to the line

of best fit

For example, if r = 0.922, then r.r = 0.850

can be explained by the linear relationship

between x and y (as described by the regression

equation)

The other 15% of the total variation in y remains

unexplained

EXAMPLE:

STRONG CORRELATION AS r=0.95

SPEARMANS RANK COEFFICIENT

A method to determine correlation when the data

is not available in numerical form

As an alternative the method, the method of rank

correlation is used

When the values of the two variables are

converted to their ranks, and there from the

correlation is obtained, the correlations known as

rank correlation.

r = rank correlation coefficient

D.D = sum of squares of differences between the pairs of ranks.

n = number of pairs of observations.

COMPUTATION FOR TIED OBSERVATIONS

For example

If the value so is repeated twice at the 5th rank,

the common rank to be assigned to each item is

average of 5 and 6 i.e 5.5

If the ranks are tied, it is required to apply a

correction factor which is

EXAMPLE

Interpretation: There is uniformity in the

performance of students in the two tests

MULTIPLE CORRELATION

Statistical technique that predicts values of one

variable on the basis of two or more other

variables

Video of SPSS analysis

DID YOU KNOW???

correlated (r=1.0) with the sun

rising?

LIMITATIONS

Correlation is among the most powerful techniques available

to researchers.

These techniques require:

Every variable is measured at the interval-ratio level

Each independent variable has a linear relationship

with the dependent variable

Independent variables do not interact with each other

Independent variables are uncorrelated with each other

When these requirements are violated (as they often are),

these techniques will produce biased and/or inefficient

estimates.

REFERENCES

NPTEL/Module11/Lect39,40

https://en.wikipedia.org/wiki/Multiple_correlati

on

https://www.stat.auckland.ac.nz/~teachers/2007/..

./correlation-causation

http://www.powershow.com/view1/27a1c9-

ZDc1Z/Coefficient_of_Multiple_Correlation_powe

rpoint_ppt_presentation

Statistics: Higher Secondary First Year

TAMILNADU TEXTBOOK CORPORATION

THANK YOU!

