You are on page 1of 8

# Correlation

Correlation as the word indicates means interrelationship and in statistical term is used to signify the
extent of relationship between two variables and therefore,
Correlation measures the Strength of relationship between
two variables.
There are two types of correlation: Positive
Correlation, Negative Correlation (or) No Correlation.
Positive Correlation: When there is a strong relationship
between two different variables such that both behave
similarly and are dependent variables, and show a positive
linear relationship.
Lets discuss this with an example to understand it better. We
have learnt in school that as Temperature increases, Pressure
also increases.

## If we plot the data points in a two-dimensional graph, we can

observe that if we plot a linear line through the data points it
will look somewhat like the graph displayed in the
below Figure (a). Here, both the variables move in the same
direction. Therefore, it is called a Positive Correlation.
Negative Correlation: When there is a relationship between
two variables such that an increase in one variable results
in a decrease in its dependent variable, it indicates a
negative correlation.
For example, when hours of time spent watching Television by
a student increases, his marks in examination decreases.
In this example, as hours of time spent watching TV
increases, the students marks decreases. So, both the
variables move in opposite direction. Therefore it is called
as Negative Correlation. This is depicted in the
below Figure (b).

## The above graph is called a Scatter Plot. For each value in X

axis, there will be a paired value in Y axis.
A Scatter Plot cannot be drawn without paired data
values.
In Figure (a), the graph represents the Correlation between
Temperature and Pressure. The line moving upwards
indicates Positive Correlation. The second graph [Figure (b)],
represents the Correlation between Hours spent watching TV
and Marks in Exam. The line moving downwards indicates
negative Correlation.

## There is another aspect we need to investigate in order to

understand the concept of Correlation completely it is
theStrength of Correlation.
When two variables are correlated, how to predict the strength
of the Correlation? Are they highly correlated?, or are they
weakly correlated?, or is there NO Correlation between these
two variables?
The Strength of Correlation is calculated using the below
Formula.

## Figure 2: Strength of Correlation Formula

r : Correlation coefficient
n: Number of data sets
xi :ith Value of variable x
yi :ith value of variable y

## The r value is also called as Pearsons Correlation

coefficient.
Note: One of the pre-requisites for using this Pearsons
Correlation coefficient is that both the variables should be in
Continuous Scale.
Another pre-requisite is that Pearsons Correlation coefficient
applies only for linear relationships.
Mathematically, the value of Coefficient of Correlation can
range from -1 to +1. While -1 signifies perfect negative
Correlation, +1 signifies perfect positive Correlation. But in
real life, such a scenario is very rare.
Correlation coefficient takes values like 0.9, 0.8, 0.75,-0.8,0.9, -0.75 etc. Positive values in the series indicate Strongly
Positive Correlation and negative values indicate Strongly
Negative Correlation.
The scatter plot of the above scenario will look like this.

## Similarly Correlation coefficient can take values like 0.4, 0.3,

0.25, -0.25, -0.3, -0.4 etc. Positive values in the series
indicate Weakly Positive Correlation and negative values
indicate Weakly Negative Correlation.
Below is the Scatter Plot that illustrates this. We can see how
closely the data points are lying near the linear line (trend
line can be used in excel) for data that are strongly correlated
(positive or negative) as compared to the weakly correlated
data points.

## Figure 4: Weak Correlation Graph

What will happen when two variables are not correlated at all?
They will behave randomly, and the plot will look something
like the graph below. The r value will be 0 when the variables
are not correlated.

of them here.
1.

## It is used to forecast a Y variable, given X and Y

are correlated. Based on the historical values of X and Y,
Y values for future can be predicted.
2.
In the field of medical research, researchers might
want to know if a particular medical condition is related to
intake/use of a particular medicine.
3.
In stock market, to understand how the rise and fall
of share prices are related to changes in a particular
economic parameter (say \$ conversion rate/ FOREX rate
etc.).

4.

## In Process Improvement methodologies like Six

Sigma, to assess the behavior of a particular metric and
the influencing parameters, that cause the variation.
There are many such uses for Correlational Analysis. Users
should remember an important point while performing
Correlation Analysis.

## Correlation does not always imply

Causation
Correlation of X and Y variable does not mean that the
variation in Y is caused by (or) due to a variation in X. It is
just that they happen simultaneously. It is with the business
knowledge, that the user should decide whether the
relationship is causation or not.