You are on page 1of 5

Good morning everyone Ms.

Bertillo and I will tackle the Correlation Analysis (Pearson and Spearman)
under Module 6. Regression and Correlation Analysis.

Let’s start our discussion by answering this question What is Correlation Analysis? Correlation analysis,
Correlation analysis also known as buy-va-reyt (bivariate), is primarily concerned with finding out
whether a relationship exists between variables and then determining the magnitude and action of that
relationship. Some examples of data that have a high correlation:

Your caloric intake and your weight.

Your eye color and your relatives’ eye colors.

The amount of time your study and your GPA.

Some examples of data that have a low correlation (or none at all):

Your sexual preference and the type of cereal you eat.

A dog’s name and the type of dog biscuit they prefer.

The cost of a car wash and how long it takes to buy a soda inside the station.

Correlational studies are our attempts to find the extent to which two variables are related. No variables
are manipulated as part of an experiment — the analyst is measuring naturally occurring events,
behaviors, or characteristics.

It’s important to remember that correlation doesn't equal causation. What is that mean, you can’t draw
any conclusions regarding the causal effect of one type of data on the other, but you can determine the
size, degree, and direction of the relationship.

Correlation analysis identities and evaluates a relationship between two variables, but a positive
correlation does not automatically mean one variable affects the other.

Why Correlation Analysis is Important

Correlation analysis can reveal meaningful relationships between different metrics or groups of metrics.
Information about those connections can provide new insights and reveal interdependencies, even if the
metrics come from different parts of the business.

If there is shown to be a strong correlation between two variables or metrics, and one of them is being
observed acting in a particular way, then you can conclude that the other one is also being affected in a
similar manner. This helps to group related metrics together to reduce the need for individual
processing of data.
Based on what I found, Correlation is very important in the field of Psychology and Education as a
measure of relationship between test scores and other measures of performance. With the help of
correlation, it is possible to have a correct idea of the working capacity of a person.

So now, lets determine what are the main types of Correlation Analysis?

The two main types of correlation are: Pearson and Spearman

When we say pearson, it evaluates the linear relationship between two continuous variables while
Spearman correlation it evaluates the monotonic relationship.

Pearson Correlation Coefficient

- is used for linearly related variables, like age and height or temperature. Let say for example, Is there a
relationship between a persons salary and age? In this scatter plot, every single point is a person. If the
relationship is confirmed in this example, salary can predicted by age using regression. Remember that
there must be a clear causal relationship for this. Just because there is a correlation, we cant tell which
way the relationship is going. So with the help of pearson correlation, we can measure the linear
relationship between two variables.

Here we can determine, How strong the correlation is and which direction the correlation goes.

We can read both in the pearson correlation coefficient r, which is between -1 and 1. The strength of the
correlation can be read in the table. If r is between 0 and 0.1, that means there is no correlation and if r
is between 0.7 and 1, that means there is a very strong correlation

A positive correlation exist when the large values of one variable go along with large values of the other
variable. Or when the small values of one variable go along with small values of the other variable. A
positive correlation is found, for example our body size, and shoe size… The result will be positive
correlation coefficient. r is greater than 0.

Lets now move on sa opposite ng positive which is yung negative correlation, a negative correlation exist
when the large values of one variable is go along with small values of the other variable and vice versa.

A negative correlation usually exist between product price and sales volume… the result is a negative
correlation coefficient or r < 0.

How is the pearson correlation calculated? This correlation is obtained vid this equation.

Where r is the pearson correlation coefficient, xi are the individual values of one variable example
given is the age, yi are the individual values of the other variable the example given is the salary and
lastly the x and y are respectively the mean values of the two variables.

In the equation, we can see that the respective mean values is first subtracted from both variables. Lets
watch the example on how to calculate the pearson correlation.
The second main types of correlation analysis is the Spearman analysis.

What is spearman analysis?

Spearman's Rank Correlation examines the relationship between two variables

Isnt that exactly what the pearsons correlation does?

Exactly!

The spearman rank correlation is the non parametric counterpart of the pearson correlation. But there
is an important difference between both correlation coefficients. Spearman correlation does not use the
raw data. But the ranks of the data. If there is no rank ties, we can also use this equation to calculate
the spearman correlation, n is the number of cases, and d is the difference in ranks between two
variables .

Lets take a look at the example.

Why we need to Use the Correlation Analysis in Data Analytics?

You might also like