You are on page 1of 2

Correlation

This look at the relationship between two variables that is for the case of our dataset we used age
and income per year to see whether there is a linear relationship between the two variables. We
shall do this by develop the concept of covariance to measure that relationship numerically, and
then turns that covariance into a correlation coefficient and see why that is a better measure than
a plot which we shall first determine the independent (The variable you manipulate or are
studying) and dependent variables (The variable that you are measuring—the data). This shall be
drawn on the X axis: The horizontal axis, also called the abscissa and the Y axis: The vertical
axis, also called the ordinate
We shall base correlation on the following questions for correlation coefficient and the plot
Does the age fact (Y) influence the amount of income paid per year (X)
In this case we are asking if one variable (Y) is related to another variable (X). When we are
dealing with the relationship between two variables, we are concerned with correlation, and our
measure of the degree or strength of this relationship is represented by a correlation coefficient
(Pearson product-moment correlation coefficient (r))
Correlation can be classified into three different ways that is;
Positive: as one variable increases so does the other
Negative: as one variable increases the other decreases
No relationship: the movement in one variable can not be predicted from the other
We can define our variables too as Predictor variable: The variable from which a prediction is
made in this “age” and Criterion variable: The variable to be predicted in this “income”
For the correlation coefficient we shall use the “CORREL” function in Excel
income/yr US,
  Age $'000
Age 1  
income/yr US, 0.19530
$'000 7 1

Since the value is 0.195307 which is far less than 1 than there is a weak positive linear
relationship between the age and income earned per year by people in the organization and this
may due to other factors like educational level thus correlation is causation between the two
variables. To clearly show the positive relationship a scatter is drawn in the following way;
Scatter plot
In preparing a scatter diagram the predictor variable, or independent variable, is traditionally
presented on the X (horizontal) axis, and the criterion variable, or dependent variable, on the Y
(vertical) axis.
Scatter plot showing the relationship between Age and
Income
20
18
16
14
12
10 f(x) = 0.05 x + 7.34
Age

8 R² = 0.04
6
4
2
0
10 20 30 40 50 60 70 80 90 100
Income

From the scatter it is clear that the variables have a weak position relationships given the positive
gradient from the trendline equation and the squared R also shows the weak correlation that
0.0381

You might also like