You are on page 1of 22

Linear Correlation

Andrivaroqi Latunil Ashraf

Muhammad Riski
Rail Mifta Zelira
Regan Nazir
The Research Context

• Correlation coefficient is a measure of the strength of

association between two variables
• It can range from -1 to +1.
• The larger the absolute value of the correlation, the
stronger the association between two variables.
• Correlations can be computed to answer questions
from many different fields.
• In the Pearson product-moment correlation
coefficient, the Pearson r is the most powerful and
most frequently used.
• The Pearson r relies on the same statistical and
methodological assumptions as the independent-
samples t test.
• All sample correlation coefficients are symbolized
using r, all population correlations coefficients
symbolized using p (rho).
The Correlation Coefficient
and Scatter Diagrams
• Correlation coefficient is represented by a number
that ranges from −1 to +1; the higher the coefficient’s
absolute value, the stronger the association between
the two variables.
• If higher values of one variable are associated with
higher values of the other variable the correlation is
said to be positive.
• Example:
• The more you practice, the more skillful you will be.
• If higher values of one variable are associated with
lower values of the second variable, the correlation is
said to be negative.
• Example:
• The more money printed, the lower the value will be.
• When a bivariate distribution is plotted on a graph, it is
called a scatter plot (or scatter diagram).
• The bivariate distribution can be represented visually by
plotting each participant’s X and Y score on a graph. For
this data set, there are 10 points; each point corresponds
to a participant’s X and Y score.
• To draw a scatter plot, place the X variable on the
horizontal axis and the Y variable on the vertical axis. To
plot the data point for participant 1’s score, follow the X
axis to the number 8. Imagine a line drawn vertically,
parallel to the Y axis. Now locate the participant’s Y score
along the Y axis. The Y score for participant 1 is 14.
Imagine drawing a horizontal line, parallel to the X axis.
Interpreting the Scatter
• The scatter plot provides a wealth of information
about the relationship between two variables.
• The magnitude of the correlation can be estimated
by looking at the general shape formed by the data
points. The more narrow the width of the oval
enveloping the data, the stronger the correlation.
The more the data take the shape of a circle, the
weaker the correlation.
• If the oval containing the majority of the points
slopes from the lower left to the upper right, the
correlation is positive. As the X scores tend to have
higher values, the Y scores are apt to be larger.
• If the plot slopes from the upper left to the lower
right, then the correlation is negative. As the value of
the X score increases, the value of the Y score tends
to decrease
Linear and non-linear
• In a linear relationship, each time the value of one
variable increases, the value of the other variable
shows a constant change. In other words, the
relationship between X and Y can be represented by a
straight line, thus the term “linear.”
• A Curvilinear Relationship is a type of relationship
between two variables where as one variable
increases, so does the other variable, but only up to a
certain point, after which, as one variable continues to
increase, the other decreases.
How to find Correlation?

• The correlation between two coefficient that indicates

the strength of the relationship between two variables
can be found using the following formula:
The z score formula for the
population correlation

ρ = Σ zXzY
• ρ = rho, the symbol for the population correlation
• Σ(zXzY) = sum of the cross products of z scores
• Np = number of pairs of scores
The Coefficient of
Determination, r2

• The coefficient of determination, represented as r2 ,

accomplishes for the correlation coefficient what ω2
accomplishes after an F test has been performed. An
r2 is a measure of the amount of variation of the Y
variable accounted for by variation of the X variable;
a measure of shared variance (or common variance).
• Shared variance is the key concept in understanding
the coefficient of determination. It is usually stated
as a percentage. If the correlation between two
measures is .80, then the amount of shared variance
is .80 × 100 = 64%.
• If the correlation is +.70, r2 is approximately 50%.
The overlapping segment of the ovals, the shaded
area, represents this shared variance. The non
overlapping segments of the ovals make up the
proportion of variance unique to each test.
• the worth of the correlations is better captured by r2
not r. It is more appropriate to compare relationships
in terms of their shared variance, r2:
Step to calculate the
correlation coefficient
• Obtain data sample variables with the values of x variable
and y variable
• Calculate the means for both variables
• For the x variable, subtract the mean from each value of
the x variable (a). Do the same with the y variable (b)
• Multiply each a-variable by the corresponding b-value and
find the sum of these multiplications (the final value is the
numerator in the formula.)
• Square each a-value and calculate the sum of the result
• Find the square root of the value obtained in the previous
step (this is the denominator in the formula)
• Divide the value obtained in the step 4 by the value
obtained in the step 7

You might also like