You are on page 1of 29

CORRELATION

AND
REGRESSION ANALYSIS
Bivariate Data – involve two variables.
Describes relationships where new
statistical methods, will be introduced.
Describes relationships between related
variables in terms of strengths and
direction.

Univariate Data – describes based on the


descriptive statistics computed such as the
averages, standard deviation, frequency
counts etc. Involve only a single variable.
Correlation Analysis – statistical
method used to determine
whether a relationship between
two variables exists.
Scatter plot – also called scatter graph
or scatter diagram. It shows how each
point collected from a set of bivariate
data are scattered on the Cartesian
plane. It gives a good visual picture of
the two variables which helps in
finding the relationship that exists
between the two variables. It is a
graphical representation of the
relationship between two variables.
The relationship or correlation
between two variables may be
described in terms of direction and
strength.
Directions of Correlation:
Positive Correlation – exists when
high values of one variable
correspond to high values in the
other variable or low values in one
variable correspond to low values
in the other variable.
Directions of Correlation:
Negative Correlation – exists when
high values on one variable
correspond to low values in the
other or low values in one variable
correspond to high values in the
other variable.
Directions of Correlation:
Zero Correlation – exists when high
values in one variable correspond
to either high or low values in the
other variable.
The strength of correlation maybe
perfect, very high, moderately high,
moderately low, very low, and zero.

* The trend line is the line closest to


the points. The direction of the line
tells the direction of correlation
that exists between the variables.
ANSWER.
PAGES 290-291
EXERCISES ITEMS 1-7
YELLOW PAPER
ANSWERS ONLY
EXPLORING THE
PEARSON
PRODUCT-MOMENT
CORRELATION
The following data shows the scores of five
students in Statistics (X) and Physics (Y).
Determine if there is a relationship
between the two scores. Interpret the
results.
Pearson Correlation Coefficient – the
value r indicates the degree of
relationship between two variables
(Karl Pearson).
Correlation Coefficient
1. If the trend line contains all the
points in the scatter plot and the line
points to the right, we conclude that
there is perfect positive correlation
between the two variables. The
computed ‘r’ is 1.
Correlation Coefficient
2. If all the points fall on the trend line
that point to the left, there exists a
perfect negative correlation between
the pair of variables. The computed
value of ‘r’ is -1.
3. If a trend line does not exist, there is
no correlation between the pair of
variables. This is confirmed by the
computed value of ‘r’ which is 0.
Correlation Coefficient
4. The absolute value of ‘r’ indicates
the strength of correlation between
the two variables. The direction of
correlation is indicated by the sign
(positive or negative of ‘r’).
EXPLORING
REGRESSION
ANALYSIS
STEPS IN TESTING
THE SIGNIFICANCE OF ‘r’
Step 1. State the null and alternative
hypothesis.
Step 2. Compute the value of t.
Step 3. Compare the computed value of
t with the critical value of t, as found in
the table. Based on the null hypothesis,
the test calls for a two-tailed test. The
degree of freedom is n-2.
STEPS IN TESTING
THE SIGNIFICANCE OF ‘r’
Step 4. Make the decision.
If the computed value of t is equal or
greater than the critical value of t,
reject the null hypothesis then accept
the alternative hypothesis.
If the computed value of t is less than
the critical value, accept H0.
STEPS IN TESTING
THE SIGNIFICANCE OF ‘r’
Step 5. Summarize the results.
EXAMPLE. Erin investigated the
relationship between family
income and savings. Using the data
from 20 families, the computed r
between income and savings was
found to be 0.78. is the computed r
significant at 0.05 level of
significance? Can we conclude that
the relationship really exists?
EXAMPLE. Ysabel would like to
know if IQ scores are related to
age. Using 10 high school students,
he found out that the computed r is
0.60. At 0.05 level of significance,
can she conclude that the
relationship really exists in the
population?
Steps to determine the Regression Analysis:
1. Find the value of the correlation
coefficient (r).
2. Test the significance of ‘r’. If ‘r’ is
significant, proceed to regression analysis.
(Proceed to Step 3), if ‘r’ is not significant,
regression analysis cannot be done (Stop).
3. Find the values of a and b.
4. Substitute the values of a and b in the
regression line Y’=bX+a.
EXAMPLE. The following data shows the
number of absences and the number of
quizzes missed by five students. If there is
a significant relationship
between the two STUDENT NUMBER
OF
NUMBER
OF MISSED
variables, predict ABSENCES QUIZZES

the number of 1 1 1

quizzes missed by 2 1 2

a student who was 3 2 4

absent for 6 days. 4 3 2


5 4 4
Height of Height of
EXAMPLE. The following the Father the Son
data pertains to the heights 71 71
of fathers and their eldest 69 69
Sons in inches. If there is a 69 71
significant relationship 65 68
66 68
between the two variables,
63 66
predict the height of the son 68 70
if the height of his father 70 72
is 78 inches. 60 65
58 60

You might also like