Correlation

is a measure of association between

variables. It could be used to make predictions

of one variable from another variable.

The Correlation Coefficient is the statistic used

to report whether two variables are

correlated or not and to what extent and in

which direction.

The correlation coefficient ranges from +1 to

-1, where zero means no correlation and 1

means strong ( perfect) correlation

Illustration of the Degree

of Correlation between two

Variables

Perfect + Correlation Perfect – Correlation

Some Degree

Some degree

of + of - No Correlation

Correlation Correlation

The correlation could be a positive correlation indicating a

positive relationship between variables in that the increase in

one will follow by an increase in the other.

Anegative correlation indicating a negative (inverse)

relationship between variables in that the increase in one will

follow by a decrease in the other.

A Scatter plot is a graphical way of showing the correlation

between two variables. The horizontal axis represent one

variable and the vertical represents the other variable.

Measures of

Correlation

1. Pearson Product-Moment Coefficient ( r )

- The most common correlation measure

- used for continuous data and when the scores are

normally distributed.

r n xy x

y

2 2 2 2

[n ][n ]

( x) ( y)

x y

2. Spearman Rank-order Coefficient ()

- is another correlation measure is used when scores

are not normally distributed or when data is ordinal

or ranked.

nR R y x

y

R R

x

2 2 2 2

[nR x ( Rx) ][n y ( Ry)]

R

Interpretation of Pearson r-values,

the following degree of relationship

shall be used:

Pearson r-value Interpretation

Ex: Suppose you’re given with two data sets X and

Y.

x y 2 2 xy

x y

9 28.5 81 806.56 255.6

15 29.3 225 858.49 439.5

24 37.6 576 1413.76 902.4

30 36.2 900 1310.44 1086

38 36.5 1444 1332.25 1387

46 35.3 2116 1246.09 1623.8

53 36.2 2809 1310.44 1918.6

60 44.1 3600 1944.81 2646

64 44.8 4096 2007.04 2867.2

76 47.2 5776 2227.84 3587.2

415 375.6 21623 14457.72 16713.3

Solution:

XY = 16713.3,

X2 = 21623 2 X= 415, Y = 375.6

Y = 14457.72 n =10

r 10(16713.3) (415)(375.6)

[10(21623) (415) 2 ][10(14457.72) (375.6) 2 ]

r 11259 . 11259

(44005)(3501.84) 154098469 .2

r = 0.906986153

0.91 High + Correlation

Problem 1

1. Age and Exercise. A researcher wishes to determine

is a person’s age is related to the number of hours

he or she exercises per week. The data for the

sample are shown here.

Age (X) 18 26 32 38 52 59

Hours (Y) 1 5 2 3 1.5 1

Testing the Significance of r- The test for

the

significance of r is needed in order to know,

whether the computed r is significant or not

Solution:

1. Solve for r

2. Hypothesis:

Ho:

Ha:

3. Level of Significance = 5% ( .05)

df = 6 – 2 = 4

Tabular Value = 2.776

4. Test Statistic to be used is t for r

5. Compute for t

t r

1

r

2

n2

6. Decision:

/tcomputed/> TV, reject Ho, accept Ha

7. Interpretation:

Problem 2

2. Emergency Calls and Temperature:

An emergency service wishes to see whether a

relationship exists between the outside temperature

and the number of emergency calls it receives for a

7-hour period. The data are shown:

Temperature ( x): 68 74 82 88 93

99 101

No. of calls (y): 7 4 8 10 11

9 13

Regression Analysis

- designed to help us determine the probability that

are inferences are sound.

- It helps us test the degree to which the dependent

variable is affected by the independent variable

- If there is no significant linear correlation, do not use

the regression equation to make predictions.

Regression Equation

It is an error-free equation used to predict the

value of y. It is an equation for perfect correlations.

Formula: y = a + bx

( exact relationship)

2

a 2 ) ( x) 2

N ( x

( x)( y)

N ( xy)

b 2

) ( x) 2

N ( x

Problem 1

1. Age and Exercise. A researcher wishes to determine

is a person’s age is related to the number of hours

he or she exercises per week. The data for the

sample are shown here.

Age (X) 18 26 32 38 52 59

Hours (Y) 1 5 2 3 1.5 1

Problem 2

2. Emergency Calls and Temperature:

An emergency service wishes to see whether a

relationship exists between the outside temperature

and the number of emergency calls it receives for a

7-hour period. The data are shown:

Temperature (x): 68 74 82 88 93

99 101

No. of calls (y): 7 4 8 10 11

9 13

