You are on page 1of 21

13- 1

Correlation Analysis

PLAN

1. Scatter diagram. Dependent and independent variables.

2. Calculating and interpretation the coefficient of


correlation, the coefficient of determination, and the
standard error of estimate.

3. Test of hypothesis to determine if the population


coefficient of correlation is different from zero.
McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 2
1. Scatter diagram. Dependent and
independent variables
 Correlation Analysis is a group of statistical
techniques used to measure the strength of the
association between two variables.
A Scatter Diagram is a chart that portrays the
relationship between the two variables.
The Dependent Variable (Y) is the variable being
predicted or estimated.
The Independent Variable (X) provides the basis
for estimation. It is the predictor variable.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 3

2. The Coefficient of Correlation, r

The Coefficient of Correlation (r) is a measure of


the strength of the relationship between two
variables.
It requires interval or ratio-scaled data.

It can range from -1.00 to 1.00.

Values of -1.00 or 1.00 indicate perfect and strong


correlation.
Values close to 0.0 indicate weak correlation.

Negative values indicate an inverse relationship and


positive values indicate a direct relationship.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 4

Perfect Negative Correlation


Y
10
9
8
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10 X

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 5

Perfect Positive Correlation


Y
10
9
8
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10 X

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 6

Zero Correlation
Y
10
9
8
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10 X

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 7

Strong Positive Correlation


Y
10
9
8
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10 X

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 8

Formula for r

We calculate the coefficient of correlation from the


following formulas.

( X  X )(Y  Y )
r
(n  1) s x s y
n(XY )  (X )(Y )

n(X )  (X ) nY  Y  
2 2 2 2

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 9

Coefficient of Determination
The coefficient of determination (r2) is the
proportion of the total variation in the dependent
variable (Y) that is explained or accounted for by
the variation in the independent variable (X).
It is the square of the coefficient of correlation.
It ranges from 0 to 1.

It does not give any information on the direction


of the relationship between the variables.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 10

EXAMPLE 1
 Dan Ireland, the student body president at
Toledo State University, is concerned about the
cost to students of textbooks. He believes there
is a relationship between the number of pages
in the text and the selling price of the book. To
provide insight into the problem he selects a
sample of eight textbooks currently on sale in
the bookstore. Draw a scatter diagram.
Compute the correlation coefficient.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 11

EXAMPLE 1 continued

Book Page Price ($)


Into to History 500 84
Basic Algebra 700 75
Into to Psyc 800 99
Into to Sociology 600 72
Bus. Mmgt 400 69
Intro to Biology 500 81
Fund. of Jazz 600 63
Princ. of Nursing 800 93
McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 12

Example 1 continued
Scatter Diagram of Number of Pages and Selling Price of Text

100

90
Price ($)

80

70

60
400 500 600 700 800
Page

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 13

Example 1 continued

Book Page Price ($)


X Y XY X2 Y2
Into to History 500 84 42,000 250,000 7,056
Basic Algebra 700 75 52,500 490,000 5,625
Into to Psyc 800 99 79,200 640,000 9,801
Into to Sociology 600 72 43,200 360,000 5,184
Bus. Mmgt 400 69 27,600 160,000 4,761
Intro to Biology 500 81 40,500 250,000 6,561
Fund. of Jazz 600 63 37,800 360,000 3,969
Princ. of Nursing 800 93 74,400 640,000 8,649
Total 4,900 636 397,200 3,150,000 51,606

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 14

Example 1 continued

n(XY )  (X )( Y )
r
n(X 2
)  (X ) 2
nY  Y  
2 2

8(397 ,200 )  (4,900 )( 636 )



8(3,150 ,000  (4,900 ) 8(51,606 )  (636 ) 
2 2

 0.614

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 15
3. Testing the Significance of
the Correlation Coefficient
Tests for Significance
• r is an estimate of the population correlation coefficient r
(rho).
• To test the hypothesis H0: r = 0, the test statistic is:

• The critical value ta is obtained from t-values Distribution


Table using f = n – 2 degrees of freedom for any a.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 16
Testing the Significance of
the Correlation Coefficient
Steps in Testing if r = 0
• Step 1: State the Hypotheses
Determine whether you are using a one or two-
tailed test and the level of significance (a).
H0: r = 0
H1: r ≠ 0
• Step 2: Calculate the Critical Value for degrees of
freedom f = n -2, look up the critical value ta at
t-values Distribution Table, then calculate

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 17
Testing the Significance of
the Correlation Coefficient
Steps in Testing if r = 0
• Step 3: Make the Decision
If using the t statistic method, reject H0 if
t > ta or t <- ta or
if the p-value < a.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 18
Testing the Significance of
the Correlation Coefficient
Role of Sample Size
• As sample size increases, the critical value of r
becomes smaller.
• This makes it easier for smaller values of the
sample correlation coefficient to be considered
significant.
• A larger sample does not mean that the correlation
is stronger nor does its significance imply
importance.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 19
Testing the Significance,
EXAMPLE 1, continued
 The correlation between the number of pages and the
selling price of the book is 0.614. This indicates a
moderate association between the variable.
Test the hypothesis that there is no correlation in the
population. Use a .02 significance level.

 Step 1: H0: The correlation in the population is zero.


H1: The correlation in the population is not zero.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 20
Testing the Significance,
EXAMPLE 1, continued

Step 2: H0 is rejected if t>3.143 or if t< -3.143.


There are 6 degrees of freedom, found by
McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 21

EXAMPLE 1 continued
 To find the value of the test statistic we use:

r n2 .614 8  2
t   1.905
1 r 2 1  (.614 ) 2

Step 3: H0 is not rejected. We cannot reject the


hypothesis that there is no correlation in the population.
The amount of association could be due to chance.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.

You might also like