Professional Documents
Culture Documents
SE 037 439
O'Brien,-Trancis J.
AUTHOR
TITLE'
Jr.
Correlation
Cdefficient Has Limits ((Plus or Minus)) 1.
P r
'
PUB DATE
NOTE
32p.
EDRS PRICE
DESCRIPTORS
.
.11
8'2
IDENTIFIERS
ABSTRACT'
6
1
4
'
,P,
.
I
O
f,
****************************4,**************1***************************
*
*
Reproductions supplied by EDRS are the beet that can be made
* ,
.*
from the original document.
*****************************************v*****************************.fr
Has
Limits
-I:
reprOduCnOn Quality.
othcfal NIE
posafon or 'poky
-4'
..
r%-
CO '-.
%CO
orslinear
s
r--4
'HoWever,
proves this.
One of the
This paper will set forth in-detail a proof bf the limits of the sample
bivariate correlation coefficient. Since the proof requires only knowledge of
4
standing.the proof.
applied statistics; I know that the typical student can understand the
prodf as it is presented here.
The key fdr tinderstanding statistical proofs is a presentation
-
A review
1P4
..
and
.\
'2.
1.
'
usually required).
in detail im-
portant statistical proofsithey feel that some of the mystery and magic
of mathematics has been unveiled.
motivational
)
Some Preliminary
Concepts
1This
T
paper is one of a series contemplated.for publication. [See
O'Brien, 1982]. Eventually ,I hope to present a textbook of applied /stat'
tics proofs and derivations to supplement standard applied statistics
textbooks.
61"
Table 1
Measure
Unstandardizgd
Measure
(X -R)/S'
rR)/S
(X
3
r,
x2
Y1
='i
JY
= Z
(Y -Y)/S
Zx
1
Standardized
Unstandardized
',Standardized
(X -;) /S
x
1
Y2
= Z
(Y
3'
Y3
I
0
t*
x.
(g.1
-R1/S
= Z
(Yi-) /S
Yi
x.
.
= Z
Y
-X
(X -T) /S
.=Z
= Z
(Y
40
Y n'
'Sample
Size
n
zx
n
x
Y,
Sample
Mean
Sample
Variance
.,2*
3c
r'.
#.
.
-NOTE:
= Ti
,1= n
z.x
-= n
zy
..
' o,
Any of these sample size terms could be identified by,just.Che symbol -- such As
n.
We' will use h when it is not important to disiinguish..kmong the other sample'
size terms, but will Ilse the table' values above when itis necessary or important
to do so.
,,
Table
-Y
Measure
Measure
-4
Sample
Mean
1 =1
X=
X.
X
1=1
Y
n
TX.
nxX =
nx
Sum
Y.
n Y =
E
1=1
i=1
Y.
1
nx
i=1
Sample
Variance
S
x
nx
2
(n -1)S
x
x
Sum of
Squares ,
L....(Xi :X)
2,
.
(n
i=1
-1)S2 =
Y
),__.
i=1
(T.-(Y. -Y')-?
1
.
NOTES:
l ,
..
3.
.
Ilr
Standard Scores
It will be Kecalled that the standard score for an unstandardized
.
'
measure (raw score) is "the score-minus the mean divided by the standard
deviation".
x
=
1
X
Y -Y
1
Y1
Similarly,
Y-
.,
,
.
__
.
0-
6.
1
.)
xi
i=i
Z =
x
40k
z
0
n
cz
11 2
P
4
1=1
n
z
.
--.
E ) 2
...
-1
4,
..
16
7.
'I
n
z
1E: (z
2
i=1
-z 1
Yi Y
nz
-1
/.
If we square
scor4s.
each, standard score for the X measurein Table 1 and sum them, we obtain:
11,
4
.
Xi"
i=1
Z2
>..... Z2
Z
.
,,,-.
Z-2
-2
V
.
If we substitute the appropriate means and variances in .the right hand side
'n
zx
E
.i.71
------
z2
xi
;
0
.r'
(X
+
2
40 2
0(
-X).
(X -X)
n
2
C,
8.
Since
the
2
S
n ,
z ..
x
'
'
z-x4
..
(x -x)
1
(x
-xr
+...+ (x
)i)
n.
1 =1
.1
S,
!I
4'
,.,)
.e08
(X1 --co 2
h
z
1:
k=1,
Z2
S
i
i=1
.
x'
.
.
From Table 2, we know that we can substitute the sum of squares term
into the numerator on'the right hand side. This results in:
.5
(n Ll)S
x
n x-1
n71
,(n -1)Sy
Y
2
--.1a -1
,Y
4
(Recall that n 71 = n-1) .
41.
9.
As
Theu relationships between squared.z scores and sample size are very important'
,
Correlation Formulas
Unstandardized Form .
dized
i=1
r
3;
0;
xy
-7
-X)2
1=1
i=1
n -1
x
ma.
n -1,
Y
6...1.
Note that the numerator contains the term n-1 because it is not important
9
4'
10
if5a,
-1 =n-1)
Jr
10.
Standardized Form
The correlation of measure,Xsand measure Y in standard score form
is defined as follows:
1
n-1
rzx`zY
zx
z.
E(Z x.
-Z
i=1
i=1,
nz- -1
z y,
E.Z x.Z
d=1
171.
.
-1
rz z
x y
obtain:
-n
(n-1) r
x. , y .
z z
x y
i=1 "
ll:
.
.__
. .
'C'
,Inequalities,
to
it it necessaryto revie
one eifurther
form of inequality:
and
> B
<
A ',
) 1 or equivalently
}e obvious.
All'of this m
1 L 3.
For
example, if
.
N.
J
=1( (3)
(1)]
-3
/---
12
12.
4"
.
That is, the inequality sign is reversed. when multiplying by a negative number.
004:
,
For example:
1 - A >
-11(1-A) >
-(1-A)
(B)]
(0))
<
A-1 <
Example:
1 1
-Multiplying through by
4>Q
we obtain:
-1[ (1-1/4)
-(1-1/4)
347.1
<
^:
a
We have reviewed standard scores(z), correlkion formulas and
algebrait inequalities.
(
the
Table 3
x-z'
1=1
z
n-1
=
Y.
1=1
xy,
z z
x y
I
n
Z
1=1
,,
.y.
(n-1) r
'
(n-1) r
x y
(I-A)
z z
>.
(B)
<
xy
. \
.4
14.
Proof
-1
xy
<
xy
+1
.-011
-1
C
410t
+1
xy
The proof consists of two parts: one part shows the lower limit
4
o
of r
of'r
(i.e.,
xy
xy
xy
(i.e., e
xy
>
-1), and
g7
Proof that
xy
+1
algebraic manipulations
Dz
i=1
z.
xi
Yi
15
15.
e,
Yi
fi
Z
values starting at
and
...44.
xl, yi and conti uing down ,to the last pair of Z's (Zx ,Z
n
).
Yn
ob
Most students readily agree that the squared sum will be greater than O.
But can it ever be exactly equal to 0? Yes, theoretically it can.
Refer-
',
4
1
the same numerical value, then it is apparent that each difference will be
it is onlyreqUired
sense.
that
of 0 is also O.
value
Yi
be true in a mathematical
Notes
Refer,
examine the algebraic statement on the left side of the page. Then read the
Example:
Y.
i
1.41
-.68
1.41
-.68
.05
.05
etc.
40.
1.
That
+1
xy
Notes
2
-Z
Z(Z
yestatement
each term, we obtain an expansion of
1=1'
;t(Z2
2Z
1=1
Z
x. y.
3.
>
A2 + B2 - 2AB ,
(A-B.)
AK.
Squaring
from before.
Y1
3.
ti
n
2
+" Z- Y1
x.
1=1
1=1 1:4:
2 E
1=1
Z
x. y.
1
2:Z2
n-1
x.
i=1 1
17
n
I
n-1
.1.Z2
i=1
xZy
3.
3.
(n-1)r
(n-1)
z z
x y
i=1
.
So
f"
Making the
> 0
2(n -1)r
(n-1)
(n-1)
'xy
ree substitutions
2.(n -1)r
2(.1-1)
2(n-1)(17r
xy
>
>
xy
n-1).term
.FactOring the
Dividing each
2(n-1) which
equality s gn
positive b ca
2 (p-1). is always
'
2(n-1)(1-rxy ]
2(n-1)
2(n-1)'
(1-r
Here we make
se of multiplying an
a negative number. Let
ch side of the inequality
us multiply
which reverses
by -1_(see Ta le 3)
sign
and
reverses the 1-r
the inequalit
xy
inequality' b
13
-1[(1-r
xy
xy
>
'+i
xy
0]
-1
xy
Now, add
:+1
to each side
This gives us
END OF PROOF
xy
18.
Proof that
xy
-1
of
We will
true, namely:g.
.
",
2: (zx
)2
'o
y.
1=1
next page-
-1
4
0
10'
,0
-41Nha
xy
;..?
4
(2,
,. n
Zx
>
)2
i=1
(A+B)
o,
2AB
Ez
i=1
+ Z
x.
2Z
+
.
Yi e
>
Z
)
x. y.
1
1
....,
2 EZ Z
i=1 1
i=1111
, 0
y
i
i
.'
i
t+
(n-1)
2 (n-1) r
2(n-1)[1 + r
xy
xy
.>
Adding.
i%
1.
xy
.41"4"'to.each side
>
-1
Simplifying
xy 0
>
-1
23
22
ra
jolt
JO,
20.
,
.'..
...
+1.
See the
Appendix
'
tit
re
ri
,r
ti
24
v..
21.
J.
.4
APPENDIX
Fs
g.
Selsiled'Proofs2
1:
6C
'4604
the X meature:'
. n
zx '
Z. 41 i
Z
1=1
=
x
(X2-56
-i)
(X
n_z
S
-
..
..._
S
.
____CX_n ri)----
+ ... +
S
?(
in summation notation:
n
x
(ni:X)
1 ;
Zx
nz
47
1=1
22
7
'
:*
t_
Zx
n.
Sx
z
x-
Y.
e,
Since
x,
"")L
i=1
v Xlisequal
4,
x
n X
x
. -
X..-.1,.
i=1
,t--..:
to in X
X
P,
0,"'
..
1.
1
-
Zx
(n X
x
----
n-
n7()
.,
nz
Z
yi
i=1
Zy
(n
n1
zy
"Y.
- n
S1
.y
mean equal to 0.
0,
..,0
c + c + c + ..1- c
i=1
. ,nCL
.
.
.
26
.
23.
2.
By. definition, the variance fbr X meAsures in ,'standard score form is:
2::(Z
x.
i=1
,z
zX
-1
S.
n1
nz
*S---(Z
L___ x.
1=1
1
nz -1
.,
If we
..
x',
rewrite 'Z
'
'T
deviation:
z =
nz
x
i=1
2
S
Rearranging terms:
n
='
2-'
-1
z
S2
x
x
2
14EDX -X)
i=1
59
24.
the "sup of
x
This results in:
Since
nx = nz
2
S.
=
z
-1
:(n -1)(S )
X
X
(n -1) (S2)
Y
'Y
-1
nzy
2
,
Y
46.
Since n
= n
z
-Y'
2
In each case, the standard deviation for approp riate variance terms,
2
n.
Sz
=
X
=17-=
Sz
and
Tr
S
Y'
=
,-
=j1
S.
z
equal to 1.
a.
25.
0 3.
That
xy
rz
x y
:E=
,e 1
n-1
r
(z E
x
i=1
(Z E y)
tex
Yi
z z
x y
-am
z
2
-Z
x
i=1
n
^
i=1
-1
-1
x
,
Since Z
4i
n
(Zx ) (Z
n-1
r
y .
3.
3.
i=1
z z
x y
nz
2
(Z
Y.
i=1
nz -1
Y4
(z'xi
)2
= n -1
z
.111140114,
nz
and
r.
26.
S--(Z
r
.)(Z
x.
1.
y.
n-1 i=1
z z
x y
-1
.nz)c-1
-1
z
-1
n
z
r
z z
11-1
x y
xi yi
1=1
(Recall that this relationship was used in the proof for the limits of r ).
xy
Nowo expanding the z score terms:
r.--
,
r
z z
x y
cp
n-1
(Xi -T)
(y.- 1-i)
.4i=1 '(S ) (S
x
This is identical to :
n
1.
RHY i)'
n-1 i=1
30
441
27.
Recognizing that
S2
and \02
, we can write:
40P
'
)
(X -R).(Y.-i
1
n-1
1=1
2
(S
2
) (S
aw score
.
i Em-X)
n-1
.
z z
x y
r-
-i)
i=1
....
n
Y
2
21(X. -X)
.4*
i=1
i=1
.
n x-1
Therefore,
-Y)2
Y:
n
,110i46
=mow.
01/0.
.....
...-
n
x
xY
forms is identical.
31
28.
References
O'Brien, Francis J., Jr. A 15/15tf that t and 'F are i e tical: the general
case. ERIC Clearinghouse orisCience, Mathematics and Environmental`.
(Note: the ED number for
Education, Ohio State University, April, 1982.
was not available at the time of this publication).
this document