You are on page 1of 4

EXERCISE SHEET 6

1. Consider the following data of two variables in a sample:


x y
5 19
7 13
8 14
16 8
15 8
18 8
22 7

a) Sketch a scatter plot of the data. Based on this, what would you expect the Pearson
correlation coefficient between x and y to be? (ie positive, negative; large, small)
20
18
16
14
12
10
Y

8
6
4
2
0
4 6 8 10 12 14 16 18 20 22 24
X

Fairly strong linear correlation between the two variables.

b) Calculate the covariance, sxy, between the two variables


Note: there is an “easier” formula, which will be covered by next week’s lectures

x y x- x y- y (x- x ¿( y− y )
5 19 -8 8 -64
7 13 -6 2 -12
8 14 -5 3 -15
16 8 3 -3 -9
15 8 2 -3 -6
18 8 5 -3 -15
22 7 9 -4 -36
x=¿ 13 y=¿11 Total=-157

∑ (x i−x )( y i− y ) −157
S xy = i=1 = =−26.17
n−1 7−1

1
c) Determine the correlation coefficient, r, between x and y, and interpret the correlation
coefficient.
Needs to calculate the standard deviation of each variable, recall formulae for sample
data standard deviation
2
s=
∑ ( x−x )2 2
, or s = ∑ 2
x −n x
2

n−1 n−1

s x=6.38
s y =¿4.47

s xy
r xy = = -0.92, negative strong correlation coefficient as expected.
sx s y

2. Consider the following data of two variables in a sample from last week’s
exercise sheet:
x y
5 19
7 13
8 14
16 8
15 8
18 8
22 7

C
d) Calculate r, between x and y, using the r = formula
√ AB
C=¿ n ∑ x i yi −∑ x i ∑ y i =7 × 844−91× 77=−1099
A=¿ n ∑ x i2−¿ ¿ ¿=7 ×1427−912=1708
B=n ∑ y i −¿ ¿ ¿ ¿=7 × 967−77 2=840
2

C −1099
r= = =−0.92
√ AB √ 1708 ×840
x y X Y xy
squared squared
5 19 25 361 95
7 13 49 169 91
8 14 64 196 112
16 8 256 64 128
15 8 225 64 120
18 8 324 64 144
22 7 484 49 154
Totals 91 77 1427 967 844
:

2
e) Test the significance of r calculated in a) using 5% significance level.

Hypotheses:
H 0 : ρ=0; H 1 : ρ ≠ 0
Two tailed test
r
t= t ( n−2 )
Test value:
√ 1−r 2
n−2
,= -5.24

t distribution with significance level 0.025 ,∧v=n−2=5 degrees of freedom, Critical


value: 2.571
Reject H 0 as test value t ≥ critical t v, α/ 2

3. Assume the X and Y are random variables and they have bivariate Normal
distribution (A bivariate normal distribution is made up of two independent
random variables. The two variables in a bivariate normal are both are
normally distributed, and they have a normal distribution when both are
added together.)

x y
5 8
7 9
3 11
16 27
12 15
9 13

C
a) Calculate the correlation coefficient between X and Y, using the r =
√ AB
formula, where

C=¿ n ∑ x i yi −∑ x i ∑ y i
A=¿ n ∑ x i2−¿ ¿ ¿
B=n ∑ y i −¿ ¿ ¿ ¿
2

5 8 40 25 64
7 9 63 49 81
3 11 33 9 121
16 27 432 256 729
12 15 180 144 225
9 13 117 81 169

3
C=¿ n ∑ xy−∑ x ∑ y =6∗865−52∗83=874
A=¿ n ∑ x i2−¿ ¿ ¿
B=n ∑ y i −¿ ¿ ¿ ¿
2

C 874
r= = = 0.88
√ AB √ 680∗1445
b) Test the significance of the sample correlation coefficient using a 95%
confidence level.
Step 1: Hypothesis
H 0 : ρ=0; H 1 : ρ ≠ 0
Step2: use the formula to calculate test value for t
r
t=
Test value:

2
1−r
n−2
0.88
t= =¿
Test value:
√ 3.7
2
1−.88
6−2

Step3: T-table critical value


Two tails tests because # r value can go less than 0 or higher than 0
So, α/2= 5/2=2.5 % =0.025
Degree of Freedom
V = n-2= 6-2=4
With 4 degree of freedom with 0.025 significant level, critical value is 2.776

Step 4: make the decision


Test value is greater than the table value/critical value. So, reject the null
hypothesis (r=0).

You might also like