Professional Documents
Culture Documents
1. Assume the X and Y are random variables and they have bivariate Normal
distribution (A bivariate normal distribution is made up of two independent
random variables. The two variables in a bivariate normal are both are
normally distributed, and they have a normal distribution when both are
added together.)
x y
5 8
7 9
3 11
16 27
12 15
9 13
C
a) Calculate the correlation coefficient between X and Y, using the r =
√ AB
formula, where
C=¿ n ∑ x i yi −∑ x i ∑ y i
A=¿ n ∑ x i2−¿ ¿ ¿
B=n ∑ y i −¿ ¿ ¿ ¿
2
5 8 40 25 64
7 9 63 49 81
3 11 33 9 121
16 27 432 256 729
12 15 180 144 225
9 13 117 81 169
C 874
r= = = 0.88
√ AB √ 680∗1445
1
b) Test the significance of the sample correlation coefficient using a 95%
confidence level.
Step 1: Hypothesis
H 0 : ρ=0; H 1 : ρ ≠ 0
Step2: use the formula to calculate test value for t
r
t=
Test value:
√ 1−r 2
n−2
0.88
t= =¿
Test value:
√ 1−.88 2
6−2
3.7
y 0 1 x
x y
5 8
7 9
3 11
16 27
12 15
9 13
Calculate the least squares estimates of the intercept and the slope of the
regression model.
Answer:
5 8 40 25 64
7 9 63 49 81
2
3 11 33 9 121
16 27 432 256 729
12 15 180 144 225
9 13 117 81 169
C
Slope: The estimator for β=b=
A
Intercept:
Answer:
3
3. A consultancy firm has offices in over 100 different cities. The managers have the
impression that some offices work more efficiently than others in giving advice to
clients. They ask you to conduct a statistical analysis to provide a benchmark on
which different offices can be evaluated.
The management provides you with data on 12 offices. The data are given in the table
below and consist of the total cost Y and the number of clients X of an office.
The management assumes that the relationship between total cost and the number of
clients is linear.
The error term captures the unsystematic influences. The error term is assumed
to be normally distributed with mean 0 and a constant variance σ 2. In other words,
ε N ¿).
(d) Calculate the estimates for the unknown model parameters and show
the estimated linear regression function.
The parameters to be estimated are as always α and β
C
The estimator for β=b=
A
Where,
C=n ∑ xy−∑ x y= (12 ×106,941 ) −( 512× 2,275 )=118,492
Therefore, ;
The intercept is the fixed cost while the slope is the cost per client.
(10%)
R-square of 0.99: It means that 99% of the variability of the total cost of the offices is
explained by the number of clients.