You are on page 1of 3

ST305 - Multivariate Methods I : Problem set 1

Department of Statistics and Computer Science - Academic Year 2021/2022

Submit the answers for all the questions on or before 6th June 2023, before 4.00pm

Problem 1
 
36 −2 9
Let X=[X1 , X2 , X3 ]0 have covariance matrix −2 9 1
9 1 4

(i) Calculate the population correlation matrix ρ.


(ii) Which pair of variables have the highest correlation?
(iii) Find the correlation between X1 and 21 (X2 + X3 ).
(iv) Calculate the total sample variance.
(v) Calculate the generalized sample variance.

Problem 2
Let X be bivariate normal, with σ11 = σ22 . Show that X1 + X2 and X1 − X2 are independent.

Problem 3
Use R to answer the following questions.

Air pollution data of 41 US cities is given in the dataset data("USairpollution"). You need to install
the R library HSAUR2 to use this data.

(i) Identify the outliers in the dataset, if any.


(ii) Obtain a scatterplot matrix for the data set. Based on the plot you obtained, which pair of variable
shows the highest correlation and the which pair shows the lowest correlation?
(iii) Calculate the covariance matrix for the data set.
(iv) Calculate the correlation matrix for the data set.
(v) Obtain the total sample variance and the generalized sample variance.
(vi) Use the function eigen() to obtains the eigen values for the covariance matrix you obtained in
part (iv).
(vii) Confirm that the sum of the eigen values you found in part (vi) is equal to the total sample variance.

Problem 4
You are given the random vector X=[X1 , X2 , X3 , X4 , X5 ]0 with mean vector µ = [2, 4, −1, 3, 0] and
variance–covariance matrix

 
4 −1 0.5 −0.5 0
 −1 3 1 −1 0
 
ΣX  0.5
= 1 6 1 −1
−0.5 −1 1 4 0
0 0 −1 0 2

1
Partition X as

 
X1
X2   
  U
X3  = V
X= 
X4 
X5
Let    
1 −1 1 1 1
A= and B =
−1 1 1 1 −2
Calculate the following:

(i) E(U )
(ii) E(AU )
(iii) Cov(U )
(iv) Cov(AU )
(v) Cov(BV )
(vi) Cov(AU, BV )

Problem 5
Suppose y and x are subvectors, such that y is 2×1 and x is 3×1, with µ and Σ partitioned accordingly:

[width=0.5]mat2.png

 
y
Assume that x is distributed as N5 (µ, Σ).

1. Find E(y|x).
2. Find Cov(y|x).

Problem 6
Use R to answer the following questions.

Consider the body measurements of 20 individuals given in the following table (Table 1).

(i) Calculate the mean vector for the body measurements.


(ii) Considering the Euclidean distance, find the individual who falls farthest from the mean vectors of
body measurements.
(iii) Considering the Mahalanobis distance, find the individual who falls farthest from the mean vectors
of body measurements. (You may use the mahalanobis() function to find the answer).

2
Subject ID Chest Waist Hips
1 34 30 32
2 37 32 37
3 38 30 36
4 36 33 39
5 38 29 33
6 43 32 38
7 40 33 42
8 38 30 40
9 40 30 37
10 41 32 39
11 36 24 35
12 36 25 37
13 34 24 37
14 33 22 34
15 36 26 38
16 37 26 37
17 34 25 38
18 36 26 37
19 38 28 40
20 35 23 35

Table 1: Body Measurement Data

You might also like