Introduction of Hotelling T-Square

6
GROUP 3 MEMBERS
S/N NAME REG.NO
1 GHALI A NUHU UG14/STAT/1080
2 FAIZ AHMAD A UG14/STAT/1004
3 ALIYU YAHAYA UG14/STAT/1094
4 KAMAL ALHAJI YARO UG12/STAT/1032
5 BASHIR SUNUSI UG15/STAT/2006
6 SAGIR ABDURRASHID UG15/STAT/2013
7 JA’AFAR ABUBAKAR UG14/STAT/1024
8 IBRAHIM USMAN UG14/STAT/1032
9 KAMALUDDEEN BELLO UG15/STAT/2010
10 IBRAHIM ABDULLAHI UG14/STAT/1041
11 USMAN ADAM YERO UG14/STAT/1067
12 SHAMSUDDEEN IBRAHIM MUSTAPHA UG15/STAT/2018
13 LUCKY ANNICHE UG11/STAT/1275
14 NAJEEB SABO WADA UG15/STAT/2012
15 ABBAS RABIU UG15/STAT/2008
16 BASHIR MAGAJI UG14/STAT/1021
17 ANAS SHUAIBU ADAMU UG13/STAT/1074
18 UMAR BELLO MUSTAPHA UG13/STAT/1002
19 LUKMAN AHMED UG14/STAT/1109
20 SULAIMAN LAWAN UG14/STAT/1105
21 JUNAIDU SALE UG15/STAT/2015
22 CHIKA VIVEIN EBERE UG13/STAT/1004
6
INTRODUCTION
MULTIVARIATE STATISTICS is a subdivision of statistics encompassing the simultaneous
observation and analyses of more than one outcome variable, its application is multivariate
analyses
HYPOTHESIS TESTING
Is a statistical method that’s is used in making statistical decision using experimental
data, hypothesis testing is basically on assumption that we make about the population
parameter, its sometimes called confirmatory data analyses, is a hypothesis that’s testable on
the basic of observing a process that is modeled via a set of random variable.
R-PACKAGE
R is an extensible, powerful and wide, used scripting language and environment for
graphics and statistical data analysis. The R-language was developed at bell laborites in 1970s by
john chambers and colleagues, the source code of R is initially writing in Fortran, C and R with
growing population in recent years, R is gradually becoming the leading language in statistics
METHODOLOGY
The given data were analyzed with the R-Package using appropriate test statistic,
Hoteling’s T2 distribution.
HOTELLING T2 DISTRIBUTION
T2 is a multivariate distribution proportional to the F-distribution and arise importantly as
the distribution of a set of statistics which are natural generalization of the statistics underlying
student T-distribution, hoteling T-square statistics (T2) is a generalization of student T- statistics
that’s used in multivariate hypothesis testing.
THE DISTRIBUTION OF T2
The distribution arise in multivariate statistics in undertaking tests of the difference
between the multivariate means of different population, where test for univariate problems
6
would make used of T-test, the distribution is named for HAROLD HOTTELING, who developed
it as a generalization of student T-distribution.
DEFINITION
If the d vector with (px1) dimension follows a normal distribution i.e. NP(0, I) and M is
pxp symmetric matrix with m degrees of freedom, with a wishart distribution i.e. W P(IPXP, M)
then the quadratic form of M i.e
dIM-1d has a hoteling T2 distribution with dimensionality parameter p and m degrees of
freedom. If a random variable X has hoteling’s T2 distribution, i.e.
m−p+ 1
X~T2P,M then X~Fp,m-p+1
pm
Where Fp,m-p+1 is the F distribution with parameter p and m-p+1.
Let NP(µ, ∑) be an p-variate normal distribution mean µ and known covariance matrix ∑,
let X1,X2,….,Xn ~ NP(µ, ∑) be an independent and identical distributed random variables, which
may be represented as px1 column vectors. With mean X́ , and covariance matrix ∑ it can
show that
Where X2p is the chi square with p degrees of freedom.
DEFINITION
The covariance matrix ∑ used above is often unknown; here we used instead the sample
covariance,
6
Sn= 1/(n-1) ∑(xi- X́ )( xi- X́ )I
It can be shown that ∑=(n-1)Sn which it follows a wishart distribution with (n-1) degree of
freedom,
The hoteling T-square statistic is defined as
T2= ( X́ −µ )ISn-1 ( X́ −µ )
Also from the distribution
p ( n−p )
T2~ T2p,n-1= Fp,n-p
n− p
Where Fp,n-p is the F distribution in order to calculate p-value, note that the distribution
of T2 equivalently implies that
n− p
T2 ~Fp,n-p
p ( n−p )
Then used the quantity of the left hand side to evaluate the p-value corresponding to
the samples which comes from the f-distribution i.e.
We compare,
n− p
¿
T2 and p ( n−p ) Fn,n-p
¿
Then we accept the null hypothesis if the calculated p-value is greater, and reject the
null hypothesis with the respect of the alternative one.

6
PROCEDURES
library(ICSNP)
Group3=read.csv("c:\\Users\\FAIZ AHMAD A\\Desktop\\GROUP 3.csv")
aa=colMeans(Group3)
bb=cov(Group3)
N=20
P=3
cc=HotellingsT2(Group3)
#Hypothesis test:Ho:mu=[4,50,10],Ha:mu<>[4,50,10]
mu.Ho<-c(4,50,10)
T.sq<-N*t(aa-mu.Ho)%*%solve(bb)%*%(aa-mu.Ho)
test.stat=(N-P)/(P*(N-1))*T.sq
Crit.val=qf(0.95,P,N-P)
p.value=1-pf((N-P)/(P*(N-1))*T.sq,P,N-P)
round(data.frame(T.sq,test.stat,Crit.val,p.value),3)
6
RESULT
library(ICSNP)
Loading required package: mvtnorm
Loading required package: ICS
> Group3=read.csv("c:\\Users\\FAIZ AHMAD A\\Desktop\\GROUP 3.csv")
> aa=colMeans(Group3)
> bb=cov(Group3)
> N=20
> P=3
> cc=HotellingsT2(Group3)
> #Hypothesis test:Ho:mu=[4,50,10],Ha:mu<>[4,50,10]
> mu.Ho<-c(4,50,10)
> T.sq<-N*t(aa-mu.Ho)%*%solve(bb)%*%(aa-mu.Ho)
> test.stat=(N-P)/(P*(N-1))*T.sq
> Crit.val=qf(0.95,P,N-P)
> p.value=1-pf((N-P)/(P*(N-1))*T.sq,P,N-P)
6
> round(data.frame(T.sq,test.stat,Crit.val,p.value),3)
T.sq test.stat Crit.val p.value
1 12.027 3.587 3.197 0.036
> Group3
X1.sweat. X2.Sodium. X3.potassium.
1 3.7 48.5 9.3
2 5.7 65.1 8.0
3 3.8 47.2 10.9
4 3.2 53.2 12.0
5 3.1 55.5 9.7
6 4.6 36.1 7.9
7 3.4 24.8 14.0
8 7.2 33.1 7.6
9 6.7 43.4 8.5
10 5.4 54.1 11.3
11 3.9 36.4 12.7
12 4.5 58.8 27.8

6
13 3.5 27.8 9.8
14 4.5 40.2 8.9
15 1.4 13.5 10.1
16 8.5 56.4 7.1
17 4.5 71.6 8.2
18 6.5 52.8 10.0
19 4.0 44.1 11.1
20 5.5 40.9 9.1
> aa
4.680 45.175 10.700
> bb
X1.sweat. 2.734316 8.695789 -1.851053
X2.Sodium. 8.695789 200.195658 5.292632
X3.potassium. -1.851053 5.292632 19.408421
>N
6
[1] 20
>P
[1] 3
> cc
Hotelling's one sample T2-test
data: Group3
T.2 = 128.01, df1 = 3, df2 = 17, p-value = 7.23e-12
alternative hypothesis: true location is not equal to c(0,0,0)
> #Hypothesis test
> mu.Ho
[1] 4 50 10
> T.sq
[,1]
[1,] 12.02739
6
> test.stat
[,1]
[1,] 3.587117
> crit.val
Error: object 'crit.val' not found
> Crit.val
[1] 3.196777
> p.value
[,1]
[1,] 0.03564193
> round(data.frame(T.sq,test.stat,Crit.val,p.value),3)
T.sq test.stat Crit.val p.value
1 12.027 3.587 3.197 0.036
DECISION AND CONCLUSION
Since the p.value=0.036 is less than α=0.95 then we reject the null hypothesis with
respect to the alternative one, i.e. and commence that Ho is not equal to [4,50,10]
REFRENCE
Goggle and Wikipedia.

6

Introduction of Hotelling T-Square

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction of Hotelling T-Square

Uploaded by

Copyright:

Available Formats

6

T2 is a multivariate distribution proportional to the F-distribution and arise importantly as

student T-distribution, hoteling T-square statistics (T2) is a generalization of student T- statistics

that’s used in multivariate hypothesis testing.

The distribution arise in multivariate statistics in undertaking tests of the difference

it as a generalization of student T-distribution.

then the quadratic form of M i.e

dIM-1d has a hoteling T2 distribution with dimensionality parameter p and m degrees of

freedom. If a random variable X has hoteling’s T2 distribution, i.e.

Where Fp,m-p+1 is the F distribution with parameter p and m-p+1.

Where X2p is the chi square with p degrees of freedom.

Sn= 1/(n-1) ∑(xi- X́ )( xi- X́ )I

The hoteling T-square statistic is defined as

Also from the distribution

of T2 equivalently implies that

the samples which comes from the f-distribution i.e.

null hypothesis with the respect of the alternative one.

Group3=read.csv("c:\\Users\\FAIZ AHMAD A\\Desktop\\GROUP 3.csv")

Loading required package: mvtnorm

Loading required package: ICS

> Group3=read.csv("c:\\Users\\FAIZ AHMAD A\\Desktop\\GROUP 3.csv")

> #Hypothesis test:Ho:mu=[4,50,10],Ha:mu<>[4,50,10]

T.sq test.stat Crit.val p.value

1 12.027 3.587 3.197 0.036

X1.sweat. X2.Sodium. X3.potassium.

1 3.7 48.5 9.3

2 5.7 65.1 8.0

3 3.8 47.2 10.9

4 3.2 53.2 12.0

5 3.1 55.5 9.7

6 4.6 36.1 7.9

7 3.4 24.8 14.0

8 7.2 33.1 7.6

9 6.7 43.4 8.5

10 5.4 54.1 11.3

11 3.9 36.4 12.7

12 4.5 58.8 27.8

13 3.5 27.8 9.8

14 4.5 40.2 8.9

15 1.4 13.5 10.1

16 8.5 56.4 7.1

17 4.5 71.6 8.2

18 6.5 52.8 10.0

19 4.0 44.1 11.1

20 5.5 40.9 9.1

X1.sweat. X2.Sodium. X3.potassium.

4.680 45.175 10.700

X1.sweat. X2.Sodium. X3.potassium.

X1.sweat. 2.734316 8.695789 -1.851053

X2.Sodium. 8.695789 200.195658 5.292632

X3.potassium. -1.851053 5.292632 19.408421

Hotelling's one sample T2-test

T.2 = 128.01, df1 = 3, df2 = 17, p-value = 7.23e-12

alternative hypothesis: true location is not equal to c(0,0,0)

> #Hypothesis test

Error: object 'crit.val' not found

T.sq test.stat Crit.val p.value

1 12.027 3.587 3.197 0.036

DECISION AND CONCLUSION

Goggle and Wikipedia.

You might also like