Professional Documents
Culture Documents
Abstract
1
1 Introduction
Let X1 ; X2 ; :::; Xn denote a sample from X s A( ; 2 ), where A is an arbitrary
distribution with = E(X) and 2 = V ar(X). The sample mean is given by
n
1X
X= Xi .
n i=1
i.e. we have
p X
P( n x) ! P (Z x).
2
2 Central Limit Theorem for S 2 and s2
2.1 Central limit theorem for S 2
In view of the de…nition of S 2 , using the ordinary central limit theorem, we
immediately obtain the following result.
2
Remark. Note that U is related to the kurtosis (X) of X. Recall that
the kurtosis is de…ned as:
E((X )4 ) V ar((X )2 )
(X) = 4
3= 4
2.
2 4
We …nd that U = ( (X) + 2) .
It follows that
p
p n p n n p
n(s2 2
)= n(S 2 2
)+ 2
n(X )2 (1)
n 1 n 1 n 1
We prove the following result.
3
p
Proof. Consider (1) and write n(s2 2
) = A + B, where
n p
An = n(S 2 2
),
n 1
p
n 2 n p
Bn = n(X )2 .
n 1 n 1
Using Theorem 1, we have
P (An x) ! P (U x).
2 s2
= p .
1 z =2 ( (X) + 2)=n
Now we …nd that 2U = pq(1 4pq). Note that for p = 1=2 we have 2U = 0.
3) If X s U N IF ( a; a), we have = 0, 2 = a2 =3 and E(X 4 ) = a4 =5. We
…nd that p 2
n(s a2 =3) =) U s N (0; a4 =5).
4
3 Multivariate central limit theorem
3.1 The central limit theorem
We prove the following theorem.
Proof. For arbitrary a and b where (a; b) 6= (0; 0), we consider aX +bY . Clearly
we have
E(aX + bY ) = a 1 + b 2,
V ar(aX + bY ) = a2 21 + b2 22 + 2ab 1 2.
where
W s N (0; a2 2
1 + b2 2
2 + 2ab 1 2 ).
Remark. The Cramer-Wold device states that for random vectors (Xn ; Yn )
we have
d
(Xn ; Yn ) =) (U; V )
if and only if
d
8(a; b) 6= (0; 0) : aXn + bYn =) aU + bV .
This device is easy to prove by using generating functions or characteristic
functions.
5
Theorem 4 Let (X1;j ; :::; Xk;j ), j = 1; 2; :::; n, denote a sample from a multi-
variate distribution (X1 ; X2 ; :::; Xk ) s A with means E(Xi ) = i and variance-
covariance matrix = (cov(Xi ; Xj ))ki;j=1 . For each i = 1; 2; :::; k, let X i =
P n
n 1 j=1 Xi;j . Then we have
p p p
P ( n(X 1 1) x1 ; n(X 2 2) x2 ; :::; n(X k k) xk )
! P (U1 x1 ; U2 x2 ; :::; Uk xk )
where (U1 ; U2 ; :::; Uk ) has a multivariate normal distribution with E(Ui ) = 0 and
Cov(Ui ; Uj ) = i;j .
The following corollary will we be useful.
Corollary 5 (5) Let (X1 ; Y1 ); (X2 ; Y2 ); :::; (Xn ; Yn ) denote a sample from a bi-
variate distribution (X; Y ) s A( 1 ; 2 ; 21 ; 22 ; ) and suppose that E(X 4 +Y 4 ) <
1. Consider the vectors
!
A = (X; Y ; X 2 ; Y 2 ; XY ),
! = ( ; ; E(X 2 ); E(Y 2 ); E(XY )).
1 2
Then p ! !) ! ! !
P ( n( A x ) ! P (V x ),
!
where V has a multivariate normal distribution with means 0 and with variance-
covariance matrix given by
0 2 1
1 Cov(X; Y ) Cov(X; X 2 ) Cov(X; Y 2 ) Cov(X; XY )
B 2
Cov(Y; X 2 ) Cov(Y; Y 2 ) Cov(Y; XY ) C
B 2 C
B 2
V ar(X ) Cov(X ; Y ) Cov(X 2 ; XY ) C
2 2
(2)
B C
@ V ar(Y 2 ) Cov(Y 2 ; XY ) A
V ar(XY )
3.2 Functions
Using the notations of Theorem 3, let us consider a new random variable
f (X; Y ), where the function f (x; y) is su¢ ciently smooth. Writing the …rst
terms of a Taylor expansion, we have
f f 1
f (x; y) = f (a; b) + (a; b)(x a) + (a; b)(y b) + R
x y 2
where the remainder term R is of the form
fx;x ( ; ) fx;y ( ; ) x a
R = (x a; y b) .
fx;y ( ; ) fy;y ( ; ) y b
Here the fa;b denote the second partial derivatives of f , and (resp. ) is
between x and a (resp. y and b). If these partial derivatives are bounded
around (a; b), for some constant c > 0 we have
jRj c((x a)2 + (y b)2 + j(x a)(y b)j).
6
Furthermore, if jx aj and jy bj , we …nd that
f f 2
f (x; y) f (a; b) (a; b)(x a) (a; b)(y b) 3c
x y
and hence also that
2 f f
3c + (a; b)(x a) + (a; b)(y b)
x y
f (x; y) f (a; b)
2 f f
3c + (a; b)(x a) + (a; b)(y b)
x y
Now replace (x; y) and (a; b) by (X; Y ) and ( 1; 2) and de…ne the following
quantities:
! f f
= ( 1; 2) =(
( ; ); ( 1; 2 )),
x 1 2 y
p p
An = 1 n(X 1 ) + 2 n(Y 2 ),
p
Kn = n(f (X; Y ) f ( 1 ; 2 )).
Note that Theorem 3 implies that P (A(n) x) ! P (W x) = P ( 1 U + 2V
x).
If X 1 and Y 2 , the previous analysis shows that
p p
3c n 2 + An Kn 3c n 2 + An
Now consider P (Kn x) and write P (Kn x) = I + II, where
I = P (Kn x; E),
II = P (Kn x; E c ),
where E is the event E = X 1 and Y 2 , and E c its com-
plement.
We have II P (E c ) P( X 1 > ) + P( Y 2 > ). Using the
inequality of Chebyshev, we obtain that
2 2
1 + 2
II .
n 2
If we choose such that n 2 ! 1, we obtain that II ! 0.
For I, we have
p p
I P ( 3 nc 2 + A(n) x; E) P (A(n) x + 3 nc 2 ).
p
If we choose such that n 2 ! 0, we …nd, after taking limits for n ! 1,
that I is bounded from above by P (W x). A good choice of is for example
= n 1=3 . On the other hand, we have
p
I P (3 nc 2 + A(n) x; E)
p p
= P (3 nc 2 + A(n) x) P (3 nc 2 + A(n) x; E c )
7
p
As before, we have P (3 nc 2 + A(n) x) ! P (W x). For the other term,
we have p
P (3 nc 2 + A(n) x; E c ) P (E c ) ! 0.
We obtain that as n ! 1, I is bounded from below by P (W x). We conclude
that
P (Kn x) ! P (W x).
Clearly we have E(W ) = 0 and for the variance we …nd that
2 1 ! !T
W = V ar(W ) = ( 1; 2) = .
2
where
V ar(X) Cov(X; Y ))
= .
Cov(X; Y ) V ar(Y )
This approach can also be used for random vectors with 3 or more components.
The general result is the following.
h(!
x ) = u1 f1 (!
x ) + u2 f2 (!
x ) + ::: + um fm (!
x)
where (u1 ; u2 ; :::; um ) 6= (0; 0; :::; 0). Now Theorem 6 and the Cramer-Wold
device can be used.
8
2
where W s N (0; W) with
2 V ar(X) Cov(X; X 2 ) 2
= ( 2 ; 1) 2
W Cov(X; X ) V ar(X 2 ) 1
= 4 2 V ar(X) 4 Cov(X; X 2 ) + V ar(X 2 )
= V ar(X 2 2 X)
= V ar((X )2 )
E(X 2 ) 1
( 1; 2) =( 2
; ).
2
Using Theorem 6, we …nd that
r
p n 1
P ( n( SCV CV ) x) ! P (W x).
n
2
where W s N (0; W) with
2 ! V ar(X) Cov(X; X 2 ) !T
= 2
W Cov(X; X ) V ar(X 2 )
E 2 (X 2 ) E(X 2 ) 1
= 4 3 2
Cov(X; X 2 ) + 2 2
V ar(X 2 ).
4
9
To simplify, note that
E((X )3 ) = Cov(X; X 2 ) 2 2
,
V ar((X )2 ) = V ar(X 2 ) + 4 2 2
4 Cov(X; X 2 )
Now we …nd
2 E 2 (X 2 ) E(X 2 ) 4
W = 4
( 3 2 2 2
)(E((X )3 ) + 2 2
)
4
1
+ 2 2
(V ar((X )2 ) 4 2 2
)
4
2 2 2 2
( + ) 1 2 1
= 4 3
E((X )3 ) 2
+ 2 2
V ar((X )2 ) 1
4
4
1 1
= 4 3
E((X )3 ) + 2 2
V ar((X )2 ).
4
3
In terms of kurtosis (X) and skewness 1 (X) = E((X )3 ), we …nd that
4 3 2 2
2
W = 4 3 1 (X) + 2
(X) + 2
.
4 2
Remarks.
1) In the case of a normal distribution, we …nd that
4 2
2 1
W = 4
+ 2
= CV 4 + CV 2 .
2 2
= = 1= , 1 = 2, =6
and then CV = 2W = 1.
3) For the Poisson( )-distribution, we have
2 1=2 1
= = , 1 = , =
1=2
and then CV = and
2 1 1
W = + 2.
2 4
1 X
= .
SCV s
10
P
If 2 < 1, the central limit theorem together with s2 ! 2
shows that we
have p
n p X d
= n =) Z
SCV s
where Z s N (0; 1). Now note that for x > 0, we have
p
n SCV 1
P( > x) = P ( p < ),
SCV n x
p
n SCV 1
P( < x) = P ( p > ).
SCV n x
As a consequence, we have
SCV d 1
p =) U = .
n Z
4.4 A t-statistic
In the place of SCV we can study T = 1=SCV = X=s. This is a quantity
related to the t-statistic t = (X )=s. As in section 4.2, we obtain that
p d
n(T ) =) W
2
where W s N (0; U) where
4 3 2 2
2 2
U = 4 W =1 3 1 (X) + 2
(X) + 2
.
4 2
Note that for the t-statistic, we have the simpler result that
p X d
n =) Z s N (0; 1).
s
s2
SD = .
X
11
!
To study SD, we consider A = (X; X 2 ), ! = ( ; E(X 2 )) and the function
2
f (x; y) = (y x )=x. Clearly we have
! n 1
f(A) = SD,
n
!
f ( ) = D,
! 2
1
= ( 2
2; ).
and
!
=( 1; 2 ; 1)
It follows that
p !
P ( n(f ( A ) Cov(X; Y )) x) ! P (W x)
12
2
where W s N (0; W) and
0 1
V ar(X) Cov(X; Y ) Cov(X; XY )
!@ !
2
W = Cov(X; Y ) V ar(Y ) Cov(Y; XY ) A t
Cov(X; XY ) Cov(Y; XY ) V ar(XY )
!
Assuming …rst for simplicity that 1 = 2 = 0, we …nd = (0; 0; 1) and
2
W = V ar(XY ). In the general case we …nd that
2
W = V ar((X 1 )(Y 2 )).
13
It follows that p
P ( n(r ) x) ! P (W x). (4)
! !t
where W s N (0; 2W ) and 2
W = with given in (2).
In this case we have
! !t 2 2
2
W = = V ar(X 2 ) + Cov(X 2 ; Y 2 ) Cov(X 2 ; XY )
4 4 2
2 2
+ Cov(X 2 ; Y 2 ) + V ar(Y 2 ) Cov(Y 2 ; XY )
4 4 2
Cov(X 2 ; XY ) Cov(Y 2 ; XY ) + V ar(XY )
2 2
2
= V ar(X 2 ) + 2Cov(X 2 ; Y 2 ) + V ar(Y 2 )
4
(Cov(X 2 ; XY ) + Cov(Y 2 ; XY )) + V ar(XY )
2
= (E(X 4 ) 1 + 2E(X 2 Y 2 ) 2 + E(Y 4 ) 1)
4
(E(X 3 Y ) + E(XY 3 ) ) + E(X 2 Y 2 ) 2
2
= (E(X 4 ) + 2E(X 2 Y 2 ) + E(Y 4 ))
4
(E(X 3 Y ) + E(XY 3 )) + E(X 2 Y 2 )
In the general case, we …nd that
2
2
W = (E(X 4 ) + 2E(X 2 Y 2
) + E(Y 4
)) (5)
4
(E(X 3 Y ) + E(X Y 3
)) + E(X 2 Y 2
),
14
where
X E(X) Y E(Y )
X = and Y = .
1 2
2
The …nal result is that (4) holds with W given in (5).
Remarks.
2
1) We can rewrite W more compact as follows. Assuming standardized
variables, we have
2
2
W = V ar(X 2 ) + 2Cov(X 2 ; Y 2 ) + V ar(Y 2 )
4
(Cov(X 2 ; XY ) + Cov(Y 2 ; XY )) + V ar(XY )
2
= V ar(X 2 + Y 2 ) Cov(X 2 + Y 2 ; XY ) + V ar(XY )
4
= V ar( (X 2 + Y 2 ) XY )
2
2) Note that the asymptotic variance 2W only depends on and fourth-order
central moments of the underlying distribution.
3) If = 0, we …nd that 2W = E(X 2 Y 2 ).
4) If X and Y are independent, we have = 0 and 2W = E(X 2 Y 2 ) =
E(X 2 )E(Y 2 ) = 1.
5) If Y = a + bX, b > 0 we …nd = 1, Y = X and 2W = 0.
5.3 Application
To model dependence, one often uses a model of the following form. Starting
from arbitrary independent random variables A and B we construct the vector
(X; Y ) = (A; B + A). Given a sample (Xi ; Yi ) we want to test e.g. the
hypothesis H0 : = 0 versus Ha : 6= 0.
It is clear that
2 2
V ar(X) = X = A
2 2 2 2
V ar(Y ) = Y = B + A
2
Cov(X; Y ) = A
2
A
= (X; Y ) = q
2( 2 2 2
A B + A)
15
5.4 The bivariate normal case
For a standard bivariate normal distribution (X; Y ) s BN (0; 0; 1; 1; ), we show
how to calculate 2W , cf. (5).
First note that (U; V ) = (X Y; Y ) also has a bivariate normal distribution
with
Cov(U; V ) = Cov(X; Y ) Cov(Y; Y ) = 0.
It follows that U and V are independent with V s N (0; 1) and U s N (0; 1 2 ).
For general W s N (0; 2 ), we have W (t) = exp 12 2 t2 and then E(W ) =
E(W 3 ) = 0 and E(W 2 ) = 2 , E(W 4 ) = 3 4 .
Now observe that Y = V and X = U + V . We …nd
E(Y 4 ) = E(X 4 ) = 3;
E(Y X 3 ) = E(Y 3 X) = E(V 3 U + V 4 ) = 3 ;
E(Y 2 X 2 ) = E(V 2 (U 2 + 2 U V + 2 V 2 ) = 1 + 2 2 ;
It follows that
2
2 2 2
W = (3 + 2 + 4 + 3) (3 + 3 ) + 1 + 2
4
4 2 2 2
= 2 + 1 = (1 )
2 2 2 2 2
In general, for (X; Y ) s BN ( 1; 2; 1; 2; ), we also …nd that W = (1 ) ,
and then
2 2
(1 )
r t N( ; )
n
16
Remark. Note that
r3
t(r) r= p p .
1 r2 (1 + 1 r2 )
Under H0 it follows that
d 1 3
n3=2 (t(r) r) =) Z .
2
For large samples it is not very useful to use the t-transformation.
17
Now note that (with or without ties):
X X n(n + 1)
Ri = Ri = 1 + 2 + :: + n = .
2
If there are no ties, we also have:
X X n(n + 1)(2n + 1)
Ri2 = Ri 2 = 1 + 22 + ::: + n2 = ,
6
X 2 n(n + 1)(2n + 1) (n + 1)2 n(n2 1)
Ri2 nR = n =
6 4 12
1X n(n + 1)(2n + 1) X
(Ri Ri )2 = R i Ri .
2 6
In the case of no ties, after simplifying, we …nd that:
Pn
6 i=1 (Ri Ri )2
rS = 1 . (6)
n(n2 1)
For independent variables, we can use the result of section 5.2 to conclude that
p d
nrS =) Z s N (0; 1).
X Y R R R R (R R )2
3 10 1 1 0 0
6 15 2 2 0 0
9 30 3 4; 5 1; 5 2; 25
12 35 4 6 2 4
15 25 5 3 2 4
18 30 6 4; 5 1; 5 2; 25
21 50 7 8 1 1
24 45 8 7 1 1
P 2 P 2
InPthe case of no
P ties we had Ri = Ri = 204. In our example, we
have Ri2 = 204, Ri 2 = 203; 5. If there is 1 tie involving 2 observations, we
see that there is a di¤erence of 0; 5.
18
Now we calculate the correction factor
23 2 33 3 k3 k
T = t2 + t3 + ::: + tk
12 12 12
In the case of ties, we replace (6) by:
Pn
6(T + i=1 (Ri Ri )2 )
rS = 1 .
n(n2 1)
6 Comparing variances
Testing hypothesis concerning di¤erences between means is well known and can
be found in any textbook about statistics. Less is known about comparing
variances. In the case of unpaired samples from normal distributions, the dis-
tribution of the quotient of the sample variances s21 =s22 can be determined and
is related to an F -distribution. In general, the analysis of s21 =s22 is more compli-
cated. In this section we study s21 =s22 for large samples. We consider unpaired
samples as well as paired samples.
19
Theorem 8 Suppose that E(X 4 + Y 4 ) < 1. If n ! 1 and m ! 1 in such a
way that n=m ! 2 (0 < 1), then
p d d 1 1
nK =) V = 2 U1 2 U2 ,
1 2
2
and V s N (0; V ) with
2 1 2 2 1 2
V = 4V ar((X 1) ) + 4V ar((Y 2 ) ). (7)
1 2
and then
1 2 1 1
V = ( (X) + 2) + ( (Y ) + 2)
n n m
20
Now we …nd that
p
p n 1 p m 1 np
n(s2p 2
)=( ) n(s21 2
)+ p m(s22 2
)
n+m 2 n+m 2 m
It follows that
2
p d d
n(s2p 2
) =) W = 2 U1 + 2 U2
+1 +1
2
In this case W s N (0; W ), with
2
2
W =( 2 )2 V ar((X 2
1) ) +( 2 )2 V ar((Y 2
2 ) ).
+1 +1
2 2 2
In the case of samples from normal distributions with 1 = 2 = , we …nd
that
2
2 4
W = 2 ( 2 )2 + 2 4
( 2 )2
+1 +1
2
4 4 n
= 2 2 t2 .
1+ n+m
where (U1 ; U2 ) has a bivariate normal distribution with zero means and with
variance-covariance matrix
2 2 2
V ar((X 1) ) Cov((X 1 ) ; (Y 2) )
2 2 2 .
Cov((X 1 ) ; (Y 2) ) V ar((Y 2) )
Proof. Take arbitrary real numbers (u; v) 6= (0; 0) and consider the vectors
!
A = (X; Y ; X 2 ; Y 2 ),
! = ( ; ; E(X 2 ); E(Y 2 )),
1 2
21
!
and f (!) = u 21 + v 22 . It is easy to see that = ( 2u 1 ; 2v 2 ; u; v). The
transfer results of section 3.2 show that
p !
P ( n(f ( A ) f (!)) x) ! P (W x),
2 2 ! !t
where W s N (0; W) with W = and
0 2
1
1 Cov(X; Y ) Cov(X; X 2 ) Cov(X; Y 2 )
B 2
Cov(Y; X 2 ) Cov(Y; Y 2 ) C
=B
@
2 C.
V ar(X 2 ) Cov(X 2 ; Y 2 ) A
V ar(Y 2 )
It follows that
p n 1
P ( n( (us21 + vs22 ) (u 2
1 +v 2
2 )) x) ! P (W x),
n
d
where W = uU1 + vU2 , and (U1 ; U2 ) has the desired bivariate normal distribu-
tion. It is clear that the correction factor (n 1)=n is not important. The result
follows by using the Cramer-Wold-device. p
As in Theorem 8, we consider K and now we conclude that P ( nK x) !
P (V x), where
d 1 1
V = 2 U1 2 U2 .
1 2
2
We …nd that V s N (0; V ) with
2 2 2 2
2 V ar((X 1) V ar((Y 2) ) Cov((X 1 ) ; (Y 2) )
V = 4 + 4 2 2 2
2 2 1 2
Remarks
1) We can rewrite 2V more compact as follows. Using the notation X =
(X 1 )= 1 and Y = (Y 2 )= 2 we have
2
V = V ar(X 2 ) + V ar(Y 2
) 2Cov(X 2 ; Y 2
)
= V ar(X 2 Y 2 )
= E((X 2 Y 2 )2 )
In the case of = 0 we …nd back the result of the unpaired case with = 1.
22
7 References
1. Bentkus,V., Jing, B.Y., Shao, Q.M. and Zhou, W., 2006, Limiting distri-
butions of the non-central t-statistic and their applications to the power
of t-tests under non-normality. Bernoulli 13:2, 346-364
2. P. Billingsley (1968). Convergence of probability measures. Wiley, New
York.
3. W. Feller, (1971). An introduction to probability theory and its applica-
tions, Vol. 2 (2nd edition). Wiley, New York.
4. G. Grimmet and D. Stirzaker (2002). Probability and Random Processes
(3rd edition). Oxford University Press, London.
5. Ladoucette, S.A. (2007). Analysis of Heavy-Tailed Risks. Ph.D. Thesis,
Catholic University of Leuven.
23