Slides

Multivariate Distributions
1-1
Characteristic Function
Rp
The characteristic function (cf ) of a random vector X dened as
is
X (t ) = E (e t X ) =
i
t x f (x ) dx ,
Rp ,
where
i is the complex unit: i2 = 1.
MVA: HumboldtUniversitt zu Berlin
Properties of cf:
1-2
X (0) = 1,
if
|X (t )| 1
is absolutely integrable (
|(x )|dx

exists and is nite)
then
f (x )
(2 )p
it
x (t ) dt . X
if X
= (X1 , X2 , . . . , Xp )
then for t
= (t1 , t2 , . . . , tp ) : , X (tp ) = X (0, . . . , 0, tp ).

p
X1 (t1 ) = X (t1 , 0, . . . , 0),
...
For X1 , . . . , Xp independent RV and t
1-3
= (t1 , t2 , . . . , tp )
p
is:
X (t ) = X1 (t1 ) . . . X (tp ).
For X1 , . . . , Xp independent RV, t
is:
X1 +...+X (t ) = X1 (t ) . . . X (t ).
p p
The characteristic function allows to recover all the cross-product moments of any order: have
jk 0, k = 1, . . . , p ,
j
p
= (t1 , . . . , tp ) .
t =0
we
X11 . . . Xp
= j1 +...+j i
X (t ) j1 . . . t j t1 p
X
1-4
R1
follows the standard normal distribution
fX (x )
1 2
exp
x2
2
X (t ) = = =
since
1 2
tx exp x

2
dx
2 exp
exp
t2
2
1 2
(x it )2
2
dx
exp
t2
2 exp
i2 = 1 and
t) (x 2
i
dx
= 1.
1-5
Theorem
The distribution of X
Rp
is completely determined by the set of
all (one-dimensional) distributions of t X , t
Rp .
Rp
This theorem says that we can determine the distribution of X in by specifying all the one-dimensional distributions of the linear
combinations
p j =1
t j Xj
=t
X,
= (t1 , t2 , . . . , tp ) .
1-6
Summary: Moments
The characteristic function (cf ) of a random vector X is
X (t ) = E (e it X ).
The distribution of a p -dimensional random variable X is completely determined by all one-dimensional distributions of
t X,t
Rp
(Theorem of Cramer-Wold).
1-7
Cumulants
For a random variable X with density f and nite moments of order
k the characteristic function

1
X (t ) = E (e itX )
j
has a derivative
ij
The values
j log{X (t )} = j , t j t =0 > 1)
= 1, . . . , k . j does X + a.
are called cumulants or semi-invariants since under a shift transformation X
not change (for j
The cumulants are natural parameters for dimension reduction methods, in particular the Projection Pursuit method.
1-8
The relation between the rst k moments m1 , . . . , mk and the cumulants is given by
m1 m2
. . . 1 0
...
m1
...
.. . . . .
k = (1)k 1
. . .
mk
1
0
mk 1
...
k k
1 2
m1
1-9
Suppose that k
= 1,
then
1 = m1 .
For k
=2
we obtain
m1
1 1 0
2 =
m2
m1
2 = m2 m1
For k
1-10
m1
=3
we have to calculate 1 0 1 2m1
3 =
m2 m3
m1 m2
Calculating this determinant we arrive at:
= = =
m1
m1
m2 2m1 m2 2m1 2 m1 (2m1 m2 ) m2 (2m1 ) + m3 3 m3 3m1 m2 + 2m1 .
m2
+ m3
0 1
m1
In a similar way one calculates
2 4 2 6m1 . 4 = m4 4m3 m1 3m2 + 12m2 m1
In a similar fashion we nd the moments from the cumulants:
1-11
m1 m2 m3 m4
= 1
2 = 2 + 1 3 = 3 + 32 1 + 1 2 2 4 = 4 + 43 1 + 32 + 62 1 + 1
A very simple relationship can be observed between the semi-invariants and the central moments
= m1 as dened 4 = 4 32 2.
before. We have, in
k = E (X )k , where fact, 2 = 2 , 3 = 3 ,
Skewness
1-12
and kurtosis
are dened as:
3 = 4 =
E (X E (X
)3 / 3 )4 / 4
The skewness and kurtosis determine the shape of onedimensional distributions. The skewness of a normal distribution is 0 and the kurtosis equals 3. The relation of these parameters to the cumulants is given by:
3 = 4 =
3 2 4 2 2
3/2
1-13
Transformations
X
fX
pdf of Y
= 3X ?
X
= u (Y )
one-to-one transformation u : Jacobian:
Rp Rp = ui (y ) yj
J =
fY (y )
xi yj
= abs(|J |)fX {u (y )}
1-14
Example
(x1 , . . . , xp ) = u (y1 , . . . , yp )
Y
= 3X X =
1 3Y
= u (y ) J =
1 3
0 .. .
0 abs(|J |)
1 3
1 3
1-15
= AX + b ,
X
nonsingular
= A1 (Y b) J = A 1
fY (y )
= abs(|A|1 )fX {A1 (y b)}
X
1-16
= (X1 , X2 ) R2
with density fX (x ) 1 1 1
= fX (x1 , x2 ) =
0 0
A=
= AX + b =
1
X1
+ X2 X1 X2 A1 =
1 1 2
|A| = 2,
abs(|A|
)= ,
2 1 2
1 1 1 1
fY (y )
1 2
fX
(y1 + y2 ), (y1 y2 )
2
1-17
Summary: Transformations
If X has pdf fX (x ) then a transformed random vector Y ,
= u (Y ),
has pdf fY (y )
denotes the Jacobian
= abs(|J |) fX {u (y )}, (y ) J = u . y
i j
where
In the case of a linear relation Y and Y are related via
= AX + b the pdf 's of X fY (y ) = abs(|A|1 )fX {A1 (y b )}.
1-18
Multinormal Distribution
The pdf of a multinormal is (assuming that 1
has full rank):
f (x )
= |2 |1/2 exp (x ) 1 (x ) .
2
Np (, ) = , = > 0.
Expected value is EX
Variance matrix of X is VaR {X }
1-19
Geometry of the Np (, ) Distribution

Density of Np (, ) is constant on ellipsoids of the form
(x ) 1 (x ) = d 2
If X
2 p distributed, since the Mahalonobis transformation p Z 2. Z = 1/2 (X ) Np (0, Ip ) and Y = Z T Z = j =1 j
Np (, ),
then the variable Y
= (X ) 1 (X )
is
normal sample
7 7
contour ellipses
X2 2
-1
-2
-3
3 X1
-3
1
-2
-1
X2 2
3 X1
Scatterplot of normal sample and contour ellipses for 1.0
3 2
and
1.5
1.5 4 .0
MVAcontnorm.xpl
1-21
Singular Normal Distribution

Denition of Normal distribution in case that the matrix singularwe use its eigenvalues rank()
is
and the generalized inverse
= k < p,
1 k
exp
(2 )k /2 (1 k )1/2
= G-inverse
(x ) (x )
2
1-22
Summary: Multinormal Distribution

The pdf of a p -dimensional multinormal X
Np (, )
is
f (x )
= |2 |1/2 exp (x ) 1 (x ) .
2
The contour curves of a multinormal are ellipsoids with half-lengths proportional to eigenvalues of
i ,
where
denote the
. Np (, )
to a
The Mahalanobis transformation transforms X
Y X
= ) Np (0, Ip ). Vice versa, one can create Np (, ) from Y Np (0, Ip ) via X = 1/2 Y + .
1/2 (X
1-23
Summary: Multinormal Distribution

If the covariance matrix
is singular (i.e., rank()
< p)
then
it denes a singular normal distribution. The density of a singular normal distribution is given by
(2 )k /2 (1 k )1/2
where
exp
(x ) (x ) ,
2
denotes the G-inverse of
1-24
Limit Theorems
Central Limit Theorem describes the (asymptotic) behaviour of
sample mean
X1 , X2 , . . . , Xn , i.i.d with Xi
(, )
for
The
n (x
) Np (0, )
CLT can be easily applied for testing. Normal distribution plays a central role in statistics.
Asymptotic Distribution, N=5

0.4 0.4
Asymptotic Distribution, N=35
Estimated and Normal Density 0.1 0.2 0.3
-3
-2
-1 0 1 1000 Random. Samples
Estimated and Normal Density 0.1 0.2 0.3
-2 0 1000 Random. Samples
The CLT for Bernoulli distributed random variables. Sample size
=5
(left) and n
= 35
(right).
MVAcltbern.xpl
The CLT in the two-dimensional case. Sample size n
=5
(left) and
= 85
(right).
MVAcltbern2.xpl
1-27
a consistent estimator of
P .
x is asymptotically normal:
n 2 (x
) Np (0, Ip )
Condence interval for (univariate) mean
Xi
N (, 2 )
N (0, 1)
1-28
Dene u1/2 as the 1 Then we get the
/2 quantile of the N (0, 1) following 1 condence interval: =

x
distribution.
C1 P (
u1/2 , x
+
for
n n
u1/2
C1 ) 1
EDF and CDF, n=100 1 edf(x), cdf(x) 0 0.5
-2
-1
0 x
The standard normal cdf and the empirical distribution function for
= 100.
MVAedfnormal.xpl
EDF and CDF, n=1000 1 edf(x), cdf(x) 0 0.5
-2 x
The standard normal cdf and the empirical distribution function for
= 1000
MVAedfnormal.xpl
EDF and 2 bootstrap EDFs, n=100

1 0 edfs{1..3}(x) 0.5
-2
-1
0 x
The cdf Fn and two bootstrap cdf `s Fn .
MVAedfbootstrap.xpl
1-32
Bootstrap condence intervals

Empirical distribution function
edf
Fn
= n 1
n I (x x ) i =1 i
Xi
x = mean of bootstrap sample
F i Fn
sup
n(x
x)
<u P
n(x
) <u
a.s.
Construction of Condence Intervals possible! The unknown distribution of x can be approximated by the known distribution of
x .
1-33
Transformation of Statistics
= (f1 , . . . , fq ) : Rp Rq p are real valued functions which are dierentiable at R , then f (t ) is asymptotically normal with mean f () and covariance D D, i.e., L n{f (t ) f ()} Nq (0, D D ) for n ,
If
n (t
) Np (0, )
and if f
where
D= (p q )
fj ti
(t )
t =
matrix of all partial derivatives.
This theorem can be applied e.g. to nd the variance stabilizing transformation.
1-34
Example
Suppose
{Xi }n i =1 (, );
We have by CLT for n
=
n(x
0 0
1 0.5
0.5 1
= 2.
) N (0, ).
The distribution of This means to
x2 ? x 1 + 3x 2 consider f = (f1 , f2 )
2 = x1 x2 ,
x2 1
with
f1 (x1 , x2 )
f2 (x1 , x2 )
= x1 + 3x2 ,
= 2.
Then f
() =
0 0
1-35
and
D = (dij ),
dij
fj xi
x =
1 2
1
2x1
1 3
=
x =0
1 3
We have the covariance 0 1
1
3
1 3
1 2
1 D
7 2
7 2
13
D
This yields
D D
x2 x 1 + 3x 2
x2 1
N2
0 0
7 2
7 2
13
1-36
Summary: Limit Theorems

If X1 , . . . , Xn are i.i.d. random vectors with Xi the distribution of
n(x
is
(, ) asymptotically N (0, )
then
(Central Limit Theorem). If X1 , . . . , Xn are i.i.d. random variables with Xi
(, )
then
an asymptotic condence interval can be constructed by the CLT:
x
.
u1/2
1-37
Summary: Limit Theorems

For small sample sizes the Bootstrap improves the precision of this condence interval. The Bootstrap estimates x
have the same asymptotic limit.
If t is a statistic that is asymptotically normal, i.e.,
) Np (0, ), then this holds also for a function f (t ), i.e., n{f (t ) f ()} is asymptotically normal.
n (t

Slides

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Slides

Uploaded by

Copyright:

Available Formats

Multivariate Distributions

The characteristic function (cf ) of a random vector X dened as

i is the complex unit: i2 = 1.

MVA: HumboldtUniversitt zu Berlin

exists and is nite)

= (t1 , t2 , . . . , tp ) : , X (tp ) = X (0, . . . , 0, tp ).

X1 (t1 ) = X (t1 , 0, . . . , 0),

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

follows the standard normal distribution

MVA: HumboldtUniversitt zu Berlin

is completely determined by the set of

all (one-dimensional) distributions of t X , t

MVA: HumboldtUniversitt zu Berlin

The characteristic function (cf ) of a random vector X is

MVA: HumboldtUniversitt zu Berlin

k the characteristic function

are called cumulants or semi-invariants since under a shift transformation X

not change (for j

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

we have to calculate 1 0 1 2m1

Calculating this determinant we arrive at:

m2 2m1 m2 2m1 2 m1 (2m1 m2 ) m2 (2m1 ) + m3 3 m3 3m1 m2 + 2m1 .

In a similar way one calculates

2 4 2 6m1 . 4 = m4 4m3 m1 3m2 + 12m2 m1

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

are dened as:

one-to-one transformation u : Jacobian:

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

= abs(|A|1 )fX {A1 (y b)}

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

If X has pdf fX (x ) then a transformed random vector Y ,

denotes the Jacobian

In the case of a linear relation Y and Y are related via

= AX + b the pdf 's of X fY (y ) = abs(|A|1 )fX {A1 (y b )}.

MVA: HumboldtUniversitt zu Berlin

has full rank):

Variance matrix of X is VaR {X }

MVA: HumboldtUniversitt zu Berlin

Geometry of the Np (, ) Distribution

2 p distributed, since the Mahalonobis transformation p Z 2. Z = 1/2 (X ) Np (0, Ip ) and Y = Z T Z = j =1 j

then the variable Y

MVA: HumboldtUniversitt zu Berlin

Scatterplot of normal sample and contour ellipses for 1.0

Singular Normal Distribution

and the generalized inverse

MVA: HumboldtUniversitt zu Berlin

Summary: Multinormal Distribution

The Mahalanobis transformation transforms X

MVA: HumboldtUniversitt zu Berlin

Summary: Multinormal Distribution

is singular (i.e., rank()

denotes the G-inverse of

MVA: HumboldtUniversitt zu Berlin

Asymptotic Distribution, N=5

Asymptotic Distribution, N=35

Estimated and Normal Density 0.1 0.2 0.3

The characteristic function (cf ) of a random vector X dened as

MVA: HumboldtUniversitt zu Berlin

exists and is nite)

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

are dened as:

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

Condence interval for (univariate) mean

MVA: HumboldtUniversitt zu Berlin

Dene u1/2 as the 1 Then we get the

/2 quantile of the N (0, 1) following 1 condence interval: =

MVA: HumboldtUniversitt zu Berlin

Bootstrap condence intervals

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

an asymptotic condence interval can be constructed by the CLT:

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin