Professional Documents
Culture Documents
Multivariate Distributions
Paola Palmitesta Corrado Provasi
Department of Quantitative Methods Department of Statistical Sciences
University of Siena University of Padova
Abstract
The evolution of statistical inference in the last years has been induced also by the
development of new computational tools which have led both to the solution of classical
statistical problems and to the implementation of new methods of analysis. In this
framework, a relevant computational tool is given by simulation which leads to deal
with methodological problems not solvable analytically. Simulation is frequently used
for the generation of variates from known distributions with Monte Carlo methods for
the study of the behavior of statistics for which the sampling distribution is unknown.
Moreover, Monte Carlo algorithms have been developed which can be also used to ap-
proximate the stochastic integrals, as in bayesian statistical analysis and in optimization
problems. The application of such algorithms needs the application of advanced com-
putational tools and, particularly, computer algebra systems are surely one of the most
suitable tools for the simulation of complex stochastic models. In this paper we present
the methods used for the implementation of some procedures which use Monte Carlo
algorithms to obtain random vectors from multivariate continuous distributions in a
computer algebra system, specifically Mathematica.
Keywords: Random Vector Generators; Continuous Multivariate Distributions; Monte
Carlo Algorithms; Algebraic Software.
1 Introduction
The evolution of statistical inference in the last years has been induced also by the develop-
ment of new computational tools which have led both to the solution of classical statistical
problems and to the implementation of new methods of analysis. In literature many exam-
ples of the recent interrelation between numeric computation and statistics can be found, as
in Thisted (1988), Fishman (1996), Tanner (1996) and Gentle (1998). In this framework, a
relevant computational tool consists of simulation which leads to deal with methodological
problems not solvable analytically.
Simulation is frequently used for the generation of variates from known distributions
with Monte Carlo methods for the study of the behavior of statistics with unknown sam-
pling distribution. Moreover, Monte Carlo algorithms have been developed which can be
also used to approximate the stochastic integrals, as in bayesian statistical analysis and in
optimization problems (for a review see Ripley, 1987).
The application of such algorithms needs the application of advanced computational
tools and, particularly, computer algebraic systems are surely one of the most suitable tools
for the simulation of complex stochastic models. In statistics, computer algebraic systems
have been used in various contexts; in particular, among others, Heller (1991) shows how
1
computer algebraic systems can be used in a wide variety of statistical problems; Kendall
(1988 and 1990) provides some procedures for the symbolic computation of expressions
used in the analysis of the diffusion of euclidean forms; Silverman and Young (1987) use
symbolic computation in order to study some bootstrap distributions and Venables (1985)
in order to obtain maximum likelihood estimates in particular cases; Andrews and Stafford
(1993) provide some procedures for the symbolic computation of asymptotic expansions in
statistics; Provasi (1996) presents some symbolic procedures applicable to compute moments
of order statistics of some frequently used random variables (see also Varian, 1992 and 1996,
for applications in econometrics).
In this paper we consider the application of the algebraic software Mathematica (Wol-
fram, 1996) to obtain random vectors from continuous multivariate distributions with Monte
Carlo methods. In general, the generation of multivariate distributions is not easily imple-
mented, because the usual method based on the inverse of the cumulative distribution
function used with univariate distributions can not be applied. This problem is stated and
addressed in Section 2. The functions built in Mathematica to generate random vectors from
specific distributions are presented and discussed in Section 3, while in Section 4 we present
some computational algorithms to generate random vectors from multivariate distributions
with given marginals and correlation matrix. Section 5 contains concluding remarks.
fX1 ,...,Xp (x1 , . . . , xp ) = fX1 (x1 )fX2 |X1 (x2 |x1 ) · · · fXp |X1 ,...Xp−1 (xp |x1 , . . . , xp−1 ).
Parrish (1990) used this method to generate random vectors from Pearson multivariate
distributions.
When the computation of the conditional distribution of X is difficult, a transformation
of the vector can be considered. In this case, X is represented as a function of independent
univariate distributions. Notwithstanding the apparent easiness for applying this method,
the search of a particular transformation to generate X could be difficult when X is specified
by its probability density function fX1 ,...,Xp .
Other approaches for generating multivariate distributions are based either on the ex-
tension to the multivariate case of the rejection method or on iterative methods. Among
2
them, we cite importance sampling, in which sampling points are concentrated in the most
relevant regions for X rather then in the whole region, and methods based on Markov
chains. For the latter methods, the idea is to simulate independent realizations forming
an irreducible Markov chain, having as stationary distribution the distribution under study
(for a review see Boswell et al., 1993).
Referring mainly on the transformation to independent forms, in the next two sections
we present the methods we used in the construction of Mathematica functions, in order to
generate random vectors from many multivariate continuous distributions (some of these
methods can be found in extended form in Johnson (1987)).
where distribution is the name of one the distributions in Tab. 1 with the expected para-
meters and n is the number of random vectors to be generated. These functions have been
organized in the package MultiRand1 .
Multiuniform Distribution. The density function of the random vector U(p) = (U1 , . . . ,
Up )0 uniformly distributed on the unit sphere in Rp is given by
p
!−1 p
Γ(p/2) X X
fU1 ,...,Up (u1 , . . . , up ) = 1− u2i , u2i < 1, ui > 0, i = 1, . . . , p,
π p/2 i=1 i=1
where Γ(·) is the gamma function. A technique for the generation of random vectors from
U(p) is based on the transformation (Muller, 1959)
Xi
Ui = , i = 1, . . . , p,
(X1 + · · · + Xp )1/2
where X1 , . . . , Xp are independent standard normal variates (see Tashiro, 1977, for some
others methods for the generation of random vectors from U(p) ).
1
The package MultiRand.m can be downloaded as a zipped file at the following URL:
http://turing.dmq.unisi.it/multirand.zip.
In MultiRand.m the generation of variates from univariate distributions is done using the package Continu-
ousDistribution.m .
3
Table 1: Specific Multivariate Distributions in the Mathematica package MultiRand.m
MultiuniformDistribution[p] multivariate uniform distribution on the the unit p-
sphere.
DiricheletDistribution[alpha] Dirichelet distribution with parameter vector α.
MultinormalDistribution multivariate normal distribution with mean vector µ
[mu,Sigma] and Σ.
MultivariateChiSquare multivariate chi square distribution with m degrees of
Distribution[m,Sigma] freedom and scale matrix Σ.
MultivariateFDistribution multivariate F distribution with (m, k) degrees of free-
[m,k,Sigma] dom and scale matrix Σ.
MultivariateTDistribution multivariate t distribution with m degrees of freedom,
[m,mu,Sigma] mean vector µ and scale matrix Σ.
MultivariateJohnson Distribu- multivariate Johnson distribution with mean vector µ,
tion[mu,sigma,a,b,R,option] standard deviation vector σ, skewness indices vector
a, kurtosis indices vector b and with R that is either
a correlation matrix or a Spearman’s rank correlation
matrix or Kendall’s rank correlation matrix according
to the value given to the logical variable option.
MultivariateBetaDistribution multivariate beta distribution with parameter vectors
[alpha,beta] α and β.
MultivariateGammaDistribution multivariate gamma distribution with shape parameter
[theta,lambda,gamma] vector θ, scale parameter vector λ and location para-
meter vector γ.
MultivariateExtremeValue multivariate extreme values distribution with position
Distribution[mu,sigma,rho] parameter vector µ, scale parameter vector σ and de-
pendence parameter ρ.
MultivariateInverseGaussian multivariate inverse gaussian distribution with parame-
Distribution[chi,psi,alpha] ter vectors χ and α and scalar parameter φ.
MultivariateUniform p-dimensional uniform distribution with scalar parame-
Distribution[p,alpha] ter α.
MultivariateBurrDistribution multivariate Burr distribution with parameter vectors
[c,d,k] c and d and scalar parameter k.
MultivariateParetoDistribution multivariate Pareto distribution with location parame-
[theta,alpha] ter vector θ and scalar parameter α.
MultivariateLogisticDistribution p-dimensional logistic distribution with scalar parame-
[p,alpha] ter α.
MultivariatePearsonTypeII multivariate Pearson Type II distribution with shape
Distribution[v,mu,Sigma] parameter v, mean vector µ and scale matrix Σ.
MultivariatePearsonTypeVII multivariate Pearson Type VII distribution with shape
Distribution[v,mu,Sigma] parameter v, mean vector µ and scale matrix Σ.
MultivariatePowerExponential multivariate power exponential distribution with shape
Distribution[beta,mu,Sigma] parameter β, mean vector µ and scale matrix Σ.
4
Dirichlet Distribution. Let Y1 , . . . , Yp , Yp+1 be independent standard gamma random
variables with shape parameters αi > 0 for i = 1, . . . , p, p + 1, and let
Yi
Xi = , i = 1, . . . , p. (1)
Y1 + · · · + Yp + Yp+1
Then, the random vector X = (X1 , . . . , Xp )0 has a Dirichlet distribution with density func-
tion given by (cf. Fang, Kotz and Ng, 1990, p. 17)
p+1 p+1
−1
xαi i −1 ,
Y X
fX1 ,...;Xp (x1 , . . . , xp ) = (Bp (α) xi = 1, xi > 0, i = 1, . . . , p, p + 1,
i=1 i=1
Multinormal and Related Distributions. Let the distribution of the random vector
X = (X1 , . . . , Xp )0 be a multivariate normal Np (µ, Σ) with mean vector µ = (µ1 , . . . , µp )0
and covariance matrix Σ = (σij ). Moreover, let the lower triangular matrix A be obtained
by the Cholesky decomposition AA0 = Σ. Then, given the random vector Z =(Z1 , . . . , Zp )0
with independent standard normal components, the simulation of Np (µ, Σ) can be run using
the transformation X = µ + AZ (Scheuer and Stoller, 1962).
A multivariate distribution with chi-squared marginals can be obtained as a func-
tion of multinormal random vectors. In fact, let the independent random vectors Xi =
(X1j , . . . , Xpj )P0 , j = 1, 2, . . . , m, identically distributed as multivariate normal N (0, Σ)
p
m 2 /σ for i = 1, . . . , p. Then, the joint distribution of (Y , . . . , Y )
and let Yi = X
j=1 ij ii 1 p
is characterized by the scale matrix Σ and has chi-squared marginals with m degrees of
freedom (cf. Mardia, Kent and Bibby, 1979, p. 67). Moreover, let S 2 be a chi-squared
random variable with k degrees of freedom independent from Yi and let Fi = k Yi /(m S 2 )
for i = 1, . . . p. The joint distribution of (F1 , . . . , Fp ) is called multivariate F distribution
with (m, k) degrees of freedom and covariance matrix Σ of the accompanying multivariate
normal distribution (Schuurmann, Krishnaiah and Chattopadhyay, 1975). We can note
that the multivariate chi-squared distribution can be obtained by the main diagonal of the
Wishart matrix generated with the Bartlett algorithm (Bartlett, 1933).
Also a random vector distributed as a multivariate t distribution can be written as a
function of a multinormal random vector. As a matter of fact, if Z is a multivariate normal
Np (0, Σ) and S 2 is a chi-squared random variable with m degrees of freedom independent
√
from Z, then the joint distribution of (T1 , . . . , Tp ), where Ti = ( vZi /S) + µi for i =
1, . . . , p, has a multivariate t distribution with m degrees of freedom, mean vector µ =
(µ1 , . . . , µp )0 and scale matrix Σ. When Σ = I, where I is the identity matrix, and m = 1,
the multivariate Student’s t distribution is equal to the multivariate Cauchy distribution
(see Johnson and Kotz, 1972, Cap. 37, for details on the various types of multivariate
Student’s t distribution).
Z = γ + δ g[(X − ξ)/λ]
5
is true, where Z indicates the standard normal distribution, γ ∈ R and δ > 0 are shape
parameters, ξ ∈ R is a location parameter, λ > 0 is a scale parameter, and g(·) is one the
following transformations:
ln(y), for the SL (lognormal) family,
−1
sinh (y), for the SU (unbounded) family,
g(y) = (2)
ln[y/(1 − y)], for the SB (bounded) family,
y, for the SN (normal) family.
An important characteristic of the Johnson system, useful in Monte Carlo studies, is that
the family to which X belongs can be unambiguously identified by skewness and kurtosis,
as in Hill, Hill and Holder (1974). Moreover, we can obtain random variables from X at
first by the generation of variates from Z and then applying the inverse translation
X = ξ + λ g −1 [(Z − γ)/δ],
where z
e , for the SL (lognormal) family,
sinh(z), for the SU (unbounded) family,
g −1 (z) = (3)
1/(1 + e−z ),
for the SB (bounded) family,
z, for the SN (normal) family.
Johnson (1949b) has proposed a bivariate distribution based on the univariate transla-
tion system which can be easily extended to higher dimensions. Then, the random vector
X = (X1 , . . . , Xp )0 with p components has a Johnson multivariate distribution if the trans-
formation
Z = γ + δ g[λ−1 (X − ξ)] ∼ Np (0, Σ), (4)
is true, where Σ = R = (ρij ) is a correlation matrix, g[(y1 , . . . , yp )0 ] = [g1 (y1 ), . . . , gp (yp )]0
identifies the family to which the marginals of the translation belong, γ = (γ1 , . . . , γp )0 ,
δ =diag(δ1 , . . . , δp ), ξ = (ξ1 , . . . , ξp )0 and λ =diag(λ1 , . . . , λp ) are, respectively, the shape,
location and scale parameters of the components Xi for i = 1, . . . , p. Therefore, having
determined the families of the marginals of X on the basis of the first four moments,
the generation of the random vectors is done generating Z from a multivariate normal
distribution Np (0, Σ) and then applying the inverse translation
X = ξ + λ g−1 [δ −1 (Z − γ)],
using the vectors of parameters previously determined and the vector of the inverse transla-
tion functions g−1 [(z1 , . . . , zp )0 ] = [g1−1 (z1 ), . . . , gp−1 (zp )], where gi−1 (·) is defined by (3) for
i = 1, . . . , p.
This method takes to the generation of random vectors with the same moments of the
marginals, but not necessarily with the same correlation matrix, because the (4) is not
linear. This suggests to measure the association among the components of X by means
of the Spearman’s rank correlation matrix RS = (ρ̃ij ) or the Kendall’s rank correlation
matrix Rτ = (τij ) whose elements, as known, are not only invariant regarding to non linear
transformations but, in the normal case, are also connected to the correlation coefficient by
the relations
π
ρij = 2 sin ρ̃ij
6
and
6
hπ i
ρij = sin τij .
2
As a consequence, this method takes to the generation of random vectors from X with
the same moments of the marginals and given matrix RS or Rτ . In the next section we
show an extension of the multivariate Johnson distribution which leads to the generation
of random vectors with given correlation matrix.
Multivariate Beta and Gamma Distributions. Mathai and Moschopoulos (1991 and
1993) have introduced two multivariate distributions with beta and gamma marginals which
can be easily indirectly simulated. Let B1 , . . . , Bp be beta random variables with density
function
Γ(αi + βi ) αi −1
fBi (bi ) = b (1 − bi )βi −1 ,
Γ(αi )Γ(βi ) i
with bi ∈ (0, 1) parameters αi > 0 and βi > 0 for i = 1, . . . , p, and let U be a uniform random
variable with support on (0, 1]. Moreover, let U , B1 , . . . , Bp be mutually independent and
let
Xi = Bi U 1/(αi +βi ) , i = 1, . . . , p. (5)
Then, the random vector X = (X1 , . . . , Xp )0 has components Beta(αi , βi + 1) for i =
1, . . . , p and the following properties can be obtained directly from the definition :
αi
E(Xi ) = ,
(αi + βi + 1)
αi (βi + 1)
Var(Xi ) = ,
(αi + βi + 1)2 ((αi + βi + 2)
αi αj
Cov(Xi , Xj ) = , i 6= j.
(αi + βi + 1)(αj + βj + 1)[(αi + βi + 1)(αj + βj + 1) − 1]
Now, consider the independent gamma random variables V0 , V1 , . . . , Vp with density
function
1
fVi (vi ) = θ (vi − γi )θi −1 exp [−(vi − γi )/λi ] ,
i
λi Γ(θi )
with vi > γi and parameters θi > 0, λi > 0 and γi ∈ R for i = 0, 1, . . . , p, and let
λi
Yi = V 0 + Vi , i = 1, . . . , p. (6)
λ0
Then, the random vector Y = (Y1 , . . . , Yp )0 has components Gamma(θ0 +θi , λi , (γ0 /λ0 )λi
+γi ) for i = 1, . . . , p, and, directly from the definition or from the moment generating
function, the following properties can be obtained:
Therefore, the generation of random vectors from X or Y simply requires the generation
of independent variates from the beta and uniform or gamma random variables and then
the application of the transformations (5) or (6). Note that for both distributions the
correlation structure is positive.
7
Multivariate Extreme Value Distribution. Suppose that the cdf of the random vector
X = (X1 , . . . , Xp )0 follows the logistic model
( " p # ρ )
X xi − µi
FX1 ,...,Xp (x1 , . . . , xp ) = exp − exp − , (7)
ρ σi
i=1
where xi ∈ R and parameters µi ∈ R and σi > 0 for i = 1, . . . , p and ρ ∈ [0, 1] measures the
dependence among the marginals. The extremes ρ → 1 and ρ → 0 correspond, respectively,
to the independence and complete dependence. Then, the components of X have an extreme
value distribution (or Gumbel distribution) with cdf
xi − µi
FXi (xi ) = exp − exp − , i = 1, . . . , p.
σi
Shi (1995) has computed the density function of X and he has proposed a transformation
which allows its simulation. Let
xi − µi
yi = exp − , i = 1, ..., p,
σi
p
!ρ
1/ρ
X
z = yi .
i=1
where
p−1 ∂Qp−1 (z, ρ)
Qp (z, ρ) = − 1 + z Qp−1 (z, ρ) − z , Q1 (z, ρ) = 1. (8)
ρ ∂z
Let
2θ ,
T1 = cos
1
Qi−1 2
Ti = 2 i = 2, . . . , p − 1,
i=1 sin θj cos θi ,
p−1
π
2
Q
Tp = j=1 sin θj , θj ∈ 0, 2 , j = 1, . . . , p − 1.
yi = z Tiρ , i = 1, . . . , p,
or
xi = µi − σi (log z + ρ log Ti ), i = 1, . . . , p. (9)
8
Then, the density function of the random vector (Z, Θ1 , Θ2 , . . . , Θp−1 ) is given by (Shi,
1995)
p−1
Y
gZ,θ1 ,...θp−1 (z, θ1 , . . . , θp−1 ) = hz (z) gθj (θj ),
j=1
where
ρp−1
hZ (z) = Qp (z, ρ) e−z , z > 0,
(p − 1)!
h πi
gθj (θj ) = 2(p − j) cos θj (sin θj )2(p−j)−1 , θj ∈ 0, , j = 1, . . . , p − 1,
2
therefore the random variables Z, Θ1 , . . . , Θp−1 are independent among them.
Because Qp (z, ρ) is a p − 1-polynomial of z, if we write the density function of Z as
p
X
hZ (z) = qp,j ϕ(z, j),
j=1
where
1 k−1 −z
ϕ(z, k) = z e , z > 0,
Γ(k)
we note that Z has a mixture gamma distribution with shape parameter k and weights qp,j
depending on Qp (z, α). It is easy to obtain the following recurrence relations from the (8):
Γ(p − ρ)
qp,1 = ,
Γ(p)Γ(1 − ρ)
(p − 1)qp,j = (p − 1 − jρ)qp−1,j + ρ(j − 1)qp−1,j−1 , j = 2, . . . , p − 1,
p−1
qp,p = ρ .
Now, the random vectors can be easily generated from the distribution (7) with mar-
ginals distributed as an extreme value distribution. As a matter of fact, applying the inverse
transformation method, sin2 Θj , cos2 Θj can be obtained, respectively, from U 1/(p−j) and
1 − U 1/(p−j) , where U is the uniform distribution on (0, 1], while the variates from the ran-
dom variable Z with mixture gamma distribution can be generated using the composition
method. Therefore, using the (9), the simulated random vectors from X are obtained.
with parameters χ > 0 and ψ > 0. Now, let us suppose that the components of the random
vector X = (X1 , . . . , Xp )0 are defined using the additive random effects model
Xi = αi T + Vi , i = 1, . . . , p, (10)
9
where T ∼InverseGaussian(χ, ψ) and V1 , . . . , Vp , where Vi ∼InverseGaussian(χi , ψ/αi ), are
independent among them and α1 , . . . , αp are positive parameters. Then, X has InverseGaus-
√ √
sian( χαi + χi )2 , ψ/αi ), i = 1, . . . , p, marginals and from the moment generating function
the following properties can be obtained:
s √ √
( χαi + χi )2
E(Xi ) = ,
ψ/αi
s √ √
( χαi + χi )2
Var(Xi ) = ,
(ψ/αi )3
αi αj χ2
Cov(Xi , Xj ) = p , i 6= j.
(χψ)3
Therefore, the generation of random vectors from X simply requires at first the genera-
tion of independent variates from inverse gaussian random variables and then the application
of the transformation (10). In this case too, note that the model used to build the mul-
tivariate distribution leads to a positive correlation structure, because it depends only on
positive parameters.
Cook and Johnson (1981) have studied this family of distributions and have shown that
lim FU1 ,...,Up (u1 , . . . , up ) = min [u1 , . . . , up ]
α→0
and
p
Y
lim FU1 ,...,Up (u1 , . . . , up ) = ui ;
α→∞
i=1
therefore the correlation among the components of U approaches to its maximum when
α → 0 and approaches to zero when α → ∞.
A simulation of U can be run using the following algorithm: let Y1 , . . . , Yp be i.i.d.
standard exponential distributions and let the distribution of V be a standard gamma with
shape parameter α. Then, the cumulative distribution function of the random variables
Ui = [1 + Yi /V ]−α , i = 1, . . . , p,
have a cdf given by (11).
For a set of arbitrary marginal distributions, FX1 , . . . , FXp , we can define the joint cdf
as ( p )−α
X
FX1 ,...,Xp (x1 , . . . , xp ) = FXi (xi )−1/α − p + 1 . (12)
i=1
If we assume that the random vector X = (X1 , . . . , Xp )0 has a multivariate distribution
with cdf given by the equation (12), then a simulation of it can be easily implemented
inverting (U1 , . . . , Up ) with (FX−11 , . . . , FXp
−1
). In this paper we have implemented the (11)
also with the marginal distributions considered by Cook and Johnson (1981):
10
(i) Burr distribution:
Γ(p/2 + v + 1) −1/2 m
fX1 ,...,Xp (x1 , . . . , xp ) = p/2
|Σ| 1 − (x − µ)0 Σ−1 (x − µ) ,
Γ(v + 1)π
where x = (x1 , . . . , xp )0 and v > −1, p is the dimension of X. The support of the distribution
is restricted to the region (x − µ)0 Σ−1 (x − µ) ≤ 1.
The generation of random vectors from this distribution on the basis of the (13) is very
easy, because the distribution of the quadratic form R2 = (X − µ)0 Σ−1 (X − µ) is beta with
parameters p/2 and v + 1.
The random vector X = (X1 , . . . , Xp )0 has a multivariate Pearson Type VII distribution
if the density function
Γ(v) −1/2
0 −1
fX1 ,...,Xp (x1 , . . . , xp ) = |Σ| 1 + (x − µ) Σ (x − µ)
Γ(v − p/2)π p/2
11
In this case the distribution of T = R2 has density function
Γ(v)
fT (t) = tp/2−1 (1 + t)−v , t > 0.
Γ(p/2)Γ(v − p/2)
The generation of random vectors from this distribution on the basis of the (13) is very
easy, because the distribution of the random variable T = R2β is a gamma with shape
parameter p/(2β) and scale parameter 2.
12
vectors from multivariate distributions with marginals in the class of the random variables
closed respect to the sum and given correlation matrix. Sim (1993) too has proposed an
algorithm to generate Poisson and gamma random vectors with given marginals and covari-
ance matrix. In this paper, on the contrary, we have followed the approach of Prékopa and
Szántai (1978) to implement an algorithm using Mathematica which allows the generation
of random vectors from a gamma distribution with given mean vector and covariance ma-
trix. Differently from the method we followed, which always has a solution, the algorithms
by Park and Shin (1998) and Sim (1993) do not always converge.
Let X = (X1 , . . . , Xp )0 be a multivariate distribution with gamma marginals with density
function
1
fXi (xi ) = θ xiθi −1 e−xi /λi , xi > 0,
i
λi Γ(θi )
with parameters θi > 0 and λi > 0 for i = 1, . . . , p. The aim is to generate random vectors
from X when the mean vector µ = (µ1 , . . . , µp )0 and the covariance matrix Σ = (σij )
are given with positive elements. Obviously, in this case also the correlation matrix with
positive elements R = (ρij ) is defined and the parameters of Xi can be derived from
the corresponding mean and variance using the relations θi = µ2i /σii and λi = σii /µi for
i = 1, . . . , p.
Now, consider the standard multivariate Gamma distribution Y = (Y1 , . . . , Yp )0 obtained
using the transformation
Y = B−1 X,
where B =diag(λ1 , . . . , λp )0 . Then, we have that
E(Yi ) = ηi = θi ,
Var(Yi ) = δii = θi ,i = 1, . . . , p,
p
Cov(Yi , Yj ) = δij = ρij θi θj , ρij ≥ 0,
i, j = 1, . . . , p, i 6= j.
At this point we introduce the independent gamma random variables again in standard
form Z1 , . . . , Zk with shape parameters, respectively, ξ1 , . . . , ξk , k ≥ p, such that Y = TZ,
where Z = (Z1 , . . . , Zk )0 and T is an incidence matrix (0, 1) of dimension (p × k) and rank
p. Then,
E(Y) = η = Tξ,
Cov(Y) = ∆ = TΩT0 ,
with η = (η1 , . . . , ηp )0 , ∆ = (δ ij ), ξ = (ξ1 , . . . , ξk )0 and Ω =diag(ξ1 , . . . , ξk ). Therefore,
the generation of random vectors from Y to run stochastic simulations with Monte Carlo
methods is very easy, because it is sufficient to generate independent Gamma random vectors
with the usual methods and, then, multiply them by T. Therefore, the generation of random
vectors from X can be obtained by means of the transformation
X = BY.
Since µ and Σ are given, and consequently also η and ∆, the values assigned to ξ1 , . . . , ξk
should satisfy the following conditions:
Tξ = η,
T̃ξ = c,
ξ ≥ 0,
13
where c = {δ11 , δ12 , . . . , δ1p , δ22 , . . . , δpp }0 and T̃ is a matrix of dimension (p(p + 1)/2) × k
obtained multiplying the lines of T among them in the same order of the elements of c. As
a consequence of ηi = δii = θi for i = 1, . . . , p, these conditions are also satisfied solving the
following system:
T̃ξ = c,
subject to ξ ≥ 0.
The following remarks can be done.
(i) With the aim of finding a parameter vector which satisfies the constraints for a wide
domain, it is convenient to use an incidence matrix with a number of elements different
from zero at least equal to covariances. Prékopa and Szántai (1978) suggest to use
an incidence matrix with a number of column vectors equal to the number of all the
different combinations which can be chosen from p different elements (removing the
column vector composed of all 0, the columns are 2p − 1 ). Using a pattern based on
a recursive relation, Ronning (1977) has obtained three different incidence matrices
of dimension p × (k(k − 1)/2).
(ii) The given conditions can determine θ1 , . . . , θp unambiguously. If this is not the case,
i.e. there are at least two vectors ξ satisfying the conditions, there are infinite solu-
tions.
(iii) Since the solutions depend on T, it is not always possible to satisfy them. However,
it is possible to face the problem as a linear programming problem minimizing the
absolute difference between the corresponding elements of T̃ξ and c. Therefore, it is
necessary to solve the following system (Prékopa and Szántai, 1978):
Pp(p+1)/2 Pp(p+1)/2
Minimize
i=1 ui + i=1 vi ,
subject to u − v + T̃ξ = c,
u ≥ 0, v ≥ 0, ξ ≥ 0.
The function of MultiRand which allows the generation of n random vectors from a
multivariate gamma with this method and with approximately mean vector mu, covariance
matrix Sigma and incidence matrix of Prékopa and Szántai (1978) is
RandomArray[MultivariateGammaTypeDistribution[mu,Sigma],n]
14
Assume that Z = (Z1 , . . . , Zp )0 has a multivariate normal distribution with standard
marginals Zi ∼ N (0, 1) and correlation matrix R and indicate the joint cdf of Z with
GZ1 ,...,Zp (z1 , . . . , zp ). Then, the cdf
where Φ(·) indicates the cdf of the standard normal, defines a multivariate cdf known as
normal copula. Therefore, for any set of cdf of given marginals FX1 , . . . , FXp , the random
variables
X1 = FX−11 (Φ(z1 )), . . . , Xp = FX−1p (Φ(zp ))), (14)
have joint cdf
FX1 ,...Xp (x1 , . . . , xp ) = GZ1 ,...,Zp Φ−1 (FX1 (z1 )), . . . , Φ−1 (FXp (zp ) )
where µ ∈ R and σ > 0 are, respectively, position and scale parameters and ξ ∈ R is a
shape parameter. The case ξ < 0 corresponds to the Weibull distribution, ξ = 0 to the
Gumbel and ξ > 0 to the Fréchet distribution. Therefore, the function of MultiRand which
allows the generation of n random vectors from this distribution with parameter vectors of
the marginals mu, sigma and xi is given by
RandomArray[MultivariateGeneralizedExtremeValueDistribution[mu,beta,sigma,R,option],n]
15
4.3 Extension of the Multivariate Johnson Distribution
Stanfield et al. (1996) have obtained an extension of the multivariate Johnson distribution
which allows the generation of random vectors with given correlation matrix and marginals
with the first four moments given. Suppose that the random vector X =(X1 , . . . , Xp )0
has mean µX = (µX1 , . . . , µXp )0 , standard deviations matrix σ X =diag(σX1 , . . . , σXp ) and
correlation matrix RX . Therefore, let the lower triangular matrix LX = (lij ) be obtained
by the Cholesky decomposition LX L0X = RX ,
If the multivariate distribution Y = (Y1 , . . . , Yp )0 has independent standard Johnson
components, so that E(Yi ) = 0 and Var(Yi ) = 1 for i = 1, . . . , p, then
W = µX + σ X LX Y (15)
has the same mean vector and covariance matrix of X. Assume that the parameters γi , δi ,
λi and ξi of Yi are arranged in a way that the i-th component Wi of the random vector W
has the same skewness and kurtosis of Xi for i = 1, . . . , p.
Let aX and bX be vectors whose i-th element is, respectively, the skewness and kurtosis
of Xi and, in a similar way, let aY and bY be the vectors indicating the skewness and
kurtosis of the components of Y. Moreover, define the k-fold Hadamard product of LX as
(k) k ) for k = 3, 4, together with the auxiliary vector Ψ = (ψ , . . . , ψ )0 , where
LX = (lij X 1 p
p
X p
X
2 2
ψi = 6 lij lik , i = 1, . . . , p.
j=1 k=j+1
Therefore, the parameter vectors γ, δ, λ and ξ of the Johnson distribution which de-
termine the distribution of Y can be modified to satisfy the conditions
h i−1
aY = L(3) aX ,
X
h i−1
b = L(4) (bX − ΨX ).
Y X
Then, the transformed random vector W has in this case the same correlation matrix of X
and also skewness and kurtosis of Wi are same of those of Xi for i = 1, . . . , p.
The function of MultiRand which allows to obtain n random vectors from the extension of
a Johnson multivariate distribution with mean vector mu, standard deviation vector sigma,
skewness vector a and kurtosis vector b and correlation matrix R is
RandomArray[ExtensionMultivariateJohnsonDistribution[mu,sigma,a,b,R],n]
16
5 Concluding remarks
As stressed in the introduction, the study of the behavior of sampling statistics is usually
done by means of the simulation of known distributions. In this framework, it is necessary
that algorithms allowing the simulation of multivariate distribution are available and, to
this purpose, we have prepared a package in an algebraic software, specifically Mathematica,
which allows the generation of various continuous multivariate distributions.
The distributions taken into consideration can be divided into two groups. A first group
refers to specific distributions for which it is possible to write analytically the density func-
tion or the moment generating function and, as a consequence, to derive the properties
of the marginals. Such marginals, in general, are an extension to the multivariate case of
known random variables, which are used very often in statistical applications with unidi-
mensional samples. On the contrary, a second group refers to distributions with a given
dependence structure of the components and are defined in indirect form. We can say that
the high quantity of multivariate distributions, which can be generated using MultiRand,
could be sufficient to run very complex Monte Carlo experiments.
In the future we plan to implement in Mathematica other multivariate distributions,
too, for example, the multivariate Liouville distributions (cf. Fang, Kotz e Ng, 1990) and
the extension to the multivariate case of stable distributions (cf. Nolan, 1998).
6 References
Andrews, D.F. and Stafford, J.E. (1993). Tools for the symbolic computation of
asymptotic expansions. J. R. Statist. Soc. B., 55, 613-627.
Bartlett, M.S. (1933). On the theory of statistical regression, Proc. Royal Society of
Edinburg, 53, 260-283.
Boswell, M.T., Gore, S.D., Patil, G.P. and Taillie, C. (1993). The art of computer
generation of random variables, Handbook of Statistics 9, 661- 721.
Cook, R.D. and Johnson, M.E. (1981). A family of distributions for modelling non-
elliptically symmetric multivariate data, J. R. Statist. Soc. B, 43, 210-218.
Fang, Kai-Tai, Kotz S. and Ng, Kai-Wang (1990). Symmetric Multivariate and
Related Distributions, Chapman & Hall, London.
Fishman, G.S. (1996). Monte Carlo: Concepts, Algorithms, and Applications Springer,
New York.
Gentle, J.E. (1998). Random Number Generation and Monte Carlo Methods, Springer,
New York.
Gómez, E., Gómez-Villegas, M.A. and Marı́n, J.M. (1998). A multivariate gener-
alization of the power exponential family of distributions, Commun. Statist.-Theory
Meth., 27, 589-600.
17
Heller, B. (1991). Macsyma for Statistician, Wiley, New York.
Hill, I.D., Hill, R. and Holder, R.L. (1974). Fitting Johnson curves by moments,
Appl. Statist., 25, 180-189.
Johnson, N.L. (1949b). Bivariate distributions based on simple translation system, Bio-
metrika, 36, 207-304.
Kendall, W.S. (1988). Symbolic computation and the diffusion of shapes of triads, Adv.
Appl. Probab., 20, 775-797.
Kendall, W.S. (1990). Computer algebra and stochastic calculus, Notices Am. Math.
Soc., 37, 1254-1256.
Mardia, K.V., Kent, J.T. and Bibby, J.M. (1979). Multivariate Analysis, Academic
Press, London.
Mathai, A.M. and Moschopoulos, P.G. (1993). A multivariate beta model, Statistica,
53, 231-241.
Park Chul Gyu and Shin Dong Wan (1998). An algorithm for generating correlated
random variables in a class of infinitely divisible distributions, J. Statist. Comput.
Simul., 61, 127-139.
Parrish, R.S. (1990). Generating random deviates from multivariate Pearson distribu-
tions, Comp. Statist. Data Anal., 9, 283-295.
Prèkopa, A. and Szàntai, T. (1978). A new multivariate gamma distribution and its
fitting to empirical streamflow data, Water Resources Res., 14, 19-24
18
Provasi C. (1996). Using algebraic software to compute the moments of order statistics,
Computational Economics, 9, 199-213.
Scheuer, E.M. and Stoller, D.S. (1962). On the generation of normal random vectors,
Technometrics, 4, 278-281.
Schuurmann, F.J., Krishnaiah, P.R. and Chattopadhyay, A.K. (1975). Tables for
a multivariate F distribution, Sankhyā B, 37, 308-331.
Shi Daoji (1995). Multivariate extreme value distribution and its Fisher Information
Matrix, Acta Mathematica Applicata Sinica, 11, 421-428.
Silverman, B.W. and Young, A.E. (1987). The bootstrap: to smooth or not to
smooth?. Int. Statist. Rev., 74, 469-479.
Sim, C.H. (1993). Generation of Poisson and Gamma random vectors with given marginals
and covariance matrix, J. Statist. Comp. Simul., 47, 1-10.
Stanfield, P.M., Wilson, J.R., Mirka, G.A., Glasscock, N.F., Psihogios, J.P.
and Davis, J.R. (1996). Multivariate input modeling with Johnson distributions,
in Proceedings of the 1996 Winter Simulation Conference (eds. J.M. Charmes, D.J.
Morrice, D.T. Brunner and J.J. Swain), 1457-1464.
Tanner, M.A. (1996). Tools for Statistical Inference: Methods for the Exploration of
Posterior Distributions and Likelihood Functions, Springer, New York.
Tashiro, Y. (1977). On methods for generating uniform points on the surface of a sphere,
Ann. Inst. Stat. Math., 29, 295-300.
Thisted, R.A (1988). Elements of Statistical Computing, Chapman & Hall, New York.
Tiit, E.-M. (1984). Definitions of random vectors with given marginals distributions and
given correlation matrix, Tartu Riikliku Ulikooli Toimetised, 685, 21-36.
Varian, H.R. (ed.) (1992). Economic and Financial Modeling with Mathematica, Springer,
New York.
Varian, H.R (ed.) (1996). Computational Economics and Finance: Modeling and Analy-
sis with Mathematica, Springer, New York.
Venables, W.N. (1985). The multiple correlation coefficient and Fisher’s A statistic,
Austr. J. Statist., 27, 172-182.
Wolfram, S. (1996). The Mathematica Book (3rd ed.), Cambridge University Press,
Cambridge (UK).
19