This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

Editors' Picks Books

Hand-picked favorites from

our editors

our editors

Editors' Picks Audiobooks

Hand-picked favorites from

our editors

our editors

Editors' Picks Comics

Hand-picked favorites from

our editors

our editors

Editors' Picks Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

**1. Lecture 2: Some Basic Distributions.
**

We start be extending the notion of kurtosis and skewness for random variables.

Let a random variable X, with variance σ

2

and mean µ. The skewness is deﬁned

as

E

_

_

X −µ

σ

_

3

_

and the kurtosis as

E

_

_

X −µ

σ

_

4

_

.

The following Lemma will be useful. It tells us how to derive the pdf of functions

of random variables, whose pdf is known.

Lemma 1. Let X a continuous random variable with pdf f

X

(x) and g a diﬀerentiable

and strictly monotonic function. Then Y = g(X) has the pdf

f

Y

(y) = f

X

(g

−1

(y))

¸

¸

¸

d

dy

g

−1

(y)

¸

¸

¸,

for y such that y = g(x), for some x. Otherwise f

Y

(y) = 0.

Proof. The proof is an easy application of the change of variables and is left as an

exercise.

The following is also a well known fact

Proposition 1. Let X

1

, X

2

, . . . , X

n

independent random variables. Then Var(X

1

+

· · · + X

n

) = Var(X

1

) +· · · + Var(X

n

)

Some important notions are the characteristic function and the moment generat-

ing function of a random variable X. The characteristic function is deﬁned as

φ(t) = E

_

e

itX

¸

, t ∈ R

and the moment generating function, or Laplace transform is deﬁned as

M(t) = E

_

e

tX

¸

, t ∈ R

The importance of these quantinties is that they uniquely deﬁne the corresponding

distribution.

1

2

1.1. Normal Distribution.

Normal distributions are probably the most fundamental ones. The reason for this

lies in the Central Limit Theorem , which states that if X

1

, X

2

, . . . are independent

random variables with mean zero and variance one, then

X

1

+· · · + X

n

√

n

converges in distribution to a standard normal distribution.

The standard normal distribution, often denoted by N(0, 1), is the distribution

with probability density function (pdf)

f(x) =

1

√

2π

e

−

x

2

2

.

The normal distribution with mean µ and variance σ

2

, often denoted by N(µ, σ

2

),

has pdf

f(x) =

1

√

2πσ

2

e

−

(x−µ)

2

2σ

2

.

Let

F(x) =

1

√

2π

_

x

−∞

e

−

y

2

2

dy

be the cumulative function of a N(0, 1) distribution. The q-quantile of the N(0, 1)

distribution is F

−1

(q). The (1 −α)−quantile of the N(0, 1) distribution is denoted

by z

α

. We will see later on that z

α

is widely used for conﬁdence intervals.

The characteristic function of a random variable X with N(µ, σ

2

) distribution is

E

_

e

itX

¸

=

_

∞

−∞

e

itx

1

√

2πσ

2

e

−

(x−µ)

2

2σ

2

dx

= e

−

σ

2

t

2

2

.

1.2. Lognormal Distribution.

Consider a N(µ, σ

2

) random variable Z, then the random variable X = exp(Z)

is said to have a lognormal distribution. In other words X is lognormal if its

logarithm log X has a normal distribution. It is easy to see that the pdf of a

lognormal distribution associated to a N(µ, σ

2

) distribution is

f(x) =

1

x

√

2πσ

2

e

−

(log x−µ)

2

2σ

2

.

The median of the above distribution is exp(µ), while its mean is exp(µ+σ

2

/2). The

mean is larger than the median which indicates that the lognormal distribution is

right skewed. In fact the larger the variance σ

2

of the associated normal distribution,

the more skewed the lognormal distribution is.

3

Lognormal distributions are particularly important in mahtematical ﬁnance, as

it appears in the modelling of returns, where geometric Brownian Motion appears.

We will try to sketch this relation below. For more detailed discussion you can look

at Ruppert: ”Statistics and Finance: An Introduction”, pg. 75-83.

The Net return of an asset measures the changes in prices of assets expressed

as fractions of the initial price. For example if P

t

is the proce of the asset at time t

then the net return at time t is deﬁned as

R

t

=

P

t

P

t−1

−1 =

P

t

−P

t−1

P

t

.

The revenue from holding an asset is

revenue=initial investment× net return.

The simple gross return is

P

t

P

t−1

= R

t

+ 1.

The gross return over a period of k units of time is

1 + R

t

(k) =

P

t

P

t−k

=

P

t

P

t−1

·

P

t−1

P

t−2

· · ·

P

t−k+1

P

t−k

= (1 + R

t

)(1 + R

t−1

) · · · (1 + R

t−k+1

)

Often it is easier to work with log returns (also known as continuously compounded

returns). This is

r

t

= log(1 + R

t

) = log

P

t

P

t−1

By analogy with above the log return over a period of k units of time is

r

t

(k) = r

t

+· · · + r

t−k+1

.

4

A very common assumption in ﬁnance is to assume that the log returns on diﬀerent

times are independent and identically distributed.

By the deﬁnition of the return the price of the asset at time t will be given by the

formula

P

t

= P

0

exp

_

r

t

+· · · + r

1

_

If the distribution of each r

i

is N(µ, σ

2

) then the distribution of the sum in the

above exponential will be N(tµ, tσ

2

). Therefore, the price of the asset at time t will

be a log-normal distribution.

Later on, you will see that is the time increments are taken to be inﬁnitesimal

the sum in the above exponential will approach a Brownian Motion with drift and

then the price of the asset will follow the exponential Brownian Motion.

1.3. Exponential, Laplace, Gamma. The exponential distribution with sclae

parameter θ > 0, often denoted by Exp(θ) has pdf

e

−x/θ

θ

, x > 0,

mean θ and standard deviation θ. The Laplace distribution with mean µ and scale

parameter θ has pdf

e

−|x−µ|

2θ

, x ∈ R.

The standard deviation of the Laplace distribution is

√

2θ.

The Gamma distribution with scale parameter θ and shape parameter α has pdf

θ

−α

Γ(α)

x

α−1

e

−x/θ

, x > 0,

with the normalisation

Γ(α) =

_

∞

0

x

α−1

e

−u

du, α > 0

which is the so called gamma function. Notice that when α = 1 one recovers the

exponential distribution with scale parameter θ.

5

Proposition 2. Consider two independent random variables X

1

, X

2

, gamma dis-

tributed with shape parameters α

1

, α

2

respectively and scale parameters equal to θ.

Then the distribution of X

1

+X

2

is gamma with shape parameter α

1

+α

2

and scale

parameter θ.

The proof uses the following lemma

Lemma 2. If X

1

, X

2

are independent random variables with continuous probability

density functions f

X

1

(x) and f

X

2

(x), then the pdf of X

1

+ X

2

is

f

X

1

+X

2

(x) =

_

f

X

1

(x −y)f

X

2

(y)dy.

Proof. Let formally f

X

1

+X

2

(x) = P(X

1

+ X

2

= x). We know that striclty speaking

the right hand side is zero in the case of a continuous random variable. We think,

though, of this as P(X

1

+ X

2

x). We then have

P(X

1

+ X

2

= x) =

_

P(X

1

+ X

2

= x, X

2

= y)dy

=

_

P(X

1

= x −y, X

2

= y)dy

=

_

P(X

1

= x −y)P(X

2

= y)dy

=

_

f

X

1

(x −y)f

X

2

(y)dy,

where in the last step we used the independence of X

1

, X

2

.

We are now ready for the proof of Proposition 2

6

Proof. For simplicity we will assume that θ = 1. The general case follows along the

same lines. Based on the previous Lemma we have

f

X

1

+X

2

(x) =

_

f

X

1

(x −y)f

X

2

(y)dy

=

_

x

0

(x −y)

α

1

−1

Γ(α

1

)

e

−(x−y)

y

α

2

−1

Γ(α

2

)

e

−y

dy

=

x

α

1

+α

2

−1

Γ(α

1

)Γ(α

2

)

e

−x

_

1

0

(1 −y)

α

1

−1

y

α

2

−1

dy

=

x

α

1

+α

2

−1

Γ(α

1

+ α

2

)

e

−x

The third equality follows from an easy change of variables, while the last from

the well known property of Gamma functions that

_

1

0

(1 −x)

α

1

−1

x

α

2

−1

dx =

Γ(α

1

)Γ(α

2

)

Γ(α

1

+ α

2

)

.

1.4. χ

2

Distribution.

If X is a N(0, 1) random variable, then the distribution of X

2

is called the χ

2

distribution with 1 degree of freedom. Often we denote the χ

2

distribution with

one degree of freedom by χ

2

1

.

It follows easily using Lemma 1, that the χ

2

1

distribution is actually a Gamma

distribution with shape and scale parameters 1/2 and 2, respectively. (Check this !)

If now U

1

, U

2

, . . . , U

n

are independent χ

2

1

distributions, then then distribution of

the sum U

2

1

+ · · · + U

2

n

is called the χ

2

distribution with n degrees of freedom

and is denoted by χ

2

n

.

Since the χ

2

1

is a Gamma(1/2,2) distribution, we know from Proposition 2 that

the χ

2

n

distribution is acutally the Gamma distribution with shape parameter n/2

and scale parameter 2.

The χ

2

distribution is important since it is used to estimate the variance of a

random variable, based on the sample variance as this will be measured in a sampling

process. To see the relevance compare with the deﬁnition of the sample variance as

this is given in Deﬁnition 4 of Lecture 1.

1.5. t-Distribution.

The t-distribution is important when we want to derive conﬁdence intervals (we

will study this later on) for certain parameters of interest, when the (population)

variance of the underlying distribution is not known.At this stage we would need to

7

have a sampling estimate for the (population) variance, thus the t−distribution is

related to the χ

2

distribution.

Let’s proceede with the deﬁnition of the t−distribution. If Z ∼ N(0, 1), and

U ∼ χ

2

n

then the distribution of Z/

_

U/n is called the t-distribution with n

degrees of freedom, often denoted by t

n

.

The pdf of the t−distribution with n degrees of freedom is given by

Γ((n + 1)/2)

_

nπΓ(n/2)

_

1 +

t

2

n

_

−(n+1)/2

To prove this we will need the following lemma

Lemma 3. Let X, Y two continuous random variables with joint pdf f

X,Y

(x, y).

Then the pdf of the quotient Z = Y/X is given by

f

Z

(z) =

_

|x|f

X,Y

(x, xz)dx

Proof.

P

_

Y

X

< z

_

=

_

∞

0

_

zx

−∞

f

XY

(x, y) +

_

0

−∞

_

∞

zx

f

XY

(x, y)

Diﬀerentiating both sides with repsect to z we get

f

Y/X

(z) =

_

∞

0

xf

XY

(x, xz)dx −

_

0

−∞

xf

XY

(x, zx)dx

and the result follows.

The rest is left as an exercise.

As the degrees of freedom n tend to inﬁnity the t

n

distribution approximates the

standard normal distribution. To see this one needs to use the fact that

(1 +

x

2

n + 1

)

(n+1)/2

∼ e

n+1

2n

x

2

∼ e

x

2

/2

and the asymptotics of the Gamma function

Γ(n) ∼

√

2πnn

n

e

−n

.

Recall that when n is an integer Γ(n+1) = n!, so the above is just Stirling’s formula,

but it also holds in the general case that n is not integer.

8

1.6. F-Distribution.

If U, V independent and U ∼ χ

2

n

1

, V ∼ χ

2

n

2

then the distribution of

W =

U/n

1

V/n

2

is called the F-distribution with n

1

, n

2

degrees of freedom. F-distributions are

used in regression analysis. The pdf of the F-distribution is given by

Γ((n

1

+ n

2

)/2)

Γ(n

1

/2)Γ(n

2

/2)

_

n

1

n

2

_

n

1

/2

x

n

1

/2−1

_

1 +

n

1

n

2

x

_

−(n

1

+n

2

)/2

The proof is similar to the derivation of the pdf of the t−distribution and is therefore

left as an exercise.

9

1.7. Heavy-Tailed Distributions.

Distributions with high tail probabilities compared to a normal distribution, with

same mean and variance are called heavy-tailed. In other words a distribution F

with mean zero and variance one is heavy (right) tailed if

1 −F(x)

_

∞

x

e

−x

2

/2

/

√

2π

>> 1, x →+∞

Similar statement holds for the left tail. A heavy-tailed distribution can also be

detected from high kurtosis (why ?)

A heavy-tailed distribution is more prone to extreme values, often called outliers.

In ﬁnance applications one is especially concerned with heavy-tailed returns, since

the possibility of an extreme negative value can deplete the capital reserves of a

ﬁrm.

For example t-distribution is heavy tailed, since its density is proportional to

1

1 + (x

2

/n)

(n+1)/2

∼ |x|

−(n+1)

>> e

−x

2

/2

for large x.

A particular class of heavy-teiled distribution are the Pareto distributions or

simply power law distributions.

These are distributions with pdf

L(x)

|x|

α

.

L(x) is a slowly varying function, that is a function with the property that, for any

constant c,

L(cx)

L(x)

→1, x →∞.

An example of a slowly varying function is log x, or exp

_

(log x)

β

_

, for β < 1. In the

Pareto distribution α > 1, or α = 1 if L(x) decays suﬃciently fast.

1.8. Multivariate Normal Distributions.

The random vector (X

1

, X

2

, . . . , X

n

) ∈ R

n

is said to have a multivariate normal

distribution if for every constant vector (c

1

, c

2

, . . . , c

n

) ∈ R

n

, the distribution of

c

1

X

1

+· · · + c

n

X

n

is normal.

Multivariate normal distributions facilitate modelling on portfolios. A portfolio

is a weighted average of the assets with weights that sum up to one. The weights

specify what fraction of the total investment is allocated to assets.

As in one dimensional normal distributions, the multivariate distribution is de-

termined by the mean

(µ

1

, . . . , µ

n

),

10

with µ

i

= E[X

i

] and the covariance matrix. That is the matric G = (G

i,j

) with

entries

G

ij

= E[X

i

X

j

] −µ

i

, µ

j

.

For simplicity let’s assume that µ

i

= 0, for i = 1, 2 . . . , n. Then the multivariate

normal density function is given by

1

(2π)

n/2

(detG)

1/2

exp

_

−

1

2

< x, G

−1

x >

_

, x ∈ R

n

,

where G

−1

is the inverse of G, detG the determinant of G and < ·, · > denotes the

inner product in R

n

, that is, if x, , y ∈ R

n

, then < x, y >=

n

i=1

x

i

y

i

.

1.9. Exercises.

1 (Logonormal Distributions) Compute the moments of a logonormal distribution

X = e

Z

, with Z a normal N(µ, σ

2

) distribution. In particular compute its mean,

standard deviation, skewness and kurtosis.

2. (Exponentials and Poissons) Exponential distributions often arise in the study

of arrivals, qeueing etc. modeling the time between interarrivals. Consider T to

be the time for the ﬁrst arrival in a system and suppose it has an exponential

distribution with scale parameter 1. Compute P(T > t + s|T > s).

Suppose that the number of arrivals on a system is a Poisson process with pa-

rameter λ. That is

Prob(#{arrivals before time t} = k) = e

−λt

(λt)

k

k!

and arrivals in disjoint time intervals are independent.

What is the distribution of the interarrival times ?

3. Compute the moments of a gamma distribution with shape parameter α and

scale parameter 1.

4. Prove Lemma 1

5. Prove that the distribution of χ

2

1

is a Gamma(1/2,1/2).

6. Show that a heavy tailed distribution has high kurtosis.

7. Derive the pdf of the t−distribution.

8. Compute the kurtosis of (a) N(0, 1), (b) an exponential with scale parameter

one.

9. (Mixture models). Let X

1

∼ N(0, σ

2

1

) and X

2

∼ N(0, σ

2

2

) two independent

normal distribution. Let also Y be another independent random variable with a

Bernoulli distribution, that is P(Y = 1) = p and P(Y = 0) = 1 − p, for some

0 < p < 1.

A. What is the mean and the variance of Z = Y X

1

+ (1 −Y )X

2

?

B. Are the tails of its distribution heavier or lighter when compared to a normal

distribution with the same mean and variance? If so, for what values of p ? Give

also an intuitive explanation of your mathematical derivation

11

Use some statistical software to draw the distribution of the mixture model, for

some values of the parameter p and compare it (especially the tails) with the one of

the corresponding normal.

The q-quantile of the N (0. are independent random variables with mean zero and variance one. . The mean is larger than the median which indicates that the lognormal distribution is right skewed. . Consider a N (µ. often denoted by N (µ.2 1. 1) distribution is F −1 (q).1. σ 2 ) random variable Z. the more skewed the lognormal distribution is. has pdf (x−µ)2 1 f (x) = √ e− 2σ2 . 2πσ 2 Let x y2 1 √ e− 2 dy F (x) = 2π −∞ be the cumulative function of a N (0. often denoted by N (0. In other words X is lognormal if its logarithm log X has a normal distribution. while its mean is exp(µ+σ 2 /2). then X 1 + · · · + Xn √ n converges in distribution to a standard normal distribution. 2π The normal distribution with mean µ and variance σ 2 . . σ 2 ) distribution is E eitX = (x−µ)2 1 e− 2σ2 dx eitx √ 2πσ 2 −∞ σ 2 t2 2 ∞ = e− . Normal distributions are probably the most fundamental ones. The reason for this lies in the Central Limit Theorem . The (1 − α)−quantile of the N (0. is the distribution with probability density function (pdf) x2 1 f (x) = √ e− 2 . The standard normal distribution. σ 2 ) distribution is (log x−µ)2 1 f (x) = √ e− 2σ2 . The characteristic function of a random variable X with N (µ. Lognormal Distribution. In fact the larger the variance σ 2 of the associated normal distribution. 1) distribution is denoted by zα . It is easy to see that the pdf of a lognormal distribution associated to a N (µ. x 2πσ 2 The median of the above distribution is exp(µ). 1) distribution. 1. . 1). We will see later on that zα is widely used for conﬁdence intervals. Normal Distribution. X2 .2. which states that if X1 . then the random variable X = exp(Z) is said to have a lognormal distribution. σ 2 ).

This is Pt rt = log(1 + Rt ) = log Pt−1 By analogy with above the log return over a period of k units of time is rt (k) = rt + · · · + rt−k+1 . Pt−1 The gross return over a period of k units of time is Pt Pt Pt−1 Pt−k+1 1 + Rt (k) = = · ··· Pt−k Pt−1 Pt−2 Pt−k = (1 + Rt )(1 + Rt−1 ) · · · (1 + Rt−k+1 ) Often it is easier to work with log returns (also known as continuously compounded returns).3 Lognormal distributions are particularly important in mahtematical ﬁnance. as it appears in the modelling of returns. The Net return of an asset measures the changes in prices of assets expressed as fractions of the initial price. pg. 75-83. . The simple gross return is Pt = Rt + 1. For more detailed discussion you can look at Ruppert: ”Statistics and Finance: An Introduction”. We will try to sketch this relation below. For example if Pt is the proce of the asset at time t then the net return at time t is deﬁned as Pt Pt − Pt−1 Rt = −1= . where geometric Brownian Motion appears. Pt−1 Pt The revenue from holding an asset is revenue=initial investment× net return.

Exponential. 2θ x ∈ R. √ The standard deviation of the Laplace distribution is 2θ. The Laplace distribution with mean µ and scale parameter θ has pdf e−|x−µ| .3. Γ(α) = 0 xα−1 e−u du. Laplace. . Γ(α) with the normalisation ∞ x > 0. Therefore. By the deﬁnition of the return the price of the asset at time t will be given by the formula Pt = P0 exp rt + · · · + r1 If the distribution of each ri is N (µ. Later on. α>0 which is the so called gamma function. mean θ and standard deviation θ. The exponential distribution with sclae parameter θ > 0. Notice that when α = 1 one recovers the exponential distribution with scale parameter θ. Gamma. The Gamma distribution with scale parameter θ and shape parameter α has pdf θ−α α−1 −x/θ x e . 1. you will see that is the time increments are taken to be inﬁnitesimal the sum in the above exponential will approach a Brownian Motion with drift and then the price of the asset will follow the exponential Brownian Motion. σ 2 ) then the distribution of the sum in the above exponential will be N (tµ. the price of the asset at time t will be a log-normal distribution. θ x > 0. often denoted by Exp(θ) has pdf e−x/θ .4 A very common assumption in ﬁnance is to assume that the log returns on diﬀerent times are independent and identically distributed. tσ 2 ).

X2 = y)dy P (X1 = x − y)P (X2 = y)dy fX1 (x − y)fX2 (y)dy. We are now ready for the proof of Proposition 2 . of this as P (X1 + X2 x). X2 . Proof. We know that striclty speaking the right hand side is zero in the case of a continuous random variable. then the pdf of X1 + X2 is fX1 +X2 (x) = fX1 (x − y)fX2 (y)dy. Let formally fX1 +X2 (x) = P (X1 + X2 = x).5 Proposition 2. Then the distribution of X1 + X2 is gamma with shape parameter α1 + α2 and scale parameter θ. If X1 . The proof uses the following lemma Lemma 2. gamma distributed with shape parameters α1 . Consider two independent random variables X1 . though. We think. X2 are independent random variables with continuous probability density functions fX1 (x) and fX2 (x). We then have P (X1 + X2 = x) = = = = P (X1 + X2 = x. X2 . where in the last step we used the independence of X1 . X2 = y)dy P (X1 = x − y. α2 respectively and scale parameters equal to θ.

Often we denote the χ2 distribution with one degree of freedom by χ2 . χ2 Distribution. . If X is a N (0. For simplicity we will assume that θ = 1. The t-distribution is important when we want to derive conﬁdence intervals (we will study this later on) for certain parameters of interest. based on the sample variance as this will be measured in a sampling process. t-Distribution. . then the distribution of X 2 is called the χ2 distribution with 1 degree of freedom. n Since the χ2 is a Gamma(1/2.6 Proof.2) distribution. we know from Proposition 2 that 1 the χ2 distribution is acutally the Gamma distribution with shape parameter n/2 n and scale parameter 2. 1 It follows easily using Lemma 1. then then distribution of 1 2 2 the sum U1 + · · · + Un is called the χ2 distribution with n degrees of freedom and is denoted by χ2 . respectively. (Check this !) If now U1 . The general case follows along the same lines. when the (population) variance of the underlying distribution is not known. while the last from the well known property of Gamma functions that 1 (1 − x)α1 −1 xα2 −1 dx = 0 Γ(α1 )Γ(α2 ) . U2 .At this stage we would need to . 1. To see the relevance compare with the deﬁnition of the sample variance as this is given in Deﬁnition 4 of Lecture 1. . The χ2 distribution is important since it is used to estimate the variance of a random variable.5. 1) random variable.4. Γ(α1 + α2 ) 1. that the χ2 distribution is actually a Gamma 1 distribution with shape and scale parameters 1/2 and 2. Based on the previous Lemma we have fX1 +X2 (x) = = 0 fX1 (x − y)fX2 (y)dy x (x − y)α1 −1 −(x−y) y α2 −1 −y e e dy Γ(α1 ) Γ(α2 ) 1 xα1 +α2 −1 −x e Γ(α1 )Γ(α2 ) xα1 +α2 −1 −x = e Γ(α1 + α2 ) = (1 − y)α1 −1 y α2 −1 dy 0 The third equality follows from an easy change of variables. . Un are independent χ2 distributions.

The rest is left as an exercise. . P Y <z X ∞ zx 0 ∞ |x|fX. y). Recall that when n is an integer Γ(n+1) = n!.7 have a sampling estimate for the (population) variance. The pdf of the t−distribution with n degrees of freedom is given by Γ((n + 1)/2) nπΓ(n/2) t2 1+ n −(n+1)/2 To prove this we will need the following lemma Lemma 3. Let X. thus the t−distribution is related to the χ2 distribution. often denoted by tn .Y (x. y) Diﬀerentiating both sides with repsect to z we get ∞ 0 fY /X (z) = 0 xfXY (x. 1). Let’s proceede with the deﬁnition of the t−distribution. so the above is just Stirling’s formula. If Z ∼ N (0. As the degrees of freedom n tend to inﬁnity the tn distribution approximates the standard normal distribution. but it also holds in the general case that n is not integer. and U ∼ χ2 then the distribution of Z/ U/n is called the t-distribution with n n degrees of freedom. Y two continuous random variables with joint pdf fX. xz)dx − −∞ xfXY (x. Then the pdf of the quotient Z = Y /X is given by fZ (z) = Proof.Y (x. zx)dx and the result follows. xz)dx = 0 −∞ fXY (x. To see this one needs to use the fact that n+1 2 x2 (n+1)/2 2 ) ∼ e 2n x ∼ ex /2 (1 + n+1 and the asymptotics of the Gamma function √ Γ(n) ∼ 2πnnn e−n . y) + −∞ zx fXY (x.

F-Distribution. If U. . V independent and U ∼ χ2 1 . n2 degrees of freedom. F-distributions are used in regression analysis. V ∼ χ2 2 then the distribution of n n W = U/n1 V /n2 is called the F-distribution with n1 .8 1.6. The pdf of the F -distribution is given by Γ((n1 + n2 )/2) Γ(n1 /2)Γ(n2 /2) n1 n2 n1 /2 x n1 /2−1 n1 1+ x n2 −(n1 +n2 )/2 The proof is similar to the derivation of the pdf of the t−distribution and is therefore left as an exercise.

or exp (log x)β . A particular class of heavy-teiled distribution are the Pareto distributions or simply power law distributions. since the possibility of an extreme negative value can deplete the capital reserves of a ﬁrm. In ﬁnance applications one is especially concerned with heavy-tailed returns. since its density is proportional to 1 2 ∼ |x|−(n+1) >> e−x /2 2 /n)(n+1)/2 1 + (x for large x. X2 . . . Multivariate normal distributions facilitate modelling on portfolios. |x|α L(x) is a slowly varying function. for any constant c.8. with same mean and variance are called heavy-tailed. . L(x) An example of a slowly varying function is log x. Distributions with high tail probabilities compared to a normal distribution. x → +∞ Similar statement holds for the left tail.9 1. In the Pareto distribution α > 1. c2 . . The random vector (X1 . often called outliers. . x → ∞. Multivariate Normal Distributions.7. . L(cx) → 1. In other words a distribution F with mean zero and variance one is heavy (right) tailed if ∞ −x2 /2 √ e / 2π x 1 − F (x) >> 1. for β < 1. Xn ) ∈ Rn is said to have a multivariate normal distribution if for every constant vector (c1 . A heavy-tailed distribution can also be detected from high kurtosis (why ?) A heavy-tailed distribution is more prone to extreme values. . . that is a function with the property that. A portfolio is a weighted average of the assets with weights that sum up to one. These are distributions with pdf L(x) . the multivariate distribution is determined by the mean (µ1 . or α = 1 if L(x) decays suﬃciently fast. For example t-distribution is heavy tailed. . . µn ). . As in one dimensional normal distributions. . The weights specify what fraction of the total investment is allocated to assets. Heavy-Tailed Distributions. 1. the distribution of c1 X1 + · · · + cn Xn is normal. . cn ) ∈ Rn .

y >= n xi yi . (Exponentials and Poissons) Exponential distributions often arise in the study of arrivals. Exercises. i=1 1. σ2 ) two independent normal distribution. . 7. 2. that is. if x. . 1). y ∈ Rn . modeling the time between interarrivals. 8. for what values of p ? Give also an intuitive explanation of your mathematical derivation . that is P (Y = 1) = p and P (Y = 0) = 1 − p. 2 . for some 0 < p < 1. then < x. (Mixture models). That is P rob(#{arrivals before time t} = k) = e−λt (λt)k k! and arrivals in disjoint time intervals are independent. Let also Y be another independent random variable with a Bernoulli distribution. σ 2 ) distribution. A. standard deviation. for i = 1. (b) an exponential with scale parameter one. Let X1 ∼ N (0. σ1 ) and X2 ∼ N (0. Suppose that the number of arrivals on a system is a Poisson process with parameter λ.1/2). G−1 x > . What is the mean and the variance of Z = Y X1 + (1 − Y )X2 ? B. What is the distribution of the interarrival times ? 3. Prove that the distribution of χ2 is a Gamma(1/2.10 with µi = E[Xi ] and the covariance matrix. . Are the tails of its distribution heavier or lighter when compared to a normal distribution with the same mean and variance? If so. µj . 1 6. with Z a normal N (µ. In particular compute its mean. Show that a heavy tailed distribution has high kurtosis. Consider T to be the time for the ﬁrst arrival in a system and suppose it has an exponential distribution with scale parameter 1. Compute the moments of a gamma distribution with shape parameter α and scale parameter 1. Then the multivariate normal density function is given by 1 (2π)n/2 (detG)1/2 1 exp − < x. 2 2 9. That is the matric G = (Gi. . Compute P (T > t + s|T > s). Derive the pdf of the t−distribution. Prove Lemma 1 5. For simplicity let’s assume that µi = 0. n. detG the determinant of G and < ·.9. qeueing etc.j ) with entries Gij = E[Xi Xj ] − µi . skewness and kurtosis. 4. Compute the kurtosis of (a) N (0. 2 x ∈ Rn . 1 (Logonormal Distributions) Compute the moments of a logonormal distribution X = eZ . where G−1 is the inverse of G. · > denotes the inner product in Rn .

. for some values of the parameter p and compare it (especially the tails) with the one of the corresponding normal.11 Use some statistical software to draw the distribution of the mixture model.

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd