This action might not be possible to undo. Are you sure you want to continue?

:

Conditional Value At Risk And Conditional Draw-Down At Risk

For Portfolio Optimization With Alternative Investments

Stephan J¨ohri

∗

Supervisor: PD Dr. Diethelm W¨ urtz

Professor: Dr. Kai Nagel

March 16, 2004

∗

Master’s Thesis of Stephan J¨ohri written at the department of Computer Science of Swiss Federal Institute

of Technology (ETH) Zurich.

1

Abstract

The aim of this Master’s Thesis is to describe and assess diﬀerent ways to optimize a portfolio.

Special attention is paid to the inﬂuence of hedge funds since their returns exhibit special sta-

tistical properties.

In the ﬁrst part of this thesis modern portfolio theory is considered. The Markowitz ap-

proach is described and analyzed. It assumes that the assets are identically independently

distributed according to the Normal law. CAPM and APT are brieﬂy reviewed.

In the second part we go beyond Markowitz and show that asset returns are in reality not

normally distributed, but have fat tails and asymmetries. This is especially true for the returns

of hedge funds. These facts justify further investigations for alternative portfolio optimization

techniques. We describe and discuss therefore alternative methods that can be found in lit-

erature. They use risk measures diﬀerent than the standard deviation like Value at Risk or

Draw-Down and their derivations Conditional Value at Risk and Conditional Draw-Down at

Risk. Based on these methods, the respective optimization problems are formulated and im-

plemented.

In the third part we describe the numerical implementation and the used data. Finally the

weight allocations and eﬃcient frontiers that summarize the results of these optimization prob-

lems are calculated, analyzed and compared. We focus on the question how optimal portfolios

with and without hedge funds are constructed according to the diﬀerent optimization methods,

how useful these methods are in practice and how the results diﬀer. The results are derived by

analytical work and simulations on historical and artiﬁcial data.

2

Acknowledgment

I would like to thank my supervisor PD Dr. Diethelm W¨ urtz for directing this thesis and

guiding me with a lot of useful impulses. I am also thankful to Prof. Kai Nagel who gave my

the opportunity to work on this topic.

My gratefulness belongs also to the people at UBS Investment Research Dr. Marcos L´opez de

Prado, Dr. Achim Peijan, Laurent Favre and Dr. Klaus Kr¨anzlein who gave me a lot of inputs

during our discussions.

3

4

Contents

I Modern Portfolio Theory 7

1 Markowitz Model 7

1.1 Risk Return Framework And Utility Function . . . . . . . . . . . . . . . . . . . . 7

1.2 Selecting Optimal Portfolios: The Eﬃcient Frontier . . . . . . . . . . . . . . . . . 14

2 Capital Asset Pricing Model (CAPM) 27

2.1 Standard Capital Asset Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . 27

2.2 Arbitrage Pricing Theory (APT) . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

II Beyond Markowitz 34

3 Stylized Facts Of Asset Returns 34

3.1 Distribution Form Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2 Dependencies Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.3 Results Of Statistical Tests Applied To Market Data . . . . . . . . . . . . . . . . 42

4 Portfolio Construction With Non Normal Asset Returns 48

4.1 Introduction To Risk In General . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.2 Variance As Risk Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5 Value At Risk Measures 52

5.1 Value At Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.2 Conditional Value At Risk, Expected Shortfall And Tail Conditional Expectation 54

5.3 Mean-Conditional Value At Risk Eﬃcient Portfolios . . . . . . . . . . . . . . . . 58

6 Draw-Down Measures 60

6.1 Draw-Down And Time Under-The-Water . . . . . . . . . . . . . . . . . . . . . . 60

6.2 Conditional Draw-Down At Risk And Conditional Time Under-The-Water At Risk 61

6.3 Mean-Conditional Draw-Down At Risk Eﬃcient Portfolios . . . . . . . . . . . . . 65

7 Comparison Of The Risk Measures 67

III Optimization With Alternative Investments 68

8 Numerical Implementation 68

9 Used Data 69

9.1 Normal Vs. Logarithmic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

9.2 Empirical Vs. Simulated Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

10 Evaluation Of The Portfolios 72

10.1 Evaluation With Historical Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

10.2 Evaluation With Simulated Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Summary and Outlook 82

5

Appendix 84

A Quadratic Utility Function Implies That Mean Variance Analysis Is Optimal 84

B Equivalence Of Diﬀerent VaR Deﬁnitions And Notations 85

C Used R Functions 86

D Description Of The Portfolio Optimization System 87

E Description Of The Excel Optimizer 89

F Description Of Various Hedge Fund Styles 90

G References 92

6

Part I

Modern Portfolio Theory

1 Markowitz Model

In this ﬁrst chapter the fundamentals of portfolio theory are introduced. This is done by showing

some statistical properties, deriving a utility function and presenting the model that combines

both for portfolio optimization. This model was developed in 1952/59 by Harry Markowitz and

is still considered as the standard approach for this task.

1.1 Risk Return Framework And Utility Function

Risk Return Framework

Assuming we are given N assets with their returns R

1

, ..., R

N

respectively. Our portfolio consists

of these assets with a fraction of w

1

, ..., w

N

invested in each asset. Then the expected returns

of the individual assets would be E[R

i

] = µ

i

(where E[] indicates the expected value) and the

total return µ

P

of the portfolio

µ

P

=

N

i=1

w

i

µ

R

i

(1)

Two properties of the mean value that will become useful later:

µ

R

i

+R

j

= µ

R

i

+µ

R

j

µ

cR

i

= cµ

R

i

The ﬁrst property means that the mean of the sum of two return series i and j are the same as

the mean of return series i plus the mean of return series j. The second property states that

the mean of a constant c multiplied with a return series is equal to c times the mean of the

return series i.

The variance of the portfolio will be

σ

2

P

= E[(R

P

−µ

P

)

2

] =

N

i=1

w

i

(R

i

−µ

P

)

2

=

N

i=1

(w

i

σ

i

)

2

+ 2

N−1

i=1

N

j=i+1

w

i

w

j

σ

ij

(2)

So in the case of three assets we get the following pattern:

σ

2

P

= (w

1

σ

1

)

2

+ (w

2

σ

2

)

2

+ (w

3

σ

3

)

2

+ 2w

1

w

2

σ

12

+ 2w

1

w

3

σ

13

+ 2w

2

w

3

σ

23

7

These variances σ

2

i

= E[(R

P

− µ

P

)

2

] and covariances σ

ij

= E[(R

i

− µ

i

)(R

j

− µ

j

)] = σ

ji

are

collected in symmetric matrix called covariance matrix:

C =

_

_

_

_

_

σ

2

1

σ

12

σ

1N

σ

12

σ

2

2

σ

2N

.

.

.

.

.

.

.

.

.

σ

1N

σ

2N

σ

2

N

_

_

_

_

_

(3)

The correlation is deﬁned as the standardized covariance:

ρ

ij

=

σ

ij

σ

i

σ

j

(4)

Comparing (1) and (2) we can see the eﬀect of diversiﬁcation: The return of a portfolio can

never be smaller than the smallest return of its constituents, since it is the weighted average

return of all constituents. In contrast, the variance of a portfolio can be smaller than the smallest

variance of its individual assets because of the second term of (2) which can be negative in case

of a negative covariance between the asset returns. So the aim of diversiﬁcation is to chose the

assets in a way to keep the mean return high and lower the variance by an appropriate selection

and weighting of the assets.

Taking (2) with equal amount of investments in each of the N assets we get

σ

2

P

=

1

N

σ

i

2

+

N −1

N

σ

ij

(5)

whereby the ﬁrst term is called diversiﬁable or non market risk and the second term

systematic or market risk. If we take a large amount of diﬀerent assets (N approaching

inﬁnity), the portfolio risk gets reduced to the average covariance of the assets in the portfolio

and all the variances of the assets disappear.

σ

2

P

−−−−→

N→∞

σ

ij

This eﬀect shows us that the ﬁrst term in (5) is called diversiﬁable risk because it can be

reduced to zero by a good diversiﬁcation of the assets. The risk represented in the ﬁrst term

of (5) has its origin in the risk of the single assets the portfolio contains, whereas the risk

expressed in the second term is coming from the market itself (which can be inﬂuenced by

economic changes or events with a large impact) and can not be reduced.

This also means that the risk of a portfolio of assets with a low correlation can be more re-

duced than the risk of a portfolio existing of highly correlated assets. In practice this results

in the recommendation to choose the constituents of a portfolio from diﬀerent geographic or

industrial sectors, because assets of companies from the same country or business areas tend to

move together and have hence a higher correlation. Figure 1 shows an example exhibiting this

eﬀect for the case of securities from the UK and the US.

In a risk return framework a high risk gets usually compensated by a high expected return.

This is called risk premium: The extra return a particular asset has to provide over the rate of

the market to compensate for market risk. The drawback of diversiﬁcation is that the investor

looses the risk premium that a certain asset might provide since its contribution on the ﬁnal

portfolio return is very small. The advantage of a well diversiﬁed portfolio however, is that one

can expect a more moderate but constant return on the long run.

8

0 5 10 15 20 25 30

0

.

0

0

.

2

0

.

4

0

.

6

0

.

8

1

.

0

Number of assets

R

i

s

k

Figure 1: This chart shows the risk of a portfolio versus the number of assets the portfolio contains.

We can see that a portfolio with few assets has a higher risk than a portfolio with lots of assets (eﬀect of

diversiﬁcation). The doted line represents a portfolio consisting of stocks from the UK whereas the solid

line depicts a portfolio with US stocks. Since the line for the UK portfolio is higher, we can conclude

that the stocks in UK have a higher average covariance and their risk can therefore less reduced in a

portfolio as the risk of a portfolio consisting of US stocks.

Utility Function Of An Investor

Bernoulli proposed in [9] that the value of an item should not be determined by the price

somebody has to pay for it but by the utility that this item has for the owner. A classical

example would be that a glass of water has a much higher utility for somebody who is lost in

the dessert than for somebody in the civilization. Although the glass of water might be exactly

the same and therefore its price, two persons in the mentioned situations will perceive its value

diﬀerently.

We will now discuss the properties that such an utility function should have and look at

some typical economic utility functions. The structure of this section will partially follow the

one in Elton&Gruber [18].

The ﬁrst property we want to have fulﬁlled is that an investor prefers more to less. Economists

call this the non-satiation attribute. It expresses the fact that an option with a higher re-

turn has always a higher utility than an option with a lower return assuming that both options

are equally likely. Or as a shorter expression, everybody prefers more wealth than less wealth.

From this we can conclude that the ﬁrst derivative of the utility function always has to be

positive. Our ﬁrst requirement for a utility function U() for a wealth parameter W is therefore

U

(W) > 0

As a second attribute we want to include the investors risk proﬁle. Bernoulli uses a fair

gamble to introduce this concept. A fair gamble is a game where the expected gain is equal

to zero. This means that the probability of a gain times the value of the gain is equal to the

probability of a loss times the loss in absolute terms. To toss a coin would be a fair gamble if one

player wins both investments when one side is up and the other payer wins both investments

when the other side is up. We will examine three types of risk proﬁles.

9

Wealth

U

t

i

l

i

t

y

Figure 2: A logarithmic utility function seems to be appropriate in the context of a risk averse investor.

The increase of the utility function for a certain increase in the wealth is smaller if the investor is already

on a high level of wealth

Risk aversion is deﬁned as rejecting a fair gamble. A risk averse investor would not play

a game where he or she has an expected return of zero in the long run. Let’s ﬁnd out what

the implications for a risk averse investor are: Since he or she does not invest, we can conclude

that the utility for keeping the current wealth is higher than the probability weighted utility

for a gain and loss. We can describe this risk proﬁle for the case of a fair gamble as

U(W) >

1

2

U(W +G) +

1

2

U(W −G)

where W is the current wealth and G the symmetric gain/loss of the game.

Multiplying by 2 and rearranging yields to

U(W) −U(W −G) > U(W +G) −U(W)

and we can see that such an investor prefers the change from the current wealth minus the

gain/loss to the current wealth than the change from the current wealth to the current wealth

plus the gain/loss. Note that the absolute change in wealth is in both cases the same (G). From

this we see that a risk averse investor prefers to keep all of his/her fortune rather than to invest

a part of it and loss or gain with a 50% probability an equal part. Functions that satisfy this

requirement have the second derivative smaller than zero.

U

(W) < 0

Figure 2 shows a logarithmic utility function that fulﬁlls this property. We can see that for the

double amount of wealth, the additional amount of utility is less then the double. Formulated

according to the example with the fair gamble the ﬁgure expresses that for the same amount of

increase in the utility the investor asks for a higher increase in the wealth the higher the wealth

already is.

As second risk proﬁle we have a look at the risk neutral investor. This is deﬁned as an

investor which is indiﬀerent to a fair gamble. He or she will sometimes play and sometimes not.

10

Wealth

U

t

i

l

i

t

y

Figure 3: Utility functions for a risk averse investor (solid), risk neutral investor (doted) and a risk

seeking investor (dashed) in a wealth/utility framework

For such a person the utility equation looks like

U(W) =

1

2

U(W +G) +

1

2

U(W −G)

We can rearrange this again and get

U(W) −U(W −G) = U(W +G) −U(W)

this means that such a person is indiﬀerent about the preference of the change from the current

wealth minus the gain/loss to the current wealth than the change from the current wealth to

the current wealth plus the gain/loss. Hence risk neutrality causes the second derivative of the

utility function to be zero.

U

(W) = 0

Risk seeking is called the third risk proﬁle and it is deﬁned as accepting a fair gamble.

These kind of investors agree to the following formulations

U(W) <

1

2

U(W +G) +

1

2

U(W −G)

U(W) −U(W −G) < U(W +G) −U(W)

we can assign them a utility function with a positive second derivative since the wealthier they

are the more they will appreciate an additional increment in their wealth.

U

(W) > 0

To conclude, in ﬁgure 3 the utility functions are drawn in a wealth/utility framework for

the three risk types.

11

We can also transform the utility function to the Mean-Variance framework. In [27] the

following utility function is proposed for this purpose

µ

U

= µ

R

−λσ

2

where µ

U

is the expected utility, µ

R

the expected return, σ the standard deviation of returns

and λ the risk-aversion coeﬃcient.

With λ the function can get adapted to the investors aversion to risk. A positive coeﬃcient

indicates risk aversion, λ = 0 means risk neutrality and a negative coeﬃcient deﬁnes a risk

seeking investor. A typical level of risk aversion would be around 0.0075, as stated in [24].

It is convenient in this context to calculate the iso-utility curves. These curves indicate

the mean/risk combinations that seem equally pleasant to a certain investor because they yield

the same value for the utility function. Our three risk proﬁle in a Mean-Variance framework

are depicted in ﬁgure 4. It is possible to see how the three diﬀerent types of investors get

compensated: The risk averse investor (solid line) accepts a higher risk if he/she gets a higher

return as compensation. The risk neutral investor (doted line) wants a certain return and does

not care about the respective risk. The risk seeking investor (dashed line) accepts a return/risk

combination as long as either the return or the risk is high enough. Such a person compensates

high risk with low return and vise versa. From this interpretation one can see that the types

of risk neutral and risk seeking investors are not very common.

Variance

M

e

a

n

Figure 4: The Iso-Utility functions for a risk averse investor (solid), risk neutral investor (doted) and a

risk seeking investor (dashed) in a mean/variance framework

There is a third property about useful utility functions that we can use to determine their

appearance. It is how the size of the wealth invested in risky assets changes once the size of

the wealth has changed. Again, we have three types of investors:

• Decreasing absolute risk aversion: The investor increases the amount of wealth invested

in risky assets when the wealth increases.

• Constant absolute risk aversion: The investor keeps the amount of wealth invested in

risky assets constant when the wealth increases.

12

• Increasing absolute risk aversion: The investor decreases the amount of wealth invested

in risky assets when the wealth increases.

It can be shown that

A(W) =

−U

(W)

U

(W)

measures the absolute risk aversion of an investor. As a consequence, we can deﬁne the investor

types according to A

**(W) and assign it as follows:
**

A

**(W) > 0: Increasing absolute risk aversion
**

A

**(W) = 0: Constant absolute risk aversion
**

A

**(W) < 0: Decreasing absolute risk aversion
**

It is also possible to use the change in the relative investment as property. This is expressed by

R(W) =

−WU

(W)

U

(W)

= WA(W)

and interpreted as follows:

R

**(W) > 0: Increasing relative risk aversion
**

R

**(W) = 0: Constant relative risk aversion
**

R

**(W) < 0: Decreasing relative risk aversion
**

It is commonly accepted that most investors exhibit decreasing absolute risk aversion, but

there is no agreement concerning the relative risk aversion.

In [18] two common utility functions are presented: The most frequently used utility function

in economics is the quadratic one. It is preferred because the assumption of a quadratic utility

function implies that the mean variance analysis is optimal (see Appendix A for a prove).

U(W) = aW −bW

2

(6)

This utility function has the following ﬁrst and second derivatives

U

(W) = a −2bW

U

(W) = −2b

To make this utility function compliant to the requirements of a risk averse investor, we have to

set the second derivative to be smaller than zero or b positive. We have shown that an investor

usually prefers more to less and asks therefore the ﬁrst derivative to be positive or W <

1

2b

.

An analysis of the absolute and relative risk-aversion measures show that the quadratic utility

function has an increasing absolute and relative risk aversion.

Since the quadratic utility function has some undesired properties, there are other utility

functions in use that also satisfy mean variance analysis like

U(W) = ln W

with its ﬁrst and second derivatives

U

(W) =

1

W

U

(W) = −

1

W

2

It gets clear that the ﬁrst derivative is positive for all values of W and the second derivative

is negative for all values of W. So the logarithmic utility function also meets the requirements

of a risk averse investor who prefers more to less. Further this function exhibits decreasing

absolute risk aversion and constant relative risk aversion.

13

1.2 Selecting Optimal Portfolios: The Eﬃcient Frontier

The basic set-up of the Markowitz [30] model is as follows:

w

T

Cw → Min (7)

s.t.

w

T

µ = µ

P

> 0 (8)

w

T

e = 1

where e = (1, 1, ..., 1)

T

, C is the Covariance Matrix as deﬁned in (3), µ is the expected return

vector of the assets and µ

P

is the desired expected return of the portfolio. The ﬁrst line of

the set-up deﬁnes that we want to minimize the variance and therefore the risk of the ﬁnal

portfolio. In the second expression we ﬁx the expected return of the portfolio to a chosen value.

It is evident that we are only interested in a return larger than zero. The last constraint sets

the sum of the weights to one since we want to be fully invested.

In a short sale a trader sells an asset that is not in its possession to buy it later back and

equalize its balance sheet. This practice makes sense in expectation of a decreasing price. Short

sales are indicated by negative asset weights in a portfolio, since the owner of the portfolio has

sold something that does not belong to him/her. If no short sales are allowed, which is usually

the case, there will be an additional constraint:

w

i

≥ 0

We will formulate the solution of the system according to de Giorgi [15]. Equations (7) and

(8) describe a quadratic objective function with linear constraints. If the covariance matrix C

is strictly positive ﬁnite, a portfolio will solve the optimization problem iﬀ

w(µ

P

) = µ

P

w

0

−w

1

(9)

where

w

0

=

1

S

(QC

−1

µ −RC

−1

e)

w

1

=

1

S

(RC

−1

µ −PC

−1

e)

with

P = µ

T

C

−1

µ

Q = e

T

C

−1

e

R = e

T

C

−1

µ

S = PQ−R

2

With (9) we can determine the optimal portfolio for a given expected portfolio return. This

formula also sets the expected portfolio return µ

P

into a relation to the portfolio variance σ

P

which is

σ

2

P

1

Q

−

(µ

P

−

R

Q

)

2

S

Q

2

= 1 (10)

14

A portfolio is called eﬃcient if it oﬀers the lowest possible risk/variance for a given expected

return. The calculation of all of these optimal portfolios for diﬀerent expected returns µ

P

leads

to set of points which called the eﬃcient frontier - a hyperbola in the µ

P

/σ

2

P

-plane as depicted

in ﬁgure 5.

0 2 4 6 8 10 12 14

6

8

1

0

1

2

1

4

Variance

M

e

a

n

Figure 5: The eﬃcient frontier (line) and some ineﬃcient portfolios (points). The portfolios on the

eﬃcient frontier guarantee the highest expected return for a given variance

An important portfolio on the eﬃcient frontier is the global minimum risk portfolio. It

is the one to the very left of the eﬃcient frontier. From (10) we can derive the expected return

of the minimum risk portfolio as

µ

minRisk

=

R

Q

From (9) we can for the global minimum risk portfolio derive

w

minRisk

=

1

R

C

−1

µ

The minimum risk portfolio is the only unambiguous portfolio in the sense that there is only

one possible expected return for a given variance. However, in practice, nobody will choose a

portfolio lying on the eﬃcient frontier below the minimum risk portfolio since the portfolios

on the eﬃcient frontier above the minimum risk portfolio oﬀer a larger expected return for the

same amount of risk.

With the eﬃcient frontier we can determine the amount of risk an investor has to accept for

a certain expected return he or she wants to achieve. Stated the other way around, an investor

can determine, how much return he or she can expect by accepting a certain risk threshold. To

deﬁne the appropriate portfolio for an investor, we can use the iso-utility curves. Figure 6 shows

the eﬃcient frontier with some iso-utility curves. The optimal portfolio is located at the point

of tangency between the eﬃcient frontier and a indiﬀerence curve (Indiﬀerence curve 2 in the

example). This portfolio maximizes the utility, taking all the portfolios on the eﬃcient frontier

into consideration. Portfolios on indiﬀerence curve 3 would have a higher utility, however with

the given assets we can not construct such a portfolio. Portfolios on the indiﬀerence curve 1

are achievable however not optimal in the sense of the utility.

15

Standard Deviation

M

e

a

n

IDC1

IDC2

IDC3

Figure 6: The eﬃcient frontier and some indiﬀerence curves. The optimal portfolio is on the IDC2 line

where the eﬃcient frontier acts as a tangent.

In Schneeweiss shows in[39] that if one wants to apply the Mean-Variance principle as pro-

posed by Markowitz, one has to assume that the utility function is quadratic or that the returns

are normal distributed. Both requirements are critical. Not every investor needs necessarily a

quadratic utility function or even a utility function in terms of mean and variance, i.e. that they

chose a desired expected return and then choose the portfolio with this mean and the lowest

variance. The requirement about the normality of the returns distribution will be discussed in

chapter 3.

Let’s follow the path of Markowitz [30] and have a closer look to the eﬃcient frontier. In

(2) we have deﬁned the variance of a portfolio as follows:

σ

2

P

=

N

i=1

(w

i

σ

i

)

2

+

N

i=1

N

j=1

w

i

w

j

σ

ij

since (4) holds, we can substitute σ

ij

and get

σ

2

P

=

N

i=1

(w

i

σ

i

)

2

+

N

i=1

N

j=1

w

i

w

j

ρ

ij

σ

i

σ

j

In the following we want to analyze the properties of the eﬃcient frontier based on this

formula for the four scenarios short sales allowed and short sales not allowed and risk-free

lending and borrowing possible and not possible. For the sake of simplicity this is done for a

portfolio of only two assets (i=1,2).

Short sales not allowed, no risk-free lending and borrowing

We start with the most common situation, where we are not allowed to sell assets short and no

risk-free lending and borrowing is possible. Most instruments have these restrictions to avoid

16

speculations and high risks. Three sub cases are investigated, dependent on the value of the

correlation ρ between the asset returns.

Perfect positive correlation (ρ = 1) with w

2

= 1 −w

1

, mean and variance of the portfolio

become

µ

P

= w

1

µ

1

+ (1 −w

1

)µ

2

(11)

σ

2

P

= (w

1

σ

1

+ (1 −w

1

)σ

2

)

2

(12)

It shows that with totally correlated assets, return and risk of a portfolio is just the weighted

average of return and risk of its components. By solving (11) for w

1

and substituting w

1

into

(12), one gets

µ

P

= (µ

2

−

µ

1

−µ

2

σ

1

−σ

2

σ

2

) + (

µ

1

−µ

2

σ

1

−σ

2

)σ

P

which is the equation of a straight line. So the eﬃcient frontier for positive correlated assets is

a linear combination of the given assets as shown in ﬁgure 7.

0 2 4 6 8 10 12 14

6

8

1

0

1

2

1

4

Variance

M

e

a

n

Asset 1

Asset 2

Figure 7: The eﬃcient frontier of two assets with perfect correlation is a straight line.

Perfect negative correlation (ρ = −1) In the case of a perfect negative correlation, mean

and variance of the portfolio become

µ

P

= w

1

µ

1

−(1 −w

1

)µ

2

σ

2

P

= (w

1

σ

1

−(1 −w

i

)σ

2

)

2

= (−w

1

σ

1

+ (1 −w

i

)σ

2

)

2

(13)

In the same way as in the case of positive correlation, we can ﬁnd, that the eﬃcient frontier

consists of two straight lines (one for each result of the square root of (13)) drawn in ﬁgure 8. If

17

we have perfectly anti-correlated assets, it is always possible to ﬁnd combination of them which

has zero risk. The appropriate weight and return can be found by setting (13) equal to zero.

w

1

=

σ

2

σ

1

+σ

2

µ

P

∗ =

µ

1

σ

2

−µ

2

σ

1

σ

1

+σ

2

0 2 4 6 8 10 12 14

6

8

1

0

1

2

1

4

Variance

M

e

a

n

Asset 1

Asset 2

Figure 8: The eﬃcient frontier of two assets with perfect negative correlation. It shows that the upper

line has the equation µ

P

∗ = aσ

P

+ µ

P

∗ whereby the lower line is µ

P

∗ = −aσ

P

+ µ

P

∗ with a as a

constant. The two lines intersect the y-axis at µ

P

∗

No relationship between returns of the assets (ρ = 0) For this scenario the variance of

the portfolio gets simpliﬁed to

σ

2

P

= (w

1

σ

1

)

2

+ ((1 −w

1

)σ

2

)

2

To ﬁnd the minimum risk portfolio, one sets

∂σ

P

∂w

i

= 0 and receives for the case of two assets

w

1

=

σ

2

2

σ

2

1

+σ

2

2

The eﬃcient frontier and the minimum risk portfolio are shown in ﬁgure 9.

Intermediate risk In general we can say that the eﬃcient frontier will be always to the left

of two assets, since the portfolio can be constructed as a linear combination of them. Figure 10

shows that the eﬃcient frontier moves to the left with decreasing correlation of the assets and

allows a higher diversiﬁcation and therefore a lower risk.

In practice we will ﬁnd almost always positive correlation between asset classes and very

rarely a negative correlation. This means that there are only very few periods where a certain

asset class has high proﬁt and another asset class a negative proﬁt. The reason lies in the

factors that inﬂuence the returns of the assets classes. Most factors inﬂuence all asset classes

18

0 2 4 6 8 10 12 14

6

8

1

0

1

2

1

4

Variance

M

e

a

n

Asset 1

Asset 2

Figure 9: The eﬃcient frontier of two assets with no correlation is a hyperbola. The minimum variance

portfolio is the one portfolio at the very left

in a similar way and only a few factors inﬂuence only part of the asset classes. For this reason

the behavior of the asset classes is often positively correlated.

Short sales allowed, no risk-free lending and borrowing

By doing a short sale, one takes a negative position in an asset. This may be useful in the case

that one expects that the value of the asset will decrease or it might even make sense when one

expects a positive return in order to get cash to invest in an asset with a better performance.

In the mean variance environment the eﬃcient frontier will continue as a slightly concave

curve to inﬁnity. This means that one can construct a portfolio with a very high expected

return by short selling a lot of assets with low expected return (see ﬁgure 11). Of course not

only the expected return but also the risk of such a portfolio gets huge.

Eﬃcient frontier with risk-free lending and borrowing

Risk-free lending is an instrument where we get a ﬁxed interest rate µ

rf

by lending an amount

to somebody (e.g. buying government bills). Similarly, we could also get cash from somebody

and pay ﬁxed interests for it (e.g. sell government bills short). In both cases the variance of the

asset is zero (σ

rf

= 0) because the interest rates are constant. The variance of our two assets

portfolio, consisting of an asset 1 and a risk-free asset rf, has a variance equal to the weighted

variance of asset 1:

σ

2

P

= (w

1

σ

1

)

2

The optimal weight for the asset 1 would be

w

1

=

σ

P

σ

1

As a formula for the eﬃcient frontier we get:

19

0 2 4 6 8 10 12 14

6

8

1

0

1

2

1

4

Variance

M

e

a

n

Asset 1

Asset 2

Figure 10: Comparison of the eﬃcient frontier of assets with diﬀerent correlation. The correlation of

between asset 1 and asset 2 is -1, 0, 0.5, 1 (from left to right).

µ

P

= (1 −w

1

)µ

rf

+w

1

µ

1

=

µ

1

−µ

rf

σ

1

σ

P

+µ

rf

From this term for the expected return of the portfolio we can see that the eﬃcient frontier

is again a linear curve as in ﬁgure 12. The term

µ

1

−µ

rf

σ

1

or the slope of the function is called

leverage factor.

To conclude, one can say that all portfolios constructed with risk-free lending and borrowing

lie on one straight line through the point (µ

rf

,0) and the point representing a portfolio consisting

only of the one available asset. By changing the leverage factor, one changes also µ

rf

and σ

2

P

in a linear way.

As soon as risk-free lending and borrowing is possible, nobody will be interested anymore in

the hyperbola (and its expansion through short sales) described in the section above, but only

in the tangent to the hyperbola through (µ

rf

,0) since it oﬀers a higher µ

rf

for a given σ

rf

.

In the case that the lending rate is not the same as the borrowing rate, we get an eﬃcient

frontier consisting out of three parts: It starts with the line of the borrowing rate until it touches

the envelope of all the portfolio built without lending and borrowing and continues ﬁnally on

the line of the lending rate to inﬁnity. Since short sales allow only a concave expansion of the

eﬃcient frontier to the right and the risk-free lending eﬃcient frontier is a straight line, short

sales are also in this case of no interest anymore. An illustration is given in ﬁgure 13.

20

0 2 4 6 8 10 12 14

6

8

1

0

1

2

1

4

Variance

M

e

a

n

Asset 1

Asset 2

Figure 11: Short sales allow to construct portfolios with very large mean and variance because it

enlarges the eﬃcient frontier to the right.

0 2 4 6 8 10 12 14

6

8

1

0

1

2

1

4

Variance

M

e

a

n

Asset 1

Figure 12: Risk-free lending corresponds to the eﬃcient frontier to the left of the asset (intersection at

µ

rf

with the y-axis) and risk-free borrowing corresponds to the eﬃcient frontier to the right of asset 1

21

Standard Deviation

M

e

a

n

µ

borrow

µ

lend

Figure 13: The eﬃcient frontier (solid line) for diﬀerent borrowing and lending rates is constructed out

of three parts: First it is on the borrow line until it arrives at the hyperbola of the eﬃcient portfolios

which it follows until it reaches the tangent of the lending line where it continues to inﬁnity.

22

Techniques for calculating the eﬃcient frontier

In this chapter we will explain the techniques to determine the eﬃcient frontier mathematically.

Again, we will diﬀerentiate between the four cases of allowed and not allowed short sales and

possible and not possible risk-free lending and borrowing.

Short sales allowed, risk-free lending and borrowing possible

We start with the simplest case. From the earlier chapter we already know that with allowed

short sales and risk-free lending and borrowing there will be one optimal portfolio on the tangent

from the risk-free asset (on the y-axis) to the envelope of all the eﬃcient portfolios. The enabled

risk-free lending and borrowing makes this tangent to the eﬃcient frontier. Our aim is for this

reason to maximize the slope of this tangent

θ =

µ

1

−µ

rf

σ

1

(14)

in order to maximize the return to risk ratio. There is a constraint to make sure that the

weights add up to one

N

i=1

w

i

= 1 (15)

With this setup we have a constraint maximization problem which could be solved with La-

grangian multipliers. However it is possible to turn it into an unconstraint maximization prob-

lem by combining the constraint (15) and the objective function (14). In order to do so, we

start with:

µ

rf

= 1µ

rf

= (

N

i=1

w

i

)µ

rf

=

N

i=1

w

i

µ

rf

Substituting this and our deﬁnition of the variance of a portfolio (2) into (14), we get

θ =

N

i=1

w

i

(µ

i

−µ

rf

)

_

N

i=1

w

i

σ

i

2

+

N

i=1

N

j=1

w

i

w

j

σ

ij

The maximization problem can be solved by

∂θ

∂w

i

= 0

This gives us a system of equations where we can apply the following substitution

z

i

=

µ

P

−µ

rf

σ

2

P

w

i

which leads to the following system of N simultaneous equations for N unknowns z

1

, . . . z

N

:

µ

1

−µ

rf

= z

1

σ

2

1

+z

2

σ

12

+. . . +z

N

σ

1N

µ

2

−µ

rf

= z

1

σ

12

+z

2

σ

2

2

+. . . +z

N

σ

2N

.

.

.

µ

N

−µ

rf

= z

1

σ

1N

+z

2

σ

2N

+. . . +z

N

σ

2

1N

23

The optimal weights w

i

can be received via

w

i

=

w

i

N

i=1

1

2

z

i

Short sales allowed, risk-free lending and borrowing not possible

If there is no risk-free asset available, we can nevertheless assume that there is a risky free

asset with a speciﬁed return. Now we are in the case discussed before and can compute the

optimal portfolio corresponding to this situation. By changing the return of this ﬁctive risk-free

asset to other rates, we can calculate the eﬃcient frontier as the sum of the optimal portfolios

corresponding to diﬀerent rates as shown in ﬁgure 14.

Standard Deviation

M

e

a

n

µ

rf1

µ

rf2

µ

rf3

Figure 14: In the case of allowed short sales but no risk-free assets, one can determine the eﬃcient

frontier as sum of points corresponding to diﬀerent (ﬁctive) risk-free rates µ

rf1

, µ

rf2

, µ

rf3

Short sales not allowed, risk-free lending and borrowing possible

With the restriction of no short selling, we get an additional constraint and the optimization

problem looks like

θ =

µ

P

−µ

rf

σ

P

→ Max

subject to constraints

N

i=1

w

i

= 1

w

i

≥ 0, ∀i

This last condition makes the problem hard to solve since we have a quadratic programming

problem and no longer an analytical solution. The quadratic aspect is hidden in the objective

function: The σ

P

-term contains squared terms in w

i

.

To solve these kind of problems, one can use a standard solver package.

24

Short sales not allowed, risk-free lending and borrowing not possible

If the investor does not want to allow short sales and no risk-free asset is available, we can solve

the following optimization problem with the investors expected return µ

p

σ

2

P

=

N

i=1

(w

i

σ

i

)

2

+

N

i=1

N

j=1

w

i

w

j

σ

ij

→ Min

subject to

N

i=1

w

i

= 1

N

i=1

w

i

µ

i

= µ

P

w

i

≥ 0, ∀i

This is also a quadratic programming problem that should be solved with a computer package.

25

26

2 Capital Asset Pricing Model (CAPM)

This chapter presents two linear regression models to answer the question, how an eﬃcient

market behaves if every market participant follows the rules of Markowitz. The models will

also be used to introduce some important concepts of ﬁnance.

2.1 Standard Capital Asset Pricing Model

The Capital Asset Pricing Model describes how a market, consisting of individual agents acting

according to the model of Markowitz, behaves in the equilibrium. The Capital Asset Pricing

Model has several assumptions:

• Investors make decisions solely in terms of expected value, standard deviation and the

correlation structure having a one period horizon.

• No single investor can aﬀect prices by one action - prices are determined by the actions

of all investors in total.

• Investors have identical expectations and information ﬂows perfectly.

• There are no transaction costs.

• Unlimited short sales are allowed.

• Unlimited lending and borrowing at risk-free rate is possible.

• Assets are inﬁnitely divisible.

As we have seen above, with allowed short sales but no risk-free lending and borrowing, we

get an eﬃcient frontier like the one from A to B in ﬁgure 15. The Separation Theorem says

that, when we introduce risk-free lending and borrowing, the optimal portfolio can be identiﬁed

without regard to the risk preference of the investor (optimal Portfolio P in the ﬁgure). The

investors satisfy their risk preferences by combining portfolio P with lending and borrowing and

get a portfolio on the tangent to P.

According to our assumptions, all investors have homogeneous expectations and are oﬀered

the same lending and borrowing rate. In this case they will all have exactly the same diagram as

ﬁgure 15. If all investors have the same diagram, they will also calculate all the same portfolio

P (and variably weight it with the risk-free asset). This implies that portfolio P must be, in the

equilibrium, the market portfolio. The market portfolio consists of all available risky assets,

weighted with their market capitalization.

We can resume this and get the Two Mutual Fund Theorem: In the equilibrium, all

investors will hold combinations of only two portfolios: the market portfolio and a risk-free

security.

Figure 16 shows the market portfolio M and the same the straight line as in ﬁgure 15. This

line is called Capital Market Line. The Capital Market Line deﬁnes the linear risk-return

trade-oﬀ for all investment portfolios. It is the new eﬃcient frontier that results from risk-free

lending and borrowing. All investors will end up on it since it contains all the eﬃcient portfolios.

The equation of this line, connecting the risk-free asset and the market portfolio M, is

27

Standard Deviation

M

e

a

n

A

B

P

rf

Figure 15: The eﬃcient frontier and its tangent at the optimal portfolio. By lending and borrowing, one

moves on the tangent: Portfolio P is without lending and borrowing. If one lends additional capital from

somebody, one gets a portfolio on the tangent to the right of P and if one borrows capital to somebody

one gets a portfolio on the tangent to the left of P.

Standard Deviation of efficient portfolio

M

e

a

n

o

f

p

o

r

t

f

o

l

i

o

M

µ

M

σ

M

µ

rf

Figure 16: The Capital Market Line describes the linear relation between risk and return for a portfolio.

The market portfolio is depicted as M

28

Variance between market and individual asset

M

e

a

n

i

n

d

i

v

i

d

u

a

l

a

s

s

e

t

M

µ

M

σ

M

2

µ

rf

Figure 17: The Security Market Line describes the linear relation between risk and return for a portfolio.

µ

P

= µ

rf

+ (

µ

M

−µ

rf

σ

M

)σ

P

This can be interpreted as

Expected return= reward for time + reward for risk * amount of risk

Let’s have a look at the individual assets: The relevant measure here is their covariance with

the market portfolio (σ

i,M

). This is described by the Security Market Line: The Security

Market Line deﬁnes the linear risk-return trade-oﬀ for individual stocks. Its formula is

µ

i

= µ

rf

+ (

µ

M

−µ

rf

σ

M

)

σ

iM

σ

M

At this point we would like to introduce a factor called beta. It is a constant that measures

the expected change in the return of an individual security R

i

given a change in the return of

the market R

M

. It can be estimated by

β

iM

=

σ

iM

σ

2

M

We can use this to substitute beta for the two variances:

µ

i

= µ

rf

+¸µ

M

−µ

rf

|β

i

Finally we derive a single index model that describes the relation between the return on indi-

vidual securities and the overall market at a time point t:

R

it

= α

i

+β

i

R

Mt

+

i

(16)

where

α

i

: part of the return of security R

it

that is independent of the market’s performance R

Mt

,

β

i

: sensitivity of return of security R

it

to market’s performance R

Mt

,

29

R

Mt

: return of the market,

i

: a random error term with mean equal to zero.

Beta measures how sensitive a stock’s return is to the return of the market. A beta of two

means that the return of the stock will be the double of the return of the market (no matter

whether it is a loss or a gain). Similarly, a beta of 0.5 means that the stock will move only half

as much as the market does. In other words, a stock with a high beta gets a high risk premium

and a stock with a low beta gets a low risk premium.

The intention of splitting the return of a stock into a part that is related to the market (β

i

R

Mt

)

and a part that is related to the individual stock (α

i

) comes from the observation, that when

the market goes up, most stocks follow this trend and vice versa. Therefore is a part of the

stock return related to the market return. It is interesting in (16) to see that the return is

only inﬂuenced by the market risk and investors don’t receive a premia for holding additional

diversiﬁable/non market risk.

We can summarize that the Capital Asset Pricing Model is a theoretical model to identify

the tangency portfolio. It uses some ideal assumptions about the economy to conclude that the

capital weighted world wealth portfolio it the tangency portfolio and that every investor will

hold this portfolio.

2.2 Arbitrage Pricing Theory (APT)

The Arbitrage Pricing theory is an alternative approach to determining asset prices. It was ﬁrst

introduced in [37] and bases on the idea that exactly the same instrument can not be diﬀerently

priced.

As we have seen, the Capital Asset Pricing Model has some quite restrictive assumptions.

This gives space for the Arbitrage Pricing Theory. It asks for the following conditions to be

fulﬁlled

• Returns are generated according to a linear factor model.

• The number of assets N is close to inﬁnite.

• Investors have homogenous expectations (same as in CAPM).

• Capital markets are perfect (perfect competition, no transaction costs - same as CAPM).

The Arbitrage Pricing Theory states that returns of stocks are generated by a linear model

consisting of F factors I

j

R

i

= a

i

+b

i1

I

1

+b

i2

I

2

+. . . +b

iF

I

F

+e

i

(17)

where

a

i

: the expected return for stock i if all factors have a value of zero,

I

j

: the value of factor j that impacts the return on stock i,

b

ij

: the sensitivity of stock i’s return to factor j,

e

i

: a random error term with mean equal to zero and variance equal to σ

2

e

i

. This error is

uncorrelated with the factors b

ij

and errors of the other assets (unsystematic risk).

30

If the assumptions hold, we can combine the assets to get a risk-free portfolio that requires

zero net investment (i.e. by short selling certain assets and buying others with the revenue).

The fundamental implication of the Arbitrage Pricing Theory is that such a free, risk-free port-

folio (arbitrage portfolio) must have a zero return on the average. This is intuitive since a

risk-free portfolio with an expected return of non zero is an arbitrage opportunity which would

be exploited immediately by market participants and hence diminish.

Let’s express this in a more mathematical way: Using (17) we can write the expected

portfolio return as

µ

P

=

N

i=1

w

i

a

i

+

N

i=1

w

i

b

i1

I

1

+. . . +

N

i=1

w

i

b

iF

I

F

+

N

i=1

w

i

e

i

(18)

We have assumed that the number of stocks are close to inﬁnite. So, it is possible to ﬁnd a

portfolio that satisﬁes the following properties:

N

i=1

w

i

= 0

N

i=1

w

i

a

i

= 0

N

i=1

w

i

b

i1

= 0

N

i=1

w

i

b

i2

= 0

.

.

.

N

i=1

w

i

b

iF

= 0

The ﬁrst condition deﬁnes that we have no net investment since we want an arbitrage portfolio.

The second condition asks the expected return for this stock to be zero if all factors are set to

zero (non-arbitrage condition). The following conditions imply that the portfolio has no risk

since it has no exposure to any of its constituents. These three types of conditions are called

orthogonality constraints. Applying them to (18), we can see that it must produce an expected

return of zero. Again, if this would not hold true, investors would have a free money generator.

It shows that the orthogonality constraints imply that the expected returns µ

R

i

are a linear

combination of the b

ij

and a constant. This means that there exists a set of factors λ

0

. . . λ

F

such that

µ

R

i

= λ

0

+λ

1

b

i1

+. . . +λ

F

b

iF

The b

ij

can still be interpreted as the sensitivity of the assets to a change in an underlying

factor I

i

. In contrast, the λ

j

represent the risk premia of the respective factor.

We are determining now the λ

j

by using the fact that an asset with single exposure to one

factor and no exposure to the other factors has the same risk premia as this factor. For each

31

λ

j

, j = 1 . . . F we do the following: The respective b

ij

is set to 1 and all the other equal to 0.

With this procedure we ﬁnd that

µ

R

i

= λ

0

+b

i1

(µ

R

1

−λ

0

) +. . . +b

iF

(µ

R

F

−λ

0

)

We assume that for i = 0 we have the risk-free asset since the risk-free asset does not depend

on any other factors (b

0j

= 0, j = 1 . . . F). For this special case of the APT model we get

µ

R

0

= λ

0

= µ

rf

and therefore we can express the model as formula for the excess return

µ

R

i

−µ

rf

= b

i1

(µ

R

1

−µ

rf

) +. . . +b

iF

(µ

R

F

−µ

rf

)

The Capital Asset Pricing Model can be seen as a very special case of the Arbitrage Pricing

Model with only one factor (single index model). This can be shown if one sets F = 1. Then

we have left

R

i

= a

i

+b

i1

I

1

+e

i

Now we can interpret a

i

as the return of the risk-free asset µ

rf

and b

i1

I

1

as the return of the

market portfolio R

M

times the leverage factor.

R

i

= µ

rf

+b

1

R

Mi

+e

i

And this is the same expression as (16) for the CAPM.

Factor analysis is the principal methodology used to estimate the factors I

j

and factor

loadings b

ij

. Since it is not possible to calculate a perfect speciﬁcation of the model described

by (17), a factor analysis will derive a good approximation. The criteria for the goodness is

the covariance of residual returns which should be minimal. To execute a factor analysis, one

has to determine the number of desired factors in advance. By repeating this process for an

increasing number of factors, one gets one solution for each number of factors. A criteria to

stop increasing the number of factors would be, if the probability that the next factor explains

a statistically signiﬁcant portion of the covariance drops below some level (e.g. 50%).

There are factor analysis methods that produce orthogonal factors (e.g. principal compo-

nent analysis) and others that produce non-orthogonal factors. It may become a disadvantage

to choose a method that creates orthogonal factors since the factors it creates do not exist in the

real world and can therefore not be interpreted. However they can be used in a pure statistical

model by assuming that the past data will be valid for the next step and applying them to

calculate one step into the future. The non-orthogonal model might be not so accurate, but as

soon as one gets the factors (like indices or interest rates) and their respective weights, one can

apply the model in the future with new data from these factors.

To conclude, we can say that the Arbitrage Pricing Model has a number of beneﬁts: It is not

as restrictive as the Capital Asset Pricing Model in its requirement concerning the distribution

of the returns and the investors utility function. It also allows multiple sources of risk to explain

the stock return movements. Further it avoids using the concept of a Market portfolio. This is

an advantage because this concept is hard to observe in practice.

The ﬂexibility is also the main disadvantage of the model: The investors have to decide

which sources of risk they want to include and how to weight them. Further the APT model

might not be so intuitive as the CAPM.

32

Nevertheless, the Arbitrage Pricing Theory remains the newest and a promising explanation

of relative returns.

33

Part II

Beyond Markowitz

We have seen in the ﬁrst part that the approach to optimize a portfolio as proposed by

Markowitz asks for some strong assumptions like normal distributed returns. In this second

part we will investigate whether it can be assumed that the returns of ﬁnancial assets are

produced by a normal distribution. As we will seen, there will be several aspects that indicate

that this assumption does not hold. We will use this as justiﬁcation for analyzing further

portfolio optimization algorithms that do not have such a strong requirement to the underlying

distribution function of the asset returns.

3 Stylized Facts Of Asset Returns

In this chapter we will present some statistical tests to investigate the characteristic properties

of ﬁnancial market data. The used tests are chosen with respect to the properties that are im-

portant specially for ﬁnancial time series. The tests for determining the form of the underlying

distribution function that has created the returns are Goodness of ﬁt (Kolmogorov-Smirnov

test), Kurtosis and Skewness (Jarque-Bera test) and Quantile-Quantile plots. Concerning the

form of the distribution function, we especially test for the Normal distribution. Further we

have selected two tests for detecting dependencies and long memory eﬀects in the time series.

These are the Runs test for randomness and BDS test for dependencies.

The focus of the tests as a whole lies on the detection of fat tail behavior rather than de-

pendencies. The tests are presented in their functionality and demonstrated on representative,

artiﬁcial data. In part III of the thesis the tests are applied to real market data and the resulting

conclusions drawn.

Non normality in return distributions

A very important question in ﬁnancial analysis is the one for the distribution function of the

asset returns. Since a lot of methods and theorems are assuming a certain distribution function,

it is crucial to analyze the origin of the returns.

There are two aspects of the distribution function that has created the asset returns that should

be considered:

• Form: Does the distribution have fat tails or skewness?

• Dependencies: Do the returns depend on an earlier return values?

The normal distribution was ﬁrst mentioned by de Moivre in 1733 [31]. The advantages of this

distribution are

• It can be deﬁned by only two variables: mean and variance.

• It describes random behavior in a natural mechanisms.

34

For this reasons and the fact that it is possible to ﬁt it as a ﬁrst approximation to asset returns,

the normal distribution is used a lot in ﬁnancial analysis and is still considered as the standard

assumption.

However, in 1963 Mandelbrot [29] observed that ﬁnancial returns might not be produced by a

normal distribution.

3.1 Distribution Form Tests

Goodness of ﬁt test (Kolmogorov-Smirnov test)

We start with the Kolmogorov-Smirnov one-sample test which can be used to answer the ques-

tion, whether a sample comes from a population with a speciﬁc distribution. The test is based

on the empirical distribution function of the given samples and is restricted to continuous dis-

tributions to test for.

Assuming we are given the samples as X

1

, X

2

, . . . , X

N

. We can order them and calculate the

empirical distribution function as

E

N

=

n(i)

N

with n(i) as the number of samples that are smaller than X

i

. The Kolmogorov-Smirnov test

determines the maximum distance between this empirical distribution function and the cumu-

lative distribution function of the assumed underlying function. Figure 18 shows a chart with

these two distribution functions.

−0.05 0.00 0.05

0

.

0

0

.

2

0

.

4

0

.

6

0

.

8

1

.

0

X

C

u

m

u

l

a

t

i

v

e

P

r

o

b

a

b

i

l

i

t

y

Figure 18: The Kolmogorov-Smirnov test calculates the maximum diﬀerence between the empirical

distribution function of the samples (doted line) and the cumulative distribution function of the assumed

underlying function (solid line).

The hypothesis of the test are deﬁned as:

Null hypothesis: The data follows the assumed distribution

35

Alternative hypothesis: The data does not follow the assumed distribution

The precise test statistic is

D = max

i≤i≤N

[F(X

i

) −

i

N

[

with F(X

i

) as the assumed underlying distribution function. The null hypothesis of the dis-

tribution is rejected if

√

ND

N

, dependent on the conﬁdence level, is greater than the critical

value derived from the standard normal distribution.

There are two equivalent ways to handle the underlying distribution. In both ways the mean

ˆ µ and variance ˆ σ of the underlying distribution need to be estimated out of the given samples.

It is then possible to compare the samples to a normal distribution with the estimated mean ˆ µ

and variance ˆ σ. Otherwise one can transform the given samples according to

ˆ

X

i

=

X

i

− ˆ µ

ˆ σ

(19)

and compare the new samples to a standard normal distribution.

Some points classify the Kolmogorov-Smirnov test as unsatisﬁable for our purpose: First,

since the test compares the absolute diﬀerence between the two cumulative distributions, it

underweights the diﬀerence in the tails and overweights the diﬀerence near the mean of the

distribution. However we want especially check whether our distribution has fat tails. The

second disadvantage of the Kolmogorov-Smirnov test is that it is a very general method (it can

also be used for comparing with other distributions than just the normal) and is thus taking

only the mean and variance of a distribution into consideration.

Skewness and kurtosis (Jarque-Bera test)

For the Kolmogorov-Smirnov test we were looking at the ﬁrst and second moment of the dis-

tribution.

µ =

i

w

i

x

i

σ

2

=

i

w

i

(x

i

−µ)

2

In terms of the normal distribution, often the third and fourth moments become interesting.

Skewness is the standardized third moment

ς =

i

w

i

(x

i

−µ)

3

σ

3

Skewness can be interpreted as a measure for the asymmetry of a distribution function whereby

a value of 0 indicates absolute symmetry (e.g. the normal distribution), a positive skewness

means an increased probability at the higher quantiles (heavy right tail) and a negative skewness

says that we have an increased probability at the lower quantiles (heavy left tail). Figure 19

shows some examples of empirical distributions with skewness.

The standardized fourth moment is called kurtosis. Because the normal distribution has a

kurtosis of 3, one often calculates the excess kurtosis which is the kurtosis minus 3.

κ =

i

w

i

(x

i

−µ)

4

σ

4

−3

36

−4 −2 0 2 4

0

.

0

0

.

1

0

.

2

0

.

3

0

.

4

0

.

5

0

.

6

X

P

r

o

b

a

b

i

l

i

t

y

−4 −2 0 2 4

0

.

0

0

.

1

0

.

2

0

.

3

0

.

4

0

.

5

X

P

r

o

b

a

b

i

l

i

t

y

Figure 19: The charts show skewed normal distributions (solid) in comparison with a normal distribution

(doted). The left chart is drawn by a standard normal distribution with a shape parameter of -3, while

the right chart is drawn by a standard normal distribution with shape parameter of 1.

The kurtosis of a distribution deﬁnes whether the distribution has fat tails in comparison with

a normal distribution or not. The following holds true for most ﬁnancial time series: A negative

kurtosis indicates that both tails are less pronounced and the distribution is less peaked as a

normal distribution (platykurtic). A distribution with a kurtosis of 0 is called mesokurtic. The

opposite of platykurtic, a positive kurtosis, means fat tails and more peakedness than a normal

distribution (leptokurtic). If there is excess kurtosis, the mid-range values on both sides of the

mean have less weight than in a normal distribution. This means that distributions with a high

kurtosis are appropriate when the returns are likely to be very small or are likely to be very

large but are not very likely to have values between these two extremes.

−10 −5 0 5 10

0

.

0

0

0

.

0

5

0

.

1

0

0

.

1

5

0

.

2

0

0

.

2

5

0

.

3

0

0

.

3

5

X

P

r

o

b

a

b

i

l

i

t

y

−4 −2 0 2 4

−

5

−

4

−

3

−

2

−

1

X

l

o

g

(

P

r

o

b

a

b

i

l

i

t

y

(

X

)

)

Figure 20: The charts show a Student-t distribution with an excess kurtosis of 6.7 (solid) in comparison

with a normal distribution (doted). The left chart uses a linear y-axis whereby the right chart uses a

logarithmic y-axis to make the excess kurtosis more explicitly. In the log chart appear the the fat tails

of the student distribution as a line above the tails of the normal distribution.

With these deﬁnitions, the normal distribution has a skewness and a kurtosis of 0. The

Jarque-Bera test calculates the skewness and kurtosis of a given distribution to ﬁnd out, whether

it is a normal distribution (with a value of 0 for both) or not. The test statistics is as follows:

37

If we assume normality for the underlying distribution, the standard error for the estimated

skewness ˆ ς and kurtosis ˆ κ are approximately

_

6

N

and

_

24

N

with N as the sample size. The

Jarque-Bera test is deﬁned as

JB = N[(

ˆ ς

2

6

) + (

ˆ κ

2

24

)] (20)

and is asymptotically chi-squared with 2 degrees of freedom.

Quantile-Quantile plot

In this section we would like to present a graphical method to assign some sample data to a

possible distribution. An α quantiles is deﬁned as x such that

P[X < x] = α

The quantile-quantile plot (QQ plot) is a scatter plot with the quantiles of the given empirical

distribution on the vertical axis and the quantiles of the theoretical distribution on the horizontal

axis. In order to calculate the quantiles of the empirical distribution, one ﬁrst has to transform

the empirical distribution according to the standard normal transformation (19). Now one

can draw the QQ plot as scatter plot of the transformed empirical and the standard normal

quantiles.

In [19] the main merits of a QQ plot are described as:

• If a random sample set is compared to its own distribution, the plot should look roughly

linear.

• If there are a few outliers contained in the data, it is possible to identify them by looking

at the scatter plot.

• If one distribution is transformed by a linear function, this transforms the QQ plot by the

same linear transformation. The transformation can be estimated from the plot (slope

and intercept)

• It is possible to deduce small diﬀerences in the participating distributions from the plot

(e.g. fat tails imply curves at the left and right end)

Figure 21 shows a QQ plot for a sample from a student-t distribution with excess kurtosis.

A distribution with excess kurtosis has a larger probability for events with very large or very

small values in comparison to the normal distribution. From this we can conclude that fat tails

will appear in a QQ plot as deviation from the diagonal at the extreme values. The deviation

will be upwards for the high values and downwards for the low values.

3.2 Dependencies Tests

Runs test for randomness

The runs test can be used to decide if a data set is from a random process. It uses the concept

of a run which is deﬁned as a sequence of increasing values or a sequence of decreasing values.

The length of a run is deﬁned as the number of values belonging to this run. The runs test is

based on the binomial distribution, which deﬁnes the probability that the i-th value is larger

or smaller than the (i + 1)-th value.

38

−3 −2 −1 0 1 2 3

−

8

−

6

−

4

−

2

0

2

4

6

Normal QQ−Plot

Normal Quantiles

S

t

u

d

e

n

t

−

t

Q

u

a

n

t

i

l

e

s

−3 −2 −1 0 1 2 3

−

6

−

4

−

2

0

2

4

6

Normal QQ−Plot

Normal Quantiles

S

t

u

d

e

n

t

−

t

Q

u

a

n

t

i

l

e

s

Figure 21: The charts show QQ plots for a student-t distribution with a degree of freedom of 4 in

comparison with the normal distribution. The left chart was created from a sample set of 1000 elements

from the student-t distribution and the right chart directly from the quantiles of the same normal and

student-t distribution. The right chart is therefore smoother. The fat tails of the student distribution

appear in both charts as deviation from the diagonal.

For the test we have to calculate the n

i

’s, the number of runs of length i for 1 ≤ i ≤ 10. We can

then normalize the n

i

’s with the expected number of runs of length i (µ

n

i

) and the standard

deviation of the number of runs of length i (σ

n

i

). These values µ

n

i

and σ

n

i

can be received

from the binomial distribution.

The ﬁnal test value is the normalized n

i

:

z

i

=

n

i

−µ

n

i

σ

n

i

which is compared to the two sided standard normal table. A z

i

value greater than the table en-

try indicates non-randomness. Figure 22 and 23 show some outcome of AR(1) and GARCH(1,1)

processes with the corresponding test results.

BDS test for dependencies

The BDS test is a non-parametric method of testing for nonlinear patterns in time series. It

was ﬁrst developed by Brock, Dechert and Scheinkman in 1987 (see [11]). The test has the null

hypothesis that the data in the time series is independently and identically distributed (iid)

and is in [8] deﬁned as

B

T

=

√

T −m+ 1(C

T

(m, ) −C

T

(1, )

m

)

ˆ σ(m, )

where

• C

T

(m, ) is the correlation integral deﬁned by

C

T

(m) = (

T−m

2

)

−1

∀s<t

I

(Y

m

t

, Y

m

s

)

• Y

m

t

= (y

t

, y

t+1

, , y

t+m−1

) is the m-history of y

t

39

Time

x

0 200 400 600 800 1000

−

2

0

2

Time

x

0 200 400 600 800 1000

−

5

0

5

Figure 22: The charts shows a sample set derived from an AR(1) process with a coeﬃcient φ = 0.5. The

process corresponding to the left picture had a standard normal distribution as innovation function and

the process corresponding to the right picture had a student-t distribution with a degree of freedom of

4 for the innovation function. The Runs test calculates a value n

1

= −0.70 for the left chart and a value

n

1

= −0.55 for the right chart. The standard normal table shows at the 5% signiﬁcance level a value of

1.96. Since -0.70 and -0.55 is contained in ±1.96, we can conclude that both underlying processes that

have created the sample sets were random.

• I

(Y

m

t

, Y

m

s

) is the indicator function with I

(Y

m

t

, Y

m

s

) = 1 , if |Y

m

t

, Y

m

s

| < , and

I

(Y

m

t

, Y

m

s

) = 0 otherwise. is a positive constant

• |Y

m

t

, Y

m

s

| is the max-norm of Y

m

t

, Y

m

s

:

|Y

m

t

, Y

m

s

| := max([y

t

−y

s

[, [y

t+1

−y

s+1

[, . . . , [y

t+m−1

−y

s+m−1

[).

• ˆ σ

2

(m, ) is a consistent estimator of the asymptotic variance of

√

T −m+ 1C

T

(m, )

The underlying idea of the BDS test can be seen in the following:

The random event ¦I

(Y

m

t

, Y

m

s

) = 1¦ is the same as

¦|Y

m

t

, Y

m

s

| < ¦ = ¦[y

t

, y

s

[ < ¦ ∩ . . . ∩ ¦[y

t+m−1

, y

s+m−1

[ < ¦

Let A

t,s

(m, ) = ¦[y

t

, y

s

[ < ¦. The above relationship can be expressed as

A

t,s

(m, ) = A

t,s

(1, ) ∩ . . . ∩ A

t+m−1,s+m−1

(1, )

If ¦y

t

¦ is an i.i.d. sequence, then the events A

t,s

(1, ), . . . , A

t+m−1,s+m−1

(1, ) will be indepen-

dent, so

P[A

t,s

(m, )] = P[A

t,s

(1, )]

m

Since the correlation integral C

T

(m, ) converges in distribution to P[A

t,s

(1, )]

m

, the BDS test

detects the null hypothesis of serial independence by comparing if C

T

(m, ) is suﬃciently close

to C

T

(1, )

m

.

The BDS statistic is easy to compute, however it has a disadvantage: The user has to deﬁne

the two free parameters maximum embedding dimension m and relative radius ex ante.

40

0 200 400 600 800 1000

−

0

.

0

1

0

−

0

.

0

0

5

0

.

0

0

0

0

.

0

0

5

0

.

0

1

0

Index

x

0 200 400 600 800 1000

−

0

.

0

6

−

0

.

0

4

−

0

.

0

2

0

.

0

0

0

.

0

2

0

.

0

4

Index

x

Figure 23: The charts show the same calculations as in ﬁgure 22 with an GARCH process (as described in

[10]) as underlying function. Again, the process corresponding to the left picture had a standard normal

distribution as innovation function and the process corresponding to the right picture had a student-t

distribution with a degree of freedom of 4 for the innovation function. The Runs test calculates a value

n

1

= −0.65 for the left chart and a value n

1

= −0.62 for the right chart. So we can again conclude that

both underlying processes that have created the sample sets were random.

We will use the same AR(1) and GARCH(1,1) processes as described in ﬁgure 22 and 23

for the Runs test and apply the BDS test to them. The following part shows the detailed BDS

analysis for the AR(1) process with normal innovation:

Embedding dimension = 2, 3

Epsilon for close points = 0.5836, 1.1672, 1.7508, 2.3344

Standard Normal =

[ 0.5836 ] [ 1.1672 ] [ 1.7508 ] [ 2.3344 ]

2 15.7714 16.6025 17.0971 18.2606

3 14.0135 14.8726 15.2300 16.4728

p-value =

[ 0.5836 ] [ 1.1672 ] [ 1.7508 ] [ 2.3344 ]

2 0 0 0 0

3 0 0 0 0

The test program has decided to use 0.58, 1.2, 1.8 and 2.3 as and calculate the statistics for

embedding dimension 2 and 3. The ﬁrst table shows the test results for each combination if

embedding dimension and . Since all values lie above the threshold given by the standard

normal distribution, we can (correctly) conclude that the series is not independent. The second

table shows the p-values for the statistics. We can have great conﬁdence in the results because

of the very low p-values.

The following table summarizes the results of the BDS test applied to the four processes:

Process Innovation Function used range of test results

AR(1) Standard Normal 0.58 1.2 1.8 2.3 14 - 18

Student-t 0.81 1.6 2.4 3.2 14 - 19

GARCH Standard Normal 0.0016 0.0032 0.0049 0.0065 2.8 - 4.9

Student-t 0.0038 0.0076 0.011 0.015 9.2 - 14

41

The results for the AR(1) process with student-t distribution as innovation function lie between

14 and 19 for as 0.81 1.6 2.4 3.2 and is therefore also not produced by and independent process.

In the case of the GARCH process with normal innovation function as underlying function the

results are not so ambiguous. We get values between 2.8 and 4.9 for the test statistics which is

still larger than the corresponding value for the standard normal distribution and therefore we

can also this time series declare as not independent. The reason for these small values might lie

in the fact that the test programm has chosen the relative radius very small: 0.0016, 0.0032,

0.0049 and 0.0065.

A BDS test for GARCH with student innovation function produces values between 9.2 and 14

as test statistics. The is chosen as 0.0038, 0.0076, 0.011 and 0.015. Therefore we can conclude

that this time series is also not independent.

3.3 Results Of Statistical Tests Applied To Market Data

Kolmogorov-Smirnov test

First we apply the market data to the Kolmogorov-Smirnov test to get an impression about

whether they are normally distributed. We have calculated the test results for all of the listed

market time series. The Smirnov-Kolmogorov test value is determined for diﬀerent data inter-

vals. This means that the given daily data (D) was aggregated to bi-daily data (BD), weekly

data (W), bi-weekly data (BW), monthly data (M) and quarterly data (Q). For each of this

data set and each mentioned index the test result is calculated. The values are listed in the

following table. Each column represents an index, whereby E stands for ’Equity’ and B for

’Bond’. Each row contains a time interval, abbreviated as explained above.

Interval E World E EU E US E FE E CH B World B EU B US B FE B CH

D 4.6 4.3 3.8 4.6 4.4 4.4 4.6 3.7 3.9 4.5

BD 4.3 4.3 3.6 3.5 3.9 4.0 3.5 3.4 3.5 3.7

W 3.8 3.2 3.2 3.8 3.8 3.2 3.8 3.2 3.6 3.7

BW 4.6 3.5 3.3 3.7 3.7 2.7 3.5 3.4 4.1 3.0

M 3.2 2.5 3.1 2.4 3.8 2.8 2.6 3.5 3.0 2.7

Q 3.3 2.0 2.4 2.7 2.2 3.1 2.4 3.1 4.0 2.4

According to the results, no time series is assumed to be normally distributed. However we can

see, that the lower the data frequency, the closer we get to the conﬁdence value and therefore

to normally distributed returns.

Skewness, Kurtosis and Jarque-Bera test

We have calculated the skewness and kurtosis for all of the listed market time series except

MSCI Europe and Lehman Aggregated Euro Bond Index since there is too few data available

for these two indices. For all of the others we have taken the last 1953 samples points of the

available data, i.e. all available data from the SBI Foreigner index and the last 1953 samples

from some of the other used indices. Again we have aggregated the daily data to get also lower

frequency data. The values for the skewness are listed in the following table.

42

Interval E World E US E FE E CH B World B US B FE B CH

D -0.11 -0.089 0.32 -0.094 0.17 -0.47 -0.46 0.32

BD -0.23 -0.046 0.30 -0.22 0.063 -0.47 -0.39 0.30

W -0.40 -0.35 0.59 -0.57 -0.073 -0.50 -0.061 0.20

BW -0.38 -0.50 0.94 -0.61 0.25 -0.48 -0.44 0.43

M -0.54 -0.27 0.60 -0.66 0.72 -0.34 -1.0 0.60

Q 0.29 0.050 0.030 0.10 0.50 -0.090 0.26 0.65

The next table shows the respective values for the kurtosis.

Interval E World E US E FE E CH B World B US B FE B CH

D 5.0 5.5 6.6 5.6 4.8 5.5 9.2 5.2

BD 4.3 3.9 5.4 6.6 3.6 4.7 7.5 3.9

W 3.6 4.5 5.0 4.7 3.4 4.1 4.9 3.4

BW 5.1 5.2 5.5 7.4 3.4 3.7 4.0 3.0

M 3.6 2.6 3.6 4.2 4.1 3.1 6.5 3.5

Q 2.6 2.4 2.3 3.1 2.5 2.4 3.7 3.2

The results for the kurtosis are also summarized in ﬁgure 24. It is visible that the value for

the kurtosis tends, for longer data periods, towards the value of the kurtosis of the normal

distribution, which is 3. From this we can conclude that time series with a longer time interval

like monthly or quarterly data can be better ﬁtted to a normal distribution than data with

higher frequency like intra-day or daily data which exhibits excess kurtosis. In [5], page 287,

we can also ﬁnd the conclusion that in most liquid ﬁnancial markets is highly signiﬁcant excess

kurtosis in intra-day returns, which decreases with sampling frequency.

Data interval

K

u

r

t

o

s

i

s

D BD W BW M Q

2

4

6

8

1

0

Figure 24: The chart shows the evolution of the kurtosis for several market time series and data

intervals. Each line depicts a certain market time series for increasing interval lengths. The length of

an interval is encoded according to: D: daily, BD: bi-daily, W: weekly, BW: bi-weekly, M: monthly, Q:

quarterly. We can see that, for longer the data intervals, the values for the kurtosis approach the kurtosis

of a normal distribution (doted line).

43

According to (20) we can calculate the test statistics for the Jarque-Bera test out of the

skewness and kurtosis. The resulting table looks like

Interval E World E US E FE E CH B World B US B FE B CH

D 320 500 1100 540 262 570 3200 410

BD 74 33 250 550 18 150 850 48

W 17 46 87 66 2.9 37 58 4.7

BW 40 46 79 170 3.6 11 14 6.0

M 6.0 1.6 7.0 12 13 1.8 63 6.4

Q 0.57 0.54 0.67 0.075 1.6 0.52 0.90 2.2

These results get compared with a χ

2

distribution with two degrees of freedom. This distribu-

tion has the threshold for a 5% conﬁdence level at 5.99. From this we can conclude that the

data are normal on a quarterly basis and for Equity World, Equity US and Bond Far East also

on a monthly basis. For any shorter time interval the normality assumption does not hold.

QQ Plot

On the following page we have depicted some QQ plots for the time series of Equities World,

Equities US, Equities Switzerland, Bonds World and Bonds US. The QQ plots of the same

time series are on the same horizontal line, ordered from daily data, bi-daily data, weekly data

to bi-weekly data. It is visible that the fat tails disappear with lower data frequency and the

empirical line approaches the linear line. Further one can see that the bond returns have are

less fat tailed that the equity returns. Another interesting phenomenon is that, specially on a

weekly basis, the lower fat tails are stronger evolved than the upper tails. The reason might be

that a crash occurs in a shorter time interval than an euphoria.

44

−3 −2 −1 0 1 2 3

−

0

.0

2

−

0

.0

1

0

.0

0

0

.0

1

0

.0

2

Normal QQ−Plot

Normal Quantiles

D

a

ily

E

q

u

itie

s

W

o

r

ld

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

3

−

0

.0

2

−

0

.0

1

0

.0

0

0

.0

1

0

.0

2

Normal QQ−Plot

Normal Quantiles

B

i−

D

a

ily

E

q

u

itie

s

W

o

r

ld

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

4

−

0

.0

3

−

0

.0

2

−

0

.0

1

0

.0

0

0

.0

1

0

.0

2

Normal QQ−Plot

Normal Quantiles

W

e

e

k

ly

E

q

u

itie

s

W

o

r

ld

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

6

−

0

.0

4

−

0

.0

2

0

.0

0

0

.0

2

0

.0

4

Normal QQ−Plot

Normal Quantiles

B

i−

W

e

e

k

ly

E

q

u

itie

s

W

o

r

ld

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

3

−

0

.0

2

−

0

.0

1

0

.0

0

0

.0

1

0

.0

2

Normal QQ−Plot

Normal Quantiles

D

a

ily

E

q

u

itie

s

U

S

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

3

−

0

.0

2

−

0

.0

1

0

.0

0

0

.0

1

0

.0

2

0

.0

3

Normal QQ−Plot

Normal Quantiles

B

i−

D

a

ily

E

q

u

itie

s

U

S

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

4

−

0

.0

2

0

.0

0

0

.0

2

0

.0

4

Normal QQ−Plot

Normal Quantiles

W

e

e

k

ly

E

q

u

itie

s

U

S

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

6

−

0

.0

4

−

0

.0

2

0

.0

0

0

.0

2

0

.0

4

0

.0

6

Normal QQ−Plot

Normal Quantiles

B

i−

W

e

e

k

ly

E

q

u

itie

s

U

S

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

2

−

0

.0

1

0

.0

0

0

.0

1

0

.0

2

Normal QQ−Plot

Normal Quantiles

D

a

ily

E

q

u

itie

s

S

w

itz

e

r

la

n

d

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

4

−

0

.0

2

0

.0

0

0

.0

2

Normal QQ−Plot

Normal Quantiles

B

i−

D

a

ily

E

q

u

itie

s

S

w

itz

e

r

la

n

d

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

6

−

0

.0

4

−

0

.0

2

0

.0

0

0

.0

2

0

.0

4

Normal QQ−Plot

Normal Quantiles

W

e

e

k

ly

E

q

u

itie

s

S

w

itz

e

r

la

n

d

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

5

0

.0

0

0

.0

5

Normal QQ−Plot

Normal Quantiles

B

i−

W

e

e

k

ly

E

q

u

itie

s

S

w

itz

e

r

la

n

d

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

0

5

0

.0

0

0

0

.0

0

5

Normal QQ−Plot

Normal Quantiles

D

a

ily

B

o

n

d

s

W

o

r

ld

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

0

5

0

.0

0

0

0

.0

0

5

0

.0

1

0

Normal QQ−Plot

Normal Quantiles

B

i−

D

a

ily

B

o

n

d

s

W

o

r

ld

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

1

0

−

0

.0

0

5

0

.0

0

0

0

.0

0

5

0

.0

1

0

Normal QQ−Plot

Normal Quantiles

W

e

e

k

ly

B

o

n

d

s

W

o

r

ld

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

1

5

−

0

.0

0

5

0

.0

0

5

0

.0

1

0

0

.0

1

5

Normal QQ−Plot

Normal Quantiles

B

i−

W

e

e

k

ly

B

o

n

d

s

W

o

r

ld

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

0

4

−

0

.0

0

2

0

.0

0

0

0

.0

0

2

0

.0

0

4

Normal QQ−Plot

Normal Quantiles

D

a

ily

B

o

n

d

s

U

S

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

0

6

−

0

.0

0

2

0

.0

0

0

0

.0

0

2

0

.0

0

4

Normal QQ−Plot

Normal Quantiles

B

i−

D

a

ily

B

o

n

d

s

U

S

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

0

5

0

.0

0

0

0

.0

0

5

Normal QQ−Plot

Normal Quantiles

W

e

e

k

ly

B

o

n

d

s

U

S

Q

u

a

n

tile

s

−3 −2 −1 0 1 2 3

−

0

.0

1

0

−

0

.0

0

5

0

.0

0

0

0

.0

0

5

Normal QQ−Plot

Normal Quantiles

B

i−

W

e

e

k

ly

B

o

n

d

s

U

S

Q

u

a

n

tile

s

Figure 25: QQ Plots for Equities World, Equities US, Equities Switzerland, Bonds World and Bonds

US time series and data intervals of daily data, bi-daily data, weekly data and bi-weekly data.

45

Runs test

In the following we show the results of the Runs test applied to the market data. The ﬁrst

table contains the result of the equity indices:

Interval Equities World Equities EU Equities US Equities FE Equities CH

D -0.92 -0.68 -0.60 -0.65 -0.67

BD -0.78 -0.64 -0.64 -0.64 -0.70

W -0.58 -0.49 -0.52 -0.60 -0.56

BW -0.75 -0.73 -0.75 -0.58 -0.66

M -0.76 -0.54 -0.61 -0.72 -0.68

Q -0.36 -0.79 -0.65 -1.6 -0.36

A two sided standard normal distribution table gives us a value of 1.96 for the 5% signiﬁcance

level. Since all results are smaller than this threshold, we have to conclude that they are all

generated by a random process.

Interval Bonds World Bonds EU Bonds US Bonds FE Bonds CH

D -0.68 -0.74 -0.79 -0.64 -0.62

BD -0.67 -0.80 -0.70 -0.63 -0.55

W -0.56 -0.86 -0.78 -0.83 -0.53

BW -0.61 -1.07 -1.07 -0.85 -0.55

M -1.1 -1.2 -1.0 -1.4 -0.72

Q -1.1 -1.5 -6.6 -0.78 -0.78

The same hold true for the bonds indices because they also lie all in between the boundaries. We

can ﬁnd that the bond indices have smaller values and are therefore more likely to be randomly

distributed.

BDS test

Finally, let’s have a look at the results of the BDS test. We list the range of the test values for

diﬀerent values for the and embedding dimension.

Interval Equities World Equities EU Equities US Equities FE Equities CH

D 6.9 - 13 8.9 - 44 2.8 - 8.9 2.1 - 6.2 7.6 - 11

BD 5.9 - 10 12 - 38 4.1 - 6.7 3.3 - 6.8 6.0 - 8.4

W 2.3 - 5.2 5.3 - 19 0.52 - 4.7 0.80 - 2.9 4.8 - 8.6

BW -0.34 - 2.6 1.1 - 14 -0.62 - 2.0 -0.64 - 0.95 2.8 - 6.8

M 0.61 - 3.0 1.5 - 8.9 2.1 - 3.9 -2.2 - -0.45 -0.70 - 0.52

Q -3.2 - 0.22 -5.3 - 0.33 -5.5 - 1.9 0.56 - 5.4 -1.8 - 1.7

The statement of the test (threshold 1.96) is that the market series are uncorrelated for

monthly and quarterly data (except for the case of monthly data for Equities EU and US) and

correlated for higher frequency data. Please remember that the results for Equities EU and

Bonds EU are gained from a shorter time series than the others and are therefore not signiﬁcant.

The next table lists the range of the results for the bond indices:

46

Interval Bonds World Bonds EU Bonds US Bonds FE Bonds CH

D 1.8 - 4.0 2.7 - 29 3.6 - 5.5 8.9 - 14 -0.27 - 1.44

BD 3.3 - 5.5 1.1 - 17 1.4 - 3.1 5.8 - 11 -1.1 - -0.51

W 2.1 - 3.8 0.021 - 5.7 2.8 - 3.7 3.7 - 10 -1.2 - 0.12

BW 0.26 - 2.6 -1.6 - 3.1 0.056 - 2.2 2.2 - 9.7 -1 - 1.0

M -0.89 - 0.76 0.028 - 6.4 -2.6 - 0.23 0.61 - 6.1 -1.4 - 0.88

Q -4.2 - 2.0 -1.2 - 8.3 -1.6 - 4.9 0.78 - 3.3 -3.7 - -1.3

We have more or less an acceptance of the hypothesis of uncorrelated returns for bi-weekly,

monthly and quarterly data (except for bi-weekly Bonds Far East).

Summary of test results

Let’s summarize the results of the applied tests:

• The Kolmogorov-Smirnov test has shown that the market time series are not normally

distributed, neither on short time frequency (daily data), nor on long time frequency

(quarterly data).

• The Jarque-Bera test conﬁrms these statement by refusing normality except for quarterly

data.

• The QQ plots for diﬀerent sampling frequencies of the data show signiﬁcant fat tails for

daily up to bi-weekly data

• From the Runs test we were able to conclude that the market series were produced by a

random process

• Finally, the BDS test showed us that the series are uncorrelated for monthly and quarterly

data and correlated for higher frequency data

These results should be evidence enough that the normality assumption of Markowitz does

not hold and it is justiﬁed to look out for other approaches that take non-normality into con-

sideration. It was even proposed by Markowitz himself in his Nobel price winning work, also

to investigate alternatives to variance as risk measures. There are some arguments for the

standard Markowitz method which we don’t want to hide:

The Central Limit Theorem says: Let X

1

, X

2

, . . . , X

n

be mutually independent random vari-

ables with a common distribution function F. Assume E[X]= 0 and Var(X)= 1. As n → ∞ the

distribution of the normalized sum

S

n

=

(X

1

+X

2

+. . . +X

n

)

√

n

tends to the Gaussian distribution function. When we look at the tick-by-tick logarithmic re-

turn data of a stock exchange for a certain ﬁnancial instrument, we can interpret each data

point as the value of a random variable and the daily, weekly or monthly data of this instru-

ment as the sum of the tick-by-tick returns or the respective random variables. According to

the Central Limit Theorem, the low frequency data will distribute like a Gaussian distribution

function, if the frequency is low enough and we have enough data points in a period. In the

context of an index or fund, the Central Limit Theorem can be applied once more by arguing

that an index or fund is the weighted sum of several random variable (the constituents of the

index or fund) and therefore the returns will behave according to a normal distribution if the

index or fund has enough constituents.

47

4 Portfolio Construction With Non Normal Asset Returns

The concept of a mean risk framework was explained an earlier chapter. Markowitz uses this

framework and has the variance chosen as risk measure. We will explore what general properties

such a risk measure should have in order to be an substitute for the variance. In the second

part of this chapter the suitability of variance as risk measure gets analyzed.

4.1 Introduction To Risk In General

In this section we will concentrate on the properties of ﬁnancial risk measures. Part of the basic

theory for this area was developed for the insurance sector and then adapted for the ﬁnancial

context.

We will use a variable X as a random variable representing the relative or absolute return

of an asset (or the insured losses in the insurance context). Assume we have two alternatives

A and B and their ﬁnancial consequences X

A

and X

B

. Let the function R denote a risk mea-

sure which assigns a value to each alternative and the notation A ~

R

B ⇔ R(X

A

) > R(X

B

)

indicates that the alternative A is riskier then alternative B. Note that this is diﬀerent from

the utility function U presented in chapter 1 where A ~ B ⇔ U(X

A

) > U(X

B

) means that A

is preferred to B.

The diﬀerence of the concepts of the utility function and risk might become more apparent if

one becomes aware that a utility function can be deﬁned without a risk term (e.g. ’prefer more

to less’) or can include a risk term (e.g. Markowitz approach where we can ﬁnd a trade-oﬀ

function between expected return and risk).

In Albrecht [4] risk measures are categorized into two kinds. The two categories are:

1.) Risk as magnitude of deviation from target (risk of the ﬁrst kind)

2.) Risk as necessary capital respectively necessary premium (risk of the second kind)

For many common risk measures one kind can get transformed into the other: The addition of

E[X] to a risk measure of the second kind will guide us to a risk measure of the ﬁrst kind and

the subtraction of E[X] from a risk measure of the ﬁrst kind will lead to a risk measure of the

second kind.

We can ﬁnd a general approach to derive a risk measure for a given utility function. This

standard measure of risk is given in Huerlimann [26] by:

R(X) = −E[U(X −E[X])] (21)

Since the risk measure corresponds to the negative expected utility function of X - E[X], the

risk measure is location free. From (21) we can derive speciﬁc risk measures by using a speciﬁc

utility function. Using for instance the quadratic utility function (6), we obtain the variance

V ar(X) = E[(X −E[X])

2

]

as the corresponding risk measure.

We will now introduce the deﬁnitions for stochastic and monotonic dominance because they

are useful in the context of risk measures. Assume we are given two random variables X, Y.

48

Stochastic dominance of order 1 for a monotonic function R:

X ≺

SD(1)

Y ↔ E[R(X)] ≤ E[R(Y )]

Stochastic dominance of order 2 for a concave, monotonic function R:

X ≺

SD(2)

Y ↔ E[R(X)] ≤ E[R(Y )]

Monotonic dominance of order 2 for a concave function R:

X ≺

MD(1)

Y ↔ E[R(X)] ≤ E[R(Y )]

Next we will now check some axiomatic systems for risk measures that were proposed in the

last years.

Axiomatic system of Pedersen and Satchell

Pedersen and Satchell give in [32] the following set of axioms for a risk measure:

1.) Nonnegativity: R(X) ≥ 0

This requirement follows from the assumption of a risk measure of the ﬁrst kind (deviation

from a location measure)

2.) Positive homogeneity: R(c ∗ X) = c ∗ R(X), ∀ constants c

If an investment gets multiplied, then also the risk gets multiplied.

3.) Subadditivity: R(X +Y ) ≤ R(X) +R(Y )

The risk or two combined investments will not be larger than the risk of the individual invest-

ments (eﬀect of diversiﬁcation).

4.) Shift invariance: R(X +c) ≤ R(X), ∀ constants c

The measure is invariant to an addition of a constant (location free)

Axioms number 2 and 3 combined lead to the statement that the risk of a constant random

variable must be zero. Axioms 2 and 4 imply that a risk measure according to these criteria

is convex. Since the risk measure is assumed to be location free, this system of axioms will

describe especially risk measures of the ﬁrst kind.

Axiomatic system of Artzner, Delbaen, Eber and Heath

Artzner, Delbaen, Eber and Heath [7] have developed another set of axioms. Risk measures

that fulﬁll their properties are called coherent. The classiﬁcation was reﬁned in [13] to introduce

the terms convex risk measure. Axioms 1 and 4 are also contained in the set of Pedersen and

Satchell [32] in a similar way.

They call a mapping a convex risk measure if ∀X, Y ∈ '

∞

,

1.) Subadditivity: R(X +Y ) ≤ R(X) +R(Y )

2.) Monotonicity: X ≤ Y ⇒ R(X) ≥ R(Y )

A higher loss potential (statistical dominance) implies a higher risk.

49

3.) Translation Invariance: R(X +a) = R(X) −a, ∀ constant returns a

There is no additional risk for an investment without uncertainty.

A convex risk measure R is called a coherent risk measure if it satisfy the additional property:

4.) Positive homogeneity: R(c ∗ X) = c ∗ R(X), ∀ constant c

This set of risk axioms is well suited for risk measures of the second kind. In fact, every reason-

able risk measure must be convex because a risk measure that does not satisfy subadditivity

penalizes diversiﬁcation and would not assign risk in an intuitive way.

Axiomatic system of Wang, Young and Panjier

Another important set of risk axioms was introduced by Wang, Young and Panjier [45]. They

are dealing with premia in an insurance context, which can however easily be transferred to

the ﬁnancial context. The two main tasks in insurance markets are the calculation of the risk

premia and the risk capital. The closed system of axioms for premia by Wang, Young and

Panjier asks for some continuity properties and

1.) Monotonicity: X ≤ Y ⇒ R(Y ) ≤ R(X)

2.) Comonotone additivity: X

1

, X

2

comonotone ⇒ R(X +Y ) = R(X) +R(Y )

Comonotone: ∃ random variable Z and monotone functions f and g with X = f(Z) and

Y = g(Z)

A general risk measure

Stone [42] reports a general risk measure containing the three parameters c, k and z as:

R(X) = [

_

z

∞

([x −c[)

k

f(x)dx]

1

k

The standard deviation and semi-standard deviation are part of this general risk measure class.

This class was extended in [32] to a ﬁve parameter model:

R(X) = [

_

z

∞

([x −c[)

a

w[F(x)]f(x)dx]

b

which contains also the variance, the semi-variance and some other risk measures.

4.2 Variance As Risk Measure

Variance was proposed as appropriate measures for risk by Markowitz in his approach (7). The

advantage of variance as risk dimension is that it is a very convenient and intuitive measure.

It is very common in statistics and has for this reason well known properties. However it has

also some properties that makes it not optimal as risk measure for ﬁnancial applications.

The risk of very rare events are not taken into account very well by variance. We will show with

the tests presented before that the returns of ﬁnancial assets often have fat tails. This means

that extreme events (very high returns or very high losses) are more likely than compared to a

50

normal distribution. In practice of portfolio optimization it is crucial to avoid very high losses

because a lot of clients just ask for a preservation of their wealth. It is true that the variance

penalizes extreme events by calculating the squared distance to the mean, however we should

ask for something more speciﬁc. For this reason a risk measure that does not pay special at-

tention for these kind of events is not very qualiﬁed.

Another unpleasant property of variance is its symmetry. When we talk about risk, we think

of the risk for a loss. However variance measures also the ”risk” of a gain, which is in fact

something desired for an investor. This gives rise to asymmetrical risk measures which take

only care for losses.

We have already mentioned that it is shown in [39] that variance is only compatible to the

the concept of a utility function under the assumption of normally distributed returns or a

quadratic utility function which is a very strong restriction.

51

5 Value At Risk Measures

In this chapter we will present a ﬁrst alternative risk measure to the standard deviation. It is

called Value at Risk and belongs to the quantile based risk measures.

There are eﬀorts undertaken to introduce regulations to the ﬁnancial industry to get a better

control for the risk that is taken by its participants and also to help the companies to get a

better overview for the risk they hold. This was also the topic of the Basel Committee on

Banking Supervision where, as a conclusion, they recommend Value at Risk as an appropriate

risk measure.

5.1 Value At Risk

We deﬁne Value at Risk as:

Let α ∈ (0, 1) be a given probability level and w the asset weights of a portfolio. The Value at

Risk at level α for the return R is deﬁned as

V aR

α

(R

P

) = sup¦x[P[R

P

< x] ≤ α¦ = F

−1

R

P

(α) (22)

The function F

−1

R

P

(α) is called the generalized inverse of the cumulative distribution function

F

R

P

(x) = P[R ≤ x] of R

P

and gives the α-quantile of R

P

.

V aR

α

(R

P

) can be interpreted as the loss of a Portfolio that will be exceeded in only α*100

percent of all cases. Since α is usually chosen between 0.01 and 0.1, the Value at Risk is a lower

boundary for a portfolio return and the return of the portfolio will with a very high probability

(0.99 or 0.9 for the example α) not be smaller. It is the aim of portfolio construction to assemble

a portfolio with a high Value at Risk in order to shift the return range for the 1 − α area as

much to the positive side as possible. Sometimes α is chosen as 0.95 or 0.99 and V aR

1−α

for a

loss function is computed. The similarity of these two notations is shown in appendix B. Figure

26 shows two areas α and 1 −α for the normal distribution.

Return

P

r

o

b

a

b

i

l

i

t

y

−3 −2 −1 0 1 2 3

0

.

0

0

.

1

0

.

2

0

.

3

0

.

4

0

.

5

α 1−α

x

Figure 26: The Value at Risk at level α for the return R is deﬁned as the return x where the probability

of having a return smaller than x is α.

52

The analytical properties of the Value at Risk model are not very pleasant: It is in the

general case not possibly to ﬁnd an symbolic expression for the portfolio weight w optimized

according to VaR and dependent on the multivariate returns function of its constituents. Even

the numerical application is diﬃcult: According to Gaivoronski and Pﬂug [23], VaR is not a

convex risk measure. This means that the VaR function contains many local maxima. To

deal with these maxima, they have developed a smoothing algorithm, which allows them to

calculate the optimal portfolios in the VaR sense with high accuracy and in reasonable time.

Another approach to deal with the VaR optimization function is proposed in Embrechts et al

[20]. It treats each univariate distribution function for the assets individually and models the

dependencies of the univariate distribution functions with a copula. The concept of the copula

is a well known way of modelling dependence in risk management.

Another unpleasant property of Value at Risk is that it fails to be coherent as stated in [3].

In the general case Value at Risk does not fulﬁll the sub-additivity axiom. This is especially

unpleasant because it implies that a portfolio made out of smaller portfolios (and therefore with

a higher diversiﬁcation as the individual small portfolio) can have a higher amount of risk than

the sum of the risk of the smaller portfolios. This would oﬀset the eﬀect of diversiﬁcation.

Since Value at Risk is only concerned about the threshold that will be crossed with the

small probability α, it does not take into consideration the distribution of the returns above the

threshold. Dembo and Fuma [17] published an example that shows this disadvantage. Assume

two distributions are given as declared in this table and depicted in ﬁgure 27.

Return Probability in Portfolio A Probability in Portfolio B

-10 0.01 0.01

-7.5 0.04 0.04

-5 0.05 0.25

-2.5 0.1 0.25

0 0.5 0.3

2.5 0.15 0.1

5 0.15 0.05

µ 0.225 -1.775

σ 3.140 3.144

1% VaR -10 -10

5% VaR -7.5 -7.5

From the expected returns of the portfolio A and B we can see that Portfolio A has a higher

expected return than portfolio B. This means that we have a clear preference for portfolio A.

However both risk measures, standard deviation and VaR, fail to capture this preference because

they both get the same values for portfolio A and B. The reason is that standard deviation, as

mentioned, does not discriminate between the risk of a loss (which should be avoided) and the

risk of a gain (which is favorable) and VaR does not take into consideration the distribution

form above the threshold at all. This example is not as artiﬁcial as it might look like, since

most distributions in ﬁnance diﬀer especially around the median and are around the tails very

similar. It is also clear that Value at Risk does not distinguish between very sever losses or just

small losses, as long as they are below the threshold.

53

Return

P

r

o

b

a

b

i

l

i

t

y

−10 −5 0 5

0

.

0

0

.

1

0

.

2

0

.

3

0

.

4

0

.

5

Figure 27: The graphic shows the two distribution functions deﬁned in the table. The distribution

function of the returns of portfolio A is depicted as solid line, whereas the distribution function of the

returns of portfolio B is depicted as doted line. The functions are coincident for returns between -10 and

-7.5

5.2 Conditional Value At Risk, Expected Shortfall And Tail Conditional

Expectation

In this chapter we will discuss the concepts of Lower Partial Moments (LPM), Conditional Value

at Risk (CVaR) and Expected Shortfall (ES). We cover them in the same chapter because they

are very similar and these risk concepts have become a totum revolutum in the last few years.

The following part tries to unveil the relation between the mentioned risk measures.

At the beginning there was a ﬁrst concept called lower partial moment (LPM) as described

in Fishburn [21]. The general lower partial moment risk measure for a random return variable

R and its probability function P(x) is given by

LPM

β

(τ; R) = E[(τ −R)

β

] =

_

τ

−∞

P(x)(τ −x)

β

dx

An investor can determine a threshold τ under which he does not want the return R to fall.

According to the choice of β, one gets a diﬀerent lower partial moment:

β = 0 : Shortfall probability LPM

0

=

_

τ

−∞

f(x)dx

β = 1 : Mean Shortfall LPM

1

=

_

τ

−∞

f(x)(τ −x)dx

β = 2 : Shortfall variance/ Semi variance LPM

2

=

_

τ

−∞

f(x)(τ −x)

2

dx

LPM

0

portfolio selection corresponds to Roy’s safety ﬁrst rule presented in [38]. LPM

1

,

also called expected regret in [16], can be interpreted as the average portfolio underperformance

compared to a ﬁxed target or some benchmark τ.

The term conditional Value at Risk was ﬁrst introduced in [35]. They use a slightly diﬀerent

deﬁnition and notation for VaR and CVaR as we will (refer to appendix B). For continuous dis-

tributions conditional Value at Risk is deﬁned as conditional expected loss under the condition

54

that it exceeds the Value at Risk. There are two variants of CVaR:

CV aR

+

α

= E[R

P

[R

P

< V aR] (23)

CV aR

−

α

= E[R

P

[R

P

≤ V aR] (24)

where V aR = V aR

α

(R

P

) as deﬁned in formula (22) and E[x] denotes the expected value

of x.

As mentioned before, conditional Value at Risk can be considered as expected amount of loss

below the VaR. From this it gets clear that CV aR

α

≤ V aR

α

Conditional Value at Risk is also known as Mean Excess Loss (CVaR

+

), Mean Shortfall

(LPM

1

with τ = V aR) (CVaR

+

) or Tail Value at Risk (CVaR

−

). Since the concept was

developed for several application ﬁelds (e.g. actuarial science, ﬁnance, economics) and by

diﬀerent researchers, it has many names and deﬁnitions. In Huerlimann [25], ten equivalent

deﬁnitions of CVaR are presented. A general deﬁnition for CVaR, also applicable for discrete

distributions is written in Uryasev [44] as a weighted average of VaR and returns strictly below

VaR. After the conversion to our environment the equation is

CV aR

α

= λ V aR

α

+ (1 −λ) CV aR

+

α

(25)

with

λ =

α −P[R

P

≤ V aR]

α

The equation can be used for continuous and discrete distributions: In the case of a con-

tinuous distribution λ = 0 and therefore CV aR

α

= CV aR

+

α

. If we have a discrete distribution

the calculated VaR (V aR

disc

) will not exactly be the α quantile as it would be for a continuous

distribution (V aR

cont

), but more on the negative side of the distribution. λ increases CV aR

+

α

to the positive side of the distribution and extrapolates CV aR

+

α

from V aR

disc

to V aR

cont

. In

other words, CV aR

+

α

and V aR get weighted proportionally to

V aR

disc

−V aR

cont

V aR

cont

and therefore

CV aR

+

α

≤ CV aR

α

≤ V aR

α

.

A similar concept to CVaR is called expected shortfall. It was introduced in [1] and redeﬁned

later to be consistent with CVaR.

ES

α

(R

P

) = −

1

α

(E[R

P

1

R

P

≤V aR

] −(P[R

P

≤ V aR] −α)V aR) (26)

They show in Acerbi and Tasche [2] that it can also be expressed as

ES

α

(R

P

) = −

1

α

_

α

0

inf[x[P[X ≤ x] ≥ a]da

In case that we have a non continuous distribution function, it might be that P[R

P

≤ x] > α.

In contrast, for a continuous distribution function P[R

P

≤ x] = α and then it can be seen that

(23) is equivalent to (26).

To conclude we try to group the risk measures that have the same base concept. They all

take the distribution function as input and process a number as representant for the risk the

distribution function holds out of the distribution function.

55

Calculate a threshold the returns Calculate the expected return of the

should not fall below returns under a certain threshold

LPM

0

LPM

1

Value at Risk Conditional Value at Risk

Shortfall risk Expected shortfall

Mean shortfall

Expected regret

Tail Value at Risk

It is shown in Testuri and Uryasev [43] that expected regret and CVaR is closely related.

They also conﬁrm the relation of CVaR with the other risk measures in the same row, at least

for the case of continuous distribution functions.

The following table lists the properties of Value at Risk and conditional Value at Risk/

expected shortfall. The statements were taken from [34].

Property VaR CVaR

Translation equivariance

√ √

Positively homogeneous

√ √

Convexity x

√

Stochastic dominance of order 1

√ √

Stochastic dominance of order 2 x

√

Monotonic dominance of order 2 x

√

Coherence x

√

From this comparison it shows that conditional Value at Risk has much nicer properties than

the standard Value at Risk. Since CVaR is convex with respect to portfolio positions, it is much

easier to optimize than VaR which has a lot of local maxima. Coherence is a requirement for

an intuitive risk measure (eﬀect of diversiﬁcation) and is also fulﬁlled only by CVaR.

Conditional Value at Risk gets presented an excellent tool for risk management and portfolio

optimization because it can quantify risks beyond Value at Risk and is easier to optimize.

In [35] it is also stated that CVaR methodology is consistent with Mean-Variance methodology

under normality assumption. This means that a CVaR maximal portfolio is also variance min-

imal for normal return distributions.

We will now focus on the optimization of the two risk measures following [34]. For the sake of

consistency, we have again transformed the notations according to appendix B. R = (R

1

, . . . R

N

)

indicates a vector of random returns of asset classes 1 . . . N. Let w = (w

1

, . . . w

N

) be the weights

of the investments in these asset classes. We try to maximize the risk measure under the con-

straint that the expected return w

T

R of the portfolio is equal to some predeﬁned level µ. The

VaR optimization problem can be stated as

Maximize (in w) V aR

α

(w

T

R)

s.t.

w

T

E[R] = µ

w

T

1 = 1

w ≥ 0

and the CVaR respectively as

56

Maximize (in w) CV aR

α

(w

T

R)

s.t.

w

T

E[R] = µ

w

T

1 = 1

w ≥ 0

Note that, since our optimizer is only capable of minimizing a function but not of maximizing

a function, we minimize in the implementation −V aR

α

(w

T

R) and −CV aR

α

(w

T

R).

The VaR optimization problem we will cover later. First, we transform the CVaR optimization

problem in the following linear program with a dummy variable Z:

Maximize (in w and a) a +

1

α

E[Z]

s.t.

Z ≥ w

T

R −a

x

T

E[R] = µ

w

T

1 = 1

Z ≥ 0

w ≥ 0

Since we have only linear constraints, we can be sure that the solution will be a singleton, a

convex polyhedron or the solution does not exist.

In practice however we have mostly discrete variables (e.g. empirical data). For this reason we

formulate the portfolio optimization problems in a discrete way.

A vector R

i

, i = 1, . . . , M indicates the returns of all asset classes for a certain time point 1,. . .,

M. For the formulation we will use a notation S

[1:k]

(u

1

, . . . , u

M

) to denote the one element

among u

1

, . . . , u

M

which is the k-th smallest. The new deﬁnitions for VaR and CVaR are

V aR

α

(w

T

R) = S

[1:αM]

(w

T

R

1

, . . . , w

T

R

M

)

CV aR

α

(w

T

R) =

1

M

w

T

R

i

≤V aR

α

w

T

R

i

The discrete portfolio optimization problem for the VaR is a nonlinear, nonconvex program:

Maximize (in w) S

[1:αM]

(w

T

R

1

, . . . , w

T

R

M

)

s.t.

w

T

e = µ

w

T

1 = 1

w ≥ 0

where e =

1

M

M

i=1

R

i

denotes the expected return vector.

The discrete version of the CVaR is piecewise linear and may therefore be solved using an

LP-solver. We formulate the problem like:

Maximize (in w, a, and z) a +

1

αM

M

i=1

z

i

s.t.

z

i

≥ −w

T

R

i

−a

w

T

e = µ

w

T

1 = 1

57

z

i

≥ 0

w

i

≥ 0

For this setting, the optimal value for a is V aR

α

(w

T

R). We can see that the objective function

and the ﬁrst and third inequality constraint express the weighted average of the Value at Risk

and the mean of all negative returns above the Value at Risk (which is the same as the mean

of all returns below the negative Value at Risk).

5.3 Mean-Conditional Value At Risk Eﬃcient Portfolios

In this section we want to analyze what it means to optimize a portfolio regarding Value At

Risk/ Conditional Value At Risk.

We start with the case of normal distributed asset returns. Figure 28 shows two normal distri-

bution with the same mean but diﬀerent variances. The distribution with the larger variance

(doted line) has smaller CVaR and vice-versa. It is intuitive to see that if we maximize the

CVaR we also minimize the variance of the distribution. The only way to enlarge the CVaR

(shifting the corresponding left tail of the distribution to the right) is to shorten the variance

(make the peak larger). Of course this is only true if we a suﬃcient amount of data coming

from a pure normal distribution function. Using small amounts of empirical data, there might

be eﬀects that prevent the equivalence of the two optimization techniques.

−10 −5 0 5 10

0

.

0

0

0

.

0

5

0

.

1

0

0

.

1

5

0

.

2

0

Return

P

r

o

b

a

b

i

l

i

t

y

CVaR

Vtlg 1

CVaR

Vtlg 2

σ

Vtlg 1

σ

Vtlg 2

Figure 28: The graphic shows two normal distribution functions. For both distribution functions the

Conditional Value at Risk and the variance is schematically depicted. It shows that minimizing the

variance of a function is equivalent to maximizing its Conditional Value at Risk.

58

The case of distributions with skewness and excess kurtosis is more interesting. The oc-

currence of fat tails and asymmetry in the distribution function allows the mean-CVaR opti-

mization to take the risk evolving out of these properties into account. As consequence, such

an optimization will assign the portfolio weights diﬀerently the the Mean-Variance approach.

The optimization, in general, will prefer assets with positive skewness, small kurtosis and low

variance for a given return.

To conclude, we expect the results of a Mean-CVaR and the results of a Mean-Variance op-

timization to be the same for the case of similar distribution functions (e.g. normal distribution

functions) for the asset returns and a suﬃcient amount of data. Au contraire, the results are

assumed to be diﬀerent for the two optimization techniques if the data is coming from varying

distribution function with diﬀerent higher moments or if the sample size is small.

59

6 Draw-Down Measures

In this section we will present two other approaches to measure the risk of a portfolio. They are

called Draw-Down and Time Under-The-Water. Draw-Down was ﬁrst presented in a portfolio

context in [12]. In [33] Draw-Down is used together with Time Under-The-Water to measure the

loss potential of hedge funds. We will describe Draw-Down as written in [12] and Time Under-

The-Water according the idea in [33]. Afterwards we will enhance Draw-Down to Conditional

Draw-Down at Risk (CDaR) and Time Under-The-Water to Conditional Time Under-The-

Water at Risk (CTaR). Finally we apply CDaR in a portfolio context.

6.1 Draw-Down And Time Under-The-Water

An advantage of the two concepts is that they are much more intuitive than other risk mea-

sures. The concepts represent values every investor is interested in: Draw-Down measures the

loss the investment might suﬀer (in absolute or relative terms) and Time Under-The-Water

is the time period the investment might remain with a negative performance. Other possible

applications for these measurements could be: A portfolio manager might loose a client if the

clients portfolio does not provide a gain over a long time or a fund might not be allowed to

loose more than a certain amount each month and has therefore to stop trading until the next

month starts and therefore a new budget.

We will work on the logarithmic returns instead of geometric returns as stated in [12]. Assume

we are given the (cumulated) return of the portfolio from time 0 until time t by a function

r

c

(w, t)

with w as the vector of weights for the portfolio constituents. The Draw-Down function at

time t is deﬁned as the diﬀerence between the maximum of the function in the time period [0,t]

(High-Water-Mark) and the value of the function at time t:

DD(w, t) = max

0≤τ≤t

[r

c

(w, τ)] −r

c

(w, t) (27)

Figure 29 shows a time series with the respective High-Water-Marks and Draw-Down.

Starting with the formula for Draw-Down, two risk functions are derived: Maximum Draw-

Down is calculated as the maximum Draw-Down in the period

MD(w, t) = max

0≤τ≤t

[DD(w, t)] (28)

and the average Draw-Down is deﬁned as

AD(w) =

1

T

_

T

0

DD(w, t)dt (29)

If, in a time-value framework, Draw-Down is measured on the y-axis, Time Under-The-Water

is the corresponding period on the x-axis that represents the time the value of an investment

may remain under its historic record mark. We deﬁne Time Under-The-Water as

TUW(w, t) = t −[maxT[r

c

(w, maxT) = max

0≤τ≤t

r

c

(w, τ)]

60

0 100 200 300 400 500

−

5

0

0

5

0

1

0

0

Time

V

a

l

u

e

Figure 29: The ﬁgure shows a time series with the respective High-Water-Mark (dashed line) and Draw-

Down (doted line) as deﬁned. The Time Under-The-Water is just the part of the dashed line above the

doted line.

Similar to the Draw-Down concept, we will now introduce Maximum Time Under-The-Water

MT(w) and Average Time Under-The-Water AT(w) as

MT(w, t) = max

0≤τ≤t

[TUW(w, t)] (30)

AT(w) =

1

T

_

T

0

TUW(w, t)dt (31)

6.2 Conditional Draw-Down At Risk And Conditional Time Under-The-

Water At Risk

Alike the enhancement of Value at Risk to Conditional Value at Risk, we will proceed with

Draw-Down and Time Under-The-Water. Draw-Down at Risk can be deﬁned similar to (22) as

DaR

α

(MD) = inf¦x[P[MD > x] ≤ α¦ (32)

with MD as Maximum Draw-Down and Conditional Draw-Down at Risk corresponding to

Conditional Value at Risk (25) as:

CDaR

α

= λ DaR

α

+ (1 −λ) CDaR

+

α

(33)

with

λ =

P[MD ≥ DaR

alpha

] −α

α

CDaR

+

α

= E[MD[MD > DaR

α

] (34)

61

We will now discuss the implementation of the concepts in detail. A ﬁrst approach would

be to calculate the Maximum Draw-Down for each new level of the High-Water-Mark. This

means, we scan the time series from the past to the present and each time we ﬁnd a new global

maxima, we calculate the Maximum Draw-Down for the period between this global maxima

and the point where the time series is higher than this global maxima for the ﬁrst time. All of

these Maximum Draw-Downs get stored to construct their distribution. The drawback of this

method is that we will probably get very few Draw-Down values for the following reasons:

• Since the Draw-Down gets calculated as the diﬀerence to the highest historical value

(record), the concept of the Draw-Down comprises the eﬀect of increasing time periods

for new records: The expected time period for a random variable to reach a new all-time-

high is not uniformly distributed but increases much faster over time (see for example

[19]).

• In times of a Baise, we won’t get any Draw-Downs at all. Only in times of a Hausse there

will be a new High-Water-Mark and therefore new Draw-Downs.

In [12] it was proposed to introduce M sub-periods in the time interval [0,T] and to calculate

the Draw-Down for each sub-period. This way they get an empirical distribution for the Draw-

Downs consisting out of maximum M sample points. Using this methodology, one should be

aware of some points:

• The methodology adds a new variable M that does not improve the descriptive power of

the concept. The reason for introducing this variable is just for numerical reasons and

has no economical or practical meaning.

• If M is chosen too large, the number of resulting Draw-Downs is too small to get a

good distribution approximation. If M is chosen too small, the Draw-Downs that extend

over several sub-periods get cut into several smaller Draw-Downs because the maximum

possible Draw-Down is restricted to the length of the sub-period. This is especially

undesirable since we are particularly interested in the large Draw-Downs to calculate the

α-quantile.

• The eﬀect of increasing time periods for new records can not be avoided by resetting the

all-time-high at the beginning of each sub-period - it is just transformed to a smaller time

scale.

We would like to bring this method and a new method into the context of the information

given by the client. The described method of the ﬁxed periods for calculating the Draw-Down

could be used if the investment horizon of the client is known:

If the investment horizon is known, a rolling window with the length of the investment hori-

zon could be applied to the available historical data. For each time window the Maximum

Draw-Down gets calculated and the window shifted for one period. If we have P historical data

points, Q data points in the rolling window and shift the window R data points each time we

advance, we get with this method

P−Q

R

Maximum Draw-Down values. It would also be possible

to use overlapping rolling windows. However this would decrease the variance of the in this

way reused data.

If the investment horizon is not known, we propose as a second method to calculate the Max-

imum Draw-Down for each possible entry combined with each possible exit point. This gives

62

us for P data points

(P−1)(P−2)

2

Maximum Draw-Down values. The idea is to calculate the

average Draw-Down an investor could face. The disadvantage of this method is that the Draw-

Down values for a certain (unknown) investment horizon have very few inﬂuence to the ﬁnal

distribution. The reason for this is that the number of possible Draw-Downs grows quadratical,

however the number of Draw-Downs for a certain investment period grows only linearly. It

might therefore be questionable to compare Draw-Downs of diﬀerent time periods.

We will assume that the investment horizon is known and therefore proceed with the ﬁrst of

the described methods to formulate the optimization problems.

It lies in the nature of the concept to change the structure of the optimization problem from

”minimize the risk for a given expected return”, as it was the case for Variance and CVaR op-

timization, to ”maximize the expected return for a given Draw-Down/Time Under-The-Water

threshold”. For an investor it is convenient to deﬁne his/her personal amount of wealth he/she

is willing to risk or the amount of time he/she gives to the portfolio manager to remain with

a negative performance. However, to be better able to compare the results of the diﬀerent

optimizations, we will stick to our old schema of ﬁxing an expected return and minimizing the

respective risk measure.

To show the corresponding linear optimization problems, we introduce the following vari-

ables: The vector of logarithmic cumulative asset returns up to time moment k be y

k

so we can

calculate the cumulative portfolio return as r

c

(w, t = k) = y

k

∗ w. With the expected return

given by the investor as µ, we get the following linear programming problem for the Maximum

Draw-Dawn

Minimize (in w and u) z

s.t.

u

k

−y

k

∗ w ≤ z 1 ≤ k ≤ M

u

k

≥ y

k

∗ w, 1 ≤ k ≤ M

u

k

≥ u

k−1

, 1 ≤ k ≤ M

u

0

= 0

1

d

y

M

∗ w = µ

w

T

1 = 1

w

i

≥ 0, 1 ≤ i ≤ N

where u

k

, 1 ≤ k ≤ M and z are auxiliary variables and d is the investment period in years.

The optimization problem with a constraint on the average Draw-Down can be written as

follows

Minimize (in w and u) z

s.t.

1

M

M

k=1

(u

k

−y

k

∗ x) ≤ z

u

k

≥ y

k

∗ x, 1 ≤ k ≤ M

u

k

≥ u

k−1

, 1 ≤ k ≤ M

u

0

= 0

1

d

y

M

∗ x

w

T

1 = 1

w

i

≥ 0, 1 ≤ i ≤ N

63

and the optimization problem with a constraint on CDaR may be formulated as

Minimize (in w, u, z, ζ) z

s.t.

ζ +

1

αM

M

k=1

z

k

≤ z

z

k

≥ u

k

−y

k

∗ x −ζ, 1 ≤ k ≤ M

z

k

≥ 0, 1 ≤ k ≤ M

u

k

≥ y

k

∗ x, 1 ≤ k ≤ M

u

k

≥ u

k−1

, 1 ≤ k ≤ M

u

0

= 0

1

d

y

T

∗ x

w

T

1 = 1

w

i

≥ 0, 1 ≤ i ≤ N

The optimal solution of this problem gives the optimal threshold value in variable ζ.

The corresponding extension of the Time Under-The-Water to a risk measure Conditional

Time Under-The-Water at Risk (CTaR) can be done similarly.

Since the optimization problems are analogous to the ones of Draw-Down, we will give only the

linear programm for the Conditional Time Under-The-Water at Risk

Minimize (in w, u, z, ζ) v

s.t.

ϑ +

1

αM

M

k=1

z

k

≤ v

z

k

≥ u

k

−y

k

∗ x −ϑ, 1 ≤ k ≤ M

z

k

≥ 0, 1 ≤ k ≤ M

u

k

≥ y

k

∗ x, 1 ≤ k ≤ M

u

k

≥ u

k−1

, 1 ≤ k ≤ M

1

d

y

T

∗ x

u

0

= 0

w

M

1 = 1

w

i

≥ 0, 1 ≤ i ≤ N

where u

k

, z

k

, 1 ≤ k ≤ M and v are auxiliary variables.

A well implementable setup for the portfolio optimization process (that we will not further

follow) would be the following framework: Optimize the expected portfolio return subject to

the clients CDaR restriction ζ

C

and CTaR restriction η

C

.

Maximize

1

dC

y

M

∗ x

s.t.

CDaR

α

(w) ≤ ζ

C

CTaR

α

(w) ≤ η

C

64

This setup has the following advantages:

• Since it is very intuitive, the portfolio manager can talk to the client in exactly the same

terms and the client has a clear view about the risk he or she is taking.

• The framework is still a linear programming problem and can therefore be solved eﬃ-

ciently.

The two risk measures Draw-Down and Time Under-The-Water demand a lot of data to be

meaningful. Both methods deﬁne at the most one risk value per data window and to get a good

approximation for the distribution of the risk measures, lots of data windows are necessary.

This is especially true when we are looking for small quantiles like α = 0.05. In empirical test

we have seen that Draw-Down and Time Under-The-Water as described so far are not very

appropriate for hedge funds, where only monthly data for about the last 15 years is available

and therefore only 180 data points in total. The risk measures are in this case to discrete

and it is not possible to get a reasonable optimization. E.g. it is often not possible to get

a good estimation for the derivatives of the risk measure which is needed in most optimiza-

tion algorithms. We have come to this conclusion especially for the Time Under-The-Water

measure where the objective function to minimize is far to discrete to get any meaningful results.

An important diﬀerence of the Draw-Down approach in comparison to the Value at Risk

approach is the fact that the Draw-Down takes the correlations implied in the time series into

consideration because it operates on the compounded historical returns and not on the return

distribution function as Value at Risk does.

6.3 Mean-Conditional Draw-Down At Risk Eﬃcient Portfolios

In this section we want again to analyze what it means to optimize a portfolio regarding Draw-

Down/ Conditional Draw-Down At Risk.

Under the assumption of normally distributed returns, the wealth of a portfolio can be

approximated by a Geometric Brownian Motion given by

X(t) = σW(t) +µt

where W(t) is a standard Wiener process, µ is the drift and σ is the diﬀusion parameter. Now

it is possible to derive the average Maximum Draw-Down. Its asymptotic behavior is

E[AD] =

2σ

2

µ

Q

AD

(α

2

)

Q

AD

(x) →

_

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

_

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

_

µ < 0

_

_

_

x → 0

+

−γ

√

2x

x → ∞ −x −

1

2

µ = 0 2γσ

√

T

µ > 0

_

_

_

x → 0

+

γ

√

2x

x → ∞

1

4

logx + 0.49088

65

α = µ

_

(

T

2σ

2

)

γ =

_

(

π

8

)

with T as the investment horizon. This setup allows us to estimate the average Draw-Down of

a time series by using its mean and variance. Again, we made the experience that in practise

a lot of data is necessary to get a reasonable result. The reason might lie in the assumption

of normality in the returns distribution which can be hold, if ever, only for very long time series.

If normality does not hold, the portfolio or assets wealth can not be modelled by a Geometric

Brownian Motion and therefore the algebraic relation is not valid anymore.

66

7 Comparison Of The Risk Measures

Peijan and L´opez de Prado state in [33] that there is also an algebraic relation between VaR,

Draw-Down and Time Under-The-Water whenever normality and time-independence hold. In

section 5.3 we have seen that variance is closely related to CVaR and in section 6.3 we have

shown that there exists an algebraic correspondence between variance and Draw-Down for the

case of normality.

This means that there is even an algebraic correspondence between Variance, Value at Risk,

Draw-Down and Time Under-The-Water. And for the context of portfolio optimization we can

conclude that the three optimization techniques minimizing the variance, minimizing CDaR

and maximizing CVaR will end up with the same results if the assumptions of normality and

time-independence hold.

This relation between the diﬀerent risk measures disappears when normality can not be assumed

anymore and we expect the optimization procedures to produce diﬀerent results.

67

Part III

Optimization With Alternative Investments

This third part deals with the implementation of the discussed risk measures and the results

achieved by using diﬀerent data sets. Therefore we ﬁrst show the implemented optimization

problems and some numerical specialities related to them. Then we discuss quickly the diﬀerent

kind of data and show the used data. Afterwards the results of the calculations are shown and

interpreted. Finally a summary and an outlook is given.

8 Numerical Implementation

The table below summarizes the considered optimization problems whereby µ indicates the

expected return given by the investor.

Min V ariance Max CV aR Min CDaR

E[R] = µ E[R] = µ E[R] = µ

w

i

= 1

w

i

= 1

w

i

= 1

w

i

≥ 0 w

i

≥ 0 w

i

≥ 0

In the following we list some aspects of the implementation:

• Since our optimizer is only capable of minimizing a function but not of maximizing a

function, we minimize in the implementation −CV aR instead of maximizing CV aR.

• We do not minimize the Variance but the standard deviation. Experiments have shown

that this results in a better convergence of the solution. The reason might lie in the

optimization algorithm that seeks the lowest value of a function by following the steep-

est gradient. Since we are mostly dealing with variances smaller than 1, the standard

deviation - as square root of the variance - has a ”broader minimum”.

• In order to get the best results the eﬃcient frontier gets calculated twice: A ﬁrst run

starts at the corner solution with the lowest expected return moving to the corner solution

oﬀering the highest expected return and afterwards a second run is executed in reverse

order. The results of the ﬁrst run are stored and compared with the results of the second

run, whereby the better results (i.e. the portfolio weights leading to a smaller risk value)

are chosen for the ﬁnal output. While moving from one corner solution to the other, the

optimal portfolios are calculated. The optimizer can be given an initial estimation for

the weights of the optimal portfolio. These estimations of the optimal portfolio weights

are calculated as linear extrapolation of the last two optimal portfolio weights because

portfolio weights change often linearly while changing the expected return.

68

9 Used Data

In this section we will explain the diﬀerence between normal and logarithmic data and argue

why we have decided to use logarithmic data. We also list the used historical market data and

show how we have simulated artiﬁcial data from it.

9.1 Normal Vs. Logarithmic Data

In ﬁnance there are two common ways to model returns: Simple/geometric returns and loga-

rithmic/ continuously compounded returns. The following table gives an overview of geometric

and logarithmic returns for single- and multi-period each. P

t

andP

t+1

denote the absolute value

of the asset at time point t and t+1 respectively.

Single-period Multi-period

Geometric Return R

t,t+1

=

P

t+1

P

t

−1 R

t,t+n

= [

n−1

i=0

(1 +R

t+i,t+i+1

)] −1

Logarithmic Return r

t,t+1

= log(1 +R

t,t+n

) r

t,t+n

= log[

n−1

i=0

(1 +R

t+i,t+i+1

)]

=

n−1

i=0

r

t+i,t+i+1

With geometric returns, the new return gets calculated at the end of each period and

therefore the increase or decrease in the return gets active for the next period. In contrast, using

continuously compounded returns, the change in the returns gets calculated on a inﬁnitesimal

small time period and therefore a continuously compounded return represents the actual value

at every time point.

We have decided to use logarithmic/continuously compounded returns for the analysis for

the following reasons:

• Because 0 ≥

P

t+n

P

t

≥ ∞, using simple returns, the eﬀective return can not be below -1

(full loss, for P

t+n

= 0) which is an restriction to the range of possible values. Using

logarithmic returns, the range gets stretched to [−∞, ∞]. This is especially important

for tail analysis, since the tail gets cut at -1 using simple returns and there would be a

probability assigned to value that do not appear.

• If single-period returns are assumed to be normal, then multi-period returns (

i

(1 +

R

t+i

)) − 1 are not normal. This comes from that the fact that a product of normally

distributed variables is not normally distributed. By taking log-returns, multi-period

returns are achieved by adding up the single-period returns which results again in a

normal distribution (Central Limit Theorem)

• The concepts of CVaR, CDaR calculate thresholds that may lie in between a time period

whereas the data is for the end of the time period. In this case it is more precise to use

logarithmic returns instead of the linear approximation done by geometric returns.

69

9.2 Empirical Vs. Simulated Data

Empirical Market Data

As real market data we have chosen 3 bond indices, 5 equity indices and a hedge fund index. For

equities and bonds there is a representative index for each of the following geographic categories:

the whole world, Europe and the United States. Additionally we have also for equity indices

for Far East and the Emerging Markets. As proxy for alternative investments the Hedge Fund

Research (HFR) Fund Weighted Composite Index gets used. For a list of the various hedge

fund styles included in this index and its descriptions you are referred to appendix F.

The data is coming from DataStream, except the HFR data coming directly from HFR. The

data range includes almost the past 14 year (January 1990 until September 2003) on a monthly

basis. This means that there are 165 data points per index available. The hedge fund index

acts bottleneck because for all of the other indices more data into the past would be available.

However to make the results more comparable, we restrict the data range to the largest common

range. Not for all indices is it possible to get 10 years of data, e.g. the two indices based on

the euro are just available after the introduction of this currency in 1999. Those indices not

booked in USD, were converted to this currency. We are aware that, by converting all indices

to USD, we have introduced currency risk to the time series. However, we think that it makes

much more sense to compare time series that are in the same currency than diﬀerent ones. In

case that a value of a time series was missing for a certain date (e.g. because of a holiday), we

have taken the value from the day before. The tests are applied to the log-returns of the data

series.

The following table lists the indices and the ﬁrst four moments of its logarithmic monthly re-

turns.

Asset Class Mean Standard Skewness Excess

Deviation Kurtosis

HFR Fund Weighted Composite 0.01140 0.0205 -0.775 3.24

MSCI World 0.00455 0.0435 -0.539 0.502

MSCI Europe 0.00603 0.0467 -0.566 0.847

MSCI North America 0.00866 0.0444 -0.569 0.600

MSCI Far East -0.00374 0.0661 0.141 0.615

MSCI Emerging Markets 0.00533 0.0697 -1.08 3.19

JPM Global 0.00648 0.0191 0.505 0.972

JPM Europe 0.00681 0.0282 0.0245 0.744

JPM USA 0.00635 0.0131 -0.568 1.19

The Covariance matrix of the 9 asset classes is as follows:

HFR FWC MSCI WD MSCI EU MSCI US MSCI FE MSCI EM JPM WD JPM EU JPM US

HFR FWC 0.000418 0.000604 0.000576 0.000636 0.000577 0.00108 -0.0000111 -0.0000705 -0.00000448

MSCI WD 0.000604 0.00188 0.00178 0.00161 0.00218 0.00205 0.000149 0.000120 0.00000066

MSCI EU 0.000576 0.00178 0.00216 0.00143 0.00167 0.00193 0.000213 0.000316 -0.00000169

MSCI US 0.000636 0.00161 0.00143 0.00196 0.00121 0.00195 0.0000218 -0.0000999 0.0000102

MSCI FE 0.000577 0.00218 0.00167 0.00121 0.00435 0.00227 0.000328 0.000301 0.0000152

MSCI EM 0.00108 0.00205 0.00193 0.00195 0.00227 0.00483 -0.0000991 -0.000262 -0.000156

JPM WD -0.0000111 0.000149 0.000213 0.0000218 0.000328 -0.0000991 0.000362 0.000468 0.000164

JPM EU -0.0000705 0.000120 0.000316 -0.0000999 0.000301 -0.000262 0.000468 0.000790 0.000170

JPM US -0.00000448 0.00000066 -0.00000169 0.0000102 0.0000152 -0.000156 0.000164 0.000170 0.000172

70

Simulated data

We generate artiﬁcial data based on the historical data described above. For this purpose we ﬁrst

ﬁt a multivariate skewed normal distribution and a multivariate skewed student-t distribution

to the historical data. The ﬁtting procedure gives us a vector of regression coeﬃcients, the

covariance matrix, a vector of shape parameters and the degree of freedom. In the case of

ﬁtting a skewed normal distribution the shape parameters are all 0 and the degree of freedom

is inﬁnite as it is well-know for the normal distribution.

Based on this estimated distributions we can generate random samples. As always with Monte

Carlo Simulations, we have the advantage that we have full control over the underlying model

because we can control and change the parameters. As disadvantage we note that the Monte

Carlo ignores all dependencies over time in the time series and therefore slightly overstate the

true value of diversiﬁcation across assets classes in simulated portfolios.

Another unpleasant aspect is that there is only one value for the degree of freedom for all asset

classes estimated and respected in the ﬁtted function. This means that the time series do not

have an individual kurtosis each but only a common one. However it is a non-trivial task to

generate multivariate correlated data with skewness and kurtosis and would be beyond the

scope of this thesis. One approach would be to use Copulas.

71

10 Evaluation Of The Portfolios

So far we have presented three methods how it could be possible to optimize a portfolio and in

the last chapter we have introduced some historical market data. In the following part we will

publish the portfolios that were optimized based on the market data. In a ﬁrst section we show

the results of the portfolio optimization if we use only traditional assets classes. Afterwards we

introduce a hedge fund and analyze how it changes the optimal portfolios. In the third part

we generate artiﬁcial data with the same characteristics as the asset classes and optimize this

data.

For the portfolio optimization we give the expected return of the investor and try to minimize

the respective risk. This procedure is done for several expected returns to get the eﬃcient fron-

tier. For the calculations, the range of these expected returns is deﬁned as the interval between

the smallest and the largest expected return of the assets classes. Clearly, under the assumption

of no short sales and no lending and borrowing it is not possible to reach an expected portfolio

return outside this interval (see chapter 1.2). We are aware that the part of the eﬃcient frontier

that is below the minimum risk portfolio is in practise not relevant. This is especially true for

expected target returns below 0. We will nevertheless show the whole range to give the whole

picture of the optimization results and to compare them.

For the charts we use the following color encoding:

Asset Class Color Style Example

HFR Fund Weighted Composite Black Solid

MSCI World Orange Solid

MSCI Europe Red Solid

MSCI North America Green Solid

MSCI Far East Blue Solid

MSCI Emerging Markets Pink Solid

JPM Global Orange Dashed

JPM Europe Red Dashed

JPM USA Green Dashed

The alpha value for the CVaR and CDaR optimization is chosen as 0.25, the size of the

rolling window for the CDaR as 24 and the step size for the CDaR as 3. These are the values for

which we got the most stable results. Since we have only 165 data points per time series, it was

not possible to decrease the alpha value further towards 0.1 or use non-overlapping windows

for the calculation of CDaR and still get reasonable results.

For the portfolio optimization no constraints for the weights were introduced in order to see

the pure results and no inﬂuenced ones.

10.1 Evaluation With Historical Data

This section contains the results derived by using the original data series as presented before.

All the calculations are done for the 8 traditional asset classes and again for the 9 asset classes

including the hedge fund data.

72

Portfolios With Traditional Assets

Figure 30 shows the result of optimization of the 8 chosen traditional assets classes. The two

pictures in the same row belong to the same optimization technique (Mean-Variance, Mean-

CVaR or Mean-CDaR). The pictures in the left row show the weights of the individual assets

classes dependent on the expected target return, chosen by the investor. The pictures in the

right row show the eﬃcient frontier resulting from the optimized weights.

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean Variance Optimization

0.02 0.03 0.04 0.05 0.06

0

.

0

0

0

0

.

0

1

0

Standard Deviation

T

a

r

g

e

t

R

e

t

u

r

n

Mean Variance Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CVaR Optimization

0.01 0.03 0.05 0.07

0

.

0

0

0

0

.

0

1

0

Conditional Value At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CVaR Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CDaR Optimization

0.1 0.2 0.3 0.4 0.5

0

.

0

0

0

0

.

0

1

0

Conditional Draw−Down At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CDaR Efficient Frontier

Figure 30: The weights and eﬃcient frontiers for traditional asset classes for various optimization

criteria.

At ﬁrst sight we can see from the pictures that the optimization techniques produced very

similar results. They all start with investing 100% of the available capital in MSCI Far East

(blue solid line) if the investor asks for a very low return around -0.004. This is the only asset

class the oﬀers such a low mean return. As we increase the expected return the contribution

from the JPM USA (green dashed line) increases until an expected return of 0.006 where the

73

contribution of MSCI Far East is decreased to a contribution of 0. In this area we can see

diﬀerences in the asset allocation of the three techniques: The Mean Variance optimization

pushes the JPM USA class to 100% to reach an expected return of 0.006, whereas Mean CDaR

increases JPM USA to 0.8 at the maximum and distributes the resulting part to JPM global

(turquoise doted line). All three techniques agree in the range above 0.007 to invest in JPM

Europe (red doted line) and have a major allocation in MSCI North America when it comes to

an expected return above 0.008.

The eﬃcient frontier also look very similar for all three optimization techniques. The minimum

risk portfolio is at an expected return of 0.0065 for all techniques. We can state that the eﬃcient

frontier of Mean Variance and Mean CVaR are more similar in comparison with the eﬃcient

frontier of Mean CDaR optimization.

The results do not correspond completely to the portfolio theory which says that in the area

of lower expected return we can ﬁnd mostly bonds because they oﬀer usually a lower expected

return and a low risk. In the higher region of expected returns we could expect equity indices

from risky geographic locations as the Emerging Markets.

We can explain the calculated results with the actual situation at the world markets: The

table with the four moments of the indices show that MSCI Far East is the only index with a

negative ﬁrst moment. The reason for this is the Asia crisis in 1997 that is contained in the

data interval. The second moment shows us why indices like MSCI Emerging Markets and

MSCI Europe don’t appear in the weights chart: They have a too high Standard Deviation -

especially in comparison to the bonds which oﬀer a higher expected return for a lower Standard

Deviation. Since the three optimizations are linked together via the standard deviation, this

holds true for all of them.

The results also show the eﬀect of diversiﬁcation very clearly: MSCI Far East (blue solid line)

and JPM US (green dashed line) which dominate the lower part of expected return have a

correlation of -0.0001 (see Covariance matrix) and MSCI US (green solid line) together with

JPM EU (red dashed line), which have a high allocation in the higher part of expected return,

have a correlation of -0.0000152. These are two of the smallest entries in the Covariance matrix.

This shows that all optimization techniques try to combine the fewest correlated assets.

This might also be the reason why MSCI World Equities is used so rarely to form the port-

folios: MSCI World can be considered as a linear combination of the other indices. Since the

optimization is looking for optimal diversiﬁcation, the other indices, that inherit more extreme

properties, are being used.

The results of the Mean-Variance optimization are very smooth, whereas the eﬃcient frontier of

the Mean-CDaR optimization is much more peaked and unstable. This eﬀect might be coming

from the small data set and the fact that CDaR (and also CVaR to a certain extent) take

outliers heavily into consideration. As we will see, this artifacts will disappear as soon as we

increase the amount of data.

74

Portfolios With Traditional And Alternative Assets

In this section we show the results of portfolio optimizations given that a hedge fund index is

available. Figure 31 contains the six pictures with the weight allocation for the portfolios in

the left row and the eﬃcient frontier in the right row.

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean Variance Optimization

0.01 0.02 0.03 0.04 0.05 0.06

0

.

0

0

0

0

.

0

1

0

Standard Deviation

T

a

r

g

e

t

R

e

t

u

r

n

Mean Variance Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CVaR Optimization

0.02 0.04 0.06 0.08

0

.

0

0

0

0

.

0

1

0

Conditional Value At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CVaR Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CDaR Optimization

0.1 0.2 0.3 0.4 0.5

0

.

0

0

0

0

.

0

1

0

Conditional Draw−Down At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CDaR Efficient Frontier

Figure 31: The weights and eﬃcient frontiers for traditional and alternative asset classes for various

optimization criteria.

Again it gets visible that the results of the optimization techniques are similar. Another

interesting eﬀect is that results for the expected returns in the range of -0.004 until 0.006 are

the same for the case with and without the hedge fund index. This means that the hedge fund

index has no inﬂuence to the lower expected returns but is treated as independent. As for

the situation without hedge fund index, MSCI Far East (blue solid line) and JPM USA (green

75

dashed line) dominate the range between -0.004 and 0.006. Around the expected return of

0.005 we have for the Mean CDaR and Mean CVaR optimization also JPM Global (turquoise

dashed line) playing a minor role. The hedge fund index gets taken into consideration when

the expected return reaches a level of 0.006 and above. It attracts all the weight for expected

returns above 0.010 because it is the only asset oﬀering such a high return. Remarkable is that

JPM Europe (red dashed line) gets over weighted in a Mean CDaR optimization in comparison

to Mean Variance and Mean CVaR optimizations.

The Covariance matrix shows that the hedge fund index is very little correlated with the other

assets. The results of the optimization suggest to combine the hedge fund index with JPM

EU (red dashed line) and JPM US (green dashed line) to get a high expected portfolio return.

The correlation of the hedge fund index is negative with both Bond indices. Again the eﬀect

of diversiﬁcation got utilized by all of the optimization techniques.

The little peaks in the weight allocation charts show that the CDaR-results are much more

instable compared to the Variance-Results.

10.2 Evaluation With Simulated Data

In this section we will optimize portfolios based on simulated data. As earlier described, the

data is gained by ﬁtting a distribution to the available monthly time series of the assets classes.

We distinguish between ﬁtting a skewed normal distribution and ﬁtting a skewed student-t

distribution. As soon as we have the distribution, we can generate as many artiﬁcial data with

the same properties as we need. For the following calculations we have generated 2000 samples

for each asset class. This represents 2000 months or 167 years of data.

Portfolios With Simulated Traditional Assets

Figure 32 shows that the results we get when ﬁtting a multivariate skew normal distribution

to the historical data and generating 2000 samples with this distribution are pretty much

similar to the ones of the original data. We can see that the instability in the CDaR and

CVaR optimization disappears and all of the three optimizations get the same results. Only

a little peak of 10 percent allocation in MSCI Emerging Markets in the CDaR optimization

distinguishes the results.

In Figure 33 the results for ﬁtting a multivariate student-t distribution to the same 8 data

series are provided. The covered range for the expected return has shifted from the interval

(-0.004, 0.008) to the interval (-0.001, 0.010) which is a results of the randomly generation of

new data from the ﬁtted distribution. Besides this shift there is another diﬀerence compared

to ﬁtting a skewed normal distribution: The allocation of JPM Europe (dotted red line) is

more varying comparing the three optimization techniques. It ﬂuctuates from 20 percent for

Mean-Variance optimization up to almost 40 percent for Mean-CDaR optimization. This eﬀect

might be coming from the ﬁtted skewed student-t distribution that allows a higher adaptation

to the original data then the skewed normal distribution.

76

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean Variance Optimization

0.02 0.03 0.04 0.05 0.06 0.07

0

.

0

0

0

0

.

0

1

0

Standard Deviation

T

a

r

g

e

t

R

e

t

u

r

n

Mean Variance Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CVaR Optimization

0.02 0.04 0.06 0.08

0

.

0

0

0

0

.

0

1

0

Conditional Value At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CVaR Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CDaR Optimization

0.1 0.2 0.3 0.4 0.5 0.6

0

.

0

0

0

0

.

0

1

0

Conditional Draw−Down At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CDaR Efficient Frontier

Figure 32: The weights and eﬃcient frontiers for traditional asset classes for various optimization

criteria. The used data has been simulated by a skewed normal distribution ﬁtted to the historical data.

77

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean Variance Optimization

0.01 0.02 0.03 0.04 0.05 0.06

0

.

0

0

0

0

.

0

1

0

Standard Deviation

T

a

r

g

e

t

R

e

t

u

r

n

Mean Variance Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CVaR Optimization

0.01 0.02 0.03 0.04 0.05 0.06 0.07

0

.

0

0

0

0

.

0

1

0

Conditional Value At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CVaR Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CDaR Optimization

0.1 0.2 0.3 0.4 0.5

0

.

0

0

0

0

.

0

1

0

Conditional Draw−Down At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CDaR Efficient Frontier

Figure 33: The weights and eﬃcient frontiers for traditional asset classes for various optimization

criteria. The used data has been simulated by a skewed student-t distribution ﬁtted to the historical

data.

78

Portfolios With Simulated Traditional Assets And Alternative Assets

Figure 34 and ﬁgure 35 show the result for the 9 assets classes, including the hedge fund index.

The results of ﬁgure 34 are retrieved by ﬁtting a skewed normal distribution to the historical

data, whereas the results of ﬁgure 35 are retrieved by ﬁtting a skewed student-t distribution to

the historical data.

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean Variance Optimization

0.01 0.02 0.03 0.04 0.05 0.06

0

.

0

0

0

0

.

0

1

0

Standard Deviation

T

a

r

g

e

t

R

e

t

u

r

n

Mean Variance Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CVaR Optimization

0.02 0.04 0.06 0.08

0

.

0

0

0

0

.

0

1

0

Conditional Value At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CVaR Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CDaR Optimization

0.1 0.2 0.3 0.4 0.5 0.6

0

.

0

0

0

0

.

0

1

0

Conditional Draw−Down At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CDaR Efficient Frontier

Figure 34: The weights and eﬃcient frontiers for traditional and alternative asset classes for various

optimization criteria. The used data has been simulated by a skewed normal distribution ﬁtted to the

historical data.

Comparing ﬁgure 34 and ﬁgure 35 we see again the same eﬀect as we have seen for the 8

asset classes: the outcome of the three diﬀerent optimizations diﬀers more when we ﬁt the

historical data with a multivariate skewed student-t distribution instead of the multivariate

skewed normal distribution. Besides this we can conﬁrm that hedge funds oﬀer a possibility for

higher returns.

79

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean Variance Optimization

0.01 0.02 0.03 0.04 0.05 0.06

0

.

0

0

0

0

.

0

1

0

Standard Deviation

T

a

r

g

e

t

R

e

t

u

r

n

Mean Variance Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CVaR Optimization

0.01 0.03 0.05 0.07

0

.

0

0

0

0

.

0

1

0

Conditional Value At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CVaR Efficient Frontier

0.000 0.005 0.010

0

.

0

0

.

4

0

.

8

Target Return

A

s

s

e

t

W

e

i

g

h

t

Asset Weights After Mean CDaR Optimization

0.1 0.2 0.3 0.4 0.5

0

.

0

0

0

0

.

0

1

0

Conditional Draw−Down At Risk

T

a

r

g

e

t

R

e

t

u

r

n

Mean CDaR Efficient Frontier

Figure 35: The weights and eﬃcient frontiers for traditional and alternative asset classes for various

optimization criteria. The used data has been simulated by a skewed student-t distribution ﬁtted to the

historical data.

80

81

Summary and Outlook

Portfolio optimization had always been a key issue of ﬁnance. In recent years its complexity

increased because of the emergence of derivatives and alternative instruments. New alternative

investment vehicles like hedge funds are very interesting in the context of portfolio optimiza-

tion because they oﬀer a lot of unexplored investment opportunities. This thesis dealt with the

question of how to integrate alternative investments like hedge funds into a portfolio.

In the ﬁrst part we presented the standard portfolio optimization approach according to

Markowitz by describing the risk return framework and the relation to the utility function of

an investor. Important here is to state that the standard Mean-Variance optimization assumes

normal distributed returns or a speciﬁc utility function for the investor. The analytical solu-

tions for optimal portfolios were derived for the case of two assets.

The purpose of the second part was to show that the requirements of the Mean-Variance op-

timization as proposed by Markowitz are not completely fulﬁlled and to present some alternative

optimization processes. To show the violation of the requirements, we applied some statistical

tests for measuring the stylized facts of asset returns. The numerical results showed that the

returns are not normal distributed but have fat tails. The stylized facts appear especially strong

when we increase the data frequency (e.g. going from monthly data to daily data). Afterwards

we discussed the pleasant properties of risk measures and present several sets of properties as

proposed in literature. In order to propose alternatives to the Mean-Variance optimization,

Value at Risk, Draw-Down and Time Under-The-Water and its derivations Conditional Value

at Risk and Conditional Draw-Down at Risk were introduced. They were analyzed and com-

pared with the variance as risk measure. It is explained that portfolio optimized according to

variance, Value at Risk or Draw-Down will be very similar in the case of normal distributed data.

The third part summarized the results achieved by applying the three optimization tech-

niques Mean-Variance, Mean-Conditional Value at Risk, and Mean-Conditional Draw-Down at

Risk to data. For this purpose we have implemented a software framework to test and com-

pare the diﬀerent optimization techniques. This software framework and also the used data is

explained. We have introduced historical hedge fund data because is known that hedge fund

returns exhibit special statistical properties like skewness and kurtosis and it is therefore inter-

esting to see how they inﬂuence the portfolio optimization results. The data were twofold: We

used empirical data and simulated time series based on ﬁtting multivariate skewed distribu-

tion functions to the empirical returns. For each setup of data and optimization technique we

have calculated the eﬃcient frontier and the weight allocation of the eﬃcient portfolios. The

results of the three optimization techniques diﬀered dependent on the used data. As expected

was the outcome for the diﬀerent optimization techniques less variable for the case of normal

data and was more varying when we used non-normal data. This supported the conclusion

from the algebraic analysis of the risk measures that portfolio optimization techniques diﬀerent

than the Mean-Value optimization are preferable in the context of non-normal data. Therefore

we propose to use risk measures like Conditional Value at Risk or Conditional Draw-Down at

Risk especially in the case of alternative investments because their returns deviate from the

normal distribution. However hedge funds have also good properties if one wants to go on

with the Mean-Variance optimization: In the investigated period hedge funds had a very good

performance and oﬀer therefore a very high return. Even if the performance will decrease in

the future (e.g. because of stricter regulations), hedge funds will still be a very good way to

diversify a portfolio because of the low correlation with the traditional assets.

82

The implemented software features eﬃcient algorithms and interfaces to other programming

languages. It is modularly designed in order to get the code easily changed and the function-

ality enhanced. We think that risk measures can be comfortable discovered and analyzed with

this software. It would be interesting to use other kind of data and implement new risk measures.

83

Appendix

A Quadratic Utility Function Implies That Mean Variance Anal-

ysis Is Optimal

In this appendix we want to show that it is possible to express the expected utility function in

terms of mean and variance and that it is therefore optimal to apply an mean variance analysis

if one uses a quadratic utility function.

The variance of a random variable W is in (2) deﬁned as

σ

2

W

= E[W −E[W]

2

] = E[W

2

−2W ∗ E[W] +E[W]

2

]

Because

E[

N

i=1

X

i

] =

N

i=1

E[X

i

]

holds, we get

σ

2

W

= E[W

2

] −E[2W ∗ E[W]] +E[W]

2

and since

E[c ∗ X] = c ∗ E[X]

holds, we can rewrite the variance as

σ

2

W

= E[W

2

] −2 ∗ E[W] ∗ E[W] +E[W]

2

= E[W

2

] −[E[W]]

2

Rearranging yields to

E[W

2

] = σ

2

W

+ [E[W]]

2

We have the expected value of the quadratic utility function we want to optimize

E[U(W)] = E[W] −b ∗ E[W

2

]

Here we can substitute the term derived two lines above and get

E[U(W)] = E[W] −b ∗ [σ

2

W

+ [E[W]]

2

]

Deriving this term we have proven that, assuming a quadratic utility function, a mean variance

analysis optimizes the expected utility.

84

B Equivalence Of Diﬀerent VaR Deﬁnitions And Notations

Deﬁnitions and Notations used in this thesis:

V aR

α

= sup¦x[P[R

P

< x] ≤ α¦ (35)

where α is expected to be in [0.01, 0.1] and x is a random variable of a return function.

CV aR

α

= E[R

P

[R

P

≤ V aR] (36)

This notation corresponds to the left graphic of ﬁgure 36.

In contrast we ﬁnd in [35] and [36] the following deﬁnitions and notations

V aR

1−α

= inf¦x[P[R

P

≤ x] ≥ α¦ (37)

where α is expected to be in [0.9, 0.99] and x is a random variable of a loss function.

CV aR

1−α

= E[R

P

[R

P

≤ V aR] (38)

This corresponds to the right graphic of ﬁgure 36.

Return

P

r

o

b

a

b

i

l

i

t

y

−3 −2 −1 0 1 2 3

0

.

0

0

.

1

0

.

2

0

.

3

0

.

4

0

.

5

α= 0.05 1−α= 0.95

VaR= 1.64

Loss

P

r

o

b

a

b

i

l

i

t

y

−3 −2 −1 0 1 2 3

0

.

0

0

.

1

0

.

2

0

.

3

0

.

4

0

.

5

1−α= 0.05 α= 0.95

VaR= 1.64

Figure 36: The two graphics depict the situation for the two kind of deﬁnitions of VaR and CVaR for

the case of a standard normal distribution and a 5%/95% conﬁdence level. The left graphic shows an

α = 0.05 and a return function. The right graphic depicts an α = 0.95 and a loss function

The formulas (35) and (36) are deﬁned on return functions (a positive value means a high

return, a negative value indicates a loss) and calculates the 5% quantile whereas formulas (37)

and (38) are deﬁned on loss functions (a positive value means a loss, a negative value indicates

a gain) and deals with the 95% quantile

The transformations of the VaR and CVaR can be expressed as

V aR

α

(X) = V aR

(1−α)

(−X) (39)

CV aR

α

(X) = CV aR

(1−α)

(−X) (40)

In [1], [2] [3] we can ﬁnd a mixture of both notations where the same deﬁnitions as formulas

(35) and (36) are used with a negative sign for both formulas in order to comply with the sign

of formulas (37) and (38).

85

C Used R Functions

The following functions of the R programming language and environment were used for the

implementation of the software system:

Function Package Description

apply base Returns a vector or array or list of values obtained by applying a

function to margins of an array

arima.sim ts Simulate from an ARIMA model

bds.test tseries Computes and prints the BDS test statistic for the null that ‘x’ is

a series of i.i.d. random variables

data.csv fBasics Loads speciﬁed data sets, or lists the available data sets

ﬂoor base Rounding of Numbers

garchSim fSeries Univariate GARCH time series modelling

length base Get or set the length of vectors (including lists)

lines base Add Connected Line Segments to a Plot

ksgofTest fBasic Performs a Kolmogorov-Smirnov Goodness-of-Test

mean base Generic function for the (trimmed) arithmetic mean

msn.ﬁt sn Fits a multivariate skew-normal (MSN) distribution to data

mst.ﬁt sn Fits a multivariate skew-student-t (MST) distribution to data

plot base Generic function for plotting of R objects

qnorm base Quantile function generation for the normal distribution

qqPlot fExtremes Produces a Quantile-Quantile plot of two data sets

qt base Quantile function generation for the t distribution

rmsn sn Random number generation for the multivariate skew-normal

distribution

rmst sn Random number generation for the multivariate skew-student

distribution

rmvnorm mvtnorm Generates random deviates from the multivariate normal distribution

rmvt mvtnorm Generates random deviates from the multivariate student distribution

rnorm base Random generation for the normal distribution

rsn sn Random number generation for the skew-normal distribution

rst sn Random number generation for the skew-student-t distribution

rt base Quantile function generation for the t distribution

runif base Generates random deviates from the uniform distribution

runsTest fBasics Performs a Runs Test

sum base Returns the sum of all the values present in its arguments

var base Computes the variance

86

D Description Of The Portfolio Optimization System

It was our intension to do all the calculations on a common hard-/software system in order to

make the analysis as useful for practical applications as possible and easy for future extensions.

This justiﬁes the following system:

• The system runs on current personal computers (3GHz clock cycles, 1GB memory). We

don’t assume the availability of a supercomputer or pc-cluster.

• As software components we use R as front-end application and for some small calculations

and an optimizer module written in Fortran77.

We will now describe how we have designed the system for portfolio optimization. The op-

timizer is written in Fortran77 which can be executed directly from R. The full system works as

follows (see ﬁgure 37): R calls the optimization routine DONLP2 and gives the needed data (id

of optimization method, asset returns, expected return of portfolio) as parameter to the opti-

mizer. The optimizer itself calls several subroutines that deﬁne the objective function, equality

constraints and inequality constraints and all of its gradients. In case that it is not possible to

deﬁne analytic gradient functions, we have implemented a numerical gradient function.

Asset return data,

Risk measure ID

R

DONLP2

Optimizer

Objective Function f

Equality Constraint h1

Inequality Constraint g1

Optimal weights

x

f(x)

x

g1(x)

x

h1(x)

R Fortran77

Figure 37: Schema of the dependencies of the optimization process.

Our intension is to develop a general purpose system that can easily be installed and ex-

tended. For this reason we have chosen a general non-linear optimizer that can be applied to

any kind of problems. We are aware that it could be more time eﬃcient to use specialized

optimizers for each problem (e.g. a linear optimizer for the Conditional Value at Risk prob-

lem), however we think that the overhead of a general optimizer is negligible in our context.

87

The used optimizer ’DONLP2’ can be downloaded for free from http://ftp.mathematik.tu-

darmstadt.de/pub/department/software/opti/ where it is available as Fortran or C implemen-

tation. The correct functionality of the optimizer was tested with cross-tests to the optimizer

in the R-package ”quadprog” and the optimizer included in Microsoft Excel. In the documen-

tation DONLP2 is described as

Purpose:

Minimization of an (in general nonlinear) diﬀerentiable real function f subject to (in general

nonlinear) inequality and equality constraints g, h.

f(x) = min

x∈S

S = ¦x ∈ R

n

: h(x) = 0, g(x) ≥ 0¦

Here g and h are vectorvalued functions.

Bound constraints are integrated in the inequality constraints g. These might be identiﬁed by

a special indicator in order to simplify calculation of its gradients and also in order to allow a

special treatment, known as the gradient projection technique. Also ﬁxed variables might be

introduced via h in the same manner.

Method employed:

The method implemented is a sequential equality constrained quadratic programming method

(with an active set technique) with an alternative usage of a fully regularized mixed constrained

subproblem in case of nonregular constraints (i.e. linear dependent gradients in the ”working

set”). It uses a slightly modiﬁed version of the Pantoja-Mayne update for the Hessian of the

Lagrangian, variable dual scaling and an improved Armijo-type stepsize algorithm. Bounds on

the variables are treated in a gradient-projection like fashion. Details can be found in [40] and

[41].

88

E Description Of The Excel Optimizer

Optimization in Microsoft Excel begins with an ordinary spreadsheet model. The spreadsheets

formula language functions as the algebraic language used to deﬁne the model. Through the

Solvers GUI, the user speciﬁes an objective and constraints by pointing and clicking with a

mouse and ﬁlling in dialog boxes. The Solver then analyzes the complete optimization model

and produces the matrix form required by the optimizers. The optimizers employ the simplex,

generalized-reduced-gradient, and branch-and-bound methods to ﬁnd an optimal solution and

sensitivity information. The solver uses the solution values to update the model spreadsheet

and provides sensitivity and other summary information on additional report spreadsheets.

Detailed information about the methods applied in the optimizer included in Microsoft

Excel are given by Fylstra et al [22].

89

F Description Of Various Hedge Fund Styles

This section lists and explains some common hedge fund strategies. The strategies are taken

from [6] and the respective volatility classiﬁcation from the webpage www.magnum.com.

• Convertible Arbitrage. Expected Volatility: Low

Attempts to exploit anomalies in prices of corporate securities that are convertible into

common stocks (convertible bonds, warrants and convertible preferred stocks). Convert-

ible bonds tends to be under-priced because of market segmentation; investors discount

securities that are likely to change types: if the issuer does well, the convertible bond

behaves like a stock; if the issuer does poorly, the convertible bond behaves like distressed

debt. Managers typically buy (or sometimes sell) these securities and then hedge part or

all of the associated risks by shorting the stock. Delta neutrality is often targeted. Over-

hedging is appropriate when there is concern about default as the excess short position

may partially hedge against a reduction in credit quality.

• Dedicated Short Bias. Expected Volatility: Very High

Sells securities short in anticipation of being able to re-buy them at a future date at a

lower price due to the managers assessment of the overvaluation of the securities, or the

market, or in anticipation of earnings disappointments often due to accounting irregu-

larities, new competition, change of management, etc. Often used as a hedge to oﬀset

long-only portfolios and by those who feel the market is approaching a bearish cycle.

• Emerging Markets. Expected Volatility: Very High

Invests in equity or debt of emerging (less mature) markets that tend to have higher inﬂa-

tion and volatile growth. Short selling is not permitted in many emerging markets, and,

therefore, eﬀective hedging is often not available, although Brady debt can be partially

hedged via U.S. Treasury futures and currency markets.

• Long/Short Equity. Expected Volatility: Low

Invests both in long and short equity portfolios generally in the same sectors of the market.

Market risk is greatly reduced, but eﬀective stock analysis and stock picking is essential to

obtaining meaningful results. Leverage may be used to enhance returns. Usually low or no

correlation to the market. Sometimes uses market index futures to hedge out systematic

(market) risk. Relative benchmark index is usually T-bills.

• Equity Market Neutral. Expected Volatility: Low

Hedge strategies that take long and short positions in such a way that the impact of the

overall market is minimized. Market neutral can imply dollar neutral, beta neutral or

both.

– Dollar neutral strategy has zero net investment (i.e., equal dollar amounts in long

and short positions).

– Beta neutral strategy targets a zero total portfolio beta (i.e., the beta of the long

side equals the beta of the short side). While dollar neutrality has the virtue of

simplicity, beta neutrality better deﬁnes a strategy uncorrelated with the market

return.

Many practitioners of market-neutral long/short equity trading balance their longs and

shorts in the same sector or industry. By being sector neutral, they avoid the risk of

market swings aﬀecting some industries or sectors diﬀerently than others.

90

• Event Driven. Expected Volatility: Moderate

Corporate transactions and special situations

– Deal Arbitrage (long/short equity securities of companies involved in corporate trans-

actions)

– Bankruptcy/Distressed (long undervalued securities of companies usually in ﬁnancial

distress)

– Multi-strategy (deals in both deal arbitrage and bankruptcy)

• Fixed Income Arbitrage. Expected Volatility: Low

Attempts to hedge out most interest rate risk by taking oﬀsetting positions. May also use

futures to hedge out interest rate risk.

• Global Macro. Expected Volatility: Very High

Aims to proﬁt from changes in global economies, typically brought about by shifts in

government policy that impact interest rates, in turn aﬀecting currency, stock, and bond

markets. Participates in all major markets equities, bonds, currencies and commodities

though not always at the same time. Uses leverage and derivatives to accentuate the

impact of market moves. Utilizes hedging, but the leveraged directional investments tend

to have the largest impact on performance.

• Managed Futures.

Opportunistically long and short multiple ﬁnancial and/or non ﬁnancial assets. Sub-

indexes include Systematic (long or short markets based on trend-following or other quan-

titative analysis) and Discretionary (long or short markets based on qualitative/fundamental

analysis often with technical input).

91

G References

References

[1] Acerbi C., Nordio C., Sirtori C., 2001: Expected Shortfall as a Tool for Financial Risk

Management.

www.gloriamundi.com

[2] Acerbi C., Tasche D., 2001: Expected Shortfall: a natural coherent alternative to value at

risk

www.gloriamundi.com

[3] Acerbi C., Tasche D., 2001: On the Coherence of Expected Shortfall.

www.gloriamundi.com

[4] Albrecht P., 2003: Risk Measures

Contribution prepared for: Encyclopedia of Actuarial Science

John Wiley & Sons

[5] Alexander C., 2001: Market models

John Wiley &Sons

[6] Amenc N., Martellini L., 2002: The Brace New World of Hedge Fund Indexes

Working Paper

[7] Artzner P., Delbaen F., Eber J.-M., Heat D., 1999: Coherent measures of risk

Mathematical Finance 9, pp. 203-228

[8] Belaire-Franch, J., Contrras-Bayarri D., 2002: The BDS Test: A Practioner’s Guide

Journal of Applied Econometrics 17, pp. 691-699

[9] Bernoulli D., 1738: Exposition of a new theory on the measurement of risk.

[10] Bollerslev T., 1986: Generalized Autoregressive Conditional Heteroscedasticity.

Journal of Econometrics

[11] Brock W., Dechert W.D., Scheinkman J., 1987: A Test for Independence Based on the

Correlation Diemension.

University of Wisconsin Working Paper No. 8702

[12] Cheklov A., Uryasev S., Zabarankin M., 2003: Portfolio Optimization With Drawdown

Constraints.

Working Paper

[13] Cheridito P, Delbaen F., Kupper M., 2003: Coherent and convex risk measures for bounded

c`adl` ag˙

[14] Conover W.J., 1999: Practical Nonparametric Statistics.

John Wiley & Sons, Inc

[15] De Giorgi E., 2002: A Note on Portfolio Selctions under Various Risk Measures.

Working Paper Series ISSN 1424-0459

92

[16] Dembo R., King A., 1992: Tracking Models and the Optimal Regret Distribution in Asset

Location.

Applied Stochastic Models and Data Analysis 8, pp. 151-157

[17] Dembo R., Freeman A., 2001: The Rules of Risk.

John Wiley & Sons, Inc

[18] Elton E., Gruber M., 1995: Modern portfolio theory.

John Wiley & Sons, Inc

[19] Embrechts P., Klueppelberg C., Mikosch T., 2002: Modelling Extremal Events.

Springer, pp. 290-294

[20] Embrechts P., H¨oing A., Juri A., 2003: Using Copulae to bound the Value-at-Risk for

functions of dependent risks.

Finance & Stochastics 7, pp.145-167

[21] Fishburn P., 1977: Mena-risk analysis with risk associated with below-target returns.

The Amerian Economic Review 67, pp. 116-126

[22] Fylstra D., Lasdon L., Watson J., Waren A., 1998: Design and Use of the Microsoft Excel

Solver

Interfaces 28, pp.29-55

[23] Gaivoronski A., Pﬂug G., 2000: Value at Risk in Portfolio Optimization

[24] Grinold R., Kahn R., 1999: Active portfolio managemet

McGraw-Hill, p. 99

[25] Huerlimann W., 2001: Conditional Value-at-Risk bounds for Compund Poisson Risks and

Normal Appriximation.

Working Paper, MPS: Applied mathematics/0201009

[26] Jia J., Dyer S., 1996: A Standard Measure of Risk and Risk-Value Models

Management Science 42, pp. 1691-1705

[27] Kritzman M., 1995: The portable ﬁnancial analyst.

The Financial Analysts Journal

[28] Magdon-Ismail M., Atiya A., Pratap A., Abu-Mostafa Y., 2003: On the Maximum Draw-

down of a Brownian Motion

Working Paper

[29] Mandelbrot B., 1963: The variation of certain speculative prices.

Journal of Business 36, pp. 394-419

[30] Markowitz H., 1959: Portfolio selection: Eﬃcient diversiﬁcation of investments.

John Wiley & Sons

[31] de Moivre A., 1733:

[32] Pedersen C.S., Satchell E., 1998: An extended family of ﬁnancial risk measures

Geneva Papers on Risk and Insurance Theory 23, pp. 89-117

93

[33] Peijan A., L´ opez de Prado M., 2003: Measuring Loss Potential Of Hedge Fund Strategies.

Working Paper UBS Wealth Management and Business Banking

[34] Pﬂug G., 2000: Some Remarks on the value-at-risk and the conditional value-at-risk.

Working Paper at Department of Statistics and Decision Support Systems, University of

Vienna

[35] Rockafellar R., Uryasev S., 1999: Optimization of Conditional value-at-risk

www.gloriamundi.com

[36] Rockafellar R., Uryasev S., 2002: Conditional value-at-risk for general loss distributions

Journal of Banking & Finance 26 www.gloriamundi.com

[37] Ross S., 1976: The arbitrage theory of capital asset pricing.

Journal of Economic Theory, 13, pp. 341-360

[38] Roy A., 1952: Safety Firts and the Holding of Assets.

Econometrica 20, pp.431-449

[39] Schneeweiss H., 1967: Entscheidungskriterien bei Risiko.

Springer, pp. 113-117

[40] Spellucci P., 1998: An SQP method for general nonlinear programs using only equality

constrained subproblems

Math. Prog. 82, (1998), pp. 413-448 Physica Verlag, Heidelberg, Germany

[41] Spellucci P., 1998: A new technique for inconsistent problems in the SQP method

Math. Meth. of Oper. Res. 47, pp. 355-400 Physica Verlag, Heidelberg, Germany

[42] Stone B., 1973: A General Class of Three-Parameter Risk Measures.

Journal of Finance 28, pp.675-685

[43] Testuri C., Uryasev S., 2002: On Relation Between Expected Regret and Condtional Value-

at-Risk.

Working Paper, University of Florida www.gloriamundi.com

[44] Uryasev S., 2000: Conditional Value-at-Risk (CVaR): Algorithms and Applications.

Working Paper, University of Florida www.ise.uﬂ.edu/uryasev

[45] Wang S., Young V., Panjier H., 1997: Axiomatic characterization of insurance prices.

Insurance: Mathematics and Economics 21, pp. 173 - 183

94

Abstract The aim of this Master’s Thesis is to describe and assess diﬀerent ways to optimize a portfolio. Special attention is paid to the inﬂuence of hedge funds since their returns exhibit special statistical properties. In the ﬁrst part of this thesis modern portfolio theory is considered. The Markowitz approach is described and analyzed. It assumes that the assets are identically independently distributed according to the Normal law. CAPM and APT are brieﬂy reviewed. In the second part we go beyond Markowitz and show that asset returns are in reality not normally distributed, but have fat tails and asymmetries. This is especially true for the returns of hedge funds. These facts justify further investigations for alternative portfolio optimization techniques. We describe and discuss therefore alternative methods that can be found in literature. They use risk measures diﬀerent than the standard deviation like Value at Risk or Draw-Down and their derivations Conditional Value at Risk and Conditional Draw-Down at Risk. Based on these methods, the respective optimization problems are formulated and implemented. In the third part we describe the numerical implementation and the used data. Finally the weight allocations and eﬃcient frontiers that summarize the results of these optimization problems are calculated, analyzed and compared. We focus on the question how optimal portfolios with and without hedge funds are constructed according to the diﬀerent optimization methods, how useful these methods are in practice and how the results diﬀer. The results are derived by analytical work and simulations on historical and artiﬁcial data.

2

Acknowledgment I would like to thank my supervisor PD Dr. Diethelm W¨rtz for directing this thesis and u guiding me with a lot of useful impulses. I am also thankful to Prof. Kai Nagel who gave my the opportunity to work on this topic. My gratefulness belongs also to the people at UBS Investment Research Dr. Marcos L´pez de o Prado, Dr. Achim Peijan, Laurent Favre and Dr. Klaus Kr¨nzlein who gave me a lot of inputs a during our discussions.

3

4 .

. . . . . . . . . . . 38 3. . . . . 72 10. . . . . . . . . . . . . . . . . 69 9. . . 30 II Beyond Markowitz 34 3 Stylized Facts Of Asset Returns 34 3. . . . . . . . . . .2 Variance As Risk Measure .1 Draw-Down And Time Under-The-Water . . . . . 7 1.2 Conditional Draw-Down At Risk And Conditional Time Under-The-Water At Risk 6. . . .1 Introduction To Risk In General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Evaluation With Historical Data . . . . . . . . . . .3 Results Of Statistical Tests Applied To Market Data . . . . . . . 76 Summary and Outlook 82 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. . . . . . .2 Dependencies Tests . . . . . . . . .2 Selecting Optimal Portfolios: The Eﬃcient Frontier . . . .1 Value At Risk . . . . . .2 Arbitrage Pricing Theory (APT) . . . . . . . . . . . . 70 10 Evaluation Of The Portfolios 72 10. . . . . . . . . . . . . . . . . . . . . . 6 Draw-Down Measures 6. 48 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simulated Data .1 Standard Capital Asset Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expected Shortfall And Tail Conditional Expectation 5. 27 2. .2 Empirical Vs. . . .2 Evaluation With Simulated Data . . . . . . . . . . . . . . . . . . . . . . . .1 Risk Return Framework And Utility Function .2 Conditional Value At Risk. . . . .3 Mean-Conditional Draw-Down At Risk Eﬃcient Portfolios . . . . . . . . . 7 Comparison Of The Risk Measures 52 52 54 58 60 60 61 65 67 III Optimization With Alternative Investments 68 68 8 Numerical Implementation 9 Used Data 69 9. . . . . . . . . . 42 4 Portfolio Construction With Non Normal Asset Returns 48 4.Contents I Modern Portfolio Theory 7 1 Markowitz Model 7 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2 Capital Asset Pricing Model (CAPM) 27 2. . . . . . . . . . . . . . . . . . . Logarithmic Data . . . . . . . . 5. . 35 3. . . . . .1 Normal Vs.1 Distribution Form Tests . .3 Mean-Conditional Value At Risk Eﬃcient Portfolios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5 Value At Risk Measures 5. . . . . .

Appendix 84 A Quadratic Utility Function Implies That Mean Variance Analysis Is Optimal 84 B Equivalence Of Diﬀerent VaR Deﬁnitions And Notations C Used R Functions D Description Of The Portfolio Optimization System E Description Of The Excel Optimizer F Description Of Various Hedge Fund Styles G References 85 86 87 89 90 92 6 .

1. Then the expected returns of the individual assets would be E[Ri ] = µi (where E[] indicates the expected value) and the total return µP of the portfolio N µP = i=1 wi µRi (1) Two properties of the mean value that will become useful later: µRi +Rj µcRi = µRi + µRj = cµRi The ﬁrst property means that the mean of the sum of two return series i and j are the same as the mean of return series i plus the mean of return series j. ..... .. Our portfolio consists of these assets with a fraction of w1 . RN respectively. deriving a utility function and presenting the model that combines both for portfolio optimization. This is done by showing some statistical properties. The variance of the portfolio will be N 2 σP N N −1 N = E[(RP − µP ) ] = i=1 2 wi (Ri − µP ) = i=1 2 (wi σi ) + 2 i=1 j=i+1 2 wi wj σij (2) So in the case of three assets we get the following pattern: 2 σP = (w1 σ1 )2 + (w2 σ2 )2 + (w3 σ3 )2 + 2w1 w2 σ12 + 2w1 w3 σ13 + 2w2 w3 σ23 7 .1 Risk Return Framework And Utility Function Risk Return Framework Assuming we are given N assets with their returns R1 . The second property states that the mean of a constant c multiplied with a return series is equal to c times the mean of the return series i.. This model was developed in 1952/59 by Harry Markowitz and is still considered as the standard approach for this task. wN invested in each asset.Part I Modern Portfolio Theory 1 Markowitz Model In this ﬁrst chapter the fundamentals of portfolio theory are introduced.

(3) . Taking (2) with equal amount of investments in each of the N assets we get 1 2 N −1 σi + σij (5) N N whereby the ﬁrst term is called diversiﬁable or non market risk and the second term systematic or market risk. is that one can expect a more moderate but constant return on the long run. the variance of a portfolio can be smaller than the smallest variance of its individual assets because of the second term of (2) which can be negative in case of a negative covariance between the asset returns. . because assets of companies from the same country or business areas tend to move together and have hence a higher correlation. . The drawback of diversiﬁcation is that the investor looses the risk premium that a certain asset might provide since its contribution on the ﬁnal portfolio return is very small. So the aim of diversiﬁcation is to chose the assets in a way to keep the mean return high and lower the variance by an appropriate selection and weighting of the assets. . σ1N σ2N ··· 2 σN The correlation is deﬁned as the standardized covariance: ρij = σij σi σj (4) Comparing (1) and (2) we can see the eﬀect of diversiﬁcation: The return of a portfolio can never be smaller than the smallest return of its constituents. This is called risk premium: The extra return a particular asset has to provide over the rate of the market to compensate for market risk. In a risk return framework a high risk gets usually compensated by a high expected return. If we take a large amount of diﬀerent assets (N approaching inﬁnity). since it is the weighted average return of all constituents. Figure 1 shows an example exhibiting this eﬀect for the case of securities from the UK and the US. . . In contrast.. whereas the risk expressed in the second term is coming from the market itself (which can be inﬂuenced by economic changes or events with a large impact) and can not be reduced. . 8 . The advantage of a well diversiﬁed portfolio however.2 These variances σi = E[(RP − µP )2 ] and covariances σij = E[(Ri − µi )(Rj − µj )] = σji are collected in symmetric matrix called covariance matrix: 2 σ1 σ12 · · · σ1N σ12 σ 2 σ2N 2 C= . In practice this results in the recommendation to choose the constituents of a portfolio from diﬀerent geographic or industrial sectors. The risk represented in the ﬁrst term of (5) has its origin in the risk of the single assets the portfolio contains. 2 σP = 2 σP − − → σij −− N →∞ This eﬀect shows us that the ﬁrst term in (5) is called diversiﬁable risk because it can be reduced to zero by a good diversiﬁcation of the assets. the portfolio risk gets reduced to the average covariance of the assets in the portfolio and all the variances of the assets disappear. This also means that the risk of a portfolio of assets with a low correlation can be more reduced than the risk of a portfolio existing of highly correlated assets.

8 1. two persons in the mentioned situations will perceive its value diﬀerently. We will now discuss the properties that such an utility function should have and look at some typical economic utility functions. A classical example would be that a glass of water has a much higher utility for somebody who is lost in the dessert than for somebody in the civilization. We can see that a portfolio with few assets has a higher risk than a portfolio with lots of assets (eﬀect of diversiﬁcation).2 0. Since the line for the UK portfolio is higher. The ﬁrst property we want to have fulﬁlled is that an investor prefers more to less. everybody prefers more wealth than less wealth. Our ﬁrst requirement for a utility function U () for a wealth parameter W is therefore U (W ) > 0 As a second attribute we want to include the investors risk proﬁle. Although the glass of water might be exactly the same and therefore its price.Risk 0. 9 . To toss a coin would be a fair gamble if one player wins both investments when one side is up and the other payer wins both investments when the other side is up. It expresses the fact that an option with a higher return has always a higher utility than an option with a lower return assuming that both options are equally likely.6 0. From this we can conclude that the ﬁrst derivative of the utility function always has to be positive. Utility Function Of An Investor Bernoulli proposed in [9] that the value of an item should not be determined by the price somebody has to pay for it but by the utility that this item has for the owner. A fair gamble is a game where the expected gain is equal to zero. Or as a shorter expression. This means that the probability of a gain times the value of the gain is equal to the probability of a loss times the loss in absolute terms. we can conclude that the stocks in UK have a higher average covariance and their risk can therefore less reduced in a portfolio as the risk of a portfolio consisting of US stocks. Economists call this the non-satiation attribute. The structure of this section will partially follow the one in Elton&Gruber [18]. The doted line represents a portfolio consisting of stocks from the UK whereas the solid line depicts a portfolio with US stocks. We will examine three types of risk proﬁles.0 5 10 15 Number of assets 20 25 30 Figure 1: This chart shows the risk of a portfolio versus the number of assets the portfolio contains.0 0 0.4 0. Bernoulli uses a fair gamble to introduce this concept.

The increase of the utility function for a certain increase in the wealth is smaller if the investor is already on a high level of wealth Risk aversion is deﬁned as rejecting a fair gamble. He or she will sometimes play and sometimes not. Formulated according to the example with the fair gamble the ﬁgure expresses that for the same amount of increase in the utility the investor asks for a higher increase in the wealth the higher the wealth already is. 10 . Let’s ﬁnd out what the implications for a risk averse investor are: Since he or she does not invest. From this we see that a risk averse investor prefers to keep all of his/her fortune rather than to invest a part of it and loss or gain with a 50% probability an equal part. the additional amount of utility is less then the double. U (W ) < 0 Figure 2 shows a logarithmic utility function that fulﬁlls this property. A risk averse investor would not play a game where he or she has an expected return of zero in the long run. This is deﬁned as an investor which is indiﬀerent to a fair gamble. As second risk proﬁle we have a look at the risk neutral investor. Functions that satisfy this requirement have the second derivative smaller than zero. We can see that for the double amount of wealth.Utility Wealth Figure 2: A logarithmic utility function seems to be appropriate in the context of a risk averse investor. U (W ) − U (W − G) > U (W + G) − U (W ) and we can see that such an investor prefers the change from the current wealth minus the gain/loss to the current wealth than the change from the current wealth to the current wealth plus the gain/loss. Note that the absolute change in wealth is in both cases the same (G). we can conclude that the utility for keeping the current wealth is higher than the probability weighted utility for a gain and loss. We can describe this risk proﬁle for the case of a fair gamble as 1 U (W ) > U (W + G) + 2 where W is the current wealth and G the symmetric Multiplying by 2 and rearranging yields to 1 U (W − G) 2 gain/loss of the game.

Hence risk neutrality causes the second derivative of the utility function to be zero. in ﬁgure 3 the utility functions are drawn in a wealth/utility framework for the three risk types.Utility Wealth Figure 3: Utility functions for a risk averse investor (solid). 11 . These kind of investors agree to the following formulations 1 1 U (W ) < U (W + G) + U (W − G) 2 2 U (W ) − U (W − G) < U (W + G) − U (W ) we can assign them a utility function with a positive second derivative since the wealthier they are the more they will appreciate an additional increment in their wealth. U (W ) > 0 To conclude. U (W ) = 0 Risk seeking is called the third risk proﬁle and it is deﬁned as accepting a fair gamble. risk neutral investor (doted) and a risk seeking investor (dashed) in a wealth/utility framework For such a person the utility equation looks like 1 1 U (W ) = U (W + G) + U (W − G) 2 2 We can rearrange this again and get U (W ) − U (W − G) = U (W + G) − U (W ) this means that such a person is indiﬀerent about the preference of the change from the current wealth minus the gain/loss to the current wealth than the change from the current wealth to the current wealth plus the gain/loss.

we have three types of investors: • Decreasing absolute risk aversion: The investor increases the amount of wealth invested in risky assets when the wealth increases. A positive coeﬃcient indicates risk aversion. Our three risk proﬁle in a Mean-Variance framework are depicted in ﬁgure 4. The risk neutral investor (doted line) wants a certain return and does not care about the respective risk. It is possible to see how the three diﬀerent types of investors get compensated: The risk averse investor (solid line) accepts a higher risk if he/she gets a higher return as compensation. as stated in [24]. λ = 0 means risk neutrality and a negative coeﬃcient deﬁnes a risk seeking investor. risk neutral investor (doted) and a risk seeking investor (dashed) in a mean/variance framework There is a third property about useful utility functions that we can use to determine their appearance. From this interpretation one can see that the types of risk neutral and risk seeking investors are not very common. 12 . The risk seeking investor (dashed line) accepts a return/risk combination as long as either the return or the risk is high enough. A typical level of risk aversion would be around 0. • Constant absolute risk aversion: The investor keeps the amount of wealth invested in risky assets constant when the wealth increases. Mean Variance Figure 4: The Iso-Utility functions for a risk averse investor (solid). In [27] the following utility function is proposed for this purpose µU = µR − λσ 2 where µU is the expected utility.We can also transform the utility function to the Mean-Variance framework. µR the expected return. It is how the size of the wealth invested in risky assets changes once the size of the wealth has changed. Such a person compensates high risk with low return and vise versa. σ the standard deviation of returns and λ the risk-aversion coeﬃcient. These curves indicate the mean/risk combinations that seem equally pleasant to a certain investor because they yield the same value for the utility function. Again. With λ the function can get adapted to the investors aversion to risk.0075. It is convenient in this context to calculate the iso-utility curves.

It is preferred because the assumption of a quadratic utility function implies that the mean variance analysis is optimal (see Appendix A for a prove). Since the quadratic utility function has some undesired properties. −U (W ) U (W ) measures the absolute risk aversion of an investor. we have to set the second derivative to be smaller than zero or b positive. but there is no agreement concerning the relative risk aversion. In [18] two common utility functions are presented: The most frequently used utility function in economics is the quadratic one. U (W ) = aW − bW 2 This utility function has the following ﬁrst and second derivatives U (W ) = a − 2bW U (W ) = −2b To make this utility function compliant to the requirements of a risk averse investor. We have shown that an investor 1 usually prefers more to less and asks therefore the ﬁrst derivative to be positive or W < 2b . An analysis of the absolute and relative risk-aversion measures show that the quadratic utility function has an increasing absolute and relative risk aversion. This is expressed by A(W ) = R(W ) = −W U (W ) = W A(W ) U (W ) It can be shown that and interpreted as follows: R (W ) > 0: Increasing relative risk aversion R (W ) = 0: Constant relative risk aversion R (W ) < 0: Decreasing relative risk aversion It is commonly accepted that most investors exhibit decreasing absolute risk aversion. we can deﬁne the investor types according to A (W ) and assign it as follows: A (W ) > 0: Increasing absolute risk aversion A (W ) = 0: Constant absolute risk aversion A (W ) < 0: Decreasing absolute risk aversion It is also possible to use the change in the relative investment as property.• Increasing absolute risk aversion: The investor decreases the amount of wealth invested in risky assets when the wealth increases. there are other utility functions in use that also satisfy mean variance analysis like U (W ) = ln W with its ﬁrst and second derivatives 1 W 1 U (W ) = − 2 W It gets clear that the ﬁrst derivative is positive for all values of W and the second derivative is negative for all values of W . U (W ) = 13 (6) . As a consequence. Further this function exhibits decreasing absolute risk aversion and constant relative risk aversion. So the logarithmic utility function also meets the requirements of a risk averse investor who prefers more to less.

. This practice makes sense in expectation of a decreasing price. The last constraint sets the sum of the weights to one since we want to be fully invested. Equations (7) and (8) describe a quadratic objective function with linear constraints. It is evident that we are only interested in a return larger than zero. In the second expression we ﬁx the expected return of the portfolio to a chosen value. µ is the expected return vector of the assets and µP is the desired expected return of the portfolio. wT µ = µP > 0 wT e = 1 where e = (1. since the owner of the portfolio has sold something that does not belong to him/her. The ﬁrst line of the set-up deﬁnes that we want to minimize the variance and therefore the risk of the ﬁnal portfolio.2 Selecting Optimal Portfolios: The Eﬃcient Frontier The basic set-up of the Markowitz [30] model is as follows: wT Cw → M in s. This formula also sets the expected portfolio return µP into a relation to the portfolio variance σP which is 2 σP 1 Q − (µP − S Q2 R 2 Q) =1 (10) 14 .1. If no short sales are allowed. 1)T . C is the Covariance Matrix as deﬁned in (3).. a portfolio will solve the optimization problem iﬀ w(µP ) = µP w0 − w1 where w0 = w1 = with P = µT C −1 µ 1 (QC −1 µ − RC −1 e) S 1 (RC −1 µ − P C −1 e) S (9) (8) (7) Q = eT C −1 e R = eT C −1 µ S = P Q − R2 With (9) we can determine the optimal portfolio for a given expected portfolio return. In a short sale a trader sells an asset that is not in its possession to buy it later back and equalize its balance sheet. which is usually the case. there will be an additional constraint: wi ≥ 0 We will formulate the solution of the system according to de Giorgi [15]. Short sales are indicated by negative asset weights in a portfolio..t. . If the covariance matrix C is strictly positive ﬁnite. 1.

With the eﬃcient frontier we can determine the amount of risk an investor has to accept for a certain expected return he or she wants to achieve. It is the one to the very left of the eﬃcient frontier. in practice. Portfolios on indiﬀerence curve 3 would have a higher utility. however with the given assets we can not construct such a portfolio. how much return he or she can expect by accepting a certain risk threshold. Portfolios on the indiﬀerence curve 1 are achievable however not optimal in the sense of the utility. nobody will choose a portfolio lying on the eﬃcient frontier below the minimum risk portfolio since the portfolios on the eﬃcient frontier above the minimum risk portfolio oﬀer a larger expected return for the same amount of risk. Stated the other way around. 12 14 q Mean 10 q q 6 0 8 2 4 6 8 10 12 14 Variance Figure 5: The eﬃcient frontier (line) and some ineﬃcient portfolios (points). This portfolio maximizes the utility. The optimal portfolio is located at the point of tangency between the eﬃcient frontier and a indiﬀerence curve (Indiﬀerence curve 2 in the example).A portfolio is called eﬃcient if it oﬀers the lowest possible risk/variance for a given expected return. To deﬁne the appropriate portfolio for an investor. However. taking all the portfolios on the eﬃcient frontier into consideration. Figure 6 shows the eﬃcient frontier with some iso-utility curves. wminRisk = 15 .a hyperbola in the µP /σP -plane as depicted in ﬁgure 5. From (10) we can derive the expected return of the minimum risk portfolio as R µminRisk = Q From (9) we can for the global minimum risk portfolio derive 1 −1 C µ R The minimum risk portfolio is the only unambiguous portfolio in the sense that there is only one possible expected return for a given variance. we can use the iso-utility curves. The portfolios on the eﬃcient frontier guarantee the highest expected return for a given variance An important portfolio on the eﬃcient frontier is the global minimum risk portfolio. an investor can determine. The calculation of all of these optimal portfolios for diﬀerent expected returns µP leads 2 to set of points which called the eﬃcient frontier .

no risk-free lending and borrowing We start with the most common situation. where we are not allowed to sell assets short and no risk-free lending and borrowing is possible. Most instruments have these restrictions to avoid 16 .2). The requirement about the normality of the returns distribution will be discussed in chapter 3. Let’s follow the path of Markowitz [30] and have a closer look to the eﬃcient frontier. that they chose a desired expected return and then choose the portfolio with this mean and the lowest variance.Mean IDC3 q q IDC2 IDC1 Standard Deviation Figure 6: The eﬃcient frontier and some indiﬀerence curves. For the sake of simplicity this is done for a portfolio of only two assets (i=1. one has to assume that the utility function is quadratic or that the returns are normal distributed. Both requirements are critical. i.e. The optimal portfolio is on the IDC2 line where the eﬃcient frontier acts as a tangent. In Schneeweiss shows in[39] that if one wants to apply the Mean-Variance principle as proposed by Markowitz. Not every investor needs necessarily a quadratic utility function or even a utility function in terms of mean and variance. Short sales not allowed. we can substitute σij and get N 2 σP = i=1 N N (wi σi )2 + i=1 j=1 wi wj ρij σi σj In the following we want to analyze the properties of the eﬃcient frontier based on this formula for the four scenarios short sales allowed and short sales not allowed and risk-free lending and borrowing possible and not possible. In (2) we have deﬁned the variance of a portfolio as follows: N 2 σP N N = i=1 (wi σi ) + i=1 j=1 2 wi wj σij since (4) holds.

mean µP = w1 µ1 − (1 − w1 )µ2 2 σP = (w1 σ1 − (1 − wi )σ2 )2 = (−w1 σ1 + (1 − wi )σ2 )2 (13) In the same way as in the case of positive correlation. that the eﬃcient frontier consists of two straight lines (one for each result of the square root of (13)) drawn in ﬁgure 8. Three sub cases are investigated. dependent on the value of the correlation ρ between the asset returns. By solving (11) for w1 and substituting w1 into (12). If 17 . Perfect negative correlation (ρ = −1) and variance of the portfolio become In the case of a perfect negative correlation.speculations and high risks. So the eﬃcient frontier for positive correlated assets is a linear combination of the given assets as shown in ﬁgure 7. 12 14 q Asset 2 Mean 8 10 q Asset 1 6 0 2 4 6 8 10 12 14 Variance Figure 7: The eﬃcient frontier of two assets with perfect correlation is a straight line. one gets µP = (µ2 − µ1 − µ2 µ1 − µ2 σ2 ) + ( )σP σ1 − σ2 σ1 − σ2 which is the equation of a straight line. we can ﬁnd. mean and variance of the portfolio µP = w1 µ1 + (1 − w1 )µ2 2 σP = (w1 σ1 + (1 − w1 )σ2 )2 (11) (12) It shows that with totally correlated assets. Perfect positive correlation (ρ = 1) become with w2 = 1 − w1 . return and risk of a portfolio is just the weighted average of return and risk of its components.

In practice we will ﬁnd almost always positive correlation between asset classes and very rarely a negative correlation. since the portfolio can be constructed as a linear combination of them. it is always possible to ﬁnd combination of them which has zero risk. The appropriate weight and return can be found by setting (13) equal to zero. The reason lies in the factors that inﬂuence the returns of the assets classes.we have perfectly anti-correlated assets. The two lines intersect the y-axis at µP ∗ No relationship between returns of the assets (ρ = 0) For this scenario the variance of the portfolio gets simpliﬁed to 2 σP = (w1 σ1 )2 + ((1 − w1 )σ2 )2 To ﬁnd the minimum risk portfolio. This means that there are only very few periods where a certain asset class has high proﬁt and another asset class a negative proﬁt. Figure 10 shows that the eﬃcient frontier moves to the left with decreasing correlation of the assets and allows a higher diversiﬁcation and therefore a lower risk. Most factors inﬂuence all asset classes 18 . σ2 σ1 + σ2 µ1 σ2 − µ2 σ1 µP ∗ = σ1 + σ2 w1 = 12 14 q Asset 2 Mean 10 8 q q Asset 1 6 0 2 4 6 8 10 12 14 Variance Figure 8: The eﬃcient frontier of two assets with perfect negative correlation. one sets ∂σP ∂wi = 0 and receives for the case of two assets w1 = 2 σ2 2 2 σ1 + σ2 The eﬃcient frontier and the minimum risk portfolio are shown in ﬁgure 9. Intermediate risk In general we can say that the eﬃcient frontier will be always to the left of two assets. It shows that the upper line has the equation µP ∗ = aσP + µP ∗ whereby the lower line is µP ∗ = −aσP + µP ∗ with a as a constant.

g. The variance of our two assets portfolio. Short sales allowed. Eﬃcient frontier with risk-free lending and borrowing Risk-free lending is an instrument where we get a ﬁxed interest rate µrf by lending an amount to somebody (e. no risk-free lending and borrowing By doing a short sale. consisting of an asset 1 and a risk-free asset rf.g. Of course not only the expected return but also the risk of such a portfolio gets huge. sell government bills short). we could also get cash from somebody and pay ﬁxed interests for it (e. buying government bills). one takes a negative position in an asset. has a variance equal to the weighted variance of asset 1: 2 σP = (w1 σ1 )2 The optimal weight for the asset 1 would be w1 = As a formula for the eﬃcient frontier we get: σP σ1 19 . Similarly. In the mean variance environment the eﬃcient frontier will continue as a slightly concave curve to inﬁnity. This means that one can construct a portfolio with a very high expected return by short selling a lot of assets with low expected return (see ﬁgure 11). The minimum variance portfolio is the one portfolio at the very left in a similar way and only a few factors inﬂuence only part of the asset classes. For this reason the behavior of the asset classes is often positively correlated.12 14 q Asset 1 Mean 8 10 q Asset 2 6 0 2 4 6 8 10 12 14 Variance Figure 9: The eﬃcient frontier of two assets with no correlation is a hyperbola. In both cases the variance of the asset is zero (σrf = 0) because the interest rates are constant. This may be useful in the case that one expects that the value of the asset will decrease or it might even make sense when one expects a positive return in order to get cash to invest in an asset with a better performance.

As soon as risk-free lending and borrowing is possible. By changing the leverage factor.12 14 q Asset 2 Mean 10 8 q q Asset 1 6 0 2 4 6 8 10 12 14 Variance Figure 10: Comparison of the eﬃcient frontier of assets with diﬀerent correlation. The term 1 σ1 rf or the slope of the function is called leverage factor. one changes also µrf and σP in a linear way. 0. An illustration is given in ﬁgure 13. µP = (1 − w1 )µrf + w1 µ1 = µ1 − µrf σP + µrf σ1 From this term for the expected return of the portfolio we can see that the eﬃcient frontier µ −µ is again a linear curve as in ﬁgure 12. 1 (from left to right). one can say that all portfolios constructed with risk-free lending and borrowing lie on one straight line through the point (µrf .5. 0. The correlation of between asset 1 and asset 2 is -1. To conclude. In the case that the lending rate is not the same as the borrowing rate. nobody will be interested anymore in the hyperbola (and its expansion through short sales) described in the section above. we get an eﬃcient frontier consisting out of three parts: It starts with the line of the borrowing rate until it touches the envelope of all the portfolio built without lending and borrowing and continues ﬁnally on the line of the lending rate to inﬁnity. short sales are also in this case of no interest anymore. 20 .0) since it oﬀers a higher µrf for a given σrf .0) and the point representing a portfolio consisting 2 only of the one available asset. but only in the tangent to the hyperbola through (µrf . Since short sales allow only a concave expansion of the eﬃcient frontier to the right and the risk-free lending eﬃcient frontier is a straight line.

Mean 10 12 14 q Asset 1 8 6 0 2 4 6 8 10 12 14 Variance Figure 12: Risk-free lending corresponds to the eﬃcient frontier to the left of the asset (intersection at µrf with the y-axis) and risk-free borrowing corresponds to the eﬃcient frontier to the right of asset 1 21 .14 12 q Asset 1 Mean 8 10 q Asset 2 6 0 2 4 6 8 10 12 14 Variance Figure 11: Short sales allow to construct portfolios with very large mean and variance because it enlarges the eﬃcient frontier to the right.

Mean q µ q lend µ borrow Standard Deviation Figure 13: The eﬃcient frontier (solid line) for diﬀerent borrowing and lending rates is constructed out of three parts: First it is on the borrow line until it arrives at the hyperbola of the eﬃcient portfolios which it follows until it reaches the tangent of the lending line where it continues to inﬁnity. 22 .

Short sales allowed. However it is possible to turn it into an unconstraint maximization problem by combining the constraint (15) and the objective function (14). . we start with: N N µrf = 1µrf = ( i=1 wi )µrf = i=1 wi µrf Substituting this and our deﬁnition of the variance of a portfolio (2) into (14). . There is a constraint to make sure that the weights add up to one θ= N wi = 1 i=1 (15) With this setup we have a constraint maximization problem which could be solved with Lagrangian multipliers. + zN σ2N . + zN σ1N 23 . . . . µN − µrf 2 = z1 σ1N + z2 σ2N + . Our aim is for this reason to maximize the slope of this tangent µ1 − µrf (14) σ1 in order to maximize the return to risk ratio. In order to do so. Again. zN : µ1 − µrf 2 = z1 σ1 + z2 σ12 + . risk-free lending and borrowing possible We start with the simplest case. . + zN σ1N 2 µ2 − µrf = z1 σ12 + z2 σ2 + . . . . From the earlier chapter we already know that with allowed short sales and risk-free lending and borrowing there will be one optimal portfolio on the tangent from the risk-free asset (on the y-axis) to the envelope of all the eﬃcient portfolios. we get θ= N i=1 wi (µi − N N 2 i=1 i=1 wi σi + µrf ) N j=1 wi wj σij The maximization problem can be solved by ∂θ =0 ∂wi This gives us a system of equations where we can apply the following substitution zi = µP − µrf wi 2 σP which leads to the following system of N simultaneous equations for N unknowns z1 . . . we will diﬀerentiate between the four cases of allowed and not allowed short sales and possible and not possible risk-free lending and borrowing. The enabled risk-free lending and borrowing makes this tangent to the eﬃcient frontier.Techniques for calculating the eﬃcient frontier In this chapter we will explain the techniques to determine the eﬃcient frontier mathematically.

we can calculate the eﬃcient frontier as the sum of the optimal portfolios corresponding to diﬀerent rates as shown in ﬁgure 14. 24 . one can determine the eﬃcient frontier as sum of points corresponding to diﬀerent (ﬁctive) risk-free rates µrf 1 . ∀i This last condition makes the problem hard to solve since we have a quadratic programming problem and no longer an analytical solution. we get an additional constraint and the optimization problem looks like θ= subject to constraints N µP − µrf → M ax σP wi = 1 i=1 wi ≥ 0. q Mean q µ rf3 µ rf2 q µ rf1 Standard Deviation Figure 14: In the case of allowed short sales but no risk-free assets. risk-free lending and borrowing possible With the restriction of no short selling.The optimal weights wi can be received via wi = wi N 1 i=1 2 zi Short sales allowed. The quadratic aspect is hidden in the objective function: The σP -term contains squared terms in wi . Now we are in the case discussed before and can compute the optimal portfolio corresponding to this situation. To solve these kind of problems. one can use a standard solver package. µrf 3 Short sales not allowed. risk-free lending and borrowing not possible If there is no risk-free asset available. By changing the return of this ﬁctive risk-free asset to other rates. µrf 2 . we can nevertheless assume that there is a risky free asset with a speciﬁed return.

risk-free lending and borrowing not possible If the investor does not want to allow short sales and no risk-free asset is available. ∀i This is also a quadratic programming problem that should be solved with a computer package.Short sales not allowed. 25 . we can solve the following optimization problem with the investors expected return µp N 2 σP = i=1 N N (wi σi )2 + i=1 j=1 wi wj σij → M in subject to N wi = 1 i=1 N wi µi = µP i=1 wi ≥ 0.

26 .

standard deviation and the correlation structure having a one period horizon. all investors will hold combinations of only two portfolios: the market portfolio and a risk-free security. weighted with their market capitalization. behaves in the equilibrium. The investors satisfy their risk preferences by combining portfolio P with lending and borrowing and get a portfolio on the tangent to P. The models will also be used to introduce some important concepts of ﬁnance. The market portfolio consists of all available risky assets. • Unlimited short sales are allowed. 2. • Unlimited lending and borrowing at risk-free rate is possible. is 27 . It is the new eﬃcient frontier that results from risk-free lending and borrowing. The Capital Asset Pricing Model has several assumptions: • Investors make decisions solely in terms of expected value. • Assets are inﬁnitely divisible.prices are determined by the actions of all investors in total. with allowed short sales but no risk-free lending and borrowing. The equation of this line. Figure 16 shows the market portfolio M and the same the straight line as in ﬁgure 15. This line is called Capital Market Line. • There are no transaction costs. If all investors have the same diagram. the market portfolio. In this case they will all have exactly the same diagram as ﬁgure 15. we get an eﬃcient frontier like the one from A to B in ﬁgure 15.1 Standard Capital Asset Pricing Model The Capital Asset Pricing Model describes how a market. consisting of individual agents acting according to the model of Markowitz. connecting the risk-free asset and the market portfolio M. We can resume this and get the Two Mutual Fund Theorem: In the equilibrium. According to our assumptions. The Separation Theorem says that. when we introduce risk-free lending and borrowing. in the equilibrium. As we have seen above. how an eﬃcient market behaves if every market participant follows the rules of Markowitz. • No single investor can aﬀect prices by one action . the optimal portfolio can be identiﬁed without regard to the risk preference of the investor (optimal Portfolio P in the ﬁgure). All investors will end up on it since it contains all the eﬃcient portfolios. all investors have homogeneous expectations and are oﬀered the same lending and borrowing rate. The Capital Market Line deﬁnes the linear risk-return trade-oﬀ for all investment portfolios. they will also calculate all the same portfolio P (and variably weight it with the risk-free asset).2 Capital Asset Pricing Model (CAPM) This chapter presents two linear regression models to answer the question. This implies that portfolio P must be. • Investors have identical expectations and information ﬂows perfectly.

By lending and borrowing. The market portfolio is depicted as M 28 .B Mean P q rf A q Standard Deviation Figure 15: The eﬃcient frontier and its tangent at the optimal portfolio. one moves on the tangent: Portfolio P is without lending and borrowing. one gets a portfolio on the tangent to the right of P and if one borrows capital to somebody one gets a portfolio on the tangent to the left of P. Mean of portfolio µ M M µ rf σM Standard Deviation of efficient portfolio Figure 16: The Capital Market Line describes the linear relation between risk and return for a portfolio. If one lends additional capital from somebody.

µP = µrf + ( This can be interpreted as µM − µrf )σP σM Expected return= reward for time + reward for risk * amount of risk Let’s have a look at the individual assets: The relevant measure here is their covariance with the market portfolio (σi. It can be estimated by βiM = σiM 2 σM We can use this to substitute beta for the two variances: µi = µrf + µM − µrf βi Finally we derive a single index model that describes the relation between the return on individual securities and the overall market at a time point t: Rit = αi + βi RM t + i (16) where αi : part of the return of security Rit that is independent of the market’s performance RM t . βi : sensitivity of return of security Rit to market’s performance RM t . 29 . It is a constant that measures the expected change in the return of an individual security Ri given a change in the return of the market RM .M ). Its formula is µi = µrf + ( µM − µrf σiM ) σM σM At this point we would like to introduce a factor called beta. This is described by the Security Market Line: The Security Market Line deﬁnes the linear risk-return trade-oﬀ for individual stocks.Mean individual asset µ M M µ rf σM2 Variance between market and individual asset Figure 17: The Security Market Line describes the linear relation between risk and return for a portfolio.

It uses some ideal assumptions about the economy to conclude that the capital weighted world wealth portfolio it the tangency portfolio and that every investor will hold this portfolio. bij : the sensitivity of stock i’s return to factor j. no transaction costs . Beta measures how sensitive a stock’s return is to the return of the market. .5 means that the stock will move only half as much as the market does. It is interesting in (16) to see that the return is only inﬂuenced by the market risk and investors don’t receive a premia for holding additional diversiﬁable/non market risk. Ij : the value of factor j that impacts the return on stock i. most stocks follow this trend and vice versa. Therefore is a part of the stock return related to the market return. • Capital markets are perfect (perfect competition. • Investors have homogenous expectations (same as in CAPM). As we have seen. a stock with a high beta gets a high risk premium and a stock with a low beta gets a low risk premium. It was ﬁrst introduced in [37] and bases on the idea that exactly the same instrument can not be diﬀerently priced. The Arbitrage Pricing Theory states that returns of stocks are generated by a linear model consisting of F factors Ij Ri = ai + bi1 I1 + bi2 I2 + . This gives space for the Arbitrage Pricing Theory.same as CAPM). + biF IF + ei (17) where ai : the expected return for stock i if all factors have a value of zero. A beta of two means that the return of the stock will be the double of the return of the market (no matter whether it is a loss or a gain). We can summarize that the Capital Asset Pricing Model is a theoretical model to identify the tangency portfolio. It asks for the following conditions to be fulﬁlled • Returns are generated according to a linear factor model. i : a random error term with mean equal to zero.RM t : return of the market. a beta of 0. 2 ei : a random error term with mean equal to zero and variance equal to σei . that when the market goes up.2 Arbitrage Pricing Theory (APT) The Arbitrage Pricing theory is an alternative approach to determining asset prices. 2. The intention of splitting the return of a stock into a part that is related to the market (βi RM t ) and a part that is related to the individual stock (αi ) comes from the observation. . In other words. the Capital Asset Pricing Model has some quite restrictive assumptions. • The number of assets N is close to inﬁnite. 30 . Similarly. This error is uncorrelated with the factors bij and errors of the other assets (unsystematic risk).

. It shows that the orthogonality constraints imply that the expected returns µRi are a linear combination of the bij and a constant. . . we can see that it must produce an expected return of zero. The second condition asks the expected return for this stock to be zero if all factors are set to zero (non-arbitrage condition). Again. . These three types of conditions are called orthogonality constraints. by short selling certain assets and buying others with the revenue). + λF biF The bij can still be interpreted as the sensitivity of the assets to a change in an underlying factor Ii . N wi biF i=1 = 0 The ﬁrst condition deﬁnes that we have no net investment since we want an arbitrage portfolio. The following conditions imply that the portfolio has no risk since it has no exposure to any of its constituents. . We are determining now the λj by using the fact that an asset with single exposure to one factor and no exposure to the other factors has the same risk premia as this factor. The fundamental implication of the Arbitrage Pricing Theory is that such a free. it is possible to ﬁnd a portfolio that satisﬁes the following properties: N wi = 0 i=1 N wi ai = 0 i=1 N wi bi1 = 0 i=1 N wi bi2 = 0 i=1 . + i=1 wi biF IF + i=1 wi ei (18) We have assumed that the number of stocks are close to inﬁnite. we can combine the assets to get a risk-free portfolio that requires zero net investment (i. This means that there exists a set of factors λ0 .If the assumptions hold. So. . investors would have a free money generator. Let’s express this in a more mathematical way: Using (17) we can write the expected portfolio return as N N N N µP = i=1 wi ai + i=1 wi bi1 I1 + . λF such that µRi = λ0 + λ1 bi1 + . . if this would not hold true. .e. This is intuitive since a risk-free portfolio with an expected return of non zero is an arbitrage opportunity which would be exploited immediately by market participants and hence diminish. the λj represent the risk premia of the respective factor. risk-free portfolio (arbitrage portfolio) must have a zero return on the average. Applying them to (18). In contrast. For each 31 .

By repeating this process for an increasing number of factors. a factor analysis will derive a good approximation. With this procedure we ﬁnd that µRi = λ0 + bi1 (µR1 − λ0 ) + . j = 1 . one gets one solution for each number of factors. This is an advantage because this concept is hard to observe in practice. one can apply the model in the future with new data from these factors.λj . Further the APT model might not be so intuitive as the CAPM. . F ). . For this special case of the APT model we get µR0 = λ0 = µrf and therefore we can express the model as formula for the excess return µRi − µrf = bi1 (µR1 − µrf ) + . F we do the following: The respective bij is set to 1 and all the other equal to 0. . To conclude. The criteria for the goodness is the covariance of residual returns which should be minimal. + biF (µRF − λ0 ) We assume that for i = 0 we have the risk-free asset since the risk-free asset does not depend on any other factors (b0j = 0. Then we have left Ri = ai + bi1 I1 + ei Now we can interpret ai as the return of the risk-free asset µrf and bi1 I1 as the return of the market portfolio RM times the leverage factor. . 32 . The non-orthogonal model might be not so accurate. 50%). . but as soon as one gets the factors (like indices or interest rates) and their respective weights. There are factor analysis methods that produce orthogonal factors (e. we can say that the Arbitrage Pricing Model has a number of beneﬁts: It is not as restrictive as the Capital Asset Pricing Model in its requirement concerning the distribution of the returns and the investors utility function. However they can be used in a pure statistical model by assuming that the past data will be valid for the next step and applying them to calculate one step into the future. To execute a factor analysis. Since it is not possible to calculate a perfect speciﬁcation of the model described by (17). . j = 1 . one has to determine the number of desired factors in advance. A criteria to stop increasing the number of factors would be. Further it avoids using the concept of a Market portfolio. + biF (µRF − µrf ) The Capital Asset Pricing Model can be seen as a very special case of the Arbitrage Pricing Model with only one factor (single index model). principal component analysis) and others that produce non-orthogonal factors.g. The ﬂexibility is also the main disadvantage of the model: The investors have to decide which sources of risk they want to include and how to weight them. Ri = µrf + b1 RM i + ei And this is the same expression as (16) for the CAPM.g. It may become a disadvantage to choose a method that creates orthogonal factors since the factors it creates do not exist in the real world and can therefore not be interpreted. . It also allows multiple sources of risk to explain the stock return movements. This can be shown if one sets F = 1. . Factor analysis is the principal methodology used to estimate the factors Ij and factor loadings bij . if the probability that the next factor explains a statistically signiﬁcant portion of the covariance drops below some level (e.

the Arbitrage Pricing Theory remains the newest and a promising explanation of relative returns.Nevertheless. 33 .

We will use this as justiﬁcation for analyzing further portfolio optimization algorithms that do not have such a strong requirement to the underlying distribution function of the asset returns. The advantages of this distribution are • It can be deﬁned by only two variables: mean and variance.Part II Beyond Markowitz We have seen in the ﬁrst part that the approach to optimize a portfolio as proposed by Markowitz asks for some strong assumptions like normal distributed returns. we especially test for the Normal distribution. The tests are presented in their functionality and demonstrated on representative. There are two aspects of the distribution function that has created the asset returns that should be considered: • Form: Does the distribution have fat tails or skewness? • Dependencies: Do the returns depend on an earlier return values? The normal distribution was ﬁrst mentioned by de Moivre in 1733 [31]. In part III of the thesis the tests are applied to real market data and the resulting conclusions drawn. The tests for determining the form of the underlying distribution function that has created the returns are Goodness of ﬁt (Kolmogorov-Smirnov test). As we will seen. Further we have selected two tests for detecting dependencies and long memory eﬀects in the time series. • It describes random behavior in a natural mechanisms. The focus of the tests as a whole lies on the detection of fat tail behavior rather than dependencies. 3 Stylized Facts Of Asset Returns In this chapter we will present some statistical tests to investigate the characteristic properties of ﬁnancial market data. Concerning the form of the distribution function. there will be several aspects that indicate that this assumption does not hold. These are the Runs test for randomness and BDS test for dependencies. The used tests are chosen with respect to the properties that are important specially for ﬁnancial time series. artiﬁcial data. it is crucial to analyze the origin of the returns. Kurtosis and Skewness (Jarque-Bera test) and Quantile-Quantile plots. Non normality in return distributions A very important question in ﬁnancial analysis is the one for the distribution function of the asset returns. 34 . In this second part we will investigate whether it can be assumed that the returns of ﬁnancial assets are produced by a normal distribution. Since a lot of methods and theorems are assuming a certain distribution function.

05 0.00 X 0. . Assuming we are given the samples as X1 .6 0. whether a sample comes from a population with a speciﬁc distribution.4 0.2 0.For this reasons and the fact that it is possible to ﬁt it as a ﬁrst approximation to asset returns. the normal distribution is used a lot in ﬁnancial analysis and is still considered as the standard assumption.0 −0. However.8 1. . X2 . . The test is based on the empirical distribution function of the given samples and is restricted to continuous distributions to test for.05 Figure 18: The Kolmogorov-Smirnov test calculates the maximum diﬀerence between the empirical distribution function of the samples (doted line) and the cumulative distribution function of the assumed underlying function (solid line). The Kolmogorov-Smirnov test determines the maximum distance between this empirical distribution function and the cumulative distribution function of the assumed underlying function. The hypothesis of the test are deﬁned as: Null hypothesis: The data follows the assumed distribution 35 . in 1963 Mandelbrot [29] observed that ﬁnancial returns might not be produced by a normal distribution. 3.0 0. Figure 18 shows a chart with these two distribution functions. We can order them and calculate the empirical distribution function as n(i) EN = N with n(i) as the number of samples that are smaller than Xi . .1 Distribution Form Tests Goodness of ﬁt test (Kolmogorov-Smirnov test) We start with the Kolmogorov-Smirnov one-sample test which can be used to answer the question. Cumulative Probability 0. XN .

Skewness and kurtosis (Jarque-Bera test) For the Kolmogorov-Smirnov test we were looking at the ﬁrst and second moment of the distribution. Otherwise one can transform the given samples according to ˆ Xi − µ ˆ ˆ Xi = σ ˆ and compare the new samples to a standard normal distribution. the normal distribution). Skewness is the standardized third moment ς= i wi (xi σ3 − µ)3 Skewness can be interpreted as a measure for the asymmetry of a distribution function whereby a value of 0 indicates absolute symmetry (e. one often calculates the excess kurtosis which is the kurtosis minus 3.g. Because the normal distribution has a kurtosis of 3. a positive skewness means an increased probability at the higher quantiles (heavy right tail) and a negative skewness says that we have an increased probability at the lower quantiles (heavy left tail). However we want especially check whether our distribution has fat tails. ˆ ˆ It is then possible to compare the samples to a normal distribution with the estimated mean µ ˆ and variance σ . Figure 19 shows some examples of empirical distributions with skewness. often the third and fourth moments become interesting. In both ways the mean µ and variance σ of the underlying distribution need to be estimated out of the given samples. The null hypothesis of the dis√ tribution is rejected if N DN . There are two equivalent ways to handle the underlying distribution. it underweights the diﬀerence in the tails and overweights the diﬀerence near the mean of the distribution. The second disadvantage of the Kolmogorov-Smirnov test is that it is a very general method (it can also be used for comparing with other distributions than just the normal) and is thus taking only the mean and variance of a distribution into consideration. κ= i wi (xi σ4 − µ)4 −3 36 . dependent on the conﬁdence level. The standardized fourth moment is called kurtosis. is greater than the critical value derived from the standard normal distribution. Some points classify the Kolmogorov-Smirnov test as unsatisﬁable for our purpose: First. µ= i (19) wi xi σ2 = i wi (xi − µ)2 In terms of the normal distribution.Alternative hypothesis: The data does not follow the assumed distribution The precise test statistic is i D = max |F (Xi ) − | i≤i≤N N with F (Xi ) as the assumed underlying distribution function. since the test compares the absolute diﬀerence between the two cumulative distributions.

4 0. The left chart is drawn by a standard normal distribution with a shape parameter of -3.20 0. the normal distribution has a skewness and a kurtosis of 0.15 0.4 Probability −4 −2 0 X 2 4 0.1 0. while the right chart is drawn by a standard normal distribution with shape parameter of 1.00 0. A distribution with a kurtosis of 0 is called mesokurtic. whether it is a normal distribution (with a value of 0 for both) or not.2 0.5 Probability 0.6 0. The test statistics is as follows: 37 . With these deﬁnitions.2 0. The following holds true for most ﬁnancial time series: A negative kurtosis indicates that both tails are less pronounced and the distribution is less peaked as a normal distribution (platykurtic). In the log chart appear the the fat tails of the student distribution as a line above the tails of the normal distribution.35 0.25 log(Probability(X)) −10 −5 0 X 5 10 Probability 0. If there is excess kurtosis. The Jarque-Bera test calculates the skewness and kurtosis of a given distribution to ﬁnd out.3 0. The opposite of platykurtic. the mid-range values on both sides of the mean have less weight than in a normal distribution. a positive kurtosis. This means that distributions with a high kurtosis are appropriate when the returns are likely to be very small or are likely to be very large but are not very likely to have values between these two extremes.30 0.0 −4 0.0 0.7 (solid) in comparison with a normal distribution (doted).1 0.10 0. The left chart uses a linear y-axis whereby the right chart uses a logarithmic y-axis to make the excess kurtosis more explicitly.5 −2 0 X 2 4 Figure 19: The charts show skewed normal distributions (solid) in comparison with a normal distribution (doted).05 −5 −4 −4 −3 −2 −1 −2 0 X 2 4 Figure 20: The charts show a Student-t distribution with an excess kurtosis of 6. means fat tails and more peakedness than a normal distribution (leptokurtic). The kurtosis of a distribution deﬁnes whether the distribution has fat tails in comparison with a normal distribution or not.3 0.0. 0.

The runs test is based on the binomial distribution. The transformation can be estimated from the plot (slope and intercept) • It is possible to deduce small diﬀerences in the participating distributions from the plot (e. The deviation will be upwards for the high values and downwards for the low values. it is possible to identify them by looking at the scatter plot. this transforms the QQ plot by the same linear transformation. the plot should look roughly linear. • If one distribution is transformed by a linear function. • If there are a few outliers contained in the data.If we assume normality for the underlying distribution. Now one can draw the QQ plot as scatter plot of the transformed empirical and the standard normal quantiles. 3. the standard error for the estimated skewness ς and kurtosis κ are approximately ˆ ˆ Jarque-Bera test is deﬁned as JB = N [( 6 N and 24 N with N as the sample size. It uses the concept of a run which is deﬁned as a sequence of increasing values or a sequence of decreasing values. In order to calculate the quantiles of the empirical distribution. The (20) ς2 ˆ κ2 ˆ ) + ( )] 6 24 and is asymptotically chi-squared with 2 degrees of freedom. From this we can conclude that fat tails will appear in a QQ plot as deviation from the diagonal at the extreme values. A distribution with excess kurtosis has a larger probability for events with very large or very small values in comparison to the normal distribution. fat tails imply curves at the left and right end) Figure 21 shows a QQ plot for a sample from a student-t distribution with excess kurtosis.2 Dependencies Tests Runs test for randomness The runs test can be used to decide if a data set is from a random process.g. An α quantiles is deﬁned as x such that P [X < x] = α The quantile-quantile plot (QQ plot) is a scatter plot with the quantiles of the given empirical distribution on the vertical axis and the quantiles of the theoretical distribution on the horizontal axis. 38 . which deﬁnes the probability that the i-th value is larger or smaller than the (i + 1)-th value. one ﬁrst has to transform the empirical distribution according to the standard normal transformation (19). The length of a run is deﬁned as the number of values belonging to this run. In [19] the main merits of a QQ plot are described as: • If a random sample set is compared to its own distribution. Quantile-Quantile plot In this section we would like to present a graphical method to assign some sample data to a possible distribution.

BDS test for dependencies The BDS test is a non-parametric method of testing for nonlinear patterns in time series. The ﬁnal test value is the normalized ni : zi = ni − µni σni which is compared to the two sided standard normal table. A zi value greater than the table entry indicates non-randomness. The fat tails of the student distribution appear in both charts as deviation from the diagonal. Ysm ) • Ytm = (yt .Normal QQ−Plot Normal QQ−Plot q q q 6 6 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q qq qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q −6 q qq −8 q −3 −2 −1 0 1 2 3 −6 −4 qq qq qq q q q q q q q qq qq q qq qq qq q q q q q q qq qq qq qq qq q q q q q q q q q q q q q q qq qq qq qq qq q qq q q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qqq qq qqq qqq qqq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q q q q q q qq qq qq qq qq qq q q q q q q q q qq qq qq qq qq qq qq q q q q q q q q q q q q q q q q q q q qq q q q q q q q qq 4 Student−t Quantiles 2 Student−t Quantiles 0 −2 −4 −2 0 2 4 −3 −2 −1 0 1 2 3 Normal Quantiles Normal Quantiles Figure 21: The charts show QQ plots for a student-t distribution with a degree of freedom of 4 in comparison with the normal distribution. We can then normalize the ni ’s with the expected number of runs of length i (µni ) and the standard deviation of the number of runs of length i (σni ). ) − CT (1. It was ﬁrst developed by Brock. The test has the null hypothesis that the data in the time series is independently and identically distributed (iid) and is in [8] deﬁned as √ T − m + 1(CT (m. · · · . The right chart is therefore smoother. The left chart was created from a sample set of 1000 elements from the student-t distribution and the right chart directly from the quantiles of the same normal and student-t distribution. ) is the correlation integral deﬁned by CT (m ) = (T −m )−1 2 ∀s<t I (Ytm . Dechert and Scheinkman in 1987 (see [11]).1) processes with the corresponding test results. )m ) BT = σ (m. ) ˆ where • CT (m. yt+1 . For the test we have to calculate the ni ’s. Figure 22 and 23 show some outcome of AR(1) and GARCH(1. yt+m−1 ) is the m-history of yt 39 . the number of runs of length i for 1 ≤ i ≤ 10. These values µni and σni can be received from the binomial distribution.

and Ytm . ) = At. Ysm ) = 1 .s (m. .96. is a positive constant • Ytm .s (1. The process corresponding to the left picture had a standard normal distribution as innovation function and the process corresponding to the right picture had a student-t distribution with a degree of freedom of 4 for the innovation function. Ysm ) is the indicator function with I (Ytm .s (m. Ysm : Ytm . ys | < }. ) ∩ . . . .s+m−1 (1. ) is suﬃciently close to CT (1. The Runs test calculates a value n1 = −0. the BDS test detects the null hypothesis of serial independence by comparing if CT (m. )]m Since the correlation integral CT (m. if I (Ytm . . • σ 2 (m. then the events At.s (1. however it has a disadvantage: The user has to deﬁne the two free parameters maximum embedding dimension m and relative radius ex ante. Ysm ) = 0 otherwise.55 is contained in ±1.s (1. |yt+m−1 − ys+m−1 |). )m . ). ) will be independent. Ysm < } = {|yt . sequence. At+m−1.5.55 for the right chart. . The BDS statistic is easy to compute. . ) If {yt } is an i.s (1. . The above relationship can be expressed as At. √ T − m + 1CT (m.i.d. • I (Ytm .2 x −2 0 x 0 200 400 Time 600 800 1000 −5 0 0 5 200 400 Time 600 800 1000 Figure 22: The charts shows a sample set derived from an AR(1) process with a coeﬃcient φ = 0. ) converges in distribution to P [At. ∩ {|yt+m−1 . .70 for the left chart and a value n1 = −0. )]m . Since -0. . Ysm < . ys | < } ∩ . .s (m.96. ys+m−1 | < } Let At. Ysm ) = 1} is the same as { Ytm .70 and -0. ) = {|yt . )] = P [At. so P [At. Ysm is the max-norm of Ytm . we can conclude that both underlying processes that have created the sample sets were random. ∩ At+m−1. ) 40 . ) is a consistent estimator of the asymptotic variance of ˆ The underlying idea of the BDS test can be seen in the following: The random event {I (Ytm . Ysm := max(|yt − ys |. |yt+1 − ys+1 |.s+m−1 (1. . The standard normal table shows at the 5% signiﬁcance level a value of 1.

1.3 as and calculate the statistics for embedding dimension 2 and 3.005 0. The ﬁrst table shows the test results for each combination if embedding dimension and .02 −0. We will use the same AR(1) and GARCH(1.4728 p-value = [ 0.015 41 range of test results 14 . 1. 1.65 for the left chart and a value n1 = −0.0135 14.8726 15.58 1.8 and 2. We can have great conﬁdence in the results because of the very low p-values.010 −0.000 x x 0 200 400 Index 600 800 1000 −0. 1.58.0076 0.62 for the right chart.0971 18. The Runs test calculates a value n1 = −0.81 1.6025 17.19 2.0016 0.5836.5836 ] [ 1.1672. the process corresponding to the left picture had a standard normal distribution as innovation function and the process corresponding to the right picture had a student-t distribution with a degree of freedom of 4 for the innovation function. The following part shows the detailed BDS analysis for the AR(1) process with normal innovation: Embedding dimension = 2.0038 0. The following table summarizes the results of the BDS test applied to the four processes: Process AR(1) GARCH Innovation Function Standard Normal Student-t Standard Normal Student-t used 0. Since all values lie above the threshold given by the standard normal distribution. The second table shows the p-values for the statistics.5836 ] [ 1.0049 0. So we can again conclude that both underlying processes that have created the sample sets were random.7508.7508 ] [ 2.0065 0.6 2.010 0.7508 ] [ 2.2300 16. we can (correctly) conclude that the series is not independent.9 9.2.0.00 0.04 200 400 Index 600 800 1000 Figure 23: The charts show the same calculations as in ﬁgure 22 with an GARCH process (as described in [10]) as underlying function. Again.1) processes as described in ﬁgure 22 and 23 for the Runs test and apply the BDS test to them.3344 ] 2 15.4 3.0032 0.2606 3 14.06 0 −0.02 0.18 14 . 2.1672 ] [ 1.8 .7714 16.04 −0.011 0.14 . 3 Epsilon for close points = 0.4.2 0.3 0.3344 ] 2 0 0 0 0 3 0 0 0 0 The test program has decided to use 0.3344 Standard Normal = [ 0.8 2.005 0.2 1.2 .1672 ] [ 1.

5 2.7 2. Each column represents an index.6 4.7 3.4 3.2 3. i. Kurtosis and Jarque-Bera test We have calculated the skewness and kurtosis for all of the listed market time series except MSCI Europe and Lehman Aggregated Euro Bond Index since there is too few data available for these two indices.011 and 0. The values for the skewness are listed in the following table.4 4.3 3.8 2.7 2. We have calculated the test results for all of the listed market time series.e. all available data from the SBI Foreigner index and the last 1953 samples from some of the other used indices.5 3. The values are listed in the following table.1 2.6 2.2 3.8 3. In the case of the GARCH process with normal innovation function as underlying function the results are not so ambiguous.0 3. monthly data (M) and quarterly data (Q).3 Results Of Statistical Tests Applied To Market Data Kolmogorov-Smirnov test First we apply the market data to the Kolmogorov-Smirnov test to get an impression about whether they are normally distributed. However we can see. For each of this data set and each mentioned index the test result is calculated.2 B World 4.2 3.81 1.0 E US 3.3 3. whereby E stands for ’Equity’ and B for ’Bond’.3 E EU 4. A BDS test for GARCH with student innovation function produces values between 9.8 3. 0. 0. The Smirnov-Kolmogorov test value is determined for diﬀerent data intervals.3 4.7 3. Each row contains a time interval. that the lower the data frequency.2 and 14 as test statistics.4 3.1 B EU 4.015.0 B CH 4. The is chosen as 0.7 2.4 3.6 3. We get values between 2. weekly data (W).0032.0049 and 0.4 According to the results. For all of the others we have taken the last 1953 samples points of the available data.5 2.0 4.0038.8 4.5 3. the closer we get to the conﬁdence value and therefore to normally distributed returns.5 3. 3.2 2.The results for the AR(1) process with student-t distribution as innovation function lie between 14 and 19 for as 0.6 3.1 B FE 3.4 2. The reason for these small values might lie in the fact that the test programm has chosen the relative radius very small: 0.6 4.0016.5 3.9 for the test statistics which is still larger than the corresponding value for the standard normal distribution and therefore we can also this time series declare as not independent.0 2. Interval D BD W BW M Q E World 4. abbreviated as explained above.7 E CH 4.6 3.6 2. 0. Skewness.1 3.6 3.5 3.8 3.0076. bi-weekly data (BW).4 B US 3.2 and is therefore also not produced by and independent process. 42 . no time series is assumed to be normally distributed. Therefore we can conclude that this time series is also not independent.4 3.9 3. Again we have aggregated the daily data to get also lower frequency data.8 3. This means that the given daily data (D) was aggregated to bi-daily data (BD).5 2. 0.4 E FE 4.7 3.8 and 4.0065.7 3.3 3.8 3.2 3.9 3.

32 0.59 0.50 B US -0.25 0. BD: bi-daily.1 3.6 E US 5.29 E US -0.47 -0.17 0. which is 3. page 287.9 4.10 B World 0.35 -0.3 3.0 6.3 E CH 5.0 5.1 3.32 0.6 6.4 3.063 -0.20 0.11 -0.4 4.50 -0.30 0.6 2. From this we can conclude that time series with a longer time interval like monthly or quarterly data can be better ﬁtted to a normal distribution than data with higher frequency like intra-day or daily data which exhibits excess kurtosis.073 0. Q: quarterly.2 The results for the kurtosis are also summarized in ﬁgure 24. It is visible that the value for the kurtosis tends.4 3. In [5].4 5.030 E CH -0.1 B World 4.5 5.089 -0.57 -0.30 0.1 2.26 B CH 0.6 2.7 7. W: weekly.9 4.0 0.1 2.6 2.5 B US 5.61 -0.47 -0.66 0.94 0.60 0.8 3.6 5.5 3.5 4.44 -1. We can see that.46 -0.6 5.050 E FE 0.6 3.38 -0.6 4.2 3. for longer the data intervals.7 4. BW: bi-weekly.22 -0. we can also ﬁnd the conclusion that in most liquid ﬁnancial markets is highly signiﬁcant excess kurtosis in intra-day returns.5 3.7 B CH 5. which decreases with sampling frequency.27 0.090 B FE -0.061 -0.34 -0. M: monthly.43 0.23 -0.5 3.4 4. Each line depicts a certain market time series for increasing interval lengths.5 4.0 3. Kurtosis 2 D 4 6 8 10 BD W BW M Q Data interval Figure 24: The chart shows the evolution of the kurtosis for several market time series and data intervals.046 -0.39 -0.40 -0.50 -0.094 -0.5 3. Interval D BD W BW M Q E World 5.2 7.2 2. the values for the kurtosis approach the kurtosis of a normal distribution (doted line). The length of an interval is encoded according to: D: daily.72 0.9 3. 43 .65 The next table shows the respective values for the kurtosis.54 0.7 3.Interval D BD W BW M Q E World -0.48 -0. towards the value of the kurtosis of the normal distribution.60 0.4 E FE 6.2 3.4 B FE 9. for longer data periods.0 4.

0 12 13 1.6 0. specially on a weekly basis. The reason might be that a crash occurs in a shorter time interval than an euphoria.6 7.8 63 6.6 11 14 6.99.4 Q 0. The resulting table looks like Interval E World E US E FE E CH B World B US B FE B CH D 320 500 1100 540 262 570 3200 410 BD 74 33 250 550 18 150 850 48 W 17 46 87 66 2.57 0. Further one can see that the bond returns have are less fat tailed that the equity returns. This distribution has the threshold for a 5% conﬁdence level at 5. 44 . It is visible that the fat tails disappear with lower data frequency and the empirical line approaches the linear line. Equities Switzerland. ordered from daily data.67 0. Another interesting phenomenon is that.90 2. weekly data to bi-weekly data.9 37 58 4. QQ Plot On the following page we have depicted some QQ plots for the time series of Equities World. For any shorter time interval the normality assumption does not hold. the lower fat tails are stronger evolved than the upper tails.According to (20) we can calculate the test statistics for the Jarque-Bera test out of the skewness and kurtosis.075 1.2 These results get compared with a χ2 distribution with two degrees of freedom.0 M 6. Equities US. The QQ plots of the same time series are on the same horizontal line.54 0.0 1. From this we can conclude that the data are normal on a quarterly basis and for Equity World. Bonds World and Bonds US.7 BW 40 46 79 170 3. Equity US and Bond Far East also on a monthly basis.52 0. bi-daily data.

02 0.000 −0.01 0.02 Bi−Weekly Equities World Quantiles Bi−Daily Equities World Quantiles Weekly Equities World Quantiles Daily Equities World Quantiles −0.00 −0.02 −0.010 qq q q q q q q q q q q q qq q q q q qq q q q q q q qq qq qq qq q q q q q qq qq q q q q q qq qq qq qq q qq qq q q qq qq qq q q q q q q q qq qq qq qq q qq qq q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q qq qq qq qq qq qq qqq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q qq qq qq qq qq qq qq qq qq qq q q q qq q q q q q qq qq q qq q qq qq qq q q q q q q q q q q q q q q q q q q qq qq q qq q q q qq q qq qq q qq qq qq q q q q q q qq qq q qq q q q q q q q q q q qq qq qq qq qq qq q q q q q q q qq qq qq q q q q qq q q q q q q q q q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q qq qq qq qq qq qq qq qq qq qq qq qq qq q qq qq qq qq qq q q qq qq qq q q q q q q q q q q q qq q q q q q q q q q q q qq qq qq qq q qq q q q q q q q qq qq q q q q q qq qq qq qq qq q q q q q q q q q q q qq qq qq q q q q qq qq q q qq 0.02 0.04 qq q q qq q qqq qq qq qq qq qqq qq qq q qq qq qq qqq qqq qq qq q qq qq qq qq qq qq qq qq q qq qq qq qq q qq qqq qq qqq qq qq qq qq qq qq qq q qq qq qq qq qq qq qqq qqq qqq qqq qq q q q qq qq qq qq qq qq q qq qq qq qq qq q qq qq qq qq q qq qq q q q q q 0.03 q −0.02 −0.005 0.00 0.04 −0.02 q Normal QQ−Plot q qq q qq q q q q q q q q qq qq q q q q q q q qq q q q q q q qq qq qq qq qq qq qq q q q q q q qq q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qqq qq qq qq qq q q qq qq qq qq qq qq qq qq qq qq qq qq qq q q q q q qq qq qq qq q q q q q qq qq qq qq qq qq qq qq qq qq qq q q qq q qq qq qq qq qq qq qq qq q q q q q q q q qq q q q q qq q q qq q q q q qq q Normal QQ−Plot 0.005 0.002 Bi−Daily Bonds US Quantiles 0.005 0.002 0.010 −3 −2 −1 0 Normal Quantiles 1 2 3 Normal Quantiles Normal Quantiles Normal Quantiles Normal QQ−Plot 0.02 −0.005 0.02 −0.00 q q −0.03 −0.04 0.02 −0.00 0.01 0.02 q 0.01 0.01 0.02 qq q q q q qq q qq qq qq qq qq qq qq qq q q qqq qqq qqq qqqq qqq qqq qqq qqq qqq q q qq qq qq qq qq qq qq qq qq q q q qq qq qq qqq qqq qqq qq qq qq qqq qqq qqq qq qq qq qq qq qq qq qq qq qq qqq qqq qq qq qq qq qq qq qq qq q q qq qq q qq q q q 0.02 q q q q qq q qq qq qq q q q q qq q q qq qqq qq qq q qq qq qq qqq qq qq q qq qq qqq qqq qqq qqq qq qq qq qq qq q qq qq qq q q qq qq qq q qq qqq qq qq qq qq qq qq q qq qq q q q qq qq qq qq qq qqq qqq qqq qqq qq qq qq q q qq qq qq q q qq qqq q q q q qqq q 0.04 q q qq q q q q q qq qq qq qq qq q q q qq qq q q qq qq q q q q q qq qq qq qq qq qq qq qq qq qq qq qq q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qqq qq qq qq qqq qqq qqq qqq qqq qqq qqq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q qq qq qq q q q qq q q q qq qq qq qq q qq q q q q q qq q q q q q q −0.06 q 0.Normal QQ−Plot 0.04 q q q qq qq q q q q qq q q qq qq qq qq q q q q q qq qq qq qq qq qq q q q qq qq q qq q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q q qq qq qq qq qq qq qq qq qq qq qq qq q qq qq qq qq q q q qq qq qq qq q q q q q q q q q q q q qq qq qq qq q q qq qq qq qq qq q q q q qq qq q qq qq qq qq q qq qq qq q qq qq qq q q q qq qq q q q q Normal QQ−Plot 0.02 −3 −2 −1 0 Normal Quantiles 1 2 3 Normal Quantiles Normal Quantiles Normal Quantiles Normal QQ−Plot 0.02 q −0.00 0.03 q q −0.00 0.000 −0.02 Bi−Weekly Equities US Quantiles Bi−Daily Equities US Quantiles Weekly Equities US Quantiles Daily Equities US Quantiles −0.015 q −0.005 0.002 Weekly Bonds US Quantiles Daily Bonds US Quantiles 0.02 −0.010 0. 45 . Bonds World and Bonds US time series and data intervals of daily data.01 −0.01 −0.00 q −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 Normal Quantiles 1 2 3 Normal Quantiles Normal Quantiles Normal Quantiles Normal QQ−Plot q q Normal QQ−Plot 0.006 q q q q q q q −0.000 0.05 q 0.01 0.005 0.06 −0.02 qq q q qq qq qq qq qq q q q qq qq qq qq qq qq qq qq qq q q q q qq qq qq qq qq q q q qq qq qq qq qq qq qq qq qq qq qq qq qq q qq qq qq q qq qq qq qq qq qq qq qq qq q q qq qq qq qq qq qq qq qq qq q qq qq qq qq qq qq q q q q q q qq qq q qq qq qq qq qq qq qq q q q q q q q q q q q qq qq qq qq q q q q q q q q q qq qq q q q q q q Bi−Weekly Equities Switzerland Quantiles 0. bi-daily data.005 Bi−Weekly Bonds World Quantiles Bi−Daily Bonds World Quantiles Weekly Bonds World Quantiles Daily Bonds World Quantiles 0.015 q qq qq q q qq q q q q q q q q q qq qq qq qq qq qq qq q qq qq qq qq qq q q q qq qq q q q q q q q qq q qq qq qq q qq qq qq qq qq q q q q qq qq qq q q qq qq qq q qq qq qq qq qqq qq qq qq qq qq qq q q q q q qq qq qq qq qq qq qq qq qq qq q q qq q qq qq qq qq q qq q q q q q q −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −0.05 q qq −0.02 0.005 −0.000 −0.010 q q Normal QQ−Plot q qq q q q q q qq q q q q qq qq q q q q q q q q q q qq q q q q qq qq qq q q q q q q q q qq qq qq qq qq q q q qq q q q q qq qq qq qq qq q q q q q qq qq qq qq qq q q q q q q q qq q qq qq qq qq q q q q qq qq qq qq qq q q q q qq q q q q qq qq qq qq q q q qq qq qq qq qq q qq qq qq qq qq q q q q qq q q q q q q q q q q q qq qq q qq qq q q q q qq q q q qq q q q Normal QQ−Plot q 0.004 q q Normal QQ−Plot q Normal QQ−Plot q q Normal QQ−Plot q qq q q q q q q qq qq q q qq qq qq qq qq qq qq qq q q q qq qq qq qqq qq q q q qq qq qq qq qq qq q q q q qq qq q qq qq q q q q qq qq qqq qq qq qq qq qq qq q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q qq q q q q q q qqq qqq qq q q q qq qq q qq q qq q q 0.005 0.02 0.04 q qq qq q qq qq qq q qq q q q q q q qq q q q q q q q q q q q q qq qq qq qq qq qq q qq qq qq qq qq qq qq q q qq qq qq qq qq qq q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qqq qqq qq qq qq qqq qq q qq q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q q q q q q qq qq qq qq qq qq q qq qq qq qq qq qq qq q q q qq qq qq qq qq qq qq qq qq qq q q q q q q q q q q q q q q qq qq q q q q q q q q 0.000 0.00 −0.01 −0.004 Bi−Weekly Bonds US Quantiles −0.000 −0.03 −0.04 0.04 q Normal QQ−Plot q −0.01 0.005 −0.00 −0.02 qq q q q qq q q q q q qq q q q q q q q q q q q qq qq qq q qq q qq qq qq qq qq qq qq qq qq qq qq qq qq qq q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qqq qqq qqq qqq qqq qqq qqq q q q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q qq qq qq qq qq qq qq qq qq qq q qq qq q q q q q q q qq qq qq qq qq qq q q q q q q q qq qq qq qq q q q q q q q q q qq qq q q q q q q qq qq q q Bi−Daily Equities Switzerland Quantiles Weekly Equities Switzerland Quantiles Daily Equities Switzerland Quantiles −0. Equities US.010 q q q q q qq −0. Equities Switzerland.06 q −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −0.00 0.02 −3 −2 −1 0 Normal Quantiles 1 2 3 Normal Quantiles Normal Quantiles Normal Quantiles Normal QQ−Plot q q Normal QQ−Plot q q q Normal QQ−Plot 0.00 0.01 −0.06 q 0.02 −0.04 q q qq q q q q qq q q q q q q qq q q q q q q q q q q qq q qq qq qq qq q qq qq qq q q q qq q q qq qq qq qq qq qq q qq qq qq qq qq qq q q qq qq qq qq q q q qq q qq qq qq qq qq q q q q q q qq qq qq qq q q q q q q q q q q q qq qq qq qq q qq q q q q qq q q q q qq qq q qq qq qq qq qq qq q q q q q q q q q q q q qq qq qq qq q q q q q q q q q q q q q q q qq q q q q Normal QQ−Plot q q q qq q q q q qq q q q q q qq q qq qq q q q q q q q q q q q q q q q q q q q q qq qq qq qq qq q q q q q qq qq q q qq qq qq qq qq qq q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q q q q q q q q q qq qq qq qq qq q q q q qq qq qq qq qq qq qq qq qq q q q q qq qq q q q q q q q q q q q q q qq qq qq qq q q 0. weekly data and bi-weekly data.005 q q q −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 Normal Quantiles 1 2 3 Normal Quantiles Normal Quantiles Normal Quantiles Figure 25: QQ Plots for Equities World.01 0.01 −0.00 −0.004 q −0.03 q q Normal QQ−Plot q q q q q q qq q qq q q q q q qq qq qq qq qq qq q q q q q q qq qq qq qq qq qq q q q q q q q qq q q q qq qq q qq qq qq qq qq qq qq qq qq q q q q q q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q q q q q q q qq q qq qq q q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q qq qq qq qq q q qq qq q qq qq q q q q q q q q q q q q q q qq q q q q q q q qq q Normal QQ−Plot 0.04 q −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 0.002 −0.005 q q qq qqq qq qq qq q q qq qq q q qq q qq qq qq q qq qq q q q q q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qqq qqq qqq qqq qqq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q q q qq q q qq q qq qq qq q qq qq q q q q q q q q q q q q q q q q q q q qq qq q q q q q q q q q q q q q q qq q q q q qq qq q q qq q q q q q q q q q q qq qq qq q qq qq q qq qq qq qq q q q q q qq qq qq qq qq qq qq q qq qq qq qq qq q q qq qq qq qq q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qqq qqq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q qq qq qq q q q q q qq qq qq q q qq qq qq q q q q qq qq qq qq qq q q qq qq qq qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq qq q qq q q q qq qq qq qq q qq qq qq qq qq qq q q qq q q qq qq qq qq q q q q q qq qq qq qq qq qqq qq qq qq qq qq qq qq qq qq q qq qq qq qq qq qq qq qq qq q q qq qq qq qq qq q q q qq q q q qq q q q q q q q q q q qq qq qq qq qq q q q q q q q qq qq q qq qq q q q qq qq qq q q q q q q q q q q qq q q q qq qq q qq q q q q q 0.000 0.

1 .19 1. Interval D BD W BW M Q Bonds World -0.1 -1.60 -0.1.6.61 -0. Please remember that the results for Equities EU and Bonds EU are gained from a shorter time series than the others and are therefore not signiﬁcant.72 -0. we have to conclude that they are all generated by a random process.3 .44 12 .54 -0.58 -0.38 5.6 .6 -0.13 5.73 -0.8 . Interval D BD W BW M Q Equities World 6.79 -0.22 Equities EU 8.11 6.3 .79 -0.64 .0 -6.5 .68 Q -0.52 -1.0.33 Equities US 2.8.8 0.1 .58 -0. BDS test Finally.07 -1.8.8.9 -5.7 The statement of the test (threshold 1.8 .1 Bonds EU -0.56 .62 -0.56 BW -0.65 -1.6.1.49 -0.45 0.55 -0.2 .95 -2.92 -0.68 -0.63 -0.64 -0.75 -0.70 -0.6 2.75 -0.3.9 .9 4.5.6.74 -0.9 .8.86 -1.07 -1.85 -1.0 -3.56 -0.60 -0.0.80 .70 .64 -0.61 .8 .14 1.2.36 -0. We can ﬁnd that the bond indices have smaller values and are therefore more likely to be randomly distributed.68 -0.76 -0.9 -5.78 -0.2.67 BD -0.2 -0. Since all results are smaller than this threshold. We list the range of the test values for diﬀerent values for the and embedding dimension.64 -0.1 .4 -0.53 -0. The next table lists the range of the results for the bond indices: 46 .1 .7 0.67 -0.78 The same hold true for the bonds indices because they also lie all in between the boundaries.64 -0.6 0.96) is that the market series are uncorrelated for monthly and quarterly data (except for the case of monthly data for Equities EU and US) and correlated for higher frequency data.2 -1.3.83 -0.4 4. The ﬁrst table contains the result of the equity indices: Interval Equities World Equities EU Equities US Equities FE Equities CH D -0.7 -0.52 -0.78 Bonds CH -0.52 .9 Equities FE 2. let’s have a look at the results of the BDS test.80 -0.62 .72 -0.9 -0.8 .70 W -0.55 -0.0 2.9 .6.96 for the 5% signiﬁcance level.61 -1.2 .5 .-0.2 3.0.5.65 -0.2.4.78 -1.66 M -0.10 2.3 .Runs test In the following we show the results of the Runs test applied to the market data.3 .34 .36 A two sided standard normal distribution table gives us a value of 1.0 .8 -0.5 Bonds US -0.4 Equities CH 7.0.6 Bonds FE -0.

3. According to the Central Limit Theorem. .-0.51 -1.2 .4. weekly or monthly data of this instrument as the sum of the tick-by-tick returns or the respective random variables. • The QQ plots for diﬀerent sampling frequencies of the data show signiﬁcant fat tails for daily up to bi-weekly data • From the Runs test we were able to conclude that the market series were produced by a random process • Finally.3 Bonds US 3.5.0 Bonds EU 2. monthly and quarterly data (except for bi-weekly Bonds Far East).2 .5.1 2. It was even proposed by Markowitz himself in his Nobel price winning work.89 .8 0. Xn be mutually independent random variables with a common distribution function F. the BDS test showed us that the series are uncorrelated for monthly and quarterly data and correlated for higher frequency data These results should be evidence enough that the normality assumption of Markowitz does not hold and it is justiﬁed to look out for other approaches that take non-normality into consideration. .8 .27 .7 .3 We have more or less an acceptance of the hypothesis of uncorrelated returns for bi-weekly.-1.2.028 . the low frequency data will distribute like a Gaussian distribution function.1 .11 3.056 .6. • The Jarque-Bera test conﬁrms these statement by refusing normality except for quarterly data.12 -1 . There are some arguments for the standard Markowitz method which we don’t want to hide: The Central Limit Theorem says: Let X1 .9 Bonds FE 8. .3.23 -1.0.1 0. the Central Limit Theorem can be applied once more by arguing that an index or fund is the weighted sum of several random variable (the constituents of the index or fund) and therefore the returns will behave according to a normal distribution if the index or fund has enough constituents.3 Bonds CH -0.1.1 .3.9.2.6 .8 .1 0.8 .8.6 -0.5 2.17 0.0 -1.1. As n → ∞ the distribution of the normalized sum Sn = (X1 + X2 + .6 .0 3.2 . if the frequency is low enough and we have enough data points in a period.9 .6 .76 -4.7 . neither on short time frequency (daily data).Interval D BD W BW M Q Bonds World 1.5 1.4 .4 .7 0. .2. also to investigate alternatives to variance as risk measures.4 -1.6 . In the context of an index or fund. Assume E[X]= 0 and Var(X)= 1.5.3 . Summary of test results Let’s summarize the results of the applied tests: • The Kolmogorov-Smirnov test has shown that the market time series are not normally distributed.29 1.7 0.10 2.44 -1.26 . + Xn ) √ n tends to the Gaussian distribution function.14 5. we can interpret each data point as the value of a random variable and the daily.3.2 -2. X2 .0.0.88 -3.1 . .021 .3.7 .6.4. nor on long time frequency (quarterly data). 47 .78 . .0. When we look at the tick-by-tick logarithmic return data of a stock exchange for a certain ﬁnancial instrument.61 .2 .7 -1.

This standard measure of risk is given in Huerlimann [26] by: R(X) = −E[U (X − E[X])] (21) Since the risk measure corresponds to the negative expected utility function of X .) Risk as necessary capital respectively necessary premium (risk of the second kind) For many common risk measures one kind can get transformed into the other: The addition of E[X] to a risk measure of the second kind will guide us to a risk measure of the ﬁrst kind and the subtraction of E[X] from a risk measure of the ﬁrst kind will lead to a risk measure of the second kind. we obtain the variance V ar(X) = E[(X − E[X])2 ] as the corresponding risk measure.g. In Albrecht [4] risk measures are categorized into two kinds. We will explore what general properties such a risk measure should have in order to be an substitute for the variance. Let the function R denote a risk measure which assigns a value to each alternative and the notation A R B ⇔ R(XA ) > R(XB ) indicates that the alternative A is riskier then alternative B. The two categories are: 1. Markowitz approach where we can ﬁnd a trade-oﬀ function between expected return and risk).g. Note that this is diﬀerent from the utility function U presented in chapter 1 where A B ⇔ U (XA ) > U (XB ) means that A is preferred to B. Assume we are given two random variables X.) Risk as magnitude of deviation from target (risk of the ﬁrst kind) 2. From (21) we can derive speciﬁc risk measures by using a speciﬁc utility function. the risk measure is location free. Y. We can ﬁnd a general approach to derive a risk measure for a given utility function. In the second part of this chapter the suitability of variance as risk measure gets analyzed. We will now introduce the deﬁnitions for stochastic and monotonic dominance because they are useful in the context of risk measures. Markowitz uses this framework and has the variance chosen as risk measure. Part of the basic theory for this area was developed for the insurance sector and then adapted for the ﬁnancial context.4 Portfolio Construction With Non Normal Asset Returns The concept of a mean risk framework was explained an earlier chapter. Using for instance the quadratic utility function (6).E[X]. Assume we have two alternatives A and B and their ﬁnancial consequences XA and XB . The diﬀerence of the concepts of the utility function and risk might become more apparent if one becomes aware that a utility function can be deﬁned without a risk term (e. 48 .1 Introduction To Risk In General In this section we will concentrate on the properties of ﬁnancial risk measures. 4. ’prefer more to less’) or can include a risk term (e. We will use a variable X as a random variable representing the relative or absolute return of an asset (or the insured losses in the insurance context).

Axioms 2 and 4 imply that a risk measure according to these criteria is convex. then also the risk gets multiplied.Stochastic dominance of order 1 for a monotonic function R: X SD(1) Y ↔ E[R(X)] ≤ E[R(Y )] Stochastic dominance of order 2 for a concave. Y ∈ ∞ . The classiﬁcation was reﬁned in [13] to introduce the terms convex risk measure. 1.) Subadditivity: R(X + Y ) ≤ R(X) + R(Y ) The risk or two combined investments will not be larger than the risk of the individual investments (eﬀect of diversiﬁcation). 3. Eber and Heath [7] have developed another set of axioms. They call a mapping a convex risk measure if ∀X. Eber and Heath Artzner.) Subadditivity: R(X + Y ) ≤ R(X) + R(Y ) 2. Delbaen.) Shift invariance: R(X + c) ≤ R(X). ∀ constants c The measure is invariant to an addition of a constant (location free) Axioms number 2 and 3 combined lead to the statement that the risk of a constant random variable must be zero. 49 .) Positive homogeneity: R(c ∗ X) = c ∗ R(X). Since the risk measure is assumed to be location free. Axioms 1 and 4 are also contained in the set of Pedersen and Satchell [32] in a similar way. Delbaen. monotonic function R: X SD(2) Y ↔ E[R(X)] ≤ E[R(Y )] Monotonic dominance of order 2 for a concave function R: X M D(1) Y ↔ E[R(X)] ≤ E[R(Y )] Next we will now check some axiomatic systems for risk measures that were proposed in the last years. Risk measures that fulﬁll their properties are called coherent. this system of axioms will describe especially risk measures of the ﬁrst kind. 4. Axiomatic system of Pedersen and Satchell Pedersen and Satchell give in [32] the following set of axioms for a risk measure: 1.) Nonnegativity: R(X) ≥ 0 This requirement follows from the assumption of a risk measure of the ﬁrst kind (deviation from a location measure) 2. Axiomatic system of Artzner. ∀ constants c If an investment gets multiplied.) Monotonicity: X ≤ Y ⇒ R(X) ≥ R(Y ) A higher loss potential (statistical dominance) implies a higher risk.

every reasonable risk measure must be convex because a risk measure that does not satisfy subadditivity penalizes diversiﬁcation and would not assign risk in an intuitive way. ∀ constant c This set of risk axioms is well suited for risk measures of the second kind. the semi-variance and some other risk measures. which can however easily be transferred to the ﬁnancial context. X2 comonotone ⇒ R(X + Y ) = R(X) + R(Y ) Comonotone: ∃ random variable Z and monotone functions f and g with X = f (Z) and Y = g(Z) A general risk measure Stone [42] reports a general risk measure containing the three parameters c. It is very common in statistics and has for this reason well known properties. However it has also some properties that makes it not optimal as risk measure for ﬁnancial applications. A convex risk measure R is called a coherent risk measure if it satisfy the additional property: 4. Young and Panjier asks for some continuity properties and 1. Axiomatic system of Wang. This class was extended in [32] to a ﬁve parameter model: z R(X) = [ ∞ (|x − c|)a w[F (x)]f (x)dx]b which contains also the variance.) Translation Invariance: R(X + a) = R(X) − a.3. The risk of very rare events are not taken into account very well by variance. 4. The advantage of variance as risk dimension is that it is a very convenient and intuitive measure.2 Variance As Risk Measure Variance was proposed as appropriate measures for risk by Markowitz in his approach (7). The closed system of axioms for premia by Wang. The two main tasks in insurance markets are the calculation of the risk premia and the risk capital.) Positive homogeneity: R(c ∗ X) = c ∗ R(X). In fact.) Comonotone additivity: X1 . Young and Panjier Another important set of risk axioms was introduced by Wang. ∀ constant returns a There is no additional risk for an investment without uncertainty. k and z as: z R(X) = [ ∞ (|x − c|)k f (x)dx] k 1 The standard deviation and semi-standard deviation are part of this general risk measure class. Young and Panjier [45]. They are dealing with premia in an insurance context. This means that extreme events (very high returns or very high losses) are more likely than compared to a 50 .) Monotonicity: X ≤ Y ⇒ R(Y ) ≤ R(X) 2. We will show with the tests presented before that the returns of ﬁnancial assets often have fat tails.

For this reason a risk measure that does not pay special attention for these kind of events is not very qualiﬁed. which is in fact something desired for an investor. Another unpleasant property of variance is its symmetry. When we talk about risk. In practice of portfolio optimization it is crucial to avoid very high losses because a lot of clients just ask for a preservation of their wealth. 51 . we think of the risk for a loss. We have already mentioned that it is shown in [39] that variance is only compatible to the the concept of a utility function under the assumption of normally distributed returns or a quadratic utility function which is a very strong restriction. This gives rise to asymmetrical risk measures which take only care for losses.normal distribution. However variance measures also the ”risk” of a gain. It is true that the variance penalizes extreme events by calculating the squared distance to the mean. however we should ask for something more speciﬁc.

It is the aim of portfolio construction to assemble a portfolio with a high Value at Risk in order to shift the return range for the 1 − α area as much to the positive side as possible.1. V aRα (RP ) can be interpreted as the loss of a Portfolio that will be exceeded in only α*100 percent of all cases. 52 .01 and 0. as a conclusion.95 or 0. Sometimes α is chosen as 0. The similarity of these two notations is shown in appendix B. they recommend Value at Risk as an appropriate risk measure. The Value at Risk at level α for the return R is deﬁned as −1 V aRα (RP ) = sup{x|P [RP < x] ≤ α} = FRP (α) (22) −1 The function FRP (α) is called the generalized inverse of the cumulative distribution function FRP (x) = P [R ≤ x] of RP and gives the α-quantile of RP .1 Value At Risk We deﬁne Value at Risk as: Let α ∈ (0. 1) be a given probability level and w the asset weights of a portfolio.4 0.5 α 1−α 0. the Value at Risk is a lower boundary for a portfolio return and the return of the portfolio will with a very high probability (0.99 and V aR1−α for a loss function is computed.0 x −3 −2 −1 0 Return 1 2 3 Figure 26: The Value at Risk at level α for the return R is deﬁned as the return x where the probability of having a return smaller than x is α. Figure 26 shows two areas α and 1 − α for the normal distribution. 5. Since α is usually chosen between 0.99 or 0.5 Value At Risk Measures In this chapter we will present a ﬁrst alternative risk measure to the standard deviation. Probability 0. There are eﬀorts undertaken to introduce regulations to the ﬁnancial industry to get a better control for the risk that is taken by its participants and also to help the companies to get a better overview for the risk they hold. It is called Value at Risk and belongs to the quantile based risk measures.3 0. This was also the topic of the Basel Committee on Banking Supervision where.2 0.9 for the example α) not be smaller.1 0.

5 From the expected returns of the portfolio A and B we can see that Portfolio A has a higher expected return than portfolio B. This would oﬀset the eﬀect of diversiﬁcation. It is also clear that Value at Risk does not distinguish between very sever losses or just small losses. This means that the VaR function contains many local maxima. since most distributions in ﬁnance diﬀer especially around the median and are around the tails very similar. Since Value at Risk is only concerned about the threshold that will be crossed with the small probability α. Dembo and Fuma [17] published an example that shows this disadvantage.01 0. This example is not as artiﬁcial as it might look like.15 0. Assume two distributions are given as declared in this table and depicted in ﬁgure 27. To deal with these maxima. In the general case Value at Risk does not fulﬁll the sub-additivity axiom.01 0.144 -10 -7.5 0 2. which allows them to calculate the optimal portfolios in the VaR sense with high accuracy and in reasonable time.25 0. The concept of the copula is a well known way of modelling dependence in risk management. 53 .225 3.1 0.1 0.140 -10 -7.5 5 µ σ 1% VaR 5% VaR Probability in Portfolio A 0.5 0. It treats each univariate distribution function for the assets individually and models the dependencies of the univariate distribution functions with a copula.04 0. Even the numerical application is diﬃcult: According to Gaivoronski and Pﬂug [23].05 0. VaR is not a convex risk measure. Another unpleasant property of Value at Risk is that it fails to be coherent as stated in [3].5 -5 -2.05 -1.775 3. fail to capture this preference because they both get the same values for portfolio A and B.5 Probability in Portfolio B 0. Return -10 -7. it does not take into consideration the distribution of the returns above the threshold. The reason is that standard deviation. This means that we have a clear preference for portfolio A.25 0. Another approach to deal with the VaR optimization function is proposed in Embrechts et al [20].04 0. does not discriminate between the risk of a loss (which should be avoided) and the risk of a gain (which is favorable) and VaR does not take into consideration the distribution form above the threshold at all. they have developed a smoothing algorithm. However both risk measures.3 0.15 0. This is especially unpleasant because it implies that a portfolio made out of smaller portfolios (and therefore with a higher diversiﬁcation as the individual small portfolio) can have a higher amount of risk than the sum of the risk of the smaller portfolios. as mentioned. as long as they are below the threshold. standard deviation and VaR.The analytical properties of the Value at Risk model are not very pleasant: It is in the general case not possibly to ﬁnd an symbolic expression for the portfolio weight w optimized according to VaR and dependent on the multivariate returns function of its constituents.

whereas the distribution function of the returns of portfolio B is depicted as doted line.0 −10 0.4 0. R) = E[(τ − R)β ] = −∞ P (x)(τ − x)β dx An investor can determine a threshold τ under which he does not want the return R to fall. one gets a diﬀerent lower partial moment: β = 0 : Shortfall probability LP M0 = β = 1 : Mean Shortfall LP M1 = τ −∞ f (x)dx τ −∞ f (x)(τ − x)dx τ −∞ f (x)(τ β = 2 : Shortfall variance/ Semi variance LP M2 = − x)2 dx LP M0 portfolio selection corresponds to Roy’s safety ﬁrst rule presented in [38].Probability 0. For continuous distributions conditional Value at Risk is deﬁned as conditional expected loss under the condition 54 .1 0.2 Conditional Value At Risk. The following part tries to unveil the relation between the mentioned risk measures. LP M1 . According to the choice of β. The term conditional Value at Risk was ﬁrst introduced in [35].3 0. At the beginning there was a ﬁrst concept called lower partial moment (LPM) as described in Fishburn [21]. can be interpreted as the average portfolio underperformance compared to a ﬁxed target or some benchmark τ .5 5. We cover them in the same chapter because they are very similar and these risk concepts have become a totum revolutum in the last few years.2 0. The functions are coincident for returns between -10 and -7. also called expected regret in [16]. Expected Shortfall And Tail Conditional Expectation In this chapter we will discuss the concepts of Lower Partial Moments (LPM). The general lower partial moment risk measure for a random return variable R and its probability function P (x) is given by τ LP Mβ (τ .5 −5 Return 0 5 Figure 27: The graphic shows the two distribution functions deﬁned in the table. The distribution function of the returns of portfolio A is depicted as solid line. They use a slightly diﬀerent deﬁnition and notation for VaR and CVaR as we will (refer to appendix B). Conditional Value at Risk (CVaR) and Expected Shortfall (ES).

Since the concept was developed for several application ﬁelds (e. also applicable for discrete distributions is written in Uryasev [44] as a weighted average of VaR and returns strictly below VaR. for a continuous distribution function P [RP ≤ x] = α and then it can be seen that (23) is equivalent to (26). After the conversion to our environment the equation is + CV aRα = λ V aRα + (1 − λ) CV aRα (25) α − P [RP ≤ V aR] α The equation can be used for continuous and discrete distributions: In the case of a con+ tinuous distribution λ = 0 and therefore CV aRα = CV aRα . economics) and by diﬀerent researchers. ten equivalent deﬁnitions of CVaR are presented. In Huerlimann [25]. From this it gets clear that CV aRα ≤ V aRα Conditional Value at Risk is also known as Mean Excess Loss (CVaR+ ). it has many names and deﬁnitions. There are two variants of CVaR: + CV aRα = E[RP |RP < V aR] − CV aRα = E[RP |RP ≤ V aR] (23) (24) where V aR = V aRα (RP ) as deﬁned in formula (22) and E[x] denotes the expected value of x.that it exceeds the Value at Risk. 55 . it might be that P [RP ≤ x] > α. λ= A similar concept to CVaR is called expected shortfall. In contrast. but more on the negative side of the distribution. λ increases CV aRα + from V aR to the positive side of the distribution and extrapolates CV aRα disc to V aRcont . In + other words.g. If we have a discrete distribution the calculated VaR (V aRdisc ) will not exactly be the α quantile as it would be for a continuous + distribution (V aRcont ). Mean Shortfall (LPM1 with τ = V aR) (CVaR+ ) or Tail Value at Risk (CVaR− ). It was introduced in [1] and redeﬁned later to be consistent with CVaR. To conclude we try to group the risk measures that have the same base concept. ﬁnance. As mentioned before. CV aRα and V aR get weighted proportionally to V aRdisc −V aRcont and therefore V aRcont + CV aRα ≤ CV aRα ≤ V aRα . They all take the distribution function as input and process a number as representant for the risk the distribution function holds out of the distribution function. conditional Value at Risk can be considered as expected amount of loss below the VaR. 1 ESα (RP ) = − (E[RP 1RP ≤V aR ] − (P [RP ≤ V aR] − α)V aR) α They show in Acerbi and Tasche [2] that it can also be expressed as ESα (RP ) = − 1 α α with (26) inf [x|P [X ≤ x] ≥ a]da 0 In case that we have a non continuous distribution function. A general deﬁnition for CVaR. actuarial science.

. We try to maximize the risk measure under the constraint that the expected return wT R of the portfolio is equal to some predeﬁned level µ. wT E[R] = µ wT 1 = 1 w≥0 and the CVaR respectively as 56 . N . Property Translation equivariance Positively homogeneous Convexity Stochastic dominance of order 1 Stochastic dominance of order 2 Monotonic dominance of order 2 Coherence VaR √ √ x √ x x x CVaR √ √ √ √ √ √ √ From this comparison it shows that conditional Value at Risk has much nicer properties than the standard Value at Risk. They also conﬁrm the relation of CVaR with the other risk measures in the same row.Calculate a threshold the returns should not fall below LPM0 Value at Risk Shortfall risk Calculate the expected return of the returns under a certain threshold LPM1 Conditional Value at Risk Expected shortfall Mean shortfall Expected regret Tail Value at Risk It is shown in Testuri and Uryasev [43] that expected regret and CVaR is closely related. . Since CVaR is convex with respect to portfolio positions. at least for the case of continuous distribution functions. The statements were taken from [34]. . The following table lists the properties of Value at Risk and conditional Value at Risk/ expected shortfall. The VaR optimization problem can be stated as Maximize (in w) V aRα (wT R) s. For the sake of consistency. Let w = (w1 . . RN ) indicates a vector of random returns of asset classes 1 . . wN ) be the weights of the investments in these asset classes. R = (R1 . In [35] it is also stated that CVaR methodology is consistent with Mean-Variance methodology under normality assumption. we have again transformed the notations according to appendix B. . This means that a CVaR maximal portfolio is also variance minimal for normal return distributions. . .t. Conditional Value at Risk gets presented an excellent tool for risk management and portfolio optimization because it can quantify risks beyond Value at Risk and is easier to optimize. it is much easier to optimize than VaR which has a lot of local maxima. Coherence is a requirement for an intuitive risk measure (eﬀect of diversiﬁcation) and is also fulﬁlled only by CVaR. We will now focus on the optimization of the two risk measures following [34].

a convex polyhedron or the solution does not exist. . M.t.t. . nonconvex program: Maximize (in w) S[1: s. empirical data). . .t. uM which is the k-th smallest. since our optimizer is only capable of minimizing a function but not of maximizing a function. .g. . First. In practice however we have mostly discrete variables (e. .. uM ) to denote the one element among u1 . wT e = µ wT 1 = 1 w≥0 αM ] (w T R1 . . .Maximize (in w) CV aRα (wT R) s. i = 1. . . The VaR optimization problem we will cover later. we minimize in the implementation −V aRα (wT R) and −CV aRα (wT R). we can be sure that the solution will be a singleton.t. . w T RM ) w T Ri CV aRα (wT R) = 1 M wT Ri ≤V aRα The discrete portfolio optimization problem for the VaR is a nonlinear. M indicates the returns of all asset classes for a certain time point 1. w T RM ) 1 where e = M M Ri denotes the expected return vector. . The new deﬁnitions for VaR and CVaR are V aRα (wT R) = S[1: αM ] (w T R1 . For the formulation we will use a notation S[1:k] (u1 . i=1 The discrete version of the CVaR is piecewise linear and may therefore be solved using an LP-solver. and z) a + s. we transform the CVaR optimization problem in the following linear program with a dummy variable Z: 1 Maximize (in w and a) a + α E[Z] s. . . . For this reason we formulate the portfolio optimization problems in a discrete way. a. . A vector Ri . We formulate the problem like: Maximize (in w. . . . Z ≥ wT R − a xT E[R] = µ wT 1 = 1 Z≥0 w≥0 Since we have only linear constraints. z i ≥ −wT Ri − a wT e = µ wT 1 = 1 1 αM M i i=1 z 57 . .. . wT E[R] = µ wT 1 = 1 w≥0 Note that.

the optimal value for a is V aRα (wT R). 5.zi ≥ 0 wi ≥ 0 For this setting. there might be eﬀects that prevent the equivalence of the two optimization techniques. 0. We start with the case of normal distributed asset returns. For both distribution functions the Conditional Value at Risk and the variance is schematically depicted. It is intuitive to see that if we maximize the CVaR we also minimize the variance of the distribution. Using small amounts of empirical data. 58 .00 CVaR Vtlg 2 −10 −5 0 Return 5 10 Figure 28: The graphic shows two normal distribution functions.3 Mean-Conditional Value At Risk Eﬃcient Portfolios In this section we want to analyze what it means to optimize a portfolio regarding Value At Risk/ Conditional Value At Risk. The only way to enlarge the CVaR (shifting the corresponding left tail of the distribution to the right) is to shorten the variance (make the peak larger).20 Probability 0. The distribution with the larger variance (doted line) has smaller CVaR and vice-versa. Figure 28 shows two normal distribution with the same mean but diﬀerent variances.15 σ Vtlg 1 CVaR Vtlg 1 σ Vtlg 2 0. We can see that the objective function and the ﬁrst and third inequality constraint express the weighted average of the Value at Risk and the mean of all negative returns above the Value at Risk (which is the same as the mean of all returns below the negative Value at Risk).10 0. It shows that minimizing the variance of a function is equivalent to maximizing its Conditional Value at Risk.05 0. Of course this is only true if we a suﬃcient amount of data coming from a pure normal distribution function.

in general.g. 59 . Au contraire.The case of distributions with skewness and excess kurtosis is more interesting. will prefer assets with positive skewness. The occurrence of fat tails and asymmetry in the distribution function allows the mean-CVaR optimization to take the risk evolving out of these properties into account. such an optimization will assign the portfolio weights diﬀerently the the Mean-Variance approach. we expect the results of a Mean-CVaR and the results of a Mean-Variance optimization to be the same for the case of similar distribution functions (e. The optimization. To conclude. the results are assumed to be diﬀerent for the two optimization techniques if the data is coming from varying distribution function with diﬀerent higher moments or if the sample size is small. As consequence. normal distribution functions) for the asset returns and a suﬃcient amount of data. small kurtosis and low variance for a given return.

Finally we apply CDaR in a portfolio context.1 Draw-Down And Time Under-The-Water An advantage of the two concepts is that they are much more intuitive than other risk measures.t] (High-Water-Mark) and the value of the function at time t: DD(w. t)dt 0 (29) If. two risk functions are derived: Maximum DrawDown is calculated as the maximum Draw-Down in the period M D(w. t) = t − [maxT |rc (w. τ )] − rc (w. Draw-Down is measured on the y-axis. t) = max0≤τ ≤t [DD(w. t) with w as the vector of weights for the portfolio constituents. t) Figure 29 shows a time series with the respective High-Water-Marks and Draw-Down. t) = max0≤τ ≤t [rc (w. τ )] 60 . in a time-value framework. In [33] Draw-Down is used together with Time Under-The-Water to measure the loss potential of hedge funds. We will describe Draw-Down as written in [12] and Time UnderThe-Water according the idea in [33]. Afterwards we will enhance Draw-Down to Conditional Draw-Down at Risk (CDaR) and Time Under-The-Water to Conditional Time Under-TheWater at Risk (CTaR). They are called Draw-Down and Time Under-The-Water. The concepts represent values every investor is interested in: Draw-Down measures the loss the investment might suﬀer (in absolute or relative terms) and Time Under-The-Water is the time period the investment might remain with a negative performance.6 Draw-Down Measures In this section we will present two other approaches to measure the risk of a portfolio. We will work on the logarithmic returns instead of geometric returns as stated in [12]. Draw-Down was ﬁrst presented in a portfolio context in [12]. Other possible applications for these measurements could be: A portfolio manager might loose a client if the clients portfolio does not provide a gain over a long time or a fund might not be allowed to loose more than a certain amount each month and has therefore to stop trading until the next month starts and therefore a new budget. Time Under-The-Water is the corresponding period on the x-axis that represents the time the value of an investment may remain under its historic record mark. t)] and the average Draw-Down is deﬁned as AD(w) = 1 T T (27) (28) DD(w. 6. Starting with the formula for Draw-Down. maxT ) = max0≤τ ≤t rc (w. We deﬁne Time Under-The-Water as T U W (w. The Draw-Down function at time t is deﬁned as the diﬀerence between the maximum of the function in the time period [0. Assume we are given the (cumulated) return of the portfolio from time 0 until time t by a function rc (w.

t)dt 0 (31) 6. we will proceed with Draw-Down and Time Under-The-Water. The Time Under-The-Water is just the part of the dashed line above the doted line.Value −50 0 0 50 100 100 200 Time 300 400 500 Figure 29: The ﬁgure shows a time series with the respective High-Water-Mark (dashed line) and DrawDown (doted line) as deﬁned. we will now introduce Maximum Time Under-The-Water MT(w) and Average Time Under-The-Water AT(w) as M T (w. t)] 1 T T (30) AT (w) = T U W (w.2 Conditional Draw-Down At Risk And Conditional Time Under-TheWater At Risk Alike the enhancement of Value at Risk to Conditional Value at Risk. t) = max0≤τ ≤t [T U W (w. Draw-Down at Risk can be deﬁned similar to (22) as DaRα (M D) = inf {x|P [M D > x] ≤ α} (32) with MD as Maximum Draw-Down and Conditional Draw-Down at Risk corresponding to Conditional Value at Risk (25) as: + CDaRα = λ DaRα + (1 − λ) CDaRα (33) with λ= P [M D ≥ DaRalpha ] − α α (34) + CDaRα = E[M D|M D > DaRα ] 61 . Similar to the Draw-Down concept.

We will now discuss the implementation of the concepts in detail. Using this methodology. This means. the number of resulting Draw-Downs is too small to get a good distribution approximation. The described method of the ﬁxed periods for calculating the Draw-Down could be used if the investment horizon of the client is known: If the investment horizon is known. The drawback of this method is that we will probably get very few Draw-Down values for the following reasons: • Since the Draw-Down gets calculated as the diﬀerence to the highest historical value (record). Only in times of a Hausse there will be a new High-Water-Mark and therefore new Draw-Downs. • The eﬀect of increasing time periods for new records can not be avoided by resetting the all-time-high at the beginning of each sub-period . one should be aware of some points: • The methodology adds a new variable M that does not improve the descriptive power of the concept. a rolling window with the length of the investment horizon could be applied to the available historical data. we get with this method P −Q Maximum Draw-Down values. All of these Maximum Draw-Downs get stored to construct their distribution. This gives 62 .T] and to calculate the Draw-Down for each sub-period. we calculate the Maximum Draw-Down for the period between this global maxima and the point where the time series is higher than this global maxima for the ﬁrst time. It would also be possible R to use overlapping rolling windows. A ﬁrst approach would be to calculate the Maximum Draw-Down for each new level of the High-Water-Mark. we won’t get any Draw-Downs at all. In [12] it was proposed to introduce M sub-periods in the time interval [0. We would like to bring this method and a new method into the context of the information given by the client. However this would decrease the variance of the in this way reused data. we scan the time series from the past to the present and each time we ﬁnd a new global maxima. Q data points in the rolling window and shift the window R data points each time we advance. we propose as a second method to calculate the Maximum Draw-Down for each possible entry combined with each possible exit point. This way they get an empirical distribution for the DrawDowns consisting out of maximum M sample points. the Draw-Downs that extend over several sub-periods get cut into several smaller Draw-Downs because the maximum possible Draw-Down is restricted to the length of the sub-period. • In times of a Baise.it is just transformed to a smaller time scale. If M is chosen too small. • If M is chosen too large. the concept of the Draw-Down comprises the eﬀect of increasing time periods for new records: The expected time period for a random variable to reach a new all-timehigh is not uniformly distributed but increases much faster over time (see for example [19]). The reason for introducing this variable is just for numerical reasons and has no economical or practical meaning. For each time window the Maximum Draw-Down gets calculated and the window shifted for one period. If the investment horizon is not known. If we have P historical data points. This is especially undesirable since we are particularly interested in the large Draw-Downs to calculate the α-quantile.

us for P data points (P −1)(P −2) Maximum Draw-Down values. The idea is to calculate the 2 average Draw-Down an investor could face. The disadvantage of this method is that the DrawDown values for a certain (unknown) investment horizon have very few inﬂuence to the ﬁnal distribution. The reason for this is that the number of possible Draw-Downs grows quadratical, however the number of Draw-Downs for a certain investment period grows only linearly. It might therefore be questionable to compare Draw-Downs of diﬀerent time periods. We will assume that the investment horizon is known and therefore proceed with the ﬁrst of the described methods to formulate the optimization problems. It lies in the nature of the concept to change the structure of the optimization problem from ”minimize the risk for a given expected return”, as it was the case for Variance and CVaR optimization, to ”maximize the expected return for a given Draw-Down/Time Under-The-Water threshold”. For an investor it is convenient to deﬁne his/her personal amount of wealth he/she is willing to risk or the amount of time he/she gives to the portfolio manager to remain with a negative performance. However, to be better able to compare the results of the diﬀerent optimizations, we will stick to our old schema of ﬁxing an expected return and minimizing the respective risk measure. To show the corresponding linear optimization problems, we introduce the following variables: The vector of logarithmic cumulative asset returns up to time moment k be yk so we can calculate the cumulative portfolio return as rc (w, t = k) = yk ∗ w. With the expected return given by the investor as µ, we get the following linear programming problem for the Maximum Draw-Dawn Minimize (in w and u) z s.t. u k − yk ∗ w ≤ z 1≤k≤M uk ≥ yk ∗ w, 1≤k≤M uk ≥ uk−1 , 1≤k≤M u0 = 0 1 d yM ∗ w = µ wT 1 = 1 wi ≥ 0, 1≤i≤N

where uk , 1 ≤ k ≤ M and z are auxiliary variables and d is the investment period in years. The optimization problem with a constraint on the average Draw-Down can be written as follows Minimize (in w and u) z s.t. M 1 k=1 (uk − yk ∗ x) ≤ z M uk ≥ yk ∗ x, uk ≥ uk−1 , u0 = 0 1 d yM ∗ x wT 1 = 1 wi ≥ 0, 1≤k≤M 1≤k≤M

1≤i≤N

63

and the optimization problem with a constraint on CDaR may be formulated as Minimize (in w, u, z, ζ) z s.t. 1 ζ + αM M zk ≤ z k=1 zk ≥ uk − yk ∗ x − ζ, zk ≥ 0, uk ≥ yk ∗ x, uk ≥ uk−1 , u0 = 0 1 d yT ∗ x wT 1 = 1 wi ≥ 0, 1≤k≤M 1≤k≤M 1≤k≤M 1≤k≤M

1≤i≤N

The optimal solution of this problem gives the optimal threshold value in variable ζ. The corresponding extension of the Time Under-The-Water to a risk measure Conditional Time Under-The-Water at Risk (CTaR) can be done similarly. Since the optimization problems are analogous to the ones of Draw-Down, we will give only the linear programm for the Conditional Time Under-The-Water at Risk Minimize (in w, u, z, ζ) v s.t. 1 ϑ + αM M zk ≤ v k=1 zk ≥ uk − yk ∗ x − ϑ, zk ≥ 0, uk ≥ yk ∗ x, uk ≥ uk−1 , 1 d yT ∗ x u0 = 0 wM 1 = 1 wi ≥ 0, where uk , zk , 1 ≤ k ≤ M and v are auxiliary variables. A well implementable setup for the portfolio optimization process (that we will not further follow) would be the following framework: Optimize the expected portfolio return subject to the clients CDaR restriction ζC and CTaR restriction ηC .

1 Maximize dC yM ∗ x s.t. CDaRα (w) ≤ ζC

1≤k≤M 1≤k≤M 1≤k≤M 1≤k≤M

1≤i≤N

CT aRα (w) ≤ ηC

64

This setup has the following advantages: • Since it is very intuitive, the portfolio manager can talk to the client in exactly the same terms and the client has a clear view about the risk he or she is taking. • The framework is still a linear programming problem and can therefore be solved eﬃciently. The two risk measures Draw-Down and Time Under-The-Water demand a lot of data to be meaningful. Both methods deﬁne at the most one risk value per data window and to get a good approximation for the distribution of the risk measures, lots of data windows are necessary. This is especially true when we are looking for small quantiles like α = 0.05. In empirical test we have seen that Draw-Down and Time Under-The-Water as described so far are not very appropriate for hedge funds, where only monthly data for about the last 15 years is available and therefore only 180 data points in total. The risk measures are in this case to discrete and it is not possible to get a reasonable optimization. E.g. it is often not possible to get a good estimation for the derivatives of the risk measure which is needed in most optimization algorithms. We have come to this conclusion especially for the Time Under-The-Water measure where the objective function to minimize is far to discrete to get any meaningful results. An important diﬀerence of the Draw-Down approach in comparison to the Value at Risk approach is the fact that the Draw-Down takes the correlations implied in the time series into consideration because it operates on the compounded historical returns and not on the return distribution function as Value at Risk does.

6.3

Mean-Conditional Draw-Down At Risk Eﬃcient Portfolios

In this section we want again to analyze what it means to optimize a portfolio regarding DrawDown/ Conditional Draw-Down At Risk. Under the assumption of normally distributed returns, the wealth of a portfolio can be approximated by a Geometric Brownian Motion given by X(t) = σW (t) + µt where W(t) is a standard Wiener process, µ is the drift and σ is the diﬀusion parameter. Now it is possible to derive the average Maximum Draw-Down. Its asymptotic behavior is 2σ 2 E[AD] = QAD (α2 ) µ √ x → 0+ −γ 2x µ<0 x→∞ −x − 1 2 µ=0 √ 2γσ T x → 0+ x→∞ 65 √ γ 2x

1 4 logx

QAD (x) →

µ>0

+ 0.49088

The reason might lie in the assumption of normality in the returns distribution which can be hold. if ever. only for very long time series. 66 . we made the experience that in practise a lot of data is necessary to get a reasonable result. Again.T ) 2σ 2 π γ= ( ) 8 with T as the investment horizon. This setup allows us to estimate the average Draw-Down of a time series by using its mean and variance. the portfolio or assets wealth can not be modelled by a Geometric Brownian Motion and therefore the algebraic relation is not valid anymore. α=µ ( If normality does not hold.

This relation between the diﬀerent risk measures disappears when normality can not be assumed anymore and we expect the optimization procedures to produce diﬀerent results. minimizing CDaR and maximizing CVaR will end up with the same results if the assumptions of normality and time-independence hold. Value at Risk. This means that there is even an algebraic correspondence between Variance. Draw-Down and Time Under-The-Water.3 we have shown that there exists an algebraic correspondence between variance and Draw-Down for the case of normality. In section 5. o Draw-Down and Time Under-The-Water whenever normality and time-independence hold.7 Comparison Of The Risk Measures Peijan and L´pez de Prado state in [33] that there is also an algebraic relation between VaR.3 we have seen that variance is closely related to CVaR and in section 6. And for the context of portfolio optimization we can conclude that the three optimization techniques minimizing the variance. 67 .

• In order to get the best results the eﬃcient frontier gets calculated twice: A ﬁrst run starts at the corner solution with the lowest expected return moving to the corner solution oﬀering the highest expected return and afterwards a second run is executed in reverse order. Afterwards the results of the calculations are shown and interpreted. 68 . Since we are mostly dealing with variances smaller than 1. The results of the ﬁrst run are stored and compared with the results of the second run. whereby the better results (i. M in V ariance E[R] = µ wi = 1 wi ≥ 0 M ax CV aR E[R] = µ wi = 1 wi ≥ 0 M in CDaR E[R] = µ wi = 1 wi ≥ 0 In the following we list some aspects of the implementation: • Since our optimizer is only capable of minimizing a function but not of maximizing a function. Experiments have shown that this results in a better convergence of the solution. Finally a summary and an outlook is given. While moving from one corner solution to the other. Then we discuss quickly the diﬀerent kind of data and show the used data. • We do not minimize the Variance but the standard deviation. 8 Numerical Implementation The table below summarizes the considered optimization problems whereby µ indicates the expected return given by the investor. the optimal portfolios are calculated. The optimizer can be given an initial estimation for the weights of the optimal portfolio.has a ”broader minimum”.e. we minimize in the implementation −CV aR instead of maximizing CV aR. These estimations of the optimal portfolio weights are calculated as linear extrapolation of the last two optimal portfolio weights because portfolio weights change often linearly while changing the expected return.as square root of the variance . The reason might lie in the optimization algorithm that seeks the lowest value of a function by following the steepest gradient.Part III Optimization With Alternative Investments This third part deals with the implementation of the discussed risk measures and the results achieved by using diﬀerent data sets. the portfolio weights leading to a smaller risk value) are chosen for the ﬁnal output. the standard deviation . Therefore we ﬁrst show the implemented optimization problems and some numerical specialities related to them.

the change in the returns gets calculated on a inﬁnitesimal small time period and therefore a continuously compounded return represents the actual value at every time point. • If single-period returns are assumed to be normal. Pt andPt+1 denote the absolute value of the asset at time point t and t+1 respectively. using continuously compounded returns. By taking log-returns. Using logarithmic returns.1 Normal Vs. the new return gets calculated at the end of each period and therefore the increase or decrease in the return gets active for the next period.t+n = log[ i=0 (1 + Rt+i. since the tail gets cut at -1 using simple returns and there would be a probability assigned to value that do not appear. using simple returns. ∞]. In contrast. for Pt+n = 0) which is an restriction to the range of possible values. the range gets stretched to [−∞.t+n = [ n−1 i=0 (1 + Rt+i.and multi-period each. We have decided to use logarithmic/continuously compounded returns for the analysis for the following reasons: t+n • Because 0 ≥ PPt ≥ ∞. then multi-period returns ( i (1 + Rt+i )) − 1 are not normal. 69 . Single-period Geometric Return Rt. This comes from that the fact that a product of normally distributed variables is not normally distributed. Logarithmic Data In ﬁnance there are two common ways to model returns: Simple/geometric returns and logarithmic/ continuously compounded returns.t+i+1 )] − 1 Logarithmic Return rt.9 Used Data In this section we will explain the diﬀerence between normal and logarithmic data and argue why we have decided to use logarithmic data. In this case it is more precise to use logarithmic returns instead of the linear approximation done by geometric returns. This is especially important for tail analysis. the eﬀective return can not be below -1 (full loss. We also list the used historical market data and show how we have simulated artiﬁcial data from it. multi-period returns are achieved by adding up the single-period returns which results again in a normal distribution (Central Limit Theorem) • The concepts of CVaR. The following table gives an overview of geometric and logarithmic returns for single.t+1 = log(1 + Rt.t+n ) n−1 rt.t+i+1 i=0 With geometric returns. CDaR calculate thresholds that may lie in between a time period whereas the data is for the end of the time period.t+i+1 )] = n−1 rt+i.t+1 = Pt+1 Pt Multi-period −1 Rt. 9.

00000448 0.000149 0.00193 0.0697 0. However to make the results more comparable.0467 0. In case that a value of a time series was missing for a certain date (e.00435 0.000577 0. As proxy for alternative investments the Hedge Fund Research (HFR) Fund Weighted Composite Index gets used. we restrict the data range to the largest common range.566 -0.g.00178 0.0000991 0. However.00216 0.19 0. Asset Class HFR Fund Weighted Composite MSCI World MSCI Europe MSCI North America MSCI Far East MSCI Emerging Markets JPM Global JPM Europe JPM USA Mean 0. 5 equity indices and a hedge fund index.0000218 -0. The data is coming from DataStream.000316 -0.19 The Covariance matrix of the 9 asset classes is as follows: HFR FWC HFR FWC MSCI WD MSCI EU MSCI US MSCI FE MSCI EM JPM WD JPM EU JPM US 0. Those indices not booked in USD.00648 0.000262 -0.0000111 -0.000172 70 . we have taken the value from the day before.000316 -0.9. The following table lists the indices and the ﬁrst four moments of its logarithmic monthly returns.00205 0.00193 0.000790 0.0000218 0. For equities and bonds there is a representative index for each of the following geographic categories: the whole world.00635 Standard Deviation 0.00483 -0.00866 -0.08 0.0444 0.0000999 0. the two indices based on the euro are just available after the introduction of this currency in 1999.00000169 MSCI US 0.000468 0.000262 0.000301 0.000120 0.505 0. were converted to this currency.847 0.00188 0.00178 0. Simulated Data Empirical Market Data As real market data we have chosen 3 bond indices.502 0.000213 0. e.0245 -0.000604 0.000170 0.000636 0.00603 0.00167 0.000149 0.000328 -0. Europe and the United States.00143 0. The tests are applied to the log-returns of the data series.00108 0.000213 0.615 3.00218 0.000468 0.00227 0. The data range includes almost the past 14 year (January 1990 until September 2003) on a monthly basis.00195 0.00374 0.0435 0.00227 0.000577 0.000164 0.0205 0. we think that it makes much more sense to compare time series that are in the same currency than diﬀerent ones. Not for all indices is it possible to get 10 years of data.539 -0.00205 0.600 0.00000169 0. except the HFR data coming directly from HFR.0000102 0. by converting all indices to USD.00121 0.00143 0.0000991 -0.141 -1.2 Empirical Vs.24 0.568 Excess Kurtosis 3.0131 Skewness -0.000156 0.00533 0.000362 0.000164 JPM EU -0.00218 0.00167 0.0000705 0. This means that there are 165 data points per index available.972 0.569 0.000170 JPM US -0.0661 0.00000448 MSCI WD 0. we have introduced currency risk to the time series.0000705 -0.00195 0.000636 0.000301 -0.0000152 MSCI EM 0.0191 0.000576 0.0000152 -0. We are aware that.000328 0.0000102 MSCI FE 0.00681 0.000120 0. The hedge fund index acts bottleneck because for all of the other indices more data into the past would be available. because of a holiday).000156 JPM WD -0.00455 0. Additionally we have also for equity indices for Far East and the Emerging Markets.00000066 -0.00196 0.00161 0.0000111 0.000576 0.775 -0.00161 0.0282 0.000418 0.00108 -0.00000066 MSCI EU 0. For a list of the various hedge fund styles included in this index and its descriptions you are referred to appendix F.0000999 0.000604 0.01140 0.744 1.g.00121 0.

As disadvantage we note that the Monte Carlo ignores all dependencies over time in the time series and therefore slightly overstate the true value of diversiﬁcation across assets classes in simulated portfolios. As always with Monte Carlo Simulations. Based on this estimated distributions we can generate random samples. a vector of shape parameters and the degree of freedom. This means that the time series do not have an individual kurtosis each but only a common one. In the case of ﬁtting a skewed normal distribution the shape parameters are all 0 and the degree of freedom is inﬁnite as it is well-know for the normal distribution. The ﬁtting procedure gives us a vector of regression coeﬃcients.Simulated data We generate artiﬁcial data based on the historical data described above. we have the advantage that we have full control over the underlying model because we can control and change the parameters. However it is a non-trivial task to generate multivariate correlated data with skewness and kurtosis and would be beyond the scope of this thesis. Another unpleasant aspect is that there is only one value for the degree of freedom for all asset classes estimated and respected in the ﬁtted function. One approach would be to use Copulas. 71 . For this purpose we ﬁrst ﬁt a multivariate skewed normal distribution and a multivariate skewed student-t distribution to the historical data. the covariance matrix.

For the calculations. This is especially true for expected target returns below 0. We are aware that the part of the eﬃcient frontier that is below the minimum risk portfolio is in practise not relevant. In the following part we will publish the portfolios that were optimized based on the market data.1 Evaluation With Historical Data This section contains the results derived by using the original data series as presented before. the range of these expected returns is deﬁned as the interval between the smallest and the largest expected return of the assets classes.10 Evaluation Of The Portfolios So far we have presented three methods how it could be possible to optimize a portfolio and in the last chapter we have introduced some historical market data. All the calculations are done for the 8 traditional asset classes and again for the 9 asset classes including the hedge fund data. In a ﬁrst section we show the results of the portfolio optimization if we use only traditional assets classes.2).1 or use non-overlapping windows for the calculation of CDaR and still get reasonable results. under the assumption of no short sales and no lending and borrowing it is not possible to reach an expected portfolio return outside this interval (see chapter 1. Clearly. the size of the rolling window for the CDaR as 24 and the step size for the CDaR as 3.25. For the charts we use the following color encoding: Asset Class HFR Fund Weighted Composite MSCI World MSCI Europe MSCI North America MSCI Far East MSCI Emerging Markets JPM Global JPM Europe JPM USA Color Black Orange Red Green Blue Pink Orange Red Green Style Solid Solid Solid Solid Solid Solid Dashed Dashed Dashed Example The alpha value for the CVaR and CDaR optimization is chosen as 0. Since we have only 165 data points per time series. it was not possible to decrease the alpha value further towards 0. For the portfolio optimization no constraints for the weights were introduced in order to see the pure results and no inﬂuenced ones. This procedure is done for several expected returns to get the eﬃcient frontier. These are the values for which we got the most stable results. 72 . For the portfolio optimization we give the expected return of the investor and try to minimize the respective risk. In the third part we generate artiﬁcial data with the same characteristics as the asset classes and optimize this data. Afterwards we introduce a hedge fund and analyze how it changes the optimal portfolios. 10. We will nevertheless show the whole range to give the whole picture of the optimization results and to compare them.

07 Conditional Value At Risk Asset Weights After Mean CDaR Optimization Target Return Asset Weight 0.8 0.0 0.4 0.010 0. The pictures in the left row show the weights of the individual assets classes dependent on the expected target return. The two pictures in the same row belong to the same optimization technique (Mean-Variance.000 0.0 0.4 0.010 Mean CVaR Efficient Frontier 0.2 0.005 Target Return 0.010 0.000 0.3 0.1 0. MeanCVaR or Mean-CDaR).03 0. At ﬁrst sight we can see from the pictures that the optimization techniques produced very similar results.01 0.8 0.5 Conditional Draw−Down At Risk Figure 30: The weights and eﬃcient frontiers for traditional asset classes for various optimization criteria.03 0.000 0.005 Target Return 0. The pictures in the right row show the eﬃcient frontier resulting from the optimized weights.4 0.02 0.006 where the 73 .010 Mean CDaR Efficient Frontier 0.010 Mean Variance Efficient Frontier 0.05 0.000 0. Asset Weights After Mean Variance Optimization Target Return Asset Weight 0.0 0. chosen by the investor.06 Standard Deviation Asset Weights After Mean CVaR Optimization Target Return Asset Weight 0.005 Target Return 0. As we increase the expected return the contribution from the JPM USA (green dashed line) increases until an expected return of 0.000 0.04 0.Portfolios With Traditional Assets Figure 30 shows the result of optimization of the 8 chosen traditional assets classes.004.8 0. This is the only asset class the oﬀers such a low mean return.000 0.010 0. They all start with investing 100% of the available capital in MSCI Far East (blue solid line) if the investor asks for a very low return around -0.4 0.05 0.

As we will see. We can state that the eﬃcient frontier of Mean Variance and Mean CVaR are more similar in comparison with the eﬃcient frontier of Mean CDaR optimization.0065 for all techniques. this holds true for all of them. The second moment shows us why indices like MSCI Emerging Markets and MSCI Europe don’t appear in the weights chart: They have a too high Standard Deviation especially in comparison to the bonds which oﬀer a higher expected return for a lower Standard Deviation. whereas Mean CDaR increases JPM USA to 0.006.008. whereas the eﬃcient frontier of the Mean-CDaR optimization is much more peaked and unstable.0001 (see Covariance matrix) and MSCI US (green solid line) together with JPM EU (red dashed line). Since the three optimizations are linked together via the standard deviation. which have a high allocation in the higher part of expected return.007 to invest in JPM Europe (red doted line) and have a major allocation in MSCI North America when it comes to an expected return above 0.0000152. The results also show the eﬀect of diversiﬁcation very clearly: MSCI Far East (blue solid line) and JPM US (green dashed line) which dominate the lower part of expected return have a correlation of -0. this artifacts will disappear as soon as we increase the amount of data. These are two of the smallest entries in the Covariance matrix. We can explain the calculated results with the actual situation at the world markets: The table with the four moments of the indices show that MSCI Far East is the only index with a negative ﬁrst moment. In the higher region of expected returns we could expect equity indices from risky geographic locations as the Emerging Markets. The reason for this is the Asia crisis in 1997 that is contained in the data interval. The results do not correspond completely to the portfolio theory which says that in the area of lower expected return we can ﬁnd mostly bonds because they oﬀer usually a lower expected return and a low risk. 74 . In this area we can see diﬀerences in the asset allocation of the three techniques: The Mean Variance optimization pushes the JPM USA class to 100% to reach an expected return of 0. All three techniques agree in the range above 0. have a correlation of -0. The results of the Mean-Variance optimization are very smooth. Since the optimization is looking for optimal diversiﬁcation. The eﬃcient frontier also look very similar for all three optimization techniques. the other indices.8 at the maximum and distributes the resulting part to JPM global (turquoise doted line).contribution of MSCI Far East is decreased to a contribution of 0. This might also be the reason why MSCI World Equities is used so rarely to form the portfolios: MSCI World can be considered as a linear combination of the other indices. The minimum risk portfolio is at an expected return of 0. that inherit more extreme properties. This eﬀect might be coming from the small data set and the fact that CDaR (and also CVaR to a certain extent) take outliers heavily into consideration. are being used. This shows that all optimization techniques try to combine the fewest correlated assets.

000 0. MSCI Far East (blue solid line) and JPM USA (green 75 .005 Target Return 0.4 0.010 0.01 0.004 until 0.Portfolios With Traditional And Alternative Assets In this section we show the results of portfolio optimizations given that a hedge fund index is available. Another interesting eﬀect is that results for the expected returns in the range of -0.4 0.3 0.0 0.006 are the same for the case with and without the hedge fund index. This means that the hedge fund index has no inﬂuence to the lower expected returns but is treated as independent.000 0.02 0.5 Conditional Draw−Down At Risk Figure 31: The weights and eﬃcient frontiers for traditional and alternative asset classes for various optimization criteria.005 Target Return 0.04 0.02 0.8 0.0 0.8 0.04 0.010 0.06 Standard Deviation Asset Weights After Mean CVaR Optimization Target Return Asset Weight 0. Asset Weights After Mean Variance Optimization Target Return Asset Weight 0. Again it gets visible that the results of the optimization techniques are similar.010 0.8 0.005 Target Return 0.06 0.2 0.0 0.010 Mean Variance Efficient Frontier 0.03 0.000 0.010 Mean CDaR Efficient Frontier 0.000 0.4 0. As for the situation without hedge fund index.4 0.05 0.1 0.08 Conditional Value At Risk Asset Weights After Mean CDaR Optimization Target Return Asset Weight 0.010 Mean CVaR Efficient Frontier 0. Figure 31 contains the six pictures with the weight allocation for the portfolios in the left row and the eﬃcient frontier in the right row.000 0.000 0.

008) to the interval (-0. We distinguish between ﬁtting a skewed normal distribution and ﬁtting a skewed student-t distribution. The hedge fund index gets taken into consideration when the expected return reaches a level of 0. The covered range for the expected return has shifted from the interval (-0.006 and above. Around the expected return of 0. 0. Portfolios With Simulated Traditional Assets Figure 32 shows that the results we get when ﬁtting a multivariate skew normal distribution to the historical data and generating 2000 samples with this distribution are pretty much similar to the ones of the original data. Only a little peak of 10 percent allocation in MSCI Emerging Markets in the CDaR optimization distinguishes the results. the data is gained by ﬁtting a distribution to the available monthly time series of the assets classes. The correlation of the hedge fund index is negative with both Bond indices.004 and 0. The little peaks in the weight allocation charts show that the CDaR-results are much more instable compared to the Variance-Results.dashed line) dominate the range between -0. As soon as we have the distribution. 76 . This eﬀect might be coming from the ﬁtted skewed student-t distribution that allows a higher adaptation to the original data then the skewed normal distribution. In Figure 33 the results for ﬁtting a multivariate student-t distribution to the same 8 data series are provided. 10.010 because it is the only asset oﬀering such a high return. we can generate as many artiﬁcial data with the same properties as we need. 0. As earlier described.005 we have for the Mean CDaR and Mean CVaR optimization also JPM Global (turquoise dashed line) playing a minor role. The Covariance matrix shows that the hedge fund index is very little correlated with the other assets.006. We can see that the instability in the CDaR and CVaR optimization disappears and all of the three optimizations get the same results. It ﬂuctuates from 20 percent for Mean-Variance optimization up to almost 40 percent for Mean-CDaR optimization.001. Remarkable is that JPM Europe (red dashed line) gets over weighted in a Mean CDaR optimization in comparison to Mean Variance and Mean CVaR optimizations. It attracts all the weight for expected returns above 0. Again the eﬀect of diversiﬁcation got utilized by all of the optimization techniques.004. The results of the optimization suggest to combine the hedge fund index with JPM EU (red dashed line) and JPM US (green dashed line) to get a high expected portfolio return.2 Evaluation With Simulated Data In this section we will optimize portfolios based on simulated data. This represents 2000 months or 167 years of data. For the following calculations we have generated 2000 samples for each asset class. Besides this shift there is another diﬀerence compared to ﬁtting a skewed normal distribution: The allocation of JPM Europe (dotted red line) is more varying comparing the three optimization techniques.010) which is a results of the randomly generation of new data from the ﬁtted distribution.

010 Mean CVaR Efficient Frontier 0.000 0.000 0.05 0.010 0. 77 .06 0.04 0.4 0.0 0.8 0.02 0.4 0.04 0.5 0.07 Standard Deviation Asset Weights After Mean CVaR Optimization Target Return Asset Weight 0.6 Conditional Draw−Down At Risk Figure 32: The weights and eﬃcient frontiers for traditional asset classes for various optimization criteria.06 0.Asset Weights After Mean Variance Optimization Target Return Asset Weight 0.010 0.010 0.0 0.3 0.0 0.005 Target Return 0.8 0. The used data has been simulated by a skewed normal distribution ﬁtted to the historical data.8 0.000 0.02 0.010 Mean CDaR Efficient Frontier 0.010 Mean Variance Efficient Frontier 0.2 0.03 0.4 0.005 Target Return 0.08 Conditional Value At Risk Asset Weights After Mean CDaR Optimization Target Return Asset Weight 0.000 0.005 Target Return 0.4 0.000 0.000 0.1 0.

4 0.05 0.000 0.3 0.000 0.0 0.0 0.8 0.010 0.000 0.4 0.010 0.010 Mean CDaR Efficient Frontier 0.03 0.005 Target Return 0.06 Standard Deviation Asset Weights After Mean CVaR Optimization Target Return Asset Weight 0.02 0.8 0.000 0.4 0.05 0.0 0.8 0.02 0.07 Conditional Value At Risk Asset Weights After Mean CDaR Optimization Target Return Asset Weight 0.04 0.005 Target Return 0.005 Target Return 0.2 0.01 0.1 0.5 Conditional Draw−Down At Risk Figure 33: The weights and eﬃcient frontiers for traditional asset classes for various optimization criteria.010 Mean Variance Efficient Frontier 0.06 0.Asset Weights After Mean Variance Optimization Target Return Asset Weight 0. 78 .010 0.03 0.01 0.000 0.4 0.010 Mean CVaR Efficient Frontier 0.04 0.000 0. The used data has been simulated by a skewed student-t distribution ﬁtted to the historical data.

79 .010 Mean CVaR Efficient Frontier 0.4 0. Comparing ﬁgure 34 and ﬁgure 35 we see again the same eﬀect as we have seen for the 8 asset classes: the outcome of the three diﬀerent optimizations diﬀers more when we ﬁt the historical data with a multivariate skewed student-t distribution instead of the multivariate skewed normal distribution. The used data has been simulated by a skewed normal distribution ﬁtted to the historical data.3 0.005 Target Return 0.8 0.8 0.000 0.4 0.010 0.4 0.0 0.2 0.000 0. The results of ﬁgure 34 are retrieved by ﬁtting a skewed normal distribution to the historical data.04 0.4 0.010 0. Besides this we can conﬁrm that hedge funds oﬀer a possibility for higher returns. Asset Weights After Mean Variance Optimization Target Return Asset Weight 0.5 0.010 Mean Variance Efficient Frontier 0.000 0.1 0.8 0.05 0.01 0.06 Standard Deviation Asset Weights After Mean CVaR Optimization Target Return Asset Weight 0.08 Conditional Value At Risk Asset Weights After Mean CDaR Optimization Target Return Asset Weight 0.02 0.010 Mean CDaR Efficient Frontier 0.06 0.005 Target Return 0.Portfolios With Simulated Traditional Assets And Alternative Assets Figure 34 and ﬁgure 35 show the result for the 9 assets classes.04 0.010 0. including the hedge fund index.03 0.000 0.0 0.02 0.6 Conditional Draw−Down At Risk Figure 34: The weights and eﬃcient frontiers for traditional and alternative asset classes for various optimization criteria.0 0.000 0.005 Target Return 0.000 0. whereas the results of ﬁgure 35 are retrieved by ﬁtting a skewed student-t distribution to the historical data.

8 0.05 0.010 0.4 0.4 0.3 0.1 0.03 0.0 0.000 0.07 Conditional Value At Risk Asset Weights After Mean CDaR Optimization Target Return Asset Weight 0.005 Target Return 0.2 0.000 0.010 0.005 Target Return 0.000 0.04 0.000 0.005 Target Return 0. 80 .01 0.06 Standard Deviation Asset Weights After Mean CVaR Optimization Target Return Asset Weight 0.5 Conditional Draw−Down At Risk Figure 35: The weights and eﬃcient frontiers for traditional and alternative asset classes for various optimization criteria.010 Mean Variance Efficient Frontier 0.000 0.8 0.010 Mean CVaR Efficient Frontier 0.8 0.0 0.010 Mean CDaR Efficient Frontier 0.4 0.Asset Weights After Mean Variance Optimization Target Return Asset Weight 0.01 0.03 0.010 0.4 0.000 0. The used data has been simulated by a skewed student-t distribution ﬁtted to the historical data.0 0.02 0.05 0.

81 .

82 . going from monthly data to daily data). because of stricter regulations). For this purpose we have implemented a software framework to test and compare the diﬀerent optimization techniques. It is explained that portfolio optimized according to variance. For each setup of data and optimization technique we have calculated the eﬃcient frontier and the weight allocation of the eﬃcient portfolios. Draw-Down and Time Under-The-Water and its derivations Conditional Value at Risk and Conditional Draw-Down at Risk were introduced. To show the violation of the requirements. This thesis dealt with the question of how to integrate alternative investments like hedge funds into a portfolio. The results of the three optimization techniques diﬀered dependent on the used data. we applied some statistical tests for measuring the stylized facts of asset returns. In recent years its complexity increased because of the emergence of derivatives and alternative instruments. Value at Risk or Draw-Down will be very similar in the case of normal distributed data. Mean-Conditional Value at Risk. As expected was the outcome for the diﬀerent optimization techniques less variable for the case of normal data and was more varying when we used non-normal data.g. Therefore we propose to use risk measures like Conditional Value at Risk or Conditional Draw-Down at Risk especially in the case of alternative investments because their returns deviate from the normal distribution. Even if the performance will decrease in the future (e. They were analyzed and compared with the variance as risk measure. This supported the conclusion from the algebraic analysis of the risk measures that portfolio optimization techniques diﬀerent than the Mean-Value optimization are preferable in the context of non-normal data. Important here is to state that the standard Mean-Variance optimization assumes normal distributed returns or a speciﬁc utility function for the investor.Summary and Outlook Portfolio optimization had always been a key issue of ﬁnance. The analytical solutions for optimal portfolios were derived for the case of two assets. hedge funds will still be a very good way to diversify a portfolio because of the low correlation with the traditional assets. The data were twofold: We used empirical data and simulated time series based on ﬁtting multivariate skewed distribution functions to the empirical returns.g. New alternative investment vehicles like hedge funds are very interesting in the context of portfolio optimization because they oﬀer a lot of unexplored investment opportunities. The third part summarized the results achieved by applying the three optimization techniques Mean-Variance. Afterwards we discussed the pleasant properties of risk measures and present several sets of properties as proposed in literature. However hedge funds have also good properties if one wants to go on with the Mean-Variance optimization: In the investigated period hedge funds had a very good performance and oﬀer therefore a very high return. The numerical results showed that the returns are not normal distributed but have fat tails. We have introduced historical hedge fund data because is known that hedge fund returns exhibit special statistical properties like skewness and kurtosis and it is therefore interesting to see how they inﬂuence the portfolio optimization results. and Mean-Conditional Draw-Down at Risk to data. The purpose of the second part was to show that the requirements of the Mean-Variance optimization as proposed by Markowitz are not completely fulﬁlled and to present some alternative optimization processes. In the ﬁrst part we presented the standard portfolio optimization approach according to Markowitz by describing the risk return framework and the relation to the utility function of an investor. The stylized facts appear especially strong when we increase the data frequency (e. In order to propose alternatives to the Mean-Variance optimization. This software framework and also the used data is explained. Value at Risk.

We think that risk measures can be comfortable discovered and analyzed with this software. It is modularly designed in order to get the code easily changed and the functionality enhanced. It would be interesting to use other kind of data and implement new risk measures. 83 .The implemented software features eﬃcient algorithms and interfaces to other programming languages.

84 . we can rewrite the variance as 2 σW = E[W 2 ] − 2 ∗ E[W ] ∗ E[W ] + E[W ]2 = E[W 2 ] − [E[W ]]2 Rearranging yields to 2 E[W 2 ] = σW + [E[W ]]2 We have the expected value of the quadratic utility function we want to optimize E[U (W )] = E[W ] − b ∗ E[W 2 ] Here we can substitute the term derived two lines above and get 2 E[U (W )] = E[W ] − b ∗ [σW + [E[W ]]2 ] Deriving this term we have proven that. a mean variance analysis optimizes the expected utility.Appendix A Quadratic Utility Function Implies That Mean Variance Analysis Is Optimal In this appendix we want to show that it is possible to express the expected utility function in terms of mean and variance and that it is therefore optimal to apply an mean variance analysis if one uses a quadratic utility function. we get 2 σW = E[W 2 ] − E[2W ∗ E[W ]] + E[W ]2 and since E[c ∗ X] = c ∗ E[X] holds. The variance of a random variable W is in (2) deﬁned as 2 σW = E[W − E[W ]2 ] = E[W 2 − 2W ∗ E[W ] + E[W ]2 ] Because E[ N N Xi ] = i=1 i=1 E[Xi ] holds. assuming a quadratic utility function.

0.05 and a return function. (38) (37) (36) (35) 0. a negative value indicates a gain) and deals with the 95% quantile The transformations of the VaR and CVaR can be expressed as V aRα (X) = V aR(1−α) (−X) CV aRα (X) = CV aR(1−α) (−X) (39) (40) In [1].4 0. CV aR1−α = E[RP |RP ≤ V aR] This corresponds to the right graphic of ﬁgure 36.05 0.64 0.1 0.64 0.5 VaR= 1.0 0.1] and x is a random variable of a return function.3 Probability 0.99] and x is a random variable of a loss function.B Equivalence Of Diﬀerent VaR Deﬁnitions And Notations Deﬁnitions and Notations used in this thesis: V aRα = sup{x|P [RP < x] ≤ α} where α is expected to be in [0. 0. In contrast we ﬁnd in [35] and [36] the following deﬁnitions and notations V aR1−α = inf {x|P [RP ≤ x] ≥ α} where α is expected to be in [0.2 0. a negative value indicates a loss) and calculates the 5% quantile whereas formulas (37) and (38) are deﬁned on loss functions (a positive value means a loss. 85 . CV aRα = E[RP |RP ≤ V aR] This notation corresponds to the left graphic of ﬁgure 36.3 α= 0.95 0.05 1−α= 0.01.9. The right graphic depicts an α = 0.0 −3 −2 −1 0 Return 1 2 3 0. The left graphic shows an α = 0.4 VaR= 1.95 1−α= 0.1 −3 −2 −1 0 Loss 1 2 3 Figure 36: The two graphics depict the situation for the two kind of deﬁnitions of VaR and CVaR for the case of a standard normal distribution and a 5%/95% conﬁdence level.2 Probability α= 0. [2] [3] we can ﬁnd a mixture of both notations where the same deﬁnitions as formulas (35) and (36) are used with a negative sign for both formulas in order to comply with the sign of formulas (37) and (38).5 0.95 and a loss function The formulas (35) and (36) are deﬁned on return functions (a positive value means a high return.

C

Used R Functions

The following functions of the R programming language and environment were used for the implementation of the software system: Function apply arima.sim bds.test data.csv ﬂoor garchSim length lines ksgofTest mean msn.ﬁt mst.ﬁt plot qnorm qqPlot qt rmsn rmst rmvnorm rmvt rnorm rsn rst rt runif runsTest sum var Package base ts tseries fBasics base fSeries base base fBasic base sn sn base base fExtremes base sn sn mvtnorm mvtnorm base sn sn base base fBasics base base Description Returns a vector or array or list of values obtained by applying a function to margins of an array Simulate from an ARIMA model Computes and prints the BDS test statistic for the null that ‘x’ is a series of i.i.d. random variables Loads speciﬁed data sets, or lists the available data sets Rounding of Numbers Univariate GARCH time series modelling Get or set the length of vectors (including lists) Add Connected Line Segments to a Plot Performs a Kolmogorov-Smirnov Goodness-of-Test Generic function for the (trimmed) arithmetic mean Fits a multivariate skew-normal (MSN) distribution to data Fits a multivariate skew-student-t (MST) distribution to data Generic function for plotting of R objects Quantile function generation for the normal distribution Produces a Quantile-Quantile plot of two data sets Quantile function generation for the t distribution Random number generation for the multivariate skew-normal distribution Random number generation for the multivariate skew-student distribution Generates random deviates from the multivariate normal distribution Generates random deviates from the multivariate student distribution Random generation for the normal distribution Random number generation for the skew-normal distribution Random number generation for the skew-student-t distribution Quantile function generation for the t distribution Generates random deviates from the uniform distribution Performs a Runs Test Returns the sum of all the values present in its arguments Computes the variance

86

D

Description Of The Portfolio Optimization System

It was our intension to do all the calculations on a common hard-/software system in order to make the analysis as useful for practical applications as possible and easy for future extensions. This justiﬁes the following system: • The system runs on current personal computers (3GHz clock cycles, 1GB memory). We don’t assume the availability of a supercomputer or pc-cluster. • As software components we use R as front-end application and for some small calculations and an optimizer module written in Fortran77. We will now describe how we have designed the system for portfolio optimization. The optimizer is written in Fortran77 which can be executed directly from R. The full system works as follows (see ﬁgure 37): R calls the optimization routine DONLP2 and gives the needed data (id of optimization method, asset returns, expected return of portfolio) as parameter to the optimizer. The optimizer itself calls several subroutines that deﬁne the objective function, equality constraints and inequality constraints and all of its gradients. In case that it is not possible to deﬁne analytic gradient functions, we have implemented a numerical gradient function.

Objective Function f x f(x) x Asset return data, Risk measure ID R Optimal weights Optimizer x h1(x) Equality Constraint h1 DONLP2 g1(x)

Inequality Constraint g1

R

Fortran77

Figure 37: Schema of the dependencies of the optimization process. Our intension is to develop a general purpose system that can easily be installed and extended. For this reason we have chosen a general non-linear optimizer that can be applied to any kind of problems. We are aware that it could be more time eﬃcient to use specialized optimizers for each problem (e.g. a linear optimizer for the Conditional Value at Risk problem), however we think that the overhead of a general optimizer is negligible in our context. 87

The used optimizer ’DONLP2’ can be downloaded for free from http://ftp.mathematik.tudarmstadt.de/pub/department/software/opti/ where it is available as Fortran or C implementation. The correct functionality of the optimizer was tested with cross-tests to the optimizer in the R-package ”quadprog” and the optimizer included in Microsoft Excel. In the documentation DONLP2 is described as Purpose: Minimization of an (in general nonlinear) diﬀerentiable real function f subject to (in general nonlinear) inequality and equality constraints g, h. f (x) = minx∈S S = {x ∈ Rn : h(x) = 0, g(x) ≥ 0} Here g and h are vectorvalued functions. Bound constraints are integrated in the inequality constraints g. These might be identiﬁed by a special indicator in order to simplify calculation of its gradients and also in order to allow a special treatment, known as the gradient projection technique. Also ﬁxed variables might be introduced via h in the same manner. Method employed: The method implemented is a sequential equality constrained quadratic programming method (with an active set technique) with an alternative usage of a fully regularized mixed constrained subproblem in case of nonregular constraints (i.e. linear dependent gradients in the ”working set”). It uses a slightly modiﬁed version of the Pantoja-Mayne update for the Hessian of the Lagrangian, variable dual scaling and an improved Armijo-type stepsize algorithm. Bounds on the variables are treated in a gradient-projection like fashion. Details can be found in [40] and [41].

88

the user speciﬁes an objective and constraints by pointing and clicking with a mouse and ﬁlling in dialog boxes. Detailed information about the methods applied in the optimizer included in Microsoft Excel are given by Fylstra et al [22]. and branch-and-bound methods to ﬁnd an optimal solution and sensitivity information.E Description Of The Excel Optimizer Optimization in Microsoft Excel begins with an ordinary spreadsheet model. The Solver then analyzes the complete optimization model and produces the matrix form required by the optimizers. The solver uses the solution values to update the model spreadsheet and provides sensitivity and other summary information on additional report spreadsheets. 89 . The optimizers employ the simplex. The spreadsheets formula language functions as the algebraic language used to deﬁne the model. Through the Solvers GUI. generalized-reduced-gradient.

beta neutrality better deﬁnes a strategy uncorrelated with the market return. 90 . • Long/Short Equity. beta neutral or both. change of management.. Expected Volatility: Very High Invests in equity or debt of emerging (less mature) markets that tend to have higher inﬂation and volatile growth. • Convertible Arbitrage. Market neutral can imply dollar neutral. • Emerging Markets.magnum. Expected Volatility: Low Invests both in long and short equity portfolios generally in the same sectors of the market. warrants and convertible preferred stocks). Managers typically buy (or sometimes sell) these securities and then hedge part or all of the associated risks by shorting the stock. – Beta neutral strategy targets a zero total portfolio beta (i.. and. Delta neutrality is often targeted. Market risk is greatly reduced. they avoid the risk of market swings aﬀecting some industries or sectors diﬀerently than others. Expected Volatility: Low Attempts to exploit anomalies in prices of corporate securities that are convertible into common stocks (convertible bonds. etc. Convertible bonds tends to be under-priced because of market segmentation. eﬀective hedging is often not available. therefore. or in anticipation of earnings disappointments often due to accounting irregularities. new competition. – Dollar neutral strategy has zero net investment (i. but eﬀective stock analysis and stock picking is essential to obtaining meaningful results. Often used as a hedge to oﬀset long-only portfolios and by those who feel the market is approaching a bearish cycle. While dollar neutrality has the virtue of simplicity. Leverage may be used to enhance returns.F Description Of Various Hedge Fund Styles This section lists and explains some common hedge fund strategies. Many practitioners of market-neutral long/short equity trading balance their longs and shorts in the same sector or industry. Sometimes uses market index futures to hedge out systematic (market) risk. investors discount securities that are likely to change types: if the issuer does well.e. Overhedging is appropriate when there is concern about default as the excess short position may partially hedge against a reduction in credit quality.S. Expected Volatility: Low Hedge strategies that take long and short positions in such a way that the impact of the overall market is minimized. Treasury futures and currency markets. although Brady debt can be partially hedged via U. or the market. Short selling is not permitted in many emerging markets. the convertible bond behaves like a stock. Relative benchmark index is usually T-bills. • Dedicated Short Bias. • Equity Market Neutral. Expected Volatility: Very High Sells securities short in anticipation of being able to re-buy them at a future date at a lower price due to the managers assessment of the overvaluation of the securities. the convertible bond behaves like distressed debt.com.e. if the issuer does poorly. equal dollar amounts in long and short positions). the beta of the long side equals the beta of the short side). By being sector neutral. Usually low or no correlation to the market. The strategies are taken from [6] and the respective volatility classiﬁcation from the webpage www.

May also use futures to hedge out interest rate risk. • Global Macro. Utilizes hedging. Opportunistically long and short multiple ﬁnancial and/or non ﬁnancial assets. Expected Volatility: Low Attempts to hedge out most interest rate risk by taking oﬀsetting positions. currencies and commodities though not always at the same time. typically brought about by shifts in government policy that impact interest rates. Subindexes include Systematic (long or short markets based on trend-following or other quantitative analysis) and Discretionary (long or short markets based on qualitative/fundamental analysis often with technical input). 91 . Participates in all major markets equities. • Managed Futures. and bond markets.• Event Driven. bonds. but the leveraged directional investments tend to have the largest impact on performance. Expected Volatility: Moderate Corporate transactions and special situations – Deal Arbitrage (long/short equity securities of companies involved in corporate transactions) – Bankruptcy/Distressed (long undervalued securities of companies usually in ﬁnancial distress) – Multi-strategy (deals in both deal arbitrage and bankruptcy) • Fixed Income Arbitrage. stock. in turn aﬀecting currency. Uses leverage and derivatives to accentuate the impact of market moves. Expected Volatility: Very High Aims to proﬁt from changes in global economies.

gloriamundi.. [10] Bollerslev T.. Eber J. Zabarankin M..G References References [1] Acerbi C. Kupper M... J. Delbaen F. Working Paper [13] Cheridito P. 2003: Coherent and convex risk measures for bounded c`dl`g˙ a a [14] Conover W... Uryasev S. Tasche D. Nordio C. 2001: Expected Shortfall: a natural coherent alternative to value at risk www...com [4] Albrecht P.. Tasche D. Martellini L. 203-228 [8] Belaire-Franch. Inc [15] De Giorgi E... 2003: Portfolio Optimization With Drawdown Constraints. 1738: Exposition of a new theory on the measurement of risk. University of Wisconsin Working Paper No.. pp. 2001: On the Coherence of Expected Shortfall.. Working Paper Series ISSN 1424-0459 92 ..com [3] Acerbi C. Journal of Econometrics [11] Brock W. 2002: The BDS Test: A Practioner’s Guide Journal of Applied Econometrics 17. 1987: A Test for Independence Based on the Correlation Diemension.. 2003: Risk Measures Contribution prepared for: Encyclopedia of Actuarial Science John Wiley & Sons [5] Alexander C. Delbaen F.... www. pp... 2002: The Brace New World of Hedge Fund Indexes Working Paper [7] Artzner P..D. Dechert W.. 691-699 [9] Bernoulli D. 1999: Practical Nonparametric Statistics. 1986: Generalized Autoregressive Conditional Heteroscedasticity. www.. Heat D. 8702 [12] Cheklov A. 2001: Expected Shortfall as a Tool for Financial Risk Management. 2001: Market models John Wiley &Sons [6] Amenc N. 2002: A Note on Portfolio Selctions under Various Risk Measures.J.com [2] Acerbi C. Contrras-Bayarri D. Sirtori C.gloriamundi..... 1999: Coherent measures of risk Mathematical Finance 9.gloriamundi. John Wiley & Sons. Scheinkman J..-M.

.. Kahn R..... Dyer S. Pﬂug G... Mikosch T. 1963: The variation of certain speculative prices...... 2001: Conditional Value-at-Risk bounds for Compund Poisson Risks and Normal Appriximation. John Wiley & Sons [31] de Moivre A.. The Amerian Economic Review 67. 1959: Portfolio selection: Eﬃcient diversiﬁcation of investments. Applied Stochastic Models and Data Analysis 8.. 1995: Modern portfolio theory.. Inc [19] Embrechts P. 1691-1705 [27] Kritzman M. Lasdon L. 1995: The portable ﬁnancial analyst. pp. John Wiley & Sons. H¨ing A. 89-117 93 .. Freeman A. John Wiley & Sons. King A.. Gruber M... 116-126 [22] Fylstra D. Springer. 1733: [32] Pedersen C... 1992: Tracking Models and the Optimal Regret Distribution in Asset Location. 1977: Mena-risk analysis with risk associated with below-target returns.. pp... pp.. 2003: Using Copulae to bound the Value-at-Risk for o functions of dependent risks.S. p. pp. Watson J. Journal of Business 36. 1999: Active portfolio managemet McGraw-Hill. 1996: A Standard Measure of Risk and Risk-Value Models Management Science 42. pp. pp. 1998: Design and Use of the Microsoft Excel Solver Interfaces 28.[16] Dembo R. Finance & Stochastics 7.. 394-419 [30] Markowitz H. 99 [25] Huerlimann W. Juri A. 2001: The Rules of Risk. 2002: Modelling Extremal Events.145-167 [21] Fishburn P.. 2003: On the Maximum Drawdown of a Brownian Motion Working Paper [29] Mandelbrot B. Satchell E. Working Paper... MPS: Applied mathematics/0201009 [26] Jia J. 290-294 [20] Embrechts P... Inc [18] Elton E. pp... Pratap A. 2000: Value at Risk in Portfolio Optimization [24] Grinold R. pp. Waren A. 1998: An extended family of ﬁnancial risk measures Geneva Papers on Risk and Insurance Theory 23. The Financial Analysts Journal [28] Magdon-Ismail M.29-55 [23] Gaivoronski A. Abu-Mostafa Y. Atiya A. 151-157 [17] Dembo R. Klueppelberg C.

183 94 .. 2000: Conditional Value-at-Risk (CVaR): Algorithms and Applications. 1998: A new technique for inconsistent problems in the SQP method Math. Germany [42] Stone B... Econometrica 20. 173 .. 1952: Safety Firts and the Holding of Assets.. Journal of Economic Theory. 2000: Some Remarks on the value-at-risk and the conditional value-at-risk. 82. L´pez de Prado M.. of Oper. pp. Insurance: Mathematics and Economics 21.gloriamundi. 47. 2003: Measuring Loss Potential Of Hedge Fund Strategies. 341-360 [38] Roy A.675-685 [43] Testuri C. Prog. Heidelberg.com [37] Ross S. o Working Paper UBS Wealth Management and Business Banking [34] Pﬂug G. 2002: On Relation Between Expected Regret and Condtional Valueat-Risk.. 1967: Entscheidungskriterien bei Risiko. Young V.. Germany [41] Spellucci P.. 413-448 Physica Verlag... 1973: A General Class of Three-Parameter Risk Measures. Working Paper. Res. Working Paper... Meth.edu/uryasev [45] Wang S. 1999: Optimization of Conditional value-at-risk www. (1998). pp. Panjier H. Journal of Finance 28.. Uryasev S.com [44] Uryasev S. pp.. Uryasev S.uﬂ. pp. University of Florida www... 2002: Conditional value-at-risk for general loss distributions Journal of Banking & Finance 26 www. Uryasev S...ise. 1997: Axiomatic characterization of insurance prices.gloriamundi. 1998: An SQP method for general nonlinear programs using only equality constrained subproblems Math. 355-400 Physica Verlag. University of Vienna [35] Rockafellar R.[33] Peijan A.com [36] Rockafellar R. pp.gloriamundi.431-449 [39] Schneeweiss H. 1976: The arbitrage theory of capital asset pricing. Springer. Heidelberg. 13. University of Florida www. Working Paper at Department of Statistics and Decision Support Systems. pp. pp. 113-117 [40] Spellucci P.

Sign up to vote on this title

UsefulNot useful- Questões_Cap6
- Tutorial 1 Investment -
- Risk
- ROR 97 2003 Workbook
- Rseaserch Proposal PMS
- Pfm Guide OGC
- Element Global Opportunities Equity Portfolio - January 2012
- JMC INVEST4 .pdf
- EricZivot
- Bayesian statistics and MCMC methods for portfolio selection
- Investments & Risk
- Report on portfolio Management
- VP Apartment Property Management In Greater Chicago IL Resume Michael Zink
- Lecture 4.Ppt
- Value Chain Cre 2000-06-25 Szigeti Davis
- Investment Management Project
- FE_2013_PS3_solution.pdf
- Berk Chapter11 Optimal Portfolio Choice Capm 111029180143 Phpapp02
- Lect3_PortfolioTheory
- Investment Portfolio Management
- Financial Decision Making
- Optimum Centralized Portfolio
- Innovation Vs Risk management Research paper week 2.docx
- Project Portfolio Management
- Lecture 1
- Portfolio
- 0701079 Portfolio Managemnet Services
- 08 Iapm9e Ppt_ch08
- CAPMsummary (1)
- file129658.J.G._Loeters_-_272918
- Portfolio Optimization With Hedge Funds