You are on page 1of 369

10/23/12 Macro‑Investment Analysis: Overview

Macro-Investment Analysis: An Overview of


the Work

This is an Electronic Work-in-Progress, a form of communication that is both new for and poorly
understood by its author. It is very much in progress. This means that it will grow over time and that the
material will undergo substantial revisions. Some material may even presume that the reader has
knowledge scheduled to be included in other, as-yet-unwritten sections. The reader's (browser's?)
indulgence is requested as the overall edifice is constructed.

The Subject
It is not an easy matter to describe the subject of a Work of this sort. This is especially true in the early
stages, when it is still under construction. However, a few paragraphs from an early section may prove
useful.

Some may regard the Work as a sort of electronic Investments text, but the focus is more specific. This is
reflected in the title, which is, at the writing, unique. All three components of the title are relevant. The
focus is on techniques of analysis that can lead to sensible top-level (macro-) investment decisions.

It is helpful to personalize some of the roles that concern us. An Investor must ultimately select positions
in various types of investment vehicles. In doing so, he or she can be assisted by an Analyst.
Importantly, many of the investment vehicles used by investors are themselves packages of more
fundamental positions -- packages provided by Investment Firms.

In essence, we concentrate on a multi-level approach to investing, with the Investor at the top level,
assisted by the Analyst, a set of Investment Firms at a second level, and the securities of corporations
and government agencies at yet lower levels.

Such a multi-level approach to investing is found both among individual and institutional investors.
Individuals utilize shares of publicly-available investment funds which in turn hold securities of
individual corporations. Large institutional investors such as pension funds use commingled funds or
separately managed accounts which serve much the same purpose. Both can also make use of many
types of derivative securities, the values of which reflect the performance of major sectors of various
capital markets.

Traditionally, investments textbooks have been directed to investors who plan to build portfolios of the
securities of corporations and public agencies. However, in the real world such investors are becoming
fewer and fewer. Increasingly, the construction of portfolios from individual securities is an activity
undertaken by Investment Firms, whose goal is to provide components that may usefully be employed
by investors. The ultimate investor then focuses on finding an appropriate combination of such
components, not on building an overall portfolio security by security.

This work is designed to provide the analytic skills needed to aid such investors. It is thus explicitly
directed to the Analyst. However, knowledge of its contents should help Investors utilize and evaluate
the services offered by Analysts. In addition, Investment Firms familiar with its contents should be able
to design products that will better meet the needs of Investors.

If you want to help an investor build and manage a portfolio of mutual funds or similar vehicles in
sophisticated ways, this Work is for you. If not, it still may have much to offer. Take what you will.
www.stanford.edu/~wfsharpe/mia/over/mia_over.htm 1/2
10/23/12 Macro‑Investment Analysis: Overview

sophisticated ways, this Work is for you. If not, it still may have much to offer. Take what you will.

Organization
The Work is divided into three sets of material:

Principles and Techniques


Empirical Analyses
Computer Material

Principles and Techniques


This first body of material will eventually resemble an on-line textbook. Indeed, some or all of it might
eventually appear in a traditional printed form. Except for issues of organization and exposition, the
material should be subject to relatively few changes. To oversimplify: good theory is timeless. This
material is intended to stand on its own and (eventually) to be read in the sequence shown.

Empirical Analyses
The second body of material will include estimates of parameter values, tests of hypotheses, etc.. As in
most empirical research, results are subject to change as more people process more data in more ways.
Each of the analyses will be described in sufficient detail so that the reader can at least get the sense of
the methods used and the key results. Those who desire a deeper understanding should read the
discussions provided under Principles and Techniques .

Computer Material
This material includes programs, procedures, functions and spreadsheets designed to illustrate and
implement key principles and techniques. Some of the material has also been used in the Empirical
Analyses. Coverage is partial, at best, and there is no guarantee that the routines are efficient, elegant or
always correct. The reader is invited to use them at his or her peril.

www.stanford.edu/~wfsharpe/mia/over/mia_over.htm 2/2
10/23/12 Macro‑Investment Analysis

Introduction
Investment Approaches
Financial Economics
Models and Paradigms

www.stanford.edu/~wfsharpe/mia/int/mia_int0.htm 1/1
10/23/12 Investment Approaches

Investment Approaches

Contents:
The Focus of this Work
Other Investment Approaches

The Focus of this Work


This work has some attributes of an investments text, but it is by no means a traditional one in either
form or substance. The difference in the former is obvious. The difference in the latter is reflected in the
title, which is, at this writing, unique. All three components of the title are relevant. The focus is on
techniques of analysis that can lead to sensible top-level (macro-) investment decisions.

It is helpful to personalize some of the roles that will be of concern throughout this work. An Investor
must ultimately select positions in various types of investment vehicles. In doing so, he or she can be
assisted by an Analyst. Importantly, many of the investment vehicles used by investors are themselves
packages of more fundamental positions -- packages provided by Investment Firms.

In practice of course, these three roles are rarely divided neatly among individuals and organizations.
Some Investors provide at least part of the analysis required for appropriate decisions. Many investment
decisions are made by intermediaries, such as pension funds, endowment funds and the like, charged
with acting in the interests of those who will ultimately be the beneficiaries of the investments
undertaken. The staffs of such organizations often provide much of the analysis needed for appropriate
decisions. Some Investment Firms provide analytic services for investors. And so on. Nonetheless, it is
useful to consider as distinct the functions of Investor, Analyst, and Investment Firm.

In essence, we concentrate on a multi-level approach to investing, with the Investor at the top level,
assisted by the Analyst, a set of Investment Firms at a second level, and the securities of corporations at
yet lower levels, as portrayed in the figure below.

Such a multi-level approach to investing is found both among individual and institutional investors.
Individuals utilize shares of publicly-available investment funds (such as those called mutual funds in the
United States, unit trusts in the United Kingdom, investment companies in Japan, etc.) which in turn
hold securities of individual corporations. Large institutional investors such as pension funds use
commingled funds or separately managed accounts (provided by banks, investment management firms,
www.stanford.edu/~wfsharpe/mia/int/mia_int1.htm 1/7
10/23/12 Investment Approaches

commingled funds or separately managed accounts (provided by banks, investment management firms,
insurance companies and the like) which serve much the same purpose. Both can also make use of
many types of derivative securities (provided by banks, exchanges, investment banking firms, etc.), the
values of which reflect the performance of major sectors of various capital markets.

Macro-Investment Analyses are performed for individual investors by financial advisors, who either
charge explicitly for such services ("for-fee Advisors") or are compensated implicitly via commissions
received from Investment Firms for the sale of the latters' products. Investment brokers often serve the
same function, either for a fee based on the assets managed (using so-called "wrap accounts") or for
commissions based on the investments undertaken. Similar services are provided by insurance agents,
private bankers, and others.

Macro Investment Analyses are performed for institutional investors by investment consulting firms.
Those who work primarily with pension funds are often called pension consultants. Most such firms
charge fees that are independent of the particular investments undertaken.

Not surprisingly, many of the techniques described in this work were initially developed by consulting
firms for institutional investors with sufficiently large investments to justify the costs associated with
sophisticated analytic methods and extensive data collection and analysis. However, the accumulation of
substantial databases, advances in knowledge and decreases in computational and communications costs
are making such procedures accessible economically to a much larger set of investors.

At this writing, there is a world-wide trend to give individuals more control over the investment of funds
designed to cover retirement expenses. This makes it imperative that ways be found to provide the
advice that will make it possible for such investment decisions to be taken rationally. One way or
another, the skills of the Analyst must be applied widely.

Traditionally, investments texts have been directed to investors who plan to build portfolios of the
securities of traditional corporations. However, in the real world such investors are becoming fewer and
fewer. Increasingly, the construction of portfolios from corporate securities is an activity undertaken by
Investment Firms whose goal is to provide components that may usefully be employed by Investors.
The ultimate investor then focuses on finding an appropriate combination of such components, not on
building an overall portfolio security by security.

This work is designed to provide the analytic skills needed to aid such Investors. It is thus explicitly
directed to the Analyst. However, knowledge of its contents should help Investors utilize and evaluate
the services offered by Analysts. In addition, Investment Firms familiar with its contents should be able
to design products that will better meet the needs of Investors.

For emphasis, we will often capitalize Investor, Analyst and Investment Firm when used in the senses
described above. This follows the legal tradition of using such notation to refer to previously "defined
terms".

Other Investment Approaches


It is useful to contrast the multi-level approach to investments emphasized in this work with some
alternatives. We present them roughly in the order in which they were introduced historically.

In simple societies, there is little distinction between savings and investment. One saves by reducing
present consumption. One invests in the hope of increasing future consumption. Thus the woodsman
who spares a tree for another year reduces consumption in the present (a long warm night by the fire) in
the hope of increasing it in the future (warmer and longer nights by the fire next year). The tree is a
productive investment -- barring floods, lightning strikes, etc., there will be more wood next year than
there is this year.
www.stanford.edu/~wfsharpe/mia/int/mia_int1.htm 2/7
10/23/12 Investment Approaches

In most societies, intermediary steps connect individuals' savings with productive investments. Such
investments are usually undertaken by firms, using resources generated by the savings of individuals. In
many cases governmental agencies perform roles similar to those of business firms.

In our terminology, the individual who saves (defers consumption) is an Investor. To avoid confusion
with Investment Firms, we will use the term Business to refer to a firm or governmental agency engaged
in productive investment.

Certainly the simplest (and no doubt earliest) form of investment is that in which each business is funded
by one investor (or perhaps one family of investors). The figure below illustrates this, with I representing
an Investor and B a Business.

Note that our woodsman conforms to this structure, although the Business (tree) is not separately
identified as such .

While this simple form of investment provides the maximum possible incentive for the Investor to insure
that the Business is run efficiently, it "puts all the Investor's eggs In one basket." It did not take long for
Investors to realize that that some sort of risk-sharing could benefit all concerned. For example, at one
time, each ship sent from London to bring back spices from the Orient was financed by one merchant. If
the ship happened to sink, the investor lost everything. But if several merchants pooled their resources,
with each taking partial interests in several ships, risk could be greatly reduced, with no diminution in
overall expected return. Such pooling could be accomplished in a number of ways. One of the simpler
procedures involved the issuance of "ownership shares" (not surprisingly), with each investor holding a
diversified portfolio of shares in several ships. Considering each such ship a business, this is
diagrammed below.

Note that such an arrangement has its drawbacks. Greater separation of ownership and control is
required, and monitoring of the managers of the Businesses by owners (Investors) is more complex.
Under such conditions it may be more difficult or costly to insure that management will act in ways that
will best serve the interests of owners. In other words: a conflict may arise between efficient corporate
governance and risk-sharing via diversification.
www.stanford.edu/~wfsharpe/mia/int/mia_int1.htm 3/7
10/23/12 Investment Approaches

In modern versions of such an arrangement, Businesses are often structured as limited liability
corporations, with ownership claims represented by shares of corporate stock.

While direct ownership of portfolios of shares of corporate stock can provide the benefits of
diversification, it is not the only way to accomplish this. One alternative utilizes an intermediary
Investment Firm. For example, such a firm can hold claims on Businesses, with Investors holding shares
in the Investment Firm. The traditional role of banks -- gathering deposits and lending money to
business -- conforms to this model, at least in part. A less complex example is provided by a mutual fund
that buys shares of stocks in corporations and issues shares representing proportional ownership of the
resulting portfolio. Diagramatically:

In this case the problems associated with corporate governance are mitigated, since the Investment Firm
is in a position to serve as the exclusive monitor of each of the Businesses. On the other hand, the
division of ownership of the Investment Firm among many Investors may lessen incentives for the
management of the Investment Firm to act solely in the interests of the Investors.

Perhaps most important, the use of a financial intermediary can greatly reduce the required number of
contractual arrangements. For example, if there were N investors and M businesses, direct investment
with sufficient diversification might require N*M such arrangements. However, if one financial
intermediary could provide all the needed diversification, only N+M contracts would be required.

Schema involving financial intermediaries lie behind traditional taxonomies for classifying Businesses,
Investment Firms, and Investors. Such entities differ primarily in their use of securities. The next figure
shows this in terms of representative balance sheets, with assets on the left and claims on assets on the
right.

www.stanford.edu/~wfsharpe/mia/int/mia_int1.htm 4/7
10/23/12 Investment Approaches

Many cases are, of course, far more complex, involving other assets, liabilities, cross-holdings, etc..

In the traditional finance curriculum, corporate finance courses dealt with the problems and actions of
businesses, financial institutions courses with those of financial institutions, and investments courses
with those of Investors. However, in many economies, the lines are sufficiently blurred to call into
question the desirability of such compartmentalization. Nonetheless, the taxonomy can serve to help
differentiate functions, even though a given entity may provide more than one such function.

The approach portrayed earlier, in which the Investment Firm serves as an intermediary, allows
Investors to achieve the benefits of diversification, but does not allow them to fully adapt their
investments to suit differing preferences for risk vis-à- vis return, or differing attitudes towards the
desirability of receiving return in different circumstances. Such goals can be accomplished in a number
of ways. Most importantly, Businesses can issue claims with different priorities on underlying earnings.
For example, a Business might borrow from (issue debt to) an Investment Firm and obtain the rest of its
funds from holders of common stock, promising to give the latter whatever might be left over after
making required debt payments. Investors could then allocate their money (to taste) between shares
issued by the Investment Firm and shares (stock) issued by Businesses, as shown below.

www.stanford.edu/~wfsharpe/mia/int/mia_int1.htm 5/7
10/23/12 Investment Approaches

This approach allows investors to more efficiently attempt to maximize utility by choosing appropriate
combinations of risk and return and the conditions under which returns are obtained. From a societal
view, this allows for the allocation of risk among investors in the most efficient manner -- a task that can
be greatly assisted by a well-developed set of financial instruments and institutions.

In practice, of course, the situation is infinitely more complex than this. In advanced economies there are
many different types of Investment Firms, and a given firm may offer more than one Investment
Product. A typical product is likely to involve specialization in a particular domain of investments, with
diversification across the investments within that domain. This makes it possible for many Investors to
limit their holdings to such Investment Products. In effect, the Investor holds securities of operating
businesses indirectly rather than directly, as shown here:

This is the model in which the Investor needs the tools of Macro-Investment Analysis. In such a world
the investor must (1) select desirable Investment Products and (2) allocate funds among them. This can
be portrayed in traditional accounting terms as follows:

www.stanford.edu/~wfsharpe/mia/int/mia_int1.htm 6/7
10/23/12 Investment Approaches

This type of Investor serves as the focal point for the analytic techniques described in this book. While
generality can be obtained by simply defining securities issued by individual operating corporations as
Investment Products, the key idea of the approach followed here is to change the focus from the
selection of securities issued by Businesses to the selection of Investment Products (such as bank
deposits, mutual fund shares, annuities, etc.) issued by Investment Firms (banks, mutual fund
companies, insurance companies, and the like). As will be seen, analyzing the characteristics of a
complex Investment Product requires specialized techniques not needed when investors follow the
approach assumed in the typical investments textbook.

www.stanford.edu/~wfsharpe/mia/int/mia_int1.htm 7/7
10/23/12 Financial Economics

Financial Economics

Macro Investment Analysis is part of the field generally known as Financial Economics, which is, in
turn, a specialty within the broader field of Economics. To provide a context for what will follow, it is
useful to consider (if only briefly) the domain of the financial economist.

One of the fundamental aspects of economic activity is a trade in which one party provides another
party something, in return for which the second party provides the first something else. In many such
trades, or transactions, one or both parties are human beings. If Mr. A gives Ms. B an orange and Ms. B
gives Mr. A two apples, it is a trade between two people. In other cases, only one is a human being. If a
fisherman throws a fish back in the water to get more fish a year hence, it is a trade between a person
and nature. Often the first type of trade is called an exchange, while the second is called production.

Economists generally (but not always) concern themselves with exchanges in which one of the items
traded is money. To facilitate trade, most societies establish a convention in which a particular item
serves as numeraire. Thus if dollars serve as money, one typically trades oranges for apples by (1)
trading oranges for dollars ("selling oranges"), then (2) trading dollars for apples ("buying apples"). The
terms of the first trade (e.g. $1 for 1 orange) determine the price of an orange (e.g. $1); the terms of the
second trade (e.g. $0.50 for one apple) determine the price of an apple (e.g. $0.50). Together, these
prices determine the terms of trade for an exchange of oranges for apples (e.g. 1 orange for 2 apples).

The use of money greatly simplifies trading, thus lowering transactions costs. If a society produces 100
different goods, there are 4,950 different possible "good-for-good" trades ([100x100-100]/2). With
money, only 100 prices are needed to establish all possible trading ratios.

Traditional economics focuses on exchanges in which money is one, but only one, of the items traded.
Financial economics concentrates on exchanges in which money of one type or another is likely to
appear on both sides of a trade.

In a single society with only one form of money, there would be no role for financial economics were it
not for time and uncertainty. In fact, however, both of these aspects are crucial elements in the lives of
individuals and economies.

Many decisions involve trading money now for money in the future. Such trades, be they between
people or with nature, fall in the domain of financial economics.

In many such cases, the amount of money to be transferred in the future is uncertain. Financial
economists thus deal with both time and uncertainty. Often the latter is called risk.

In many situations, agreements allow one party to make decisions at later times that can affect
subsequent transfers of money. Thus financial economists deal with contracts involving options.

Often, information can reduce or possibly eliminate the uncertainty associated with future outcomes.
Thus financial economists study the impact of information on trades involving money.

In sum, the financial economist can be distinguished from more traditional economists by his or her
concentration on monetary activities in which time, uncertainty, options and/or information play roles.
Not surprisingly, Macro Investment Analysis requires careful attention to all four of these key elements.

www.stanford.edu/~wfsharpe/mia/int/mia_int2.htm 1/2
10/23/12 Financial Economics

www.stanford.edu/~wfsharpe/mia/int/mia_int2.htm 2/2
10/23/12 Models and Paradigms

Models and Paradigms

Contents:
Models
Paradigms
Plan of the Work

Models
In this work, we develop models designed to enable the Analyst to better counsel an Investor. Before
embarking on this task, it is useful to consider the role of theory in practical affairs.

This work is concerned with theory. To understand a complex financial economy, one must begin with
a simple one. All economic theory uses abstraction. Key elements of a process are investigated in the
hope that resulting implications will help illuminate the issues being addressed. In effect, the financial
economist builds a simplified model to help answer one or more questions. Not surprisingly, different
questions may be addressed most effectively with different models.

Financial economics is concerned with the terms under which financial trades take place. Determining
such terms is often called valuation. Financial economics is concerned with both the trades that people
do make and those that they should make. Valuation and analysis of trades that are made fall in the
domain of positive financial economics (analysis of what is). Analysis of trades that people should make
falls in the domain of normative financial economics (analysis of what should be).

The ultimate test of a positive model is the consistency with reality of its implications concerning the
questions asked. The ultimate test of a normative model is its ability to provide better outcomes in the
areas for which it is intended.

Often financial economists assume that people make trades optimally, given certain assumed objectives.
The implications of such behavior are then examined. In such cases, normative financial economics
provides a base for positive financial economics.

It is useful to analyze situations in which (1) trades can be made without any cost and (2) a specific body
of information is known to all members of a society. In such a world, there are no transactions costs and
anyone can swap X for Y on the same terms as he or she can swap Y for X.

In the real world, this is rarely the case. Consider foreign exchange. One might be able to buy 1 franc
for $0.21 but sell 1 franc for only $0.19. A currency dealer covers costs by bidding only $0.19 for a
franc but asking $0.21. The bid-ask spread represents the cost of transacting. The average of the bid and
ask prices (e.g. $0.20) approximates the price that might prevail in an idealized (transactions cost-free)
world.

Financial institutions provide transactions services, broadly construed. To do business, such an


institution must find a way to arrange a set of trades more efficiently than can those with whom it does
business. To compete effectively, an institution must also do this at least as efficiently as other
institutions.

www.stanford.edu/~wfsharpe/mia/int/mia_int3.htm 1/3
10/23/12 Models and Paradigms

Were there no transactions costs, there would be no financial institutions. Advances in communications,
computation, and financial economics have greatly increased the competition among such institutions
and provided accompanying decreases in transactions costs.

Paradigms
It would be convenient if every investment problem could best be analyzed with a model derived from a
single overarching paradigm (pattern). For some issues, one paradigm will prove more practical; for
other issues, another.

One useful classification of paradigms used in financial economics identifies major types, based on the
treatment of time and outcomes. For each of these elements, one may consider discrete alternatives or a
continuum of possibilities.

The most straightforward approach treats time and outcomes as discrete. For example, one identifies
time period 0 (today), time period 1 (e.g. next year), time period 2 (two years hence), etc.. Each time
period is associated with a limited number of possible outcomes. For example, good weather and bad
weather; stocks rise 10%, rise 5%, fall 5%, fall 10%, etc.. The discrete time, discrete outcome approach
is often termed the time-state paradigm or the Arrow- Debreu paradigm, after the two Nobel prize-
winning authors who developed its basic characteristics.

A second approach retains the notion of discrete time, but treats outcomes as continuous. For example,
at time 1 the stock market might be assumed to return any amount between -50% and +100%, with any
intermediate value (such as 10.34123%) possible. To make such an approach feasible, Analysts
generally characterize prospective outcomes using probability distributions. Thus the stock market might
be assumed to offer a return characterized by a normal (bell-shaped) distribution with an expected value
(mean) of 11% and a likely range (standard deviation) of 15%. Related to the standard deviation
measure of risk is its square, the variance. The discrete time, continuous outcome approach is often
termed the mean-variance paradigm or the Markowitz paradigm after the Nobel prize-winning author
who proposed it and showed how it could be used in the investment process.

A combination involving continuous time and discrete outcomes makes little sense and hence is not
utilized. However, considerable work in financial economics has utilized models in which both time and
outcomes are assumed to be continuous. This is generally known as the continuous time paradigm.

In practice, the mean-variance approach is widely used in applications involving investment portfolios,
and the continuous time approach in applications involving derivative contracts. Procedures deriving
from the time-state approach are also widely used for the analyses of derivatives.

Continuous-time models are of substantial importance in financial economics, and especially so in


theoretical work. However, the mathematical sophistication required for a full understanding of this
approach is substantial. Fortunately, most of the key economic aspects of issues relevant to Investors can
be understood as well or better using one or both of the alternative paradigms. From the viewpoint of the
Analyst, the continuous time formulas of practical relevance can generally be regarded as limiting cases
of related time-state formulations in which the number of time periods becomes very large and the length
of each period very small; we generally present them as such, without detailed discussion of the limiting
process.

The time-state paradigm provides considerable generality, yet its understanding requires little in the way
of mathematical sophistication. We rely on it heavily when establishing principles concerning the key
economic relationships that should be fully understood by the Analyst. The mean-variance paradigm is
more limited in scope, but is well-suited for applications in which numeric values must be estimated in
order for investment decisions to be made appropriately. Accordingly, we will devote a great deal of
attention to it as well.
www.stanford.edu/~wfsharpe/mia/int/mia_int3.htm 2/3
10/23/12 Models and Paradigms

Philosophically, one may consider mean-variance approaches as special cases of time-state approaches
in which the number of possible states of the world is very large and can only be practically dealt with
using summary measures based on probabilities. Given this view, it is important to begin with time-state
formulations, then move to mean-variance approaches.

Plan of the Work


Skillful Macro Investment Analysis requires an understanding of many aspects of Financial Economics
and an ability to apply a wide range of techniques. Organization of the requisite body of material is not a
simple task.

We have chosen an arrangement built on major themes, with an understanding that the reader who fails
to proceed in sequence does so at his or her peril.

The title of each section is deceptively short -- an alternative that seems preferable to the usual long list
of subjects separated by the usual sets of colons and semicolons. Be forewarned, however, that subjects
often appear in unexpected places and that some issues are treated in increasing detail in two or more
places.

www.stanford.edu/~wfsharpe/mia/int/mia_int3.htm 3/3
10/23/12 Macro‑Investment Analysis

Matrices and Programming


Matrices
Matrix Operations
MATLAB
Excel
Asset Allocation with Investment Funds

www.stanford.edu/~wfsharpe/mia/mat/mia_mat0.htm 1/1
10/23/12 Matrices

Matrices

Contents:
Matrix Algebra
Vectors
Matrices

Matrix Algebra
Finance lends itself well to calculations that use matrix algebra To oversimplify, this term refers to
computations that involve vectors (rows or columns of numbers) and matrices (tables of numbers), as
wells as scalars (single numbers). In a great many cases, the simplest way to describe a set of
relationships uses matrix algebra. Moreover, key calculations that the Analyst should perform routinely
are best made with matrix operations.

For this exposition of matrix operations, we rely to a considerable extent on the specifications of
MATLAB -- a computer program designed to efficiently perform matrix calculations. This has the very
large added advantage that the computations can in fact be performed directly by anyone with access to
the MATLAB system. Since MATLAB provides such an ideal environment for our purposes, we
describe it in a later section in considerable detail. For those who wish to use a more familiar spreadsheet
environment for computations, we also describe ways in which matrix operations can be performed in
Microsoft Excel, although in considerably less detail.

Vectors
A vector is either a row or a column of numbers. In either case, its dimension is described by giving the
number of rows first, followed by the number of columns. For example, consider p, a row vector with
the prices of two assets (say a bond and a stock):
p=
54 21

Vector p is {1*2} (pronounced "1 by 2"), since it has one row and two columns. If prices are stated in
dollars, the vector's current values indicate that one bond costs $54 and one stock costs $21. Vector p
can be said to be a "two-element row vector".

Similarly, consider n, a column vector with the number of shares of each of the securities:
n=
1
2

Vector n is {2*1}. It indicates that the investor's portfolio contains one bond and two stocks. It can be
said to be a "two-element column vector".

Matrices
www.stanford.edu/~wfsharpe/mia/mat/mia_mat1.htm 1/2
10/23/12 Matrices

A matrix is a table of numbers. Within text passages, it is conventional to denote them with bold letters.
For example, consider D, a matrix of the prices of the securities on three days of the week :

D=
54 21
55 18
56 27

Matrix D is {3*2}. As before, the number of rows is given first, followed by the number of columns. D
shows that on the first day, the bond was worth $54 and the stock was worth $21. On the second day
the bond was worth $55 and the stock $18. On the third day the bond was worth $56 and the stock $27.

In many ways, the use of the term vector is redundant. One may view a row vector as simply a {1*c}
matrix, where c is the number of columns, and a column vector as simply an {r*1} matrix, where r is the
number of rows.

A very special case is that of a {1*1} matrix -- i.e. a single number. This is often termed a scalar.

In general, we will use the term matrix in its most general form, to include full matrices, vectors and
scalars. Matrix operations are generally defined to include cases in which some or all matrices are
vectors or even scalars. The terms vector and scalar are useful primarily for communicating information
about the dimensionality of certain matrices.

While matrices are generally composed solely of numeric values, it is often desirable to think of them as
the "insides" of tables which include identifying information in the borders. Thus, matrix D might
comprise the values from the following table:
Bond Stock
Mon 54 21
Tue 55 18
Wed 56 27

Occasionally we will use the term "Table X" to refer to matrix X with identifying information appended
on the left and top borders.

From time to time, we will use notation such as {days*assets} to identify not only the size of the matrix
(number of days by number of assets) but also the nature of the information. Thus each element of D
contains a price for the day given by its row and the asset given by its column. This "curly bracket"
notation is decidedly non-standard, but its use can serve as an aid to understanding. In many cases it can
also help avoid serious errors. In some cases we will append the description to the matrix name, as in:
D{days*assets}

Note, however, that programming systems such as MATLAB or Excel would either be confused or
complain if asked to process such a description. The added information is strictly for human use.

Mathematicians generally use single letters to represent matrices, vectors and scalars. Moreover, they
often follow a convention that uses lower-case regular fonts for scalars, lower-case bold fonts for
vectors, and upper-case bold fonts for full matrices, as we have done in the text above and will
sometimes do in subsequent text passages. In other cases we will use descriptive names for matrices. In
program segments, only a regular (computer-like) font will be used, since programming languages do
not distinguish among fonts.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat1.htm 2/2
10/23/12 Matrix Operations

Matrix Operations

Contents:
Matrix Multiplication
Matrix Addition
Matrix Subtraction
Other Element-by-element Operations
Matrix Inverstion
Solving Simultaneous Linear Equations
The Transpose of a Matrix
Multiple Operations

Matrix Multiplication
A key matrix operation is that of multiplication.

The product of two vectors


Consider the task of portfolio valuation. This requires the multiplication of the number of shares of each
security by the corresponding price per share, then the summation of the results. A simple matrix
operation can accomplish this easily. Suppose that:

price {1*assets} =
54 21

quantity {assets*1} =
1
2

Let value be the product of price and quantity:


value = price*quantity

In this case:

value = 96

To compute the value, one multiplies matrix (here, vector) price by matrix (here, vector) quantity.

To understand this process, it is useful to represent each number by a symbol:

price =
p1 p2

quantity =
n1
n2

www.stanford.edu/~wfsharpe/mia/mat/mia_mat2.htm 1/9
10/23/12 Matrix Operations

value =
p1*n1 + p2*n2

The first number in price is multiplied by the first number in quantity, then the second number in price is
multiplied by the second number in quantity. The process continues until the end is reached, at which
time all the products are summed.

Rather clearly, this cannot be done unless the number of columns in the first matrix equals the number of
rows in the second. Put somewhat differently, the inner dimensions of the two matrices must be the
same. This is always required in matrix multiplication and should be checked in advance. Here:

price {1*assets} *quantity {assets*1} ===> value {1*1}

Note that the information in the curly brackets verifies that the multiplication can take place, since the
inner dimensions are the same (assets). Such information also indicates the dimensions of the answer,
which is given by the outer dimensions (here: 1 by 1).

In general, the product obtained by multiplying two matrices will have the same number of rows as the
first matrix, and the same number of columns as the second. For example:
{2*3} times {3*5} ==> {2*5}
{3*2} times {2*4} ==> {3*4}
{1*2} times {2*1} ==> {1*1}

The last case is the one in the example. More generally, multiplying a row vector times a column vector
always produces a scalar.

To repeat, it is good practice (and often necessary) to think about the dimension of an answer before
performing any matrix multiplication. When doing so, one can also check to make certain that the inner
dimensions are the same, as is required. The general scheme is:
{a*b} times {b*c} will produce {a*c}

The product of a matrix and a vector


When one or more of the matrices to be multiplied is a table, the process is simply one of repeated vector
multiplications. Consider, for example, the determination of the value of a portfolio on three different
days (Monday, Tuesday, Wednesday):

Here, there are three sets of prices. The Price Table is:
Bond Stock
Mon 54 21
Tue 55 18
Wed 56 27

while the Price Matrix is:


54 21
55 18
56 27

The dimensions of Price are {days*assets} -- in this case, {3*2}.

Now, consider multiplication of Price times quantity, to obtain value:


Price {days*assets} *quantity {assets*1} ===> value {days*1}

www.stanford.edu/~wfsharpe/mia/mat/mia_mat2.htm 2/9
10/23/12 Matrix Operations

Given the quantity vector q:


Bond 1
Stock 2

The result is the column vector:


96
91
110

The first number in the result value is obtained by multiplying the vector in the top row of matrix Price
by the column vector quantity, giving the same result as before. The second number in the result is
obtained by multiplying the vector in the second row of matrix Price by the column vector quantity,
and so on. Using symbols:
Price =
p11 p12
p21 p22
p31 p32

quantity =
n1
n2

value =
p11*n1 + p12*n2
p21*n1 + p22*n2
p31*n1 + p32*n2

Recall that value is {days*1}. Hence, the associated table is:


Mon 96
Tue 91
Wed 110

The value of the portfolio was $96 on Monday, $91 on Tuesday, and $110 on Wednesday.

The product of two matrices


When two tables are multiplied, the process is simply expanded, with each column of the result obtained
by using the corresponding column of the second matrix. For example, consider the task of finding the
values of two portfolios on each of three days. In this case, Quantity is itself a matrix. In table form:
PortA PortB
Bond 1 5
Stock 2 2

In matrix form:
1 5
2 2

The product is a matrix showing the value of each portfolio on each of the three days:
Price {days*assets} *quantity {assets*portfolios}
===> Value {days*portfolios}

In table form:

www.stanford.edu/~wfsharpe/mia/mat/mia_mat2.htm 3/9
10/23/12 Matrix Operations

PortA PortB
Mon 96 312
Tue 91 311
Wed 110 334

Matrix Addition
The sum of two matrices
If two matrices have the same dimensions, they may be added together. The result is a new matrix with
the same dimensions in which each element is the sum of the corresponding elements of the previous
matrices. For example, consider the following tables:
portA:
Bond 1
Stock 2

portB:
Bond 5
Stock 2

To find the total amounts held in the two portfolios, simply add the corresponding matrices:
portAll = portA + portB

In table form:
portAll:
Bond 6
Stock 4

The sum of a matrix and a scalar


It is also possible to add a constant to every element in a matrix. For example:
portPlus = portAll + 5

Gives:

portPlus:
Bond 11
Stock 9

Matrix Subtraction
Matrix subtraction is like addition. Each element of one matrix is subtracted from the corresponding
element of the other. If a scalar is subtracted from a matrix, the former is subtracted from every element
of the latter. For example:
portA:
Bond 1
Stock 2

portB:
Bond 5
Stock 2

www.stanford.edu/~wfsharpe/mia/mat/mia_mat2.htm 4/9
10/23/12 Matrix Operations

portB - portA:
Bond 4
Stock 0

portB - 1:
Bond 4
Stock 1

Other Element-by-element Operations


Addition and subtraction of matrices operate on an element-by-element basis. In some cases it is
desirable to perform multiplication, division or exponentiation in the same manner. We follow the
MATLAB conventions, preceding the relevant operator with a dot (period) to indicate that such an
element-by-element operation is desired.

Element-by-element operations with matrices


Here are examples involving vectors:
portA:
Bond 1
Stock 2

portB:
Bond 5
Stock 2

portA .* portB:
Bond 5
Stock 4

portA ./ portB:
Bond 0.2
Stock 1.0

portB .^ portA:
Bond 5
Stock 4

Element-by-element operations with a matrix and a scalar

Element-by-element operations can also be performed with a matrix and a scalar. For example:
portA .* 5:
Bond 5
Stock 10

portA ./ 5:
Bond 0.2
Stock 0.4

portA .^ 3:
Bond 1
Stock 8

Matrix Inversion
Thus far, we have not discussed matrix division; only array division. There is a matrix construct similar
www.stanford.edu/~wfsharpe/mia/mat/mia_mat2.htm 5/9
10/23/12 Matrix Operations

to that of division, and it is central to much of the work of the Analyst. The key ingredient is the use of
the inverse of a matrix, to which we now turn.

First, a few preliminaries.

A square matrix has the same number of rows and columns. An identity matrix is a square matrix with
ones on the diagonal from upper left to lower right and zeros elsewhere. For example:
I=
100
010
001

Such a matrix is often denoted I.

The product of an identity matrix (of the right size) and a column vector is the column vector, as can be
seen by applying the rules for matrix multiplication. Thus, if:

v=
3
4
5

I*v ==> v

(read: I times v gives v).

More generally, the product of any matrix M and an identity matrix with the same number of columns as
M will be the original matrix:

I*M ==> M

as can be seen by working through the operations involved in matrix multiplication.

The inverse of a square matrix is a matrix of the same size that, when multiplied by the matrix, gives an
identity matrix of the same size. The inverse of a matrix is sometimes written with a "-1" superscript. We
use instead the more computer-friendly MATLAB form:

inv(M)

where M is a square matrix.

By definition:

inv(M)*M = I

Note that only square matrices can have inverses (although not all do).

To see why matrix inversion is similar to division, consider a {1*1} matrix -- i.e. a scalar -- with a value
of 5. The identity matrix of the same size will also be a scalar, in this case the single value 1. From this it
follows that the inverse of the original matrix (scalar) will be the reciprocal of its value. Thus:

(1/5)*5 = 1

Multiplication by the inverse of a matrix is like dividing by the matrix, except this is strictly true only if
the matrix is {1*1}.

Solving Simultaneous Linear Equations


www.stanford.edu/~wfsharpe/mia/mat/mia_mat2.htm 6/9
10/23/12 Matrix Operations

Matrix inversion is often used to solve a set of simultaneous linear equations. Consider a situation in
which there are two states of the world ("weather is good", "weather is bad") and two securities (Bond,
Stock). Matrix Payoff {states*assets} shows the payments made by each security in each state of the
world. Vector quantity {assets*1} shows the composition of a portfolio. Vector result {states*1}
shows the payments that will be received from the portfolio in each possible state of the world. Below,
we show all three in table form:

Payoff:
Bond Stock
good 60 40
bad 60 10

quantity:
Bond 1
Stock 2

result = Payoff*quantity:
good 140
bad 80

Thus the portfolio will provide $140 if the weather is good. If the weather is bad it will only provide
$80.

Now, assume that an investor would like to receive $240 if the weather is good and $150 if the weather
is bad. The problem is to determine the portfolio (quantity) that will produce the desired payment vector.

Consider the equation for the computation:


Payoff*quantity = result

Note that Payoff is square, so it is possible to compute its inverse, barring complications to be discussed
later. We multiply both sides of the equation by this inverse (a "legal" matrix operation):

inv(Payoff)*Payoff*quantity = inv(Payoff)*result

But the product of the inverse and the original matrix is the identity matrix, so:
I*quantity = inv(Payoff)*result

But the product of an identity matrix and a vector is the vector. Thus:
quantity = inv(Payoff)*result

This is precisely what we want -- an equation for a portfolio (quantity) that will provide the desired set
of cash flows (result)!

The three components are shown below, with the resulting values shown in bold:

result:
good 240
bad 150

inv(Payoff):
-0.0056 0.0222
0.0333 -0.0333

quantity:
Bond 2
Stock 3

www.stanford.edu/~wfsharpe/mia/mat/mia_mat2.htm 7/9
10/23/12 Matrix Operations

Thus the desired result can be achieved with a portfolio of 2 bonds and 3 stocks.

Any set of simultaneous linear equations for which there is a solution can be solved in this manner. It
may seem that the requirement that the matrix of coefficients be square is overly restrictive. However, to
solve a set of such equations requires precisely as many equations as there are unknowns, so the matrix
of "left-hand sides" (here, Payoff) must have as many rows (equations) as it does columns (variables).

Unfortunately, sometimes this won't work. It is impossible to take the inverse of some matrices, even
though they are square. In such cases the matrix in question is said to be singular. In typical investment
applications this will occur when a strategy is not truly independent and can be provided with some
combination of other included strategies. When this occurs, the programming system being used is likely
to complain that it cannot take the needed inverse because the matrix in question is singular (or very
nearly so). This is a signal that the economics of the original problem formulation need to be re-
examined.

The Transpose of a Matrix


It is not unusual to find that a matrix is the "wrong way around" for a needed calculation. More
precisely, its rows should be columns and its columns should be rows. Happily, there is a standard
operation that "turns around" a matrix (or vector).

The transpose of a matrix is, in effect, the matrix rotated in this manner. For example, if M is:
123
456

then M' (read: M-prime or M-transpose) is:


14
25
36

This is sometimes denoted by appending a "T" as a superscript after M , but we will use the MATLAB
version M'.

Multiple Operations
To facilitate exposition, we have generally restricted our examples to one matrix or array operation.
Sometimes we have put the result on the left; and sometimes on the right. Moreover, we have used an
arrow when it appeared useful and an equality sign at other times. When writing commands to be
executed by a programming system, of course, rather strict rules of syntax must be followed. Generally,
the result must be written first, followed by an equality sign, followed by an expression indicating the
desired computations. Such expressions can include multiple matrix and/or array operations, if desired.
For example:
D = inv(A)*(b*c)

This would be perfectly legal if the dimensions of A, b and c were appropriate. The sense of the
equality sign is that of assignment. Thus the statement really says: "D should be assigned the result
obtained by multiplying the inverse of A times the product of b and c."

Statements such as this, which are designed to be operated on by a programming system, are generally
written without bold fonts, since such subtleties would be lost on the processor, even if they could be
presented to it.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat2.htm 8/9
10/23/12 Matrix Operations

www.stanford.edu/~wfsharpe/mia/mat/mia_mat2.htm 9/9
10/23/12 MATLAB

MATLAB

Contents:
Introduction
MATLAB Versions
Matrices as Fundamental Objects
Matrix Operations
Assignment Statements
Case Sensitivity
Immediate and Deferred Execution
Showing Values
Initializing Matrices
Making Matrices from Matrices
Using Portions of Matrices
Text Strings
Matrix and Array Operations
Using Functions
Logical and Relational Operations on Matrices
Sorting Matrices
Controlling Execution Flow
Writing Functions
Comments and Help
Data Input and Output
MATLAB Function Library

Introduction
MATLAB stands for Matrix Laboratory. According to The Mathworks, its producer, it is a "technical
computing environment". We will take the more mundane view that it is a programming language.

This section covers much of the language, but by no means all. We aspire to at the least to promote a
reasonable proficiency in reading procedures that we will write in the language but choose to address
this material to those who wish to use our procedures and write their own programs.

MATLAB Versions
Versions of MATLAB are available for almost all major computing platforms. Our material was
produced and tested on the version designed for the Microsoft Windows environment. The vast majority
of it should work with other versions, but no guarantees can be offered.

Of particular interest are the Student Versions of MATLAB. Prices are generally below $100. These
systems include most of the features of the language, but no matrix can have more than 8,192 elements,
with either the number of rows or columns limited to 32. For many applications this proves to be of no
consequence. At the very least, one can use a student version to experiment with the language.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 1/16
10/23/12 MATLAB

The Student Editions are sold as books with disks enclosed. They are published by Prentice-Hall and
can be ordered through bookstores.

In addition to the MATLAB system itself, Mathworks offers sets of Toolboxes, containing MATLAB
functions for solving a number of important types of problems. Of particular interest to us is the
optimization toolbox, which will be discussed in a later section.

Matrices as Fundamental Objects


MATLAB is one of a few languages in which each variable is a matrix (broadly construed) and
"knows" how big it is. Moreover, the fundamental operators (e.g. addition, multiplication) are
programmed to deal with matrices when required. And the MATLAB environment handles much of the
bothersome housekeeping that makes all this possible. Since so many of the procedures required for
Macro-Investment Analysis involve matrices, MATLAB proves to be an extremely efficient language
for both communication and implementation.

Matrix Operations
Consider the following MATLAB expression:
C=A+B

If both A and B are scalars (1 by 1 matrices), C will be a scalar equal to their sum. If A and B are row
vectors of identical length, C will be a row vector of the same length, with each element equal to the
sum of the corresponding elements of A and B. Finally, if A and B are, say, {3*4} matrices, so will C,
with each element equal to the sum of the corresponding elements of A and B.

In short the symbol "+" means "perform a matrix addition". But what if A and B are of incompatible
sizes? Not surprisingly, MATLAB will complain with a statement such as:
??? Error using ==> +
Matrix dimensions must agree.

So the symbol "+" means "perform a matrix addition if you can and let me know if you can't".

Assignment Statements
MATLAB uses a pattern common in many programming languages for assigning the value of an
expression to a variable. The variable name is placed on the left of an equal sign and the expression on
the right. The expression is evaluated and the result assigned to the variable name. In MATLAB, there
is no need to declare a variable before assigning a value to it. If a variable has previously been assigned
a value, the new value overrides the predecessor.

This may sound obvious, but consider that the term "value" now includes information concerning the
size of matrix as well as its contents. Thus if A and B are of size {20*30} the statement:
C=A+B

Creates a variable named C that is also {20*30} and fills it with the appropriate values. If C already
existed and was, say {20*15} it would be replaced with the required {20*30} matrix. In MATLAB,
unlike some languages, there is no need to "pre-dimension" or "re-dimension" variables. It all happens
without any explicit action on the part of the user.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 2/16
10/23/12 MATLAB

Case Sensitivity
MATLAB variable names are normally case-sensitive. Thus variable C is different from variable c. A
variable name can have up to 19 characters, including letters, numbers and underscores. While it is
tempting to use names such as FundReturns it is safer to choose instead fund_returns or to use the
convention from the C language of capitalizing only second and subsequent words, as in fundReturns.
In any event, a\Adopt a simple set of naming conventions so that you won't write one version of a name
in one place and another later. If you do so, you may get lucky (e.g. the system will complain that you
have asked for the value of an undefined variable) or you may not (e.g. you will assign the new value to
a newly-created variable instead of the old one desired). In programming languages there are always
tradeoffs. You don't have to declare variables in advance in MATLAB. This avoids a great deal of
effort, but it allows nasty, difficult-to-detect errors to creep into your programs.

Immediate and Deferred Execution


When MATLAB is invoked, the user is presented with an interactive environment. Enter a statement,
press the carriage return ("ENTER") and the statement is immediately executed. Given the power that
can be packed into one MATLAB statement, this is no small accomplishment. However, for many
purposes it is desirable to store a set of MATLAB statements for use when needed.

The simplest form of this approach is the creation of a script file: a set of commands in a file with a name
ending in .m (e.g. do_it.m). Once such a file exists and is stored on disk in a directory that MATLAB
knows about (i.e. one on the "MATLAB path"), the user can simply type:
do_it

at the prompt in interactive mode. The statements will then be executed.

Even more powerful is the function file; this is also a file with an .m extension, but one that stores a
function. For example, assume that the file val_port.m, stored in an appropriate directory, contains a
function to produce the value of a portfolio, given a vector of holdings and a vector of prices. In
interactive mode, one can then simply type:
v = val_port(holdings, prices);

MATLAB will realize that it doesn't have a built-in function named val_port and search the relevant
directories for a file named val_port.m, then use the function contained in it.

Whenever possible, you should try to create "m-files" to do your work, since they can easily be re-used.

Showing Values
If at any time you wish to see the contents of a variable, just type its name. MATLAB will do its best,
although the result may take some space if the variable is a large matrix.

MATLAB likes to do this and will tell you what it has produced after an assignment statement unless
you request otherwise. Thus if you type:
C=A+B

MATLAB will show you the value of C. This may be a bit daunting if C is, say, a 20 by 30 matrix. To

www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 3/16
10/23/12 MATLAB

surpress this, put a semicolon at the end of any assignment statement. For example:
C = A + B;

Initializing Matrices
If a matrix is small enough, one can provide initial values by simply typing them in. For example:
a = 3;
b = [ 1 2 3];
c = [ 4 ; 5 ; 6];
d = [ 1 2 3 ; 4 5 6];

Here, a is a scalar, b is a {1*3} row vector, c a {3*1} column vector, and d is a {2*3} matrix. Thus,
typing "d" produces:
d=
1 2 3
4 5 6

The system for indicating matrix contents is very simple. Values separated by spaces are to be on the
same row; those separated by semicolons are on to be on separate rows. All values are enclosed in
square brackets.

Making Matrices from Matrices


The general scheme for initializing matrices can be extended to include matrices as components. For
example:
a = [1 2 3];
b = [4 5 6];
c = [a b];

gives:
c=
1 2 3 4 5 6

While:
d = [a ; b]

gives:
d=
1 2 3
4 5 6

Matrices can easily be "pasted" together in this manner -- a process that is both simple and easily
understood by anyone reading a procedure (including its author). Of course, the sizes of the matrices
must be compatible. If they are not, MATLAB will tell you.

Using Portions of Matrices


Frequently one wishes to reference only a portion of a matrix. MATLAB provides simple and powerful
ways to do so.
www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 4/16
10/23/12 MATLAB

To reference a part of a matrix, give the matrix name followed by parentheses with expressions
indicating the portion desired. The simplest case arises when only one element is wanted. For example,
using d in the previous section:
d(1,2) equals 2
d(2,1) equals 4

In every case the first parenthesized expression indicates the row (or rows), while the second expression
indicates the column (or columns). If a matrix is, in fact, a vector, a single expression may be given to
indicate the desired element, but it is often wise to give both row and column information explicitly,
even in such cases.

MATLAB's real power comes into play when more than a single element of a matrix is wanted. To
indicate "all the rows" use a colon for the first expression. To indicate "all the columns", use a colon for
the second expression. Thus, with:
d=
1 2 3
4 5 6

d(1,:) equals
1 2 3

d(:,2) equals
2
5

In fact, you may use any expression in this manner as long as it evaluates to a vector of valid row or
column numbers. For example:
d(2,[2 3]) equals
5 6
d(2, [3 2]) equals
6 5

Variables may also be used as "subscripts". Thus:


if
z = [2 3]
then
d(2,z) equals
5 6

Particularly useful in this context (and others) is the construct that uses a colon to produce a string of
consecutive integers. For example:
the statement:
x = 3:5
produces
x=
3 4 5

Thus:
d(1, 1:2) equals
1 2

Text Strings

www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 5/16
10/23/12 MATLAB

MATLAB is wonderful with numbers. It deals with text but you can tell that its heart isn't in it.

A variable in MATLAB is one of two types: numeric or string. A string matrix is like any other, except
the elements in it are interpreted asASCII numbers. Thus the number 32 represents a space, the number
65 a capital A, etc.. To create a string variable, enclose a string of characters in "single" quotation marks
(actually, apostrophes), thus:
stg = 'This is a string';

Since a string variable is in fact a row vector of numbers, it is possible to create a list of strings by
creating a matrix in which each row is a separate string. As with all standard matrices, the rows must be
of the same length. Thus:
the statement
x = ['ab' ; 'cd']
produces:
x=
ab
cd

while
x = ['ab' 'cd']
produces:
x=
abcd
as always.

Matrix and Array Operations


The Mathworks uses the term matrix operation to refer to standard procedures such as matrix
multiplication. The term array operation is reserved for element-by-element computations.

Matrix Operations
Matrix transposition is as easy as adding a prime (apostrophe) to the name of the matrix. Thus:
if:
x=
123
then:
x' =
1
2
3

To add two matrices of the same size, use the plus (+) sign. To subtract one matrix from another of the
same size, use a minus (-) sign. If a matrix needs to be "turned around" to conform, use its transpose.
Thus, if A is {3*4} and B is {4*3}, the statement:

C=A+B

will get you the message:


??? Error using ==> +
Matrix dimensions must agree.

while:

www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 6/16
10/23/12 MATLAB

C = A + B'

will get you a new matrix.

There is one case in which addition or subtraction works when the components are of different sizes. If
one is a scalar, it is added to or subtracted from all the elements in the other.

Matrix multiplication is indicated by an asterisk (*), commonly regarded in programming languages as a


"times sign". With one exception the usual rules apply: the inner dimensions of the two operands must
be the same. If they are not, you will be told so. The one allowed exception covers the case in which
one of the components is a scalar. In this instance, the scalar value is multiplied by every element in the
matrix, resulting in a new matrix of the same size.

MATLAB provides two notations for "matrix division" that provide rapid solutions to simultaneous
equation or linear regression problems. They are better discussed in the context of such problems.

Array Operations
To indicate an array (element-by-element) operation, precede a standard operator with a period (dot).
Thus:

if x =
1 2 3
and y =
4 5 6
then:
x.*y =
4 10 18
the "dot product" of x and y.

You may divide all the elements in one matrix by the corresponding elements in another, producing a
matrix of the same size, as in:
C = A ./ B

In each case, one of the operands may be a scalar. This proves handy when you wish to raise all the
elements in a matrix to a power. For example:
if x =
1 2 3
then:
x.^2 =
1 4 9

MATLAB array operations include multiplication (.*), division (./) and exponentiation (.^). Array
addition and subtraction are not needed (and in fact are not allowed), since they would simply duplicate
the operations of matrix addition and subtraction.

Using Functions
MATLAB has a number of built-in functions -- many of which are very powerful. Some provide one
(matrix) answer; others provide two or more.

You may use any function in an expression. If it returns one answer, that answer will be used. The sum
function provides an example:
if x =
1
www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 7/16
10/23/12 MATLAB

1
2
3
then the statement:
y =sum(x) + 10
will produce:
y=
16

Some functions, such as max provide more than one answer. If such a function is included in an
expression, only the first answer will be used. For example:

if x =
1 4 3
the statement:
z = 10 + max(x)
will produce:
z=
14

To get all the answers from a function that provides more than one, use a multiple assignment statement
in which the variables that are to receive the answers are listed to the left of the equal sign, enclosed in
square brackets, and the function is on the right. For example:

if x =
1 4 3
the statement:
[y n] = max(x)
will produce:
y=
4
n=
2

In this case, y is the maximum value in x, and n indicates the position in which it was found.

Many of MATLAB's built-in functions, such as sum, min, max, and mean have natural interpretations
when applied to a vector. If a matrix is given as an argument to such a function, its procedure is applied
separately to each column, and a row vector of results returned. Thus:
if x =
1 2 3
4 5 6
then :
sum(x) =
5 7 9

Some functions provide no answers per se. For example, to plot a vector y against a vector x, simply use
the statement:
plot(x,y)

which will produce the desired cross-plot.

Note that in this case, two arguments (the items in the parentheses after the function name) were
provided as inputs to the function. Each function needs a specific number of inputs. However, some
have been programmed to react appropriately when fewer are given. For example, to plot y against
(1,2,3...), you can use the statement:

www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 8/16
10/23/12 MATLAB

plot(y)

There are many built-in functions in MATLAB. Among them, the following are particularly useful for
Macro-Investment Analysis:

ones
ones matrix
zeros
zeros matrix
size
size of a matrix
diag
diagonal elements of a matrix
inv
matrix inverse
rand
uniformly distributed random numbers
randn
normally distributed random numbers
cumprod
cumulative product of elements
cumsum
cumulative sum of elements
max
largest component
min
smallest component
sum
sum of elements
mean
average or mean value
median
median value
std
standard deviation
sort
sort in ascending order
find
find indices of nonzero entries
corrcoef
correlation coefficients
cov
covariance matrix

Not listed, but of great use, are the many functions that provide plots of data in either two or three
dimensions, as well as a number of more specialized functions. However, this list should serve to whet
the Analyst's appetite. The full list of functions and information on each one can be obtained via
MATLAB's on-line help system.

Logical and Relational Operations on Matrices


MATLAB offers six relational operators:

< : less than


www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 9/16
10/23/12 MATLAB

<= : less than or equal to


> : greater than
>= : greater than or equal to
== : equal
~= : not equal

Note carefully the difference between the double equality and the single equality. Thus A==B should be
read "A is equal to B", while A=B should be read "A should be assigned the value of B". The former is
a logical relation, the latter an assignment statement.

Whenever MATLAB encounters a relational operator, it produces a one if the expression is true and a
zero if the expression is false. Thus:
the statement:
x = 1 < 3 produces: x=1, while
x = 1 > 3 produces: x=0

Relational operators can be used on matrices, as long as they are of the same size. Operations are
performed element-by-element, resulting a matrix with ones in positions for which the relation was true
and zeros in positions for which the relation was false. Thus:

if A =
1 2
3 4
and B =
3 1
2 2
the statement:
C=A>B
produces:
C=
0 1
1 1

One or both of the operands connected by a relational operator can be a scalar. Thus:
if A =
1 2
3 4
the statement:
C=A>2
produces:
C=
0 0
1 1

One may also use logical operators of which there are three:

& : and
| : or
~ : not

Each works with matrices on an element-by-element basis and conforms to the ordinary rules of logic,
treating any non-zero element as true and any zero element as false.

Relational and logical operators are used frequently with If statements (described below) and scalar
variables, as in more mundane programming languages. But the ability to use them with matrices offers
major advantages in some Investment applications.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 10/16
10/23/12 MATLAB

Sorting Matrices
To sort a matrix in ascending order, use the sort function. If the argument is a vector, the result will be a
new vector with the items in the desired order. If it is a matrix, the result will be a new matrix in which
each column will contain the contents of the corresponding column from the old matrix, in ascending
order. Note that in the latter case, each column is, in effect, sorted separately. Thus:
if x =
1 5
3 2
2 8
the statement:
y=sort(x)
will produce:
y=
1 2
2 5
3 8

To obtain a record of the rows from which each of the sorted elements came, use a multiple assignment
to get the second output of the function. For the case above:
the statement:
[y r] = sort(x)
would produce y as before and
r=
1 2
3 1
2 3

Thus the second item in the sorted list in column 1 came from row 3, etc..

Controlling Execution Flow


It is possible to do a great deal in MATLAB by simply executing statements involving matrix
expressions, one after the other, However, there are cases in which one simply must substitute some
non-sequential order. To facilitate this, MATLAB provides three relatively standard methods for
controlling program flow: For Loops, While Loops, and If statements

For Loops
The most common use of a For Loop arises when a set of statements is to be repeated a fixed number of
times, as in:

for j= 1:n
.......
end

There are fancier ways to use For Loops, but for our purposes, the standard one suffices.

While Loops

A While Loop contains statements to be executed as long as a stated condition remains true, as in:
while x > 0.5
.......
end
www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 11/16
10/23/12 MATLAB

It is, of course, crucial that at some point a statement will be executed that will cause the condition in the
While statement to be false. If this is not the case, you have created an infinite loop -- one that will go
merrily on until you pull the plug.

For readability, it is sometimes useful to create variables for TRUE and FALSE, then use them in a
While Loop. For example:

true = 1==1;
false = 1==0;
.....
done = false;
while not done
........
end

Of course, somewhere in the While loop there should be a statement that will at some point set done
equal to true.

If Statements
A If Statement provides a method for executing certain statements if a condition is true and other
statements (or none) if the condition is false. For example:

If x > 0.5
........
else
.......
end

In this case, if x is greater than 0.5 the first set of statements will be executed; if not, the second set will
be executed.

A simpler version omits the "else section", as in:


If x > 0.5
........
end

Here, the statements will be executed if (but only if) x exceeds 0.5.

Nesting
All three of these structures allow nesting, in which one type of structure lies within another. For
example:

for j = 1:n
for k = 1:n
if x(j,k) > 0.5
x(j,k) = 1.5;
end
end
end

The indentation is for the reader's benefit, but highly recommended in this and other situations.
MATLAB will pair up end statements with preceding for, while, or if statements in a last-come-first-
served manner. It is up to the programmer to ensure that this will give the desired results. Indenting can
help, but hardly guarantees success on every occasion.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 12/16
10/23/12 MATLAB

While it is tempting for those with experience in traditional programming languages to take the easy way
out, using For and While loops for mathematical operations, this temptation should be resisted
strenuously. For example, instead of:

port_val = 0;
for j = 1:n
port_val = port_val + ( holdings(j) * prices(j));
end

write:

port_val = holdings*prices;

The latter is more succinct, far clearer, and will run much faster. MATLAB performs matrix operations
at blinding speed, but can be downright glacial at times when loops are to be executed a great many
times, since it must do a certain amount of translation of each statement every time it is encountered.

Writing Functions
The power of MATLAB really comes into play when you add your own functions to enhance the
language. Once a function m-file is written, debugged, and placed in an appropriate directory, it is for all
practical purposes part of your version of MATLAB.

A function file starts with a line declaring the function, its arguments and its outputs. There follow the
statements required to produce the outputs from the inputs (arguments). That's it.

Here is a simple example:


function y = port_val(holdings,prices)
y = holdings*prices;

Of course, this will only work if the holdings and prices vectors or matrices are compatible for matrix
multiplication. A more complex version could examine the sizes of these two matrices, then use
transposes, etc. as required.

It is important to note that the argument and output names used in a function file are strictly local
variables that exist only within the function itself. Thus in a program, one could write the statement:

v = port_val(h,p);

The first matrix in the argument list in this calling statement (here, h) would be assigned to the first
argument in the function (here, holdings) while the second matrix in the calling statement (p) would be
assigned to the second matrix in the function (prices). There is no need for the names to be the same in
any respect. Moreover, the function cannot change the original arguments in any way. It can only return
information via its output.

This function returns only one output, called y internally. However, the resultant matrix will be
substituted for the entire argument "call" in any expression.

If a function is to return two or more arguments, simply assign them names in the declaration line, as in:

function [total_val, avg_val] = port_val(holdings,prices)


total_val = holdings*prices;
avg_val = total_val/size(holdings,2);

This can still be used as in the earlier case if only the total value is desired. To get both the total value
and the average value per position, a program could use a statement such as:
www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 13/16
10/23/12 MATLAB

[tval aval] = port_val( h,p);

Note that as with inputs, the correspondence between outputs in the calling statement and the function
itself is strictly by order. When the function has finished its work, its output values are assigned to the
variables in the calling statement.

Variables other than inputs and arguments may be included in functions, as needed. They are strictly
local to the function and have no existence outside it. Indeed, a variable in a function may have the same
name as one in another place; the two will coexist with neither bothering the other. While MATLAB
provides for the use of "global variables", their use is widely discouraged and will not be treated here.

Comments and Help


It is an excellent idea to include comments throughout any m-file. To do so, use the percent (%) sign.
Everything after it up to the end of the line will be ignored by MATLAB.

The first several lines after each function header should provide a brief description of the function and its
use. Once the function has been placed in an appropriate directory, a user need only type help followed
by the function name to be shown all the initial comment lines (up to the first non-comment or totally
blank line). Thus if there is a function named port_val, the user can get this information by typing:

help port_val

To provide even more assistance, create a script file with nothing but comment lines, each giving the
name and a brief description of all your functions and scripts. If this were named mia_fun, the user could
simply type:
help mia_fun

to get a list of your functions, then type help function name to get more details on any specific function.

Data Input and Output


There are many ways to get information into and out of the MATLAB environment. We will cover only
the simpler ones here.

Data Input
The most straightforward way to get information into MATLAB is to type it in "command mode". For
example:

prices = [ 12.50 37.875 12.25];


assets = ['cash ';'bonds ' ; 'stocks'];

MATLAB even makes it easy to enter matrices in a more normal form by treating carriage returns as
semicolons within brackets. Thus:

holdings = [ 100 200


300 400
500 600 ]

will create a {3*2} matrix, as desired.

A second way to get data into MATLAB is to create a script file with the required statements, such as
the one above. This can be done with any text processor. Large matrices of data can even be "cut out"
of databases, spreadsheets, etc. then edited to include the desired variable names, square brackets and
www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 14/16
10/23/12 MATLAB

of databases, spreadsheets, etc. then edited to include the desired variable names, square brackets and
the like. Once the file or files are saved with .m names they only have to be invoked to bring the data
into MATLAB.

Next up the chain of complexity is the use of a flat file which stores data for a matrix. Such a file should
have numeric ascii text characters, with each element in a row separated from its neighbor with a space
and each row on a separate line. Say, for example, that you have stored the elements of a matrix in a file
named test.txt in a directory on the MATLAB path. Then the statement:
load test.txt

will create a matrix named test containing the data.

Data Output
A simple way to output data is to display a matrix. This can be accomplished by either giving its name
(without a semicolon) in interactive mode. Alternatively you can use the disp function, which shows
values without the variable name, as in:
disp(test);

For prettier output, MATLAB has various functions for creating strings from numbers, formatting data,
etc.. Function pmat can produce small tables with string identifiers on the borders.

If you want to save almost everything that appears on your screen, issue the command:

diary filename

where filename represents the name of a new file that will receive the subsequent output. When you are
through, issue the command:

diary off

Later, at your leisure, you may use a text editor to extract data, commands, etc. to data files, script or
function files, and so on.

There are, of course, other alternatives. If you are in an environment (such as a Windows system) that
allows material to be copied from one program and pasted into another, this may suffice.

To create a flat file containing the data from a matrix use the -ascii version of the save command. For
example:
save newdata.txt test -ascii

will save the matrix named test in the file named newdata.txt.

Finally, you may save all or part of the material from a MATLAB session in MATLAB's own mat file
format. To save all the variables in a file named temp.mat, issue the command:

save temp

At some later session you may load all this information by simply issuing the command:

load temp

To save only one or more matrices in this manner, list their names after the file name. Thus:

www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 15/16
10/23/12 MATLAB

save temp prices holdings portval

would save only these three matrices in file temp.mat. Subsequent use of the command:

load temp

would restore the three named matrices, with their values intact.

There are more sophisticated ways to move information into and out of MATLAB, but they can be left
to others.

MATLAB Function Library


We provide a number of MATLAB functions that may prove of value to the Macro-Investment Analyst
and, possibly, others. The user is advised to proceed with caution when using any of them.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat3.htm 16/16
10/23/12 Excel

Excel

Contents:
Introduction
Named Ranges
Matrix Operations in Excel

Introduction
Microsoft's Excel spreadsheet program provides an alternative environment for many of the
computations required for Macro-Investment Analysis. Its ubiquity and ease of use are among its more
attractive features. However, spreadsheets are notoriously dangerous, since the underlying logic of a set
of calculations is usually contained in formulas scattered around a sheet (or sheets). Worse yet, the
formulas are usually hidden from sight, behind the numbers representing the results of their calculations.
These disadvantages loom especially large when an environment is to be chosen primarily as a means of
communication. For our purposes, languages such as MATLAB are superior to a spreadsheet
environment -- Excel or any other.

The situation is not, however, as bleak as it once was. Since the introduction of version 5.0, Excel has
included a full programming language that allows for structured, documented, and readable sets of
commands. Formally, it is a version of Microsoft's Visual Basic for Applications, but we will use the
simpler form: Visual Basic or to be even more succinct: VB.

In Excel, VB procedures are called Macros , but this is far too humble a term for perfectly respectable
programs and we will resist its use except when absolutely necessary.

Will will not cover Visual Basic, since it is a complex programming language that requires an extensive
treatise. Suffice it to say that it provides an alternative to MATLAB and other languages for preparing
investment application programs.

Here we concentrate on a a discussion of matrix operations in the standard Excel spreadsheet


environment. The treatment will be cursory, at best since Excel is far too complex to cover in any detail
in this exposition. Our goal is only to suggest ways in which it can be used by the Analyst for matrix
operations.

Named Ranges
Many Excel formulas require the specification of one or more ranges of cells as arguments. In many
cases the easiest way to indicate such a range is to select it using keystrokes and/or a mouse as the
formula is typed. For clarity, we adopt an alternative approach, using only named ranges in our
formulas and statements. Since names remain with the formulas and statements, it is easy to change the
physical range of cells to which a name applies whenever results are desired for a different range of
inputs. Perhaps more important, the use of appropriate range names can greatly improve the readability
of a set of formulas or statements.

The safest way to assign a name to a range of cells is to first select it, then choose Insert Name Define
www.stanford.edu/~wfsharpe/mia/mat/mia_mat4.htm 1/4
10/23/12 Excel

from the menu, followed by the desired name. Be certain to avoid names that look like cell locations or
combinations of them (e.g. A22). In Excel, range names are not case sensitive. Thus Prices, prices and
PRICES are considered the same name.

To select a named range, choose Edit Go to (or the equivalent key), followed by the range name.
Alternatively, use the drop-down list of names located just above and to the left of the spreadsheet.
When a named range is selected, the name will appear in the window for this list. (In fact, you can name
ranges by selecting them, then typing the name in this box; however, this sometimes allows conflicts to
creep in and should be avoided).

Once you have named a range, you may use it in any formula that allows for a range as an argument. As
indicated earlier, we will always choose this alternative.

Matrix Operations in Excel


Unbeknownst to many users, Excel can do matrix operations very efficiently, either directly, or through
the use of matrix functions. Microsoft prefers to use the term "Array" to "Matrix", so most references in
their manuals and help system can be found under the former term.

Key to understanding the use of matrix operations in Excel is the concept of the Matrix (Array) formula.
Such a formula uses matrix operations and returns a result that can be a matrix, a vector, or a scalar,
depending on the computations involved. Whatever the result may be, an area on the spreadsheet of
precisely the correct size must be selected before the formula is typed in (otherwise you will either lose
some of the answer or get added and possibly confusing information).

After typing such a formula, you "enter" it with three keys pressed at once: CTRL, SHIFT and
ENTER. This indicates that a matrix (array) result really is desired. It also designates the entire selected
range as the desired location for the answer. To modify or delete the formula, select the entire region
beforehand.

When matrix computations are performed in this way, the "result areas" will be updated immediately
whenever any of the numbers in the "input areas" change (unless automatic recomputation has been
turned off). This can be a great help when one wishes to evaluate the effects of changes in assumptions,
initial conditions, etc.. This feature, coupled with the ability to see matrices, complete with identification
of the rows and columns (i.e. in the form that we have termed tables), will often make the spreadsheet
environment the preferred choice for computation, if not for communication.

In Excel, some matrix operations are performed automatically, using standard operators (as in
MATLAB). Others require the use of matrix functions. We treat each below.

Matrix Addition
Assume that Holdings_1 and Holdings_2 are two ranges of the same size (say, {20*1}) containing the
holdings of mutual funds in two accounts. To create a vector with the total holdings of both accounts,
select an empty {20*1} range on the sheet, type in the formula:
= Holdings_1 + Holdings_2

then press CTRL-SHIFT-ENTER. As a matter of good practice, you might wish to name the resultant
range (e.g. Tot_Holdings) for future reference.

Any two matrices of the same size can be added in this manner, with the result placed in a range of the
same size.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat4.htm 2/4
10/23/12 Excel

Matrix Subtraction

Not surprisingly, a matrix can be subtracted from one of the same size in a manner analogous to that of
addition. For example to find the holdings of account 2, you could use the formula:
= Tot_Holdings - Holdings_1

Using Matrices with Scalars


To add a constant to every element of a matrix, simply include it in a formula, as in:
= Tot_Holdings + 100

You can also subtract a constant from every element or multiply or divide every element by a constant.
For example:
= Prices * 1.10

Matrix Multiplication
To multiply two matrices, use the MMULT function. Thus, if prices and holdings are compatible for
multiplication, you could compute the value of a portfolio with the formula:
= MMULT(prices,holdings)

Transposition

If a matrix is not turned in the right direction, simply use the TRANSPOSE function. Thus if prices is a
{20*1} vector and holdings is also, you could use the formula:
= MMULT(TRANSPOSE(prices),holdings)

to produce the value of the portfolio.

As is often the case, there is another way to do the same thing in Excel. The (non-matrix) function
SUMPRODUCT produces the sum of the products of the elements in two vectors of equal dimensions.
Thus if prices and holdings are both {20*1}, you could compute the value of the portfolio with the
formula:
= SUMPRODUCT(prices,holdings)

Note that to enter this formula, only the ENTER key need be pressed.

The provision of alternative methods for accomplishing a given type of calculation endears Excel to
many users, especially those who grew up with prior versions. But it tends to frustrate those who yearn,
perhaps quixotically, for a simple, yet powerful computing environment.

Matrix Inversion
To produce the inverse of a matrix, use the MINVERSE function, as in:

= MINVERSE(lhs)

Of course the matrix in the named range must be square and invertable.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat4.htm 3/4
10/23/12 Excel

Combining Matrix Operations

In Excel, as in MATLAB, you may combine matrix operations in a single formula. Remember,
however, that everything must conform, that the output range should be the correct size for the final
result, and that you must press CTRL-SHIFT-ENTER to enter the formula in the output range. As in
more mundane formulas, it never hurts to include sufficient parentheses to remove any possible
ambiguity concerning your desires.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat4.htm 4/4
10/23/12 Asset Allocation with Investment Funds

An Example: Asset Allocation with Investment


Funds

Contents:
Assets and Funds
The Current Portfolio
The Current Allocation
Obtaining a Desired Allocation
Finding Alternative Allocations
Finding Pure Asset Plays

Assets and Funds


To see the power of matrix operations, we consider an important example -- the choice of a portfolio of
investment funds designed to achieve a desired asset allocation.

Assume that an Analyst has identified three major asset classes:

DomBds: Domestic Bonds


DomStx: Domestic Stocks
ForStx: Foreign Stocks

Three Funds are available for investment:

FundA
FundB
FundC

After considerable work, the Analyst has estimated the exposures of each fund to each of the three asset
classes. For present purposes, think of these as the funds' allocations of money among the asset classes.
The results of the analysis are summarized in matrix A:
FundA FundB FundC
DomBds 0.60 0.20 0.00
DomStx 0.40 0.50 0.30
ForStx 0.00 0.30 0.70

Thus FundA has 60% of its money in Domestic Bonds and 40% in Domestic Stocks; Fund B has 20%
in Domestic Bonds, 50% in Domestic Stocks, and 30% in Foreign Stocks, etc..

The Current Portfolio


At the moment, the Investor has 20% of her money invested in FundA, 30% in FundB and 50% in
FundC. This is shown in vector x:
FundA 0.20
FundB 0.30
www.stanford.edu/~wfsharpe/mia/mat/mia_mat5.htm 1/4
10/23/12 Asset Allocation with Investment Funds

FundC 0.50

The Current Allocation


What is the Investor's current allocation among the three major asset classes? The answer can be found
by simply multiplying matrix A by vector x. The required MATLAB statement is:
b = A*x

This provides the result:


DomBds 0.18
DomStx 0.38
ForStx 0.44

Thus her portfolio has 18% in Domestic Bonds, 38% in Domestic Stocks and 44% in Foreign Stocks.

Note the dimensions associated with this calculation:


A {assets*funds) * x (funds*1) ===> b {assets*1}

The inner dimensions are the same, as they must be. And the dimensions of the result are, as usual, the
outer dimensions of the two operands.

Obtaining a Desired Allocation


What if the Analyst, after conferring with the Investor, decided that it would be better to allocate the
assets differently, with 15% in Domestic Bonds, 35% in Domestic Stocks and 50% in Foreign Stocks?
How should money be divided among the investment funds to achieve this goal?

To begin, create vector bb {assets*1}, with the desired allocation:


DomBds 0.15
DomStx 0.35
ForStx 0.50

We seek a new vector of fund investments that will provide this allocation. Let the former be xx
{funds*1}. We want the following to hold:
A*xx = bb

To solve this set of equations requires only the MATLAB statement:


xx = inv(A)*bb

or the equivalent operation in Excel (or another system).

The required investments are given in xx. In table form:


FundA 0.20
FundB 0.15
FundC 0.65

The Investor should put 20% of her assets in FundA, 15% in FundB and 65% in FundC.

Finding Alternative Allocations


www.stanford.edu/~wfsharpe/mia/mat/mia_mat5.htm 2/4
10/23/12 Asset Allocation with Investment Funds

What if the Analyst wished to present two different allocations among asset classes to the Investor? The
procedure described above could be repeated with different values in vector bb. But there is an even
simpler approach.

First, include all allocations of interest in a matrix, with one column per desired mix. In this case, BBB
{assets*mixes} is:
Mix1 Mix2
DomBds 0.15 0.15
DomStx 0.35 0.40
ForStx 0.50 0.45

It is tempting to simply substitute this matrix for the vector bb used in the previous case. In fact, this is a
temptation to which one can and should succumb, for it will provide the desired answers.

The MATLAB statement:


XXX = inv(A)*BBB

will produce the following, in table form:


Mix1 Mix2
FundA 0.20 0.10
FundB 0.15 0.45
FundC 0.65 0.45

Why does this work? Recall that matrix multiplication can be regarded as a series of multiplications of
the first matrix (here, inv(A)) by the adjoining column vectors in the second matrix (here, BBB). Not
surprisingly, each column in the result (XXX) is the solution to a simpler problem in which only the
corresponding column of BBB is utilized.

What about the dimensions? To answer this question we need to know the dimensions of inv(A). Recall
that A is {assets*funds}. It follows that its inverse is {funds*assets} (it is, after all, inverted). Thus:
inv(A) {funds*assets} * BBB {assets*mixes} ===> XXX {funds*mixes}

as characterized above.

Finding Pure Asset Plays


In this example the only vehicles available for direct investment are the three investment funds. If one
wishes to invest in asset classes, it must be done via such funds. We have shown how to find the set of
fund allocations required to achieve any desired asset allocation. One particularly interesting set of the
latter includes the three possible pure asset plays.

Consider the following set of mixes, contained in matrix BBBB:


Mix1 Mix2 Mix3
DomBds 1.00 0.00 0.00
DomStx 0.00 1.00 0.00
ForStx 0.00 0.00 1.00

Mix1 represents allocation of all one's assets to Domestic Bonds, Mix2 to Domestic Stocks, and Mix3 to
Foreign Stocks. Each is a "pure asset play". For example, an Investor with Mix1 will be totally
unaffected by the performance of Domestic Stocks and Foreign Stocks -- only the returns from
Domestic Bonds will matter.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat5.htm 3/4
10/23/12 Asset Allocation with Investment Funds

To find the allocations among investment funds required to achieve each of these mixes, we simply
repeat the procedure used in the previous case. Letting matrix XXXX represent the desired results:
XXXX = inv(A)*BBBB

which gives:
DomBds DomStx ForStx
FundA 2.60 -1.40 0.60
FundB -2.80 4.20 -1.80
FundC 1.20 -1.80 2.20

Thus one wishing to create a pure Domestic Bond play would place an amount equal to 260% of her
money in FundA and 120% in FundC. To help finance these investments, she would take a negative
position in Fund B with an amount equal to 280% of her money.

Could this be done in practice? Possibly, if the investment funds' shares were traded and could be "sold
short". Some closed-end fund shares might be used in this manner, but in all likelihood such an extreme
strategy would be infeasible or at least costly. To deal with such real-world aspects requires more
complex problem formulations and solution procedures, which will be discussed in due course..

However simplistic, this example does illustrate an important point. Look again at matrix BBBB. It is, in
fact, a {3*3} identity matrix (which can be created in MATLAB with the expression eye(3)).. Thus
XXXX is the product of the inverse of A times an identity matrix. But this must be the inverse of A!
Hence, each column in the inverse of A shows the allocation of money among funds that will provide a
pure asset play. As will be seen, this relationship can be applied in both normative and positive
applications.

www.stanford.edu/~wfsharpe/mia/mat/mia_mat5.htm 4/4
10/23/12 Macro‑Investment Analysis

Prices
Time-state Claims
Valuation
Multiple Commodities, States and Times
Interest Rates and Bond Yields
Forward Prices

www.stanford.edu/~wfsharpe/mia/prc/mia_prc0.htm 1/1
10/23/12 Time‑state Claims

Time-state Claims

Contents:
The Arrow-Debreu Paradigm
An Apple Tree
Time-state Claims
Prices of Atomic Time-state Claims
The Sufficiency of Atomic Security Prices
Arbitrage
Zero-investment Strategies

The Arrow-Debreu Paradigm


An important subfield of physics -- nuclear physics -- deals with the smallest particles of which matter is
composed. Constructs developed by Kenneth Arrow [Arrow 1964] and Gerard Debreu [Debreu 1959]
provided a similar foundation for financial economics. The resulting approach is often called the Arrow-
Debreu Paradigm. It characterizes promised future payments in terms of both the times at which
payments are to be made and the states of the world that must obtain for payments to be made. Hence
the often-used name: the time-state paradigm.

Since the approach represents securities and other types of financial instruments in terms of their most
elemental components, one could as well title it: Nuclear Financial Economics.

An Apple Tree
We start with the simplest possible example that involves both time and uncertainty. There are two time
periods:
0: today
1: a year from now

There are also two possible future states of the world:


G: the weather over the next year will be good
B: the weather over the next year will be bad

The states of the world are mutually exclusive (if one occurs, the other cannot) and exhaustive (one of
them must occur).

The economy is very simple indeed. The only commodity is the apple and there is no money per se. In
effect, the apple is the unit of currency.

The only type of productive investment in this economy is, not surprisingly, an apple tree. We focus
initially on a tree that will produce:

63 apples if the weather is good


48 apples if the weather is bad
www.stanford.edu/~wfsharpe/mia/prc/mia_prc1.htm 1/5
10/23/12 Time‑state Claims

This is shown in the figure below. Time proceeds from left to right. The box on the left refers to present
value. The boxes on the right represent alternative states of the world. One but only one of the states of
the world in a vertical position will take place at the time in question. The names of the states are
indicated at the tops of the boxes. The numbers inside the boxes indicate the payoffs.

Time-state Claims
In our economy there are three elemental time-state claims:
One apple at time 0 (today)
One apple at time 1 if the weather is good
One apple at time 1 if the weather is bad

In keeping with our interpretation of the Arrow-Debreu approach as Nuclear Financial Economics, we
will call these atomic time-state claims. Note that the latter two descriptions include the item (apple), the
number of units (one), the time at which delivery is to be made (0 or 1), and the state of the world that
must obtain for delivery to be made (good or bad weather). In the case of the first description, no state is
given, since present values are not conditional on future states of the world. To simplify the exposition,
we will refer to these claims as:
One "present apple" (PA)
One "good weather apple" (GA)
One "bad weather apple" (BA)

with the understanding that these are simply shorthand descriptions.

In principle, any investment vehicle can be considered to be composed of such atomic claims. Thus the
output of our apple tree is equivalent to a 63 GAs and 48 BAs.

Prices of Atomic Time-state Claims


Many of the fundamental concepts of Financial Economics are based on the assumption that markets
exist in which claims can be traded efficiently (at low cost). We begin with the assumption that dealers
stand ready to trade atomic claims and to do so without cost. As will be seen, these dealers are a bit of
an artifice. Later we consider more realistic assumptions about the world.

Assume that Dealer G "makes a market" in good weather apples. In particular, she is willing to trade
(swap):
0.285 present apples for 1.000 good weather apples
or
1.000 good weather apples for 0.285 present apples
www.stanford.edu/~wfsharpe/mia/prc/mia_prc1.htm 2/5
10/23/12 Time‑state Claims

or any multiple thereof (dividing apples into pieces, as required).

This can better be understood in standard financial terms. Assume that an owner of an apple tree has
issued a certificate of the following form:
I, __________ promise to deliver to the bearer of this certificate
one apple at the end of year ____ if (but only if) the weather
during the year has been good.

In standard parlance, this piece of paper (or its electronic equivalent) would be termed a security.
Assume that a credit-rating agency has examined the property of the apple grower (the apple tree) and
has established that no more than 63 of these securities have been issued and that there are no other
claims on the grower's assets in the event of good weather. As a result, the securities are rated AAA
("triple-A") and can be considered default-free.

Under these conditions, the security in question represents a property right in an atomic time-state claim.
In a sense, it is thus an atomic security or, given the origin of the concept, an Arrow-Debreu security.

The price of this security is 0.285 present apples, since the dealer stands ready to trade this number of
present apples for the security. More generally, the price of any security is the amount of the relevant
numeraire paid immediately for which the security can be traded. Note that the ability to make a trade is
central to the definition of a price.

In the real world, of course, dealers charge more to sell a security than they are willing to pay to buy it.
The spread between the ask (selling) and bid (buying) price provides compensation for the market-
making function. In our examples we assume (unrealistically) that there is no such spread and hence that
there is but one price. In practice, the average of the bid and ask prices is often used as a surrogate for
"the price". For detailed computations, of course, the specifics of a proposed transaction may need to be
taken into account and the relevant price (bid or ask) used.

This diversion completed, we return to our world of non-profit dealers.

In addition to Dealer G, we assume that another, Dealer B, is willing to trade (swap):

0.665 present apples for 1.000 bad weather apples


or
1.000 bad weather apples for 0.665 present apples

or any multiple thereof (again, dividing apples into pieces, as required).

The trading environment is shown in the figure below.

The Sufficiency of Atomic Security Prices


www.stanford.edu/~wfsharpe/mia/prc/mia_prc1.htm 3/5
10/23/12 Time‑state Claims

Thus far we have a world with three types of time-state claims (PA, GA and BA). Explicit markets exist
for trading (1) PA and GA and (2) PA and BA. Note that each such trade has the characteristic of an
investment -- today's goods are traded for the prospect of goods in the future. Thus one purchasing a GA
atomic security can be said to have invested 0.285 (present) apples to obtain 1.000 apples in the future if
the weather is good.

But what of the other possible type of trade in this world? What would it mean to trade good weather
apples for bad weather apples? How might one accomplish this? And what would be the terms of trade?

To answer these questions, consider the following agreement:


Party A promises to pay party B: 6 apples if the weather is good
Party B promises to pay party A: 3 apples if the weather is bad
Neither party pays the other anything today (on signing)

Such an agreement is called a swap in financial parlance. It represents the third possible type of trade in
our simple world: GA for BA.

Is this a fair deal? If one desires an answer based on ethical considerations, other disciplines will have to
be invoked. Financial Economics can only indicate whether or not one of the parties could get a better
deal elsewhere.

Assume that Party A comes to you with the proposal that you sign the agreement as Party B. You are
willing to give up 3 apples if the weather is bad in order to increase your consumption if the weather is
good. But is 6 apples the best that you can do?

To answer the question, consider the following alternative trades:


Go to dealer B, trade 3 BA for 3*0.665 = 1.995 PA
Go to dealer G, trade 1.995 PA for 1.995/0.285 = 7 GA

The net result is, of course, to trade 3 BA for 7 GA -- a better deal than offered by canny Party A, who
will have to search elsewhere for a counterparty foolish enough to take the deal.

Note that although explicit markets are being made in only future atomic time-state claims, it is possible
to "create" trades involving any present and future claims. This is a perfectly general result. If one can
trade each possible future atomic time-state claim for present units of a numeraire, any desired trade can
be accomplished. Thus a set of atomic security prices is sufficient for accomplishing any desired trade.

Arbitrage
Consider two people sharing a pizza. To insure an even division, it is wise to agree that one party should
cut it, and the other should choose his or her piece. Similarly, it is useful to require someone offering a
bet on a sporting event to be willing to take either side on the offered terms. With this in mind, we return
to Party A and Party B.

Assume that a securities firm is willing to serve as either Party A or Party B in the previously-described
swap (6 GAs for 3 BAs). Clearly, you have no interest in being Party B. But what about serving as
Party A? Consider the following set of trades:
Sign the Agreement as party A (pay 6 GA, get 3 BA)
Go to dealer B, trade 3 BA for 3*0.665 = 1.995 PA
Go to dealer G, trade 6*0.285 = 1.710 PA for 6 GA

It is useful to put all this information in a payment matrix with each row representing a time-state
combination and each column a transaction. Conventionally, we represent outflows with negative
www.stanford.edu/~wfsharpe/mia/prc/mia_prc1.htm 4/5
10/23/12 Time‑state Claims

numbers, inflows with positive numbers, and neither with zeros.


Agreement Dealer B Dealer G
Present 0 + 1.995 - 1.710
Good weather - 6.0 0 + 6.0
Bad weather + 3.0 - 3.0 0

Of particular interest is the sum of the payments in each row, shown in the final column below:
Agreement Dealer B Dealer G Net
Present 0 + 1.995 - 1.710 + 0.285
Good weather - 6.0 0 + 6.0 0
Bad weather + 3.0 - 3.0 0 0

Note what this set of transactions accomplishes -- getting something for nothing! Moreover, there is no
reason to settle for such a small gain. Double the sizes of all the transactions and the net gain is doubled.
Quadruple them and the gain is quadrupled.

Well and good, but what if one really wanted apples next year if the weather is good. Not to worry. Add
a final trade in which 0.285 present apples are traded for 1.000 good weather apples. Want bad weather
apples? Add a trade to convert the gains into the appropriate payment. No matter what a person's
preferences may be, it is desirable to exploit the foolishness of the firm offering this swap.

Too good to be true? Probably. This example constitutes an arbitrage -- every trader's dream. To
formalize:
An arbitrage provides a positive net payoff in at least one time
and state and no negative net payoff in any time and state.

An arbitrage is thus a money machine (or, as in this case, an apple machine). When an opportunity of
this type arises, traders will rush to exploit it, causing others to adjust their terms of trade until swap
terms involve no arbitrage.

A set of swap terms that does not permit arbitrage is arbitrage-free.

Zero-investment Strategies
In an important sense, every security transaction can be considered a swap. The purchase of an atomic
security is a swap of present goods for conditional future goods. The sale of such a security is a swap of
conditional future goods for present goods. Such cases, when one "side" of the swap involves present
goods or services, are typically termed investments. Thus one invests present apples in the hope of
obtaining more apples in the future. But note that the swap of good weather apples for bad weather
apples is no different in kind, even though no goods or services are exchanged at the time of the
agreement.

To be explicit, we refer to swaps of this latter kind as zero-investment strategies. As with other
transactions, they are represented by cash flow vectors with positive and negative numbers, but with
zeros in the first (present) row.

www.stanford.edu/~wfsharpe/mia/prc/mia_prc1.htm 5/5
10/23/12 Valuation

Valuation

Contents:
Present Value
Net Present Value
Asymetric Information
Productive Investment
Riskless Securities
The Law of One Price
Financing Methods
The Principle of Value Additivity
Risky Debt
Re-allocating Value Among Groups of Claimants
Inferring Atomic Security Prices
Financial Engineering
Opportunity Sets
Consumption and Investment Decisions

Present Value
How much is an apple tree "worth" in our economy? Such a question is best interpreted as "how many
present units of the numeraire should be traded for the apple tree?". The answer is found by calculating
the cost of obtaining the same set of payoffs in another way. The result is the present value of the apple
tree. The process of determining the present value of a security or productive investment is termed
valuation.

In principle, the process is very simple. Recall that the tree will provide 63 apples if the weather is good
and 48 if it is bad. Dealer G will provide the former for 0.285*63=17.955 present apples. Dealer B will
provide the latter for 0.665*48=31.920 present apples. The total cost of obtaining the same results in this
other manner will thus be 17.955+31.920=49.875 present apples. This is the present value of the apple
tree, as shown in the figure below.

There is nothing metaphysical about this concept of value. It is based entirely on the cost of obtaining a
fully equivalent set of payments. If an apple tree is selling for less than this, one can obtain an arbitrage
www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 1/16
10/23/12 Valuation

fully equivalent set of payments. If an apple tree is selling for less than this, one can obtain an arbitrage
by buying the tree and selling its production through the dealers. If a tree is selling for more, one can
offer the same outputs, sell them and use the proceeds to buy securities from the dealers to guarantee
delivery. In each case, there will be something left over for the arbitrageur.

In real markets, such opportunities are few and fleeting. In an arbitrage-free market, they are totally
absent -- there is one set of atomic security prices and every security will sell for a price equal to its
present value, computed using these atomic security prices. In such a market the present value of any set
of claims is computed by multiplying the quantity of each claim by its price, then summing.

Net Present Value


A distinction is often made between the present value and the net present value of a set of claims. The
former is generally based on all future payments and thus determines the present value of those
payments. The latter is generally based on all payments, including any required payment in the present
("up front"). Thus:

net present value = present value - present payment

In matrix terms, either concept can be represented as equal to p*q where p is a {1*states} vector of
elemental security prices and q is a {states*1} vector of payments. In this case, if:

p:
good weather bad weather
0.285 0.665

and q:

good weather 63
bad weather 48

then pv = 49.875.

If the tree could be purchased for 49.875 apples:

p:
present good weather bad weather
1.000 0.285 0.665

and q:
present -49.875
good weather 63
bad weather 48

and npv = 0.0.

The net present value of an investment that is valued correctly ("fairly priced") is zero, as in this case.

The goal of an arbitrageur is to find a strategy with a positive net present value. As indicated earlier,
arbitrages are hard to find in well-developed capital markets.

Productive Investment
www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 2/16
10/23/12 Valuation

How can an entrepreneur evaluate a "trade with nature" (productive investment)? If the commodities
produced by such activity will not significantly alter either commodity prices or time-state-claim prices,
the rule is simple: engage in any such investment with positive net present value.

Assume that a scientist discovers how to plant 60 apples in a way that will produce 100 apples if the
weather is good and 50 apples if the weather is bad. Is this desirable? To find out, compute the net
present value:

p:
present good weather bad weather
1.000 0.285 0.665

q:

present -60
good weather 100
bad weather 50

and npv = 1.75. The NPV is positive, so the technology should be implemented.

The entrepreneur might sell stock in this apple firm. Its value would be 61.75 apples, which is the
present value of the possible future payments. Thus the scientist's investment of 60 apples could be
turned into 61.75 apples today. She could (1) use the profit of 1.75 apples today, (2) swap it for apples
next year if the weather is good, (3) swap it for apples next year if the weather is bad, or (4) select some
combination of the three alternatives.

Asymetric Information
In practice, of course, it is not always easy for an insider to cash in on the future prospects of his or her
enterprise. Our example assumes that outside investors are able to confirm that the scientist's tree will in
fact produce 100 apples if the weather is good and 50 if the weather is bad. Security analysts, who study
publicly-traded securities, attempt to make the best possible forecasts of firm's future prospects and the
associated payments to the holders of their securities. Banks and other credit-granting agencies expend
substantial resources to assess the firms to which they may lend money. And, of course, the
managements of companies provide information about their progress and prospects. However, there is
the ever-present possibility that insiders may provide forecasts that are overly optimistic, either through
excessive enthusiasm or in an attempt to drive up the prices of securities that they may wish to sell.

The underlying problem is the fact that the parties to a proposed trade may have different sets of
information. The possibly negative effects of the resulting asymetry can be mitigated in a number of
ways. Insiders may retain substantial positions in a firm to show their good faith. They may pledge their
personal assets to cover certain types of shortfalls. And so on. At the very least, however, additional
resources will be almost certainly have to be expended to verify predictions, audit records, etc..

In our quest to establish first principles in a simple setting, we leave until later an extended discussion of
issues associated with differential information, the costs associated with attempts to improve participants'
information sets, etc.. For now we assume that people agree on the payment vectors associated with
various investments and that the only source of uncertainty concerns the state of the world that will
actually obtain.

Riskless Securities
A riskless security pays the same amount at a given time, no matter what state of the world obtains.
www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 3/16
10/23/12 Valuation

Equivalently, it is a bundle of equal amounts of atomic claims for a time period. In our case, a riskless
security pays a fixed amount (say X apples) at time period 1, whether the weather has been good or bad.
Equivalently, it is a bundle of X good weather apples and X bad weather apples.

To value such a security one follows the general rule: multiply price times quantity. For example,
assume that a AAA-rated security promises to pay 20 apples at time period 1. Then:

p:
good weather bad weather
0.285 0.665

q:
good weather 20
bad weather 20

and pv = 19.00.

Note that this computation involves multiplying a fixed amount (20) by each of the atomic security
prices for claims for the associated date, then summing the results. One could as well have summed the
prices, then multiplied the result by the fixed payment. In this case:

(0.285 * 20) + (0.665 * 20)


= (0.285 + 0.665) * 20
= 0.95 * 20

The sum of the appropriate atomic security prices is termed the discount factor for the date in question. It
represents the present value of a payment of one unit to be made with certainty at the specified future
date. The process of computing a present value in this manner is called discounting. Thus one discounts
the (certain) future value to obtain the present value of a riskless security.

Note that this calculation, like any other valuation analysis, rests on the ability to obtain the same set of
payments in a different way. An apple a year from now has a present value of 0.95 apples because one
can obtain the same thing using other types of transactions. Absent functioning markets that provide
alternatives, the processes of valuation (in general) and discounting (in particular) lack a rigorous basis.

The Law of One Price


In an arbitrage-free economy with no transactions costs, any given time-state claim will sell for the same
price, no matter how obtained. This will also be true for any "package" of time-state claims. This
property is known as the law of one price.

It is easy to see why the law must hold in an economy of the sort we have posited. Assume that a given
set of time-state claims can be traded for cash today at either of two prices -- X or Y. Then an
arbitrageur could buy the set of claims for the lower of the two prices and sell it for the higher, pocketing
the difference. No matter what occurred in the future, he or she would receive from one counterparty
exactly the cash required to meet the promises made to the other.

In the real world where transactions costs are relevant, the lack of arbitrage opportunities only insures
that prices for a given set of time-state claims will fall within a band narrow enough to preclude
transactions that can provide something for nothing net of transactions costs. Moreover, since different
traders face different transactions costs, it may be possible for some to make money from discrepancies
in prices, while others cannot.

www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 4/16
10/23/12 Valuation

Traders, financial institutions, and those who create new financial instruments attempt to exploit
discrepancies in prices that arise from transactions costs, broadly construed. In so doing, they may make
money for themselves. But they also tend to decrease the transactions costs that will be borne by others.
Thus the actions of arbitrageurs and other traders tend to bring markets for key time-state claims and
combinations of such claims closer and closer to the ideal of zero transactions costs and true conformity
with the law of one price.

Financing Methods
We return now to our "standard" apple tree, which will produce 63 apples if the weather is good and 48
if the weather is bad.

Consider an entrepreneur who wishes to set up a firm that will buy one such tree. As we have shown,
the present value of the tree is 49.875 (present) apples. To buy the tree, our entrepreneur thus needs
49.875 apples. How can he get them?

One alternative is to simply take it out of his bank account, borrow money on his house, etc.. In such a
case, he will serve as both owner and manager.

To keep things simple, we assume that the firm has no costs. Hence its revenues will equal its profits
(revenues minus costs). Since our world ends at the end of the year, the firm can be also be expected to
distribute these profits.

If the entrepreneur provides all the financing, the firm can provide him with a security that provides the
holder with the property right to receive all the earnings that it distributes. Such an instrument would be
termed an equity security or, more commonly, a common stock. In this case, the firm can be said to have
employed an all-equity financing strategy.

In practice, the firm might issue 100 stock certificates, each representing the right to receive 1/100'th of
the firm's distributions. Each certificate would be called a share of the firm's stock. The holder of one
share would receive 0.63 apples if the weather is good and 0.48 apples if the weather is bad. Not
surprisingly, one share would be worth 0.49875 present apples -- 1/100'th of the value of the set of all
outstanding shares.

But there are other ways to finance a firm. In fact, the possibilities are almost endless. We consider first a
simple example involving two classes of securities.

Assume that the firm issues a bond of the following form:


The Apple Tree Firm promises to pay the holder 20 apples
at the end of the year, no matter what the weather has been

and a stock of the form:


The Apple Tree Firm promises to pay the holder all the
apples left over after the bondholder has been paid.

The stock is a residual claim, while the bond is a prior claim -- it is senior in the firm's capital structure.

What is the bond worth? What is the stock worth?

First, the payment vectors.

qfirm:
good weather 63
www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 5/16
10/23/12 Valuation

bad weather 48

qbond:

good weather 20
bad weather 20

qstock = qfirm - qbond:

good weather 43
bad weather 28

It is straightforward to compute the values:

p*qfirm = 49.875
p*qbond = 19.000
p*qstock = 30.875

The Principle of Value Additivity


In the previous example the sum of the values of the securities equaled the value of the firm. This is not
too surprising. The payments made by the firm are simply divided among the claimants:
qbond + qstock = qfirm

But thus it must be the case that:


p*(qbond + qstock) = p*qfirm

and:
p*qbond + p*qstock = p*qfirm

Spelled out:
value of bond + value of stock = value of firm

The result is perfectly general. No matter how the payments are divided among claimants, the sum of the
values will be the same. This is known as the principle of value additivity.

While it is entirely possible to help one set of claimants and hurt others by rearranging the allocation of
cash flows among them, the sum of the values should be unaffected by any such allocation. More
simply put, financial decisions that only redistribute claims should not affect total value. Some
characterize this as the principle of the conservation of value.

It is, of course, important to include all the claimants in such calculations. In the real world, governments
impose taxes on firms and/or those who receive income from firms. As a result, the government must be
included when considering claimants to a firm's cash flows. Moreover, lawyers, accountants, investment
bankers and others are likely to absorb more of a firm's proceeds under some financial arrangements
than under others. Thus while total value may be conserved, some financial legerdemain may divert
substantial amounts of value from prior claimants towards those who aid in a transformation.

The principle of value additivity assumes that prices of time-state claims are unaffected by any changes
in the financing of the firm. If a proposed change is large relative to the underlying set of associated
time-state claims in an economy, it may be necessary to take into account alterations of time-state claim
prices and any resulting increases or decreases in value.
www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 6/16
10/23/12 Valuation

If financial arrangements actually affect a firm's operations and hence its revenues and/or costs, value
may in fact be increased (or reduced) as well as re-distributed. Some have argued that greater ownership
by managers and/or greater debt burdens may increase managerial incentives to maximize profit and
hence increase overall firm value. The possibility of such effects is generally accounted for outside the
domains of standard valuation theory.

Risky Debt
A bond represents debt in which a borrower promises to pay a lender specified amounts in the future.
More precisely, the borrower promises to pay if he or she can. If the borrower is a corporation with
limited liability, the payment will be made in full and on time only if the borrower's cash inflows and
cash outflows associated with claims with greater priority permit. Otherwise, some or all of the promised
payment will be in default (i.e. not paid).

Consider an owner of one of our apple trees who issues a bond that promises to pay 60 Apples in period
1 (if possible). If the weather turns out to be good, the payment will be made in full. If it turns out to be
bad, only 48 Apples will in fact be paid.

The valuation of such a bond is shown below.


Present
State Payment Price Value
good 60 0.285 17.10
bad 48 0.665 31.92
--------
49.02

Note that the bond will not sell for as much as a similar bond that is riskless. The latter would sell for
60*0.95, or 57 present apples. Comparing the promised payment with the present value (price of the
bond) will favor the risky bond (60/49.02) over the riskless one (60/57). It is not surprising that risky
bonds offer higher promised yields than riskless ones. A more difficult question concerns their expected
yields, taking the possibility of default into account. We deal with these issues subsequently.

The presence of risky debt does nothing to affect the principle of value additivity. Consider the
prospects of the residual claimants (stockholders) in this case:
Present
State Payment Price Value
good 3 0.285 0.855
bad 0 0.665 0
--------
0.855

The sum of the values of the bond and stock claims will be 49.02 + 0.855, or 49.875, which equals the
value of the underlying assets (the apple tree).

Re-allocating Value Among Groups of Claimants


One of the problems associated with financing via two or more classes of claims is that of avoiding
decisions that may increase a firm's value but actually decrease the value of one or more set of claims. A
simple example can illustrate the point.

Consider the apple tree firm that has bonds outstanding with promised payments of 60 apples. As shown
earlier, the value of the firm is 49.875, divided between the value of the bonds (49.02) and that of the
stocks (0.855). Now, assume that management has an opportunity to trade its present apple tree for one
www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 7/16
10/23/12 Valuation

that will produce 61 apples if the weather is good and 49 apples if the weather is bad. The value of this
new tree is shown below:

Present
State Payment Price Value
good 61 0.285 17.385
bad 49 0.665 32.585
--------
49.970

Clearly the proposed trade is desirable, since it will increase value from 49.875 to 49.970. But this new
value will be distributed differently:

Before After
Bond 49.020 49.685
Stock 0.855 0.285
-------- ------
49.875 49.970

While the value of the firm has increased, the value of the stock has actually decreased! The increase in
the bond's value due its greater security has swallowed more than the entire increase in the value of the
firm. While those holding bonds (or even proportional amounts of bonds and stocks) would endorse the
change, those holding only stock would be violently opposed.

In principle, stockholders and bondholders in such a situation could work out a re-arrangement of terms
to their mutual advantage so that such an opportunity could be exploited. However, this may require
time, bargaining, legal costs, etc., making the cost greater than the benefit.

In this case a change in the firm's business to one of lower risk advantaged holders of (formerly) risky
debt and disadvantaged holders of junior claims (here, stock). In the converse situation, an increase in
the risk of a firm's operations may lower the value of bonds and increase the value of stock. Such a
change may not enhance the firm's total value, but may still prove desirable for stockholders. To
minimize temptations on the part of management to engage in such tactics, bondholders typically require
covenants placing at least some restrictions on management prerogatives. The danger, of course, is that
profitable (value-enhancing) undertakings that may increase risk will be foregone as a result.

Inferring Atomic Security Prices


Thus far we have assumed that dealers stand ready to buy and sell atomic securities that pay off in one
and only one time and state of the world. This is not totally fanciful, for financial instruments with
similar characteristics do exist. For example, a term life insurance policy will provide payment only if
the state of the world is "insured is dead". Conversely, an annuity policy will provide payments only if
the state of the world is "insured is alive". Indeed, one could (at great expense) provide a riskless
security by purchasing both a life insurance policy and an annuity. Nonetheless, it is true that the typical
traded security is better characterized as a bundle of different types of time-state claims. Does this
obviate our approach? In principle, no.

Imagine a world in which only two securities are traded on a regular basis. One is the common stock of
the Apple Tree Firm described above. The other is its bond. It is convenient to represent their payments
in matrix form.

Let Q be:
Bond Stock
Good Weather 20 43
Bad Weather 20 28

www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 8/16
10/23/12 Valuation

Assume that the bond sells for 19 Present Apples and that the stock sells for 30.875 present apples. The
vector of security prices is thus:

ps:
Bond Stock
19 30.875

Consider now the payoffs obtained from a given combination of securities, for example, a portfolio that
includes 1 bond and 2 stocks. We represent this is a vector of the number of each type of security. Here,

n:

Bond 1
Stock 2

To determine the number of apples provided by this portfolio in each state of the world we multiply Q
by n to obtain c, a vector of payments. (We utilize c to indicate cash flows, even though a might be
more appropriate here, given the fact that apples are involved).

As always, it is useful to check to see that the dimensions are appropriate. Here:
Q {states*securities} * n {securities*1} ----> c {states*1}

This operation can be performed regardless of the number of securities. In this case we have as many
securities as states; however portfolios with fewer securities than states or with more securities than
states can be used in the calculations as long as the requisite information is contained in both Q and n.

In this case, the resulting set of state-contingent payments is:

c:
Good Weather 106
Bad Weather 76

Note that in these calculations we started with a portfolio, n, then computed the resulting contingent
cash flows (payments), c. We turn now to the reverse question.

Assume that one wishes to obtain a set of state-contingent payments c. What portfolio n will provide
them? As before, the requisite equation is:
Q*n = c

If Q is square, it may be possible to take its inverse. If so:

n = inv(Q)*c

and this relationship can be used to determine the portfolio n that will provide a desired set of state-
contingent payments c.

For example, assume that one wishes to have 845 apples if the weather is good and 620 if the weather is
bad, i.e.

c:

good weather 845


bad weather 620

www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 9/16
10/23/12 Valuation

then: inv(Q)*c =
Bonds 10
Stocks 15

How much will this portfolio cost? To find out, we "price it out" by multiplying the vector of security
prices times the portfolio positions:
ps*n = 653.125

It will cost 653.125 present apples to provide the desired contingent payments (845 apples if the weather
is good and 620 if the weather is bad). One way to do this is to purchase 10 Apple Firm bonds and 15
Apple Firm Stocks.

Note that in the above calculations:


ps*n = ps*inv(Q)*c = [ps*inv(Q)]*c

The bracketed expression is of particular interest. In this case it is:


ps*inv(Q) =
0.285 0.665

This should look familiar. It is, in fact, the vector of atomic state prices with which we started. This is
not surprising, since the cost of any vector of state-contingent payments can be found by multiplying this
vector times the desired payment vector. In the special case in which c has a one in the first row and
zero in the second, the answer will be 0.285. But this is the cost of an atomic claim in state 1 (good
weather). Similarly, if c has a zero in the first row and a one in the second, the answer will be 0.665 --
the cost of an atomic claim in state 2 (bad weather). The result is quite general:

p = ps*inv(Q)

Even in a market in which atomic securities are not traded explicitly, it is possible to create them
synthetically by combining positions in existing securities. Moreover, any desired set of payments can be
replicated with a suitably-chosen portfolio of existing securities. The cost of obtaining that set of
payments can be determined by computing the cost of the replicating portfolio. Equivalently, it can be
determined by pricing the contingent payments using the atomic security prices implicit in the prices of
existing securities.

Note that this bit of apparent legerdemain requires the inversion of Q -- the matrix that maps securities to
payments in states of the world. For this to even be possible, Q must be square -- there must be precisely
as many securities as states of the world. In addition, the securities must be sufficiently "different" that
an inverse can be computed. If both conditions are met, the securities represented in the matrix can be
said to span the space of time-state payments.

What if there are more securities in the world than there are states? Simple. If there are M states, select
M (different) securities for inclusion in matrix Q, then compute the implied atomic security prices. Next,
for each remaining security: compute the present value using the derived set of atomic prices and
compare the result with its traded price. If there are no discrepancies, the market is arbitrage-free and the
computed atomic prices can be used for all further calculations. If you find a discrepancy, it is possible
to get something for nothing via arbitrage with a set of trades involving the securities in Q and the
security for which the associated value differs from price. Stop everything and take advantage of this
information. Then, when you have helped bring markets back to an arbitrage-free status (and reaped
your reward for undertaking this socially valuable activity), proceed with the analysis as above.

Financial Engineering
www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 10/16
10/23/12 Valuation

Imagine an investor who tells an investment banker that he would very much like to receive 100 apples
next year if the weather is good, and 130 if it is bad. His question: how much will the investment banker
charge to guarantee that her firm will provide such payments?

To find the answer, we set c:

good weather 100


bad weather 130

then compute:

n = inv(Q)*c:

Bonds 9.30
Stocks -2.00

and:
ps*n = 114.95

If the client will pay at least $114.95, the investment banker can provide the commitment and make a
guaranteed profit. Perhaps she will quote $120.00. If the client accepts, the banker can purchase 9.30
apple firm Bonds and short 2.00 of its Stocks. This will cost $114.95, leaving $5.05 in profit. However,
the investment banker is perfectly hedged. No matter what the future state of the world may be, her
payments to the client will be exactly offset by the net receipts from the portfolio.

Specialists who make such computations are termed financial engineers. Their task is to determine ways
to provide desired sets of payments under various contingencies with existing securities and/or new
arrangements that can partially or fully offset other commitments.

Opportunity Sets
Given an initial wealth and a set of market opportunities, an investor can attain a number of alternative
combinations of time-state claims. The set of all such opportunities is termed (rather unimaginatively),
the investor's opportunity set.

To separate the influence of wealth from that of market opportunities, one can consider the set of
opportunities available with a wealth of one unit of present value. A particular investor's opportunity set
will have the same form, scaled up as needed to account for his or her wealth.

In the present instance we can plot such a set as a three-dimensional diagram since there are only three
needed dimensions -- present apples, good weather apples and bad weather apples. We do so in a later
section. For now we consider an even simpler case that focuses on the opportunities for future apples
per present apple invested.

We plot four investment strategies. The first represents purchase of a pure "bad weather apple" security,
the second one of the Apple Tree Firm's bonds, the third one of the firm's stocks, and the last a pure
"good weather apple" security. The associated payment matrix is:

Q:

Good Bond Stock Bad


Good weather 1 20 43 0
Bad weather 0 20 28 1

and the associated security price vector, ps is:


www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 11/16
10/23/12 Valuation

Good Bond Stock Bad


0.285 19.000 30.875 0.665

We wish to determine the future apples per present apple invested for each of these securities. To do so
we divide each future value in Q by the corresponding price in ps:

q ./ [ps;ps] =
Good Bond Stock Bad
Good Weather 3.5088 1.0526 1.3927 0
Bad Weather 0 1.0526 0.9069 1.5038

The ratio of an ending value to the initial value is termed a value relative. Thus if the weather is good,
the value relative for a stock will be 1.3927. Subtracting 1.0 gives the return. If the weather is good, the
stock will return 0.3927, or 39.27 percent. Note that an atomic security will return -100% in all states
and times but the one for which it is designed.

All the returns are plotted in the figure below and connected with a straight line:

www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 12/16
10/23/12 Valuation

The point representing the bond lies on a line sloping upward at 45 degrees from the origin, since it
provides the same payment in each state of the world. The point representing the stock lies above it and
to the left. One can attain any combination lying on the straight line connecting these two points by
holding a portfolio of both securities, with the sum of the values of the two holdings equal to one present
apple. The larger the amount invested in the stock, the closer the point will be to the point representing
an all-stock portfolio. The larger the amount invested in the bond, the closer the point will be to the point
representing an all-bond portfolio. Let rb be the vector of value relatives for the bond (column 2 in the
matrix above) and rs the vector of value relatives for the stock (column 3 in the matrix above). Then the
vector of value relatives for a portfolio with proportion xs invested in the stock and (1-xs) invested in the
bond will be:

rp = xs*rs + (1-xs)*rb

For example, with xs = 0.6:


rp =
1.2567
0.9652

If only the bond and the stock can be traded, how can one attain points on the line outside the range
encompassed by these two securities? Simple. Apply the same formula, but with a negative value of xs
or (1-xs). Thus if xs = -0.5 and (1-xs) = 1.5:

rp=
0.8826
1.1255

Since the equation is one of a straight line, any point on the extension of the line through the points
representing the two traded securities is available by combining a long (positive) position in one with a
short (negative) position in the other. Our two atomic securities are, of course, extreme cases of this
general principle.

This example shows graphically why any two securities can be used to "span a space" with two states
of the world. It also shows why any security not priced in accordance with the atomic prices implied by
two traded securities will present an opportunity for arbitrage. Assume that a third security (Z) exists that
plots above the line in the figure. Imagine a line drawn from it to the origin. Label as ZZ the point at
which this constructed line crosses the line in the figure. The portfolio of bonds and stocks that will
provide ZZ offers a smaller payoff in each state of the world than does Z. Thus one could take a short
position in portfolio ZZ and use the proceeds (one present apple) to purchase security Z. In each future
state of the world, security Z would provide more than enough apples to pay the counterparty or
counterparties to the short position in portfolio ZZ. Voila: something for nothing.

Once arbitrage opportunities of this sort have disappeared all possible strategies will on a linear
opportunity set. In a two-dimensional case such as this, the set will plot as a line. In a three-dimensional
case it will plot as a plane. In higher dimensional cases the task of plotting would be arduous indeed, but
mathematicians would say that the points "plot" on a hyperplane.

If one wishes to be precise, both the linear frontier and all points under it should be considered members
of the opportunity set, since one can always throw apples away. The points on the frontier can be said to
constitute the set of efficient opportunities, since only individuals who could be satiated (get too much of
a good thing) might wish to consider interior points.

Note that all of this discussion depends on the assumption that one can take a negative position in a
security when and if desired. This can be done by simply signing a document of the form:
I promise to pay the holder whatever the Apple Tree Firm
pays its stockholders
www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 13/16
10/23/12 Valuation

Alternatively, one can engage in a short sale. This is implemented as follows. Assume that A wishes to
sell stock X short (equivalently, take a short position). He or she can borrow shares from B, who does
own them, then sell them to C (who can remain oblivious to the fact that the shares were never actually
owned by A). Upon the sale of the shares, A will receive an amount equal to the price of the shares --
exactly the reverse of the situation that would obtain if he or she had purchased them (in which case A
would have paid this amount). However, A will have promised to "make B whole", by paying to B
anything that B would have received had he or she retained the shares. In addition, B can usually
demand that A return the shares on demand. One way or another, A will pay the amounts that someone
who had purchased the shares would have received. Here, too, the situation is reversed. In such
circumstances, a short sale will in fact be equivalent to a negative purchase.

In practice, things are not always this simple. B may worry that A will be unable to make some or all of
the required payments and/or fail to purchase the stock if and when B calls for its return. Hence B may
demand that A earmark some "good faith money" that can be acquired in the event of any such default.
Worse yet, B may require that A forego some or all the interest earned on such money, with such gains
going to B. Under such circumstances, a short sale is not precisely equivalent to a negative purchase.

Increasingly, institutional arrangements allow investors to take short positions that are very similar, if not
identical, to negative holdings. Any costs or impediments associated with such approaches can be
considered transactions costs. As usual, we will generally ignore them to avoid even more complexity.

Consumption and Investment Decisions


Consider the opportunities available to an individual with W apples to spend. He or she could spend the
entire W apples immediately, obtaining thereby W units of consumption today. In this rather extreme
case, he or she could look forward to no consumption in the future.

An alternative (and equally extreme) strategy would involve purchase of the maximum number of
claims for future consumption if the weather is good. Since the price of each such claim is 0.285 apples,
he or she can exchange W apples today for 3.5088*W apples in the future if the weather is good. Of
course, such a choice involves zero consumption today and zero consumption in the future if the
weather is bad.

The third possible extreme strategy involves the exchange of apples today for the maximum possible
amount of consumption in the future if the weather is bad. Since each such claim costs 0.665, a total of
1.5038*W apples can be consumed under these conditions, but only by sacrificing both consumption
today and in the future in the event that the weather is good.

These choices are extreme or pure consumption-investment strategies. One and only one type of time-
state claim is chosen, with all others rejected. Clearly, few would choose such strategies.

The figure below shows the opportunity set in this case. It is a plane, the borders of which are shown by
the three red lines connecting the extreme strategies.

www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 14/16
10/23/12 Valuation

The most interesting alternatives lie on the portion of the plane away from the corners. Such points
represent efficient combinations of present and contingent future consumption. Any point on the plane
can be obtained by an individual with a wealth (present value) of W apples by a judicious allocation of
that wealth among the three "pure" strategies and/or any other securities that are priced appropriately.

Our use of the term wealth in this example is not gratuitous. The wealth of an individual may be defined
as the maximum amount of present consumption that could be obtained if all his or her property were
sold (traded for present units of the numeraire). Thus an individual who owned an apple tree might
currently be located at a point near the middle a plane such as that shown in the figure, but his or her
wealth would still be measured by the intercept on the consumption axis of the plane through that point.

Given market prices, individual opportunity sets will differ only in scale. If individual D is twice as
wealthy as individual C, her opportunity set will be parallel to that of C but twice as far from the origin.
An individual's opportunity set will thus be determined by his or her wealth and security characteristics
and prices.

In an economic sense, anyone who chooses a point on the opportunity set other than the one at the all-
consumption corner is an investor who sacrifices potential present consumption to obtain at least the
possibility of future consumption. The goal of the Analyst is to help the Investor understand the possible
trade-offs and then move efficiently from the point representing current opportunities to the point on the
frontier of the opportunity set that is most desirable for the Investor.

It is convenient to decompose a set of choices of this sort into a consumption/investment decision and an
investment decision. In a three-dimensional diagram, the former would concern the position chosen on
the Consumption Now axis, while the latter would concern the relative positions chosen on the Good
www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 15/16
10/23/12 Valuation

and Bad axes. While this dichotomy is useful, it is important to remember that investment opportunities
may influence one's consumption decision and that consumption opportunities may influence one's
investment decision.

www.stanford.edu/~wfsharpe/mia/prc/mia_prc2.htm 16/16
10/23/12 Multiple Commodities, States and Times

Multiple Commodities, States and Times

Contents:
Expanding the World
Money
Real and Nominal Interest Rates
Multiple Commodities
Multiple States
Incomplete Markets
Hedging at Minimum Cost
Completing a Market
Multiple Times
Dynamic Strategies
Model Risk
Options
Derivatives

Expanding the World


The Apple economy has served us well, but it is time to consider more complex worlds. We begin with
situations in which there are other commodities and, most importantly, money. We then consider
circumstances in which more than two states of the world can occur in a single period. Finally, we
expand the analysis to cover situations in which there are more than two time periods.

Money
Standard definitions assign the term "money" to instruments that are legal tender within a political
jurisdiction. In principle, one must accept such instruments in the settlement of transactions. Currency
and deposits against which checks can be written are usually considered money. Assets that can quickly,
easily and cheaply be turned into money are often termed "near-money". For our purposes, money can
be thought of as a medium of exchange.

We will generally use dollars as our standard monetary unit in examples. One may think of these as
U.S. Dollars, Australian Dollars, Hong Kong Dollars, or any other such currency. For notation, we
follow the standard practice of preceding an amount with the identifying symbol. Thus $1.50 represents
1.50 dollars. For symmetry, we will do the same for apples: hence, A2.50 represents 2.50 apples.

In most economies, conditions of trade are stated in monetary units. Moreover, most trades involve a
swap in which at least one side involves money per se. Thus we trade money for apples (buy apples),
trade money today for money a year from now (e.g. buy a one-year Treasury bill), trade money today
for apples next year, etc.. If one wishes to trade today's apples for next year's oranges, it may be most
efficient to sell the apples today (trade today's apples for today's money), invest the proceeds (trade
today's money for next year's money), and use the proceeds to purchase oranges next year (trade next
year's money for next year's oranges).

www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 1/18
10/23/12 Multiple Commodities, States and Times

The number of dollars for which a commodity can be traded at any given time is generally termed its
price. In some contexts it is desirable to use a more precise term: the spot price of a commodity is an
amount determined and paid contemporaneously with the delivery of that commodity.

The spot price of a commodity will depend on both the amount of the commodity and the amount of
money available at the time. Other things equal, the greater the amount of money relative to the amount
of a commodity, the higher will be its price. Inflation (rising prices) is often attributed to "too much
money chasing too few goods". Central banks attempt to control national money supplies to avoid
excessive inflation (and deflation). However, other objectives and outside influences may compromise
such good intentions. In practice, there is by no means a simple one-to-one relationship between the
money supply and prices. As economies become more intertwined it becomes more and more difficult
for a government or central bank to manage any element of a national economy, including its prices.
And, of course, even if the general level of prices in an economy remains constant, there will be changes
in relative prices, as dictated by changing demand and supply conditions.

To illustrate the behavior of a monetary economy, we begin with a world with only apples and money.
As before, there are two periods and two future states of the world (good and bad weather). Initially, we
assume that the monetary authorities are able to adjust the money supply so that there is more money
when there are more apples and less when there are fewer and that this adjustment will succeed in
keeping the price of an apple constant at $0.50. As in the earlier examples, we assume that the price of 1
good weather apple is 0.285 present apples and that the price of 1 bad weather apple is 0.665 apples.

Consider first all the possible trades between the present and time 1 if the weather is good:
Time 0 Time 1 (good)

A0.285 ---------------- A1
| |
| A1 = $0.50 | A1 = $0.50
| |
$0.1425 $0.50

If apples can be traded as shown in the top portion of the diagram, then it is possible to trade $0.1425
today for $0.50 next year if the weather is good. This follows from the fact that knowledge of the state
of the world resolves all uncertainty at any specific time. Thus at time zero we know that if the weather
turns out to be good, the price of an apple will be $0.50 at time 1. Hence $0.1425 can be converted to
0.285 apples today, those apples can be used to purchase a claim for 1 apple next year if the weather is
good, and we know in advance that that apple can be converted to $0.50 if that state of the world
obtains.

In practice, the economy is more likely to look like this:


Time 0 Time 1 (good)

A0.285 A1
| |
| A1 = $0.50 | A1 = $0.50
| |
$0.1425 -------------- $0.50

Explicit markets will exist for buying and selling commodities at each time period, with transactions
involving time and/or uncertainty conducted in monetary units. Thus if one wants to swap today's
apples for next year's apples if the weather is good, one might sell A0.285 apples, obtain $0.1425, use
this amount to purchase a claim for $0.50 if the weather is good, and plan to use that amount to purchase
1 apple when and if the state of the world obtains.

Well and good, but what if the price of a commodity in a future time and state is expected to differ from
www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 2/18
10/23/12 Multiple Commodities, States and Times

that of today? Assume that if the weather is good, the price of an apple is expected to rise to $0.60 due
to an increase in the money supply, the velocity at which money circulates, or both. In this case we
would have:
Time 0 Time 1 (good)

A0.285 A1
| |
| A1 = $0.50 | A1 = $0.60
| |
$0.1425 -------------- $0.60

Note that the current apple price of one good weather apple remains at A0.285. However, the current
dollar price of one good weather dollar is $0.1425/$0.60, or $0.2375. We term the former a real (apple)
exchange rate and the latter a nominal one. When future commodity prices differ from current prices,
there will be disparities of this sort. However, arbitrage will insure that there is a close relationship
among commodity prices, real exchange rates and nominal exchange rates.

To complete this latter example, assume that if the weather is bad the price of apples will remain at
$0.50. Thus:
Time 0 Time 1 (bad)

A0.665 A1
| |
| A1 = $0.50 | A1 = $0.50
| |
$0.3325 --------------- $0.50

The dollar price of one dollar if the weather is bad is $0.3325/$0.50, or $0.665; in this case the apple
and dollar exchange rates are the same.

Real and Nominal Interest Rates


In the current example, there are two discount factors:
Real
1 good weather apple = 0.285 present apples
1 bad weather apple = 0.665 present apples
------------------- --------------------
1 future apple = 0.950 present apples

Nominal
1 good weather dollar = 0.2375 present dollars
1 bad weather dollar = 0.665 present dollars
------------------- --------------------
1 future apple = 0.9025 present apples

Thus the real discount factor is 0.950, while the nominal discount factor is 0.9025.

Closely related to the concept of a discount factor is that of the default-free interest rate. For a case
involving only the present and a future period, the rate may be expressed on a per-period basis and
calculated simply. If the discount factor is d, then one unit will "grow to" 1/d in one period. Thus, given
a real discount factor of 0.95, one apple will grow to 1/0.95, or 1.0526 (approximately) apples in one
year. We say that the associated interest rate is 0.0526, or 5.26 percent per year. Thus:
1+i = 1/d

or:
www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 3/18
10/23/12 Multiple Commodities, States and Times

i = (1/d) - 1

equivalently:

d = 1/(1+i)

In our example, the real interest rate is 5.26%, while the nominal interest rate is approximately 10.80%:
(1/0.9025)-1. In a potentially inflationary environment, nominal interest rates will be higher than real
interest rates, with the difference larger, the greater the likely degree of inflation.

Multiple Commodities
In the real world there are, of course, many different types of commodities (there are also different types
of currencies, but we leave this complication for a later discussion). Formally, the time-state approach
assumes that once the state of the world is known, all uncertainty is resolved concerning
contemporaneous commodity prices. Thus markets need only (!) exist for each commodity and money at
a given time and for claims for money across time and possible states of the world.

With multiple commodities, the simple notion of a real exchange rate breaks down, and with it the
notions of real discount factors and real interest rates. For example, the real interest rate expressed in
oranges may differ from that expressed in apples, kumquats or whatever. Economists attempt to get
around this problem by using the price of a pre-defined basket of goods and services to compute a price
index. They then convert nominal exchange rates to real rates using the price of such a basket of goods.
Such undertakings are fraught with hazard. A basket is unlikely to be fully representative of the
purchasing habits of a given individual or institution. If the basket's composition is held fixed, the
change in cost will likely overstate the cost of obtaining a constant degree of satisfaction, since
adaptation to a new set of relative prices is not taken into account. Finally, it is difficult to fully take into
account changes in quality when attempting to determine a change in the "cost of living" (or producing)
at a given level of happiness (or efficiency).

Despite these problems, price indices are important for financial analysis. Accordingly, governmental
agencies compute and publish various versions designed to represent the costs faced by representative
consumers and producers. Most countries have established consumer price indices as well as producer
price indices. More general measures are those used for computing overall national statistics, in
particular gross domestic product deflators, employed to estimate changes in the real levels of domestic
production of economies.

Any price index can be used to estimate a real counterpart for a nominal value. In practice, Analysts
usually employ a consumer price index (CPI) designed to represent (as best possible) the buying habits
of a typical member of an economy.

Multiple States
Thus far we have assumed that from one period to the next there are only two possible states of the
world (specifically, good weather and bad weather). For many applications this stretches credulity. Over
a year there can be good weather, fair weather, bad weather, plagues, pestilence, and so on. If an entire
economy is to be modeled, one may need to consider a multiplicity of possible outcomes.

Imagine a world in which there are two periods (now and a year from now) and three possible states of
the world (Good Weather, Fair Weather and Bad Weather). As before, there is a Bond and a Stock. All
values are stated in dollars. The payment matrix is given by Q:
Bond Stock

www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 4/18
10/23/12 Multiple Commodities, States and Times

Good Weather 20 43
Fair Weather 20 35
Bad Weather 20 28

and the security prices by ps:


Bond Stock
19 30.875

What are the atomic prices?

Our general rule is, of course


p = ps*inv(Q)

But it is impossible to take the inverse of Q, since it is not square. The number of rows is equal to the
number of possible states of the world. To be able to invert the matrix we need as many columns
(securities) as there are rows (states of the world). In this case we need one more security.

Assume some research turns up a convertible bond. Such an instrument is a bond with promised
payments that can, on the holder's demand, be converted to a common stock with equity interest. Since
one should only undertake such a conversion when the equity is worth more than the bond, we can
write the cash flows associated with the various states of the world assuming optimal exercise of the
option to convert. Assume that doing so gives the payment matrix Q: :
Bond Stock Convertible
Good Weather 20 43 35
Fair Weather 20 35 25
Bad Weather 20 28 25

If the convertible is selling for $24, we have the security price vector ps:
Bond Stock Convertible
19 30.875 24

We can now compute the atomic prices p = ps*inv(Q). To four decimal places they are::

Good Fair Bad


Weather Weather Weather
0.0250 0.5571 0.3679

The discount factor is, as always, sum(p). In this case:

sum(p) = 0.95

which is not surprising, given the presence of the same riskless bond as used in the earlier examples.

With three states of the world, three securities are needed to "span the space". However, to do so, the
securities must be sufficiently different. Consider, for example, the following payment matrix Q:
Bond Stock Security X
Good Weather 20 43 21.5
Fair Weather 20 35 17.5
Bad Weather 20 28 14.0

If you were to try to take the inverse of this matrix, the results would be (at the very least) a warning
message. MATLAB would say something like:
Warning: Matrix is close to singular or badly scaled.
www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 5/18
10/23/12 Multiple Commodities, States and Times

Results may be inaccurate. RCOND = 4.648137e-019

The problem is not difficult to discern -- every payoff from Security X is precisely half that from the
Stock. It is thus not different enough to complement the other two securities and allow construction of
portfolios that can replicate any desired set of payments across the three states. The securities are not
"different enough" -- they do not span the state space.

To see what the latter expression means, consider again our earliest example of the Apple Tree Bond
and Stock. The diagram showing the opportunity set for contingent payments per dollar invested is
repeated below, with two arrows added.

The point shown for the Stock depends on three values -- its payments in each of the two states of the
world and its current price. Given the payment vector, the price will determine the actual location of the
point, but it will lie on the vector shown by the arrow through the current point, no matter what the price
may be.

Similarly, the Bond will plot at some point along the vector shown by the arrow through its current
point. As long as the arrows are distinct ("different"), the securities will plot at two different points and
support an opportunity set with a boundary such as that shown by the line in the diagram.

Imagine the consequences if the securities fell on the same vector (arrow). If they were priced to plot at
the same point, it would clearly be impossible to use them to obtain any other combination of payments.
If they plotted at different points, one could take a short position in one and a long position in the other
and make a potentially infinite amount of money with no risk and no investment! The latter case is of
course implausible, and both fail our test..

In general, if there are S states of the world in a given time period, S securities that plot on different
www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 6/18
10/23/12 Multiple Commodities, States and Times

"arrows" in the state space are required. Since the locations of the arrows depend only on the payment
vectors, it is the set of such vectors (our matrix Q) that must meet this condition. The test is simple: if Q
can be inverted, the securities are sufficiently different. If it cannot, they are not.

If an available set of securities spans a state space, we say that the markets are complete.

Incomplete Markets
If states of the world are defined very narrowly, the number of possible states of the world is very large.
The number of securities (broadly construed) is also large, but almost certainly smaller than the number
of possible states. Thus markets are incomplete in a global sense. On the other hand, for many
applications it is sufficient to define states broadly. For example, assume that an Analyst wishes to value
and/or hedge an investment product that has payments tied to the level of a stock index. For such an
analysis, there is no need to differentiate between the state "Stock Index level = $500 and sailing
conditions are good" and the state "Stock Index level = $500 and sailing conditions are bad". The
broader state "Stock Index level = $500" suffices, for it resolves all the uncertainty that is relevant for
the issue at hand. While securities may not span a detailed space, they may do so very well for a more
aggregated one.

Consider the following payment matrix Q:

Bond Stock
Good Weather 20 43
Fair Weather 20 28
Bad Weather 20 28

and price vector ps:

Bond Stock
19 35

Each security provides the same payment in the state Fair Weather as in the state Bad Weather. If one is
interested only in payment patterns that also have this characteristic, the problem may be reduced to one
involving two states. Thus, we have Q:
Bond Stock
Good Weather 20 43
Fair or Bad Weather 20 28

and can compute the atomic prices p = ps*inv(Q). They are::

Good Fair or Bad


Weather Weather
0.56 0.39

The latter price is, in effect, the sum of two prices: the price of $1 if the weather is fair and the price of
$1 if the weather is bad. We are able to measure the sum but have no way to know what the value of
each of the components might be. The markets are sufficiently complete to price or replicate any
payment pattern in which the same amount is to be paid in the two states (Fair and Bad Weather).
However, if someone asks for a payment pattern in which a different amount is to be received in Fair
Weather than in Bad Weather, it will be impossible to find a replicating strategy involving the Bond and
the Stock in question.

This example is of considerable practical importance. Payment patterns that can be replicated with
existing securities can be priced with considerable accuracy. Moreover, an Investment Firm can offer
products with such patterns by taking an offsetting position in the appropriate replicating strategy. When
www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 7/18
10/23/12 Multiple Commodities, States and Times

this is done, the product is said to be fully hedged and there is in principle no risk associated with it other
than uncertainty about the future state of the world.

More interesting but more problematic are investment products that offer payment patterns not available
with combinations of existing securities. Such products cannot be fully hedged by the provider, nor can
their prices be established definitively using the prices of other securities. Consider an Investor who asks
an Investment Firm to create an investment product with the following payments:

Good Weather 40
Fair Weather 30
Bad Weather 20

What should the Investment Firm charge? And how might it hedge as much of the associated risk as
possible?

The problem is, of course, the fact that no matter what combination of the Bond and the Stock is chosen,
the net payments will be the same in Fair Weather and Bad Weather. To be absolutely certain that no net
outlay might be required, the firm would have to select a combination of securities that would pay 40 in
Good Weather and 30 in Fair or Bad Weather. In this case:
ps =
Bond Stock
19 35

Q=
Bond Stock
Good Weather 20 43
Fair or Bad Weather 20 28

c=
Good Weather 40
Fair or Bad Weather 30

n = inv(Q)*c:
Bond 0.5667
Stock 0.6667

price = ps*n:
34.10

Thus it would cost $34.10 (price) to purchase a portfolio (n) that would cover all outflows. Note,
however, that in the event that the weather is Bad, the portfolio will provide $30 while the firm would
be obligated to pay out only $20, leaving $10 as profit. Thus the Investment Firm would be delighted to
sell the product for a price of $34.10. Moreover, if it did so and undertook the hedge (n), the Investor
would be assured of receiving the promised payments under all circumstances.

Hedging at Minimum Cost


The approach used in the previous example may be generalized by formulating the problem in a manner
that allows for the number of securities to be less than, equal to or greater than the number of states of
the world. The goal is to minimize the cost of meeting or exceeding the required payment in each state
of the world. Using our previous notation:

select: n

to minimize: ps*n

subject to: Q*n >= c


www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 8/18
10/23/12 Multiple Commodities, States and Times

In this formulation, n {securities*1} is the vector of decision variables, ps {1*securities} contains the
coefficients of the linear objective function, or minimand, Q {states*securities} contains the coefficients
of the left-hand side of the constraint set, and c {states*1} contains the coefficients of the right-hand
side of the constraint set.

The matrix representation of the constraint set is straightforward, The left-hand side Q*n is a {states*1}
vector, as is the right-hand side c. The matrix inequality indicates that each element of the left-hand
vector (amount available to be paid) must be greater than or equal to the corresponding element of the
right-hand vector (amount required to be paid).

Since this problem involves a linear objective function and linear inequality constraints, it is a member of
the class of linear programming problems and can be solved using any of a number of algorithms
(procedures) designed for such tasks. MATLAB's optimization toolbox includes a function named lp for
this purpose. The simplest use is described in MATLAB as follows:

x = lp(f,A,b) solves the linear programming problem:

min f'x subject to: Ax <= b x

This is almost precisely what we need. However, the inequality is reversed. This is easily overcome,
for:

Q*n >= c

is the same as:

-Q*n <= c (for example: 3>=2 and -3

<=-2)
Thus the problem can be solved with the following MATLAB expressions:

n = lp(ps,-Q,-c);
price = ps*n;

While our current interest is in the use of the linear programming formulation in an incomplete market
setting, it can also be used in a complete market. In such a case, the procedure can produce a portfolio
that will achieve the required set of payments exactly. If more securities are included than there are
states, there will typically be multiple ways of meeting the goals; however, if the markets are arbitrage-
free, all such portfolios will have the same cost.

Linear programming algorithms can provide a set of Lagranian multipliers, each of which indicates the
change in the objective function (here, cost) per unit change in one of the right-hand side coefficients. In
this instance, each such multiplier for constraints that are binding is the atomic price for a state -- how
much it would cost to have one more dollar paid in that state.

Here is the MATLAB description of the procedure used to obtain these multipliers:

[x,LAMBDA]=lp(f,A,b) returns the set of Lagrangian multipliers,


LAMBDA, at the solution.

To obtain the set of atomic prices and the hedge portfolio, one could use the MATLAB expression:

[n p] = lp(ps,-Q,-c);
www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 9/18
10/23/12 Multiple Commodities, States and Times

Completing a Market
We return now to the problem faced by the Investment Firm. A potential client wants a product that will
pay c:

Good Weather 40
Fair Weather 30
Bad Weather 20

But the only existing securities are a Bond and a Stock with payments Q:
Bond Stock
Good Weather 20 43
Fair Weather 20 28
Bad Weather 20 28

and prices ps:

Bond Stock
19 35

The firm has run its linear programming algorithm and found that for $34.10 it can meet its obligation in
every state of the world, but with $10 left over if the weather is Bad. Surely, it figures, this is worth
something (but at most 0.39*10 = $ 3.90, according to our previous calculations). Assume that after
some thought, it offers the product for $32.00. Moreover, it is willing to "make a market" and "take
either side", that is, buy or sell the product at that price.

We now have three securities, with payment matrix, Q:

Bond Stock Product


Good Weather 20 43 40
Fair Weather 20 28 30
Bad Weather 20 28 20

and price vector ps:


Bond Stock Product
19 35 32

The market is now complete, with atomic prices p = ps*inv(Q):


Good Fair Bad
Weather Weather Weather
0.56 0.18 0.21

In effect, the Investment Firm has priced the Fair Weather and Bad Weather claims at $0.18 and $0.21,
respectively, resolving the indeterminacy of the split of the prior known value of $0.39 between the two
claims.

In the real world, as in this example, an Investment Firm, by offering a new product with sufficient
guarantees of payment in all circumstances, can provide an important service by making the capital
markets more complete. In this case the availability of such a product fully completed the market, since
the Bond, the Stock and the new Product completely span the space of relevant states of the world. Any
desired set of payments across the three states can be replicated with some combination of these three
instruments.

Even if motivated only by greed and cupidity, Investment Firms can provide significant social services
and move markets closer and closer to the idealized ones described in works such as this.
www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 10/18
10/23/12 Multiple Commodities, States and Times

Multiple Times
Practical applications of the time-state approach often dispense with the convenient fiction that the world
ends after one period. It is time for us to do so as well. Following practice, we focus on cases in which
a variable of interest can move in one of two directions in each time period. To keep things as simple as
possible, we allow for two future time periods (times 1 and 2), in addition to the present (time 0).

The tree of possible states of the world now has seven nodes:

Instead of numbering the nodes sequentially we use letters for all but time zero. The number of letters
indicates the time period (thus state gg is at time period 2, since there are two letters). The sequence of
letters indicates the path taken to reach the node (thus gb indicates a good branch followed by a bad
one).

A diagram such as this is sometimes termed a lattice. Since only two branches emanate from each node,
the underlying relationship is often termed a binomial process.

A security or Investment Product is represented with a set of values or cash flows in such a tree. We
start with a two-period zero-coupon Bond that grows in value by 5% per period. Its initial value is
$1.00. The values of a Bond at the nodes are shown below:

Our second security is a Stock that pays no dividends. Its price increases 26% in good times but falls to
96% of its prior value in bad times. Its initial value is also $1.00. The values at the nodes are shown
below:

www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 11/18
10/23/12 Multiple Commodities, States and Times

In this case S, the number of future states of the world equals six. Our previous discussion indicated that
six different securities would be required to span this space and hence allow replication and valuation of
Investment Products. This remains true if the term "security" is expanded to include a planned
acquisition of a security in the future. Taking this view, six distinct elemental combinations of payments
can be provided using the two traded instruments. The associated purchases and sales are as follows:

B0: Buy a bond today; sell it at the end of period 1


S0: Buy a stock today; sell it at the end of period 1
Bg: At period 1, if the state is g, buy a bond; sell it at the end of period 2
Sg: At period 1, if the state is g, buy a stock; sell it at the end of period 2
Bb: At period 1, if the state is b, buy a bond; sell it at the end of period 2
Sb: At period 1, if the state is b, buy a stock; sell it at the end of period 2

Only the first two strategies involve an outlay at the present. The latter involve outlays (negative cash
flows) at future times under some circumstances, but none today. The associated payment matrix Q is:

B0 S0 Bg Sg Bb Sb
g 1.05 1.26 -1.00 -1.00 0 0
b 1.05 0.96 0 0 -1.00 -1.00
gg 0 0 1.05 1.26 0 0
gb 0 0 1.05 0.96 0 0
bg 0 0 0 0 1.05 1.26
bb 0 0 0 0 1.05 0.96

The price vector ps is:


B0 S0 Bg Sg Bb Sb
1.00 1.00 0 0 0 0

To find the atomic prices, we proceed as always to find p = ps*inv(Q):


g b gg gb bg bb
0.2857 0.6667 0.0816 0.1905 0.1905 0.4444

Dynamic Strategies
To see how a multiple-time approach can be used in a practical situation, consider the following
Investment Product:

At time 2, Investment Firm will pay the holder an amount equal to:
$1.50 if the Stock is worth more than $1.50
$1.00 if the Stock is worth less than $1.00
www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 12/18
10/23/12 Multiple Commodities, States and Times

The value of the Stock otherwise

What is this collar around the price of the Stock worth? Is there some other way that an Investor could
obtain the same results?

To answer these questions, construct a vector (c) with the cash flows associated with the product:

g 0
b 0
gg 1.50
gb 1.2096
bg 1.2096
bb 1.00

Next, compute the value = p*c:

1.0277

and the replicating portfolio n = inv(Q)*c:

B0 0.2853
S0 0.7423
Bg 0.2670
Sg 0.9680
Bb 0.3136
Sb 0.6987

While we may term this a portfolio, it would usually be considered a dynamic strategy. It calls for an
initial purchase of $0.2853 of Bonds and $0.7423 of Stocks (for a total cost of $1.0277). If the weather
turns out to have be good at the end of the first period, the Bonds will have grown to 1.05*$0.2853, or
$0.2996, while the Stocks will have grown to 1.26*0.7423, or $0.9354, giving a total portfolio value of
$1.2350. The strategy calls for this portfolio to be sold and a portfolio with $0.2670 of Bonds and
$0.9680 of Stocks purchased. The cost will be precisely equal to the proceeds obtained from the sale of
the initial positions, as shown below:

Initial Revised Difference


Bond 0.2996 0.2670 -0.0326
Stock 0.9354 0.9680 0.0326
------- -------- --------
1.2350 1.2350 0.0000

In fact, of course, it would only be necessary to sell $0.0326 of Bonds and purchase $0.0326 of Stocks
to implement the needed change.

If the weather turns out to have been bad, the stock position at the end of the year will only be worth
0.96*$0.7423, or $0.7127. The situation would then be the following:

Initial Revised Difference


Bond 0.2996 0.3136 0.0140
Stock 0.7127 0.6987 -0.0140
------- -------- --------
1.0123 1.0123 0.0000

In this case, $0.0140 of Stocks would be sold and the proceeds used to purchase $0.0140 of Bonds.

Model Risk
In principle, any set of time and state-contingent cash flows can be replicated with any set of securities
that spans the relevant space of time-state claims. Moreover, if there are only two possible branches at

www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 13/18
10/23/12 Multiple Commodities, States and Times

each node in the tree representing the underlying process, planned acquisition and sale of two different
securities at each intermediate future time period will suffice to span the entire space.

What can go wrong? Two things. First, a counterparty can default, in whole or in part, on an obligated
payment. Second, the model may be wrong. The possibility of the latter is known as model risk.

Two examples will illustrate the type of dangers lurking behind such arrangements.

Assume that the tree drawn in the previous example is in error in one respect. If Stocks do poorly in the
first year, they are likely to do somewhat poorer than initially projected in the second year. Specifically,
the true payment matrix is. QQ:

B0 S0 Bg Sg Bb Sb
g 1.05 1.26 -1.00 -1.00 0 0
b 1.05 0.96 0 0 -1.00 -1.00
gg 0 0 1.05 1.26 0 0
gb 0 0 1.05 0.96 0 0
bg 0 0 0 0 1.05 1.20
bb 0 0 0 0 1.05 0.90

The correct strategy would have been nn = inv(QQ)*c:


B0 0.4450
S0 0.6093
Bg 0.2670
Sg 0.9680
Bb 0.3535
Sb 0.6987

which would have cost $1.0543.

The strategy adopted, using the wrong model, cost less ($1.0277), but would in fact provide a different
set of payments (QQ*n) from that desired:

g 0.0000
b 0.0000
gg 1.5000
gb 1.2096
bg 1.1677
bb 0.9581

If the first year turns out to be Bad, problems lie ahead. The Investment Firm will think that it is fully
hedged but wake up to find that it either owes $1.2096 with only $1.1677 of assets (state bg) or that it
owes $1.0000 with only $0.9581 of assets (state bb).

The second example is different in kind but similar in outcome. Assume that the tree and payment
matrix are completely accurate, but that the market "moves too fast" to make any trades at the end of the
first period. Instead, the positions established at the outset must be held until the end of the second
period. This is not unlike the experience in a number of stock markets on the day in October, 1987
known as "Black Monday", when some of the participants found that their assumption that trades could
be made after relatively small price changes was in error.

In this case, the results would be as follows:


Bonds Stocks Total
gg 0.2853*1.05*1.05 0.7423*1.26*1.26 1.4931
gb 0.2853*1.05*1.05 0.7423*1.26*0.96 1.2125
bg 0.2853*1.05*1.05 0.7423*0.96*1.26 1.2125
bb 0.2853*1.05*1.05 0.7423*0.96*0.96 0.9987

www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 14/18
10/23/12 Multiple Commodities, States and Times

In this case, the Investment Firm actually makes money if the Stock reverses its behavior (state gb or bg)
but is in trouble (may not be able to make its payments) if the stock price continues in the same direction
(state gg or bb).

Model risk is an important element whenever a dynamic strategy is adopted to provide a desired set of
cash flows. Both firms that plan to hedge obligations and those who are counterparties for such firms
must be keenly aware of the possibility that the underlying model is wrong in some sense or another. A
stress test, in which changes in the underlying model are examined to estimate the magnitudes of likely
deviation, can prove valuable in assessing the degree of the danger associated with this type of risk.

Options
Thus far, our examples have involved a specified set of cash flows at each of the nodes in the time-state
tree. Futures and forward contracts have such attributes, as do many swap agreements. However, a
great many financial arrangements involve one or more options: at certain times and under some or all
conditions, one or both parties may change the pattern of remaining cash flows.

Consider first a European Call Option which allows the option holder (buyer) to "call away" a security
or stock index for a pre-specified amount at a given date. To illustrate, we use the previous example in
which a Stock price can increase to 1.26 times its prior value or decrease to 0.96 times its prior value in
each of two periods while a Bond grows to 1.05 times its prior value in each period. We wish to analyze
an option to call away the stock for $1.10 at the end of the second period.

The value of the Stock at the end of that period will depend on the final state of the world, as follows:
gg $1.5876
gb $1.2096
bg $1.2096
bb $0.9216

Clearly, it would be foolish to pay $1.10 for something worth less. Hence optimal exercise involves
choosing to let the option expire in state bb and exercising it in every other state. The net value received
in each state will thus be:

gg $0.4876
gb $0.1096
bg $0.1096
bb $0

In terms of the full vector of possible cash flows, c:


g 0
b 0
gg 0.4876
gb 0.1096
bg 0.1096
bb 0

The cost of providing these flows with a dynamic strategy equals p*c:
$ 0.0816

The replicating portfolio (strategy) is n:

B0 -0.5220
S0 0.6036
Bg -1.0476
Sg 1.2600
www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 15/18
10/23/12 Multiple Commodities, States and Times

Bb -0.3340
Sb 0.3653

The initial position involves the investment of $0.0816 of the investor's money plus $0.5220 of
borrowed funds to purchase $0.6036 of the Stock. Subsequently, the positions are adjusted, depending
on the state of the world, but in each case the strategy combines borrowing (a short position in the
Bond) with investment (a long position in the Stock).

A European option may be exercised only on its expiration date . An American option may be
exercised at any date up to and including its expiration date. Analysis of the latter is somewhat more
complex than that of the former.

Consider an American put option that allows the holder to "put" (sell) the Stock to the option writer
(seller) at a price of $1.20 at either time period 1 or time period 2. If the option is held until time period
2, it should be left to expire worthless in all but state bb. In this case, the option will be worth $0.2784
since it can be used to sell a stock worth only $0.9216 for $1.20.

The figure below shows the situation diagramatically. The values in the boxes for time period 2 indicate
the cash flows if the option is held until time period 2 and then exercised optimally.

Should the option be exercised at the end of period 1? Consider first the situation if the first year is
Good. The Stock will be worth $1.26. If the option were exercised, the holder would sell something
worth $1.26 for $1.20, thereby losing $0.06. Moreover, the game would be over. It is immediately
apparent that it is better to continue (to get zero) than to exercise and obtain $ -0.06.

The situation at the end of a Bad year 1 is not as clear. Since the Stock will be worth $0.96, immediate
exercise will net $0.24, as shown in the diagram. Is it better to take this amount or to continue to hold
the option in the hope of receiving either 0 (state bg) or 0.2784 (state bb) at the end of the next year?

The question can be posed in terms of alternative vectors of cash flows. Which is better? c1:

g 0
b 0
gg 0
gb 0
bg 0
bb 0.2784

or c2:

g 0
b 0.24
gg 0
gb 0
bg 0
www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 16/18
10/23/12 Multiple Commodities, States and Times

bb 0

The answer is easily found by pricing the two alternatives:

p*c1 = 0.1237
p*c2 = 0.1600

Clearly, it is better to exercise the put at the end of year 1 if the stock price falls. Planning to do so
makes the option worth $0.1600 at the outset. Planning to not do so makes it worth only $0.1237.

The figure below shows the tree after it has been "pruned" to include only optimal paths.

The procedure for pruning is simple conceptually. First, the tree is priced, using the standard securities,
so that the value of a payment at each node is known. Then one starts at the end, working back one
node at a time. The present value of the cash flows associated with one decision (here, to exercise the
option) is compared with the present value of the cash flows associated with the alternative decision
(here, to not exercise the option). The better choice is retained and the poorer one discarded. The
process is performed first for all the nodes at the last time period. Then it is performed for all the nodes at
the penultimate (next-to-last) period, then for the period before that, and so on. To speed up the process,
each node can be assigned a present value based on optimal choices at subsequent nodes. The set of
such values for the nodes at time period t can then be used when evaluating choices for nodes at time
period t-1. The final result will be a set of rules for making optimal choices, a corresponding set of cash
flows, and the associated present value which will be the largest possible, given the alternatives.

If a contract between party A and party B gives B one or more options, how should party A arrive at an
appropriate price? One might assume that party B will act optimally and hence follow the procedure
described above. However, in many cases option-holders do not do this. For example, homeowners
who borrow money via mortgages often retain an option to prepay their loan at fixed amounts,
regardless of the course of interest rates. Pools of such mortgages are frequently assembled and sold as
an Investment Product. The value of such a pool will depend critically on the nature of prepayments by
the individual mortgagees. Consider a borrower who has a $100 8% loan with one year to run. She can
either pay $108 in a year or $100 today. If the current rate of interest is 7%, it is to her advantage to "pay
off" the loan for $100 with money borrowed at 7%, thus replacing an obligation to pay $108 with one to
pay $107. If, on the other hand, interest rates are 9%, it would be undesirable for her to pay off the loan.

In practice, a mortgagee may fail to prepay a loan when interest rates fall below the rate at which the
mortgage was issued, due to costs or inattention. Moreover, some will pay off loans when interest rates
are above the initial rate, due to a need to sell a house, etc.. The prices of mortgage pool securities are
typically higher than they would be if borrowers always exercised optimally in a narrow sense. Those
who analyze such products incorporate a prepayment model in their calculations, based on observed
behavior of a class of borrowers. Profits can be made by analysts who utilize a model superior to that
reflected in market prices. On the other hand, losses can be incurred by those with inferior models. To a
major extent, the competition among active managers of funds that utilize mortgage instruments is a
www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 17/18
10/23/12 Multiple Commodities, States and Times

competition among prepayment models.

When both parties to an agreement retain options, valuation requires assumptions about the behavior of
each one. Thus a convertible callable bond provides the issuer with an option to call the bond from the
holder under certain conditions and an option for the holder to convert the bond into the issuer's stock
under some conditions. If each party is assumed to exercise optimally, valuation can proceed using the
general procedure outlined above. Otherwise, more complex assumptions are required.

Whatever the model used to predict choices made when options are available, the goal is to reduce the
problem to one involving a vector of time and state-contingent cash flows. If markets for the associated
securities are sufficiently complete, the value of an Investment Product with options and a replicating
dynamic strategy can be determined.

Derivatives
The instruments that we have examined in the last few examples are all derivatives -- Investment
Products whose value depends on the values of one or more underlying securities. We have
considered only cases in which a derivative is tied to one security value. Such instances are well suited
to binomial models of the behavior of the value of the underlying security. More complex derivatives
may be based on the behavior of the prices of two or more securities or on values of non-investment
vehicles (e.g. the average temperature in July at a particular resort). The farther the underlying value
from that of a traded security, the less likely it is that a replicating portfolio can be determined and the
derivative's value established definitively. In such cases perfect hedging is impossible and the specter of
counterparty risk looms especially large.

It is an overstatement to say that an Investor can attain any desired pattern of time and state-contingent
cash flows via either Investment Products or dynamic strategies. However, the range of possibilities is
very large indeed and growing larger by the day. This leaves the Analyst with two key questions: what
set of payments is the most desirable and what is the best way to achieve the desired outcome?

If an Investment Product is utilized, there is the danger that the counterparty will fail to make all required
payments, at least in some circumstances. The more exotic the derivative, the greater such counterparty
risk is likely to be. On the other hand, if the Investor undertakes a dynamic strategy, he or she is directly
(rather than indirectly) subject to model risk. An Investor who chooses a derivative Investment Product
will need to examine the assets and other liabilities of the counterparty. One who chooses a dynamic
strategy will have to examine the credentials and methods utilized by his or her Analyst.

www.stanford.edu/~wfsharpe/mia/prc/mia_prc3.htm 18/18
10/23/12 Interest Rates and Bond Yields

Interest Rates and Bond Yields

Contents:
Multi-period Discount Factors
Multi-period Interest Rates
Bond Yields
Duration
Forward Interest Rates

Multi-period Discount Factors


A nominal discount factor is the present value of one unit of currency to be paid with certainty at a
stated future time. This definition suffices, whatever the time period. In a multi-period setting there is
one discount factor for every time period. Thus df(1) could be the present value (at time 0) of $1 certain
at the end of time period 1, df(2) the present value at time 0 of $1 certain at the end of time period 2,
etc.. The vector of such values df {1*periods} is known as the discount function. It can be used to value
any vector of cash flows known to be certain. If cf {periods*1} is such a vector, its present value is
simply:

pv = df*cf

In this equation, pv is termed the discounted present value of the cash flows.

The one-period example generalizes to a multi-period setting in another respect. The discount factor for
a given period will equal the sum of the atomic prices for that period. This follows because the purchase
of one unit of every time-state claim for a specified time will guarantee one unit of the currency at that
period. The cost of such a bundle is the cost of one unit of currency certain at that date, and hence
equals the associated discount factor.

In many countries, nominal discount factors are easily discovered. For example, in the United States,
financial publications report recent prices of U.S. Treasury Bills and "Strips", each of which promises a
fixed dollar payment at one specified date. Since the Treasury has the power to print dollars, payments
on such securities can be considered certain, absent revolution, etc.. The reported prices on any given
day thus constitute the discount function at the time.

Real discount factors are another matter. In some countries the government issues bonds with payments
linked to a price index. Such bonds typically provide both coupon payments at periodic intervals and a
final principal payment at maturity. If there are enough issues with sufficiently different maturities, at
least some elements of discount function can be determined.

Consider a case in which there are three bonds. The one-year bond promises a payment of 103 real or
"constant dollars" (e.g. Apples) in a year. The two-year bond promises a payment of 4 constant dollars
in one year and 104 in two. The three-year bond promises a payment of 3 constant dollars in years 1 and
2 and 103 in year 3. The current prices are $100, $101 and $98, respectively. What are the real discount
factors (i.e. the present value of $1 of purchasing power in each of the next three years?).

To answer the question we construct a {periods*bonds} cash flow matrix Q:


www.stanford.edu/~wfsharpe/mia/prc/mia_prc4.htm 1/8
10/23/12 Interest Rates and Bond Yields

Bond1 Bond2 Bond3


Yr1 103 4 3
Yr2 0 104 3
Yr3 0 0 103

and a price vector p {1*periods}:


Bond1 Bond2 Bond3
100 101 98

The price of each bond should equal its discounted present value. Thus:
df*Q = p

where df {1*periods} is the discount function.

We wish to find df, given Q and p. Multiplying both sides of the equation by inv(Q) gives:
df = p*inv(Q)

In this case, df {1*periods) is:


Yr1 Yr2 Yr3
0.9709 0.9338 0.8960

Thus a claim for 1 real dollar in year 1 is worth $0.9709 now, a claim for 1 real dollar in year 2 is worth
$0.9338 now, and so on. Any desired set of real payments over the next three years can be valued using
this discount function.

To find the combination of such bonds that will replicate a desired set of cash flows we utilize the
formula:
Q*n = c

where n {bonds*1} is a portfolio of bonds and c {periods*1} is the desired set of payments. From this it
follows that n=inv(Q)*c. Thus if the desired set of payments is c:

Yr1 300
Yr2 200
Yr2 100

The replicating portfolio is n:


Bond1 2.8107
Bond2 1.8951
Bond3 0.9709

Whether for real or nominal units of a currency, if a discount function can be determined from the values
and characteristics of default-free instruments, any corresponding vector of cash flows can be valued
and replicated. Moreover, any such vector can be "traded for" any other with the same present value.
The set of such combinations forms the default-free opportunity set available to the Investor. The
Analyst can help determine the set, but ultimately the Investor must select either one of its members or a
vector of cash flows that is not fully default-free.

Multi-period Interest Rates


While a discount factor provides a natural and direct measure of the present value of a certain future
cash flow, it is sometimes convenient to focus on a related and more familiar figure. If an investment
www.stanford.edu/~wfsharpe/mia/prc/mia_prc4.htm 2/8
10/23/12 Interest Rates and Bond Yields

grows from a value of x to a value of x*(1+i) in one period, it can be said to have "earned interest" at
the rate i. The concept can be extended to multiple periods by assuming that interest compounds once
per period. Thus if an investment grows from V0 to V2 in two periods, the equivalent interest rate is
found by solving the equation:
(1+i)*(1+i) = V2/V0

or:
(1+i)^2 = V2/V0

The ratio of the ending value to the beginning value is termed the (t-period) value relative. For an
investment held t periods, the associated interest rate is computed from:
(1+i) = (Vt/V0)^(1/t)

Interest rates are generally used to describe securities for which payments are certain. In a one-period
setting, such securities can be termed riskless. In a multi-period setting it is preferable to describe them as
default-free since their values may fluctuate, making them risky if sold before the final payment has been
made.

There is a one-to-one relationship between a discount factor and the corresponding interest rate. If df(t)
is the discount factor for time t, one unit of the numeraire will grow to 1/df(t) units with certainty by time
t. Thus i(t), the default-free interest rate for time t is given by:

i(t) = ((1/df(t))^(1/t)) -1

With the value of the "t-period interest rate", one can discount any certain payment to be obtained at that
date. Let P(t) be an amount to be paid at t and i(t) the corresponding interest rate. Then the present value
pvis given by:
pv = P(t) / ( (1+i(t)) ^ t)

Since there is a one-to-one relationship between a discount factor and the associated interest rate, either
may be used to calculate a present value. Moreover, give one of them, the other can be determined with
little effort.

Consider the following discount function df:


Yr1 Yr2 Yr3
0.9400 0.8800 0.8200

The corresponding value relatives are given by vr = 1./df:


Yr1 Yr2 Yr3
1.0638 1.1364 1.2195

Using the MATLAB notation of [1:3] to generate the vector [1 2 3], the interest rates can be computed
as i = (vr.^(1./[1:3]))-1:
Yr1 Yr2 Yr3
0.0638 0.0660 0.0684

or:

Yr1 Yr2 Yr3


6.38% 6.60% 6.84%

These values, when plotted, give one version of the current yield curve or term structure of interest
www.stanford.edu/~wfsharpe/mia/prc/mia_prc4.htm 3/8
10/23/12 Interest Rates and Bond Yields

rates. In this case it is upward-sloping, with long-term rates greater than short-term rates.

In these calculations, we have computed interest rates assuming compounding once per period. One
could as easily use a definition based on compounding more than once per period; or not at all; or
continuously. When processing an interest rate, it is important to know which definition was used so that
errors do not creep into subsequent calculations. The possibility of alternative definitions makes the use
of discount factors a safer approach. Moreover, a case can be made for the thesis that a discount factor,
being a price, is a fundamental characteristic of an economy, while an interest rate is a derived construct.
This being said, interest rates are ubiquitous, helpful for comparisons of prices of payments at different
times, and necessary for communication with those used to more traditional characterizations of financial
markets.

Bond Yields
Many bonds, both traditional and index-linked, provide coupon payments periodically and a final
principal payment at maturity. Consider, for example, a bond that provides payments cf of:
Yr1 6
Yr2 6
Yr3 106

Given the previous discount function, such a bond has a present value of $97.84. Based on its initial par
value of $100, the yield is 6% per year. However, given the fact that it is selling for $97.84, the effective
yield is greater. To reflect this, analysts often use a derived figure, the yield-to-maturity. This is a
constant interest rate that makes the present value of all the bond's payments equal its price. In this case,
we seek a value for i that will satisfy the equation:
6/(1+y) + 6/((1+y)^2) + 106/((1+y)^3) = 97.84

This can be done by trial and error, preferably using an intelligent algorithm to find the result (to a
desired degree of accuracy). In this case, i is approximately 6.82%.

A set of yields-to-maturity for bonds with varying coupons and maturities will typically not plot on a
single curve. Nonetheless, some analysts crossplot yield-to-maturity and maturity date for a set of bonds,
then fit a "yield curve" through the resulting scatter of plots. The result may be helpful, but should not
be used for valuation purposes.

Duration
The maturity of a bond provides important information for its valuation. The values of longer-term
bonds are generally affected more by changes in interest rates, especially longer-term rates. However,
for coupon bonds, maturity is a somewhat crude indicator of interest rate sensitivity. A high-coupon
bond will be exposed more to short and intermediate-term rates than will a low coupon bond with the
same maturity, while a zero-coupon bond will be exposed only to the interest rate associated with its
maturity. To provide a somewhat better measure than maturity, Analysts often compute the duration of a
set of cash flows.

Let df be a {1*periods}vector of discount factors and cf a {periods*1} vector of cash flows. The
duration of cf is a weighted average of the times at which payments are made, with each payment
weighted by its present value relative to that of the vector as a whole. In the previous example, the bond
has cash flows cf:
Yr1 6
Yr2 6

www.stanford.edu/~wfsharpe/mia/prc/mia_prc4.htm 4/8
10/23/12 Interest Rates and Bond Yields

Yr3 106

The market discount function df is:


Yr1 Yr2 Yr3
0.9400 0.8800 0.8200

The present values of the cash flows are v = df.*cf'':


Yr1 Yr2 Yr3
5.6400 5.2800 86.9200

To compute weights we divide by total value, w = v/(df*cf), giving:


Yr1 Yr2 Yr3
0.0576 0.0540 0.8884

In MATLAB, the expression [1:3]'produces the {periods*1} vector of time periods:


Yr1 1
Yr2 2
Yr3 3

The duration, given by d = w*([1:3]'), is 2.8307 years -- somewhat less than the maturity of 3 years.

Well and good, but what use can be made of duration? In some circumstances, quite a bit. In others,
somewhat less. We make the calculation to better understand the reaction of the value of a vector of cash
flow to a change in one or more interest rates. In practice, of course, many such rates along the term
structure may change at the same time. In general, if the discount function changes from df1 to df2, the
present value of cash flow vector cf will experience a change in value equal to:
dV = (df2 - df1)*cf

How can one number summarize the effect on value of a change in potentially many different interest
rates along the discount function?

Of necessity, a change in the yield-to-maturity of a bond will cause a predictable change in the value of
that bond or set of cash flows, since there is a one-to-one relationship between the two. The relationship
holds as well for most cash flow vectors. In such case the term internal rate of return is utilized, instead
of yield-to-maturity. If there are sufficiently many positive and negative cash flows in a vector, the
internal rate of return may not be unique, causing potential mischief if one relies upon it. However, this
cannot happen if the vector consists of a series of negative (positive) flows, followed by a series of
positive (negative) flows -- that is, if there is only one reversal of sign.

In practice, a bond's duration is usually calculated with a discount function based on its own yield-to-
maturity, that is:
[ 1/(1+y) 1/((1+y)^2) 1/((1+y)^3) ]

Now, consider c(t), the cash for the t'th period. Using the bond's yield-to-maturity, Its present value is:
v(t) = c(t)/((1+y)^t)

If there is a very small change dy in y, the change in v(t) will be:


dv(t) = (c(t)*(-t*(1+y)^(-t-1))) * dy

or

www.stanford.edu/~wfsharpe/mia/prc/mia_prc4.htm 5/8
10/23/12 Interest Rates and Bond Yields

dv(t) = (v(t)*-t) * (dy/(1+y))

Summing all such terms we have the total change in value dv:
dv = sum(dv(t)) = - sum(v(t)*t) * (dy/(1+y))

Finally, the proportional change in value, dv/v is:


dv/v = sum(dv(t)/v) = - sum((v(t)/v)*t) * (dy/(1+y)

But the term inside the parentheses preceded with "sum" is the duration, calculated using the bond's
own yield-to-maturity. Thus we have:
dv/v = - d * (dy/(1+y))

Sometimes the duration is divided by (1+y) to give the modified duration. Letting md represent this, we
have:
dv/v = - md * dy

Thus the modified duration indicates the negative percentage change in the value of the bond per
percentage change in its own yield-to-maturity. The minus sign indicates that an increase (decrease) a
bond's yield-to-maturity is accompanied by a decrease (increase) in its value.

Duration (modified or not) is of no interest unless one can establish a relationship between a bond's own
yield-to-maturity and some market rate of interest. For example, assume y = y20+.01, where y20 is the
interest rate on 20-year zero coupon government bonds. In this case:

dy = dy20

and:

dv/v = - md * dy20

which relates the percentage change in the bond's value to the change in a market rate of interest.

The concept of duration that is especially relevant for Analysts who counsel the managers of defined-
benefit pension funds. Many such funds have obligations to pay future pensions that are fixed in nominal
(e.g. dollar) terms, at least formally. Moreover, the bulk of the cash flows must be paid at dates far into
the future. The present value of the liabilities of such a plan can be computed in the usual way and its
yield-to-maturity (internal rate of return) or discount rate, determined, using market rates of interest. In
many cases, the discount rate will be very close to a long-term rate of interest (e.g. that for 20-year
bonds). Since term structures of interest rates tend to be quite flat at the long end, any change in the
long-term rate of interest will be accompanied by a roughly equal change in the discount rate for a
typical pension plan of this type. Thus the duration of the plan's cash flows provides a good estimate of
the sensitivity of the present value of its liabilities to a change in long-term interest rates. Any imbalance
between the duration of the assets in a pension fund held to meet those liabilities and the duration of the
liabilities may well provide an indication of the extent to which the fund is taking on interest rate risk.

Forward Interest Rates


In our most recent example, the discount function df was:

Yr1 Yr2 Yr3


0.9400 0.8800 0.8200

with associated interest rates:


www.stanford.edu/~wfsharpe/mia/prc/mia_prc4.htm 6/8
10/23/12 Interest Rates and Bond Yields

Yr1 Yr2 Yr3


0.0638 0.0660 0.0684

For example, $1 invested at a rate of 6.60% per year, compounded yearly, would grow to $1/0.88
dollars at the end of two years. This interest rate could be termed the 2-year spot rate to emphasize the
fact that it assumes an investment that begins immediately and lasts for two years.

A different type of interest rate involves an agreement made immediately for investment at a later date
and repayment at an even later date. For example, one might agree today to borrow $1 in a year and
repay $1 plus a stated amount of interest one year later (i.e. two years' hence). The interest rate in
question is termed a forward interest rate to emphasize the fact that it covers an interval that begins at a
date forward (i.e. in the future).

Of particular interest are forward rates covering periods that last only one period. Such rates can be
denoted by their starting date. Hence the 1 year forward rate covers the period from the end of year 1 to
the end of year 2, but on terms negotiated today. Given the discount function, it is possible to arrange
today to borrow 1/df(1) dollars at the end of year one and pay 1/df(2) dollars at the end of year 2 for a
zero net investment, since each "side" will have a present value of $1. Hence, arbitrage decrees that any
forward contract covering the same period will have the same results. This insures that:

(1/df(1)) * (1+f(1)) = (1/df(2))

where f(1) is the forward rate for the period beginning at the end of year 1 and ending at the end of year
2.

Re-arranging the equation above gives the simpler form:


f(1) = (df(1)/df(2)) - 1

More generally:
f(t) = (df(t)/df(t+1)) - 1

In the special case in which t = 0, the "forward rate" will, in fact, be the spot rate for a one-year loan,
since df(0), the present value of $1 today, is $1.

To obtain the full vector of forward rates, we create a lagged vector dfl of all but the last discount factor,
preceded by the present value of $1 today:

dfl = [ 1 df(1:2)]

Yr1 Yr2 Yr3


1.00 0.94 0.88

Dividing each element of the original discount function by the corresponding element in this vector, then
subtracting 1 gives the forward rate vector f:
f = (dfl ./ df) - 1

f(0) f(1) f(2)


0.0638 0.0682 0.0732

Thus one dollar grows to $1.0638*1.0682 in two years and $1.0638*1.0682*1.0732 in three years. Of
necessity, these calculations reach the same conclusion as do those based on the respective spot interest
rates. However, the latter use different rates for the same year (e.g. year 2), depending on the investment
being analyzed, while the former do not. Thus forward rates are closer to economic reality and can be
used with far less risk of error.

www.stanford.edu/~wfsharpe/mia/prc/mia_prc4.htm 7/8
10/23/12 Interest Rates and Bond Yields

Forward rates are especially useful when an Analyst is trying to predict future levels of inflation for
estimating liabilities of a pension plan with benefits tied to salary levels, which are in turn, affected by
changes in the cost of living. A standard assumption holds that a forward interest rate is the sum of two
components: (1) a liquidity premium (sometimes called a term premium) and (2) an expectation
concerning the spot rate that will hold at the time. Thus the two-year forward rate in our example
(7.32%) might be considered to be the sum of a normal liquidity premium for such obligations of 1.0%
and a consensus expectation of market participants that the one-year spot rate will equal 6.32% for year
3. The spot rate, in turn, may be assumed to equal an expected one-year real return of, say, 1.5% plus
an expected level of inflation equal to 6.32%-1.5%, or 4.82%. Combining the two calculations gives:
Forward Rate
- Liquidity Premium
- Expected Short-term Real Return
----------------------------------
Expected Inflation

Here:

7.32
-1.00
-1.50
------
4.82

A common set of assumptions holds that liquidity premia increase at a decreasing rate as maturity
increases and that expected short-term real returns are constant. This implies that the term structure of
forward rates will have the same shape as the liquidity premium function in periods in which inflation is
expected to remain constant. If the forward curve is steeper, inflation is presumably expected to increase.
If it is flatter or downward-sloping, inflation can be expected to decrease.

Procedures such as this applied to the set of forward interest rates allow an Analyst to estimate levels of
future inflation that are consistent with current market yields. As usual, the estimates are only as good as
the assumptions, but are likely to be better than the use of some average historic inflation level,
especially in periods in which term structures of interest rates are unusually steep, unusually flat, or
actually downward-sloping.

www.stanford.edu/~wfsharpe/mia/prc/mia_prc4.htm 8/8
10/23/12 Forward Prices

Forward Prices

Contents:
Forward Prices
Atomic Forward Prices
Properties of Atomic Forward Prices
Valuation Using Atomic Forward Prices
Prices and Probabilities
Return Swaps

Forward Prices
We have used the term price to refer to an amount to be paid at the present time for a time-state claim or
bundle of such claims. Not surprisingly, the magnitude of such a value is specified at the present time.
Thus party A might agree to deliver a dollar next year if the weather is good, in return for which party B
delivers 0.285 dollars immediately.

It is, of course, possible to agree today to an exchange in which both receipts and payments will occur in
the future. Forward interest rates represent the terms of such an arrangement. Similar procedures can be
followed when contingencies are involved. For example, party A could agree to deliver 7 dollars next
year if the weather is good while party B agrees to deliver 3 dollars next year if the weather is bad. In
this case, both "sides" of the transaction are contingent and will take place (if at all) in the future.

Of particular interest are cases in which all payments are in the future, but one involves payments that
are not state-dependent. For example, party A might agree to deliver one dollar next year if the weather
is good while party B agrees to deliver 0.30 dollars next year no matter what the weather has been. In
this case we would say that the forward price of a good weather dollar is 0.30 dollars delivered next
year: party B bought one good weather dollar for 0.30 dollars to be delivered (with certainty) one year
hence..

A forward price involves a future payment date. Thus in a multi-period setting, one could have a one-
year forward price for a given set of time-state claims, a two-year forward price for the same set of
claims, etc.. The first would indicate an amount that would have to be paid in one year to purchase the
set of claims. The second would indicate an amount that would have to be paid in two years to purchase
the set of claims, etc.. In each case, the price would be determined at the present time and the agreed-
upon amount would have to be paid at the specified future time, regardless of the nature of ensuing
events.

Atomic Forward Prices


We have said that an atomic price is the present value of a "pure security" -- i.e. a claim that pays one
unit of a stated commodity or currency at a specified time and state of the world. For present purposes
we focus on claims that pay in currency terms. In particular, we assume that one unit of such a claim
pays $1 at the stated time if and only if the state of the world occurs.

www.stanford.edu/~wfsharpe/mia/prc/mia_prc5.htm 1/6
10/23/12 Forward Prices

We define an atomic forward price as an amount that must be paid with certainty for such a claim, with
payment to be made at the same time as the possible payment in question. Thus in our example the
atomic forward price for one good-weather dollar is $0.30.

By extension, we may say that the forward price for a bundle of claims that share the same payment
date, but differ only in the states of the world in which they are to be paid is the amount to be paid with
certainty at the common date for which the bundle can be obtained.

Note that at time 0 there will be a forward price for, say, a claim or combination of claims that will be
paid (if at all) at time 3. At time 1 there will be a potentially different forward price for the same set of
claims. However, contracts struck at time 0 will require payment of the initial amount at time 3, even
though contracts newly negotiated at time 1 will require a different amount. This means that a deal
negotiated at time 0 that had a net present value of zero at the time may well have a negative or positive
net present value at time 1, depending on the events that transpired in the first period. Realistic
accounting calls for both parties to adjust their books to reflect the new value, thereby marking to
market the positions involved.

The Relationship between Prices and Forward Prices


Arbitrage ensures that there is a very close relationship between prices and forward prices. Consider an
economy in which a security promising $1 in year 1 if the weather is good commands a price of $0.285.
Assume that an investment firm offers you such a security in return for a promise to pay $0.305 at the
end of the year. Is this a good deal?

To obtain the answer one must consider the rate at which present (certain) dollars can be exchanged for
certain future dollars. Assume that in this economy the one-period discount factor is 0.95. Thus one can
exchange one certain future dollar for 0.95 present dollars. Alternatively, one can exchange 1/0.95, or
1.052632 certain future dollars for one present dollar (the one-period interest rate is 5.2632 percent per
period). Assume that you wish to buy a good weather dollar security but pay for it at the end of the year.
You could agree to pay $0.305 at the time to the investment firm. Alternatively, you could borrow
$0.285 in order to buy such a security on the open market. You would, of course, have to repay the
loan, which would require the payment of $0.285/0.95, or $0.285*1.052632 at the end of the year. But
this is $0.300! Thus the investment firm was trying to make you agree to pay $0.305 in a year for
something "worth" (obtainable elsewhere for) a promised payment of $0.300. In a market populated by
astute Analysts, the investment firm would find that it had no takers for this product. If it were willing to
take the other side of the offer, clever analysts could make money with neither risk nor investment via
arbitrage between its terms and those available directly or indirectly in public markets.

Sooner or later, arbitrage will force equality between the present price of a set of time-state claims and
the discounted forward price, using the appropriate discount factor (or, equivalently default-free interest
rate). Thus, if fc(t) is the forward price to be paid at time t for the claim, df(t) is the discount factor for
time t, and pc is the present price of the claim:
pc = df(t) * fc(t)

Given a present price, one can determine the appropriate forward price, or vice-versa, using the
appropriate discount factor.

Properties of Atomic Forward Prices


It is important to understand the precise meaning of an atomic forward price. Consider the forward price
of 0.300 for a dollar in year 1 if the weather has been good. What net payments would the forward
purchaser of such an atomic claim have to make? The answer depends on the weather.
www.stanford.edu/~wfsharpe/mia/prc/mia_prc5.htm 2/6
10/23/12 Forward Prices

If the weather is good:


Pay: $ 0.300
Receive: $ 1.000
-----
net $ 0.700

If the weather is bad:


Pay: $ 0.300
Receive: $ 0.000
-----
net - $ 0.300

In fact, the two parties could as well agree that if the weather is good, A will pay B $0.70, but if the
weather is bad, A will receive $0.30 from B.

Forward atomic prices can, of course, be computed directly from (present) atomic prices and the
discount factor. Assume that the present value of $1 if the weather is bad is $0.665. Then the atomic
forward prices are:

good weather dollars: 0.285/0.95 = 0.300


bad weather dollars: 0.665/0.95 = 0.700

Note that they sum to 1.000. This is hardly a surprise. Consider the effect of buying one unit of every
time-state claim. Such a bundle of claims will guarantee the receipt of $1.00 no matter what state may
occur. It will also require the payment of an amount equal to the sum of all the corresponding forward
atomic prices. If this sum is less than 1.000, one can get something for nothing by buying a package of
equal amounts of all such claims. If it is more than 1.000, one can get something for nothing by selling
such a package. Arbitrage will thus ensure that the sum will be 1.000. Thus:

The sum of the forward atomic prices for a given date


must be 1.000.

Valuation Using Atomic Forward Prices


The relationship between prices and forward prices allows one to value a set of time-state claims in two
steps. First, all claims for a given time period are analyzed and their collective forward value determined.
This is repeated for each time period. Finally, the resultant values are discounted, using the discount
function, to obtain the overall present value.

Consider our previous example with two states of the world at time 1 (g and b) and four states at time 2
(gg, gb , bg, and bb). The prices were given by p:
g b gg gb bg bb
0.2857 0.6667 0.0816 0.1905 0.1905 0.4444

The associated discount function df is thus:

Yr1 0.9524
Yr2 0.9070

The forward prices for year 1 are:


g b
0.30 0.70

and those for year 2 are:


www.stanford.edu/~wfsharpe/mia/prc/mia_prc5.htm 3/6
10/23/12 Forward Prices

gg gb bg bb
0.0900 0.2100 0.2100 0.4900

Now, consider the task of valuing the following set of claims c:


g 5
b 3
gg 15
gb 12
bg 11
bb 5

Given the atomic prices p the result can be determined by simple matrix multiplication:
p*c = 11.2562

Here is the alternative.

First, compute the forward value of the time 1 claims:

State Payment Forward Forward


Price Value
g 5 .30 1.50
b 3 .70 2.10
-------
3.60

Next, the forward value of the time 2 claims:


State Payment Forward Forward
Price Value
gg 15 0.09 1.35
gb 12 0.21 2.52
bg 11 0.21 2.31
bb 5 0.49 2.45
------
8.63

Finally, the discounted present value of both sets of claims:

Time Future Discount Present


Value Factor Value
1 3.60 0.9523 3.4286
2 8.63 0.9070 7.8277
--------
11.2562

Prices and Probabilities


Thus far, nothing has been said about probabilities. This is just as well, for there is no "law of one
probability". Market participants can hold radically different opinions concerning the probabilities of
various states of the world. No matter -- the markets will still function. Prices will be set, valuation can
be performed, replicating strategies can be determined, packages of state-contingent claims can be
valued using atomic prices or the combination of atomic forward prices and discount factors, and so on.

Despite these facts, there is a great temptation to interpret atomic forward prices as probabilities. All
such prices for a given time sum to 1.0, as must any set of probabilities assigned rationally to the states
in question. The forward value of a set of claims for a given time period could be interpreted as the
expected value of the payments if only the atomic forward prices were probabilities. If so, one could

www.stanford.edu/~wfsharpe/mia/prc/mia_prc5.htm 4/6
10/23/12 Forward Prices

argue that to value a set of claims one only need discount the expected values, using riskless rates of
interest.

But there is no reason to expect that the atomic forward price of a time-state claim equals the probability
assigned to it by a single market participant or even a consensus of market participants. Quite the
contrary. Hence it is dangerous to equate an atomic forward price with any notion of the probability that
the associated state will occur. Nonetheless, many Analysts accept the danger inherent in such a
position, while recognizing the fact that prices and probabilities need not be the same. Commonly, they
may use the term risk-neutral probability instead of "atomic forward price", then argue that valuation
involves discounting (at riskless rates of interest) the (pseudo-) expected payments at each period, with
risk-neutral probabilities used to calculate expected values.

The rationalization for this approach rests on two observations. As we will discuss subsequently, if
investors were all risk-neutral and agreed on the probabilities of the various states of the world, the
atomic forward price for a time-state claim would equal the agreed-upon probability of its occurrence.
But if there is anything known about investors it is that they are risk-averse, not risk-neutral. Moreover,
they do not all agree on probabilities. Hence atomic forward prices are not probabilities in any simple
sense.

Despite these objections, those who use this nomenclature can still get the right answers. However, the
economics of the situation are at best hidden from sight and may in many cases be overlooked entirely.
Most importantly, it is easy to slip over the line and equate prices ("risk-neutral probabilities") with real
probabilities. We attempt to avoid the confusion that such an approach can entail. Here, prices are prices
and probabilities are probabilities. The relationships among them are complex and need to be addressed
explicitly, which we do in other sections.

Return Swaps
We conclude this section with yet one more example that illustrates that prices alone can provide all the
needed answers for many important practical problems.

A very popular arrangement encouraged by financial engineers can be termed a return swap. Consider
the following case. Investor A promises to pay investor B the return on a notional value of $1 of a
Stock, while B promises to pay A the return on a value of $1 of a Bond. We utilize our Stock that can
increase by 26% or decrease by 4% in a year, and the Bond that will increase by 5% for certain. In a
one-period setting, the return on an asset is simply the value-relative minus 1. Thus the net cash flows
are.
State Bond Stock A to B B to A
good 0.05 0.26 0.21 -0.21
bad 0.05 -0.04 -0.09 0.09

The final columns of the table summarize the net payments between the counterparties to this swap in
each of the possible states of the world. In practice only one payment is made: $0.21 from A to B if the
weather is good or $0.09 from B to A if the weather is bad.

The obvious question: is this fair? The answer, shown below, is clearly yes. The present value of the
amounts paid by A to B is precisely zero! Needless to say, so is the present value of the amounts paid by
B to A.
Present
State A to B price Value
good 0.21 0.285 0.0598
bad -0.09 0.665 -0.0598
-------
0.0000
www.stanford.edu/~wfsharpe/mia/prc/mia_prc5.htm 5/6
10/23/12 Forward Prices

This is a very general result, and depends in no way on the simplified world analyzed here. As long as
both parties will make the required payments, any swap of the return on a marketable security for the
return on another one with equal value will be "fair" (i.e. have zero net present value at the time the deal
is struck).

To see this, consider how one could "manufacture" the return on a security from other instruments. For
simplicity, assume that the goal is to produce the dollar returns on the Stock in question: +$0.26 if the
weather is good and -$0.04 if the weather is bad. The trick is to purchase one unit of the stock, and take
out a loan that will require payment of $1.00 at the end of the year. The net results will then be:
Stock Loan
State Value Payment Net
good 1.26 -1.00 0.26
bad 0.96 -1.00 -0.04

as desired.

The cost of this strategy is $1.00 (for the stock) less the cost of the loan, which is 1/1.05, or $0.9524.
But the latter is the discount factor for the time in question. Thus the present value of a promise to
receive the return on the stock at time period 1 is 1-df(1) which is, in turn, the discount on a 1-year zero-
coupon bond.

While we reached this conclusion for the Stock in our example, the result would have been the same if
any other asset had been utilized, as a careful review of the argument will indicate. Moreover, with
suitable modification, it would hold for other time periods. To generalize:
The present value of a guarantee to pay the return on
an asset with a notional value of $X is the discount
on a riskless loan that requires payment of $X at the
end of the period over which the return is guaranteed.

Since a return swap involves exchange of two promises with the same present value, it is thus fair.

In practice, there is often some uncertainty concerning the ability of one or both counterparties to make
all promised (contingent) payments. When this is the case, one or both present values must be decreased
(or promised payments increased) to account for credit risk. Careful evaluation of such risk is critical for
success in the swap business.

www.stanford.edu/~wfsharpe/mia/prc/mia_prc5.htm 6/6
10/24/12 Macro‑Investment Analysis

Probabilities
Production, Consumption and Market Clearing
Risk Premia
Consumption and Investment Choices

www.stanford.edu/~wfsharpe/mia/prb/mia_prb0.htm 1/1
10/24/12 Production, Consumption and Market‑Clearing

Production, Consumption and Market-Clearing

Contents:
Production
Wealth
Production Possibilities
Optimal Production
Consumption
Optimal Consumption
The Societal Aggregate Product
The Market Portfolio

Production
In most of the prior examples, production was very simple. The economy consisted of one or more apple producers, each of
whom grew trees with the same output (63 apples if the weather was good, 48 apples if the weather was bad).

Consider now a slightly more complex firm: one that will produce 20 apples today, 63 apples in the future if the weather is
good and 48 apples in the future if the weather is bad. This production plan can be plotted as a point in a three-dimensional
diagram in which the axes plot apples today, apples next year if the weather is good, and apples next year if the weather is
bad, as shown below.

As before, assume that it is possible to swap any one of the three time-state claims for one or more of the other two. Any two
swap ratios can be utilized to summarize all possibilities.
www.stanford.edu/~wfsharpe/mia/prb/mia_prb1.htm As before we use the prices (present values) of the atomic securities, 1/7
10/24/12 Production, Consumption and Market‑Clearing

swap ratios can be utilized to summarize all possibilities. As before we use the prices (present values) of the atomic securities,
each of which represents a payment for one and only one future state of the world.

While a firm's production plan will plot as one point in this type of diagram, by utilizing swaps (including traditional market
exchanges), the firm can achieve any point on a plane that goes through that point. The plane is easily constructed, once the
production plan and market prices (swap terms) are known. If the firm issues more than one class of security, each class will
plot at a particular point on a lower plane, but the total value of the various classes will be the same. The set of decisions taken
by the firm to move to different points on one or more planes can be considered its financing plan.

Of course, the shareholders need not consume the mix corresponding to the point selected by the firm via the combination of
its production plan and its financing plan. They, too, can utilize swaps (market exchanges) to collectively achieve any desired
points on any desired planes of their own that collectively have the same value as the firm's production plan. Typically, each
shareholder will obtain one point on his or her own plane, and utilize market trades to transform it into another point. The total
of the consumption-investment combinations chosen by the shareholders will, of course, lie on the plane going through the
firm's production point and have the same value as the production plan.

In a world of full information and zero transactions costs, it is the distance from the origin of the plane on which a firm's
production plan plots that matters. The particular location chosen on the plane (or sub-planes) is, in such a setting, irrelevant to
shareholders. If the firm does not choose a point or points that they collectively prefer, they can utilize market exchanges to
move to such a point or points themselves.

Wealth
In our simple world, the measure of the value of a production plan is the distance from the origin of the plane that represents
combinations of time-state claims that can be obtained from it. Given market prices, all such planes are parallel to one another,
so one can measure the desirability of any such plane using virtually any chosen dimension. It is conventional to do this using
the "now" axis. In other words, the desirability of a production plan is measured in terms of the number of present goods for
which it could be traded. Equivalently, we measure the present value of the plan.

The term wealth is often used to indicate a person's present value. For a firm financed entirely through equity, the goal of
management can be stated as the maximization of shareholder wealth. The analogous criterion for a firm with more complex
financing is the maximization of the wealth of those holding claims on the firm's production (the firm's stakeholders).

Given the ability of a firm's stakeholders to trade time-state claims on their own, the financing plan chosen by a firm should be
irrelevant. So should any swaps made by the firm, once its production plan is in place. The key issues concern the "trades
with nature" that result in the production plan. In the type of world we have described, management should concentrate on
choosing a plan that will produce the maximum possible present value for required investment. Aspects of financing via
issuance of different types of securities, risk-reduction via swaps, etc. are of no importance.

More generally, a firm should undertake new projects if and only if their value exceeds the cost of marketable alternatives
with the same time-state payments. The manner in which the needed resources are obtained is (at best) of secondary
importance.

While such statements are correct in the world under discussion, they are hardly likely to apply without qualification in the
world as it is. Transactions costs, asymmetric information, concentration of managers' human and other capital in the firms for
which they work, and a host of other issues make some corporate financial decisions more advantageous than others.
However, such aspects cannot be adequately analyzed until the essential framework for understanding economies without
such features has been built. Hence we continue to deal with our simple frictionless world in which information concerning
time-state payoffs is known to all.

Production Possibilities
Consider a firm with given resources. It will typically have to choose among a number of alternative production plans. Some
will be inefficient. In this context, a plan is inefficient if there is an alternative plan that provides more of at least one time-state
claim and no less of any time-state claims. Inefficient plans can be rejected out of hand, since each is dominated by at least
one other alternative. But this will typically still leave for consideration a number of efficient plans -- i.e. plans that are not
inefficient.

Efficient plans will generally plot on a "hill" in a diagram with time-state payments on the axes -- a hill that "bulges out" from
the origin of the diagram. As long as production technologies may be undertaken at any desired scale with proportionate
results, this surface is guaranteed never to "cave in". Consider two plans, X and Y. Point X might represent planting an entire
orchard with one type of apple tree, and point Y planting it with another type. Now consider a plan in which half the orchard
is planted with one type of tree and the other half with the other. If each tree has the same characteristics, no matter how many
are planted, the result will lie half-way between points X and Y in every dimension -- i.e. on the straight line connecting the

www.stanford.edu/~wfsharpe/mia/prb/mia_prb1.htm 2/7
10/24/12 Production, Consumption and Market‑Clearing

points in the diagram. If no better alternative can be found, at least this linear set of combinations will be available. Thus the
surface will not cave in. If a better alternative is in fact available, the surface will bulge out.

The set of efficient production possibilities can be termed the production possibility frontier. If it bulges out, the technology
evidences decreasing returns to scale -- a widely-observed phenomenon. A rather crude analogy holds that it looks
something like an upside-down mixing bowl (although perhaps a somewhat irregular one).

Optimal Production
Insuring that production is efficient is, at base, a technical issue. This criterion can be met without reference to market prices.
However, the choice of the best (optimal) plan from among the efficient plans requires the use of market prices.

The rule is simple: for given resources, among such plans select the one with the largest present value. Graphically, this
involves selecting the point on the production possibility frontier at which a value plane (every point on which has the same
market value) is tangent to (touches but does not intersect) the frontier. This point represents the wealth-maximizing
production plan. Its selection is the only important decision made by management in our setting. In the real world, it is still
likely to be by far the most important decision for a firm -- considerably more important than financing decisions.

Again, a crude analogy: think of the valuation plane as a cookie sheet. Then the optimal production plan lies at the point at
which the cookie sheet touches a point on the upside-down mixing bowl.

Consumption
Firms (producers) play a key role in an economy. Individuals (consumers) are the other key players. Ultimately, of course, all
resources (including both physical and human capital) are owned by individuals, so people function in both roles.

Given his or her resources, an individual will "own" a particular combination of time-state claims. For example, a worker may
expect to receive a salary of 100 apples today and a salary of 110 apples in the future if the weather is good or 90 apples if the
weather is bad. Such an endowment will plot as a point in the kind of diagram we have been using. However, given the
ability to trade in markets, the individual can choose to consume any point lying on the value plane that includes his or her
endowment point. This can be done by lending, borrowing, or any of a host of investment transactions.

An individual's wealth can be measured by the amount of present consumption that he or she could obtain by exchanging all
future prospects for present values, plus the value of present prospects already attained. Acting in one's role as producer, it is
desirable to choose a career, location, education, etc. to maximize this present value. We ignore here the importance of
intangibles that are not fully tradable, such as peace of mind, integrity, etc., (but do not wish to leave the impression that they
are unimportant). In any event, given such decisions, the individual as consumer faces a value plane representing the available
combinations of time-state claims. This is sometimes termed a budget plane or consumption opportunity set.

Optimal Consumption
It is trivial, but nonetheless correct, to suggest that a consumer should select from among available consumption combinations
the one that he or she likes best. Ultimately, this depends on individual preferences. However, it is possible to argue that for
most people such preferences are likely to have certain characteristics.

It is useful to consider sets of time-state claims among which a consumer exhibits indifference. For example, if a given
consumer considers combinations X, Y and Z equally desirable, we say that he or she is indifferent among them. Graphically,
they lie on the same indifference surface. A given consumers' preferences can then be represented by a set of such surfaces.
Such surfaces will not intersect, since this would involve a contradiction. Assuming non-satiation of preferences, i.e. that each
time-state claim is a good (more is better), such surfaces will not have the appearance of the production possibility surface.
Rather, they are likely to "curve away" from the origin in a diagram with time-state payments on the axes. In most cases, they
will have the appearance of mixing bowls that are "right-side up". Of course, there will be many of them, each representing a
set of alternatives preferred to that on the surface below (closer to the origin). The appearance will thus be something like that
a set of nested mixing bowls.

The figure below shows one "cut" of this relationship. The horizontal axis plots an amount consumed in the present, while the
vertical axis plots an amount consumed for certain in the future. In such a trade-off, only the timing of consumption is of
relevance, since there is no uncertainty.

www.stanford.edu/~wfsharpe/mia/prb/mia_prb1.htm 3/7
10/24/12 Production, Consumption and Market‑Clearing

Each curve in the figure shows combinations of these two consumption items among which the investor is indifferent. Only a
few such curves are shown, of the very many that could be drawn. The key assumption is that each curve becomes flatter as
one goes from the upper left portion to the lower right portion. Equivalently, the added amount of the good on the X axis that
the consumer will require to give up a unit of the good on the Y axis will be larger, the larger the amount of X and smaller the
amount of Y.

The figure below shows another type of trade-off. Here the amount consumed in the present is assumed to be fixed, with only
the amounts to be consumed under each of the two possible future states to be determined. The vertical axis plots the amount
consumed if the weather is good, while the horizontal axis plots the amount consumed if the weather is bad. Needless to say,
the diagram reflects preferences when these are still contingent claims -- i.e. before the actual weather pattern is known.

Here, too the consumer is assumed to have indifference curves that get flatter as one moves from the upper-left to the lower-
right. In a sense, this is no different than in the first figure. However, given the nature of the goods in question, there are
further implications.

Consider combinations X and Y. Since they are on the same indifference curve, the consumer considers one as desirable as
the other. For convenience, we denote payments B and G in bad and good weather, respectively, as [B G]. In the figure, X is
thus [20 80] and Y is [50 40]. Now consider combination Z, which pays [35 60] and plots midway between X and Y.
www.stanford.edu/~wfsharpe/mia/prb/mia_prb1.htm 4/7
10/24/12 Production, Consumption and Market‑Clearing

Clearly, the individual would prefer it, since it lies on a higher (better) indifference curve.

Now assume that there are two consumers, each with the preferences shown in the figure. Assume, moreover, that one holds
securities providing combination X, while the other holds securities providing combination Y. Together, their portfolios will
pay [70 120].

Imagine that a clever entrepreneur sets up a mutual fund, suggesting that both consumers "invest" their shares in return for
half the payments received by the fund. Under this arrangement, each will obtain [35 60]. But this is combination Z. The first
consumer has traded X for Z, and is happier. The second consumer has traded Y for Z and is also happier. In fact, the
entrepreneur could take a bit of the action and still make both consumers (who are now investors) happier.

Needless to say, the increased happiness is an ex ante construct in this case. After the fact, one of the two investors will be
better off, and the other worse off, than had the change not been made. But this involves hindsight, which always is
characterized by 20/20 vision.

As the example illustrates, consumers with preferences of the assumed sort will find it desirable to diversify. In an important
sense, they are risk-averse.

The Societal Aggregate Product


Each producer in an economy of the sort we have analyzed will choose a production plan based on available resources,
technological possibilities and current market prices. Similarly, each consumer will choose a consumption plan based on
wealth, preferences and market prices. But what will assure that the total amount of each time-state claim provided by
producers will equal the total amount that consumers collectively wish to consume?

The answer, of course, lies in the role of prices in a market economy.

Given a set of prices, if consumers wish to consume more of one time-state claim than producers wish to produce, there will
be "buying pressure" (the quantity demanded will exceed the quantity supplied), and the price will rise. Conversely, if
consumers wish to consume less of a time-state claim than producers wish to produce, there will be "selling pressure" (the
quantity supplied will exceed the quantity demanded) and the price will fall. Such changes will continue until equilibrium is
achieved -- i.e. the market for each time-state claim clears (quantity demanded equals quantity supplied). The prices that
accomplish this are termed equilibrium prices.

There is no point debating the "causes" of such prices. Both supply (production opportunities) and demand (consumer
preferences) determine them. Equilibrium prices result from the interaction of both forces, as well as the initial distribution of
wealth (including human abilities).

When an equilibrium is achieved, the set of the amounts of time-state claims produced and the set of the amounts consumed
will be the same. This combination can be termed the societal aggregate product.

Assume that in a society the aggregate product in dollars or apples is qtotal:


Present 100
Future if Bad Weather 50
Future if Good Weather 150

Denote the three alternatives as N (now), B (future if weather is bad) and G (future if weather is good). Assume that, as
before, the equilibrium price of 1B is 0.665N, and the equilibrium price of 1G is 0.285N. Thus the price vector p is:

Future Future
Present if Bad if Good
1.000 0.665 0.285

The value of the aggregate product, p*qtotal is $176.00.

Assume there are two people in this society, each with the same initial wealth ($88.00). If their preferences were the same,
each could select [50 ; 25; 75 ] of [N ; B ; G ], and the markets would clear (equivalently, their aggregate preferred holdings
would equal the aggregate product). If their preferences and/or wealths differed, they would generally choose different mixes.
However, for the markets to clear, their aggregate preferred holdings must equal the aggregate product.

Equivalently, we might represent each individual's consumption choices in terms of the relative values of the holdings.
Assume that individual X's wealth is $59.00 and that she chooses [40; 20; 20]. In this context it seems appropriate to call her
investor X. Her consumption proportions will then be those shown below.
Holding Price Value Proportion

www.stanford.edu/~wfsharpe/mia/prb/mia_prb1.htm 5/7
10/24/12 Production, Consumption and Market‑Clearing
N 40 1.000 40.00 0.6780
B 20 0.665 13.30 0.2254
G 20 0.285 5.70 0.0966
------ ------
59.00 1.0000

Thus Investor X has chosen to consume 67.8% of her wealth now, and to invest the remaining 32.2%. Of the amount
invested, 70% (0.2254/0.3220) will be used to purchase claims that pay off in state B and 30% (0.0966/0.3220) to purchase
claims that pay off in state G (or some combination of other securities that gives the same overall set of exposures).

In this situation, Investor Y will have a wealth of $117.00 and choose the following portfolio:
Holding Price Value Proportion
N 60 1.000 60.00 0.5128
B 30 0.665 19.95 0.1705
G 130 0.285 37.05 0.3167
------ ------
117.00 1.0000

We can also characterize the aggregate product in proportional terms; thus:

Holding Price Value Proportion


N 100 1.000 100.00 0.5682
B 50 0.665 33.25 0.1889
G 150 0.285 42.75 0.2429
------ ------
176.00 1.0000

In this world, Investor X has 33.52% of total wealth (59.00/176.00) while Investor Y has the remaining 66.48%. If we were
to compute a wealth-weighted average of their two consumption mixes (expressed in proportions), we would obtain the
aggregate mix (also expressed in proportions). In this sense, consumers collectively consume the aggregate product of current
and contingent goods.

The Market Portfolio


Now consider a related set of calculations in which only future claims are included. The resulting three investment portfolios
would then be characterized as follows:

Investor X
------------
Holding Price Value Proportion
B 20 0.665 13.30 0.70
G 20 0.285 5.70 0.30
------ -----
19.00 1.00

Investor Y
-----------
Holding Price Value Proportion
B 30 0.665 19.95 0.35
G 130 0.285 37.05 0.65
------ -----
57.00 1.00
Society
-------
Holding Price Value Proportion
B 50 0.665 33.25 0.4375
G 150 0.285 42.75 0.5625
------ -------
76.00 1.0000

The aggregate portfolio is often termed the market portfolio.

In terms of invested wealth, Investor X has 25% (19.00/76.00) while Investor Y has 75% (57.00/76.00) of the total. Relative
to the market, Investor X is overweighted in B (70.0%, compared with 43.75%) and underweighted in G (30.0%, compared
with 56.25%), while Investor Y is underweighted in B (35.0%, compared with 43.75%) and overweighted in G (65.0%,
compared with 56.25%). However, their weighted average holdings will equal those of "the market" precisely. Not
surprisingly, if one or more investors chooses to hold less than market proportions of a security, other investors must choose
to hold more than market proportions, and the total value of the first group's underweighting must equal that of the second
group's overweighting.
www.stanford.edu/~wfsharpe/mia/prb/mia_prb1.htm 6/7
10/24/12 Production, Consumption and Market‑Clearing

In one sense, this is simply an accounting identity: that which is, must be held. But in a world in which prices have adjusted to
achieve equilibrium, no one holds more or less than desired. Thus investors who underweight or overweight relative to the
market portfolio do so voluntarily ( for what must seem to them at the time good reasons).

An investor can choose to hold securities in market proportions. On average, taking wealth into account, investors must do so.
In this sense, investing in the market portfolio represents a default position. The Analyst must thus address two key questions:

Should the overall portfolio diverge from the market?

If so, which types of securities should be underweighted


and which ones should be overweighted?

Practical and theoretical problems abound, but it is important to ask the right questions.

www.stanford.edu/~wfsharpe/mia/prb/mia_prb1.htm 7/7
10/24/12 Risk Premia

Risk Premia

Contents:
Probabilities
Expected Returns
Risk Premia
The Market Risk Premium
Forward Prices and Probabilities
Atomic Risk Premia
Determinants of the Market Risk Premium
Atomic Risk Premia
Determinants of Atomic Risk Premia
Determinants of Security Risk Premia

Probabilities
Hopefully, the reader will agree that a rather substantial amount was accomplished in prior sections of
this work. Yet he or she may not have recognized a rather remarkable aspect of the analysis to this point:
probability played no direct role whatever! Instead, most of the results followed from the law of one
price.

The importance of this law should not be underestimated. Whenever it is violated, "the same thing" can
be obtained at two different prices. Better yet, it can be purchased for one price and sold at a higher
price. In some instances it may take a clever analyst to determine how to construct "the same thing"
synthetically via combinations of marketable instruments. Nonetheless, the stakes are high enough to
discover how to do so, for the result will be an arbitrage, with concomitant prospects of increased
wealth.

Modern markets populated by investors and investment professionals driven by greed and cupidity are
unlikely to the characterized by large and persistent violations of the law of one price. Thus results that
rely on it are likely to be quite robust. In the real world, characterized as it is by transactions costs,
differences in information, and so on, prices of things that are "almost the same" may not be equal, but
they are nonetheless likely to be close to one another

Unhappily, there is no such thing as the "law of one probability". Thus Investor X may believe that
there is a 40% probability that the weather will be good (and hence a 60% chance that it will be bad),
while Investor Y believes that there is a 50% probability that it will be good (and hence a 50%
probability that it will be bad). Their beliefs will undoubtedly influence their attitudes towards the
associated time-state claims, and hence the equilibrium prices for such claims. But markets can clear
without investors reaching any agreement concerning such probabilities.

Despite this, much of modern financial theory is built around the notion that there is a single set of
probabilities for various outcomes. In some cases it is simply assumed that all investors agree
concerning such probabilities. In others, the probabilities utilized for calculations are assumed to be
those of a "consensus of well-informed investors". Often the latter characterization represents a rationale
for a model built on the foundations of the former (more extreme) assumption.
www.stanford.edu/~wfsharpe/mia/prb/mia_prb2.htm 1/9
10/24/12 Risk Premia

To proceed, we join the theorists who make such heroic assumptions. In particular, we assume that all
individuals in our simple economy agree on the probabilities associated with future states of the world.
For our initial examples, we assume that they believe our two states are equally probable. Representing
probabilities by the vector prob:

Bad Good
Weather Weather
0.50 0.50

Not surprisingly, as long as the states of the world that have been enumerated are mutually exclusive and
exhaustive (i.e., one and only one will occur), the sum of such probabilities must equal 1.0. Other than
this, little can be said ex cathedra, since the concept applied here is that of subjective probability
(people's beliefs about the relative likelihoods of various events), not that of objective probability (with
its accompanying notion that there is a "true" set of such likelihoods).

Whatever the source of such probabilities, we assume for now that they exist and that all market
participants agree both about them and about the results of computations involving them -- to which we
now turn.

Expected Returns
The expected value of an uncertain variable is obtained by weighting every possible outcome by the
associated probability. Equivalently, it is a probability-weighted average of the possibilities.

The one-period return from a security is the change in its value plus any distributions received at the end
of the period, all divided by the initial value. Consider the Stock that we have been following that will
increase in value by 26% if the weather is good but decrease by 4% if the weather is bad. Its return
vector r is:

Good Weather 0.26


Bad Weather -0.04

The expected return on the Stock will thus equal prob*r, or 0.11 (11.0 percent).

We can, of course, compute expected returns for a number of securities in one stage. Assume that a
Bond offers a guaranteed return of 5% and that the Market portfolio consists of 60% Stocks and 40%
Bonds. Consider the matrix R of returns for the Stock, the Market and the Bond shown below:

Stock Market Bond


Bad Weather -0.04 -0.004 0.05
Good Weather 0.26 0.176 0.05

As is so often the case in life, as one goes across the columns, good news (a better return in bad times)
accompanies bad news (a poorer return in good times).

The expected returns are e = prob*R:


Stock Market Bond
0.110 0.086 0.050

Not surprisingly, the expected return of the riskless bond is its certain return. For each of the other
choices, the expected return is neither the highest possible value nor the lowest. Rather, it is a "middle
value". Note that for the Market and the Stock, under no circumstances can the actual return equal the
expected value. No matter what, the actual value will deviate from the expectation. Moreover, in this
case the magnitude of the deviation from the expected return is larger for the Stock than for the Market.
In this sense, the Stock is riskier than the Market, which is in turn riskier than the Bond.
www.stanford.edu/~wfsharpe/mia/prb/mia_prb2.htm 2/9
10/24/12 Risk Premia

Risk Premia
The difference between the expected return on a security or portfolio and the "riskless rate of interest"
(the certain return on a riskless security) is often termed its risk premium. Underlying the terminology is
the notion that there should be a premium (higher expected return) for bearing risk. As we will see,
however, there is no reason why such premia should be associated with all types of risk.

An equivalent definition of a risk premium is: the expected excess return on a security or portfolio,
where excess return is the difference between an actual return and that of a riskless security.

The Market Risk Premium


In our example, the expected return on the Market portfolio is 8.60%, and the market risk premium is
3.60% (8.60% - 5.00%). The associated risk can be measured by the likely divergence between the
actual return and the expected return. If the weather is Good, the difference will be 9.00% (17.6% -
8.6%). If the weather is bad, the difference will be -9.00% (-0.4% - 8.6%). Since the two outcomes are
equally likely, the absolute value of the divergence will be 9.00%.

The market portfolio provides an excess return of 12.60% (3.60 + 9.00) if the weather is good and an
excess return of -5.40% (3.60 - 9.00) if the weather is bad. The potential gain over the riskless rate
(12.60%) is thus 2.33 times as large as the potential loss relative to the riskless rate (5.40%). This may
seem particularly generous, given the assumption that the two situations are equally likely. However,
such a relationship is not uncommon in actual markets.

Since Stocks and Bonds are issued by firms, the Market Portfolio represents a package of claims on all
the productive assets of such firms. Equivalently, it is the set of claims that would be held if only equity
financing had been utilized by firms.

Given the use of Bonds and Stocks as financing vehicles by a firm, an investor can create an "all-
equity" version of the firm synthetically, by holding a portfolio with its bonds and stocks in market value
proportions. Conversely, if a firm is financed solely by equity, an investor who would have preferred the
payment pattern associated with a "levered" stock could create the latter synthetically by borrowing
money, then using both the borrowed money and his or her own funds to buy stock in the unlevered
firm. These relationships form the basis for the original Modigliani-Miller theorem: in a world of the sort
we are analyzing, "home-made leverage" (borrowing) can serve as a substitute for "firm-made leverage"
(borrowing); hence, corporate financing decisions do not matter.

In our example, one can, of course, create pure securities synthetically. The security price vector ps and
payoff matrix Q can be written as:
ps:
Bond Stock
1.00 1.00

Q:
Bond Stock
Good 1.05 1.26
Bad 1.05 0.96

The state prices, given by p = ps*inv(Q) are:


Good Bad
0.2857 0.6667

giving value-relatives vr = 1 ./ p:
www.stanford.edu/~wfsharpe/mia/prb/mia_prb2.htm 3/9
10/24/12 Risk Premia

Good Bad
3.50 1.50

and returns (vr-1):


Good Bad
2.50 0.50

A security that promises to pay off only if the weather is good will provide a return equal to 250% of the
amount invested if the weather is good, and -100% if it is not. A security that promises to pay off only if
the weather is bad will return 50% if the weather is bad and -100% if it is not. The return matrix Q is
thus:
Good Bad
Security Security
Good 2.50 -1.00
Bad -1.00 0.50

and the expected return vector e = prob*Q is:

Good Bad
Security Security
0.75 -0.25

The "Good Security" has an expected return of 75%, while the "Bad Security" has an expected return
of minus 25%.

Clearly, the Good Security has a very high risk premium: 75% - 5%, or 70%. This might not seem too
surprising, given the extreme risk involved. However, the Bad Security actually has a negative risk
premium (i.e. a "risk discount"): -25% - 5% = -30%. Yet it too is very risky.

Obviously, this world is not one in which just any kind of risk is rewarded with a risk premium.

What is going on here?

Forward Prices and Probabilities


Part of the answer to the question can be found by comparing the forward prices of pure securities with
the probabilities that the associated states will occur. Recall that the forward price for a pure security is
the amount that one must agree today to pay at a future date. As we argued earlier, arbitrage will ensure
that this is simply the current price for the contingent claim plus interest for the period in question. But
this implies that the ratio of a current atomic price to the sum of such prices will be the same as the ratio
of the corresponding forward price to the sum of such prices.

As always, the sum of the forward prices is 1.0. As indicated earlier, this must be the case, since by
purchasing one unit of every time-state claim, one is guaranteed a payment of $1 at the future date.
Clearly the cost of obtaining this, if paid at the future date, must be $1. Thus forward prices, like
probabilities, sum to 1.0.

Since the sum of a set of forward prices for a given date must equal 1.0, it follows that the atomic
forward price vector f will equal p./sum(p). In this case:
Good Bad
0.30 0.70

Now, compare the forward prices for the states with their probabilities:

www.stanford.edu/~wfsharpe/mia/prb/mia_prb2.htm 4/9
10/24/12 Risk Premia

Good Bad
Forward Price 0.30 0.70
Probability 0.50 0.50

The forward price for a state need not equal its probability. In our example, one such price (Good) is
lower, and the other (Bad) is higher than the probability that the state will actually occur. Note,
however, that since both sums must equal 1.0, if any price is below its associated probability, at least
one other must be above its associated probability.

In this example, equilibrium has been achieved when prices are such that a payment of $1 if the weather
is good is "cheap" -- it only costs $0.30 (forward) to obtain a payment with an expected value of $0.50
(0.50*$1). On the other hand, a payment of $1 if the weather is bad is "expensive" -- it costs $0.70
(forward) to obtain a payment with an expected value of $0.50 (0.50*$1).

Why are people willing to pay these prices? The answer is not too surprising. Other things equal, one
would prefer to have goods and services when there are not as many available. Thus payments under
bad conditions are more highly prized than those under good conditions. In this case, the society
produces less when the weather is Bad than when it is Good. All contingent claims are not equal. The
fewer there are, the more valuable another one will be.

This aspect of our example captures an important feature of most economies. To see this, consider an
alternative scenario in which each of the forward prices of the securities is $0.50, and thus equal to the
probability of the associated state.

As before, the riskless rate of interest would be 5.0%. What would be the expected return on the Market
Portfolio? Assume that it still offers payments of $1.26 if the weather is Good and $0.96 if the weather
is bad. The state prices would each equal 0.50/1.05, or 0.4762. We thus have:
p:
Good Bad
0.4762 0.4762

q:
Good 1.26
Bad 0.96

price = p*q:
1.0571

prob:
Good Bad
0.50 0.50

expected value = prob*q:


1.11

expected return = (expected future value/price) - 1:


0.05

Thus the expected return on the market would equal 5% -- the riskless rate of interest. There would be
no risk premium at all!

In fact, in a world of this sort, there would be no risk premium on any security -- every single one would
have an expected return equal to the riskless rate, as we will show.

Determinants of the Market Risk Premium

www.stanford.edu/~wfsharpe/mia/prb/mia_prb2.htm 5/9
10/24/12 Risk Premia

Consider the vector f of forward prices. Given a cash flow vector c, we may calculate an associated
forward value fv:
fv = f*c

The forward value of a set of (contingent) cash flows is an amount agreed upon in the present that must
be paid for the set at a specified date in the future. In our case, there is only one future period, so the
payment date coincides with the date at which cash flows (if any) will be received.

The expected value of c is calculated by multiplying each possibility by its probability, then summing.
As before, assume that the probabilities of the states are included in vector prob. Then the expected
value ev will be given by:
ev = prob*c

We know that arbitrage will insure that the present value pv of a set of claims will equal its forward
value discounted at the riskless rate of interest. Thus:
pv = fv/(1+i)

The expected value relative evr for an investment is its expected value divided by its present value
(price):

evr = ev/pv

But this can be stated in terms of the forward value, using the arbitrage relationship between present and
forward values:

evr = (ev/fv)*(1+i)

The expected value relative for a riskless security is, of course, 1+i. Thus the expected value relative for
an investment will be greater than that for a riskless security if ev/fv is greater than one. In such cases the
investment will provide a risk premium. If ev/fv is less than one, the expected value relative for the
investment will be less than that for a riskless security, and the investment will "provide" a risk discount.
An investment for which the expected value is equal to the forward value will offer an expected value
relative equal to that of a riskless security and will provide neither a risk premium nor a risk discount.

This shows why the risk premium will be zero for every security or portfolio if the forward price for
each state equals the associated probability (i.e. f equals prob). In such a world, the expected value (ev)
will equal the forward value (fv) in every case, and every expected value relative will equal 1+i. Hence
the expected return on every security will equal the riskless rate of interest and there will be no risk
premia.

For there to be a market risk premium, some atomic forward prices must differ from the probabilities of
the associated states.

Atomic Risk Premia


The expected return on an investment is:

(ev/pv) -1

while its risk premium is:


((ev/pv) - 1) -i
or

www.stanford.edu/~wfsharpe/mia/prb/mia_prb2.htm 6/9
10/24/12 Risk Premia

(ev/pv) - (1+i)

where i is the riskless rate of interest.

The arbitrage relationship between a present value and an associated forward value insures that the risk
premium for an investment will also equal:

(ev/fv)*(1+i) - (1+i)
or
((ev/fv) - 1) * (1+i)

As shown earlier, an investment will have a risk premium if ev/fv is greater than 1.0, a risk discount if
ev/fv is less than 1.0, and neither if ev/fv equals 1.0. Clearly, it is crucial to understand the determinants
of differences in ev/fv across atomic securities in order to understand the nature of risk premia.

For an atomic security, the expected value will equal the probability of the associated state. Thus the
vector of probabilities, prob is the vector of expected values. Moreover, the vector of atomic forward
prices f is the vector of forward values. Thus the vector of ev/fv values can be computed directly via the
formula:

prob ./ f

Note that since both vectors must sum to 1.0, either all the ratios will equal 1.0, or some will be below
1.0 and others above it.

Determinants of Atomic Risk Premia


Why should one atomic risk premium differ from another? And if there are differences, what might
explain them? As in other realms of economics, we would expect market prices to adjust until there is
good news to go with every piece of bad news. This suggests that higher risk premia (good news)
should be associated with states in which additional goods and services are of less value (bad news). But
when are additional amounts of consumption of less value? When the amount available for consumption
is large. Hence, states of plenty should have high risk premia (ev/fv, or prob/f values) while states of
scarcity have low risk premia (ev/fv or prob/f values).

In a one-good economy, the notion of aggregate output is unambiguous. In such a case, if aggregate
output in state s1 exceeds that in s2, then the ratio of probability to forward price will be higher for state
s1 than for state s2. If the aggregate output in two states is the same, then the ratio of probability to
forward price should be the same for the two states. We deal later with cases involving multiple goods.

Consider the following example:


c prob' f' f'-prob'
40 0.25 0.35 0.10
50 0.25 0.30 0.05
60 0.25 0.20 -0.05
70 0.25 0.15 -0.10
----- ---- ----
1.00 1.00 0.00

Note that the transposes of the last three vectors have been shown for convenience.

The expected value for the market portfolio (c) is given by prob*c; it is 55.0. The forward value for the
market portfolio is given by f*c; it is 51.5. The market portfolio thus has an expected value relative
equal to (55.0/51.5)*(1+i). Assume that the interest rate is .06 (6%). Then the market portfolio has an
expected value relative of 1.132 and hence a risk premium of 0.132-0.06, or 0.072 (7.2%).

www.stanford.edu/~wfsharpe/mia/prb/mia_prb2.htm 7/9
10/24/12 Risk Premia

In this case, prob*c is 55.0 and f*c is 51.5. Thus (f-prob)*c must be -3.5. Moving from vector prob to
vector f as a multiplier of c lowered the value of the product, and hence implied the presence of a risk
premium.

Consider the results of multiplying a vector x times c, where x can equal vector prob, vector f, or
something in between. We wish to move from x=prob to x=f by a series of small steps. To see how this
might be done, consider the final vector in the table: f'-prob', each entry of which equals the sum of all
the steps to be taken. Since the entries in this vector must sum to zero, some will be positive and others
negative. Moreover, the sum of the positive numbers must equal the sum of the negative numbers.
Finally, given our assumptions about atomic risk premia and the ordering of states by increasing
aggregate consumption, the positive numbers will all precede the negative numbers.

Given these relationships, we can move from x=prob to x=f by a series of steps, each of which involves
adding a fixed amount (e.g. 0.05) to an entry in x for one state of the world and subtracting an equal
amount from one or more entries for states of the world in which aggregate output is larger. But each
such step must lower the value of x*c, since the amount chosen is first multiplied by one level of output,
and then by one or more larger levels of output. with the first amount added to the total product and the
second (larger) amount subtracted from it.

Since each such step will lower the product of the two vectors, the total effect of all such steps must
lower the product. This gives the key result that:

If atomic risk premia increase with aggregate output,


(1) the expected value for the market portfolio will exceed its forward price, and
(2) the expected return on the market portfolio will exceed the riskless rate of interest.

In simpler terms:

If atomic risk premia increase with aggregate output, there will be a market risk premium.

While there may be other explanations for such a risk premium, the diminishing value of added
consumption as more consumption becomes available appears to be by far the most plausible cause.
Indeed, it is tempting to conclude that:

If there is a market risk premium, atomic risk premia increase with aggregate output.

We hereby yield to that temptation.

Over the long run, portfolios comprising large numbers of risky securities tend to provide higher returns
than do short-term riskless deposits. This is consistent with the existence of a market risk premium in the
sense that we have used the term. We consider this strong presumptive evidence that on average, people
consider additional consumption more valuable in states of scarcity than in states of plenty.

Determinants of Security Risk Premia


Not all atomic securities offer risk premia (ev/fv>1). Some offer risk discounts (ev/fv<1). But all are
risky. Thus there is not a simple relationship between risk and expected return. This is true not only for
atomic securities, but also for more traditional ones.

The key issue in determining the presence and magnitude of a risk premium is the distribution of the
value of a security's cash flows across various states. If more of its value comes from states with high
ev/fv (probability / forward price) ratios than from those with low ev/fv (probability / forward price)
ratios, the security will generally provide a risk premium. But since states with high ev/fv ratios are
generally associated with times of abundance, this is equivalent to saying that an investment which pays

www.stanford.edu/~wfsharpe/mia/prb/mia_prb2.htm 8/9
10/24/12 Risk Premia

more in good times and less in bad times will generally offer a risk premium. Indeed, the greater the
extent to which an investment is a "fair weather friend" (bad news), the greater will be its expected
return (good news).

This is an important corollary of the general theorem that in a competitive economic market, bad news is
likely to accompany good news. Investors demand higher expected returns from securities that are likely
to fail them when they most need help.

To summarize, some risky securities will provide a risk premium. However, the premium will be
associated with the risk of doing badly when times are bad. There is no reason to expected higher
returns to be associated with any type of risk -- just "bad times risk".

www.stanford.edu/~wfsharpe/mia/prb/mia_prb2.htm 9/9
10/24/12 Consumption and Investment Choices

Consumption and Investment Choices

Contents:
Consumer Utility
Consumer Utility Functions
The Expected Utility Maxim
Risk Tolerance
Maximizing Expected Consumer Utility
Representative Investors
Market Efficiency
Betting and Tailoring
Active and Passive Management

Consumer Utility
Normative financial economics concerns optimal decisions made by individuals, firms and/or
institutions. In an important sense, much of the subject matter of investments deals with optimal choices
of investment and consumption.

Thus far we assumed that the investor/consumer makes optimal choices from among alternative
combinations of present and contingent future consumption opportunities. Initially, we suggested that
the individual picks the combination that he or she likes best. This hardly offers much help. Imagine an
Analyst saying to a client: "do what's best".

Our second characterization of investor behavior utilized the concept of indifference curves or, more
generally, indifference surfaces. The conclusion was somewhat more elegant, although hardly more
useful: pick the combination from the opportunity line (or plane or hyperplane) on the highest (best)
indifference surface.

At this point, we have provided little help for the Analyst seeking to offer direction to individuals or
institutions seeking advice on either the optimal amount to be invested or the particular investments that
should be undertaken.

As we will see, an Analyst can provide useful advice concerning such decisions. Individuals may differ
in preferences, circumstances, constraints and predictions. A rather rich body of analytic methods can be
invoked to help take such differences into account. Such techniques provide a core set of normative
methods for investment management.

Here we will deal with three aspects that may lead informed individuals to adopt different strategies:
differences in preferences, differences in wealth and differences in predictions. We leave for later the
analysis of differences in constraints, other circumstances, and so on.

A formal construct that helps to highlight the differences among utility-based, wealth-based and
prediction-based investment decisions uses the concept of consumer utility and the assumption that the
goal of the consumer is to maximize the expected value of such utility. In this scheme, (1) consumer
utility summarizes an individual's preferences, (2) possible combinations of consumption are related to

wealth, and (3) the probabilities utilized to compute expected utility can be considered predictions. In
www.stanford.edu/~wfsharpe/mia/prb/mia_prb3.htm 1/9
10/24/12 Consumption and Investment Choices

wealth, and (3) the probabilities utilized to compute expected utility can be considered predictions. In
principle, one can thus determine the extent to which investment decisions differ due to differences in
predictions as opposed to differences in preferences or differences in wealth. In practice, such a neat
taxonomy is difficult to attain. Nonetheless, every investment decision should be scrutinized in an
attempt to determine (as best possible) the role that each such type of difference plays.

Consumer Utility Functions


Consider an individual trying to select a combination of apples today, apples in the future if the weather
is good, and apples in the future if the weather is bad. We represent a consumption plan as a vector c in
which the elements are the levels of consumption in every time and state; in this case: [consumption
now, consumption later if the weather is good, consumption later if the weather is bad].

Consider a particular consumption plan, for example, [80,100,50]. Note that the consumer will, in fact,
attain one of two of the mutually exclusive sets of consumption:
if the weather is good:
Now: 80
Future: 100

if the weather is bad


Now: 80
Future: 50

It is generally assumed that consumer utility functions are such that all types of consumption are goods
(i.e., more is preferred to less, other things equal). It is generally also assumed that in such functions,
marginal utility (the added utility from one added unit) decreases as the number of units increases (i.e.
there are decreasing returns to scale in consumption).

In some cases the utility associated with a given amount of future consumption will differ, depending on
the state of the world in which the consumption takes place; for example, 5 apples might give more
satisfaction on a rainy day than on a sunny one. In such cases, we say that the consumer has state-
dependent utility.

The utility associated with additional consumption in one time period may depend on the amounts
consumed in prior time periods. Consumers may fall into habits so that both the absolute amount of
consumption and any change from previous levels may be of importance.

In many cases Analysts will take neither of these possible complications into account. Instead, they
assume that utility is separable and additive -- that is, that there is a utility associated with each time
period and state of the world and that the total utility is simply the sum of these sources of utility.
Moreover, they assume that the utility associated with each time and state is of a particularly simple
form.

Vector c, which represents a consumption plan, includes entries for at least two time periods; in our
case: now and later. Let tp be a vector of the same length with coefficients indicating the
consumer/investor's time preference. For example:
now future future
(good weather) (bad weather)
1.0 0.95 0.95

This indicates that a given amount of consumption in the future provides 0.95 times as much utility as
the same amount of consumption now. If there were a third time period, the entries for that period in
vector tp would typically be smaller than those for the second time period, and so on. One possibility
would be to make the entries for the second period some constant d, those for the third period d^2, those
www.stanford.edu/~wfsharpe/mia/prb/mia_prb3.htm 2/9
10/24/12 Consumption and Investment Choices

for the fourth period d^3, etc.. This would allow the investor's time preference to be summarized with
one number (d). In any event, all entries in vector tp that refer to a given time period will be the same.

Regarding utility itself, Analysts often make another restrictive assumption. They assume that the utility
associated with a given time and state can be written as:
tp(ts)*u(c(ts))

where ts is the time and state, tp(ts) is the time preference parameter for the associated time, c(ts) is the
consumption planned for the time and state, and u(c(ts)) is the utility associated with that consumption,
not taking into account the time at which it is received. This rules out, for example, the possibility that an
Investor may have one attitude concerning risk in period 2 and a different attitude concerning risk in
period 3.

Only one task remains -- to specify the utility function u(). Possible forms are discussed in subsequent
sections. Suffice it to say that at the very least, utility should increase with consumption, but at a
decreasing rate. Equivalently, the marginal utility (change in utility per unit change in consumption)
should decrease as consumption increases.

The Expected Utility Maxim


We have associated an amount of consumer utility with each possible level of consumption. However,
in fact some of the levels will not be realized. To take this into account, we multiply each utility level by
the probability that the consumption in question will be attained. The sum of all such values is the
expected utility of the consumption plan. We assume that the consumer's objective is to select from
among all feasible plans, the one that provides the maximum expected utility. The latter is known as the
expected utility maxim principle.

Let prob be a vector of the same length as c with probabilities assigned to each time and state by the
Investor (consumer) and/or the Analyst. In this vector, the sum of all entries for a given time period will
equal one. For example:
now future future
(good weather) (bad weather)
1.0 0.50 0.50

If there were entries for a third period, the sum of those entries would also be 1.0, and so on.

We are now in a position to write a formula for the expected utility eu of a consumption plan c. It is:
eu = sum (prob.*tp.*u(c));

Given a vector of atomic security prices p, the optimal consumption/investment problem can then be
stated as:
Select: c
to Maximize: eu
Subject to: p*c <= W

where W is the consumer/investor's wealth.

The decision variables are the levels of planned consumption. The optimization problem is to select
values for these variables that maximize the objective function without violating the inequality
constraint. It can be solved either through a "search procedure" or, in some cases, by directly finding the
values that satisfy a set of conditions that must obtain when the optimum solution for such a problem is
found. Microsoft's Excel spreadsheet includes a solver procedure that employs an intelligent search
method to solve problems of this sort. MATLAB's optimization toolbox also provides functions that can
www.stanford.edu/~wfsharpe/mia/prb/mia_prb3.htm 3/9
10/24/12 Consumption and Investment Choices

method to solve problems of this sort. MATLAB's optimization toolbox also provides functions that can
be used for the purpose.

Risk Tolerance
Given the expected utility maxim, we can see more directly the relationship between the curvature of the
utility function and an Investor's tolerance for risk. To do so, we utilize a simple function of the
following form:
u(c) = c.^k

where k is a positive constant between 0 and 1. The greater the value of k, the less curved will be the
function; if k were to equal 1.0, the curve would become a straight line.

Consider an investment that offers a probability of 0.50 that consumption will equal 80 apples (Good)
and a probability of 0.50 that it will equal 20 apples (Bad). The table below shows the utility associated
with each outcome for an investor with k=0.375. The expected utility -- the probability-weighted
average of these two utility values -- is shown as well.
k = 0.375

Consumption Utility
Good 80 5.172
Bad 20 3.075
--------
Expected Utility 4.124

Certainty Equivalent 43.73

The expected utility maxim assumes that an investor will be indifferent between two investments if they
offer the same expected utility. But the expected utility of a certain investment will equal its utility. The
certainty-equivalent for a risky investment can be defined as an amount to be received with certainty that
the investor would just be willing to accept instead of the risky investment. Here, we seek the value of c
for which:
c.^k = 4.124

The answer, also shown in the table above, is 43.73 apples. Thus although the investment offers an
expected consumption of 50 apples (0.50*20 + 0.50*80), this investor considers it only as desirable as
43.73 apples for certain.

Now consider the investor with a utility function for which k=0.500. The corresponding calculations are
shown in the table below. She considers the investment as desirable as 45.00 apples for certain. In a
sense she "likes it better" than the first investor. If one had to pay 44 apples to obtain the investment, the
first investor would pass the opportunity by, while the second one would seize it with pleasure.
k = 0.500

Consumption Utility
Good 80 8.944
Bad 20 4.472
--------
Expected Utility 6.708

Certainty Equivalent 45.00

The greater the value of k in a utility function of the type we have posited, the greater will be the
certainty equivalent for a given risky investment. Hence we can say that the greater the value of k, the
www.stanford.edu/~wfsharpe/mia/prb/mia_prb3.htm 4/9
10/24/12 Consumption and Investment Choices

greater the investor's tolerance for risk and the smaller his or her aversion to risk. More generally, the
smaller the curvature of the utility function, the greater is an investor's tolerance for risk.

Maximizing Expected Consumer Utility


Given our simplifying assumptions, the expected utility of a consumption plan will depend on the
consumer's time-preference, risk tolerance, and assessment of the probabilities of the alternative states of
the world. In our example, there are only two such states. Since the probability of bad weather will
equal one minus the probability of good weather, we may focus on three parameters: two reflecting a
consumer's preferences (time preference and risk tolerance), and one reflecting his or her predictions.

Given a set of prices p and a level of wealth W, a consumer/investor will choose a consumption plan c
that maximizes expected utility, taking into account his or her time preference, risk tolerance and
probability assessments. Note that the latter three aspects are not directly observable, while the former
are, at least in principle. Investors with the same wealth facing the same set of prices can and often will
differ in their choices of planned consumption. In general, those with greater preference for present as
opposed to future consumption will consume more in the present and save less. Those with greater risk
tolerance will take greater risk in their investment portfolio. And, other things equal, those who attach
higher (lower) probabilities to certain events will invest more (less) in securities that pay off when those
events take place.

To illustrate, we consider an investor with the following expected utility function:


eu = cn^k + prg*d*(cg^k) + (1-prg)*d*(cb^k)

Here, the arguments of the function, cn, cg and cb, are consumption now, consumption in the future if
the weather is good, and consumption in the future if the weather is bad, respectively. The three
parameters of the function are k (a measure of risk tolerance), d (a measure of time preference), and prg
(the estimated probability of good weather). As in earlier examples, we assume that the prices are [1.00
0.285 and 0.665] for the three types of consumption (cn, cg and cb, respectively).

Investors whose preferences can be described with this type of utility function will react to increases in
wealth by adjusting their plans proportionately. Thus, compared with an Investor with a wealth of 100,
an investor with a wealth of 200 will consume twice as many apples today, and select a consumption
plan involving twice as many apples if the weather is good and twice as many apples if the weather is
bad. The savings rate and portfolio composition will be the same for any two investors that have (1) the
same risk tolerance (more precisely, the same value of k) and (2) the same time-preference (more
precisely, the same value of d), and (3) the same probability assessment (more precisely, the same value
of prg) . Such invariance with respect to wealth is not a generally observed relationship, indicating that
this form of an expected utility function does not capture the preferences of all investors. However, for
now allows us to avoid issues associated with the effects of differences in wealth.

Consider an Investor with a wealth of 100 for whom k=0.375, d=0.96, and prg=0.50. Her optimal
consumption plan c in units will be [48.76 112.27 28.94]. The values of the components (p.*c) will be:
Now Good Weather Bad Weather
48.76 32.00 19.24

She will spend 48.76% of her wealth on present consumption and invest 51.24%. Her investment
portfolio will consist of claims on apples if the weather is good with a value of 32.00 and claims on
apples if the weather is bad with a value of 19.24. Thus the proportion of the portfolio's value invested
in good weather apples is 32.00/51.24, or 62.45%.

This example might seem extreme, since the investor spends only 48.76% of her wealth and invests the
remaining 51.24% -- a seemingly extremely high savings rate. However, recall that we are dealing here
with total wealth, including the present value of future income. It is important to recognize that an
www.stanford.edu/~wfsharpe/mia/prb/mia_prb3.htm 5/9
10/24/12 Consumption and Investment Choices

with total wealth, including the present value of future income. It is important to recognize that an
individual's wealth prior to retirement includes the value of his or her human capital. It should be
included, along with financial and physical capital, when considering total wealth and when making
plans for savings and risk-taking. Among other things, this suggests that younger investors (for whom
human capital is likely to represent a majority of wealth) may choose to invest their physical and
financial capital rather differently than older investors (for whom human capital may represent a
minority of wealth). It also suggests that the nature of one's human capital should be taken into account
when determining the appropriate investment of the non-human capital that is not consumed.

The next table shows the relationship between k and the decisions of interest. Each row portrays the
optimal choice for a different investor. The first three columns indicate the parameters used in the
analysis. The final columns show the values of, respectively, the amount consumed in the present, the
amount planned to be consumed if the weather is good and the amount planned to be consumed if the
weather is bad. The eighth row contains the results obtained earlier. The other rows show results for
Investors that are alike with regard to time-preference and prediction but differ in risk tolerance.

prg k d consumed good bad


0.5000 0.2000 0.9600 50.2676 27.4901 22.2422
0.5000 0.2250 0.9600 50.1184 27.9926 21.8890
0.5000 0.2500 0.9600 49.9600 28.5286 21.5114
0.5000 0.2750 0.9600 49.7729 29.1149 21.1121
0.5000 0.3000 0.9600 49.4887 29.7447 20.7666
0.5000 0.3250 0.9600 49.3278 30.4329 20.2394
0.5000 0.3500 0.9600 49.0605 31.1812 19.7583
0.5000 0.3750 0.9600 48.7561 31.9981 19.2458
0.5000 0.4000 0.9600 48.4089 32.8932 18.6978
0.5000 0.4250 0.9600 48.0107 33.8783 18.1109
0.5000 0.4500 0.9600 47.5525 34.9661 17.4814
0.5000 0.4750 0.9600 47.0219 36.1728 16.8053
0.5000 0.5000 0.9600 46.4062 37.5155 16.0783
0.5000 0.5250 0.9600 45.6875 39.0175 15.2951
0.5000 0.5500 0.9600 44.8436 40.7054 14.4510
0.5000 0.5750 0.9600 43.8488 42.6102 13.5410
0.5000 0.6000 0.9600 42.6711 44.7683 12.5606

As the table shows, investors with greater tolerance for risk will invest a greater proportion of their
portfolios in the good pure security. Recall that it has a larger expected return but greater
underperformance in bad times. Thus investors whose utility decreases at a slower rate (higher k) with
decreases in wealth are more willing to take the risk of doing badly in bad times. Note that such
investors also devote a slightly smaller portion of wealth to present consumption, and hence a larger
portion of wealth to investment, since future prospects are somewhat more attractive to them than to
those who are more concerned with the risk such investments entail.

The next table provides the same type of analysis for a group of investors who differ in time-preference
but are, in other respects, like our original investor.
prg k d consumed good bad
0.5000 0.3750 0.9000 51.3375 30.3861 18.2765
0.5000 0.3750 0.9100 50.8964 30.6614 18.4422
0.5000 0.3750 0.9200 50.4588 30.9350 18.6062
0.5000 0.3750 0.9300 50.0265 31.2052 18.7684
0.5000 0.3750 0.9400 49.5981 31.4720 18.9299
0.5000 0.3750 0.9500 49.1757 31.7361 19.0883
0.5000 0.3750 0.9600 48.7561 31.9981 19.2458
0.5000 0.3750 0.9700 48.3429 32.2557 19.4014
0.5000 0.3750 0.9800 47.9336 32.5117 19.5547
0.5000 0.3750 0.9900 47.5279 32.7655 19.7066

As the table shows, investors with greater preference for future consumption will consume less and

www.stanford.edu/~wfsharpe/mia/prb/mia_prb3.htm 6/9
10/24/12 Consumption and Investment Choices

invest more. While the absolute values of the claims for consumption in the good and bad states of the
world are affected, their relative values are not. Investors with this type of utility function will change
only their savings rate when their time-preference changes. The composition of their portfolios will not
be affected.

The final table completes the analysis by showing a group of investors with different assessments of the
probabilities associated with the alternative future states of the world, but who are like our investor in
other respects.

prg k d consumed good bad


0.4000 0.3750 0.9600 50.3109 23.1037 26.5854
0.4200 0.3750 0.9600 50.0734 24.8632 25.0634
0.4400 0.3750 0.9600 49.7994 26.6357 23.5648
0.4600 0.3750 0.9600 49.4864 28.4204 22.0932
0.4800 0.3750 0.9600 49.1391 30.2092 20.6517
0.5000 0.3750 0.9600 48.7561 31.9981 19.2458
0.5200 0.3750 0.9600 48.3435 33.7809 17.8756
0.5400 0.3750 0.9600 47.8998 35.5540 16.5462
0.5600 0.3750 0.9600 47.4280 37.3134 15.2586
0.5800 0.3750 0.9600 46.9306 39.0545 14.0150
0.6000 0.3750 0.9600 46.4086 40.7727 12.8187

Note the dramatic effects of differences in predictions. Optimists, who assign a higher probability to
good weather, will invest considerably larger portions of their portfolios in good weather apples
(securities with higher expected returns and possibilities for greater underperformance). They invest
more of their wealth as well. Differences in opinions really do make horse races (as has been said).

Representative Investors
Recall that an Investor who assigned a probability of 0.50 to good weather, and had a utility function
with k=0.375 and d = 0.96 and wealth of 100 would choose a consumption plan (in units) of [48.76
112.27 28.94]. Now, assume that in the aggregate social product, the proportions of the three types of
consumption are precisely the same. If so, our candidate can be considered a representative Investor.
Why so? Because a society with aggregate consumption of [48.76*z 112.27*z 28.94*z] (where z is a
positive constant) and prices [1.00 0.285 0.665] could be populated entirely by a set of identical
investors, each of whom had this specific utility function. The existence of such preferences and
predictions on the part of every investor would be consistent with the attributes of equilibrium that
would be observed in such a society. Since preferences and predictions cannot generally be observed
by the outside analyst (financial or other), it is helpful to determine at least some possible attributes for
such elements from observed magnitudes.

By construction, a representative investor will find it optimal to (1) save at the societal savings rate, and
(2) hold the market portfolio. Hence, any investor who assigns the same probabilities to states of the
world , has the same risk tolerance, the same impatience and (in the general case) the same wealth as a
representative investor should also save at the societal savings rate and hold the market portfolio. The
concept of a representative investor thus provides a useful benchmark against which one can compare
oneself. Other things equal, if an Investor makes different probability assessments from the
representative investor, it will be optimal to "tilt" holdings towards securities that pay more in states that
the Investor feels are more likely than does the representative Investor. Other things equal, if an Investor
has greater (less) tolerance for risk than the representative Investor, he or she should hold a portfolio
with a higher (lower) expected return than the market portfolio. Other things equal, if an investor is more
(less) patient, he or she should save more (less) than is typical in the society.

This type of comparison is complicated by the fact that the representative Investor may not be unique.
For example, the world we have described could instead be populated by some other set of
representative investors. In one sense, this does not matter. Any one of the possible set of representative
www.stanford.edu/~wfsharpe/mia/prb/mia_prb3.htm 7/9
10/24/12 Consumption and Investment Choices

investors can be used as a benchmark with which one can compare oneself. However, most analyses of
optimal consumption and investment decisions go farther, as we will see.

Market Efficiency
We say that a securities market is efficient relative to a given set of information if the prices of securities
are the same as they would be if all participants had that information and processed it appropriately.
Note that this definition does not require the holdings to be the same as they would be if all the
participants had the information and processed it appropriately. Consider the situation in which Investor
A is overly optimistic about the prospects for a firm., while Investor B is overly pessimistic. Under
these conditions, the price of the firm's stock might be precisely the same as it would be if A and B had
each made informed predictions. If so, we would say that the market was efficient because the average
of Investor's opinions was, in a rather broad sense, correct. Note, however, that under the posited
conditions, Investor A would hold "too much" of the security, and investor B "too little", relative to the
amounts they would hold (and should have held) had they obtained the same information and processed
it appropriately.

Key to the notion of market efficiency is that of what we will call fully-informed probabilities. Such
probabilities would be assessed by a sophisticated Analyst with access to a defined set of information. In
effect, such probabilities "take the information into account" in an efficient manner.

Using this construct, we can say that a market is efficient relative to a given set of information if security
prices are the same as they would be if every Investor utilized fully-informed probabilities. Under these
conditions, it makes sense to concentrate on a representative Investor who uses fully-informed
probabilities. The risk tolerance of such an Investor is likely to be roughly (or exactly) equal to a wealth-
weighted average of the risk tolerances of the Investors in the society (the latter is sometimes termed the
societal risk tolerance). Similarly, the impatience of this representative Investor is likely to be roughly (or
exactly) equal to a wealth-weighted average of the degrees of impatience of the consumer/investors in
the society (sometimes termed the societal degree of impatience).

Betting and Tailoring


There are three possible reasons why Investors X and Y may wish to hold different portfolios. Two
relate to preferences and one to predictions.

First, Investors' attitudes toward risk and return may differ. An outside expert cannot argue that such
differences (if fully informed) are "wrong", any more than an outside expert can tell a consumer whether
he or she should prefer beer to wine. If Investors understand the characteristics of alternative investment
vehicles and agree on their prospects and probabilities, differences in holdings can be said to be utility-
based or consistent with market efficiency. Adjusting portfolio holdings (and/or consumption/investment
decisions) to suit differences in utility can be considered tailoring.

The second reason for differences in holdings is associated with differences in levels of wealth.
Wealthier individuals generally invest a greater absolute amount of money. Some may invest a greater
percentage of their wealth, others a smaller percentage, and yet others the same percentage. Some
wealthier individuals choose investment portfolios that are riskier, others portfolios with the same risk,
and yet others portfolios with less risk. Differences in holdings that arise due to differences in Investor
wealth are, like differences due to underlying preferences, consistent with efficient markets.
Implementing strategies designed to accommodate such differences can thus also be considered
tailoring.

The third reason is different. Investors (or their financial advisors) may disagree about the probabilities
of alternative future states of the world. (Here we assume that all agree about the possible states of the
www.stanford.edu/~wfsharpe/mia/prb/mia_prb3.htm 8/9
10/24/12 Consumption and Investment Choices

world and the payments associated with various securities in each of the states; in this setting, the only
disagreements about the future relate to the probabilities associated with alternative states.) Such
disagreements can arise when investors utilize disparate sets of information and/or process a given set of
information differently (although at a more profound level, the latter can be considered a variant of the
former).

Despite this, the wealth-weighted average probabilities assessed by investors may be equivalent to fully-
informed probabilities, since the probabilities assessed by those with greater amounts invested (and
hence greater incentive to gather information and process it well) are weighted more heavily in
computing the averages. This notion underlies the often-made assumption that consensus probabilities
are equal to fully-informed probabilities.

Loosely speaking, we can term an investor's probability assessments to be deviant if they differ from
those of the consensus of investors. If security prices do not reflect efficient-market probabilities, at least
some investors whose holdings are based on deviant predictions may profit from such differences. On
the other hand, if markets are efficient, differences in holdings based on deviant predictions will
generally prove undesirable in the long run, leading only to added transactions costs and lack of
appropriate diversification. In either event, adjusting portfolio holdings (and/or the
consumption/investment decision) to suit differences from consensus estimates of the likelihoods of
alternative scenarios can be considered prediction-based, and inconsistent with market efficiency. In the
vernacular, such choices can simply be termed betting.

Active and Passive Management


Those who concentrate on tailoring holdings to suit an investor's preferences and/or circumstances
generally employ investment approaches that the investment industry classifies under the heading
passive management. Those who bet on differences in predictions generally employ approaches
classified under the heading active management. Tailoring decisions tend to be designed to implement a
strategy that requires only small changes over time. Such decisions typically involve small and
relatively infrequent changes, and hence can be considered "passive".

Bets, however, are likely to change rather dramatically, as new information is revealed and investors
react differently to such information. Investors making decisions based on differential predictions are
likely to generate a significant number of trades, and are thus appropriately termed "active".

www.stanford.edu/~wfsharpe/mia/prb/mia_prb3.htm 9/9
10/24/12 Macro‑Investment Analysis

Risk and Return


Mean, Variance and Distributions
Portfolio Choice
Multi-period Returns
Portfolio Characteristics
Two-asset Portfolios

www.stanford.edu/~wfsharpe/mia/rr/mia_rr0.htm 1/1
10/24/12 Mean, Variance and Distributions

Mean, Variance and Distributions

Contents:
The Mean-variance Paradigm
Expected Value
Probabilities
Standard Deviation
Continuous and Discrete Outcomes
Cumulative Distributions
Normal Distributions
Joint Normality
Shortfall Measures
Shortfall Probability
Measures of Likely Shortfall
Value at Risk
Shortfall and other Risk Measures

The Mean-variance Paradigm


The world is, unhappily, very complex. Before one can analyze, one must abstract. The time-state
paradigm provides a procedure for doing so. Its power lies in the straightforward way that it
accommodates time, risk, and options. But this power comes at a price. In general, one must assume a
relatively simple structure (e.g. two possible outcomes in each trading period) and the existence of
markets that are sufficiently complete to allow replication and valuation of desired patterns of payments
and/or consumption over time.

Despite these limitations, the time-state paradigm is eminently practical in a number of settings. Dynamic
strategies involving broad asset classes are frequently analyzed using it. It is also the paradigm of choice
when derivative securities are the focus of attention. However, when the goal is to consider many
possible combinations of many different financial instruments, use of the time-state approach poses a
number of problems. One must either assume a limited number of outcomes in each trading interval,
making most of the securities redundant, or many such outcomes, making the assumption of complete
markets unrealistic. Clearly, a Hobson's choice.

In 1952, Markowitz proposed a paradigm for dealing with issues concerning choices which involve
many possible financial instruments. Formally, it deals with only two discrete time periods (e.g. "now"
and "a year from now"), or, equivalently, one accounting period (e.g. "one year"). In this scheme, the
goal of an Investor is to select the portfolio of securities that will provide the best distribution of future
consumption, given his or her investment budget. Two measures of the prospects provided by such a
portfolio are assumed to be sufficient for evaluating its desirability: the expected or mean value at the
end of the accounting period and the standard deviation or its square, the variance, of that value. If the
initial investment budget is positive, there will be a one-to-one relationship between these end-of-period
measures and comparable measures relating to the percentage change in value, or return over the period.
Thus Markowitz' approach is often framed in terms of the expected return of a portfolio and its standard
deviation of return, with the latter serving as a measure of risk.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr1.htm 1/11
10/24/12 Mean, Variance and Distributions

The Markowitz paradigm is often characterized as dealing with portfolio risk and (expected) return or,
more simply, risk and return. More precisely, it can be termed the mean-variance paradigm.

Expected Value
Assume that a portfolio will have a future (end-of-period) value of v1 in state 1, v2 in state 2, etc.. let v
= [v1,v2,...vm] be a {1*m} element vector, where m is the number of possible states of the world. To
compute the portfolio's expected future value, we need someone's estimate of the probabilities associated
with the states. Let pr = [pr1,pr2,...,prm] be such a vector. The expected value is, as usual, a weighted
average of the possible outcomes, with the probabilities of the outcomes used as weights:
ev = pr*v'

If the current value of the portfolio is p, we can compute a vector of value-relatives (future/present
values):
vr = v/p

And a vector of returns (proportional changes in value):

r = (v-p)/p

The portfolio's expected value-relative can be computed either directly or indirectly:


evr = (pr*v')/p = ev/p

Similarly, the portfolio's expected return will be:


er = (pr*(v-p)')/p = ((pr*v')-p)/p = (ev-p)/p

In the special case in which every outcome is equally likely, the expected value can be computed by
simply taking the (arithmetic) mean of the possible values. In MATLAB:
ev = mean(v)

Note that the mean function can be used with a matrix -- the result will be a row vector in which each
element is the mean of the corresponding column in the original. This can prove handy with a matrix in
which each column represents a different asset and each row a different state of the world, with the latter
assumed to be equally likely.

Probabilities
Probability estimates are essential in the mean-variance approach. Unless all Investors agree about such
probabilities, one cannot talk about "the" expected value or expected return (or risk, for that matter) of a
portfolio, security, asset class or investment plan. Two different Analysts might well provide different
estimates of expected values for the same investment product. Indeed, one of the key functions than an
Analyst can perform for an Investor is the provision of informed estimates of the probabilities of various
outcomes and the associated risks and expected values of alternative investment strategies.

Normative applications of the mean-variance paradigm often accept the possibility of disagreement
among Investors and Analysts concerning probability estimates. Positive applications usually assume
either that there is agreement concerning such probabilities or that prices are set as if there were
agreement on a set of consensus probability estimates.

It is important to emphasize the fact that the mean-variance approach calls for the use of estimates of the
www.stanford.edu/~wfsharpe/mia/rr/mia_rr1.htm 2/11
10/24/12 Mean, Variance and Distributions

probabilities of alternative future possible events in the next period. Historic frequencies of such events
in past periods may prove helpful when forming such forward-looking estimates, but one should
consider taking into account any additional information that might prove helpful. The world changes,
and the future need not be like the past, even probabilistically. Issues concerning ways to implement the
mean-variance approach can and should be separated from issues concerning its structure, assumptions,
and implications.

Standard Deviation
If the future value of a portfolio will be vs in state s and the expected future value is ev, the deviation, or
surprise, in state s will equal (vs-ev). More generally, if v is the vector of possible future values, the
vector of deviations, state by state, will be:
d = v - ev

In this vector, a positive deviation represents a happy surprise, a negative deviation an unhappy surprise,
and a zero deviation no surprise at all. Roughly: the greater the "spread" of the possible deviations, the
greater the uncertainty about the actual outcome.

To measure risk in a fully useful manner we need to take into account not only the possible surprises,
but also the probabilities associated with them. Simply weighting each deviation by its probability won't
do, since the answer will always equal zero.

One alternative uses the expected or mean absolute deviation (mad):

mad = pr*abs(d)'

In practice, it is difficult to use mad measures when considering combinations of securities and
portfolios. Mean-variance theory thus utilizes the expected squared deviation, known as the variance:
var = pr*(d.^2)'

Variance is often the preferred measure for calculation, but for communication (e.g between an Analyst
and an Investor), variance is usually inferior to its square root, the standard deviation:
sd = sqrt(var) = sqrt(pr*(d.^2)')

Standard deviation is measured in the same units as the original outcomes (e.g. future values or returns),
while variance is measured in such units squared (e.g. values squared or returns squared).

We again emphasize that standard deviation is used in this context as a forward-looking measure of risk,
since it is based on probabilities of future outcomes, however derived. One can assume that future risk is
similar to past variability, but this is neither required nor, in certain cases, desirable.

MATLAB provides a function for computing the standard deviation of a series of values, and one that
can be used to compute the variance of such values. In each case, the computations assume that the
outcomes are equally probable. In addition, it is assumed that the values are drawn from a sample
distribution taken from a larger population., and that the variance and standard deviation of the
population are to be estimated.

For reasons that we will not cover here, the best estimate of the population variance will equal the
sample variance times n/(n-1), where n is the number of sample values. Correspondingly, the best
estimate of the population standard deviation will equal the sample standard deviation times the square
root of n/(n-1). MATLAB'a functions make this correction automatically, as do many functions included
with spreadsheet software. When estimates of this type are desired, one can use std(v) to find the
www.stanford.edu/~wfsharpe/mia/rr/mia_rr1.htm 3/11
10/24/12 Mean, Variance and Distributions

estimated population standard deviation where v is a vector of sample values. Alternatively, one can use
cov(v) to find the estimated population variance. Note that both functions are inherently designed to
process historic data in order to make predictions about future results and hence implicitly assume that
future "samples" will be drawn from the same "population" as were prior ones. In some cases this
assumption may be entirely justified; in others it may not.

Continuous and Discrete Outcomes


Thus far, we have dealt with a world in which a future value can take on one of a discrete set of
specified values, with a probability associated with each value. The mean-variance approach can be
utilized in such a setting, and we will do this from time to time for expository purposes. However, its
natural setting is in a world in which outcomes can lie at any point along a continuum of values.
Statisticians use the term random variable to denote a variable that can take on any of a number of such
values.

In a discrete setting, the actual value of a variable will be drawn from a vector (e.g. v) having a finite
number of possible outcomes, with the probability of drawing each value given by the corresponding
entry in an associated probability vector (e.g. pr). The set of values (v) and the associated probabilities
(pr) constitute a discrete probability distribution.

In a continuous setting, a value will be drawn from a continuous probability distribution, the parameters
and form of which indicate the range of outcomes and the associated probabilities.

Cumulative Distributions
The most informative way to portray a distribution utilizes a plot of the probability that the actual
outcome will be less than or equal to each of a set of possible values.

Let v be a vector of values, sorted in ascending order, and pr a vector of the probabilities associated with
each of the corresponding values. For example:
v= [ 10 20 30];

pr = [ 0.20 0.30 0.50];

The probability that the actual outcome will be less than or equal to 10 is 0.20. The probability that the
actual outcome will be less than or equal to 20 is (0.20+0.30), or 0.50, and the probability that the
outcome will be less than or equal to 30 is 1.00. To produce a vector of these probabilities we can use
the MATLAB cumsum function, which creates a new vector in which each element is the cumulative
sum of all the elements up to and including the comparable position in the original vector. In this case:
cumsum(pr) =

0.2000 0.5000 1.0000

The figure below shows the associated cumulative probability distribution. Note that it is a step function,
reflecting the discrete nature of the outcomes.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr1.htm 4/11
10/24/12 Mean, Variance and Distributions

It is, of course, much simpler to simply plot the points, and let MATLAB connect them with straight
lines. Here are the required statements:
plot(v,cumsum(pr));

xlabel('outcome');

ylabel('Probability actual <= outcome');

In this case the result is:

The greater the number of points and the nearer together they are, the closer will be this type of plot to
the more accurate step function. In the case of a continuous distribution, there will be no difference at
all.

Normal Distributions
A uniformly-distributed random variable can take on any value within a specified range (e.g., zero to
one) with equal probability. Most programming languages and spreadsheets provide functions that can
generate close approximations to such variables (purists would, however, call them pseudo-random
variables, since they are not completely random). In MATLAB, the function rand(r,c) generates an
{r*c} element matrix of such numbers.

Consider the process of generating 1000 sets of 1000 such numbers, then taking the mean (unweighted
average) of each set. In MATLAB:
z = mean(rand(1000,1000))

www.stanford.edu/~wfsharpe/mia/rr/mia_rr1.htm 5/11
10/24/12 Mean, Variance and Distributions

A histogram showing the frequency distribution of the mean values in each of 25 "bins" can be
obtained with the statement:
hist(z,25)

The figure below shows the results obtained in this manner in one experiment.

Note that the distribution is approximately "bell-shaped" and roughly symmetric. This is not surprising
since the central limit theorem holds that the distribution of the sum or average of a set of unrelated
variables will approach a particular form as the number of variables increases. The form is that of the
normal distribution, given by the equations:
nd = (x - ev)/sd;

p(x) = (1/sqrt(2*pi))*exp(-(nd^2)/2)

where p(x) is proportional to the probability that the actual value will equal x; ev and sd stand for the
expected value and standard deviation, respectively, of the distribution, and nd is the deviation of x from
ev in standard deviation units.

The figure below plots p(x) for various values of nd.

More practical is the cumulative normal distribution . MATLAB does not provide such a function, but it
offers the next best thing. The expression erf(x)/sqrt(2)) gives the probability that a normally-distributed
random variable will fall between -x and +x standard deviations of the mean. This forms the basis for
our function cnd(nd) where nd is a standardized deviation and cnd(nd) is the probability that the actual
outcome will be less than nd.

The figure below shows the values of cnd(nd) for nd from -3 to +3 (in steps of 0.1), using the
www.stanford.edu/~wfsharpe/mia/rr/mia_rr1.htm 6/11
10/24/12 Mean, Variance and Distributions

MATLAB statements:

nd = -3:0.1:3;

pr = cnd(nd)

plot(nd,pr);

grid;

xlabel('deviation');

ylabel('Probability actual <= outcome');

The cumulative normal distribution can be used to determine probabilities that a normally-distributed
outcome will lie within a given range. For example, the probability that an outcome will like within one
standard deviation of the mean is:

cnd(1)-cnd(-1)

0.6827

Thus there are roughly two chances out of three that the outcome will lie within this range. Some
characterize an investment's prospects by giving its mean and standard deviation in the form: e +/- sd
(read as e plus or minus sd); thus an asset mix might be said to offer returns of 10+/-15. If the return can
be assumed to be normally-distributed, this means that there are roughly two chances out of three that
the actual return will lie between -5% (10-15) and 25% (10+25).

The probability that a normally-distributed return will be within two standard deviations of the mean is
given by:

cnd(2)-cnd(-2)

0.9545

Thus if a normally-distributed investment is characterized by 10+/-15, the chances are roughly 95% that
its actual return will lie between -20% (10 - 2*15) and 40% (10+2*15).

In MATLAB one can produce normally-distributed random variables with an expected value of zero
and a standard deviation of 1.0 directly using the function randn. Thus:

z = ev + randn(100,10)*sd

will produce a {100*10} matrix z of random numbers from a distribution with a mean of ev and a
standard deviation of sd.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr1.htm 7/11
10/24/12 Mean, Variance and Distributions

Joint Normality
While the central limit theorem provides a powerful inducement to assume that investment returns and
values are normally distributed, it is not sufficient in its own right. While most investment results depend
on many events and most portfolios contain many securities, it is unlikely that the influences on overall
results are unrelated. If, for example, the health of an economy is not normally distributed, and if it
affects most securities to at least some extent, even the value of a diversified portfolio will have a non-
normal distribution.

To solve this problem at a formal level, Analysts often assume that the return or value of every
investment is normally distributed as is the value or return of any possible combination of investments.
Since knowledge of the expected value and standard deviation of a normal distribution is sufficient to
calculate the probability of every possible outcome, this very convenient assumption implies that the
expected value and standard deviation are sufficient statistics for investment choices in which an end-of-
period value or return is the sole source of an Investor's utility.

If the value or return of every possible investment and combination of investments is normally
distributed, we say that the set of such variables is jointly normally distributed .The mean-variance
approach is well suited for application in such an environment.

Shortfall Measures
Some argue that standard deviation is a flawed measure of risk since it takes into account both happy
and unhappy surprises, while most people associate the concept of risk with only the latter. Alternative
measures focus on "downside risk" or likely "shortfall". Each requires the specification of an additional
parameter -- the point from which shortfall is to be measured. This threshold may be zero, a riskless rate
of return, or some level below which the Investor's disappointment with the outcome is assumed to be
especially great.

Shortfall Probability
The simplest shortfall measure is the probability of a shortfall below a stated threshold. This can be read
directly from a graph of the associated cumulative distribution. For example, assume that the probability
that a return will be less than 10% is desired. In the figure below, find 10% on the horizontal axis. Go up
to the curve, then over to the vertical axis. The result is 0.5. Thus there is a 50% probability that the
return will fall below the selected threshold of 10%.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr1.htm 8/11
10/24/12 Mean, Variance and Distributions

Measures of Likely Shortfall


More complex shortfall measures take into account all possible outcomes below the selected threshold
and their probabilities to obtain an estimate of the "likely" magnitude of the shortfall. Let r be a vector of
possible returns and pr a vector of the associated probabilities. For example:
r = [-10 0 10 20]

pr = [.1 .2 .3 .4]

Assume that the desired threshold is 10 (%). The positions in r which contain returns below the
threshold can be found simply using the MATLAB expression:

r<threshold

1 1 0 0

To produce a vector of shortfalls we subtract the threshold from each return, then multiply the resulting
vector by the vector that contains zeros in all positions in which the difference is positive:
sf = (r-threshold).*(r<threshold)

-20 -10 0 0

To find the expected shortfall multiply each of these values times the associated probability:

pr*sf'

-4

An alternative is the semi-variance, which is the expected squared shortfall:

pr*(sf.^2)'

60

www.stanford.edu/~wfsharpe/mia/rr/mia_rr1.htm 9/11
10/24/12 Mean, Variance and Distributions

The square root of the semi-variance is termed the semi-standard deviation. In a sense, it is the
"downside" counterpart of the standard deviation. In the case at hand:

sqrt(pr*(sf.^2)')

7.7460

The expected shortfall, the semi-variance and the semi-standard deviation are all unconditional
measures. For example, the expected shortfall is the expected value of the shortfall, whether there is one
or not. All outcomes that exceed the threshold are treated equally (as zero shortfalls), no matter what
their magnitude. Alternative measures answer a somewhat different set of questions. For example, one
might wish to know the size of the expected shortfall if there is one. More directly: conditional on the
existence of a shortfall, how large is it likely to be?

To compute a conditional measure, only states of the world in which a shortfall occurs are considered.
The desired probabilities are those conditional on such a situation arising. In our example, only the first
two states of the world produce shortfalls. The associated unconditional probabilities are 0.1 and 0.2.
Thus the probability of a shortfall is 0.3. The conditional probabilities for the two states are 0.3333
(=0.1/0.3) and 0.6667 (=0.2/0.3).More generally, we divide each unconditional probability by the
probability of a shortfall. To find the latter we need a vector of the unconditional probabilities for states
in which there is a shortfall:

pr.*(r<threshold)

0.1000 0.2000 0 0

The sum of these values is the probability of a shortfall:


prsf = sum(pr.*(r<threshold))

0.3000

To find the conditional expected shortfall, we could divide each unconditional probability by this value,
then multiply by the shortfall vector. Equivalently, we could simply divide the unconditional expected
shortfall by the probability of a shortfall:
pr*sf'/prsf

-13.3333

Earlier we found that the expected shortfall is 4%. However, if there is a shortfall, the expected amount
is 13.33%.

Similarly, the conditional semi-variance equals the unconditional semi-variance divided by the
probability of a shortfall. From this it follows that the conditional semi-standard deviation equals the
unconditional semi-standard deviation divided by the square root of the probability of a shortfall.

Value at Risk
Another measure of downside risk is based on a specified probability. In effect one asks the question:
what is the (almost) worst thing that can happen? A probability px is selected. The associated (almost)
worst thing that can happen is given by a return or future value x, such that there is only a 1%
probability that the actual outcome will be worse than x.

Assume, for example, that a bad outcome is specified as one that will not be underperformed more than
10% (px) of the time. In the case shown in the previous figure, this is easily determined. Locate 0.1
(10%) on the vertical axis. Then go over to the curve and down to the horizontal axis. The result is
www.stanford.edu/~wfsharpe/mia/rr/mia_rr1.htm 10/11
10/24/12 Mean, Variance and Distributions

-10%. Thus the (10%) worst case involves a return of -10%.

When the result of this kind of calculation involves a negative change in value, the change is often
termed the value at risk. Thus, in our example, if the current amount invested were $500,000, we would
say that the value at risk is $50,000.

Value at risk is often calculated for short holding periods (e.g. a day or a week). In such cases the
expected return is often assumed to be zero. This allows the Analyst to concentrate on the shape of the
distribution of returns and its standard deviation, thereby lending at least a somewhat greater sense of
objectivity to the result.

Shortfall and other Risk Measures


In many cases it proves helpful to summarize the prospects of an investment strategy in terms of (1) its
expected outcome and (2) a measure of downside risk or likely shortfall, even though the analysis
leading to its choice utilized standard deviation as a measure of risk.

Among strategies with equal expected outcomes there is often a one-to-one correspondence between
standard deviation and each of several alternative risk measures, including downside ones. Since
calculations are far easier when standard deviation is utilized, we follow common practice by utilizing it
in much of what follows. When issues of communication are paramount, however, we will include
transformations to alternative measures that focus attention on bad outcomes rather than all outcomes.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr1.htm 11/11
10/24/12 Portfolio Choice

Portfolio Choice

Contents:
Efficient Portfolios
Roles of the Analyst and the Investor
Indifference Curves
Expected Utility
Approximating an Investor's Utility Function
Negative Exponential Utility Functions
Inferring Investor Risk Tolerance
Risk Tolerance and Risk
Properties of Portfolio Utility
Effects of Increases in Wealth
Risk-adjusted Expected Return

Efficient Portfolios
An Investor must choose between two portfolios. The end-of-period value of each one is normally
distributed. Portfolio A has an expected value of $10,000 and a standard deviation of $15,000. Portfolio
B has an expected return of $14,000 and a standard deviation of $15,000. Which will provide the
greatest expected utility?

The answer is not difficult to obtain. As long as the Investor's utility increases with wealth and does not
depend on the state of the world in which the wealth is obtained, portfolio B is better. This can be seen
in the plot of the cumulative distributions, shown below:

Take any possible outcome -- for example, $5,000. This is (5,000-10,000)/15,000 standard deviations
from the expected value of portfolio A. The probability that the actual outcome will fall short of this
amount is cnd((5000-10000)/15000) or 0.3694. On the other hand, this outcome is (5,000-
14,000)/15,000 standard deviations from the expected value of portfolio B. The probability that the

actual outcome will fall short of this amount is cnd((5000-14000)/15000) or 0.2743. Clearly, it is better
www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 1/19
10/24/12 Portfolio Choice

actual outcome will fall short of this amount is cnd((5000-14000)/15000) or 0.2743. Clearly, it is better
to have a smaller chance of a shortfall below $5,000; in this respect, B is preferred to A. But the result
will be the same for every possible outcome, as the figure shows. Portfolio B thus dominates portfolio A
for any Investor who prefers more wealth to less and who has a state-independent utility function.
Formally, this is termed a case of first-degree stochastic dominance. More simply put: mean-variance
theory assumes that among portfolios with the same standard deviation, the one with the greatest
expected value is the best.

Now examine the figure below in which each circle plots the expected value and standard deviation of a
different portfolio.

Consider the portfolios shown by the red circles in the figure that plot on curve XzzYZ. Each provides
the maximum expected value for a given level of standard deviation. If all the portfolio returns are
normally distributed, then any Investor for whom more wealth is better than less and for whom only
wealth matters should choose from among the portfolios on this curve.

What about the portfolios on the section of the curve from Y to Z? The one plotting at point Y provides
a greater expected value and a smaller standard deviation than any of the portfolios between Y and Z.
Moreover, for every portfolio on the section between Y and Z there are alternatives with the same
expected return but lower standard deviations. For example, portfolio zz offers the same expected return
as Z but a lower standard deviation (indeed, the lowest possible, in this case). The figure below plots the
cumulative distributions for these two portfolios.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 2/19
10/24/12 Portfolio Choice

Portfolio zz dominates Z over the lower half of the range of possible outcomes, but Z provides larger
chances of obtaining higher values.

To deal with such a case, Markowitz proposed that Investors be assumed to be risk-averse -- more
precisely, each Investor's marginal utility of wealth is assumed to decline with wealth.

In this case, an Investor with decreasing marginal utility of wealth will prefer zz to Z, since moving from
Z to zz will improve bad outcomes symmetrically with reductions in good outcomes, and the gain in
utility from each of the former reductions will exceed the loss in utility from the corresponding latter
reduction. Formally, this is a case of second-degree stochastic dominance.

Mean-variance theory assumes that Investors prefer (1) higher expected returns for a given level of
standard deviation and (2) lower standard deviations for a given a level of expected return. Portfolios
that provide the maximum expected return for a given standard deviation and the minimum standard
deviation for a given expected return are termed efficient portfolios. All others are inefficient.

In practice the curve plotting the maximum expected value for each level of risk will usually be upward-
sloping throughout the range of feasible values. Sections such as YZ are rare. Thus it generally suffices
to assume only that Investors prefer greater expected return for given risk, placing a considerably smaller
burden on the Analyst who advocates a focus on only efficient portfolios.

The figure below provides an illustration, with expected returns expressed in terms of excess returns
over and above a riskless rate of interest.

In this figure each point represents a portfolio. Given the Investor's budget and the joint distribution of
security and portfolio values, there are many such points, only a few of which are shown in the figure.
The set of all such points make up a feasible region of mean-variance (or mean-standard deviation)
combinations. Efficient portfolios plot on the upper left-hand border of this region, shown as a red
curved line in this case. For obvious reasons this border is often termed the efficient frontier.

Roles of the Analyst and the Investor


By its nature, our delineation of the roles of the Analyst and the Investor suggests a division of labor.
The Analyst is presumed to be an expert on capital markets, while the Investor knows his or her
circumstances, tastes, obligations, future opportunities, etc.. Their joint goal is to bring all this
information together to achieve the best possible plan for the Investor's savings and investment.

Central in this enterprise is the selection of an asset mix for a long-term investment policy. Such a policy

www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 3/19
10/24/12 Portfolio Choice

asset mix plays a key role in many investment plans. Its choice is often the first (and sometimes the only)
joint undertaking of the Analyst and the Investor.

In principle, four ingredients are needed for a complete analysis of this type. In discrete terms: (1)
estimates of possible payments made by various investment products at different times and in different
states of the world, (2) estimates of the probabilities associated with these times and states of the world,
(3) the Investor's utility function for consumption at different times and in different states of the world,
and (4) the Investor's current wealth, projected future income and required payments at various times in
various states of the world. Since the first two aspects are the same for every Investor, it pays to share
the cost of obtaining information about them among many Investors. Economies of scale thus provide
the primary justification for the existence of the Analyst, to whom this material is directed.

In mean-variance analyses, payments (item 1) and probabilities (item 2) are summarized in the mean-
variance feasible region. By assuming that all Investors are risk-averse, the Analyst can narrow the
range of "sensible" investment opportunities to those that lie on the efficient frontier. In some cases this
may suffice. The dialogue would be something like this:
Analyst: Here is the tradeoff between risk (standard deviation)
and (expected) return, using only efficient strategies. Which
point do you want?

Investor: I'll take that one (chooses a point).

Analyst: OK. The asset mix for you is (writes down a mix, which
indicates the percentage to be invested in each of several asset
classes).

In practice it is rarely this simple. While the mean-variance framework deals with only one period, actual
investment policies are designed to last for many periods. Few Investors can relate one-period mean and
variance to long-run outcomes. In most cases it falls to the Analyst to do so, taking into account
information supplied by the Investor -- information that is unique to his or her situation.

Take the example of a 55-year old with savings of $500,000. She plans to save an additional $50,000
per year for the next ten years, then "cash in" and move to Southern France. She cares only about the
amount of money that she will have at that time.

In such a circumstance an Analyst would generally do an asset allocation study. For example, five asset
mixes might be selected from the efficient frontier, in order of increasing risk and expected return. For
each mix a Monte Carlo analysis would be performed to estimate the probability distribution of the size
of the Investor's retirement fund ten years hence. These would be shown to the Investor, who would
pick the one she preferred. The dialogue would be something like:
Analyst: Based on your current and planned future savings, I
have estimated the likely range of retirement funds for each of
five efficient investment strategies. The more conservative
strategies run less risk of a truly disappointing outcome, but
are likely to provide less under expected conditions. Here are
depictions that show what might happen with each policy. Which
do you prefer?

Investor: This is not an easy choice. All things considered,


this one (chooses a policy) seems best for me.

Analyst: OK, here's the asset mix for you ....

Indifference Curves

www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 4/19
10/24/12 Portfolio Choice

Mean-variance theory provides a neat separation between Investor preferences and capital market
opportunities. The latter are summarized in the feasible mean-variance opportunity set and its efficient
frontier. The former can be shown with a set of Investor indifference curves

The diagram below shows portions of a map of the preferences of a specific Investor

Here are some answers obtained when this Investor was asked to choose between various pairs of
mean-variance combinations:

Choose between Answer


-------------- -----------
W and Y W
V and Y Y
X and Y don't care
W and Z W
V and X X
Z and Y don't care

These responses can be written using algebraic notation, with > meaning "is preferred to" (more
properly: "would be chosen over"), < the converse, and = meaning "is equally desirable as" (more
properly, "would let someone else choose" or "flip a coin"). Thus:

W>Y
Y>V
X=Y
W>Z
X>V
Z=Y

If the Investor has transitive preferences, we can combine all these responses, using the rules of algebra:
W>X=Y=Z>V

Thus if we know that W is preferred to Y and Y is preferred to V, we assume that if asked to choose
between W and V, the Investor would pick W. This may seem obvious, but people often fail to make
choices that are "rational" in this sense. Worse yet, when the preferences represent the results of choices
made by a committee voting by majority rule, instances of intransitivity are common. In such cases the
order in which votes are presented can easily affect the outcome. Thus W might win over Y in a first
www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 5/19
10/24/12 Portfolio Choice

vote, and Y over V in a second vote, even though in an initial contest between W and V, the victory
might have gone to V.

The Analyst who works with Investment Committees must be aware of such possibilities: a difficult task
indeed. Such is the world of practice. In the world of theory no such dangers lurk. The Investor is
assumed to have transitive preferences which can, in principle, be graphed as a series of indifference
curves of the type shown in the figure. The Investor is indifferent among all combinations of expected
return and risk plotting on a single indifference curve (for example, X,Y and Z). He or she prefers any
combination on a curve that cuts the expected return axis at a higher point to any combination that cuts it
at a lower point. Thus W is preferred to X (or Y or Z or V), and X (or Y or Z) is preferred to V. The
joint task of the Investor and the Analyst is to put the former on the highest possible indifference curve.
This is shown below, with the red curve plotting the risk and return combinations available with
efficient portfolios.

In this case, point Y is optimal. Note that point E1 is inferior. Even though it represents an efficient
mean-variance combination, it puts the Investor on the lowest curve shown, points on which are inferior
to points on the middle curve, given his or her preferences. Of course, this Investor would prefer to be
on the highest curve shown, but this is impossible, given current resources and capital market
opportunities.

It would be convenient if each Investor would present an Analyst with a complete map of his or her
indifference curves. The Analyst could then recommend an asset mix virtually instantaneously. But the
task is never this simple. As we will see, indifference curves are a useful construct, but in practice the
Analyst generally focuses on only one portion of an Investor's entire indifference map -- which is just as
well

Expected Utility
Mean-variance theory assumes that every Investor's utility function increases at a decreasing rate as
wealth increases and is independent of the state of the world in which wealth is received. Even for
Investors for whom this is the case, there are many possible relationships between utility and wealth. In
discrete terms, if v is a column vector of end-of-period values, pr a row vector of corresponding
probabilities, and u(v) the function relating the Investor's utility to end-of-period value (wealth), the
expected utility of the portfolio that provides v and pr is:
www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 6/19
10/24/12 Portfolio Choice

eu = pr*u(v)

In continuous terms, if pr(v) is a probability distribution over end-of-period value (wealth) and u(v) is
the Investor's utility function, the expected utility is the integral of u(v) weighted by pr(v).

As noted earlier, if all portfolio distributions are normal, it is possible to determine the expected utility of
each one, given an Investor's utility function. All mean-variance pairs that provide the same level of
expected utility will lie on a single indifference curve. The set of all such curves will form the Investor's
indifference map.

There will be a great many indifference curves in the map representing an Investor's preferences. None,
however, will intersect. To see why, consider a counterexample. Let points A and B lie on curve 1 and
points B and C on curve 2, which intersects curve 1 at point B. If curve 2 represents a higher level of
utility, then C is preferred to A. But B and C are equally desirable, as are B and A. The last two
statements imply that A and C are equally desirable, if the Investor's preferences are transitive. But this
contradicts the fact that C is preferred to A. Hence intersection of indifference curves make no sense.

The best investment policy will lie at a point on the efficient frontier at which an indifference curve is
tangent to (touches but does not intersect) the feasible region, as shown in the previous figure.

In this case, point Y is optimal. It provides the level of expected utility associated with indifference
curve XYZ. Note that the vertical intercept of this curve (point X) provides the same level of expected
utility. But it represents a certain (standard deviation = 0) outcome. Hence we can say that the optimal
combination is as desirable for this Investor as the amount X for certain. The latter is often termed the
certainty-equivalent of the selected point.

This interpretation provides a useful reformulation of the optimization problem:


Maximize the certainty-equivalent value for the Investor
in question.

Note that the value of the objective function in this formulation depends on both the characteristics of
the investment (its mean and variance) and the preferences of the Investor (his or her utility function).

Approximating an Investor's Utility Function


Some work in decision theory has attempt to elicit an individual's utility function via a series of
questions concerning choices under uncertainty. For example:

Would you rather have $10,000 for certain or a 50/50


chance of receiving $0 or $25,000?

What probability of receiving $10,000 is as good as


$5,000 for certain?

and so on.

Cognitive psychologists have shown that most individuals make choices in such situations that are
inconsistent with the hypothesis that they attempt to maximize the expected value of a utility function
that increases smoothly with wealth at a decreasing rate. Further, the choices presented to the individuals
in questions such as those shown above often involve outcomes far from those associated with likely
investment results. And in some cases "what if" situations may not be taken sufficiently seriously by the
respondent to elicit carefully considered choices.

An alternative approach concentrates on the Investor's preferences in the region in which the optimal
investment is likely to lie, then uses a specific form as a local approximation to his or her (potentially
www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 7/19
10/24/12 Portfolio Choice

more complex) preference function in that region.

The first step involves a kind of crude asset allocation study. A representative ("long run") risk-return
tradeoff is used to produce probability distributions of an outcome with meaning to the Investor. For
example, the Investor might be shown a separate probability distribution of projected real annual income
in retirement for each of five alternatives, say, A,B,C,D and E, -- each based on an efficient investment
strategy with a greater short-term risk and expected return than its predecessor. Each of the distributions
would be presented in its entirety or in part (with, for example, "likely", "poor" and "bad" outcomes
shown), depending on the best manner in which to communicate such information to the Investor in
question.

After studying the alternative distributions (however presented), the Investor picks one. If it lies at one
extreme (e.g. A or E), it may prove desirable to repeat the exercise with added alternatives extending the
set of strategies beyond the point chosen. Sooner or later, the investor will select an "interior" alternative
-- say D. We do not know, of course, that this was the very best alternative, since some other strategy
between the two adjacent alternatives (e.g. C and E) could well have been preferred. In practice,
however, it is usually assumed that when an interior strategy is chosen, it was the best of all possibilities
-- an assumption that places considerable responsibility on the Analyst to provide an appropriate set of
alternatives.

If the chosen strategy is in fact the best, we know that one of the Investor's indifference curves is tangent
to the efficient frontier at the chosen point, as shown in the earlier diagram. Since we know the slope of
the frontier at the optimal point, we also know the slope of the indifference curve at that point. But we
do not know the slope of that indifference curve at other points nor the slopes of other indifference
curves at various points. To deal with this, we need to assume more about the Investor's preferences, at
least in the near neighborhood of the selected portfolio.

Negative Exponential Utility Functions


A particularly useful utility function for mean-variance analysis is the negative exponential. In
MATLAB notation:

u = 1 - exp(-(c*w))

where w is a measure of wealth and c is a positive parameter. In standard utility theory the argument (w)
is the absolute value of wealth at a future date. Some assume that such a function can be applied
repeatedly for one-period decisions on sequential dates. However, for purposes of portfolio theory it is
desirable to state utility in terms of return (the relative change in wealth over the future period):

u = 1 - exp(-(c*r))

While this is simply a linear transform of the wealth-based version for a single period, it implies different
behavior with respect to repeated one-period decisions, as we will see.

The figure below provides three examples of this function. We state return in percentage terms (e.g. 10.0
for an increase in wealth of 10%). As indicated, the utility associated with a return of zero is taken as
zero, although no change in behavior would be implied if a constant were added to each such function.
The flattest curve in the figure, shown in yellow, is based on a value of 0.04 for the parameter c. The
next flattest (red) curve is based on a c value of 0.05, and the steepest (green) curve a value of 0.06.

In each case utility increases at a decreasing rate, exhibiting Investor risk aversion. Moreover, the greater
the value of parameter c, the more curved the function and hence the more risk-averse the Investor in
question.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 8/19
10/24/12 Portfolio Choice

To see the effect of curvature (c) on risk aversion, we can compute the certainty-equivalent return for a
given distribution for our three Investors. Assume that Investment X offers a fifty percent chance of
obtaining a return of 10% and a fifty percent chance of breaking even (i.e. obtaining a return of 0%).
The expected utility of such a gamble will be:
eu = 0.5*(1-exp(-c*0)) + 0.5*(1-exp(-c*10))

We seek a return, rc, that will offer the same expected utility (which will, of course, be certain):
eu = 1-exp(-c*rc)

Combining the two equations produces the following relationship:


rc = -(1/c)*log(.5*exp(-c*0)+.5*exp(-c*10))

giving the following values for our three Investors:


c rc
---- ------
0.04 4.5033
0.05 4.3814
0.06 4.2610

Even though the expected return from the investment is 5.0%, all three of these Investors will accept a
smaller amount for certain to give up the investment. However, the amounts differ. The Investor for
whom c=0.04 would be indifferent between the investment in question and 4.5033 percent for certain.
The second Investor would be willing to accept a lower certain return (4.3814 percent), reflecting
greater risk aversion. The third Investor, even more averse to risk, will accept even less (4.2610 percent)
in return for giving up the investment. The greater is parameter c, the greater is the Investor's risk
aversion.

The negative exponential utility function is especially convenient in a world of normally-distributed


outcomes. Recall that expected utility is the integral of the utility function using the probability
distribution as weights. If the former is negative exponential and the latter is normal, it will be the case
that expected utility will be a simple function of the mean and variance of the distribution:
eu = e - (v/t)
www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 9/19
10/24/12 Portfolio Choice

Here, e is the expected outcome, v is the variance of the outcome, and t equals (2/c), where c is the
parameter from the investor's utility function. For our three Investors:

c t
---- ----
0.04 50.0
0.05 40.0
0.06 33.3

Parameter t measures the Investor's risk tolerance. Not surprisingly, the greater an Investor's risk
aversion (c), the smaller is his or her risk tolerance (2/c).

If the probability distribution of returns is not normal, the expected utility of an investment for an
Investor with a negative exponential utility function is likely to differ somewhat from that given by the
simple mean-variance formula. For example, consider the investment with a 50/50 chance of returning
0% or 10%. It offers an expected return of 5%, a standard deviation of 5% and a variance of 25. The
distribution is far from normal. Nonetheless, the (e-v/t) formula provides good approximations even in
this case, as can be seen by comparing its values with the exact certainty-equivalents calculated earlier
for our three Investors:
t exact approx
---- ------ -----
50.0 4.5033 4.500
40.0 4.3814 4.375
33.3 4.2610 4.250

Although the formula provides only approximate values when the distribution of returns is non-normal,
it may nonetheless give very good approximations in such cases.

Of course these statements rely on the assumption that an Investor's utility function is in fact negative
exponential, which need not be the case. However, if we assume that in the region near the selected
initial point, the Investor's utility function can be adequately approximated by a negative exponential
function, we can continue to use (e-v/t) to measure the desirability of a portfolio for the Investor in
question. This could be termed a certainty-equivalent, as before. However, such a local approximation
to the Investor's utility function is unlikely to hold over a large enough region to make the true certainty-
equivalent equal this value. Hence we will term (e-v/t) the portfolio's expected utility or simply, its
portfolio utility:

pu = e - (v/t)

Inferring Investor Risk Tolerance


Portfolio utility depends on both portfolio characteristics and the risk tolerance of the Investor in
question. To emphasize this one could write:

pu(p,k) = e(p) - v(p)/t(k)

where e(p) is the expected value (or return) of portfolio p, v(p) is its variance, t(k) is Investor k's risk
tolerance, and pu(p,k) is the utility of portfolio p for investor k.

Portfolio utility is measured in the same units as e. Consider the set of all portfolios that provide a given
level of utility, say pux. All such portfolios must satisfy the equation:

pux = e(p) - v(p)/t(k)

or:
www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 10/19
10/24/12 Portfolio Choice

e(p) = pux + (1/t(k))*v(p)

In a diagram with mean (e) on the vertical axis and variance (v) on the horizontal axis, such an
indifference curve will plot as an upward-sloping straight line, the intercept of which will indicate the
associated portfolio utility. The slope of such a line indicates the rate at which the Investor is willing to
trade expected value (or return) for variance. But the slope is 1/t(k). Thus t(k), the reciprocal of this
slope, is the rate at which the Investor is willing to trade variance for expected return. Indeed, we can
define t(k) as Investor k's marginal rate of substitution of variance for expected value.

If an Investor's risk tolerance were the same at all points in a mean/variance diagram, his or her
indifference map would be a family of parallel lines, as shown below:

In fact it is unlikely that risk tolerance is constant over wide ranges of outcomes. However, we can infer
its level from the choice of a prototypical asset mix, then assume that it is constant in the near
neighborhood of the mean and variance of that mix.

Assume that the riskless rate of interest is 4% and that by investing in a diversified stock index portfolio
it is possible to obtain an expected excess return over the riskless investment of 6% per year, with a
standard deviation of 15% per year. If an amount x is invested in the stock index and an amount (1-x) in
the riskless asset the return will be:
R = x*Rs + (1-x)*rr

where Rs is the return on the stock index and rr is the return on the riskless asset.

The expected return of the combination will be:


e = x*Es + (1-x)*rr

where Es is the expected return on the stock index.

Since the riskless asset has no risk, its standard deviation of return is zero. Hence the standard deviation
of the combination will equal that of x*Rs. Examination of the formula for computing a standard
deviation shows that the standard deviation of a positive constant times a variable will equal the constant
times the standard deviation of that variable. Hence:

s = x*Ss

The net result of these relationships is that a combination in which x is invested in the stock index and
(1-x) is invested in the riskless asset will lie at a point on the straight line that connects the points plotting

www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 11/19
10/24/12 Portfolio Choice

the two assets in mean/standard deviation space. If x = 0 the point will coincide with that of the riskless
asset. If x = 1, it will coincide with that of the stock index. If x is between 0 and 1 the point will plot on
the line between the two asset's points. If x is greater than one, it will plot on the extension of this line
above the point representing the risky asset. The figure below shows the relationship for the case in
question.

The portfolios shown in the diagram have the proportions:


x 1-x
---- ----
A 0 100
B 25 75
C 50 50
D 75 25
E 100 0

In this case the opportunity set (feasible region) is the straight line connecting the points. Since every
feasible combination is efficient, the efficient frontier is the same line. Presumably there are other
combinations of individual securities that lie below this line, but they have been excluded from the
analysis. In any event, the frontier in this case can be described by the equation:

e = 4 + (6/15)*s

Now consider the same relationship plotted in a mean/variance diagram:

e = rr + (6/15)*sqrt(v)

In terms of e and v, the equation is non-linear, as shown in the figure below:

www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 12/19
10/24/12 Portfolio Choice

A relationship that plots as a line in mean/standard deviation space will plot as a curve that increases at a
decreasing rate in a mean/variance diagram, as in this case.

Assume that Investor I has been presented with the implications of policies A,B,C,D and E for his or her
standard of living in retirement. After careful reflection, C was chosen. Assume moreover that a finer set
of choices (e.g. between B and D) was presented and C was again chosen (or that the Analyst is willing
to assume that such would have been the case had the further analysis been undertaken). In any event,
we know that one of Investor I's indifference curves is tangent to the frontier at point C and must
therefore have the same slope as the frontier at that point, as shown below:

In this case:
e = 4 + (6/15)*(v.^(1/2))

thus:

de/dv =(1/2)*(6/15)*(v.^(-1/2))

or:

de/dv = (1/2)*(6/15)*(1/s)

Since the chosen point has a standard deviation of 7.5, the slope of the frontier at that point is:

(1/2)*(6/15)*(1/7.5)
www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 13/19
10/24/12 Portfolio Choice

or

6/225

The reciprocal is, of course, the Investor's risk tolerance. We thus infer that:

t(k) = 225/6 = 37.5

Note that the riskless rate of interest does not appear in this calculation -- only the expected excess return
and standard deviation of the risky asset.

For further analyses we may assume that risk tolerance is constant in the near neighborhood of this
point, as shown below:

While it is convenient to show the derivation of risk tolerance from a choice of investment in
mean/variance space, it is more intuitive to examine the situation in mean-standard deviation space, as in
the following figure.

In this more familiar setting the efficient frontier plots as a straight line and each indifference curve
increases at an increasing rate. Of course the point of tangency represents the same portfolio as before.
Here too, the indifference curve and efficient frontier have the same slope (in this case, de/ds) at the
chosen point.
www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 14/19
10/24/12 Portfolio Choice

In these diagrams we have shown only a limited range for the inferred indifference curve and for a few
others with the same risk tolerance -- all within the near neighborhood of the chosen efficient risk/return
combination. This was done to emphasize the fact that the Investor's actual indifference curves may
differ considerably in areas of the diagram that are far from the chosen point. However, for relatively
modest changes in the opportunity set (feasible region) it may be perfectly acceptable to assume that the
fitted indifference curves reflect the Investor's true preferences. In such instances one can simply search
the new opportunity set for a new point of tangency with the set of indifference curves with the same
risk tolerance as found in the initial study.

Risk tolerance and risk


If the efficient portfolio is linear in mean/standard deviation space, there is a one-to-one mapping
between risk undertaken and risk tolerance, assuming efficient investment strategies are utilized. Let the
efficient frontier be:

e = a + b*s

then:

s = (e-a)/b

and

v = ((e-a)^2)/(b^2)

The rate at which v can be substituted for e along the frontier is thus:

dv/de = 2*(e-a)/(b^2)

For an investment strategy to be optimal, this must equal the investor's risk tolerance t:

t = 2*(e-a)/(b^2)
= (2*(e-a)/b)/b

But s=(e-a)/b. Thus:

t = (2*s)/b

and

s = (b/2)*t

It follows that, compared with an "average investor":


an Investor with twice the risk tolerance
should take twice the risk

an Investor with half the risk tolerance


should take half the risk

Similarly:

if Investor A optimally takes half the risk


taken optimally by Investor B, then A has
half as much tolerance for risk as B.

Note that this neat correspondence depends on the assumption that the efficient risk-return tradeoff is
linear. In practice this is often approximately true, so the relationship holds to at least a first
www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 15/19
10/24/12 Portfolio Choice

approximation.

Properties of Portfolio Utility


While portfolio utility indifference curves plot as parallel upward-sloping straight lines in mean/variance
space, they have a more familiar look in more intuitive mean/standard deviation ("return/risk") diagrams,
as we have seen.

The equation for all combinations of e and s that provide a given level of portfolio utility pux is:
e = pux + (s^2)/t

Note that e is a quadratic function of s, and that:

de/ds= 2*s/t

Hence such a curve increases at an increasing rate as s increases. Moreover, all such curves have the
same slope for a given value of s, as can be seen from the equation and inspection of the following
enlarged version of the area near the chosen point in the previous diagram:

In a later section we will show that efficient frontiers will always increase at a non-increasing rate in
mean/standard deviation space. More simply put, they will either be linear or increase at a decreasing
rate. Since our assumed indifference curves increase at an increasing rate, this assures that one and only
one efficient combination of risk and return will, in principle, be optimal in every case. It should,
however, be said that in practice Investors often find it difficult to make a single choice from among
alternative efficient combinations, indicating that preferences are not as neatly defined as our expected
utility theory would suggest.

Effects of Increases in Wealth


In a one-period analysis it seems natural to measure outcomes in terms of end-of-period portfolio value,
since this is a measure of wealth and, ultimately, consumption -- the source of utility. However,
mean/variance analysis is seldom used in a strictly one-period setting. Rather, one derives a one-period
portfolio utility function (more precisely, an Investor's one-period risk tolerance) from choices made in a
multi-period setting (for example, as part of an asset allocation study). Although the connection between
the formal one-period model and the selection of a multi-period strategy is somewhat inelegant, it is
www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 16/19
10/24/12 Portfolio Choice

important to recall the context in which mean/variance analyses are in fact performed.

For this reason it is generally preferable to cast mean/variance problems in terms of portfolio return, as
we have done. This makes it more likely that the Analyst can utilize the same risk tolerance from period
to period, at least until circumstances change in significant ways.

To see this, consider an Investor with $100 in year 1 who chooses an asset mix with 50% invested in a
riskless asset with a return of 4% and 50% invested in a stock index fund with an expected return of
10% and a standard deviation of 15%. Shown below are the opportunity set and the selected point in a
diagram based on ending value (wealth).

Now assume that in year 2 the Investor has a portfolio worth $110 and adds another $90 so that a total
of $200 is available for investment. The new opportunity set in terms of wealth is shown below. Note
that it has the same slope as the one for year 1.

We know that the optimal mix lies at a point at which one of the Investor's indifference curves has the
same slope as the opportunity set. But we have established that all indifference curves have the same
slope for a given standard deviation. Since the new opportunity set is parallel to the old one, the Investor
will choose the same standard deviation as before ($7.50). In this case, however, the associated mix will
www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 17/19
10/24/12 Portfolio Choice

have only 25% invested in the stock index fund, which has a dollar standard deviation of $30.0
($200*0.15). The dollar amount ($50) invested in the stock fund will, however, be the same as before.

Investors with constant risk tolerance stated in terms of end-of-period value will exhibit constant
absolute risk aversion, keeping constant their absolute exposure to risky assets as wealth increases.
Since this results in a decrease in their relative exposure to such assets, they exhibit increasing relative
risk aversion. While some Investors might have preferences with such characteristics, most, will have
greater risk tolerances expressed in value terms, as their wealth increases.

Consider now a portrayal of Investor preferences in terms of portfolio return, with risk tolerance
indicating the Investor's willingness to trade variance of return for expected return. The figure below
shows the situation in year 1. It also shows the situation in year 2. In such a situation, an Investor with
constant risk tolerance expressed in terms of return would select the same relative mix of risky and
riskless assets, no matter what his or her wealth -- behavior consistent with constant relative risk
aversion.

The assumption of constant relative risk aversion seems much closer to the preferences of most investors
than does that of constant absolute risk aversion. Nonetheless, it is by no means guaranteed to reflect
every Investor's attitude. Some may wish to take on more risk (standard deviation of return) as their
wealth increases. Others may wish to take on less. Many Analysts counsel a decrease in such risk as one
ages. Some strategies are based on acceptance of more or less risk, based on economic conditions. And
so on.

For these and other reasons it is important to at least consider strategies in which an Investor's risk
tolerance (vis-a-vis one-period return) changes from time to time. However, such changes, if required at
all, will likely be far more gradual than those associated with a constant risk tolerance expressed in terms
of end-of-period value.

Henceforth, unless stated otherwise, when we use the term risk tolerance, we will mean the Investor's
marginal rate of substitution of the variance of one-period return for expected one-period return.
Similarly, we will assume that portfolio utility is based on the expected return, variance of return and
Investor risk tolerance based on one-period return. In cases involving zero-investment strategies we will
sometimes be forced to deal in value terms. However, in such situations we will derive a utility function
for end-of-period value from an assumed constant risk tolerance based on one-period return.

Risk-adjusted Expected Return


www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 18/19
10/24/12 Portfolio Choice

Some use the term risk-adjusted expected return or, more simply, risk-adjusted return, for the construct
we have termed portfolio utility. We illustrate with an example:

Expected return 7.60 e


Risk (standard deviation) 9.00 sd
Variance 81.00 v = sd^2
Risk Tolerance 45.00 t
Risk Penalty 1.80 v/t
Risk-adjusted Expected Return 5.80 e - v/t

The expected return is 7.60%, but the investment has a risk of 9%. The Investor in question thus
subtracts a risk penalty of 1.80% to obtain a risk-adjusted expected return of 5.80%. As stated earlier,
the risk penalty depends on both the portfolio's risk and on the Investor's tolerance for risk. Other things
equal, the greater the risk, the greater the risk penalty. And, other things equal, the greater the Investor's
risk tolerance, the smaller the risk penalty.

The Analyst's main task is to find the portfolio with the maximum risk-adjusted expected return
(portfolio utility) for the Investor in question. The figure below shows the relationship between (1)
portfolio utility for an Investor with a risk tolerance of 45 and (2) the percent invested in the stock index
fund in our previous example (a riskless return of 4% and a stock index fund with an expected return of
10% and a standard deviation of 15%). In this case it is optimal to invest 60% in the stock index fund
and 40% in the riskless asset. The characteristics of this solution are those shown in the preceding table.

Not surprisingly the "utility hill" plots as a quadratic function of the proportion invested in the risky
asset. The Analyst's goal is to place the Investor at the top of this hill.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr2.htm 19/19
10/24/12 Multi‑period Returns

Multi-period Returns

Contents:
Compounded Returns
Lognormal Distributions
Discounting Projected Values

Compounded Returns
While one-period returns may be normally distributed, this will generally not be the case for the value of
a portfolio many periods hence, due to the effects of compounding. If $1 is invested initially, the value
will be (1+r1) at the end of period 1, where r1 is the rate of return in the first period. If money is neither
withdrawn from nor added to the account so that (1+r1) is invested at the beginning of period 2, the
value at the end of period 2 (v2) will be (1+r1)*(1+r2), where r2 is the rate of return in the second
period:

v2 = (1+r1)*(1+r2)

Equivalently:

v2 = 1 + r1 + r2 + r1*r2

The final term reflects the effects of compounding. If r1 and r2 are normally distributed, r1*r2 will not
be, and hence neither will v2.

Assume that in each period there is a 0.50 probability that $1 will become (e+sd) and a 0.50 probability
that it will become (e-sd). For example, if e = 1.10 and sd=0.15:

www.stanford.edu/~wfsharpe/mia/rr/mia_rr3.htm 1/9
10/24/12 Multi‑period Returns

Note that this is not a normal distribution, but it is symmetric, with an expected value relative of e and a
standard deviation of sd.

With compounding, the ending values after two periods for an initial investment of $1 would be those
shown in the following diagram:

A somewhat different representation shows the one-period value relatives at each node in the diagram,
rather than the cumulative values:

www.stanford.edu/~wfsharpe/mia/rr/mia_rr3.htm 2/9
10/24/12 Multi‑period Returns

In this case the one-period value relative distribution is the same at all points in the diagram: neither the
ending values nor the probabilities associated with the branches change through time or depend on prior
outcomes. Such returns are said to be independent and identically distributed (iid, for short). Since the
distribution of possible one-period returns looks the same in such a situation, no matter what has
happened in the past, returns can be said to follow a random walk.

Note that this type of tree "folds back" on itself, so that there are only three distinct outcomes:
(e+sd)*(e+sd) : probability = 0.25
(e+sd)*(e-sd) : probability = 0.50
(e-sd)*(e-sd) : probability = 0.25

Expanding:
e^2 + 2*sd + sd^2 : probability = 0.25
e^2 - sd^2 : probability = 0.50
e^2 - 2*sd + sd^2 : probability = 0.25

The expected ending value is found by weighting each outcome by its probability. In this case it will be:

ev2 = e^2

Perhaps not surprisingly, the two-period expected value is simply the one-period expected value relative
squared.

There is more to be said, however. Consider the distribution of the ending values:

www.stanford.edu/~wfsharpe/mia/rr/mia_rr3.htm 3/9
10/24/12 Multi‑period Returns

Note that the distribution is not symmetric, since the largest value is farther to the right of the most likely
value than the smallest value is to the left of the most likely value. The distribution is skewed to the right.
Note also that 1.1875 (=0.95*1.25), the most likely outcome, known also as the mode, is smaller than
the expected outcome (1.1^2=1.21).

In our two-period case the most likely outcome (e+sd)*(e-sd) is also the median outcome: the
probability of a smaller value is equal to the probability of a larger value. It is thus of considerable
interest. Its value is:
e^2 - sd^2

which is equal to the expected one-period value relative squared minus the one-period variance.

It is convenient to translate this ending value into a "what if" value called, in some contexts, the
geometric mean -- the return per period which, if obtained with no variance, would have produced the
same ending value. Here:

(1+g)^2 = e^2 - sd^2

or:
(1+g)^2 = (1+er)^2 - sd^2

where er is the one-period expected return. In this case:


(1+g)^2 = (1 + .10)^2 - 0.15^2 = 1.21 - .0225 = 0.1875

and
1 + g = sqrt(1.1875) = 1.0897

Thus g = 8.97%, which is less than 10.0%, the one-period expected return.

While this expression is perfectly usable, practitioners often adopt a simpler approximation. Expanding
the squared expressions gives:
1 + 2*g + g^2 = 1 + 2*er + er^2 - sd^2

or:

er - g = (g^2 - er^2)/2 + (sd^2)/2

www.stanford.edu/~wfsharpe/mia/rr/mia_rr3.htm 4/9
10/24/12 Multi‑period Returns

Since er and g are generally significantly less than one (e.g. 0.10 and 0.09), both er^2 and g^2 will be
even smaller (e.g. 0.0100 and 0.0081). Moreover, half the difference between g^2 and er^2 will be even
smaller yet (e.g. -0.00095). Hence it will be approximately true that:
er - g = (sd^2)/2

or
g = er - (sd^2)/2

For example, if er = 0.10 and sd = 0.15, then:


g = 0.10 - 0.0225/2 = 0.10 - 0.01125 = 0.08875

or 8.875%, only slightly different from the more precise estimate of 8.97%.

If the return on a diversified stock market portfolio is assumed to be iid with a standard deviation of 15%
per year, the median long-term return (g) will be approximately 1.125% ((0.15^2)/2) below the expected
one-period return (e). If the standard deviation of return were 20%, the difference would be 2.0%
((0.20^2)/2). And so on.

The geometric mean return will be less than the expected return (sometimes termed the arithmetic
mean), as long as there is some variation in returns. Moreover, the difference between the geometric and
arithmetic means will be greater, the greater the amount of such variance.

What about longer periods? Consider the ending value of a portfolio n periods hence, where n is an
even number. The most likely and median outcome will have n/2 "up moves" and n/2 "down moves".
Hence, the n-period median ending value (evn) will be:
evn = ( (e+sd)^(n/2))*((e-sd)^(n/2) )
= ((e+sd)*(e-sd))^(n/2)
= (e^2 - sd^2)^(n/2)

The geometric mean will be the value that satisfies:


(1+g)^n = (e^2 - sd^2)^(n/2)

or:
((1+g)^2)^(n/2) = (e^2 - sd^2)^(n/2)

or:
(1+g)^2 = e^2 - sd^2

which is precisely the relationship found earlier.

Lognormal Distributions
It is common in asset allocation studies to assume that returns are independent and identically
distributed. This has important implications for the distribution of long-term returns.

Let v1,v2,...,vn be the value relatives for a portfolio in periods 1,2,..,n, respectively. Assuming an initial
investment of $1 with periodic compounding and no withdrawals or additional investments, the ending
value in period n will be:
www.stanford.edu/~wfsharpe/mia/rr/mia_rr3.htm 5/9
10/24/12 Multi‑period Returns

evn = v1*v2*...*vn

Now, take the logarithm of each side:


ln(evn) = ln(v1*v2*...*vn)

or:
ln(evn) = ln(v1) + ln(v2) + ... + ln(vn)

Ex ante, each of the variables on the right-hand side is unknown. Each will be drawn from a distribution
(that of ln(v)) and each draw will, by assumption, be independent of every other draw.

Recall the central limit theorem, which holds that the sum of a set of independent random variables will
have a distribution that will be closer and closer to normal, the greater the number of variables in the
sum. For a sufficiently large value of n, ln(evn) will be normally distributed, or nearly so.

We say that variable x has a lognormal distribution if the distribution of ln(x) is normal. Thus long-term
compounded values tend to be lognormally distributed if returns are independent. Note that this result
follows, no matter what the distributions of one-period returns may be, as long as the returns are
independent.

The figures below show distributions of ln(evn) and evn when evn is lognormally distributed.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr3.htm 6/9
10/24/12 Multi‑period Returns

Note that the distribution of evn is skewed to the right, due to the relationship between evn and ln(evn)
and the symmetry of the distribution of the latter. Note also that m -- the modal and median value of evn
will be less than e, the expected value.

To fix ideas, consider the two-period example discussed earlier. The one-period value relatives are:

e+sd with probability = 0.50


e-sd with probability = 0.50

In logarithmic terms:

ln(e+sd) with probability = 0.50


ln(e-sd) with probability = 0.50

The expected logarithm is thus:

0.50*ln(e+sd) + 0.50*ln(e-sd)
= 0.50*(ln(e+sd)+ln(e-sd)
= 0.50*ln((e+sd)*(e-sd))
= 0.50*ln(e^2 - sd^2)

Let ln(1+g) represent this mean (for reasons that will become clear shortly). Then:
ln(1+g) = 0.50*ln(e^2 - sd^2)

and:
(1+g)^2 = e^2 - sd^2

which is the formula obtained earlier for the geometric mean.

We know that ln(1+g) is the mean of the distribution of ln(ev1). It follows that n*ln(1+g) is the mean of
the distribution of ln(evn). But since the modal (median) ending value will equal the exponential of the
mean value of the logarithm:
median(evn) = (1+g)^n

Thus the median outcome will equal the value obtained by compounding each period at the geometric
mean rate of return. There is a 50% chance that the actual value will exceed this amount and a 50%
chance that it will fall below it.
www.stanford.edu/~wfsharpe/mia/rr/mia_rr3.htm 7/9
10/24/12 Multi‑period Returns

In some cases it is necessary to determine the moments of the distribution of the logarithm of a
lognormally-distributed value from those of the value itself or vice-versa. The formulas for doing so are
slightly complicated but easily computed. Assume that log(y) is normally distributed with mean el and
standard deviation sl. Then the mean (e), variance (v) and standard deviation (s) of y will equal:

e = exp ( el + ( ( sl^2 ) / 2 ));


v = exp ( 2*el + ( sl^2 ) ) * ( exp (sl^2) - 1);
s = sqrt ( v );

where:

exp ( z ) = the exponential of z (that is, e raised to the z'th power)

If the mean and variance of y are known, the moments for the distribution of log(y) can be found by
sequentially evaulating the equations below:

b = sqrt ( log ( ( v / (e^2) ) + 1) );


a = 0.5 * log ( (e^2) / exp(b^2) );

where:

log ( z ) = the natural logarithm of z

Discounting Projected Values


In corporate finance and investment practice it is common to project a set of cash flows, then discount
them using an appropriate cost of capital or discount rate. If the resulting value is less than the cost of
the investment, it is rejected. If the value exceeds the cost, the investment is accepted. Key to the validity
of such a procedure is the choice of an appropriate cost of capital or discount rate.

We will not attempt a complete discussion of this topic, but it is useful to analyze the arguments for
using a geometric mean vis-a-vis an arithmetic mean for such purposes.

Consider our example in which a standard market investment produces a return of (e+sd) with
probability 0.50 and a return of (e-sd) with probability 0.50 in each period. The expected cost of capital
for such an investment is e, while the geometric mean is given by:
(1+g)^2 = e^2 - sd^2

Now consider a project that is expected to make a payment two periods hence of:
(e+sd)*(e+sd) with probability 0.25
(e+sd)*(e-sd) with probability 0.50
(e-sd)*(e-sd) with probability 0.25

We know that such a project is worth $1 since its payments can be replicated in the market for this
amount.

In practice those charged with assessing the project will be asked to produce a single set of cash flows
over time (in this case, one number for the ending cash flow). A discount rate will then be used to
compute the present value.

If the project's cash flow is implicitly or explicitly estimated by taking all possibilities into account as
well as the associated probabilities, the result will be equivalent to an expected value -- in this case, e^2.
www.stanford.edu/~wfsharpe/mia/rr/mia_rr3.htm 8/9
10/24/12 Multi‑period Returns

Clearly, such an estimate should be discounted using the expected return (arithmetic mean). Here:
(e^2)/(e^2) = 1

which is the correct present value. This is the method advocated by many who have addressed the issue.
However the argument for using the expected return as a discount rate assumes that that the projection
process takes into account all possible future cash flows and the accompanying probabilities. In many
cases a much simpler approach is utilized. Imagine a situation in which only the most likely (or "50/50")
outcome was considered. In our example, the projected cash flow would then be:
(e+sd)*(e-sd) = (1+g)^2

If this were discounted using the expected cost of capital, the resultant value would be less than $1 --
clearly a wrong answer. The correct value would be obtained by discounting with the geometric mean:
((1+g)^2)/((1+g)^2) = 1

In practice cash flows are projected for many different periods. Moreover, the assumptions utilized to
make such projections are often highly implicit. Those making projections may even adjust their
estimates to assure a particular outcome if the "hurdle rate" (cost of capital) is known beforehand. Thus
the nature of the overall process must be known before a "theoretically correct" procedure can be
determined. In some cases an expected return may be appropriate discount rate, but in many instances a
geometric mean (median return) may provide more correct results.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr3.htm 9/9
10/24/12 Portfolio Characteristics

Portfolio Characteristics

Contents:
Portfolio Expected Return
Portfolio Risk
Covariance
Correlation
Interpreting Correlation Coefficients
Exponentially Weighted Covariances
Function wcov
Weighted Statistics Worksheet
Portfolio Covariances
Asset Covariances with a Portfolio
Marginal Risks

Portfolio Expected Return


Thus far we have dealt with portfolios of at most two assets, with only one involving any risk. It is time
to turn to the general relationship between the characteristics of a portfolio and the characteristics of its
components.

Let there be n assets and s states of the world, with R an {n*s} matrix in which the element in row i and
column j is the return (or value) of asset i in state of the world j. Here is an example with n=3 and s=4:

Good Fair Poor Bad


Asset1 5 5 5 5
Asset2 10 8 6 -5
Asset3 25 12 2 -20

Let x be an {n*1} vector of asset holdings in a portfolio. For example:

x
Asset1 0.20
Asset2 0.30
Asset3 0.50

What will be the return of the portfolio in each of the states? This is easily computed. The {1*s} vector
of portfolio returns in the states (rp) will be:
rp = x'*R

Here:
Good Fair Poor Bad
rp 16.50 9.40 3.80 -10.50

Now, let p be an {s*1} vector of the probabilities of the various states of the world. In this case:
p
Good 0.40
www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 1/17
10/24/12 Portfolio Characteristics

Good 0.40
Fair 0.30
Poor 0.20
Bad 0.10

The expected return (or value) of the portfolio will be:


ep = rp*p

In this case:
ep = 9.13

It is useful to write the expression for expected return in terms of its fundamental components:

ep = x'*R*p

The product of the three terms can be computed in either of two ways. Above, we computed x'*R, then
multiplied the result by p. Alternatively, we could have multiplied x' by the result obtained by
multiplying R times p:
ep = x'*(R*p)

The parenthesized expression is an {n*1} vector in which each element is the expected return (or value
) of one of the n securities. Let e be this vector:
e = R*p

Here:
e
Asset1 5.00
Asset2 7.10
Asset3 12.00

Using these results we may write:


ep = x'*e

That is, the expected return (or value) of a portfolio is equal to the product of the vector of its asset
holdings and the vector of asset expected returns (or values). This is the case whether the returns are
discrete, as in this derivation, or continuous (that is, drawn from continuous distributions).

The units utilized for the values in vectors x and e will depend on the application. In some cases,
physical units (e.g. shares) may be appropriate for x; in others, values (e.g. dollars); and in yet others,
proportions of total value. Whatever the units selected, to find the end-of-period value of a portfolio, the
end-of-period values per unit of exposure should be placed in vector e and the number of units of each
asset held placed in vector x. To find the expected return (or value-relative) for a portfolio, multiply the
expected returns (or value-relatives) in vector e by the exposures to the assets in vector x.

Whatever the application, the relationship between the expected outcome of a portfolio and the expected
outcomes for its components is relatively simple and intuitive. For example, the expected return on a
portfolio is a weighted average of the expected returns on its components, with the proportionate values
used as weights. Since the relationship is linear, the marginal effect on portfolio expected return of a
small change in the exposure to a single component will equal its expected outcome:

d(ep)/d(x(j)) = e(j)

If the expected outcome were the only relevant characteristic of a portfolio, it would be easy to make
www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 2/17
10/24/12 Portfolio Characteristics

investment decisions. But risk is also relevant. And, as we will see, its determination presents a more
substantial challenge.

Portfolio Risk
For present purposes we will use as a measure of portfolio risk the standard deviation of the distribution
of its one-period return or the square of this value, the variance of returns.

By definition, the variance of a portfolio's return is the expected value of the squared deviation of the
actual return from the portfolio's expected return. It depends, in turn, on the possible asset returns (R),
the probability distribution across states of the world (p) and the portfolio's composition (x). The
relationship is, however, somewhat complex.

To begin it is useful to create a matrix of deviations of security returns from their expectations. This can
be accomplished by subtracting from each security return the corresponding expectation:

d = R - e*ones(1,s)

The result (d) shows the deviation (surprise) associated with each security in each of the states of the
world. Here:
Good Fair Poor Bad
Asset1 0.00 0.00 0.00 0.00
Asset2 2.90 0.90 -1.10 -12.10
Asset3 13.00 0.00 -10.00 -32.00

The deviation (surprise) associated with the portfolio in each of the states of the world can be obtained
by multiplying the transpose of the composition vector times the asset deviation matrix:

dp = x'*d

In this case:
Good Fair Poor Bad
dp 7.37 0.27 -5.33 -19.63

To determine the variance of the portfolio, we wish to take a probability-weighted sum of the squared
deviations. A simple way to do so uses the dot-product operation, in which elements are treated one by
one:
vp = sum(p'.*(dp.^2))

However, there is a more elegant and (as will be seen) far more useful way to do the computation. First,
we create a {s*s} matrix with the state probabilities on the main diagonal and zeros elsewhere. This can
be done in one statement:
P = diag(p);

In this case, P will be:


Good Fair Poor Bad
Good 0.40 0.00 0.00 0.00
Fair 0.00 0.30 0.00 0.00
Poor 0.00 0.00 0.20 0.00
Bad 0.00 0.00 0.00 0.10

The variance of the portfolio is then given by a more conventional matrix expression:

www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 3/17
10/24/12 Portfolio Characteristics

vp = dp*P*dp'

For our portfolio:


vp = 65.9641

and
sdp = sqrt(vp)
= 8.1218

To see why the latter procedure for computing variance is more useful, we substitute the vectors used to
compute dp:
vp = (x'*d)*P*(x'*d)'

There is an easier way to write the last portion. Remember that the transpose operation turns a matrix on
its side. From this it follows that:
(a*b)' = b'*a'

For example, let a be a {ra*c} matrix and b a {c*rb} matrix. Then (a*b) is {ra*rb} and (a*b)' is
{rb*ra}. Now consider the expression to the right of the equal sign. The first term (b') is of dimension
{rb*c}, while the second is of dimension {c*ra}. Their product will thus be of dimension {rb*ra}.
Since each element will represent the sum of the same set of products as in the result produced by the
expression on the left, the resulting matrices will in fact be the same.

We can use this result to note that:


(x'*d)' = d'*x''

But two transpose operations will simply turn a matrix on its side, then turn it back, giving the original
matrix. Therefore:
(x'*d)' = d'*x

And the expression for portfolio variance can be written as:


vp = (x'*d)*P*(d'*x)

Of course the multiplications can be performed in any desired order. For example:
vp = x'*(d*P*d')*x

The parenthesized term is of great importance in portfolio analysis - - enough to warrant its own section
in this exposition.

Covariance
The matrix described in the previous section is termed the covariance matrix for the assets in question.
Each of its elements is said to measure the covariance between the corresponding assets. Using C to
represent the covariance matrix:
C = d*P*d'

In this example:
Asset1 Asset2 Asset3
www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 4/17
10/24/12 Portfolio Characteristics

Asset1 0.00 0.00 0.00


Asset2 0.00 18.49 56.00
Asset3 0.00 56.00 190.00

The variance of a portfolio depends on the portfolio's composition (x) and the covariance matrix for the
assets in question:

vp = x'*C*x

which of course gives the same value found earlier (65.9641).

Well and good. But what do the covariance numbers mean? How are we to interpret the fact that the
covariance of Asset2 with Asset3 is 56.00, while that of Asset3 with itself is 190.00, and so on?

Examination of the matrices involved in the computation of C provides the answer. Recall that
C=d*P*d'. Consider the covariance of Asset2 and Asset3. It uses the information in row 2 of matrix d
and that in column 3 of matrix d' (the latter is, of course, also in row 3 of matrix d). It also uses the
vector of probabilities along the diagonal of matrix P. The net result, written in a slightly casual notation
is that:
C(2,3) = sum(d(2,s)*p'(s)*d(3,s))

where the sum is taken over the states of the world.

As this expression shows, the covariance between two assets is a probability-weighted sum of the
product of their deviations. To verify this we can adapt the expression above to make it legal in
MATLAB:
c23 = sum(d(2,:).*p'.*d(3,:))

The answer is 56.00, precisely equal to the value in the second row and third column of the covariance
matrix.

Put in terms of prospective results: the covariance between two assets is the expected value of the
product of their deviations from their respective expected values. It immediately follows that the
covariance of asset i with asset j is the same as the covariance of asset j with asset i. Thus the matrix is
symmetric around its main diagonal -- note that the value in row 2, column 3 is the same as that in row
3, column 2. It also follows from the expression for covariance that the covariance of an asset with itself
is its variance. The asset variances thus lie on the main diagonal of the covariance matrix. In this case:
va = diag(C)

Here:
va
Asset1 0.00
Asset2 18.49
Asset3 190.00

The asset standard deviations are of course the square roots of these numbers:
sda = sqrt(diag(C))

In this case:
sda
Asset1 0.00
Asset2 4.30
Asset3 13.78
www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 5/17
10/24/12 Portfolio Characteristics

Note that the first asset's return is certain. Hence its variance and standard deviation are zero. The
second asset is risky, with a standard deviation of 4.30. The third is considerably more risky, with a
standard deviation of 13.78.

Since the covariance matrix includes asset variances along the main diagonal, the entire matrix is
sometimes termed a variance-covariance matrix. For brevity we will use the simpler term covariance
matrix, but it should be remembered that the diagonal elements are both covariances and variances.

For the special case in which the probability of each state is the same, it is possible to compute the
covariance matrix more simply using the standard MATLAB function cov. However, the function
assumes that the inputs represent a sample of observations drawn from a larger population and hence
adjusts the values in the matrix upwards to offset the bias associated with measuring deviations from a
fitted mean. In effect, each value produced by the MATLAB function cov will equal the one given by
our formulas times (s/(s-1)), where is the number of states (observations).

To use the cov function, simply provide the matrix of observations, with each row representing a
different observation (state) and each column a different asset class. For example, if the returns in our
{n*s} matrix R were historic observations and we were willing to assume that they were equally
probable we could compute:
C = cov(R')

which would give:

Asset1 Asset2 Asset3


Asset1 0.00 0.00 0.00
Asset2 0.00 44.92 122.58
Asset3 0.00 122.58 360.92

These values are, of course, quite different from those found earlier, due to both the assumption of equal
probabilities and the correction for bias.

With this aside completed, we return to our forward-looking example.

Correlation
It is relatively easy to find a meaning for the elements on the main diagonal of the covariance matrix.
But what of the remaining ones? How can one interpret the fact that the covariance of Asset2 with
Asset3 is 56.00?

The solution is to scale each covariance by the product of the standard deviations of the associated
assets. The result is the correlation coefficient for the two assets, usually denoted by the Greek letter rho:
rho(i,j) = C(i,j)/(sda(i)*sda(j))

The matrix of correlation coefficients is termed (unimaginatively) the correlation matrix. We denote it
Corr. To compute it, we compute a matrix containing the products of the asset standard deviations:

sda*sda':

Asset1 Asset2 Asset3


Asset1 0.00 0.00 0.00
Asset2 0.00 18.49 59.27
Asset3 0.00 59.27 190.00

We need to divide each element in the covariance matrix by the corresponding element in this matrix.
This can be done in one equation:
www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 6/17
10/24/12 Portfolio Characteristics

Corr = C./(sda*sda')

Giving:
Asset1 Asset2 Asset3
Asset1 NaN NaN NaN
Asset2 NaN 1.00 0.94
Asset3 NaN 0.94 1.00

Notice that the elements associated with asset pairs in which one of the assets is riskless are NaN (not a
number), since they involve an attempt to divide zero (the covariance) by zero(the product of two
standard deviations, one of which is zero).

While the correlation of two assets, one of which is riskless, is not really a number, it sometimes proves
helpful to set it to zero. This can be accomplished by adjusting the matrix of the cross-products of the
standard deviations to have ones in the cells for which the true value is zero. A simple way to do this is
to add to the original matrix a matrix with 1.0 in such positions. Since "true" is represented in
MATLAB as 1.0, a single matrix expression does the job. Here is a set of statements that accomplishes
the objective:

z = sda*sda';
z = z + (z==0);
CC = C./z;

where CC is the desired correlation matrix. In this case:

Asset1 Asset2 Asset3


Asset1 0.00 0.00 0.00
Asset2 0.00 1.00 0.94
Asset3 0.00 0.94 1.00

In most cases, the covariance matrix is known, and the correlation matrix derived from it as an aid in
interpretation. However, there are cases in which standard deviations and correlations are estimated first,
and the covariance matrix derived from those estimates. To do this, we simply reverse the terms in the
definition of correlation. For the element in row i, column j:

C(i,j) = rho(i,j)*sda(i)*sda(j)

And, for the entire matrix:

C =CC.*(sda*sda')

Note that the adjusted matrix CC was used in the latter computation to avoid NaN values in the cells
associated with the riskless asset.

Interpreting Correlation Coefficients


Asset covariances are the main ingredients for computing portfolio risks. But we have shown that
standard deviations are much easier to interpret than are asset variances. Similarly, correlations often
prove more useful for communicating relationships than do covariances.

Correlation coefficients measure the extent of the association between two variables. Each such
coefficient must lie between -1 and +1, inclusive. A positive coefficient indicates a positive association:
a greater-than-expected outcome for one variable is likely to be associated with a greater- than-expected
outcome for the other while a smaller-than-expected outcome for one is likely to be associated with a
smaller-than-expected outcome for the other. A negative coefficient indicates a negative association: a
greater-than-expected outcome for one variable is likely to be associated with a smaller-than-expected
www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 7/17
10/24/12 Portfolio Characteristics

outcome for the other while a smaller-than- expected outcome for one is likely to be associated with a
greater-than-expected outcome for the other.

The figures below provide examples. In each case the probabilities of the points shown are assumed to
be equal.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 8/17
10/24/12 Portfolio Characteristics

www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 9/17
10/24/12 Portfolio Characteristics

In the above examples the variables are roughly jointly normally distributed with means of zero and
standard deviations of 1.0 -- roughly, because each of the 100 points is drawn from such a joint
distribution so the (sample) distribution of the actual results departs somewhat from the underlying
(population) distribution.

Note that in the case of perfect positive correlation (+1.0), the points fall precisely along an upward-
sloping straight line. In this case it has a slope of approximately 45 degrees due to the nature of the
variables. In general, the line may have a greater or smaller slope. Nonetheless, a necessary and
sufficient condition for perfect positive correlation is that all possible outcomes plot on an upward-
sloping straight line.

In the case of perfect negative correlation the plot has the opposite characteristic. All points will plot on
a downward-sloping straight line. Here too, the slope will depend on the magnitudes of the variables,
but the line will be downward-sloping in any event.

As the figures show, in the case of less-than-perfect positive correlation (between 0 and +1.0), the points
will tend to follow an upward-sloping line, but will deviate from it. The closer the correlation coefficient
is to zero, the greater will be such deviations and the more difficult it will be to see any positive
relationship. In the case of less-than-perfect negative correlation (between 0 and -1), the points will tend
to follow a downward-sloping line. Here too, the closer the correlation coefficient is to zero, the greater
will be the deviations and the more obscure the relationship.

If the correlation coefficient is zero, the best linear approximation of the relationship will be a flat line.
This does not preclude the possibility that there is a non-linear relationship between the variables. The
figure below shows a case in which the correlation coefficient is zero, but knowledge of the value of the
variable on the horizontal axis would help a great deal if one wished to predict the value of the variable
on the vertical axis. In this case the variables are uncorrelated, but they are not independent.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 10/17
10/24/12 Portfolio Characteristics

In the special case in which probabilities are equal, one can use the MATLAB function corrcoef to
compute a correlation matrix directly from an {n*s} matrix of values of n assets in s different states of
the world, with each row representing a different state (observation) and each column a different asset.
For example:
corrcoef(R')

would give:

Asset1 Asset2 Asset3


Asset1 NaN NaN NaN
Asset2 NaN 1.00 0.96
Asset3 NaN 0.96 1.00

In this case the only source of the differences from our forward- looking estimates is the use of equal
probabilities rather than the predicted probabilities. Since the correlation coefficient is the ratio of
estimated variance to the product of two estimated standard deviations, any adjustment of the covariance
matrix for sample bias cancels out, leaving the correlation coefficients unaffected.

Exponentially Weighted Covariances


Analysts frequently utilize historic returns to estimate the covariances among future returns. If all returns,
past and future, were drawn from a stable joint distribution, it would be desirable to use as many
observations from the past as possible in order to maximize the accuracy of the resultant estimates of the
true underlying process that will generate future returns. However, if the parameters of the distribution
are likely to have changed over time, the situation is more difficult. One can utilize a great deal of data,
much of which may be of limited relevance for the future. Alternatively, a small amount of recent data
can be employed, with the attendant danger of substantial estimation errors. Which is better -- a great
deal of possibly irrelevant data or too little relevant data?

There is no easy answer to the question. The optimal procedure ultimately will depend on the manner in
which covariances evolve through time. Some Analysts approach the problem by limiting the historic
data to, say, 60 monthly observations, with each observation assigned the same weight (probability).
Others select only periods in which underlying conditions are assumed to have been similar to those
existing at the present time (e.g. periods following recessions if a recession has recently been
experienced). Yet others employ complex procedures in which covariances are assumed to be positively
correlated but with a tendency to eventually revert to a long-run mean value. Here we focus on a simple
www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 11/17
10/24/12 Portfolio Characteristics

procedure utilized in a number of asset allocation models that assumes that the future is more likely to be
like the recent past than the distant past.

In an exponential weighting scheme each historic observation is assigned a multiple of the weight
assigned to its predecessor. For example, observation t could be assigned a weight equal to 2^(t/h)
divided by a constant (k), with the latter set so that the sum of all the weights equaled 1.0. In such a
scheme h can be interpreted as an assumed half-life. To see why, consider the weights assigned to
months t and t-h:

w(t) = (2^(t/h))/k
w(t-h) = (2^((t-h)/h))/k

Thus:

w(t)/w(t-h) = (2^(t/h))/((2^t/h)/2) = 2

Thus if month t is the most recent month and h=60, the observation 60 months ago will be assigned half
as much weight as the most recent month.

The weight assigned to any month relative to that assigned to its predecessor will be:

(2^(t/h))/(2^((t-1)/h))

which will equal 2^(1/h). Thus if a 60-month half life is utilized, each month's observation will be given
a weight equal to 2^(1/60) or 1.0116 times that given the prior month (1.16% higher).

It is relatively straightforward to compute a set of such weights using MATLAB. Assume that there are
T observations. The vector of dates (1,2,...T) is given by:

d = 1:1:T;

The vector of 2^(t/h) values will be:

w = 2.^(d/h)

where h is the desired half-life.

The weights can easily be normalized so that they sum to 1.0:

p = w/sum(w)

We denote the result p since the weights will serve as probabilities. In a sense, the assumption is made
that the probability is p(t) that next month's returns will equal those that occurred in month t.

Having selected a set of probabilities, we proceed as before to estimate expected returns, deviations and
the covariance matrix:

e = R*p;
d = R - e*ones(1,T);
C = d*diag(p)*d';

Function wcov
The library function wcov obviates the need to remember all these formulas. It takes as inputs a matrix
of returns for n assets in s states (or from s historic time periods). For convenience, the return matrix can
have assets in the rows and states (observations) in the columns or vice-versa. The function assumes
(reasonably) that the number of states (observations) exceeds the number of assets and proceeds
www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 12/17
10/24/12 Portfolio Characteristics

accordingly.

To cover more cases, the half-life parameter can be specified as zero, in which event the states
(observations) are given equal weights.

The simplest way to utilize the function is as follows:

C = wcov(R,h)

where R is the matrix of returns, h is the half-life and C is the resultant covariance matrix.

If more information is desired, one or more additional variables may be indicated. For example:

[C,sda] = wcov(R,h)

will also return sda as the vector of standard deviations.

The statement:

[C,sda,CC] = wcov(R,h)

will also return CC as the matrix of correlation coefficients, following the convention that the correlation
coefficient is zero if the corresponding covariance is zero.

Finally, the statement:

[C,sda,CC,e] = wcov(R,h)

will also return the expected returns, based on the assumption that future probabilities equal the weights
computed from the assumed half-life.

Weighted Statistics Worksheet


The Weighted Statistics Worksheet allows you to compute both equal-weighted and exponentially-
weighted means, standard deviations and correlation coefficients. An example using returns for a set of
Vanguard mutual funds from 1991 through 1995 provides a chance for you to experiment with different
weighting schemes. Try a half-life of zero for equal weights, then compare the results with those
obtained with other values (for example, 12, 24, .. 60). You might even wish to try a negative half-life to
weight earlier observations more heavily than later ones.

You may also wish to copy and paste other return series into the weighted statistics worksheet so that
you can calculate the resulting historic statistics.

Portfolio Covariances
It is remarkably easy to determine the covariances between two portfolios. Recall the formula for
computing the covariance of portfolio x:

vp = x'*C*x

where x is the vector with the portfolio composition and C is the covariance matrix for asset returns.

Now, let xa represent one portfolio and xb another. For example:

xa
Asset1 0.10
www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 13/17
10/24/12 Portfolio Characteristics

Asset2 0.50
Asset3 0.40

xb
Asset1 0.40
Asset2 0.10
Asset3 0.50

Assume that the covariance matrix (C) is:

Asset1 Asset2 Asset3


Asset1 0.00 0.00 0.00
Asset2 0.00 18.49 56.00
Asset3 0.00 56.00 190.00

The covariance between the two portfolios is given by:


cab = xa'*C*xb

Which in this case equals 55.16.

The relationship can be extended to cover a case in which there are multiple portfolios. Let X be an
{n*p} matrix containing information on the composition of p portfolios of n assets. For example:

xa xb
Asset1 0.10 0.40
Asset2 0.50 0.10
Asset3 0.40 0.50

Then the covariance matrix for the portfolios is given by:

Cp = X'*C*X

Which gives:

xa xb
xa 57.42 55.16
xb 55.16 53.28

Note that the elements on the main diagonal indicate the variances of the two portfolios, while the other
elements equal their covariance.

Asset Covariances with a Portfolio


It is straightforward to compute the covariance of each asset with a given portfolio. Recall the statement
for the covariance of portfolio xa with portfolio xb:

cab = xa'*C*xb

This can be computed in two operations:

cab = xa'*(C*xb)

For example, with xb:

Asset1 0.40
Asset2 0.10
Asset3 0.50

www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 14/17
10/24/12 Portfolio Characteristics

and C:

Asset1 Asset2 Asset3


Asset1 0.00 0.00 0.00
Asset2 0.00 18.49 56.00
Asset3 0.00 56.00 190.00

then

cab = xa'*cp

where cp = C*xb, or:

cp
Asset1 0.00
Asset2 29.85
Asset3 100.60

Now, assume that xa contains only the first asset:

xa
Asset1 1.00
Asset2 0.00
Asset3 0.00

Clearly, the covariance of xa with xb will equal the first value in vector cp (0.00).

If xa contained only the second asset, its covariance with xb would equal the second value in vector cp
(29.85). And so on.

The conclusion is not hard to reach. Vector cp contains the covariances of the asset classes with
portfolio xb.

More generally, if:

cp = C*x

then cp(i) is the covariance of asset i with portfolio x.

Note that the covariance of an asset with a portfolio will be a weighted average of its covariances with
all the assets (including itself), with the composition of the current portfolio used as weights.

Marginal Risks
The risk of a portfolio is not a linear function of the vector of its components. Rather, the variance of a
portfolio is a quadratic function of its composition. This thwarts the intuition of most Analysts and
Investors. Indeed, the nature of risk may be the single most important argument for the use of
quantitative analysis in investment management. Neither Investors nor Analysts can be blamed for this
fact. Nor can Harry Markowitz. Nature made risk a quadratic function. Markowitz only discovered it.

Given this central fact of investment life, it follows that the impact on the risk of a portfolio of a small
change in the amount invested in a particular asset is not simply a function of the risk of that asset. The
impact will depend on the covariances of the asset with all the other assets currently in the portfolio and
on the composition of the portfolio.

Consider a portfolio x and a "difference vector" d. We wish to determine the effect on portfolio variance
of a switch from portfolio x to portfolio x+d.
www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 15/17
10/24/12 Portfolio Characteristics

The variance of x will be:

vx = x'*C*x

While that of (x+d) will be:

vv = (x+d)'*C*(x+d)

The latter can be expanded by noting first that (x+d)'=x'+d', giving:

vv = (x'+d')*C*(x+d)

then by multiplying out all the terms:


vv = x'*C*x + x'*C*d + d'*C*x + d'*C*d

Since the first term of the latter expression equals the variance of x, the change in variance is given by
the sum of the last three terms:

dvp = x'*C*d + d'*C*x + d'*C*d

The first two terms are the same. This follows from the facts that: (1) the transpose of a scalar is the
same scalar, (2) the transpose of a product of matrices is the product of their transposes, taken in reverse
order and (3) the covariance matrix C is symmetric, so that its transpose equals the original matrix.
Given this, we may write:

dvp = 2*d'*C*x + d'*C*d

We are interested here in the effect on variance of a small change in the holding of one asset. Thus d
will contain only one small non-zero element. For example,if we wished to know the effect of a small
change in the holdings of asset 2 we could set:

d
Asset1 0.00
Asset2 0.01
Asset3 0.00

Since the elements in d will be either zero or very small (very much less than 1.0), the final term in the
earlier expression (d'*C*d) will be even smaller. Indeed, as d approaches zero, the one element in
d'*C*d will approach zero considerably faster, since it involves the square of the non-zero element in d.
For purposes of computing the marginal effect of a change we may ignore the final term, giving:
dvp = 2*d'*C*x

or

dvp = d'*2*C*x

From this it follows that d(vp)/d(x(j)) will equal the j'th row of 2*C*x. More generally, 2*C*x is the
vector of marginal risks of the asset classes:
mr = 2*C*x

with mr(j) indicating the change in portfolio variance per unit change in the amount invested in asset j.

Note finally, that C*x is the vector of the covariances of the assets with portfolio x, which we have
denoted cp. Thus:

mr = 2*cp
www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 16/17
10/24/12 Portfolio Characteristics

and two times the covariance of an asset with a portfolio indicates the marginal risk of that asset, given
the composition of the portfolio. In our case:
mr
Asset1 0.00
Asset2 59.70
Asset3 201.20

Thus variance would not be affected by a small change in the holding of asset 1. It would increase at a
rate of 59.70 per unit change in Asset 2 and at a rate of 201.20 per unit change in Asset 3. Of course
these figures hold only approximately for finite changes in the assets, with the error greater, the larger
the underlying change. Moreover, they assume that only one element in the portfolio is changed. If the
assets represent zero investment strategies this may be feasible. If, however, they are true investments, at
least one holding will have to be decreased for another to be increased. We will take these aspects into
account in later discussions. For now it suffices to have determined the vector of derivatives of variance
with respect to asset holdings.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr4.htm 17/17
10/24/12 Portfolios of Two Assets

Portfolios of Two Assets

Contents:
Characteristics of a Two-asset Portfolio
Combining a Riskless Asset with a Risky Asset
Combining Two Risky Assets
Combining Two Perfectly Positively Correlated Risky Assets
Combining Imperfectly Correlated Risky Assets
Risk-return Tradeoffs in Mean-Variance Space
The Excess Return Sharpe Ratio

Characteristics of a Two-asset Portfolio


The formulas and MATLAB functions discussed previously are sufficient to compute the characteristics
of any portfolio. However, to better understand the economics of portfolio construction it is useful to
consider the effects of combining two assets to form a portfolio.

To economize on notation we omit parentheses. Thus x1 and x2 will be the proportions invested in
assets 1 and 2 respectively, and e1 and e2 will be their expected returns. The expected return of the
portfolio, ep, will then be:

ep = x1*e1 + x2*e2

Since the proportions will sum to 1.0 we may also write:

ep = e1 + x2*(e2-e1)

The variance of the portfolio, vp, will be a function of the proportions invested in the assets, their return
variances (v1 and v2), and the covariance between their returns (c12):

vp = ((x1^2)*v1) + (x2^2)*v2) + 2*x1*x2*c12

Here too, we can substitute (1-x2) for x1 to obtain an expression relating the portfolio's variance to the
amount invested in asset 2:

vp = v1 + 2*x2*(c12-v1) + (x2^2)*(v1-2*c12+v2)

The standard deviation of return (sp) will, as always, be the square root of the variance:

sp = sqrt(vp)

Throughout this section we will assume that asset 1 has less risk (v1&ltv2) and a smaller expected return
(e1&lte2). We will be interested in the risk-return tradeoff associated with different combinations of the
two assets and, in particular, the shape of the curves in mean-variance and mean-standard deviation
www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 1/14
10/24/12 Portfolios of Two Assets

space that result as more money is invested in the risky asset (that is, as x2 is increased and x1
decreased).

Combining a Riskless Asset with a Risky Asset


The figure below plots the locus of mean-standard deviation combinations for values of x2 between 0
and 1 when e1=6, e2=10, s1=0, s2=15 and c12=0.

In this case, as in every case involving a riskless and a risky asset, the relationship is linear. This is easily
seen. Recall that ep is always linear in x2 as shown earlier. If asset 1 is riskless, sp will also be linear in
x2, since both v1 and c12 will equal zero. In such a case:

vp = (x2^2)*v2
sp = sqrt((x2^2)*v2)
= abs(x2)*sqrt(v2)
= abs(x2)*s2

where abs(x2) denotes the absolute value of x2.

This result can be extended to cases in which it is possible to take short positions. First, assume one can
either go long the riskless asset ('lend") or take a short position in it ("borrow") at the same interest rate
(e1). The formulas above then apply directly. The figure below shows the results obtained by using
leverage in this manner. For example, the point marked 1.5 is associated with x2=1.5 and x1=-0.5. It
indicates that by "levering up" an investment in asset 2 by 50% an Investor can obtain a probability
distribution of return on initial capital with an expected value of 12% and a standard deviation of 22.5%.
The other points in the figure correspond to the indicated values of x2. Those above the original point
involve borrowing (x1<0) while those below it involve lending (x1>0).

www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 2/14
10/24/12 Portfolios of Two Assets

What if an Investor could short the risky security (x2<0) and invest the proceeds obtained from the short
sale in the riskless security? Here too, the standard formulas apply. However, note that the variance will
be positive, as will the standard deviation, since a negative number (x2) squared is always positive. The
figure below shows the effects of negative x2 values.

A qualification to this analysis is in order. A strategy involving short positions may require that
additional capital be pledged to cover possible shortfalls between ending asset and liability values.
Absent this, a higher rate will typically be charged for the short position so the income to the lender in
states of the world in which the borrower is solvent will be sufficiently high to compensate for the
shortfalls associated with the states of the world in which the borrower is insolvent.

The figure below shows a simple case of this sort in which funds may be borrowed, but at a higher rate
(8%) than the rate at which they may be lent (6%). Here the locus of the ep,sp combinations plots as two
lines, the first associated with the lower lending rate, the second with the higher borrowing rate. As
before, the risky asset offers an expected return of 10% and a risk of 15%. The efficient frontier is
shown by the solid lines. The dashed line indicates the options that would be available if the Investor
could lend at 8%. While points on it are infeasible, those plotting on its extension to the right of the point
representing the risky asset are both feasible and efficient.

In practice the rate charged for borrowing may increase with the amount borrowed. In such a case the
locus of ep,sp combinations will increase at a decreasing rate as risk (sp) increases beyond the amount
associated with a full unlevered investment in the risky asset (x2=1). Under these conditions there will
eventually be decreasing returns to risk-taking.

Combining Two Risky Assets


www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 3/14
10/24/12 Portfolios of Two Assets

When a portfolio includes two risky assets, the Analyst needs to take into account expected returns,
variances and the covariance (or correlation) between the assets' returns. The differences from the earlier
case in which one asset is riskless occur in the formula for portfolio variance. In terms of risks and
correlations it is:

vp = ((x1^2)*(s1^2)) + (2*x1*x2*r12*s1*s2) + ((x2^2)*(s2^2))

where r12 is the correlation between the assets' returns.

Combining Two Perfectly Positively Correlated Risky Assets


To begin, consider the case in which two returns are perfectly positively correlated. Under these
conditions:

vp = ((x1^2)*(s1^2)) + (2*x1*x2*s1*s2) + ((x2^2)*(s2^2))

The term on the right can be factored to obtain:

vp = (x1*s1 + x2*s2)^2

from which it follows that:

sp = abs(x1*s1 + x2*s2)

where, abs(..) connotes the absolute value of the enclosed expression. As long as x1 and x2 are both
non-negative the expression itself will be non-negative since neither s1 nor s2 can ever be negative.
However, if one of the two x values is sufficiently negative, the absolute value must be utilized
explicitly.

Consider combinations of long positions in the two assets (x1>=0, x2>=0). For any such combination:

sp = x1*s1 + x2*s2

And, as always:

ep = x1*e1 + x2*e2

In such a case both risk and return will be proportional to x2:

sp = s1 + x2*(s2-s1)
ep = e1 + x2*(e2-e1)

and all such portfolios will plot on a straight line connecting the points representing the two assets. In the
figure below, e1=8,s1=5, e2=10, s2=15 and r12=1.0.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 4/14
10/24/12 Portfolios of Two Assets

This relationship can be extended by allowing x2 to be either greater than one or less than zero. Of
particular interest is the combination that gives the smallest possible risk: the minimum-variance
portfolio. In this case it is possible to achieve a variance of zero! We seek a value x2 for which:

sp = s1 + x2*(s2-s1) = 0

This will be obtained when:

x2 = -s1/(s2-s1)

and

x1 = 1-x2
= 1 + (s1/(s1+s2)
= s2/(s2-s1)

In our example a riskless portfolio can be obtained by setting:

x2 = -5/(15-5) = -0.5
x1 = 1-x2 = 1.5

This can be accomplished by taking a short position in asset 2 equal to one-half the Investor's funds and
investing the proceeds as well as the original amount of money in asset 1. In practice this may require
the pledging of some other collateral to provide a sufficient guarantee to the lender of asset 2 that the
short position can be covered when needed.

The ability to form a riskless portfolio by taking offsetting positions in two perfectly positively correlated
assets leads directly to a figure similar to that derived earlier when a riskless asset was combined with a
risky one. Let the expected return on the zero-variance portfolio be:

e0 = e1 + x2min(*e2-e1)

where:

x2min = -s1/(s2-s1)

In the current example e0 will equal 7% (8-0.5*(10-8)). The set of portfolio risks and returns can then be
derived by considering combinations of this riskless asset (portfolio) and either asset 1 or asset 2. Either
view will provide the familiar graph associated with risky and a riskless asset. In this case:
www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 5/14
10/24/12 Portfolios of Two Assets

While all the combinations shown are feasible, only those on the upper line are efficient -- a point
emphasized by the use of a dashed line for the dominated portion of the relationship.

Combining Imperfectly Correlated Risky Assets


While perfectly positively correlated risky assets do exist, they are the exception rather than the rule. In
most cases correlation coefficients are less than 1.0. The implications of this fact for risk are central to an
understanding of the effects of diversification.

Consider a portfolio with long positions in two risky assets (x1>0, x2>0). As shown earlier, its variance
will be:

vp = ((x1^2)*(s1^2)) + (2*x1*x2*r12*s1*s2) + ((x2^2)*(s2^2))

Now imagine two cases, similar in every respect (x1,x2,e1,e2,s1,s2) but correlation (r12). Let vp(1) be
the variance of one portfolio, for which r12=1 and vp(r) be the variance of the other, for which r12=r,
where r is less than 1. Only the middle term in the equation for portfolio variance will differ in the two
calculations. Since all the components of that term but r12 are positive, it follows that vp(r)&ltvp(1).
More generally, other things equal:

vp(r1) 0, x2>0

Other things equal, the smaller the correlation between two assets, the smaller will be the risk of a
portfolio of long positions in the two assets.

The figure below shows combinations of risk and return for such portfolios when e1=8,s1=5, e2=10 and
s2=15. Each curve applies to a case with a different correlation between the two assets' returns. Not
surprisingly, the cases are coincident at the end-points (x1=1,x2=0 and x1=0,x2=1). For all interior
combinations, when the correlation coefficient is less than 1.0, risk is less than proportional to the risks
of the two assets, with the extent of risk reduction greater, the smaller the correlation coefficient. Thus
the yellow curve (r12=1.0) provides no risk reduction, only risk-averaging; the red curve (r12=0.5)
provides some risk reduction, the green curve (r12=0) more, and the blue curve (r12=-0.5) even more.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 6/14
10/24/12 Portfolios of Two Assets

The most powerful case of diversification arises when r12=-1.0. In this instance:

vp = ((x1^2)*(s1^2)) - (2*x1*x2*s1*s2) + ((x2^2)*(s2^2))

which can be factored to obtain:

vp = (x1*s1 - x2*s2)^2

and

sp = abs(x1*s1 - x2*s2)

In such a case the minimum-variance portfolio will be riskless. To obtain it, we wish to set:

x1*s1 - x2*s2 = 0
given: x1+x2 = 1

The solution is:

x2 = s1/(s1+s2)

Thus if s1=5 and s2=15 and the two assets are perfectly negatively correlated, a riskless portfolio will be
obtained with x2=5/(5+15)=0.25 and x1=0.75. Its expected return will equal 0.25*e1+0.75*e2 (here,
8.5%). The next figure repeats the results of the prior case and adds this new one (in white).

www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 7/14
10/24/12 Portfolios of Two Assets

Once again we have obtained the familiar diagram in which there is a riskless asset. This should not be a
surprise. Long positions in two perfectly negatively correlated assets are similar to (1) a long position in
one of two perfectly positively correlated assets and (2) a short position in the other.

In most cases asset correlations lie between -1.0 and +1.0. To cover all possibilities we need a general
formula for the minimum-variance portfolio. For this we start with a reduced-form equation in which vp
is expressed as a function of x2 (under the assumption that x1=1-x2):

vp = v1 + x2*(c12-v1) + (x2^2)*(v1-2*c12+v2)

The derivative with respect to x2 is:

d(vp)/d(x2) = (c12-v1) + 2*x2*(v1-2*c12+v2)

Setting this to zero gives:

x2min = (v1-c12)/(2*(v1-2*c12+v2))

The minimum-variance portfolio may have a lower risk than either of its component assets. It may also
have a higher return. Consider the point at which x2=0. Here:

d(ep)/d(x2) = e2-e1
d(vp)/d(x2) = c12-v1

and:

d(ep)/d(vp) = (e2-e1)/(c12-v1)

As usual, we assume that e2>e1. For the slope d(ep)/d(vp) to be negative we need:

c12-v1 <0

That is:

r12*s1*s2

or:
r12

For example, if s1=5, s2=15, e2>e1 and r12<5/15, the minimum variance portfolio will dominate asset
1, offering both lower risk and higher expected return.

The figure below shows a case in which e1=8,s1=5, e2=10,s2=15 and r12=0.10. Here, the risk-return
plot "bends backward" so that x1=1 is an inefficient portfolio. The minimum-variance portfolio is
efficient, as are portfolios that combine it (in non-negative amounts) with asset 2.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 8/14
10/24/12 Portfolios of Two Assets

If r12 exceeds s1/s2, the minimum variance portfolio will require a short position in asset 1. The figure
below shows a case in which e1=8,s1=5, e2=10,s2=15 and r12=0.80.

As before, all points above and to the right of the point representing the minimum-variance portfolio are
efficient. In this case, asset 1 is efficient, as are all combinations involving asset 2 in amounts exceeding
x2min. Note that the efficient frontier increases at a decreasing rate, exhibiting decreasing returns to risk-
bearing.

Risk-return Tradeoffs in Mean-Variance Space


We have seen that efficient combinations of two assets plot on a curve in mean-standard deviation
space that increases at either a constant rate or at a decreasing rate as standard deviation is increased.
What can be said about the shape of the frontier in mean-variance space?

First, note that:

vp = sp^2

Thus:

d(vp) = 2*sp*d(sp)

and:

d(sp) = d(vp)/(2*sp)

www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 9/14
10/24/12 Portfolios of Two Assets

From this it follows that:

d(ep)/d(vp) = (d(ep)/d(sp))/(2*sp)

If d(ep)/d(sp) is constant as sp is increased, then d(ep)/d(vp) will be a decreasing function of sp. The
figures below provide illustrations of the efficient frontiers in each of the two spaces when e1=6, s1=0,
e2=10, s2=15 and the two assets are perfectly positively correlated..

If two risky assets are less than perfectly correlated, d(ep)/d(sp) will decrease with sp and d(ep)/d(vp)
will, in a sense, decrease at an even faster rate. The figures below provide illustrations of the frontier
representing non-negative combinations of the two assets in each of the two spaces when e1=6,s1=5,
e2=10, s2=15 and r12=0.5.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 10/14
10/24/12 Portfolios of Two Assets

When two assets are combined to form portfolios, the efficient frontier will plot as a curve with a
decreasing slope in mean-variance space, no matter what the assets' characteristics. This implies that
there will be a unique point of tangency with an indifference curve (line) from a family exhibiting
constant risk tolerance (that is, for which utility = ep-vp/t).

The figures below provide illustrations using the assets in the most recent example and a risk tolerance
of 50. In each diagram the optimal portfolio lies at the point at which the green indifference curve is
tangent to the efficient frontier. Of course the optimal portfolio is the same in each diagram. In this case
the optimal mix has x1=0.5 and x2 =0.5. The expected return of the portfolio is 8.0%, its variance is
81.25, its standard deviation 9.0139 and it provides the Investor in question a utility of 6.375%. The
latter can be seen by inspecting the vertical intercept of the green indifference curve. The points on the
blue indifference curve provide a lower level of utility and thus are inappropriate for this Investor. Points
on the red indifference curve provide a higher level of utility, but none is feasible in this case.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 11/14
10/24/12 Portfolios of Two Assets

The Excess Return Sharpe Ratio


A widely-used (and sometimes misused) measure of investment performance is the Sharpe Ratio,
originally named the reward-to-variability ratio by its author, but now commonly given this
eponymous description. Broadly defined, it is the ratio of the expected value of a zero-investment
strategy to the standard deviation of that strategy. An important special case involves a zero-investment
strategy in which funds are borrowed at a fixed rate and invested in a risky asset.

Let x be invested in asset 2 and -x in asset 1. In our previous notation, this is equivalent to x2=x and
x1=- x. The expected return relative to the underlying or notional value x will be:

e = x*(e2-e1)

while the standard deviation will be:

s = x*s2

The Sharpe Ratio for the strategy will thus be:

e/s = (x*(e2-e1))/(x*s2)

or:

(e2-e1)/s2

Note that e2-e1 is the expected value of the difference between the return on asset 2 and the return on
asset 1. The difference between the return of a risky asset and that of a riskless one is termed the risky
asset's excess return:

xr2 = r2-r1

Since r1 is riskless, the standard deviation of xr2 will equal the standard deviation of r2. Thus the
Sharpe Ratio for our strategy will be:

e(xr2)/s(xr2)
www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 12/14
10/24/12 Portfolios of Two Assets

that is, the expected excess return divided by the standard deviation of excess return. For specificity
we will call this asset 2's excess return Sharpe Ratio (xrsr), although it is often termed simply the
asset's Sharpe Ratio.

Note that x, the scale of the zero-investment strategy, does not appear in the formula -- all strategies
involving a given asset or portfolio have the same value of xrsr, no matter what their scale (assuming, of
course, that the rate of interest is unaffected by the amount borrowed). Under these conditions, xrsr is
scale-independent.

To see one of the implications of this characteristic, consider a choice between two risky assets, a and b,
in which only one can be chosen but long or short positions can be taken in a riskless asset. Assume that
asset b has a higher excess return Sharpe Ratio. Which is better? The figure below provides an
illustration.

For any desired level of risk, a portfolio "based on" asset b will provide a higher expected return than
one based on asset a. In this context, the phrase "based on" connotes a combination of the asset in
question and the riskless asset with the amount of the latter positive, negative or zero as required to
obtain the desired level of risk. Thus if the level of risk offered by asset a is desired, the combination of
asset b and lending that provides the risk and return shown by point z in the diagram should be selected,
since it provides the same level of risk, but greater expected return. This will be true for any desired
level of risk.

Another way to see this is to consider an investment of x1 in asset 1 and x2 in asset 2 as (1) an
investment in asset 1 and (2) a decision to take a position of size x2 in the zero-investment strategy in
which asset 1 is shorted, with the proceeds from the short sale invested in asset 2. An institutional
arrangement for the latter is a swap in which the Investor (party A) agrees to pay a fixed rate (the return
on asset 1) to a counterparty (party B) and the latter agrees to pay a rate based on the return of a risky
asset (asset 2). Both rates are multiplied by a notional amount, then netted to determine the final value
transferred from one party ("the winner") to the other ("the loser").

In this situation the Investor will receive r1 on his or her direct investment and x*(r1-r2) on the swap,
where x is the ratio of the notional value of the swap to the value of the Investor's original funds.
Overall, the return will be:

r1 + x*(r2-r1)

or:
www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 13/14
10/24/12 Portfolios of Two Assets

r1*(1-x) + x*r2

Thus x plays the role of x2 in our original formulation, while (1-x) plays the role of x1.

While x represents a natural measure for use when contracting for a zero-investment strategy, in many
cases a more useful measure of scale is the standard deviation of ending value. In this case we normalize
it, dividing by amount of the Investor's initial fund to obtain the resultant standard deviation of overall
return:

sp = x*s2

We are now ready to re-examine the choice between two mutually exclusive risky assets under the
assumption that it is possible to take long or short positions as desired, either directly or indirectly via
derivative strategies such as swaps. In the present view, one must select between two zero-investment
strategies. One provides e(xra)/s(xra) per unit of risk, while the other provides e(xrb)/s(xrb) per unit of
risk. Assume that a fixed amount of risk, sp, is desired. Then the expected returns of the two strategies
will be:

ea = e1+ ((e(xra)/s(sra))*sp

and

eb = e1+ ((e(xrb)/s(srb))*sp

Clearly the strategy with the larger ratio of expected excess return to standard deviation of excess return
is better. But this ratio is the excess return Sharpe Ratio. In a situation of this sort one should pick the
alternative that provides the highest reward per unit of variability. More simply put: among mutually
exclusive risky portfolios, pick the one with the greatest expected return Sharpe Ratio.

Practitioners often compute excess return Sharpe Ratios based on historic returns for mutual funds and
other investment products. The usefulness of such measures may be limited due to lack of conformance
with the assumptions that we have made in this section. First, the distribution of historic returns may be a
poor surrogate for the distribution of next period's return. Second, it may not be possible to borrow at the
same rate of interest used in the calculations of excess return. Finally, the Investor may have other assets
and/or liabilities, and the funds being compared may provide different degrees of correlation with them.
Since the Sharpe Ratio takes into account only expected return and risk, it may fail to lead to the best
investment if correlations with important assets and liabilities differ significantly among the alternatives.

Despite these caveats, the Sharpe Ratio is a useful measure that can combine aspects of both expected
return and risk in one number. The excess return Sharpe Ratio represents one application of the broader
concept. Later sections present other examples.

www.stanford.edu/~wfsharpe/mia/rr/mia_rr5.htm 14/14
10/24/12 Macro‑Investment Analysis

Optimization
The Gradient Method
Optimal Portfolios without Bounds on Holdings
The Critical Line Method

www.stanford.edu/~wfsharpe/mia/opt/mia_opt0.htm 1/1
10/24/12 The Gradient Method

The Gradient Method

Contents:
Optimization Procedures
The Standard Asset Allocation Problem
A Three-Asset Example
The Utility Hill
Asset Marginal Utility
The Optimal Feasible Swap
The Optimal Feasible Amount to Swap
An Algorithm for the Standard Problem
Function GQP
The Optimization Worksheet

Optimization Procedures
The goal of the Analyst is to help the Investor "do what is best". Their joint objective should be to make a
set of investment decisions that will provide the maximum possible utility for the Investor. In some cases
this can be formalized as a problem involving the maximization of an objective function (such as the utility
of a portfolio for the Investor) subject to one or more constraints (such as those imposed by the Investor's
level of wealth). In the investment arena the process of solving such a problem is often termed
optimization. Procedures for efficiently determining optimal strategies are frequently called optimization
algorithms. Not surprisingly, computer programs that use such procedures are generally described as
optimizers.

The sections that follow deal with classes of optimization problems frequently encountered by Analysts
and methods that can be employed to solve such problems.

The Standard Asset Allocation Problem


We focus on the allocation of an Investor's assets among several asset classes to as to maximize the utility
of the resulting portfolio of assets for the Investor, taking into account the Investor's risk tolerance and
relevant constraints on asset holdings. Many Analysts approach such problems using one-period estimates
of asset risks, correlations and expected returns, assuming that the Investor's utility is a function of the
expected return and standard deviation of return of the selected portfolio. More precisely, the Investor's
utility is represented as a linear function of the mean and variance of the portfolio of assets:

u = ep - vp/rt

where:

u = the utility of the portfolio for the Investor


ep = the expected return of the portfolio
vp = the variance of the portfolio return
www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 1/14
10/24/12 The Gradient Method

rt = the Investor's risk tolerance

Here, rt represents the investor's marginal rate of substitution of variance for expected value. Moreover, u,
the measure of portfolio utility, can be interpreted as a risk-adjusted expected return, since it is computed
by subtracting a risk penalty (vp/rt) from the expected return (ep).

The expected return of a portfolio will, of course, depend on its composition and on the expected returns
of its components. A portfolio is represented by a vector of holdings, expressed as proportionate values.
Let x be an {n*1} vector of such proportions and e be an {n*1} vector of asset expected returns. Then the
expected return on portfolio x will be:

ep = x'*e

The variance of a portfolio's return will depend on its composition and on the covariances among the
various asset classes. Let C be an {n*n} matrix of such covariances. Then the variance of return for
portfolio x will be:

vp = x'*C*x

The goal is to find the best portfolio -- here, the one with the maximum possibility utility. The decision
variables are the asset holdings -- that is, the elements of vector x. As these are varied, the utility of the
associated portfolio will change. We wish to vary them until the maximum possible utility is attained.
However, there are typically constraints on the allowable combinations. In the standard problem the x
values represent proportionate holdings of assets. In such a case only values of x that sum to 1.0 may be
considered. We thus must obey a full-investment constraint:

sum(x) = 1

To generalize slightly in anticipation of more complex problems, we will require that:

sum(x) = k

where k is a constant.

Often there will be further constraints on asset holdings. In many situations short sales are precluded,
hence only non-negative values of the x's are allowed. Upper limits may also apply. The standard problem
includes both types of bounds. Let lb be a {n*1} vector of lower bounds and ub an {n*1} vector of upper
bounds. Then we require that each value of x(i) must be below or at its upper bound ub(i) and above or
equal to its lower bound lb(i). In vector terms:

x <= ub
x >= lb

More succinctly:

lb <= x <= ub

Cases involving only lower bounds can be treated by assigning values of plus infinity to upper bounds,
while those involving only upper bounds can be treated by assigning values of minus infinity to lower
bounds. If there are no bounds, both procedures can be invoked. This makes the standard problem
formulation more general than one might initially assume.

We will refer to this as the "standard asset allocation problem". To summarize, the goal is to:

Select:
x

www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 2/14
10/24/12 The Gradient Method

to maximize:
u = ep - vp/rt

where:
ep = x'*e
vp = x'*C*x

subject to:
sum(x) = k
lb <= x <= ub

Note that this involves the maximization of a quadratic function of the decision variables, subject to a set
of linear constraints, some of which are inequalities. A problem with such characteristics is termed a
quadratic programming (QP) problem. It may be solved with a general quadratic programming algorithm
or with a procedure designed to deal only with problems that have similar structures. Here we introduce an
algorithm that can solve the standard asset allocation problem in a simple and intuitive way. While
somewhat limited in its range of application, it is easy to program and illustrates key economic principles
that apply to a very broad range of optimization problems in Macro-investment Analysis.

A Three-Asset Example
To illustrate the steps in optimization procedures, we use a simple example involving three assets -- cash,
bonds and stocks. The standard deviations and correlations among their returns are similar to those of real
returns on diversified index mutual funds during the period from 1980 through 1995. Expected returns are
similar to mean real returns over that period. All monthly values are annualized. It is important to note that
the period in question involved very high average returns. Unbiased estimates of future expected returns
would in all likelihood be considerably lower.

It is convenient to include all the key asset-related information in one block. Here are the inputs for our
example formatted for use in the optimization worksheet provided for solving such problems:

MIN INIT MAX ExpRet StdDev c:cash c:bonds c:stocks


cash 0.00 1.00 1.00 2.80 1.00 1.00 0.40 0.15
bonds 0.00 0.00 1.00 6.30 7.40 0.40 1.00 0.35
stocks 0.00 0.00 1.00 10.80 15.40 0.15 0.35 1.00

The first column shows the lower bounds (all zero in this case) and the third column the upper bounds (all
1.00). The second column indicates the initial portfolio. In this case all of its assets are invested in cash.
The fourth and fifth columns show the expected returns and standard deviations of the assets, respectively,
stated in terms of percent return per year (for example, stock is expected to return 10.80% per year). The
final columns provide estimates of the correlations among the asset classes.

In matrix terms, the problem inputs are:

e=
2.80
6.30
10.80

sd =
1.00
7.40
15.40

cc =
1.00 0.40 0.15
www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 3/14
10/24/12 The Gradient Method

0.40 1.00 0.35


0.15 0.35 1.00

lbd =
0.00
0.00
0.00

ubd =
1.00
1.00
1.00

x0 =
1.00
0.00
0.00

For computational purposes we need the covariance matrix C:

C = (sd*sd').*cc

C=
1.000 2.960 2.310
2.960 54.760 39.886
2.310 39.886 237.160

Two other inputs are needed. The first is k, the required sum of the values in the x vector. Here we
compute it from the initial portfolio, so that the sum of the x values is required to be the same as currently :

k = sum(x0)

k=1

The other input is the Investor's risk tolerance. In this case we assume that it is 50 -- a value representing a
moderate attitude towards risk-taking.

rt = 50

The Utility Hill


Although our example involves three decision variables (cash, bonds and stocks), the full-investment
constraint restricts portfolios to combinations that sum to one. Thus we can characterize the problem as one
of choosing, say, proportions to be invested in bonds and stocks, with any remaining amount invested in
cash. This makes it possible to graph the relationship between two of the decision variables and the
measure of merit. The resulting surface will have some of the attributes of a hill. However, only a portion
of this "utility hill"is feasible. We must restrict our search to coordinates in which the sum of the amounts
invested in bonds and stocks is 1.0 or less.

The diagram below shows the feasible portion of the utility hill over which we can conduct our
exploration.

www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 4/14
10/24/12 The Gradient Method

Note that the highest feasible point involves somewhat more investment in stocks than in bonds, and no
investment in cash. Note also that it provides the Investor with considerably more utility (over 6.0%) than
the initial all-cash portfolio shown at the bottom of the hill, which provides less than 3.0% in utility.

We have called this surface a utility hill for a reason. It resembles at least a portion of a hill or mountain. In
a sense, our job is to climb to the highest feasible point on the hill. We will do this in stages. We start with
a feasible portfolio. Then we find the feasible direction in which we can move upward at the greatest rate.
More specifically, we select the direction that will result in the greatest increase in altitude (utility) per step
(change in portfolio holdings) -- that is, the steepest gradient. Having selecting a direction, we climb until
we either reach a peak or a boundary that we cannot cross. Then we determine the feasible direction of
steepest ascent again and repeat the process. When no feasible direction leads upward, we stop. Given the
nature of the terrain in a standard problem, this procedure will place us on the highest allowable point --
that is, provide the portfolio with the greatest possible utility.

Asset Marginal Utility


Consider the effect of a small change in the holding of one asset on the portfolio's utility. Recall that:

u = ep - vp / rt

Let mu(i) be the marginal utility of asset i -- the derivative of u with respect to x(i):

mu(i) = d u / d (x(i)

This will be related to the marginal expected return of asset i:

dep/dx(i)

and its marginal risk:


www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 5/14
10/24/12 The Gradient Method

dvp/dx(i)

as follows:

mu(i) = dep/dx(i) - (1/rt)*dvp/dx(i)

But we know that:

dep/dx(i) = e(i)

and

dvp/dx(i) = the i'th row of 2*C*x

where C is the asset covariance matrix.

Given this, we can compute mu, the vector containing the marginal utilities of all the assets directly:

mu = e - (1/rt)*2*C*x

where x is the current portfolio.

For the example we begin with the current portfolio (x) equal to x0. The resulting marginal utilities are:

mu =
2.7600
6.1816
10.7076

Thus utility will change at a rate of 10.7076% per unit change in the amount invested in asset 3, as long as
the change in the latter is small. The rate of change will be 6.1816% for asset 2 and 2.7600% for asset 1.

The Optimal Feasible Swap


Marginal utilities provide important information about a portfolio -- information that can be used to
improve it --to alter its composition so as to increase its utility for the Investor in question.

In this case all the marginal utilities are positive, indicating that more of any asset would increase utility. It
would be lovely if we could increase the proportions invested in all three assets or, better, yet, only the best
of the three. But such is not feasible. The only changes that can be considered are those that meet the
constraint that the sum of the proportions equal k. Since this is already the case for the current portfolio, we
must restrict our choices to changes in which the sum of the amounts by which we increase one or more
assets equals the sum of the amounts by which we decrease one or more of the others.

In this case the most attractive asset to increase is the third (stocks). Happily, it can be increased, since the
current amount (0.00) is below the upper bound (1.00). The least attractive asset to increase is the first
(cash). But it is also the most attractive asset to decrease. This suggests a two-asset swap in which asset 3
is increased and asset 1 decreased. Such a swap would increase utility at the rate of 10.7076-2.7600 or
7.9476% per unit amount of the swap, as long as the change in the latter is small. Fortunately, this
particular swap is feasible, since (1) the asset to be increased (stock) is currently below its upper bound and
(2) the current value of the asset to be decreased (cash) is 1.00, well above its lower bound of 0.00.

We conclude that if a small two-asset swap is to be undertaken, the best possibility involves a decrease in
the amount of cash, with an equivalent increase in the amount invested in stocks. If the current portfolio is
actually held, this requires the sale of cash securities, with the proceeds used to buy stocks. In the more
www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 6/14
10/24/12 The Gradient Method

likely event that the calculations are being made to determine an optimal portfolio to be held, the "buys"
and "sells" would be hypothetical. For simplicity, however, we use the terms "sell", "buy" and "swap" to
describe both cases.

In a problem of this type, a swap is only feasible if the asset to be decreased is above its lower bound and
the asset to be increased is below its upper bound. Moreover, if the current portfolio satisfies the constraint
that sum of the holdings equals k, so will any portfolio resulting from such a swap, as long as the
magnitude of the swap is not too large. Since (by definition) there are no additional constraints in a
standard problem, these conditions are both necessary and sufficient for a swap to be feasible. These
observations lead to the rules for finding the optimal feasible two-security swap:

For all securities i for which x(i)>lb(i), find the one with the smallest value of mu(i).

Let this asset be isell and its marginal utility mu(isell)

For all securities i for which x(i)<ub(i), find the one with the largest value of mu(i).

Let this asset be ibuy and its marginal utility mu(ibuy)

The optimal two-security swap is then to sell isell and buy ibuy

This will increase utility at the rate: mu(ibuy)-mu(isell)

Clearly, if mu(ibuy)-mu(isell) is positive, the portfolio's utility can be increased. Moreover, no other small
change can increase it by a larger amount. Why? Because (1) any other set of purchases will prove
inferior to the purchase of ibuy, since ibuy's marginal utility is greater than that of any other potential
feasible purchase or set of purchases and (2) any other set of sales will prove inferior to the sale since isell's
marginal utility is smaller than that of any other potential feasible sale or set of sales. The optimal two-
security swap is thus the optimal swap in a standard problem.

Special cases require slightly more general rules. If two or more assets have the same marginal utility, the
choice of isell or ibuy may not be unique and an arbitrary choice may be made. If all securities are at their
lower bounds or at their upper bounds, no improvement is possible. This is also the case if the mu(ibuy)
equals mu(isell) -- a condition that proves central for both optimization and an understanding of
equilibrium in financial markets.

The Optimal Feasible Amount to Swap


The optimal feasible swap will increase utility at the greatest possible rate per unit swapped. But the rate of
increase will change as the size of the swap increases. At some point, utility will reach its peak, then
decline. Moreover, the feasible amount of a swap will be limited by the upper bound on the asset being
purchased and by the lower bound on the asset being sold. All these factors need to be taken into account
in order to find the optimal feasible size for any desired swap

For generality, we represent a swap by a vector s of changes in asset holdings, where the sum of the
elements is zero. In our example, the optimal feasible swap is:

s=
-1
0
1

Now, let a represent the amount swapped, so that the net effect is to change the portfolio by an amount
equal to s*a.

www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 7/14
10/24/12 The Gradient Method

For example, if a = 0.10:

s*a =
-0.10
0.00
0.10

Let cx denote this set of changes:

cx = s*a

Then if such changes are made to portfolio x, the result will be a new portfolio xx, given by:

xx = x + cx

or

xx = x + s*a

Now, consider the utilities of portfolios x and xx:

u(x) = x'*e - (1/rt)* x'*C*x

u(xx) = (x' + cx')*e - (1/rt)*(x'+cx')*C*(x+cx)

We are interested in the change in utility cu=u(xx)-u(x). Expanding the second formula and subtracting the
first gives:

cu = cx'*e - (1/rt) ( 2*x'*C*cx + cx'*C*cx);

Substituting s*a for cx and simplifying gives:

cu = [s'*e]*a - (1/rt) * ( [ 2*x'*C*s]*a + [s'*C*s]*(a^2))

Rearranging terms to express cu as a function of the amount of the swap, we obtain:

cu = [ s'*e - (1/rt)*2*s'*C*x]*a - [(s'*C*s)/rt]*(a^2)

or

cu = k0*a - k1*(a^2)

where:

k0 = s'*(e - (1/rt)*2*C*x)

k1 = (s'*C*s)/rt

In our example:

k0 = 7.9476

k1 = 4.6708

The figure below plots cu as a function of a for this case.

www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 8/14
10/24/12 The Gradient Method

Note that the change in utility increases at a decreasing rate. This is not surprising, since the function is
quadratic with a negative quadratic term (-k1). This is a characteristic of all changes in portfolio
composition. Given the nature of covariance matrices the s'*C*s will be positive for any set of changes
represented by vector s. As long as the investor's risk tolerance is positive, k1 will be also, so that -k1 will
be negative. We thus conclude that:

Portfolio revision is subject to decreasing returns to scale. The greater the magnitude of a
revision, the smaller will be the rate of further increase in portfolio utility -- that is, utility will
increase at a decreasing rate.

Note that k0, the first term in this expression, is equal to the net marginal utility of the swap. This is not
surprising, since the marginal utility measures the effect on utility of an infinitesimal change in holdings.

In this case, the maximum possible change in utility is obtained by swapping a large amount of the
portfolio. The actual amount may be calculated directly. We seek the value of a at which cu is maximized.
Since cu must have a positive slope at the origin and the slope must decrease with a, we need only set the
derivative equal to zero:

dcu/da = k0 - 2*k1*a = 0

or

a = k0 / ( 2*k1 )

In this case:

a = 7.9476 / (2 * 4.6708) = 0.8508

Thus the optimal amount to swap is 0.8508.

Of course, this calculation does not take into account the upper and lower bounds on the holdings. To
remain feasible, we cannot buy an amount of ibuy that will make x(ibuy) exceed ub(ibuy). Nor can we sell
an amount of isell that will make x(isell) fall below lb(isell). The optimal feasible amount to swap (aopt) is
thus:

a = min( [ k0 / (2*k1), ub(ibuy) - x(ibuy), x(isell) - lb(isell) ] )

www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 9/14
10/24/12 The Gradient Method

In our case:

a = 0.8508

cu = k0*a - k1*(a^2) = 3.3808

So the optimal amount of the swap will add 3.3808 to the utility of the portfolio.

An Algorithm for the Standard Problem


We now have all the ingredients needed to solve a standard problem. Starting with any feasible portfolio,
we can find the optimal feasible swap and the optimal amount of that swap. After making a swap in the
appropriate amount we have a new (and improved) feasible portfolio. But this may also be improved,
using the same technique. By repeating the process until no further improvement is possible, we reach the
goal of maximizing utility without violating the stated constraints.

The central part of the algorithm uses a loop that continues until a terminal condition is fulfilled. This can
be written somewhat inelegantly in pseudo-MATLAB as:

while 1==1
[ do computations ]
if [ finished ]
return
end;
end;

To begin, of course, an initial feasible portfolio is required. Assuming that x0 meets this requirement, we
precede the loop with:
% set initial mix and number of assets
x = x0;
n = length(x);

This also sets n as the number of assets for later use.

Inside the loop, the first computations are those that compute the marginal utilities and find the best asset to
buy and the best to sell. For expository purposes we do this in a somewhat inefficient manner, as follows:

% compute marginal utilities


mu = e - (1/rt)*2*C*x;
% find best assets to buy and sell
ibuy = 0;
mubuy = -1E200;
isell = 0;
musell = 1E200;
for i = 1:n
if x(i) < ub(i) % possible buy
if mu(i) > mubuy
mubuy = mu(i);
ibuy = i;
end;
end;
if x(i) > lb(i) % possible sell
if mu(i) < musell
musell = mu(i);
isell = i;

www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 10/14
10/24/12 The Gradient Method

end;
end;
end;

This results in a large negative value of mubuy if no assets may be bought and a large positive value of
musell if no assets may be sold. Otherwise, mubuy and musell are the marginal utilities of ibuy and isell,
respectively, the best assets to buy and sell.

At this point it is time to check to see if the procedure should be terminated. If the net rate of change in
marginal utility associated with the optimal feasible swap is zero or negative, no more improvement is
possible. Given the nature of computed values, it is desirable to stop when this amount is less than some
minimum threshold. For example:

% terminate if change in mu is less than threshold value


if (mubuy - musell) <= 0.0001
return
end

In the optimization worksheet this constant is termed the marginal utility cutoff. If speed is more important
than accuracy, it should be set to a relatively large value (for example, .001). If accuracy is more important
than speed, it should be set to a relatively small value (e.g. 0.00001).

Note the procedure for termination also covers cases in which no assets remain to be purchased and/or
sold, since we have set the values of musell and mubuy to be very large and very small, respectively, in
such cases. A bit devious, but effective.

If the termination condition has not been met, it is time to make a swap. First, we set up the vector
describing the optimal swap:

% set up swap vector


s = zeros(n,1);
s(ibuy) = 1;
s(isell) = -1;

Then we compute the optimal amount to swap without considering the effects of asset bounds:
% compute optimal amount of swap without regard to asset bounds
k0 = s'*(e - (1/rt)*2*C*x);
k1 = (s'*C*s)/rt;
a = k0/(2*k1);

This may or may not be feasible. If necessary, the amount to be swapped is reduced to avoid violating the
upper bound of the asset to be bought or the lower bound of the asset to be sold:
% reduce amount if required to keep ibuy from exceeding its upper bound
if a > (ub(ibuy) - x(ibuy))
a = ub(ibuy) - x(ibuy);
end;
% reduce amount if required to keep isell from falling below its lower bound
if a > (x(isell) - lb(isell))
a = x(isell) - lb(isell);
end;

To avoid the possibility of an infinite loop, it is wise to check for a zero amount and terminate if such is
encountered:

% terminate if amount is zero


if a == 0
return;
end

www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 11/14
10/24/12 The Gradient Method

Finally (!) it is time to revise the portfolio to improve it as much as possible and be in a position to repeat
the process all over again:
% change mix
x = x + ( s*a) ;

Here is the entire algorithm set up as a MATLAB function that takes rt, e, C, lb, ub and x0 as inputs and
returns the optimal portfolio x as an output.
function x = gmqp(rt,e,C,lb,ub,x0);
% determines solution to a standard optimization problem
% usage:
% x = gmqp(rt,e,C,lb,ub,x0);
% rt = Investor risk tolerance
% e = {n*1} vector of asset expected returns
% C = {n*n} return covariance matrix
% lb = {n*1} vector of asset lower bounds
% ub = {n*1} vector of asset upper bounds
% x0 = {n*1} vector of initial feasible asset mix

% set initial mix and number of assets


x = x0;
n = length(x);
while 1==1;
% compute marginal utilities
mu = e - (1/rt)*2*C*x;
% find best assets to buy and sell
ibuy = 0;
mubuy = -1E200;
isell = 0;
musell = 1E200;
for i = 1:n
if x(i) < ub(i) % possible buy
if mu(i) > mubuy
mubuy = mu(i);
ibuy = i;
end;
end;
if x(i) > lb(i) % possible sell
if mu(i) < musell
musell = mu(i);
isell = i;
end;
end;
end;
% terminate if change in mu is less than threshold value
if (mubuy - musell) <= 0.0001
return
end
% set up swap vector
s = zeros(n,1);
s(ibuy) = 1;
s(isell) = -1;
% compute optimal amount of swap without regard to asset bounds
k0 = s'*(e - (1/rt)*2*C*x);
k1 = (s'*C*s)/rt;
a = k0 / (2*k1);
% reduce amount if required to keep ibuy from exceeding its upper bound
if a > (ub(ibuy) - x(ibuy))
a = ub(ibuy) - x(ibuy);
end;
% reduce amount if required to keep isell from falling below its lower bound
if a > (x(isell) - lb(isell))
a = x(isell) - lb(isell);
end;

www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 12/14
10/24/12 The Gradient Method

% terminate if amount is zero


if a == 0
return;
end
% change mix
x = x + ( s*a) ;
end;

With this function in a directory in the MATLAB path under the name gmqp.m, you could obtain the
solution to a standard problem by simply giving the following command at the command line or in a
program:
x = gmqp(rt,e,C,lb,ub,x0)

For our problem:


x=
0
0.3996
0.6004

Thus the optimal asset mix (portfolio) contains no cash and has roughly 40% invested in bonds and 60% in
stocks.

Function GQP
While function gmqp suffices for most standard problems, it cannot handle cases in which rt=0, which
would result in attempts to divided portfolio variance by zero. However, economically meaningful
problems arise in which the goal is to minimize variance, as would an Investor with zero tolerance for risk.
Such cases can be handled by changing the units in which utility is measured. Instead of:

u = ep - vp/rt

We can use:

uv = rt*ep - vp

The more commonly-used version (u) divides portfolio variance by rt, the marginal rate of substitution of
variance for expected return, to convert vp to an expected-return equivalent. The latter is then subtracted
from portfolio expected return to give u, a measure of portfolio utility in expected return equivalent terms.

The second measure (uv) multiplies portfolio expected return by rt, the marginal rate of substitution of
variance for expected return, to convert ep to a variance equivalent. The portfolio variance is then
subtracted to obtain a measure of portfolio utility stated in variance-equivalent terms. When rt=0,
maximizing uv is equivalent to minimizing portfolio variance, as desired. When rt is greater than zero,
maximizing uv will give the same mix (x) as will maximizing u.

This alteration is incorporated in MATLAB function gqp, that can be used instead of gmqp. Function gqp
has a number of additional features. It uses more efficient vector methods to find the optimal swap and the
appropriate amount of that swap and is thus faster than gmqp. It also computes the expected return and
variance of the portfolio and returns them as additional outputs if requested to do so.

With function gqp in a directory in the MATLAB path under the name gqp.m, you could obtain the
solution to a standard problem by simply giving the following command at the command line or in a
program:

www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 13/14
10/24/12 The Gradient Method

[x,ep,vp] = qqp(rt,e,C,lb,ub,x0)

The Optimization Worksheet


For those without access to MATLAB, all is not lost. The optimization worksheet is a javascript
implementation of the gradient algorithm.The format for inputs follows that given in the section above. In
addition, the Investor's risk tolerance and the marginal utility cutoff must be specified. The outputs
obtained from the worksheet using the inputs shown earlier for an Investor with a risk tolerance of 50 are:
PORTFOLIOS:
Initial Optimal Change
cash 1.000 0.000 -1.000
bonds 0.000 0.400 0.400
stocks 0.000 0.600 0.600

CHARACTERISTICS:
Initial Optimal Change
ExpRet 2.800 9.002 6.202
StdDev 1.000 10.648 9.648
Utility 2.780 6.734 3.954

Not surprisingly, the optimal portfolio composition is that obtained earlier (to three decimal places). Also
shown are the expected returns, standard deviations and utilities of the intial and optimal portfolios as well
as the changes in all the variables.

www.stanford.edu/~wfsharpe/mia/opt/mia_opt1.htm 14/14
10/24/12 Optimal Portfolios without Bounds on Holdings

Optimal Portfolios without Bounds on Holdings

Contents:
Characteristics of Optimal Portfolios
Optimal Portfolio Composition without Bounds on Holdings
A Three-Asset Example
Characteristics of Optimal Portfolios without Bounds on Holdings
Additional Linear Equality Constraints
Additional Linear Objectives

Characteristics of Optimal Portfolios


Assume that a standard asset allocation problem has been solved and an optimal portfolio obtained. Each asset will
be in one of three states, depending on the amount invested relative to its required upper and lower bounds. We
can term these "down", "in" and "up", as follows:

down: l(i) = x(i)


in: l(i) < x(i) < u(i)
up: x(i) = u(i)

Since the portfolio is optimal, it must be the case that the marginal utilities of all the "in-variables" are the same. If
this were not the case, it would be possible to improve utility by reallocating money from an in-variable to another
with a higher marginal utility. Thus:

for all in-variables: mu(i) = mup

where mup is a constant.

It must also be the case that the marginal utility of every down-variable is less than or equal to this amount. If this
were not the case, it would be possible to increase the amount of money allocated to such a down-variable and
reduce the amount allocated to an in-variable, thereby increasing utility. Thus:
for all down-variables: mu(i) <= mup

Finally, the marginal utility of every up-variable must be greater than or equal to that of every in-variable --
otherwise utility could be increased by reducing the amount invested in an up-variable and increasing the amount
invested in an in-variable. Thus:

for all up-variables: mu(i) >= mup

We will exploit the implications of all these characteristics when deriving the critical line method for solving
general asset allocation problems. First, however, we will focus on cases in which upper and lower bounds on
asset holdings are either absent or non-binding. In such instances it is possible to determine optimal portfolio
composition analytically -- that is, without using iterative procedures such as those required in the gradient and
critical line methods.

Optimal Portfolio Composition without Bounds on Holdings


If there are no bounds on portfolio holdings, one can use the gradient method to determine an optimal portfolio by

www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 1/14
10/24/12 Optimal Portfolios without Bounds on Holdings

setting all upper bounds to plus infinity and all lower bounds to minus infinity. In the resultant optimal portfolio, of
course, all variables will be in (that is, between their bounds). As a result, all will have the same marginal utility.
Thus we know that for the optimal portfolio in such a setting:

mu(i) = mup : for all i

Now, recall the formula for calculating the marginal utilities of a set of assets:
mu = e - (1/rt) * 2*C*x

For asset i:
mu(i) = e(i) - (1/rt) * [ 2*C(i,1)*x(1) + 2*C(i,2)*x(2) + ... + 2*C(i,n)*x(n) ]

We seek the composition of the optimal portfolio (x(1), x(2), ... x(n)) and the common marginal utility of all the
assets in that portfolio (mup). This requires that all the marginal utilities be the same, that is:
e(i) - (1/rt) * [ 2*C(i,1)*x(1) + 2*C(i,2)*x(2) + ... + 2*C(i,n)*x(n) ] = mup

for assets 1,2,...n.

Note that each of these equations is linear and that there are n such equations. But there is one more requirement
for a standard portfolio problem:
sum(x) = k

In the absence of liabilities, etc. this takes the familiar form:


sum(x) = 1

This is a linear equation in the x-variables, so we now have n+1 linear equations in n+1 unknowns. Barring
degeneracy, this can be solved by simple matrix inversion and the optimal portfolio obtained directly.

It is useful to state the problem using a straightforward matrix equation. This will ease the task of providing a
solution, facilitate extensions, and most importantly, make evident a number of characteristics of solutions to
particular classes of problems.

To begin, we rewrite the requirement for asset i as:

2*C(i,1)*x(1) + 2*C(i,2)*x(2) + ... + 2*C(i,n)*x(n) + tmup = rt*e(i)

where:

tmup = rt*mup

We will treat tmup as an unknown, recognizing that mup can always be computed after the fact by dividing tmup
by rt. In practice, the values of tmup and mup may be only of passing interest, since the primary goal is to
determine the composition of the optimal portfolio.

Note that all the unknowns are now on the left-hand side. Let y be an {(n+1)*1}element vector that includes all
the variables. Here:
y = [ x(1)
x(2)
...
x(n)
tmup ]

To memorialize the fact that this vector contains the unknown x-variables, we name it y, which is one letter from
x, a convention dating back to the computer in the film "2001" which was named HAL, one letter removed from
IBM (although in the other direction)..

Now, let D be an {(n+1)*(n+1)}element matrix that includes information about the asset covariances and the
constraint that the sum of the holdings equals a constant:
www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 2/14
10/24/12 Optimal Portfolios without Bounds on Holdings
D = [ 2*C(1,1) 2*C(1,2) ... 2*C(1,n) 1
2*C(2,1) 2*C(2,2) ... 2*C(2,n) 1
.....
2*C(n,1) 2*C(n,2) ... 2*C(n,n) 1
1 1 ... 1 0]

Note that this matrix is formed by bordering two times the covariance with the coefficients from the left-hand side
of the constraint, hence the name D (one letter after C).

For the remainder of the equation we need two more vectors. The first contains the right-hand side for the full-
investment constraint in row (n+1) and zeros elsewhere. At the risk of some temporary confusion we will call this
k. For the case in which the sum of the x values must equal 1:

k=[0
0
..
0
1]

Note that k contains the constant (hence "k") from the right-hand side of the constraint.

The last vector contains the asset expected returns in the first n rows and zero in the n+1'st row:

f = [ e(1)
e(2)
...
e(n)
0 ]

Since this contains the expected returns (e), we use the next letter (f) for its name.

We can now write a single matrix equation that contains all the conditions required for an optimal portfolio. It is
both simple and elegant:

D*y = k + rt*f

Of course we seek the portfolio that makes this equation hold. Since vector y contains the portfolio, we need to
solve for y. This is simply done by multiplying both sides of the equation by the inverse of D. The result is:

y = inv(D)*k + rt*inv(D)*f

For purposes of interpretation (and in some instances, implementation) it is useful to write this as:
y = mvp + rt*z

where:
mvp = inv(D)*k

and:
z = inv(D)*f

A Three-Asset Example
To illustrate the use of the formulas for optimal portfolio composition without upper and lower bounds we return
to the simple three-asset (cash, bonds and stocks) case used earlier. Expected real returns, risks and correlations
are:

e =[ 2.80
6.30
10.80 ]

www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 3/14
10/24/12 Optimal Portfolios without Bounds on Holdings

sd =[ 1.00
7.40
15.40 ]

cc = [ 1.00 0.40 0.15


0.40 1.00 0.35
0.15 0.35 1.00 ]

The corresponding covariance matrix is:


C = (sd*sd').*cc
= 1.000 2.960 2.310
2.960 54.760 39.886
2.310 39.886 237.160 ]

We assume the Investor has a risk tolerance (rt) equal to 25 and require that the sum of the holdings equals 1.

This gives the following matrices:

D = [ 2*C ones(3,1)
ones(1,3) 0 ]

D= 2.0000 5.9200 4.6200 1.0000


5.9200 109.5200 79.7720 1.0000
4.6200 79.7720 474.3200 1.0000
1.0000 1.0000 1.0000 0]

k = [ zeros(3,1)
1 ]

k=[0
0
0
1 ]

f=[e
0]

f = [ 2.80
6.30
10.80
0 ]

We can now find the components of the solution (mvp and z) and the solution (y):
mvp = inv(D)*k

mvp = [ 1.0392
-0.0396
0.0004
-1.8458 ]

z = inv(D)*f

z = [ -0.0389
0.0257
0.0132
2.6648 ]

y = mvp + rt*z

y = [ 0.0671
0.6021
0.3308
64.7731 ]

The first three elements of y contain the optimal portfolio holdings. In this case the best combination involves
6.71% in cash, 60.21% in bonds and 33.08% in stocks. Since every holding is positive, this would also be the
www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 4/14
10/24/12 Optimal Portfolios without Bounds on Holdings

optimal combination for an Investor unable to take short positions in assets. However, this need not always be the
case. Consider, for example, an Investor with a risk tolerance of 50. For such an Investor:
y = mvp + 50*z

y = [ -0.9050
1.2439
0.6611
131.3920 ]

Here the optimal investment involves 124.39% in bonds, 66.11% in stocks and borrowing (a negative cash
position) an amount equal to 90.50% of the initial value in order to finance the holdings in bonds and stocks.

Characteristics of Optimal Portfolios without Bounds on Holdings


As we have shown, the solution to the problem of portfolio choice without bounds on holdings can be written as
an equation involving three vectors (y, mvp, z) and a constant (rt):

y = mvp + rt*z

We turn now to the properties of vectors mvp and z, which tell us a great deal about optimal portfolio holdings.

The Minimum variance Portfolio


Consider an Investor who wishes to minimize risk, no matter how much expected return is sacrificed in the
process. Such a person will have a risk tolerance of zero. The optimal portfolio will, in turn, be given by
y = mvp + 0*z
= mvp

Thus mvp is the minimum variance portfolio (hence its name).

In our simple example:

mvp = [ 1.0392
-0.0396
0.0004
-1.8458 ]

It is easy to verify the fact that this is indeed a portfolio, since the sum of the x-values (elements 1 through 3)
equals 1.0.

To see why mvp must be a portfolio in this sense, it is useful to consider the problem for which it is a solution.
Start with the original problem formulation:
D*y = k + rt*f

In this case:
D*y = k + 0*f

or:

D*y = k

for which the solution is:


y = inv(D)*k

www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 5/14
10/24/12 Optimal Portfolios without Bounds on Holdings

Recall the components of the matrices and vectors in this case:

2*C(1,1) 2*C(1,2) 2*C(1,3) 1 x(1) 0


2*C(2,1) 2*C(2,2) 2*C(2,3) 1 * x(2) = 0
2*C(3,1) 2*C(3,2) 2*C(3,3) 1 x(3) 0
1 1 1 0 tmup 1

The last row requires that:


1*x(1) + 1*x(2) + 1*x(3) + 0*tmup = 1

Hence the solution must be a portfolio, that is the sum of the holdings must equal 1.

In this example, the minimum-variance portfolio involves borrowing an amount equal to 3.96% of the Investor's
funds by issuing (shorting) bonds, then investing the proceeds plus all the Investor's original funds in a
combination of cash and a minuscule amount of stocks. The portfolio proper is:
x = y(1:3,1)

and its variance (vp) is:

vp = x'*C*x
= 0.9229

giving a standard deviation of:


sdp = sqrt(vp)
= 0.9607

Note that this is less than the standard deviation of cash, which equals 1.0. Such is the power of diversification.

In this example, cash is not riskless, since returns are in real (inflation-adjusted) terms. Had nominal returns been
used, cash could have been considered riskless (if the holding period and the investment period for the cash asset
were the same). Consider, for example a case in which the variance of cash and its covariance with every other
asset is zero. Retaining the original assumptions for bond and stock risks and correlations:

D= 0 0 0 -1.0000
0 109.5200 79.7720 -1.0000
0 79.7720 474.3200 -1.0000
1.0000 1.0000 1.0000 0

and

mvp = inv(D)*k

= 1
0
0
0

Not surprisingly, if there is a riskless asset, the minimum variance portfolio is invested exclusively in it.

The Optimal Swap


We turn next to vector z.

Recall that the optimal portfolio for an Investor with a risk tolerance of rt is:
www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 6/14
10/24/12 Optimal Portfolios without Bounds on Holdings
y = mvp + rt*z

Now consider two Investors. One, with risk tolerance of zero, should hold portfolio y0, given by:

y0 = mvp + 0*z
= mvp

The other, with risk tolerance of 1.0, should hold portfolio y1, given by:

y1 = mvp + 1*z
= mvp + z

The differences between their portfolios are contained in the vector:

y1 - y0 = (mvp + z) - mvp = z

Thus z is a vector of differences in holdings between two portfolios. The sum of the asset holdings will thus equal
zero. In our earlier terminology, it is a zero-investment strategy. Hence the name (z).

In the original version of the three-asset example:

z=
-0.0389
0.0257
0.0132
-2.6648

Thus an Investor with a risk tolerance of 1.0 should hold 3.89% less in cash, 2.57% more in bonds and 1.32%
more in stocks than an investor with the same amount of money but no tolerance for risk at all. Not surprisingly,
the sum of the asset proportions equals zero, as it must if y is to be a portfolio.

Since the asset proportions in z sum to zero, it can be considered a recipe for a swap. One unit of the swap (for
example, $1) calls for the holder to pay (1) an amount equal to the return on 0.0389 units ($0.0389) invested in
cash and to receive an amount equal to the sum of (2) the return on 0.0257 ($0.0257) invested in bonds and (3) the
return on 0.0132 ($0.0132) invested in stocks.

Of course, z is not just any swap. It is the optimal swap. An Investor with a positive tolerance for risk should, in
effect, begin with the minimum variance portfolio, then take an appropriately large position in the optimal swap. In
dollar terms, the swap position should equal the investor's initial fund times rt, since:
y = mvp + rt*z

Thus an Investor with a risk tolerance of 50 should take twice as large a position in the optimal swap z as should
an Investor with the same wealth and a risk tolerance of 25.

One need not actually take a position in a swap contract to achieve the desired result. In most cases, the Investor
would simply determine the optimal portfolio y and invest in it directly. However, recognition that this is
equivalent to the results obtained by starting with mvp and then making a standard swap z in an appropriate
magnitude proves useful in understanding differences among Investors' optimal portfolio holdings.

Two Fund Separation


The recipe for an optimal portfolio is clearly linear. In vector form:

y = mvp + rt*z

Row i will be of the form:

y(i) = mvp(i) + rt*z(i)

For the first n rows corresponding to the asset positions, so that

www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 7/14
10/24/12 Optimal Portfolios without Bounds on Holdings
x(i) = mvp(i) + rt*z(i) : for i= 1,..,n

Now consider two portfolios, each optimal for a given risk tolerance. Let a represent the smaller of the two risk
tolerances and b the larger. Then the portfolios are respectively:

ya = mvp + a*z
yb = mvp + b*z

Assume that ya and yb represent portfolios offered by two mutual funds (mfa and mfb, respectively). How might
an investor with a risk tolerance equal to rt use such funds optimally? The answer is simple: place a proportion xa
of wealth in fund a and a proportion 1-xa in fund b, using the following formula to compute xa:
xa = (b-rt)/(b-a)

To see why this works, note that:

y = xa*ya + (1-xa)*yb
= xa*(mvp+a*z)+ (1-xa)*(mvp+b*z)
= mvp + [xa*a+(1-xa)*b]*z

But, given our recipe for choosing xa:

[xa*a+(1-xa)*b] = rt

So the two-fund portfolio is in fact the optimal portfolio for the Investor in question.

No matter how many assets are used to form the two mutual funds, an Investor can achieve a completely optimal
portfolio by allocating his or her funds between the two mutual funds, as long as each of the funds is optimal for a
particular risk tolerance and the two funds are designed for different risk tolerances.

As a practical matter, of course, an Investor would have to take a short position in one of the funds if his or her
risk tolerance fell outside the range covered by the funds. For this reason, it might be useful to utilize (1) a fund
designed for an extremely small level of risk tolerance and (2) one designed for a very large level of risk tolerance.

If several mutual funds are available, an Investor could achieve an optimal portfolio by combining any two funds,
as long as each is optimal for different level of risk tolerance and the proportions invested in the funds are chosen
appropriately (that is, using the formula given above for xa).

This result is of sufficient importance to deserve a relatively grand name. It is sometimes termed "Tobin's
separation theorem", since its introduction in Tobin 1958, but we will call the present result the two-fund
separation theorem to differentiate it from more complex results that arise in different settings. Why the name?
Because in this situation it is possible to separate the investment decision into two stages. In the first stage, two
optimal mutual funds are formed. In the second, investors allocate their assets between the two funds. Moreover,
two well-constructed funds are sufficient to span the set of desirable investment alternatives for all investors.

To be sure, these very strong results flow from very strong assumptions. Investors are assumed to agree on
probabilistic forecasts (asset means, standard deviations and correlations) and to consider portfolio mean and
variance to be sufficient statistics for selecting portfolios. Moreover, short positions are assumed to be feasible and
costless, as are other transactions. Later we will consider the effects of dropping one or more of these assumptions.
In the meantime, it is appropriate to pause to reflect on the simplicity and tranquillity of a world in which these
conditions would hold.

Additional Linear Equality Constraints


It is a relatively simple matter to extend the analysis of the last few sections to cover cases with two or more linear
equality constraints. Write the set of m such constraints as:

A*x = b

where A is an {m*n} matrix of "left-hand sides" and b is an {m*1) vector of "right-hand sides". For example, in
www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 8/14
10/24/12 Optimal Portfolios without Bounds on Holdings

addition to the standard full-investment constraint, assume (1) that it is desired to select a portfolio with an income
yield equal to 5.5%, and (2) that the yields of cash, bonds and stocks are, respectively, 5%, 7% and 3%. Then:

A=
1 1 1
5 7 3

b=
1
5.5

Lagrange Multipliers
To formulate this problem so it can be solved efficiently, we need to utilize explicitly a mathematical procedure
that we have employed implicitly already: the method of Lagrange multipliers.

The goal is to maximize portfolio utility, stated in expected return equivalent terms:
up = ep - vp/rt

As before, we choose instead to maximize utility stated in variance-equivalent terms since the optimal portfolio
will be the same. In this metric, the objective function is:

vup = rt*ep - vp

Of course, we are not free to choose any asset holdings we might desire. Instead, we must meet one or more linear
equality constraints:

A*x = b

For a solution to be feasible (that is, satisfy these constraints), we require that:

b - A*x = zeros(m,1)

Now the trick. We form a Lagrangean function by appending each linear constraint times an associated Lagrange
multiplier to the original objective function. With two linear constraints:

L = rt*ep - vp + g1*[b(1)-A(1,:)*x] + g2*[b(2)-A(2,:)*x]

For portfolios that satisfy the two linear constraints, each of the terms in the square brackets will equal zero and the
Lagrangean function L will equal the original objective function! Thus maximizing L will give the same answer as
maximizing the original objective function, as long as only feasible portfolios are considered.

Since we wish to maximize L, the goal is to get to the top of a hill where the height is given by the value of L and
the coordinates of the terrain are given by the values of the variables (the x values plus g1, g2 and any additional
Lagrange multipliers. Since the terrain is smooth, it is flat at the top of this particular hill. Moreover, the top is the
only place at which it is flat. Thus it is both necessary and sufficient for an optimal solution that the first
derivatives of L with respect to each of the variables be set to zero. Note, however, that there are now n+m
variables -- the n asset holdings (here, x1, x2 and x3) and the m Lagrange multipliers (here, g1 and g2). The
derivatives with respect to the asset holdings are:

rt*e(1) - 2*C(1,1)*x(1) - 2*C(1,2)*x(2) - 2*C(1,3)*x(3) - g1*A(1,1) - g2*A(2,1)


rt*e(2) - 2*C(2,1)*x(1) - 2*C(2,2)*x(2) - 2*C(2,3)*x(3) - g1*A(1,2) - g2*A(2,2)
rt*e(3) - 2*C(3,1)*x(1) - 2*C(3,2)*x(2) - 2*C(3,3)*x(3) - g1*A(1,3) - g2*A(2,3)

Setting these to zero and rearranging gives equations:


2*C(1,1)*x(1) + 2*C(1,2)*x(2) + 2*C(1,3)*x(3) + g1*A(1,1) + g2*A(2,1) = 0 + rt*e(1)
2*C(2,1)*x(1) + 2*C(2,2)*x(2) + 2*C(2,3)*x(3) + g1*A(1,2) + g2*A(2,2) = 0 + rt*e(2)
2*C(3,1)*x(1) + 2*C(3,2)*x(2) + 2*C(3,3)*x(3) + g1*A(1,3) + g2*A(2,3) = 0 + rt*e(3)

The derivatives with respect to the Lagrange multipliers are:


www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 9/14
10/24/12 Optimal Portfolios without Bounds on Holdings
b(1) - A(1,1)*x(1) - A(1,2)*x(2) - A(1,3)*x(3)
b(2) - A(2,1)*x(1) - A(2,2)*x(2) - A(2,3)*x(3)

Setting these to zero gives the original constraint equations:

b(1) - A(1,1)*x(1) - A(1,2)*x(2) - A(1,3)*x(3) = 0


b(2) - A(2,1)*x(1) - A(2,2)*x(2) - A(2,3)*x(3) = 0

Rearranging:

A(1,1)*x(1) + A(1,2)*x(2) + A(1,3)*x(3) = b(1)


A(2,1)*x(1) + A(2,2)*x(2) + A(2,3)*x(3) = b(2)

We now have five linear equations in five unknowns. They may be written succinctly (and somewhat familiarly)
as:
D*y = k + rt*f

where:
D = [ 2*C A'
A zeros(m,m) ]

k = [ zeros(n,1)
b ]

f=[e
zeros(m,1) ]

Note that our previous example represents a special case of this formula, with m=1 and each of the coefficients in
A and b equal to 1.0.

Economic Interpretation of Lagrange Multipliers


Recall the form of the Lagrangean function that has been maximized when the solution is obtained:

L = rt*ep - vp + g1*[b(1)-A(1,:)*x] + g2*[b(2)-A(2,:)*x]

Consider the derivative of this function relative to, say, b(1). It will be:

d L/d b(1) = g1

Since the Lagrangean function will equal the original objective function for all feasible portfolios, we can interpret
this derivative as the change in utility per unit change in the right-hand side of constraint number 1. Of course, the
objective function is in variance-equivalent terms. To state the derivative in the standard expected return
equivalent terms (up) we must divide by rt. Thus:

d up / d b(1) = g1 / rt

And similarly for any additional constraints.

In the case of the standard full investment constraint, the Lagrangean multiplier reflects the marginal utility in
variance equivalent terms of additional money to invest (for example, allowing the sum of the asset holdings to
equal 1.0001 instead of 1.0000). Dividing by the investor's risk tolerance gives the marginal utility of additional
funds in expected return equivalent terms. The latter is, in effect, the marginal utility of the portfolio (mup).
Correspondingly, the Lagrangean multiplier is rt times this. All of which explains why we assigned the multiplier
the name tmup and the result obtained by dividing it by rt the name mup in our earlier example.

The result is quite general. Each Lagrangean multiplier indicates the marginal utility in variance-equivalent terms
of a small change in the right-hand side of the corresponding constraint. In the case of portfolio yield, the
multiplier would indicate the added utility per unit of change in the required yield. To state this in expected return
equivalent terms, the Lagrangean would be divided by the investor's risk tolerance.
www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 10/14
10/24/12 Optimal Portfolios without Bounds on Holdings

Lagrangean multipliers are often useful for evaluating the extent to which a given constraint limits achievement of
an overall objective. The greater the multiplier (assuming that it is positive), the more costly the constraint. Of
course, the values apply for only small changes, since the objective function is quadratic, but they are useful for
evaluating the desirability of at least small changes in various constraints.

Optimal Portfolio Holdings with an Added Constraint


Finally, we are ready to solve the problem posed with a constraint on portfolio yield. Using the formulas derived
above, we obtain:
D = [ 2*C A'
A zeros(m,m) ]
=
2.0000 5.9200 4.6200 1.0000 5.0000
5.9200 109.5200 79.7720 1.0000 7.0000
4.6200 79.7720 474.3200 1.0000 3.0000
1.0000 1.0000 1.0000 0 0
5.0000 7.0000 3.0000 0 0

k = [ zeros(n,1)
b]
=
0
0
0
1.0000
5.5000

f=[e
zeros(m,1 ]
=
2.8000
6.3000
10.8000
0
0

mvp = inv(D)*k
=
0.8889
0.1805
-0.0695
39.8923
-8.4836

z = inv(D)*f
=
-0.0324
0.0162
0.0162
0.8723
0.3643

For rt = 25:

y = mvp + 25*z
=
0.0782
0.5859
0.3359
61.6987
0.6249

To see the effect on portfolio utility (up) of a small change in the yield constraint, we divide the corresponding
multiplier by risk tolerance to obtain muy, the marginal utility of the yield constraint:

www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 11/14
10/24/12 Optimal Portfolios without Bounds on Holdings
muy = 0.6249/25
= .0250

Requiring a higher yield would allow for a greater optimal portfolio utility, since this value is positive (had it been
negative, a higher yield requirement would have lowered optimal portfolio utility). Note, however, that the value
is not large -- optimal portfolio utility would increase at a rate of 0.0250 (2.5 basis points in expected return terms)
per unit change (100 basis points) in yield. Of course, this is a rate of change for a small difference in required
yield. To find the effect of a substantial change (e.g. from 5.50% to 6.50%) the optimization would have to be
performed with both values and the difference in optimal utility calculated directly.

Additional Linear Objectives


Before concluding the examination of cases in which there are no bounds on holdings we treat one further
possible complication that has important implications for both portfolio construction and understanding the
possible workings of capital markets. In particular, we consider an investor whose utility function has three or
more arguments -- one quadratic and the others linear in the decision variables. To illustrate we use an example in
which an investor associates a disutility with income yield due to its unfavorable tax treatment relative to capital
gains. Letting yp be the yield of the portfolio, utility is now:

up = ep + uy*yp - vp/rt

where uy (utility from yield) is a constant indicating the investor's attitude towards yield. For concreteness we
assume a negative value equal to -0.2 for uy. Thus a dollar received in the form of income (yield) will be 80% (1-
0.2) as desirable as a dollar received in the form of a capital gain.

Having run out of obvious letters, we let q represent the vector of asset yields. In our example:
q=[5
7
3]

and

yp = x'*q

As before, we can convert the utility function to variance-equivalent terms by multiplying all terns by rt, giving:

vup = rt*ep +(uy*rt)*yp - vp

For our example with three assets and two constraints, the derivative of the Lagrangean function for asset i
becomes:

rt*e(i)+(uy*rt)*y(i)-2*C(i,1)*x(1)-2*C(i,2)*x(2)-2*C(i,3)*x(3)-g1*A(1,i)-g2*A(2,i)

Setting this to zero and rearranging gives:

2*C(1,1)*x(1)+2*C(1,2)*x(2)+2*C(1,3)*x(3)+g1*A(1,1)+g2*A(2,1) = 0+rt*e(1)+(uy*rt)*y(i)

Putting these n equations together with the m equations for the derivatives taken with respect to the Lagrangean
multipliers associated with the constraints gives a system of (n+m) linear equations of the form:

D*y = k + rt*f + (uy*rt)*r

where D, y and f are defined as before and

r=[q
zeros(m,1) ]

Once again, the optimal portfolio can be determined simply by multiplying each term by the inverse of D.. Thus:

y = inv(D)*k + rt*inv(D)*f + (uy*rt)*inv(D)*r


www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 12/14
10/24/12 Optimal Portfolios without Bounds on Holdings

Note that the first two terms on the right-hand side are unchanged from the earlier incarnation. Thus we may write:

y = mvp + rt*z + (uy*rt)*zy

The new vector is zy. Clearly, it is a swap, or zero-investment strategy. An investor who derives neither utility
(positive uy) nor disutility (negative uy) from yield will be uninterested in this swap. An investor with a preference
for yield will wish to take long positions in it, while an investor for whom yield provides disutility will wish to
take short positions in it.

A-fund Separation
In the current example, optimal asset holdings are linear in two variables -- rt and (uy*rt). To keep the notation
simple, let:

ur = uy*rt

Then:

y = mvp + rt*z + ur*zy

Now, assume that three mutual funds (a,b and c) have been formed. Fund a holds a portfolio that is optimal for an
Investor with a risk tolerance of rta and a value of ur equal to ura. Fund b holds a portfolio that is optimal for an
Investor with preferences given by rtb and urb, and fund c holds a portfolio optimal for an Investor with
preferences rtc and urc. Thus:

ya = mvp + rta*z + ura*zy


yb = mvp + rtb*z + urb*zy
yc = mvp + rtc*z + urc*zy

Consider an Investor who places proportions xa, xb and xc of his or her wealth in the three mutual funds, with:

xa + xb + xc = 1

The resulting portfolio will be:

y = xa*ya + xb*yb + xc*yc

Or:

y = mvp + [xa*rta+xb*rtb+xc*rtc]*z + [xa*ura+xb*urb+xc*urc]*zy

The goal is to make the first bracketed expression equal to the Investor's risk tolerance (rt) and the second equal to
his or her value of ur while keeping the sum of the proportions allocated to the funds equal to 1. This is a system
of three linear equations in three unknowns (xa, xb, and xc):

xa*rta + xb*rtb + xc*rtc = rt


xa*ura + xb*urb + xc*urc = ur
xa + xb + xc = 1

Barring degeneracy due to lack of differences among the mutual funds, it can be easily solved, providing the
appropriate combination of mutual funds for the Investor to achieve his or her objectives.

We have thus shown that in this setting, three mutual funds can provide Investors with sufficient alternatives to
achieve optimal portfolios. Each such mutual fund must be optimal for a particular combination of rt and uy, and
the three funds must be designed for Investors with different degrees of both rt and ur. Thus three (different) funds
suffice to span the space of optimal portfolios when there are three arguments in investor's utility functions
(variance plus 2 linear terms). Examination of the procedures used to obtain this result (and the earlier two-fund
theorem) show that the natural generalization of this result is in fact correct:. If there are A possible arguments in
Investor's utility functions (variance plus A-1 linear terms), A different funds are sufficient to span the space of
optimal portfolios. This may be called the A-fund separation theorem.

www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 13/14
10/24/12 Optimal Portfolios without Bounds on Holdings

A world in which Investors care about more than expected return and risk requires more investment products and
more care when selecting a combination of mutual funds for a particular Investor. But the magnitude of the
Investor's task is still small, requiring only consideration of A (here, 3) mutual funds rather than n assets (e.g.
potentially thousands of securities). Of course, this assumes away nasty realities such as transactions costs and
bounds on holdings. It also assumes that the managers of the selected mutual funds do their jobs correctly (that is,
construct optimal portfolios) and that Investors or their advisors know the preferences for which each mutual fund
is optimal. To the extent that the real world falls short on one or more of these fronts, adjustments will have to be
made before giving practical investment advice.

www.stanford.edu/~wfsharpe/mia/opt/mia_opt2.htm 14/14
10/24/12 The Critical Line Method

The Critical Line Method

Contents:
The General Asset Allocation Problem
Adding Linear Inequalities
Parametric and Non-Parametric General Asset Allocation Problems
Yet Another Three-Asset Problem
Finding the Portfolio with the Maximum Expected Return
Finding an Optimal Portfolio Given the Status of Each Variable
Computing the Derivatives of the Lagrangean Function with Respect to Bounded Variables
Finding Optimal Portfolios for a Range of Risk Tolerances
Finding the Next Value of rt at Which a Variable Must Change Status
Corner Portfolios
The Algorithm
C Fund Separation

The General Asset Allocation Problem


The gradient method works well for solving the type of problem that we have termed the standard asset
allocation problem:

Select:
x

to maximize:
u = ep - vp/rt

where:
ep = x'*e
vp = x'*C*x

subject to:
sum(x) = k
lb <= x <= ub

If there are no bounds on holdings, cases with additional linear constraints can be solved directly, as we have
shown in the previous sections. But thus far we have no procedure for solving cases in which there are both
bounds on holdings and linear constraints in addition to the standard full-investment constraint.

Fortunately, Markowitz developed a general procedure in Markowitz 1956 that can handle additional linear
constraints and upper and lower bounds on holdings. Moreover, the approach provides a method for
determining the entire set of efficient portfolios. And, (as if this were not enough) it also leads to conclusions
about the properties of the efficient set and a new separation theorem.

Markowitz called his procedure the critical line method, as shall we.

www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 1/13
10/24/12 The Critical Line Method

Adding Linear Inequalities


Recall that the standard problem involves two sets of constraints. The first requires that the sum of the
proportions invested equal a constant:

sum(x) = k

The second constraints require that the proportions remain within specified bounds:

lb <= x <= ub

As in the analysis of problems without bounds, we can generalize the first constraint to allow for any desired
number of constraints as long as they are linear in the variables (x values). Given m such constraints. We
require that:

A*x = b

where A is an {m*n} matrix with the left-hand side coefficients of the constraints, and b is an {m*1} vector of
the right-hand sides of the constraints. As indicated earlier, a problem with only the full investment constraint
is a special case in which m=1 and all the coefficients in A and b are equal to 1.0.

Linear equality constraints may be of some interest in their own right, but in most practical cases they are only
a means to an end. By combining a linear equality with bounds on a variable one can constrain a linear
function of the asset holdings to be within desired bounds.

To illustrate, consider a three-asset case in which it is desired to limit the amount invested in cash plus bonds to
40% of the overall portfolio. To do so, we introduce a new variable (number 4) to represent the sum of the
amounts invested in assets 1 plus 2. To make this variable (X4 ) equal to the sum of X1 and X2 , we add a
second equation to the constraint set, giving:

A=[1 1 1 0
1 1 0 -1 ]

b=[1
0]

Note that in the first (full-investment) constraint, variable 4 has a coefficient of zero, since it is not an
investment, per se.

To restrict the amount invested in the sum of the first two asset classes to be less than 40% of the portfolio, we
need only to assign the appropriate values for the bounds on variable 4. In this case:
lb = [ 0
0
0
0]

ub = [ 1
1
1
0.4 ]

Parametric and Non-Parametric General Asset Allocation


Problems
We are now ready to formally define two versions of the general asset allocation problem. The non-
parametric general asset allocation problem has the form:
www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 2/13
10/24/12 The Critical Line Method

Select:
x

to maximize:
u = ep - vp/rt

where:
ep = x'*e
vp = x'*C*x

subject to:
A*x = b
lb <= x <= ub

The parametric general asset allocation problem has the form:

For all positive values of rt:

Select:
x

to maximize:
u = ep - vp/rt

where:
ep = x'*e
vp = x'*C*x

subject to:
A*x = b
lb <= x <= ub

The solution to the parametric version will be a matrix of portfolios rather than a single portfolio. One might
imagine that this would be huge -- containing, for example, a different portfolio for every possible level of risk
tolerance. Not so. We will show that in fact, a remarkably parsimonious matrix of portfolios can be used to
determine the solution to the non-parametric version of the problem for any desired magnitude of rt. This
rather opaque statement will (hopefully) be clear as the analysis unfolds.

Yet Another Three-Asset Problem


To illustrate some of the key concepts that form the basis for the critical line method, we use a variation on the
by-now familiar three-asset problem. The assets are cash, bonds and stocks, with expected returns, risks and
correlations:

e=
2.8000
6.3000
10.8000

sd =
1.0000
7.4000
15.4000

cc =
1.0000 0.4000 0.1500
0.4000 1.0000 0.3500
0.1500 0.3500 1.0000
www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 3/13
10/24/12 The Critical Line Method

To make the problem interesting, we assume that for some reason, it is required that at least 20% of the
portfolio be invested in each of the assets and that no more than 50% of the portfolio be invested in any asset.
Thus:

lb =
0.2000
0.2000
0.2000

ub =
0.5000
0.5000
0.5000

Our goal is to find all the portfolios which are efficient. In effect, we wish to maximize ep-vp/rt for every non-
negative value of rt.

Finding the Portfolio with the Maximum Expected Return


To start, we take a somewhat easier problem:

Maximize
e - vp/rt

when
rt = infinity

Of course, this is equivalent to:

Maximize
ep

where:
ep = x'*e
vp = x'*C*x

subject to:
A*x = b
lb <= x <= ub

This is a linear programming problem, since all the constraints are linear, some are inequalities, and the
objective function is linear. Many algorithms exist for solving such problems. The MATLAB optimization
toolbox contains a function called lp that can handle any such problem. Since it is designed to minimize a
linear function f'*x, the signs of the expected returns must be reversed to achieve maximization of ep. Since
the function can also handle somewhat more general problems, two additional arguments need to be included.
For our purposes these can be written as functions of e and A. The maximum-expected return portfolio can
then be found using the statement:

xinf = lp(-e,A,b,lb,ub,ones(size(e)),size(A,1));

where the notation xinf serves to indicate that this is the composition of a portfolio that is optimal for an
Investor with infinite risk tolerance.

While problems with two or more linear equalities are best solved using general linear programming
algorithms, cases such as the present one, in which the only equation is the full investment constraint can be
solved more simply using the following algorithm:

www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 4/13
10/24/12 The Critical Line Method
1. set all variables at their lower bounds
2. rank the variables in order of decreasing e(i)
3. increase the amount invested in the highest-e(i) variable to the smaller of
a. the lower bound plus the remaining funds
b. the upper bound
4. repeat step 3 with the next highest e(i) variable until no funds remain

In MATLAB:

% given e, lb, ub, finds maxe-portfolio xinf where sum(xinf)=1


xinf = lb;
[zz,ii] = sort(-e);
amtleft = 1 - sum(xinf);
ix = 1;
while amtleft>0
i = ii(ix);
chg = min((ub(i)-lb(i)),amtleft);
xinf(i) = xinf(i) + chg;
amtleft = amtleft-chg;
ix = ix+1;
end;

In this case:

xinf =
0.2000
0.3000
0.5000

Now, recall the classification of a variable's status in a solution:

down: l(i) = x(i)


in: l(i) < x(i) < u(i)
up: x(i) = u(i)

In this case, cash is down (at its lower bound), bonds are in (the solution), and stocks are up (at the upper
bound).

It is convenient to represent the states of the variables in a vector (s) using the following conventions:

i down : s(i) = -1
i in : s(i) = 0
i up : s(i) = 1

Here:

s=
-1
0
1

Finding an Optimal Portfolio Given the Status of Each Variable


We have found the optimal portfolio for an Investor with infinite risk tolerance. What about one with a risk
tolerance of, say, 45?

To find an answer we might guess that the status of each variable in such a case would be the same as in the
solution for rt=infinity. There is no reason to believe that this is so (except a suspicion that the author may
know it to be the case). But for now, assume that it is true.

Given this information, could you easily determine the magnitudes of the variables? Yes indeed.

www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 5/13
10/24/12 The Critical Line Method

Recall the solution equation for the case in which no bounds are binding:

D*y = k + rt*f

In this case:

2*C(1,1) 2*C(1,2) 2*C(1,3) 1 x(1) 0 e(1)


2*C(2,1) 2*C(2,2) 2*C(2,3) 1 * x(2) = 0 + rt* e(2)
2*C(3,1) 2*C(3,2) 2*C(3,3) 1 x(3) 0 e(3)
1 1 1 0 tmup 1 0

The first equation is derived by setting the derivative of the Lagrangean function with respect to the first
variable equal to zero. The second equation is derived by doing so with respect to the second variable. And so
on, for the first n equations. These first-order condition equations remain appropriate for the variables that are
in the solution, for their bounds are not binding and hence could have been omitted entirely (at least for the
risk tolerance being examined). Note, however, that the corresponding equations will not generally hold for
variables that are down or up. On the other hand, it is easy to write an equation for any such variable, since it
must be at the corresponding bound. In the case at hand we need to replace the first equation with one that
states:
x(1) = lb(1)

and the third equation with one that states:


x(3) = ub(3)

This is easily done by modifying D, k and f to give:

DD*y = kk + rt*ff

where the new matrices and vectors are:

1 0 0 0 x(1) 0.20 0
2*C(2,1) 2*C(2,2) 2*C(2,3) 1 * x(2) = 0 + rt* e(2)
0 0 1 0 x(3) 0.50 0
1 1 1 0 tmup 1 0

We can now proceed to solve the system of equations. The solution is given by:

y = inv(DD)*kk + rt*inv(DD)*ff

Here:

y=
0.2000
0.3000
0.5000
209.5740

If everything else is correct, the optimal portfolio is the same for someone with a risk tolerance of 45 as for
someone who doesn't care at all about risk!

The Kuhn-Tucker Conditions


In all the asset allocation problems we have analyzed the goal is to maximize a Lagrangean function. In this
case, it is:
www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 6/13
10/24/12 The Critical Line Method
L = rt*ep - vp + g1*[b(1)-A(1,:)*x]

The objective is to reach the highest point in a terrain in which altitude is measured by L. However, in this
setting we may not be on a smooth terrain. The presence of inequality constraints may lead to places in which
there is an abrupt change in slope or a move in one or more directions is infeasible (in a sense, there is a
fence). Nonetheless, there are some characteristics of the optimal position that can be exploited when
designing a solution algorithm.

Consider dL/dx(i) in the x(i) direction when at the optimal point. If x(i) is an in-variable, this derivative (slope)
must be zero or else we would not in fact be at the top of the feasible hill. Thus we have:

dL/dx(i) = 0 : for all in variables

Now consider dL/dx(i) at the optimal point for a down-variable. If this were positive, we could not be at the
top of the feasible hill since an increase in x(i) would improve the solution. However, it could (and in most
cases would) be negative, indicating that a further decrease would lead to a higher value of L, but such a
decrease is not allowed. Thus:
dL/dx(i) <=0 : for all down variables

Finally, it must be the case that dL/dx(i) is zero or positive for all up variables. If such a slope were negative, it
would pay to decrease the value of the variable, since by so doing one could reach a higher position. Thus:

dL/dx(i) >= 0: for all up variables

These three characteristics of an optimal solution are collectively known as the Kuhn-Tucker conditions. If
they are not met, a portfolio is not optimal.

Computing the Derivatives of the Lagrangean Function with


Respect to Bounded Variables
To make certain that all is well with a purported solution, it is a simple matter to compute the derivative of the
Lagrangean with respect to each variable. In this case the Lagrangean is:

L = rt*ep - vp + g1*[b(1)-A(1,:)*x]

The derivative of L with respect to Xi is:

dL/dx(i) = rt*e(i) - 2*C(i,:)*x -g1

and the full set of the derivatives with respect to the assets can be written in matrix form as:
dL = rt*e - 2*C*x -g

where g is the vector of Lagrange multipliers.

Since vector y contains the asset holdings (x) plus the Lagrange multipliers (g), we may take advantage of the
structures of vector f and matrix D to write:

dL = rt*f - D*y

where the first n elements of dL are the derivatives of L with respect to the asset holdings.

For the case at hand:

dL =
-88.0600
0
www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 7/13
10/24/12 The Critical Line Method

14.4104
-1.0000

Note that all the Kuhn-Tucker conditions for an optimal solution have been met. The derivative is negative for
the down variable (cash), zero for the in variable (bonds),and positive for the up variable (stocks).

Finding Optimal Portfolios for a Range of Risk Tolerances


By good fortune and hard work we have found the solution to the portfolio problem when rt = 45. We have
also verified that it is indeed the solution by checking the derivatives of the Lagrangean function to see that
they satisfy the Kuhn-Tucker conditions for optimality. But what about a case in which rt = 44? If in fact all
the variables have the same status as in the previous solution we can use the same formulas. The portfolio is
given by:

y = inv(DD)*kk + rt*inv(DD)*ff

and the derivatives by:


dL = rt*f - D*y

For rt = 44 we obtain:
y=
0.2000
0.3000
0.5000
203.2740

dL =
-84.5600
0
9.9104
-1.0000

It worked! Each asset in vector y is within its allowable bounds and the derivatives satisfy the Kuhn-Tucker
conditions.

We could, if desired, go on like this, trying values for rt of 43, 42, 41, etc. until the procedure "did not work".
There are two ways that a purported solution could fail. First, one or more of the variables in y could be
outside the permissible bounds. Second, the required Kuhn-Tucker conditions for the derivatives could be
violated. In the first instance the ostensible solution would be infeasible. In the second instance, it would be
suboptimal.

But we need not proceed by trial and error. Instead we can determine the value of rt at which one or both of
these conditions will first be violated as the value of rt is decreased.

Finding the Next Value of rt at Which a Variable Must Change


Status
To keep the notation reasonably simple, it is useful to rewrite:

y = inv(DD)*kk + rt*inv(DD)*ff

as:

www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 8/13
10/24/12 The Critical Line Method
y = ya + yb*rt

where:

ya = inv(DD)*kk
yb = inv(DD)*ff

In this case:
ya =
0.2000
0.3000
0.5000
-73.9260

yb =
0
0
0
6.3000

Now, recall that:


dL = rt*f - D*y

Substituting the equation for y gives:


dL = rt*f - D*(ya + yb*rt)

Simplifying:

dL = dla + dlb*rt

where:

dla = -D*ya
dlb = f - D*yb

In this case:

dla =
69.4400
0.0000
-188.0896
-1.0000

dlb =
-3.5000
0
4.5000
0

Now, note that a variable may change status in one of four ways as rt falls.

1) An in variable may go up. This can only happen if yb(i) is negative. The critical value of rt
(crt) is reached when y(i)=ub(i), that is, when:

ub(i) = ya(i)+yb(i)*crt
or:
crt(i) = (ub(i) - ya(i))/yb(i)

2) An in variable may go down. This can only happen if yb(i) is positive. The critical value of rt
is reached when y(i)=lb(i), that is, when:
lb(i) = ya(i)+yb(i)*crt

www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 9/13
10/24/12 The Critical Line Method
or:
crt(i) = (lb(i) - ya(i))/yb(i)

3) A down variable may come in. This can only happen if dlb(i) is negative. The critical value of
rt is reached when dL(i)=0, that is, when:

0 = dla(i)+dlb(i)*crt
or:
crt(i) = -dla(i)/dlb(i)

4) An up variable may come in. This can only happen if dlb(i) is positive. The critical value of rt
is reached when dL(i)= 0, that is:

0 = dla(i)+dlb(i)*crt
or:
crt(i) = -dla(i)/dlb(i)

Applying these rules to the case at hand gives the following critical values of rt for our three variables:

crt =
19.8400
0
41.7977

The greatest value of crt is the next value at which a variable must change status. Letting nrt represent the next
critical value of rt:

nrt = max(crt);

In MATLAB we can find both the next critical value of rt and the variable to change status in one expression:

[ncp,ichg] = max(crt);

where ichg is the variable for which crt is largest. In this case:

ncp =
41.7977

ichg =
3

So stocks (variable 3) are to change status. Since s(3)=1, they want to go from up (s=1) to in (s=0).

Corner Portfolios
Before proceeding further, it is important to draw some implications from our experience to date. Imagine that
we know that for a range of risk tolerances from rta to rtb the same bounds are binding. In other words, within
this range, as risk tolerance is changed each member (if any) of one set of variables will remain at its upper
bound, each member (if any) of another set of variables will remain at its lower bound, and each of the other
variables will move within its designated bounds. For this range of risk tolerances one could find the optimal
set of portfolios by solving the standard set of linear equations using the same DD matrix and kk and ff
vectors. Therefore, within this range every asset holding is a linear function of risk tolerance. This in turn
implies that for any risk tolerance between rta and rtb, the optimal portfolio can be constructed by simply
taking a weighted average of the portfolios that are optimal for rta and rtb, with the weights proportional to the
difference between the desired risk tolerance and the endpoints of the range rta to rtb.

This observation provides the motivation for assigning a name to each portfolio that is optimal for a risk
tolerance at which a variable changes status. We call such a portfolio a corner portfolio because in a graph that
plots holdings against risk tolerance, two or more variables "turn a corner". As we will see, corner portfolios
www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 10/13
10/24/12 The Critical Line Method

play a central role in the critical line algorithm. They also are attractive candidates for mutual funds.

The Algorithm
We now have the ingredients to complete the algorithm for the parametric general asset allocation problem.
Here it is in outline form:
1. Find the portfolio that gives the maximum expected return.
2. Determine the status of each variable in the max-e portfolio.
3. Record the composition of the portfolio

4. Compute DD, kk and ff given the current status of each variable.


5. Find the equations for the optimal holdings and the derivatives of the
Langrangean function.
6. Determine the next critical value of rt and the variable to change status.
7. Determine the optimal portfolio for the next critical value of rt and record it
and the associated value of rt

8. Repeat steps 4 through 8 until the last critical value of rt is zero or negative

The end result will be a matrix of portfolios and a vector of the associated risk tolerances. With the exception
of the first, each of the portfolios in the matrix will be a corner portfolio. Moreover, this information is all that
is needed to determine the optimal portfolio for any degree of risk tolerance! Given an Investor's risk tolerance,
the Analyst needs only to do a table lookup to find the two corner portfolios for risk tolerances on either side
of the desired value, then perform a simple computation involving a weighted average of the compositions of
the portfolios in question.

The figure below shows the composition of all optimal portfolios, given our inputs, for risk tolerances from 0
to 50 (the blue line represents Cash, the red line Stocks, and the green line Bonds).

The corner portfolios are evident in the figure. Their compositions are shown in the table below:

rt % Cash % Bonds % Stocks

www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 11/13
10/24/12 The Critical Line Method

<= 13.73 50.00 30.00 20.00


15.10 45.19 34.81 20.00
21.02 22.18 50.00 27.82
22.30 20.00 50.00 30.00
22.94 20.00 50.00 30.00
>=41.80 20.00 30.00 50.00

Note that over the range of risk tolerances from 22.30 to 22.94 the optimal composition remains the same. This
is not a rounding error. Since the efficient frontier is piecewise quadratic, there is always the possibility that
there is a kink at the point corresponding to a specific level of risk and return. In such a case indifference
curves with different slopes (risk tolerances) can be tangent to the efficient frontier at the same point, giving
the same optimal portfolio.

C- Fund Separation
A world with inequality constraints is more complex than one without them. In such a setting, it is rarely
possible to obtain an optimal investment strategy by determining a single optimal combination of risky
securities, then mixing it with borrowing or lending to suit the risk tolerance of a given Investor. However, all
is not lost. It is still possible to construct a limited number of mutual funds, each holding a portfolio of risky
securities that is optimal for a particular risk tolerance, then allocate the assets of each Investor between two of
the resulting mutual funds. This of course cannot be done haphazardly. To minimize the number of mutual
funds, each should hold a different corner portfolio from the optimization analysis. If there are C different such
corner portfolios, the minimum number of mutual funds required to service all possible Investors will equal C.
Each fund will be ideal (if the optimization inputs were correct) for an Investor with a specific risk tolerance.
Given this set of funds, each Investor should allocate his or her money between the two funds with objectives
(risk tolerances) closest to his or her own.

In a world of inequality constraints, there may need to be as many funds as there are different corner portfolios
in the efficient set of portfolios. Given C such funds, the investment decision can be separated into (1) the
formation of C funds and (2) the choice of one or two among them for each Investor. We memorialize this
process by calling it C-fund Separation.

If additional linear attributes are important, more efficient funds will be required, and the Investor will typically
have to choose a combination of A funds (where A is the total number of relevant attributes). But separation of
the process into two phases (construction of a set of mutual funds and Investor choice among those funds) is
still possible.

The results associated with the critical line algorithm have important implications for practical people. We
finish with a case in point.

Assume that there are four efficient funds, advertised as conservative (C), moderate (M), aggressive (A) and
very aggressive (V), differing only in levels of risk (that is, no additional attributes are relevant). An Investor
with preferences lying between "moderate" and "aggressive" should allocate assets between funds M and A.
While he or she might achieve the same amount of risk by choosing a combination of C and V, this would be
inefficient, giving lower expected return.. A slightly better choice, although still an inefficient one, might
involve investing in all four mutual funds. But the best choice of all involves only the two funds with
objectives nearest those of the Investor in question. This suggests that the desire to diversify across many
mutual funds, each of which is itself relatively diversified, may be counterproductive. It is entirely possible to
have too many mutual funds!

www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 12/13
10/24/12 The Critical Line Method

www.stanford.edu/~wfsharpe/mia/opt/mia_opt3.htm 13/13
10/24/12 Factor Models

Factor Models
The Need for Factor Models
Linear Factor Models
Factor-based Expected Returns, Risks and Correlations

www.stanford.edu/~wfsharpe/mia/fac/mia_fac0.htm 1/1
10/24/12 The Need for Factor Models

The Need for Factor Models

Contents:
The Number of Estimates Needed for Mean/Variance Analyses
The Use of Historic Data

The Number of Estimates Needed for Mean/Variance Analyses


The problem with securities is that there are too many of them. This is also true for pools of securities
such as mutual funds. Worldwide, there are hundreds of thousands of securities and tens of thousands
of mutual funds.

In the United States alone there are roughly ten thousand mutual funds. To perform a mean/variance
analysis of portfolios that could contain any of them would require estimates for the future values of:

10,000 expected returns,


10,000 standard deviations, and
100,000,000 (10,000*10,000) correlation coefficients

To be sure, this overstates the magnitude of the problem. We know that 10,000 of the correlation
coefficients will equal 1.0, since each fund will be perfectly correlated with itself. Moreover, for each
entry below the main diagonal of the correlation matrix there is a corresponding entry above it (that is,
cc(i,j)=cc(j,i)). Thus the number of potentially different correlation coefficients to be estimated will be
only (!) (10,000*10,000 - 10,000)/2, or 49,995,000.

More generally, with N different assets, we require:

N expected returns
N standard deviations
(N^2 - N)/2 correlation coefficients

for a grand total of (N^2 + 3*N)/2 different estimates.

There are two consequences of the fact that problems involving large numbers of assets require a great
many estimates. The first concerns the sheer computational requirements for optimization or even the
determination of the risk and return of a given portfolio. Fortunately, ever-declining computer costs can
ameliorate the pain caused on this front. But a second problem remains -- it is simply too difficult to
estimate each of the required values explicitly.

The Use of Historic Data


At first glance it might seem that the estimation problem could also be solved simply by unleashing a
sufficient amount of computer power. Why not obtain a set of historic returns for the N assets and
compute historic mean returns, standard deviations of return and correlations among the returns? Even
for large values of N this could be done in reasonable time and for reasonable cost, although storage of
www.stanford.edu/~wfsharpe/mia/fac/mia_fac1.htm 1/3
10/24/12 The Need for Factor Models

each of the resulting estimates would use up a considerable amount of computer space.

Issues of cost and time aside, such an approach would not provide a good solution. A set of historic data
provides only a sample of possible outcomes. The statistics we desire are those that describe the entire
underlying "return-generating" process. But the statistics from a sample are likely to differ in potentially
significant ways from those that are appropriate for tasks such as risk estimation and portfolio
optimization. In statistician's terms, the numbers obtained from historic data are "subject to error". More
simply put, they include noise. In some cases this may be reasonably benign. For example, if some
values are overstated and others understated, a simple average of historic values may provide a quite
accurate estimate of the expected value of the true process. This suggests that the use of historic data for
estimating the expected returns and risks of pre-specified portfolios might be an acceptable practice.
However, the use of optimization to find the best portfolio for a given investor will be fraught with
hazard if historic data are used, since optimization programs look for unusual values, and such values
are far more likely to include error than those that are not unusual. The same danger lurks when
evaluating portfolios chosen in simpler ways, but with knowledge of the behavior of the assets over the
historic period. In either case, the purported risks and returns for the portfolios will be biased toward
favorable estimates (higher expected returns and/or lower risks), and the portfolios will almost certainly
be inefficient in prospective terms. More precisely, portfolios that appear on the efficient frontier using
unadjusted historic data will almost certainly plot below the true efficient frontier that could be
constructed if the correct future risks, expected returns and correlations were known.. Unhappily, of
course, we can never know the location of the true efficient frontier, since we can never know precisely
the correct future risks, expected returns and correlations.

The problem with using historic data when estimates are required for a large number of assets can be
seen by comparing the required number of estimates with the data available for the estimation. Assume
that returns are available for N assets for T periods (e.g. months). In all, N*T numbers are available in
the empirical database. But we need to estimate (N^2 + 3*N)/2 different numbers (expected returns,
standard deviations and correlation coefficients). Taking the ratio of the former to the latter gives the
ratio of numbers available per number to be estimated. It is 2*T/(N+3). The table below shows the
ratio of the numbers available to the numbers being estimated for selected values of N and T.

N T available/estimated
10 60 9.23
100 60 1.17
1,000 60 0.12
10 120 18.46
100 120 2.33
1,000 120 0.24
10 840 129.23
100 840 16.31
1,000 840 1.68
10,000 840 0.17

For the common cases in which monthly returns are used for estimation, each set of three rows
corresponds to 5, 10 and 70 years (the latter being approximately the number of years in longer-term
databases.

www.stanford.edu/~wfsharpe/mia/fac/mia_fac1.htm 2/3
10/24/12 The Need for Factor Models

Cases in which fewer numbers are available than are to be estimated are clearly beyond the pale. Yet
such combinations can easily arise in practice. This is often encountered in scenario analyses, when
judgmental forecasts of asset returns in a limited set of possible future situations are used as the
foundation for portfolio construction. But it is not uncommon in empirical analyses of historic data.

One might assume that the problem of insufficient data can be mitigated sufficiently by simply using
more data. Unfortunately this usually requires going farther back in history, and the longer the historic
period covered, the less likely is the maintained hypothesis that the underlying joint probability
distribution generating the returns has been the same.

While these dilemmas cannot totally be resolved, there are ways to mitigate the problem. Needed are
procedures that can produce estimates of risks, returns and correlations closer to the desired future values
than those obtained by simply using historic statistics. Two ingredients are required.

First, historic data must be "smoothed" to try to focus on underlying relationships that are more
likely to be true in the future and to ignore deviations from those relationships that are more likely
to be due to random noise or errors. The tools used most often to accomplish this are factor
models -- the subject of this chapter.
Second, good financial economic theory must be utilized to adjust estimates of risks, expected
returns and correlations until they bear some reasonable relationship with one another. This
involves the use of concepts and models associated with equilibrium in efficient markets -- a
subject that is treated at length in a later chapter.

www.stanford.edu/~wfsharpe/mia/fac/mia_fac1.htm 3/3
10/24/12 Linear Factor Models

Linear Factor Models

Contents:
A Generic Linear Factor Model
Terminology
Decomposing Returns
Matrix Representation of Factor Models

A Generic Linear Factor Model


The Equation

A linear factor model relates the return on an asset (be it a stock, bond, mutual fund or something else)
to the values of a limited number of factors, with the relationship described by a linear equation. In its
most generic form, such a model can be written as:

ri = bi1 *f1 + bi2 *f2 + .... + bim*fm + ei

where:

ri = the return on asset i


bi1 = the change in the return on asset i per unit change in factor 1
f1 = the value of factor 1
bi2 = the change in the return on asset i per unit change in factor 2
f2 = the value of factor 2
... = terms of the form bij*fj with j going from 3 to m-1
fm = the value of factor m
bim = the change in the return on asset i per unit change in factor m
m = the number of factors
ei = the portion of the return on asset i not related to the m factors

For emphasis, the equation is sometimes written so that variables that are assumed to be known before
the fact are differentiated from those the value of which is generally not known until after the fact. For
example:
~ ~ ~ ~ ~
ri = bi1 *f1 + bi2 *f2 + .... + bim*fm + ei

In this version, a tilde after a variable indicates that its value is not generally known in advance. The
values of such stochastic variables are uncertain. Thus we do not know what the return on the asset
~ ~ ~ ~
(ri ) will be, since we do not know the values that the factors (f1 , f2 , .... ,fm ) will take on, nor do
~
we know the amount of the asset's return that will come from other sources (ei ). On the other hand,
we do know (or at least assume that we know) the sensitivities of the return on the asset to each of the
www.stanford.edu/~wfsharpe/mia/fac/mia_fac2.htm 1/8
10/24/12 Linear Factor Models

factors ( bi1 , bi2 , ....,bim) -- these are deterministic (not subject to uncertainty). Somewhat differently
put, they are parameters in the model.

Purists will note that it is unusual to place tildes after stochastic variables rather than over them. The
latter is indeed the convention in media that are not typographically challenged. Our approach is simply
a pragmatic response to the limitations of standard browser formats.

The Key Assumption


The factor model equation may appear to make a significant statement about the relationship between an
asset's return and the values of the enumerated factors, but this is not so. For example, one could choose
any arbitrary set of bij 's and fj's, then simply define the residual as:

~ ~ ~ ~ ~
ei = ri - [bi1 *f1 + bi2 *f2 + .... + bim*fm ]

The factor equation would then hold precisely, but could have no economic content at all. To make the
equation have meaning, two assumptions are made. One is relatively innocuous. The other is not.
~
First, the residual return (ei ) is assumed to be uncorrelated with each of the factors:

~ ~
corr (ei , fj ) = 0 : for every j from 1 to m

This is not as restrictive as it may seem. Consider, for example, a case in which the residual return is
correlated with factor 1. By adjusting the factor exposure (bi1 ) appropriately, the correlation of the
residual with the factor can be made to equal zero. Moreover, this can be done for every factor. In fact,
in simple settings using historic data, multiple regression procedures can be used to find a set of factor
exposures (bij 's) that will give residual returns that are uncorrelated with each of the factors. Why?
Because standard linear multiple regression methods select slope coefficients (here, the bij 's) that
minimize the variance of the residual (here ei). But this will insure that the residual is uncorrelated with
each of the independent variables (here, the fj's), since the removal of any such correlation by changing
one or more bij's will reduce the variance of the residual.

Thus the assumption that the residual is uncorrelated with each of the factors is convenient, but does not
give the linear factor model much power. However, the second assumption does.

The key assumption of a linear factor model is that the residual for one asset's return is uncorrelated with
that of any other:
~ ~
corr (ei ,e j ) = 0 : for every i not equal to j, with i and j running from 1 to m

This means that the only sources of correlations among asset total returns are those that arise from their
exposures to the factors and the covariances among the factors. The residual component of an asset's
return is assumed to be unrelated to that of any other asset, and hence totally specific to that asset. In
other words, the risk associated with the residual return is idiosyncratic to the asset in question.

This assumption makes a linear factor model powerful in the sense that it rules out many possible
combinations of outcomes. But greater power comes at a cost. The more restrictive a model, the greater
the chance that it may be inconsistent with reality. For this reason it is incumbent on the Analyst to try
to capture the most important sources of correlations among asset returns by including a sufficient

number of factors and attempting to focus on the most important ones. This being said, as in the
www.stanford.edu/~wfsharpe/mia/fac/mia_fac2.htm 2/8
10/24/12 Linear Factor Models

number of factors and attempting to focus on the most important ones. This being said, as in the
construction of any model, parsimony is a virtue, since the goal is to include "signals" and avoid
"noise".

Non-linear Relationships
We have termed the standard factor model linear which, strictly speaking, it is. However this is far less
restrictive than might first seem. There are no restrictions on correlations among the enumerated factors,
so it perfectly possible to include some that are correlated with others or are transforms of others. For
example, assume that the desired relationship is a quadratic one in which ri is related to two factors, fa
and fb as follows:

ri = bi1 *fa + bi2 *fb + bi3 *fa2 + bi4 *(fa*fb ) + ei

To put this in our standard format, define:

f1 = fa
f2 = fb
f3 = fa2
f4 = fa*fb

Then the relationship can be written as a linear function of these new variables:

ri = bi1 *f1 + bi2 *f2 + bi3 *f3 + bi4 *f4 + ei

In cases of this sort it may be difficult to estimate the values of the sensitivities (bij's) from historic data
because the new factors are highly correlated with each other, but there is no reason why such a format
cannot be employed if good estimates can be obtained.

To avoid needless carping on the need to define factors to allow a linear format for the overall
relationship, we henceforth will use the shorter term: factor models.

Expected Residual Returns

Thus far we have imposed no restrictions on the expected returns of the factors or on the asset's residual
returns (ei's). In general, we will not do so. This allows the expected return of ei to be positive,
negative, or zero for any asset. However, in some applications it is useful to divide the expected non-
factor return into two components -- a known expected value and an unknown residual component with
an expected value of zero. As typically written, the equation becomes:
~ ~ ~ ~ ~
ri = bi1 *f1 + bi2 *f2 + .... + bim*fm + (ai + ei )

~
where the expected value of ei = 0.

As the choice of letter suggests, the equation is often written with the ai term first:

www.stanford.edu/~wfsharpe/mia/fac/mia_fac2.htm 3/8
10/24/12 Linear Factor Models
~ ~ ~ ~ ~
ri = ai + bi1 *f1 + bi2 *f2 + .... + bim*fm + ei

In some cases the first term is called the asset's alpha value, but at this point we use the more humble
notation of "a".

Terminology
Factor models are used in many domains in the field of investments, so it should not be surprising that
different factors are used and different terms employed to describe the key components.

Factors (the fj's) may be::

macro-economic variables
returns on pre-specified portfolios,
returns on zero-investment strategies (long and short positions of equal value) giving maximum
exposure to a fundamental or macro-economic factors,
returns on benchmark portfolios representing asset classes,
or something else.

The bij coefficients may be called:

factor exposures,
factor sensitivities,
factor loadings,
factor betas,
asset exposures
style
or something else.

The ei term may be called:

idiosyncratic return,
security-specific return,
non-factor return,
residual return,
selection return
or something else.

Different problems require different factors and emphasize different economic relationships. The job of
the Analyst is to either construct and apply an appropriate factor model for the task at hand or to at least
understand the underlying structures and economic meanings of models constructed by others.

Decomposing Returns
A factor model is especially useful when analyzing historic asset returns, since such a model allows the
Analyst to separate components of the overall return of the asset. For such purposes it is useful to write
the underlying model as:

www.stanford.edu/~wfsharpe/mia/fac/mia_fac2.htm 4/8
10/24/12 Linear Factor Models

rit = bi1 *f1t + bi2 *f2t + .... + bim*fmt + eit

where:

rit = the return on asset i in period t


bi1 = the change in the return on asset i per unit change in factor 1
f1t = the value of factor 1 in period t
bi2 = the change in the return on asset i per unit change in factor 2
f2t = the value of factor 2 in period t
... = terms of the form bij*fj with j going from 3 to m-1
fm = the value of factor m
bim = the change in the return on asset i per unit change in factor m
m = the number of factors
eit = the residual return on asset i in period t

While the subscript t and the term period suggest the traditional application in which each period
represents a different historic realization (for example, a different month in the past), the concepts can be
used as well in an ex ante analyses, in which each period (t) represents a different possible scenario or
realization that could occur in the next (future) period. To emphasize the context, we will sometimes
use the subscript s (for scenario) instead of t (for time period). In the former case, there are S scenarios.
In the latter, there are T time periods.

Note that in this representation the bij terms are not given a t (period) or s (scenario) subscript. This is
innocuous in the latter case, since every scenario involves the same future period. However, in the
former case, the assumption is quite restrictive, since it indicates that the asset's exposures to the factors
were the same in every period. In some cases involving ex post returns, different exposures will be
estimated for different time periods, with bij values replaced with bijt values.

Matrix Representations of Factor Models


To simplify notation and to facilitate computation it is useful to switch from a subscripted notation to a
matrix representation. As usual, we utilize Matlab conventions. We consider several cases in turn,
focusing on decomposition of returns, be they over time or over scenarios. For simplicity we cast our
examples in terms of historic returns over different time periods, but the interpretations can easily be
adapted to cases involving different possible scenarios over a single future time period.

One asset, one realization


First consider the case in which there is one asset and one time period (looking backward) or scenario
(looking forward).Let b be a {1*m) vector of the asset's factor exposures, let f be an {m*1} vector of
actual factor values, r a scalar representing the asset's return and e a scalar representing its residual
return. The factor model equation can then be written as:

r = b*f + e

For example, let the asset's exposures to the factors be:


b = [ 0.1 0.3 0.6 ]

www.stanford.edu/~wfsharpe/mia/fac/mia_fac2.htm 5/8
10/24/12 Linear Factor Models

Assume that the realized values of the factors in a given year were:
f=[4
7
20 ]

If the total return on the asset (r) was 16.0 percent, then:

e = r - b*f
= 16.0 - 14.5
= 1.5

Thus, in the year in question, the asset's residual (non-factor related) return was 1.5%, while its factor-
related return was 14.5%.

One Asset, Multiple Realizations


Next, consider a case in which there are many historic periods (looking backward) or scenarios (looking
forward). Let there be T such alternatives. For each one, there will be a return on the asset, so that the
scalar r will be replaced by T values, which can be written as a {1*T} (row) vector. Similarly, for every
alternative there will be a set of factor values, so that f will be replaced by a {m*T} matrix. This will
give a residual return for each of the periods or cases, so that the scalar e will be replaced by a {1*T}
vector. We assume that the asset's factor exposures will be the same in each case, so that b will remain a
{1*m} vector.

The relationships among these variables can then be written with the following succinct equation:
r = b*F + e

Given r, b and F, the residual returns can be found by performing the operation:

e = r - b*F

For example, assume that in the last two years the realized returns for the asset were:

r = [ 16 4]

while the factor values were:

F= [4 3
7 2
20 10 ]

This implies that the factor-related returns were:

b*F = [ 14.5 6.9 ]

and the residual returns were:

e = r - b*F = [ 1.5 -2.9 ]

In each case the first column corresponds to the previous one-period example, which is a special case
(with T=1) of this present version.

www.stanford.edu/~wfsharpe/mia/fac/mia_fac2.htm 6/8
10/24/12 Linear Factor Models

Multiple Assets, Multiple Realizations


An even more general case can subsume both of the prior ones as special cases. Assume that there are
N assets and T realizations. Let:

R = an {N*T} matrix, where R(i,t) is the return on asset i in realization t

B = an {N*m} matrix, where B(i,j) is the exposure of asset i to factor j

F = an {m*T} matrix, where F(j,t) is the value of factor j in realization t

e = an {N*T} matrix, where e(i,t) is the residual return on asset i in realization t

The factor model then becomes:

R = B*F + E

and the matrix of residual returns can be found by computing:


E = R - B*F

As an example, assume that we have four assets, with exposures to three factors given by:
B =[ 0.1 0.3 0.6
0.2 0.8 0
0 0.7 0.3
0 0 1.0 ]

If the asset's returns in the two years were:


R =[ 16 4
7 1
8 6
22 7 ]

Then the residual returns were:

E = [ 1.5 -2.9
0.6 -1.2
-2.9 1.6
2.0 -3.0 ]

Not surprisingly, the first security is the one used in the prior case.

Note that three different dimensions are involved here (two periods, three factors, and four assets). The
matrices, with row and column labels are as follows:
Returns (R):

period 1 period2
security 1 16.0 4.0
security 2 7.0 1.0
security 3 8.0 6.0
security 4 22.0 7.0

Asset Exposures (B):

factor 1 factor 2 factor 3


security 1 0.1 0.3 0.6
security 2 0.2 0.8 0.0
www.stanford.edu/~wfsharpe/mia/fac/mia_fac2.htm 7/8
10/24/12 Linear Factor Models

security 3 0.0 0.7 0.3


security 4 0.0 0.0 1.0

Factor realizations (F):

period 1 period2
factor 1 4.0 3.0
factor 2 7.0 2.0
factor 3 20.0 10.0

Factor-related Returns (B*F):

period 1 period2
security 1 14.5 6.9
security 2 6.4 2.2
security 3 10.9 4.4
security 4 20.0 10.0

Residual returns (E = R - B*F):

period 1 period2
security 1 1.5 -2.9
security 2 0.6 -1.2
security 3 -2.9 1.6
security 4 2.0 -3.0

www.stanford.edu/~wfsharpe/mia/fac/mia_fac2.htm 8/8
10/24/12 Factor‑based Expected Returns, Risks and Correlations

Factor-based Expected Returns, Risks and


Correlations

Contents:
Factor-based Asset Expected Returns
Factor-based Asset Covariances and Variances
Factor-based Portfolio Expected Returns and Risks

Factor-based Asset Expected Returns


What is the expected return for a single asset whose return is generated by a factor model? The answer
conforms nicely with intuition -- each uncertain term in the factor model equation can simply be
replaced with its expected value. Thus, if:
~ ~ ~ ~ ~
ri = bi1 *f1 + bi2 *f2 + .... + bim*fm + ei

It will be the case that:

ev(ri) = bi1 *ev(f1 ) + bi2 *ev(f2 ) + .... + bim*ev(fm) + ev(ei)

where ev(x) denotes the expected value of x

To see why this is the case, recall that for a given possible future scenario s:

ris = bi1 *f1s + bi2 *f2s + .... + bim*fms + eis

Now, multiply each term by prs , the probability that the scenario will occur:

prs*ris = prs*bi1 *f1s + prs*bi2 *f2s + .... + prs*bim*fms + prs*eis

This equation will continue to hold (the left side value must equal the right side value) and there will be
one such equation for each possible scenario.

Next, add together the equations for all S possible scenarios. Letting sums( ) denote the sum for
s=1...S, we have:

sums(prs*ris) = sums(prs*bi1 *f1s) + sums(prs*bi2 *f2s) + .... + sums(prs*bim*fms) +


sums(prs*eis)

Collecting terms that have scenario subscripts gives:

sums(prs*ris) = bi1 *sums(prs*f1s) + bi2 *sums(prs*f2s) + .... + bim*sums(prs*fms) +


sums(prs*eis)

www.stanford.edu/~wfsharpe/mia/fac/mia_fac3.htm 1/7
10/24/12 Factor‑based Expected Returns, Risks and Correlations

By definition, the first sum is the asset's expected return, the next m sums are the expected returns of the
factors, and the last term is the expected residual return. Thus:

ev(ri) = bi1 *ev(f1 ) + bi2 *ev(f2 ) + .... + bim*ev(fm) + ev(ei)

as asserted.

If the expected residual return is represented by ai, the equation can be written as:

ev(ri) = ai + bi1 *ev(f1 ) + bi2 *ev(f2 ) + .... + bim*ev(fm)

or, in matrix terms:

e(i) = b*ef + a(i)

where:

e(i)= r*pr'
ef = F*pr'
a(i) = e*pr'

Here, pr represents a {1*S} vector of probabilities, r a {1*S} vector of asset returns, F a {M*S} matrix
of factor values, and e a {1*S} vector of residual returns. As previously, b represents a {1*M} vector
indicating the asset's sensitivities to the factors. The computed values are: e(i), a scalar representing the
asset's expected return; ef, an {M*1} vector of factor expected values; and a(i), a scalar representing the
asset's expected residual return.

This rather tortuous proof suffices for other situations involving linear functions. The expected value of
a variable that is a linear function of other variables will itself be a linear function of the expected values
of the variables in question, using the same constants (here, bij values).

The equation for an asset's expected return could be used to compute the expected return on each asset,
one at a time. However, it is far more efficient to generalize it so that the entire vector of asset expected
returns can be computed in one operation. This is straightforward. Let:

e = an {N*1) vector of asset expected returns

B = an {N*m} matrix of factor exposures, where B(i,j) is the exposure of asset i to factor j

ef = an {m*1} vector of asset expected returns

a = an {N*1} vector of the expected residual returns

e = an {N*1) vector of asset expected returns

Then:

e = B*ef + a

This is a Matlab expression which requires one operation to do the entire job.

With respect to expected returns, it would appear that the use of a factor model has actually increased
the number of required estimates. In this approach, for N assets the Analyst needs N estimates of a(i)
plus estimates of the expected values of the M factors. While this is true, there are at least some cases in
which it is reasonable to assume that each asset has the same expected residual return, or that each such
expected residual return is related in a simple way to the asset's factor sensitivities. In such cases, the
www.stanford.edu/~wfsharpe/mia/fac/mia_fac3.htm 2/7
10/24/12 Factor‑based Expected Returns, Risks and Correlations

number of estimates required to specify asset expected returns may be considerably smaller than N.
Even when this is not so, a small increase in the size of the task of estimating expected values is a
reasonable price to pay for the substantial decreases in the magnitude the task of estimating risks, as the
next section shows.

Factor-based Asset Covariances and Variances


To determine the relationship between factor characteristics and those of an asset it is useful to re-
examine the nature of covariance. Put in future terms, the covariance of asset i with asset j is the
expected value of the product of (1) the deviation of asset i's return from its mean and (2) the deviation
of asset j's return from its mean:

cov(ri,rj) = ev( (ri-ei)*(rj-ej))

From this is follows that if k is a constant:

cov(ri,k*rj) = ev( (ri-ei)*(k*rj-k*ej)) = ev( (ri-ei)*k*(rj-ej)) = k*ev( (ri-ei)*(rj-ej)) =


k*cov(ri,rj)

In words: the covariance of a variable with a constant times another variable equals the constant times
the covariance of the two variables.

The definition of covariance also implies that if ri,rj, and rk are returns:

cov(ri,rj+rk ) = ev( (ri-ei)*((rj+rk ) - (ej+ek )) = ev( (ri-ei)*((rj-ej)+(rk -ek )) = ev( (ri-ei)*((rj-


ej)+ ev( (ri-ei)*((rk -ek )) = cov(ri,rj)+cov(ri,rk )

In words: the covariance of a variable with the sum of two variables equals the sum of its covariances
with the two variables. Clearly, a similar statement holds for the relationship between the covariance of a
variable with the difference between two other variables.

Now, consider the covariance between two assets (i and j), where the returns of each are determined by
a factor model. To keep notation to a minimum, let there be two factors. The relationships are thus:
~ ~ ~ ~
ri = bi1 *f1 + bi2 *f2 + ei

~ ~ ~ ~
rj = bj1 *f1 + bj2 *f2 + ej

~ ~
The goal is to determine the covariance between ri and rj . Substituting the right-hand sides of the
equations, we have:
~ ~ ~ ~ ~ ~ ~ ~
cov(ri ,rj ) = cov( (bi1 *f1 + bi2 *f2 + ei ), (bj1 *f1 + bj2 *f2 + ej ))

Using the relationships derived earlier, the right-hand side of this equation can be re-written as the sum
of nine covariances, since there are three terms in each component. However, some of these will equal
zero. The maintained assumptions of the factor model are that each residual is uncorrelated with that of
any other variable, and that each residual is uncorrelated with each of the factors. However, since a
variable's residual return will be correlated with itself, the corresponding term should be included to
cover the case in which i=j. Including only the terms that could be non-zero, and dropping the tildes
gives:
www.stanford.edu/~wfsharpe/mia/fac/mia_fac3.htm 3/7
10/24/12 Factor‑based Expected Returns, Risks and Correlations

cov(ri,rj) = bi1 *bj1 *cov(f1 ,f1 ) + bi1 *bj2 *cov(f1 ,f2 ) + bi2 *bj1 *cov(f2 ,f1 ) +
bi2 *bj2 *cov(f2 ,f2 ) + cov( ei,ej)

This can be written far more succinctly using matrix notation:

Cij = bi*CF*bj' + rvij

where:

Cij = the covariance between the returns on assets i and j

bi = a {1*m} vector of asset i's exposures to the m factors

CF = an {m*m} matrix of the factor covariances

bj = a {1*m} vector of asset j's exposures to the m factors

rvij = the covariance between the residuals on assets i and j

Note that rvij will equal zero if i and j are different, but will equal the variance of the asset's residual if
i=j.

A small amount of reflection on this derivation will indicate that the matrix version of the formula is as
applicable if there are more than two factors as it is if there are two.

The formula can be used to compute the variance of an asset's return since var(ri) = Cii.Thus:

var(ri) = bi*CF*bi' + rvii

More impressively, the formula can be generalized to compute the entire covariance matrix for asset
returns. As before, let:

B = an {N*m} matrix of factor exposures, where B(i,j) is the exposure of asset i to factor j

and

rv = an {N*1} matrix, where rv(i) is the residual variance for asset i (that is, the variance of
ei)

Then in Matlab notation:

C = B*CF*B' + diag(rv)

where:

diag(rv) = a matrix with the elements of rv on the main diagonal and zeros elsewhere.

For problems involving many securities (N) and relatively few factors (m) the number of potentially
different estimated variables (those on the right-hand side of the equation) can be very much smaller
than the number of asset covariances (on the left-hand side of the equation). For each of the covariances
matrices (C and CF) we count only the elements on and below the diagonal, since once those are
known, the remainder can be filled in. Taking this into account, the numbers of values to be determined
for each component are:

www.stanford.edu/~wfsharpe/mia/fac/mia_fac3.htm 4/7
10/24/12 Factor‑based Expected Returns, Risks and Correlations

C: (N2 +N)/2

CF: (m2 +m)/2

B: N*m

rv: N

The table below shows a few examples.

N m C CF B rv CF+B+rv
100 3 5,050 6 300 100 406
1,000 3 500,500 6 3,000 1,000 4,006
9,000 15 40,504,500 120 135,000 9,000 144,120
7,000 60 24,503,500 1,830 420,000 7,000 428,830

The first two rows are included for illustrative purposes. The third is representative of a typical model
used for U.S. mutual funds, with 15 broad asset class returns used for factors to explain the returns of
the many thousand mutual funds in the country. The fourth row is representative of some commercial
models that use a number of factors to explain the returns of the thousands of common stocks in the
United States.

As the table vividly illustrates, the number of potentially different asset covariances (in C) can be huge.
While the number of estimates required if a factor model is used (shown in the last column) can be large,
the use of such a model reduces a virtually intractable problem to one that can be manageable.

Finally, we can compare the total number of estimates required with and without a factor model for each
of the four cases. In addition to the estimates required for covariances, those needed for expected returns
(e), factor expected returns (ef) and residual expected returns (a) need to be taken into account. The table
below shows the results:

without with
N m C e CF B rv ef a
model model
100 3 5,050 100 6 300 100 3 100 5,150 509
1,000 3 500,500 1,000 6 3,000 1,000 3 1,000 501,500 5,009
9,000 15 40,504,500 9,000 120 135,000 9,000 15 9,000 40,513,500 153,135
7,000 60 24,503,500 7,000 1,830 420,000 7,000 60 7,000 24,510,500 435,890

The conclusion is the same. A factor model is a necessity for the estimation of risks and returns if
problems of any size are to be analyzed.

Factor-based Portfolio Expected Returns and Risks


All the work performed in the previous sections can be summarized with the two equations for asset
expected returns (e) and covariances (C):

e = B*ef + a

www.stanford.edu/~wfsharpe/mia/fac/mia_fac3.htm 5/7
10/24/12 Factor‑based Expected Returns, Risks and Correlations

C = B*CF*B' + diag(rv)

It is now time to consider portfolios of assets.

The custom is to represent a portfolio by an {N*1} vector x in which each element is the proportion (by
value) invested in an asset and the sum of the x's equals 1. As usual, the portfolio's expected return is
given by:

ep = x'*e

and its variance by:

vp = x'*C*x

One could use the factor model-based equations to compute e and C, then the equations for portfolio
expected return and variance to compute the portfolio's characteristics. However, this would require the
creation of a possibly very large asset covariance matrix (C). An far more attractive alternative
combines the equations, simplifies, and then solves for ep and vp.

First, take the equations for expected return. Combine:

e = B*ef + a

and

ep = x'*e

to get:

e = x'*B*ef + x'*a

Grouping terms slightly gives:

ep = (x'*B)*ef + x'*a

The parenthesized expression involves multiplying a {1*N}vector by a {N*m} matrix. The result is a
{1*m} vector that we will call bp:

bp = x'*B

The terminology is not without a purpose, for each element of bp indicates the exposure of the portfolio
to a factor. In somewhat casual notation:

bpf = sumi (xi*bif)

More succinctly:

bp = a {1*m} vector of the portfolio's exposures to the m factors

where each exposure is a weighted average of the asset exposures to the factor in questions, with
proportionate portfolio holdings used as weights.

The expected return of a portfolio can thus be obtained by multiplying each of its bpj values by the
expected return of the associated factor and adding the weighted average of the asset residual expected
returns, using the portions invested in the assets as weights:

www.stanford.edu/~wfsharpe/mia/fac/mia_fac3.htm 6/7
10/24/12 Factor‑based Expected Returns, Risks and Correlations

ep = bp*ef + x'*a

A similar approach can be used for computing a portfolio's variance, with very rewarding reductions in
computational requirements. Combine the equations for variance and covariance:

C = B*CF*B' + diag(rv)

and

vp = x'*C*x

to get:

vp = x'* B*CF*B'*x' + x'*diag(rv)*x

This rather menacing equation can be simplified by grouping:

vp = (x'* B)*CF*(B'*x') + x'*diag(rv)*x

and then replacing the first two parenthesized expressions with the equivalent vector of portfolio bpj
values:

vp = bp*CF*bp' + x'*diag(rv)*x

A further simplification is possible. The net result of the computations in the final term is simply to
multiply each residual variance by the square of the asset's proportionate holdings, then add the results.
Far better to do this directly, that is:

x'*diag(rv)*x = (x.^2)'*rv

so that:

vp = bp*CF*bp' + (x.^2)'*rv

This set of transformations allows the computation of portfolio variance without ever computing the
covariances of the component assets! Once bp has been calculated, the portfolio's variance can be
determined with one set of operations involving an {m*m} covariance matrix (CF) and another
requiring the computation of only N products of two terms. A result well worth the matrix algebra
involved in the derivations.

www.stanford.edu/~wfsharpe/mia/fac/mia_fac3.htm 7/7
10/24/12 Macro‑Investment Analysis

Style Analysis
Asset allocation: Management style and performance measurement
Setting the Record Straight on Style Analysis

www.stanford.edu/~wfsharpe/mia/sa/mia_sa.htm 1/1
10/24/12 Asset Allocation: Management Style and Performance Measurement

ASSET ALLOCATION: MANAGEMENT STYLE AND


PERFORMANCE MEASUREMENT
An Asset class factor model can help make order out of chaos

William F. Sharpe*

Reprinted from the Journal of Portfolio Management, Winter 1992, pp. 7-19.

This copyrighted material has been reprinted with permission from The Journal of Portfolio Management.
Copyright © Institutional Investor, Inc., 488 Madison Avenue, New York, N.Y. 10022,
a Capital Cities/ABC, Inc. Company. Phone (212) 224-3599.

It is widely agreed that asset allocation accounts for a large part of the variability in the return on a typical investor's
portfolio. This is especially true if the overall portfolio is invested in multiple funds, each including a number of
securities.

Asset allocation is generally defined as the allocation of an investor's portfolio among a number of "major" asset
classes. Clearly such a generalization cannot be made operational without defining such classes.

Once a set of asset classes has been defined, it is important to determine the exposures of each component of an
investor's overall portfolio to movements in their returns. Such information can be aggregated to determine the
investor's overall effective asset mix. If it does not conform to the desired mix, appropriate alterations can then be
made.

Once a procedure for measuring exposures to variations in returns of major asset classes is in place, it is possible to
determine how effectively individual fund managers have performed their functions and the extent (if any) to which
value has been added through active management. Finally, the effectiveness of the investor's overall asset allocation
can be compared with that of one or more benchmark asset mixes.

An effective way to accomplish all these tasks is to use an asset class factor model. After describing the characteristics
of such a model, we illustrate applications of a model with twelve asset classes to analyze the performance of a set of
open-end mutual funds between 1985 and 1989.

ASSET CLASS FACTOR MODELS

Factor models are common in investment analysis. Equation (1) is a generic representation:

Ri represents the return on asset i, Fi1 represents the value of factor 1, Fi2 the value of factor 2, Fin the value of the n'th
(last) factor and ei the "non-factor" component of the return on i. All these values are (potentially) unknown before-
the-fact, as indicated by the tildes. The remaining values (bi1 through bin ) represent the sensitivities of Ri to factors Fi1
through Fin .

A key assumption makes a model of this sort more than simply an exercise in data description: The non-factor return
for one asset (ei) is assumed to be uncorrelated with that of every other (e.g. ej). In effect, the factors are the only
sources of correlation among returns.

An asset class factor model can be considered a special case of the generic type. In such a model each factor
represents the return on an asset class and the sensitivities (bij values) are required to sum to 1 (100%). In effect, the
return on an asset i is represented as the return on a portfolio (shown by the sum of the terms in the bracketed
www.stanford.edu/~wfsharpe/art/sa/sa.htm 1/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

expression) invested in the n asset classes plus a residual component (ei). For expository convenience, the sum of the
terms in the brackets can be termed the return attributable to style and the residual component (ei) the return due to
selection. Indeed, a key contribution of this approach is the separation of return into these two main components.

EVALUATING ASSET CLASS FACTOR MODELS

The usefulness of an asset class factor model depends on the asset classes chosen for its implementation. While not
strictly necessary, it is desirable that such asset classes be 1) mutually exclusive, 2) exhaustive and 3) have returns that
"differ". Pragmatically, each should represent a market-capitalization weighted portfolio of securities; no security
should be included in more than one asset class; as many securities as possible should be included in the chosen asset
classe; and the asset class returns should either have low correlations with one another or, in cases in which
correlations are high, different standard deviations.

While the appropriate measure of the efficacy of any specific implementation depends on the uses to which the model
is to be put, factor models are typically evaluated on the basis of their ability to explain the returns of the assets in
question (i.e. the Ris). A useful metric is the proportion of variance "explained" by the selected asset classes. Using
the traditional definition, for asset i:

The right-hand side of equation (2) equals 1 minus the proportion of variance "unexplained". The resulting R-squared
value thus indicates the proportion of the variance of Ri "explained" by the n asset classes1 .

It is important to recognize that this measure indicates only the extent to which a specific model fits the data at hand. A
better test of the usefulness of any implementation is its ability to explain performance out-of-sample. For this reason it
is important to consider not only the ability of a model to explain a given set of data but also its parsimony. Other
things equal (e.g. R-squared values), the fewer the asset classes, the more likely is the model to represent continuing
fundamental relationships with predictive content2 .

To evaluate the exposures of funds to changes in the returns of key asset classes, the appropriate measure is the
collective ability of a set of such classes to explain the time-series variability in the returns on a typical fund (e.g.
mutual fund or separately-managed institutional account). Note that this criterion differs from that often applied in
evaluating factor models designed to describe specific portions of the overall capital market.

For example, when constructing an equity factor model, one might consider the ability of the selected factors to
explain the time-series variation in the returns of a typical stock. Most stock market models include factors
representing returns on industry groups and/or economic sectors -- factors that account for much of the typical
security's return. If most managers diversify across industries and economic sectors,however, inclusion of factors
related to differences in industry and sector returns will add little if any explanatory power to a model designed to
explain fund returns.

A TWELVE ASSET CLASS MODEL

The model we use has twelve asset classes. The return of each is represented by a market capitalization weighted
index of the returns on a large number of securities. For reasons that will become clear, it is important to note that each
index represents a strategy that could be followed at low cost using an index fund. The composition of each index is
specified in sufficient detail by its provider to enable an investor to track the returns with little error through a passive
(index-like) investment strategy.

Table 1 describes the twelve asset classes and the indices used for the associated return series. Most are widely used
indexes that require no further description. The four less well-known are those employed to represent U.S. equity
classes.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 2/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

TABLE 1
Asset Classes

Bills
Cash-equivalents with less than 3 months to maturity
Index: Salomon Brothers' 90-day Treasury bill index

Intermediate-term Government Bonds


Government bonds with less than 10 years to maturity
Index: Lehman Brothers' Intermediate-term Government Bond Index

Long-term Government Bonds


Government bonds with more than 10 years to maturity
Index: Lehman Brothers' Long-term Government Bond Index

Corporate Bonds
Corporate bonds with ratings of at least Baa by Moody's or BBB by Standard & Poor's
Index: Lehman Brothers' Corporate Bond Index

Mortgage-Related Securities
Mortgage-backed and related securities
Index: Lehman Brothers'Mortgage-Backed Securities Index

Large-Capitalization Value Stocks


Stocks in Standard and Poor's 500-stock index with high book-to-price ratios
Index: Sharpe/BARRA Value Stock Index

Large-Capitalization Growth Stocks


Stocks in Standard and Poor's 500-stock index with low book-to-price ratios
Index: Sharpe/BARRA Growth Stock Index

Medium-Capitalization Stocks
Stocks in the top 80% of capitalization in the U.S. equity universe after the exclusion
of stocks in Standard and Poor's 500 stock index
Index: Sharpe/BARRA Medium Capitalization Stock Index

Small-Capitalization Stocks
Stocks in the bottom 20% of capitalization in the U.S. equity universe after the exclusion
of stocks in Standard and Poor's 500 stock index
Index: Sharpe/BARRA Small Capitalization Stock Index

Non-U.S. Bonds
Bonds outside the U.S. and Canada
Index: Salomon Brothers' Non-U.S. Government Bond Index

European Stocks
European and non-Japanese Pacific Basin stocks
Index: FTA Euro-Pacific Ex Japan Index

Japanese Stocks
Japanese Stocks
Index: FTA Japan Index

In effect, the institutional universe of U.S. equities has been divided into four mutually exclusve and exhaustive
groups3 . The first two represent a partition of the stocks in Standard and Poor's 500 stock index. Every six months the
S&P500 stocks are ranked according to the ratio of the most recently published book value per share to a previous
month-end price per share. A dividing line is drawn so that approximately half the total value of the 500 stocks is
placed on either side. Stocks with high book-to-price ratios are placed in the "value" stock index; the remainder are in
the "growth" stock index.

A similar procedure is followed in constructing the medium capitalization and small capitalization stock indexes. Non-
S&P500 stocks are ranked on the basis of total outstanding market capitalization every six months, and a dividing line
drawn so that approximately 80% of the total value is above the line and 20% below it. Most of the stocks in the first
group are placed in the medium capitalization index and most of the remaining stocks in the small capitalization index.
To avoid excessive turnover in the composition of these indexes of relatively illiquid stocks (and an associated high
cost for index tracking), any stock that has recently "crossed over the line" a relatively small distance is allowed to
remain in its former index4 .

Many of the differences in returns of U.S. equity mutual funds can be attributed to differences in their exposures to
these four asset classes. In effect, there
www.stanford.edu/~wfsharpe/art/sa/sa.htm appear to be two key dimensions along which such funds differ. One may 3/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

these four asset classes. In effect, there appear to be two key dimensions along which such funds differ. One may
loosely be termed "value/growth"; the other "small/large". Figure 1 illustrates the composition of the four domestic
equity asset classes. Each index can be considered a capitalization-weighted "center of gravity" of the securities in its
associated class, as the dots in Figure 1 indicate. Note that any combination of the four indices with non-negative
holdings can be represented by a point in the area defined by the index locations (in this case, a triangle)5 .

FIGURE 1

COMPOSITION OF FOUR DOMESTIC EQUITY CLASSES

Much has been written about both the small-stock and the value/growth phenomena. While the terms "value" and
"growth" reflect common usage in the investment profession, they serve only as convenient names for stocks that tend
to be similar in several respects. As is well known, across securities there is significant positive correlation among:
book/price, earnings/price, low earnings growth, dividend yield and low return on equity. Moreover, the industry
compositions of the value and growth groups differ (e.g. companies with high research budgets tend to have low book
values relative to their stock prices).

Those concerned with these distinctions have focused most of their research on long-run average return differences;
that is, they have asked whether small stocks or value stocks "do better than they should" in the long-run. Less
attention has been paid to likely sources of short-run variability in returns among such groups. For present purposes it
suffices that such variability is substantial.

Figure 2 provides relevant evidence: The variability in returns across the four asset classes from year-to-year is far
greater than would be encountered if groups with similar numbers of securities had been formed randomly. Fund
exposures across these dimensions vary greatly. As a result, much of the variation in fund returns in any given period
can be attributed to the combined effects of their exposures to these asset classes and the realized returns on those
classes.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 4/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

DETERMINING FUND EXPOSURES

The traditional view of asset allocation assumes that an investor allocates assets among (potentially many) funds, each
of which holds (potentially many) securities. Ultimately one is interested in the investor's exposures to the key asset
classes. These are a function of 1) the amounts of the investor's portfolio invested in the various funds and 2) the
exposures of each such fund to the asset classes. The exposures of a fund to the various asset classes are, in turn, a
function of 1) the amounts that the fund has invested in various securities and 2) the exposures of the securities to the
asset classes.

While it is possible to attempt to determine a fund's exposures from a detailed analysis of the securities held by the
fund, a simpler approach typically provides more than enough information for purposes of asset allocation. Such a
method uses only realized fund returns to infer the typical exposures of the fund to the asset classes. Since only easily
obtained information is required, the approach may be considered "external", in comparison with methods that rely on
information that may be available only from sources internal to the fund.

Inspection of equation (1) immediately suggests a procedure that might be used in this connection. Given, say, sixty
monthly returns on a fund, along with comparable returns for a selected set of asset classes, one could simply employ
a multiple regression analysis with fund returns as the dependent variable and asset class returns as the independent
variables. The resulting slope coefficients could then be intererpreted as the fund's historic exposures to the asset class
returns.

Table 2 provides an example for Trustees' Commingled U.S. Portfolio (an open-end mutual fund offered by the
Vanguard Group). Monthly returns from January 1985 through Decenber 1989 are used for the dependent variable,
with the corresponding returns for the twelve asset classes serving as independent variables.

The column entitled "Unconstrained Regression" shows results obtained applying Equation (1). The first twelve rows
show the resulting slope coefficients (bij values), expressed as percentages. The coefficient total is shown next,
followed by the R-squared value (also expressed as a percentage). A substantial portion (95.20%) of the monthly
variance in the fund's returns is explained by this equation. The coefficients, however, do not sum to 100%. What is
more important, several are vastly inconsistent with the fund's actual policy (to invest in common stocks with no short
positions).

Table 2
Regression and Quadratic Programming Results
Trustees' Commingled Fund - U.S. Portfolio
January 1985 through December 1989

www.stanford.edu/~wfsharpe/art/sa/sa.htm 5/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Unconstrained Constrained Quadratic


Regression Regression Programming
Bills 14.69 42.65 0
Intermediate Bonds -69.51 -68.64 0
Long-term Bonds -2.54 -2.38 0
Corporate Bonds 16.57 15.29 0
Mortgages 5.19 4.58 0
Value Stocks 109.52 110.35 69.81
Growth Stocks -7.86 -8.02 0
Medium Stocks -41.83 -43.62 0
Small Stocks 45.65 47.17 30.04
Foreign Bonds -1.85 -1.38 0
European Stocks 6.15 5.77 0.15
Japanese Stocks -1.46 -1.79 0
Total 72.71 100.00 100.00
R-squared 95.20 95.16 92.22

The column in Table 2 titled "Constrained Regression" reports the results of a multiple regression analysis similar to
the first, with one added constraint: The coefficients were required to sum to 100%. The reduction in R-squared was
slight (from 95.20% to 95.16%), but the inconsistency between the coeffficients and the fund's investment policy
remains.

The last column in Table 2 reports the results of an analysis where each coefficient is constrained to lie between 0 and
100% and the sum is again required to be 100%. As in the previous cases, the objective of the analysis was to select a
set of coefficients that minimizes the "unexplained" variation in returns (i.e., the variance of ei), subject to the stated
constraints. An equivalently goal was to maximize the associated value of R-squared, subject to the stated constraints.
For this analysis, the presence of inequality constraints ( 0<= bij<= 100% for each i ) required the use of a quadratic
programming algorithm.

The addition of constraints reflecting the fund's actual investment policy causes a slight reduction in the fit of the
resulting equation to the data at hand (i.e., a decrease in R-squared to 92.22%). Now, however, the coefficients
conform far more closely to the reality of the fund's investment style, making the resulting characterization more likely
to provide meaningful results with out-of-sample data.

As Table 2 shows, the analysis suggests that the fund traditionally invests so as to obtain returns similar to those
achievable with a portfolio with roughly 70% invested in a market-representative portfolio of value stocks and 30% in
a market-representative portfolio of small stocks. During the period investigated, over 92% of the month-to-month
variation in the return on the fund could be explained by the concurrent variation in the return on this particular mix of
value and small stocks.

STYLE ANALYSIS

The use of quadratic programming for the purpose of determining a fund's exposures to changes in the returns of
major asset classes is termed style analysis (see Sharpe[1988]). The goal is to find the "best" set of asset class
exposures (bij values) that totals to 100% and conforms with rudimentary information concerning the fund's policies
(typically, no net short positions in any asset class; for funds known to employ short positions, other bounds may be
invoked).. In this context, the best such set of exposures is the one for which the the variance of ei is the least.

Rearranging Equation (1):

The term on the left can be interpreted as the difference between the return on the fund (the first term on the right) and
that of a passive portfolio with the same style (shown by sum of the terms in the brackets). The goal of style analysis is
to select the style (set of asset class exposures) that minimizes the variance of this difference. Such a difference can be
www.stanford.edu/~wfsharpe/art/sa/sa.htm 6/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

termed the fund's "tracking error" and its variance the fund's "tracking variance".

Note that the objective of such an analysis is not to minimize either the average value of this difference or the sum of
the squared differences. Thus the method is not designed to find a style that "makes the fund look bad" (or good).
Rather, the goal is to infer as much as possible about the fund's exposures to variations in the returns of the asset
classes during the period studied.

We have noted that a quadratic programming algorithm must be used for such an analysis. For an exact solution, one
can implement Markowitz' critical line method1(see Markowitz [1987]). This study uses the simpler gradient method
described in Sharpe [1987]. Although in principle the latter produces only an approximate solution, differences
between the results obtained with the two methods prove to be of no practical importance in this application.

MUTUAL FUND STYLES

Figure 3 provides a graphical summary of the results shown in the final column of Table 2. The bar chart indicates the
estimated style of the fund, and the pie chart the associated R-squared value. In the latter the R-squared value is
identified as attributable to the fund's style and the remainder (1 - R-squared) to selection.

It is important to note that the style identified in such an analysis is, in a sense, an average of potentially changing
styles over the period covered. Month-to-month deviations of the fund's return from that of style itself can arise from
selection of specific securities within one or more asset classes, rotation among asset classes, or both security selection
and asset class rotation . For the sake of simplicity, we use the term selection to cover all such sources of tracking
differences.

It is sometimes helpful to examine the behavior of a manager's average exposures to asset classes over time. To do so,
www.stanford.edu/~wfsharpe/art/sa/sa.htm 7/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

one can perform a series of style analyses, using a fixed number of months for each analysis through time. Figure 4
shows the results from such a study for Trustee's Commingled U.S. Fund.

The point at the far right of the diagram represents the style described when the sixty months ending in December
1989 are analyzed. This corresponds to the style shown in Figure 3. Every other point represents the results of an
analysis using a different set of sixty monthly returns (note, however, that each such set has fifty-eight months in
common with its predecessor). As Figure 4 shows, this fund's style appeared to remain quite constant throughout the
period analyzed.

Figures 5 and 6 show the results obtained when the same type of analysis was applied to the returns of Fidelity
Magellan Fund -- a highly popular open-end common stock fund. As Figure 5 shows, its style differed considerably
from that of Trustees' Commingled U.S. Fund, with emphasis on growth rather than value stocks and exposure to
medium-capitalization stocks in addition to smaller ones.

The pie chart in Figure 5 shows that the fund is considerably more diversified (and/or engaged in less rotation) than
Trustees' Commingled U.S. Fund. During the period covered, over 97.3% of the monthly variation in Magellan
returns could be attributed to the concurrent return on a passive portfolio with the style shown in the bar chart in
Figure 5.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 8/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Figure 6 suggests that the Magellan Fund progressively increased its emphasis on large growth stocks and decreased
its exposure to small capitalization stocks during the 1980s. This is not surprising, as the fund grew to approximately
$14 billion by the end of the period, making substantial investment in very small stocks increasingly difficult.

MUTUAL FUND TYPES

Figures 3 through 6 show results for two particular mutual funds. Here we provide a more representative view of the
efficacy of the procedure, with style analysis performed for each of 395 funds using returns from January 1985
through December 1989. Averages are taken for both the styles and R-squared values of all funds classified as being
of the same "type" by Jaye C. Jarrett & Company, Inc., the providers of the data used for this study. In all, seven such
types are represented. The results are shown in Figures 7 through 13.

Each figure should be interpreted as representative of the style (bar chart) and variance due to style (pie chart) of a
"typical" fund of the type. A portfolio invested in all the funds of a particular type would typically have a much higher
R-squared (percent of variance attributable to style) than is shown in the figure in question. Moreover, there is
typically considerable variation in both style and R-squared values among the funds within each type group. Given
these caveats, the analyses provide useful illustrations of some of the features of the style analysis method.

Utility Stock Fund

Figure 7 shows the results for a typical utility stock fund. Such funds (atypically) concentrate their holdings in one
industry. As a result, style accounts for an unusally small part (although still 59.3%) of the variance in return.
Although such funds hold common stocks, their returns behave more like a passive portfolio invested in both stocks
and bonds. That is, utility revenues are "sticky" because to the regulatory process, causing shares of such companies
to have features that are both stock-like and bond-like.

The utility fundexample emphasizes the fact that style analysis provides measures that reflect how returns act, rather
than a simplistic concept of what the portfolios include. Note, finally, all equity exposure is to value stocks, relflecting
the high dividend yields typical of utility shares.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 9/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Growth Equity Fund

Figure 8 portrays a typical growth equity fund. Here the most prominent exposure is, as expected, to growth stocks,
although the typical fund of this type also responds to movements in the returns of other asset classes. Note the
exposure to Bills, which probably results from the actual cash holdings that many such funds maintin to meet liquidity
needs.

Overall, the results illustrate the fact that few funds are "pure" in the sense of responding only to movements in returns
of one asset class. The style analysis procedure can detect some of the subtleties that exist in practice, instead of
classifying each fund by a single (pure) style. Finally, note that almost 90% of the monthly variation in return of the
typical growth equity fund can be attributed to its style -- a result typical of common stock funds.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 10/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Growth and Income Equity Fund

Figure 9 shows the characteristics of a typical growth and income equity fund. Here too, style accounts for
approximately 90% of the monthly variation in returns. The effects of a liquidity reserve are probably at least partly
responsible for the exposure to Bills, although choices of stocks with lower beta values than those in the asset class
indexes could also play a role.

Note the almost perfect balance between value and growth Stocks, relecting an "SP500-like" stance with respect to
large-capitalization stocks. The exposures to small and medium stocks may reflect actual investment in such stocks
and/or a preference for equal weighting rather than capitalization weighting within the large stock sector. In an
important sense, the source of a set of exposures may not even need to be identified, as long as the exposures are
representative of likely future results.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 11/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Small Stock Fund

Figure 10 indicates that small stock funds do indeed buy small stocks (as defined by the asset class used for this
study). However, they also appear to buy somewhat larger ones. Moreover, there tends to be an emphasis on growth
rather than value. This may reflect the actual purchase of large-capitalization growth stocks by some funds. It may also
indicate a preference for medium-capitalization stocks with growth characteristics. As Figure 1 suggests, a point lying
to the right of the point rerpresentin the medium stock index can be represented by a combination of the large growth
stock index, the small stock index and the medium stock index.

As before, the goal is to represent the behavior of the fund, not its precise composition. Finally, note that the R-
squared value is slightly lower (87.6%) than for the other diversified funds -- perhaps reflecting the lower liquidity of
this sector of the equity market.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 12/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Balanced Fund

Figure 11 shows that balanced funds are precisely that. While any single fund may diverge substantially from the style
shown in the figure, collectively balanced funds provide results similar to those obtained by holding all U.S. asset
classes and small amounts of foreign ones. As with other diversified funds, style accounts for roughly 90% of the
monthly variation in the returns for the typical fund of this type.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 13/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

High-Quality Bond Fund

Figure 12 shows that the method works well for bond funds as well as for stock and balanced ones. The typical high-
quality bond fund provides exposure to corporate bonds, government bonds and mortgage-related securities, with
style accounting for slightly over 88% of monthly variance in return. In any given case, a mix of, e.g., intermediate
government bonds and corporate bonds might reflect actual holdings or the average quality of the corporate bond
portfolio.

Thus a portfolio with a higher average quality than that reflected in the Corporate Bond Index typically acts more like
a mixture of corporate bonds (defined by the index) and intermediate government bonds. Similarly, a portfolio of
corporate bonds with a longer duration than that of the Corporate Bond index will "track" more closely with a mix of
corporate bonds (defined by the index) and long-term Government bonds. As always behavior, not nomenclature, is
relevant.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 14/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Convertible Bond Fund

Figure 13 shows a case where an asset class not explicitly represented in a model can be represented well by the
classes that are included. As shown, 88.8% of the monthly variation in returns of a typical convertible bond fund can
be attributed to the concurrent variation in the returns of a mix of stocks, bills, and bonds. This is not too surprising. A
convertible bond has characteristics of both bonds and stocks.

Of course, as bond and stock markets diverge, the relative sensitivities of any given convertible bond to the two
markets will change, giving such an instrument its distinctive non-linear characteristic. Interestingly, managers of
convertible bond funds appear to have preferred habitats, causing them to buy and sell convertible bonds so as to
maintain fairly consistent exposures to asset classes of the type utilized in this study.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 15/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Fund Type Summary

As these examples show, a remarkable amount of information can be revealed from an analysis of the returns
provided by the manager of an investment fund. This is especially gratifying since in the final analysis return is the
product the investor buys from such a manager.

THE INVESTOR'S EFFECTIVE ASSET MIX

Once the styles of an investor's funds have been estimated, it is a simple matter to determine the associated overall
effective asset mix. Letting Wi represent the proportion of the investor's portfolio invested in fund i, overall portfolio
return (Rp ) will be:

As both equations (1) and (4) are linear, substitution of the former in the latter will provide another linear equation:

or:

www.stanford.edu/~wfsharpe/art/sa/sa.htm 16/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

where the bpj values are the portfolio's exposures to the asset classes. As can be seen by comparing Equations (5) and
(6), each bpj is simply a value-weighted average of the exposures of the component funds to the asset class in
question, with the relative amounts invested in the funds used as weights.

The resultant effective asset mix (specified by the values bp1 , bp2 , ...,bpn ) will account for a large portion of the
month-to-month variation in returns for the typical portfolio invested in many funds. Under the assumption that the
residual ei terms are uncorrelated, diversification across funds will greatly reduce the variance of the final (non-factor)
component and thus increase the portion of variance attributable to asset allocation. Even if some of the residuals are
correlated, the use of multiple funds will typically lead to substantial reduction in selection risk.

The effective asset mix represents the style of the investor's overall portfolio. For a multiple-managed portfolio, style is
even more important than for an individual fund.

PERFORMANCE MEASUREMENT

In a sense, a passive fund manager provide an investor with an investment style, while an active manager provides
both style and selection. This statements can be used to define the terms "active" and "passive" management. Note
that in this taxonomy, the precise implementation of an asset class factor model play a crucial role. This suggests that
one may wish to select a set of asset classes so that only superior performance relative to a static mix of the chosen
classes warrants the higher fees usually associated with "active" as opposed to "passive" management. This is the
approach we take, focusing on a fund's selection return, defined as the difference between the fund's return and that of
a passive mix with the same style.

There are several desiderata associated with the selection of a benchmark for performance measurement. A
benchmark portfolio should be 1) a viable alternative, 2) not easily beaten, 3) low in cost, and 4) identifiable before
the fact.

Style analysis provides a natural method for constructing benchmarks meeting these requirements. The return obtained
by a fund in each month can be compared with the return on a mix of asset classes with the same estimated style,
where the style is estimated prior to the month in question. Note that this differs from the use of the ei values obtained
as byproducts of a style analysis, since the latter are in-sample, not out-of-sample values.

To illustrate the approach for Trustees' Commingled U.S. and Fidelity's Magellan Funds, for each month t:

1. The fund's style is estimated, using returns from months t-60 through t-1.6

2. The return on the resultant style is calculated for month t.

3. The difference between the fund's return in month t and that of the style benchmark determined in
steps 1) and 2) is computed. This difference is defined as the fund's Selection Return for month t.

Figure 14 shows the cumulative sum of the monthly selection returns from January 1986 through December 1989 for
Trustees' Commingled U.S. Fund.7 In such a graph, increases result from positive selection returns and decreases from
negative ones.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 17/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Average -.06 % per month


Std Dev 1.69 % per month
T(Avg) -0.25

The table below the graph in Figure 14 summarizes the results in a different manner. On average, the fund
underperformed its style benchmarks by -0.06% (6 basis points) per month, with a standard deviation of 1.69% (169
basis points) per month. The t-statistic associated with the mean difference was, however, small in absolute value,
suggesting that the average difference was not statistically significantly different from zero.8

Figure 15 emphasizes the advantages to be gained by analyzing performance the way we have described. It compares
the return on Trustees' Commingled U.S. Fund with that of Standard & Poor's 500 stock index (commonly used to
evaluate mutual fund performance). The fund's performance, so measured, was over three times as bad as that shown
previously: the cumulative difference was -10% and the average difference -20 basis points per month.

But such a comparison includes results due to both style and selection. During the period in question the fund's style
underperformed that of the S&P500 (primarily because of its exposure to small stocks). Indeed, this accounts for
approximately two-thirds of the fund's underperformance relative to the S&P500.

An investor choosing Trustees' Commingled U.S. Fund could and should have known that its style favored value
stocks and small stocks. The choice to expose some of the portfolio to these asset classes should be attributed to the
investor. Results (good or bad) associated with such the choice of a style should be attributed to the investor, not to the
manager of a fund following that style.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 18/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Average -.20 % per month


Std Dev 2.13 % per month
T(Avg) -0.65

Figures 16 and 17 show the results of similar analyses for Fidelity Magellan Fund. As Figure 16 shows, the fund
provided a positive but statistically insignificant outperformance when compared with the S&P500 over the period.

But Figure 17 shows that such a comparison masked Magellan's truly outstanding selection performance. During this
period, the fund outperformed its style benchmarks by a cumulative amount of over 25%. Outperformance averaged
57 basis points per month with a standard deviation of 105 basis points. The t-statistic of 3.76 shows that such
differences were highly significant statistically. Two aspects account for the large t-value: the relatively large average
return difference and the relatively small variation in the difference from month to month.9

www.stanford.edu/~wfsharpe/art/sa/sa.htm 19/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Average 0.18 % per month


Std Dev 1.48 % per month
T(Avg) 0.84

Average 0.57% per month


Std Dev 1.05 % per month
T(Avg) 3.76

www.stanford.edu/~wfsharpe/art/sa/sa.htm 20/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

MUTUAL FUND PERFORMANCE

Fidelity Magellan's performance from 1985 through 1989 is far from typical. While only out-of-sample results can
provide a definitive test of the collective performance of mutual funds, the average ei values obtained as a by-products
from fund style analyses can provide at least some evidence on the matter.

Figure 18 shows the distribution of the average tracking errors obtained from the style analyses of 636 stock, bond
and balanced funds. Each value is the average ei value obtained from a style analysis using returns for one fund
covering the period from January 1985 through December 1989. Note that the distribution is roughly normal, with a
mean of -0.074 (-7.4 basis points per month). This is roughly consistent with the hypothesis that the average mutual
fund cannot "beat the market" before costs, because such funds constitute a large (and presumably representative) part
of the market. Annualized, the mean underperformance is approximately 0.89% per year -- an amount that, if
anything, may be slightly less than the non-transaction costs incurred by a typical mutual fund.

Average = -.074

MEASURING AN INVESTOR'S PERFORMANCE

In the paradigm utilized in this article, an investor makes decisions that result in an effective asset mix and a set of
selection returns. In a sense, the investor selects a set of (passive or active) managers and a specific allocation of funds
among such managers. Given the managers' styles, this determines the investor's effective asset mix.

The procedures described earlier can be applied directly to measure the efficacy with which the investor performs his
or her functions. The performance of each month's effective asset mix can be compared with that of a predetermined
benchmark asset mix to assess the value added or lost due to asset allocation decisions (advertent or inadvertent). The
remainder of the investor's return is attributable to the joint effects of 1) the fund managers' selection returns, and 2)
the investor's allocation of money among the managers. The investor selection return (Sp ) is simply:

where the Si values are determined out-of-sample, using procedures such as those described earlier.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 21/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

CONCLUSION

An asset class factor model can help make order out of the chaos that often attends the investment process. It can
provide a consistent view of investment decisions investors make to economize on information flows and exploit
comparative advantages. The style analysis procedure described in this article allows such a model to be implemented
economically. At the very least it can serve as a valuable supplement to other methods designed to help investors
achieve their goals in cost-effective ways.

*This article is adapted from the O'Neil Abbot Distinguished Lecture given at the Darden School of the University of
Virginia on October 2, 1990. The author thanks Robert O'Neil, Mark Eaker and the faculty of the Darden School for
making the presentation possible. He also thanks Mark Friebel, Sharon Kitajima, Diana Lieberman, Anita Nanda, and
Kathryn Sharpe, colleagues at William F. Sharpe Associates for their contributions.

1 When an equation such as (1) is fit by regression analysis, the same value will be obtained whether equation (2) is
used or the proportion of variance explained is computed directly. This follows from the fact that the residual returns
obtained from a regression analysis are guaranteed to be uncorrelated with each of the asset class returns. When
quadratic programming is employed, it is possible that the residual returns will be correlated with the returns on one or
more asset classes; hence the two procedures for computing R-squared may yield slightly different results. In practice
such differences are typically insignificant. In the remainder of the paper equation (2) is utilized for all calculations..

2 If regression analysis is used to fit equation (1), conventional tests of statistical significance can also be invoked to
evaluate the likely performance of the model out-of-sample. When quadratic programming is employed, the
assumptions that lie behind such tests are violated, making true out-of-sample tests the only reliable means of
evaluating the efficacy of the approach.

3. More precisely, BARRA's all-U.S. universe.

4. That is, a security already placed in one class is not permitted to migrate to the other class unless it passes the
capitalization requirement by 20% of the boundary value.

5. More generally, the convex hull of the points.

6. For a more operational procedure, returns for a period ending with month t-2 might be used.

7. No compounding is employed in Figure 14 (and the subsequent ones using similar portrayals). This makes it
possible to compare vertical distances directly at any point in the figure. Because selection return represents the
difference between the returns of two portfolios, compounding would not provide the difference in the cumulative
values of the fund and the benchmark. This being said, it should be pointed out that since the logarithm of a monthly
value relative will generally differ little from the corresponding monthly return, a graph in which compounded
selection returns are shown on a logarithmic scale appears similar to one in which returns are summed and shown on
an arithmetic scale, as in this article.

8. The t-statistic is computed by dividing the average return difference by the standard error of the mean (here, the
standard deviation of the return difference divided by the square root of 60).

9.The t-statistic is closely related to the reward-to-variability Ratio (sometimes termed the Sharpe Ratio) for the
"active" component of a fund's return, which is simply the mean value divided by the standard deviation.

REFERENCES

Markowitz, Harry M. Mean-Variance Analyses in Portfolio Choice and Capital Markets. Oxford: Basil Blackwell,
Inc. 1987.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 22/23
10/24/12 Asset Allocation: Management Style and Performance Measurement

Sharpe, William F. "An Algorithm for Portfolio Improvement," in Kenneth D. Lawrence, John B. Guerard, Jr. and
Gary D. Reeves, eds. Advances in Mathematical Programming and Financial Planning, Vol. 1, pp. 155-170.
Greenwich: JAI Press, Inc., 1987.

---------. "Determining a Fund's Effective Asset Mix," Investment Management Review, December 1988, pp. 59-69.

www.stanford.edu/~wfsharpe/art/sa/sa.htm 23/23
10/24/12 Sharpe Interview on Style Analysis

Setting the Record Straight on Style Analysis


A Newsmaker Interview by Barry Vinocur
Reprinted with permission from Dow-Jones Fee Advisor

Few subjects are more timely—or controversial—than mutual fund style analysis. And no person figures
more centrally in any discussion of style analysis than Stanford University's William F. Sharpe.

Though best known for his work on the capital asset pricing model—for which he along with Harry
Markowitz and Merton Miller were awarded the Nobel Prize in Economics in 1990—Sharpe is also
credited with developing mutual fund style analysis. His 1988 paper, "Determining a Fund's Effective
Asset Mix" is considered must reading for anyone with even a passing interest in style analysis.

Unlike some of his technique's critics, however, Sharpe refuses to make style analysis into a "holy war."
As he told us recently, style analysis generally, and returns-based style analysis (the technique Sharpe
uses), specifically, is a very powerful and useful tool. However, Sharpe stresses he wouldn't suggest
investing in a fund based solely on style analysis. "Before I'd invest in a fund I would want to have read
the Morningstar material and whatever else I could find, including the prospectus."

What role should "style analysis" play in the selection (and monitoring) of investment managers? How
should you use style analysis? What are the technique's strengths and limitations? For the answers to
those, as well as long list of other questions, we again went straight to the source (see Investment
Advisor, October 1994, page 83) and spoke with Sharpe.

FA: Most discussions of "style analysis" begin with someone mentioning the Brinson, Singer and
Beebower study. As I recall, they analyzed the performance of 82 large, multiasset U.S. pension
fund portfolios from 1977 to 1987. Simply stated, they found asset allocation decisions accounted
for 91.5% of the portfolios' performance.
www.stanford.edu/~wfsharpe/art/fa/fa.htm 1/6
10/24/12 Sharpe Interview on Style Analysis

Sharpe: What that study said is if you put enough securities—in this case mutual funds or investment
managers together—you're going to get something that's responsive to basic market factors. In other
words, there isn't a lot that's going to impact the portfolio but what's going on in those basic factors.

In my practice with pension funds—using returns-base style analysis for every single manager in a very
mechanistic way, asset allocation plays an even more central role than it did in the Brinson and
Beebower study. Brinson's and Beebower's factors were just really stocks, bonds and cash. But when I
break it down and bring in value and growth stocks, asset allocation accounts for 98% or more of the
return. Those are really profound numbers. Of course,every now and then there may be one that's 97%.

FA: A lot of people use the term factor analysis interchangeably with style analysis. You make a
distinction between the two.

Sharpe: Factor analysis is a bad term because factor analysis is a particular way of estimating a factor
model. And it's a way that most people don't use in this domain. Factor analysis is just one way of
estimating a factor model and it isn't the method I use.

So, let's replace factor analysis with a factor model. What is a factor model? A factor model says the
fund's return is related— typically linearly—to the return on this factor and the return on that factor and
the return on the other factor, etc. And there are a lot of those models. The kind we're talking about here
— and again everybody is talking about the same thing up to this point—is what I call an asset class
factor model, where the factors are the returns on asset classes.

FA: Sounds pretty tame so far. Why all the fireworks?

Sharpe: All of these methods are methods of estimating or writing numbers down for a factor model.
The two things you have to do in such a procedure is that you have to first figure out what the factors
are and then you have to figure out how sensitive each fund is to each of the factors. And we're all
doing that. Where you start parting company is the way you do that.

As far as picking the asset classes, that's an art form and we all do it. Some of us use the same asset
classes but different methods. Where the methods part company is once you have picked the asset
classes figuring out how sensitive a fund is to moves in each of those asset classes.

One method, which I call just plain style analysis but which people are calling to differentiate it returns-
based style analysis, looks at the way the fund's returns moved in the past with the returns of the asset
classes and uses that along with some minimal prior information—such as the fund didn't hold any short
positions—to estimate the fund's effective asset mix or style.

All the style is is exposure. If I say your style is 60% growth and 40% value that means you'll move 0.6
times whatever happens to growth stocks plus 0.4 times whatever happens to value stocks. Reduced to
its bare essentials that's returns-based style analysis.

FA: The view from the opposing camp would be?

Sharpe: Everybody has to have some kind of a model. The major part of the opposing camp says it's
not a good idea to look at the bottom line numbers the way returns-based style analysis does. That camp
says you should look under the hood and see what's in the portfolio.

That's fine. But if you do that you have to have a way of estimating the exposure of everyuthing in the
portfolio to the asset classes so you can add them up. So you're not absolved from having to estimate a
factor model. You now have to estimate a factor model for every single security so that you can
aggregate and get the portfolio exposure.

You can do that the way BARRA does with very complex factor models—in which each security can

www.stanford.edu/~wfsharpe/art/fa/fa.htm 2/6
10/24/12 Sharpe Interview on Style Analysis

be exposed to many factors— or you can do it much more simplistically by assuming that each security
is exposed to one and only one factor. So, its sensitivity to growth is 1.0 and its exposure to value is zero
and everything else is zero. That's a very simple kind of model at the security level and, of course,
makes it easy to add up.

There are two problems with that approach. One is that it's very hard to estimate at the security level if
you're going to deal with any subtleties because there's so much noise in what happens to a particular
security. Whereas with a portfolio a lot of that noise is canceled out and you get a better view of the
aggregate.

FA: And the other problem...?

Sharpe: The other issue is that if you limit yourself to sort of a zero vs. one view of the world— each
security is in one and only one asset class—you may just throw out a lot of reality As a practical matter
most of tile security-by-security models do not cut across asset classes in the large. So, for example, you
typically don't capture the sensitivity of a utility stock to interest rates because you say it's a stock. Or
you use a stock model and the stock model doesn't have interest rates in it.

FA: What about the argument that returns-based style analysis is always backward looking?

Sharpe: That's another issue. One which shouldn't cause you to choose between one method or the
other. However you perform style analysis—that is whether you pursue returns-based style analysis or
what I'd call portfolio-based or composition-based, style analysis, you have to decide whether you want
to know what the sensitivity of the portfolio is as it exists this very day to the various asset classes. Or,
do you want to have some notion of what it was on average over some historic period.

For some questions you can answer one way and for other questions you can answer the other way. If
what you're trying to do is benchmark a manager for the next five years presumably you want some sort
of an estimate of where he'll be on average over the next five years and that may be better reflected in
the five year historic average of where he was than where he is this very day. For example, he may have
rotated opportunistically and may very likely rotate out tomorrow. On the other hand, if he has made a
once and for all shift—he's found religion and he's decided that value stocks are it and he'll never own a
growth stock again even though he used to—then, obviously, you're going to want yesterday's portfolio.

FA: Is one method "better" than the other?

Sharpe: If you look at the portfolios composition and you look at it every month for some poor style
rotator you're going to track him really well. But that's not the way you want to benchmark him.
Because a benchmark shouldn't rely on information that the manager gives you. Rather the benchmark
should reflect how you would do if you didn't have the manager.

So you have to ask what am I doing this for? By and large what you're doing it for is performance
analysis. You're doing it to figure out what benchmark you want to set. You're doing it after the fact to
see if the benchmark you set was a reasonable benchmark. So, decide what question you're asking.
Then, you can figure out how you want to approach it and what measure you want to use?

FA: Do you use both methods?

Sharpe: No, because I don t believe in simple taxonomy—a stock is a stock is a stock. I think that's too
crude. And I don't have detail rich models— security-by-security data. So, if I wanted to use a
composition-based method, I couldn't. But I don't feel any need to use it for what I do.

FA: Are there circumstances in which people should prefer composition-based style analysis?

Sharpe: It's only as good as the security model that you're using. You have to have a model at the

www.stanford.edu/~wfsharpe/art/fa/fa.htm 3/6
10/24/12 Sharpe Interview on Style Analysis

security level if you're going to use that. Often times, people have a model that's so implicit they don't
realize it is a model.

The Morningstar model is you re in this box, that box or another box. But that's a model. It's a model in
which each security gets assigned to one of nine boxes—if you include bonds and stocks they actually
have 18 boxes. Each security gets assigned a one for one box as exposure and a zero for the other 17.
That's a model. It's a factor model. And they use it appropriately, I'm sure. Is it a good model? Is there a
better model? That's an empirical issue.

FA: It seems a limitation of that approach is that it doesn't leave room for shades of gray. A stock
is either large cap growth or large cap value, or whatever.

Sharpe: That's right. Plus, you can really get fooled if you follow that approach. Let's take a fund that I
know well because I'm on the board of directors— Smith Breeden Market Tracking Fund. On the
surface. it looks as if the fund has a lot of bonds. But there's this "funny" little swap in there. "But, hey,
it's only worth three percent. And when the fund bought it, it was worth nothing." You might conclude
it's a bond fund. That wouldn't be correct, however.

FA: Morningstar's Don Phillips says he isn't worried about you, but about the popularizers of
the technique who refer to "style analysis" with such catchy phrases as x-raying a portfolio.

Sharpe: There are powerful commercial forces at work here. And almost everybody who has a
commercial interest in one or the other technique is going to overstate the merits of his technique and
underestimate the merits of the opposing technique. In a world of infinite resources, I'd say do it all. You
cannot help but be better off with more information rather than less. But there's a huge disparity in the
cost. Composition-based analysis is just a lot more costly.

What often gets lost, however: is that the composition-based, or portfolio-based, methods are only as
good as their security models. Every security has to be assigned a sensitivity to the asset class. It's
basically a noise issue. So much impacts the price of a security that it's very hard to tell whether a
security is reacting to the growth stocks or the value stocks or something else.

However, if you put 20 securities in a portfolio that noise tends to average out and you get a much
clearer view of the portfolio's sensitivity to the factors.

FA: Which technique is better as an early-warning sign of a shift in style by a portfolio manager?

Sharpe:Returns-based analysis using historic returns clearly is going to take a while to pick up major
changes. There's no question about that. So, if there's a major change, you'll see it a lot quicker if you re
looking at the portfolio.

The next issue is whether its a permanent change. To determine that today, you'll have to talk to that
portfolio manager. But in cases in which you're in touch with the manager and you have the resources
and it's important enough, yes look at the portfolio. But even then, I might well prefer rather than using a
very crude securities model to take that portfolio—and I've done that a couple of times—and compute
the portfolio's historic returns as it now exists. That may give you a better view of what that portfolio
really is.

FA: When you do returns-based style analysis you use monthly returns. Is there a case to be
made for looking at returns daily or weekly?

Sharpe: The problem with daily retunns is you're getting more and more noise. There's a problem here
with noise and the more noise you get the poorer your estimates are. I use monthly data because it's
what's most readily available and it works so well with most of the kinds of managers and funds that I
look at. But I have talked to people who have had good luck with weekly data.
www.stanford.edu/~wfsharpe/art/fa/fa.htm 4/6
10/24/12 Sharpe Interview on Style Analysis

FA: Critics of returns-based analysis love to tell stories about how Bill Sharpe analyzes a
portfolio and says it contains this or that and then when you lift the hood, 'lo and behold, there
aren't any bonds, or whatever.

Sharpe: What you're talking ahout here is predominantly risk. It's not so much a matter of how the
manager did on average over the seven years but how he did in months when there was a disparity in
returns.

If you have a security that says on it stock right up there at the top in nice Gothic lettering but it seems to
go down whenever bonds go down, there's good reason to suspect there's something about the
economics of the company or the way the instrument is written that makes it sensitive to interest rates.
And if you happen to not want any more sensitivity to interest rates in terms of the risk you're taking,
you better not buy that security.

FA: What are the limitations of returns-based style analysis

Sharpe: If you run into people with very concentrated portfolios. its very difficult to figure out what the
core is. Say it's a sector fund, which won't work very well at all except utilities which happen to be
fairly homogeneous and the analysis picks up the interest-rate sensitivity. But, if you have a chemical
fund or something of that sort, you're going to get so much noise that if something good happens to that
sector in a month when U.S. stocks are flat and Japanese stocks soar, the analysis is going to say you
have a portfolio with some Japanese stocks in it. You can get that kind of spurious behavior. But, in
practice, it's amazing how little of it you see.

FA: How do you see "style analysis" changing the role of financial advisors?

Sharpe: It has already had a pretty big impact on traditional institutional consulting. Traditional
consultants charge a lot of money for saying: " I know those guys down at this or that finm, they
manage bonds. And they manage long bonds and they do it well." Well, you don't have to pay a
consultant $250,000 per year to do that for you.

What we're talking about is figuring out which investment products — mutual funds and such — will
implement a particularly good asset allocation for the client.

To do that, the financial advisor has to talk to the client and educate the client — understand the client
— to figure out what asset allocation makes sense, including changes through time. Then the advisor
has to implement it.

What this technique does is give you a very efficient tool to help you in that process. It's not the only
tool. And if you have other tools, you should use them. But this is a very efficient tool for performance
analysis and reporting. So. I see this technique as increasing the efficiency and lowering the cost of an
important part of what afinancial advisor does. But it's by no means the only thing an advisor does.

FA: What role should "style analysis" play in choosing managers or retaining managers?

Sharpe: I would never suggest investing in a mutual fund based solely on style analysis. Before I'd
invest in a fund I would want to have read the Morningstar material and whatever else I could find.
including the prospectus.

I would never say that style analysis is enough. Maybe it's enough to help you pick some funds that you
want to look at more carefully. But once you look at style analysis and you decide that a fund is
interesting to you, then you go further.

FA: How should financial advisors make use of this technique?

Sharpe : What I tell my students is that you work, to begin with, on the asset allocation. Then you have
www.stanford.edu/~wfsharpe/art/fa/fa.htm 5/6
10/24/12 Sharpe Interview on Style Analysis

to figure out a suite of investment products that give you that allocation and then if you're so inclined
will hopefully add some value through active management. To do that you have to know which product
gives you exposure to which asset classes and how much. You have to estimate it as best you can. The
simple way of saying "hey he's a value manager or a growth manager" is a little bit too crude. Even if
he's a value manager maybe he's a value manager with cash or without cash. You need to know that.
You need to have some sort of system that adds it all up.

FA: Product vendors like to mention your name, though I have to say that no one I have run into
comes right out and says: "Sharpe endorses our product." But you have already acknowledged
that there are powerful commercial forces at work in this arena. So, whose product do you use?

Sharpe : I have been very careful. I have spoken at the BARRA conference, the Ibbotson conference
and the Zephyr conference. I'm an equal opportunity speaker and those are the three major vendors at
this point as I see it. I have no financial stake in any of them. They all give me software and/or databases
for my research for which I am very grateful. For all of my regular work, however, I use various pieces
of my own software .

FA: Some folks, including Morningstar's Don Philips, say they're concerned that "style analysis"
may be misused.

Sharpe: It's amazing how people can misuse even the simplest tool. On the other hand, that shouldn't be
taken as an argument against using a technique that has a great deal of value.

FA: Thanks, Bill.

Dow-Jones Fee Advisor


Editorial Director and Publisher
BARRY VINOCUR
(908) 389-8700 ext. 114
CompuServe 75054,1777
Internet: bvinocur@ix.netcom.com
America Online: BVinocur

www.stanford.edu/~wfsharpe/art/fa/fa.htm 6/6
10/24/12 Macro‑Investment Analysis

Equilibrium
Equilibrium -- preliminary

www.stanford.edu/~wfsharpe/mia/eq/mia_eq0.htm 1/1
10/24/12 Equilibrium (preliminary)

Equilibrium
(preliminary)

Note: This page provides formulas and some implications associated with mean-variance
equilibrium. It is highly preliminary and may contain errors. In time, any errors will be corrected
and the explanations expanded, especially those concerning the economics of the results and their
practical implications.

The Optimality Condition with No Bounds


As before, we use the three-asset example for simplicity.

correlations (cc):

cash bonds stocks


cash 1.00 0.40 0.15
bonds 0.40 1.00 0.35
stocks 0.15 0.35 1.00

standard deviations (sd):

sd
cash 1.00
bonds 7.40
stocks 15.40

covariances (C):

cash bonds stocks


cash 1.000 2.960 2.310
bonds 2.960 54.760 39.886
stocks 2.310 39.886 237.160

expected returns (e):

e
cash 2.80
bonds 6.30
stocks 10.80

Assume that portfolio x is optimal for risk tolerance t. Moreover, assume that each position in x is within
any relevant bounds (that is, that no bounds are binding). For example, here the optimal portfolio (x) for
a risk tolerance (t) of 25 is:

portfolio (x):

x
cash 0.0671

www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 1/13
10/24/12 Equilibrium (preliminary)

bonds 0.6021
stocks 0.3308

We know from the conditions for optimality that for this portfolio, the marginal utility of each asset must
be the same. Let z represent this common marginal utility.. For reasons that will become clear, we
express utility and marginal utility in variance-equivalent terms:

u = t*ep - vp

mu(i) = t* dep/dx(i) - dvp/dx(i)

We know that:
mu(i) = t*e(i) - 2*C(i,:)*x

But optimality requires that this equal z. Hence for each asset i:
t*e(i) - 2*C(i,:)*x = z

Moreover, for portfolio x to be a portfolio, the sum of its components must equal 1:
sum(x) = 1

or:
x'*ones(n,1) = 1

where n is the number of assets

We can combine all these conditions in one matrix equation, which will prove useful in a number of
contexts. First, we write all n of the first order conditions as:

t*e - 2*C*x = z*ones(n,1)

and re-arrange to give:

2*C*x + ones(n,1)*z = t*e

To this we will add the full investment constraint:

x'*ones(n,1) = 1

Define an {(n+1)*1} vector that includes the portfolio composition (x(1), x(2), ... x(n)) and z:

xx = [x ; z]

This allows us to "stack" all the conditions to give:

CC*xx = kk + t*ee

where:

CC = [ 2*C ones(n,1) ; ones(1,n) 0]

kk = [zeros(n,1); 1]

ee = [e ; 0]

www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 2/13
10/24/12 Equilibrium (preliminary)

In our case:

CC:
2.0000 5.9200 4.6200 1.0000
5.9200 109.5200 79.7720 1.0000
4.6200 79.7720 474.3200 1.0000
1.0000 1.0000 1.0000 0

kk:

0
0
0
1

ee:

2.80
6.30
10.80
0

Finding an optimal portfolio with no bounds


It is straightforward to find an optimal portfolio if no upper and lower bounds are present or if any such
bounds can be assumed to not be binding. Recall the conditions for optimality:
CC*xx = kk + t*ee

Multiplying both sides by the inverse of CC gives:


inv(CC)*CC*xx = inv(CC)*kk + t*inv(CC)*ee

or:
xx = inv(CC)*kk + t*inv(CC)*ee

Now, let:

zp = inv(CC)*kk

and
rtswap = inv(CC)*ee

Then:
xx = zp + t*rtswap

In our example:
zp:

zp
cash 1.0392
bonds -0.0396
www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 3/13
10/24/12 Equilibrium (preliminary)

bonds -0.0396
stocks 0.0004
z -1.8458

rtswap:

rtswap
cash -0.0389
bonds 0.0257
stocks 0.0132
z 2.6648

Note that the asset positions in zp sum to 1.0. This must be the case since the final element in kk is 1.0
and the final row in CC has ones in every asset position. The effect is to require that the sum of the
resulting holdings equals 1.0. Thus zp will always represent a true portfolio -- that is, holdings that sum
to 1.0.

In contrast, note that the asset positions in rtswap sum to 0.0. This must be the case since the final
element in ee is 0.0. Since the final row in CC contains ones in every asset position, the effect is to
require that the sum of the resulting holdings equals 0.0. Thus rtswap will always represent a zero-
investment strategy, or swap.

Consider now the optimal portfolio for an investor with zero tolerance for risk:
xx = zp + 0*rtswap

Thus zp is the optimal portfolio for an investor with no tolerance for risk. It must therefore be the
minimum-variance portfolio.

Now consider the investor with a risk tolerance of 25. The optimum portfolio is:

xx = zp + 25*rtswap

or:
0.0671 1.0392 -0.9721
0.6021 -0.0396 0.6418
0.3308 = 0.0004 + 0.3303
64.7731 -1.8458 66.6189

In effect, the investor takes 25 units of the rtswap. This, added to the initial portfolio, gives net holdings
with relatively little cash, a substantial bond position and a smaller but still significant stock position.

Two-fund separation

Returning to the equation for an optimal portfolio:


xx = zp + t*rtswap

It is clear that every investor can be accommodated if there are two instruments in the world: a portfolio
(zp) and a swap (rtswap). Moreover, the magnitude of the latter that an investor should take is
proportional to his or her risk tolerance. Optimal holdings are thus linear functions of risk tolerance.

In fact, all investors can be accommodated by any two optimal portfolios. For example, consider xxa
and xxb, which are optimal for risk tolerances of ta and tb, respectively:

www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 4/13
10/24/12 Equilibrium (preliminary)

xxa = zp + ta*rtswap

xxb = zp + tb*rtswap

Now, consider an investor with risk tolerance t. Assume that she invests pa in portfolio xxa and pb (=1-
pa) in portfolio xxb. The resulting portfolio xx will be:
xx = pa*xxa + (1-pa)*xxb = zp + (pa*ta + (1-pa)*tb) * rtswap

This will in fact be optimal if:


t = pa*ta + (1-pa)*tb

or:
pa = (t - tb) / (ta - tb)

As long as short positions are allowed (as is assumed here), any investor can choose any mix of
portfolios xxa and xxb and be assured of optimality. Moreover, this will be the case as long as xxa and
xxb are optimal for some levels of risk tolerance (i.e. are on the efficient frontier).

This relationship is termed the two-fund separation theorem since it allows the choice problem to be
separated into (1) the selection of two efficient funds and (2) the selection of the appropriate
combination of such funds for each investor.

Expected Return and Beta Values


The first-order conditions for the optimality of portfolio x for an investor with risk tolerance i requires
that the following hold for every asset i:
t*e(i) - 2*C(i,:)*x = z

The second term is equivalent to the covariance of asset i with portfolio x. We may thus write:
t*e(i) - 2*Cix = z

Re-arranging:
e(i) = z/t + 2*Cix/t

Note that this means that the values of e(i) will plot on a straight line in a graph in which e(i) is plotted
on the vertical axis and Cix is plotted on the horizontal axis. Moreover, the vertical intercept will equal
z/t and the slope will equal 2/t (and the latter will be positive, so the line will be upward-sloping). Thus
the higher the covariance with the optimal portfolio, the higher will be the expected return.

Since this is true for every asset i, it must also be true for every combination of assets, including portfolio
x. Thus:

ex = z/t + 2*Cxx/t

But Cxx is the variance of x (Vx), so:


ex = z/t + 2*Vx/t

We may combine this with the equation for a single asset to obtain:

www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 5/13
10/24/12 Equilibrium (preliminary)

( e(i) - z/t ) / ( ex - z/t) = ( 2*Cix/t ) / ( 2*Vx/t)

Simplifying gives:
e(i) = z/t + ( ex - z/t ) * (Cix / Vx)

From regression theory, we can interpret Cix/Vx (the scaled covariance of asset i with the optimal
portfolio x) as the beta value of i relative to x. Thus:
e(i) = z/t + ( ex - z/t) * Bix

This implies that there is a linear relationship between asset expected returns and their beta values with
respect to the optimal portfolio. Note that this is a direct implication of the assumption that x is an
optimal portfolio (and the lack of binding upper or lower bound constraints).

The Market Portfolio and the Security Market Line


Now consider a world in which all investors solve the problem formulated here but differ in both risk
tolerance and wealth. Let there be m investors, with w(j) the relative wealth of investor j. Let ww be the
{m*1} vector of these relative wealths (summing to 1.0). Similarly, let t(j) be investor j's risk tolerance
and tt the {m*1} vector of all such risk tolerances. For our example let there be 2 investors (Dick and
Jane) with:

tt:
tt
Dick 10
Jane 25

ww:

ww
Dick 0.30
Jane 0.70

We may represent all the optimal portfolios in an {n*m} matrix XX in which the j'th column contains
the optimal portfolio for investor j. Here:
XX = zp*ones(1,m) + rtswap*tt'

Dick Jane
cash 0.6504 0.0671
bonds 0.2171 0.6021
stocks 0.1326 0.3308
z 24.8018 64.7731

Now consider xxm, the wealth-weighted average of the investors' portfolios:

xxm = XX*ww

This can be termed the market portfolio, since it contains all assets, in market-weighted proportions.
Here:

xm
Cash 0.2421
Bonds 0.4866
www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 6/13
10/24/12 Equilibrium (preliminary)

Bonds 0.4866
Stocks 0.2713
z 52.7817

Note that the last two equations can be combined to give:


xxm = zp*ones(1,m)*ww + rtswap*tt'*ww

Rearranging:
xxm = zp*[ones(1,m)*ww] + rtswap*[tt'*ww]

The first expression in square brackets equals 1.0. And the second expression is simply a wealth-
weighted average of the investors' risk tolerances. Define the latter as:

tm = tt'*ww

Then:
xm = zp + tm*rtswap

This indicates that the market portfolio is optimal for an investor with a risk tolerance equal to the
wealth-weighted average investor risk tolerance (perhaps better termed the "societal risk tolerance").
Given this optimality it immediately follows that:
e(i) = z/tm + (em - z/tm) * Bim

where z is the value of z for an investor with risk tolerance tm.

If there is a riskless asset, z/tm will equal its return. We thus obtain the formula for the original CAPM:
e(i) = rf + (em - rf) * Bim

In the absence of a riskless asset, z/tm can be interpreted as the expected return on a zero-beta asset.
Note that the slope of the resulting Security Market Line will equal the market risk premium and will
generally be positive.

It is useful to compare an Investor's optimal strategy with that of an Investor who is average in every
respect (here, t = tm). We know that the latter will hold the market portfolio. Thus:
xm = zp + tm*rtswap

x = zp + t*rtswap

x - xm = (t - tm) * rtswap

or:

x = xm + (t - tm) * rtswap

Thus an Investor's optimal portfolio will differ from the market portfolio by rtswap times the difference
between the Investor's risk tolerance and that of the market (society). We can think of rtswap as the "tilt"
away from the market per unit of difference between the investor's risk tolerance and that of the market.
In this view, the two vectors that provide two-fund separation are the market portfolio and the swap
given by vector rtswap.

Reverse Optimization
www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 7/13
10/24/12 Equilibrium (preliminary)

Recall the conditions for optimality of a portfolio x for an investor with risk tolerance t if no bounds are
relevant. For each asset i:

t*e(i) - 2*C(i,:)*x = z

Now, assume that the values of all covariances are known and that x can be assumed to be optimal for
some investor, but that the risk tolerance of that investor is not known. Moreover, assume that the
expected returns of two of the assets has been predicted, but the others have not. In our example, assume
that the expected real return on Cash (asset 1) is 3.0% and the expected real return on Stocks (asset 3) is
8.0%. Then:

t*e(1) - 2*C(1,:)*x = z

t*e(3) - 2*C(3,:)*x = z

where t and z are unknown variables and all other values are specified. This is a system of two linear
equations with two unknowns and hence can be solved simply. For example:

t*e(1) - 2*C(1,:)*x = t*e(3) - 2*C(3,:)*x

or:

t = ( 2*(C(1,:) - C(3,:))*x ) / ( e(1) - e(3))

In this case:

t=

40.0038

Substituting in the first of the two equations:

z = t*e(1) - 2*C(1,:)*x

Here

z=

114.7844

We now know the values of t and z for which portfolio x is optimal. But from the conditions of
optimality we also know what all the remaining asset expected returns must be. Recall that:

t*e(i) - 2*C(i,:)*x = z

or:

e(i) = ( z + 2*C(i,:)*x ) / t

In this case there is only one remaining expected return, that of Bonds (asset 2). Its expected return will
be:

e(2) = ( z + 2*C(2,:)*x ) / t

Here:

e(2) =

5.1873

www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 8/13
10/24/12 Equilibrium (preliminary)

We are thus able to infer the vector of expected returns from the assumptions that (1) covariances are
known, (2) an efficient portfolio is known, and (3) two expected returns are known. This process is is
sometimes termed reverse optimization. Given the difficulty associated with predicting expected returns,
it can be a powerful tool to insure that optimizations give sensible answers. If, for example, the market
portfolio is used in this procedure, the results will be those implied by the capital asset pricing model.

The reverse optimization worksheet uses this approach. Inputs may be pasted in directly from the output
of the weighted covariance worksheet. The output of the reverse optimization worksheet may, in turn,
be pasted directly into the optimization worksheet.

Additional Sources of Utility


Thus far we have assumed that Investor utility functions included only expected return and variance,
that is:
u = ep - vp / t

But at least some Investors may care about other attributes of a portfolio. Of particular interest are
attributes for which the portfolio attribute is a linear combination of the attributes of individual assets.
For our example, assume that the liquidity of each asset can be specified and that a portfolio's liquidity is
a value-weighted average of the liquidities of the individual assets. For example, assume that the asset
expected returns are:
e
Cash 3.00
Bonds 5.00
Stocks 8.00

and the liquidity measures (denoted a for "attribute") are:


a
Cash 1.00
Bonds 0.50
Stocks 0.20

Then the liquidity of portfolio x will be:


ap = x'*a

As with e, we can make a new vector with these attributes plus a zero:

aa = [a ; 0]

An Investor's utility will be assumed to depend on three characteristics of the portfolio. In particular, let
at be an Investor's liquidity preference -- his or her marginal rate of substitution of variance for liquidity.
The more an Investor prefers to have liquidity, the greater the amount of added variance he or she will
accept to obtain added liquidity. Thus for an Investor unconcerned with liquidity, at=0. For one very
concerned with liquidity, at will be large, and so on.

We can now write the Investor's utility function as:

u = t*ep + at*ap - vp

The first-order conditions for portfolio optimality then become:

www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 9/13
10/24/12 Equilibrium (preliminary)

mu(i) = t*e(i) + at*a(i) - 2*C(i,:)*x = z (for all i)

Combining all these first-order conditions gives:

t*e + at*a - 2*C*x = z*ones(n,1)

Re-arranging:

2*C*x + ones(n,1)*z = t*e + at*a

Finally, adding the full investment constraint gives:

CC*xx = kk + at*aa + t*ee

where:

CC = [ 2*C ones(n,1) ; ones(1,n) 0]

kk = [zeros(n,1); 1]

aa = [a; 0]

ee = [e ; 0]

To find the optimal portfolio for an Investor with liquidity preference at and risk tolerance t, we can
simply solve:

xx = inv(CC)*kk + at*inv(CC)*aa + t*inv(CC)*ee

or:

xx = zp + at*atswap + t*rtswap

where zp and rtswap are as defined earlier, and:

atswap = inv(CC)*aa

Here:
atswap
Cash 0.0053
Bonds -0.0043
Stocks -0.0011
z 1.0195

We call this vector a swap because the asset positions must sum to zero due to the zero in the last
position of aa. Moreover, the amount of adjustment in the overall portfolio should equal this vector times
the Investor's liquidity preference. Hence the name atswap (here - liquidity preference swap). Note that
the more an Investor prefers liquidity, the larger the cash position and the smaller the bond and stock
positions.

In this case, zp is now the optimal portfolio for an Investor with zero risk tolerance and zero liquidity
preference. Let att be the vector of Investor liquidity preferences. Here:
at
Dick 0.00
Jane 0.20

By summing over all Investors as before we have:

www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 10/13
10/24/12 Equilibrium (preliminary)

xm = zp + atm*atswap + tm*rtswap

where atm is the weighted average liquidity preference:


atm = ww'*att

Here:
atm =

0.1400

In this case the optimal portfolios for the investors are:

XX = zp*ones(1,m) + atswap*att' + rtswap*tt'

or:
Dick Jane
Cash 0.6504 0.0682
Bonds 0.2171 0.6013
Stocks 0.1326 0.3305
z 24.8018 64.9770

and:

xm = XX*ww

or:

xm
Cash 0.2428
Bonds 0.4860
Stocks 0.2711
z 52.9244

As before, it is useful to compare an Investor's optimal strategy with that of an Investor who is average
in every respect (here, t = tm and at = atm). We know that the latter will hold the market portfolio. Thus:
xm = zp + atm*atswap + tm*rtswap

x = zp + at*atswap + t*rtswap

x - xm = (at-atm)*atswap + (t-tm)*rtswap

or:

x = xm + (at-atm)*atswap + (t-tm)*rtswap

Thus an Investor's optimal portfolio will differ from the market portfolio by atswap times the difference
between the Investor's liquidity preference and that of the market (society). plus rtswap times the
difference between the Investor's risk tolerance and that of the market (society) As before, we can think
of rtswap as the "tilt" away from the market per unit of difference between the investor's risk tolerance
and that of the market. And here, atswap is the "tilt" away from the market per unit of difference
between the Investor's liquidity preference and that of the market.

Note that in this case, three funds (and/or swaps) will be required to span the space of optimal portfolios
(thus -- three-fund separation). One alternative would be to employ these three vectors: the market
portfolio, the swap given by atswap, and the swap given by rtswap.

www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 11/13
10/24/12 Equilibrium (preliminary)

Liability Hedging
Finally, consider Investors who are concerned not with asset risk and return but with the risk and return
of the difference between asset return and that of some liability (for example, pension obligations). For
example, let the Investor's utility function depend on the expected surplus and standard deviation of
surplus, where surplus is defined as:

s = ra - h*rl

where ra is the return on assets, rl is the return on liabilities, and h is a constant.

The expected surplus es will be:

es = ea - h*erl

where ea is the expected return on the assets and erl is the expected return on the liabilities.

The variance of the surplus vs will be:

vs = va - 2*h*cal + (h^2)*vl

where va is the variance of the asset return, cal is the covariance of the asset and liability returns, and vl
is the variance of the liability return.

In variance-equivalent terms, the utility function can be written as:

u = t*ex - vs

or:

u = t*(ea - h*erl) - (va - 2*h*cal + (h^2)*vl)

But some of these terms will not be affected by the decision variables in the optimization (that is, the
allocation of the fund among the assets). In particular, erl and vl are characteristics of the liability return
distribution that cannot be changed via asset allocation. Hence we can simplify the problem to one of
maximizing:

u = t*ea - va - 2*h*cal

The first two terms in this objective function are precisely the same as in the general asset allocation
problem. Only the third differs.

Consider the covariance of the asset returns with the liability. This will be a weighted average of the
covariances of the individual assets with the liability, with the relative portfolio values as weights:

cal = cl'*x

where cl is an {n*1}vector of the covariances of the assets with the liability. But note that this makes the
third term a linear function of the portfolio weights. Re-writing:

u = t*ea - [h*2*cl']*x - va

But this is equivalent to the general formulation with a linear attribute, so all the conclusions reached
earlier can be applied to this case. In particular, (1) one additional fund or swap will be needed to
provide all possible optimal portfolios, (2) the market portfolio will be optimal for an Investor with the
market average desire to hedge (here, h), and (3) a given Investor should tilt away from the market
portfolio based, among other things, on the difference between his or her desire to hedge and that of the
www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 12/13
10/24/12 Equilibrium (preliminary)

weighted average Investor.

www.stanford.edu/~wfsharpe/mia/eq/mia_eq1.htm 13/13
10/24/12 Macro‑Investment Analysis

Performance Measurement
Mutual Fund Performance Measurement
The Sharpe Ratio
Morningstar's Risk-adjusted Ratings

www.stanford.edu/~wfsharpe/mia/pm/mia_pm.htm 1/1
10/24/12 Mutual Fund Performance Measures

Mutual Fund Performance Measures, Factor Models,


and Fund Style and Selection

William F. Sharpe

www-sharpe.stanford.edu

www-leland.stanford.edu/~wfsharpe

Mutual Fund Performance Measures

Use statistics from:

historic frequency distribution

many periods

Example: combination of mean and standard deviation


for past 36 months

To predict statistics for:

future probability distribution

one period

Example: combination of mean and standard deviation


for next month

Decisions

One Fund

One Fund plus borrowing or lending

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 1/28
10/24/12 Mutual Fund Performance Measures

One fund from a given asset class or category

A portfolio of potentially many funds

Portfolio Theory

Hierarchic Taxonomic Procedures

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 2/28
10/24/12 Mutual Fund Performance Measures

Statistics: M

Ex Ante:

Expected Return
Expected geometric return
etc.

Ex Post:

Arithmetic average return


Geometric average return
Compounded total return over period
etc.

Statistics: S

Ex Ante:

Standard Deviation of Return


Variance of Return
Expected loss
www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 3/28
10/24/12 Mutual Fund Performance Measures

etc.

Ex Post:

Standard deviation of return


Variance of Return
Average loss
etc.

Performance Measures

Return

Utility-based

M-k*S

Scale-independent

M/S

Variables

Total Return

Fund Return

Excess Return

Fund Return - Return on a risk-free instrument

Differential Return

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 4/28
10/24/12 Mutual Fund Performance Measures

Fund Return - Return on an appropriate


benchmark portfolio

Absolute and Relative Measures

Absolute

Use statistics as computed for all funds

Relative

Each fund assigned to a peer group


Performance of funds ranked within each peer group
Comparisons based on:
Differences
Ratios
Rankings
Stars
5 stars: top 10%
4 stars: next 22.5%
etc.

Frequently-used Measures

Relative

Total Excess Differential


Return Return Return
Return Lipper
Utility- Morningstar
based (form)
Scale- Morningstar
Micropal
independent (subst.)

Absolute
www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 5/28
10/24/12 Mutual Fund Performance Measures

Total Excess Differential


Return Return Return
selection
Return
mean (alpha)
Utility-
based
Scale- Sharpe selection
independent ratio Sharpe ratio

Scale-independent Measures

Variable = Return on A minus return on B

Strategy requires zero investment

long position in A
short position in B

Change in value can be doubled by doubling sizes of


positions

For scale k:

Mk = k* M1
SDk = k* SD1
Mk / SDk = M1 / SD1

Therefore, ratio is scale-independent

Scale-independent Measures with Positive Expected Returns

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 6/28
10/24/12 Mutual Fund Performance Measures

Scale-independent Measures with Negative Average Returns

Inappropriateness of Total Return M/S Measures

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 7/28
10/24/12 Mutual Fund Performance Measures

Morningstar Peer Groups

Peer Groups

Asset classes
Categories

Asset Classes

Domestic equity
International equity
Taxable bond
Municipal bond

Domestic equity categories

Diversified (9)
Specialty (9)
Hybrid
Convertible

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 8/28
10/24/12 Mutual Fund Performance Measures

Morningstar Diversified Equity Categories

Based on portfolio composition

price/earnings, price/book
market capitalization

Averaged over past three years

Style Boxes

Large Large Large


Value Blend Growth
Medium Medium Medium
Value Blend Growth
Small Small Small
Value Blend Growth

Morningstar Ratings

Stars:

Rank within asset class (e.g. equity)


3-year, 5 year, 10 year and weighted average of 3,5,
and 10 year
Net of load charges

Category Ratings:

Rank within asset category (e.g. Large Growth equity)


3-year
Load charges not taken into account

Percentages:

1 5
2 3 4
(worst) (best)
10% 22.5% 35% 22.5% 10%

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 9/28
10/24/12 Mutual Fund Performance Measures

Morningstar Statistics, 3-year Ratings

Compounded return on fund - compounded return on


Treasury bills

Loss

if fund return > Treasury bill return, loss = 0


if fund return < Treasury bill return, loss = - (fund
return - bill return)

Average Monthly Loss


sum ( monthly loss)
takes all 36 months into account

Average Monthly Loss versus Standard Deviation of Monthly


Returns,
Morningstar Diversified Equity Funds, 1994-1996

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 10/28
10/24/12 Mutual Fund Performance Measures

Average Monthly loss versus function of Monthly Mean and Std.


Deviation
Morningstar Diversified Equity Funds, 1994-1996

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 11/28
10/24/12 Mutual Fund Performance Measures

Morningstar Risk-adjusted Rating

_ _
RARf = Mf / M - Sf / S
_
M

if mean ( Mf ) >= compound return on Treasury bills,


mean ( Mf )

if mean ( Mf ) < compound return on Treasury bills,


compound return on Treasury bills
_
S

mean ( AMLf )

Morningstar Risk-adjusted Ratings as Utility-based


Measures

_ _
RARf = Mf / M - Sf / S
_ _ _
= ( 1/M ) * [ Mf - ( M / S ) * Sf ]

_
Rankings unaffected by initial constant ( 1/M )

Rankings depend on:

Mf - k * Sf
where:
_ _
k=M /S

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 12/28
10/24/12 Mutual Fund Performance Measures

A bi-linear VnM Utility Function with threshold = 4% and utility


ratio = 2.5

Optimal Leverage when Utility = Return - k*Risk

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 13/28
10/24/12 Mutual Fund Performance Measures

Optimal Leverage when Utility = Return - k*Risk2

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 14/28
10/24/12 Mutual Fund Performance Measures
_ _
Indifference Curves and Iso-M/S lines: k = M / S

_ _
Indifference Curves and Iso-M/S lines: k > M / S

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 15/28
10/24/12 Mutual Fund Performance Measures

Sharpe Ratio Ranks versus Category Rankings,


Morningstar Diversified Equity Funds, 1994-1996

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 16/28
10/24/12 Mutual Fund Performance Measures

Three-year Star Ratings and Mean-variance combinations,


Morningstar Diversified Equity Funds, 1994-1996

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 17/28
10/24/12 Mutual Fund Performance Measures

An Asset Class Factor Model

~ ~ ~

R f = [ b1f F 1 + b2f F 2 + ... + bnf


~ ~

F n]+e f
~
Fund return
Rf
~

F1
~ Asset class returns

,...,F n
b1f ,..., Fund asset class exposures (style) : sum
bnf =1

[ ... ] Fund style return


~
~
Fund selection return: e f i uncorrelated
ef ~

with e f j

Benchmark Portfolios and Asset Exposures

~ ~ ~

R f = [ b1f F 1 + b2f F 2 + ... + bnf


www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 18/28
10/24/12 Mutual Fund Performance Measures

~ ~

F n]+e f
~
Fund return
Rf
~ ~
Asset class returns
F 1 ,...,F n
b1f ,..., bnf Benchmark portfolio composition

[ ... ] Benchmark portfolio return


~
Fund differential return
ef

Methods for Selecting a Benchmark

Historic
Current Projected
Average
MStar
Composition MStar Style
Category
Actual Retrospective
Regression
Returns Returns
Style Actual Retrospective
Analysis Returns Returns
FER
Projection
Proposal

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 19/28
10/24/12 Mutual Fund Performance Measures

Taxonomic Factor Models

All conditions for a general asset class factor model hold

plus

For any given fund f

One bif = 1

All other bif's = 0

Fund expected return = asset class expected return + fund


alpha

Fund Variance = asset class variance + fund selection


variance

Overall Portfolio Return

~ ~ ~

R p = [ b1p F 1 + b2p F 2 + ... +


~ ~

bnp F n ] + e p
where:
bjp = X1 b1j + X2 b2j + ... + Xn bnj
~ ~ ~ ~

e p = X1 e 1 + X2 e 2 + ... + Xn e m
~

[...] = (style) return on assets ( R A )

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 20/28
10/24/12 Mutual Fund Performance Measures

e p = selection return

Selection Return Statistics

Ex post

~
Average selection return ( alpha )
mean ( e f )
~
Selection return variability
stddev ( e f )
Ex ante

~
Expected selection return ( alpha
expected ( e f )
)
~
Selection return risk
stddev ( e f )

Factor-model Based Analysis

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 21/28
10/24/12 Mutual Fund Performance Measures

Factor-model Based Analysis: Optimization Inputs

Asset Classes

Expected Returns
Standard Deviations
Correlations

Funds

Styles ( Benchmark portfolios)


Expected selection returns (alphas)
Selection risks

Investor

Risk tolerance: t
other constraints, assets, liabilities, etc

Optimization with Unlimited Short Positions in Assets

Creating a hedge fund

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 22/28
10/24/12 Mutual Fund Performance Measures

Long: fund
Short: fund's benchmark asset mix

Zero investment required

Return is scale-independent

Asset allocation unaffected by scale of investment

Select Xi to maximize:

Xi expected (ei ) - ( Xi 2 Var ( ei ) ) / t

Optimal Position in a Fund with Unlimited Short Positions


in Assets

Xi = [ expected (ei ) / Var ( ei ) ) ] * ( t / 2 )

Amount of risk taken:

Xi * stdev ( ei )

= [ expected (ei ) / stdev ( ei ) ] * ( t / 2 )

= [ selection Sharpe ratio ] * ( t / 2 )

Relative values independent of investor preferences

Choosing a Fund for an Asset Class Position with a


Taxonomic Factor Model

Assume asset allocation is fixed

Then:

Ep = EA + X1 expected ( e1 ) + . . . + Xm
expected ( em )

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 23/28
10/24/12 Mutual Fund Performance Measures

Vp = VA + X12 variance ( e1 ) + .... + Xm2


variance ( em )

Utility:

[ E A - VA / t ] +

[ X1 expected ( e1 ) - X12 variance ( e1 ) / t ] +

...+

[ Xm expected ( em ) - Xm2 variance ( em ) / t ]

The Optimal Fund for an Asset Class with a Taxonomic


Factor Model

Xj is a given constant

From the funds in the asset class, select the fund for which

[ Xj expected ( ef ) - Xj2 variance ( ef ) / t ] is the


largest

Equivalently, select the fund with the largest value of:

expected ( ef ) - ( Xj / t ) * variance ( ef )

A utility-based differential return measure with k a function


of:

the amount to be invested in the asset class ( Xj )


the investor's risk tolerance (t)

The Optimal Fund for a Small Portion of a Portfolio

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 24/28
10/24/12 Mutual Fund Performance Measures

The preferred fund for an investment of Xj in asset class j


has maximum:

z = expected ( ef ) - ( Xj / t ) * variance ( ef )

If Xj is small:

( Xj / t ) * variance ( ef ) is small

z is approximately equal to expected ( ef ) = alpha

Hence best fund is the one with the largest alpha relative to
an appropriate benchmark

Correlations of Percentiles within Categories

SR Cat. Star Alpha SSR


Sharpe Ratio 1.000 0.986 0.945 0.831 0.744
Category Rating 0.986 1.000 0.957 0.829 0.735
Star Rating 0.945 0.957 1.000 0.790 0.694
Selection Mean
0.831 0.829 0.790 1.000 0.940
(Alpha)
Selection Sharpe
0.744 0.735 0.694 0.940 1.000
Ratio

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 25/28
10/24/12 Mutual Fund Performance Measures

Style Analysis Alpha Ranks versus Category Rankings,


Morningstar Diversified Equity Funds, 1994-1996

Style Analysis Selection Sharpe Ratios versus Category Rankings,


Morningstar Diversified Equity Funds, 1994-1996

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 26/28
10/24/12 Mutual Fund Performance Measures

Conclusions (1)

Hierarchic taxonomic approaches will generally be


suboptimal

lower-level characteristics not taken into account when


making decisions

asset category characteristics not taken into


account when allocating among asset classes

fund characteristics not taken into account when


allocating among asset classes and categories

No universal single measure can provide a sufficient statistic


for choosing

one fund in each category, or

multiple funds in each category


www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 27/28
10/24/12 Mutual Fund Performance Measures

Conclusions (2)

Need good estimates of:

future asset exposures

appropriate benchmark portfolio (fund style)

future fund selection risk

future fund selection expected return

This information should be combined optimally with


estimates of

future asset risks, expected returns and correlations

investor risk tolerance and other characteristics

All useful predictors of future performance should be taken


into account

include fund expense ratios, turnover, etc..

www.stanford.edu/~wfsharpe/art/mfpm/mfpm.htm 28/28
10/24/12 The Sharpe Ratio

The Sharpe Ratio


William F. Sharpe
Stanford University
Reprinted fromThe Journal of Portfolio Management, Fall 1994
This copyrighted material has been reprinted with permission from The Journal of Portfolio Management.
Copyright © Institutional Investor, Inc., 488 Madison Avenue, New York, N.Y. 10022,
a Capital Cities/ABC, Inc. Company. Phone (212) 224-3599.

. Over 25 years ago, in Sharpe [1966], I introduced a measure for the performance of mutual funds and
proposed the term reward-to-variability ratio to describe it (the measure is also described in Sharpe
[1975] ). While the measure has gained considerable popularity, the name has not. Other authors have
termed the original version the Sharpe Index (Radcliff [1990, p. 286] and Haugen [1993, p. 315]), the
Sharpe Measure (Bodie, Kane and Marcus [1993, p. 804], Elton and Gruber [1991, p. 652], and Reilly
[1989, p.803]), or the Sharpe Ratio (Morningstar [1993, p. 24]). Generalized versions have also
appeared under various names (see. for example, BARRA [1992, p. 21] and Capaul, Rowley and
Sharpe [1993, p. 33]).

Bowing to increasingly common usage, this article refers to both the original measure and more
generalized versions as the Sharpe Ratio. My goal here is to go well beyond the discussion of the
original measure in Sharpe [1966] and Sharpe [1975], providing more generality and covering a broader
range of applications.

THE RATIO
Most performance measures are computed using historic data but justified on the basis of predicted
relationships. Practical implementations use ex post results while theoretical discussions focus on ex ante
values. Implicitly or explicitly, it is assumed that historic results have at least some predictive ability.

For some applications, it suffices for future values of a measure to be related monotonically to past
values -- that is, if fund X had a higher historic measure than fund Y, it is assumed that it will have a
higher future measure. For other applications the relationship must be proportional - - that is, it is
assumed that the future measure will equal some constant (typically less than 1.0) times the historic
measure.

To avoid ambiguity, we define here both ex ante and ex post versions of the Sharpe Ratio, beginning
with the former. With the exception of this section, however, we focus on the use of the ratio for making
decisions, and hence are concerned with the ex ante version. The important issues associated with the
relationships (if any) between historic Sharpe Ratios and unbiased forecasts of the ratio are left for other
expositions.

Throughout, we build on Markowitz' mean-variance paradigm, which assumes that the mean and
standard deviation of the distribution of one-period return are sufficient statistics for evaluating the
prospects of an investment portfolio. Clearly, comparisons based on the first two moments of a
distribution do not take into account possible differences among portfolios in other moments or in
distributions of outcomes across states of nature that may be associated with different levels of investor
utility.

www.stanford.edu/~wfsharpe/art/sr/sr.htm 1/14
10/24/12 The Sharpe Ratio

When such considerations are especially important, return mean and variance may not suffice, requiring
the use of additional or substitute measures. Such situations are, however, beyond the scope of this
article. Our goal is simply to examine the situations in which two measures (mean and variance) can
usefully be summarized with one (the Sharpe Ratio).

The Ex Ante Sharpe Ratio

Let Rf represent the return on fund F in the forthcoming period and RB the return on a benchmark
portfolio or security. In the equations, the tildes over the variables indicate that the exact values may not
be known in advance. Define d, the differential return, as:

Let d-bar be the expected value of d and sigmad be the predicted standard deviation of d. The ex ante
Sharpe Ratio (S) is:

In this version, the ratio indicates the expected differential return per unit of risk associated with the
differential return.

The Ex Post Sharpe Ratio

Let RFt be the return on the fund in period t, RBt the return on the benchmark portfolio or security in
period t, and Dt the differential return in period t:

Let D-bar be the average value of Dt over the historic period from t=1 through T:

and sigmaD be the standard deviation over the period 1 :

The ex post, or historic Sharpe Ratio (Sh ) is:

www.stanford.edu/~wfsharpe/art/sr/sr.htm 2/14
10/24/12 The Sharpe Ratio

In this version, the ratio indicates the historic average differential return per unit of historic variability of
the differential return.

It is a simple matter to compute an ex post Sharpe Ratio using a spreadsheet program. The returns on a
fund are listed in one column and those of the desired benchmark in the next column. The differences
are computed in a third column. Standard functions are then utilized to compute the components of the
ratio. For example, if the differential returns were in cells C1 through C60, a formula would provide the
Sharpe Ratio using Microsoft's Excel spreadsheet program:

AVERAGE(C1:C60)/STDEV(C1:C60)

The historic Sharpe Ratio is closely related to the t-statistic for measuring the statistical significance of
the mean differential return. The t-statistic will equal the Sharpe Ratio times the square root of T (the
number of returns used for the calculation). If historic Sharpe Ratios for a set of funds are computed
using the same number of observations, the Sharpe Ratios will thus be proportional to the t-statistics of
the means.

Time Dependence

The Sharpe Ratio is not independent of the time period over which it is measured. This is true for both
ex ante and ex post measures.

Consider the simplest possible case. The one-period mean and standard deviation of the differential
return are, respectively, d-bar1 and sigmad1 . Assume that the differential return over T periods is
measured by simply summing the one-period differential returns and that the latter have zero serial
correlation. Denote the mean and standard deviation of the resulting T-period return, respectively, d-
barT and sigmadT. Under the assumed conditions:

and:

Letting S1 and ST denote the Sharpe Ratios for 1 and T periods, respectively, it follows that:

In practice, the situation is likely to be more complex. Multiperiod returns are usually computed taking
compounding into account, which makes the relationship more complicated. Moreover, underlying
differential returns may be serially correlated. Even if the underlying process does not involve serial
www.stanford.edu/~wfsharpe/art/sr/sr.htm 3/14
10/24/12 The Sharpe Ratio

correlation, a specific ex post sample may.

It is common practice to "annualize" data that apply to periods other than one year, using equations (7)
and (8). Doing so before computing a Sharpe Ratio can provide at least reasonably meaningful
comparisons among strategies, even if predictions are initially stated in terms of different measurement
periods.

To maximize information content, it is usually desirable to measure risks and returns using fairly short
(e.g. monthly) periods. For purposes of standardization it is then desirable to annualize the results.

To provide perspective, consider investment in a broad stock market index, financed by borrowing.
Typical estimates of the annual excess return on the stock market in a developed country might include a
mean of 6% per year and a standard deviation of 15%. The resulting excess return Sharpe Ratio of "the
stock market", stated in annual terms would then be 0.40.

Correlations

The ex ante Sharpe Ratio takes into account both the expected differential return and the associated risk,
while the ex post version takes into account both the average differential return and the associated
variability. Neither incorporates information about the correlation of a fund or strategy with other assets,
liabilities, or previous realizations of its own return. For this reason, the ratio may need to be
supplemented in certain applications. Such considerations are discussed in later sections.

Related Measures

The literature surrounding the Sharpe Ratio has, unfortunately, led to a certain amount of confusion. To
provide clarification, two related measures are described here. The first uses a different term to cover
cases that include the construct that we call the Sharpe Ratio. The second uses the same term to describe
a different but related construct.

Whether measured ex ante or ex post, it is essential that the Sharpe Ratio be computed using the mean
and standard deviation of a differential return (or, more broadly, the return on what will be termed a
zero investment strategy). Otherwise it loses its raison d'etre. Clearly, the Sharpe Ratio can be
considered a special case of the more general construct of the ratio of the mean of any distribution to its
standard deviation.

In the investment arena, a number of authors associated with BARRA (a major supplier of analytic tools
and databases) have used the term information ratio to describe such a general measure. In some
publications , the ratio is defined to apply only to differential returns and is thus equivalent to the
measure that we call the Sharpe Ratio (see, for example, Rudd and Clasing [1982, p. 513] and Grinold
[1989, p. 31]). In others, it is also encompasses the ratio of the mean to the standard deviation of the
distribution of the return on a single investment, such as a fund or a benchmark (see, for example,
BARRA [1993, p. 22]). While such a "return information ratio" may be useful as a descriptive statistic,
it lacks a number of the key properties of what might be termed a "differential return information ratio"
and may in some instances lead to wrong decisions.

For example, consider the choice of a strategy involving cash and one of two funds, X and Y. X has an
expected return of 5% and a standard deviation of 10%. Y has an expected return of 8% and a standard
deviation of 20%. The riskless rate of interest is 3%. According to the ratio of expected return to
standard deviation, X (5/10, or 0.50) is superior to Y (8/20, or 0.40). According to the Sharpe Ratios
using excess return, X (2/10, or 0.20) is inferior to Y (5/20, or 0.25).

Now, consider an investor who wishes to attain a standard deviation of 10%. This can be achieved with
fund X, which will provide an expected return of 5.0%. It can also be achieved with an investment of
50% of the investor's funds in Y and 50% in the riskless asset. The latter will provide an expected return 4/14
www.stanford.edu/~wfsharpe/art/sr/sr.htm
10/24/12 The Sharpe Ratio

50% of the investor's funds in Y and 50% in the riskless asset. The latter will provide an expected return
of 5.5% -- clearly the superior alternative.

Thus the Sharpe Ratio provides the correct answer (a strategy using Y is preferred to one using X),
while the "return information ratio" provides the wrong one.

In their seminal work, Treynor and Black [1973], defined the term "Sharpe Ratio" as the square of the
measure that we describe. Others, such as Rudd and Clasing [1982, p. 518] and Grinold [1989, p. 31],
also use such a definition.

While interesting in certain contexts, this construct has the curious property that all values are positive --
even those for which the mean differential return is negative. It thus obscures important information
concerning performance. We prefer to follow more common practice and thus refer to the Treynor-
Black measure as the Sharpe Ratio squared (SR2 ). 2 :

We focus here on the Sharpe Ratio, which takes into account both risk and return without reference to a
market index. [Sharpe 1966, 1975] discusses both the Sharpe Ratio and measures based on market
indices, such as Jensen's alpha and Treynor's average excess return to beta ratio.

Scale Independence
Originally, the benchmark for the Sharpe Ratio was taken to be a riskless security. In such a case the
differential return is equal to the excess return of the fund over a one-period riskless rate of interest.
Many of the descriptions of the ratio in Sharpe [1966, 1975] focus on this case .

More recent applications have utilized benchmark portfolios designed to have a set of "factor loadings"
or an "investment style" similar to that of the fund being evaluated. In such cases the differential return
represents the difference between the return on the fund and the return that would have been obtained
from a "similar" passive alternative. The difference between the two returns may be termed an "active
return" or "selection return", depending on the underlying procedure utilized to select the benchmark.

Treynor and Black [1973] cover the case in which the benchmark portfolio is, in effect, a combination
of riskless securities and the "market portfolio". Rudd and Clasing [1982] describe the use of
benchmarks based on factor loadings from a multifactor model. Sharpe [1992] uses a procedure termed
style analysis to select a mix of asset class index funds that have a "style" similar to that of the fund.
When such a mix is used as a benchmark, the differential return is termed the fund's selection return.
The Sharpe Ratio of the selection return can then serve as a measure of the fund's performance over and
above that due to its investment style. 3 :

Central to the usefulness of the Sharpe Ratio is the fact that a differential return represents the result of a
zero-investment strategy. This can be defined as any strategy that involves a zero outlay of money in the
present and returns either a positive, negative or zero amount in the future, depending on circumstances.
A differential return clearly falls in this class, since it can be obtained by taking a long position in one
asset (the fund) and a short position in another (the benchmark), with the funds from the latter used to
finance the purchase of the former.

In the original applications of the ratio, where the benchmark is taken to be a one- period riskless asset,
the differential return represents the payoff from a unit investment in the fund, financed by borrowing. 4 :
More generally, the differential return corresponds to the payoff obtained from a unit investment in the
fund, financed by a short position in the benchmark. For example, a fund's selection return can be
considered to be the payoff from a unit investment in the fund, financed by short positions in a mix of
asset class index funds with the same style.

A differential return can be obtained explicitly by entering into an agreement in which a party and a
counterparty agree to swap the return on the benchmark for the return on the fund and vice-versa. A
www.stanford.edu/~wfsharpe/art/sr/sr.htm 5/14
10/24/12 The Sharpe Ratio

counterparty agree to swap the return on the benchmark for the return on the fund and vice-versa. A
forward contract provides a similar result. Arbitrage will insure that the return on such a contract will be
very close to the excess return on the underlying asset for the period ending on the delivery date. 5 : A
similar relationship holds approximately for traded contracts such as stock index futures , which clearly
represent zero-investment strategies. 6 :

To compute the return for a zero-investment strategy the payoff is divided by a notional value. For
example, the dollar payoff for a swap is often set to equal the difference between the dollar return on an
investment of $X in one asset and that on an investment of $X in another. The net difference can then
be expressed as a proportion of $X, which serves as the notional value. Returns on futures positions are
often computed in a similar manner, using the initial value of the underlying asset as a base. In effect, the
same approach is utilized when the difference between two returns is computed.

Since there is zero net investment in any such strategy, the percent return can be made as large or small
as desired by simply changing the notional value used in such a computation. The scale of the return
thus depends on the more- or-less arbitrary choice of the notional value utilized for its computation. 7 :

Changes in the notional value clearly affect the mean and the standard deviation of the distribution of
return, but the changes are of the same magnitude, leaving the Sharpe Ratio unaffected. The ratio is thus
scale independent. 8 :

The Influence of a Zero Investment Strategy on Asset Risk and Return


Scale independence is more than a mathematical artifact. It is key to understanding why the Sharpe
Ratio can provide an efficient summary statistic for a zero- investment strategy. To show this, we
consider the case of an investor with a pre-existing portfolio who is considering the choice of a zero
investment strategy to augment current investments.

The Relative Position in a Zero Investment Strategy

Assume that the investor has $A in assets and has placed this money in an investment portfolio with a
return of RI. She is considering investment in a zero-investment strategy that will provide a return of d
per unit of notional value. Denote the notional value chosen as V (e.g. investment of V in a fund
financed by a short position of V in a benchmark). Define the relative position, p, as the ratio of the
notional value to the investor's assets:

The end-of-period payoff will be:

Let RA denote the total return on the investor's initial assets. Then:

If R-barA denotes the expected return on assets and R- barI the expected return on the investment:

www.stanford.edu/~wfsharpe/art/sr/sr.htm 6/14
10/24/12 The Sharpe Ratio

Now, let sigmaA, sigmaI and sigmad denote the standard deviations of the returns on assets, the
investment and the zero-investment strategy, respectively, and rhoId the correlation between the return
on the investment and the return on the zero-investment strategy. Then:

or, rewriting slightly:

The Risk Position in a Zero Investment Strategy

The parenthesized expression (p sigmad ) is of particular interest. It indicates the risk of the position in
the zero-investment strategy relative to the investor's overall assets. Let k denote this risk position

For many purposes it is desirable to consider k as the relevant decision variable. Doing so states the
magnitude of a zero-investment strategy in terms of its risk relative to the investor's overall assets. In
effect, one first determines k, the level of risk of the zero- investment strategy. Having answered this
fundamental question, the relative (p) and absolute (V) amounts of notional value for the strategy can
readily be determined, using equations (17) and (11). 9 :

Asset Risk and Expected Return

It is straightforward to determine the manner in which asset risk and expected return are related to the
risk position of the zero investment strategy, its correlation with the investment, and its Sharpe Ratio.

Substituting k in equation (16) gives the relationship between 1) asset risk and 2) the risk position and
the correlation of the strategy with the investment:

To see the relationship between asset expected return and the characteristics of the zero investment
strategy, note that the Sharpe Ratio is the ratio of d-bar to sigmad . It follows that

Substituting equation (19) in equation (14) gives:

www.stanford.edu/~wfsharpe/art/sr/sr.htm 7/14
10/24/12 The Sharpe Ratio

or:

which shows that the expected return on assets is related directly to the product of the risk position times
the Sharpe Ratio of the strategy.

By selecting an appropriate scale, any zero investment strategy can be used to achieve a desired level (k)
of relative risk. This level, plus the strategy's Sharpe Ratio, will determine asset expected return, as
shown by equation (21). Asset risk, however, will depend on both the relative risk (k) and the
correlation of the strategy with the other investment (rhoId ). In general, the Sharpe Ratio, which does
not take that correlation into account, will not by itself provide sufficient information to determine a set
of decisions that will produce an optimal combination of asset risk and return, given an investor's
tolerance of risk.

Adding a Zero-Investment Strategy to an Existing Portfolio


Fortunately, there are important special cases in which the Sharpe Ratio will provide sufficient
information for decisions on the optimal risk/return combination: one in which the pre-existing portfolio
is riskless, the other in which it is risky.

Adding a Strategy to a Riskless Portfolio

Suppose first that an investor plans to allocate money between a riskless asset and a single risky fund
(e.g. a "balanced" fund). This is, in effect, the case analyzed in Sharpe [1966,1975].

We assume that there is a pre-existing portfolio invested solely in a riskless security, to which is to be
added a zero investment strategy involving a long position in a fund, financed by a short position in a
riskless asset (i.e., borrowing). Letting Rc denote the return on such a "cash equivalent", equations (1)
and (13) can be written as:

and

Since the investment is riskless, its standard deviation of return is zero, so both the first and second terms
on the right-hand side of equation (18) become zero, giving:

The investor's total risk will thus be equal to that of the position taken in the zero investment strategy,
which will in turn equal the risk of the position in the fund.

Letting SF represent the Sharpe Ratio of fund F, equation (21) can be written:

www.stanford.edu/~wfsharpe/art/sr/sr.htm 8/14
10/24/12 The Sharpe Ratio

It is clear from equations (24) and (25) that the investor should choose the desired level of risk (k), then
obtain that level of risk by using the fund (F) with the greatest excess return Sharpe Ratio. Correlation
does not play a role since the remaining holdings are riskless.

This is illustrated in the Exhibit. Points X and Y represent two (mutually exclusive) strategies. The
desired level of risk is given by k. It can be obtained with strategy X using a relative position of px
(shown in the figure at point PxX) or with strategy Y using a relative position of pY (shown in the figure
at point PyY). An appropriately-scaled version of strategy X clearly provides a higher mean return
(shown at point MRx) than an appropriately-scaled version of strategy Y (shown at point MRy).
Strategy X is hence to be preferred.

The Exhibit shows that the mean return associated with any desired risk position will be greater if
strategy X is adopted instead of strategy Y. But the slope of such a line is the Sharpe Ratio. Hence, as
long as only the mean return and the risk position of the zero-investment strategy are relevant, the
optimal solution involves maximization of the Sharpe Ratio of the zero-investment strategy.

Consider, for example, a choice between fund XX, with a risk of 10% and an excess return Sharpe
Ratio of 0.20 and fund YY with a risk of 20% and an excess return Sharpe Ratio of 0.25. Assume the
investor has $100 to invest and desires a level of risk (here, k) equal to 15%.

The optimal strategy involves investment of $100 in the riskless asset plus a zero-investment strategy
based on fund YY. To make the risk of the latter equal to 15%, a relative position (p) of 0.75 should be
taken. This, in turn, requires an investment of $75 in the fund, financed by $75 of borrowing (i.e. a short
position in the riskless asset). The net position in the riskless asset will thus be $25 ($100 - $75), with
$75 invested in Fund YY.

In this case the investor's tasks include the selection of the fund with the greatest Sharpe Ratio and the
allocation of wealth between this fund and borrowing or lending, as required to obtain the desired level
of asset risk.

Adding a Strategy to a Risky Portfolio

Consider now the case in which a single fund is to be selected to complement a pre-existing group of
risky investments. For example, an investor might have $100, with $80 already committed (e.g. to a
group of bond and stock funds). The goal is to allocate the remaining $20 between a riskless asset
www.stanford.edu/~wfsharpe/art/sr/sr.htm 9/14
10/24/12 The Sharpe Ratio

("cash") and a single risky fund (e.g. a "growth stock fund"), accepting the possibility that the amount
allocated to cash might be positive, zero or negative, depending on the desired risk and the risk of the
chosen fund.

In this case the investment should be taken as the pre-existing investment plus a riskless asset (in the
example, $80 in the initial investments plus $20 in cash equivalents). The return on this total portfolio
will be RI. The zero- investment strategy will again involve a long position in a risky fund and a short
position in the riskless asset.

As stated earlier, in such a case it will not necessarily be optimal to select the fund with the largest
possible Sharpe Ratio. While the ratio takes into account two key attributes of the predicted performance
of a zero-investment strategy (its expected return and its risk), it does not include information about the
correlation of its return with that of the investor's other holdings (rhoId ). It is entirely possible that a fund
with a smaller Sharpe Ratio could have a sufficiently smaller correlation with the investor's other assets
that it would provide a higher expected return on assets for any given level of overall asset risk.

However, if the alternative funds being analyzed have similar correlations with the investor's other
assets, it will still be optimal to select the fund with the greatest Sharpe Ratio. To see this, note that with
rhoId taken as given, equation (18) shows that there is a one-to-one correspondence between sigmaA
and k. Thus, for any desired level of asset risk, the investor chooses the corresponding risk position k
given by equation (18), regardless of the fund to be employed.

But, as before, the expected return on assets will be:

which can be maximized by selecting the fund with the largest Sharpe Ratio.

The practical implication is clear. When choosing one from among a group of funds of a particular type
for inclusion in a larger set of holdings, the one with the largest predicted excess return Sharpe Ratio
may reasonably be chosen, if it can be assumed that all the funds in the set have similar correlations with
the other holdings. If this condition is not met, some account should be taken of the differential levels of
such correlations.

The Choice of a Set of Uncorrelated Strategies

Suppose finally that an investor has a pre-existing set of investments and is considering taking positions
in one or more zero-investment strategies, each of which is uncorrelated both with the existing
investments and with each of the other such strategies. Such lack of correlation is generally assumed for
residual returns from an assumed factor model and hence applies to strategies in which long and short
positions are combined to obtain zero exposures to all underlying factors in such a model.

In particular, this is assumed to hold for the "non-market returns" which are the residual returns in one-
factor "market models" of the type employed in Treynor-Black [1973]. It is also assumed to hold for the
"active returns" that constitute the residual returns in a model of the type used by BARRA (described,
for example, in Grinold [1989]).

Most germane, perhaps, for selecting funds, this is assumed to hold for the "selection returns" that
constitute the residuals from the asset class factor model used in the style analysis procedure described in
10 :

Under the assumed conditions, the counterpart to equation (13) is:

www.stanford.edu/~wfsharpe/art/sr/sr.htm 10/14
10/24/12 The Sharpe Ratio

where pi represents the relative position taken in strategy i and di represents its return.

Letting sigmadi represent the risk of position i, asset risk is given by:

and expected asset return by:

Adding subscriptions to equations (21) and (18), and substituting the results gives:

and

Now, assume that the investor's goal is to maximize a standard risk- adjusted expected return of the
form:

where tau represents risk tolerance (the marginal rate of substitution of variance for expected return).
Substituting equations (30) and (31) in (32) gives:

Since the terms involving the initial investment will be unaffected by the decisions (ki's) concerning the
zero investment strategies, it suffices to maximize:

To do so, the partial derivative with respect to each decision variable (ki) should be set to zero:

www.stanford.edu/~wfsharpe/art/sr/sr.htm 11/14
10/24/12 The Sharpe Ratio

The optimal risk position in strategy i is thus:

Hence the risk levels of the strategies should be proportional to their Sharpe Ratios. Strategies with zero
predicted Sharpe Ratios should be ignored. Those with positive ratios should be "held long", and those
with negative ratios "held short". If strategy X has a positive Sharpe Ratio that is twice as large as that of
strategy Y, twice as much risk should be taken with X as with Y. The overall scale of all the positions
should, in turn, be proportional to the investor's risk tolerance.

An interesting application occurs when long and short positions can be taken (e.g. via financial futures)
in the asset classes that underlie a style analysis model of the type described in Sharpe [1992]. In
principle, funds should be selected based only on their selection returns, with the respective amounts of
selection risk set in proportion to the funds' selection return Sharpe Ratios. The net exposures to asset
classes required to implement this mixture of zero investment strategies can then be compared with the
investor's desired passive asset mix to determine needed net positions.

Summary
The Sharpe Ratio is designed to measure the expected return per unit of risk for a zero investment
strategy. The difference between the returns on two investment assets represents the results of such a
strategy. The Sharpe Ratio does not cover cases in which only one investment return is involved.

Clearly, any measure that attempts to summarize even an unbiased prediction of performance with a
single number requires a substantial set of assumptions for justification. In practice, such assumptions
are, at best, likely to hold only approximately. Certainly, the use of unadjusted historic (ex post) Sharpe
Ratios as surrogates for unbiased predictions of ex ante ratios is subject to serious question. Despite such
caveats, there is much to recommend a measure that at least takes into account both risk and expected
return over any alternative that focuses only on the latter.

For a number of investment decisions, ex ante Sharpe Ratios can provide important inputs. When
choosing one from among a set of funds to provide representation in a particular market sector, it makes
sense to favor the one with the greatest predicted Sharpe Ratio, as long as the correlations of the funds
with other relevant asset classes are reasonably similar. When allocating funds among several such
funds, it makes sense to allocate funds such that the selection (residual) risk levels are proportional to the
predicted Sharpe Ratios for the selection (residual) returns. If some of the implied net positions are
infeasible or involve excessive transactions costs, of course, the decision rules must be modified.
Nonetheless, Sharpe Ratios may still provide useful guidance.

Whatever the application, it is essential to remember that the Sharpe Ratio does not take correlations into
account. When a choice may affect important correlations with other assets in an investor's portfolio,
such information should be used to supplement comparisons based on Sharpe Ratios.

All the same, the ratio of expected added return per unit of added risk provides a convenient summary of
two important aspects of any strategy involving the difference between the return of a fund and that of a
relevant benchmark. The Sharpe Ratio is designed to provide such a measure. Properly used, it can
improve the process of managing investments.

Endnotes
1. We use the formula for the standard deviation of a population, taking the observations as a sample.
For applications in which the value of T is the same for all the funds being measured, the standard
www.stanford.edu/~wfsharpe/art/sr/sr.htm 12/14
10/24/12 The Sharpe Ratio

deviation of the historic data (in which the denominator is T rather than T-1) can generally be used
instead, since the relative magnitudes of the resulting measures would be the same.

2. Treynor and Black showed that if resources are allocated optimally, the SR2 of a portfolio will equal
the sum of the SR2 values for its components. This follows from the fact that the optimal holding of a
component will be proportional to the ratio of its mean differential return to the square of the standard
deviation of its differential return. Thus, for example, components with negative means should be held
in negative amounts. In this context, the product of the mean return and the optimal holding will always
be positive. For completeness, it should be noted that Treynor and Black used the term appraisal ratio
to refer to what we term here the SR2 of a component and the term Sharpe Ratio to refer to the SR2 of
the portfolio, although other authors have used the latter term for both the portfolio and its components.

3. This type of application is described in BARRA [1992, p. 21].

4. In this context, maximization of the Sharpe Ratio is the normative equivalent to the separation
theorem first put forth in Tobin [1958] in a positive context.

5. To see this, note that by borrowing money to purchase the underlying asset, one can obtain precisely
the same asset at the delivery date. The ending value of such a strategy will be perfectly correlated with
the value of the forward contract and neither will require any outlay. If the payoffs at the end of the
period differ, one could take a long position in one combination (e.g. the forward contract or the
asset/borrowing combination) and a short position in the other and obtain a guaranteed payment at the
end of the period with no outlay at any other time. This is unlikely to be the case in a market populated
by astute investors. In practice, transactions costs will limit the precision of the relationship.

6. Futures contracts are often not protected against changes in value due to (for example) dividend
payments. They also generally require daily marking to market. For these reasons they differ from
forward contracts with dividend protection, for which the arbitrage relationship will hold within the
bounds of transactions costs. Futures contracts generally require that margin be posted. However, this is
not an investment in the underlying asset.

7. Despite this drawback, once a notional value has been selected, the actual rate of return can be used
for comparison purposes.

8. Indeed, a Sharpe Ratio can be computed without regard to notional value by simply using the mean
and standard deviation of the distribution of the final payoff.

9. To see the advantages of concentration on the risk position of a strategy, consider two funds. One (X)
invests directly, the other (Y) borrows money at the riskless rate and invests in X, with a leverage ratio
of 2 to 1. Let kx be the optimal position in fund X. Clearly the optimal position in fund Y will be half as
large. However, the standard deviation of return on fund Y will be twice that of fund X. Thus the
optimal risk position in Y will be the same as that in X.

10. In fact, the basic relationship on which this section builds was first obtained by Treynor and Black
[1973].

References
BARRA Newsletter, September/October 1992, May/June 1993, BARRA, Berkeley, Ca.

Bodie, Zvi, Alex Kane and Alan J. Marcus. Investments, 2d edition. Homewood, IL: Richard D. Irwin,
1993.

Capaul, Carlo, Ian Rowley, and William F. Sharpe. "International Value and Growth Stock Returns,"
Financial Analysts Journal, January/February 1993, pp. 27-36.
www.stanford.edu/~wfsharpe/art/sr/sr.htm 13/14
10/24/12 The Sharpe Ratio

Financial Analysts Journal, January/February 1993, pp. 27-36.

Elton, Edwin J., and Martin J. Gruber. Modern Portfolio Theory and Investment Analysis, 4th edition.
New York: John Wiley & Sons, 1991.

Grinold, Richard C. "The Fundamental Law of Active Management," Journal of Portfolio


Management, Spring 1989, pp. 30-37.

Haugen, Robert A. Modern Investment Theory, 3d edition. Englewood Cliffs, NJ: Prentice-Hall, 1993.

"Morningstar Mutual Funds User's Guide." Chicago: Morningstar Inc., 1993.

Radcliff, Robert C. Investment Concepts, Analysis, Strategy, 3d edition. New York: HarperCollins,
1990.

Reilly, Frank K. Investment Analysis and Portfolio Management, 3d edition. Chicago: The Dryden
Press, 1989.

Rudd, Andrew, and Henry K. Clasing. Modern Portfolio Theory, The Principles of Investment
Management. Homewood, IL: Dow-Jones Irwin, 1982.

Sharpe, William F. "Mutual Fund Performance." Journal of Business, January 1966, pp. 119-138.

-----. "Adjusting for Risk in Portfolio Performance Measurement." Journal of Portfolio Management,
Winter 1975, pp. 29-34.

-----. "Asset allocation: Management Style and Performance Measurement," Journal of Portfolio
Management, Winter 1992, pp. 7-19.

Tobin, James. "Liquidity Preference as Behavior Toward Risk." Review of Economic Studies, February
1958, pp. 65-86.

Treynor, Jack L., and Fischer Black. "How to Use Security Analysis to Improve Portfolio Selection."
Journal of Business, January 1973, pp. 66-85.

www.stanford.edu/~wfsharpe/art/sr/sr.htm 14/14
10/24/12 Morningstar's Risk‑adjusted Ratings

Morningstar's Risk-adjusted Ratings


William F. Sharpe*
Stanford University
January, 1998

Summary
The last decade has seen the rapid growth of investment via mutual funds across the globe. This has led
to a demand for simple measures of the performance of such funds. In the United States, the most
popular is the "risk-adjusted rating" (RAR) produced by Morningstar, Incorporated. This measure
differs significantly from more traditional ones such as various forms of the Sharpe ratio. This paper
investigates the properties of Morningstar's measure. We show that the RAR measure has characteristics
similar to those of an expected utility function based on an underlying bilinear utility function. This is of
some concern, since strict adherence to a goal of maximizing expected utility with such a function could
lead to extreme investment strategies. Next, we show that in practice, Morningstar varies one of the
parameters of this function in a manner that frequently leads to results similar to those that would be
obtained with the more traditional excess return Sharpe Ratio. Finally, we argue that neither
Morningstar's measure nor the excess return Sharpe Ratio is an efficient tool for choosing mutual funds
within peer groups when constructing a multi-fund portfolio --the ostensible purpose for which
Morningstar's rankings are produced.

Introduction

This paper analyzes the characteristics of the "risk-adjusted ratings" on which Morningstar, Incorporated
bases its well-known "star ratings" and somewhat less well-known "category ratings", then compares
these measures with more traditional mean/variance measures such as the excess return Sharpe ratio.

It is common for a mutual fund family to proudly advertise that one of its funds or possibly several funds
have "received 5 stars from Morningstar". One study1 found that as much as 90% of new money
invested in stock funds in 1995 went to funds with 4-star or 5-star ratings. While this may or may not be
the correct figure today, few if any advertisements announce that a fund has received 1 star. For better or
worse, Morningstar's risk-adjusted measures greatly influence U.S. investor behavior. Since they differ
significantly from traditional risk-adjusted performance measures such as various forms of the Sharpe
ratio, it is important to understand their strengths and limitations.

Ex Ante and Ex Post Performance Measurement


Mutual fund performance measures are typically based on one or more summary statistics of past
performance. Measures that attempt to take risk into account incorporate both a measure of historic
return and a measure of historic variability or loss. Since investment decisions only affect the future, the
use of historic results involves an implicit assumption that the statistics derived from past performance
have at least some predictive content for future performance. For example, a measure of average or
cumulative return over some historic period may be assumed to provide information concerning
expected return over some future period. Correspondingly, a measure of past variability or average

magnitude of loss may be assumed to provide information about future risk or the likely loss over some
www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 1/19
10/24/12 Morningstar's Risk‑adjusted Ratings

magnitude of loss may be assumed to provide information about future risk or the likely loss over some
future period.

While measures of historic variability can be useful for predicting future levels of risk, there is ample
evidence that measures of average or cumulative return are at best highly imperfect predictors of
expected future return. We leave questions of predictability for other papers. Our goal is to examine the
properties of Morningstar's and other measures under the heroic assumption that statistics from historic
frequency distributions are reliable predictors of corresponding statistics from a probability distribution
of future returns. In particular, we seek to relate alternative performance measures to likely investment
decisions on the grounds that one should attempt to select a performance measure that aligns well with
the decision to be undertaken, even if the relationship between the past and the future is subject to a
great deal of noise. Ultimately, of course, the goal is to use all relevant information to make unbiased
forecasts of expected returns, risks, and any other relevant characteristics of future fund performance,
then use such estimates to determine an optimal combination of investments in appropriate funds.

Our analysis of the Morningstar measures focuses on their key properties. The reader interested in
empirical analyses of these and more traditional measures as well as the similarities and differences
among them in practice will find a relatively extensive treatment in Sharpe [1997] .

We begin with a description of the computations used by Morningstar.

Morningstar's Risk-adjusted Ratings

The Risk-adjusted Rating

The Risk-adjusted Rating (RAR) for a fund is calculated by subtracting a measure of the fund's relative
risk (RRisk) from a measure of its relative return (RRet):

RARi = RReti - RRiski

Relative Returns and Relative Risks

Each of the relative measures for a fund is computed by dividing the corresponding measure for the fund
by a denominator that is used for all the funds in a specified peer group. Letting g(i) represent the peer
group to which fund i is assigned:

RReti = Reti / BRetg(i)

RRiski = Riski / BRiskg(i)

where BRetg(i) and BRiskg(i) denote the bases used for the relative return and relative risk of all funds in
the group in question.

Star and Category Risk-adjusted Ratings

Morningstar calculates RAR values taking load charges into account for purposes of determining its
"star ratings". However, their newer "category ratings" omit load charges. The time periods utilized also
differ. Four sets of star ratings are computed. The first three cover the last 3, 5 and 10 years, while the
most popular (overall) measure is based on a combination of the 3,5 and 10-year results. In contrast, the
category ratings cover only the last 3 years (36 months).

www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 2/19
10/24/12 Morningstar's Risk‑adjusted Ratings

For simplicity, we describe only the calculations for the RAR values used for the category ratings.
Sharpe [1997] provides considerable detail about the broader set of measures as well as a host of
empirical analyses of their similarities and differences.

Return

Morningstar's measure of a fund's return is the difference between the cumulative value obtained by
investing $1 in the fund over the period and the cumulative value obtained by investing $1 in Treasury
bills:

Reti = VRi - VRb

Thus if $1 invested in the fund would have grown to $1.50 in 36 months, assuming reinvestment of all
distributions, while $1 invested in Treasury bills would, with reinvestment, have grown to $1.20:

Reti = 1.50 - 1.20 = 0.30, or 30%

The Relative Return Base

Two steps are required to calculate the base to be used to calculate the relative returns for all the funds in
a group. First, the returns for all the funds in the group are averaged. If the result is greater than the
increase in value that would have been obtained with Treasury bills, the group average is used.
Otherwise, the growth in value for Treasury bills is used. Thus:

BRetg(i) = max ( mean i in g(i) [Reti], VRb - 1)

Note that for the average value of Reti to be used, the funds must do at least twice as well as Treasury
bills -- that is:

mean i in g(i) (VRi - 1) >= 2*(VRb - 1)

As we will show, the fact that BRetg(i) may have one of two distinct values makes it difficult to
characterize the RAR measure in general terms.

Risk

To measure a fund's risk, Morningstar first computes the fund's excess return (ER) for each month by
subtracting the return on a short-term Treasury bill from the fund's return. Next, all the positive monthly
excess returns are converted to zeros. Finally, a simple mean is taken of the resulting "monthly losses"
and the sign reversed to give a positive number2 Thus:

Riski = - meant ( mint [ERit , 0] )

The result is defined as a measure of the fund's "average monthly loss". More strictly, it is a measure of
opportunity loss, where the foregone opportunity is investment in Treasury bills, and months in which
there was an opportunity gain are counted as periods of zero opportunity loss.

The Relative Risk Base

The base used to calculate the relative returns for all the funds in a group is simply the average of all the
risk measures for the funds in that group:

www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 3/19
10/24/12 Morningstar's Risk‑adjusted Ratings

BRiskg(i) = meani in g(i) [Riski]

Peer Groups

For purposes of calculating RARs, each fund is assigned to one (and only one) peer group. For its star
ratings, Morningstar uses four such groups: domestic equity, international equity, taxable bond, and
municipal bond. For its category ratings, peer groups are defined more narrowly. In mid-1997, for
example, there were 20 domestic equity categories, 9 international equity categories, 10 taxable bond
categories, and 5 municipal bond categories.

Stars and Category Ratings

While Morningstar reports relative returns, relative risks and risk-adjusted ratings, most attention is
focused on the "stars" and "category ratings" derived from the RAR values. To assign these measures,
the RARs for all the funds in a peer group are ranked; funds falling in the top 10% of the resulting
distribution are given 5 stars (or a category rating of 5), those in the next 22.5% get 4, those in the next
35% get 3, those in the next 22.5% get 2, and those in the bottom 10% get 1.

Mean-Variance Measures

Expected Utility

Most academic treatments of risk and return are based on the mean-variance approach developed in
Markowitz [1952]. Markowitz argued that the desirability of a probability distribution of portfolio
returns should be summarized using the first two moments: the expected return and the standard
deviation of return (or its square, the variance of return). The ex post counterparts are the arithmetic
mean return, which we will denote Mi for fund i and the standard deviation of historic returns, which we
will denote Si.

For an investor who chooses only one mutual fund, the fund's return will equal his or her overall
portfolio return. In this very special case, if the investor follows Markowitz' prescriptions, the expected
utility of a portfolio invested solely in fund i can be written as:

EUi = Mi - rk * (Si2 )

where rk is a measure of investor's k's risk-aversion -- that is, his or her marginal rate of substitution of
mean return for variance of return. The goal of such an investor is to select the one fund for which this
measure is the greatest, under the maintained assumption that historic returns are appropriate predictors
of future returns.

While this type of expected utility function is widely used for optimization analyses, it is rarely chosen
for ex post performance measurement. In part this is due to the fact that it only applies strictly when all
an investor's funds are to be allocated to one single risky investment. Even more limiting, however, is
the fact that in principle no universal measure of this type can be used by all investors. Rather, each
investor must evaluate performance using a measure designed for his or her degree of risk aversion (rk ).

The Excess Return Sharpe Ratio

In an important contribution to investment theory, Tobin [1958] showed that combining a riskless
investment with a risky one provides an opportunity set in which expected excess return is proportional
www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 4/19
10/24/12 Morningstar's Risk‑adjusted Ratings

to return standard deviation. This implies that an investor able to borrow or lend at a given rate and who
is planning to hold only one mutual fund plus borrowing or lending should select the fund for which the
ratio of expected excess return to standard deviation is the highest. This ratio is generally termed the
Sharpe ratio, based on its introduction in Sharpe [1966]. As shown in Sharpe [1994], the key properties
of the original measure apply more broadly to any "zero-investment strategy" such as that given by the
difference between the returns on any two investments. To avoid confusion, we refer to the measure
based on excess returns as the excess return Sharpe ratio (ERSR). Letting Rbt represent the return on a
riskless security, the excess return Sharpe Ratio for fund i is:

ERSRi = meant (Rit - Rbt) / stdevt (Rit - Rbt)

Ex ante, Rb is a fixed constant, so that:

ERSRi = (Mi - Rb ) / Si

Ex post, the more complete formula is typically employed to account for any variation in Rb .

The goal of an investor able to borrow or lend at a fixed rate but planning to hold only one risky mutual
fund is to select the fund with the greatest ex ante ERSRi since a strategy employing it with the
appropriate amount of leverage can provide the greatest possible expected return for any desired level of
risk As with other measures, of course, selection of a fund with the highest ex post excess return Sharpe
ratio is only appropriate under the maintained assumption that the historic return distribution is a good
predictor of the future probability distribution.

Excess return Sharpe ratios are often used as measures of mutual fund performance, partly because they
are less limited in applicability than mean variance expected utility measures. Importantly, under the
assumptions on which the argument is based, the fund with the greatest Sharpe ratio is the best for any
investor, regardless of his or her degree of risk aversion. In this sense, the measure is universal. Strictly,
of course, the ratio is suitable only for cases in which an investor plans to invest funds in a single risky
asset plus (possibly) borrowing or lending. Thus it is slightly more general (two investments rather than
one), but still potentially inappropriate for a more typical portfolio involving multiple risky funds.

RAR as an Expected Utility Function

Expected Utility

As shown, a fund's RAR is the difference between two relative measures:

RARi = [ Reti / BRetg(i) ] - [ Riski / BRiskg(i) ]

Rearranging slightly gives:

RARi = (1 / BRetg(i) ) * [ Reti - ( BRetg(i) / BRiskg(i) ) Riski ]

Note that both the first and second parenthesized expressions are the same for all the funds in a given
group. Since the first term must be positive, both the rankings of funds within a group and the relative
magnitudes of their ratings will be unaffected if this term is omitted. Denoting the second parenthesized
expression as kg(i) gives a re-scaled RAR of the form:

RRARi = Reti - kg(i) * Riski

www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 5/19
10/24/12 Morningstar's Risk‑adjusted Ratings

It is tempting to interpret this modified function as a measure of the expected utility of fund i for an
investor with a risk aversion of kg(i), where risk aversion is a measure of the investor's marginal rate of
substitution of Reti for Riski. Under this interpretation, kg(i) would represent the risk aversion of all
investors who select funds in the group in question. We address the relevance of such an assumption
later. For now we take RRAR as a measure of expected utility.

Periodicity

Sharpe ratios use standard statistics from a frequency distribution of differential returns. For example, the
first two moments of the probability distribution of next month's excess return might be assumed to be
similar to the same moments from the frequency distribution of the last 36 months' excess returns.
Importantly, the same time period (e.g. monthly) is used for both statistics.

Morningstar's risk measure has a similar character. Each monthly loss is given the same weight, with the
average value presumably used as a surrogate for the expected value of next month's loss. However, the
measure of return is the difference between two cumulative values taken over the complete historic
period. The properties of such a statistic are complex, since it represents the difference between two
value relatives, each of which can be considered to equal the result obtained by raising [1 plus the
geometric mean return] to the T'th power, where T is the number of months in the overall period. Since
the geometric mean of a series of returns is a function of both the arithmetic mean and the variance of
the series, the resulting return measure includes aspects of both return and risk.

Among other things, this makes the statistical properties of Morningstar's measure highly complex,
seriously compromising the analyst's ability to estimate likely ranges of future performance, given
historic results. This contrasts with the Sharpe ratio, which is a simple transform of the standard t-statistic
for measuring the statistical significance of the difference between a realized mean value and zero and
hence easily used in this manner.

We explore further implications of Morningstar's calculation in greater detail below. For now, we
consider a modification that would make the RAR measure internally consistent. In particular, we use as
a measure of return the difference between the fund's arithmetic mean monthly return and the arithmetic
mean return on Treasury bills; we also modify the procedure used to calculate the relative return base
accordingly:

MRARi = MReti - mkg(i) * Riski

where :

MReti = meant (Rit - Rbt)

In this measure, mkg(i) is the marginal rate of substitution of mean monthly excess return for mean loss,
given by:

mkg(i) = MBRetg(i) / BRiskg(i)

where:

MBRetg(i) = max ( meani in g(i) [MReti], meant [Rbt] )

Except in extreme cases, the relative MRARi values for the funds within a peer group will be similar to
those obtained using Morningstar's actual procedures (that is, the corresponding RRARi or RARi
values). In the following analysis, we assume that MReti, BRetg(i) and kg(i) are computed using
arithmetic monthly mean values. This allows us to obtain precise analytic results. Fortunately, the main
www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 6/19
10/24/12 Morningstar's Risk‑adjusted Ratings

qualitative conclusions apply as well to the more complex measures utilized by Morningstar.

The Bilinear Utility Function

Consider an investor with a Von Neuman-Morgenstern utility function of the form:

U = a* (Ri - Rb ) if Ri <= Rb , and

U = (Ri - Rb ) if Ri > Rb

where Ri is the return on fund i, Rb is the return on treasury bills, and a is a constant greater than one.

An example of such a function in which Rb =5% and a= 3 is shown in Figure 1. As can be seen, it is
composed of two linear segments, with a greater slope to the left of Rb than to the right. Such a function
exhibits risk-aversion in the large, since the loss in utility associated with a return below Rb is greater
than the gain in utility associated with a return equally far above Rb . However, within return ranges that
lie wholly above or wholly below Rb , the function is linear and thus reflects risk-neutrality.

Figure 1: A Bilinear Utility Function

A bilinear function of this sort captures one of the three salient features of the prospect theory of
decision-making under uncertainty derived by Kahneman and Tversky [1979] from observation of
choices made by subjects in experimental settings. An individual with such a function experiences loss-
aversion, where loss is measured from a reference point determined by the current riskless rate of return
Rb . More precisely, the function can be said to reflect opportunity loss aversion, with the value of the
parameter a providing a measure of the degree of such aversion and the riskless rate acting as the
reference point or alternative investment opportunity.

Maximizing Expected Utility with a Bilinear Utility Function


www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 7/19
10/24/12 Morningstar's Risk‑adjusted Ratings

Now consider an investor with a bilinear utility function who wishes to determine the expected utility of
a given mutual fund over a future period.

To begin, we rewrite the formula for the utility function as:

U = Ri - Rb + [(a - 1) *(Ri - Rb ) if Ri <= Rb and 0 otherwise]

The expected value of U will thus be:

E(U) = E( Ri - Rb ) + (a - 1)* E ( Li)

where:

Li = Ri - Rb if Ri <= Rb , and

Li = 0 if Ri > Rb

Note that Li is exactly equal to Morningstar's monthly loss figure..

Let there be T possible future returns, each equally likely to be realized. Then the expected values are
simply arithmetic means, and:

E(U) = mean ( Ri - Rb ) + (a - 1)* mean( Li)

Substituting historic excess returns for future returns gives a measure that would be precisely equal to
Morningstar's RAR if the latter used arithmetic mean monthly excess returns for its return calculations.
Since the differences due to this disparity are likely to be small, in form, Morningstar's RAR measure is
highly similar, if not identical, to that that would be chosen by an investor who wishes to maximize a
bilinear utility function but has decided to invest in only one mutual fund.

Loss-aversion

Compare the equation for expected utility with our modified version of Morningstar's RAR measure:

MRARi = Reti - kg(i) * Riski

Thus it is approximately true that:

a = 1 + kg(i)

Since kg(i) is positive, the investor will exhibit opportunity loss aversion, with the magnitude of aversion
greater, the larger is kg(i).

Optimal Investment Choice for an Investor with a Bilinear Utility Function

While the bilinear utility function has at least one attractive property, on closer examination it can be
shown to imply extreme investment choices under plausible circumstances, as we now show.

Consider a strategy in which a proportion of an investor's wealth equal to x is placed in risky fund i and
a proportion equal to (1-x) is placed in a riskless asset. The mean and variance of the strategy's excess
return will be given by x*Mi and x*Si, respectively. Since both measures are linear in scale, their ratio is
scale-independent. Thus the excess return Sharpe ratio for the strategy will equal that of the fund itself.
Indeed, it is the fact that Sharpe ratios are scale-independent that makes them attractive as measures of
performance.
www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 8/19
10/24/12 Morningstar's Risk‑adjusted Ratings

performance.

For such strategies, both of Morningstar's measures are also proportional to scale. Recall that:

Reti = VRi - VRb

Letting TRi and TRb represent the total compound return for fund i and bills, respectively, over the
period covered:

Reti = ( 1 + TRi ) - ( 1 + TRb ) = TRi - TRb

For a strategy in which x is invested in fund i and (1-x) in Treasury bills:

Retx = [ x*(1 + TRi ) + ( 1-x)*( 1 + TRb )] - (1 + TRb ) = x*(TRi - TRb ) = x*Reti

A similar relationship holds for the average loss measure. In months for which Ri <= Rb :

L = x*(Ri - Rb )

while for months for which Ri > Rb :

L = 0 = x*0

Hence for the strategy in which x is invested in fund i and (1-x) in Treasury bills:

Riskx = x * Riski

The fact that both Morningstar's measures are proportional to scale implies that by combining a risky
fund with borrowing or lending, an investor can attain any point on a linear opportunity set in Retx -
Riskx space. Faced with such a tradeoff, what choice will be made by an investor with a bilinear utility
function?

Figure 2 shows three possible outcomes. In each case, the opportunity set is shown by the red line. The
green lines are representative iso-expected utility lines . All combinations of risk and return along any
such line provide the same expected utility, with higher lines representing greater expected utility than
lower lines. Each investor's objective is to find the feasible point (on the red line) with the highest
expected utility (on the highest attainable green line). The three figures represent investors with different
degrees of risk aversion. The investor in the left-hand panel is the most risk averse; the investor in the
right-hand panel is the least risk averse;the investor in the middle panel has an intermediate degree of
risk aversion.

Figure 2: Investment Choice for Three Investors with Bilinear Utility Functions

www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 9/19
10/24/12 Morningstar's Risk‑adjusted Ratings

Note that for two out of the three investors the optimal choice is an extreme one. The conservative
investor invests solely in Treasury bills, while the aggressive investor puts as much as possible in the
mutual fund, borrowing to the maximum allowable limit. Only for an investor with risk aversion
precisely equal to the available risk-return tradeoff is any interior strategy optimal, and any such investor
is totally indifferent to the degree of leverage involved.

Such choices are clearly inconsistent with the observed behavior of the vast majority of investors, calling
into serious question the assumption that investors have utility functions as simple as that of the bilinear
form. The problem is mitigated slightly in settings in which many investment options are available and
multiple funds may be selected. However, even in such cases, the efficient opportunity set is likely to be
close to linear, leading to very similar results.

Note that these objections apply as well to a function in which expected utility is a linear function of
mean (Mi) and standard deviation (Si). The problem does not arise, however, using the Markowitz
formulation in which expected utility is a linear function of mean and variance, since the implied iso-
expected utility curves increase at an increasing rate in mean/standard deviation space. As shown in
Figure 3, such preferences lead to interior investment choices, even when the efficient portion of the
opportunity set is linear.

Figure 3: Investment Choice for an Investor with a Mean-Variance Utility Function

www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 10/19
10/24/12 Morningstar's Risk‑adjusted Ratings

RAR as a Function of Mean and Variance


While Morningstar's RAR measure differs considerably from a utility function based on a fund's mean
and variance of return, it is likely to be well approximated by a function of these more traditional
measures.

Morningstar Return as a function of mean and variance

To begin, consider Reti. It is the difference between the value relative for the fund and that for Treasury
bills. But the value relative over T periods will equal one plus the geometric mean return (G) to the T'th
power. Thus

Reti = ( 1 + Gi) T - (1 + Gb )T

A close approximation for the geometric mean of a series is given by subtracting one-half the variance
from the arithmetic mean. Thus:

Reti = ( 1 + Mi - Si 2 / 2 ) T - (1 + Mb - Sb 2 / 2)T

As can be seen, Morningstar's return measure incorporates aspects of both mean return and risk
(standard deviation of return), with Reti increasing in Mi and decreasing in Si. Given knowledge of Mi
and Si, one can clearly obtain a good estimate of Reti.

Morningstar Risk as a function of mean and variance

The situation is not as clear-cut for Riski. In general it will depend on both the shape of the return
distribution and its moments. Letting prx be the probability of state of the world x and ERix the excess
return on fund i in state x, the expected loss (Riski) for fund i is defined as:

Riski = - sumx [ prx *minx (ERix ,0) ]


www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 11/19
10/24/12 Morningstar's Risk‑adjusted Ratings

Consider now the situation in which the mean and variance of the distribution of excess returns are
sufficient statistics to identify the entire distribution. This is the case, for example, if returns are normally
distributed. Under this assumption:

Riski = f [ Mi-Rb , Si ]

since Mi-Rb is the mean of the excess return distribution and Si is its standard deviation (assuming that
Rb is known).

Using a relationship given in Triantis and Hodder [1990], it can be shown3 that for a normal
distribution:

Riski = f [ Mi-Rb , Si ] = Si * n(-z) - (Mi - Rb ) * N(-z)

where:

z = ( Mi-Rb ) / Si

Here, n(z) denotes the standard normal density function while N(z) denotes the standard cumulative
normal. 3 .

Empirical evidence given in Sharpe [1997] indicates that monthly return distributions for diversified
mutual funds may be sufficiently close to normal to make this approximation quite accurate

Morningstar RAR as a function of mean and variance

If both Riski and Reti are well approximated as functions of Mi and Si, then RARi will be also.

Figure 4 shows the relationship between RAR and various combinations of e (expected annual excess
return) and sd (standard deviation of annual excess return) using the approximations given above for a
case in which the riskless rate of interest is 5% per year, the holding period is 3 years, and the peer
group has an average excess return of 5% and a standard deviation of 15%. As can be seen, the
relationship is monotonic and very close to linear in the region shown, which includes likely
combinations for popular investment strategies.

Figure 4: RAR as a Function of Expected Excess Return and Standard Deviation

www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 12/19
10/24/12 Morningstar's Risk‑adjusted Ratings

The high degree of linearity of the relationship in Figure 4 can be seen more clearly in Figure 5, which
shows a few of the associated iso-RAR curves. Clearly an investor who wishes to maximize RAR is
likely to select an extreme solution unless the opportunity set is highly non-linear.

Figure 5: Iso-RAR Curves

Recall that a portfolio is said to be mean-variance efficient if it provides the maximum possible mean for
a given level of variance and the minimum possible variance for a given level of mean. Equivalently,
fund A is said to be inefficient if there exists another fund B with (1) the same expected return but less
risk, (2) the same risk but more expected return, or (3) less risk and more expected return. With
functions such as those shown in Figures 4 and 5, in each such case, fund B would also have a higher
RAR value than fund A if the approximations held. Thus it would be appropriate to exclude from
consideration portfolios that are inefficient using the mean-variance criterion even if the ultimate goal
were to select a portfolio with the largest possible RAR value.

These relationships imply that the key differences between Morningstar's measures and those used in
more traditional mean-variance analyses concern (1) the use of a linear combination of a return measure
and a risk measure, rather than a ratio of the two and/or (2) the use of risk per se rather than risk-squared
in the linear measure. The use of a multi-period value relative and a measure of average loss is thus of
www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 13/19
10/24/12 Morningstar's Risk‑adjusted Ratings

secondary importance in terms of implications for fund selection.

These results provide an illustration of our earlier assertion that Morningstar's actual RAR calculations
give implications for investment choice very similar to those obtained using the simpler modified
(MRAR) measure. Moreover, they suggest that if monthly returns are close to normally distributed, a
choice based on a RAR measures will differ from one based on the use of a traditional mean-variance
approach only in the selection of an extreme point on the mean-variance efficient frontier rather than an
interior point on that same frontier. This is unfortunate since a preference for extreme risk-return
combinations is inconsistent with investor behavior. In effect, the RAR measure assumes that an
investor's marginal rate of substitution of expected return for risk is the same, no matter what the level of
his or her portfolio's return or risk. This is inconsistent with observed behavior -- both in this context and
in more general cases involving choices among competing alternatives.

RARs and Excess Return Sharpe Ratios


Clearly, there are conceptual difference between rankings of funds based on RAR values and excess
return Sharpe Ratios. This can be seen in Figure 6, which shows selected iso-excess return Sharpe Ratio
lines (iso-SR for short) in red and selected mean-variance approximations of iso-RAR curves in green.

Figure 6: Iso-Excess Return Sharpe Ratio Lines and Approximate Iso-RAR Curves

To assess the likely magnitudes of such differences, consider a selected mutual fund, X and the iso-
RAR and iso-SR lines on which it lies. Figure 7 shows a case in which fund X has an expected return
of 10% and a standard deviation of 15%.

Now consider the set of all funds that are better than X based on the RAR criterion. They will lie above
the green line in Figure 7. Similarly, the set of all funds that are worse than X based the RAR criterion
will lie below the green line. On the other hand, funds that are better than X based on the ERSR
criterion will lie above the red line and those that are worse will lie below the red line.

Figure 7: The Iso-SR and Iso-RAR Lines for a Single Fund

www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 14/19
10/24/12 Morningstar's Risk‑adjusted Ratings

Obviously, the sets of funds rated better or worse than X may be different, depending on the criterion
used. However, the differences may be relatively few. Figure 8 shows the regions in which the criteria
give different results. Any fund plotting in the blue area will have a higher RAR than fund X but a
lower ERSR. Any fund plotting in the yellow area will have a lower RAR than fund X but a higher
ERSR. However, for all funds that plot above both lines or below both lines, the criteria will lead to the
same conclusion. In general, the closer the slopes of the two lines, the fewer will be the disparities in
rankings between the two criteria.

Figure 8: Regions in Which the SR and RAR Criteria Conflict

Now, recall the procedures used to compute Morningstar's RAR measures. As we have shown, the
slope of the iso-RAR curve is given by the ratio of the return base to the risk base. If the period used for
the computation has been one in which the average return for the funds in the relevant peer group has
been sufficiently high (greater than two times the return on Treasury bills), the return base will equal the
mean excess return for the funds in the peer group. In every case the risk base is the mean risk for the
funds in the peer group. Let a fund (A) have a mean excess return and standard deviation of return
equal, respectively, to the corresponding average value for all the funds in its peer group. This implies
that under such conditions, by construction, the mean-variance approximation to the iso-RAR line for
fund X will be coincident with the iso-SR line for the fund.
www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 15/19
10/24/12 Morningstar's Risk‑adjusted Ratings

In such circumstances, the sets of funds that are better and worse than fund A will be the same, no
matter which criterion is used. The same can be said about any fund that plots on fund A's iso-SR (and
iso-RAR) line -- that is, any fund with the same ERSR as a fund with the average risk and return for the
peer group. In practice, funds are likely to cluster reasonably closely around this line. Hence we might
well expect that for peer groups with good average historic performance, rankings based on
Morningstar's RAR measure might be relatively similar to those based on the more traditional excess
return Sharpe Ratio.

Figure 9, taken from Sharpe [1997], shows that this can indeed be the case. Each point represents the
ranking of a one of 1,286 diversified equity funds within its category peer group, based on performance
from 1994 through 1996. The correlation coefficient was 0.986, showing that despite substantial
differences in computational procedures, Morningstar's approach and the simpler excess return Sharpe
Ratio do indeed give similar results in times such as the 1994-1996 period of relatively high returns for
U.S. equity funds.

Figure 9: Rankings Based on Morningstar's Category RARs and Excess Return Sharpe Ratios

While these results are quite striking, it is important to note that they apply to a situation in which returns
were high and Morningstar's procedure therefore utilized the mean returns of the peer groups for the
return bases in the calculations. Since ex post returns are used for the performance measures, there can
be situations in which the average return for a peer group is small or even negative. In such cases,
Morningstar sets the return base at the level obtained by Treasury bills. This may well lead to a greater
disparity in rankings based on the Morningstar and Sharpe Ratio measures.

Figure 10 shows an extreme version of such a situation. Here, both funds X and Y have performed
poorly. However, fund Y had a better (algebraically greater, or less negative) excess return Sharpe Ratio
than fund X, as shown by the fact that it lies on a higher iso-SR (red) line. On the other hand,
Morningstar's RAR measure assigns a better rating to fund X than to fund Y, since X provided a better
average return and a lower risk, leading the fund to plot on a higher iso-RAR (green) line.

Figure 10: Performance of Two Funds in Bad Times

www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 16/19
10/24/12 Morningstar's Risk‑adjusted Ratings

This example makes very clear the differences in the questions that the two measures attempt to answer.
We have argued that the RAR measure is best seen as an attempt to determine the best single fund on
the assumption that only one fund is to be held in the investor's portfolio. In this context, X was certainly
better (here, less bad) than Y. Moreover, this would be true for any (positive) degree of investor risk-
aversion (slope of the iso-RAR lines). However, this is not the setting for which the excess return
Sharpe Ratio was developed. It is intended for situations in which an investor can use borrowing or
lending to achieve his or her desired level of risk. In this context, the excess return Sharpe Ratio gives
the more appropriate answer. An investor who desired a level of risk of, say 10% would have held
either fund X or a 50/50 combination of fund Y and lending at the riskless rate (here, 5%). The latter
strategy, shown by point Y' in Figure 10, was clearly better than investment in fund X, as shown by its
greater excess return Sharpe Ratio.

Multi-Fund Portfolios
Morningstar's measure is best suited to answer questions posed by an investor who places all his or her
money in one fund. The excess return Sharpe Ratio is best suited to answer questions posed by an
investor who allocates money between one fund and borrowing or lending. Neither type of investor
should be interested in ranking funds within peer groups -- indeed such rankings conceal information
about the relative magnitudes of the underlying variables that is crucial for such an investor.

Why then does Morningstar present its risk-adjusted ratings in terms of rankings of funds within peer
groups? The only plausible answer is that investors are assumed to have some other basis for allocating
funds across peer groups and plan to use Morningstar's rankings as at least an important input when
deciding which fund or funds to choose from each peer group. In such a situation, neither Morningstar's
measure nor the excess return Sharpe Ratio is an appropriate performance measure. The reason is
simple. When evaluating the desirability of a fund in a multi-fund portfolio, the relevant measure of risk
is its contribution to the total risk of the portfolio. This will depend on the fund's total risk and, more
importantly, in most cases, on its correlation with the funds in the remainder of the portfolio. Neither the
Morningstar RAR measure nor the excess return Sharpe Ratio incorporates any information about such
correlation. Excessive reliance on either measure in such a decision process could seriously diminish the

effectiveness of the resulting multi-fund portfolio.


www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 17/19
10/24/12 Morningstar's Risk‑adjusted Ratings

effectiveness of the resulting multi-fund portfolio.

There are some very special cases in which a different single measure of fund performance may be
useful when constructing an optimal multi-fund portfolio. For example, Sharpe [1994] shows that the
Selection Sharpe Ratio, based on the difference between a fund's return and that of an appropriate asset
class benchmark, may be used if long and short positions in asset classes can be taken as needed.
However, the preconditions for this special case may not be met in many cases, and even if they are,
there can be significant differences between rankings based on excess return Sharpe Ratios and
Selection Sharpe Ratios. Given the relationships between RARs and excess return Sharpe Ratios,
rankings based on Selection Sharpe Ratios will also differ considerably from those based on RARs.

In many if not most cases, the use of any procedure for ranking funds within peer groups, followed by
selection of one or more funds from each of several peer groups based on such rankings, is likely to be
suboptimal, and possibly highly suboptimal.

Conclusions
We have shown that Morningstar's RAR measure has a number of drawbacks. It is complex, with poor
statistical qualities. More importantly, it fails to capture an important aspect of investor preferences --
increasing aversion to risk -- and the resulting desire for portfolios that are neither the least or most risky
available. Fortunately, the inherent disadvantages are mitigated to a considerable extent by Morningstar's
practice of adjusting the risk-aversion implicit in the measure to equal the ratio of return to risk for each
peer group over the specific period covered, although this adjustment is made only in part if the peer
group performance has been modest or poor. While this procedure makes the measure even more time
and sample-dependent, it has the advantage of aligning rankings rather well with those that would be
obtained using the more familiar, less complex and statistically more straightforward excess return
Sharpe Ratio.

Given a choice between Morningstar's RAR measure and the excess return Sharpe Ratio, the evidence
would seem to favor the latter. However, a more appropriate choice would involve either a different
performance measure or none at all. If it is possible to costlessly separate fund selection from asset
allocation by taking long and short positions in index funds representing "pure asset plays", funds may
usefully be evaluated based on their projected Selection Sharpe Ratios. Such measures take into account
only a fund's non-asset related expected return and risk. Typically, rankings based on selection Sharpe
Ratios will differ considerably from those based on Morningstar's measures or excess return Sharpe
Ratios. So of course will the resulting preferred portfolios.

While it is tempting to conclude that investors constructing multi-fund portfolios should shift their focus
from performance measures based on total or excess return to those based on differential or relative-to-
benchmark return, such is not our ultimate counsel. The conditions under which the Selection Sharpe
Ratio is appropriate are stringent and unlikely to hold for a typical investor. Rather than continue the
search for the ideal universal performance measure it is preferable to return to basics. Markowitz taught
us that portfolios should be constructed taking into account the best possible estimates of all relevant
future risks and returns. This is as true for portfolios of mutual funds as it is for portfolios of individual
securities. Asset allocation exercises, followed by selection of funds within peer groups based on simple
rankings, are easy but may lead to inefficient overall portfolios. A better approach takes into account the
complexity involved in such decisions. The key information an investor needs to evaluate a mutual fund
includes (1) its likely future exposures to movements in major asset classes, (2) the likely added (or
subtracted) return over and above a benchmark with similar exposures, and (3) the likely risk vis-a-vis
that benchmark. Efforts should be devoted to obtaining the best possible estimates for future values of
these key ingredients, then using them optimally to determine efficient portfolios.

www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 18/19
10/24/12 Morningstar's Risk‑adjusted Ratings

Footnotes

*. The author would like to thank John Watson of Financial Engines, Inc. for suggestions and comments
on an earlier draft.

1. Described in Damato1996

2. For the calculations used by Morningstar, it makes no difference whether the sign is reversed, due to
the subsequent division by the risk base, which is an average of all the risk numbers. However, for ease
of interpretation, we reverse the sign so that a smaller absolute value of risk will be considered more
desirable than a larger absolute value (as with standard deviation).

3. Function f was obtained by integrating over negative values of the excess return, taking into account
the relationship shown in equation (A1) in Triantis and Hodder [1990].

References

Damato, Karen "Morningstar Edges Toward One-Year Ratings," The Wall Street Journal, April 5,
1996, p. C1.

Markowitz, Harry, "Portfolio Selection," Journal of Finance, March 1952, pp. 77-91

Sharpe, William F., "Mutual Fund Performance," Journal of Business, January 1966, pp. 119-138.

Sharpe, William F., "The Sharpe Ratio," Journal of Portfolio Management, Fall 1994.

Sharpe, William F., Morningstar's Performance Measures, 1997

Kahneman, Daniel and Amos Tversky, "Prospect Theory: An Analysis of Decision Under Risk,"
Econometrica, XXXXVII (1979): pp. 263-291.

Tobin, James, "Liquidity Preference as Behavior Towards Risk," Review of Economic Studies,
February 1958, pp. 65-86.

Triantis, Alexander J. and James E. Hodder, "Valuing Flexibility as a Complex Option," The Journal of
Finance, Vol. XLV No. 2, June 1990, pp. 549-564.

www.stanford.edu/~wfsharpe/art/msrar/msrar.htm 19/19

You might also like