You are on page 1of 140

Behavioural Finance and Market Psychology

Mark Salmon

May 1, 2001

Part I

Introduction
1 Are Financial Markets Ecient?

(based on Shleifer Ch1.)

Fama(1970) dened an ecient market as one in which asset prices always fully reect the available information

Rules out trading systems based only on currently available information that have expected prots

or returns in excess of the equilibrium expected prots or return. Cannot consistently beat the market.

1960's EMH generated substantial theoretical and empirical support but subsequently the key forces that are supposed to create eciency such as arbitrage appear to be much weaker and more limited than initially supposed.

Now- considerable evidence that eciency is limited and that signicant deviations from eciency can persist for long periods of time -

WHY? =) Behavioural Finance

Theoretical Foundations of the EMH

3 arguments:i) Investors rational and value assets rationally ii) To the extent that some investors not rationaltheir trades are random and so their trades cancel each other out without aecting prices iii) To the extent that investors are irrational in similar ways their inuence on prices is eliminated by their interaction with rational traders. i) .......Samuelson (1965) and Mandelbrot(1968) : in competitive markets, with risk neutral investors =) returns unpredictable =)prices follow random walks

since then ecient asset prices derived with risk averse investors =)asset prices no longer follow random walks but rationality implies the impossibility of earning superior risk adjusted returns But EMH doesn't absolutely require rationality eg. One possible model: with random noise traders whose actions random and hence uncorrelated hence their trades are likely to cancel each other out.=) substantial volume but prices always close to fundamental value absence of correlation dubious but even this not required | EMH then depends on ARBITRAGE Friedman(1953), Fama(1965)

The simultaneous purchase and sale of the same or essentially similar security in two different markets at advantageously dierent prices

Smart investors note mispricing ( induced by noise traders) and trade to reduce towards fundamental value.

So prices should stay close to fundamentals even if noise traders correlated all the time.

Empirical Foundations of EMH

1960's-70's strong empirical support for EMH;

i) quick and accurate response to news no underreaction or over-reaction, no price trends ii) no price changes when there is no news regarding fundamentals =) no value in stale news dicult to dene making money on news{ ie superior return after adjustment for risk

measuring risk dicult{ need model of risk and returnCAPM? but not the only possibility { predictability and model

stale information{ weak form-semi strong form and strong form..

initial tests corroborated weak and semi-strong forms empirically- event studies

take over news { immediate and nal adjustment to news Keown and Pinkerton (1981) Non- reaction to non-information- Scholes (1972) block sales by large stake holders Addresses central issue in Arbitrage argument for EMH the existence of close substitutes for individual securities

Does there exist an asset or portfolio with identical cash ows in all states of the world ?

hence similar risk characteristics { question of complete markets critical for arbitrage arguments to work.

Sales of large blocks should not have a large eect on stock price because stock's value determined by its value relative to its close substitutes rather than by supply. Others should be willing to take up sale with minimal adjustment in price as long as they can easily sell o holdings of close substitutes.

Scholes nds relatively small reactions to block sales =)stock prices do not respond to non-information

Theoretical Challenges to EMH


Dicult to sustain the case that investors are fully rational; people react to irrelevant information trade on noise rather than information; What do investors actually do? follow advice of nancial gurus,fail to diversify,actively trade sell winning stocks , hold onto losing stocks, follow patterns etc. Deviations from maxims of economic rationality turn out to be highly persuasive and systematic ( Kahneman and Riepe(1998)

attitudes towards risk non-bayesian expectation formation sensitivity of decision making to framing of problem

Investors do not follow axioms of Von-Neuman Morgenstein Rationality{ in particular people do not look at the levels of nal wealth but at the gains and losses relative to some reference point{ which may vary from situation to situation and display loss aversion. ie. a loss function that is steeper than a gain function. PROSPECT THEORY. Investors don't like selling losing stocks. Aversion to holding stocks at all{ Equity premium puzzle.

Systematic violation of Bayes Rule and other axioms of probability theory. Take short history of data, unrepresentative of population when predicting. Heuristics help identify patterns but realistic features of population? extrapolate short histories of rapid earnings growth{ glamorous companies. dot.coms . Overreaction Over condent in their own opinions.

People make dierent choices depending on how equivalent decisions are presented to them- framing inuences psychological inuences or investor sentiment.

Raise fundamental problems for Arbitrage arguments

Since noise trader models argue that random effects of noise traders cancel each other out statistically.

However Kahneman and Tversky's work show that there are systematic and consistent deviations from rationality.

Most people deviate from rationality in the same way! buying and selling highly correlated across investors.

Herd eects. managers tend to adopt safe strategies that other managers tend to adopt. Window Dressing.

Finally:

Arbitrage is RISKY and therefore LIMITED; without exact substitutes there is no riskless hedge for the arbitrageur.

also the unpredictability of future resale price even if exact substitutes exist, given mispricing. Noise traders implicitly take on more risk{ need not die out and may make prots.

Empirical Challenges to EMH.


Shiller (1981) Le Roy (1981) =) Excess volatility Stock prices far more volatile than could be justied by simple model in which prices are equal to the expected net present value of future dividends.

The anomalies.................. predictability beyond simply risk adjustment to returns.

Part II

Investor Psychology
6 Expected Utility Theory

) Hand written notes

Heuristics and Biases

Kahneman and Riepe (1998):

Cognitive illusions similar to visual illusions{ when intuition fails

Decision theory identies: BELIEFS and PREFERENCES

Choice between gambles: described by a range of possible outcomes and the probabilities associated with these outcomes.

People make judgements about these probabilities and assign values ( utilities) to the outcomes.

Combine these beliefs and values in forming preferences about risky options.

Judgements can be systematically wrong in a number of ways{ biases

Errors of preference arise from either making mistakes in assigning values to future outcomes or from improper combinations of values and probabilities.

Biases of Judgement

Complex environments preclude xed rules of decision making =)intuition biases imply unrecognised risks taken =)unexpected outcomes unjustied trading

8.1

Overcondence

What is your best estimate of the FTSE tomorrow? It is now. What is the (low) value you are 99% sure it will not fall below? What is the ( high) value you are 99% sure it will not exceed? NB condence intervals rather than point estimates. ) 1% high surprises and 1% low surprises and 98% of the time you would expect value to lie between the two

Unfortunately people are not well calibrated in that there exists a well documented and systematic bias in subjective condence intervals too many surprises) condence intervals too tight. typically 15-20% not 2% even when it is in the best interest to be accurate. So if someone says they are 99% sure then sensible to treat it as 85% Applies to your own intuition Bookies at the racetrack accurate { repeated experience with the same problem. Explicit probability statement.

) Monitor your performance and adjust

8.2

Optimism

How good a driver are you compared with the average?

Most people biased towards optimistic outcome. Exaggerate talents and underestimate bad outcomes

Are you less likely to develop cancer or have a heart attack than the person next to you.?

Misperceive chance outcomes as skill, exaggerate ability to control events.

8.3

Hindsight

Recall the last action of the monetary policy committee{ what was your prior estimate of the probability to leave interest rates unchanged?

Psychological evidence indicates that people can rarely reconstruct accurately after the fact what they thought about an event before it happened. Exaggerate earlier accuracy. Lots of advice after the event but if that predictable then people would have changed what they did. Conditional and unconditional probabilities. Hindsight promotes over condence { world is more predictable than it actually is. Hindsight is an important element of regret.

8.4

Over-reaction to chance events

Which of the following is more likely to occur when a coin is tossed- HHHT T T or HT HT T H ? Both equally likely.| although one appears systematic and the other more random. Most people too quick to perceive causal regularity in random sequences{ \hot hand" fallacy are people sometimes systematically above and sometimes below long term average? no- Gilovich, Vallone and Tversky (1985) Human mind is pattern seeking device=) spurious causality investors perceive trends where none exists. Odean (1998) sold stock outperform replacement by 3.4% in rst year. =)Overtrading

Errors of Preference

Having considered how people may get probabilities wrong now consider how people use probabilities to evaluate risky projects , assign values to outcomes and combine values and probabilities into preferences.

9.1

Non-Linear weigthing of Probabilities

Given a chance for gain of $20,000 but you do not know the exact probabilities. Consider three pairs of outcomes: A: probability is either 0% or 1% B: probability is either 41% or 42% C: probability is either 99% 100%

Are the three dierences A,B,C equally signicant to the decision maker? Can you order them by their impact on preferences? Theory of rational choice tells us that uncertain prospects should be evaluated by a weighted average of the utilities of possible outcomes weighted by their probabilities. So an outcome that has a probability of 1% should have a weight which is ten times as much as an outcome which has a probability of 0.1%. Similarly an increment of 1% should have the same eect on the weighting of outcomes whether the initial probability is 0%, 41% or 99% However people not true. People will pay more to raise probability from 0% to 1% and 99% to 100% than to increase it from 41% to 42%. People deviate from the principle of weighting by probability in highly systematic ways. People overweight low probabilities and underweight high probabilities.

explains why people like to take long shots more than gambles that have the same expected value. long shots preferred because low probabilities are greatly overweighted. Would you prefer $10 or a 1% chance of $1,000. people who have a 99% chance to win $1,000 will pay much more than $10 to eliminate the possibility of losing.

9.2

People value Changes not States

Imagine you are $20,000 richer than you are today. and you face a choice between two options: A: Get $5,000 or B: get 50% chance to win $10,000 and 50% chance to win nothing.

and You are richer by $30,000 than you are now. and you have to choose between two options C: Lose $5,000 or D: 50% chance to lose $10,000 and a 50% chance to lose nothing. If you are like most others then i) you probably pay very little attention to the initial statement about being richer. ii) Feel the two problems are quite dierent. iii) You chose the gamble in the second question and the sure thing in the rst.

This behaviour violates an important rule in rational decision making.

The two problems are identical when formulated in terms of states of wealth and hence identical. In both problems you have a choice of being $25,000 richer than you are today or take a gamble in which you could end up richer by $20,000 or by $30,000 with equal probabilities.

What matters to a rational decision maker is where he gets to in the end. not the gains and losses along the way. so would chose either the gamble or the sure thing in both questions instead of ipping preferences as most people do. Inuenced by irrelevant emotions associated with gains and losses instead of maximising the utility of wealth.

People simplify decision problems and ignore the initial detail and concentrate on the gains and losses dependent on the choices made.

Implies people will make inconsistent choices in alternative formulations of the same problem.

) always possible to frame same decision problem in broader terms (wealth) or in narrower terms ( gains and losses)- broad and narrow frames often lead to dierent preferences. ) rationality most likely with the broadest frame and focusing on states ( such as wealth) rather than changes (gains and losses) { although narrow framing is easier, more natural and more common.

9.3

Value Function

Two characteristics:-

1. steeper for losses than for gains{ Loss aversion

2. near proportionality of risk attitudes.

Consider a bet based on the toss of a coin. If you lose, you pay $100{ what is the minimal gain that would make this gamble acceptable?

usual response $200-$250|- very high gain to loss ratio reects sharp asymmetry between gains and losses{ Loss Aversion. found frequently. Benartzi and Thaler (1995) pricing of Stocks and Bonds. Historical annual real return on stocks has

been 7% whereas that on Bonds approx 1% despite market for the two markets appearing to be in long run equilibrium. Equity Premium Puzzle. People consider the distributions of both assets and weight possible losses 2.5 times more heavily than possible gains. The probability of loss is higher for stocks obviously and the extra weighting on negative outcomes requires compensation with higher returns to match the attractiveness of a safer asset.

What sure gain is just as attractive as the risky prospect{ 50% chance to win $1000 or a 50% to gain nothing.?

The cash equivalent or certainty equivalent is the sure amount of gain or loss that is as attractive as the random prospect. Most people set a cash equivalent of less than $400 for the gamble above. Now assume the amount could

be gained is $5000 and then $20,000. The cash equivalent will normally grow almost proportionately with the size of the prize ( although probably slightly more slowly). Near proportionality of risk attitudes. Kahneman and Thaler { exercise with nancial advisors; wealthy family seeking advice, half told family had assets of $30 million and yearly expenditure of $200,000. The other half identical except told assets $6 million and yearly expenditure $120,000. The proportion the advisors suggested should be placed in equities was almost exactly the same ; 66% for the wealthier families 65% for the less wealthy. When the same advisors were asked to consider both scenarios and to determine whether they would recommend the same proportion of equities to both a substantial majority thought they would recommend more equity holding for the wealthier family.

9.4

Shape and Attractiveness of Gambles

Consider following 8 gambles{ are they ordered in descending order of attractiveness? Gamble A B C D E F G H Payo 1 5,000 5,000 1,000 1,000 2,000 0 (2,000) (5,000) Prob 1 Payo 2 .95 105,000 .50 15,000 10 11,000 .90 91,000 .50 18,000 .50 20,000 .90 118,000 .50 25,000 Prob 2 .05 .50 .90 .10 .50 .50 .10 .50

All have the same expected value of $10,000 but they dier in the \shape", the two outcomes can have equal or very unequal probabilities and the low probability outcome can be either better or worse than the more probable outcome. Some involve possible losses others do not.

Order of gambles determined by group of nancial analysts{ individual deviations small. evident that ideal gamble is one that combines a high probability of a moderate gain and a small probability of a very large gain. Individuals like gambles that involve a high level of security with some upside potential.{ much hope and little fear. ) use derivatives to hedge. and therefore limit the downside risk while retaining some upside potential.

9.5

Purchase Price as a Reference Point

Investor A owns a block of stock which he originally bought at $100 a share. Investor B owns a block of

the same stock that he bought for $200 a share. The value of the stock was $160 yesterday and it dropped to $150 today. Who is more upset? everyone would agree that investor B is more upset than A. The reason: investor A treats the bad news as a reduction in gain, while B views the same news as an increased loss. Because the value function is steeper for losses than for gains the dierence of $10 in the share price is more signicant for B than for A. Investors use the purchase price as a reference point{ which then determines weather selling the stock yields a loss or a prot and hence determines in part the decision to sell or buy the stock. Psychological eect known as the disposition eect. a marked reluctance of investors to realise their losses. Given two stocks one of which has gone up and the other gone down then Odean (1998) shows that investors much more likely to sell the one that has gone up.

9.6

Narrow Framing

Imagine that you face a pair of concurrent decisions. Examine both decisions and then indicate which you prefer. Decision 1: Choose between; A: sure gain of $2,400 B: 25% chance to gain $10,000 and 75% chance to gain nothing. Decision 2: Choose between; C: sure loss of $7,500 D: 75% chance to lose $10,000 and 25% chance to lose nothing.

A large majority of people choose A in 1 and D in 2. In 1 the sure thing seems more attractive. In 2 the sure loss is a repellent and the chance to lose nothing induces a preference for the gamble. What could be wrong with following preferences on each problem. Consider another decision problem. A: 25% chance to win $2,400 and 75% chance to lose $7,600 B: 25% chance to win $2,500 and 75% chance to lose $7,500 Easy { everyone correctly prefers option B to option A. However considering the previous problem you will see that the inferior option A is obtained by choosing A and D in the rst problem{ which is what you probably chose. The dominating option in this latter decision is obtained by combining the two options most people reject in the earlier question!

A fully rational decision maker would adopt a broad framework and incorporate the combined decision into an even broader context. An inclusive view would allow the decision maker to avoid the dominated option in the rst pair of problems..... do most humans behave like this? Many examples of investors considering decision problems one at a time instead of adopting a broader framework. Simple mistakes, failure to diversify, hedge or self insure. Narrow framing can arise from separate mental accounts. Separate budgets for dierent actions, attitudes towards risk dier in dierent accounts.

9.7

Repeated Gambles and Risk Proles

What is your cash equivalent for one play of a gamble: 50% chance to win $1,000 or 50% chance of zero?

What is your cash equivalent for 5 plays of the same gamble? Most would say the cash equivalent for the second option is more than 5 times higher than the rst. Second proposition is less risky. What about the following: Oered one play of the gamble| you may be oered more plays but do not know this for certain nor how many. What is you cash equivalent for this opportunity?

A narrow frame would not distinguish between the rst and the last options{ potentially costly given the risk reduction potential. A decision maker who fails to consider future risky opportunities always acts as if his current decision is the last

he will ever make. Most decision makers adopt narrow frames and consider their decisions one at a time and are overly inuenced by current opportunities. Decisions based on narrow frames tend to exhibit near proportionality of risk taking; this normally means too little tolerance for risk with small gambles and too much risk taking with large ones.

9.8

Short and Long Views

In what percentage of months during the last 71 years did stocks make money? What was the ratio of the average loss to the average gain? Answer the same questions for the percentage of consecutive 5 year periods. Critical questions relate to the relevant horizons of commitment{ the unit of time probably varies greatly for individuals.

Consider the frequency with which an investor checks how well he has done. Some nervous investors check very frequently- others are less concerned with short term uctuations. May determine their preferences for risk. Consider an investor with a monthly horizon, allocates based on experience of past month and expectations of immediate future. Stocks made money 62% of the months over the last 71 years; average loss was 97% as large as average gain. A loss averse investor with a one month horizon will not like this gamble and will keep all his money in a safer asset one month at a time, forever! Consider now a ve year horizon. Look at the same 71 year history. Now looks much better; stocks made money 90% of the 5 year periods and the average loss was only 63% of the average gain. So even a loss averse investor will invest in stocks{ only;if they are willing to adopt a long term view.

Bernatzi and Thaler (1995) assume investors myopically loss-averse and use the observed dierences in returns between stocks and bonds to derive the investment horizon for which investors will nd the two forms of investment to be equally attractive. It turns out to about a year. So an investor who adopts a longer horizon will be willing to take risks that a more myopic investor will reject{ even if the underlying risk aversion is the same.

10

Living with the Consequences

Investment decisions have both emotional and nancial consequences. Pride, elation, regret, guilt. Financially optimal decision may not be emotionally optimal.

10.1

Regrets of Omission and Commission

Fred owns share in company A but considered over the last year switch to company B- decided against it. Now nds he would have been $20,000 better o if he had switched. Bert owned shares in B but in last year switched to A and now nds out she would have been better o if he had kept his shares in B{ Who is more upset?

Most agree that Bert would have been more upset although in economic terms their outcomes are the same. Essential dierence between them is that Bert suers from a regret of commission{ he regrets something he did | whereas Fred suers a weaker regret of omission{ he regrets failing to do something that would have made him better o.

The dierence between the two is related to the well documented dierence between losses (which people feel acutely) and the opportunity costs ( failures t o gain) which seem to cause relatively little pain.

People ruminate on unusual aspects of the events that lead to bad outcomes

10.2

Regret and Risk Taking

Think of a bad nancial decision that you made which you know regret. Was it a decision to do something or to refrain from doing something{ How did chance inuence the decision.

Most people feel regret about things they did rather than things they did not do{ there are exceptions. People who tend to regret the opportunities they missed tend to take more risks than people who regret attempts that failed. Another characteristic of risk takers concerns investors views about the role of luck. Little role to luck attributed to either decisions that induce regret or those for which people are proud. Illusion of Control.

Part III

Prospect Theory
Kahneman and Tversky ( Econometrica 1979)

Preferences shown to systematically violate axioms of expected utility theory. A PROSPECT (x1; p1; ::::::::; xn; pn) outcome xi with P prob pi : pi = 1 (x; p) (x; p; 0; 1 p) Riskless prospect: (x) Expected Utility Theory: 1. Expectation: U (x1; p1; ::::::::; xn; pn) = p1u(x1)+ :::: + pnu(xn) 2. Asset integration: U (w+x1; p1; ::::::::; w+xn; pn) > u(w) ie. prospect only worth doing if adds utility : Domain of utility function is nal states rather than losses or gains

3. Risk Aversion: U is concave (u00 < 0) N.B. utilities weighted by probabilities people over weight outcomes that are considered certain. CERTAINTY EFFECT - ALLAIS Paradox =) choices inconsistent with expected utility theory in particular; Substitution axiom of EUT: if B preferred to A then any mixture (B; p) must be preferred to mixture (A; p) rejected REFLECTION EFFECT: what happens when gains replaced by losses?=) Risk aversion in the positive domain is replaced by risk seeking in the negative domain. Simple translation of outcomes induces shift from risk aversion to risk seeking. ie.

accept risk instead of accepting a sure loss. Preferences between negative prospects violate expectation principle. Certainty increases the aversion to of losses as well as the desirability of gains. Not true that certainty is generally desirable.

ISOLATION EFFECT. In order to simplify people will often disregard components that the alternatives share and focus on the components that distinguish them.=) may produce inconsistent preferences since a pair of prospects can be decomposed into common and distinctive components in more than one way and dierent decompositions can lead to dierent preferences. Dierent representations of the same probabilities induce dierent choices.

11

Prospect Theory

Distinguishes two phases of the choice process:

1. early phase of editing preliminary analysis{ often leading to a simpler representation of prospects.

2. phase of evaluation prospects evaluated and one with highest choice chosen.

Editing: codify { into gains or losses rather than nal states of wealth{ dened with respect to a reference point

combination{ simplify eg (200; 0:25; 200; 0:25) (200; 0:50) combining probabilities with identical outcomes segregation{ separate riskless component from risky component; ie (300; 0:8; 200; 0:2) can be naturally decomposed into a sure gain of 200 and a risky prospect of (100; 0:80). Similarly the prospect (400; 0:4; 100; 0:6) consists of a sure loss of 100 and the prospect (300; 0:4): cancellation{ applies to two or more prospects. Isolation eect- discards components shared by prospects. Ignoring rst stage that is common or a common bonus or discarding common constituents. eg. choice between (200; 0:2; 100; 0:5; 50; 0:3) and (200; 0:2; 150; 0:5; 100; 0:3) can be reduced to a choice between (100; 0:5; 50; 0:3)

and (150; 0:5; 100; 0:3) rounding { ((101; 0:49) is likely to be seen as an even chance to win 100 dominance { detection of dominated alternatives rejected. Final edited prospects may depend on sequence of editing. Many preference anomalies arise from the editing process Evaluation: Value of an edited prospect V , expressed in terms of two scales and

1. associates with each probability p a decision weight (p); which reects the impact of p on the overall value of the prospect. is not a probability measure and (p) + (1 p) will typically be less than unity.

2. assigns each outcome x a number (x) which reects the subjective value of that outcome. Dened relative to a reference point. zero point of value scale. Hence measures deviations from the reference point ie. gains and losses.

Present formulation concerned with simple prospects (x; p; y; q) with at most two non-zero outcomes. x with prob p and y with prob q and nothing with prob 1 p q: p + q 1: Prospect strictly positive if all outcomes positive x; y > 0 and p + q = 1 ( similarly negative) Otherwise regular prospect.

Basic equation of the theory describes the manner in which and are combined to determine the overall value of regular prospects.

If (x; p; y; q) is a regular prospect ( ie. p + q < 1; or x 0 y; or x 0 y) then V (x; p; y; q) = (p) (x) + (q)v (y ) (1) where (0) = 0; (0) = 0 and (1) = 1: As in utility theory V is dened on prospects while is dened on outcomes. The two scales coincide for sure prospects, where V (x; 1) = V (x) = (x) Equation (1) generalises expected utility theory by relaxing the expectation principle. Kahneman and Tversky provide an axiomatic analysis which ensure existence and uniqueness of : The evaluation of strictly positive and strictly negative prospects follows a dierent rule. In the editing phase such prospects are separated into two components :-

1. the riskless component ; the minimum gain or loss which is certain to be attained.

2. the risky component; the additional gain or loss which is actually at stake.

evaluated by means of:- If p + q = 1 and either x > y > 0 or x < y < 0 then V (x; p; y; q) = (y ) + (p)[ (x) (y )] (2)

So the value of a strictly positive or negative prospect equals the value of the riskless component plus the value dierence between the outcomes multiplied by the weight associated with the more extreme outcome. eg. V (400; 0:25; 100; 0:75) = (100)+(0:25)[ (400) (100)]

essential point is that a decision weight is applied to the value dierence (x) (y); the risky component but not to the riskless component (y ): Notice that the right hand side of (2) equals (p) (x) +[1 (p)] (y ): Hence (2) reduces to (1) if (p) + (1 p) = 1 But this condition is not generally satised. So Prospect theory is dened on gains and losses rather than nal asset positions and replaces probabilities with more general weights. These departures from expected utility theory must give rise to inconsistencies , intransitivities and violations of dominance. As with the arbitrage process the question arises as to whether the decision maker is able to realise that his preferences are inconsistent and whether he has the opportunity or desire to adjust them. If not then the anomalies implied by prospect theory should be expected to occur and remain.

11.1

The Value Function

The essential carriers of value are changes in wealth rather than nal states. Our perceptual apparatus is attuned to detect changes rather than absolutes{ we respond to changes in brightness, loudness temperature, time. The context denes the reference point. Hot or cold depending on temperature you have become adapted to. Same applies to health wealth etc. Again of $10,000 ?.. but although we value changes the value is not independent of the initial position. So value a function of (the reference point , the change).

Many sensory and perceptual dimensions share the property that the psychological response is a concave function of the magnitude of the physical change.

Easier to discriminate between a change of 3 and a change of 6 than it is between 93 and 96: Also with monetary gains{ the dierence in value of a gain of $100 and $200 appears greater than the dierence between a gain of $1100 and $1200. Similarly the dierence between a loss of $100 and a loss of $200 appears greater than the dierence between a loss of $1100 and $1200. So hypothesise that normal value function for changes in wealth will be concave above the reference point ( 00(x) > 0 for x < 0) and convex ( 00(x) < 0 for x > 0) below it. Will also apply in a risky context. Large losses may induce non-normal behaviour. Losses loom larger than gains. The loss one feels with losing an amount of money is more than one feels with gaining the same amount. So proposed value function that is:

Figure 1: 1. dened on deviations from reference point 2. generally concave for gains and convex for losses 3. Steeper for losses than for gains. Actual scaling more complicated than with utility theory because of the introduction of decision weights.

Decision weights could produce risk aversion and risk seeking behaviour even with a linear value function.

11.2

The Weighting Function

The decision weights are not probabilities and they do not obey the axioms of probability{ should not be interpreted as measures of belief. In tossing a coin for $1000 - most would agree that the probability was 0.5- but the decision weight (0:5) derived from choices is likely to be less than 0:5: Decision weights measure the impact of events on the desirability of the prospects and not merely the perceived likelihood of the events. The two scales coincide ( (p) = p) if the expectation principle holds but not otherwise.

What are the properties of the weighting function, which relates decision weights to probabilities? Naturally is an increasing function of p with (0) = 0 and (1) = 1: Outcomes contingent on impossible events are ignored and the scale is normalised so that (p) is the ratio of the weight associated with the probability p to the weight associated with the certain event. Consider properties of the weighting function for small probabilities: Evidence suggests that it is a sub-additive function of p; ie. (rp) > r (p) for 0 < r < 1 (6; 000; 0:001) is usually preferred to (3000; 0:002) hence (3; 000) 1 (0:001) > > (0:002) (6; 000) 2 by concavity of : But this need not hold for large values of p: Furthermore low probabilities are generally over-weighted, ie (p) > p for small p:

Low probabilities frequently over-weighted. Such over-weighting is a property of the decision weights and is distinct from the boundedly rational problem of overestimating the probability of rare events. Both act to increase the impact of rare events. Although (p) > p for low probabilities there is evidence that for all 0 < p < 1; (p) + (1 p) < 1: SUBCERTAINTY. Preferences expressed in Allais paradox imply subcertainty The slope of in the interval (0; 1) can be viewed as a measure of the sensitivity of preferences to changes in probability. Subcertainty ensures that is regressive with respect to p:; ie. that preferences are generally less sensitive to variations of probability than the expectation principle would dictate. So subcertainty captures an essential element of people's attitudes to uncertain events, in that the sum of the weights associated with complementary events is typically less than the weight associated with the certain event.

For a xed ratio of probabilities, the ratio of the corresponding decision weights is closer to unity when the probabilities are low then when they are high.
1.0

DECISION WEIGHTS

0 STATED PROBABILITY 1.0

Figure 2: is not well behaved at end points. People may discard or overweight low probability events and

treat high probability events as if they were certain. is nonlinear. Suppose you are playing Russian Roulette but are given the opportunity to purchase one bullet. Would you pay the same if you were to reduce the number of bullets from 1 to 0 as from 4 to 3? There are a number of issues raised here about this function and consistent choice between dominated strategies. Excluded in Prospect theory as dominated choices are detected and eliminated prior to evaluation. Risk attitudes in Prospect theory are determined jointly by and and not solely by the utility function. nal comment is that reference point has been taken as the current wealth but it may in fact correspond to the asset position that you expect to attain. A change of reference point alters the preference order for prospects.

Part IV

Noise Trader Risk


Arbitrage=) simultaneous purchase and sale of the same security in two dierent markets for an advantageous price dierential. In theory requires no capital and entails no risk Is this arbitrage force that induces eciency realistic ? Basic risks associated with arbitrage; transactions costs, lack of exactly similar assets that can act as perfect substitutes and mispricing deepening in the short run. This latter risk is the most important as it is in the relatively short run that traders engage in arbitrage against noise traders. Risk is that noise traders beliefs can become even more extreme before reverting to the mean.

If noise traders are pessimistic today about an asset and have driven down its price then an arbitrageur. buying the asset has to recognise that in the near horizon that noise traders may drive the prices down even more. If he needs to liquidate before the price rises he suers a loss. Fear of loss will limit his original arbitrage position. Similarly when an arbitrageur. sells an asset short when bullish noise traders have driven the price up needs to consider that the noise traders may drive the price even higher. He has to account for this possibility when he has to buy back the asset. This risk of a further change in noise trader's opinion away from the mean NOISE TRADER RISK. Limits the willingness to trade against noise traders.

Assumption of short horizon essential to limit arbitrage when we have perfect substitutes. Frequency of regular inspection of Managers (arbitrageurs) by investors. Question of period of arbitrage and evaluation. Mispricings that take longer than the evaluation horizon to correct do not increase the arbitrageurs return and the deepening of such mispricing in fact decreases it.

Moreover many arbitrageurs borrow funds to trade and have to pay interest and face liquidation if the prices move against them and the value of the collateral falls.

Dicult to believe that identical assets would not trade at the same price because of arbitrage but noise trader risk shows why this may not be the case.

Royal Dutch and Shell merged on 60:40 basis , traded as identical but under dierent names on dierent exchanges{ so would expect RD price to be 1.5 times the Shell price but gure below shows this is not the case. Enormous deviations from parity. No structural explanation for these dierences appears to exist. Such dierences cannot exist in a market where arbitrageurs have an innite horizon and face no transactions costs. No dierence in fundamental risk. There are other examples in stock and bond market- persistent mispricing. Closed End Funds which trade at dierent prices to the market values of their portfolios - although in this case there are structural explanations . Question of time and how long an arbitrageur can survive a position taking losses. Noise trader risk. About four years! for about 7% a year. Also costs of leverage.

Mispricing can take a while to correct and enormous ineciencies can be sustained without arbitrage activity correcting. Mere unpredictability of investor sentiment provides the explanation. The future noise trader demand for assets?? { important source of risk and deterrent to arbitrage.

A Model of Noise Trader Risk Noise traders and arbitrageurs: Noise traders - n; hold erroneous beliefs about future distribution of returns on risky asset{ behavioural biases, irrationality. Select portfolios on the basis of these beliefs. Measure in the market. Arbitrageurs - a , hold rational expectations, exploit noise traders misperceptions { pushing prices towards fundamentals { but not all the way. 1 : in the market.

Basic overlapping generations model with two agents for two periods. No rst period consumption, no labour supply decision and no bequest. Resources agents have to invest are exogenous and only decision is to choose a portfolio when young.

Two assets that pay identical dividends; one is safe,s, pays a xed real dividend r, per period. Asset s, is in perfectly elastic supply- perfect substitute for consumption at any period. With consumption as the numeraire the price of the safe asset is xed to be 1 at every period. Dividend r, is therefore the riskless rate.

The second asset, the unsafe asset,u; always pays the same xed real dividend as s , but it is not in elastic supply. There is a xed and unchangeable quantity normalised to one unit. The price of u, in period t is denoted by pt: If the price of each asset were determined by its net present values

of future dividends then assets u and s would be perfect substitutes and would sell for the same price of 1 in all periods{ but this is not how the price of u is determined in the presence of noise traders. Analysis relies on the limited risk bearing capacity of arbitrageurs as a whole. Justied by{ either noise trader risk being market wide rather than idiosyncratic. Risk aecting the stock market as a whole or the majority of shares. RD shares move with the Dutch market whereas Shell shares move with the British market. Exposure to two dierent price movements. Alternatively - market requires specialist arbitrage resources which are limited. Eg. arbitrage trades in emerging markets limited by access to these markets. Cost of learning about specic areas- costs of entry. Without one of these assumptions there would be too many opportunities for arbitrageurs to diversify.

Both agents chose their portfolio when young to maximise perceived expected utility given their own beliefs about the ex ante mean of the distribution of the price of u at t + 1: The young arbitrageur holds the correct distribution of returns form the risky asset and the representative young noise trader misperceives the expected value of the risky asset by an independent and identically distributed normal random term ,t: The misperception is a measure of the average bullishness of the noise traders and 2 captures the variance of the noise trader's misperceptions of the expected return of the risky asset. Unpredictability of future investor sentiment critical. So noise traders maximise their own expectation of utility given the next period's dividend, the one period variance of pt+1 and their false belief that the distribution of the price of u in the next period has mean ,t above its true value. t N (; 2 )

Each agent's utility is a constant absolute risk aversion function of wealth when old U = e(2 )w where is the coecient of absolute risk aversion and w is wealth when old. Given their beliefs the young divide their portfolios between u and s . When old they convert their holdings of s into consumption and sell their holdings of u for price pt+1 to the new young and consume all their wealth. Given normally distributed returns to holding the risky asset , maximising the expected value of the utility stream generates demand functions for risky asset proportional to the perceived expected returns and inversely proportional to the perceived variance of the returns. Denote by a t the amount of the risky asset u, held by an arbitrageur and by n t the amount held

by a noise trader and dene tpt+1 the rationally expected price for u at time t for period t + 1 and

2 t pt+1

= Etf[pt+1 Et(pt+1)]2g

to be the one period ahead variance of pt+1: The demand functions are given by r +t pt+1 (1 + r)pt 2 (t 2 p+1 )

a t = and n t =

t r +t pt+1 (1 + r)pt + 2 (t2 2 (t2 p+1 ) p+1 )

both possibly negative so that short positions can be taken. The fact that returns are unbounded means that each investor could have negative nal wealth.

Demands for risky asset are proportional to its perceived excess return and inversely proportional to its variance. The additional term in the noise trader's demand comes from their misperception of the expected return. When they overestimate the expected return they demand more of the risky asset, when they underestimate they demand less.

Arbitrageurs oset the volatile positions of the noise traders and hence exert a stabilising inuence on the market.

The variance of prices in the demand functions is derived solely from noise trader risk. Both types of trader limit their demand for the uncertain asset u because the price at which they can sell the asset is uncertain, the price they can sell it when old depends on the uncertain beliefs of

next period's young noise traders. This uncertainty aects all traders regardless of their own beliefs and so limits their own willingness to take positions against each other. If there was certainty about the price in the next period then the noise traders and arbitrageurs would hold dierent beliefs about expected returns with certainty and would therefore take innite positions against each other and an equilibrium wouldn't exist.

Noise trader risk limits all investors' positions and in particular keeps arbitrageurs from driving prices all the way to fundamental values.

The pricing function: To calculate equilibrium prices: old sell their holdings and so the demands of the young must sum to unity

in equilibrium. Equations for a and n above imply that 1 pt = [r +t pt+1 2 (t 2 pt+1 ) + t] 1+r This says that the price of the risky asset in period t is a function of period t0s misperception by noise traders (t) of the technological (r) and behavioural parameters ( ) of the model and the moments of the one period ahead distribution of pt+1: The model considers only steady state equilibria by imposing the requirement that the unconditional distribution of pt+1 be identical to the distribution of pt: The endogenous one period ahead distribution of the price of asset u can then be eliminated from the equation by solving recursively: (t ) 2 2 + (tpt+1 ) pt = 1 + 1+r r r looking at this expression we can see that only the second term is variable, ; and r are all constants and

the one step ahead variance of pt is a simple unchanging function of the constant variance of a generation of noise traders' misperceptions t
2 t pt+1

= 2 pt+1 =

2 2 (1 + r)2

The nal form for the price of u; which depends only on exogenous parameters of the model and on public information about present and future misperception by the noise traders is (t ) 2 2 2 pt = 1 + + 1+r r r (1 + r)2 Interpretation: The last three terms in this expression show the impact of noise traders on the price of asset u: As the distribution of t converges to zero, the equilibrium pricing function then converges to its equilibrium value of one.

The second term captures the uctuations in the price of the risky asset u due to variation of the noise traders' misperceptions ( beliefs) { when noise traders bullish they bid up the price and vice versa when they are bearish- they bid down the price. When they hold average misperception, when t = the term is zero. The more numerous noise traders are relative to arbitrageurs the more volatile are asset prices. The third term captures the deviations of pt from its fundamental value given that the average misperception by noise traders is not zero. If noise traders are bullish on average then this price pressure makes the price of the risky asset higher than it would be otherwise. Optimistic noise traders bear a greater risk than they otherwise would do. Since arbitrageurs bear a smaller share of the price risk when is higher they require a lower expected excess return and so are willing to pay a higher price for the risky asset, u: The nal term is the most important : Arbitrageurs would not hold the risky asset unless compensated for

carrying the risk that noise traders become bearish and the price of the risky asset falls. Both noise traders and sophisticated traders believe in period t that the risky asset is mispriced, but because pt+1 is uncertain neither group is willing to bet too much on its value. At the margin the return from enlarging one's position in an asset that is mispriced ( but dierent traders think the mispricing is to a dierent degree) has to be oset by the additional price risk that is borne. Thus noise traders create their own space: the uncertainty about what the next period's noise traders will believe makes the otherwise riskless asset u risky and drives its price down and its return up.

These results rest on the three assumptions in the model; the overlapping generations structure, the xed supply of the unsafe asset and the systematic nature of noise trader risk.

Overlapping generations model has no last period and equilibrium always exists as long as returns to holding the risky asset are uncertain. It also ensures the horizon is short. No agent has an opportunity to wait until the price of the risky asset recovers before selling. Emphasises short term rather than long term performance. If the horizons of arbitrageurs is long relative to the optimism or pessimism of noise traders wrt the risky asset then they can buy low condent of being able to sell later high as prices revert to the mean. Noise trader risk is an important deterrent to arbitrage only when the duration of noise traders misperceptions is of the same order of magnitude or longer than the horizon of arbitrageurs.

In general the longer the horizons of arbitrageurs the more aggressively they trade and the more ecient are the markets.

The assumption of a xed supply of the risky asset prevents such opportunities as converting the safe asset into the unsafe asset when it is over priced and vice versa when it is underpriced. If this assumption were removed completely then the opportunities for arbitrage would expand so as to restore market eciency. This is completely realistic in many situations. Royal Dutch and Shell shares exist in xed proportions and an arbitrageur could not convert the relatively cheap Royal Dutch Shares into the relatively expensive Shell shares. In other cases arbitrageurs may actually increase the supply of the expensive unsafe asset. Arbitrageurs create companies in overpriced industries and take them public, high tech stocks in the 1980's and internet stocks in the 1990's and closed end mutual funds. Increases in the supply of securities during price bubbles have been important historically.

Companies also issue securities when they believe their stock is over priced. Such creation of perfect or close substitutes in response to overpricing of an unsafe asset is exactly what the arbitrageurs in the model would like to do. Companies tend to issue stock when equities are overpriced as a whole.

But such activity is limited and costly- not everyone has started an internet company and hence arbitrage has its limits. Protable while it lasts.

Consistent with this is that new issues of equity by new and old companies tend to be notoriously over priced.

If noise trader risk is systematic it aects the market as a whole or the entire relevant section of the market. If it were idiosyncratic then it would not

be priced in equilibrium. The necessity of this systematic inuence of noise trader risk for it to be priced implies that the securities aected by noise trader sentiment must have correlated returns.{ even if fundamentally uncorrelated.

So fundamentally unrelated securities subject to the same noise trader sentiment must move together.

So common observation that particular groups of stocks moving together- usually attributed to common fundamental risk { may simply be evidence of common risk exposure but need not be fundamental.

The co-movement of securities which are fundamentally unrelated is strong evidence of the inuence of investor sentiment.

Relative Returns of Noise Traders and Arbitrageurs So we can see that noise traders can aect prices even though there is no uncertainty about fundamentals. Friedman and Fama argue that noise traders earn lower returns so economic selection would work to eliminate them from the market.

However in this model it need not be the case; Noise traders collectively shift opinion and increase the riskiness of the returns to assets. If noise traders' portfolios are concentrated in assets subject to noise trader risk then noise traders can earn a higher average rate of return on their portfolios than arbitrageurs.

E (Rna) =

(1 + r)2()2 + (1 + r)22 (2 ) 2

This equation makes it clear that for noise traders to earn higher expected returns the mean perception of the returns on the risky asset must be positive. If they tend to hold more ( ) and create their own risk they raise their returns when bullish- conicting forces in general - but clear cases when not driven out of the market. Although may get higher average return noise traders will always get a lower average utility. So although noise traders may be inuential in the market they will not be so happy on average{ not just wealth but volatility. positive corresponds to the optimism bias discussed earlier.

The key dierence from Friedman's analysis is that the demand curve of arbitrageurs shifts in

response to the addition of noise traders and the resulting increase in risk. Because of this shift, arbitrageurs expected returns may fall relative to the returns of noise traders even though their expected utility rises relative to that of noise traders. Since noise traders' wealth can increase faster than arbitrageurs', it is not possible to make any denite statement that noise traders lose money and eventually become unimportant.

Also expected returns are not the same thing as long run survival. The greater variance of noise traders' returns might give them in the long run a high probability of having low wealth and a low probability of having very high wealth. Even if expected value of wealth is high.

Many other issues aect survival arguments{ if noise traders bullish then might attract imitators -HERD EFFECTS

Above arguments have assumed perfect substitutes for arbitrage{ same dividends in all states of the world but in reality not the case. Costs of short selling, very little known about investor sentiment{ arbitrageurs do not know the mood of the market exactly and in the model arbitrageurs faced no transactions costs. Bottom line: noise traders compensated for bearing the risk that they themselves create and so earn higher returns than arbitrageurs even though they distort prices. May not be driven out of the market. Even in text book model arbitrage does not work { in more complicated environment of practical markets arbitrage even more limited. Market eciency arguments based on existence of arbitrage seem to be very weak.

Part V

Knightian Uncertainty
The problem with expected utility theory and the modications imposed by Prospect Theory are that they presume that we can formulate valid probability distributions so that we can construct expected values for the valuation of expected present values etc.

In many problems we may instead regard the uncertainty that we face as a once o event rather than arising as a result of repeated trails of some experiment.

In this latter case which underlies the relative frequency notion of probability we can adopt standard stochastic formulations of risk. Knightian uncertainty relates to the situation where we are uncertain about the form of uncertainty if you like and cannot presume that we can exploit known probability distributions in order to make our investment plans. Many risk factors such as political events are almost impossible to predict with standard Savagesubjective probability distribution structure. =)Use alternative decision theoretic model- min-max. Epstein and Zin (1994) Intertemporal asset pricing under Knightian Uncertainty- on the reading list show that under uncertainty there may be indeterminate equilibria, ie, there may be a continuum of equilibria for any given set of fundamentals. That leaves the determination of a particular equilibrium asset price process to \animal spirits" and sizable volatility may result.

Dow and da Costa Werlang. Econometrica (1992), show that under Knightian uncertainty there is an interval within which the agent neither buys or sells- ie does not take a position. At prices below the lower limit he is willing to buy the asset and at prices above the upper limit he is willing to sell but within the interval he will not. A further limit to arbitrage.

Dow and Costa Werlang (1992) use this Knightian approach to look at the question of volatility in nancial markets. We considered earlier the variance bounds inequality derived by LeRoy and Porter (1981) and Shiller(1981). Knightian uncertainty can also be used to explain excess volatility.

Systematic violations of the variance bounds inequality exist.

Assume agents risk neutral so an asset's price should equal the expectation of the discounted value of the future dividends. V: with current price P which depends on the agent's current information set I:

So V = E [P j I ] Given that E (V ) = E [E [V j I ]] we have that var(V ) = var[E (V j I )] + E [var(V j I )] Since variances are non-negative and the price equals the expected value we have immediately var(V ) var(P )

How do these variance bounds apply to a situation of Knightian uncertainty when the agent doesn't know the true probability distribution of the value V.

Gilboa and Schmeidler(1982), (1989) have developed an axiomatic model of rational decision making under Knightian uncertainty. This predicts that agent's behaviour will be represented by a utility function and a (subjective) non-additive probability distribution. A non-additive probability p reecting aversion to uncertainty satises the condition p(A) + p(B ) p(A [ B ) + p(A \ B ) (4)

rather than the stronger condition satised by (additive) probabilities p(A) + p(B ) = p(A [ B ) + p(A \ B ) (5)

In particular p(A) + p(Ac) may be less than unity.

The dierence could be thought of as a measure of the uncertainty attached to the event A: The agent now maximises expected utility under the non-additive distribution where the expectation of a non-negative random variable X is dened by E [X ] =
Z
<+

p(X x)dx

Associated with a non-additive probability p is a set of additive probabilities called the core of p dened as the set of additive probability measures such that (A) p(A) for all events A: If the non-additive probability satises the inequality (4) ( reecting uncertainty aversion) the core is non-empty. A closely related model of behaviour under uncertainty is for agents to maximise the minimum value over the elements in the core of expected utility.

Assume observed realisations drawn from one element of the core.

When agent's satisfy the standard Bayesian assumptions of stochastic decision theory then it is natural to expect that update their beliefs according to Bayes' rule (although not implied).

With non-additive probabilities the choice of learning rule that agents might adopt is less clear. The Dempster-Shafer rule is a natural generalisation of Bayes' rule and can be written as follows [p(A [ B c) p(B c)] p(A j B ) = [1 p(B c)]

whereas the usual Bayes rule can be written p(A \ B ) p(B j A)p(A) p(A j B ) = = p(B ) p(B )

An Example of Equilibrium in a Stock Market with Knightian Uncertainty

3 periods,t = 0; 1; 2: one risky asset in net positive supply and one safe asset (cash).

riskless rate of return is zero =) so all values are present value discounted values.

asset pays a liquidating dividend of V at time t = 2:

3 states of nature ,1; 2; and 3: The value of V is dierent in each state of nature and is respectively 1; 1=2 or 0:

all agents are identical and risk neutral. In period t = 1 agents receive public information about the value of the asset represented by a partition I of the set of states, ie I = ff1g; f2; 3gg in other words agents are told whether or not state 1 has occurred. Agents beliefs and attitudes towards uncertainty are represented by the following non-additive probability measure:

1 p1 = p2 = p3 = 4 and 1 3 here pij is the (non-additive) probability of state i or j : because equation (5) doesn't hold it is necessary p12 = p23 = p13 =

to specify these in addition to the ( non additive) probability of each state..

The CORE ; of this measure is the convex hull of the following 3 ( additive) probability measures. 1 1 1 1 = ; 2 = ; 3 = 2 4 4 1 1 1 1 = ; 2 = ; 3 = 4 2 4 1 1 1 1 = ; 2 = ; 3 = 4 4 2

If we let Pt denote the price at time t; then since the asset is in positive net supply it follows that P0 = E (V ); P1 = E (V j I ); and P3 = V: So in the initial period 1 1 3 P0 = E (V ) = 0 + ( 0)p12 + (1 )p1 = 2 2 8

If agents receive good news (state 1) then P1 = 1 ( implied by the Dempster Shafer rule){ if they receive bad news ( states 2 or 3) we need to rst calculate the conditional probability of state 2. By the Dempster Shafer rule this is given by
1 1) (2 (p12 p1) 4 = 1 = 1) (1 p1) 3 (1 4

and so the price is P1 = 1 6: So to summarise:

state 1 3 P0 8 P1 1 P2 1

2
3 8 1 6 1 2

3
3 8 1 6

If the actual distribution of prices is then the variances are var(P1) = 1(11)+23(123)=36 1 23=3

var(V ) = 1(1 1) + 2(1 2)=4 1 2 where 23 = 2 + 3 since is an additive probability distribution. It is not immediately clear what relationship should bear to the subjective non-additive distribution p: If the agent were not uncertainty averse p would be an additive distribution assume 2 ; the core of p: For 2 ; the variance of P1 ranges over the interval 25 100 [ ; ] = [0:1302; 0:1736] 192 576 while the variance of V ranges over the interval 8 11 [ ; ] = [0:1250; 0:1719] 64 64 Both end points of possible variances are larger for the price than for the value. This is a stronger property than having an element of for which the value has lower variance than the price.

For the three probability distributions listed above we have:


1; = 1; = 1 : In the case of 1 = 2 2 4 3 4

var(P1) = 0:1736; var(V ) =0.1719 which violates the variance bound


1; = 1; = 1 : In the case of 1 = 4 2 2 3 4

var(P1) = 0:1302; var(V ) =0.1250 which violates the variance bound


1; = 1; = 1 : In the case of 1 = 4 2 4 3 2

var(P1) = 0:1302; var(V ) =0.1719 which does not violate the variance bound

So agents who are uncertainty averse in the sense of Gilboa and Schmeidler may violate the variance bound. An alternative approach to decision making under knightian uncertainty is provided by min-max decision theory{ again Gilboa and Schmeidler have been instrumental in this development although the general decision theory dates back to Wald in the 1950's. We will consider the situation as described in the paper by Donald Lein on the reading list: Production and Hedging under Knightian uncertainty.

We consider a rm has imprecise information about the pdf of future output or futures prices.

In which case it adopts a max-min rule which Lein shows inertia in hedging ( under Knightian Uncertainty) This implies in the forward market there is a region for the current forward price within which a full hedge is the optimal hedge{ under standard probabilistic approach only nd full hedge to be optimal when the current forward price equals the expected spot in the future. May explain why a 1:1 hedge ratio is often seen in practice. Inertia increases as the ambiguity with respect to the pdf increases. Ambiguity aversion underlying Knightian uncertainty.

Basic Framework 1 period model,, time 0 to time 1 at time 0 the producer ( hedger) has to decide how much to produce and how much to trade forward production non-stochastic and output becomes available only at t=1. A forward contract provides a given price f0 for delivery at time t=1. Let Q denote output level and x the amount sold on forward contracts. Producers prots at time t=1 are then = p1(Q x) + f0x c(Q) where p1 is the price prevailing at time t=1 and c(Q) is the cost function incurred at t=0( increasing and convex)

The producer is assumed to have a utility function u(:) which is increasing and strictly concave.

The optimal production level and forward sales are chosen so as to maximise the expected utility E (u()) = E [u(p1(Q x) + f0x c(Q))] To introduce the Knightian uncertainty|- assume that the pdf of p1 cannot be precisely specied. So with probability # the pdf is g(:) with support [pL; pH ] and with probability 1 # the pdf can take any form with the same support.

Knightian Uncertainty in this case assumes the min- max principle; ie. when the pdf is unknown the producer assumes the worst possible outcome.

The optimal action is then chosen to maximise the resulting expected utility.

Let m(:) be an arbitrary pdf with bounded support,[pL; p The objective function of the producer / hedger is then given by
fQ;xg fmg

max min[#Eg (u( )) + (1 #)Em(u( ))]

This sort of strategy has become dominant in control theory in recent years and is known as H 1 theory.

There is a direct link from this form of decision rule to extreme risk aversion

In practise we may not be interested in extreme risk sensitive decision rules all the time{ state dependence of risk aversion as we saw last week with the use of Prospect theory and the state dependence of the adjustment to the utility function.

In fact H 1 theory is entirely deterministic and treats all disturbances as once of shocks that do not follow any stochastic distribution at all. We assume that nature will throw the worst case disturbances at us and then adopt the best rule in that case. It is a fascinating fact that the deterministic H 1 decision rule can shown to coincide with an extreme risk aversion case of a stochastic decision rule.

Part VI

Herd Eects
What we are now interested in is what happens when each decision maker looks at the decisions taken by previous decision makers before taking their own decision..

This is rational since others may have private information that is not known to you. Can then show that the decision rules chosen by optimising individuals will be characterised by herd behaviour, ie. people will do what others are doing rather than what is optimal given their own information.

The resulting equilibrium is inecient. eg. Beauty Contest of Keynes, new technology adoption etc ....... Consider the following situation: we have to choose between two restaurants both of which are more or less unknown to us. Say there is a population of 100 people facing the same decision. 2 restaurants next to each other and it is known that the prior probabilities are 51% that restaurant A is better and 49% that restaurant B is better. People arrive at the restaurants in sequence, observe the choices made by people before them and decide which one to go to. Apart from the prior probabilities each person also gets a signal which says that either A or B is better. The signal could be wrong of course.

Assume each person's signal is of the same quality. Suppose of the 100 people, 99 have received signals that B is better but one person, whose signal favours A arrives rst. So the rst person goes to A. The second person now knows that the rst person has gone to A and had a signal that favoured A while her signal favoured B. Since the signals are of equal quality they eectively cancel each other out and the rational choice is to follow the prior probabilities which is to go to A. So the second person chooses A regardless of her own private information ( signal). Her choice thus provides no new information to the next person in the line. the third person's situation is now exactly the same as the second's and she should make the same choice and so on.

So everyone ends up making the same choice to go to A even if given the aggregate information, it is practically certain that B is better.

If the second person had followed her own information the third person would have known that her signal favoured B and the third person would also have chosen B and so on.

The second person's decision to ignore her own private information and join the herd inicts a negative externality on the rest of the population. the HERD EXTERNALITY

There are a number of dierent explanations/ models for herding in which rational agents all act alike:direct payo externalities ( negative externalities in bank runs, positive externalities in the generation of trading liquidity or information acquisition), pay-os

increase as more people adopt the same action - agency issues ( based on managerial desire to protect or signal reputation) in order to preserve or gain reputation when the markets are imperfectly informed then may prefer to hide in the herd so as not to be distinctride the herd to prove quality{ micro incentives, or information learning ( cascades) later agents, inferring information from the actions of prior agents optimally decide to ignore their own information and act alike. Imitation and mimicry are basic instincts{ fashion, fads{ where ever decisions are aected by others. HERDING - behaviour where behaviour patterns are correlated across individuals| but could also be due to correlated information arrival in independently acting investors.- Systematic erroneousie suboptimal decision making by entire population. Herding closely linked to imperfect expectations,ckle changes without new information, bubbles, fads, frenzies.

Herding does require a coordination mechanism. Either wide spread rule to coordinate based on some signal ( eg. price movement) or based on direct ability to observe other decision makers ( observing investment trends).

Two views of Herding{ Rational| or Non- Rational.

Non-Rational view based on investor psychology{ people behave like lemmings{ following each other blindly and forgoing rational analysis

Rational view centres on - externalities{ optimal decision making distorted by information diculties or incentive issues.- intermediate view near rational{ economising on information processing or information costs by using \heuristics".

Grossman and Stiglitz (1976) , Information and Competitive Price Systems, American Economic Review, 66,2, 246-253, and Diamond and Verrechia (1981) , Information Aggregation in a Noisy Rational Expectations Economy, Journal of Financial Economics, 9, 3, 221-236. - market price is the only ecient coordination device for investors, each investor's valuation is an interior combination only of private and public information and local information is quickly dispersed into and dominated by the market price. Instead we shall look at models in which incentives to follow the herd increase with the size of the herd.

Herding and the Ecient Markets Hypothesis The EMH was so successful because it replaced the previous dominant notion of irrational markets driven by herds. Keynes' famous adage that stock market was mostly a beauty contest in which judges picked

who they thought other judges would pick, rather than who they thought was the most beautiful. There exist a range of examples of bubble like phenomena in nancial markets which are dicult to explain in standard ecient market models. Many nancial markets display waves and fragility{ eg mergers and IPO's come in waves that are more amplied than possible waves in fundamentals. Consensus among market participants seems to be low, not based on private information and yet still localised cf Shiller et al , Why did the Nikkei crash? Review of Economics and Statistics, (1995), indicating that independent decision making across all market participants is a ction. many market participants emphasise that their decisions are highly inuenced by other market participants.

A Theory of Fads, Fashion, Custom and Cultural Change as Information Cascade Birkchandani, Hirschliefer and Welch, JPE. 1992, vol 100, no.5

Let them alone: they be blind leaders of the blind. And if the blind lead the blind, both shall fall into the ditch. [Mathew 15:14]

Localised conformity exists everywhere- Americans are Americans, Germans are Germans, Italians are Italians etc

four primary mechanisms have been suggested for uniform social behaviour:-

i) sanctions on deviants ii) positive payo externalities, driving on the right, beta vs vhs iii)conformity preference iv) communication

Model of information cascade explains conformity but also rapid and short lived uctuations such as fads and fashions, booms and crashes. Slight shocks can lead to big shifts in mass behaviour if close to the border between choices.

An information cascade occurs when it is optimal for an individual, having observed the actions ahead of him, to follow the behaviour of the preceding individuals without regard to his own information.

In a sequential choice problem at some stage a decision maker will ignore private information and act only on basis of previous decisions ( journal submission) and at this stage result of decision becomes uninformative to others.

A model of Information Cascades Assume there is a sequence of individuals, each deciding on whether or adopt some behaviour or not. Each individual observes the choices of all those before him The ordering is exogenous All have same cost of adopting ,C which is set to 1=2:

The gain from adopting, V , is also the same for all and equal to either 0 or 1 with probability 1=2:

Each individual privately observes a conditionally independent signal about value. Individual i0s signal is either H or L and H is observed with probability pi > 1=2 if the true value is H and with probability 1 pi if the true value is 0: binary signal probabilities : P r(Xi = H j V ) P r(Xi = L j L) V = 1 pi 1 pi V = 0 1 pi pi All signals identically distributed , pi = p for all i:

The expected value of adoption is just E [V ] = 1 + (1 ) 0 = where is the posterior probability that the true value is one.

Thus the rst individual adopts if his signal is H and rejects if it is L: The second individual can infer the rst individual's signal from his decision. If the rst individual adopted, the second individual adopts if his signal is also H: However if his signal is L; the second individual computes the expected value of adoption ( given one H and one L signal) to be 1=2: Being indierent, he adopts with probability 1=2: Similarly, if the rst individual had rejected, the second individual rejects if his signal is also L: and accepts with probability 1=2 if his signal is H: The third individual is faced with one of three situations:(1) both predecessors have adopted ( in which case even an L signal induces him to adopt and thus creates an UP CASCADE ), (2) both have rejected ( in which case even an H signal induces him to reject and thus creates a DOWN CASCADE ) or (3) one has adopted and the other rejected.

In the last case, the third individual is in the same situation as the rst: his expected value of adoption, based only on his predecessors actions is 1=2; and therefore his signal determines his choice. If this happens then a similar analysis shows that the fourth individual would be in the same situation as the second individual, the fth as the third and so on. With this decision rule, we can derive the unconditional ex ante probabilities of an up cascade , no cascade and a down cascade after two individuals, 1 p + p2 ; 2 and
2)n=2 1 ( p p (pp2)n=2; 2 (6) after an even number of n individuals.

p p2;

1 p + p2 2

1 (p p2)n=2 ; 2

Equation (6) shows that the closer p is to 1=2; the later a cascade is likely to start. A reduction of p towards 1=2 is equivalent to adding noise to the signal : at p =1=2 the signal is uninformative. In other words, cascades tend to start sooner when individuals have more precise signals of the value of the investment.

Moreover according to (6) , the probability of not being in a cascade falls exponentially with the number of individuals. Even for a very noisy signal, as when p = 1=2 +" with " arbitrarily small, this probability after only 10 individuals is less than 0.1 percent!

We can also derive the probability of ending up in the correct cascade.

The probability of an UP cascade, no cascade and a DOWN cascade after two individuals, given that the true value is one are p(p + 1) (p 2)(p 1) ; p(1 p); 2 2 (7)

and after an even number of individuals n are p(p + 1)(1 (p p2)n=2) 2(1 p + p2) (p p2)n=2 and (p 2)(p 1)(1 (p p2)n=2) 2(1 p + p2)

and

The rst is the expression for the probability of a correct cascade. It is increasing in p and n . See plot below. Even for very informative signals ( where p is far from 1=2) the probability of the wrong cascade is remarkably high.

PROB

1.0

correct cascade

0.5

incorrect cascade

0.6

Signal Accuracy

1.0

Figure 3: correct cascade

The problem with cascades is that they prevent information aggregation across individuals. Ideally, if the information of numerous individuals is aggregated then ( consistent with laws of large numbers) we should expect later individuals to

converge to the correct action. However once a cascade has started, actions convey no information about private signals - so an individuals action does not improve later decisions.

This basic model can be modied in many ways: for instance by allowing dierent individuals to have signals of dierent accuracy. There is also a literature on the fragility of information cascades. Actions by early individuals are important in a cascade. The depth of the cascade need not increase with the number of adopters, once the cascade has started then further adopters are uninformative. Conformity can be brittle; arrival of shock or mere possibility of a value change can shatter the information cascade.

eg. release of public information- smoking and cancer{ eectively changes the decision of predecessors.

Huge literature on social learning and conventions . eg driving on the right/left reputations adds complicated nonlinear. dynamics and volatility.

Part VII

Bubbles and Irrational Exuberance


In the earlier discussion we had on ecient markets we saw that there is a problem with interpreting the standard value relationship as the only rational pricing solution. In fact there can be an innite number of \bubble" solutions to the rst order ( EULER) conditions for ecient asset pricing all of which are therefore considered to be Rational. These solutions, which do not satisfy a transversality condition, can therefore be called rational bubble solutions.

The lack of a transversality condition arises from the forward solution of the (rational expectations) forward looking equation which implies that today's fair value is given by the expected discounted future dividend stream.

The bubble was any term bt that satised the exceptional condition in the Euler equation.

We also showed how Frankel and Froot demonstrated that this term could in fact be a nonlinear function of the fundamentals.

Related literature of sunspots or multiple equilibria in nonlinear dynamic models.

We now turn to consider how bubble phenomena can be explained by the herding behaviour discussed in

last week's lecture. Before we do this we will consider several historical bubble episodes. A good reference to this literature is the special issue of the Journal of Economic Perspectives, Spring 1990. An Historical Overview Tulip mania, bubble, chain letter, Ponzi schemes, panic, crash,speculative attack, nancial crisis.......... invokes ideas of frenzied and potentially irrational behaviour but is it?? Self fullling behaviour?

perception of increased probability of potential large returns. While some of these perceptions may in the end prove erroneous are asset price movements based on them are fundamental based? Dutch Tulip Mania (1634-37) Mississippi Bubble (1719-20) and South Sea Bubble (1720). Outbursts of irrationality?

The Fundamentals of Tulipmania The Netherlands became a centre of cultivation and development of new tulip varieties. Professional growers and ower fanciers created a market for rare varieties in which bulbs sold for high prices. eg. a Semper Augustus bulb sold for 2000 guilders in 1625, about $16,000. Ordinary varieties sold very cheaply. By 1636 rapid price rises attracted speculators and prices surged from November 1636 through January 1637. In February 1637 prices suddenly collapsed ( rapid rise for two weeks) and then bulbs could not be sold for 10% of their previous peak value. By 1739 all the most prized bulbs in the tulip mania had fallen to no more than 0.1 guilder. 1/200 of 1% of the peak price. No serious economic distress in the Netherlands for years afterwards.

What should the fundamental price of tulip bulbs have been?

There is a standard dynamic in new /rare bulbs even today until supply expands. Then prices fall{ but how fast? Did prices decline faster than expected? Fashion.

By 1707 the Tulip replaced as the most fashionable ower by the Hyacinth. Average annual rate of price decline about 28.8% after 1707 and 32% immediately after the mania 1637-43. The crash itself only accounted for in fact only about a 16% decline- large but hardly as dramatic as legends make it..

The Mississippi and South Sea Bubbles

Financial dynamics of these episodes seem remarkably similar. Both involve a company that sought a rapid expansion of its balance sheet through corporate takeovers or acquisition of government debt, nanced by successive issues of shares.

The new waves of shares that were marketed were oered at higher and higher prices. Prospective future dividends. Prots from later entrants paid to earlier entrants.

The purchasers of the last shares took the greatest losses when the stock price fell but the initial buyers generally gained.

Critical question of intrinsic value compared to market value. Ponzi scheme.

Structure not uncommon in early days of all successful companies { as long as future earnings materialise.

sequences of stock issues at increasing prices not uncommon{ regulators .

May in fact realise that future large dividends not reasonable but that a sequence of buyers at increasing prices is likely to materialise.

Buy in on a gamble that you are not in the last wave.

South Sea Bubble: British Debt in 1720 amounted to approximately 50 million pounds. 18.3 million held by the three largest corporations, 3.4

by Bank of England, 3.2 million by the East India Company, 11.7 million by South Sea Company. Redeemable government bonds held privately amounted to 16.5 million{ about 15 million irredeemable long and short annuities. In 1720 the assets of the South Sea Company consisted of its monopoly rights on British trade with the South Seas{ Spanish Colonies of South America and its holdings of government debt. British trade with South America blocked by Spanish Fleet so only holdings of Government Debt relevant to the story. After competitive bidding between the Bank of England and the South Sea company the bill to allow the South Sea Company to refund the debt passed rst passage through parliament on March 21st 1720. To acquire this right the South Sea company had to pay the government up to 7.5 million pounds if it managed to acquire the 31 million pounds of debt in non-corporate hands.

To nance the debt acquisition, the company was permitted to expand the number of its shares, each of which had a par value of $100.For each $100 per year of the long and short annuities acquired the company could increase the par value of its shares outstanding by $2000 and $1400 respectively. For each $100 par value of the redeemables acquired, it could increase its stock issue by $100. The interest to be paid by the Government on the debt thus acquired by the company until 1727 was 5% and then 4% thereafter. Implying a substantial reduction in the annual debt services of the government. Conditional on the passage of the refunding act , the South Sea Company paid bribes to leading members of Parliament and favourites of the king totalling $1.3 million. Also numerous members of parliament and government participated in the sequence of stock

subscriptions through August 1720 and most received large cash loans from the company on their shares, 128 MPs in the rst subscription, 190 in the second, 352 in the third.. The total par value of the shares acquired by them was $548,000. Prior to the refunding operation the par value of South Sea shares outstanding was 11.7 million pounds and after the speculation this was increased to 22.8 million. So people in powerful positions took 17% of the additional shares created. Parliament and the government supported the refunding so enthusiastically that it signalled ocial cooperation in the South Sea companies future ventures. The South Sea Company was then leveraged so as to undertake commercial projects that would drive the economy forward. The movements in the South Seas share price was as follows, starting from about $120 per $100

share value in Jan 1720 prices moved upward as the refunding proposal was negotiated and with the passing of the refunding act prices jumped to from about $200 t0 $300. To nance the bribes and loans the company offered share subscriptions. In the rst the company oered 22,500 shares issued at a price of $300 and in the second 15,000 shares at a price of $400 realising $2,000,000 to pay its bribe commitments. April 14th and 29th. The rst debt conversion aimed at convincing the holders of irredeemable government annuities to agree to exchange for South Sea shares began on April 28th with one week to make decision. Market value of an annuity was about $1600 and their conversion oer with the value of South Sea stock at about $400 a share was $3375 and since annuity holders would not loose unless the value of South Sea stock fell beneath $146 the oer was highly attractive.

All government creditors who subscribed prior assented to the companies terms so the company absorbed 64 % of long annuities and 52 % of short annuities. S it became clear that the company would get most of the outstanding debt its share price rose to $700. TO get sucient cash to continue with the loans to the share holders the company undertook a third cash subscription ( June 17, 1720) in which it sold a par value of $5,000,000 ( 50,000 shares) for a market price of $1000 per share. 1/10 down and rest in stages. Prices immediately jumped immediately from 745 to 950. Finally company held 80 % of public's holdings of irredeemables and 85 % of redeemables.

The Price Collapse:

South Sea prices collapsed from about $775 on August 31st to about $290 on October 1, 1720. Shares outstanding to be issued to the public after the subscriptions amounted to 212,012 so the market value of all shares on August 31st was 164 million and about 103 million pounds evaporated in one month, an amount twice exceeding the value of the original government debt. The reason for the speed and the magnitude of the decline are vague though they generally attribute blame to the appearance of a liquidity crisis. The surge in the price of the South Seas company triggered a rise in the prices of other companies and the creation of numerous other bubble companies. The passage of money into these companies alarmed the directors of the South Sea

company who had paid a large amount to but parliament and they didn't;want to see their potential prots dissipated by the entry of these companies{ so Bubble act of parliament was passed to ban the formation of unauthorised companies.! When the act was enforced some downward pressure was placed on the market{ selling hit the whole market including the South Seas company in a scramble for liquidity. Eventually South seas company had to sell back its debt to the Bank of England{ many lost millions

The fundamentals: beginning of Sept 1720 market value of South seas Company was 164 million. The visible asset

supporting this was a ow of revenue from the company's claim on government debt of 1.9 million a year until 1727 and 1.5 million thereafter. At a 4% long term discount rate this asset had a value of about 40 million. Against this the company had agreed to pay 7.1 million for the conversion privilege and owed 6 million in bonds and bills for a net asset value of 26.1 million In addition it had 11 million due on loans and 70 million eventually due from cash subscribers. Thus the share values exceeded the asset values by more than 60 million. Given dubious value of cash claims a ratio of ve time or more. Intangible assets? Maybe not unreasonable{ could buy parliament.

Not necessarily the case that these experiments were irrational or \market psychology" forces at work.

You might also like