This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Tehranchi

Contents

Chapter 1. One-period models 1. The set-up 2. Arbitrage and the ﬁrst fundamental theorem of asset pricing 3. Contingent claims and no-arbitrage pricing 4. Pricing and hedging in an incomplete market by minimizing hedging error 5. Change of num´raire and equivalent martingale measures e 6. Discounted prices Chapter 2. Multi-period models 1. The set-up 2. The ﬁrst fundamental theorem of asset pricing 3. European and American contingent claims 4. Locally equivalent measures Chapter 3. Brownian motion and stochastic calculus 1. Brownian motion 2. Itˆ stochastic integration o 3. Itˆ’s formula o 4. Girsanov’s theorem 5. Martingale representation theorems Chapter 4. Black–Scholes model and generalizations 1. The set-up 2. Admissible strategies 3. Arbitrage and equivalent martingale measures 4. Contingent claims and market completeness 5. The set-up revisited 6. Markovian markets 7. The Black–Scholes model, PDE, and formula 8. Local and stochastic volatility models Chapter 5. Interest rate models 1. Bond prices and interest rates 2. Short rate models 3. Factor models 4. The Heath–Jarrow–Morton framework Chapter 6. Crashcourse on probability theory 1. Measures

3

7 7 8 12 16 18 20 23 23 26 29 35 37 37 40 43 47 48 51 51 52 54 55 56 58 60 63 67 67 70 73 75 81 81

4. Index Random variables Expectations and variances Special distributions Conditional probability and expectation. 5.2. independence Probability inequalities Characteristic functions Fundamental probability results 81 82 84 85 85 86 86 89 4 . 6. 7. 8. 3.

including knowing the deﬁnition and key properties of the following concepts: random variable. with an emphasis on the pricing and hedging of contingent claims within such models. Familarity with measure theoretical probability is helpful.ac. An attempt is made to keep this course self-contained. the proper language to formulate the models that we will study is the language of probability theory. but you should be familiar with the basics of the theory. Gaussian (normal) distribution. conditional probability/expectation. independence. Our starting point is the self-evident observation: The future is uncertain. expected value. variance. etc. though a crashcourse on probability theory is given in an appendix.This course is about models of ﬁnancial markets.tehranchi@statslab.uk. Therefore. Indeed.cam. Please send all comments (including small typos and major blunders) to the author at m. 5 . anyone with even a passing acquaintance with ﬁnance knows that it is impossible to predict with absolute certainty how the the price of an asset will ﬂuctuate.

.

We will denote the collection of prices by the d + 1-dimensional column vector ¯ St = (St0 . that an investor in this market can buy or sell any number (even fractional) of shares of each asset without aﬀecting the price. Often. one of the assets. 2. Std ). Also. . 7 . 1} by the random variable Sti . in the real world. The random variables S0 . As simple as it seems. . and we will leave it as a standing assumption throughout the course. labelled 0. . etc. much of the ﬁnancial aspects of this course already appear in oneperiod models where T = {0. The market consists of d + 1 assets. We assume now. You might wonder at this stage why we have chosen the notation such that there are d + 1 assets. . . 1}. 1. S0 are constants – that is.) at time t ∈ T. 1} for single period models. d. rather than simply d assets. we have the following assumption (which will be generalized in the multi-period models): 0 d Assumption. . 1. Std )t∈T where Sti will model the price of a ﬁnancial asset (stock. . . In this course. bond. Of course. but not always. therefore. not random. . the index set T will be one of the three sets • {0. plays a distinguished role. ∞) when time is continuous.} when time is discrete. this set-up captures most of the essential features of the general discrete-time model.CHAPTER 1 One-period models The models we will encounter will be of form (St0 . and large buy or sell orders tend to move the market prices. . The set-up In this section we describe our ﬁrst model. • Z+ = {0. and • R+ = [0. appropriate to devote a signiﬁcant portion of the course to this important special case. . . It is. As we shall see later. . ¯ ¯ Our ﬁrst assumption on the stochastic process S = (St )t∈{0. investors face constraints when short-selling assets. The reason is that in the sequel. We also assume that there is no bid/ask spread – the buying price is the same as the selling price. . Then the market consists of one distinguished asset and d other assets. 1. . Or. . . . . at time t ∈ {0.1} is that the time 0 prices of the assets are known at time 0. but we ignore these issues. We model the price of the assets. the overbar in the ¯ notation St will become apparent as this chapter proceeds. usually asset 0. Remark. so that S0 = S1 = 1. Formally. we can model the cost of borrowing cash by 0 0 introducing an interest rate r ≥ 0 so that S0 = 1 and S1 = 1 + r. for a total of d + 1. we take asset 0 0 0 to be ordinary cash.

but that always has non-negative (and ¯ sometimes positive) value at time 1. we introduce an investor. π 1 = 3. S 1 ) . We will allow π i to be either positive. negative. etc. Again notice that we do not demand that the π i are integers. then the investor has borrowed |π i | units of asset i to pay back at time 1.) Consider the portfolio π 0 = −4. Example. but the time-1 wealth is strictly positive in 8 . for i = 0. We use the following notation: • π i ∈ R denotes the (non-random) number of shares of asset i. Consider a market with two assets with prices given by (S 0 . . . d. In particular. The budget constraint simply says that the investor’s initial wealth is the time-0 total value of his holdings. π d ) ¯ denote the vector of portfolio weights. We assume that the investor’s wealth and porfolio is connected by the following relationships: X0 (¯ ) = π · S0 π ¯ ¯ X1 (¯ ) = π · S1 π ¯ ¯ where the a · b = d i=0 the budget constraint the self-ﬁnancing condition ai bi is the usual Euclidean inner (or dot) product in Rd+1 . Arbitrage and the ﬁrst fundamental theorem of asset pricing Now that we have our market model and we’ve introduced an investor into this market. 1}. 4) qq qq qq 1/2 qq# (2. or zero with the interpretation that if π i > 0 the investor is ‘long’ asset i and if π i < 0 the investor is ‘short’ the asset. An investor’s dream is to ﬁnd a portfolio π that costs nothing to buy at time 0. . if π i < 0. and the self-ﬁnancing condition says that changes in his wealth between time 0 and 1 are due only to changes in the asset prices – he does not consume or have any other source of income. It costs zero to buy at time 0. . • Xt (¯ ) denotes the investor’s wealth at time t ∈ {0. . our ﬁrst challenge is to ﬁnd out how to invest optimally. . P(S1 = 7) = 1/2. at time 0. 1/2 www w ww ww (4.Now given this market model. and let π = (π 0 . . 2. 3) 0 1 (The above diagram should be read S0 = 3. . π Remark. Remark. the investor only chooses his investment portfolio once. Since there is only one period. 7) (3.

and one of the most important theorems in ﬁnancial mathematics. Proof. . we need some vocabulary. Notice that a market has no arbitrage if and only if π · S0 ≤ 0 and π · S1 ≥ 0 a. But for the sake of building realistic models. 2) is also an arbitrage in the ¯ example above. π • X1 (¯ ) ≥ 0 almost surely. then there is no arbitrage. . Before we begin. and π • either X0 < 0 or P (X1 > 0) > 0 (or both). Now suppose X0 ≤ 0 and X1 ≥ 0 almost surely. we have by the deﬁnition of pricing kernel and the linearity of expectations ¯ ¯ ¯ E(ρX1 ) = π · E(ρS1 ) ¯ = π · S0 ¯ ¯ = X0 . It is no surprise that it is often called the ﬁrst fundamental theorem of asset pricing. for all i = 0. At this point. Markets with many arbitrage opportunities would be nice–we all would be a lot richer. First we prove that if there exists a pricing kernel.s. Since ρ > 0. and E(ρS1 ) = S0 . ¯ Definition.all states of the world: X 5 Ñ@ ÑÑ ÑÑ ÑÑ 0a aa aa 1/2 aa 1/2 1 The example above leads us to our ﬁrst deﬁnition: Definition. . we usually assume that markets are free of arbitrages. . The following proof is from Doug Kennedy’s lecture notes. A market model has no arbitrage if and only if there exists a pricing kernel. An arbitrage is a portfolio π ∈ Rd+1 such that ¯ • X0 (¯ ) ≤ 0. ⇒ π · S0 = 0 and π · S1 = 0 a. ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ In this section we ﬁnd a mathematical classiﬁcation of such market models. you should check that the portfolio π = (−3. Now we come to ﬁrst theorem of the course.s. A pricing kernel (or state price density) for a market model S is a positive random variable ρ such that i i i E(ρ|S1 |) < ∞. Letting Xt = π ·St . we have ρX1 ≥ 0 and hence 0 ≥ X0 = E(ρX1 ) ≥ 0 9 . d. Theorem (First fundamental theorem of asset pricing).

s.s. Next.) Again. it is easy to see that C is convex. But the separating hyperplane theorem says that there exists at least one ρ such that X0 < E(ρX1 ). with strict inequality for at least one ρ. and ¯ E(ρ0 |S1 |) < ∞. The set C is not empty. There are many versions of this theorem.s. Notice that the right-hand side of the above inequality tends to −∞ as M ↑ ∞ if the event {X1 < 0} has positive probability. but we give only the version needed to prove the ﬁrst fundamental theorem of asset pricing in one-period. our desired contradiction. we conclude that X1 = 0 a. Consider the set C ⊆ Rd+1 deﬁned by ¯ ¯ C = {E(ρS1 ) : ρ > 0 a.. and such that the above inequality is strict for at least one y ∈ C.s. suppose. since ρ0 deﬁned by ρ0 = e−|S1 | certainly satisﬁes ρ0 > 0 a.1. and E(ρ|S1 |) < ∞}. But since X0 is ﬁnite.s. 2. Theorem (Separating/supporting hyperplane theorem). We also see that ρX1 = 0 a. Letting π · St = Xt the above inequality translates to ¯ ¯ X0 ≤ E(ρX1 ) for all feasible ρ.s and E(Y ) = 0 then Y = 0 a. for the sake of ﬁnding a contra¯ diction. we have X0 ≤ E(ρ0 X1 ) → 0 as ↓ 0. ¯ If S0 is contained in C. a version of the separating hyperplane theorem is stated and proved. that S0 is not an element of C. We must prove there exists a pricing kernel. By the separating hyperplane theorem (stated and proved below) there exists a vector π such that ¯ ¯ π · (y − S0 ) ≥ 0 ¯ for all y ∈ C.s. Also. Since we have assumed that there is no arbitrage. by letting ρ = ρ0 with > 0 in the above inequality. Now suppose that there is no arbitrage. and we have just shown X0 ≤ 0 and X1 ≥ 0 a. since ρ = 0 a. we must conclude X1 ≥ 0 a. In this optional subsection. we must conclude that X0 = 0 and X1 = 0 a. So.so that X0 = 0 = E(ρX1 ). let ρ = ρ0 (M 1{X1 <0} + 1) so that X0 ≤ M E(ρ0 X1 1{X1 <0} ) + E(ρ0 X1 ). Let C ⊂ RN be convex and x ∈ RN not contained in C. we would be done. where the inequality is strict for at least one y ∈ C. Then there exists a λ ∈ RN such that λ · (y − x) ≥ 0 for all y ∈ C. (Recall the pigeonhole principle: if Y ≥ 0 a. so we can conclude X0 ≤ 0..s. First. 10 ¯ .s. and hence there is no arbitrage. A separating hyperplane theorem*.s.

Separating hyperplane ¯ Proof. and note that the point (1 − θ)y ∗ + θy is ¯ ¯ in C since C is convex. Fix a point y ∈ C and 0 < θ < 1. Deﬁne a subspace N of R by S = span{y − x : y ∈ C}. We have established that the sequence (yn )n is Cauchy. where we have used the convexity of C to assert that 2 (ym + yn ) ∈ C and 1 hence 2 (ym + yn ) − x ≥ d. The point x is in the closure C of C. Hence (y − x) · λ = (y − y ∗ ) · λ + |λ|2 > 0 as desired. Then ∗ 0 = ≤ = = |y ∗ − x|2 − λ2 |(1 − θ)y ∗ + θy − x|2 − λ2 |θ(y − y ∗ ) + λ|2 − λ2 θ2 |y − y ∗ |2 + 2θ(y − y0 ) · λ. By ﬁrst dividing by θ and then taking the limit as θ ↓ 0 in the above inequality. we establish the existence of y ∗ : Let d = inf y∈C |y − x| > 0 and let yn be a sequence in C such that |yn − x| → d. Let ¯ be the point in C closest to x. n → ∞. Now. and ¯ hence converges to some point y ∗ ∈ C as claimed. ¯ Case 2: Supporting hyperplane. we conclude (y − y ∗ ) · λ ≥ 0. The point x is not in the closure C of C. ¯ y ∈C let λ = y ∗ − x. 11 . Assuming for the moment the existence of this point. Applying the parallelogram law |a + b|2 + |a − b|2 = 2|a|2 + 2|b|2 we have 1 |ym − yn |2 = 2|ym − x|2 + 2|yn − x|2 − 4 2 (ym + yn ) − x 2 ≤ 2|ym − x|2 + 2|yn − x|2 − 4d2 → 0 1 as m.Case 1. Case 1: Separating hyperplane.

In other words. since λ ∈ S and λ = 0. the payout of a contingent claim is modelled by a random variable on our probability space (Ω. P). Example (Call option). called the strike of the option. As Case 2. Then λ · (y − x) = lim λn · (y − x) n = lim λn · (y − xn ) + λn · (xn − x) n ≥ 0. F. Hence. realizing a proﬁt of S1 − K. as desired. 3. Contingent claims and no-arbitrage pricing A contingent claim is any cash payment at time 1 where the size of the payment is contingent on the realized prices of other assets. still denoted by (λn )n .¯ Let (xn )n be a sequence in the complement of C such that xn − x is in S and xn → x. we can conclude that there exists y ∈ C such that λ·(y −x) = 0. Let S1 denote the price of the stock at time 1. On the other hand. with limit λ ∈ S. Finally. One of the major triumphs of modern ﬁnance is the no-arbitrage pricing of contingent claims. Supporting hyperplane ∗ ¯ in case 1. etc. A call option gives the owner of the option the right. then the owner of the option can buy the stock for the price K from the counterparty and immediately sell the stock for the price S1 to the market. There are two cases: If K ≥ S1 . then the option is worthless to the owner since there is no point paying a price above the market price for the underlying stock. we can ﬁnd the point yn ∈ C closest to xn . there exists a convergent subsequence. but not the obligation. Let y ∗ − xn . for us. the payout of the 12 . λn = n ∗ |yn − xn | Since (λn )n is bounded. if K < S1 . to buy a given stock at time 1 at some ﬁxed price K.

Given an arbitrage free market model ¯ the following are equivalent: S. for i = 0. . which is just the time-0 price of this replicating portfolio. . So. Furthermore. We now study what the no-arbitrage principle has to say about the pricing of contingent ¯ claims. . . and φd+1 = 1. ξ) has no-arbitrage. We will show that if the augmented market (S. Definition. (1) A contingent claim with payout ξ1 is attainable. φd+1 ) where the ﬁrst d + 1 entries correspond to the number of shares of the assets in the underlying market. (1) ⇒ (2) Suppose the claim is attainable. and the d + 2-th entry is for the contingent claim. by symmetry. The conclusion of the theorem is very important. . . Remark. ξ) is still free of arbitrage. the time-0 price of such a claim. Proof.call option is (S1 − K)+ . Let φi = −π i . ¯ (2) There exists a unique initial price ξ0 such that the augmented market (S. d. . . The ‘hockey-stick’ graph of the function g(x) = (x − K)+ is below. so that ξ1 = π · S1 for some portfolio ¯ ¯ ¯ π . (3) There is a constant ξ0 such that ξ0 = E(ρξ1 ) for all pricing kernels ρ such that E(ρ|ξ1 |) < ∞. so it’s worth reprasing it for emphasis. . Following proof is a bit redundant but may help with building intuition. where a+ = max{a. φd . Our ¯ goal is to ﬁnd the time-0 price ξ0 such that the augmented market with prices (S. . ¯ ¯ ¯ ¯0 then there exists an arbitrage. The theorem says that a contingent claim has a unique no-arbitrage price if and only if its payout can be replicated by trading in the original assets 0. A contingent claim with payout ξ1 is an attainable (or replicable) claim if and only if there exists a portfolio π ∈ Rd+1 such that ¯ ξ1 = π · S1 . it’s enough to show that if ξ0 < π · S ¯ Consider a portfolio φ = (φ0 . can be found by computing the expected value of the payout times a pricing kernel. 13 . There is a useful class of contingent claims for which no-arbitrage give a unique price. ξ) has no arbitrage. ¯ ¯ This set of attainable claims can be characterized: Theorem (Characterization of attainable claims). 0} as usual. d. Introduce a contingent claim with time-1 payout of ξ1 . then ξ0 = π · S0 . We start oﬀ with a given market model of d + 1 assets with prices S. and suppose this market is free of arbitrage. . .

Then X0 (φ) = −¯ · S0 + ξ0 < 0 π ¯ and X1 (φ) = −¯ · S1 + ξ1 = 0 a. one can choose a non-zero θ small enough that ρθ (ω) > 0 for all ω ∈ Ω. that asset 1 is a stock. ¯ ¯ ¯ (3) ⇔ (2) By the ﬁrst fundamental theorem of asset pricing. a call option on that stock with strike K. π ¯ Hence φ is an arbitrage. . we conclude that ξ1 ∈ K ⊥⊥ = K as desired. Ct )t∈{0. and that asset 2 is a . it has an orthogonal complement K ⊥ = {µ : E(µX1 ) = 0 for all X1 ∈ K} dimension at least n − d. Viewed as a vector subspace of Rn+1 of dimension at most d + 1. Now by assumption E(ρθ ξ1 ) = E(ρ0 ξ0 ) + θE(µξ1 ) = ξ0 . i i i i E(ρθ S1 ) = E(ρ0 S1 ) + θE(µS1 ) = S0 . We assume that asset 0 is a riskless bond with prices B0 = 1 and B1 = 1 + r. then ξ1 = π · S1 for some portfolio π . St . Recall that the payout of the option is C1 = (S1 − K)+ . (1) ⇒ (3) If ξ1 is attainable. (The statement is true in full generality. Since ρ0 (ω) > 0 for all ω ∈ Ω. . (3) ⇒ (1)* This is the hard direction. Y (ωn )) ∈ Rn+1 . .1} . (Put-call parity formula) Suppose we start with a market with three assets with prices (Bt .) Furthermore. ξ) has no arbitrage if and only if there exists a positive random variable ρ such that ¯ ¯ E(ρS1 ) = S0 and E(ρξ1 ) = ξ0 . The second equation says that only there is a unique initial price ξ0 compatible with no-arbitrage if and only if E(ρξ1 ) = ξ0 all pricing kernels ρ. (This is where we have used the assumption that Ω is ﬁnite. . (Of course. Then ¯ ¯ ¯ E(ρξ1 ) = π · S0 ¯ ¯ for every pricing kernel ρ. so that ξ0 = π · S0 . and hence ρθ is a pricing kernel as well. the augmented market (S. Suppose that this market is free of arbitrage. the vector ξ1 is orthogonal to µ. ) Pick a vector µ ∈ K ⊥ and let ρθ = ρ0 + θµ. Since θ = 0. K ⊥ is allowed to be the singleton {0}. ¯ The ﬁrst equation above says that ρ is a pricing kernel for the underlying market S. Suppose that the sample space Ω has n + 1 elements. with P{ω} > 0 for all ω ∈ Ω.s. ¯ Let ρ0 be a pricing kernel for S and consider the set K = {¯ · S1 : π ∈ Rd+1 } π ¯ ¯ of attainable claims. But µ ∈ K ⊥ was arbitrary. Example.) We can identify a random variable Y with the vector (Y (ω0 ). 14 . but the proof is left as an exercise on the example sheet.

the payout of a put option is P1 = (K − S1 )+ . so every random variable satisﬁes the integrability property. Since ξ1 is attainable by assumption. A put option gives the owner of the option the right. 1 . a miracle occurs. Assuming E(ρ|ξ1 |) < ∞. we see that the no-arbitrage assumption implies the existence of a pricing kernel ρ with E(ρ|ξ1 |) < ∞.Now we introduce another claim. Since the time-0 price of an attainable claim is just the price of the replicating portfolio. A market model is complete if and only if there exists a unique pricing kernel. and the put option is attainable in the market (B. E( |ξ1 |) < ∞}. C). A market is incomplete otherwise. completing the proof of the “only-if” direction. We will be done once we show that every ξ1 has the desired integrability property. suppose the market is complete. so we can let ξ1 = ρ − ρ in the above equation. every random variable is attainable. we see that ξ0 = E(ρξ1 ) satisﬁes our characterization of attainable claims. S. we have just derived the famous put-call parity formula 1 K C0 − P0 = S0 − 1+r Since ﬁnding the time-0 price of an attainable claim is easy. and E( |S1 |) < ∞. We can characterize complete markets: Theorem (Second Fundamental Theorem of Asset Pricing). we must conclude ρ = ρ almost surely. and ξ1 be an arbitrary random variable. but not the obligation. 15 . 1+r K so the replicating portfolio is (π 0 . and let ξ1 be an arbitrary random variable. we have E(ρ|ξ1 |) < ∞ and E(ρ |ξ1 |) < ∞ and E(ρξ1 ) = ξ0 = E(ρ ξ1 ) so that E[(ρ − ρ )ξ1 ] = 0. An arbitrage-free market is complete if and only if every random variable is attainable. Now suppose that the pricing kernel ρ is unique. to sell the stock for a ﬁxed strike price K at time 1. But there is only one pricing kernel. called a put option. All we need do is replace the set C in our proof of the ﬁrst fundamental theorem with ¯ ¯ C = {E( S1 ) : > 0 a. Indeed. we have the identity P1 = (K − S1 )+ = K − S1 + (S1 − K)+ K = B1 − S1 + C1 . In the special case when K = K . Since the integrand is non-negative. First. −1. Let ρ and ρ be two pricing kernels.s. concluding the proof of the “if” direction. Using a similar argument as we used for the call option. we single out the markets for which every claim is attainable: Definition. Following the argument. π 1 . Proof. Since the market is complete. and hence ξ1 is attainable. π 2 ) = 1+r .

. Pricing and hedging in an incomplete market by minimizing hedging error In complete markets.Remark. . if n < d then there are few unknowns than equations. = . one has the following rule-of-thumb: 1FTAP: No arbitrage ⇔ Existence of pricing kernel “⇔ n ≥ d” Now. but unless the claim in attainable. n sources of randomness). + π d S1 (ωn ) in the d + 1 unknowns π 0 . . Hence. . Otherwise.e. so their time-0 prices are just the time-0 prices of the corresponding replicating portfolio. . There is usually exists a solution only if n ≤ d. we know how to compute the time-0 price of any contingent claim– just compute the expected value of the claim times the unique pricing kernel. . . we must solve the n + 1 equations ¯ ¯ 0 d ξ1 (ω0 ) = π 0 S1 (ω0 ) + . the set of no-arbitrage prices is an interval. . i. . That is.e. = . the interval does not collapse to a single point. d d d d S0 = ρ(ω0 )S1 (ω0 )P{ω0 } + . ξ) if inf E(ρξ1 ) ≤ ξ0 ≤ sup E(ρξ1 ) ρ ρ where the inﬁmum and supremum are taken over pricing kernels ρ such that E(ρ|ξ1 |) < ∞. ωn } has n + 1 points (informally. what portfolio should you buy to minimize your exposure to the unhedgeable risk? 16 . Then what to do? If you are the seller of a claim. Roughly speaking. Indeed.. . Or looking the other way. . to replicate a contingent claim ξ1 we need to ﬁnd a π such that ¯ π · S1 = ξ1 . If ρ is a pricing kernel. + ρ(ωn )S1 (ωn )P{ωn } = E(ρS1 ). n = d. if n ≥ d. . . . + ρ(ωn )S1 (ωn )P{ωn } = E(ρS1 ) . Theorem (No-arbitrage price bounds). one should expect to ﬁnd a solution to these equations if there are at least as many unknowns as equations. . . + π d S1 (ω0 ) . π d . it must satisfy the d + 1 equations 0 0 0 0 S0 = ρ(ω0 )S1 (ω0 )P{ω0 } + . in a complete market. .. What does the no-arbitrage principle say about the prices of contingent claims in incom¯ plete markets? Let S be a no-arbitrage market model. i. we have the rule-of-thumb below: 2FTAP: Market completeness ⇔ Uniqueness of pricing kernel “ ⇔ n = d” 4... Since the existence of the pricing kernel is equivalent to the lack of arbitrage. . . 0 d ξ1 (ωn ) = π 0 S1 (ωn ) + . we should expect there to be a unique pricing kernel ρ only if the number of unknowns equals the number of equations. That is. all contingent claims can be perfectly replicated. when is the market complete? Heuristically. and hence one should not expect to ﬁnd a ρ satisfying the system. There is no arbitrage in the augmented market ¯ (S. Suppose that our sample space Ω = {ω0 . .

That is. First we show that E(S1 ρ∗ ) = S0 . we have the following interpretion: • π ∗ the optimal vector of portfolio weights. ¯ We can ﬁnd the minimizer of the function F (¯ ) = E[(ξ1 − π · S1 )2 ] π ¯ ¯ by the usual means of calculus: ¯ F (π ∗ ) = E[S1 (ξ1 − π · S1 )] = 0 ¯ ¯ yielding ¯ π ∗ = E(V −1 S1 ξ1 ) ¯ ∗ ¯T ¯ X0 = E[S0 V −1 S1 ξ1 ] ¯ ¯T and V = E[S1 S1 ] is a (d + 1) × (d + 1) matrix.Of course. Hence if ∗ ¯ ¯ X1 = π ∗ · S1 is optimal. The random variable ρ∗ = S0 V −1 S1 minimizes the functional ρ → E(ρ2 ) ¯ ¯ among all random variables ρ such that E(ρS1 ) = S0 . ∗ we can think of X1 as the projection of ξ1 on the subspace of attainable claims. and ¯ ∗ • π ∗ · S0 = X0 is the time-0 price of this portfolio. we suppose that the prices are square integrable 2 ¯ E(|S1 |2 ) < ∞ and E(ξ1 ) < ∞. In this section. We have the computation ¯ ¯ ¯T ¯ ¯ E(S1 ρ∗ ) = E[S1 S1 V −1 So ] = S0 . 17 . ∗ We are trying to ﬁnd the attainable claim X1 = π ∗ · S1 which minimizes the functional ¯ ¯ 2 ∗ X1 → E[(ξ1 − X1 ) ]. just as in the case of a complete market. ¯ ¯ ∗ In particular. many answers to this question. there are many. One possible solution is ¯ to minimize expected square hedging error. ¯ ¯ Proof. which is closest (in the least-squares sense) to the target claim ξ1 . which can be attained by trading in the market. we ﬁnd the wealth X1 . What can we say about this solution? Note that we can express the optimal initial wealth as ∗ X0 = E(ρ∗ ξ1 ) where the random variable ρ∗ is deﬁned by ¯T ¯ ρ∗ = S0 V −1 S1 ∗ so that the pricing rule ξ1 → X0 is linear. The most important property of ρ∗ is found in the following: ¯T ¯ Theorem. assumed positive deﬁnite. the seller of the claim should charge at least X0 and invest the proceeds into the portfolio π ∗ . Consider a market S and introduce a claim with payout ξ1 at time 1. Equivalently.

Change of num´raire and equivalent martingale measures e In this section. 2. we use the notation EP to denote expected value with respect to P. for a general market ρ∗ may well have the property P(ρ ≤ 0) > 0. written P ∼ Q. When there are more than one probability measure ﬂoating around. we introduce the important concepts of num´raire assets and equivalent e martingale measures. and P{3} = 0 • Q{1} = 1/1000. F). Consider the sample space Ω = {1. It turns out that equivalent measures can be characterized by the following theorem. Also. Note that if ρ∗ is a positive random variable. if ξ is a non-negative random variable then EQ (ξ) = EP (Zξ). Theorem (Radon–Nikodym theorem). 3} with the set F of events all subsets of Ω.It remains to show that ρ∗ is minimal. 18 . Let (Ω. However. In this case. completing the proof. Consider probability measures P and Q deﬁned by • P{1} = 1/2. The measures P and Q are equivalent. ∗ ρ is not a pricing kernel. if and only if P(A) = 1 ⇔ Q(A) = 1 Example. we have the computation E(ρ2 ) = E[(ρ∗ + ∆ρ)2 ] ¯T ¯ = E[(ρ∗ )2 ] + 2E[S0 V −1 S1 ∆ρ] + E[(∆ρ)2 ] ≥ E[(ρ∗ )2 ]. if the market is complete. then it is possible that the augmented market has an arbitrage! 5. and Q{3} = 0. Let ρ be another random variable such that ¯ ¯ ¯ E(S1 ρ∗ ) = S0 . Writing ρ = ρ∗ + ∆ρ. Then P and Q are equivalent. we have discovered an explicit formula for the unique pricing kernel. In particular. then ρ∗ would be the pricing kernel with the smallest L2 norm. The probability measure Q is equivalent to the probability measure P if and only if there exists a positive random variable Z such that Q(A) = EP (Z 1A ) for each A ∈ F. F) be a measurable space and let P and Q be two probability measures on (Ω. etc. We need a deﬁnition: Definition. P{2} = 1/2. Then ∆ρ = ρ∗ − ρ satisﬁes E(S1 ∆ρ) = 0. Q{2} = 999/1000. by the usual rules of integration theory. Indeed if ρ∗ is not strictly positive and if one prices the claim by ∗ ξ0 = X0 = E(ρ∗ ξ1 ). Note that EP (Z) = 1 by putting A = Ω in the conclusion of theorem.

we can speak in terms e of prices relative to the num´raire. then the random variable Z is called the density. A num´raire is an asset with a strictly positive price at all times. Remark. . .1} where F0 = {∅. The measure P is called the objective (or historical or statistical ) measure for the model. . or the Radon–Nikodym derivative. An equivalent martingale measure relative to the num´raire asset i is any probability e j Q i measure Q equivalent to P such that E (|S1 |/S1 |) < ∞ and EQ for all j ∈ {0. d}. Unlike the notion of a pricing kernel. we have e j Sj S0 = E Zi 1 i i S0 S1 i S1 . Then. That is. for every num´raire e e asset i. Ω} and F1 = F. we deﬁne an equivalent probability measure Qi by Qi (A) = E(Zi 1A ) for all A ∈ F.If Q ∼ P. That is. we need the following deﬁnition: Definition. and suppose asset e i is a num´raire. let ρ be a pricing kernel. P also has a density with respect to Q given by dP 1 = . of Q with respect to P. As a preview of what’s to come. Hence. To start oﬀ. i S0 Notice that Zi > 0 almost surely and E(Zi ) = 1. P). the term equivalent martingale measure is appropriate since the stochastic processes (Stj /Sti )t∈{0.1} are a martingale for Qi with respect to the ﬁltration (Ft )t∈{0. . dQ Z Now let’s return to our ﬁnancial model. The measure Qi is called an equivalent martingale measure: where Zi = ρ ¯ Definition. e The idea is that a num´raire can be used to count money. given a pricing kernel ρ we can construct an equivalent e martingale measure corresponding to a diﬀerent num´raire. Let S be a market model deﬁned on a probability space (Ω. Hence. dP In fact. In particular. F. We will elaborate on this in the multi-period case. i dP S0 19 j S1 i S1 = j S0 i S0 . we can deﬁne a measure Qi by Si dQi = ρ 1. and is often denoted dQ Z= . the notion of an equivalent martingale measure is ‘num´raire-dependent’.

In general. so that X0 = φB0 + π · S0 . To make the notation easier. This brings us to a standing assumption: e Assumption. in which case B0 = B1 = 1 or a bank (or money market) account in which case B0 = 1 and B1 = 1 + r. and we wouldn’t need the new notation. Asset 0 is a num´raire. . we count money in units of asset 0: Deﬁne the new quantities for both t ∈ {0. B0 B0 Plugging this into the self-ﬁnancing condition yields after some manipulation B1 B1 X1 = X0 + π · S1 − S0 B0 B0 To clean things up. we distinguish one asset and treat it diﬀerently than the d others. As mentioned earlier. . By convention. The actual identity of asset 0 is irrelevant for much of our analysis. and St = Bt Bt denoting the wealth and the risky asset prices discounted by the num´raire asset 0. but this choice is arbitrary. 1} Xt St ˜ ˜ Xt = . . π d ).) In this new notation. B0 > 0 and B1 > 0 almost surely. e If the investor’s initial wealth X0 is ﬁxed. 20 . . . . (Of e course. The choice of num´raire is usually arbitrary. A num´raire is risk-free if its time-1 price is not random. 6. where r is the interest rate. X1 = φB1 + π · S1 . their riskiness being relative to the num´raire asset 0. . . we choose asset 0. e The remaining d assets are often called risky assets. Std ) from now on. and π = (π 1 . Since we have assumed that asset 0 is a num´raire. e but there is one case that should be mentioned. Definition. Discounted prices From now on. as long as it is a num´raire. An equivalent e martingale measure relative to a risk-free asset is called risk-neutral. i. the budget constraint and self-ﬁnancing condition neatly combine to yield ˜ ˜ ˜ ˜ X1 = X0 + π · (S1 − S0 ) . asset 0 is often cash. if asset 0 were cash then B0 = B1 = 1 almost surely. and St = (S1 . yielding e X0 S0 φ= −π· . we let 1 Bt = St0 .e. the measures Qi constructed above are diﬀerent for diﬀerent i. even though they all correspond to the same pricing kernel ρ. We write φ = π 0 . the budget constraint means that we cannot freely choose the investor’s portfolio. we can solve for φ.

so π is an arbitrage relative to asset 0. Definition. an arbitrage by our old deﬁnition of the word. Indeed. we are actually interested in the lack of arbitrage: the market model has no arbitrage if and only if ˜ ˜ ˜ ˜ π · (S1 − S0 ) ≥ 0 a. and P(X1 > X0 ) > 0.s. Then π = (φ.Now it is time to consider the notion of arbitrage in this new notation. We can now rewrite the ﬁrst fundamental theorem: Theorem (First Fundamental Theorem of Asset Pricing). Finally. π) is an arbitrage according to our old deﬁnition. We can also redeﬁne the term attainable: Definition. and P(X1 > 0) > 0. Note that an equivalent martingale measure Q is simply a probability measure Q ∼ P such that ˜ ˜ EQ (S1 ) = S0 . An arbitrage (relative to the num´raire asset 0) is a portfolio π ∈ Rd of e risky assets such that ˜ ˜ P[π · (S1 − S0 ) ≥ 0] = 1 ˜ ˜ P[π · (S1 − S0 ) > 0] > 0 Remark. if the portfolio π = (φ. so X1 ≥ 0 a. Attainable claims can be characterized in terms of equivalent martingale measures: Theorem (Characterization of attainable claims). π) ∈ Rd+1 is ¯ ˜0 . once we let φ = −π · S ˜ ˜ the initial wealth is X0 = φB0 + π · S0 = 0 and X1 = B1 π · (S1 − S0 ).s. A claim with payout ξ1 is attainable if there is a real number x and portfolio π ∈ Rd of risky assets such that ˜ ˜ ˜ ξ1 = x + π · (S1 − S0 ) ˜ where ξ1 = ξ1 /B1 . in this case. A market model is complete if and only if the equivalent martingale measure (relative to asset 0) is unique. implies π · (S1 − S0 ) = 0. 21 . then ¯ ˜ ˜ ˜ ˜ ˜ ˜ ˜ ˜ X1 ≥ 0 ≥ X0 a. the second fundamental theorem can be rewritten: Theorem (Second Fundamental Theorem of Asset Pricing). Suppose π ∈ Rd is an arbitrage relative to asset 0. A claim with payout ξ1 is attainable if and only if there exists a constant x such that ˜ EQ (ξ1 ) = x ˜ ˜ for all equivalent martingale measure Q such that EQ (|ξ1 |) < ∞. These are the formulations that we will use for the remainder of these notes. where ξ1 = ξ1 /B1 . Of course.s. But X1 − X0 = π · (S1 − S0 ). The market model has no arbitrage if and only if there exists an equivalent martingale measure (relative to asset 0). Conversely.

.

After the second toss. F. HT }? I can assign a probability P{HH. The set-up So we consider a market with d + 1 assets. this entails replacing the time index set {0. Is ω in Ω? Yes. 1} with the non-negative integers Z+ and keeping track of the resulting complications. you can answer every question. HT. S) = (Bt . We need to formalize the concept of information being revealed as time marches forward. Pick an outcome ω ∈ Ω at random. We can model this experiment on the sample space Ω = {HH.CHAPTER 2 Multi-period models Now that the ﬁnancial foundation has been laid in the one-period models. T H}? Yes. the coin has not been tossed. Our ﬁrst goal is to ﬁnd the proper generalization to this setting of the assumption that the time-0 prices B0 and S0 are constant. 23 . Then you can always answer these four questions. T H. HT. Is ω in {T H. Is ω in ∅? No. So. the ﬂow of information is modelled by the following sigma-ﬁelds • F0 = {∅. we can proceed briskly into the natural generalization of multi-period discrete time. Is ω in Ω? Yes. if the ﬁrst toss is tails. Consider the experiment of tossing a coin two times. 1. but I can’t answer the question with certainty. Is ω in ∅? No. in general I can’t answer the question with certainty after observing the ﬁrst ﬂip. At time 0. T T }. Is ω in {HH. HT }? Yes. Essentially. the coin has been tossed once. The prices of these assets are modelled by the stochastic process (B. but Is ω in {HH. P). Example. so there are only two questions you can answer. T T }? Yes. The correct notions are that of measurablility and of a ﬁltration. no otherwise. HT } = 1/2. but what if the ﬁrst toss comes up tails? So. St )t∈Z+ deﬁned on some background probability space (Ω. Ω}. if the ﬁrst toss is heads. no otherwise. if the ﬁrst toss is heads. of course. At time 1. but Is ω in {HH.

then ξ = (ξ1 . For instance. A random variable ξ : Ω → R is measurable with respect to G ( or brieﬂy. we have T = Z+ the positive integers. T T } since the only information known at time 1 is whether or not the ﬁrst coin came up heads. On the other hand. When dealing with ﬁltrations. If ξ is a random vector. T T }. we will nearly always assume that F0 is trivial: A ∈ F0 ⇔ P(A) = 0 or P(A) = 1. A ﬁltration (Ft )t∈T is an increasing collection of sigma-ﬁelds on Ω such that Fs ⊆ Ft if s ≤ t. Ω}. HT }. . HT } Y1 (ω) = b if ω ∈ {T H. The sigma-ﬁeld Ft is our model of what information is known to the market participants at time t. Now consider a stochastic process (Yt )t∈{0. . 24 . P) with a ﬁltration (Ft )t∈Z+ .2} that has the property that the value of the random variable ξt is known once after t tosses of the coin. S) = (Bt . but the some of the following deﬁnitions are stated in more generality than we need here to avoid repetition in later chapters in which T = R+ . Y0 must be a constant. n}. that is. . For this chapter. {T H. ξn ) is G-measurable if and only if ξi is G-measurable for each i ∈ {1. F. Remark. the random variable Y1 must be of the form a if ω ∈ {HH. we have the following deﬁnitions: Definition. Y0 (ω) = a for all ω ∈ Ω. With this motivation.• F1 = {∅. 1. . . Let T ⊆ R+ be an index set. In particular. .1. Assumption. A stochastic process (Yt )t∈T is adapted to a ﬁltration (Ft )t∈T iﬀ Yt is Ft measurable for each t ∈ T. The stochastic process (B. Definition. Definition. Now we can state the proper generalization of the assumption in one period models that the time-0 asset prices are constant: We equip the background probability space (Ω. all F0 measurable random variables are almost surely constant. 2} the event {Yt ≤ x} is in Ft for every x ∈ R. • F2 = the set of all sixteen subsets of Ω. Finally. {HH. Y2 can be any function on Ω. of the form a if ω = HH b if ω = HT Y2 (ω) = c if ω = T H d if ω = T T. G-measurable) if and only if the event {ξ ≤ x} is an element of G for all x ∈ R. Let G ⊆ F be a sigma-ﬁeld. since there is no information before the experiment. . St )t∈Z+ is assumed to be adapted to (Ft )t∈Z+ . Notice that for all t ∈ {0. .

. Our second assumption is natural. . Just as we did in the one-period case. • φt denotes the number of shares of asset 0 held between periods t − 1 and t. we see that B0 and S0 are constants as before. e As before. . For a ﬁxed t ∈ N we refer to the d-dimensional random vector πt as the investor’s portfolio. Bt−1 Bt−1 Now count money in units of the num´raire asset 0: e B0 B0 ˜ ˜ Xt = Xt and St = St Bt Bt so that the budget constraint and self-ﬁnancing condition combine to yield ˜ ˜ ˜ ˜ Xt = Xt−1 + πt · (St − St−1 ). the d-dimensional stochastic process (πt )t∈N is called the investor’s trading strategy. Now there are multiple periods. • πt denotes the portfolio of risky assets held between periods t − 1 and t. Asset 0 is a num´raire so that Bt > 0 almost surely for all t ∈ Z+ . The ﬁnal formula to remember is then t ˜ ˜ Xt = X 0 + s=1 ˜ ˜ πs · (Ss − Ss−1 ). Since the only random variables measurable with respect to the trivial sigmaﬁeld are constants. we introduce an investor. so we need to be careful about when an investor chooses his investment portfolio. Note that the time index set for a predictable process (Yt )t∈N is (usually) N = {1. πt ) that are Ft−1 -measurable. A stochastic process (Yt )t∈N is predictable if Yt is Ft−1 -measurable for each t ∈ N. generalizes the assumption made in one period models. • Xt denotes the investor’s wealth at the beginning of period t. Hence we henceforth only consider portfolios (φt . The investor’s wealth and porfolio are connected by the following relationships: Xt−1 = φt Bt−1 + πt · St−1 Xt = φt Bt + πt · St the budget constraint the self-ﬁnancing condition Note that we should assume that our investor is not clairvoyant. Such processes have a name: Definition.Remark. Remark. Assumption. we can solve for φt : φt = Xt−1 St−1 − πt · . and doesn’t really need further comment. given this market model. Remark.}. Hence Y0 is not necessarily deﬁned. 2. 25 . not Z+ .

and let G ⊆ F be a sub-sigma-ﬁeld of F. P(Gn ) This example relates the notion of conditional expection given a sigma-ﬁeld and that of conditional expectation given an event. G2 . is a G-measurable random variable with the property that E [1G E(X|G)] = E(1G X) for all G ∈ G. and let G = {∅. G2 . if there exists another G-measurable random variable Y such that E(1G Y ) = E(1G X) for all G ∈ G. T T }. HT. then Y = Y almost surely. . An arbitrage is a strategy (πt )t∈N with the property that there exists a (non-random) time T ∈ N such that T P s=1 T ˜ ˜ πs · (Ss − Ss−1 ) ≥ 0 ˜ ˜ πs · (Ss − Ss−1 ) > 0 s=1 = 1 P > 0. {T H. be a sequence of disjoint events with P(Gn ) > 0 for all n ∈ N and n∈N Gn = Ω. The conditional expectation of X given G. . The ﬁrst fundamental theorem of asset pricing Following the discussion from before we can deﬁne an arbitrage. Furthermore. Suppose the coin is fair. since we usually write E(X 1G ) = E(X|G). we have to recall some results and deﬁnitions about conditional expectations and martingale theory. . . Let G be the smallest sigma-ﬁeld containing {G1 . Theorem (Existence and uniqueness of conditional expectations).2. Example. Definition. HT }. T T } consists of two tosses of a coin. Let X be a integrable random variable deﬁned on the probability space (Ω. P). {HH..}. so that each outcome is equally likely. written E(X|G). . P). Let G1 . F. Before we can state the ﬁrst fundamental theorem. Consider the random variable a if ω = HH b if ω = HT ξ(ω) = c if ω = T H d if ω = T T. Definition. . P(G) More concretely. Ω} be the sigma-ﬁeld containg the information revealed by the ﬁrst toss. . Let X be an integrable random variable and let G ⊂ F be a sigma-ﬁeld. (Sigma-ﬁeld generated by a countable partition) Let X be a non-negative random variable deﬁned on (Ω.. 26 . suppose Ω = {HH. F. T H. Then there exists an integrable G-measurable random variable Y such that E(1G Y ) = E(1G X) for all G ∈ G. . Then E(X|G)(ω) = E(X 1Gn ) if ω ∈ Gn .

s. If T = Z+ .Then E(ξ|G)(ω) = (a + b)/2 (c + d)/2 if ω ∈ {HH. if a stochastic process is given but a ﬁltration is not explicitly mentioned. In particular. A martingale is simply an adapted stochastic process that is constant on average in a following sense: Definition. Remark. then we are implicitly working with the natural ﬁltration of the process. the smallest ﬁltration for which ξ is adapted. In what follows. then E(Xn |G) → E(X|G) a. it is convenient to introduce a deﬁnition: Definition. for all n. then E(lim inf n Xn |G) ≥ lim inf n E(Xn |G) • dominated convergence theorem: If supn |Xn | is integrable and Xn → X a. then E(Xn |G) ↑ E(X|G) a. then E(X|G) = X. then E(XY |G) = XE(Y |G). with almost sure equality if and only if X = 0 almost surely. then E[f (X)|G] ≥ f [E(X|G)] • monotone convergence theorem: If 0 ≤ Xn ↑ X a. In particular. that is. E(X|G) = E(X) if G is trivial. Given a stochastic process Y = (Yt )t∈T . That is. HT } if ω ∈ {T H. • Fatou’s lemma: If Xn ≥ 0 a. • If X is independent of G (the events {X ≤ x} and G are independent for each x ∈ R and G ∈ G) then E(X|G) = E(X). T T } The important properties of conditional expectations are collected below: Theorem. Below are some examples of martingales. • linearity: E(aX + bY |G) = aE(X|G) + bE(Y |G) for all constants a and b • positivity: If X ≥ 0 almost surely. • ‘take-out-what’s-known’: If X is G-measurable.s. and let G be a sub-sigma-ﬁeld of the sigma-ﬁeld F of all events. then E(X|G) ≥ 0 almost surely. let Ft be the smallest sigma-ﬁeld for which the random variables Ys is measurable for all 0 ≤ s ≤ t. it is an exercise to show that an integrable process M is a martingale only if E(Mt+1 |Ft ) = Mt for all t ≥ 0. • tower property or law of iterated expectations: If H ⊆ G then E[E(X|G)|H] = E[E(X|H)|G] = E(X|H) As hinted at in Chapter 1. A martingale relative to a ﬁltration (Ft )t∈T is a stochastic process M = (Mt )t∈T with the following properties: • E(|Mt |) < ∞ for all t ∈ T • E(Mt |Fs ) = Ms for all 0 ≤ s ≤ t. Before listing them.s. if X is G-measurable.s.s. Let all random variables appearing below be such that the relevant conditional expectations are deﬁned. the most important concept in ﬁnancial mathematics is that of a martingale. • Jensen’s inequality: If f is convex. 27 . it is suﬃcient to verify the conditional expectations of the process one period ahead. The natural ﬁltration of Y is the ﬁltration (Ft )t∈T .

and let Mt = E(X|Ft ). + E(|Xt |) by the triangular inequality. We now construct one of the most important examples of a martingale. the random variable St is integrable since E(|St |) ≤ E(|X1 |) + . . Then M = (Mt )t∈T is a martingale. The process (St )t∈Z+ given by S0 = 0 and St = X1 + .Example. First. By assumption. X3 . . Proof. Example. . Let M be a martingale and let H be a bounded predictable process. Also. . E(St+1 |Ft ) = E(St + Xt+1 |Fn ) = E(St |Ft ) + E(Xt+1 |Ft ) = St + E(Xt+1 ) = St . Then the process N deﬁned by t Nt = s=1 Hs (Ms − Ms−1 ) is a martingale. Remark. X2 . Theorem. . we have E(|Mt |) < ∞ and there exist a constant C > 0 such that |Ht | < C almost surely for all t ∈ Z+ . + Xt is a martingale relative to its natural ﬁltration. . Indeed. be independent integrable random variables such that E(Xi ) = 0 for all i ∈ N. Hence t E(|Nt |) ≤ s=1 t E(|Hs ||Ms − Ms−1 |) C[E(|Ms |) + E(|Ms−1 |)] < ∞ s=1 ≤ Since E(Nt+1 − Nt |Ft ) = E(Ht+1 (Mt+1 − Mt )|Ft ) = Ht+1 E(Mt+1 − Mt |Ft ) = 0 we’re done. E(|Mt |) = E{|E(X|Ft )|} ≤ E{E(|X||Ft )} = E(|X|) < ∞ 28 . . The process N in the theorem is often called a martingale transform or a discrete time stochastic integral. . Let X1 . Let X be an integrable random variable. The following theorem shows how to take one martingale and build another one.

Our aim is to show X2 = 0 a. that the existence of an equivalent martingale measure implies the lack of arbitrage. ˜ is a Q-martingale. The market model has no arbitrage if and only if there exists an equivalent martingale measure.. Now. called American contingent claims. An equivalent martingale measure is a measure Q equivalent to P such that ˜ is a martingale for Q.s. again by equivalence of measures. The idea is the same as in the T = 1 case.s. since ˜ ˜ EQ [1An X2 |F1 ] = 1An X1 = 0 ˜ ˜ we conclude 1An X2 = 0 a.T } deﬁned on a probability ˜ space (Ω.Now. 29 .s. P) and adapted to a ﬁltration (Ft )t∈{0.. Proof. Finally.. that can be exercised whenever the holder of the claim wants.. We Q ˜ ˜ would be done if we could show E (X2 ) = 0. let An = {|X1 | ≤ n. E(Mt |Fs ) = E[E(X|Ft )|Fs ] = E(X|Fs ) = Ms by the tower property. Now we can compute ˜ ˜ EQ (X1 ) = π1 · E(S1 ) = 0 ˜ to conclude X1 = 0 a. for each n. F. S Theorem (First Fundamental Theorem of Asset Pricing). as desired. Note Q(X2 ≥ 0) = 1 because P ∼ Q.s. there are now essentially two types of contingent claims: those. ¯ We now are ready to consider our model S = (Bt . but one has to be a bit more careful. since the ﬁrst term is bounded. called European that mature at a ﬁxed date and those. s=1 ˜ Let Q be an equivalent martingale measure. which is Q-integrable since S ˜ ˜ ˜ 0 ≤ EQ [1An X2 |F1 ] = 1An X1 + 1An π2 · E(∆S2 |F1 ) ˜ = 1An X1 ˜ and letting n → ∞. Recall the notation St = B0 St for Bt the discounted risky asset prices. St )t∈{0. Let π be such that ˜ ˜ ˜ ˜ Xt = t πs · ∆Ss satiﬁes X2 ≥ 0 a.s. we can introduce a contingent claim. Furthermore. and the second term is bounded by n|∆S2 |.T } . European and American contingent claims Now given our multi-period market model. because this would imply that Q(X2 = 0) = 1 ˜ and hence P(X2 = 0) = 1. Definition. note that ˜ ˜ ˜ 1An X2 = 1An X1 + 1An π2 · ∆S2 ˜ is Q-integrable. ˜ To do this. |π2 | ≤ n}. Note An is in F1 . Letting n → ∞ shows X2 = 0 a... in the case where T = 2. We only consider the easier direction..s. Unlike the one-period case.. 3. we have X1 ≥ 0 a.

The above theorem gives a lot of ﬂexibility in pricing contingent claims unless the claim is attainable. St . where ξT = Bt . By the ﬁrst fundamental theorem of asset pricing. T }.1.. European claims. We concentrate on the European claims ﬁrst.. so that there exists a constant x and strategy (π)t={1... and suppose that the claim is attainable. . We model the payout of the claim by an FT measurable random variable ξT .. ξt )t∈{0. (It needs to be proven that there is at least one Q with this property.. St )t∈Z+ be an arbitrage-free market. We just consider the case T = 2.. . T The market is complete if and only if every claim is attainable. where t ˜ Xt = x + s=1 Q ˜ ˜ πs · (Ss − Ss−1 ) ˜ ˜ We need to show x = E (ξ2 ) for all Q such that ξ2 is integrable. ˜ ˜ where ξt = B0 ξt . ξt )t∈{0. . . But (ξt )t∈{0.T } is free of arbitrage if and only if there exists an equivalent martingale measure Q such that ξt = EQ for all t ∈ {0. S) = (Bt . The following theory should seem very familiar since there really is nothing new.3. there is no-arbitrage if and ˜ ˜ only if there exists an equivalent measure Q such that (St . or the set Q of equivalent martingale measures contains just one element.T } is a martingale if and only if Bt ˜ ˜ ξt = EQ (ξT |Ft ) and we’re done.s. Let (B. We now introduce a (European) contingent claim which matures at time T ∈ N.... Proof. Definition. The following should now come as no surprise. A claim maturing at time T ∈ N is attainable if and only if its payout ξT is an FT -measurable random variable such that T Bt ξ Ft BT ˜ ξT = x + s=1 ˜ ˜ πs · (Ss − Ss−1 ) ξ ˜ for some x ∈ R and predictable strategy (πt )t∈N . The augmented market (Bt .2} such that X2 = ξ2 a.. ˜ x = EQ (ξT ) ˜ for all equivalent martingale measures Q such that EQ (|ξT |) < ∞.) 30 . but we do not do so here. In this case ξT = (ST − K)+ . Proof. Think of the example of a call option on a stock maturing at time T with strike K. Given an arbitrage free market model ¯ a contingent claim with payout ξT is attainable if and only if there a constant x such that S. Theorem (Characterization of attainable claims).. Theorem.T } is a martingale for Q.

T } . Now. Indeed. if an American claim matures at T ∈ N and is speciﬁed by the payout process (ξt )t∈{0. . let An = {|X1 | ≤ n. the payout of an American claim is speciﬁed by two ingredients: • a maturity date T ∈ N. Hence. 31 . ˜ ˜ ˜ Now X1 = x + π1 (S1 − S0 ) is integrable and has mean x. Finally. in the case of an American put. The market is complete if and only if there exists a unique equivalent martingale measure. A stopping time for a ﬁltration (Ft )t∈T is a random variable τ taking values in T ∪ {∞} such that the event {τ ≤ t} is Ft -measurable for all t ∈ T. we insist that τ is a stopping time: Definition. we can use the conditional dominated convergence theorem on the left side to let n → ∞ to get ˜ ˜ EQ [X2 |F1 ] = X1 . The canonical example of an American claim is the American put option– a contract which gives the buyer the right (but not the obligation) to sell the underlying stock at a ﬁxed strike price K > 0 at any time between time 0 and a ﬁxed maturity date T ∈ N. .. {τ > t} = {Ys ≤ a for all s = 0. Here.. . T } is a time chosen by the holder of the put to exercise the option.. Theorem (Second Fundamental Theorem of Asset Pricing). Hence. we may take ξt = (K − St )+ . Then the random variable τ = inf{t ∈ Z+ : Yt > a} corresponding to the ﬁrst time the process crosses the level a is a stopping time. .. . Unlike the European claim.˜ As before.. . . we state the characterization of complete markets. and hence so is {τ > t}c = {τ ≤ t}. We now discuss American claims.. t} is Ft -measurable. For instance. since |1An X2 | ≤ X2 and this is integrable by assumption. (First passage time) Here is a typical example of a stopping time. to rule out clairvoyance. • an adapted process (ξt )t∈{0. Let (Yt )t∈Z+ be an adapted process and ﬁx a ∈ R. T }. .2. then the actual payout of the claim is modelled by the random variable ξτ . . ˜ ˜ Now.. things are quite different. Example. the payout of the option is (K − Sτ )+ where τ ∈ {0. .T } .. since π1 is a constant. 3. Applying the tower law yields ˜ ˜ EQ [EQ (X2 |F1 )] = E(X1 ) = x. However. . where τ is any stopping time for the ﬁltration taking values in {0. |π2 | ≤ n} and hence ˜ ˜ EQ [1An X2 |F1 ] = 1An X1 . American claims. the holder of an American claim can choose to exercise the option at any time τ before or at maturity. .

A submartingale is an adapted process (Vt )t∈Z+ with the following properties: • E(|Vt |) < ∞ for all t ∈ T • E(Vt |Fs ) ≥ Vs for all t ∈ T... Consider a complete market (B. The theorem says that if the initial wealth is suﬃciently large.. St )t∈Z+ is complete. . There exists a trading strategy (πt )t∈{1. T } if and only if X0 ≥ sup EQ τ ∈T B0 ξτ Bτ . Indeed. We need some new vocabulary: Definition. indexed by the stopping time τ . The unique equivalent martingale measure is denoted Q. the investor can super-replicate the payout of the American claim. we make the following assumption in this subsection: The market model (B. the seller of such a claim should at time 0 charge at least the amount sup EQ τ ∈T B0 ξτ Bτ to be sure that he can hedge the option. The rest of this subsection is dedicated to proving this theorem. Theorem. Hence a supermartingale decreases on average. of European claims with payouts ξτ .. . Remark. Now we need a result of general interest: 32 .t} with corresponding discounted wealth process t ˜ ˜ Xt = X 0 + s=1 ˜ ˜ πs · (Ss − Ss−1 ) such that Xt ≥ ξt almost surely for all t ∈ {0. S).. To simplify matters.. Intuitively.. S) = (Bt . A martingale is a stochastic process that is both a supermartingale and a submartingale. . Remark. this is the case. A supermartingale relative to a ﬁltration (Ft )t∈T is an adapted stochastic process (Ut )t∈T with the following properties: • E(|Ut |) < ∞ for all t ∈ T • E(Ut |Fs ) ≤ Us for all 0 ≤ s ≤ t.. where the supremum is taken over the set T of stopping times smaller than or equal to T and where Q is the unique martingale measure. .We can think of the American claim then as a family. while a submartingale increases on average.T } speciﬁes the payout of an American claim maturing at T ∈ N. and suppose that the adapted process (ξt )t∈{0.

33 .. let (Ut )t∈{0... assume that Ut = Mt − At = Mt − At .. Clearly At+1 is Ft -measurable and the process A is non-decreasing process since U is a supermartingale.. Theorem. .. E(Ut+1 |Ft )} for t ∈ {0. In our application (Yt )t∈{0. Then there is a unique decomposition Ut = Mt − At where M is a martingale and A is a predictable non-decreasing process with A0 = 0. . Let M0 = U0 and deﬁne Mt+1 − Mt = Ut+1 − E(Ut+1 |Ft ) for t ∈ N.T } be its Snell envelope with Doob decomposition Ut = Mt − At . . another way to describe the Snell envelope of a process is to say it is the smallest supermartingale dominating that process. . Remark. Let (Yt )t∈{0. .. To show uniqueness. T } : At+1 > 0} with the convention τ ∗ = T on {At = 0 for all t}.T } is called the Snell envelope of (Yt )t∈{0.. that is. Then M − M is a predictable martingale.. Thus. Let (Yt )t∈{0... Deﬁne an adapted process (Ut )t∈{0. a constant.T } be an adapted process.T } be a given integrable adapted process.T } . Proof. .. Summing up..T } by U T = YT Ut = max{Yt . ..Theorem (Doob decomposition theorem). Let τ ∗ = min{t ∈ {0. The Snell envelope clearly satisﬁes both Ut ≥ Yt and Ut ≥ E(Ut+1 |Ft ) almost surely. Now let A0 = 0 and At+1 − At = Ut − E(Ut+1 |Ft ) for t ∈ N. T − 1} The process (Ut )t∈{0. t Mt − At = M0 − A0 + s=1 t (Ms − Ms−1 − As + As−1 ) = U0 + s=1 (Us − Us−1 ) = Ut ......T } will be the process specifying the discounted payout of ˜ the given American claim Yt = ξt ... Then M is a martingale... . Let U be a supermartingale indexed by Z+ .. Then τ ∗ is a stopping time and Yτ ∗ = Mτ ∗ . Definition.....

T } be the process specifying the payout of an American ˜ option. and let (Ut )t∈{0. .. . τ ∈T Proof. .. we have Ut = max{Yt ...T } is predictable. E(Ut+1 |Ft )} = Yt .. U0 ≥ E(Uτ ) by the optional sampling theorem. Consider the event {τ ∗ = t}. We now will use the assumption that the market is complete: consider an investor with initial discounted wealth ˜ x = U0 = M0 = sup EQ (ξτ ). we have U0 = M0 = E(Mτ ∗ ) = E(Yτ ∗ ). . Theorem. ˜ Hence Xt = Mt for all t ∈ {0.Proof. and let U be its Snell envelope. ˜ Since (St )t∈{0. At+1 > 0 so that E(Ut+1 |Ft ) = E(Mt+1 −At+1 |Ft ) < Mt = Ut .... τ ∈T By the complete market assumption. Returning to ﬁnance. since ˜ ˜ ξt ≤ Ut ≤ Mt = Xt almost surely for all t ∈ {0..T } be the Snell envelope of the discounted payout ξ with Doob decomposition Ut = Mt − At . .T } so that T x+ s=1 ˜ ˜ πs · (Ss − Ss−1 ) = MT . T }. . . where t ˜ Xt = x + s=1 ˜ ˜ πs · (Ss − Ss−1 ). However. A stopping time τ such that E(Yτ ) = U0 is called an optimal stopping time.. [and since (πt )t∈{1. .. Definition... Finally...T } is bounded because we’re in a complete market and hence any Ft -measurable random variable can only take a ﬁnite ˜ number of values] the investor’s discounted wealth X is a martingale... T } and the trading strategy (πt )t∈{1. ..t} super-replicates the payout of the American claim.... Since U is a supermartingale. . . . let (ξt )t∈{0.. (See example sheet 2. .. T }.T } is a martingale for Q. .) But letting τ ∗ = min{t ∈ {0. . 34 .. recalling the deﬁnition of a Snell envelope. Then U0 = sup E(Yτ ).... and we’re done. On this event we have At = 0 and hence Mt = Ut . Let Y be an adapted process indexed by {0. .. That τ ∗ is a stopping time follows from the fact that the non-decreasing process (At )t∈{0. T } : At+1 > 0} where U = M − A is the Doob decomposition of U . . we can ﬁnd a predictable trading strategy (πt )t∈{1.

F. then we will extend the deﬁnition of equivalent ˜ martingale measure to include measures Q locally equivalent to P such that S = S/B is a Q-martingale. dP|Ft We will call the martingale Z the density process for Q with respect to P. 35 . Notice that if A ∈ Ft then Q(A) = = = = EP (1A Z) EP [EP (Z 1A |Ft )] EP [1A EP (Z|Ft )] EP [1A Zt ]. To see where we’re going. we will brieﬂy discuss a weakening of the notion of equivalence of measures that is suitable for our purposes. By the Radon–Nikodym theorem. Now we are in a position to generalize the notion of equivalent measures. If no ﬁnal time horizon T > 0 is explicitly mentioned. Definition. suppose that the measure Q is equivalent to P. S) = (B. he cannot do so with a smaller endowment. we will deal with ﬁnite horizon problems. However. P) be a probability space equipped with a ﬁltration (Ft )t≥T . as there exists a stopping time τ ∗ such that Xτ ∗ = ξτ ∗ . What the above calculation shows is that the density of the restriction Q|Ft of the measure P to the sub-sigma-ﬁeld Ft ⊆ F with respect to P|Ft is the random variable Zt .We have shown that if the seller of the claim has x = U0 = M0 intial discounted wealth. S)t∈Z+ . In particular. Theorem (Radon–Nikodym. In this section. Furthermore. local version). The measures P and Q are locally equivalent if and only if there exists a positive martingale Z with Z0 = 1 such that dQ|Ft = Zt . let us consider a ﬁnancial model (B. let (Ω. it is sometimes more convenient not to explicitly mention the horizon. Locally equivalent measures In all the examples in this course. The measure Q is locally equivalent to P if and only if the restriction Q|Ft of the measure Q to the sigma-ﬁeld Ft is equivalent to P|Ft for each t ∈ T. 4. To conclude this chapter. there exists a positive random variable Z = dQ such that Q(A) = dP EP (Z 1A ). he can super-replicate the payout of the American claim. We can associate with any equivalent measure a positive martingale given by Zt = EP (Z|Ft ).

.

we will see that essentially all continuous martingales M = (Mt )t∈R+ are of the form t Mt = M0 + 0 αs · dWs where W = (Wt )t∈R+ is a Brownian motion.CHAPTER 3 Brownian motion and stochastic calculus In the lectures up to now. Now recall that the ﬁrst fundamental theorem of asset pricing tells us. The following chapter will provide an extremely brief introduction to this theory. to construct the Brownian stochastic integral. If the sample paths of Brownian motion were t t diﬀerentiable. In this chapter. As a preview of what’s to come. Brownian motion In this section. we consider the limit as trading frequency becomes more and more frequent. we introduce one of the most fundamental continuous-time stochastic processes. in the discretetime world. We will see that the above stochastic integral can. in fact. a word of warning is in order: The rules of stochastic calculus are not the same as those of ordinary calculus. and to learn the rules of the resulting calculus. Our goals will be to deﬁne a Brownian motion. As hinted above. our primary interest in this process is that it will be the building block for all of the continuous-time market models studied in these lectures. 1. So although the stochastic integral behaves in some ways like a Riemann–Stieltjes integral of calculus. that if there is no-arbitrage. we would be able to deﬁne the stochastic integral 0 αs · dWs by 0 αs · dWs ds ds and the story would be over. Unfortunately. 37 . we have considered investors whose discounted wealth process is typically of the form t ˜ ˜ Xt = X 0 + s=1 ˜ ˜ πs · (Ss − Ss−1 ). then there is an equivalent measure Q such that ˜ (St )t∈Z+ is a martingale. and so we will have to do more work to make sense of the situation. Brownian motion. the sample paths of Brownian motion are not diﬀerentiable. be deﬁned. and hence we need to understand the continuous time generalization t ˜ ˜ X t = X0 + 0 ˜ πs · dSs .

0 −0.5 3. • the sample path t → Wt (ω) is continuous all ω ∈ Ω. of course. and the proof of this fact is due to Wiener in 1930. does there exist a probability space (Ω. Therefore. ≤ tN = t 38 . |t − s|). they are very irregular.S. F. especially in the U. .0 −1.0 2.0 1.5 W 1. • for all 0 ≤ t0 < t1 < . It is not clear that Brownian motion exists. P) on which the uncountable collection of random variables (Wt )t∈R+ can be simultaneously deﬁned in such a way that the above deﬁnition holds? The answer. we will use the phrase ‘a sequence of partitions (N ) (N ) (N ) of [0. is yes. In this chapter. the Brownian motion is also often called the Wiener process. < tn the increments Wti+1 − Wti are independent.. Below is a computer simulation of a one-dimensional Brownian motion: Sample path of Brownian motion 3.. . which can quantify the irregularity e of the Brownian sample path. A Brownian motion W = (Wt )t∈R+ is a collection of random variables such that • W0 (ω) = 0 for all ω ∈ Ω.5 0 1 2 3 4 5 t 6 7 8 9 10 The following is a very important result.Definition.5 2.5 −1.5 0. Although the sample paths of Brownian motion are continuous.0 0. That is. due to L´vy. t] with vanishing norm’ to mean a collection of points 0 = t0 ≤ t1 . and the distribution of Wt − Ws is N (0.

. Useful properties of such sequences is N (N ) (t(N ) − tn−1 ) = t n n=1 (N ) and N N (t(N ) n n=1 − (N ) tn−1 )2 ≤ n∈{1.. n (N ) Theorem. Let W and W ⊥ be independent one-dimensional Brownian motions. t] with vanishing norm we have N (Wtn ) − Wt(N ) )2 → t (N n=1 n−1 in probability and N (Wtn ) − Wt(N ) )(Wt⊥ ) − Wt⊥ ) ) → 0 (N (N (N n=1 n−1 n n−1 in probability.such that maxn∈{1.N } max |t(N ) n − (N ) tn−1 | n=1 (t(N ) − tn−1 ) → 0. Then.. By deﬁnition. by the independence of the increments of Brownian motion.. For comparison. Remark.t] (N ) tn−1 (N ) sn (N ) tn (t(N ) − tn−1 )2 → 0 n (N ) where < < by the mean value theorem. consider a continuously diﬀerentiable function f : R+ → R.N } |tn (N ) − tn−1 | → 0 as N → ∞.. 39 . For every sequence of partitions of [0.. we have N N [f (t(N ) ) n n=1 − (N ) f (tn−1 )]2 = n=1 f (s(N ) )2 (t(N ) − tn−1 )2 n n N 2 n=1 (N ) ≤ max f (s) s∈[0.. N N Var n=1 (Wtn ) − Wt(N ) )2 = 2 (N n−1 (t(N ) − tn−1 )2 → 0 n n=1 (N ) and the ﬁrst conclusion follows from Chebychev’s inequality. Proof. the increments of Brownian motion are Gaussian randoms so that E[(Wt − Ws )2 ] = t − s and Var[(Wt − Ws )2 ] = 2(t − s)2 for every 0 ≤ s ≤ t. Hence N N E n=1 (Wtn ) − Wt(N ) ) (N n−1 2 = n=1 (t(N ) − tn−1 ) = t n (N ) and.. We can conclude that from the above theorem that a typical Brownian sample path is not a continuously diﬀerentiable function of time.

We also will assume that for each 0 ≤ s < t the increment Wt − Ws is independent of Fs . Rogers and D. Theorem. The Brownian motion is too rough to apply the Riemann–Stieltjes integration theory. Brownian Motion and Stochastic Calculus. Now ﬁx 0 ≤ s < t. Proof. There are now plenty of places to turn for a proper treatment of the subject. What follows is the briefest of sketches of the theory. A simple predictable integrand is an adapted process α = (αt )t∈R+ of the form N αt (ω) = n=1 1(tn−1 . there is an integration theory that does the job.1.tn ] (t)an (ω) where an is bounded and Ftn−1 -measurable for some 0 ≤ t0 < t1 < . 2. Definition.C. and let (Ft )t∈R+ be its natural ﬁltration. Testing integrability is easy since Gaussian random variables have ﬁnite exponential moments. For instance. To get things started. Shreve. Fortunately. Then W is a martingale. E(Wt |Fs ) = E(Ws + Wt − Ws |Fs ) = Ws + E(Wt − Ws ) = Ws Note that the fact that Wt − Ws is independent of Fs is used to pass from a conditional expectation to an unconditional expectation. We will assume that W is adapted to a ﬁltration (Ft )t∈R+ . Diﬀusions. 2. please consult one of the following references: • L. For simple predictable integrands we deﬁne the stochastic integral by the formula ∞ N αs dWs = 0 n=1 an (Wtn − Wtn−1 ) 40 . Williams. and Martingales: Volume 2 • I. It is based on the fact that Brownian motion is a martingale. Let W be a scalar Brownian motion.G. let W be a scalar Brownian motion. The ﬁrst building block of the theory are the simple predictable integrands. Karatzas and S.Since E[(Wt − Ws )(Wt⊥ − Ws⊥ )] = 0 and Var[(Wt − Ws )(Wt⊥ − Ws⊥ )] = (t − s)2 the second conclusion follows similarly.. Of this ﬁltration we will assume that it satisﬁes what are called the usual conditions of right-continuity Ft = >0 Ft+ and that F0 contains all P-null events. Itˆ stochastic integration o We now have suﬃcient motivation to construct a stochastic integral with respect to a Wiener process. < tN < ∞. Markov Processes. The L2 theory.E..

and we have E m. t]×A where 0 ≤ s < t and A is Fs -measurable. E[(Wtm − Wtm−1 )(Wtn − Wtn−1 )|Ftn−1 ] = (Wtm − Wtm−1 )E[Wtn − Wtn−1 ] = 0 and if m = n E[(Wtn − Wtn−1 )2 |Ftn−1 ] = E[(Wtn − Wtn−1 )2 ] = tn − tn−1 since the increment Wtn −Wtn−1 is independent of Ftn−1 . the oﬀ diagonal terms cancel. we have o ∞ 2 ∞ E 0 αs dWs =E 0 2 αs ds Proof.Theorem (Itˆ’s isometry). Remark. ∞ 2 E 0 αs dWs =E m. predictable processes are limits of simple. If m < n. adapted process is predictable. For a simple predictable integrand α.n am an (Wtm − Wtm−1 )(Wtn − Wtn−1 ) Consider the terms in the sum when m ≤ n. A predictable process α is a map α : R+ × Ω → R that is P-measurable.n am an (Wtm − Wtm−1 )(Wtn − Wtn−1 ) = E n an (tn − tn−1 ) n ∞ 2 αs ds 0 = E as desired. we see Eam an (Wtm − Wtm−1 )(Wtn − Wtn−1 ) = E[E(am an (Wtm − Wtm−1 )(Wtn − Wtn−1 )|Ftn−1 )] = E[am an E((Wtm − Wtm−1 )(Wtn − Wtn−1 )|Ftn−1 )] since am and an are Ftn−1 -measurable. Now. Every left-continuous. the map deﬁned by I(α) = 0 αs dWs is an isometry from the space of simple predictable integrands to the space L2 of square-integrable random variables. predictable integrands. predictable integrands. Now. The predictable sigma-ﬁeld P is the sigma-ﬁeld on the product space R+ ×Ω generated by sets of the form (s. In particular. suppose (α(k) )k∈N is a sequence of simple predictable integrands converging to a predictable process α in the sense that ∞ ∞ E 0 (k) (αs − αs )2 ds →0 41 . the predictable sigma-ﬁeld is that generated by the simple. These are the examples to keep in mind. since they are the ones that come up most in application. The fact that L2 is complete is the key observation which allows us to build the stochastic integral of more general integrands. Applying the tower property. Equivalently. Definition. Equivalently.

Localization.2. Then M is a continuous martingale. which by the como 2 pleteness of L .as k → ∞. 2. 42 . ∞) but rather ﬁnite intervals [0. In this section. we are not really interested in integrals over the whole interval [0. If α is predictable and ∞ E 0 2 αs ds <∞ ∞ then 0 ∞ αs dWs = L2 − lim k 0 (k) αs dWs where α (k) is any sequence of simple. predictable integrands converging in L2 to α. and let αt (n) = αt 1{t≤τn } . we show how to extend the deﬁnition of stochastic integral to predictable processes α such that t 2 αs ds < ∞ almost surely 0 for all t ≥ 0. Let Mt = 0 t αs dWs where α is predictable and t E 0 2 αs ds <∞ for each t ≥ 0. Deﬁne the stopping times t τn = inf t ≥ 0 : 0 2 αs ds = n for each n ∈ N. t]. By Itˆ’s isometry the sequence I(α(k) ) is a Cauchy sequence. converges to some random variable. This is easily handled. where inf ∅ = ∞ as usual. Definition. What can we say. This is what we take as the deﬁnition. 0 t ∞ αs dWs = 0 αs 1{0<s≤t} dWs whenever the right-hand side is well-deﬁned. Definition. within the L2 -theory. But please note: The stochastic integral is not deﬁned as the almost sure limit of a sequence of Riemann–Stieltjes integrals!! Of course. about a process deﬁned by a stochastic integral? Theorem. The technique is called localization.

Itˆ’s formula o In the last section. t The process L = 0 αs dWs deﬁned in this way is still continuous. First you localize L. What makes the Itˆ stochastic integral useful is that there is a corresponding o stochastic calculus. Then. Indeed. 3. M is a 2 continuous local martingale. n n=1 n−1 Then you can deﬁne the integral over ﬁnite intervals. the process t 0 αs dWs (n) is well-deﬁned by the L2 t∈R+ theory. An adapted process (Lt )t∈R+ is called a local martingale if and only if there exists an increasing sequence of stopping times (τn )n∈N with τn → ∞ almost surely such that the stopped process (Lt∧τn )t∈R+ is a martingale for all n ∈ N. but by construction it is what is called a local martingale: Definition. t Remark. we sketched the constructed of a stochastic integral with respect to a Wiener process. we have P n∈N An = 1. Once you’re done. The basic building block of this calculus is the chain rule. Since t 2 α ds < ∞ almost surely for all t ≥ 0. If in addition we have E 0 αs ds < ∞ then M is a true martingale (as opposed to a being strictly local martingale). extend the integrands by localization. If you’re ambitious. but Itˆ’s isometry becomes o ∞ 2 ∞ E 0 αs dLs N =E 0 2 αs d L s where L t = lim N →∞ (Lt(N ) − Lt(N ) )2 . 43 . rather than Brownian motion. called Itˆ’s o formula.. The steps are the same. Hence we can deﬁne 0 s the stochastic integral by the formula t t αs dWs = lim 0 n→∞ (n) αs dWs 0 where the limit is in probability. left-continuous process such that 0 αs ds < ∞ almost t surely for each t ≥ 0 then Mt = 0 αs dWs is deﬁned. and ﬁnally. In all cases. but in general t∈R+ there is no guarantee that L a martingale. and proces I is another continuous local martingale. you can do the L2 theory as before. the stochastic integral deﬁned by the localized version of the L2 stochastic integration theory may not be a martingale.Note that since E t (n) (αs )2 ds 0 ≤ n. Now ﬁx t > 0 and deﬁne the increasing sequence of events An = {ω ∈ Ω : τn ≥ t}. ∞ a.s. you will now try to build a stochastic integration theory starting with a general continuous local martingale L. you will have built a stochastic integration theory that has t t 2 the very nice property that the integral It = 0 αs dLs is deﬁned when 0 αs d L s . so you can assume L is a square-integrable martingale. To summarize: t 2 If α is an adapated.

3. The scalar version. Fix a real number X0 and let t t X t = X0 + 0 αs dWs + 0 ks ds. 2 0 0 Let us highlight a diﬀerence between Itˆ and ordinary calculus. Y t = lim N →∞ (Xt(N ) − Xt(N ) )(Yt(N ) − Yt(N ) ). the stochastic integral Ws dWs is a true t∈R+ martingale. then the quadratic covariation process X. the ﬁrst as a stochastic integral and the second as a pathwise Lebesgue integral. Let f : R → R be twice continuously diﬀereno tiable. Both integrals appearing the above equation now be interpreted. Can you verify. o We are now ready for the ﬁrst version of Itˆ’s formula: o Theorem (Itˆ’s formula.) For a semimartingale X. if it exists. (For instance. But consider the example f (x) = x2 so that t Wt2 Note that since E t 0 =2 0 Ws dWs + t. Let (Wt )t∈R+ be a scalar Brownian motion adapted to a ﬁltration (Ft )t≥0 satisfying our usual conditions. Let (αt )t∈R+ and (kt )t∈R+ be predictable real-valued processes such that t 2 αs ds < ∞ and 0 0 t |ks |ds < ∞ almost surely for all t ≥ 0. Y = ( X. This term would not appear in the chain rule of o ordinary calculus. that the process (Wt2 − t)t∈R+ is a martingale? We now introduce some notions which helps with computations involving Itˆ’s formula. by noting the mysterious o appearance of the f term in Itˆ’s formula. If (Yt )t∈R+ is another semimartingale. A semimartingale is a process X of the form Xt = At + Lt where L is a local martingale and A is a process of bounded variation. t 0 Ws2 ds = t2 /2 < ∞. is deﬁned by the limit in probability N X t = lim N →∞ (Xt(N ) − Xt(N ) )2 n n=1 n−1 over a sequence of partitions with vanishing norm. Y t )t∈R+ is deﬁned by the limit N X.1. o Definition. scalar version). the quadratic variation process X = ( X t )t∈R+ . Then t t 1 2 f (Xt ) = f (X0 ) + f (Xs )αs dWs + (f (Xs )ks + f (Xs )αs )ds. directly from the deﬁnition of Brownian motion. n n n=1 n−1 n−1 44 . an Itˆ o process is a semimartingale. A process (Xt )t∈R+ of the above form is often called an Itˆ process.

we are ready to see an indication for why Itˆ’s formula may be true. 0 0 bs ds t = 0 0 as ds. The map (X. Let o dXt = αt dWt + kt dt where the diﬀerential notation is shorthand for the corresponding integrals. 0 bs dWs t = 0 · · t · · as dWs . symbol d X t means dX N t 2 = αt dt. Y ) → X. · · · · as ds. Y t = 0 (1) (1) (2) (2) (αs βs + αs βs ) ds.Theorem. t] and consider the following second order Taylor approximation: f (Xt ) − f (X0 ) = n=1 N f (Xtn ) − f (Xtn−1 ) 1 f (Xtn−1 )(Xtn − Xtn−1 ) + f (Xtn−1 )(Xtn − Xtn−1 )2 2 n=1 t t ≈ ≈ f (Xs )dXs + 0 0 1 f (Xs )d X 2 s In fact. and where (Wt )t∈R+ and (Wt⊥ )t∈R+ are independent Brownian motions. Y is bilinear. Fix a partion of [0. and the following table summarizes its possible values. where (at )t∈R+ and (bt )t∈R+ are processes such that the relevant integrals are deﬁned. Armed with this the notion of quadratic variation. suppose we have two Itˆ processes o t t (1) αs dWs(1) + 0 t 0 t (1) βs dWs(1) + 0 0 (2) (2) βs dWs(2) + 0 (2) αs dWs(2) + 0 t t Xt = X0 + Yt = Y0 + hs ds ks ds for independent Brownian motions W (1) and W computed by bilinearity: t . Notice that in this diﬀerential notation. 0 0 bs dWs t = 0 as bs ds 0 as dWs . For instance. it is customary to write out Itˆ’s formula as o 1 df (Xt ) = f (Xt )dXt + f (Xt )d X 2 45 t . then the quadratic covariation can be X. 0 bs dWs⊥ t = 0 Example.

Let f (x) = ex . let’s check.j) (αs )2 ds i=1 j=1 t n (i) |ks |ds < ∞ 0 i=1 < ∞ and 46 . The vector version. t = α2 Yt2 dt. It is o basically the same as before. α2 Yt2 dt 3. where a = α and b = β − α2 /2. let (αt )t≥0 be an adapted process valued in the space of n × d matrices. Letting Zt = log(Yt ). dXt = a dWt + b dt and d X So Itˆ’s formula says: o 1 df (Xt ) = f (Xt )dXt + f (Xt )d X 2 ⇒ dYt = Yt [(b + a2 /2)dt + a dWt ] t t = a2 dt Example. and let (kt )t≥0 be an adapted process valued in Rn . Then f (x) = ex and f (x) = ex . we would like to show that the process (Yt )t∈R+ is an Itˆ process. and write down its decomo position in terms of ordinary and stochastic integrals. β ∈ R.2. 1 But to be safe. Suppose we have the Itˆ process given implicitly as the solution of the stoo chastic diﬀerential equation dYt = Yt [βdt + α dWt ] for some constants α.where the diﬀerential notation should really be intrepreted as the corresponding integral equation. Letting Yt = eXt . We now introduce the vector version of Itˆ’s formula. Consider the Itˆ process given by o Xt = X0 + aWt + bt for some constants a. Example. we know from the previous example that Zt = Z0 + aWt + bt. Let (Wt )t≥0 be a d-dimensional Wiener process. Let g(y) = log(y). and also dY Again. We insist that t 0 n d (i. Then g (y) = y and g (y) = − y12 . b ∈ R. but with worse notation. Itˆ’s formula says: o 1 dg(Yt ) = g (Yt ) dYt + g (Yt ) d Y t 2 1 1 1 ⇒ dZt = Yt [βdt + α dWt ] + − 2 Yt 2 Yt 2 = (β − α /2) dt + α dWt . Also.

F. we aim to understand how martingales arise within the context of the Itˆ stochastic integration theory. F. P). notice that by Itˆ’s formula we have o dZt = Zt αt · dWt so that (Zt )t∈R+ is a local martingale. Recall that locally equivalent measures are related to positive martingales via the Radon– Nikodym theorem.almost surely for all t ≥ 0. if (Zt )t∈R+ is a positive martingale we can deﬁne a locally equivalent measure Q. This process is clearly positive. X (j) ∂xi ∂xj t 4. x) → f (t. Girsanov’s theorem As we have seen in discrete time. by the Radon–Nikodym theorem there exists a density dQ|t Zt = dP|t such that (Zt )t∈R+ is a strictly positive martingale on the probability space (Ω. Xt ) d X (i) . Indeed. Now we are ready for the statement of the theorem: Theorem (Itˆ’s formula. What if (Zt )t∈R+ is a true martingale.j) αs dWs(j) + 0 t (i) ks ds. Then. Motivated by above discussion. Xt )dt + ∂t n i=1 ∂f 1 (i) (t. Now consider the n-dimensional Itˆ process (Xt )t≥0 deﬁned by o t t X t = X0 + 0 αs dWs + 0 ks ds. P) be our probability space equipped with a ﬁltration (Ft )t∈R+ . x) o be continuously diﬀerentiable in the t variable and twice-continuously diﬀerentiable in the x variable. the economic notion of an arbitrage-free market model is tied to the existence of a locally equivalent measure for which the discounted asset prices are martingales. Consider the stochastic process (Zt )t∈R+ o given by t 1 t 2 Zt = e− 2 0 |αs | ds+ 0 αs ·dWs where (Wt )t∈R+ is a d-dimensional Brownian motion and (αt )t∈R+ is a d-dimensional adapted t process with 0 |αs |2 ds < ∞. Furthermore. and let Q be locally equivalent to P in the sense that the restrictions P|t and Q|t of P and Q to Ft are equivalent for each t ≥ 0. Xt ) = (t. let (Ω. Then ∂f df (t. interpreted component-wise as Xt = X0 + 0 j=1 (i) (i) t d (i. and hence the density of a change of measure? What happens to the Brownian motion? 47 . vector version). as it is a stochastic integral with respect to a Brownian motion. Xt ) dXt + ∂xi 2 n n i=1 j=1 ∂ 2f (t. Let f : R+ × Rn → R where (t. Conversely.

Let (Lt )t∈R+ be a continuous e d-dimensional local martingale such that L(i) . Let L = (Lt )t∈R+ be a local martingale. Let (Ω. Let (Ω. Up to now. Even when it’s not. and let the ﬁltration (Ft )t∈R+ be the ﬁltration generated by W . it was claimed that all continuous martingales are essentially stochastic integrals with respect to Brownian motion. L(j) t = t 0 48 if i = j if i = j. you may be asking yourself: When is the process (Zt )t∈R+ not just a local martingale. we have assumed that the ﬁltration is generated by a Brownian motion. Theorem (Martingale Representation Theorem). Now. F. F. Q). we hopefully clear up the situation. F. P) be a probability space on which a d-dimensional Brownian motion W = (Wt )t∈R+ is deﬁned. and let (Ft )t∈R+ be a ﬁltration satisfying the usual conditions. P) be a probability space on which a d-dimensional Brownian motion (Wt )t∈R+ is deﬁned. dP|t ˆ Then the d-dimensional process (Wt )t∈R+ deﬁned by t ˆ Wt = Wt − 0 αs ds is a Brownian motion on (Ω. Deﬁne the locally equivalent measure Q on (Ω. |αs |2 ds+ t 0 αs ·dWs is a true martin- 5. but a true martingale? An easy-to-check suﬃcient condition is given by: Theorem (Novikov’s criterion). .Theorem (Cameron–Martin–Girsanov Theorem). Then there exists a unique adapted d-dimensional t process (αt )t∈R+ such that 0 |αs |2 ds < ∞ almost surely for all t ≥ 0 and t Lt = L0 + 0 αs · dWs . L is continuous. Then (Lt )t∈R+ is a standard d-dimensional Brownian motion. F) by the density process dQ|t = Zt . Martingale representation theorems In the introduction to this chapter. we can say a lot about continuous local martingales: Theorem (L´vy’s Characterization of Brownian Motion). Let Zt = e− 2 1 t 0 |αs |2 ds+ t 0 αs ·dWs and suppose (Zt )t∈R+ is a martingale. In this section. If E e2 1 t 0 |αs |2 ds <∞ 1 t 0 for all t ≥ 0 then the process (Zt )t∈R+ deﬁned by Zt = e− 2 gale. In particular.

since |Mt | = e|θ| t/2 and hence E(sups∈[0. Consider . L(n) 2 2 m=1 n=1 d d t and so (Mt )t∈R+ is a continuous local martingale.t] |Ms |) < ∞ the process (Mt )t∈R+ is a true martingale. Let (Lt )t∈R+ be a continuous real-valued local martingale. (Wt )t∈R+ is a Brownian motion such that Lt = L0 + 0 αs dWs for all t ≥ 0. Example. The above equation implies that the increment Lt − Ls has the Nd (0. as it is the stochastic integral with re2 spect to a continuous local martingale. t Wt = 0 dLss . (t − s)I) distribution and is independent of Fs . we can attempt e a proof of Girsanov’s theorem: Proof of Girsanov’s theorem. θ·Lt +|θ|2 t/2 1 |θ|2 dt − Mt θ(m) θ(n) d L(m) . Fix a constant vector θ ∈ Rd and let i = Mt = ei By Itˆ’s formula. o dMt = Mt i θ · dLt + = i Mt θ · dLt √ −1. Note that (Wt )t∈R+ is a Brownian motion by L´vy’s characterization since e α t W t = 0 dL 2 αs t s = t. and suppose there t 2 is an adapted process (αt )t∈R+ such that L t = 0 αs ds where αt > 0 almost surely. As another application of L´vy’s characterization of Brownian motion. Hence. Let Zt = e− 2 1 t 0 |αs |2 ds+ t 0 αs ·dWs for a d-dimensional adapted process (αt )t∈R+ and a d-dimensional Brownian motion (Wt )t∈R+ . and suppose (Zt )t∈R+ is a true martingale. Thus for all 0 ≤ s ≤ t we have E(Mt |Fs ) = Ms which implies E(ei θ·(Lt −Ls ) |Fs ) = e−|θ| 2 (t−s)/2 . Let t ˆ Wt = Wt − 0 αs ds. 49 . On the other hand.Proof.

ˆ Note that (Zt Wt )t∈R+ is a local martingale for P, since by Itˆ’s formula: o ˆ (i) ˆ (i) ˆ (i) ˆ d(Zt Wt ) = Wt dZt + Zt dWt + d Z, W (i) (i) ˆ (i) = Zt Wt αt · dWt + Zt dWt ˆ It now follows that (Wt )t∈R+ is a local martingale for Q, where ˆ ˆ W (i) , W (j)

t dQ|t dP|t t

= Zt . But since

=

t 0

if i = j if i = j.

ˆ then (Wt )t∈R+ is a Brownian motion of for Q.

50

CHAPTER 4

**Black–Scholes model and generalizations
**

We now return to the main theme of these lecture, models of ﬁnancial markets. We now have the tools to discuss the continuous time case, at least when the asset prices are continuous processes. 1. The set-up As before, our market model consists of a d+1-dimensional stochastic processes (B, S) = (Bt , St )t∈R+ representing the asset prices. This process will be deﬁned on a probability space (Ω, F, P) with a ﬁltration (Ft )t∈R+ satisfying the usual conditions. We will make the following now-familiar assumptions. Assumption. The stochastic process (B, S) is assumed to be is an Itˆ process adapted o to (Ft )t∈R+ . Assumption. Asset 0 is a num´raire so that Bt > 0 almost surely for all t ∈ R+ . e As before, the investor’s controls consist of the d + 1-dimensional process (φt , πt )t∈R+ , (i) where φt and corresponds to the number of shares of asset 0 held at time t, while πt corresponds to the number of shares of asset i ∈ {1, . . . , d} held at time t. We will use the d 1 notation πt = (πt , . . . , πt ). The wealth Xt at time t then satisﬁes the pair of equations Xt = φt Bt + πt · St dXt = φt dBt + πt · dSt the budget constraint the self-ﬁnancing condition

As usual, we can use the budget contraint to solve for φt , and plug this expression into the self-ﬁnancing condition to get dXt = (Xt − πt · St ) By Itˆ’s formula, we have o d Xt Bt = Xt −

d

**dBt + πt · dSt Bt dXt d X, B − 2 Bt Bt
**

t

dBt d B + 2 3 Bt Bt St St Bt

(i)

t

+

t

=

i=1

πt

(i)

−

dBt d B + 2 3 Bt Bt

+

dSt d S (i) , B − 2 Bt Bt

t

= πt · d

.

51

**As usual, we express everything in terms of asset 0 prices Xt St ˜ ˜ and Xt = St = Bt Bt so that
**

t

˜ ˜ Xt = X0 +

0

˜ πs · dSt

2. Admissible strategies In order to make sense of the stochastic integral deﬁning the discounted wealth, we need to impose the integrablity condition

t 0 d i ˜ (πs )2 d S i i=1 s

<∞

almost surely for all t ≥ 0. However, in moving from discrete to continuous time, we have to be careful. We will now see that this condition isn’t strong enough to make our economic analysis interesting. Example. (A doubling strategy.) This example is intended to provide motivation for restricting the class of trading strategies that we will consider in these lectures. The problem with continuous time is that events that will happen eventually can be made to happen at any ﬁnite time by speeding up the clock. In particular, we will now construct T 2 a real-valued adapted process (αt )t∈[0,T ] such that 0 αs ds < ∞ almost surely, but

T

αs dWs = K

0

almost surely, where (Wt )t∈[0,T ] is a standard scalar Brownian motion, and T > 0 and K > 0 are real constants. Let f : [0, T ) → R+ be a strictly increasing, continuous function such that f (0) = 0 and limt→T f (t) = ∞. Note in particular that we assume that f (t) > 0 for t ∈ [0, T ) and there exists an inverse function f −1 : R → [0, T ) such that f ◦ f −1 (u) = u. For instance, to be t Tu explicit, we may take f (t) = T −t and f −1 (u) = 1+u . Now deﬁne a process (Zu )u∈R+ by

f −1 (u)

Zu =

0

(f (s))1/2 dWs

**Note that the quadratic variation is
**

f −1 (u)

Z

u

=

0

f (s)ds

**= f (f −1 (u)) − f (0) = u so by L´vy’s characterization (Zu )u∈R+ is a Brownian motion. Deﬁne the stopping time τ by e τ = inf{u ≥ 0, Zu = K}.
**

52

In this case. let Zu = Brownian motion. Let (B. u≥0 u≥0 (n) P(sup Zu > K) = P(sup u≥0 But it is easy to see that P(supu≥0 Zu > 0) = 1. One can check that that the proces (Zu )u∈R+ is also a standard √ √ nZu/n > K) = P(sup Zu > K/ n) → P(sup Zu > 0). The strange fact is that (Mt )t∈[0. employing a doubling strategy (with borrowed money) at a quicker and quicker pace.T ] corresponds to an gambler starting at noon with £0. This situation is rather unrealistic.1 Now let αs = (f (s))1/2 1{s≤f −1 (τ )} and Mt = 0 t αs dWs Note that since T 2 αs ds 0 f −1 (τ ) = 0 f (s)ds = τ < ∞ the stochastic integral is well-deﬁned. That is. A strategy (πt )t∈R+ is admissible if and only if there is a constant C > 0 such that t ˜ πs · dSs > −C almost surely 0 see why. We see that integrand (αs )s∈[0. We might choose to impose the stronger condition t d i ˜ (πs )2 d S (i) 0 i=1 s E <∞ for all t ≥ 0. Hence u≥0 1To (n) √ nZu/n . S) be a market model. and in particular.T ] is a local martingale with M0 = 0. The above discussion shows that the natural integrability condition t 0 d i ˜ (πs )2 d S i i=1 s < ∞ almost surely is not really suﬃcient for our needs. we have τ < ∞ almost surely. we usually insist that the investor cannot go arbitrarily far into debt. Indeed. it doesn’t necessarily follow that EQ (Y ) < ∞. (Can you ﬁnd an example?) So instead. the stochastic integrals against Brownian motions would be martingales. However. if such strategies were a good model for investor behaviour. until ﬁnally he gains £K almost surely before the clock strikes one o’clock. 53 . even if P and Q are equivalent and Y is a positive random variable such that EP (Y ) < ∞. we would be able to use the L2 theory.Since (Zu )u∈R+ is a Brownian motion. we usually do not impose this extra integrability condition because in ﬁnance we are often computing expectations under an equivalent measure. particularly since the gambler must go arbitrarily far into debt in order to secure the £K winning. we all could be much richer by just spending some time trading over the internet. Definition. but MT = Zτ = K almost surely.

˜ for all t ≥ 0, where St =

St . Bt

Note that the doubling strategy is not admissible, since the investor now has only a ﬁnite credit line. However, a suicide strategy, that is, a doubling strategy in which the object is to lose a ﬁxed amount K by time T , is admissible. 3. Arbitrage and equivalent martingale measures To see that our restriction to admissible strategies is reasonable, let’s now consider continuous-time arbitrage theory. Definition. An admissible strategy (πt )t∈R+ is an arbitrage if there is a (non-random) time T > 0 such that

T

P

0 T

˜ πs · dSs ≥ 0 ˜ πs · dSs > 0

0

= 1 > 0.

P

˜ ˜ Definition. A probability measure Q, locally equivalent to P, such that S = (S)t∈R+ is a local martingale is called a equivalent martingale measure. The following theorem will serve the role of the ﬁrst fundamental theorem of asset pricing in continuous time. Theorem. If there exists an equivalent martingale measure, then the market model has no arbitrage. Remark. Note that the theorem doesn’t say that no-arbitrage implies the existence of an equivalent martingale measure. Indeed, our notion of arbitrage is too strong. The ‘correct’ notion of arbitrage is called ‘free-lunch-with-vanishing-risk,’ but it is outside the scope of these lectures. See the recent book of Delbaen and Schachermayer The Mathematics of Arbitrage for an account of the modern theory. Proof. Suppose (πt )t∈R+ is admissible and let

t

˜ Xt =

0

˜ πs · dSs .

˜ By the deﬁnition, there exists a constant C > 0 such that Xt > −C P− almost surely. ˜ t > −C Q− almost surely. But under Q, the Since P and Q are locally equivalent, then X ˜ process (St )t∈R+ is a local martingale. But a local martingale that is bounded from below is necessarily a supemartingale. (See Problem 3.5) In particular, we have the following inequality ˜ ˜ EQ (Xt ) ≤ X0 = 0. for all t ≥ 0. ˜ ˜ Now suppose that there is a T > 0 such that XT ≥ 0 P-almost surely. Then XT ≥ 0 Q˜ ˜ almost surely. But since EQ (XT ) ≤ 0, we conclude that XT = 0 Q-almost surely, which ˜ implies XT = 0 P-almost surely.

54

4. Contingent claims and market completeness As before, given a market model (B, S), we can introduce a contingent claim. Recall that a European contingent claim maturing at a time T > 0 is modelled as random variable ξ that is FT -measurable. Let Q be the set of equivalent martingale measures, and which we shall assume is not empty. Hence, there is no arbitrage in the market. Theorem. Let (ξt )t∈[0,T ] be a process such that ξT = ξ. The augmented market (Bt , St , ξt )t∈[0,T ] is free of arbitrage if there exists an equivalent martingale measure Q ∈ Q such that Bt ξt = EQ ξ Ft BT for all t ∈ [0, T ]. ˜ ˜ Proof. There is no arbitrage if there exists an equivalent measure Q such that (St , ξt )t∈[0,T ] ξt ˜ ˜ is a local martingale for Q, where ξt = Bt . But (ξt )t∈[0,T ] is a martingale by construction. And just as before, we are interested in replicating contingent claims by trading in the market. Definition. A (European) contingent claim with payout ξT maturing at time T > 0 is attainable if and only if there exists a constant x ∈ R and an admissible strategy (π)t∈[0,T ] such that T ˜ ˜ πs · dSs ξT = x +

0 ξ ˜ where ξT = BT . T ˜ A market is complete if for every bounded FT -measurable random variable ξT , there exists x, π such that t

˜ Xt = x +

0

˜ πs · dSs

˜ deﬁnes a bounded discounted wealth process X such that XT = ξT . The following is a version of the second fundamental theorem of asset pricing for this continuous time setting. Theorem. Suppose that there exists an equivalent martingale measure Q and the market is complete. Then Q is the unique equivalent martingale measure. ˜ Proof. Let Q be another equivalent martingale measures and let ξT be an arbitrary bounded FT -measurable random variable. By assumption ξT is attainable by a wealth process ˜ X, and since X is a bounded, it is a martingale for both Q and Q . ˜ ˜ EQ (ξT ) = EQ (XT ) ˜ = x = EQ (ξT ) ˜ ˜ Hence, for all bounded random variables ξT , the inequality EP [(ZT −ZT )ξT ] ≥ 0 holds, where dQT dQT ˜ ZT = dPT and ZT = dPT . Letting ξ = 1{ZT <ZT } shows that P(ZT ≥ ZT ) = 1. Symmetry completes the argument shows ZT = ZT a.s. At this stage we’re a bit too general to say anything interesting. Hence, it is wise to introduce more notation to ﬂesh out the story...

55

5. The set-up revisited In this section, we revisit the set-up of the continuous time market model (B, S). We will make one extra assumption, which is standard, but not really necessary: Assumption. Asset 0 is risk-free in the sense that B

t

= 0 almost surely.

The point of this section is to set notation that will be used for the remaining lectures, and to interpret the existence of an equivalent martingale measure via Girsanov’s theorem. Asset 0, which we now assume is risk-free, will be intepreted as a bank account. Its dynamics are given by dBt = rt Bt dt for an adapted process (rt )t≥0 which can be interpreted as the spot interest rate. The above random ordinary diﬀerential equation has the solution Bt = B0 e

t 0 rs ds

.

**Assets 1, . . . , d are assumed to have dynamics given by
**

n (i) dSt

=

(i) St

(i) µt

dt +

j=1

σt

(i,j)

dWt

(j)

(i) for some adapted process (µt )t∈R+ (j) (i) (Wt )t∈R+ . The random variable µt

**and and independent Brownian motions can be considered the instantaneous drift of the return
**

n (i,j) (σt )2 j=1 1/2

(i,j) (σt )t∈R+ ,

on stock i, while the random variable

is its instantaneous volatility. In vector notation, these equations can be written as dSt = diag(St )(µt dt + σt dWt ) where St = (St , . . . , St ), and the Rd -valued random variables µt and Wt , and d × n matrix-valued random σt deﬁned similarly. Here we are using the notation s1 0 · · · 0 ... 0 0 s diag(s1 , . . . , sd ) = . . 2 . .. .. . . . . . 0 0 · · · sd The dynamics of the discounted stock price are given by ˜ ˜ dSt = diag(St )((µt − rt 1)dt + σt dWt ) where 1 = (1, . . . , 1) ∈ Rd . Now we come to the two theorems of this section: The ﬁrst is a reasonably easy-to-check suﬃcient condition to ensure no-arbitrage.

56

(1) (d)

The following theorem is a suﬃcient condition for completeness: Theorem. and generates the ﬁltration.T ] is a bounded Q-martingale. so that λt = σt (µt −rt 1). Remark. Proof. ω).T ] such that t ˜ ˜ ξt = EQ (ξT ) + 0 ˆ αs · dWs . By Girsanov’s theorem. ˜ so that S is a local martingale for Q. ˆ If the process W generates the ﬁltration. Notice then that the dynamics of the discounted stock price are given by ˜ ˜ ˆ dSt = diag(St )σt dWt . then the market is complete. Take as given the hypotheses of the previous theorem. and let ˜ ˜ ξt = EQ (ξT |Ft ). 57 .Theorem. the martingale representation theorem asserts the existence an adapted process (αt )t∈[0. Hence Q is an equivalent martingale measure and there is no arbitrage in this market. The n-dimensional random vector λt is a generalization of the Sharpe ratio. ˜ ˆ so that (ξt )t∈[0. On the other hand. If the process λ is such that Zt = e− 2 EP (e 2 1 1 t 0 |λs |2 ds− t 0 λs ·dWs . since it measures in some sense the excess return of the stocks per unit of volatility. we can write the discounted stock dynamics as ˜ ˜ ˆ dSt = diag(St )σt dWt . we see that t ˜ ξt = x + 0 ˜ πs · dSs . deﬁnes a martingale (for instance if Novikov’s criterion t 0 |λs |2 ds )<∞ holds for all t ≥ 0). the locally equivalent measure Q whose density process ˆ is Z is such that the process W given by t ˆ Wt = Wt + 0 λs ds is a Brownian motion for Q. then the market has no arbitrage. T ˜ By taking x = EQ (ξT ) and πt = diag(St )−1 (σt )−1 αt . The process λ = (λt )t∈R+ is often called the market price of risk. Let Q be the equivalent martingale measure in the proof of the previous theorem. Proof. ˜ Fix a bounded FT -measurable random variable ξT . and further suppose −1 ˆ n = d and σt (ω) is invertible for all (t. Let dWt = dWt +λt dt. Suppose there exists a predictable process λ such that σt λt = µt − rt 1 almost surely for all t ≥ 0. Since W is a Brownian motion under Q.

we are still left with the question: How do you price and hedge contingent claims? The ﬁrst step is to pose a model for the asset prices (Bt . However. A good model should give a reasonable statistical ﬁt to the actual market data. We now have a suﬃcient condition that the market model is complete.yahoo. See ﬁgure 1 below. the rule of thumb becomes: n < d ‘⇒’ n = d ‘⇒’ n > d ‘⇒’ The market has arbitrage. 6. In this section. St )t∈R+ . Figure 1. a useful model is one in which the prices and hedges of contingent claims can be computed reasonably easily. but we do not yet know how to actually compute it. we will study models in which the asset 58 . If we consider the equation σt λt = µt − rt 1 where σt is an d × n matrix. at this stage we can only assert the existence of a replicating strategy for a given claim. The market has no arbitrage and is incomplete. and many solutions if n > d.and hence the claim with payout ξT = XT is attainable. Graph of the Standard & Poor’s 500 stock index 1950-2008. the market is complete. Of course. The market has no arbitrage and is complete. Hence. and the discounted wealth process ˜ ˜ X = ξ is bounded. this is not theorem. Data taken from http://finance. Markovian markets Now that we have our two main structural theorems in the context of a market with continuous asset prices. just a rule of thumb.com Furthermore. Financially. one expects from the rules of linear algebra for there to be no solution if n < d. This problem is the subject of the next section. exactly one solution if n = d.

St ) = (t. d (t. S) ∈ [0. the asset prices (St )t∈R+ are a d-dimensional Markov process. µ : R+ ×Rd → Rd and σ : R+ ×Rd → Rd×n are given. The above theorem says that if the market model is Markovian. Notice that this is a special case of the set-up of the last section.prices are Markov processes. Theorem. if V is non-negative. St ) . Assume there exists an equivalent martingale measure. In this special situation. and σt (ω) = σ(t. If r = 0. Suppose the function V : [0. Solving this equation may be diﬃcult to do by hand. . If ξt = V (t. St (ω)). Furthermore. . but it can usually be done by computer if the dimension d is reasonably small. the pricing function V can be found by solving a certain linear parabolic partial diﬀerential equation2 with terminal data to match the payout of the claim. St . Furthermore. . St )dt + σ(t. St )dWt ) where the nonrandom functions r : R+ ×Rd → R. . St ) dt dSt = diag(St )(µ(t. St ). the PDE reduces to the (backward) Kolmogorov 59 equation. S) = g(S) where a = σσ T . ξt )t∈[0. and where all functions in the PDE are evaluated at the same point (t. St (ω)). Now suppose that the d + 1 assets have Itˆ dynamics which can be expressed as o dBt = Bt r(t. T ) × Rd . the price of a claim contingent on the future risky asset prices can be written as a deterministic function V of the current market prices. The next theorem says how to ﬁnd the no-arbitrage price and the replicating strategy for a contingent claim maturing at time T with payout ξ = g(ST ) for some non-random function g : Rd → R. as now rt (ω) = r(t. though there seems to be some controversy over how well they ﬁt actual market data. 1 ∂S ∂S Remark. These models are useful in the above sense. . then the claim with payout ξ = g(ST ) is attainable with initial capital ξ0 = V (0. T ] × Rd → R satiﬁes the partial diﬀerential equation ∂V + ∂t d i=1 1 ∂V rS + ∂S i 2 i d i=1 ∂ 2V = rV ai. St ) then there is no arbitrage in the augmented market (Bt .j S S ∂S i ∂S j j=1 i j d V (T.T ] . St (ω)). µt (ω) = µ(t. And most importantly for the banker selling such a contingent claim: the replicating portfolio πt can be calculated as the 2sometimes called the Feynman–Kac PDE. S0 ) and the replicating strategy ∂V ∂V πt = grad V (t.

St ) Bt ∂V + ∂t ∂V + ∂t d n d i=1 d ∂V 1 dSti + i ∂S 2 µ i d d i=1 j=1 d d ∂ 2V d S i. 7. if the ﬁltration (Ft )t∈R+ is the natural ﬁltration of the asset prices (Bt . We are interested in pricing and hedging a European contingent claim with payout ξT = g(ST ). This is often called the Black–Scholes model.1. we will consider the simplest possible Markovian model of the type studied in section 6. We’ve seen before that since λ = (µ − r)/σ is constant. St )t∈R+ . Let Q be an equivalent martingale measure. St ) dBt 2 Bt dt Sti i=1 ∂V 1 + i ∂S 2 σ ik σ jk Sti Stj i=1 j=1 k=1 ∂ 2V − rV ∂S i ∂S j i=1 j=1 ∂V i ij S σ dWtj ∂S i n d = i=1 ∂V S i (µi − r)dt + ∂S i Bt σ ij dWtj j=1 ˜ = grad V · dSt . ξ can be attained by the announced admissible trading strategy πt = grad V (t. S j ∂S i ∂S j n t − V (t. the model is complete. we have o ˜ dξt = d 1 = Bt 1 = Bt + 1 Bt V (t. In particular. so the price dynamics are given by the pair of equations dBt = Bt r dt dSt = St (µ dt + σdWt ) for real constants r. . PDE.T ] is a local martingale for Q. Furthermore. By Itˆ’s formula. St ). Consider the case of a market with two assets. a measure under which S is a local martingale. the market has no arbitrage. evaluated at time t and current price St . and formula In this section. there are two ways of doing this. there exists a locally equivalent ˆ measure Q such that the process deﬁned by Wt = Wt +λt is a Brownian motion.gradient of the pricing function V with respect to the spatial variables. The Black–Scholes model. As we’ve seen. 7. Pricing by expectations. ˜ Hence (ξt )t∈[0. Furthermore. We will assume that all coeﬃcients are constant. From Section 4. ˜ Proof. µ. σ where σ > 0. there is no arbitrage if ξt = EQ Bt g(ST )|Ft BT 60 . if V is non-negative.e. i.

These quantities are collectively knowns as the greeks. etc. In this simple case. then V (t. theta. In nearly all cases of interest. assuming the ﬁltration is generated by S. hence ξt is a noarbitrage price for the claim. S) ∈ [0. vega. the prices of traded assets can be written explicitly: Bt = B0 ert and St = S0 e(µ−σ and hence ξt = EQ e−r(T −t) g S0 e(r−σ = EQ e−r(T −t) g St e(r−σ ∞ −r(T −t) 2 /2)T +σ W ˆT 2 /2)t+σW t = S0 e(r−σ |Ft T −Wt ) 2 /2)t+σ W ˆ t 2 /2)(T −t)+σ(W ˆ ˆ |Ft 2 e−z /2 √ dz. It is worth pointing out. we can solve the Black–Scholes PDE ∂V ∂V 1 ∂ 2V + rS + σ 2 S 2 2 = rV ∂t ∂S 2 ∂S V (T.. S) = g(S) ˜ If ξt = V (t. 7. the Remark.assuming the expectation exists. name delta is inspired by the notation of the original Black–Scholes paper. we do not know how to compute π. we see that πt = ∂V (t. assuming V is bounded from below. Here is a suﬃcient condition: Suppose there exists positive contants C and p such that |V (t. there is a whole list if quantities. T ] × R+ .2. the quantity delta. which describe the sensitivities of a pricing formula with respect to some of its parameters. g St e = e 2π −∞ In section 5. that the above suﬃcient condition is speciﬁc to the Black– Scholes model. St ) is an admissible replicating portfolio ∂S T e −rT g(ST ) = V (0. S0 ) + 0 ∂V ˜ (s. gamma. of the claim. the two approaches yield the same answer. Furthermore..3 is given a special name. St ) = EQ [e−r(T −t) g(ST )|Ft ]. however. In ﬁnance. ∂S ∂V ∂S Because of its central importance in the theory. St ) then ξT = g(ST ) and (ξt )t∈[0. the martingale representation theorem asserts the existence of a process π such that √ (r−σ 2 /2)(T −t)+σ T −tz T e−rT g(ST ) = EQ [e−rT g(ST )] + 0 ˜ πs dSs Unfortunately. 61 3The . From the general results of section 6. the delta. Pricing and hedging by PDE.. S)| ≤ C(1 + S p ) for all (t. we argued that.T ] is a local martingale. Ss )dSs .

of ˆ the form St = S0 eνt+σWt . the option’s maturity time T . Volatility estimation. the option’s strike K.We can easily apply this result to a speciﬁc payout function g. K) = EQ [e−r(T −t) (ST − K)+ |Ft ] yielding the Nobel-prize-winning Black–Scholes formula: Ct (T.3. the hedging portfolio. the spot interest rate r. For instance. introduce a European call option maturing at T with strike K. not the equivalent martingale measure Q. . ∆ti If one was to truly believe that the stock price was a geometric Brownian motion. the underlying stock’s price St at time t. There is no arbitrage if the time t price Ct (T. then one could insert the value σ 2 into the Black–Scholes formula to obtain the price of a call option. one must ﬁrst estimate the volatility σ. and a volatility parameter σ. i. Of these six numbers. σ 2 ∆ti ) where ν = µ − σ 2 /2 and ∆ti = ti − ti−1 . the delta. 62 . only the volatility parameter is neither speciﬁed by the option contract nor quoted in the market. Notice that in 2π this case. < tn . that is. Then the random variables Yi = log St ti then has distribution i−1 Yi = (µ − σ 2 /2)(ti − ti−1 ) + σ(Wti − Wti−1 ) ∼ N (ν∆ti . K) of the call is given by Ct (T. What made this formula so popular after its publication in 1973 is the fact that the right-hand-side depends only on six quantities: the current calendar time t. The maximum likelihood estimators are then 1 ν= ˆ tn − t0 and 1 ˆ σ2 = n n n Yi i=1 i=1 (Yi − ν ∆ti )2 ˆ . .e. K) = St Φ √ log(St /K) √ + (r/σ + σ/2) T − t σ T −t √ log(St /K) √ + (r/σ − σ/2) T − t . Notice that we have done the statistics under the objective measure P. is given by πt = Φ √ log(St /K) √ + (r/σ + σ/2) T − t . σ T −t x 2 7. −e−r(T −t) K Φ σ T −t where Φ(x) = −∞ √1 e−y /2 dy is the standard normal distribution function. The payout is (ST − K)+ . To use the Black–Scholes formula to ﬁnd the price of real call options. One possibility is to is to collect n + 1 historical price observations S at times t0 < .

8. Now notice that v → BS(v. Ke−r(T −t) /St ).7. the function 2π v → BS(v. St appearing in the above formula is often called the moneyness of the The quantity Ke St option. then there would exist one parameter σ such that Σt (T. The second approach is more prudent. spot interest rate r. Then Black–Scholes formula says that in the context of a Black–Scholes model the call price is given by Ke−r(T −t) Ct (T. A completely diﬀerent approach to ﬁnd the volatility parameter is to observe the price Ct (T. m) = √m Φ − log v + (1 − m)+ √ v 2 − mΦ − log m √ v √ − v 2 if v > 0 if v = 0. called the implied volatility of the option. 1) be deﬁned by BS(v. However. and then try to work out which σ to put into the Black–Scholes to get the right price. such that Ct (T. or that the Black–Scholes model does not quite match reality. K) of call options in this model in terms of the calendar time t. and a volatility parameter σ. 63 2 −r(T −t) . So. m) = √ φ − √ + >0 ∂v 2 2 v v where φ(x) = √1 e−x /2 is the standard normal density. K) from the market. thanks to the enormous inﬂuence of the Black–Scholes theory. the implied volatility is now used as a common language to quote option prices. Then. K) is not ﬂat. Implied volatility. One could either conclude Black–Scholes model is the true model of the stock price and that the the market is mispricing options. is is usually the case that the implied volatility surface (T. However. K)2 . we have considered the Black–Scholes model–a two asset market model in which the risky asset price is a geometric Brownian motion. Hence for ﬁxed m. K) = St BS((T − t)Σt (T. K) = St BS (T − t)σ 2 . in real markets. the option maturity T and strike K. If the market was still pricing call options by Black–Scholes formula. we can ﬁnd a number Σt (T. m) can be inverted. . m) is strictly increasing and continuous since √ ∂BS 1 log m v (t. K) → Σt (T. K) of the option. Local and stochastic volatility models In the section 7. why even consider implied volatility? As Rebonato famously put it: Implied volatility is the wrong number to put into wrong formula to obtain the correct price. the current stock price St . K). Let the nonrandom function BS : R+ × R+ → [0. K) = σ for all 0 ≤ t < T and K > 0. given the market price Ct (T. The Black–Scholes formula gives an explicit representation of the prices Ct (T.4.

the existence of a Markovian martingale with a given marginal distribution was proven in 1972 by Kellerer. That is. ∂T ∂K 2 ∂K 2 Remark. K) = 2[ ∂C0 (T. there exists a continuous function fSt : R+ → R+ such that x Q(St ≤ x) = 0 fSt (y) dy. As always. However. We consider a model given by dBt = Bt r dt dSt = St (µ dt + σ(t. K > 0} is observed from the market.However. 2 Of course. K) = EQ [e−rT (ST − K)+ ] Then ∂C0 ∂C0 σ(T. St )dWt ). practitioners and researchers have proposed various generalizations of the Black– Scholes model to better match the observed implied volatility surface. K). the idea is replace the constant volatility parameter in Black–Scholes model with a local volatility function σ : R+ × R+ → R+ . ∞ Hence. K) + rK ∂C0 (T. K) : T > 0. We will assume that σ is smooth and bounded from below and above. 64 . In general. that is. K) ∂K 2 1/2 then the no-abitrage prices of call options in this model exactly match the observed prices. however. Proof. K) = −rK (T. The point of the above theorem is this: Suppose that today’s call price surface {C0 (T. Ke−rT /S0 ) for some constant σ0 then 2[ ∂C0 (T. It can be shown that for all t > 0. St ). the local volatility surface need not be ﬂat. K)2 2 ∂ 2 C0 (T. K) + rK ∂C0 (T. since the implied volatility surface Σt (T. K) = e−rT 0 fST (y)(y − K)+ dy = e−rT K fST (y)(y − K)dy. If one chooses the local volatility function σ by Dupire’s formula σ(T. K) + K (T. if C0 (T. let Q be the equivalent martingale measures with density process deﬁned by dZt = −Zt λt dWt where λt = (µ − r)/σ(t. Theorem. K) of real-world option prices is usually not ﬂat. K)] ∂T ∂K 0 K 2 ∂ C2 (T. K) ∂K 2 1/2 = σ0 . We now consider another Markovian model which can match a given implied volatility surface exactly. the random variable St has a continuous density with respect to Lebesgue measure. The next theorem in the present context is usually attributed to Dupire’s 1994 paper. K)] ∂T ∂K 0 K 2 ∂ C2 (T. Suppose that C0 (T. K) = S0 BS(σ0 T. ∞ C0 (T.

The fundamental theorem of calculus then implies ∞ ∂C0 fST (y) dy (T. g (x) = 1[K. σt = σ(t. St ) • CEV model. σt = γStβ−1 2 2 • Heston model. the Dirac delta ‘function’. we proceed formally T (ST − K)+ = (S0 − K)+ + 0 T 1{St ≥K} dSt + 1 2 1 2 T δK (St )d S 0 t = (S0 − K) + 0 T + 1{St ≥K} St r + δK (St )St2 σ(t. and o g (x) = δK (x). K) + rC0 (T. other models have been proposed. Now computing expected values of both sides 1 T fSt (y)y r dy dt + (1) e C0 (T. generally the model fails to correctly price path-dependent options. St )2 dt + 0 ˆ 1{St ≥K} St σ(t. K) = (S0 − K) + fSt (K)K 2 σ(t. called Tanaka’s formula. St )dWt where we have appealed to Itˆ’s formula4 with g(x) = (x − K)+ . can actually be o rigorously stated in terms of a quantity called local time. dσt = λ(¯ 2 − σt )dt + γσt dZt σ version of Itˆ’s formula for non-smooth convex functions. K) = e−rT fST (K) ∂K 2 [The second equality shows that the risk-neutral density of the stock price at time T can be recovered from the prices of the calls of maturity T . K)2 dt 2 0 0 K and then diﬀerentiating both sides with respect to T yields ∞ ∂C0 1 (T.] To outline the argument. It turns out that although a local volatility model can be made to exactly match all European call option prices.∞) (x). Local volatility models are not the end of the story. K) = −e−rT ∂K K ∂ 2 C0 (T. This result was proven by Breeden and Litzenberger in 1978. K)2 erT ∂T 2 K and the result follows from noting rT + ∞ ∞ ∞ T ∞ fST (y) y dy = K 0 fST (y)(y − K)+ dy + K K fST (y)dy and applying the appropriate identities. K) = fST (y)y r dy + fSt (K)K 2 σ(t. These so-called stochastic volatility models are of the form dBt = Bt r dt ˆ dSt = St (r dt + σt dWt ) where (σt )t∈R+ is a given stochastic process. Here are a few popular ones: • Local volatility model. 65 4A . Therefore.

where Zt = ρW t t As the economic notion of no-arbitrage is too weak to pin down the precise functional form of a stochastic volatility model. however. σt = γt St and dγt = αγt dZt ˆ t + 1 − ρ2 W ⊥ for an independent Brownian motion (W ⊥ )t∈R+ for Q. dσt = λ(¯ 2 − σt )dt + γσt dZt β−1 • SABR model. 66 . that aside from the local volatility model (including CEV). stochastic volatility models are incomplete since there more Brownian motions than risky assets. how quickly a computer can calculate exotic option prices with the model. Notice.2 2 2 σ • GARCH model. how easy the model is to calibrate. a practioner’s choice of model must be made on a combination of issues: how well the model ﬁts data. etc.

Recall that we can write the dynamics of the bank account as dBt = Bt rt dt where the adapted process (rt )t∈R+ is called the spot interest rate or the short interest rate. there are only a ﬁnite number of maturities of bonds traded on the the ﬁxed income market. etc. Bond prices and interest rates In this last chapter. T ))t∈[0. P (t. rather than a discrete set of points. they are virtually riskless and will serve as a convenient example.T ] : T > 0}. T ) the price at time t ∈ [0. we formulate a condition so that for any collection of maturities T1 < . . Note that on 22 November 2008. T ))t∈[0. since bonds issued by the U.T ] is a local martingale for all T > 0. there is no arbitrage if P (t. the above diﬀerential equation has the solution Bt = B0 e t 0 rs ds . Of course. where ˜ P (t. This is the typical situation.T1 ] has no arbitrage. However. it is common practice to represent the zero-coupon bond prices as a continuous curve.CHAPTER 5 Interest rate models 1. Treasury bond prices in Figure 1. T ) = P (t. The basic ﬁnancial instruments in this setting are the zero-coupon bonds. the market model will be speciﬁed by a family of processes {(P (t. P (t. Treasury are backed by the ‘full faith and credit’ of the U. < Td . . Therefore. But since this number is very large. the market (Bt . We also assume that there is a risk-free num´raire process (Bt )t∈R+ .S. we are not discussing corporate bonds. Theorem. Definition. mortgage-backed securities. . Of course. T1 ). We will work in market model where there are bonds of all maturities T available to trade. . assume that the bond issuer is absolutely credit worthy. . Therefore. Now. We denote by P (t. There is no arbitrage if there exists a locally equivalent measure Q such that ˜ the discounted bond price process (P (t. the map T → Pt (T ) was decreasing. which we will think e of as a bank or money market account. T )/Bt . In particular. T ) = EQ (e− for all 0 ≤ t ≤ T . government. consider the graph of U.S. . Td ))t∈[0. A (zero-coupon) bond with maturity T is a European contingent claim that pays exactly1 one unit of currency at time T . T ] of the bond. To get a feel for how we should model the bond prices. we explore models for the interest rate term structure.S. 67 1We T t rs ds |Ft ) . and there is zero probability of default.

it seems that we should like to model the interest rate (rt )t∈R+ as a non-negative process. The yield curve and the bond price curve contain the same information. A popular interest rate is the yield y(t. and r is suitably well behaved.S. Treasury zero-coupon bond price curve on 22 November 2008. if the spot rate process is bounded and continuous. From common experience. then ∂ Pt (T ) − ∂T = lim E T =t T ↓t Q T 1 − e− t rs ds |Ft T −t T = rt by the dominated convergence theorem. Indeed. T ) = − log P (t. Indeed. Treasury yield curve on 22 November 2008. Graph of the U. However. if rt ≥ 0 almost surely for all t ≥ 0 then the map T → Pt (T ) is decreasing almost surely. T ) = e−(T −t) 68 y(t.gov/ Notice that if P (t. we can diﬀerentiate the bond price with respect to maturity to recover the spot rate. T ) at time t of a bond maturing at time T deﬁned by the formula 1 y(t. T ) = EQ (e− t rs ds |Ft ). the interest rate is often modelled by a Gaussian process which might become negative with positive probability. Rather than speak of bond prices. T ) T −t Figure 2 is a graph of the U.treasury. Data taken from http://www. Note that spot interest rate is just the left hand point of the yield curve rt = y(t.T ) . . in actual practice. since P (t. We have already deﬁned the short interest rate.Figure 1.S. it is often easier to speak of interest rates. t).

Figure 2. deﬁned by f (t. T ) at time t for maturity T . notice that the spot rate is the left hand end point of the forward rate curve rt = f (t.treasury. T ). The term structure of interest rates refers the function T → P (t. then f (t.S.gov/ For us. the price data encoded in either of the functions T → y(t. T ). T ) = EQ (rT e− EQ (e− T t rs ds |Ft ) T t rs ds |Ft ) so that the forward rate can be interpreted as the Ft -measurable random variable f (t. ∂T Again. T ) such that the no-arbitrage price at time t of the claim that pays rT − f (t. T ) or T → f (t. T ). T ) = − ∂ log P (t. note that the forward rates contain the same information as the bond pricese since P (t. say bounded and continuous.s) ds . T ) at time T is zero. Graph of the U. or equivalently. a more useful interest rate is the forward rate f (t. Again. t). T ) = e− T t f (t. Data taken from http://www. Treasury yield curve on 22 November 2008. 69 . Note that if (rt )t∈R+ is suitably regular.

Itˆ’s formula implies o d e− t 0 rs ds t 0 rs ds V (t. rt ) dt + (t. and where αt = at − βt λt deﬁnes the risk-neutral drift. T ) = V (t. rt )2 2 (t. rt )drt + ∂t ∂r 2 ∂r2 t ∂V ∂V 1 ∂ 2V = e− 0 rs ds (t. there is no need to model the processes (at )t∈R+ and (λt )t∈R+ separately. rt )dWt for some non-random functions α : R+ × R → R and β : R+ × R → R. Since we are interested in pricing and hedging. Assume that ˆ drt = α(t. rt ) ∂t ∂r 2 ∂r t ∂V ˆ −rt V (t. r) ∂t ∂r 2 ∂r V (T. rt )dt ∂V ∂V 1 ∂ 2V (t. rt ) + α(t. rt )d r t (t. we must be careful to realize that is impossible to estimate the distribution of the random variable αt directly from a time series rt1 . . T ))t∈[0. rtn . However. rt ) (t. Short rate models We begin with a market that has just the bank account B. . r) + α(t.T ] is a local martingale. no such choice is possible since the short rate is not traded. Proof. rt ) dt + e− 0 rs ds (t. t 0 rs ds 70 . r) = rV (t. r) + β(t. We now study the case when the short rate is Markovian. the price of contingent claims can be expressed in terms the solution of a PDE: Theorem. and a Brownian motion (Wt )t∈R+ for P. r) (t. we know that there is no arbitrage if the market somehow picks an equivalent martingale measure Q to price the bonds. rt )dWt ∂r ˜ Since the drift vanishes by assumption. However. . rt )dt + β(t. T ] × R → R satisﬁes the PDE ∂V 1 ∂ 2V ∂V (t. r) = 1 If P (t. rt )β(t. rt ) + β(t. We will consider an Itˆ o process short interest rate model of the form drt = at dt + βt dWt for adapted process (at )tR+ and (bt )t∈R+ . Fix T > 0 and suppose V : [0. rt ) = −rt e− +e− V (t. r)2 2 (t. As we have learned for Markovian stock models. .2. rt ) then there is no arbitrage. so (P (t. We will assume that the market price of risk is given by the process (λt )t∈R+ so that ˆ drt = αt dt + βt dWt ˆ where dWt = dWt +λt dt deﬁnes a Brownian motion for the measure Q whose density process is given by dZt = −Zt λt dWt . Note that while in a complete stock market model there was only one way to switch to an equivalent martingale measure.

a mean-reversion parameter λ > 0. r) ∂t ∂r 2 ∂r V (T. r) − λ(¯ − r)A(t)V (t. r) + r A (t) = λA(t) − 1 B (t) = −λ¯A(t) + r 2A σ2 A(t)2 2 continuous Gaussian Markov process is often called a Ornstein–Uhlenbeck process. Vasicek proposed the following model for the short rate: ˆ drt = λ(¯ − rt )dt + σdWt r for a parameter r > 0 interpreted as a mean short rate. σ . A disadvantage of this model is that there is a chance that rt < 0 for some time t > 0. that in the present framework we can say absolutely nothing about the distribution of rt for the objective measure P. 71 . We can also use the above theorem to compute bond prices. the Q-probabilty of the event {rt < 0} is pretty small.2. r) = 1. Vasicek model. for instance of bonds. 2 Since this is supposed to be an identity for all (t. r). unless we have a model for the market price of risk. for sensible parameter values. r) = e−rA(t)−B(t) for some functions A and B which satisfy the boundary conditions A(T ) = B(T ) = 0. Indeed. Note that for each t ≥ 0 the random variable rt is Gaussian2 under the measure Q with t E (rt ) = e Q −λt r0 + (1 − e −λt )¯ and Var (rt ) = r 0 Q e −2λ(t−s) 2 σ2 σ ds = (1 − e−2λt ). r) = rV (t. In 1977. We can make the ansatz V (t. Since the short rate rt is Gaussian. r) = rV (t. we have ¯ 2λ 1 T T rs ds → r Q − almost surely. one can show that the process is ergodic and converges to the invariant distri2 bution N r. r) + σ 2 2 (t. r) + λ(¯ − r) r (t. the advantage of this type of model is that it is relatively easy to compute prices. explicitly. In particular.1. however. However. Recall that a normal random variable can take any real value. r) we have (−A (t)r − B (t))V (t. Substituting this into the PDE yields σ2 A(t)2 V (t. ﬁx T > 0 and consider the PDE ∂V ∂V 1 ∂ 2V (t. both positive and negative. ¯ and a volatility parameter σ > 0. This stochastic diﬀerential equation can be solved explicitly to yield t rt = e−λt r0 + (1 − e−λt )¯ + r 0 ˆ e−λ(t−s) σdWs . ¯ 0 Please note. 2λ Moreover.

Althought the stochastic diﬀerential equation cannot be solved explicitly. The process (rt )t∈R+ satisfying the above stochastic diﬀerential equation is often called a square-root diﬀusion or CIR process. its graph is a line. r) = rV (t. r) + r 72 σ2 rA(t)2 V (t. In 1985. r). that is. ¯ An advantage of this model over the Vasicek model is that the short rate rt is non-negative for all t ≥ 0. ﬁx T > 0 and consider the PDE ∂V ∂V 1 ∂ 2V (t. the CIR model is another example of an aﬃne term structure model: Again. though this stochastic process was studied as early as 1951 by Feller. one can say quite a lot about this process. one can show that the process is ergodic and its invariant disribution is a gamma distribution with mean r. and Ross proposed the following model for the short rate: √ ˆ drt = λ(¯ − rt ) + σ rt dWt r for a parameter r > 0 interpreted as a mean short rate. Ingersoll. a mean-reversion parameter λ > 0. r) = 1. However. Cox. t + x) = rt e−λx + r(1 − e−λx ) − ¯ σ2 (1 − e−λx )2 2λ2 This formula says that for the Vasicek model. Furthermore. r) − λ(¯ − r)A(t)V (t. For instance. r) ∂t ∂r 2 ∂r V (T. T ) = t 1 (1 − e−λ(T −t) ) λ T σ2 λ¯A(s) − A(s)2 ds r 2 T −t T −t so that the bond price is given by P (t. the forward rates are more manageable: f (t. explicit formula are still available for the bond prices. r) + σ 2 r 2 (t. T ) = exp −rt (1 − e−λ(T −t) ) −r ¯ λ (1 − e−λu ) du + 0 σ2 2λ2 (1 − e−λu )2 du . 0 This is a mess. 2 .) 2. r) + λ(¯ − r) r (t.The solution to this pair of equations is A(t. As before we can make the ansatz V (t. Indeed.2. T ) = B(t. r) = e−rA(t)−B(t) for some functions A and B which satisfy the boundary conditions A(T ) = B(T ) = 0. Cox–Ingersoll-Ross model. the forward rates at time t are an aﬃne function of the short rate at time t. ¯ and a volatility parameter σ > 0. Substituting this into the PDE yields (−A (t)r − B (t))V (t. r) = rV (t. (An aﬃne function is of the form g(x) = ax + b.

Zt )dWt ˆ where a : R+ × Rd → Rd and b : R+ × Rd → Rd×d are given functions and (Wt )t∈R+ is a d-dimensional Brownian motion for the equivalent martingale measure Q. since there exists a deterministic function g : D × R → R such that f (t.This time we have σ2 A (t) = λA(t) + A(t)2 − 1 2 B (t) = −λ¯A(t). Factor models The Markovian short rate models are popular in practice. 73 . a possible shortcoming of these models is that they predict a very rigid term structure. but the forward rates are given by f (t. Zt )dt + b(t. the forward rates for the CIR model are again given by an aﬃne function of the short rate. so that the correlation coeﬃcient ρ(rt . the function r → g(t. t + x) = 4γ 2 eγx 2λ¯(eγx − 1) r rt + . The short rate models considered in the last section have d = 1. The main theorem is below. In particular. γx + (γ − λ)]2 [(γ + λ)e (γ + λ)eγx + (γ − λ) In particular. The bond prices are too messy to write down. especially the Vasicek and CIR models in which formulas for the bond prices are available in closed form. whose solution is A(t. However. T ) = g(t. Indeed. T )) = 1 for all 0 < t ≤ T . we consider more general factor models. and R(t. r The equation for A is a Riccati equation. T. T ) : 0 ≤ t ≤ T }. r) is aﬃne. T. in the Vasicek and CIR models. We then assume that the short rate is given by a function rt = R(t. f (t. In this section. r) = r. there is very little ﬂexibility in the possible shapes of the forward rate curve. Zt = rt . We model these factors as the solution (Zt )t∈R+ of a stochastic diﬀerential equation ˆ dZt = a(t. 3. T ) = B(t. of which the short rate models are only a special case. Zt ). T ) = t 2(eγ(T −t) − 1) (γ + λ)eγ(T −t) + (γ − λ) T λ¯A(s)ds r √ where γ = λ2 + 2σ 2 . The idea is to assume that there are d underlying economic ‘factors’ in the market. rt ) where D = {(t.

T ) = e−A(t. ﬁrst studied by Duﬃe and Kan in 1996. and β is a d × d constant matrix. . We let √ γ1 + δ1 · Zt 0 ··· 0 √ . for instance. . the functions A and B can be found by solving the following system of d + 1 coupled Riccati equations ∂Ak = − ∂t ∂B = − ∂t d (1) (d) i=1 d 1 βik Ai + δk A2 − 1 k 2 1 αi Ai + 2 d γi A2 .T ) . However. T ) 74 . γ1 . .. .Theorem. and analyzing the properties of this non-Gaussian diﬀusion is not easy because of the random volatility created by the non-zero δi ’s. for deterministic functions A : D → R → Rd and B : D → R. . . z))T . the situation is much more delicate when some of the δi ’s are non-zero. T ) = 0 = B(T. Zt ) then the market consisting of the bank account and bond maturity T has no arbitrage. . i i=1 i=1 These equations can be solved numerically. . Fix T > 0. we can make the ansatz that the bond prices can be written in the exponential aﬃne form P (t. √ γd + δd · Zt 0 ··· Here. We now consider a special case of the above theorem. Existence and uniqueness of solutions of such stochastic diﬀerential equations is not guaranteed. 0 γ2 + δ2 · Zt · · · . As in the case of the Vasicek and CIR models. + Zt . As before. δd are d-dimensional constant vectors. there is existence and uniqueness of a solution. subject to the boundary conditions Ak (T. z)b(t. . . Z) = 1 where B(t. ˆ dZt = (α + βZt )dt + dWt . γd are d positive real constants. Suppose V : [0. . . the solution is a d-dimensional Ornstein–Uhlenbeck process. In this case. Suppose the short rate is given by rt = Zt + . The analysis of such a stochastic diﬀerential equation is very simple when all the δi ’s are zero. If P (t. . . and in fact. Remark. z) = b(t. T ) = V (t.T )·Zt −B(t. α and δ1 . T ] × Rd → R+ satiﬁes the partial diﬀerential equation ∂V + ∂t d i=1 1 ∂V + ai ∂Zi 2 d d Bij i=1 j=1 ∂ 2V = RV ∂Zi ∂Zj V (T. .

by construction. If the investor’s wealth at time t is denoted by Bt . ∂T ∂T Fix d dates T1 . T ) = Γt = ∂Aj (t. Hence. ti ] the investor holds all of his wealth in the bond which matures at time ti . d}. ti ) Pti−1 (ti ) ti − ti−1 ∂ P (t. Ti ) − (t. . Fix a sequence 0 ≤ t0 < t1 < . f (t.. t) = 1 for all t. suppose we start out with just the bond market.. . T1 ).. . T ) · Zt + (t. note that ∂A ∂B (t. the budget constraint is Bti−1 = πti P (ti−1 . . T ))t∈[0. T ). . . the rate of change of the wealth is given by Bti − Bti−1 ti − ti−1 = Bti−1 1 − P (ti−1 . ti ) and the self-ﬁnancing condition is Bti = πti since P (t. Td ). Td and consider the d benchmark rates f (t. . suppose at time 0 the investor has B0 units of wealth. of times and suppose that during the interval (ti−1 . . . Ti ) ∂T i.d} is invertible. but without the bank account. . where the discounted bond price at time t for maturity T is given by ˜ P (t. we can deﬁne the spot rate by rt = − so that dBt = Bt rt dt as before. Jarrow..d} 4. Indeed. We can construct the bank account by considering an investor holding his wealth in just-maturing bonds. Given the exponential aﬃne bond prices. More concretely. Motivation... . . there exists an equivalent martingale ˜ measure Q such that all discounted bond prices (P (t. then we can recover the factors as linear combinations of the benchmark forward rates: ∂B Zt = Γ−1 f (t.. and Morton in 1992 was that we can change perspectives by modelling the bond prices directly. There o is no arbitrage in a factor model since. and the number of shares of the just-maturing bond by πt . The insight of Heath. T ) = e− t 0 rs ds Pt (T ). the derived bond prices are necessarily Itˆ processes..for all k ∈ {1. T )|T =t ∂T 75 By taking the limit as ti − ti−1 → 0. Ti ) t ∂T i∈{1.j∈{1. . The Heath–Jarrow–Morton framework Starting from a factor model. If the matrix f (t.T ] are local martingales. . .

The upshot of the HJM result is that the drift and the volatilty of the forward rate dynamics cannot be prescribed independently. s)ds . T ))t∈[0. . Indeed. P) on which we can deﬁne a d-dimensional Brownian motion (Wt )t∈R+ .T ] has dynamics n df (t. the foward rate process (f (t. Let the short rate be given by rt = f (t. T ). Proof. < Td . T ) = σ(t. we can initialize the model with any initial forward rate curve T → f (0. T ) · t ˆ σ(t. Deﬁne a locally equivalent measure Q by the density process dZt = −Zt λt · dWt . . the process deﬁned by dWt = dWt + λt dt is a Brownian motion.The usual formulation of the HJM idea is in terms of the forward rates. T )dWt (i) for some suitably regular adapted processes (a(t. The diﬀerence with the short rate models is that we are now trying to model the dynamics of the whole term structure. they must be related by the famous formula T a(t. We can rewrite the forward rate dynamics as T df (t. usually called the HJM drift condition. T ))t∈[0. T ) = e− T t f (t. T ) · dWt . . F. As usual. T1 ).T ] . T )dt + i=1 σ (i) (t. . s)ds . P (t. T ) (i) (i) λt T + t σ (i) (t. .s) ds . just by choosing the initial forward rate curve to match the one predicted by the model. in the HJM framework. Suppose for each T . T ) = σ(t. Remark. T ) = a(t. we put ourselves in the context of a probability space (Ω. ˆ For this measure. note that any of the short rate or factor models can be put into the HJM framework. T ) · λt + t σ(t. 76 . for any set of d maturities 0 < T1 < . P (t. the market model with prices (Bt . Nevertheless. If there exists a d-dimensional bounded adapted process (λt )t∈R+ such that d a(t. . Td ))t∈ has no arbitrage. then. t) and the bank account dynamics by dBt = Bt rt dt. s)ds dt + σ(t. let the bond prices be given by P (t. as in Section 2. (T ) = i=1 σ (t.T ] and (σ (i) (t. Indeed. T ))t∈[0. Theorem. Finally. Notice that this drift/volatility contraint is not present in models in which only the dynamics of the short rate are speciﬁed.

S] × [0. t)dWt ds − 0 0 gn (s. 0 ≤ ti−1 < ti ≤ T. 77 . and hence are vulnerable to the criticism that there is a positive probability that the rates become negative. t)]2 dt ds → 0. Then g satisﬁes the exchange of order of integration equality.ti ] : 0 ≤ si−1 < si ≤ S. t) − gn (s. T ) = T t 3 0 rs ds− t f (t.s)ds is a local martingale. t))2 dt ds → 0 by the Cauchy–Schwarz inequality. t) − gn (s. t))dWt S = SE 0 0 (g(s. by Itˆ’s formula. t)ds dWt = E (g(s. since T S T S 2 T 0 T 0 S 2 dt E 0 0 g(s.e− ˜ It is enough to show that for each T > 0. s) ds T = Hence. T ≥ 0. s)ds dt + t ˆ σ(t. We conclude this section with some examples. Now applying some formal manipulations t T T d 0 rs ds + t f (t. and o S T T T 2 SE 0 0 (g(s. s)ds · dWt . t)ds dWt − 0 0 gn (s. s)ds · dWt and we’re done. s) ds = (rt − f (t. ki is bounded andFti−1 measurable . Fubini’s theorem. we have o 1 2 T t 2 σ(t. In these examples. when do we have the equality T 0 0 S S T 3We g(s. would like to appeal to a stochastic Fubini theorem in order to exchange the order of integration in the double integral. t))2 ds dt → 0 S E 0 S 0 T T 2 dt E 0 0 g(s. t))dt + t df (t. the discounted bond price process P (t. t) − gn (s.sj ]×(ti−1 . Now suppose there exists a sequence (gn )n∈N in S such that T S S T E 0 0 [g(s. t)ds dWt = 0 0 g(s. t) − gn (s. t)2 ds dt < ∞ a. An example which will occur frequently is when g is not random and continuous (or at least Riemann integrable) on [0. by localization. T ]. t) − gn (s. and Itˆ’s isometry. t)]2 ds dt =E 0 0 [g(s. t)2 ds dt < ∞ with 0 0 g(s. t)dWt ds ≤ (g(s. t))ds S = by Itˆ’s isometry and the Cauchy–Schwarz inequality. T ) t ˆ σ(t. we can replace the condition o T S T S E 0 0 g(s. t)dWt ds? First note that the equality holds for g ∈ S where S is the set n S= i=1 ki 1(si−1 .s. T ˜ ˜ dP (t. The question is: if we ﬁx S. t) − gn (s. T ) = −P (t. the forward rates are Gaussian under the measure Q. Finally.

T ) under the risk-neutral measure Q is Gaussian with mean t T EQ [f (t.3. the forward rates are given by t T t f (t. then for positive times t the forward rates f (0. T ) = f (0. Then df (t. f (s. t) + σ 2 t2 /2 + σ0 Wt . (1986) This model is the simplest possible model HJM model. T ) = σ0 be constant. S). Ho–Lee. S) · σ(u. T ) = σ 2 (T − t) dt + σ0 dWt . T ) = σ0 e−λ(T −t) for positive constants σ0 and λ. T ) is bounded from below. T )] = f0 (T ) + 0 σ(u. 0 or 2 ˆ f (t. mean rate r(t). λ rt = f (0. T ) + σ0 (T t − t2 /2) + σ0 Wt . T ) → ∞ as T → ∞. 0 0 4. T ) · u σ(u. s)ds du + 0 ˆ σ(u. (1994) Note that for the HJM models discussed above. T )du. Then ˆ df (t. t) + 0 2 σ0 −λ(t−s) e (1 − e−λ(t−s) )ds + λ t t ˆ σ0 e−λ(t−s) dWs 0 = f (0. The short rate is then given by ˆ rt = f (0. but non-random. T ) + 0 σ(u. Vasicek–Hull–White. Kennedy.2. T ) · u s∧t σ(u. T ) · dWu . If σ is not random. then the distribution of f (t. 0 Hence the Ho–Lee model corresponds to the following short rate model: ˆ drt = (f (t) + σ 2 t)dt + σ0 dWt . Here is an unusual feature of this model: if the initial forward rate curve T → f (0. t) + 2 σ0 (1 2λ2 − e−λt )2 + 0 ˆ σ0 e−λ(t−s) dWs The short rate dynamics are given by t 2 σ0 −λt ˆ ˆ e (1 − e−λt ) dt + σ0 dWt − λ σ0 e−λ(t−s) dWs dt λ 0 2 σ0 ˆ = f0 (t) + λf0 (t) + (1 − e−2λt ) − λrt dt + σ0 dWt 2λ Hence.4. the Hull–White extension of the Vasicek essentially replaces the mean interest rate r ¯ with a time-varying. (1990) Again let d = 1 but now σ(t.1. Let d = 1 and σ(t. ¯ drt = f0 (t) + 4. 78 . T ) = f (0. T )] = 0 σ(u. T ) = The short rates are given by t 2 σ0 −λ(T −t) ˆ e (1 − e−λ(T −t) )dt + σ0 e−λ(T −t) dWt . s)ds du and covariance CovQ [f (s.

the increments of (f (t. T ))t∈[0. t. T ) = cs∧t (S. Then there is no arbitrage if the mean is given by T µ(t.Kennedy reversed this logic. one choice is to have the correlation of the increments decay exponentially in the diﬀerence of the maturities: ‘ρ(dft (S). An advantage of this formulation of the Gaussian HJM model is that one is no longer restricted to ﬁnite dimensional Brownian motions. T ) = e−β|T −S| 0 αu (S)αu (T )du for real valued functions αu . dft (T )) = e−β|T −S| . there is much more ﬂexibility to specify the correlation of the increments. T ) + 0 ct∧s (s. for each ﬁxed T > 0. T ) : 0 ≤ t ≤ T } with mean µ(t. Suppose that covariance has the special form C(s. S. therefore.T ] are independent. and considered a Gaussian random ﬁeld {f (t. and. T ). 79 . T ) and covariance C(s. t. T ) = f (0.’ This can be achieved by taking c to be t ct (S. For instance. T ) so that. S. T )ds.

.

b] = b − a for every b > a. F a sigma-ﬁeld on Ω. The triple (Ω. Random variables Definition. F. A sigma-ﬁeld is called trivial if each of its elements is either almost sure or null. 81 . 2. Definition. A2 . F. A table of notation is in the appendix. Let Ω be a set. . A probability measure P on (Ω. A random variable is a function X : Ω → R such that the set {ω ∈ Ω : X(ω) ≤ t} is an element of F for all t ∈ R. (2) if A1 . F).CHAPTER 6 Crashcourse on probability theory These notes are a list of many of the deﬁnitions and results of probability theory needed to follow the Advanced Financial Models course. A sigma-ﬁeld on Ω is a non-empty set F of subsets of Ω such that (1) if A ∈ F then Ac ∈ F. and P a probability measure on (Ω. F) is a µ : F → [0. The Borel sigma-ﬁeld B on R is the smallest sigma-ﬁeld containing every open interval. ∈ F are disjoint then µ( ∞ Ai ) = ∞ µ(Ai ). The phrase ‘almost surely’ is often abbreviated a. If P(A) = 1 then A is called an almost sure event. A measure µ on the measurable space (Ω. Let Ω be a set and let F be a sigma-ﬁeld on Ω. Measures Definition. for instance Rn . 1.s. Let A ∈ F be an event. The set Ω is called the sample space. i=1 The terms sigma-ﬁeld and sigma-algebra are interchangeable. Since they are free from any motivating exposition or examples. Let (Ω. Definition. Let Ω be a set. F) is a measure such that P(Ω) = 1. ∞] such that (1) µ(∅) = 0 (2) if A1 . . and if P(A) = 0 then A is called a null event. and an element of Ω is called an outcome. This measure is called Lebesgue measure. . There exists a unique measure Leb on (R. . ∈ F then ∞ Ai ∈ F. . B) such that Leb(a. and since no proofs are given for any of the theorems. A subset of Ω which is an element of F is called an event. . these notes should be used only as a reference. i=1 i=1 Theorem. if Ω is a topological space. A2 . the Borel sigma-ﬁeld on Ω is the smallest sigma-ﬁeld containing every open set. P) is called a probability space. P) be a probability space. More generally.

is Cov(X. The variance of an integrable random variable X. .Let A be a subset of R. . written ρ(X. We also use the term random variable to refer to measurable functions X from Ω to more general spaces. 1} deﬁned by 1A (ω) = for all ω ∈ Ω. E(X) = sup{E(Y ) : Y simple and 0 ≤ Y ≤ Xa. . is ρ(X. written Var(X).} Note that the expected value of a non-negative random variable may take the value ∞. . . then their correlation. n E(X) = i=1 xi P(X = xi ). is Var(X) = E{[X − E(X)]2 } = E(X 2 ) − E(X)2 . F. . The distribution function of X is the function FX : R → [0. Y ). expectation. Y ) . 1] deﬁned by FX (t) = P(X ≤ t) for all t ∈ R. For instance. Let X be a random variable on (Ω. 1 0 if ω ∈ A if ω ∈ Ac 3. . Var(X)1/2 Var(Y )1/2 82 . • X ≥ 0 almost surely. The covariance of square-integrable random variable X and Y . Y ) = E{[X − E(X)][Y − E(Y )]} = E(XY ) − E(X)E(Y ). E(X) = E(X + ) − E(X − ) • X is vector valued and E(|X|) < ∞. written Cov(X. . Xn (ω)) and Xi is a random variable for each i ∈ {1. . Y ) = Cov(X. Let A be an event in Ω. and let X be a random variable. n}. E[(X1 . Xd )] = (E[X1 ]. Expectations and variances Definition. E[Xd ]) A random variable X is integrable iﬀ E(|X|) < ∞ and is square-integrable iﬀ E(X 2 ) < ∞. • Either E(X + ) or E(X − ) is ﬁnite. We use the notation {X ∈ A} to denote the set {ω ∈ Ω : X(ω) ∈ A}. . . . The terms expected value. xn . . . The expected value of X is denoted by E(X) and is deﬁned as follows • X is simple. The indicator function of the event A is the random variable 1A : Ω → {0. P). . and mean are interchangeable. . the event {X ≤ t} denotes {ω ∈ Ω : X(ω) ≤ t}. takes only a ﬁnite number of values x1 . . Y ). . Definition. . If neither X or Y is almost surely constant.s. we call a function X : Ω → Rn a random variable or random vector if X(ω) = (X1 (ω). i.e. . In particular.

If X is a random vector taking values in Rn . • linearity: E(aX + bY ) = aE(X) + bE(Y ) • positivity: Suppose X ≥ 0 almost surely. Let X be a random variable and g : R → R be a convex function. the space Lp is the collection of random variables such that E(|X|p ) < ∞. Then E[g(X)] ≥ g(E[X]) whenever the expectations exist. If X is a discrete random variable with probability mass function pX taking values in a countable set S then E(g(X)) = g(t) pX (t). Definition. ∞) such that t P(X ≤ t) = −∞ fX (x)dx for all t ∈ R. if it exists. The case when p = q = 2 is called the Cauchy–Schwarz inequality. If X ∈ Lp and Y ∈ Lq then E(XY ) ≤ E(|X|p )1/p E(|Y |q )1/q with equality if and only if X = 0 almost surely or |Y | = a|X|p−1 almost surely for some constant a ≥ 0. Then E(X) ≥ 0 with equality if and only if X = 0 almost surely. A random variable X is called discrete if X takes values in a countable set. the above inequality is strict unless X is constant.e. Theorem (Jensen’s inequality). If g is strictly convex. Theorem. 1 p Theorem (H¨lder’s inequality). the function pX : R → [0. The random variable X is absolutely continuous (with respect to Lebesgue measure) if and only if there exists a function fX : R → [0. i. Let X and Y be random variables and let p. there is a countable set S such that X ∈ S almost surely.Random variables X and Y are called uncorrelated if Cov(X. t∈S If X is an absolutely continuous random variable with density function fX then ∞ E(g(X)) = −∞ g(x) fX (x) dx. then the density of X. 1] deﬁned by pX (t) = P(X = t) is called the mass function of X. Y ) = 0. q > 1 with o 1 + q = 1. Let the function g : R → R be such that g(X) is integrable. Theorem. For p ≥ 1. 83 . ∞) such that P(X ∈ A) = A fX (x)dx for all Borel subsets A ⊆ Rn . Let X and Y be integrable random variables. Definition. If X is discrete. is the function fX : Rn → [0. in which case the function fX is called the density function of X. The space L∞ is the collection of random variables which are bounded almost surely.

Then E(X) = µ and Var(X) = σ 2 . . . Let X be a continuous random variable with density function fX . Then E(X) = p and Var(X) = p(1 − p). if pX (k) = n k p (1 − p)n−k for all k ∈ {0. • Poisson with parameter λ if pX (k) = λk −λ e for all k = 0. Special distributions Definition. written X ∼ unif(a. where 0 < p < 1. 1. Then E(X) = 1/p. Then E(X) = 1/λ. if fX (t) = √ 1 (x − µ)2 exp − 2σ 2 2πσ for all t ∈ R for some µ ∈ R and σ 2 > 0. • exponential with rate λ. if fX (t) = λe−λt for all t ≥ 0 for some λ > 0. σ 2 ). • geometric with parameter p if pX (k) = p(1 − p)k−1 for all k = 1. written X ∼ N (µ. if X is a random vector valued in Rn with density fX and g : Rn → R then E(g(X)) = Rn g(x) fX (x) dx. . written X ∼ bin(n. if fX (t) = 1 for all a < t < b b−a for some a < b. . 1. . 4. b). n} k where n ∈ N and 0 < p < 1. where 0 < p < 1. . 2. Then E(X) = a+b . Let X be a discrete random variable taking values in Z+ with mass function pX .More generally. 2 • normal or Gaussian with mean µ and variance σ 2 . 3. Definition. The random variable X is called • uniform on the interval (a. . 84 . The random variable X is called • Bernoulli with parameter p if pX (0) = 1 − p and pX (1) = p. k! where λ > 0. b). . 2. • binomial with parameters n and p. . p). Then E(X) = λ. . Then E(X) = np and Var(X) = np(1 − p).

X2 .If X is a random vector valued in Rn with density 1 fX (x) = (2π)−n/2 det(V )−1/2 exp − (x − µ) · V −1 (x − µ) 2 for a positive deﬁnite n × n matrix V and vector µ ∈ Rn . written X ∼ Nn (µ. then E(XY ) = E(X)E(Y ).i. The phrase ‘independent and identically distributed’ is often abbreviated i. . is P(A ∩ B) P(A|B) = . written P(A|B). P(B) Theorem (The law of total probability). P(B) The conditional expectation of X given B. Let B be an event with P(B) > 0. written E(X|B). Then E(X) P(X ≥ ) ≤ for all > 0. be events. Corollary (Chebychev’s inequality). Theorem. If X and Y are independent and integrable. 6. Let A1 . be disjoint. . The conditional probability of an event A given B. 85 . Let X be a random variable with E(X) = µ and Var(X) = σ 2 . Then σ2 P(|X − µ| ≥ ) ≤ 2 for all > 0. Random variables X1 . non-null events such that ∞ Bi = Ω. are independent. are called independent if the events {X1 ≤ t1 }. 5. Then E(Xi ) = µi and Cov(Xi . . . . Let X be a positive random variable. .d. . Let B1 . Probability inequalities Theorem (Markov’s inequality). . Xj ) = Vij . . Conditional probability and expectation. independence Definition. A2 . {X2 ≤ t2 }. If P( i∈I Ai ) = i∈I P(Ai ) for every ﬁnite subset I ⊂ N then the events are said to be independent. B2 . . V ). is E(X 1B ) E(X|B) = . . . then X is said to have the ndimensional normal (or Gaussian) distribution with mean µ and variance V . Then i=1 ∞ P(A) = i=1 P(A|Bi )P(Bi ) for all events A. Definition.

The term eventually is deﬁned by {An eventually} = N ∈N n≥N An and inﬁnitely often by {An inﬁnitely often} = N ∈N n≥N An . Let A1 . if X is a random vector valued in Rn then φX : Rn → C deﬁned by φX (t) = E(eit·X ) is the characteristic function of X. . Characteristic functions Definition. . where i = −1. . Definition. More generally. A2 . be events. Fundamental probability results Definition (Modes of convergence). and X be random variables. Let φX and φY be the characteristic functions of X and Y .7. If ∞ P(An ) < ∞ n=1 then P(An inﬁnitely often) = 0. be a sequence of events. Let X1 . for p ≥ 1. . 86 . . • Xn → X almost surely if P(Xn → X) = 1 • Xn → X in Lp .] Theorem (The ﬁrst Borel–Cantelli lemma). Then φX (t) = φY (t) for all t ∈ R if and only if FX (t) = FY (t) for all t ∈ R. . 8. A2 . The characteristic function of a real-valued random variable X is the function φX : R → C deﬁned by φX (t) = E(eitX ) √ for all t ∈ R. if E|X|p < ∞ and E|Xn − X|p → 0 • Xn → X in probability if P(|Xn − X| > ) → 0 for all > 0 • Xn → X in distribution if FXn (t) → FX (t) for all points t ∈ R of continuity of FX Theorem. Let X and Y be real-valued random variables with distribution functions FX and FY .o. . [The phrase ‘inﬁnitely often’ is often abbreviated i. X2 . Theorem (Uniqueness of characteristic functions). if r ≥ p ≥ 1 then Xn → X in Lr ⇒ Xn → X in Lp . . . The following implications hold: Xn → X almost surely or ⇒ Xn → X in probability ⇒ Xn → X in distribution Xn → X in Lp . Let A1 . p ≥ 1 Furthermore.

. . . be independent and identically distributed integrable random variables with common mean E(Xi ) = µ. X2 . and X be random variables such that Xn → X almost surely. . Let X1 . + Xn − nµ √ Zn = . be independent and identically distributed with E(Xi ) = µ and Var(Xi ) = σ 2 for each i = 1. 2. . Theorem (Monotone convergence theorem). . 1). n↑∞ n↑∞ Theorem (Dominated convergence theorem). . . and let X1 + . . Then X 1 + . σ n Then Zn → Z in distribution. . A2 . . Then Xn → X almost surely and E(Xn ) → E(X).Theorem (The second Borel-Cantelli lemma). X2 . Theorem (A strong law of large numbers). be a sequence of independent events. X2 . be positive random variables with Xn ≤ Xn+1 almost surely for all n ≥ 1. X2 . . n Theorem (Central limit theorem). . . + Xn → µ almost surely. . . Let X1 . . .. X2 . Let X1 . . Let X1 . . 87 . Theorem (Fatou’s lemma). . Let A1 . If E(supn≥1 |Xn |) < ∞ then E(Xn ) → E(X). and let X = supn∈N Xn . . be positive random variables. Then E(lim inf Xn ) ≤ lim inf E(Xn ). . Let X1 . If ∞ P(An ) = ∞ n=1 then P(An inﬁnitely often) = 1. . . where Z ∼ N (0.

2. b) the set of random variables X with E|X|p < ∞ Table 1. . σ 2 ) Nn (µ. . 2. −2. x2 . 2. Y ) E(X|B) a∧b a∨b a+ lim supn↑∞ xn lim inf n↑∞ xn a·b |a| X∼ν the the the the the the the the the the the the the the the set of real numbers set of non-negative real numbers [0. 1. b) Lp the random variable X is distributed as the probability measure ν the indicator function of the event A the normal distribution with mean µ and variance σ 2 the n-dimensional normal distribution with mean µ ∈ Rn and variance V ∈ Rn×n the binomial distribution with parameters n and p the uniform distribution on the interval (a.R R+ N C Z Z+ Ac FX pX fX φX E(X) Var(X) Cov(X. . . −1. Notation 88 . . |a| = (a · a)1/2 n i=1 ai b i 1A N (µ. b} max{a. . the limit inferior of the sequence x1 . . p) unif(a.} complement of a set A. . a · b = Euclidean norm in Rn .} set of complex numbers set of integers {. . 0} the limit superior of the sequence x1 . . ∞) set of natural numbers {1. . Ac = {ω ∈ Ω. . b} max{a. V ) bin(n. ω ∈ A} / distribution function of a random variable X mass function of a discrete random variable X density function of an absolutely continuous random variable X characteristic function of X expected value of the random variable X variance of X covariance of X and Y conditional expectation of X given the event B min{a. . 1. Euclidean inner (or dot) product in Rn . x2 . . . . . 0.} set of non-negative integers {0. .

9 2FTAP one-period. 30 one-period. 12 Cameron–Martin–Girsanov theorem. 21 second multi-period. 9. 31 one-period. 83 central limit theorem. 35 discounted prices. 67 Borel sigma-ﬁeld. 21 Bernoulli random variable. 87 Doob decomposition. 31 arbitrage continuous time. 55 multi-period. 24 forward rate. 83 density process. 84 binomial random variable. 30 exponential random variable. 83 adapted process. 87 Breeden–Litzenberger formula. 24 admissible trading strategy. 84 Black–Scholes formula. 15 conditional expectation existence and uniqueness. 81 Borel–Cantelli lemmas.. 85 Cox–Ingersoll–Ross model. 86. 65 Brownian motion. 26 one-period. 87 89 CEV model. 61 bond. 29 no time horizon. 72 density function. 8 call option. 85 CIR model. 19 equivalent measures. 53 almost sure event. 9. 55 one-period.Index 1FTAP one-period. 69 fundamental theorem of asset pricing ﬁrst multi-period. 64 equivalent martingale measure continuous time.s. 54 multi-period. 83 dominated convergence theorem. 51 multi-period. 81 American contingent claims. 15. 81 absolutely continuous random variable. 54 multi-period. 13. 15 a. 29 one-period. 25 one-period. 30 characterization. 26 given a sigma-ﬁeld. 35 one-period. 26 given an event. 87 Feynman–Kac PDE. 13. 21 attainable claim continuous time. 21 . 20 discrete random variable. 18 European contingent claims. 21 characterization. 33 doubling strategy. 72 complete market continuous time. 60 Black–Scholes PDE. 48 Cauchy–Schwarz inequality. 52 Dupire’s formula. 65 Chebychev’s inequality. 59 ﬁltration. 62 Black–Scholes model. 38 budget constraint continuous time. 84 Fatou’s lemma.

13 risk-free asset one-period. 44 separating hyperplane theorem. 40 simple random variable. 66 self-ﬁnancing condition continuous time. 82 interest rate term structure. 33 spot interest rate. 44 of Brownian motion. 19 optimal stopping time. 19 e objective probability measure. 41 pricing kernel one-period. 24 measure. 83 Kennedy model. 9 statistical probability measure. 44 vector version. 70 Itˆ process. 28 . 48 martingale transform. 82 integrable random variable. 18 multi-period. 67 sigma-algebra. 28 mass function. 39 Radon–Nikodym derivative. 27 normal random variable. 15 independent events. 85 independent random variables. 77 stochastic integral discrete time. 81 simple predictable integrand. 85 martingale. 20 SABR model. 85 90 Novikov’s criterion. 65 historical probability measure. 35 Markov’s inequality. 63 incomplete market one-period.i. 66 Gaussian random variable. 76 Ho–Lee model.GARCH model. 84 Gaussian random vector. 27 martingale representation theorem. 19 HJM drift condition. 87 multivariate Gaussian. 84 Girsanov’s theorem. 14 quadratic covariation.d. 85 geometric random variable. 78 Hull–White extension of Vasicek. 83 measurable with respect to a sigma-ﬁeld. 44 o Itˆ’s formula o scalar version. 44 of independent Brownian motions. 84 predictable process continuous time. 82 Snell envelope. 35 replicable claim one-period. 19 Radon–Nikodym theorem. 81 num´raire. 84 normal random vector. 39 quadratic variation. 48 H¨lder’s inequality. 78 i. 83 put option. 81 probability. 81 local martingale. 81 monotone convergence theorem. 85 implied volatility. 9 probability density function. 83 probability mass function. 34 Poisson random variable. 76 Heston model. 47 Itˆ’s isometry. 85 natural ﬁltration of a process. 41 discrete time. 20 risk-neutral measure one-period. 59 law of iterated expectations. 43 locally equivalent measure. 83 o Heath–Jarrow–Morton drift condition. 79 Kolmogorov equation. 15 put-call parity. 27 Lebesgue measure. 10 short interest rate. 85 multivariate normal distribution. 48 null event. 25 predictable sigma-ﬁeld. 19 stochastic Fubini theorem. 8 semimartingale. 41 o Jensen’s inequality. 85 indicator function. 81 sigma-ﬁeld. 51 multi-period.. 25 one-period. 82 state price density. 67 square-integarable random variable.

25 trivial sigma-ﬁeld. 69 zero-coupon bond. 27 trading strategy. 54 super-replicatation of American option. 87 suicide strategy. 32 supporting hyperplane theorem. 84 usual conditions.stopping time. 81 uniform random variable. 10 term structure of interest rates. 67 91 . 40 Vasicek model. 31 strong law of large numbers. 71 yield curve. 70 tower property.

- stochastic lectures
- stochint
- Bayes
- January 2005
- Pairs Trading--Quantitative Methods and Analysis (Wiley Books)
- p01PDFIntro
- trigpresentation_1
- rp_gub_13_03
- Berit Givens - Latex Notes
- Probability
- Probability
- Statistics study sheet
- Ch5p
- Some Discrete Probability Distributions
- Math 55 - Green's Theorem
- Gatarek D. Bachert P. Maksymiuk R. the LIBOR Market Model in Practice
- Review of J. Buechners Godel Putnam And
- CV2
- Monty Hall ConditionalProbability
- Column Understanding the Kelly Criterion
- Column 24 Understanding the Kelly Criterion 2
- Sec8.1
- Game Theory Instructor Lctn - Yuval Peres
- mcs_1--7
- Secrets of the Marke Wizards Reaveled - Schwager_smw
- ekonomikos knyga
- Risk Neutral+Valuation Pricing+and+Hedging+of+Financial+Derivatives
- 4-3ParagraphsGeom
- Lectures on Integrable Probability
- Maths In Focus Extension 1 HSC Chapter 10

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd