Seppo Pynnonen
Professor of Statistics,
Department of Mathematics and Statistics,
University of Vaasa
email: sjp@uwasa.
url: www.uwasa./sjp/
1. Martingales
A discrete time introduction
Model of the market with the following sub
models
(i) T +1 trading dates, t = 0, 1, . . . , T.
(ii) Finite sample space:
= {
1
, . . . ,
K
}.
(iii) Probability measure:
P : IR, P() > 0, .
(iv) Filtration:
IF = {F
t
: t = 0, 1, . . . , T},
a submodel describing how information about the security prices
are revealed to the investors, such that F
0
F
1
F
T
.
(v) Bank account process:
B = {B
t
; t = 0, 1, . . . , T},
where B
t
is a stochastic process with B
0
= 1 and B
t
(
t
) > 0, t
and .
2
Usually B
t
is nondecreasing, so that time interval (t 1, t) return
is nonnegative, i.e.,
r
t
=
B
t
B
t1
B
t1
0, t = 1, . . . , T.
(vi) N risky security processes
S
n
= {S
n
(t); t = 0, 1, . . . , T},
where S
n
is a nonnegative stochastic process for all n = 1, . . . , N.
S
n
(t) = S
n
(t, ) is time t price of the risky security (at state
).
Note: In the following we usually drop out the sub index n and
denote for short the process as S.
Important model concepts:
(a) The information structure IF.
(b) The stochastic process S
n
.
3
(a) Information Structures
How to model information revealed to the
investor?
At t = 0 every state is possible. Noth
ing is ruled out.
At t = T the investors learn the true state.
At 0 < t < T information cumulate enabling
investors ruling out certain states as impos
sible and concentrate to those among which
the nal state will be.
Thus, we can model information accumula
tion as partitioning to ner and ner sub
sets. Let P
t
denote time t partition
of .
{F
1
, . . . , F
m
} is a partition of , if =
m
i=1
F
i
and
F
i
F
j
= for i = j.
4
Example 1.1: Coin tossing. Let S
t
(= S(t)) be such that S
t
=
S
t1
+$1, if head (H) and S
t
= S
t1
$1, if tail (T) with S
0
the
initial amount of money. Three tosses. Then the sample space
is
{HHH} =
1
{HHT} =
2
{HTH} =
3
{THH} =
4
{HTT} =
5
{THT} =
6
{TTH} =
7
{TTT} =
8
,
K = 8
= {
1
, . . . ,
8
}
S
t
is a stochastic process. Thus for example if S
0
= 0 then
S
t
(
2
) = {0, 1, 2, 1} is the sample path corresponding to state
2
.
The information accumulation process as time goes is for exam
ple in the case of
5
as follows (if we are only interested in the
nal state)
t Outcome Subset of the possible outcome
0 {
1
,
2
,
3
,
4
5
,
6
,
7
,
8
}
1 H {
1
,
2
,
3
,
5
}
2 T {
3
,
5
}
3 T {
5
}
At each step we observe either H or T, so the incremental par
titioning becomes
t Outcome Partition
0 P
0
= {{
1
, . . . ,
8
}}
1 H or T P
1
= {{
1
,
2
,
3
,
5
}, {
4
,
6
,
7
,
8
}}
2 H or T P
2
= {{
3
,
5
}, {
1
,
2
}, {
4
,
6
}, {
7
,
8
}}
3 H or T P
3
= {{
1
}, {
2
}, {
3
}, {
4
}, {
5
}, {
6
}, {
7
}, {
8
}}
Note that e.g. P
2
= {{
3
,
6
}, . . .} is not possible.
5
In the above example we observe that the
next step partition P
t+1
is obtained from P
t
by partitioning its sets. Schematically this
corresponds the path or network diagram be
low.
t = 0 t = 1 t = 2 t = 3
{
1
, . . . ,
8
}
H
{
1
,
2
,
3
,
5
}
`
H
{
1
,
2
}
H
{
1
}
T {
2
}
`
`
`
`
T
{
3
,
5
}
H
{
3
}
T {
5
}
\
\
\
\
\
\
\
\
T
{
4
,
6
,
7
,
8
}
`
H
{
4
,
6
}
H
{
4
}
T
{
6
}
`
`
`
`
T
{
7
,
8
}
H
{
7
}
T
{
8
}
Figure 1.1: Information tree for the coin toss example.
6
Mathematically the information structure can
be conveniently modeled with a structure called
algebra.
Denition 1.1: A collection F of subsets of
is called an algebra on if
(a) F
(b) if F F F
c
F
(c) if F, G F F G F.
7
Remark 1.1: (i) F where is the empty set, (ii) if
F, G F then F G F.
Remark 1.2: In probability theory (sub)sets are called
events.
Remark 1.3: Algebra is a family of subsets which is
closed under set operations. That is, if a set D is
obtained from sets in F by some set operations then
D F.
Remark 1.4: If {A
1
, . . . , A
m
} is a partition of then
there is a unique algebra F
A
corresponding to this
partition. Conversely, for each algebra, F, there is a
unique partition of .
8
Because algebras are more convenient in prob
ability theory, the information structure will
be modeled in terms of sequences F
t
algebras
rather than partitions.
For the purpose we denote the sequence as
IF = {F
t
: t = 0, 1, . . . , T}
where F
t
F
t+1
, that is if F F
t
than F
F
t+1
. IF is called a ltration.
Particularly, we have
F
0
= {, }
and
F
T
= {A : A } = P()
the power set of , i.e., the set of all possible
subsets of .
9
Example 1.2: Coin tossing example continued.
t P
t
F
t
0 {, }
1 {{
1
,
2
,
3
,
5
}, {, , {
1
,
2
,
3
,
5
}, {
4
,
6
,
7
,
8
}}
{
4
,
6
,
7
,
8
}}
2 {{
1
,
2
}, {
3
,
5
}, {, , {
1
,
2
}, {
3
,
5
}, {
4
,
6
}, {
7
,
8
},
{
4
,
6
}, {
7
,
8
}} {
1
,
2
,
3
,
5
}, {
1
,
2
,
4
,
6
},
{
1
,
2
,
7
,
8
}, {
3
,
4
,
5
,
6
},
{
3
,
5
,
7
,
8
}, {
4
,
6
,
7
,
8
},
{
1
,
2
,
3
,
4
,
5
,
6
},
{
1
,
2
,
3
,
5
,
7
,
8
},
{
1
,
2
,
4
,
6
,
7
,
8
},
{
3
,
4
,
5
,
6
,
7
,
8
}}
3 {{
1
, {
2
}, {
3
}, {
4
}, P() = {A : A }
{
5
}, {
6
}, {
7
}, {
8
}}
Thus F
0
F
1
F
2
F
3
, IF = {F
t
; t = 0, 1, 2, 3} is a ltration.
The partitions P
t
consist of the smallest nonempty sets of the
corresponding algebra F
t
, i.e. P
t
= {A F
t
; A = , A B =
A or = , B F
t
} (when the sample space is nite).
10
(b) Stochastic Process
Denition 1.2: A (discrete time) stochastic
process S is a real valued function S(t, ) of
both t and . That is
S : {0, 1, . . . , T} IR. (1)
For each xed the function t S(t, )
is called the sample path.
Remark 1.5: We usually denote for short the process
as S(t) = S(t, ).
11
Example 1.4: In our previous example with =
3
=
HTH the sample path is S(t,
3
) = {0, 1, 0, 1} because
S(0,
3
) = 0, S(1,
3
) = 1, S(2,
3
) = 0 and S(3,
3
) =
1.
For xed t, the function S(t, ) is a ran
dom variable (see Denition 1.4 below).
Example 1.5: For t = 2 the possible values for the
random variable S(2, ) are 2, 0 or 2.
12
In order to make the information structure
consistent with the random variable S of se
curity prices, we need to link the values of
S to our information structure. This is done
via the measurability concept.
Denition 1.3: We say that a real valued
function X : R is measurable with re
spect to algebra F (for short Fmeasurable),
if
{ : X() = x} F, x IR. (2)
Denition 1.4: X : IR is a random vari
able on an algebra F if it is Fmeasurable.
13
Remark 1.6: (i)Denition 1.3 implies that if A P, the partition
generating F, then X is constant for all A.
(ii) For short we denote simply X F if X is Fmeasurable.
(iii) All the following are equivalent: X is Fmeasurable
x IR, { : X() = x} F { : X() x}
F { : X() < x} F.
14
Example 1.6: With
F
1
= {, , {
1
,
2
,
3
,
4
}, {
5
,
6
,
7
,
8
}}
let
X() =
6, =
1
,
2
,
3
, or
4
8, =
5
,
6
,
7
, or
8
and
Y () =
1, =
1
,
3
,
5
, or
7
0, =
2
,
4
,
6
, or
8
.
Then X is F
1
measurable, but Y is not.
15
Example 1.7: Coin tossing continued.
Is S(2, ) F
2
measurable?
{S(2, ) = 2} = {
7
,
8
} F
2
{S(2, ) = 0} = {
3
,
4
,
5
,
6
} F
2
{S(2, ) = 2} = {
1
,
2
} F
2
{S(2, ) = x, x {2, 0, 2}} = F
2
.
Thus S(2, ) is F
2
measurable. However, note that
S(2, ) is obviously not F
1
measurable, but S(1, ) is.
Furthermore, note that S(1, ) is F
2
measurable, and
both S(1, ) and S(2, ) are F
3
measurable.
16
Remark 1.7: A variable with a constant value is always measur
able on an algebra.
To make the observed values of S up to time
point t as part of the information set, we
dene:
Denition 1.5: A stochastic process
S = {S(t); t = 0, 1, . . . , T}
is said to be adapted to the ltration
IF = {F
t
; t = 0, 1, . . . , T}
if the random variable S(t) is F
t
measurable
for every t = 0, 1, . . . , T.
Example 1.8: In the coin tossing case, the stochastic
process S is adapted.
17
Remark 1.8: If S(u) is adapted then it is F
t
measurable for all
u t. This guarantees that all historical information is available
to the investors at the current time point t.
Given current F
t
we can deduce the basic
partition P
t
. So if we know the subset where
the true state belongs, we can deduce the
current price of the security (because it is
constant for all states of the subset). Going
backwards we can deduce the price history.
18
Example 1.9: (Continued) S(2) = {2, 0, 2} If we
know {
3
,
5
}, then S(2, ) = 0, and because
P
1
= {{
1
,
2
,
3
,
5
}, {
4
,
6
,
7
,
8
}}, we know that
at t = 1 the subset was {
1
,
2
,
3
,
5
} for which the
S(1, ) = 1 {
1
,
2
,
3
,
5
}. Thus the price pro
cess up to time t = 2 has been {0, 1, 0}.
19
Filtrations with which the security prices are
adapted may be interpreted as learning pro
cesses of the investors about security prices.
Usually, however, it is possible to construct
several ltrations such that the price pro
cesses are adapted, but some of which may
be unacceptable (against the nancial theory
or practical intuition).
20
Example 1.10: (Continued) If we dene
P
1
= {{
1
,
2
}, {
3
,
5
}, {
4
,
6
}, {
7
,
8
}},
and
P
2
= {{
1
,
2
}, {
3
,
5
}, {
4
,
6
}, {
7
}, {
8
}}.
Let F
1
and F
2
be the corresponding algebras (con
struct them as an exercise) then IF
= {F
0
, F
1
, F
2
, F
3
},
(where F
0
= {, } and F
3
= {A : A } is the power
set) is a ltration and S(t) is adapted to IF
.
Then of course S(2, ) is F
2
measurable, but it is also
F
1
measurable!
Consequences?
Suppose that at time t = 1 we learn {
3
,
5
}.
Then we know S(1, ) = 1, but because S(2,
3
) =
S(2,
5
) = 0 we know exactly what the price will be
at the next period t = 2!
That is the price is fully predictable on the basis of
the information at t = 1.
The same is true in all other cases whenever we know
the subset where is at time t = 1.
21
There always exists one ltration that corre
sponds to learning about the prices as time
goes on, but learning nothing more.
This corresponds at each time point the coars
est one following the previous partitions con
sistent with the stock price process.
An acceptable ltration is constructed from
the partitions generated by the stochastic
process.
The result is a ltration that is the coarsest
possible (i.e., the generating partitions have
fewest possible subsets) such that the price
process is adapted.
22
Example 1.11: (Continued) Suppose instead of the
coin tosses the investor observes the price process
t = 0 t = 1 t = 2 t = 3
k
S
0
S
1
S
2
S
3
1
0 1 2 3
2
0 1 2 1
3
0 1 0 1
4
0 1 0 1
5
0 1 0 1
6
0 1 0 1
7
0 1 2 1
8
0 1 2 3
This gives exactly the same partitions and algebras as
earlier. Here the implied ltration is acceptable.
23
Example 1.12: (Pliska 1997, Example 3.3) Suppose
N = 1, K = 4, and T = 2 and the markets are going
to evolve as follows
t = 0 t = 1 t = 2
k
S
0
S
1
S
2
1
5 8 9
2
5 8 6
3
5 4 6
4
5 4 3
At time t = 0 investors observe S
0
= 5 for all ,
so P
0
= {} and
F
0
= {, }.
At time t = 1 investors observe either S
1
= 8 or S
1
= 4
giving rise to P
1
= {{
1
,
2
}, {
3
,
4
}}, so that
F
1
= {, , {
1
,
2
}, {
3
,
4
}} .
At time t = 2 investors observe S
2
and thereby deduce
, and hence the partition is {{
1
}, {
2
}, {
3
}, {
4
}}.
and
F
2
= {A : A }
= {, , {
1
}, {
2
}, {
3
}, {
4
}, {
1
,
2
}, {
1
,
3
}, {
1
,
4
},
{
2
,
3
}, {
2
,
4
}, {
3
,
4
},
{
1
,
2
,
3
}, {
1
,
2
,
4
}, {
1
,
3
,
4
}, {
2
,
3
,
4
}}
F
0
F
1
F
2
is a ltration, and S
t
is adapted to it.
24
As an information tree the structure is
t = 0 t = 1 t = 2
S
0
= 5
>
>
>
>
>
>
>
>S
1
= 8
S
2
= 9
1
S
2
= 6
2
S
1
= 4
S
2
= 6
3
S
2
= 3
4
Figure 1.2: Information structure and risky securities
for the above example.
25
Denition 1.9a: The stochastic process S
t
, t = 0, 1, 2, . . .
is called a (discretetime) martingale with respect to
ltration IF = {F
t
: t = 0, 1, . . . , }, if
(1) S
t
is adapted to IF
(2) E[S
t
] < for all t.
(3) E[S
t
F
s
] = S
s
for all 0 s < t.
Remark: Using the law of iterated expectations,
condition (3) is equivalent to
E[S
t+1
F
t
] = S
t
.
26
Continuous time martingales
is a general continuous space (e.g. Eu
clidean space).
Denition 1.6: A collection F of subsets of is called
a sigmaalgebra (called also sigmaeld) on if
(a) F
(b) if F F F
c
F
(c) if F
1
, F
2
, . . . F
i=1
F
i
F.
Denition 1.7: A ltration is an increasing family of
subsigmaelds F
t
F.
Denition 1.8: Let (, F, P) be a probability space.
A stochastic process {S
t
, t [0, )} is a measurable
function S : [0, ) IR.
27
For a xed
, S(
, ) : [0, ) IR (or
denoted as S
t
, or S(t)) gives the sample path
or a realization of the process associated to
, S(, t
) :
IR is a random variable.
Remark: Let X : IR be a random variable on a
sigmaalgebra F. The sigmaalgebra constructed by
the sets of the form
{ : a < X() b},
for all a, b IR is called a sigmaalgebra generated by
the random variable X, and is denoted as (X).
Denote in the following the information sets
F
t
in the associated ltration IF = {F
t
} as I
t
.
Remark: Information sets generated by a stochastic
process S
t
, I
t
= (S
t
) form a ltration, with respect
to which S
t
is automatically adapted (i.e., S
t
is I
t

measurable for all t).
28
Using the past information I
t
the best pre
diction (best with respect to mean square
criterion) of future S
t+u
, u > 0, is the condi
tional expectation E[S
t+u
I
t
], usually abbre
viated as E
t
[S
t+u
] (click here for a digression
to conditional expectations).
Denition 1.9: A stochastic process {S
t
, [0, )} is a
martingale with respect to the family of information
sets I
t
, and probability measure P if for all t > 0
(1) S
t
is I
t
adapted (i.e., known, given I
t
)
(2) E[S
t
] < (unconditional forecasts are nite)
(3) E
t
[S
t+u
] = S
t
, for all u > 0.
29
Remark 1.9:. All the expectations here are w.r.t the
probability measure P.
Remark 1.10: If S
t
is a martingale then the future
changes S
t+u
S
t
are unpredictable, because E
t
[S
t+u
S
t
] = E
t
[S
t+u
] E
t
[S
t
] = S
t
S
t
= 0.
Remark 1.11: S
t
is a submartingale if E
t
[S
t+u
] S
t
,
and supermartingale if E
t
[S
t+u
] S
t
, for u > 0.
30
Example 1.13: S
t
= S
t1
+e
t
is a martingale if E
t
[e
t+u
] =
0 for all u > 0. S
t
= + S
t1
+ e
t
is submartingale if
> 0 and supermartingale if < 0.
Example 1.14: Suppose Y
t
is an adapted process with
respect to ltration I
1
I
2
. . . I
T
. Dene
M
t
= E[Y
T
I
t
].
Using the law of iterated expectations it is easily seen
that M
t
is a martingale.
31
We say that a martingale S
t
is rightcontinuous
if its sample paths (trajectories) are right
continuous.
Technically:
plim
u0+
S
t+u
S
t
 = lim
u0+
P(S
t+u
S
t
 > ) = 0
for all > 0.
A martingale is continuous if its sample paths
are continuous. That is plimS
t+u
S
t
 = 0
as u 0.
If S
t
is a martingale and E[S
2
t
] < , S
t
is
called a continuous square integrable martin
gale.
32
DoopMeyer Decomposition
An important task in nancial applications
is to convert submartingales to martingales.
This can be done basically in two ways. One
is the DoopMeyer decomposition.
As an example, suppose the probability of an
uptick (by one unit, +1) at any time point
is p, and downtick (by one unit, 1). De
note the price process as S
t
, with indepen
dent increments, and observed at time points
t
i
, i = 0, 1, . . . k.
Then
E
p
[S
t
k
S
t
0
, S
t
1
, . . . , S
t
k1
] = S
t
k1
+(2p 1).
If p = 1/2, {S
t
i
} is a martingale, if p > 1/2,
S
t
i
is a submartingale, and if p < 1/2, S
t
i
is a
supermartingale.
33
Dening
Z
t
k
= S
t
k
+(1 2p)(k +1),
we nd that Z
t
k
is a martingale. We have the
decomposition of S
t
k
S
t
k
= (1 2p)(k +1) +Z
t
k
,
where Z
t
k
is a martingale. This is a special
case of the following result.
Theorem 1.1: Let X
t
be a rightcontinuous submartin
gale w.r.t ltration {I
t
}, and E[X
t
] < for all t, then
X
t
admits the decomposition
X
t
= M
t
+A
t
,
where M
t
is a rightcontinuous martingale and A
t
is
an increasing I
t
measurable process.
34
Another more convenient method to trans
form a submartingale to a martingale is to
transform the probability measure. This will
be considered later.
35