You are on page 1of 311

Contents

Articles
Mathematical finance 1
Asymptotic analysis 4
Calculus 7
Copula (statistics) 20
Differential equation 25
Expected value 31
Ergodic theory 38
FeynmanKac formula 44
Fourier transform 46
Girsanov's theorem 66
It's lemma 68
Martingale representation theorem 72
Mathematical model 73
Monte Carlo method 79
Numerical analysis 90
Real analysis 100
Partial differential equation 102
Probability 114
Probability distribution 120
Binomial distribution 125
Log-normal distribution 132
Heat equation 137
RadonNikodym derivative 149
Risk-neutral measure 153
Stochastic calculus 155
Wiener process 157
Lvy process 164
Stochastic differential equations 167
Stochastic volatility 171
Numerical partial differential equations 174
CrankNicolson method 175
Finite difference 180
Value at risk 186
Volatility (finance) 194
Autoregressive conditional heteroskedasticity 198
Brownian Model of Financial Markets 202
Rational pricing 207
Arbitrage 214
Futures contract 223
Putcall parity 235
Intrinsic value (finance) 237
Option time value 239
Moneyness 240
BlackScholes 242
Black model 254
Binomial options pricing model 255
Monte Carlo option model 260
Volatility smile 263
Implied volatility 267
SABR Volatility Model 270
Markov Switching Multifractal 272
Greeks (finance) 275
Finite difference methods for option pricing 284
Trinomial tree 285
Optimal stopping 287
Interest rate derivative 288
Short rate model 291
HullWhite model 293
CoxIngersollRoss model 296
Chen model 297
LIBOR Market Model 299
HeathJarrowMorton framework 300
References
Article Sources and Contributors 303
Image Sources, Licenses and Contributors 308
Article Licenses
License 309
Mathematical finance
1
Mathematical finance
Mathematical finance is applied mathematics concerned with financial markets. The subject has a close relationship
with the discipline of financial economics, which is concerned with much of the underlying theory. Generally,
mathematical finance will derive, and extend, the mathematical or numerical models suggested by financial
economics. Thus, for example, while a financial economist might study the structural reasons why a company may
have a certain share price, a financial mathematician may take the share price as a given, and attempt to use
stochastic calculus to obtain the fair value of derivatives of the stock (see: Valuation of options).
In terms of practice, mathematical finance also overlaps heavily with the field of computational finance (also known
as financial engineering). Arguably, these are largely synonymous, although the latter focuses on application, while
the former focuses on modeling and derivation (see: Quantitative analyst). The fundamental theorem of
arbitrage-free pricing is one of the key theorems in mathematical finance. Many universities around the world now
offer degree and research programs in mathematical finance; see Master of Mathematical Finance.
History
The history of mathematical finance starts with The Theory of Speculation (published 1900) by Louis Bachelier,
which discussed the use of Brownian motion to evaluate stock options. However, it hardly caught any attention
outside academia.
The first influential work of mathematical finance is the theory of portfolio optimization by Harry Markowitz on
using mean-variance estimates of portfolios to judge investment strategies, causing a shift away from the concept of
trying to identify the best individual stock for investment. Using a linear regression strategy to understand and
quantify the risk (i.e. variance) and return (i.e. mean) of an entire portfolio of stocks and bonds, an optimization
strategy was used to choose a portfolio with largest mean return subject to acceptable levels of variance in the return.
Simultaneously, William Sharpe developed the mathematics of determining the correlation between each stock and
the market. For their pioneering work, Markowitz and Sharpe, along with Merton Miller, shared the 1990 Nobel
Memorial Prize in Economic Sciences, for the first time ever awarded for a work in finance.
The portfolio-selection work of Markowitz and Sharpe introduced mathematics to the black art of investment
management. With time, the mathematics has become more sophisticated. Thanks to Robert Merton and Paul
Samuelson, one-period models were replaced by continuous time, Brownian-motion models, and the quadratic utility
function implicit in meanvariance optimization was replaced by more general increasing, concave utility functions
[1]
.
The next major revolution in mathematical finance came with the work of Fischer Black and Myron Scholes along
with fundamental contributions by Robert C. Merton, by modeling financial markets with stochastic models. For this
M. Scholes and R. Merton were awarded the 1997 Nobel Memorial Prize in Economic Sciences. Black was
ineligible for the prize because of his death in 1995.
More sophisticated mathematical models and derivative pricing strategies were then developed but their credibility
was damaged by the financial crisis of 20072010. Bodies such as the Institute for New Economic Thinking are now
attempting to establish more effective theories and methods.
[2]
Mathematical finance
2
Mathematical finance articles
Mathematical tools
Asymptotic analysis
Calculus
Copulas
Differential equations
Expected value
Ergodic theory
FeynmanKac formula
Fourier transform
Gaussian copulas
Girsanov's theorem
It's lemma
Martingale representation theorem
Mathematical models
Monte Carlo method
Numerical analysis
Real analysis
Partial differential equations
Probability
Probability distributions
Binomial distribution
Log-normal distribution
Quantile functions
Heat equation
RadonNikodym derivative
Risk-neutral measure
Stochastic calculus
Brownian motion
Lvy process
Stochastic differential equations
Stochastic volatility
Numerical partial differential equations
CrankNicolson method
Finite difference method
Value at risk
Volatility
ARCH model
GARCH model
Mathematical finance
3
Derivatives pricing
The Brownian Motion Model of Financial Markets
Rational pricing assumptions
Risk neutral valuation
Arbitrage-free pricing
Futures contract pricing
Options
Putcall parity (Arbitrage relationships for options)
Intrinsic value, Time value
Moneyness
Pricing models
BlackScholes model
Black model
Binomial options model
Monte Carlo option model
Implied volatility, Volatility smile
SABR Volatility Model
Markov Switching Multifractal
The Greeks
Finite difference methods for option pricing
Trinomial tree
Optimal stopping (Pricing of American options)
Interest rate derivatives
Short rate model
HullWhite model
CoxIngersollRoss model
Chen model
LIBOR Market Model
HeathJarrowMorton framework
See also
Computational finance
Quantitative Behavioral Finance
Derivative (finance), list of derivatives topics
Modeling and analysis of financial markets
International Swaps and Derivatives Association
Fundamental financial concepts - topics
Model (economics)
List of finance topics
List of economics topics, List of economists
List of accounting topics
Statistical Finance
Brownian model of financial markets
Master of Mathematical Finance
Mathematical finance
4
Notes
[1] Karatzas, I., Methods of Mathematical Finance, Secaucus, NJ, USA: Springer-Verlag New York, Incorporated, 1998
[2] Gillian Tett (April 15 2010), Mathematicians must get out of their ivory towers (http:/ / www. ft. com/ cms/ s/ 0/
cfb9c43a-48b7-11df-8af4-00144feab49a.html), Financial Times,
References
Harold Markowitz, Portfolio Selection, Journal of Finance, 7, 1952, pp.7791
William Sharpe, Investments, Prentice-Hall, 1985
Asymptotic analysis
In mathematical analysis, asymptotic analysis is a method of describing limiting behavior. The methodology has
applications across science. Examples are
in computer science in the analysis of algorithms, considering the performance of algorithms when applied to
very large input datasets
the behavior of physical systems when they are very large.
The simplest example is, when considering a function f(n), there is a need to describe its properties when n becomes
very large. Thus, if f(n) = n
2
+3n, the term 3n becomes insignificant compared to n
2
when n is very large. The
function "f(n) is said to be asymptotically equivalent to n
2
as n ", and this is written symbolically as f(n) ~ n
2
.
Definition
Formally, given complex-valued functions f and g of a natural number variable n, one writes
to express the fact that
and f and g are called asymptotically equivalent as n . This defines an equivalence relation on the set of
functions being nonzero for all n large enough. Alternatively, a more general definition is that
using little O notation, which defines an equivalence relation on all functions. In each case, the equivalence class of f
informally consists of all functions g which "behave like" f, in the limit. Here, o(1) stands for some function of n
whose value tends to 0 as n ; in general o(h(n)) stands for some function k(n) such that k(n)/h(n) tends to 0 as n
.
Big O notation (also known as Landau notation or asymptotic notation) has been developed to provide a convenient
language for the handling of statements about order of growth and is now ubiquitous in the analysis of algorithms.
The asymptotic point of view is basic in computer science, where the question is typically how to describe the
resource implication of scaling-up the size of a computational problem.
Asymptotic analysis
5
Asymptotic expansion
An asymptotic expansion of a function f(x) is in practice an expression of that function in terms of a series, the
partial sums of which do not necessarily converge, but such that taking any initial partial sum provides an asymptotic
formula for f. The idea is that successive terms provide a more and more accurate description of the order of growth
of f. An example is Stirling's approximation.
In symbols, it means we have
but also
and
for each fixed k, while some limit is taken, usually with the requirement that g
k+1
= o(g
k
), which means the (g
k
) form
an asymptotic scale. The requirement that the successive sums improve the approximation may then be expressed as
In case the asymptotic expansion does not converge, for any particular value of the argument there will be a
particular partial sum which provides the best approximation and adding additional terms will decrease the accuracy.
However, this optimal partial sum will usually have more terms as the argument approaches the limit value.
Asymptotic expansions typically arise in the approximation of certain integrals (Laplace's method, saddle-point
method, method of steepest descent) or in the approximation of probability distributions (Edgeworth series). The
famous Feynman graphs in quantum field theory are another example of asymptotic expansions which often do not
converge.
Use in applied mathematics
Asymptotic analysis is a key tool for exploring the ordinary and partial differential equations which arise in the
mathematical modelling of real-world phenomena
[1]
. An illustrative example is the derivation of the boundary layer
equations from the full Navier-Stokes equations governing fluid flow. In many cases, the asymptotic expansion is in
power of a small parameter, : in the boundary layer case, this is the nondimensional ratio of the boundary layer
thickness to a typical lengthscale of the problem. Indeed, applications of asymptotic analysis in mathematical
modelling often
[1]
centre around a nondimensional parameter which has been shown, or assumed, to be small
through a consideration of the scales of the problem at hand.
Method of dominant balance
The method of dominant balance is used to determine the asymptotic behavior of solutions to an ODE without
solving it. The process is iterative in that the result obtained by performing the method once can be used as input
when the method is repeated, to obtain as many terms in the asymptotic expansion as desired.
The process is as follows:
1. Assume that the asymptotic behavior has the form
.
2. Make a clever guess as to which terms in the ODE may be negligible in the limit we are interested in.
3. Drop those terms and solve the resulting ODE.
4. Check that the solution is consistent with step 2. If this is the case, then we have the controlling factor of the
asymptotic behavior. Otherwise, we need to try dropping different terms in step 2.
Asymptotic analysis
6
5. Repeat the process using our result as the first term of the solution.
Example
Consider this second order ODE:
where c and a are arbitrary constants.
This differential equation cannot be solved exactly. However, it may be useful to know how the solutions behave for
large x.
We start by assuming as . We do this with the benefit of hindsight, to make things quicker.
Since we only care about the behavior of y in the large x limit, we set y equal to , and re-express the ODE in
terms of S(x):
, or
where we have used the product rule and chain rule to find the derivatives of y.
Now let us suppose that a solution to this new ODE satisfies
as
as
We get the dominant asymptotic behaviour by setting
If satisfies the above asymptotic conditions, then everything is consistent. The terms we dropped will indeed
have been negligible with respect to the ones we kept. is not a solution to the ODE for S, but it represents the
dominant asymptotic behaviour, which is what we are interested in. Let us check that this choice for is
consistent:
Everything is indeed consistent. Thus we find the dominant asymptotic behaviour of a solution to our ODE:
By convention, the asymptotic series is written as:
so to get at least the first term of this series we have to do another step to see if there is a power of x out the front.
We proceed by making an ansatz that we can write
and then attempt to find asymptotic solutions for C(x). Substituting into the ODE for S(x) we find
Asymptotic analysis
7
Repeating the same process as before, we keep C' and (c-a)/x and find that
The leading asymptotic behaviour is therefore
See also
Asymptotic computational complexity
Asymptotic theory
References
[1] S. Howison, Practical Applied Mathematics, Cambridge University Press, Cambridge, 2005. ISBN 0-521-60369-2
J. P. Boyd, "The Devil's Invention: asymptotic, superasymptotic and hyperasymptotic series", Acta Applicandae
Mathematicae, 56: 1-98 (1999). Preprint (http:/ / www-personal. umich. edu/ ~jpboyd/ boydactaapplicreview.
pdf).
A. Erdlyi, Asymptotic Expansions. New York: Dover, 1987.
Calculus
Calculus (Latin, calculus, a small stone used for counting) is a branch of mathematics focused on limits, functions,
derivatives, integrals, and infinite series. This subject constitutes a major part of modern mathematics education. It
has two major branches, differential calculus and integral calculus, which are related by the fundamental theorem
of calculus. Calculus is the study of change,
[1]
in the same way that geometry is the study of shape and algebra is the
study of operations and their application to solving equations. A course in calculus is a gateway to other, more
advanced courses in mathematics devoted to the study of functions and limits, broadly called mathematical analysis.
Calculus has widespread applications in science, economics, and engineering and can solve many problems for
which algebra alone is insufficient.
Historically, calculus was called "the calculus of infinitesimals", or "infinitesimal calculus". More generally, calculus
(plural calculi) may refer to any method or system of calculation guided by the symbolic manipulation of
expressions. Some examples of other well-known calculi are propositional calculus, variational calculus, lambda
calculus, pi calculus, and join calculus.
Calculus
8
History
Ancient
Isaac Newton is one of the most famous contributors to
the development of calculus, with, among other things,
the use of calculus in his laws of motion and
gravitation.
The ancient period introduced some of the ideas that led to integral
calculus, but does not seem to have developed these ideas in a
rigorous or systematic way. Calculations of volumes and areas,
one goal of integral calculus, can be found in the Egyptian
Moscow papyrus (c. 1820 BC), but the formulas are mere
instructions, with no indication as to method, and some of them
are wrong. Some, including Morris Kline in Mathematical thought
from ancient to modern times, Vol. I, suggest trial and error.
[2]
From the age of Greek mathematics, Eudoxus (c. 408355 BC)
used the method of exhaustion, which prefigures the concept of the
limit, to calculate areas and volumes, while Archimedes (c.
287212 BC) developed this idea further, inventing heuristics
which resemble the methods of integral calculus.
[3]
The method of
exhaustion was later reinvented in China by Liu Hui in the 3rd
century AD in order to find the area of a circle.
[4]
In the 5th
century AD, Zu Chongzhi established a method which would later
be called Cavalieri's principle to find the volume of a sphere.
[5]
Medieval
Around AD 1000, the mathematician Ibn al-Haytham (Alhacen)
was the first to derive the formula for the sum of the fourth powers
of an arithmetic progression, using a method that is readily generalizable to finding the formula for the sum of any
higher integer powers.
[6]
In the 11th century, the Chinese polymath Shen Kuo developed 'packing' equations that
prefigure integration. In the 12th century, the Indian mathematician, Bhskara II, developed an early method using
infinitesimal change, a precursor of the derivative, and he stated a form of Rolle's theorem.
[7]
Also in the 12th
century, the Persian mathematician Sharaf al-Dn al-Ts used a method similar to taking the derivative of cubic
polynomials.
[8]
In the 14th century, Indian mathematician Madhava of Sangamagrama, along with other
mathematician-astronomers of the Kerala school of astronomy and mathematics, described special cases of Taylor
series,
[9]
which are treated in the text Yuktibhasa.
[10]

[11]

[12]
Modern
In Europe, the foundational work was a treatise due to Bonaventura Cavalieri, who argued that volumes and areas
should be computed as the sums of the volumes and areas of infinitesimal thin cross-sections. The ideas were similar
to Archimedes' in The Method, but this treatise was lost until the early part of the twentieth century. Cavalieri's work
was not well respected since his methods can lead to erroneous results, and the infinitesimal quantities he introduced
were disreputable at first.
The formal study of calculus combined Cavalieri's infinitesimals with the calculus of finite differences developed in
Europe at around the same time. The combination was achieved by John Wallis, Isaac Barrow, and James Gregory,
the latter two proving the second fundamental theorem of calculus around 1675.
The product rule and chain rule, the notion of higher derivatives, Taylor series, and analytical functions were
introduced by Isaac Newton in an idiosyncratic notation which he used to solve problems of mathematical physics.
In his publications, Newton rephrased his ideas to suit the mathematical idiom of the time, replacing calculations
Calculus
9
with infinitesimals by equivalent geometrical arguments which were considered beyond reproach. He used the
methods of calculus to solve the problem of planetary motion, the shape of the surface of a rotating fluid, the
oblateness of the earth, the motion of a weight sliding on a cycloid, and many other problems discussed in his
Principia Mathematica. In other work, he developed series expansions for functions, including fractional and
irrational powers, and it was clear that he understood the principles of the Taylor series. He did not publish all these
discoveries, and at this time infinitesimal methods were still considered disreputable.
Gottfried Wilhelm Leibniz was originally accused of
plagiarizing Sir Isaac Newton's unpublished work (only
in Britain, not in continental Europe), but is now
regarded as an independent inventor of and contributor
to calculus.
These ideas were systematized into a true calculus of
infinitesimals by Gottfried Wilhelm Leibniz, who was originally
accused of plagiarism by Newton.
[13]
He is now regarded as an
independent inventor of and contributor to calculus. His
contribution was to provide a clear set of rules for manipulating
infinitesimal quantities, allowing the computation of second and
higher derivatives, and providing the product rule and chain rule,
in their differential and integral forms. Unlike Newton, Leibniz
paid a lot of attention to the formalismhe often spent days
determining appropriate symbols for concepts.
Leibniz and Newton are usually both credited with the invention
of calculus. Newton was the first to apply calculus to general
physics and Leibniz developed much of the notation used in
calculus today. The basic insights that both Newton and Leibniz
provided were the laws of differentiation and integration, second
and higher derivatives, and the notion of an approximating
polynomial series. By Newton's time, the fundamental theorem of
calculus was known.
When Newton and Leibniz first published their results, there was
great controversy over which mathematician (and therefore which
country) deserved credit. Newton derived his results first, but Leibniz published first. Newton claimed Leibniz stole
ideas from his unpublished notes, which Newton had shared with a few members of the Royal Society. This
controversy divided English-speaking mathematicians from continental mathematicians for many years, to the
detriment of English mathematics. A careful examination of the papers of Leibniz and Newton shows that they
arrived at their results independently, with Leibniz starting first with integration and Newton with differentiation.
Today, both Newton and Leibniz are given credit for developing calculus independently. It is Leibniz, however, who
gave the new discipline its name. Newton called his calculus "the science of fluxions".
Since the time of Leibniz and Newton, many mathematicians have contributed to the continuing development of
calculus. In the 19th century, calculus was put on a much more rigorous footing by mathematicians such as Cauchy,
Riemann, and Weierstrass (see (, )-definition of limit). It was also during this period that the ideas of calculus were
generalized to Euclidean space and the complex plane. Lebesgue generalized the notion of the integral so that
virtually any function has an integral, while Laurent Schwartz extended differentiation in much the same way.
Calculus is a ubiquitous topic in most modern high schools and universities around the world.
[14]
Calculus
10
Significance
While some of the ideas of calculus were developed earlier in Egypt, Greece, China, India, Iraq, Persia, and Japan,
the modern use of calculus began in Europe, during the 17th century, when Isaac Newton and Gottfried Wilhelm
Leibniz built on the work of earlier mathematicians to introduce its basic principles. The development of calculus
was built on earlier concepts of instantaneous motion and area underneath curves.
Applications of differential calculus include computations involving velocity and acceleration, the slope of a curve,
and optimization. Applications of integral calculus include computations involving area, volume, arc length, center
of mass, work, and pressure. More advanced applications include power series and Fourier series. Calculus can be
used to compute the trajectory of a shuttle docking at a space station or the amount of snow in a driveway.
Calculus is also used to gain a more precise understanding of the nature of space, time, and motion. For centuries,
mathematicians and philosophers wrestled with paradoxes involving division by zero or sums of infinitely many
numbers. These questions arise in the study of motion and area. The ancient Greek philosopher Zeno gave several
famous examples of such paradoxes. Calculus provides tools, especially the limit and the infinite series, which
resolve the paradoxes.
Foundations
In mathematics, foundations refers to the rigorous development of a subject from precise axioms and definitions.
Working out a rigorous foundation for calculus occupied mathematicians for much of the century following Newton
and Leibniz and is still to some extent an active area of research today.
There is more than one rigorous approach to the foundation of calculus. The usual one today is via the concept of
limits defined on the continuum of real numbers. An alternative is nonstandard analysis, in which the real number
system is augmented with infinitesimal and infinite numbers, as in the original Newton-Leibniz conception. The
foundations of calculus are included in the field of real analysis, which contains full definitions and proofs of the
theorems of calculus as well as generalizations such as measure theory and distribution theory.
Principles
Limits and infinitesimals
Calculus is usually developed by manipulating very small quantities. Historically, the first method of doing so was
by infinitesimals. These are objects which can be treated like numbers but which are, in some sense, "infinitely
small". An infinitesimal number dx could be greater than 0, but less than any number in the sequence 1, 1/2, 1/3, ...
and less than any positive real number. Any integer multiple of an infinitesimal is still infinitely small, i.e.,
infinitesimals do not satisfy the Archimedean property. From this point of view, calculus is a collection of techniques
for manipulating infinitesimals. This approach fell out of favor in the 19th century because it was difficult to make
the notion of an infinitesimal precise. However, the concept was revived in the 20th century with the introduction of
non-standard analysis and smooth infinitesimal analysis, which provided solid foundations for the manipulation of
infinitesimals.
In the 19th century, infinitesimals were replaced by limits. Limits describe the value of a function at a certain input
in terms of its values at nearby input. They capture small-scale behavior, just like infinitesimals, but use the ordinary
real number system. In this treatment, calculus is a collection of techniques for manipulating certain limits.
Infinitesimals get replaced by very small numbers, and the infinitely small behavior of the function is found by
taking the limiting behavior for smaller and smaller numbers. Limits are the easiest way to provide rigorous
foundations for calculus, and for this reason they are the standard approach.
Calculus
11
Differential calculus
Tangent line at (x, f(x)). The derivative f(x) of a curve at a point is the slope (rise
over run) of the line tangent to that curve at that point.
Differential calculus is the study of the
definition, properties, and applications of
the derivative of a function. The process of
finding the derivative is called
differentiation. Given a function and a point
in the domain, the derivative at that point is
a way of encoding the small-scale behavior
of the function near that point. By finding
the derivative of a function at every point in
its domain, it is possible to produce a new
function, called the derivative function or
just the derivative of the original function.
In mathematical jargon, the derivative is a
linear operator which inputs a function and
outputs a second function. This is more
abstract than many of the processes studied
in elementary algebra, where functions usually input a number and output another number. For example, if the
doubling function is given the input three, then it outputs six, and if the squaring function is given the input three,
then it outputs nine. The derivative, however, can take the squaring function as an input. This means that the
derivative takes all the information of the squaring functionsuch as that two is sent to four, three is sent to nine,
four is sent to sixteen, and so onand uses this information to produce another function. (The function it produces
turns out to be the doubling function.)
The most common symbol for a derivative is an apostrophe-like mark called prime. Thus, the derivative of the
function of f is f, pronounced "f prime." For instance, if f(x) = x
2
is the squaring function, then f(x) = 2x is its
derivative, the doubling function.
If the input of the function represents time, then the derivative represents change with respect to time. For example,
if f is a function that takes a time as input and gives the position of a ball at that time as output, then the derivative of
f is how the position is changing in time, that is, it is the velocity of the ball.
If a function is linear (that is, if the graph of the function is a straight line), then the function can be written y = mx +
b, where:
This gives an exact value for the slope of a straight line. If the graph of the function is not a straight line, however,
then the change in y divided by the change in x varies. Derivatives give an exact meaning to the notion of change in
output with respect to change in input. To be concrete, let f be a function, and fix a point a in the domain of f. (a,
f(a)) is a point on the graph of the function. If h is a number close to zero, then a + h is a number close to a.
Therefore (a + h, f(a + h)) is close to (a, f(a)). The slope between these two points is
This expression is called a difference quotient. A line through two points on a curve is called a secant line, so m is
the slope of the secant line between (a, f(a)) and (a + h, f(a + h)). The secant line is only an approximation to the
behavior of the function at the point a because it does not account for what happens between a and a + h. It is not
possible to discover the behavior at a by setting h to zero because this would require dividing by zero, which is
impossible. The derivative is defined by taking the limit as h tends to zero, meaning that it considers the behavior of f
Calculus
12
for all small values of h and extracts a consistent value for the case when h equals zero:
Geometrically, the derivative is the slope of the tangent line to the graph of f at a. The tangent line is a limit of secant
lines just as the derivative is a limit of difference quotients. For this reason, the derivative is sometimes called the
slope of the function f.
Here is a particular example, the derivative of the squaring function at the input 3. Let f(x) = x
2
be the squaring
function.
The derivative f(x) of a curve at a point is the slope of the line tangent to that curve
at that point. This slope is determined by considering the limiting value of the
slopes of secant lines. Here the function involved (drawn in red) is f(x) = x
3
x.
The tangent line (in green) which passes through the point (3/2, 15/8) has a
slope of 23/4. Note that the vertical and horizontal scales in this image are
different.
The slope of tangent line to the squaring function at the point (3,9) is 6, that is to say, it is going up six times as fast
as it is going to the right. The limit process just described can be performed for any point in the domain of the
squaring function. This defines the derivative function of the squaring function, or just the derivative of the squaring
function for short. A similar computation to the one above shows that the derivative of the squaring function is the
doubling function.
Calculus
13
Leibniz notation
A common notation, introduced by Leibniz, for the derivative in the example above is
In an approach based on limits, the symbol dy/dx is to be interpreted not as the quotient of two numbers but as a
shorthand for the limit computed above. Leibniz, however, did intend it to represent the quotient of two
infinitesimally small numbers, dy being the infinitesimally small change in y caused by an infinitesimally small
change dx applied to x. We can also think of d/dx as a differentiation operator, which takes a function as an input and
gives another function, the derivative, as the output. For example:
In this usage, the dx in the denominator is read as "with respect to x". Even when calculus is developed using limits
rather than infinitesimals, it is common to manipulate symbols like dx and dy as if they were real numbers; although
it is possible to avoid such manipulations, they are sometimes notationally convenient in expressing operations such
as the total derivative.
Integral calculus
Integral calculus is the study of the definitions, properties, and applications of two related concepts, the indefinite
integral and the definite integral. The process of finding the value of an integral is called integration. In technical
language, integral calculus studies two related linear operators.
The indefinite integral is the antiderivative, the inverse operation to the derivative. F is an indefinite integral of f
when f is a derivative of F. (This use of upper- and lower-case letters for a function and its indefinite integral is
common in calculus.)
The definite integral inputs a function and outputs a number, which gives the area between the graph of the input
and the x-axis. The technical definition of the definite integral is the limit of a sum of areas of rectangles, called a
Riemann sum.
A motivating example is the distances traveled in a given time.
If the speed is constant, only multiplication is needed, but if the speed changes, then we need a more powerful
method of finding the distance. One such method is to approximate the distance traveled by breaking up the time into
many short intervals of time, then multiplying the time elapsed in each interval by one of the speeds in that interval,
and then taking the sum (a Riemann sum) of the approximate distance traveled in each interval. The basic idea is that
if only a short time elapses, then the speed will stay more or less the same. However, a Riemann sum only gives an
approximation of the distance traveled. We must take the limit of all such Riemann sums to find the exact distance
traveled.
Calculus
14
Integration can be thought of as measuring the area under a curve, defined by
f(x), between two points (here a and b).
If f(x) in the diagram on the left represents speed
as it varies over time, the distance traveled
(between the times represented by a and b) is the
area of the shaded region s.
To approximate that area, an intuitive method
would be to divide up the distance between a
and b into a number of equal segments, the
length of each segment represented by the
symbol x. For each small segment, we can
choose one value of the function f(x). Call that
value h. Then the area of the rectangle with base
x and height h gives the distance (time x
multiplied by speed h) traveled in that segment.
Associated with each segment is the average
value of the function above it, f(x)=h. The sum
of all such rectangles gives an approximation of
the area between the axis and the curve, which is
an approximation of the total distance traveled.
A smaller value for x will give more rectangles and in most cases a better approximation, but for an exact answer
we need to take a limit as x approaches zero.
The symbol of integration is , an elongated S (the S stands for "sum"). The definite integral is written as:
and is read "the integral from a to b of f-of-x with respect to x." The Leibniz notation dx is intended to suggest
dividing the area under the curve into an infinite number of rectangles, so that their width x becomes the
infinitesimally small dx. In a formulation of the calculus based on limits, the notation
is to be understood as an operator that takes a function as an input and gives a number, the area, as an output; dx is
not a number, and is not being multiplied by f(x).
The indefinite integral, or antiderivative, is written:
Functions differing by only a constant have the same derivative, and therefore the antiderivative of a given function
is actually a family of functions differing only by a constant. Since the derivative of the function y = x + C, where C
is any constant, is y = 2x, the antiderivative of the latter is given by:
An undetermined constant like C in the antiderivative is known as a constant of integration.
Calculus
15
Fundamental theorem
The fundamental theorem of calculus states that differentiation and integration are inverse operations. More
precisely, it relates the values of antiderivatives to definite integrals. Because it is usually easier to compute an
antiderivative than to apply the definition of a definite integral, the Fundamental Theorem of Calculus provides a
practical way of computing definite integrals. It can also be interpreted as a precise statement of the fact that
differentiation is the inverse of integration.
The Fundamental Theorem of Calculus states: If a function f is continuous on the interval [a, b] and if F is a function
whose derivative is f on the interval (a, b), then
Furthermore, for every x in the interval (a, b),
This realization, made by both Newton and Leibniz, who based their results on earlier work by Isaac Barrow, was
key to the massive proliferation of analytic results after their work became known. The fundamental theorem
provides an algebraic method of computing many definite integralswithout performing limit processesby
finding formulas for antiderivatives. It is also a prototype solution of a differential equation. Differential equations
relate an unknown function to its derivatives, and are ubiquitous in the sciences.
Applications
The logarithmic spiral of the Nautilus shell is a
classical image used to depict the growth and
change related to calculus
Calculus is used in every branch of the physical sciences, actuarial
science, computer science, statistics, engineering, economics, business,
medicine, demography, and in other fields wherever a problem can be
mathematically modeled and an optimal solution is desired. It allows
one to go from (non-constant) rates of change to the total change or
vice versa, and many times in studying a problem we know one and are
trying to find the other.
Physics makes particular use of calculus; all concepts in classical
mechanics and electromagnetism are interrelated through calculus. The
mass of an object of known density, the moment of inertia of objects,
as well as the total energy of an object within a conservative field can
be found by the use of calculus. An example of the use of calculus in
mechanics is Newton's second law of motion: historically stated it expressly uses the term "rate of change" which
refers to the derivative saying The rate of change of momentum of a body is equal to the resultant force acting on
the body and is in the same direction. Commonly expressed today as Force=Massacceleration, it involves
differential calculus because acceleration is the time derivative of velocity or second time derivative of trajectory or
spatial position. Starting from knowing how an object is accelerating, we use calculus to derive its path.
Maxwell's theory of electromagnetism and Einstein's theory of general relativity are also expressed in the language
of differential calculus. Chemistry also uses calculus in determining reaction rates and radioactive decay. In biology,
population dynamics starts with reproduction and death rates to model population changes.
Calculus can be used in conjunction with other mathematical disciplines. For example, it can be used with linear
algebra to find the "best fit" linear approximation for a set of points in a domain. Or it can be used in probability
theory to determine the probability of a continuous random variable from an assumed density function. In analytic
geometry, the study of graphs of functions, calculus is used to find high points and low points (maxima and minima),
slope, concavity and inflection points.
Calculus
16
Green's Theorem, which gives the relationship between a line integral around a simple closed curve C and a double
integral over the plane region D bounded by C, is applied in an instrument known as a planimeter which is used to
calculate the area of a flat surface on a drawing. For example, it can be used to calculate the amount of area taken up
by an irregularly shaped flower bed or swimming pool when designing the layout of a piece of property.
In the realm of medicine, calculus can be used to find the optimal branching angle of a blood vessel so as to
maximize flow. From the decay laws for a particular drug's elimination from the body, it's used to derive dosing
laws. In nuclear medicine, it's used to build models of radiation transport in targeted tumor therapies.
In economics, calculus allows for the determination of maximal profit by providing a way to easily calculate both
marginal cost and marginal revenue.
Calculus is also used to find approximate solutions to equations; in practice it's the standard way to solve differential
equations and do root finding in most applications. Examples are methods such as Newton's method, fixed point
iteration, and linear approximation. For instance, spacecraft use a variation of the Euler method to approximate
curved courses within zero gravity environments.
See also
Lists
List of differentiation identities
List of calculus topics
Publications in calculus
Table of integrals
Related topics
Calculus of finite differences
Calculus with polynomials
Complex analysis
Differential equation
Differential geometry
Elementary calculus
Fourier series
Integral equation
Mathematical analysis
Mathematics
Multivariable calculus
Non-classical analysis
Non-standard analysis
Non-standard calculus
Precalculus (mathematical education)
Product integral
Stochastic calculus
Taylor series
Calculus
17
References
Notes
[1] Latorre, Donald R.; Kenelly, John W.; Reed, Iris B.; Biggers, Sherry (2007), Calculus Concepts: An Applied Approach to the Mathematics of
Change (http:/ / books. google.com/ books?id=bQhX-3k0LS8C), Cengage Learning, p.2, ISBN0-618-78981-2, , Chapter 1, p 2 (http:/ /
books.google. com/ books?id=bQhX-3k0LS8C& pg=PA2)
[2] Morris Kline, Mathematical thought from ancient to modern times, Vol. I
[3] Archimedes, Method, in The Works of Archimedes ISBN 978-0-521-66160-7
[4] Dun, Liu; Fan, Dainian; Cohen, Robert Sonn (1966). A comparison of Archimdes' and Liu Hui's studies of circles (http:/ / books. google.
com/ books?id=jaQH6_8Ju-MC). Chinese studies in the history and philosophy of science and technology. 130. Springer. p. 279.
ISBN0-792-33463-9. ., Chapter , p. 279 (http:/ / books. google. com/ books?id=jaQH6_8Ju-MC& pg=PA279)
[5] Zill, Dennis G.; Wright, Scott; Wright, Warren S. (2009). Calculus: Early Transcendentals (http:/ / books. google. com/
books?id=R3Hk4Uhb1Z0C) (3 ed.). Jones & Bartlett Learning. p.xxvii. ISBN0-763-75995-3. ., Extract of page 27 (http:/ / books. google.
com/ books?id=R3Hk4Uhb1Z0C& pg=PR27)
[6] Victor J. Katz (1995). "Ideas of Calculus in Islam and India", Mathematics Magazine 68 (3), pp. 163-174.
[7] Ian G. Pearce. Bhaskaracharya II. (http:/ / turnbull. mcs. st-and. ac. uk/ ~history/ Projects/ Pearce/ Chapters/ Ch8_5. html)
[8] J. L. Berggren (1990). "Innovation and Tradition in Sharaf al-Din al-Tusi's Muadalat", Journal of the American Oriental Society 110 (2), pp.
304-309.
[9] "Madhava" (http:/ / www-gap. dcs. st-and.ac.uk/ ~history/ Biographies/ Madhava. html). Biography of Madhava. School of Mathematics
and Statistics University of St Andrews, Scotland. . Retrieved 2006-09-13.
[10] "An overview of Indian mathematics" (http:/ / www-history. mcs. st-andrews. ac. uk/ HistTopics/ Indian_mathematics. html). Indian Maths.
School of Mathematics and Statistics University of St Andrews, Scotland. . Retrieved 2006-07-07.
[11] "Science and technology in free India" (http:/ / www.kerala. gov. in/ keralcallsep04/ p22-24. pdf) (PDF). Government of Kerala Kerala
Call, September 2004. Prof.C.G.Ramachandran Nair. . Retrieved 2006-07-09.
[12] Charles Whish (1834), "On the Hindu Quadrature of the circle and the infinite series of the proportion of the circumference to the diameter
exhibited in the four Sastras, the Tantra Sahgraham, Yucti Bhasha, Carana Padhati and Sadratnamala", Transactions of the Royal Asiatic
Society of Great Britain and Ireland (Royal Asiatic Society of Great Britain and Ireland) 3 (3): 509523, doi:10.1017/S0950473700001221,
JSTOR25581775
[13] Leibniz, Gottfried Wilhelm. The Early Mathematical Manuscripts of Leibniz. Cosimo, Inc., 2008. Page 228. Copy (http:/ / books. google.
com/ books?hl=en& lr=& id=7d8_4WPc9SMC& oi=fnd& pg=PA3& dq=Gottfried+ Wilhelm+ Leibniz+ accused+ of+ plagiarism+ by+
Newton& ots=09h9BdTlbE& sig=hu5tNKpBJxHcpj8U3kR_T2bZqrY#v=onepage& q=plagairism& f=false|Online)
[14] UNESCO-World Data on Education (http:/ / nt5.scbbs. com/ cgi-bin/ om_isapi. dll?clientID=137079235& infobase=iwde. nfo&
softpage=PL_frame)
Books
Larson, Ron, Bruce H. Edwards (2010). "Calculus", 9th ed., Brooks Cole Cengage Learning. ISBN
9780547167022
McQuarrie, Donald A. (2003). Mathematical Methods for Scientists and Engineers, University Science Books.
ISBN 9781891389245
Stewart, James (2008). Calculus: Early Transcendentals, 6th ed., Brooks Cole Cengage Learning. ISBN
9780495011668
Thomas, George B., Maurice D. Weir, Joel Hass, Frank R. Giordano (2008), "Calculus", 11th ed.,
Addison-Wesley. ISBN 0-321-48987-X
Calculus
18
Other resources
Further reading
Courant, Richard ISBN 978-3540650584 Introduction to calculus and analysis 1.
Edmund Landau. ISBN 0-8218-2830-4 Differential and Integral Calculus, American Mathematical Society.
Robert A. Adams. (1999). ISBN 978-0-201-39607-2 Calculus: A complete course.
Albers, Donald J.; Richard D. Anderson and Don O. Loftsgaarden, ed. (1986) Undergraduate Programs in the
Mathematics and Computer Sciences: The 1985-1986 Survey, Mathematical Association of America No. 7.
John Lane Bell: A Primer of Infinitesimal Analysis, Cambridge University Press, 1998. ISBN 978-0-521-62401-5.
Uses synthetic differential geometry and nilpotent infinitesimals.
Florian Cajori, "The History of Notations of the Calculus." Annals of Mathematics, 2nd Ser., Vol. 25, No. 1 (Sep.,
1923), pp.146.
Leonid P. Lebedev and Michael J. Cloud: "Approximating Perfection: a Mathematician's Journey into the World
of Mechanics, Ch. 1: The Tools of Calculus", Princeton Univ. Press, 2004.
Cliff Pickover. (2003). ISBN 978-0-471-26987-8 Calculus and Pizza: A Math Cookbook for the Hungry Mind.
Michael Spivak. (September 1994). ISBN 978-0-914098-89-8 Calculus. Publish or Perish publishing.
Tom M. Apostol. (1967). ISBN 9780471000051 Calculus, Volume 1, One-Variable Calculus with an Introduction
to Linear Algebra. Wiley.
Tom M. Apostol. (1969). ISBN 9780471000075 Calculus, Volume 2, Multi-Variable Calculus and Linear
Algebra with Applications. Wiley.
Silvanus P. Thompson and Martin Gardner. (1998). ISBN 978-0-312-18548-0 Calculus Made Easy.
Mathematical Association of America. (1988). Calculus for a New Century; A Pump, Not a Filter, The
Association, Stony Brook, NY. ED 300 252.
Thomas/Finney. (1996). ISBN 978-0-201-53174-9 Calculus and Analytic geometry 9th, Addison Wesley.
Weisstein, Eric W. "Second Fundamental Theorem of Calculus." (http:/ / mathworld. wolfram. com/
SecondFundamentalTheoremofCalculus. html) From MathWorldA Wolfram Web Resource.
Online books
Crowell, B. (2003). "Calculus" Light and Matter, Fullerton. Retrieved 6 May 2007 from http:/ / www.
lightandmatter. com/ calc/ calc. pdf (http:/ / www. lightandmatter. com/ calc/ calc. pdf)
Garrett, P. (2006). "Notes on first year calculus" University of Minnesota. Retrieved 6 May 2007 from
http://www.math.umn.edu/~garrett/calculus/first_year/notes.pdf (http:/ / www. math. umn. edu/ ~garrett/ calculus/
first_year/ notes. pdf)
Faraz, H. (2006). "Understanding Calculus" Retrieved 6 May 2007 from Understanding Calculus, URL http:/ /
www. understandingcalculus. com/ (http:/ / www. understandingcalculus. com/ ) (HTML only)
Keisler, H. J. (2000). "Elementary Calculus: An Approach Using Infinitesimals" Retrieved 29 August 2010 from
http://www.math.wisc.edu/~keisler/calc.html (http:/ / www. math. wisc. edu/ ~keisler/ calc. html)
Mauch, S. (2004). "Sean's Applied Math Book" California Institute of Technology. Retrieved 6 May 2007 from
http://www.cacr.caltech.edu/~sean/applied_math.pdf (http:/ / www. cacr. caltech. edu/ ~sean/ applied_math. pdf)
Sloughter, Dan (2000). "Difference Equations to Differential Equations: An introduction to calculus". Retrieved
17 March 2009 from http:/ / synechism. org/ drupal/ de2de/ (http:/ / synechism. org/ drupal/ de2de/ )
Stroyan, K.D. (2004). "A brief introduction to infinitesimal calculus" University of Iowa. Retrieved 6 May 2007
from http:/ / www. math. uiowa. edu/ ~stroyan/ InfsmlCalculus/ InfsmlCalc. htm (http:/ / www. math. uiowa. edu/
~stroyan/ InfsmlCalculus/ InfsmlCalc. htm) (HTML only)
Strang, G. (1991). "Calculus" Massachusetts Institute of Technology. Retrieved 6 May 2007 from http:/ / ocw.
mit. edu/ ans7870/ resources/ Strang/ strangtext. htm (http:/ / ocw. mit. edu/ ans7870/ resources/ Strang/
strangtext. htm)
Calculus
19
Smith, William V. (2001). "The Calculus" Retrieved 4 July 2008 (http:/ / www. math. byu. edu/ ~smithw/
Calculus/ ) (HTML only).
External links
Weisstein, Eric W., " Calculus (http:/ / mathworld. wolfram. com/ Calculus. html)" from MathWorld.
Topics on Calculus (http:/ / planetmath. org/ encyclopedia/ TopicsOnCalculus. html) at PlanetMath.
Calculus Made Easy (1914) by Silvanus P. Thompson (http:/ / djm. cc/ library/ Calculus_Made_Easy_Thompson.
pdf) Full text in PDF
Calculus (http:/ / www. bbc. co. uk/ programmes/ b00mrfwq) on In Our Time at the BBC. ( listen now (http:/ /
www. bbc. co. uk/ iplayer/ console/ b00mrfwq/ In_Our_Time_Calculus))
Calculus.org: The Calculus page (http:/ / www. calculus. org) at University of California, Davis contains
resources and links to other sites
COW: Calculus on the Web (http:/ / cow. math. temple. edu/ ) at Temple University contains resources ranging
from pre-calculus and associated algebra
Earliest Known Uses of Some of the Words of Mathematics: Calculus & Analysis (http:/ / www. economics.
soton. ac. uk/ staff/ aldrich/ Calculus and Analysis Earliest Uses. htm)
Online Integrator (WebMathematica) (http:/ / integrals. wolfram. com/ ) from Wolfram Research
The Role of Calculus in College Mathematics (http:/ / www. ericdigests. org/ pre-9217/ calculus. htm) from
ERICDigests.org
OpenCourseWare Calculus (http:/ / ocw. mit. edu/ OcwWeb/ Mathematics/ index. htm) from the Massachusetts
Institute of Technology
Infinitesimal Calculus (http:/ / eom. springer. de/ I/ i050950. htm) an article on its historical development, in
Encyclopaedia of Mathematics, Michiel Hazewinkel ed. .
Elements of Calculus I (http:/ / ocw. nd. edu/ mathematics/ elements-of-calculus-i) and Calculus II for Business
(http:/ / ocw. nd. edu/ mathematics/ calculus-ii-for-business), OpenCourseWare from the University of Notre
Dame with activities, exams and interactive applets.
Calculus for Beginners and Artists (http:/ / math. mit. edu/ ~djk/ calculus_beginners/ ) by Daniel Kleitman, MIT
Calculus Problems and Solutions (http:/ / www. math. ucdavis. edu/ ~kouba/ ProblemsList. html) by D. A. Kouba
Solved problems in calculus (http:/ / calculus. solved-problems. com/ )
Copula (statistics)
20
Copula (statistics)
In statistics, a copula is used as a general way of formulating a multivariate distribution in such a way that various
general types of dependence can be represented.
[1]
The approach to formulating a multivariate distribution using a
copula is based on the idea that a simple transformation can be made of each marginal variable in such a way that
each transformed marginal variable has a uniform distribution. Once this is done, the dependence structure can be
expressed as a multivariate distribution on the obtained uniforms, and a copula is precisely a multivariate distribution
on marginally uniform random variables. When applied in a practical context, the above transformations might be
fitted as an initial step for each marginal distribution, or the parameters of the transformations might be fitted jointly
with those of the copula.
There are many families of copulas which differ in the detail of the dependence they represent. A family will
typically have several parameters which relate to the strength and form of the dependence. Some families of copulas
are outlined below. A typical use for copulas is to choose one such family and use it to define the multivariate
distribution to be used, typically in fitting a distribution to a sample of data. However, it is possible to derive the
copula corresponding to any given multivariate distribution.
The basic idea
Consider two random variables X and Y, with continuous cumulative distribution functions F
X
and F
Y
. The
probability integral transform can be applied separately to the two random variables to define X = F
X
(X) and Y =
F
Y
(Y). It follows that X and Y both have uniform distributions but are, in general, dependent if X and Y were already
dependent (of course, if X and Y were independent, X and Y remain independent). Since the transforms are
invertible, specifying the dependence between X and Y is, in a way, the same as specifying dependence between X
and Y. With X and Y being uniform random variables, the problem reduces to specifying a bivariate distribution
between two uniforms, that is, a copula. So the idea is to simplify the problem by removing consideration of many
different marginal distributions by transforming the marginal variates to uniforms, and then specifying dependence
as a multivariate distribution on the uniforms.
Definition
A copula is a multivariate joint distribution defined on the n-dimensional unit cube [0,1]
n
such that every marginal
distribution is uniform on the interval [0,1].
Specifically, C:[0,1]
n
[0,1] is an n-dimensional copula (briefly, n-copula) if:
C(u)=0 whenever u[0,1]
n
has at least one component equal to 0;
C(u) = u
i
whenever u[0,1]
n
has all the components equal to 1 except the ith one, which is equal to u
i
;
C is n-increasing, i.e., for each hyperrectangle
where the . is the so called C-volume of .
Copula (statistics)
21
Sklar's theorem
The theorem proposed by Sklar
[2]
underlies most applications of the copula. Sklar's theorem states that given a joint
distribution function for variables, and respective marginal distribution functions, there exists a copula
such that the copula binds the margins to give the joint distribution.
For the bivariate case, Sklar's theorem can be stated as follows. For any bivariate distribution function , let
and be the univariate marginal probability distribution functions. Then
there exists a copula such that
(where the symbol for the copula has also been used for with its cumulative distribution function). Moreover, if
the marginal distributions and are continuous, the copula function is unique. Otherwise, the copula
is unique on the range of values of the marginal distributions.
To understand the density function of the coupled random variable it should be noticed that
The expectation of a function g can be written in the following ways:
FrchetHoeffding copula boundaries
Graphs of the FrchetHoeffding copula limits
and of the independence copula (in the middle).
Minimum (antimonotone) copula: This is the lower bound for all
copulas. In the bivariate case only, it represents perfect negative
dependence between variates.
For n-variate copulas, the lower bound is given by
Maximum (comonotone ) copula: This is the upper bound for all copulas. It represents perfect positive dependence
between variates:
For n-variate copulas, the upper bound is given by
Conclusion: For all copulas C(u,v),
In the multivariate case, the corresponding inequality is
Copula (statistics)
22
Families of copula
Gaussian copula
Cumulative distribution and probability density
functions of Gaussian copula with =0.4
One example of a copula often used for modelling in financeas
introduced by David X. Li in 2000is the Gaussian copula,
[3]
which is
constructed from the bivariate normal distribution via Sklar's theorem.
With being the standard bivariate normal cumulative distribution
function with correlation , the Gaussian copula function is
where and denotes the standard normal cumulative distribution function.
Differentiating C yields the copula density function:
where
is the density function for the standard bivariate Gaussian with Pearson's product moment correlation coefficient
and is the standard normal density.
Archimedean copulas
Archimedean copulas are an important family of copulas, which have a simple form with properties such as
associativity and have a variety of dependence structures. Unlike elliptical copulas (e.g. Gaussian), most of the
Archimedean copulas have closed-form solutions and are not derived from the multivariate distribution functions
using Sklars Theorem.
One particularly simple form of a n-dimensional copula is
where is known as a generator function. Such copulas are known as Archimedean. Any generator function which
satisfies the properties below is the basis for a valid copula:
Product copula: Also called the independent copula, this copula has no dependence between variates. Its density
function is unity everywhere.
Where the generator function is indexed by a parameter, a whole family of copulas may be Archimedean. For
example:
Clayton copula:
For = 0 in the Clayton copula, the random variables are statistically independent. The generator function approach
can be extended to create multivariate copulas, by simply including more additive terms.
Gumbel copula:
Frank copula:
Copula (statistics)
23
Periodic copula
In 2005 Aurlien Alfonsi and Damiano Brigo introduced new families of copulas based on periodic functions.
[4]
They noticed that if is a 1-periodic non-negative function that integrates to 1 over [0,1] and F is a double primitive
of , then both
are copula functions, the second one not necessarily exchangeable. This may be a tool to introduce asymmetric
dependence, which is absent in most known copula functions.
Empirical copulas
When analysing data with an unknown underlying distribution, one can transform the empirical data distribution into
an "empirical copula" by warping such that the marginal distributions become uniform
[1]
. Mathematically the
empirical copula frequency function is calculated by
where x
(i)
represents the ith order statistic of x.
Less formally, simply replace the data along each dimension with the data ranks divided byn.
Applications
Dependence modelling with copula functions is widely used in applications of financial risk assessment and actuarial
analysis - for example in the pricing of collateralized debt obligations (CDOs).
[5]
Some believe the methodology of
applying the Gaussian copula to credit derivatives to be one of the reasons behind the global financial crisis of
20082009.
[6]

[7]
Despite this perception, there are documented attempts of the financial industry, occurring before
the crisis, to address the limitations of the Gaussian copula and of Copula functions more generally, specifically the
lack of dependence dynamics and the poor representation of extreme events
[8]
. The volume "Credit Correlation: Life
After Copulas", published in 2007 by World Scientific, summarizes a 2006 conference held by Merrill Lynch in
London where several practitioners attempted to propose models rectifying some of the copula limitations. See also
the article by Donnelly and Embrechts
[9]
and the book by Brigo, Pallavicini and Torresetti
[10]
.
Whilst the application of copulas in credit has gone through popularity as well as misfortune during the global
financial crisis of 2008-2009,
[3]

[11]
it is arguably an industry standard model for pricing CDOs. Less arguably,
copulas have also been applied to other asset classes as a flexible tool in analyzing multi-asset derivative products.
The first such application outside credit was to use a copula to construct an implied basket volatility surface,
[12]
taking into account the volatility smile of basket components. Copulas have since gained popularity in pricing and
risk management
[13]
of options on multi-assets in the presence of volatility smile/skew, in equity, foreign exchange
and fixed income derivative business. Some typical examples of application of copulas are listed below:
Analyzing and pricing volatility smile/skew of exotic baskets, e.g. best/worst of;
Analyzing and pricing volatility smile/skew of less liquid FX cross, which is effectively a basket: C = S
1
/S
2
or C
= S
1
*S
2
;
Analyzing and pricing spread options, in particular in fixed income constant maturity swap (CMS) spread options.
Recently, copula functions have been successfully applied to the database formulation for the reliability analysis of
highway bridges, to the analysis of spike counts in neuroscience
[14]
and to various multivariate simulation studies in
civil, mechanical and offshore engineering.
Copula (statistics)
24
See also
Joint probability distribution
References
Notes
[1] Nelsen, Roger B. (1999), An Introduction to Copulas, New York: Springer, ISBN0387986235.
[2] Sklar, A. (1959), "Fonctions de rpartition n dimensions et leurs marges", Publ. Inst. Statist. Univ. Paris 8: 229231
[3] Li, David X. (2000), "On Default Correlation: A Copula Function Approach" (http:/ / www. defaultrisk. com/ pp_corr_05. htm), Journal of
Fixed Income 9: 4354,
[4] Alfonsi, A. & Brigo, D. (2005), "New families of Copulas based on periodic functions", Communications in Statistics - Theory and Methods
34 (7): 14371447, doi:10.1081/STA-200063351
[5] Meneguzzo, David; Walter Vecchiato (Nov 2003), "Copula sensitivity in collateralized debt obligations and basket default swaps", Journal of
Futures Markets 24 (1): 3770, doi:10.1002/fut.10110
[6] Recipe for Disaster: The Formula That Killed Wall Street (http:/ / www. wired. com/ techbiz/ it/ magazine/ 17-03/
wp_quant?currentPage=all) Wired, 2/23/2009
[7] MacKenzie, Donald (2008), "End-of-the-World Trade" (http:/ / www. lrb. co. uk/ v30/ n09/ mack01_. html), London Review of Books,
2008-05-08, , retrieved 2009-07-27
[8] Lipton, A., and A. Rennie, (Editors) (2007), Credit Correlation: Life after Copulas (http:/ / www. worldscibooks. com/ economics/ 6559.
html), World Scientific,
[9] Donnelly, C, Embrechts, P, (2010), The devil is in the tails: actuarial mathematics and the subprime mortgage crisis, ASTIN Bulletin 40(1),
1-33
[10] Brigo, D, Pallavicini, A, and Torresetti, R, (2010), Credit Models and the Crisis: A Journey into CDOs, Copulas, Correlations and dynamic
Models, Wiley and Sons
[11] Jones, Sam (April 24, 2009), "The formula that felled Wall St" (http:/ / www. ft. com/ cms/ s/ 2/ 912d85e8-2d75-11de-9eba-00144feabdc0.
html), Financial Times,
[12] Qu, Dong, (2001), Basket Implied Volatility Surface, Derivatives Week, 4 June.
[13] Qu, Dong, (2005), Pricing Basket Options With Skew, Wilmott Magazine, July.
[14] Onken, A; Grnewlder, S; Munk, MH; Obermayer, K (2009), "Analyzing Short-Term Noise Dependencies of Spike-Counts in Macaque
Prefrontal Cortex Using Copulas and the Flashlight Transformation" (http:/ / www. ploscompbiol. org/ article/ info:doi/ 10. 1371/ journal.
pcbi.1000577), PLoS Computational Biology 5 (11): e1000577, doi:10.1371/journal.pcbi.1000577, PMID19956759, PMC2776173,
General
David G. Clayton (1978), "A model for association in bivariate life tables and its application in epidemiological
studies of familial tendency in chronic disease incidence", Biometrika 65, 141151. JSTOR (subscription) (http:/ /
links. jstor. org/ sici?sici=0006-3444(197804)65:1<141:AMFAIB>2. 0. CO;2-Y)
Frees, E.W., Valdez, E.A. (1998), "Understanding Relationships Using Copulas", North American Actuarial
Journal 2, 125. Link to NAAJ copy (http:/ / www. soa. org/ library/ journals/ north-american-actuarial-journal/
1998/ january/ naaj9801_1. pdf)
Roger B. Nelsen (1999), An Introduction to Copulas. ISBN 0-387-98623-5.
S. Rachev, C. Menn, F. Fabozzi (2005), Fat-Tailed and Skewed Asset Return Distributions. ISBN 0-471-71886-6.
A. Sklar (1959), "Fonctions de rpartition n dimensions et leurs marges", Publications de l'Institut de Statistique
de L'Universit de Paris 8, 229-231.
C. Schlzel, P. Friederichs (2008), "Multivariate non-normally distributed random variables in climate research
introduction to the copula approach", Nonlinear Processes in Geophysics, 15, 761-772 Copernicus (open access)
(http:/ / www. nonlin-processes-geophys. net/ 15/ 761/ 2008/ npg-15-761-2008. html)
W.T. Shaw, K.T.A. Lee (2006), "Copula Methods vs Canonical Multivariate Distributions: The Multivariate
Student T Distibution with General Degrees of Freedom". PDF (http:/ / www. mth. kcl. ac. uk/ ~shaww/
web_page/ papers/ MultiStudentc. pdf)
Srinivas Sriramula, Devdas Menon and A. Meher Prasad (2006), "Multivariate Simulation and Multimodal
Dependence Modeling of Vehicle Axle Weights with Copulas", ASCE Journal of Transportation Engineering 132
Copula (statistics)
25
(12), 945955. (doi 10.1061/(ASCE)0733-947X(2006)132:12(945)) ASCE(subscription) (http:/ / cedb. asce. org/
cgi/ WWWdisplay. cgi?0613154)
Genest, C.; MacKay, R.J. (1986), "The Joy of Copulas: Bivariate Distributions with Uniform Marginals" (http:/ /
jstor. org/ stable/ 2684602), The American Statistician (American Statistical Association) 40 (4): 280283,
doi:10.2307/2684602
External links
MathWorld Eric W. Weisstein. "Sklar's Theorem." From MathWorldA Wolfram Web Resource (http:/ /
mathworld. wolfram. com/ SklarsTheorem. html)
Copula Wiki: community portal for researchers with interest in copulas (http:/ / sites. google. com/ site/
copulawiki/ )
A collection of Copula simulation and estimation codes (http:/ / www. mathfinance. cn/ tags/ copula)
Recipe for Disaster: The Formula That Killed Wall Street (http:/ / www. wired. com/ techbiz/ it/ magazine/ 17-03/
wp_quant) By Felix Salmon, Wired News
Did math formula cause financial crisis? (http:/ / marketplace. publicradio. org/ display/ web/ 2009/ 02/ 24/
pm_stock_formula_q/ ) By Felix Salmon and Kai Ryssdal, Marketplace, American
Several short articles on copulas (http:/ / www. aghakouchak. com/ resources/ copulas)
An introduction and some examples to modeling with copulas in Excel (http:/ / www. vosesoftware. com/
ModelRiskHelp/ Modeling_correlation/ Copulas. htm)
Public Media
Copula Functions and their Application in Pricing and Risk Managing Multiame Credit Derivative Products
(http:/ / www. defaultrisk. com/ pp_crdrv_41. htm)
Differential equation
Visualization of heat transfer in a pump casing, by solving the heat equation. Heat is
being generated internally in the casing and being cooled at the boundary, providing a
steady state temperature distribution.
A differential equation is a
mathematical equation for an unknown
function of one or several variables
that relates the values of the function
itself and its derivatives of various
orders. Differential equations play a
prominent role in engineering, physics,
economics, and other disciplines.
Differential equations arise in many
areas of science and technology,
specifically whenever a deterministic
relation involving some continuously
varying quantities (modeled by
functions) and their rates of change in
space and/or time (expressed as
derivatives) is known or postulated.
This is illustrated in classical
mechanics, where the motion of a body
Differential equation
26
is described by its position and velocity as the time varies. Newton's laws allow one to relate the position, velocity,
acceleration and various forces acting on the body and state this relation as a differential equation for the unknown
position of the body as a function of time. In some cases, this differential equation (called an equation of motion)
may be solved explicitly.
An example of modelling a real world problem using differential equations is determination of the velocity of a ball
falling through the air, considering only gravity and air resistance. The ball's acceleration towards the ground is the
acceleration due to gravity minus the deceleration due to air resistance. Gravity is constant but air resistance may be
modelled as proportional to the ball's velocity. This means the ball's acceleration, which is the derivative of its
velocity, depends on the velocity. Finding the velocity as a function of time involves solving a differential equation.
Differential equations are mathematically studied from several different perspectives, mostly concerned with their
solutionsthe set of functions that satisfy the equation. Only the simplest differential equations admit solutions
given by explicit formulas; however, some properties of solutions of a given differential equation may be determined
without finding their exact form. If a self-contained formula for the solution is not available, the solution may be
numerically approximated using computers. The theory of dynamical systems puts emphasis on qualitative analysis
of systems described by differential equations, while many numerical methods have been developed to determine
solutions with a given degree of accuracy.
Directions of study
The study of differential equations is a wide field in pure and applied mathematics, physics, meteorology, and
engineering. All of these disciplines are concerned with the properties of differential equations of various types. Pure
mathematics focuses on the existence and uniqueness of solutions, while applied mathematics emphasizes the
rigorous justification of the methods for approximating solutions. Differential equations play an important role in
modelling virtually every physical, technical, or biological process, from celestial motion, to bridge design, to
interactions between neurons. Differential equations such as those used to solve real-life problems may not
necessarily be directly solvable, i.e. do not have closed form solutions. Instead, solutions can be approximated using
numerical methods.
Mathematicians also study weak solutions (relying on weak derivatives), which are types of solutions that do not
have to be differentiable everywhere. This extension is often necessary for solutions to exist, and it also results in
more physically reasonable properties of solutions, such as possible presence of shocks for equations of hyperbolic
type.
The study of the stability of solutions of differential equations is known as stability theory.
Nomenclature
The theory of differential equations is quite developed and the methods used to study them vary significantly with
the type of the equation.
An ordinary differential equation (ODE) is a differential equation in which the unknown function (also known as
the dependent variable) is a function of a single independent variable. In the simplest form, the unknown
function is a real or complex valued function, but more generally, it may be vector-valued or matrix-valued: this
corresponds to considering a system of ordinary differential equations for a single function. Ordinary differential
equations are further classified according to the order of the highest derivative of the dependent variable with
respect to the independent variable appearing in the equation. The most important cases for applications are
first-order and second-order differential equations. In the classical literature also distinction is made between
differential equations explicitly solved with respect to the highest derivative and differential equations in an
implicit form.
Differential equation
27
A partial differential equation (PDE) is a differential equation in which the unknown function is a function of
multiple independent variables and the equation involves its partial derivatives. The order is defined similarly to
the case of ordinary differential equations, but further classification into elliptic, hyperbolic, and parabolic
equations, especially for second-order linear equations, is of utmost importance. Some partial differential
equations do not fall into any of these categories over the whole domain of the independent variables and they are
said to be of mixed type.
Both ordinary and partial differential equations are broadly classified as linear and nonlinear. A differential
equation is linear if the unknown function and its derivatives appear to the power 1 (products are not allowed) and
nonlinear otherwise. The characteristic property of linear equations is that their solutions form an affine subspace of
an appropriate function space, which results in much more developed theory of linear differential equations.
Homogeneous linear differential equations are a further subclass for which the space of solutions is a linear
subspace i.e. the sum of any set of solutions or multiples of solutions is also a solution. The coefficients of the
unknown function and its derivatives in a linear differential equation are allowed to be (known) functions of the
independent variable or variables; if these coefficients are constants then one speaks of a constant coefficient linear
differential equation.
There are very few methods of explicitly solving nonlinear differential equations; those that are known typically
depend on the equation having particular symmetries. Nonlinear differential equations can exhibit very complicated
behavior over extended time intervals, characteristic of chaos. Even the fundamental questions of existence,
uniqueness, and extendability of solutions for nonlinear differential equations, and well-posedness of initial and
boundary value problems for nonlinear PDEs are hard problems and their resolution in special cases is considered to
be a significant advance in the mathematical theory (cf. NavierStokes existence and smoothness).
Linear differential equations frequently appear as approximations to nonlinear equations. These approximations are
only valid under restricted conditions. For example, the harmonic oscillator equation is an approximation to the
nonlinear pendulum equation that is valid for small amplitude oscillations (see below).
Examples
In the first group of examples, let u be an unknown function of x, and c and are known constants.
Inhomogeneous first-order linear constant coefficient ordinary differential equation:
Homogeneous second-order linear ordinary differential equation:
Homogeneous second-order linear constant coefficient ordinary differential equation describing the harmonic
oscillator:
First-order nonlinear ordinary differential equation:
Second-order nonlinear ordinary differential equation describing the motion of a pendulum of length L:
In the next group of examples, the unknown function u depends on two variables x and t or x and y.
Homogeneous first-order linear partial differential equation:
Differential equation
28
Homogeneous second-order linear constant coefficient partial differential equation of elliptic type, the Laplace
equation:
Third-order nonlinear partial differential equation, the Kortewegde Vries equation:
Related concepts
A delay differential equation (DDE) is an equation for a function of a single variable, usually called time, in
which the derivative of the function at a certain time is given in terms of the values of the function at earlier
times.
A stochastic differential equation (SDE) is an equation in which the unknown quantity is a stochastic process and
the equation involves some known stochastic processes, for example, the Wiener process in the case of diffusion
equations.
A differential algebraic equation (DAE) is a differential equation comprising differential and algebraic terms,
given in implicit form.
Connection to difference equations
The theory of differential equations is closely related to the theory of difference equations, in which the coordinates
assume only discrete values, and the relationship involves values of the unknown function or functions and values at
nearby coordinates. Many methods to compute numerical solutions of differential equations or study the properties
of differential equations involve approximation of the solution of a differential equation by the solution of a
corresponding difference equation.
Universality of mathematical description
Many fundamental laws of physics and chemistry can be formulated as differential equations. In biology and
economics differential equations are used to model the behavior of complex systems. The mathematical theory of
differential equations first developed, together with the sciences, where the equations had originated and where the
results found application. However, diverse problems, sometimes originating in quite distinct scientific fields, may
give rise to identical differential equations. Whenever this happens, mathematical theory behind the equations can be
viewed as a unifying principle behind diverse phenomena. As an example, consider propagation of light and sound in
the atmosphere, and of waves on the surface of a pond. All of them may be described by the same second-order
partial differential equation, the wave equation, which allows us to think of light and sound as forms of waves, much
like familiar waves in the water. Conduction of heat, the theory of which was developed by Joseph Fourier, is
governed by another second-order partial differential equation, the heat equation. It turned out that many diffusion
processes, while seemingly different, are described by the same equation; Black-Scholes equation in finance is for
instance, related to the heat equation.
Differential equation
29
Notable differential equations
Newton's Second Law in dynamics (mechanics)
Hamilton's equations in classical mechanics
Radioactive decay in nuclear physics
Newton's law of cooling in thermodynamics
The wave equation
Maxwell's equations in electromagnetism
The heat equation in thermodynamics
Laplace's equation, which defines harmonic functions
Poisson's equation
Einstein's field equation in general relativity
The Schrdinger equation in quantum mechanics
The geodesic equation
The NavierStokes equations in fluid dynamics
The CauchyRiemann equations in complex analysis
The PoissonBoltzmann equation in molecular dynamics
The shallow water equations
Universal differential equation
The Lorenz equations whose solutions exhibit chaotic flow.
Biology
Verhulst equation biological population growth
von Bertalanffy model biological individual growth
LotkaVolterra equations biological population dynamics
Replicator dynamics may be found in theoretical biology
Economics
The BlackScholes PDE
Exogenous growth model
Malthusian growth model
The Vidale-Wolfe advertising model
See also
Complex differential equation
Exact differential equation
Integral equations
Linear differential equation
PicardLindelf theorem on existence and uniqueness of solutions
Differential equation
30
References
D. Zwillinger, Handbook of Differential Equations (3rd edition), Academic Press, Boston, 1997.
A. D. Polyanin and V. F. Zaitsev, Handbook of Exact Solutions for Ordinary Differential Equations (2nd edition),
Chapman & Hall/CRC Press, Boca Raton, 2003. ISBN 1-58488-297-2.
W. Johnson, A Treatise on Ordinary and Partial Differential Equations
[1]
, John Wiley and Sons, 1913, in
University of Michigan Historical Math Collection
[2]
E.L. Ince, Ordinary Differential Equations, Dover Publications, 1956
E.A. Coddington and N. Levinson, Theory of Ordinary Differential Equations, McGraw-Hill, 1955
P. Blanchard, R.L. Devaney, G.R. Hall, Differential Equations, Thompson, 2006
External links
Lectures on Differential Equations
[3]
MIT Open CourseWare Video
Online Notes / Differential Equations
[4]
Paul Dawkins, Lamar University
Differential Equations
[5]
, S.O.S. Mathematics
Introduction to modeling via differential equations
[6]
Introduction to modeling by means of differential
equations, with critical remarks.
Differential Equation Solver
[7]
Java applet tool used to solve differential equations.
Mathematical Assistant on Web
[8]
Symbolic ODE tool, using Maxima
Exact Solutions of Ordinary Differential Equations
[9]
Collection of ODE and DAE models of physical systems
[10]
MATLAB models
Notes on Diffy Qs: Differential Equations for Engineers
[11]
An introductory textbook on differential equations by
Jiri Lebl of UIUC
References
[1] http:/ / www. hti. umich.edu/ cgi/ b/ bib/ bibperm?q1=abv5010. 0001. 001
[2] http:/ / hti.umich. edu/ u/ umhistmath/
[3] http:/ / ocw. mit. edu/ OcwWeb/ Mathematics/ 18-03Spring-2006/ VideoLectures/ index. htm
[4] http:/ / tutorial.math. lamar. edu/ classes/ de/ de. aspx
[5] http:/ / www. sosmath.com/ diffeq/ diffeq. html
[6] http:/ / www. diptem.unige.it/ patrone/ differential_equations_intro. htm
[7] http:/ / publicliterature. org/ tools/ differential_equation_solver/
[8] http:/ / user. mendelu. cz/ marik/ maw/ index.php?lang=en& form=ode
[9] http:/ / eqworld. ipmnet. ru/ en/ solutions/ ode. htm
[10] http:/ / www.hedengren.net/ research/ models.htm
[11] http:/ / www.jirka.org/ diffyqs/
Expected value
31
Expected value
In probability theory and statistics, the expected value (or expectation value, or mathematical expectation, or
mean, or first moment) of a random variable is the integral of the random variable with respect to its probability
measure.
[1]

[2]
Intuitively, expectation is the long-run average: if a test could be repeated many times, expectation is
the mean of all the results.
For discrete random variables this is equivalent to the probability-weighted sum of the possible values.
For continuous random variables with a density function it is the probability density-weighted integral of the
possible values.
The term "expected value" can be misleading. It must not be confused with the "most probable value." The expected
value is in general not a typical value that the random variable can take on. It is often helpful to interpret the
expected value of a random variable as the long-run average value of the variable over many independent repetitions
of an experiment.
The expected value may be intuitively understood by the law of large numbers: The expected value, when it exists, is
almost surely the limit of the sample mean as sample size grows to infinity. The value may not be expected in the
general sense the "expected value" itself may be unlikely or even impossible (such as having 2.5 children), just
like the sample mean.
The expected value does not exist for some distributions with large "tails", such as the Cauchy distribution.
[3]
It is possible to construct an expected value equal to the probability of an event by taking the expectation of an
indicator function that is one if the event has occurred and zero otherwise. This relationship can be used to translate
properties of expected values into properties of probabilities, e.g. using the law of large numbers to justify estimating
probabilities by frequencies.
History
The idea of the expected value originated in the middle of the 17th century from the study of the so-called problem
of points, posed by a French nobleman chevalier de Mr. The problem was that of two players who want to finish a
game early and, given the current circumstances of the game, want to divide the stakes fairly, based on the chance
each has of winning the game from that point. This problem was solved in 1654 by Blaise Pascal in his private
correspondence with Pierre de Fermat, however the idea was not communicated to the broad scientific community.
Three years later, in 1657, a Dutch mathematician Christiaan Huygens published a treatise (see Huygens (1657)) De
ratiociniis in ludo ale on probability theory, which not only lay down the foundations of the theory of probability,
but also considered the problem of points, presenting a solution essentially the same as Pascals.
[4]
Neither Pascal nor Huygens used the term expectation in its modern sense. In particular, Huygens writes: That my
Chance or Expectation to win any thing is worth just such a Sum, as woud procure me in the same Chance and
Expectation at a fair Lay. If I expect a or b, and have an equal Chance of gaining them, my Expectation is worth
. More than a hundred years later, in 1814, Pierre-Simon Laplace published his tract Thorie analytique des
probabilits, where the concept of expected value was defined explicitly:

This advantage in the theory of chance is the product of the sum hoped for by the probability of obtaining it; it is the partial sum which
ought to result when we do not wish to run the risks of the event in supposing that the division is made proportional to the probabilities. This
division is the only equitable one when all strange circumstances are eliminated; because an equal degree of probability gives an equal right
for the sum hoped for. We will call this advantage mathematical hope.
The use of letter E to denote expected value goes back to W.A. Whitworth (1901) Choice and chance. The symbol
has become popular since for English writers it meant Expectation, for Germans Erwartungswert, and for French
Esprance mathmatique.
[5]
Expected value
32
Examples
The expected outcome from one roll of an ordinary (that is, fair) six-sided die is
which is not among the possible outcomes.
[6]
A common application of expected value is gambling. For example, an American roulette wheel has 38 places where
the ball may land, all equally likely. A winning bet on a single number pays 35-to-1, meaning that the original stake
is not lost, and 35 times that amount is won, so you receive 36 times what you've bet. Considering all 38 possible
outcomes, the expected value of the profit resulting from a dollar bet on a single number is the sum of potential net
loss times the probability of losing and potential net gain times the probability of winning, that is,
The net change in your financial holdings is $1 when you lose, and $35 when you win. Thus one may expect, on
average, to lose about five cents for every dollar bet, and the expected value of a one-dollar bet is $0.947368421. In
gambling, an event of which the expected value equals the stake (i.e. the bettor's expected profit, or net gain, is zero)
is called a fair game.
Mathematical definition
In general, if is a random variable defined on a probability space , then the expected value of ,
denoted by , , or , is defined as
When this integral converges absolutely, it is called the expectation of X.The absolute convergence is necessary
because conditional convergence means that different order of addition gives different result, which is against the
nature of expected value. Here the Lebesgue integral is employed. Note that not all random variables have an
expected value, since the integral may not converge absolutely (e.g., Cauchy distribution). Two variables with the
same probability distribution will have the same expected value, if it is defined.
If is a discrete random variable with probability mass function , then the expected value becomes
as in the gambling example mentioned above.
If the probability distribution of admits a probability density function , then the expected value can be
computed as
It follows directly from the discrete case definition that if is a constant random variable, i.e. for some
fixed real number , then the expected value of is also .
The expected value of an arbitrary function of X, g(X), with respect to the probability density function f(x) is given
by the inner product of f and g:
This is sometimes called the law of the unconscious statistician. Using representations as RiemannStieltjes integral
and integration by parts the formula can be restated as
if ,
if .
As a special case let denote a positive real number, then
Expected value
33
In particular, for , this reduces to:
if , where F is the cumulative distribution function of X.
Conventional terminology
When one speaks of the "expected price", "expected height", etc. one means the expected value of a random
variable that is a price, a height, etc.
When one speaks of the "expected number of attempts needed to get one successful attempt," one might
conservatively approximate it as the reciprocal of the probability of success for such an attempt. Cf. expected
value of the geometric distribution.
Properties
Constants
The expected value of a constant is equal to the constant itself; i.e., if c is a constant, then .
Monotonicity
If X and Y are random variables so that almost surely, then .
Linearity
The expected value operator (or expectation operator) is linear in the sense that
Note that the second result is valid even if X is not statistically independent of Y. Combining the results from
previous three equations, we can see that
for any two random variables and (which need to be defined on the same probability space) and any real
numbers and .
Expected value
34
Iterated expectation
Iterated expectation for discrete random variables
For any two discrete random variables one may define the conditional expectation:
[7]
which means that is a function on .
Then the expectation of satisfies
Hence, the following equation holds:
[8]
The right hand side of this equation is referred to as the iterated expectation and is also sometimes called the tower
rule. This proposition is treated in law of total expectation.
Iterated expectation for continuous random variables
In the continuous case, the results are completely analogous. The definition of conditional expectation would use
inequalities, density functions, and integrals to replace equalities, mass functions, and summations, respectively.
However, the main result still holds:
Inequality
If a random variable X is always less than or equal to another random variable Y, the expectation of X is less than or
equal to that of Y:
If , then .
In particular, since and , the absolute value of expectation of a random variable is less
than or equal to the expectation of its absolute value:
Expected value
35
Non-multiplicativity
If one considers the joint PDF of X and Y, say j(x,y), then the expectation of XY is
In general, the expected value operator is not multiplicative, i.e. is not necessarily equal to .
In fact, the amount by which multiplicativity fails is called the covariance:
Thus multiplicativity holds precisely when , in which case and are said to be uncorrelated
(independent variables are a notable case of uncorrelated variables).
Now if X and Y are independent, then by definition j(x,y)=f(x)g(y) where f and g are the marginal PDFs for X and Y.
Then
and .
Observe that independence of X and Y is required only to write j(x,y)=f(x)g(y), and this is required to establish the
second equality above. The third equality follows from a basic application of the Fubini-Tonelli theorem.
Functional non-invariance
In general, the expectation operator and functions of random variables do not commute; that is
A notable inequality concerning this topic is Jensen's inequality, involving expected values of convex (or concave)
functions.
Uses and applications
The expected values of the powers of are called the moments of ; the moments about the mean of are
expected values of powers of . The moments of some random variables can be used to specify their
distributions, via their moment generating functions.
To empirically estimate the expected value of a random variable, one repeatedly measures observations of the
variable and computes the arithmetic mean of the results. If the expected value exists, this procedure estimates the
true expected value in an unbiased manner and has the property of minimizing the sum of the squares of the residuals
(the sum of the squared differences between the observations and the estimate). The law of large numbers
demonstrates (under fairly mild conditions) that, as the size of the sample gets larger, the variance of this estimate
gets smaller.
This property is often exploited in a wide variety of applications, including general problems of statistical estimation
and machine learning, to estimate (probabilistic) quantities of interest via Monte Carlo methods, since most
quantities of interest can be written in terms of expectation, e.g. where is the
indicator function for set , i.e. .
In classical mechanics, the center of mass is an analogous concept to expectation. For example, suppose is a
discrete random variable with values and corresponding probabilities . Now consider a weightless rod on
which are placed weights, at locations along the rod and having masses (whose sum is one). The point at
which the rod balances is .
Expected values can also be used to compute the variance, by means of the computational formula for the variance
Expected value
36
A very important application of the expectation value is in the field of quantum mechanics. The expectation value of
a quantum mechanical operator operating on a quantum state vector is written as . The
uncertainty in can be calculated using the formula .
Expectation of matrices
If is an matrix, then the expected value of the matrix is defined as the matrix of expected values:
This is utilized in covariance matrices.
Formulas for special cases
Discrete distribution taking only non-negative integer values
When a random variable takes only values in we can use the following formula for computing its
expectation:
Proof:
interchanging the order of summation, we have
as claimed. This result can be a useful computational shortcut. For example, suppose we toss a coin where the
probability of heads is . How many tosses can we expect until the first heads (not including the heads itself)? Let
be this number. Note that we are counting only the tails and not the heads which ends the experiment; in
particular, we can have . The expectation of may be computed by . This is because
the number of tosses is at least exactly when the first tosses yielded tails. This matches the expectation of a
random variable with an Exponential distribution. We used the formula for Geometric progression:
Expected value
37
Continuous distribution taking non-negative values
Analogously with the discrete case above, when a continuous random variable X takes only non-negative values, we
can use the following formula for computing its expectation:
Proof: It is first assumed that X has a density .
interchanging the order of integration, we have
as claimed. In case no density exists, it is seen that
See also
Conditional expectation
An inequality on location and scale parameters
Expected value is also a key concept in economics, finance, and many other subjects
The general term expectation
Moment (mathematics)
Expectation value (quantum mechanics)
Wald's equation for calculating the expected value of a random number of random variables
Notes
[1] Sheldon M Ross (2007). "2.4 Expectation of a random variable" (http:/ / books. google. com/ books?id=12Pk5zZFirEC& pg=PA38).
Introduction to probability models (9th ed.). Academic Press. p.38 ff. ISBN0125980620. .
[2] Richard W Hamming (1991). "2.5 Random variables, mean and the expected value" (http:/ / books. google. com/
books?id=jX_F-77TA3gC& pg=PA64). The art of probability for scientists and engineers. Addison-Wesley. p.64 ff. ISBN0201406861. .
[3] For a discussion of the Cauchy distribution, see Richard W Hamming (1991). "Example 8.71 The Cauchy distribution" (http:/ / books.
google.com/ books?id=jX_F-77TA3gC& printsec=frontcover& dq=isbn:0201406861& cd=1#v=onepage& q=Cauchy& f=false). The art of
probability for scientists and engineers. Addison-Wesley. p.290 ff. ISBN0201406861. . "Sampling from the Cauchy distribution and
averaging gets you nowhere one sample has the same distribution as the average of 1000 samples!"
[4] In the foreword to his book, Huygens writes: It should be said, also, that for some time some of the best mathematicians of France have
occupied themselves with this kind of calculus so that no one should attribute to me the honour of the first invention. This does not belong to
me. But these savants, although they put each other to the test by proposing to each other many questions difficult to solve, have hidden their
methods. I have had therefore to examine and go deeply for myself into this matter by beginning with the elements, and it is impossible for me
for this reason to affirm that I have even started from the same principle. But finally I have found that my answers in many cases do not differ
from theirs. (cited in Edwards (2002)). Thus, Huygens learned about de Mrs problem in 1655 during his visit to France; later on in 1656
from his correspondence with Carcavi he learned that his method was essentially the same as Pascals; so that before his book went to press in
1657 he knew about Pascals priority in this subject.
[5] "Earliest uses of symbols in probability and statistics" (http:/ / jeff560. tripod. com/ stat. html). .
[6] Sheldon M Ross. "Example 2.15" (http:/ / books. google. com/ books?id=12Pk5zZFirEC& pg=PA39). cited work. p.39. ISBN0125980620. .
[7] Sheldon M Ross. "Chapter 3: Conditional probability and conditional expectation" (http:/ / books. google. com/ books?id=12Pk5zZFirEC&
pg=PA97). cited work. p.97 ff. ISBN0125980620. .
[8] Sheldon M Ross. "3.4: Computing expectations by conditioning" (http:/ / books. google. com/ books?id=12Pk5zZFirEC& pg=PA105). cited
work. p.105 ff. ISBN0125980620. .
Expected value
38
Historical background
Edwards, A.W.F (2002). Pascals arithmetical triangle: the story of a mathematical idea (2nd ed.). JHU Press.
ISBN0-8018-6946-3.
Huygens, Christiaan (1657). De ratiociniis in ludo ale (English translation, published in 1714: (http:/ / www.
york.ac. uk/ depts/ maths/ histstat/ huygens. pdf)).
External links
An 8-foot-tall (2.4 m) Probability Machine (named Sir Francis) comparing stock market returns to the
randomness of the beans dropping through the quincunx pattern. (http:/ / www. youtube. com/
watch?v=AUSKTk9ENzg) from Index Funds Advisors IFA.com (http:/ / www. ifa. com), youtube.com
Expectation (http:/ / planetmath. org/ ?op=getobj& amp;from=objects& amp;id=505) on PlanetMath
Ergodic theory
Ergodic theory is a branch of mathematics that studies dynamical systems with an invariant measure and related
problems. Its initial development was motivated by problems of statistical physics.
A central aspect of ergodic theory is the behavior of a dynamical system when it is allowed to run for a long time.
The first result in this direction is the Poincar recurrence theorem, which claims that almost all points in any subset
of the phase space eventually revisit the set. More precise information is provided by various ergodic theorems
which assert that, under certain conditions, the time average of a function along the trajectories exists almost
everywhere and is related to the space average. Two of the most important examples are ergodic theorems of
Birkhoff and von Neumann. For the special class of ergodic systems, the time average is the same for almost all
initial points: statistically speaking, the system that evolves for a long time "forgets" its initial state. Stronger
properties, such as mixing and equidistribution, have also been extensively studied.
The problem of metric classification of systems is another important part of the abstract ergodic theory. An
outstanding role in ergodic theory and its applications to stochastic processes is played by the various notions of
entropy for dynamical systems.
Applications of ergodic theory to other parts of mathematics usually involve establishing ergodicity properties for
systems of special kind. In geometry, methods of ergodic theory have been used to study the geodesic flow on
Riemannian manifolds, starting with the results of Eberhard Hopf for Riemann surfaces of negative curvature.
Markov chains form a common context for applications in probability theory. Ergodic theory has fruitful connections
with harmonic analysis, Lie theory (representation theory, lattices in algebraic groups), and number theory (the
theory of diophantine approximations, L-functions).
Ergodic theory
39
Ergodic transformations
Ergodic theory is often concerned with ergodic transformations.
Let T: X X be a measure-preserving transformation on a measure space (X, , ), usually assumed to have finite
measure. A measure-preserving transformation T as above is ergodic if for every with
either or .
Examples
An irrational rotation of the circle R/Z, T: x x+, where is irrational, is ergodic. This transformation has even
stronger properties of unique ergodicity, minimality, and equidistribution. By contrast, if = p/q is rational (in
lowest terms) then T is periodic, with period q, and thus cannot be ergodic: for any interval I of length a, 0 < a <
1/q, its orbit under T is a T-invariant mod 0 set that is a union of q intervals of length a, hence it has measure qa
strictly between 0 and 1.
Let G be a compact abelian group, the normalized Haar measure, and T a group automorphism of G. Let G
*
be
the Pontryagin dual group, consisting of the continuous characters of G, and T
*
be the corresponding adjoint
automorphism of G
*
. The automorphism T is ergodic if and only if the equality (T
*
)
n
()= is possible only when n
= 0 or is the trivial character of G. In particular, if G is the n-dimensional torus and the automorphism T is
represented by an integral matrix A then T is ergodic if and only if no eigenvalue of A is a root of unity.
A Bernoulli shift is ergodic. More generally, ergodicity of the shift transformation associated with a sequence of
i.i.d. random variables and some more general stationary processes follows from Kolmogorov's zero-one law.
Ergodicity of a continuous dynamical system means that its trajectories "spread around" the phase space. A
system with a compact phase space which has a non-constant first integral cannot be ergodic. This applies, in
particular, to Hamiltonian systems with a first integral I functionally independent from the Hamilton function H
and a compact level set X = {(p,q): H(p,q)=E} of constant energy. Liouville's theorem implies the existence of a
finite invariant measure on X, but the dynamics of the system is constrained to the level sets of I on X, hence the
system possesses invariant sets of positive but less than full measure. A property of continuous dynamical
systems that is the opposite of ergodicity is complete integrability.
Ergodic theorems
Let be a measure-preserving transformation on a measure space . One may then consider
the "time average" of a -integrable functionf, i.e. . The "time average" is defined as the average (if
it exists) over iterations of T starting from some initial pointx.
If is finite and nonzero, we can consider the "space average" or "phase average" of f, defined as
In general the time average and space average may be different. But if the transformation is ergodic, and the measure
is invariant, then the time average is equal to the space average almost everywhere. This is the celebrated ergodic
theorem, in an abstract form due to George David Birkhoff. (Actually, Birkhoff's paper considers not the abstract
general case but only the case of dynamical systems arising from differential equations on a smooth manifold.) The
equidistribution theorem is a special case of the ergodic theorem, dealing specifically with the distribution of
probabilities on the unit interval.
More precisely, the pointwise or strong ergodic theorem states that the limit in the definition of the time average of
f exists for almost every x and that the (almost everywhere defined) limit function is integrable:
Ergodic theory
40
Furthermore, is T-invariant, that is to say
holds almost everywhere, and if is finite, then the normalization is the same:
In particular, if T is ergodic, then must be a constant (almost everywhere), and so one has that
almost everywhere. Joining the first to the last claim and assuming that is finite and nonzero, one has that
for almost all x, i.e., for all x except for a set of measure zero.
For an ergodic transformation, the time average equals the space average almost surely.
As an example, assume that the measure space models the particles of a gas as above, and let f(x)
denotes the velocity of the particle at position x. Then the pointwise ergodic theorems says that the average velocity
of all particles at some given time is equal to the average velocity of one particle over time.
Probabilistic formulation: BirkhoffKhinchin theorem
BirkhoffKhinchin theorem. Let be measurable, , and be a measure-preserving map. Then
where is the conditional expectation given the -algebra of invariant sets of .
Corollary (Pointwise ergodic theorem) In particular, if is also ergodic, then is the trivial -algebra, and thus
Mean ergodic theorem
Another form of the ergodic theorem, von Neumann's mean ergodic theorem, holds in Hilbert spaces.
[1]
Let be a unitary operator on a Hilbert space ; more generally, an isometric linear operator (that is, a not
necessarily surjective linear operator satisfying for all , or equivalently, satisfying
but not necessarily ). Let be the orthogonal projection onto
.
Then, for any , we have:
where the limit is with respect to the norm on H. In other words, the sequence of averages
converges to P in the strong operator topology.
This theorem specializes to the case in which the Hilbert space H consists of L
2
functions on a measure space and U
is an operator of the form
Ergodic theory
41
where T is a measure-preserving endomorphism of X, thought of in applications as representing a time-step of a
discrete dynamical system.
[2]
The ergodic theorem then asserts that the average behavior of a function f over
sufficiently large time-scales is approximated by the orthogonal component of f which is time-invariant.
In another form of the mean ergodic theorem, let U
t
be a strongly continuous one-parameter group of unitary
operators on H. Then the operator
converges in the strong operator topology as T. In fact, this result also extends to the case of strongly
continuous one-parameter semigroup of contractive operators on a reflexive space.
Remark: Some intuition for the mean ergodic theorem can be developed by considering the case where complex
numbers of unit length are regarded as unitary transformations on the complex plane (by left multiplication). If we
pick a single complex number of unit length (which we think of as ), it is intuitive that its powers will fill up the
circle. Since the circle is symmetric around 0, it makes sense that the averages of the powers of will converge to
0. Also, 0 is the only fixed point of , and so the projection onto the space of fixed points must be the zero
operator (which agrees with the limit just described).
Convergence of the ergodic means in the norms
Let be as above a probability space with a measure preserving transformation T, and let .
The conditional expectation with respect to the sub--algebra of the T-invariant sets is a linear projector of
norm 1 of the Banach space onto its closed subspace The latter may also be
characterized as the space of all T-invariant -functions on X. The ergodic means, as linear operators on
also have unit operator norm; and, as a simple consequence of the BirkhoffKhinchin theorem,
converge to the projector in the strong operator topology of if and in the weak operator
topology if . More is true if then the WienerYoshidaKakutani ergodic dominated
convergence theorem states that the ergodic means of are dominated in ; however, if , the
ergodic means may fail to be equidominated in . Finally, if f is assumed to be in the Zygmund class, that is
is integrable, then the ergodic means are even dominated in .
Sojourn time
Let be a measure space such that is finite and nonzero. The time spent in a measurable set A is
called the sojourn time. An immediate consequence of the ergodic theorem is that, in an ergodic system, the relative
measure of A is equal to the mean sojourn time:
for all x except for a set of measure zero, where is the indicator function of A.
Let the occurrence times of a measurable set A be defined as the set k
1
, k
2
, k
3
, ..., of times k such that T
k
(x) is in A,
sorted in increasing order. The differences between consecutive occurrence times R
i
= k
i
k
i1
are called the
recurrence times of A. Another consequence of the ergodic theorem is that the average recurrence time of A is
inversely proportional to the measure of A, assuming that the initial point x is in A, so that k
0
= 0.
(See almost surely.) That is, the smaller A is, the longer it takes to return to it.
Ergodic theory
42
Ergodic flows on manifolds
The ergodicity of the geodesic flow on compact Riemann surfaces of variable negative curvature and on compact
manifolds of constant negative curvature of any dimension was proved by Eberhard Hopf in 1939, although special
cases had been studied earlier: see for example, Hadamard's billiards (1898) and Artin billiard (1924). The relation
between geodesic flows on Riemann surfaces and one-parameter subgroups on SL(2,R) was described in 1952 by S.
V. Fomin and I. M. Gelfand. The article on Anosov flows provides an example of ergodic flows on SL(2,R) and on
Riemann surfaces of negative curvature. Much of the development described there generalizes to hyperbolic
manifolds, since they can be viewed as quotients of the hyperbolic space by the action of a lattice in the semisimple
Lie group SO(n,1). Ergodicity of the geodesic flow on Riemannian symmetric spaces was demonstrated by F. I.
Mautner in 1957. In 1967 D. V. Anosov and Ya. G. Sinai proved ergodicity of the geodesic flow on compact
manifolds of variable negative sectional curvature. A simple criterion for the ergodicity of a homogeneous flow on a
homogeneous space of a semisimple Lie group was given by C. C. Moore in 1966. Many of the theorems and results
from this area of study are typical of rigidity theory.
In the 1930s G. A. Hedlund proved that the horocycle flow on a compact hyperbolic surface is minimal and ergodic.
Unique ergodicity of the flow was established by Hillel Furstenberg in 1972. Ratner's theorems provide a major
generalization of ergodicity for unipotent flows on the homogeneous spaces of the form \G, where G is a Lie group
and is a lattice in G.
In the last 20 years, there have been many works trying to find a measure-classification theorem similar to Ratner's
theorems but for diagonalizable actions, motivated by conjectures of Furstenberg and Marguils. An important partial
result (solving those conjectures with an extra assumption of positive entropy) was proved by Elon Lindenstrauss,
and he was awarded the Fields medal in 2010 for this result.
See also
Chaos theory
Ergodic hypothesis
Ergodic process
Maximal ergodic theorem
Statistical mechanics
References
[1] I: Functional Analysis : Volume 1 by Michael Reed, Barry Simon,Academic Press; REV edition (1980)
[2] (Walters 1982)
Historical references
Birkhoff, George David (1931), "Proof of the ergodic theorem" (http:/ / www. pnas. org/ cgi/ reprint/ 17/ 12/ 656),
Proc Natl Acad Sci USA 17 (12): 656660, doi:10.1073/pnas.17.12.656, PMID16577406, PMC1076138.
Birkhoff, George David (1942), "What is the ergodic theorem?" (http:/ / www. jstor. org/ stable/ 2303229),
American Mathematical Monthly (The American Mathematical Monthly, Vol. 49, No. 4) 49 (4): 222226,
doi:10.2307/2303229.
von Neumann, John (1932), "Proof of the Quasi-ergodic Hypothesis" (http:/ / www. pubmedcentral. nih. gov/
articlerender. fcgi?tool=pmcentrez& artid=1076162), Proc Natl Acad Sci USA 18 (1): 7082,
doi:10.1073/pnas.18.1.70, PMID16577432, PMC1076162.
von Neumann, John (1932), "Physical Applications of the Ergodic Hypothesis" (http:/ / www. jstor. org/ stable/
86260), Proc Natl Acad Sci USA 18 (3): 263266, doi:10.1073/pnas.18.3.263, PMID16587674, PMC1076204.
Ergodic theory
43
Hopf, Eberhard (1939), "Statistik der geodtischen Linien in Mannigfaltigkeiten negativer Krmmung", Leipzig
Ber. Verhandl. Schs. Akad. Wiss. 91: 261304.
Fomin, Sergei V.; Gelfand, I. M. (1952), "Geodesic flows on manifolds of constant negative curvature", Uspehi
Mat. Nauk 7 (1): 118137.
Mautner, F. I. (1957), "Geodesic flows on symmetric Riemann spaces" (http:/ / jstor. org/ stable/ 1970054), Ann.
Of Math. (The Annals of Mathematics, Vol. 65, No. 3) 65 (3): 416431, doi:10.2307/1970054.
Moore, C. C. (1966), "Ergodicity of flows on homogeneous spaces" (http:/ / jstor. org/ stable/ 2373052), Amer. J.
Math. (American Journal of Mathematics, Vol. 88, No. 1) 88 (1): 154178, doi:10.2307/2373052.
Modern references
D.V. Anosov (2001), "Ergodic theory" (http:/ / eom. springer. de/ e/ e036150. htm), in Hazewinkel, Michiel,
Encyclopaedia of Mathematics, Springer, ISBN978-1556080104
This article incorporates material from ergodic theorem on PlanetMath, which is licensed under the Creative
Commons Attribution/Share-Alike License.
Vladimir Igorevich Arnol'd and Andr Avez, Ergodic Problems of Classical Mechanics. New York: W.A.
Benjamin. 1968.
Leo Breiman, Probability. Original edition published by AddisonWesley, 1968; reprinted by Society for
Industrial and Applied Mathematics, 1992. ISBN 0-89871-296-3. (See Chapter 6.)
Peter Walters, An introduction to ergodic theory, Springer, New York, 1982, ISBN 0-387-95152-0.
Tim Bedford, Michael Keane and Caroline Series, eds. (1991), Ergodic theory, symbolic dynamics and hyperbolic
spaces, Oxford University Press, ISBN0-19-853390-X (A survey of topics in ergodic theory; with exercises.)
Karl Petersen. Ergodic Theory (Cambridge Studies in Advanced Mathematics). Cambridge: Cambridge
University Press. 1990.
Joseph M. Rosenblatt and Mt Weirdl, Pointwise ergodic theorems via harmonic analysis, (1993) appearing in
Ergodic Theory and its Connections with Harmonic Analysis, Proceedings of the 1993 Alexandria Conference,
(1995) Karl E. Petersen and Ibrahim A. Salama, eds., Cambridge University Press, Cambridge, ISBN
0-521-45999-0. (An extensive survey of the ergodic properties of generalizations of the equidistribution theorem
of shift maps on the unit interval. Focuses on methods developed by Bourgain.)
A.N. Shiryaev, Probability, 2nd ed., Springer 1996, Sec. V.3. ISBN 0-387-94549-0.
External links
Ergodic Theory (29 October 2007) (http:/ / www. cscs. umich. edu/ ~crshalizi/ notebooks/ ergodic-theory. html)
Notes by Cosma Rohilla Shalizi
FeynmanKac formula
44
FeynmanKac formula
The FeynmanKac formula, named after Richard Feynman and Mark Kac, establishes a link between parabolic
partial differential equations (PDEs) and stochastic processes. It offers a method of solving certain PDEs by
simulating random paths of a stochastic process. Conversely, an important class of expectations of random processes
can be computed by deterministic methods. Consider the PDE,
defined for all real and in the interval , subject to the terminal condition
where are known functions, is a parameter and is the unknown. Then the FeynmanKac
formula tells us that the solution can be written as an expectation:
where is an It process driven by the equation
with is a Wiener process (also called Brownian motion) and the initial condition for is .
This expectation can then be approximated using Monte Carlo or quasi-Monte Carlo methods.
Proof
NOTE: The proof presented below is essentially that of
[1]
, albeit with more detail. Let be the solution to
above PDE. Applying It's lemma to the process one gets
Since , the third term is and can be dropped. Applying It's
lemma once again to , it follows that
The first term contains, in parentheses, the above PDE and is therefore zero. What remains is
Integrating this equation from to , one concludes that
Upon taking expectations, conditioned on , and observing that the right side is an It integral, which has
expectation zero, it follows that . The desired result is
obtained by observing that
Remarks
When originally published by Kac in 1949,
[2]
the FeynmanKac formula was presented as a formula for determining
the distribution of certain Wiener functionals. Suppose we wish to find the expected value of the function
in the case where is some realization of a diffusion process starting at . The FeynmanKac
formula says that this expectation is equivalent to the integral of a solution to a diffusion equation. Specifically,
under the conditions that ,
FeynmanKac formula
45
where and
The FeynmanKac formula can also be interpreted as a method for evaluating functional integrals of a certain form.
If
where the integral is taken over all random walks, then
where is a solution to the parabolic partial differential equation
with initial condition .
See also
It's lemma
KunitaWatanabe theorem
Girsanov theorem
Kolmogorov forward equation (also known as FokkerPlanck equation)
References
Simon, Barry (1979). Functional Integration and Quantum Physics. Academic Press.
[1] http:/ / www. math. nyu.edu/ faculty/ kohn/ pde_finance.html
[2] Kac, Mark (1949). "On Distributions of Certain Wiener Functionals" (http:/ / www. jstor. org/ stable/ 1990512). Transactions of the American
Mathematical Society 65 (1): 113. doi:10.2307/1990512. . Retrieved 2008-05-30.
Fourier transform
46
Fourier transform
In mathematics, the Fourier transform (often abbreviated FT) is an operation that transforms one complex-valued
function of a real variable into another. In such applications as signal processing, the domain of the original function
is typically time and is accordingly called the time domain. The domain of the new function is typically called the
frequency domain, and the new function itself is called the frequency domain representation of the original function.
It describes which frequencies are present in the original function. This is analogous to describing a musical chord in
terms of the individual notes being played. In effect, the Fourier transform decomposes a function into oscillatory
functions. The term Fourier transform refers both to the frequency domain representation of a function, and to the
process or formula that "transforms" one function into the other.
The Fourier transform and its generalizations are the subject of Fourier analysis. In this specific case, both the time
and frequency domains are unbounded linear continua. It is possible to define the Fourier transform of a function of
several variables, which is important for instance in the physical study of wave motion and optics. It is also possible
to generalize the Fourier transform on discrete structures such as finite groups. The efficient computation of such
structures, by fast Fourier transform, is essential for high-speed computing.
Fourier transforms
Continuous Fourier transform
Fourier series
Discrete Fourier transform
Discrete-time Fourier transform
Related transforms
Definition
There are several common conventions for defining the Fourier transform of an integrable function : R C
(Kaiser 1994). This article will use the definition:
for every real number .
When the independent variable x represents time (with SI unit of seconds), the transform variable represents
frequency (in hertz). Under suitable conditions, can be reconstructed from by the inverse transform:
for every real numberx.
For other common conventions and notations, including using the angular frequency instead of the frequency ,
see Other conventions and Other notations below. The Fourier transform on Euclidean space is treated separately, in
which the variable x often represents position and momentum.
Introduction
The motivation for the Fourier transform comes from the study of Fourier series. In the study of Fourier series,
complicated periodic functions are written as the sum of simple waves mathematically represented by sines and
cosines. Due to the properties of sine and cosine it is possible to recover the amount of each wave in the sum by an
integral. In many cases it is desirable to use Euler's formula, which states that e
2i
=cos2+isin2, to write
Fourier series in terms of the basic waves e
2i
. This has the advantage of simplifying many of the formulas involved
and providing a formulation for Fourier series that more closely resembles the definition followed in this article. This
passage from sines and cosines to complex exponentials makes it necessary for the Fourier coefficients to be
Fourier transform
47
complex valued. The usual interpretation of this complex number is that it gives both the amplitude (or size) of the
wave present in the function and the phase (or the initial angle) of the wave. This passage also introduces the need
for negative "frequencies". If were measured in seconds then the waves e
2i
and e
2i
would both complete one
cycle per second, but they represent different frequencies in the Fourier transform. Hence, frequency no longer
measures the number of cycles per unit time, but is closely related.
We may use Fourier series to motivate the Fourier transform as follows. Suppose that is a function which is zero
outside of some interval [L/2,L/2]. Then for any TL we may expand in a Fourier series on the interval
[T/2,T/2], where the "amount" (denoted by c
n
) of the wave e
2inx/T
in the Fourier series of is given by
and should be given by the formula
If we let
n
=n/T, and we let =(n+1)/Tn/T=1/T, then this last sum becomes the Riemann sum
By letting T this Riemann sum converges to the integral for the inverse Fourier transform given in the
Definition section. Under suitable conditions this argument may be made precise (Stein & Shakarchi 2003). Hence,
as in the case of Fourier series, the Fourier transform can be thought of as a function that measures how much of
each individual frequency is present in our function, and we can recombine these waves by using an integral (or
"continuous sum") to reproduce the original function.
The following images provide a visual illustration of how the Fourier transform measures whether a frequency is
present in a particular function. The function depicted oscillates at 3 hertz (if t measures
seconds) and tends quickly to 0. This function was specially chosen to have a real Fourier transform which can easily
be plotted. The first image contains its graph. In order to calculate we must integrate e
2i(3t)
(t). The second
image shows the plot of the real and imaginary parts of this function. The real part of the integrand is almost always
positive, this is because when (t) is negative, then the real part of e
2i(3t)
is negative as well. Because they oscillate
at the same rate, when (t) is positive, so is the real part of e
2i(3t)
. The result is that when you integrate the real part
of the integrand you get a relatively large number (in this case 0.5). On the other hand, when you try to measure a
frequency that is not present, as in the case when we look at , the integrand oscillates enough so that the
integral is very small. The general situation may be a bit more complicated than this, but this in spirit is how the
Fourier transform measures how much of an individual frequency is present in a function (t).
Original function showing
oscillation 3 hertz.
Real and imaginary parts of
integrand for Fourier transform
at 3 hertz
Real and imaginary parts of
integrand for Fourier transform
at 5 hertz
Fourier transform with 3 and 5
hertz labeled.
Fourier transform
48
Properties of the Fourier transform
An integrable function is a function on the real line that is Lebesgue-measurable and satisfies
Basic properties
Given integrable functions f(x), g(x), and h(x) denote their Fourier transforms by , , and
respectively. The Fourier transform has the following basic properties (Pinsky 2002).
Linearity
For any complex numbers a and b, if h(x)=a(x)+bg(x), then
Translation
For any real number x
0
, if h(x)=(xx
0
), then
Modulation
For any real number
0
, if h(x)=e
2ix
0(x), then .
Scaling
For a non-zero real number a, if h(x)=(ax), then . The case a=1 leads to the
time-reversal property, which states: if h(x)=(x), then .
Conjugation
If , then
In particular, if is real, then one has the reality condition
And if is purely imaginary, then
Convolution
If , then
Uniform continuity and the RiemannLebesgue lemma
The rectangular function is Lebesgue integrable.
The sinc function, the Fourier transform of the
rectangular function, is bounded and continuous,
but not Lebesgue integrable.
The Fourier transform of an integrable function is bounded and continuous, but need not be integrable for
example, the Fourier transform of the rectangular function, which is a step function (and hence integrable) is the sinc
Fourier transform
49
function, which is not Lebesgue integrable, though it does have an improper integral: one has an analog to the
alternating harmonic series, which is a convergent sum but not absolutely convergent.
It is not possible in general to write the inverse transform as a Lebesgue integral. However, when both and are
integrable, the following inverse equality holds true for almost every x:
Almost everywhere, is equal to the continuous function given by the right-hand side. If is given as continuous
function on the line, then equality holds for every x.
A consequence of the preceding result is that the Fourier transform is injective on L
1
(R).
The Plancherel theorem and Parseval's theorem
Let f(x) and g(x) be integrable, and let and be their Fourier transforms. If f(x) and g(x) are also
square-integrable, then we have Parseval's theorem (Rudin 1987, p. 187):
where the bar denotes complex conjugation.
The Plancherel theorem, which is equivalent to Parseval's theorem, states (Rudin 1987, p. 186):
The Plancherel theorem makes it possible to define the Fourier transform for functions in L
2
(R), as described in
Generalizations below. The Plancherel theorem has the interpretation in the sciences that the Fourier transform
preserves the energy of the original quantity. It should be noted that depending on the author either of these theorems
might be referred to as the Plancherel theorem or as Parseval's theorem.
See Pontryagin duality for a general formulation of this concept in the context of locally compact abelian groups.
Poisson summation formula
The Poisson summation formula provides a link between the study of Fourier transforms and Fourier Series. Given
an integrable function we can consider the periodic summation of given by:
where the summation is taken over the set of all integers k. The Poisson summation formula relates the Fourier series
of to the Fourier transform of . Specifically it states that the Fourier series of is given by:
Fourier transform
50
Convolution theorem
The Fourier transform translates between convolution and multiplication of functions. If (x) and g(x) are integrable
functions with Fourier transforms and respectively, then the Fourier transform of the convolution is
given by the product of the Fourier transforms and (under other conventions for the definition of the
Fourier transform a constant factor may appear).
This means that if:
where denotes the convolution operation, then:
In linear time invariant (LTI) system theory, it is common to interpret g(x) as the impulse response of an LTI system
with input (x) and output h(x), since substituting the unit impulse for (x) yields h(x)= g(x). In this case,
represents the frequency response of the system.
Conversely, if (x) can be decomposed as the product of two square integrable functions p(x) and q(x), then the
Fourier transform of (x) is given by the convolution of the respective Fourier transforms and .
Cross-correlation theorem
In an analogous manner, it can be shown that if h(x) is the cross-correlation of (x) and g(x):
then the Fourier transform of h(x) is:
As a special case, the autocorrelation of function (x) is:
for which
Eigenfunctions
One important choice of an orthonormal basis for L
2
(R) is given by the Hermite functions
where are the "probabilist's" Hermite polynomials, defined by H
n
(x)= (1)
n
exp(x
2
/2)D
n
exp(x
2
/2). Under
this convention for the Fourier transform, we have that
In other words, the Hermite functions form a complete orthonormal system of eigenfunctions for the Fourier
transform on L
2
(R) (Pinsky 2002). However, this choice of eigenfunctions is not unique. There are only four
different eigenvalues of the Fourier transform (1 and i) and any linear combination of eigenfunctions with the
same eigenvalue gives another eigenfunction. As a consequence of this, it is possible to decompose L
2
(R) as a direct
sum of four spaces H
0
, H
1
, H
2
, and H
3
where the Fourier transform acts on H
k
simply by multiplication by i
k
. This
approach to define the Fourier transform is due to N. Wiener(Duoandikoetxea 2001). The choice of Hermite
functions is convenient because they are exponentially localized in both frequency and time domains, and thus give
rise to the fractional Fourier transform used in time-frequency analysis (Boashash 2003).
Fourier transform
51
Fourier transform on Euclidean space
The Fourier transform can be in any arbitrary number of dimensions n. As with the one-dimensional case there are
many conventions, for an integrable function (x) this article takes the definition:
where x and are n-dimensional vectors, and x is the dot product of the vectors. The dot product is sometimes
written as .
All of the basic properties listed above hold for the n-dimensional Fourier transform, as do Plancherel's and
Parseval's theorem. When the function is integrable, the Fourier transform is still uniformly continuous and the
RiemannLebesgue lemma holds. (Stein & Weiss 1971)
Uncertainty principle
Generally speaking, the more concentrated f(x) is, the more spread out its Fourier transform must be. In
particular, the scaling property of the Fourier transform may be seen as saying: if we "squeeze" a function in x, its
Fourier transform "stretches out" in . It is not possible to arbitrarily concentrate both a function and its Fourier
transform.
The trade-off between the compaction of a function and its Fourier transform can be formalized in the form of an
Uncertainty Principle by viewing a function and its Fourier transform as conjugate variables with respect to the
symplectic form on the timefrequency domain: from the point of view of the linear canonical transformation, the
Fourier transform is rotation by 90 in the timefrequency domain, and preserves the symplectic form.
Suppose (x) is an integrable and square-integrable function. Without loss of generality, assume that (x) is
normalized:
It follows from the Plancherel theorem that is also normalized.
The spread around x= 0 may be measured by the dispersion about zero (Pinsky 2002) defined by
In probability terms, this is the second moment of about zero.
The Uncertainty principle states that, if (x) is absolutely continuous and the functions x(x) and (x) are square
integrable, then
(Pinsky 2002).
The equality is attained only in the case (hence ) where > 0
is arbitrary and C
1
is such that is L
2
normalized (Pinsky 2002). In other words, where is a (normalized) Gaussian
function, centered at zero.
In fact, this inequality implies that:
for any in R (Stein & Shakarchi 2003).
In quantum mechanics, the momentum and position wave functions are Fourier transform pairs, to within a factor of
Planck's constant. With this constant properly taken into account, the inequality above becomes the statement of the
Heisenberg uncertainty principle (Stein & Shakarchi 2003).
Fourier transform
52
Spherical harmonics
Let the set of homogeneous harmonic polynomials of degree k on R
n
be denoted by A
k
. The set A
k
consists of the
solid spherical harmonics of degree k. The solid spherical harmonics play a similar role in higher dimensions to the
Hermite polynomials in dimension one. Specifically, if f(x)=e
|x|
2P(x) for some P(x) in A
k
, then
. Let the set H
k
be the closure in L
2
(R
n
) of linear combinations of functions of the form f(|x|)P(x)
where P(x) is in A
k
. The space L
2
(R
n
) is then a direct sum of the spaces H
k
and the Fourier transform maps each
space H
k
to itself and is possible to characterize the action of the Fourier transform on each space H
k
(Stein & Weiss
1971). Let (x)=
0
(|x|)P(x) (with P(x) in A
k
), then where
Here J
(n+2k2)/2
denotes the Bessel function of the first kind with order (n+2k2)/2. When k=0 this gives a
useful formula for the Fourier transform of a radial function (Grafakos 2004).
Restriction problems
In higher dimensions it becomes interesting to study restriction problems for the Fourier transform. The Fourier
transform of an integrable function is continuous and the restriction of this function to any set is defined. But for a
square-integrable function the Fourier transform could be a general class of square integrable functions. As such, the
restriction of the Fourier transform of an L
2
(R
n
) function cannot be defined on sets of measure 0. It is still an active
area of study to understand restriction problems in L
p
for 1<p<2. Surprisingly, it is possible in some cases to
define the restriction of a Fourier transform to a set S, provided S has non-zero curvature. The case when S is the unit
sphere in R
n
is of particular interest. In this case the Tomas-Stein restriction theorem states that the restriction of the
Fourier transform to the unit sphere in R
n
is a bounded operator on L
p
provided 1p (2n + 2) / (n + 3).
One notable difference between the Fourier transform in 1 dimension versus higher dimensions concerns the partial
sum operator. Consider an increasing collection of measurable sets E
R
indexed by R(0,): such as balls of radius
R centered at the origin, or cubes of side 2R. For a given integrable function , consider the function
R
defined by:
Suppose in addition that is in L
p
(R
n
). For n= 1 and 1 < p < , if one takes E
R
= (R,R), then
R
converges to in
L
p
as R tends to infinity, by the boundedness of the Hilbert transform. Naively one may hope the same holds true for
n> 1. In the case that E
R
is taken to be a cube with side length R, then convergence still holds. Another natural
candidate is the Euclidean ball E
R
= {:||< R}. In order for this partial sum operator to converge, it is necessary
that the multiplier for the unit ball be bounded in L
p
(R
n
). For n2 it is a celebrated theorem of Charles Fefferman
that the multiplier for the unit ball is never bounded unless p=2 (Duoandikoetxea 2001). In fact, when p 2, this
shows that not only may
R
fail to converge to in L
p
, but for some functions L
p
(R
n
),
R
is not even an element of
L
p
.
Generalizations
Fourier transform on other function spaces
It is possible to extend the definition of the Fourier transform to other spaces of functions. Since compactly
supported smooth functions are integrable and dense in L
2
(R), the Plancherel theorem allows us to extend the
definition of the Fourier transform to general functions in L
2
(R) by continuity arguments. Further : L
2
(R)
L
2
(R) is a unitary operator (Stein & Weiss 1971, Thm. 2.3). Many of the properties remain the same for the Fourier
transform. The HausdorffYoung inequality can be used to extend the definition of the Fourier transform to include
functions in L
p
(R) for 1 p 2. Unfortunately, further extensions become more technical. The Fourier transform of
functions in L
p
for the range 2 < p < requires the study of distributions (Katznelson 1976). In fact, it can be shown
that there are functions in L
p
with p>2 so that the Fourier transform is not defined as a function (Stein & Weiss
Fourier transform
53
1971).
FourierStieltjes transform
The Fourier transform of a finite Borel measure on R
n
is given by (Pinsky 2002):
This transform continues to enjoy many of the properties of the Fourier transform of integrable functions. One
notable difference is that the RiemannLebesgue lemma fails for measures (Katznelson 1976). In the case that
d=(x)dx, then the formula above reduces to the usual definition for the Fourier transform of . In the case that is
the probability distribution associated to a random variable X, the Fourier-Stieltjes transform is closely related to the
characteristic function, but the typical conventions in probability theory take e
ix
instead of e
2ix
(Pinsky 2002). In
the case when the distribution has a probability density function this definition reduces to the Fourier transform
applied to the probability density function, again with a different choice of constants.
The Fourier transform may be used to give a characterization of continuous measures. Bochner's theorem
characterizes which functions may arise as the FourierStieltjes transform of a measure (Katznelson 1976).
Furthermore, the Dirac delta function is not a function but it is a finite Borel measure. Its Fourier transform is a
constant function (whose specific value depends upon the form of the Fourier transform used).
Tempered distributions
The Fourier transform maps the space of Schwartz functions to itself, and gives a homeomorphism of the space to
itself (Stein & Weiss 1971). Because of this it is possible to define the Fourier transform of tempered distributions.
These include all the integrable functions mentioned above and have the added advantage that the Fourier transform
of any tempered distribution is again a tempered distribution.
The following two facts provide some motivation for the definition of the Fourier transform of a distribution. First let
and g be integrable functions, and let and be their Fourier transforms respectively. Then the Fourier transform
obeys the following multiplication formula (Stein & Weiss 1971),
Secondly, every integrable function defines a distribution T

by the relation
for all Schwartz functions .
In fact, given a distribution T, we define the Fourier transform by the relation
for all Schwartz functions .
It follows that
Distributions can be differentiated and the above mentioned compatibility of the Fourier transform with
differentiation and convolution remains true for tempered distributions.
Locally compact abelian groups
The Fourier transform may be generalized to any locally compact abelian group. A locally compact abelian group is
an abelian group which is at the same time a locally compact Hausdorff topological space so that the group
operations are continuous. If G is a locally compact abelian group, it has a translation invariant measure , called
Haar measure. For a locally compact abelian group G it is possible to place a topology on the set of characters so
that is also a locally compact abelian group. For a function in L
1
(G) it is possible to define the Fourier
Fourier transform
54
transform by (Katznelson 1976):
Locally compact Hausdorff space
The Fourier transform may be generalized to any locally compact Hausdorff space, which recovers the topology but
loses the group structure.
Given a locally compact Hausdorff topological space X, the space A=C
0
(X) of continuous complex-valued functions
on X which vanish at infinity is in a natural way a commutative C*-algebra, via pointwise addition, multiplication,
complex conjugation, and with norm as the uniform norm. Conversely, the characters of this algebra A, denoted
are naturally a topological space, and can be identified with evaluation at a point of x, and one has an isometric
isomorphism In the case where X=R is the real line, this is exactly the Fourier transform.
Non-abelian groups
The Fourier transform can also be defined for functions on a non-abelian group, provided that the group is compact.
Unlike the Fourier transform on an abelian group, which is scalar-valued, the Fourier transform on a non-abelian
group is operator-valued (Hewitt & Ross 1971, Chapter 8). The Fourier transform on compact groups is a major tool
in representation theory (Knapp 2001) and non-commutative harmonic analysis.
Let G be a compact Hausdorff topological group. Let denote the collection of all isomorphism classes of
finite-dimensional irreducible unitary representations, along with a definite choice of representation U
()
on the
Hilbert space H

of finite dimension d

for each . If is a finite Borel measure on G, then the FourierStieltjes


transform of is the operator on H

defined by
where is the complex-conjugate representation of U
()
acting on H

. As in the abelian case, if is absolutely


continuous with respect to the left-invariant probability measure on G, then it is represented as
for some L
1
(). In this case, one identifies the Fourier transform of with the FourierStieltjes transform of .
The mapping defines an isomorphism between the Banach space M(G) of finite Borel measures (see rca
space) and a closed subspace of the Banach space C

() consisting of all sequences E=(E

) indexed by of
(bounded) linear operators E

:H

H

for which the norm
is finite. The "convolution theorem" asserts that, furthermore, this isomorphism of Banach spaces is in fact an
isomorphism of C
*
algebras into a subspace of C

(), in which M(G) is equipped with the product given by


convolution of measures and C

() the product given by multiplication of operators in each index .


The Peter-Weyl theorem holds, and a version of the Fourier inversion formula (Plancherel's theorem) follows: if
L
2
(G), then
where the summation is understood as convergent in the L
2
sense.
The generalization of the Fourier transform to the noncommutative situation has also in part contributed to the
development of noncommutative geometry. In this context, a categorical generalization of the Fourier transform to
noncommutative groups is Tannaka-Krein duality, which replaces the group of characters with the category of
representations. However, this loses the connection with harmonic functions.
Fourier transform
55
Alternatives
In signal processing terms, a function (of time) is a representation of a signal with perfect time resolution, but no
frequency information, while the Fourier transform has perfect frequency resolution, but no time information: the
magnitude of the Fourier transform at a point is how much frequency content there is, but location is only given by
phase (argument of the Fourier transform at a point), and standing waves are not localized in time a sine wave
continues out to infinity, without decaying. This limits the usefulness of the Fourier transform for analyzing signals
that are localized in time, notably transients, or any signal of finite extent.
As alternatives to the Fourier transform, in time-frequency analysis, one uses time-frequency transforms or
time-frequency distributions to represent signals in a form that has some time information and some frequency
information by the uncertainty principle, there is a trade-off between these. These can be generalizations of the
Fourier transform, such as the short-time Fourier transform or fractional Fourier transform, or can use different
functions to represent signals, as in wavelet transforms and chirplet transforms, with the wavelet analog of the
(continuous) Fourier transform being the continuous wavelet transform. (Boashash 2003). For a variable time and
frequency resolution, the De Groot Fourier Transform can be considered.
Applications
Analysis of differential equations
Fourier transforms and the closely related Laplace transforms are widely used in solving differential equations. The
Fourier transform is compatible with differentiation in the following sense: if f(x) is a differentiable function with
Fourier transform , then the Fourier transform of its derivative is given by . This can be used to
transform differential equations into algebraic equations. Note that this technique only applies to problems whose
domain is the whole set of real numbers. By extending the Fourier transform to functions of several variables partial
differential equations with domain R
n
can also be translated into algebraic equations.
Fourier transform spectroscopy
The Fourier transform is also used in nuclear magnetic resonance (NMR) and in other kinds of spectroscopy, e.g.
infrared (FTIR). In NMR an exponentially-shaped free induction decay (FID) signal is acquired in the time domain
and Fourier-transformed to a Lorentzian line-shape in the frequency domain. The Fourier transform is also used in
magnetic resonance imaging (MRI) and mass spectrometry.
Domain and range of the Fourier transform
It is often desirable to have the most general domain for the Fourier transform as possible. The definition of Fourier
transform as an integral naturally restricts the domain to the space of integrable functions. Unfortunately, there is no
simple characterizations of which functions are Fourier transforms of integrable functions (Stein & Weiss 1971). It is
possible to extend the domain of the Fourier transform in various ways, as discussed in generalizations above. The
following list details some of the more common domains and ranges on which the Fourier transform is defined.
The space of Schwartz functions is closed under the Fourier transform. Schwartz functions are rapidly decaying
functions and do not include all functions which are relevant for the Fourier transform. More details may be found
in (Stein & Weiss 1971).
The space L
p
maps into the space L
q
, where 1/p+ 1/q= 1 and 1p2 (HausdorffYoung inequality).
In particular, the space L
2
is closed under the Fourier transform, but here the Fourier transform is no longer
defined by integration.
The space L
1
of Lebesgue integrable functions maps into C
0
, the space of continuous functions that tend to zero at
infinity not just into the space of bounded functions (the RiemannLebesgue lemma).
Fourier transform
56
The set of tempered distributions is closed under the Fourier transform. Tempered distributions are also a form of
generalization of functions. It is in this generality that one can define the Fourier transform of objects like the
Dirac comb.
Other notations
Other common notations for are these:
Though less commonly other notations are used. Denoting the Fourier transform by a capital letter corresponding to
the letter of function being transformed (such as f(x) and F()) is especially common in the sciences and engineering.
In electronics, the omega () is often used instead of due to its interpretation as angular frequency, sometimes it is
written as F(j), where j is the imaginary unit, to indicate its relationship with the Laplace transform, and sometimes
it is written informally as F(2f) in order to use ordinary frequency.
The interpretation of the complex function may be aided by expressing it in polar coordinate form:
in terms of the two real functions A() and () where:
is the amplitude and
is the phase (see arg function).
Then the inverse transform can be written:
which is a recombination of all the frequency components of (x). Each component is a complex sinusoid of the
form e
2ix
whose amplitude is A() and whose initial phase angle (at x=0) is ().
The Fourier transform may be thought of as a mapping on function spaces. This mapping is here denoted and
is used to denote the Fourier transform of the function f. This mapping is linear, which means that can
also be seen as a linear transformation on the function space and implies that the standard notation in linear algebra
of applying a linear transformation to a vector (here the function f) can be used to write instead of .
Since the result of applying the Fourier transform is again a function, we can be interested in the value of this
function evaluated at the value for its variable, and this is denoted either as or as . Notice that
in the former case, it is implicitly understood that is applied first to f and then the resulting function is evaluated
at , not the other way around.
In mathematics and various applied sciences it is often necessary to distinguish between a function f and the value of
f when its variable equals x, denoted f(x). This means that a notation like formally can be interpreted as
the Fourier transform of the values of f at x. Despite this flaw, the previous notation appears frequently, often when a
particular function or a function of a particular variable is to be transformed. For example,
is sometimes used to express that the Fourier transform of a rectangular function is a sinc function, or
is used to express the shift property of the Fourier transform. Notice, that the
last example is only correct under the assumption that the transformed function is a function of x, not of x
0
.
Fourier transform
57
Other conventions
The Fourier transform can aslo be written in terms of angular frequency: = 2 whose units are radians per
second.
The substitution = /(2) into the formulas above produces this convention:
Under this convention, the inverse transform becomes:
Unlike the convention followed in this article, when the Fourier transform is defined this way, it is no longer a
unitary transformation on L
2
(R
n
). There is also less symmetry between the formulas for the Fourier transform and its
inverse.
Another convention is to split the factor of (2)
n
evenly between the Fourier transform and its inverse, which leads
to definitions:
Under this convention, the Fourier transform is again a unitary transformation on L
2
(R
n
). It also restores the
symmetry between the Fourier transform and its inverse.
Variations of all three conventions can be created by conjugating the complex-exponential kernel of both the forward
and the reverse transform. The signs must be opposites. Other than that, the choice is (again) a matter of convention.
Summary of popular forms of the Fourier transform
ordinary frequency (hertz) unitary
angular frequency (rad/s) non-unitary
unitary
The ordinary-frequency convention (which is used in this article) is the one most often found in the mathematics
literature. In the physics literature, the two angular-frequency conventions are more commonly used.
As discussed above, the characteristic function of a random variable is the same as the FourierStieltjes transform of
its distribution measure, but in this context it is typical to take a different convention for the constants. Typically
characteristic function is defined . As in the case of the "non-unitary angular
frequency" convention above, there is no factor of 2 appearing in either of the integral, or in the exponential.
Unlike any of the conventions appearing above, this convention takes the opposite sign in the exponential.
Fourier transform
58
Tables of important Fourier transforms
The following tables record some closed form Fourier transforms. For functions (x) , g(x) and h(x) denote their
Fourier transforms by , , and respectively. Only the three most common conventions are included. It is
sometimes useful to notice that entry 105 gives a relationship between the Fourier transform of a function and the
original function, which can be seen as relating the Fourier transform and its inverse.
Functional relationships
The Fourier transforms in this table may be found in (Erdlyi 1954) or the appendix of (Kammler 2000).
Function Fourier transform
unitary, ordinary frequency
Fourier transform
unitary, angular frequency
Fourier transform
non-unitary, angular
frequency
Remarks

Definition
101 Linearity
102 Shift in
time
domain
103 Shift in
frequency
domain,
dual of
102
104 Scaling in
the time
domain. If
is
large, then
is
concentrated
around 0
and
spreads out
and
flattens.
Fourier transform
59
105 Duality.
Here
needs to be
calculated
using the
same
method as
Fourier
transform
column.
Results
from
swapping
"dummy"
variables
of and
or or
.
106
107 This is the
dual of
106
108 The
notation
denotes the
convolution
of and
this
rule is the
convolution
theorem
109 This is the
dual of
108
110
For a purely
real
Hermitian
symmetry.
indicates
the
complex
conjugate.
111
For a purely
real even function
, and are purely real even functions.
112
For a purely
real odd function
, and are purely imaginary odd functions.
Fourier transform
60
Square-integrable functions
The Fourier transforms in this table may be found in (Campbell & Foster 1948), (Erdlyi 1954), or the appendix of
(Kammler 2000).
Function Fourier transform
unitary, ordinary frequency
Fourier transform
unitary, angular frequency
Fourier transform
non-unitary, angular
frequency
Remarks

201 The rectangular
pulse and the
normalized sinc
function, here
defined as
sinc(x) =
sin(x)/(x)
202 Dual of rule
201. The
rectangular
function is an
ideal low-pass
filter, and the
sinc function is
the non-causal
impulse
response of such
a filter.
203 The function
tri(x) is the
triangular
function
204 Dual of rule
203.
205 The function
u(x) is the
Heaviside unit
step function
and a>0.
206
This shows that,
for the unitary
Fourier
transforms, the
Gaussian
function
exp(x
2
) is its
own Fourier
transform for
some choice of
. For this to be
integrable we
must have
Re()>0.
Fourier transform
61
207 For a>0. That
is, the Fourier
transform of a
decaying
exponential
function is a
Lorentzian
function.
208 The functions J
n
(x) are the n-th
order Bessel
functions of the
first kind. The
functions U
n
(x)
are the
Chebyshev
polynomial of
the second kind.
See 315 and 316
below.
209 Hyperbolic
secant is its own
Fourier
transform
Distributions
The Fourier transforms in this table may be found in (Erdlyi 1954) or the appendix of (Kammler 2000).
Function Fourier transform
unitary, ordinary frequency
Fourier transform
unitary, angular frequency
Fourier transform
non-unitary, angular frequency
Remarks

301
The distribution () denotes the Dirac
delta function.
302
Dual of rule 301.
303
This follows from 103 and 301.
304
This follows from rules 101 and 303
using Euler's formula:
305
This follows from 101 and 303 using
306
Fourier transform
62
307
308
Here, n is a natural number and
is the n-th distribution derivative of the
Dirac delta function. This rule follows
from rules 107 and 301. Combining this
rule with 101, we can transform all
polynomials.
309
Here sgn() is the sign function. Note
that 1/x is not a distribution. It is
necessary to use the Cauchy principal
value when testing against Schwartz
functions. This rule is useful in studying
the Hilbert transform.
310
1/x
n
is the homogeneous distribution
defined by the distributional derivative
311
If Re > 1, then is a locally
integrable function, and so a tempered
distribution. The function is
a holomorphic function from the right
half-plane to the space of tempered
distributions. It admits a unique
meromorphic extension to a tempered
distribution, also denoted for
2, 4,... (See homogeneous
distribution.)
312
The dual of rule 309. This time the
Fourier transforms need to be considered
as Cauchy principal value.
313
The function u(x) is the Heaviside unit
step function; this follows from rules
101, 301, and 312.
314
This function is known as the Dirac
comb function. This result can be derived
from 302 and 102, together with the fact
that
as distributions.
315
The function J
0
(x) is the zeroth order
Bessel function of first kind.
316
This is a generalization of 315. The
function J
n
(x) is the n-th order Bessel
function of first kind. The function T
n
(x)
is the Chebyshev polynomial of the first
kind.
Fourier transform
63
Two-dimensional functions
Functions (400
to 402)
Fourier transform
unitary, ordinary frequency
Fourier transform
unitary, angular frequency
Fourier transform
non-unitary, angular frequency

Remarks
To 400: The variables
x
,
y
,
x
,
y
,
x
and
y
are real numbers. The integrals are taken over the entire plane.
To 401: Both functions are Gaussians, which may not have unit volume.
To 402: The function is defined by circ(r)=1 0r1, and is 0 otherwise. This is the Airy distribution, and is
expressed using J
1
(the order 1 Bessel function of the first kind). (Stein & Weiss 1971, Thm. IV.3.3)
Formulas for general n-dimensional functions
Function Fourier transform
unitary, ordinary frequency
Fourier transform
unitary, angular frequency
Fourier transform
non-unitary, angular
frequency
500

501
502
Remarks
To 501: The function
[0,1]
is the indicator function of the interval [0,1]. The function (x) is the gamma function.
The function J
n/2+
is a Bessel function of the first kind, with order n/2+. Taking n=2 and =0 produces 402.
(Stein & Weiss 1971, Thm. 4.13)
To 502: See Riesz potential. The formula also holds for all n, n1, ... by analytic continuation, but then the
function and its Fourier transforms need to be understood as suitably regularized tempered distributions. See
homogeneous distribution.
Fourier transform
64
See also
Fourier series
Fast Fourier transform
Laplace transform
Discrete Fourier transform
DFT matrix
Discrete-time Fourier transform
FourierDeligne transform
Fractional Fourier transform
Linear canonical transform
Fourier sine transform
Short-time Fourier transform
De Groot Fourier Transform
Fourier inversion theorem
Analog signal processing
Transform (mathematics)
Integral transform
Hartley transform
Hankel transform
Symbolic integration
References
Boashash, B., ed. (2003), Time-Frequency Signal Analysis and Processing: A Comprehensive Reference, Oxford:
Elsevier Science, ISBN0080443354
Bochner S., Chandrasekharan K. (1949), Fourier Transforms, Princeton University Press
Bracewell, R. N. (2000), The Fourier Transform and Its Applications (3rd ed.), Boston: McGraw-Hill,
ISBN0071160434.
Campbell, George; Foster, Ronald (1948), Fourier Integrals for Practical Applications, New York: D. Van
Nostrand Company, Inc..
Duoandikoetxea, Javier (2001), Fourier Analysis, American Mathematical Society, ISBN0-8218-2172-5.
Dym, H; McKean, H (1985), Fourier Series and Integrals, Academic Press, ISBN978-0122264511.
Erdlyi, Arthur, ed. (1954), Tables of Integral Transforms, 1, New Your: McGraw-Hill
Fourier, J. B. Joseph (1822), Thorie Analytique de la Chaleur
[1]
, Paris
Grafakos, Loukas (2004), Classical and Modern Fourier Analysis, Prentice-Hall, ISBN0-13-035399-X.
Hewitt, Edwin; Ross, Kenneth A. (1970), Abstract harmonic analysis. Vol. II: Structure and analysis for compact
groups. Analysis on locally compact Abelian groups, Die Grundlehren der mathematischen Wissenschaften, Band
152, Berlin, New York: Springer-Verlag, MR0262773.
Hrmander, L. (1976), Linear Partial Differential Operators, Volume 1, Springer-Verlag, ISBN978-3540006626.
James, J.F. (2002), A Student's Guide to Fourier Transforms (2nd ed.), New York: Cambridge University Press,
ISBN0-521-00428-4.
Kaiser, Gerald (1994), A Friendly Guide to Wavelets, Birkhuser, ISBN0-8176-3711-7
Kammler, David (2000), A First Course in Fourier Analysis, Prentice Hall, ISBN0-13-578782-3
Katznelson, Yitzhak (1976), An introduction to Harmonic Analysis, Dover, ISBN0-486-63331-4
Knapp, Anthony W. (2001), Representation Theory of Semisimple Groups: An Overview Based on Examples
[2]
,
Princeton University Press, ISBN978-0-691-09089-4
Pinsky, Mark (2002), Introduction to Fourier Analysis and Wavelets, Brooks/Cole, ISBN0-534-37660-6
Fourier transform
65
Polyanin, A. D.; Manzhirov, A. V. (1998), Handbook of Integral Equations, Boca Raton: CRC Press,
ISBN0-8493-2876-4.
Rudin, Walter (1987), Real and Complex Analysis (Third ed.), Singapore: McGraw Hill, ISBN0-07-100276-6.
Stein, Elias; Shakarchi, Rami (2003), Fourier Analysis: An introduction, Princeton University Press,
ISBN0-691-11384-X.
Stein, Elias; Weiss, Guido (1971), Introduction to Fourier Analysis on Euclidean Spaces, Princeton, N.J.:
Princeton University Press, ISBN978-0-691-08078-9.
Wilson, R. G. (1995), Fourier Series and Optical Transform Techniques in Contemporary Optics, New York:
Wiley, ISBN0471303577.
Yosida, K. (1968), Functional Analysis, Springer-Verlag, ISBN3-540-58654-7.
External links
Fourier Transform Tutorial
[3]
Fourier Series Applet
[4]
(Tip: drag magnitude or phase dots up or down to change the wave form).
Stephan Bernsee's FFTlab
[5]
(Java Applet)
Stanford Video Course on the Fourier Transform
[6]
Tables of Integral Transforms
[7]
at EqWorld: The World of Mathematical Equations.
Weisstein, Eric W., "Fourier Transform
[8]
" from MathWorld.
Fourier Transform Module by John H. Mathews
[9]
The DFT Pied: Mastering The Fourier Transform in One Day
[10]
at The DSP Dimension
An Interactive Flash Tutorial for the Fourier Transform
[11]
~
References
[1] http:/ / books. google. com/ ?id=TDQJAAAAIAAJ& printsec=frontcover& dq=Th%C3%A9orie+ analytique+ de+ la+ chaleur& q
[2] http:/ / books. google. com/ ?id=QCcW1h835pwC
[3] http:/ / www. thefouriertransform. com
[4] http:/ / www. westga.edu/ ~jhasbun/ osp/ Fourier.htm
[5] http:/ / www. dspdimension. com/ fftlab/
[6] http:/ / www. academicearth. com/ courses/ the-fourier-transform-and-its-applications
[7] http:/ / eqworld. ipmnet. ru/ en/ auxiliary/ aux-inttrans. htm
[8] http:/ / mathworld. wolfram. com/ FourierTransform. html
[9] http:/ / math. fullerton. edu/ mathews/ c2003/ FourierTransformMod. html
[10] http:/ / www.dspdimension. com/ admin/ dft-a-pied/
[11] http:/ / www.fourier-series.com/ f-transform/ index.html
Girsanov's theorem
66
Girsanov's theorem
Visualisation of the Girsanov theorem The left side shows a Wiener process with
negative drift under a canonical measure P; on the right side each path of the process is
colored according to its likelihood under the martingale measure Q. The density
transformation from P to Q is given by the Girsanov theorem.
In probability theory, the Girsanov
theorem (named after Igor
Vladimirovich Girsanov) tells how
stochastic processes change under
changes in measure. The theorem is
especially important in the theory of
financial mathematics as it tells how to
convert from the physical measure
which describes the probability that an
underlying instrument (such as a share
price or interest rate) will take a
particular value or values to the
risk-neutral measure which is a very
useful tool for evaluating the value of
derivatives on the underlying.
History
Results of this type were first proved by CameronMartin in the 1940s and by Girsanov in 1960. They have been
subsequently extended to more general classes of process culminating in the general form of Lenglart (1977).
Significance
Girsanov's theorem is important in the general theory of stochastic processes since it enables the key result that if Q
is a measure absolutely continuous with respect to P then every P-semimartingale is a Q-semimartingale.
Statement of theorem
We state the theorem first for the special case when the underlying stochastic process is a Wiener process. This
special case is sufficient for risk-neutral pricing in the Black-Scholes model and in many other models (e.g. all
continuous models).
Let be a Wiener process on the Wiener probability space . Let be a measurable process
adapted to the natural filtration of the Wiener process .
Given an adapted process with define
where is the stochastic exponential (or Dolans exponential) of X with respect to W, i.e.
Thus is a strictly positive local martingale, and a probability measure Q can be defined on such that
we have RadonNikodym derivative
Then for each t the measure Q restricted to the unaugmented sigma fields is equivalent to P restricted to
Girsanov's theorem
67
Furthermore if Y is a local martingale under P then the process
is a Q local martingale on the filtered probability space .
Corollary
If X is a continuous process and W is Brownian Motion under measure P then
is Brownian motion under Q.
The fact that is continuous is trivial; by Girsanov's theorem it is a Q local martingale, and by computing
it follows by Levy's characterization of Brownian Motion that this is a Q Brownian Motion.
Comments
In many common applications, the process X is defined by
For X of this form then a sufficient condition for to be a martingale is Novikov's condition which requires
that
The stochastic exponential is the process Z which solves the stochastic differential equation
The measure Q constructed above is not equivalent to P on as this would only be the case if the
RadonNikodym derivative were a uniformly integrable martingale, which the exponential martingale described
above is not (for ).
Application to finance
This theorem can be used to show the Black-Scholes model under the unique risk neutral measure, i.e. the measure
in which the fair value of a derivative is the discounted expected value, Q, is specified by
See also
CameronMartin theorem
Girsanov's theorem
68
References
C. Dellacherie and P.-A. Meyer, "Probabilits et potentiel -- Thorie de Martingales" Chapitre VII, Hermann
1980
E. Lenglart "Transformation de martingales locales par changement absolue continu de probabilits", Zeitschrift
fr Wahrscheinlichkeit 39 (1977) pp 6570.
External links
Notes on Stochastic Calculus
[1]
which contains a simple outline proof of Girsanov's theorem.
References
[1] http:/ / www. chiark. greenend.org. uk/ ~alanb/ stoc-calc. pdf
It's lemma
In mathematics, It's lemma is used in It stochastic calculus to find the differential of a function of a particular
type of stochastic process. It is named after its discoverer, Kiyoshi It. It is the stochastic calculus counterpart of the
chain rule in ordinary calculus and is best memorized using the Taylor series expansion and retaining the second
order term related to the stochastic component change. The lemma is widely employed in mathematical finance and
its best known application is in the derivation of the BlackScholes equation used to value options. Ito's Lemma is
also referred to currently as the ItDoeblin Theorem in recognition of the recently discovered work of Wolfgang
Doeblin.
[1]
It drift-diffusion processes
In its simplest form, It's lemma states the following: for an It drift-diffusion process
and any twice differentiable function (t,x) of two real variables t andx, then
This immediately implies that (t,X) is itself an It drift-diffusion process.
In higher dimensions, Ito's lemma states
where is a vector of It process, is the partial differential w.r.t. t,
is the gradient of w.r.t. X, and is the Hessian matrix of w.r.t.X.
Continuous semimartingales
More generally, the above formula also holds for any continuous d-dimensional semimartingale X=(X
1
,X
2
,,X
d
),
and twice continuously differentiable and real valued function f on R
d
. And some people prefer to presenting the
formula in another form with cross variation shown explicitly as follows, f(X) is a semimartingale satisfying
In this expression, the term f
,i
represents the partial derivative of f(x) with respect to x
i
, and [X
i
,X
j
] is the quadratic
covariation process of X
i
and X
j
.
It's lemma
69
Poisson jump processes
We may also define functions on discontinuous stochastic processes.
Let h be the jump intensity. The Poisson process model for jumps is that the probability of one jump in the interval
is plus higher order terms. h could be a constant, a deterministic function of time or a stochastic
process. The survival probability is the probability that no jump has occurred in the interval . The
change in the survival probability is
So
Let be a discontinuous stochastic process. Write for the value of S as we approach t from the left.
Write for the non-infinitesimal change in as a result of a jump. Then
Let z be the magnitude of the jump and let be the distribution of z. The expected magnitude of the
jump is
Define , a compensated process and martingale, as
Then
Consider a function of jump process . If jumps by then jumps by . is
drawn from distribution which may depend on , dg and . It's Lemma for is then
It's lemma for a process which is the sum of a drift-diffusion process and a jump process is just the sum of the It's
lemma for the individual parts.
Non-continuous semimartingales
It's lemma can also be applied to general d-dimensional semimartingales, which need not be continuous. In general,
a semimartingale is a cdlg process, and an additional term needs to be added to the formula to ensure that the
jumps of the process are correctly given by It's lemma. For any cadlag process Y
t
, the left limit in t is denoted by
Y
t-
, which is a left-continuous process. The jumps are written as Y
t
= Y
t
- Y
t-
. Then, It's lemma states that if X =
(X
1
,X
2
,,X
d
) is a d-dimensional semimartingale and f is a twice continuously differentiable real valued function on
R
d
then f(X) is a semimartingale, and
This differs from the formula for continuous semimartingales by the additional term summing over the jumps of X,
which ensures that the jump of the right hand side at time t is f(X
t
).
It's lemma
70
Informal derivation
A formal proof of the lemma requires us to take the limit of a sequence of random variables, which is not done here.
Instead, we can derive It's lemma by expanding a Taylor series and applying the rules of stochastic calculus.
Assume the It process is in the form of
Expanding f(x,t) in a Taylor series in x and t we have
and substituting a dt+bdB for dx gives
In the limit as dt tends to0, the dt
2
and dtdB terms disappear but the dB
2
term tends to dt. The latter can be shown if
we prove that
since
The proof of this statistical property is however beyond the scope of this article.
Deleting the dt
2
and dtdB terms, substituting dt for dB
2
, and collecting the dt and dB terms, we obtain
as required.
The formal proof is beyond the scope of this article.
Examples
Geometric Brownian motion
A process S is said to follow a geometric Brownian motion with volatility and drift if it satisfies the stochastic
differential equation dS =S(dB+dt), for a Brownian motion B. Applying It's lemma with f(S) = log(S) gives
It follows that log(S
t
) = log(S
0
) + B
t
+ (
2
/2)t, and exponentiating gives the expression for S,
The Dolans exponential
The Dolans exponential (or stochastic exponential) of a continuous semimartingale X can be defined as the solution
to the SDE dY =YdX with initial conditionY
0
=1. It is sometimes denoted by (X). Applying It's lemma with
f(Y)=log(Y) gives
Exponentiating gives the solution
It's lemma
71
BlackScholes formula
It's lemma can be used to derive the BlackScholes formula for an option. Suppose a stock price follows a
Geometric Brownian motion given by the stochastic differential equation dS=S(dB+dt). Then, if the value of an
option at time t is f(t,S
t
), It's lemma gives
The term (f/S)dS represents the change in value in time dt of the trading strategy consisting of holding an amount
f/S of the stock. If this trading strategy is followed, and any cash held is assumed to grow at the risk free rate r,
then the total value V of this portfolio satisfies the SDE
This strategy replicates the option if V = f(t,S). Combining these equations gives the celebrated Black-Scholes
equation
See also
Wiener process
It calculus
FeynmanKac formula
References
Kiyoshi It (1951). On stochastic differential equations. Memoirs, American Mathematical Society 4, 151.
Hagen Kleinert (2004). Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial
Markets, 4th edition, World Scientific (Singapore); Paperback ISBN 981-238-107-4. Also available online:
PDF-files
[2]
. This textbook also derives generalizations of It's lemma for non-Wiener (non-Gaussian) processes.
Bernt ksendal (2000). Stochastic Differential Equations. An Introduction with Applications, 5th edition,
corrected 2nd printing. Springer. ISBN 3-540-63720-6. Sections 4.1 and 4.2.
Domingo Tavella
[3]
(2002). Quantitative Methods in Derivatives Pricing: An Introduction to Computational
Finance
[4]
, John Wiley and Sons. ISBN 9780471274797. Pages 3639.
Notes
[1] "Stochastic Calculus :: ItDblin formula", Michael Stastny (http:/ / mahalanobis. twoday. net/ stories/ 756201/ )
[2] http:/ / www. physik. fu-berlin. de/ ~kleinert/ b5
[3] http:/ / www. wilmottwiki.com/ wiki/ index.php/ Tavella,_Domingo
[4] http:/ / books. google. com/ books?id=wjs5hENNjL4C& pg=PA36& dq=jump+ ito+ lemma#PPA38,M1
External links
Derivation (http:/ / www2. sjsu. edu/ faculty/ watkins/ ito. htm), Prof. Thayer Watkins
Discussion (http:/ / www. quantnotes. com/ fundamentals/ backgroundmaths/ ito. htm), quantnotes.com
Informal proof (http:/ / www. ftsmodules. com/ public/ texts/ optiontutor/ chap6. 8. htm), optiontutor
Martingale representation theorem
72
Martingale representation theorem
In probability theory, the martingale representation theorem states that a random variable which is measurable
with respect to the filtration generated by a Brownian motion can be written in terms of an It integral with respect to
this Brownian motion.
The theorem only asserts the existence of the representation and does not help to find it explicitly; it is possible in
many cases to determine the form of the representation using Malliavin calculus.
Similar theorems also exist for martingales on filtrations induced by jump processes, for example, by Markov chains.
Statement of the theorem
Let be a Brownian motion on a standard filtered probability space and let be the
augmentation of the filtration generated by . If X is a square integrable random variable measurable with respect
to , then there exists a predictable process C which is adapted with respect to , such that
Consequently
Application in finance
The martingale representation theorem can be used to establish the existence of a hedging strategy. Suppose that
is a Q-martingale process, whose volatility is always non-zero. Then, if is any other
Q-martingale, there exists an F-previsible process , unique up to sets of measure 0, such that
with probability one, and N can be written as:
The replicating strategy is defined to be:
hold units of the stock at the time t, and
hold units of the bond.
At the expiration day T, the value of the portfolio is:
and it's easy to check that the strategy is self-financing: the change in the value of the portfolio only depends on the
change of the asset prices .
References
Montin, Benot. "Stochastic Processes Applied in Finance", 2002
Elliott, Robert, "Stochastic Integrals for Martingales of a Jump Process with Partially Accessible Jump Times",
Zeitschrift fuer Wahrscheinlichkeitstheorie und verwandte Gebiete 36, p213-226, 1976
Mathematical model
73
Mathematical model
Note: The term model has a different meaning in model theory, a branch of mathematical logic. An artifact
which is used to illustrate a mathematical idea may also be called a mathematical model, and this usage is the
reverse of the sense explained below.
A mathematical model is a description of a system using mathematical language. The process of developing a
mathematical model is termed mathematical modelling (also written modeling). Mathematical models are used not
only in the natural sciences (such as physics, biology, earth science, meteorology) and engineering disciplines, but
also in the social sciences (such as economics, psychology, sociology and political science); physicists, engineers,
statisticians, operations research analysts and economists use mathematical models most extensively.
Mathematical models can take many forms, including but not limited to dynamical systems, statistical models,
differential equations, or game theoretic models. These and other types of models can overlap, with a given model
involving a variety of abstract structures.
Examples of mathematical models
Population Growth. A simple (though approximate) model of population growth is the Malthusian growth model.
A slightly more realistic and largely used population growth model is the logistic function, and its extensions.
Model of a particle in a potential-field. In this model we consider a particle as being a point of mass which
describes a trajectory in space which is modeled by a function giving its coordinates in space as a function of
time. The potential field is given by a function V : R
3
R and the trajectory is a solution of the differential
equation
Note this model assumes the particle is a point mass, which is certainly known to be false in many cases in
which we use this model; for example, as a model of planetary motion.
Model of rational behavior for a consumer. In this model we assume a consumer faces a choice of n commodities
labeled 1,2,...,n each with a market price p
1
, p
2
,..., p
n
. The consumer is assumed to have a cardinal utility function
U (cardinal in the sense that it assigns numerical values to utilities), depending on the amounts of commodities x
1
,
x
2
,..., x
n
consumed. The model further assumes that the consumer has a budget M which is used to purchase a
vector x
1
, x
2
,..., x
n
in such a way as to maximize U(x
1
, x
2
,..., x
n
). The problem of rational behavior in this model
then becomes an optimization problem, that is:
subject to:
This model has been used in general equilibrium theory, particularly to show existence and Pareto efficiency
of economic equilibria. However, the fact that this particular formulation assigns numerical values to levels of
satisfaction is the source of criticism (and even ridicule). However, it is not an essential ingredient of the
theory and again this is an idealization.
Neighbour-sensing model explains the mushroom formation from the initially chaotic fungal network.
Modelling requires selecting and identifying relevant aspects of a situation in the real world.
Mathematical model
74
Background
Often when engineers analyze a system to be controlled or optimized, they use a mathematical model. In analysis,
engineers can build a descriptive model of the system as a hypothesis of how the system could work, or try to
estimate how an unforeseeable event could affect the system. Similarly, in control of a system, engineers can try out
different control approaches in simulations.
A mathematical model usually describes a system by a set of variables and a set of equations that establish
relationships between the variables. The values of the variables can be practically anything; real or integer numbers,
boolean values or strings, for example. The variables represent some properties of the system, for example, measured
system outputs often in the form of signals, timing data, counters, and event occurrence (yes/no). The actual model is
the set of functions that describe the relations between the different variables.
Building blocks
There are six basic groups of variables: decision variables, input variables, state variables, exogenous variables,
random variables, and output variables. Since there can be many variables of each type, the variables are generally
represented by vectors.
Decision variables are sometimes known as independent variables. Exogenous variables are sometimes known as
parameters or constants. The variables are not independent of each other as the state variables are dependent on the
decision, input, random, and exogenous variables. Furthermore, the output variables are dependent on the state of the
system (represented by the state variables).
Objectives and constraints of the system and its users can be represented as functions of the output variables or state
variables. The objective functions will depend on the perspective of the model's user. Depending on the context, an
objective function is also known as an index of performance, as it is some measure of interest to the user. Although
there is no limit to the number of objective functions and constraints a model can have, using or optimizing the
model becomes more involved (computationally) as the number increases.
Classifying mathematical models
Many mathematical models can be classified in some of the following ways:
1. Linear vs. nonlinear: Mathematical models are usually composed by variables, which are abstractions of
quantities of interest in the described systems, and operators that act on these variables, which can be algebraic
operators, functions, differential operators, etc. If all the operators in a mathematical model exhibit linearity, the
resulting mathematical model is defined as linear. A model is considered to be nonlinear otherwise.
The question of linearity and nonlinearity is dependent on context, and linear models may have nonlinear
expressions in them. For example, in a statistical linear model, it is assumed that a relationship is linear in the
parameters, but it may be nonlinear in the predictor variables. Similarly, a differential equation is said to be linear
if it can be written with linear differential operators, but it can still have nonlinear expressions in it. In a
mathematical programming model, if the objective functions and constraints are represented entirely by linear
equations, then the model is regarded as a linear model. If one or more of the objective functions or constraints
are represented with a nonlinear equation, then the model is known as a nonlinear model.
Nonlinearity, even in fairly simple systems, is often associated with phenomena such as chaos and irreversibility.
Although there are exceptions, nonlinear systems and models tend to be more difficult to study than linear ones. A
common approach to nonlinear problems is linearization, but this can be problematic if one is trying to study
aspects such as irreversibility, which are strongly tied to nonlinearity.
2. Deterministic vs. probabilistic (stochastic): A deterministic model is one in which every set of variable states is
uniquely determined by parameters in the model and by sets of previous states of these variables. Therefore,
deterministic models perform the same way for a given set of initial conditions. Conversely, in a stochastic model,
Mathematical model
75
randomness is present, and variable states are not described by unique values, but rather by probability
distributions.
3. Static vs. dynamic: A static model does not account for the element of time, while a dynamic model does.
Dynamic models typically are represented with difference equations or differential equations.
A priori information
Mathematical modelling problems are often classified into black box or white box models, according to how much a
priori information is available of the system. A black-box model is a system of which there is no a priori information
available. A white-box model (also called glass box or clear box) is a system where all necessary information is
available. Practically all systems are somewhere between the black-box and white-box models, so this concept is
useful only as an intuitive guide for deciding which approach to take.
Usually it is preferable to use as much a priori information as possible to make the model more accurate. Therefore
the white-box models are usually considered easier, because if you have used the information correctly, then the
model will behave correctly. Often the a priori information comes in forms of knowing the type of functions relating
different variables. For example, if we make a model of how a medicine works in a human system, we know that
usually the amount of medicine in the blood is an exponentially decaying function. But we are still left with several
unknown parameters; how rapidly does the medicine amount decay, and what is the initial amount of medicine in
blood? This example is therefore not a completely white-box model. These parameters have to be estimated through
some means before one can use the model.
In black-box models one tries to estimate both the functional form of relations between variables and the numerical
parameters in those functions. Using a priori information we could end up, for example, with a set of functions that
probably could describe the system adequately. If there is no a priori information we would try to use functions as
general as possible to cover all different models. An often used approach for black-box models are neural networks
which usually do not make assumptions about incoming data. The problem with using a large set of functions to
describe a system is that estimating the parameters becomes increasingly difficult when the amount of parameters
(and different types of functions) increases.
Subjective information
Sometimes it is useful to incorporate subjective information into a mathematical model. This can be done based on
intuition, experience, or expert opinion, or based on convenience of mathematical form. Bayesian statistics provides
a theoretical framework for incorporating such subjectivity into a rigorous analysis: one specifies a prior probability
distribution (which can be subjective) and then updates this distribution based on empirical data. An example of
when such approach would be necessary is a situation in which an experimenter bends a coin slightly and tosses it
once, recording whether it comes up heads, and is then given the task of predicting the probability that the next flip
comes up heads. After bending the coin, the true probability that the coin will come up heads is unknown, so the
experimenter would need to make an arbitrary decision (perhaps by looking at the shape of the coin) about what
prior distribution to use. Incorporation of the subjective information is necessary in this case to get an accurate
prediction of the probability, since otherwise one would guess 1 or 0 as the probability of the next flip being heads,
which would be almost certainly wrong.
[1]
Mathematical model
76
Complexity
In general, model complexity involves a trade-off between simplicity and accuracy of the model. Occam's Razor is a
principle particularly relevant to modelling; the essential idea being that among models with roughly equal predictive
power, the simplest one is the most desirable. While added complexity usually improves the realism of a model, it
can make the model difficult to understand and analyze, and can also pose computational problems, including
numerical instability. Thomas Kuhn argues that as science progresses, explanations tend to become more complex
before a Paradigm shift offers radical simplification.
For example, when modelling the flight of an aircraft, we could embed each mechanical part of the aircraft into our
model and would thus acquire an almost white-box model of the system. However, the computational cost of adding
such a huge amount of detail would effectively inhibit the usage of such a model. Additionally, the uncertainty
would increase due to an overly complex system, because each separate part induces some amount of variance into
the model. It is therefore usually appropriate to make some approximations to reduce the model to a sensible size.
Engineers often can accept some approximations in order to get a more robust and simple model. For example
Newton's classical mechanics is an approximated model of the real world. Still, Newton's model is quite sufficient
for most ordinary-life situations, that is, as long as particle speeds are well below the speed of light, and we study
macro-particles only.
Training
Any model which is not pure white-box contains some parameters that can be used to fit the model to the system it is
intended to describe. If the modelling is done by a neural network, the optimization of parameters is called training.
In more conventional modelling through explicitly given mathematical functions, parameters are determined by
curve fitting.
Model evaluation
A crucial part of the modelling process is the evaluation of whether or not a given mathematical model describes a
system accurately. This question can be difficult to answer as it involves several different types of evaluation.
Fit to empirical data
Usually the easiest part of model evaluation is checking whether a model fits experimental measurements or other
empirical data. In models with parameters, a common approach to test this fit is to split the data into two disjoint
subsets: training data and verification data. The training data are used to estimate the model parameters. An accurate
model will closely match the verification data even though this data was not used to set the model's parameters. This
practice is referred to as cross-validation in statistics.
Defining a metric to measure distances between observed and predicted data is a useful tool of assessing model fit.
In statistics, decision theory, and some economic models, a loss function plays a similar role.
While it is rather straightforward to test the appropriateness of parameters, it can be more difficult to test the validity
of the general mathematical form of a model. In general, more mathematical tools have been developed to test the fit
of statistical models than models involving Differential equations. Tools from nonparametric statistics can
sometimes be used to evaluate how well data fits a known distribution or to come up with a general model that
makes only minimal assumptions about the model's mathematical form.
Mathematical model
77
Scope of the model
Assessing the scope of a model, that is, determining what situations the model is applicable to, can be less
straightforward. If the model was constructed based on a set of data, one must determine for which systems or
situations the known data is a "typical" set of data.
The question of whether the model describes well the properties of the system between data points is called
interpolation, and the same question for events or data points outside the observed data is called extrapolation.
As an example of the typical limitations of the scope of a model, in evaluating Newtonian classical mechanics, we
can note that Newton made his measurements without advanced equipment, so he could not measure properties of
particles travelling at speeds close to the speed of light. Likewise, he did not measure the movements of molecules
and other small particles, but macro particles only. It is then not surprising that his model does not extrapolate well
into these domains, even though his model is quite sufficient for ordinary life physics.
Philosophical considerations
Many types of modelling implicitly involve claims about causality. This is usually (but not always) true of models
involving differential equations. As the purpose of modelling is to increase our understanding of the world, the
validity of a model rests not only on its fit to empirical observations, but also on its ability to extrapolate to situations
or data beyond those originally described in the model. One can argue that a model is worthless unless it provides
some insight which goes beyond what is already known from direct investigation of the phenomenon being studied.
An example of such criticism is the argument that the mathematical models of Optimal foraging theory do not offer
insight that goes beyond the common-sense conclusions of evolution and other basic principles of ecology.
[2]
.
See also
Agent-based model
Biologically inspired computing
Cliodynamics
Computer simulation
Decision engineering
Differential equations
Dynamical systems
Model
Model (economics)
Mathematical biology
Mathematical models in physics
Mathematical diagram
Mathematical psychology
Mathematical sociology
Simulation
Statistical model
Mathematical model
78
References
[1] MacKay, D.J. Information Theory, Inference, and Learning Algorithms, Cambridge, (2003-2004). ISBN 0521642981
[2] Optimal Foraging Theory: A Critical Review - Annual Review of Ecology and Systematics, 15(1):523 - First Page Image (http:/ / arjournals.
annualreviews. org/ doi/ abs/ 10.1146/ annurev.es. 15. 110184. 002515?journalCode=ecolsys)
Further reading
Books
Aris, Rutherford [ 1978 ] ( 1994 ). Mathematical Modelling Techniques, New York : Dover. ISBN 0-486-68131-9
Bender, E.A. [ 1978 ] ( 2000 ). An Introduction to Mathematical Modelling, New York : Dover. ISBN
0-486-41180-X
Lin, C.C. & Segel, L.A. ( 1988 ). Mathematics Applied to Deterministic Problems in the Natural Sciences,
Philadelphia : SIAM. ISBN 0-89871-229-7
Gershenfeld, N., The Nature of Mathematical Modelling, Cambridge University Press, (1998).ISBN 0521570956
Yang, X.-S., Mathematical Modelling for Earth Sciences, Dudedin Academic, (2008). ISBN 1903765927
Specific applications
Peierls, Rudolf. Model-making in physics (http:/ / www. informaworld. com/ smpp/
content~content=a752582770~db=all~order=page), Contemporary Physics, Volume 21 (1), January 1980, 3-17
Korotayev A., Malkov A., Khaltourina D. ( 2006 ). Introduction to Social Macrodynamics: Compact
Macromodels of the World System Growth (http:/ / cliodynamics. ru/ index. php?option=com_content&
task=view& id=124& Itemid=70). Moscow : Editorial URSS (http:/ / urss. ru/ cgi-bin/ db. pl?cp=& lang=en&
blang=en& list=14& page=Book& id=34250). ISBN 5-484-00414-4
External links
General reference material
McLaughlin, Michael P. ( 1999 ) 'A Tutorial on Mathematical Modelling' (http:/ / www. causascientia. org/
math_stat/ Tutorial. pdf) PDF(264KiB)
Patrone, F. Introduction to modeling via differential equations (http:/ / www. diptem. unige. it/ patrone/
differential_equations_intro. htm), with critical remarks.
Plus teacher and student package: Mathematical Modelling. (http:/ / plus. maths. org/ issue44/ package/ index.
html) Brings together all articles on mathematical modelling from Plus, the online mathematics magazine
produced by the Millennium Mathematics Project at the University of Cambridge.
Software
List of computer simulation software
Philosophical background
Frigg, R. and S. Hartmann, Models in Science (http:/ / plato. stanford. edu/ entries/ models-science/ ), in: The
Stanford Encyclopedia of Philosophy, (Spring 2006 Edition)
Monte Carlo method
79
Monte Carlo method
Monte Carlo methods (or Monte Carlo experiments) are a class of computational algorithms that rely on repeated
random sampling to compute their results. Monte Carlo methods are often used in simulating physical and
mathematical systems. Because of their reliance on repeated computation of random or pseudo-random numbers,
these methods are most suited to calculation by a computer and tend to be used when it is unfeasible or impossible to
compute an exact result with a deterministic algorithm.
[1]
Monte Carlo simulation methods are especially useful in studying systems with a large number of coupled degrees of
freedom, such as fluids, disordered materials, strongly coupled solids, and cellular structures (see cellular Potts
model). More broadly, Monte Carlo methods are useful for modeling phenomena with significant uncertainty in
inputs, such as the calculation of risk in business. These methods are also widely used in mathematics: a classic use
is for the evaluation of definite integrals, particularly multidimensional integrals with complicated boundary
conditions. It is a widely successful method in risk analysis when compared with alternative methods or human
intuition. When Monte Carlo simulations have been applied in space exploration and oil exploration, actual
observations of failures, cost overruns and schedule overruns are routinely better predicted by the simulations than
by human intuition or alternative "soft" methods.
[2]
The term "Monte Carlo method" was coined in the 1940s by physicists working on nuclear weapon projects in the
Los Alamos National Laboratory.
[3]
Overview
Monte Carlo method
80
The Monte Carlo method can be illustrated as a
game of Battleship. First a player makes some
random shots. Next the player applies algorithms
(i.e. a battleship is four dots in the vertical or
horizontal direction). Finally based on the
outcome of the random sampling and the
algorithm the player can determine the likely
locations of the other player's ships.
There is no single Monte Carlo method; instead, the term describes a
large and widely-used class of approaches. However, these approaches
tend to follow a particular pattern:
1. Define a domain of possible inputs.
2. Generate inputs randomly from the domain using a certain specified
probability distribution.
3. Perform a deterministic computation using the inputs.
4. Aggregate the results of the individual computations into the final
result.
For example, the value of can be approximated using a Monte Carlo
method:
1. Draw a square on the ground, then inscribe a circle within it. From
plane geometry, the ratio of the area of an inscribed circle to that of
the surrounding square is .
2. Uniformly scatter some objects of uniform size throughout the
square. For example, grains of rice or sand.
3. Since the two areas are in the ratio , the objects should fall in
the areas in approximately the same ratio. Thus, counting the
number of objects in the circle and dividing by the total number of
objects in the square will yield an approximation for .
4. Multiplying the result by 4 will then yield an approximation for
itself.
Notice how the approximation follows the general pattern of Monte
Carlo algorithms. First, we define a domain of inputs: in this case, it's
the square which circumscribes our circle. Next, we generate inputs
randomly (scatter individual grains within the square), then perform a computation on each input (test whether it
falls within the circle). At the end, we aggregate the results into our final result, the approximation of . Note, also,
two other common properties of Monte Carlo methods: the computation's reliance on good random numbers, and its
slow convergence to a better approximation as more data points are sampled. If grains are purposefully dropped into
only, for example, the center of the circle, they will not be uniformly distributed, and so our approximation will be
poor. An approximation will also be poor if only a few grains are randomly dropped into the whole square. Thus, the
approximation of will become more accurate both as the grains are dropped more uniformly and as more are
dropped.
History
Enrico Fermi in the 1930s and Stanisaw Ulam in 1946 first had the idea. Ulam later contacted John von Neumann to
work on it.
[4]
Physicists at Los Alamos Scientific Laboratory were investigating radiation shielding and the distance that neutrons
would likely travel through various materials. Despite having most of the necessary data, such as the average
distance a neutron would travel in a substance before it collided with an atomic nucleus or how much energy the
neutron was likely to give off following a collision, the problem could not be solved with analytical calculations.
John von Neumann and Stanislaw Ulam suggested that the problem be solved by modeling the experiment on a
computer using chance. Being secret, their work required a code name. Von Neumann chose the name "Monte
Carlo". The name is a reference to the Monte Carlo Casino in Monaco where Ulam's uncle would borrow money to
gamble.
[1]

[5]

[6]
Monte Carlo method
81
Random methods of computation and experimentation (generally considered forms of stochastic simulation) can be
arguably traced back to the earliest pioneers of probability theory (see, e.g., Buffon's needle, and the work on small
samples by William Sealy Gosset), but are more specifically traced to the pre-electronic computing era. The general
difference usually described about a Monte Carlo form of simulation is that it systematically "inverts" the typical
mode of simulation, treating deterministic problems by first finding a probabilistic analog (see Simulated annealing).
Previous methods of simulation and statistical sampling generally did the opposite: using simulation to test a
previously understood deterministic problem. Though examples of an "inverted" approach do exist historically, they
were not considered a general method until the popularity of the Monte Carlo method spread.
Monte Carlo methods were central to the simulations required for the Manhattan Project, though were severely
limited by the computational tools at the time. Therefore, it was only after electronic computers were first built (from
1945 on) that Monte Carlo methods began to be studied in depth. In the 1950s they were used at Los Alamos for
early work relating to the development of the hydrogen bomb, and became popularized in the fields of physics,
physical chemistry, and operations research. The Rand Corporation and the U.S. Air Force were two of the major
organizations responsible for funding and disseminating information on Monte Carlo methods during this time, and
they began to find a wide application in many different fields.
Uses of Monte Carlo methods require large amounts of random numbers, and it was their use that spurred the
development of pseudorandom number generators, which were far quicker to use than the tables of random numbers
which had been previously used for statistical sampling.
Applications
As mentioned, Monte Carlo simulation methods are especially useful for modeling phenomena with significant
uncertainty in inputs and in studying systems with a large number of coupled degrees of freedom. Specific areas of
application include:
Physical sciences
Monte Carlo methods are very important in computational physics, physical chemistry, and related applied fields,
and have diverse applications from complicated quantum chromodynamics calculations to designing heat shields and
aerodynamic forms. The Monte Carlo method is widely used in statistical physics, particularly Monte Carlo
molecular modeling as an alternative for computational molecular dynamics as well as to compute statistical field
theories of simple particle and polymer models
[7]
; see Monte Carlo method in statistical physics. In experimental
particle physics, these methods are used for designing detectors, understanding their behavior and comparing
experimental data to theory, or on vastly large scale of the galaxy modelling.
[8]
Monte Carlo methods are also used in the ensemble models that form the basis of modern weather forecasting
operations.
Engineering
Monte Carlo methods are widely used in engineering for sensitivity analysis and quantitative probabilistic analysis in
process design. The need arises from the interactive, co-linear and non-linear behaviour of typical process
simulations. For example,
in microelectronics engineering, Monte Carlo methods are applied to analyze correlated and uncorrelated
variations in analog and digital integrated circuits. This enables designers to estimate realistic 3 sigma corners and
effectively optimise circuit yields.
in geostatistics and geometallurgy, Monte Carlo methods underpin the design of mineral processing flowsheets
and contribute to quantitative risk analysis.
Monte Carlo method
82
Applied statistics
Monte Carlo methods are generally used for two purposes in applied statistics. One purpose is to provide a
methodology to compare and contrast competing statistics for small sample, realistic data conditions. The Type I
error and power properties of statistics are obtainable for data drawn from classical theoretical distributions (e.g.,
normal curve, Cauchy distribution) for asymptotic conditions (i. e, infinite sample size and infinitesimally small
treatment effect), but such results often have little bearing on statistics' properties for realistic conditions.
[9]
The second purpose for Monte Carlo methods, found frequently as an option to asymptotic or exact tests in statistics
software, is to provide a more efficacious approach to data analysis than the time consuming (and often impossibility
to compute) permutation methodology. The Monte Carlo option is more accurate than relying on hypothesis tests'
asymptotically derived critical values, and yet not as time consuming to obtain as are exact tests, such as permutation
tests. For example, in SPSS version 18 with the Exact module installed, a two independent samples Wilcoxon Rank
Sum / Mann - Whitney U test can be conducted using asymptotic critical values, a Monte Carlo option by specifing
the number of samples, or via exact methods by specifying the time limit to be alloted to the analysis.
Monte Carlo methods are also a compromise between approximate randomization and permutation tests. An
approximate randomization test is based on a specified subset of all permutations (which entails potentially
enormous housekeeping of which permutations have been considered). The Monte Carlo approach is based on a
specified number of randomly drawn permutations (exchanging a minor loss in precision if a permutation is drawn
twice - or more frequently - for the efficiency of not having to track which permutations have already been selected).
It is important to differentiate between a simulation, Monte Carlo study, and a Monte Carlo simulation. A simulation
is a fictitious representation of reality. A Monte Carlo study is a technique that can be used to solve a mathematical
or statistical problem. A Monte Carlo simulation uses repeated sampling to determine the properties of some
phenomenon. Examples:
Drawing a pseudo-random uniform variate from the interval [0,1] can be used to simulate the tossing of a coin: If
the value is less than or equal to 0.50 designate the outcome as heads, but if the value is greater than 0.50
designate the outcome as tails. This is a simulation, but not a Monte Carlo simulation.
The area of an irregular figure inscribed in a unit square can be determined by throwing darts at the square and
computing the ratio of hits within the irregular figure to the total number of darts thrown. This is a Monte Carlo
method of determining area, but not a simulation.
Drawing a large number of pseudo-random uniform variates from the interval [0,1], and assigning values less than
or equal to 0.50 as heads and greater than 0.50 as tails, is a Monte Carlo simulation of the behavior of repeatedly
tossing a coin.
Sawilowsky listed the characteristics of a high quality Monte Carlo simulation:
the pseudo-random number generator has certain characteristics (e. g. a long period before repeating values)
the pseudo-random number generator produces values that pass tests for randomness
the number of repetitions of the experiment is sufficiently large to ensure accuracy of results
the proper sampling technique is used
the algorithm used is valid for what is being modeled
the study simulates the phenomenon in question.
[10]
Monte Carlo method
83
Design and visuals
Monte Carlo methods have also proven efficient in solving coupled integral differential equations of radiation fields
and energy transport, and thus these methods have been used in global illumination computations which produce
photorealistic images of virtual 3D models, with applications in video games, architecture, design, computer
generated films, and cinematic special effects.
Finance and business
Monte Carlo methods in finance are often used to calculate the value of companies, to evaluate investments in
projects at a business unit or corporate level, or to evaluate financial derivatives. Monte Carlo methods used in these
cases allow the construction of stochastic or probabilistic financial models as opposed to the traditional static and
deterministic models, thereby enhancing the treatment of uncertainty in the calculation. For use in the insurance
industry, see stochastic modelling.
Telecommunications
When planning a wireless network, design must be proved to work for a wide variety of scenarios that depend
mainly on the number of users, their locations and the services they want to use. Monte Carlo methods are typically
used to generate these users and their states. The network performance is then evaluated and, if results are not
satisfactory, the network design goes through an optimization process.
Games
Monte Carlo methods have recently been applied in game playing related artificial intelligence theory. Most notably
the game of Go and Battleship have seen remarkably successful Monte Carlo algorithm based computer players. One
of the main problems that this approach has in game playing is that it sometimes misses an isolated, very good move.
These approaches are often strong strategically but weak tactically, as tactical decisions tend to rely on a small
number of crucial moves which are easily missed by the randomly searching Monte Carlo algorithm.
Monte Carlo simulation versus what if scenarios
The opposite of Monte Carlo simulation might be considered deterministic modeling using single-point estimates.
Each uncertain variable within a model is assigned a best guess estimate. Various combinations of each input
variable are manually chosen (such as best case, worst case, and most likely case), and the results recorded for each
so-called what if scenario.
[11]
By contrast, Monte Carlo simulation considers random sampling of probability distribution functions as model inputs
to produce hundreds or thousands of possible outcomes instead of a few discrete scenarios. The results provide
probabilities of different outcomes occurring.
[12]
For example, a comparison of a spreadsheet cost construction model run using traditional what if scenarios, and
then run again with Monte Carlo simulation and Triangular probability distributions shows that the Monte Carlo
analysis has a narrower range than the what if analysis. This is because the what if analysis gives equal weight to
all scenarios.
[13]
For further discussion, see quantifying uncertainty under corporate finance.
Monte Carlo method
84
Use in mathematics
In general, Monte Carlo methods are used in mathematics to solve various problems by generating suitable random
numbers and observing that fraction of the numbers which obeys some property or properties. The method is useful
for obtaining numerical solutions to problems which are too complicated to solve analytically. The most common
application of the Monte Carlo method is Monte Carlo integration.
Integration
Deterministic methods of numerical integration usually operate by taking a number of evenly spaced samples from a
function. In general, this works very well for functions of one variable. However, for functions of vectors,
deterministic quadrature methods can be very inefficient. To numerically integrate a function of a two-dimensional
vector, equally spaced grid points over a two-dimensional surface are required. For instance a 10x10 grid requires
100 points. If the vector has 100 dimensions, the same spacing on the grid would require 10
100
pointsfar too many
to be computed. 100 dimensions is by no means unusual, since in many physical problems, a "dimension" is
equivalent to a degree of freedom. (See Curse of dimensionality.)
Monte Carlo methods provide a way out of this exponential time-increase. As long as the function in question is
reasonably well-behaved, it can be estimated by randomly selecting points in 100-dimensional space, and taking
some kind of average of the function values at these points. By the law of large numbers, this method will display
convergencei.e. quadrupling the number of sampled points will halve the error, regardless of the number
of dimensions.
A refinement of this method is to somehow make the points random, but more likely to come from regions of high
contribution to the integral than from regions of low contribution. In other words, the points should be drawn from a
distribution similar in form to the integrand. Understandably, doing this precisely is just as difficult as solving the
integral in the first place, but there are approximate methods available: from simply making up an integrable
function thought to be similar, to one of the adaptive routines discussed in the topics listed below.
A similar approach involves using low-discrepancy sequences insteadthe quasi-Monte Carlo method.
Quasi-Monte Carlo methods can often be more efficient at numerical integration because the sequence "fills" the
area better in a sense and samples more of the most important points that can make the simulation converge to the
desired solution more quickly.
Integration methods
Direct sampling methods
Importance sampling
Stratified sampling
Recursive stratified sampling
VEGAS algorithm
Random walk Monte Carlo including Markov chains
Metropolis-Hastings algorithm
Gibbs sampling
Monte Carlo method
85
Optimization
Most Monte Carlo optimization methods are based on random walks. Essentially, the program will move around a
marker in multi-dimensional space, tending to move in directions which lead to a lower function, but sometimes
moving against the gradient.
Another powerful and very popular application for random numbers in numerical simulation is in numerical
optimization. These problems use functions of some often large-dimensional vector that are to be minimized (or
maximized). Many problems can be phrased in this way: for example a computer chess program could be seen as
trying to find the optimal set of, say, 10 moves which produces the best evaluation function at the end. The traveling
salesman problem is another optimization problem. There are also applications to engineering design, such as
multidisciplinary design optimization.
Optimization methods
Evolution strategy
Genetic algorithms
Parallel tempering
Simulated annealing
Stochastic optimization
Stochastic tunneling
Inverse problems
Probabilistic formulation of inverse problems leads to the definition of a probability distribution in the model space.
This probability distribution combines a priori information with new information obtained by measuring some
observable parameters (data). As, in the general case, the theory linking data with model parameters is nonlinear, the
a posteriori probability in the model space may not be easy to describe (it may be multimodal, some moments may
not be defined, etc.).
When analyzing an inverse problem, obtaining a maximum likelihood model is usually not sufficient, as we
normally also wish to have information on the resolution power of the data. In the general case we may have a large
number of model parameters, and an inspection of the marginal probability densities of interest may be impractical,
or even useless. But it is possible to pseudorandomly generate a large collection of models according to the posterior
probability distribution and to analyze and display the models in such a way that information on the relative
likelihoods of model properties is conveyed to the spectator. This can be accomplished by means of an efficient
Monte Carlo method, even in cases where no explicit formula for the a priori distribution is available.
The best-known importance sampling method, the Metropolis algorithm, can be generalized, and this gives a method
that allows analysis of (possibly highly nonlinear) inverse problems with complex a priori information and data with
an arbitrary noise distribution. For details, see Mosegaard and Tarantola (1995),
[14]
or Tarantola (2005).
[15]
Computational mathematics
Monte Carlo methods are useful in many areas of computational mathematics, where a lucky choice can find the
correct result. A classic example is Rabin's algorithm for primality testing: for any n which is not prime, a random x
has at least a 75% chance of proving that n is not prime. Hence, if n is not prime, but x says that it might be, we have
observed at most a 1-in-4 event. If 10 different random x say that "n is probably prime" when it is not, we have
observed a one-in-a-million event. In general a Monte Carlo algorithm of this kind produces one correct answer with
a guarantee n is composite, and x proves it so, but another one without, but with a guarantee of not getting this
answer when it is wrong too often in this case at most 25% of the time. See also Las Vegas algorithm for a
related, but different, idea.
Monte Carlo method
86
Monte Carlo and random numbers
Interestingly, Monte Carlo simulation methods do not always require truly random numbers to be useful while for
some applications, such as primality testing, unpredictability is vital (see Davenport (1995)).
[16]
Many of the most
useful techniques use deterministic, pseudo-random sequences, making it easy to test and re-run simulations. The
only quality usually necessary to make good simulations is for the pseudo-random sequence to appear "random
enough" in a certain sense.
What this means depends on the application, but typically they should pass a series of statistical tests. Testing that
the numbers are uniformly distributed or follow another desired distribution when a large enough number of
elements of the sequence are considered is one of the simplest, and most common ones.
See also
General
Auxiliary field Monte Carlo
Bootstrapping (statistics)
Demon algorithm
Evolutionary computation
FERMIAC
Markov chain
Molecular dynamics
Monte Carlo option model
Monte Carlo integration
Quasi-Monte Carlo method
Random number generator
Randomness
Resampling (statistics)
Application areas
Graphics, particularly for ray tracing; a version of the Metropolis-Hastings algorithm is also used for ray tracing
where it is known as Metropolis light transport
Modeling light transport in biological tissue
Monte Carlo methods in finance
Reliability engineering
In simulated annealing for protein structure prediction
In semiconductor device research, to model the transport of current carriers
Environmental science, dealing with contaminant behavior
In geophysics, to invert seismic refraction data.
[17]
Search And Rescue and Counter-Pollution. Models used to predict the drift of a life raft or movement of an oil
slick at sea.
In probabilistic design for simulating and understanding the effects of variability
In physical chemistry, particularly for simulations involving atomic clusters
In biomolecular simulations
In polymer physics
Bond fluctuation model
In computer science
Monte Carlo algorithm
Monte Carlo method
87
Las Vegas algorithm
LURCH
Computer go
General Game Playing
Modeling the movement of impurity atoms (or ions) in plasmas in existing and tokamaks (e.g.: DIVIMP).
Nuclear and particle physics codes using the Monte Carlo method:
GEANT CERN's simulation of high energy particles interacting with a detector.
FLUKA INFN and CERN's simulation package for the interaction and transport of particles and nuclei in
matter
SRIM, a code to calculate the penetration and energy deposition of ions in matter.
CompHEP, PYTHIA Monte-Carlo generators of particle collisions
MCNP(X) - LANL's radiation transport codes
MCU: universal computer code for simulation of particle transport (neutrons, photons, electrons) in
three-dimensional systems by means of the Monte Carlo method
EGS Stanford's simulation code for coupled transport of electrons and photons
PEREGRINE: LLNL's Monte Carlo tool for radiation therapy dose calculations
BEAMnrc Monte Carlo code system for modeling radiotherapy sources (LINAC's)
PENELOPE Monte Carlo for coupled transport of photons and electrons, with applications in radiotherapy
MONK Serco Assurance's code for the calculation of k-effective of nuclear systems
Modelling of foam and cellular structures
Modeling of tissue morphogenesis
Computation of holograms
Phylogenetic analysis, i.e. Bayesian inference, Markov chain Monte Carlo
Other methods employing Monte Carlo
Assorted random models, e.g. self-organized criticality
Direct simulation Monte Carlo
Dynamic Monte Carlo method
Kinetic Monte Carlo
Quantum Monte Carlo
Quasi-Monte Carlo method using low-discrepancy sequences and self avoiding walks
Semiconductor charge transport and the like
Electron microscopy beam-sample interactions
Stochastic optimization
Cellular Potts model
Markov chain Monte Carlo
Cross-entropy method
Applied information economics
Monte Carlo localization
Evidence-based Scheduling
Binary collision approximation
List of software for Monte Carlo molecular modeling
Monte Carlo method
88
Notes
[1] Douglas Hubbard "How to Measure Anything: Finding the Value of Intangibles in Business" pg. 46, John Wiley & Sons, 2007
[2] Douglas Hubbard "The Failure of Risk Management: Why It's Broken and How to Fix It", John Wiley & Sons, 2009
[3] Nicholas Metropolis (1987). "The beginning of the Monte Carlo method" (http:/ / library. lanl. gov/ la-pubs/ 00326866. pdf). Los Alamos
Science (1987 Special Issue dedicated to Stanislaw Ulam): 125130. .
[4] http:/ / people. cs. ubc. ca/ ~nando/ papers/ mlintro.pdf
[5] Charles Grinstead & J. Laurie Snell "Introduction to Probability" pp. 10-11, American Mathematical Society, 1997
[6] H.L. Anderson, "Metropolis, Monte Carlo and the MANIAC," (http:/ / library. lanl. gov/ cgi-bin/ getfile?00326886. pdf) Los Alamos Science,
no. 14, pp. 96-108, 1986.
[7] Stephan A. Baeurle (2009). "Multiscale modeling of polymer materials using field-theoretic methodologies: a survey about recent
developments" (http:/ / www. springerlink.com/ content/ xl057580272w8703/ ). Journal of Mathematical Chemistry 46 (2): 363426.
doi:10.1007/s10910-008-9467-3. .
[8] H. T. MacGillivray, R. J. Dodd, Monte-Carlo simulations of galaxy systems, Astrophysics and Space Science, Volume 86, Number 2 /
September, 1982, Springer Netherlands (http:/ / www.springerlink. com/ content/ rp3g1q05j176r108/ fulltext. pdf)
[9] Sawilowsky, Shlomo S.; Fahoome, Gail C. (2003). Statistics via Monte Carlo Simulation with Fortran. Rochester Hills, MI: JMASM.
ISBN0-9740236-0-4.
[10] Sawilowsky, S. (2003). You think you've got trivials? Journal of Modern Applied Statistical Methods, 2(1), 218-225.
[11] David Vose: Risk Analysis, A Quantitative Guide, Second Edition, p. 13, John Wiley & Sons, 2000.
[12] Ibid, p. 16
[13] Ibid, p. 17, showing graph
[14] http:/ / www.ipgp.jussieu. fr/ ~tarantola/ Files/ Professional/ Papers_PDF/ MonteCarlo_latex. pdf
[15] http:/ / www.ipgp.jussieu. fr/ ~tarantola/ Files/ Professional/ SIAM/ index. html
[16] Davenport, J. H.. "Primality testing revisited" (http:/ / doi. acm. org/ 10. 1145/ 143242. 143290). doi:. . Retrieved 2007-08-19.
[17] Desman Geophysics - seismic refraction inversion users manual. http:/ / www. desmangeophysics. com/ wb/ pages/ home/ product/
users-manual. php
References
Metropolis, N.; Ulam, S. (1949). "The Monte Carlo Method" (http:/ / jstor. org/ stable/ 2280232). Journal of the
American Statistical Association (American Statistical Association) 44 (247): 335341. doi:10.2307/2280232.
PMID18139350.
Metropolis, Nicholas; Rosenbluth, Arianna W.; Rosenbluth, Marshall N.; Teller, Augusta H.; Teller, Edward
(1953). "Equation of State Calculations by Fast Computing Machines". Journal of Chemical Physics 21 (6): 1087.
doi:10.1063/1.1699114.
Hammersley, J. M.; Handscomb, D. C. (1975). Monte Carlo Methods. London: Methuen. ISBN0416523404.
Kahneman, D.; Tversky, A. (1982). Judgement under Uncertainty: Heuristics and Biases. Cambridge University
Press.
Gould, Harvey; Tobochnik, Jan (1988). An Introduction to Computer Simulation Methods, Part 2, Applications to
Physical Systems. Reading: Addison-Wesley. ISBN020116504X.
Binder, Kurt (1995). The Monte Carlo Method in Condensed Matter Physics. New York: Springer.
ISBN0387543694.
Berg, Bernd A. (2004). Markov Chain Monte Carlo Simulations and Their Statistical Analysis (With Web-Based
Fortran Code). Hackensack, NJ: World Scientific. ISBN9812389350.
Caflisch, R. E. (1998). Monte Carlo and quasi-Monte Carlo methods. Acta Numerica. 7. Cambridge University
Press. pp.149.
Doucet, Arnaud; Freitas, Nando de; Gordon, Neil (2001). Sequential Monte Carlo methods in practice. New
York: Springer. ISBN0387951466.
Fishman, G. S. (1995). Monte Carlo: Concepts, Algorithms, and Applications. New York: Springer.
ISBN038794527X.
MacKeown, P. Kevin (1997). Stochastic Simulation in Physics. New York: Springer. ISBN9813083263.
Robert, C. P.; Casella, G. (2004). Monte Carlo Statistical Methods (2nd ed.). New York: Springer.
ISBN0387212396.
Monte Carlo method
89
Rubinstein, R. Y.; Kroese, D. P. (2007). Simulation and the Monte Carlo Method (2nd ed.). New York: John
Wiley & Sons. ISBN9780470177938.
Mosegaard, Klaus; Tarantola, Albert (1995). "Monte Carlo sampling of solutions to inverse problems". J.
Geophys. Res. 100 (B7): 1243112447. doi:10.1029/94JB03097.
Tarantola, Albert (2005). Inverse Problem Theory (http:/ / www. ipgp. jussieu. fr/ ~tarantola/ Files/ Professional/
SIAM/ index. html). Philadelphia: Society for Industrial and Applied Mathematics. ISBN0898715725.
External links
Overview and reference list (http:/ / mathworld. wolfram. com/ MonteCarloMethod. html), Mathworld
Introduction to Monte Carlo Methods (http:/ / www. phy. ornl. gov/ csep/ CSEP/ MC/ MC. html), Computational
Science Education Project
The Basics of Monte Carlo Simulations (http:/ / www. chem. unl. edu/ zeng/ joy/ mclab/ mcintro. html),
University of Nebraska-Lincoln
Introduction to Monte Carlo simulation (http:/ / office. microsoft. com/ en-us/ excel-help/
introduction-to-monte-carlo-simulation-HA010282777. aspx) (for Microsoft Excel), Wayne L. Winston
Monte Carlo Methods - Overview and Concept (http:/ / www. brighton-webs. co. uk/ montecarlo/ concept. asp),
brighton-webs.co.uk
Molecular Monte Carlo Intro (http:/ / www. cooper. edu/ engineering/ chemechem/ monte. html), Cooper Union
Monte Carlo techniques applied in physics (http:/ / www. princeton. edu/ ~achremos/ Applet1-page. htm)
Risk Analysis in Investment Appraisal (http:/ / papers. ssrn. com/ sol3/ papers. cfm?abstract_id=265905), The
Application of Monte Carlo Methodology in Project Appraisal, Savvakis C. Savvides
Monte Carlo Method Example (http:/ / waqqasfarooq. com/ waqqasfarooq/ index. php?option=com_content&
view=article& id=47:monte-carlo& catid=34:statistics& Itemid=53), A step-by-step guide to creating a monte
carlo excel spreadsheet
Pricing using Monte Carlo simulation (http:/ / knol. google. com/ k/ giancarlo-vercellino/
pricing-using-monte-carlo-simulation/ 11d5i2rgd9gn5/ 3#), a practical example, Prof. Giancarlo Vercellino
Approximate And Double Check Probability Problems Using Monte Carlo method (http:/ / orcik. net/
programming/ approximate-and-double-check-probability-problems-using-monte-carlo-method/ ) at Orcik Dot
Net
Numerical analysis
90
Numerical analysis
Babylonian clay tablet BC 7289 (c. 18001600 BC) with
annotations. The approximation of the square root of 2 is four
sexagesimal figures, which is about six decimal figures. 1 + 24/60 +
51/60
2
+ 10/60
3
= 1.41421296...
[1]
Image by Bill Casselman.
[2]
Numerical analysis is the study of algorithms that use
numerical approximation (as opposed to general
symbolic manipulations) for the problems of
continuous mathematics (as distinguished from discrete
mathematics).
One of the earliest mathematical writings is the
Babylonian tablet BC 7289, which gives a sexagesimal
numerical approximation of , the length of the
diagonal in a unit square. Being able to compute the
sides of a triangle (and hence, being able to compute
square roots) is extremely important, for instance, in
carpentry and construction.
[3]
Numerical analysis continues this long tradition of
practical mathematical calculations. Much like the
Babylonian approximation of , modern numerical
analysis does not seek exact answers, because exact
answers are often impossible to obtain in practice.
Instead, much of numerical analysis is concerned with
obtaining approximate solutions while maintaining reasonable bounds on errors.
Numerical analysis naturally finds applications in all fields of engineering and the physical sciences, but in the
21stcentury, the life sciences and even the arts have adopted elements of scientific computations. Ordinary
differential equations appear in the movement of heavenly bodies (planets, stars and galaxies); optimization occurs
in portfolio management; numerical linear algebra is important for data analysis; stochastic differential equations and
Markov chains are essential in simulating living cells for medicine and biology.
Before the advent of modern computers numerical methods often depended on hand interpolation in large printed
tables. Since the mid 20th century, computers calculate the required functions instead. The interpolation algorithms
nevertheless may be used as part of the software for solving differential equations.
General introduction
The overall goal of the field of numerical analysis is the design and analysis of techniques to give approximate but
accurate solutions to hard problems, the variety of which is suggested by the following.
Advanced numerical methods are essential in making numerical weather prediction feasible.
Computing the trajectory of a spacecraft requires the accurate numerical solution of a system of ordinary
differential equations.
Car companies can improve the crash safety of their vehicles by using computer simulations of car crashes. Such
simulations essentially consist of solving partial differential equations numerically.
Hedge funds (private investment funds) use tools from all fields of numerical analysis to calculate the value of
stocks and derivatives more precisely than other market participants.
Airlines use sophisticated optimization algorithms to decide ticket prices, airplane and crew assignments and fuel
needs. This field is also called operations research.
Insurance companies use numerical programs for actuarial analysis.
The rest of this section outlines several important themes of numerical analysis.
Numerical analysis
91
History
The field of numerical analysis predates the invention of modern computers by many centuries. Linear interpolation
was already in use more than 2000 years ago. Many great mathematicians of the past were preoccupied by numerical
analysis, as is obvious from the names of important algorithms like Newton's method, Lagrange interpolation
polynomial, Gaussian elimination, or Euler's method.
To facilitate computations by hand, large books were produced with formulas and tables of data such as interpolation
points and function coefficients. Using these tables, often calculated out to 16 decimal places or more for some
functions, one could look up values to plug into the formulas given and achieve very good numerical estimates of
some functions. The canonical work in the field is the NIST publication edited by Abramowitz and Stegun, a
1000-plus page book of a very large number of commonly used formulas and functions and their values at many
points. The function values are no longer very useful when a computer is available, but the large listing of formulas
can still be very handy.
The mechanical calculator was also developed as a tool for hand computation. These calculators evolved into
electronic computers in the 1940s, and it was then found that these computers were also useful for administrative
purposes. But the invention of the computer also influenced the field of numerical analysis, since now longer and
more complicated calculations could be done.
Direct and iterative methods
Direct vs iterative methods
Consider the problem of solving
3x
3
+4=28
for the unknown quantity x.
3x
3
+ 4 = 28.
Subtract 4
3x
3
= 24.
Divide by 3
x
3
= 8.
Take cube roots x = 2.
For the iterative method, apply
the bisection method to f(x) =
3x
3
- 24. The initial values are a
= 0, b = 3, f(a) = -24, f(b) = 57.
a b mid f(mid)
0 3 1.5 -13.875
1.5 3 2.25 10.17...
1.5 2.25 1.875 -4.22...
1.875 2.25 2.0625 2.32...
We conclude from this table that
the solution is between 1.875
and 2.0625. The algorithm
might return any number in that
range with an error less than
0.2.
Numerical analysis
92
Discretization and
numerical integration
In a two hour race, we have
measured the speed of the car at
three instants and recorded them
in the following table.
Time 0:20 1:00 1:40
km/h 140 150 180
A discretization would be to
say that the speed of the car was
constant from 0:00 to 0:40, then
from 0:40 to 1:20 and finally
from 1:20 to 2:00. For instance,
the total distance traveled in the
first 40 minutes is
approximately (2/3h x
140km/h)=93.3km. This would
allow us to estimate the total
distance traveled as 93.3km +
100km + 120km = 313.3km,
which is an example of
numerical integration (see
below) using a Riemann sum,
because displacement is the
integral of velocity.
Ill posed problem: Take the
function f(x) = 1/(x1). Note
that f(1.1) = 10 and f(1.001) =
1000: a change in x of less than
0.1 turns into a change in f(x) of
nearly 1000. Evaluating f(x)
near x = 1 is an ill-conditioned
problem.
Well-posed problem: By
contrast, the function
is continuous
and so evaluating it is
well-posed, at least for x being
not close to zero.
Direct methods compute the solution to a problem in a finite number of steps. These methods would give the precise
answer if they were performed in infinite precision arithmetic. Examples include Gaussian elimination, the QR
factorization method for solving systems of linear equations, and the simplex method of linear programming. In
practice, finite precision is used and the result is an approximation of the true solution (assuming stability).
In contrast to direct methods, iterative methods are not expected to terminate in a number of steps. Starting from an
initial guess, iterative methods form successive approximations that converge to the exact solution only in the limit.
A convergence test is specified in order to decide when a sufficiently accurate solution has (hopefully) been found.
Numerical analysis
93
Even using infinite precision arithmetic these methods would not reach the solution within a finite number of steps
(in general). Examples include Newton's method, the bisection method, and Jacobi iteration. In computational matrix
algebra, iterative methods are generally needed for large problems.
Iterative methods are more common than direct methods in numerical analysis. Some methods are direct in principle
but are usually used as though they were not, e.g. GMRES and the conjugate gradient method. For these methods the
number of steps needed to obtain the exact solution is so large that an approximation is accepted in the same manner
as for an iterative method.
Discretization
Furthermore, continuous problems must sometimes be replaced by a discrete problem whose solution is known to
approximate that of the continuous problem; this process is called discretization. For example, the solution of a
differential equation is a function. This function must be represented by a finite amount of data, for instance by its
value at a finite number of points at its domain, even though this domain is a continuum.
The generation and propagation of errors
The study of errors forms an important part of numerical analysis. There are several ways in which error can be
introduced in the solution of the problem.
Round-off
Round-off errors arise because it is impossible to represent all real numbers exactly on a machine with finite memory
(which is what all practical digital computers are).
Truncation and discretization error
Truncation errors are committed when an iterative method is terminated or a mathematical procedure is
approximated, and the approximate solution differs from the exact solution. Similarly, discretization induces a
discretization error because the solution of the discrete problem does not coincide with the solution of the continuous
problem. For instance, in the iteration in the sidebar to compute the solution of , after 10 or so
iterations, we conclude that the root is roughly 1.99 (for example). We therefore have a truncation error of 0.01.
Once an error is generated, it will generally propagate through the calculation. For instance, we have already noted
that the operation + on a calculator (or a computer) is inexact. It follows that a calculation of the type a+b+c+d+e is
even more inexact.
What does it mean when we say that the truncation error is created when we approximate a mathematical procedure.
We know that to integrate a function exactly requires one to find the sum of infinite trapezoids. But numerically one
can find the sum of only finite trapezoids, and hence the approximation of the mathematical procedure. Similarly, to
differentiate a function, the differential element approaches to zero but numerically we can only choose a finite value
of the differential element.
Numerical analysis
94
Numerical stability and well-posed problems
Numerical stability is an important notion in numerical analysis. An algorithm is called numerically stable if an
error, whatever its cause, does not grow to be much larger during the calculation. This happens if the problem is
well-conditioned, meaning that the solution changes by only a small amount if the problem data are changed by a
small amount. To the contrary, if a problem is ill-conditioned, then any small error in the data will grow to be a large
error.
Both the original problem and the algorithm used to solve that problem can be well-conditioned and/or
ill-conditioned, and any combination is possible.
So an algorithm that solves a well-conditioned problem may be either numerically stable or numerically unstable. An
art of numerical analysis is to find a stable algorithm for solving a well-posed mathematical problem. For instance,
computing the square root of 2 (which is roughly 1.41421) is a well-posed problem. Many algorithms solve this
problem by starting with an initial approximation x
1
to , for instance x
1
=1.4, and then computing improved
guesses x
2
, x
3
, etc... One such method is the famous Babylonian method, which is given by x
k+1
= x
k
/2 + 1/x
k
.
Another iteration, which we will call Method X, is given by x
k + 1
= (x
k
2
2)
2
+ x
k
.
[4]
We have calculated a few
iterations of each scheme in table form below, with initial guesses x
1
= 1.4 and x
1
= 1.42.
Babylonian Babylonian Method X Method X
x
1
= 1.4 x
1
= 1.42 x
1
= 1.4 x
1
= 1.42
x
2
= 1.4142857... x
2
= 1.41422535... x
2
= 1.4016 x
2
= 1.42026896
x
3
= 1.414213564... x
3
= 1.41421356242... x
3
= 1.4028614... x
3
= 1.42056...
... ...
x
1000000
= 1.41421... x
28
= 7280.2284...
Observe that the Babylonian method converges fast regardless of the initial guess, whereas Method X converges
extremely slowly with initial guess 1.4 and diverges for initial guess 1.42. Hence, the Babylonian method is
numerically stable, while Method X is numerically unstable.
Numerical stability is affected by the number of the significant digits the machine keeps on, if we use a
machine that keeps on the first four floating-point digits,a good example on loss of significance these two
equivalent functions
if we compare the results of
and
by looking to the two above results, we realize that loss of significance which is also called Subtractive
Cancelation has a huge effect on the results, even though both functions are equivalent; to show that they are
equivalent simply we need to start by f(x) and end with g(x), and so
Numerical analysis
95
the true value for the result is 11.174755...
which is exactly g(500)=11.1748 after rounding the result to 4 decimal digits
now imagine that you use tens of terms like these functions in your program, your error will increase as you
proceed in the program, unless you use the suitable formula of the two functions each time you evaluate either
f(x), or g(x), the choice is dependent on the parity of x .
The example is taken from Mathew; Numerical methods using matlab , 3rd ed.
Areas of study
The field of numerical analysis is divided into different disciplines according to the problem that is to be solved.
Computing values of functions
Interpolation: We have observed the temperature to vary from 20 degrees Celsius at 1:00 to 14 degrees at 3:00. A linear interpolation of this data
would conclude that it was 17 degrees at 2:00 and 18.5 degrees at 1:30pm.
Extrapolation: If the gross domestic product of a country has been growing an average of 5% per year and was 100 billion dollars last year, we
might extrapolate that it will be 105 billion dollars this year.
Regression: In linear regression, given n points, we compute a line that passes as close as possible to those n points.
Numerical analysis
96
Optimization: Say you sell lemonade at a lemonade stand, and notice that at $1, you can sell 197 glasses of lemonade per day, and that for each
increase of $0.01, you will sell one less lemonade per day. If you could charge $1.485, you would maximize your profit, but due to the constraint of
having to charge a whole cent amount, charging $1.49 per glass will yield the maximum income of $220.52 per day.
Differential equation: If you set up 100 fans to blow air from one end of the room to the other and then you drop a feather into the wind, what
happens? The feather will follow the air currents, which may be very complex. One approximation is to measure the speed at which the air is
blowing near the feather every second, and advance the simulated feather as if it were moving in a straight line at that same speed for one second,
before measuring the wind speed again. This is called the Euler method for solving an ordinary differential equation.
One of the simplest problems is the evaluation of a function at a given point. The most straightforward approach, of
just plugging in the number in the formula is sometimes not very efficient. For polynomials, a better approach is
using the Horner scheme, since it reduces the necessary number of multiplications and additions. Generally, it is
important to estimate and control round-off errors arising from the use of floating point arithmetic.
Interpolation, extrapolation, and regression
Interpolation solves the following problem: given the value of some unknown function at a number of points, what
value does that function have at some other point between the given points?
Extrapolation is very similar to interpolation, except that now we want to find the value of the unknown function at a
point which is outside the given points.
Regression is also similar, but it takes into account that the data is imprecise. Given some points, and a measurement
of the value of some function at these points (with an error), we want to determine the unknown function. The least
squares-method is one popular way to achieve this.
Solving equations and systems of equations
Another fundamental problem is computing the solution of some given equation. Two cases are commonly
distinguished, depending on whether the equation is linear or not. For instance, the equation is linear
while is not.
Much effort has been put in the development of methods for solving systems of linear equations. Standard direct
methods, i.e., methods that use some matrix decomposition are Gaussian elimination, LU decomposition, Cholesky
decomposition for symmetric (or hermitian) and positive-definite matrix, and QR decomposition for non-square
matrices. Iterative methods such as the Jacobi method, GaussSeidel method, successive over-relaxation and
conjugate gradient method are usually preferred for large systems.
Numerical analysis
97
Root-finding algorithms are used to solve nonlinear equations (they are so named since a root of a function is an
argument for which the function yields zero). If the function is differentiable and the derivative is known, then
Newton's method is a popular choice. Linearization is another technique for solving nonlinear equations.
Solving eigenvalue or singular value problems
Several important problems can be phrased in terms of eigenvalue decompositions or singular value decompositions.
For instance, the spectral image compression algorithm
[5]
is based on the singular value decomposition. The
corresponding tool in statistics is called principal component analysis.
Optimization
Optimization problems ask for the point at which a given function is maximized (or minimized). Often, the point
also has to satisfy some constraints.
The field of optimization is further split in several subfields, depending on the form of the objective function and the
constraint. For instance, linear programming deals with the case that both the objective function and the constraints
are linear. A famous method in linear programming is the simplex method.
The method of Lagrange multipliers can be used to reduce optimization problems with constraints to unconstrained
optimization problems.
Evaluating integrals
Numerical integration, in some instances also known as numerical quadrature, asks for the value of a definite
integral. Popular methods use one of the NewtonCotes formulas (like the midpoint rule or Simpson's rule) or
Gaussian quadrature. These methods rely on a "divide and conquer" strategy, whereby an integral on a relatively
large set is broken down into integrals on smaller sets. In higher dimensions, where these methods become
prohibitively expensive in terms of computational effort, one may use Monte Carlo or quasi-Monte Carlo methods
(see Monte Carlo integration), or, in modestly large dimensions, the method of sparse grids.
Differential equations
Numerical analysis is also concerned with computing (in an approximate way) the solution of differential equations,
both ordinary differential equations and partial differential equations.
Partial differential equations are solved by first discretizing the equation, bringing it into a finite-dimensional
subspace. This can be done by a finite element method, a finite difference method, or (particularly in engineering) a
finite volume method. The theoretical justification of these methods often involves theorems from functional
analysis. This reduces the problem to the solution of an algebraic equation.
Software
Since the late twentieth century, most algorithms are implemented in a variety of programming languages. The
Netlib repository contains various collections of software routines for numerical problems, mostly in Fortran and C.
Commercial products implementing many different numerical algorithms include the IMSL and NAG libraries; a
free alternative is the GNU Scientific Library.
There are several popular numerical computing applications such as MATLAB, S-PLUS, LabVIEW, and IDL as
well as free and open source alternatives such as FreeMat, Scilab, GNU Octave (similar to Matlab), IT++ (a C++
library), R (similar to S-PLUS) and certain variants of Python. Performance varies widely: while vector and matrix
operations are usually fast, scalar loops may vary in speed by more than an order of magnitude.
[6]

[7]
Many computer algebra systems such as Mathematica also benefit from the availability of arbitrary precision
arithmetic which can provide more accurate results.
Numerical analysis
98
Also, any spreadsheet software can be used to solve simple problems relating to numerical analysis.
See also
Scientific computing
List of numerical analysis software
List of numerical analysis topics
Gram-Schmidt process
Numerical differentiation
Symbolic-numeric computation
Analysis of algorithms
Numerical Recipes
Notes
[1] Photograph, illustration, and description of the root(2) tablet from the Yale Babylonian Collection (http:/ / it. stlawu. edu/ ~dmelvill/
mesomath/ tablets/ YBC7289.html)
[2] YBC 7289, Bill Casselman (http:/ / www. math. ubc.ca/ ~cass/ Euclid/ ybc/ ybc. html)
[3] The New Zealand Qualification authority specifically mentions this skill in document 13004 version 2, dated 17 October 2003 titled
CARPENTRY THEORY: Demonstrate knowledge of setting out a building (http:/ / www. nzqa. govt. nz/ nqfdocs/ units/ pdf/ 13004. pdf)
[4] This is a fixed point iteration for the equation , whose solutions include . The iterates always
move to the right since . Hence converges and diverges.
[5] The Singular Value Decomposition and Its Applications in Image Compression (http:/ / online. redwoods. cc. ca. us/ instruct/ darnold/ maw/
single.htm)
[6] Speed comparison of various number crunching packages (http:/ / www. sciviews. org/ benchmark/ )
[7] Comparison of mathematical programs for data analysis (http:/ / www. scientificweb. com/ ncrunch/ ncrunch5. pdf) Stefan Steinhaus,
ScientificWeb.com
References
Gilat, Amos (2004). MATLAB: An Introduction with Applications (2nd edition ed.). John Wiley & Sons.
ISBN0-471-69420-7.
Hildebrand, F. B. (1974). Introduction to Numerical Analysis (2nd edition ed.). McGraw-Hill.
ISBN0-070-28761-9.
Leader, Jeffery J. (2004). Numerical Analysis and Scientific Computation. Addison Wesley.
ISBN0-201-73499-0.
Trefethen, Lloyd N. (2006). "Numerical analysis" (http:/ / web. comlab. ox. ac. uk/ oucl/ work/ nick. trefethen/
NAessay. pdf), 20 pages. In: Timothy Gowers and June Barrow-Green (editors), Princeton Companion of
Mathematics, Princeton University Press.
External links
Journals
Numerische Mathematik (http:/ / www-gdz. sub. uni-goettingen. de/ cgi-bin/ digbib. cgi?PPN362160546),
volumes 1-66, Springer, 1959-1994 (searchable; pages are images). (English) (German)
Numerische Mathematik at SpringerLink (http:/ / www. springerlink. com/ content/ 0029-599X), volumes 1-112,
Springer, 19592009
SIAM Journal on Numerical Analysis (http:/ / siamdl. aip. org/ dbt/ dbt. jsp?KEY=SJNAAM), volumes 1-47,
SIAM, 19642009
Software and Code
Numerical analysis
99
Lists of free software for scientific computing and numerical analysis (http:/ / norma. mas. ecp. fr/ wikimas/
ScientificComputingSoftware) (English) (French)
Numerical methods for Fortran programmers (http:/ / people. sc. fsu. edu/ ~tomek/ Fortran/ num_meth. html)
Java Number Cruncher (http:/ / www. apropos-logic. com/ nc/ ) features free, downloadable code samples that
graphically illustrate common numerical algorithms
Excel Implementations (http:/ / www. ifh. uni-karlsruhe. de/ people/ fenton/ Lectures. html)
Several Numerical Mathematical Utilities (in Javascript) (http:/ / www. akiti. ca/ Mathfxns. html)
Online Texts
Numerical Recipes (http:/ / www. nr. com/ oldverswitcher. html), William H. Press (free, downloadable previous
editions)
First Steps in Numerical Analysis (http:/ / kr. cs. ait. ac. th/ ~radok/ math/ mat7/ stepsa. htm#Numerical Analysis),
R.J.Hosking, S.Joe, D.C.Joyce, and J.C.Turner
Numerical Analysis for Engineering (http:/ / ece. uwaterloo. ca/ ~dwharder/ NumericalAnalysis/ ), D. W. Harder
CSEP (Computational Science Education Project) (http:/ / www. phy. ornl. gov/ csep/ CSEP/ TEXTOC. html),
U.S. Department of Energy
Online Course Material
Numerical Methods (http:/ / www. damtp. cam. ac. uk/ user/ fdl/ people/ sd103/ lectures/ nummeth98/ index.
htm#L_1_Title_Page), Stuart Dalziel University of Cambridge
Lectures on Numerical Analysis (http:/ / www. math. upenn. edu/ ~wilf/ DeturckWilf. pdf), Dennis Deturck and
Herbert S. Wilf University of Pennsylvania
Numerical methods (http:/ / www. ifh. uni-karlsruhe. de/ people/ fenton/ LectureNotes/ Numerical-Methods. pdf),
John D. Fenton University of Karlsruhe
Numerical Methods for Science, Technology, Engineering and Mathematics (http:/ / numericalmethods. eng. usf.
edu/ ), Autar Kaw University of South Florida
Numerical Analysis Project (http:/ / math. fullerton. edu/ mathews/ numerical. html), John H. Mathews California
State University, Fullerton
Numerical Methods - Online Course (http:/ / www. math. jct. ac. il/ ~naiman/ nm/ ), Aaron Naiman Jerusalem
College of Technology
Numerical Methods for Physicists (http:/ / www-teaching. physics. ox. ac. uk/ computing/ NumericalMethods/
NMfP. pdf), Anthony OHare Oxford University
Lectures in Numerical Analysis (http:/ / kr. cs. ait. ac. th/ ~radok/ math/ mat7/ stepsa. htm#Numerical Analysis),
R. Radok Mahidol University
Introduction to Numerical Analysis for Engineering (http:/ / ocw. mit. edu/ OcwWeb/ Mechanical-Engineering/
2-993JSpring-2005/ CourseHome/ ), Henrik Schmidt Massachusetts Institute of Technology
Real analysis
100
Real analysis
Real analysis, or theory of functions of a real variable is a branch of mathematical analysis dealing with the set of
real numbers. In particular, it deals with the analytic properties of real functions and sequences, including
convergence and limits of sequences of real numbers, the calculus of the real numbers, and continuity, smoothness
and related properties of real-valued functions.
Scope
Real analysis is an area of analysis which studies concepts such as sequences and their limits, continuity,
differentiation, integration and sequences of functions. By definition, real analysis focuses on the real numbers, often
including positive or negative infinity.
Order properties of the real numbers
The real numbers have several important lattice-theoretic properties that are absent in the complex numbers. Most
importantly, the real numbers form an ordered field, in which addition and multiplication preserve positivity.
Moreover, the ordering of the real numbers is total, and the real numbers have the least upper bound property. These
order-theoretic properties lead to a number of important results in real analysis, such as the monotone convergence
theorem, the intermediate value theorem and the mean value theorem.
However, while the results in real analysis are stated for real numbers, many of these results can be generalized to
other mathematical objects. In particular, many ideas in functional analysis and operator theory generalize properties
of the real numbers --- such generalizations include the theories of Riesz spaces and positive operators. Also,
mathematicians consider real and imaginary parts of complex sequences, or by pointwise evaluation of operator
sequences.
Relation to complex analysis
Real analysis is closely related to complex analysis, which studies broadly the same properties of complex numbers.
In complex analysis, it is natural to define differentiation via holomorphic functions, which have a number of useful
properties, such as repeated differentiability, expressability as power series, and satisfying the Cauchy integral
formula.
However, in real analysis, it is usually more natural to consider differentiable, smooth, or harmonic functions, which
are more widely applicable, but may lack some more powerful properties of holomorphic functions. Also results
such as the fundamental theorem of algebra are simpler when expressed in terms of complex numbers.
Techniques from the theory of analytic functions of a complex variable are often used in real analysis --- such as
evaluation of real integrals by residue calculus.
Key concepts
The foundation of real analysis is the construction of the real numbers from the rational numbers, usually either by
Dedekind cuts, or by completion of Cauchy sequences. Key concepts in real analysis are real sequences and their
limits, continuity, differentiation, and integration. Real analysis is also used as a starting point for other areas of
analysis, such as complex analysis, functional analysis, and harmonic analysis, as well as motivating the
development of topology, and as a tool in other areas, such as applied mathematics.
Important results include the Bolzano-Weierstrass and Heine-Borel theorems, the intermediate value theorem and
mean value theorem, the fundamental theorem of calculus, and the monotone convergence theorem.
Real analysis
101
Various ideas from real analysis can be generalized from real space to general metric spaces, as well as to measure
spaces, Banach spaces, and Hilbert spaces.
See also
List of real analysis topics
Complex analysis
Real analysis on time scales - a unification of real analysis with calculus of finite differences
Bibliography
Aliprantis, Charalambos D; Burkinshaw, Owen (1998). Principles of real analysis (Third ed.). Academic.
ISBN0-12-050257-7.
Browder, Andrew (1996). Mathematical Analysis: An Introduction. Undergraduate Texts in Mathematics. New
York: Springer-Verlag. ISBN0-387-94614-4.
Bartle, Robert G. and Sherbert, Donald R. (2000). Introduction to Real Analysis (3 ed.). New York: John Wiley
and Sons. ISBN0-471-32148-6.
Abbott, Stephen (2001). Understanding Analysis. Undergradutate Texts in Mathematics. New York:
Springer-Verlag. ISBN0-387-95060-5.
Rudin, Walter. Principles of Mathematical Analysis. Walter Rudin Student Series in Advanced Mathematics (3
ed.). McGraw-Hill. ISBN978-0070542358.
Dangello, Frank and Seyfried, Michael (1999). Introductory Real Analysis. Brooks Cole. ISBN978-0395959336.
Bressoud, David (2007). A Radical Approach to Real Analysis. MAA. ISBN0-883857472.
External links
Analysis WebNotes
[1]
by John Lindsay Orr
Interactive Real Analysis
[2]
by Bert G. Wachsmuth
A First Analysis Course
[3]
by John O'Connor
Mathematical Analysis I
[4]
by Elias Zakon
Mathematical Analysis II
[5]
by Elias Zakon
Trench, William F. (2003). Introduction to Real Analysis
[6]
. Prentice Hall. ISBN978-0-13-045786-8
Earliest Known Uses of Some of the Words of Mathematics: Calculus & Analysis
[7]
Basic Analysis: Introduction to Real Analysis
[8]
by Jiri Lebl
References
[1] http:/ / www. math. unl.edu/ ~webnotes/ contents/ chapters. htm
[2] http:/ / www. mathcs. org/ analysis/ reals/ index. html
[3] http:/ / www-groups. mcs. st-andrews.ac. uk/ ~john/ analysis/ index. html
[4] http:/ / www. trillia.com/ zakon-analysisI.html
[5] http:/ / www. trillia.com/ zakon-analysisII. html
[6] http:/ / ramanujan. math. trinity. edu/ wtrench/ texts/ TRENCH_REAL_ANALYSIS. PDF
[7] http:/ / www. economics. soton. ac. uk/ staff/ aldrich/ Calculus%20and%20Analysis%20Earliest%20Uses. htm
[8] http:/ / www. jirka. org/ ra/
Partial differential equation
102
Partial differential equation
A visualisation of a solution to the heat equation
on a two dimensional plane
In mathematics, partial differential equations (PDE) are a type of
differential equation, i.e., a relation involving an unknown function (or
functions) of several independent variables and their partial derivatives
with respect to those variables. Partial differential equations are used to
formulate, and thus aid the solution of, problems involving functions of
several variables; such as the propagation of sound or heat,
electrostatics, electrodynamics, fluid flow, and elasticity. Seemingly
distinct physical phenomena may have identical mathematical
formulations, and thus be governed by the same underlying dynamic.
They find their generalization in Stochastic partial differential
equations. Just as ordinary differential equations often model
dynamical systems, partial differential equations often model multidimensional systems.
Introduction
A partial differential equation (PDE) for the function is of the form
F is a linear function of u and its derivatives if, by replacing u with v+w, F can be written as and
if, by replacing u with ku, F can be written as
If F is a linear function of u and its derivatives, then the PDE is linear. Common examples of linear PDEs include the
heat equation, the wave equation and Laplace's equation.
A relatively simple PDE is
This relation implies that the function u(x,y) is independent of x. Hence the general solution of this equation is
where f is an arbitrary function of y. The analogous ordinary differential equation is
which has the solution
where c is any constant value (independent of x). These two examples illustrate that general solutions of ordinary
differential equations (ODEs) involve arbitrary constants, but solutions of PDEs involve arbitrary functions. A
solution of a PDE is generally not unique; additional conditions must generally be specified on the boundary of the
region where the solution is defined. For instance, in the simple example above, the function can be
determined if is specified on the line .
Existence and uniqueness
Although the issue of the existence and uniqueness of solutions of ordinary differential equations has a very
satisfactory answer with the PicardLindelf theorem, that is far from the case for partial differential equations.
There is a general theorem (the CauchyKowalevski theorem) that states that the Cauchy problem for any partial
differential equation that is analytic in the unknown function and its derivatives has a unique analytic solution.
Although this result might appear to settle the existence and uniqueness of solutions, there are examples of linear
Partial differential equation
103
partial differential equations whose coefficients have derivatives of all orders (which are nevertheless not analytic)
but which have no solutions at all: see Lewy (1957). Even if the solution of a partial differential equation exists and
is unique, it may nevertheless have undesirable properties. The mathematical study of these questions is usually in
the more powerful context of weak solutions.
An example of pathological behavior is the sequence of Cauchy problems (depending upon n) for the Laplace
equation
with initial conditions
where n is an integer. The derivative of u with respect to y approaches 0 uniformly in x as n increases, but the
solution is
This solution approaches infinity if nx is not an integer multiple of for any non-zero value of y. The Cauchy
problem for the Laplace equation is called ill-posed or not well posed, since the solution does not depend
continuously upon the data of the problem. Such ill-posed problems are not usually satisfactory for physical
applications.
Notation
In PDEs, it is common to denote partial derivatives using subscripts. That is:
Especially in (mathematical) physics, one often prefers the use of del (which in cartesian coordinates is written
) for spatial derivatives and a dot for time derivatives. For example, the wave equation
(described below) can be written as
(physics notation),
or
(math notation), where is the Laplace operator. This often leads to misunderstandings in
regards of the -(delta)operator.
Partial differential equation
104
Examples
Heat equation in one space dimension
The equation for conduction of heat in one dimension for a homogeneous body has the form
where u(t,x) is temperature, and is a positive constant that describes the rate of diffusion. The Cauchy problem for
this equation consists in specifying , where f(x) is an arbitrary function.
General solutions of the heat equation can be found by the method of separation of variables. Some examples appear
in the heat equation article. They are examples of Fourier series for periodic f and Fourier transforms for
non-periodic f. Using the Fourier transform, a general solution of the heat equation has the form
where F is an arbitrary function. To satisfy the initial condition, F is given by the Fourier transform of f, that is
If f represents a very small but intense source of heat, then the preceding integral can be approximated by the delta
distribution, multiplied by the strength of the source. For a source whose strength is normalized to 1, the result is
and the resulting solution of the heat equation is
This is a Gaussian integral. It may be evaluated to obtain
This result corresponds to a normal probability density for x with mean 0 and variance 2t. The heat equation and
similar diffusion equations are useful tools to study random phenomena.
Wave equation in one spatial dimension
The wave equation is an equation for an unknown function u(t, x) of the form
Here u might describe the displacement of a stretched string from equilibrium, or the difference in air pressure in a
tube, or the magnitude of an electromagnetic field in a tube, and c is a number that corresponds to the velocity of the
wave. The Cauchy problem for this equation consists in prescribing the initial displacement and velocity of a string
or other medium:
where f and g are arbitrary given functions. The solution of this problem is given by d'Alembert's formula:
This formula implies that the solution at (t,x) depends only upon the data on the segment of the initial line that is cut
out by the characteristic curves
Partial differential equation
105
that are drawn backwards from that point. These curves correspond to signals that propagate with velocity c forward
and backward. Conversely, the influence of the data at any given point on the initial line propagates with the finite
velocity c: there is no effect outside a triangle through that point whose sides are characteristic curves. This behavior
is very different from the solution for the heat equation, where the effect of a point source appears (with small
amplitude) instantaneously at every point in space. The solution given above is also valid if t is negative, and the
explicit formula shows that the solution depends smoothly upon the data: both the forward and backward Cauchy
problems for the wave equation are well-posed.
Spherical waves
Spherical waves are waves whose amplitude depends only upon the radial distance r from a central point source. For
such waves, the three-dimensional wave equation takes the form
This is equivalent to
and hence the quantity ru satisfies the one-dimensional wave equation. Therefore a general solution for spherical
waves has the form
where F and G are completely arbitrary functions. Radiation from an antenna corresponds to the case where G is
identically zero. Thus the wave form transmitted from an antenna has no distortion in time: the only distorting factor
is 1/r. This feature of undistorted propagation of waves is not present if there are two spatial dimensions.
Laplace equation in two dimensions
The Laplace equation for an unknown function of two variables has the form
Solutions of Laplace's equation are called harmonic functions.
Connection with holomorphic functions
Solutions of the Laplace equation in two dimensions are intimately connected with analytic functions of a complex
variable (a.k.a. holomorphic functions): the real and imaginary parts of any analytic function are conjugate
harmonic functions: they both satisfy the Laplace equation, and their gradients are orthogonal. If f=u+iv, then the
CauchyRiemann equations state that
and it follows that
Conversely, given any harmonic function in two dimensions, it is the real part of an analytic function, at least locally.
Details are given in Laplace equation.
Partial differential equation
106
A typical boundary value problem
A typical problem for Laplace's equation is to find a solution that satisfies arbitrary values on the boundary of a
domain. For example, we may seek a harmonic function that takes on the values u() on a circle of radius one. The
solution was given by Poisson:
Petrovsky (1967, p. 248) shows how this formula can be obtained by summing a Fourier series for . If r<1, the
derivatives of may be computed by differentiating under the integral sign, and one can verify that is analytic,
even if u is continuous but not necessarily differentiable. This behavior is typical for solutions of elliptic partial
differential equations: the solutions may be much more smooth than the boundary data. This is in contrast to
solutions of the wave equation, and more general hyperbolic partial differential equations, which typically have no
more derivatives than the data.
EulerTricomi equation
The EulerTricomi equation is used in the investigation of transonic flow.
Advection equation
The advection equation describes the transport of a conserved scalar in a velocity field . It is:
If the velocity field is solenoidal (that is, ), then the equation may be simplified to
In the one-dimensional case where is not constant and is equal to , the equation is referred to as Burgers'
equation. fjktgi jytir,li7l thavendiran.
GinzburgLandau equation
The GinzburgLandau equation is used in modelling superconductivity. It is
where and are constants and is the imaginary unit.
The Dym equation
The Dym equation is named for Harry Dym and occurs in the study of solitons. It is
Partial differential equation
107
Initial-boundary value problems
Many problems of mathematical physics are formulated as initial-boundary value problems.
Vibrating string
If the string is stretched between two points where x=0 and x=L and u denotes the amplitude of the displacement of
the string, then u satisfies the one-dimensional wave equation in the region where 0<x<L and t is unlimited. Since the
string is tied down at the ends, u must also satisfy the boundary conditions
as well as the initial conditions
The method of separation of variables for the wave equation
leads to solutions of the form
where
where the constant k must be determined. The boundary conditions then imply that X is a multiple of sin kx, and k
must have the form
where n is an integer. Each term in the sum corresponds to a mode of vibration of the string. The mode with n=1 is
called the fundamental mode, and the frequencies of the other modes are all multiples of this frequency. They form
the overtone series of the string, and they are the basis for musical acoustics. The initial conditions may then be
satisfied by representing f and g as infinite sums of these modes. Wind instruments typically correspond to vibrations
of an air column with one end open and one end closed. The corresponding boundary conditions are
The method of separation of variables can also be applied in this case, and it leads to a series of odd overtones.
The general problem of this type is solved in SturmLiouville theory.
Vibrating membrane
If a membrane is stretched over a curve C that forms the boundary of a domain D in the plane, its vibrations are
governed by the wave equation
if t>0 and (x,y) is in D. The boundary condition is if is on . The method of separation of
variables leads to the form
which in turn must satisfy
The latter equation is called the Helmholtz Equation. The constant k must be determined to allow a non-trivial v to
satisfy the boundary condition on C. Such values of k
2
are called the eigenvalues of the Laplacian in D, and the
Partial differential equation
108
associated solutions are the eigenfunctions of the Laplacian in D. The SturmLiouville theory may be extended to
this elliptic eigenvalue problem (Jost, 2002).
Other examples
The Schrdinger equation is a PDE at the heart of non-relativistic quantum mechanics. In the WKB approximation it
is the HamiltonJacobi equation.
Except for the Dym equation and the GinzburgLandau equation, the above equations are linear in the sense that
they can be written in the form Au = f for a given linear operator A and a given function f. Other important non-linear
equations include the NavierStokes equations describing the flow of fluids, and Einstein's field equations of general
relativity.
Also see the list of non-linear partial differential equations.
Classification
Some linear, second-order partial differential equations can be classified as parabolic, hyperbolic or elliptic. Others
such as the EulerTricomi equation have different types in different regions. The classification provides a guide to
appropriate initial and boundary conditions, and to smoothness of the solutions.
Equations of second order
Assuming , the general second-order PDE in two independent variables has the form
where the coefficients A, B, C etc. may depend upon x and y. If over a region of the xy plane,
the PDE is second-order in that region. This form is analogous to the equation for a conic section:
More precisely, replacing by and likewise for other variables (formally this is done by a Fourier transform),
converts a constant-coefficient PDE into a polynomial of the same degree, with the top degree (a homogeneous
polynomial, here a quadratic form) being most significant for the classification.
Just as one classifies conic sections and quadratic forms into parabolic, hyperbolic, and elliptic based on the
discriminant , the same can be done for a second-order PDE at a given point. However, the
discriminant in a PDE is given by due to the convention of the xy term being 2B rather than B;
formally, the discriminant (of the associated quadratic form) is with the factor
of 4 dropped for simplicity.
1. : solutions of elliptic PDEs are as smooth as the coefficients allow, within the interior of the
region where the equation and solutions are defined. For example, solutions of Laplace's equation are analytic
within the domain where they are defined, but solutions may assume boundary values that are not smooth. The
motion of a fluid at subsonic speeds can be approximated with elliptic PDEs, and the EulerTricomi equation is
elliptic where x<0.
2. : equations that are parabolic at every point can be transformed into a form analogous to the
heat equation by a change of independent variables. Solutions smooth out as the transformed time variable
increases. The EulerTricomi equation has parabolic type on the line where x=0.
3. : hyperbolic equations retain any discontinuities of functions or derivatives in the initial data.
An example is the wave equation. The motion of a fluid at supersonic speeds can be approximated with
hyperbolic PDEs, and the EulerTricomi equation is hyperbolic where x>0.
If there are n independent variables x
1
, x
2
, ..., x
n
, a general linear partial differential equation of second order has the
form
Partial differential equation
109
The classification depends upon the signature of the eigenvalues of the coefficient matrix.
1. Elliptic: The eigenvalues are all positive or all negative.
2. Parabolic : The eigenvalues are all positive or all negative, save one that is zero.
3. Hyperbolic: There is only one negative eigenvalue and all the rest are positive, or there is only one positive
eigenvalue and all the rest are negative.
4. Ultrahyperbolic: There is more than one positive eigenvalue and more than one negative eigenvalue, and there are
no zero eigenvalues. There is only limited theory for ultrahyperbolic equations (Courant and Hilbert, 1962).
Systems of first-order equations and characteristic surfaces
The classification of partial differential equations can be extended to systems of first-order equations, where the
unknown u is now a vector with m components, and the coefficient matrices are m by m matrices for
. The partial differential equation takes the form
where the coefficient matrices A

and the vector B may depend upon x and u. If a hypersurface S is given in the
implicit form
where has a non-zero gradient, then S is a characteristic surface for the operator L at a given point if the
characteristic form vanishes:
The geometric interpretation of this condition is as follows: if data for u are prescribed on the surface S, then it may
be possible to determine the normal derivative of u on S from the differential equation. If the data on S and the
differential equation determine the normal derivative of u on S, then S is non-characteristic. If the data on S and the
differential equation do not determine the normal derivative of u on S, then the surface is characteristic, and the
differential equation restricts the data on S: the differential equation is internal to S.
1. A first-order system Lu=0 is elliptic if no surface is characteristic for L: the values of u on S and the differential
equation always determine the normal derivative of u on S.
2. A first-order system is hyperbolic at a point if there is a space-like surface S with normal at that point. This
means that, given any non-trivial vector orthogonal to , and a scalar multiplier , the equation
has m real roots
1
,
2
, ...,
m
. The system is strictly hyperbolic if these roots are always distinct. The geometrical
interpretation of this condition is as follows: the characteristic form Q()=0 defines a cone (the normal cone) with
homogeneous coordinates . In the hyperbolic case, this cone has m sheets, and the axis = runs inside these
sheets: it does not intersect any of them. But when displaced from the origin by , this axis intersects every sheet. In
the elliptic case, the normal cone has no real sheets.
Partial differential equation
110
Equations of mixed type
If a PDE has coefficients that are not constant, it is possible that it will not belong to any of these categories but
rather be of mixed type. A simple but important example is the EulerTricomi equation
which is called elliptic-hyperbolic because it is elliptic in the region x < 0, hyperbolic in the region x > 0, and
degenerate parabolic on the line x = 0.
Infinite-order PDEs in quantum mechanics
Weyl quantization in phase space leads to quantum Hamilton's equations for trajectories of quantum particles. Those
equations are infinite-order PDEs. However, in the semiclassical expansion one has a finite system of ODEs at any
fixed order of . The equation of evolution of the Wigner function is infinite-order PDE also. The quantum
trajectories are quantum characteristics with the use of which one can calculate the evolution of the Wigner function.
Analytical methods to solve PDEs
Separation of variables
In the method of separation of variables, one reduces a PDE to a PDE in fewer variables, which is an ODE if in one
variable these are in turn easier to solve.
This is possible for simple PDEs, which are called separable partial differential equations, and the domain is
generally a rectangle (a product of intervals). Separable PDEs correspond to diagonal matrices thinking of "the
value for fixed x" as a coordinate, each coordinate can be understood separately.
This generalizes to the method of characteristics, and is also used in integral transforms.
Method of characteristics
In special cases, one can find characteristic curves on which the equation reduces to an ODE changing coordinates
in the domain to straighten these curves allows separation of variables, and is called the method of characteristics.
More generally, one may find characteristic surfaces.
Integral transform
An integral transform may transform the PDE to a simpler one, in particular a separable PDE. This corresponds to
diagonalizing an operator.
An important example of this is Fourier analysis, which diagonalizes the heat equation using the eigenbasis of
sinusoidal waves.
If the domain is finite or periodic, an infinite sum of solutions such as a Fourier series is appropriate, but an integral
of solutions such as a Fourier integral is generally required for infinite domains. The solution for a point source for
the heat equation given above is an example for use of a Fourier integral.
Partial differential equation
111
Change of variables
Often a PDE can be reduced to a simpler form with a known solution by a suitable change of variables. For example
the BlackScholes PDE
is reducible to the heat equation
by the change of variables (for complete details see Solution of the Black Scholes Equation
[1]
)
Fundamental solution
Inhomogeneous equations can often be solved (for constant coefficient PDEs, always be solved) by finding the
fundamental solution (the solution for a point source), then taking the convolution with the boundary conditions to
get the solution.
This is analogous in signal processing to understanding a filter by its impulse response.
Superposition principle
Because any superposition of solutions of a linear, homogeneous PDE is again a solution, the particular solutions
may then be combined to obtain more general solutions.
Methods for non-linear equations
See also the list of nonlinear partial differential equations.
There are no generally applicable methods to solve non-linear PDEs. Still, existence and uniqueness results (such as
the CauchyKowalevski theorem) are often possible, as are proofs of important qualitative and quantitative
properties of solutions (getting these results is a major part of analysis). Computational solution to the nonlinear
PDEs, the Split-step method, exist for specific equations like nonlinear Schrdinger equation.
Nevertheless, some techniques can be used for several types of equations. The h-principle is the most powerful
method to solve underdetermined equations. The RiquierJanet theory is an effective method for obtaining
information about many analytic overdetermined systems.
The method of characteristics (Similarity Transformation method) can be used in some very special cases to solve
partial differential equations.
In some cases, a PDE can be solved via perturbation analysis in which the solution is considered to be a correction to
an equation with a known solution. Alternatives are numerical analysis techniques from simple finite difference
schemes to the more mature multigrid and finite element methods. Many interesting problems in science and
engineering are solved in this way using computers, sometimes high performance supercomputers.
Partial differential equation
112
Lie Group Methods
From 1870 Sophus Lie's work put the theory of differential equations on a more satisfactory foundation. He showed
that the integration theories of the older mathematicians can, by the introduction of what are now called Lie groups,
be referred to a common source; and that ordinary differential equations which admit the same infinitesimal
transformations present comparable difficulties of integration. He also emphasized the subject of transformations of
contact.
A general approach to solve PDE's uses the symmetry property of differential equations, the continuous infinitesimal
transformations of solutions to solutions (Lie theory). Continuous group theory, Lie algebras and differential
geometry are used to understand the structure of linear and nonlinear partial differential equations for generating
integrable equations, to find its Lax pairs, recursion operators, Bcklund transform and finally finding exact analytic
solutions to the PDE.
Symmetry methods have been recognized to study differential equations arising in mathematics, physics,
engineering, and many other disciplines.
Numerical methods to solve PDEs
The three most widely used numerical methods to solve PDEs are the finite element method (FEM), finite volume
methods (FVM) and finite difference methods (FDM). The FEM has a prominent position among these methods and
especially its exceptionally efficient higher-order version hp-FEM. Other versions of FEM include the generalized
finite element method (GFEM), extended finite element method (XFEM), spectral finite element method (SFEM),
meshfree finite element method, discontinuous Galerkin finite element method (DGFEM), etc.
Almost-solution of PDE
Almost-solution of PDE is a concept introduced by a Russian mathematician Vladimir Miklyukov in connection with
research of solutions with nonremovable singularities.
See also
Boundary value problem
Difference equation
Laplace transform applied to differential equations
List of dynamical systems and differential equations topics
Matrix differential equation
Ordinary differential equation
Separation of variables
Stochastic partial differential equations
References
Courant, R. & Hilbert, D. (1962), Methods of Mathematical Physics, II, New York: Wiley-Interscience.
Evans, L. C. (1998), Partial Differential Equations, Providence: American Mathematical Society,
ISBN0821807722.
Ibragimov, Nail H (1993), CRC Handbook of Lie Group Analysis of Differential Equations Vol. 1-3, Providence:
CRC-Press, ISBN0849344883.
John, F. (1982), Partial Differential Equations (4th ed.), New York: Springer-Verlag, ISBN0387906096.
Jost, J. (2002), Partial Differential Equations, New York: Springer-Verlag, ISBN0387954287.
Partial differential equation
113
Lewy, Hans (1957), "An example of a smooth linear partial differential equation without solution", Annals of
Mathematics, 2nd Series 66 (1): 155158.
Petrovskii, I. G. (1967), Partial Differential Equations, Philadelphia: W. B. Saunders Co..
Pinchover, Y. & Rubinstein, J. (2005), An Introduction to Partial Differential Equations, New York: Cambridge
University Press, ISBN0521848865.
Polyanin, A. D. (2002), Handbook of Linear Partial Differential Equations for Engineers and Scientists, Boca
Raton: Chapman & Hall/CRC Press, ISBN1584882999.
Polyanin, A. D. & Zaitsev, V. F. (2004), Handbook of Nonlinear Partial Differential Equations, Boca Raton:
Chapman & Hall/CRC Press, ISBN1584883553.
Polyanin, A. D.; Zaitsev, V. F. & Moussiaux, A. (2002), Handbook of First Order Partial Differential Equations,
London: Taylor & Francis, ISBN041527267X.
Solin, P. (2005), Partial Differential Equations and the Finite Element Method, Hoboken, NJ: J. Wiley & Sons,
ISBN0471720704.
Solin, P.; Segeth, K. & Dolezel, I. (2003), Higher-Order Finite Element Methods, Boca Raton: Chapman &
Hall/CRC Press, ISBN158488438X.
Zwillinger, D. (1997), Handbook of Differential Equations (3rd ed.), Boston: Academic Press, ISBN0127843957.
External links
Partial Differential Equations: Exact Solutions
[2]
at EqWorld: The World of Mathematical Equations.
Partial Differential Equations: Index
[3]
at EqWorld: The World of Mathematical Equations.
Partial Differential Equations: Methods
[4]
at EqWorld: The World of Mathematical Equations.
Example problems with solutions
[5]
at exampleproblems.com
Partial Differential Equations
[6]
at mathworld.wolfram.com
Dispersive PDE Wiki
[7]
NEQwiki, the nonlinear equations encyclopedia
[8]
References
[1] http:/ / web. archive. org/ web/ 20080411030405/ http:/ / www. math. unl. edu/ ~sdunbar1/ Teaching/ MathematicalFinance/ Lessons/
BlackScholes/ Solution/ solution.shtml
[2] http:/ / eqworld. ipmnet. ru/ en/ pde-en.htm
[3] http:/ / eqworld. ipmnet. ru/ en/ solutions/ eqindex/ eqindex-pde. htm
[4] http:/ / eqworld. ipmnet. ru/ en/ methods/ meth-pde.htm
[5] http:/ / www. exampleproblems. com/ wiki/ index.php?title=Partial_Differential_Equations
[6] http:/ / mathworld. wolfram. com/ PartialDifferentialEquation. html
[7] http:/ / tosio.math. toronto.edu/ wiki/ index. php/ Main_Page
[8] http:/ / www. primat.mephi. ru/ wiki/
Probability
114
Probability
Probability is a way of expressing knowledge or belief that an event will occur or has occurred. The concept has
been given an exact mathematical meaning in probability theory, which is used extensively in such areas of study as
mathematics, statistics, finance, gambling, science, and philosophy to draw conclusions about the likelihood of
potential events and the underlying mechanics of complex systems.
Interpretations
The word probability does not have a consistent direct definition. In fact, there are two broad categories of
probability interpretations, whose adherents possess different (and sometimes conflicting) views about the
fundamental nature of probability:
1. Frequentists talk about probabilities only when dealing with experiments that are random and well-defined. The
probability of a random event denotes the relative frequency of occurrence of an experiment's outcome, when
repeating the experiment. Frequentists consider probability to be the relative frequency "in the long run" of
outcomes.
[1]
2. Bayesians, however, assign probabilities to any statement whatsoever, even when no random process is involved.
Probability, for a Bayesian, is a way to represent an individual's degree of belief in a statement, or an objective
degree of rational belief, given the evidence.
Etymology
The word Probability derives from latin word probabilitas that can also mean probity, a measure of the authority of a
witness in a legal case in Europe, and often correlated with the witness's nobility. In a sense, this differs much from
the modern meaning of probability, which, in contrast, is used as a measure of the weight of empirical evidence, and
is arrived at from inductive reasoning and statistical inference.
[2]

[3]
History
The scientific study of probability is a modern development. Gambling shows that there has been an interest in
quantifying the ideas of probability for millennia, but exact mathematical descriptions of use in those problems only
arose much later.
According to Richard Jeffrey, "Before the middle of the seventeenth century, the term 'probable' (Latin probabilis)
meant approvable, and was applied in that sense, univocally, to opinion and to action. A probable action or opinion
was one such as sensible people would undertake or hold, in the circumstances."
[4]
However, in legal contexts
especially, 'probable' could also apply to propositions for which there was good evidence.
[5]
Aside from some elementary considerations made by Girolamo Cardano in the 16th century, the doctrine of
probabilities dates to the correspondence of Pierre de Fermat and Blaise Pascal (1654). Christiaan Huygens (1657)
gave the earliest known scientific treatment of the subject. Jakob Bernoulli's Ars Conjectandi (posthumous, 1713)
and Abraham de Moivre's Doctrine of Chances (1718) treated the subject as a branch of mathematics. See Ian
Hacking's The Emergence of Probability and James Franklin's The Science of Conjecture for histories of the early
development of the very concept of mathematical probability.
The theory of errors may be traced back to Roger Cotes's Opera Miscellanea (posthumous, 1722), but a memoir
prepared by Thomas Simpson in 1755 (printed 1756) first applied the theory to the discussion of errors of
observation. The reprint (1757) of this memoir lays down the axioms that positive and negative errors are equally
probable, and that there are certain assignable limits within which all errors may be supposed to fall; continuous
errors are discussed and a probability curve is given.
Probability
115
Pierre-Simon Laplace (1774) made the first attempt to deduce a rule for the combination of observations from the
principles of the theory of probabilities. He represented the law of probability of errors by a curve ,
being any error and its probability, and laid down three properties of this curve:
1. it is symmetric as to the -axis;
2. the -axis is an asymptote, the probability of the error being 0;
3. the area enclosed is 1, it being certain that an error exists.
He also gave (1781) a formula for the law of facility of error (a term due to Lagrange, 1774), but one which led to
unmanageable equations. Daniel Bernoulli (1778) introduced the principle of the maximum product of the
probabilities of a system of concurrent errors.
The method of least squares is due to Adrien-Marie Legendre (1805), who introduced it in his Nouvelles mthodes
pour la dtermination des orbites des comtes (New Methods for Determining the Orbits of Comets). In ignorance of
Legendre's contribution, an Irish-American writer, Robert Adrain, editor of "The Analyst" (1808), first deduced the
law of facility of error,
being a constant depending on precision of observation, and a scale factor ensuring that the area under the
curve equals 1. He gave two proofs, the second being essentially the same as John Herschel's (1850). Gauss gave the
first proof which seems to have been known in Europe (the third after Adrain's) in 1809. Further proofs were given
by Laplace (1810, 1812), Gauss (1823), James Ivory (1825, 1826), Hagen (1837), Friedrich Bessel (1838), W. F.
Donkin (1844, 1856), and Morgan Crofton (1870). Other contributors were Ellis (1844), De Morgan (1864),
Glaisher (1872), and Giovanni Schiaparelli (1875). Peters's (1856) formula for , the probable error of a single
observation, is well known.
In the nineteenth century authors on the general theory included Laplace, Sylvestre Lacroix (1816), Littrow (1833),
Adolphe Quetelet (1853), Richard Dedekind (1860), Helmert (1872), Hermann Laurent (1873), Liagre, Didion, and
Karl Pearson. Augustus De Morgan and George Boole improved the exposition of the theory.
Andrey Markov introduced the notion of Markov chains (1906) playing an important role in theory of stochastic
processes and its applications.
The modern theory of probability based on the measure theory was developed by Andrey Kolmogorov (1931).
On the geometric side (see integral geometry) contributors to The Educational Times were influential (Miller,
Crofton, McColl, Wolstenholme, Watson, and Artemas Martin).
Mathematical treatment
In mathematics, a probability of an event A is represented by a real number in the range from 0 to 1 and written as
P(A), p(A) or Pr(A).
[6]
An impossible event has a probability of 0, and a certain event has a probability of 1.
However, the converses are not always true: probability 0 events are not always impossible, nor probability 1 events
certain. The rather subtle distinction between "certain" and "probability 1" is treated at greater length in the article on
"almost surely".
The opposite or complement of an event A is the event [not A] (that is, the event of A not occurring); its probability is
given by P(not A) = 1 - P(A).
[7]
As an example, the chance of not rolling a six on a six-sided die is 1 (chance of
rolling a six) . See Complementary event for a more complete treatment.
If both the events A and B occur on a single performance of an experiment this is called the intersection or joint
probability of A and B, denoted as . If two events, A and B are independent then the joint probability is
for example, if two coins are flipped the chance of both being heads is
[8]
Probability
116
If either event A or event B or both events occur on a single performance of an experiment this is called the union of
the events A and B denoted as . If two events are mutually exclusive then the probability of either
occurring is
For example, the chance of rolling a 1 or 2 on a six-sided die is
If the events are not mutually exclusive then
For example, when drawing a single card at random from a regular deck of cards, the chance of getting a heart or a
face card (J,Q,K) (or one that is both) is , because of the 52 cards of a deck 13 are hearts, 12
are face cards, and 3 are both: here the possibilities included in the "3 that are both" are included in each of the "13
hearts" and the "12 face cards" but should only be counted once.
Conditional probability is the probability of some event A, given the occurrence of some other event B. Conditional
probability is written P(A|B), and is read "the probability of A, given B". It is defined by
[9]
If then is undefined.
Summary of probabilities
Event Probability
A
not A
A or B
A and B
A given B
Theory
Like other theories, the theory of probability is a representation of probabilistic concepts in formal termsthat is, in
terms that can be considered separately from their meaning. These formal terms are manipulated by the rules of
mathematics and logic, and any results are then interpreted or translated back into the problem domain.
There have been at least two successful attempts to formalize probability, namely the Kolmogorov formulation and
the Cox formulation. In Kolmogorov's formulation (see probability space), sets are interpreted as events and
probability itself as a measure on a class of sets. In Cox's theorem, probability is taken as a primitive (that is, not
further analyzed) and the emphasis is on constructing a consistent assignment of probability values to propositions.
In both cases, the laws of probability are the same, except for technical details.
There are other methods for quantifying uncertainty, such as the Dempster-Shafer theory or possibility theory, but
those are essentially different and not compatible with the laws of probability as they are usually understood.
Probability
117
Applications
Two major applications of probability theory in everyday life are in risk assessment and in trade on commodity
markets. Governments typically apply probabilistic methods in environmental regulation where it is called "pathway
analysis", often measuring well-being using methods that are stochastic in nature, and choosing projects to undertake
based on statistical analyses of their probable effect on the population as a whole.
A good example is the effect of the perceived probability of any widespread Middle East conflict on oil prices -
which have ripple effects in the economy as a whole. An assessment by a commodity trader that a war is more likely
vs. less likely sends prices up or down, and signals other traders of that opinion. Accordingly, the probabilities are
not assessed independently nor necessarily very rationally. The theory of behavioral finance emerged to describe the
effect of such groupthink on pricing, on policy, and on peace and conflict.
It can reasonably be said that the discovery of rigorous methods to assess and combine probability assessments has
had a profound effect on modern society. Accordingly, it may be of some importance to most citizens to understand
how odds and probability assessments are made, and how they contribute to reputations and to decisions, especially
in a democracy.
Another significant application of probability theory in everyday life is reliability. Many consumer products, such as
automobiles and consumer electronics, utilize reliability theory in the design of the product in order to reduce the
probability of failure. The probability of failure may be closely associated with the product's warranty.
Relation to randomness
In a deterministic universe, based on Newtonian concepts, there is no probability if all conditions are known. In the
case of a roulette wheel, if the force of the hand and the period of that force are known, then the number on which
the ball will stop would be a certainty. Of course, this also assumes knowledge of inertia and friction of the wheel,
weight, smoothness and roundness of the ball, variations in hand speed during the turning and so forth. A
probabilistic description can thus be more useful than Newtonian mechanics for analyzing the pattern of outcomes of
repeated rolls of roulette wheel. Physicists face the same situation in kinetic theory of gases, where the system, while
deterministic in principle, is so complex (with the number of molecules typically the order of magnitude of
Avogadro constant 6.0210
23
) that only statistical description of its properties is feasible.
A revolutionary discovery of 20th century physics was the random character of all physical processes that occur at
sub-atomic scales and are governed by the laws of quantum mechanics. The wave function itself evolves
deterministically as long as no observation is made, but, according to the prevailing Copenhagen interpretation, the
randomness caused by the wave function collapsing when an observation is made, is fundamental. This means that
probability theory is required to describe nature. Others never came to terms with the loss of determinism. Albert
Einstein famously remarked in a letter to Max Born: Jedenfalls bin ich berzeugt, da der Alte nicht wrfelt. (I am
convinced that God does not play dice). Although alternative viewpoints exist, such as that of quantum decoherence
being the cause of an apparent random collapse, at present there is a firm consensus among physicists that
probability theory is necessary to describe quantum phenomena.
Probability
118
See also
Black Swan theory
Calculus of predispositions
Chance (disambiguation)
Class membership probabilities
Decision theory
Equiprobable
Fuzzy measure theory
Game theory
Gaming mathematics
Information theory
Important publications in probability
Measure theory
Negative probability
Probabilistic argumentation
Probabilistic logic
Random fields
Random variable
List of scientific journals in probability
List of statistical topics
Stochastic process
Wiener process
Notes
[1] The Logic of Statistical Inference, Ian Hacking, 1965
[2] The Emergence of Probability: A Philosophical Study of Early Ideas about Probability, Induction and Statistical Inference, Ian Hacking,
Cambridge University Press, 2006, ISBN 0521685575, 9780521685573
[3] The Cambridge History of Seventeenth-century Philosophy, Daniel Garber, 2003
[4] Jeffrey, R.C., Probability and the Art of Judgment, Cambridge University Press. (1992). pp. 54-55 . ISBN 0-521-39459-7
[5] Franklin, J., The Science of Conjecture: Evidence and Probability Before Pascal, Johns Hopkins University Press. (2001). pp. 22, 113, 127
[6] Olofsson, Peter. (2005) Page 8.
[7] Olofsson, page 9
[8] Olofsson, page 35.
[9] Olofsson, page 29.
References
Kallenberg, O. (2005) Probabilistic Symmetries and Invariance Principles. Springer -Verlag, New York. 510
pp.ISBN 0-387-25115-4
Kallenberg, O. (2002) Foundations of Modern Probability, 2nd ed. Springer Series in Statistics. 650 pp.ISBN
0-387-95313-2
Olofsson, Peter (2005) Probability, Statistics, and Stochastic Processes, Wiley-Interscience. 504 pp ISBN
0-471-67969-0.
Probability
119
Quotations
Damon Runyon, "It may be that the race is not always to the swift, nor the battle to the strong - but that is the way
to bet."
Pierre-Simon Laplace "It is remarkable that a science which began with the consideration of games of chance
should have become the most important object of human knowledge." Thorie Analytique des Probabilits, 1812.
Richard von Mises "The unlimited extension of the validity of the exact sciences was a characteristic feature of
the exaggerated rationalism of the eighteenth century" (in reference to Laplace). Probability, Statistics, and Truth,
p 9. Dover edition, 1981 (republication of second English edition, 1957).
External links
Probability and Statistics EBook (http:/ / wiki. stat. ucla. edu/ socr/ index. php/ EBook)
Edwin Thompson Jaynes. Probability Theory: The Logic of Science. Preprint: Washington University, (1996).
HTML index with links to PostScript files (http:/ / omega. albany. edu:8008/ JaynesBook. html) and PDF (http:/ /
bayes. wustl. edu/ etj/ prob/ book. pdf) (first three chapters)
People from the History of Probability and Statistics (Univ. of Southampton) (http:/ / www. economics. soton. ac.
uk/ staff/ aldrich/ Figures. htm)
Probability and Statistics on the Earliest Uses Pages (Univ. of Southampton) (http:/ / www. economics. soton. ac.
uk/ staff/ aldrich/ Probability Earliest Uses. htm)
Earliest Uses of Symbols in Probability and Statistics (http:/ / jeff560. tripod. com/ stat. html) on Earliest Uses of
Various Mathematical Symbols (http:/ / jeff560. tripod. com/ mathsym. html)
A tutorial on probability and Bayes theorem devised for first-year Oxford University students (http:/ / www.
celiagreen. com/ charlesmccreery/ statistics/ bayestutorial. pdf)
pdf file of An Anthology of Chance Operations (1963) (http:/ / ubu. com/ historical/ young/ index. html) at
UbuWeb
Probability Theory Guide for Non-Mathematicians (http:/ / probability. infarom. ro)
Understanding Risk and Probability (http:/ / www. bbc. co. uk/ raw/ money/ express_unit_risk/ ) with BBC raw
Introduction to Probability - eBook (http:/ / www. dartmouth. edu/ ~chance/ teaching_aids/ books_articles/
probability_book/ book. html), by Charles Grinstead, Laurie Snell Source (http:/ / bitbucket. org/ shabbychef/
numas_text/ ) (GNU Free Documentation License)
Probability distribution
120
Probability distribution
In probability theory and statistics, a probability distribution identifies either the probability of each value of a
random variable (when the variable is discrete), or the probability of the value falling within a particular interval
(when the variable is continuous).
[1]
The probability distribution describes the range of possible values that a random
variable can attain and the probability that the value of the random variable is within any (measurable) subset of that
range.
The Normal distribution, often called the "bell curve".
When the random variable takes values
in the set of real numbers, the
probability distribution is completely
described by the cumulative
distribution function, whose value at
each real x is the probability that the
random variable is smaller than or
equal to x.
The concept of the probability
distribution and the random variables
which they describe underlies the
mathematical discipline of probability
theory, and the science of statistics. There is spread or variability in almost any value that can be measured in a
population (e.g. height of people, durability of a metal, sales growth, traffic flow, etc.); almost all measurements are
made with some intrinsic error; in physics many processes are described probabilistically, from the kinetic properties
of gases to the quantum mechanical description of fundamental particles. For these and many other reasons, simple
numbers are often inadequate for describing a quantity, while probability distributions are often more appropriate.
There are various probability distributions that show up in various different applications. Two of the most important
ones are the normal distribution and the categorical distribution. The normal distribution, also known as the Gaussian
distribution, has a familiar "bell curve" shape and approximates many different naturally occurring distributions over
real numbers. The categorical distribution describes the result of an experiment with a fixed, finite number of
outcomes. For example, the toss of a fair coin is a categorical distribution, where the possible outcomes are heads
and tails, each with probability 1/2.
Formal definition
In the measure-theoretic formalization of probability theory, a random variable is defined as a measurable function X
from a probability space to measurable space . A probability distribution is the pushforward
measure X
*
P=PX
1
on .
Probability distributions of real-valued random variables
Because a probability distribution Pr on the real line is determined by the probability of a real-valued random
variable X being in a half-open interval (-,x], the probability distribution is completely characterized by its
cumulative distribution function:
Probability distribution
121
Discrete probability distribution
A probability distribution is called discrete if its cumulative distribution function only increases in jumps. More
precisely, a probability distribution is discrete if there is a finite or countable set whose probability is 1.
For many familiar discrete distributions, the set of possible values is topologically discrete in the sense that all its
points are isolated points. But, there are discrete distributions for which this countable set is dense on the real line.
Discrete distributions are characterized by a probability mass function, such that
Continuous probability distribution
By one convention, a probability distribution is called continuous if its cumulative distribution function
is continuous and, therefore, the probability measure of singletons for all .
Another convention reserves the term continuous probability distribution for absolutely continuous distributions.
These distributions can be characterized by a probability density function: a non-negative Lebesgue integrable
function defined on the real numbers such that
Discrete distributions and some continuous distributions (like the Cantor distribution) do not admit such a density.
Terminology
The support of a distribution is the smallest closed interval/set whose complement has probability zero. It may be
understood as the points or elements that are actual members of the distribution.
A discrete random variable is a random variable whose probability distribution is discrete. Similarly, a continuous
random variable is a random variable whose probability distribution is continuous.
Some properties
The probability density function of the sum of two independent random variables is the convolution of each of
their density functions.
The probability density function of the difference of two independent random variables is the cross-correlation
of their density functions.
Probability distributions are not a vector space they are not closed under linear combinations, as these do not
preserve non-negativity or total integral 1 but they are closed under convex combination, thus forming a convex
subset of the space of functions (or measures).
Common probability distributions
The following is a list of some of the most common probability distributions, grouped by the type of process that
they are related to. For a more complete list, see list of probability distributions, which groups by the nature of the
outcome being considered (discrete, continuous, multivariate, etc.)
Note also that all of the univariate distributions below are singly-peaked; that is, it is assumed that the values cluster
around a single point. In practice, actually-observed quantities may cluster around multiple values. Such quantities
can be modeled using a mixture distribution.
Probability distribution
122
Related to real-valued quantities that grow linearly (e.g. errors, offsets)
Normal distribution (aka Gaussian distribution), for a single such quantity; the most common continuous
distribution
Multivariate normal distribution (aka multivariate Gaussian distribution), for vectors of correlated outcomes that
are individually Gaussian-distributed
Related to positive real-valued quantities that grow exponentially (e.g. prices, incomes,
populations)
Log-normal distribution, for a single such quantity whose log is normally distributed
Pareto distribution, for a single such quantity whose log is exponentially distributed; the prototypical power law
distribution
Related to real-valued quantities that are assumed to be uniformly distributed over a
(possibly unknown) region
Discrete uniform distribution, for a finite set of values (e.g. the outcome of a fair die)
Continuous uniform distribution, for continuously-distributed values
Related to Bernoulli trials (yes/no events, with a given probability)
Basic distributions
Bernoulli distribution, for the outcome of a single Bernoulli trial (e.g. success/failure, yes/no)
Binomial distribution, for the number of "positive occurrences" (e.g. successes, yes votes, etc.) given a fixed total
number of independent occurrences
Negative binomial distribution, for binomial-type observations but where the quantity of interest is the number of
failures before a given number of successes occurs
Geometric distribution, for binomial-type observations but where the quantity of interest is the number of failures
before the first success; a special case of the negative binomial distribution
Related to sampling schemes over a finite population
Binomial distribution, for the number of "positive occurrences" (e.g. successes, yes votes, etc.) given a fixed
number of total occurrences, using sampling with replacement
Hypergeometric distribution, for the number of "positive occurrences" (e.g. successes, yes votes, etc.) given a
fixed number of total occurrences, using sampling without replacement
Beta-binomial distribution, for the number of "positive occurrences" (e.g. successes, yes votes, etc.) given a fixed
number of total occurrences, sampling using a Polya urn scheme (in some sense, the "opposite" of sampling
without replacement)
Probability distribution
123
Related to categorical outcomes (events with K possible outcomes, with a given probability
for each outcome)
Categorical distribution, for a single categorical outcome (e.g. yes/no/maybe in a survey); a generalization of the
Bernoulli distribution
Multinomial distribution, for the number of each type of catergorical outcome, given a fixed number of total
outcomes; a generalization of the binomial distribution
Multivariate hypergeometric distribution, similar to the multinomial distribution, but using sampling without
replacement; a generalization of the hypergeometric distribution
Related to events in a Poisson process (events that occur independently with a given rate)
Poisson distribution, for the number of occurrences of a Poisson-type event in a given period of time
Exponential distribution, for the time before the next Poisson-type event occurs
Useful for hypothesis testing related to normally-distributed outcomes
Chi-square distribution, the distribution of a sum of squared standard normal variables; useful e.g. for inference
regarding the sample variance of normally-distributed samples (see chi-square test)
Student's t distribution, the distribution of the ratio of a standard normal variable and the square root of a scaled
chi squared variable; useful for inference regarding the mean of normally-distributed samples with unknown
variance (see Student's t-test)
F-distribution, the distribution of the ratio of two scaled chi squared variables; useful e.g. for inferences that
involve comparing variances or involving R-squared (the squared correlation coefficient)
Useful as conjugate prior distributions in Bayesian inference
Beta distribution, for a single probability (real number between 0 and 1); conjugate to the Bernoulli distribution
and binomial distribution
Gamma distribution, for a non-negative scaling parameter; conjugate to the rate parameter of a Poisson
distribution or exponential distribution, the precision (inverse variance) of a normal distribution, etc.
Dirichlet distribution, for a vector of probabilities that must sum to 1; conjugate to the categorical distribution and
multinomial distribution; generalization of the beta distribution
Wishart distribution, for a symmetric non-negative definite matrix; conjugate to the inverse of the covariance
matrix of a multivariate normal distribution; generalzation of the gamma distribution
See also
Copula (statistics)
Cumulative distribution function
Histogram
Inverse transform sampling
Likelihood function
List of statistical topics
Probability density function
Random variable
RiemannStieltjes integral application to probability theory
Notes
[1] Everitt, B.S. (2006) The Cambridge Dictionary of Statistics, Third Edition. pp. 313314. Cambridge University Press, Cambridge. ISBN
0521690277
Probability distribution
124
External links
An 8-foot-tall (2.4 m) Probability Machine (named Sir Francis) comparing stock market returns to the
randomness of the beans dropping through the quincunx pattern. (http:/ / www. youtube. com/
watch?v=AUSKTk9ENzg) from Index Funds Advisors IFA.com (http:/ / www. ifa. com), youtube.com
Interactive Discrete and Continuous Probability Distributions (http:/ / www. socr. ucla. edu/ htmls/
SOCR_Distributions. html), socr.ucla.edu
A Compendium of Common Probability Distributions (http:/ / www. causascientia. org/ math_stat/ Dists/
Compendium. pdf)
A Compendium of Distributions (http:/ / www. vosesoftware. com/ content/ ebook. pdf), vosesoftware.com
Statistical Distributions - Overview (http:/ / www. xycoon. com/ contdistroverview. htm), xycoon.com
Probability Distributions (http:/ / www. sitmo. com/ eqcat/ 8) in Quant Equation Archive, sitmo.com
A Probability Distribution Calculator (http:/ / www. covariable. com/ continuous. html), covariable.com
Sourceforge.net (http:/ / sourceforge. net/ projects/ distexplorer/ ), Distribution Explorer: a mixed C++ and C#
Windows application that allows you to explore the properties of 20+ statistical distributions, and calculate CDF,
PDF & quantiles. Written using open-source C++ from the Boost.org (http:/ / www. boost. org) Math Toolkit
library.
Explore different probability distributions and fit your own dataset online - interactive tool (http:/ / www. xjtek.
com/ anylogic/ demo_models/ 111/ ), xjtek.com
Binomial distribution
125
Binomial distribution
Probability mass function
Cumulative distribution function
notation: B(n, p)
parameters: n N
0
number of trials
p [0,1] success probability in each trial
support: k { 0, , n }
pmf:
cdf:
mean: np
median: np or np
mode: (n + 1)p or (n + 1)p 1
variance: np(1p)
skewness:
ex.kurtosis:
entropy:
mgf:
cf:
pgf:
In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of
successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. Such
a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial. In fact, when n = 1, the binomial
distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of
Binomial distribution
126
statistical significance.
It is frequently used to model number of successes in a sample of size n from a population of size N. Since the
samples are not independent (this is sampling without replacement), the resulting distribution is a hypergeometric
distribution, not a binomial one. However, for N much larger than n, the binomial distribution is a good
approximation, and widely used.
Examples
An elementary example is this: roll a standard die ten times and count the number of fours. The distribution of this
random number is a binomial distribution with n=10 and p=1/6.
As another example, flip a coin three times and count the number of heads. The distribution of this random number
is a binomial distribution with n=3 and p=1/2.
Specification
Probability mass function
In general, if the random variable K follows the binomial distribution with parameters n and p, we write K~B(n,p).
The probability of getting exactly k successes in n trials is given by the probability mass function:
for k=0,1,2,...,n, where
is the binomial coefficient (hence the name of the distribution) "nchoosek", also denoted C(n,k),
n
C
k
, or
n
C
k
. The
formula can be understood as follows: we want k successes (p
k
) and nk failures (1p)
nk
. However, the k
successes can occur anywhere among the n trials, and there are C(n,k) different ways of distributing k successes in a
sequence of n trials.
In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is
because for k>n/2, the probability can be calculated by its complement as
So, one must look to a different k and a different p (the binomial is not symmetrical in general). However, its
behavior is not arbitrary. There is always an integer m that satisfies
As a function of k, the expression (k;n,p) is monotone increasing for k<m and monotone decreasing for k>m,
with the exception of one case where (n+1)p is an integer. In this case, there are two maximum values for
m=(n+1)p and m1. m is known as the most probable (most likely) outcome of Bernoulli trials. Note that the
probability of it occurring can be fairly small.
Binomial distribution
127
Cumulative distribution function
The cumulative distribution function can be expressed as:
where is the "floor" under x, i.e. the greatest integer less than or equal to x.
It can also be represented in terms of the regularized incomplete beta function, as follows:
For k np, upper bounds for the lower tail of the distribution function can be derived. In particular, Hoeffding's
inequality yields the bound
and Chernoff's inequality can be used to derive the bound
Moreover, these bounds are reasonably tight when p = 1/2, since the following expression holds for all k 3n/8
[1]
Mean and variance
If X ~ B(n, p) (that is, X is a binomially distributed random variable), then the expected value of X is
and the variance is
This fact is easily proven as follows. Suppose first that we have a single Bernoulli trial. There are two possible
outcomes: 1 and 0, the first occurring with probability p and the second having probability 1p. The expected value
in this trial will be equal to = 1 p + 0 (1p) = p. The variance in this trial is calculated similarly:
2
= (1p)
2
p +
(0p)
2
(1p) = p(1 p).
The generic binomial distribution is a sum of n independent Bernoulli trials. The mean and the variance of such
distributions are equal to the sums of means and variances of each individual trial:
Binomial distribution
128
Mode and median
Usually the mode of a binomial B(n, p) distribution is equal to (n+1)p, where is the floor function. However
when (n+1)p is an integer and p is neither 0 nor 1, then the distribution has two modes: (n+1)p and (n+1)p1.
When p is equal to 0 or 1, the mode will be 0 and n correspondingly. These cases can be summarized as follows:
In general, there is no single formula to find the median for a binomial distribution, and it may even be non-unique.
However several special results have been established:
If np is an integer, then the mean, median, and mode coincide.
[2]
Any median m must lie within the interval npmnp.
[3]
A median m cannot lie too far away from the mean: |m np| min{ ln 2, max{p, 1 p} }.
[4]
The median is unique and equal to m=round(np) in cases when either p 1 ln 2 or p ln 2 or
|mnp|min{p,1p} (except for the case when p= and n is odd).
[3]

[4]
When p=1/2 and n is odd, any number m in the interval (n1)m(n+1) is a median of the binomial
distribution. If p=1/2 and n is even, then m=n/2 is the unique median.
Covariance between two binomials
If two binomially distributed random variables X and Y are observed together, estimating their covariance can be
useful. Using the definition of covariance, in the case n=1 we have
The first term is non-zero only when both X and Y are one, and
X
and
Y
are equal to the two probabilities. Defining
p
B
as the probability of both happening at the same time, this gives
and for n such trials again due to independence
If X and Y are the same variable, this reduces to the variance formula given above.
Relationship to other distributions
Sums of binomials
If X~B(n,p) and Y~B(m,p) are independent binomial variables, then X+Y is again a binomial variable; its
distribution is
Bernoulli distribution
The Bernoulli distribution is a special case of the binomial distribution, where n=1. Symbolically, X~B(1,p) has
the same meaning as X~Bern(p). Conversely, any binomial distribution, B(n,p), is the sum of n independent
Bernoulli trials, Bern(p), each with the same probability p.
Binomial distribution
129
Poisson binomial distribution
The binomial distribution is a special case of the Poisson binomial distribution, which is a sum of n independent
non-identical Bernoulli trials Bern(p
i
). If X has the Poisson binomial distribution with p
1
==p
n
=p then
X~B(n,p).
Normal approximation
Binomial PDF and normal approximation for n=6 and p=0.5
If n is large enough, then the skew of the distribution is
not too great. In this case, if a suitable continuity
correction is used, then an excellent approximation to
B(n,p) is given by the normal distribution
The approximation generally improves as n increases (at least 20) and is better when p is not near to 0 or 1.
[5]
Various rules of thumb may be used to decide whether n is large enough, and p is far enough from the extremes of
zero or unity:
One rule is that both x=np and n(1p) must be greater than5. However, the specific number varies from source
to source, and depends on how good an approximation one wants; some sources give 10 which gives virtually the
same results as the following rule for large n until n is very large (ex: x=11, n=7752).
That rule
[5]
is that for n > 5 the normal approximation is adequate if
Another commonly used rule holds that the normal approximation is appropriate only if everything within 3
standard deviations of its mean is within the range of possible values, that is if
Also as the approximation generally improves, it can be shown that the inflection points occur at
The following is an example of applying a continuity correction: Suppose one wishes to calculate Pr(X8) for a
binomial random variable X. If Y has a distribution given by the normal approximation, then Pr(X8) is
approximated by Pr(Y8.5). The addition of 0.5 is the continuity correction; the uncorrected normal approximation
gives considerably less accurate results.
This approximation, known as de MoivreLaplace theorem, is a huge time-saver (exact calculations with large n are
very onerous); historically, it was the first use of the normal distribution, introduced in Abraham de Moivre's book
Binomial distribution
130
The Doctrine of Chances in 1738. Nowadays, it can be seen as a consequence of the central limit theorem since
B(n,p) is a sum of n independent, identically distributed Bernoulli variables with parameterp. This fact is the basis
of a hypothesis test, a "proportion z-test," for the value of p using x/n, the sample proportion and estimator of p, in a
common test statistic.
[6]
For example, suppose you randomly sample n people out of a large population and ask them whether they agree with
a certain statement. The proportion of people who agree will of course depend on the sample. If you sampled groups
of n people repeatedly and truly randomly, the proportions would follow an approximate normal distribution with
mean equal to the true proportion p of agreement in the population and with standard deviation =(p(1p)/n)
1/2
.
Large sample sizes n are good because the standard deviation, as a proportion of the expected value, gets smaller,
which allows a more precise estimate of the unknown parameterp.
Poisson approximation
The binomial distribution converges towards the Poisson distribution as the number of trials goes to infinity while
the product np remains fixed. Therefore the Poisson distribution with parameter = np can be used as an
approximation to B(n, p) of the binomial distribution if n is sufficiently large and p is sufficiently small. According
to two rules of thumb, this approximation is good if n20 and p0.05, or if n100 and np10.
[7]
Limits
As n approaches and p approaches 0 while np remains fixed at >0 or at least np approaches >0, then the
Binomial(n,p) distribution approaches the Poisson distribution with expected value .
As n approaches while p remains fixed, the distribution of
approaches the normal distribution with expected value0 and variance1. This result is sometimes loosely
stated by saying that the distribution of X approaches the normal distribution with expected valuenp and
variancenp(1p). That loose statement cannot be taken literally because the thing asserted to be approached
actually depends on the value ofn, and n is approaching infinity. This result is a specific case of the Central
Limit Theorem.
Generating binomial random variates
Luc Devroye, Non-Uniform Random Variate Generation, New York: Springer-Verlag, 1986. See especially
Chapter X, Discrete Univariate Distributions
[8]
.
Kachitvichyanukul, V.; Schmeiser, B. W. (1988). "Binomial random variate generation". Communications of the
ACM 31: 216222. doi:10.1145/42372.42381.
See also
Bean machine / Galton box
Binomial proportion confidence interval
Logistic regression
Sample_size#Estimating_proportions
Binomial distribution
131
References
[1] Matousek, J, Vondrak, J: The Probabilistic Method (lecture notes) (http:/ / kam. mff. cuni. cz/ ~matousek/ prob-ln. ps. gz).
[2] Neumann, P. (1966). "ber den Median der Binomial- and Poissonverteilung" (in German). Wissenschaftliche Zeitschrift der Technischen
Universitt Dresden 19: 2933.
[3] Kaas, R.; Buhrman, J.M. (1980). "Mean, Median and Mode in Binomial Distributions". Statistica Neerlandica 34 (1): 1318.
doi:10.1111/j.1467-9574.1980.tb00681.x.
[4] Hamza, K. (1995). "The smallest uniform upper bound on the distance between the mean and the median of the binomial and Poisson
distributions". Statistics & Probability Letters 23: 2121. doi:10.1016/0167-7152(94)00090-U.
[5] Box, Hunter and Hunter (1978). Statistics for experimenters. Wiley. p.130.
[6] NIST/SEMATECH, "7.2.4. Does the proportion of defectives meet requirements?" (http:/ / www. itl. nist. gov/ div898/ handbook/ prc/
section2/ prc24. htm) e-Handbook of Statistical Methods.
[7] NIST/SEMATECH, "6.3.3.1. Counts Control Charts" (http:/ / www. itl. nist. gov/ div898/ handbook/ pmc/ section3/ pmc331. htm),
e-Handbook of Statistical Methods.
[8] http:/ / cg. scs. carleton. ca/ ~luc/ chapter_ten.pdf
External links
Web Based Binomial Distribution Calculator (with arbitrary precision) (http:/ / www. adsciengineering. com/
bpdcalc/ )
Binomial Distribution Web App (http:/ / limfinity. com/ apps/ binomial. php)
Binomial Probabilities Simple Explanation (http:/ / faculty. vassar. edu/ lowry/ binomialX. html)
SOCR Binomial Distribution Applet (http:/ / www. socr. ucla. edu/ htmls/ SOCR_Distributions. html)
CAUSEweb.org (http:/ / www. causeweb. org) Many resources for teaching Statistics including Binomial
Distribution
"Binomial Distribution" (http:/ / demonstrations. wolfram. com/ BinomialDistribution/ ) by Chris Boucher,
Wolfram Demonstrations Project, 2007.
Binomial Distribution (http:/ / www. cut-the-knot. org/ Curriculum/ Probability/ BinomialDistribution. shtml)
Properties and Java simulation from cut-the-knot
Statistics Tutorial: Binomial Distribution (http:/ / stattrek. com/ Lesson2/ Binomial. aspx)
Log-normal distribution
132
Log-normal distribution
Log-normal
Probability density function
Cumulative distribution function
notation:
parameters:

2
> 0 squared scale (real),
R location
support: x (0, +)
pdf:
cdf:
mean:
median:
mode:
variance:
skewness:
ex.kurtosis:
Log-normal distribution
133
entropy:
mgf: (defined only on the negative half-axis, see text)
cf:
representation is asymptotically divergent but sufficient for numerical purposes
Fisher information:
In probability theory, a log-normal distribution is a probability distribution of a random variable whose logarithm
is normally distributed. If Y is a random variable with a normal distribution, then X=exp(Y) has a log-normal
distribution; likewise, if X is log-normally distributed, then Y=log(X) is normally distributed. (This is true regardless
of the base of the logarithmic function: if log
a
(Y) is normally distributed, then so is log
b
(Y), for any two positive
numbers a,b1.)
Log-normal is also written lognormal or lognormal. It is occasionally referred to as the Galton distribution or
Galton's distribution, after Francis Galton.
A variable might be modeled as log-normal if it can be thought of as the multiplicative product of many independent
random variables each of which is positive. For example, in finance, a long-term discount factor can be derived from
the product of short-term discount factors. In wireless communication, the attenuation caused by shadowing or slow
fading from random objects is often assumed to be log-normally distributed. See log-distance path loss model.
Characterization
Probability density function
The probability density function of a log-normal distribution is:
where and are the mean and standard deviation of the variables natural logarithm (by definition, the variables
logarithm is normally distributed).
Cumulative distribution function
where erfc is the complementary error function, and is the standard normal cdf.
Mean and standard deviation
If X is a lognormally distributed variable, its expected value (mean), variance, and standard deviation are
Equivalently, parameters and can be obtained if the values of mean and variance are known:
Log-normal distribution
134
The geometric mean of the log-normal distribution is , and the geometric standard deviation is equal to .
Mode and median
The mode is the point of global maximum of the pdf function. In particular, it solves the equation (ln)=0:
The median is such a point where F
X
=:
Confidence interval
If X is distributed log-normally with parameters and , then the (1)-confidence interval for X will be
where q* is the (1/2)-quantile of the standard normal distribution: q*=
1
(1/2).
Moments
For any real or complex number s, the s
th
moment of log-normal X is given by
A log-normal distribution is not uniquely determined by its moments E[X
k
] for k1, that is, there exists some other
distribution with the same moments for all k. In fact, there is a whole family of distributions with the same moments
as the log-normal distribution.
Characteristic function and moment generating function
The characteristic function E[e
itX
] has a number of representations. The integral itself converges for Im(t)0. The
simplest representation is obtained by Taylor expanding e
itX
and using formula for moments above.
This series representation is divergent for Re(
2
)>0, however it is sufficient for numerically evaluating the
characteristic function at positive as long as the upper limit in sum above is kept bounded, nN, where
and
2
<0.1. To bring the numerical values of parameters , into the domain where strong inequality holds true
one could use the fact that if X is log-normally distributed then X
m
is also log-normally distributed with parameters
m,m. Since , the inequality could be satisfied for sufficiently smallm. The sum of series first
converges to the value of (t) with arbitrary high accuracy if m is small enough, and left part of the strong inequality
is satisfied. If considerably larger number of terms are taken into account the sum eventually diverges when the right
part of the strong inequality is no longer valid.
Another useful representation was derived by Roy Lepnik (see references by this author and by Daniel Dufresne
below) by means of double Taylor expansion of e
(lnx)
2/(2
2
).
The moment-generating function for the log-normal distribution does not exist on the domain R, but only exists on
the half-interval (,0].
Log-normal distribution
135
Partial expectation
The partial expectation of a random variable X with respect to a threshold k is defined as g(k) = E[X|X>k]P[X>k].
For a log-normal random variable the partial expectation is given by
This formula has applications in insurance and economics, it is used in solving the partial differential equation
leading to the BlackScholes formula.
Properties
Data that arise from the the log-normal distribution has a symmetric Lorenz curve (see also Lorenz asymmetry
coefficient)
[1]
.
Maximum likelihood estimation of parameters
For determining the maximum likelihood estimators of the log-normal distribution parameters and , we can use
the same procedure as for the normal distribution. To avoid repetition, we observe that
where by
L
we denote the probability density function of the log-normal distribution and by
N
that of the normal
distribution. Therefore, using the same indices to denote distributions, we can write the log-likelihood function thus:
Since the first term is constant with regard to and , both logarithmic likelihood functions,
L

L
and
N
, reach their
maximum with the same and. Hence, using the formulas for the normal distribution maximum likelihood
parameter estimators and the equality above, we deduce that for the log-normal distribution it holds that
Generating log-normally-distributed random variates
Given a random variate N drawn from the normal distribution with 0 mean and 1 standard deviation, then the variate
has a Log-normal distribution with parameters and .
Related distributions
If is a normal distribution, then
If is distributed log-normally, then is a normal random variable.
If are n independent log-normally distributed variables, and , then Y is
also distributed log-normally:
Let be independent log-normally distributed variables with possibly varying and
parameters, and . The distribution of Y has no closed-form expression, but can be reasonably
approximated by another log-normal distribution Z at the right tail. Its probability density function at the
neighborhood of 0 is characterized in (Gao et al., 2009) and it does not resemble any log-normal distribution. A
commonly used approximation (due to Fenton and Wilkinson) is obtained by matching the mean and variance:
Log-normal distribution
136
In the case that all have the same variance parameter , these formulas simplify to
If , then X+c is said to have a shifted log-normal distribution with support x (c, +).
E[X+c] = E[X]+c, Var[X+c] =Var[X].
If , then Y=aX is also log-normal,
If , then Y=
1

X
is also log-normal,
If anda0, then Y=X
a
is also log-normal,
Similar distributions
A substitute for the log-normal whose integral can be expressed in terms of more elementary functions (Swamee,
2002) can be obtained based on the logistic distribution to get the CDF
This is a log-logistic distribution.
An exGaussian distribution is the distribution of the sum of a normally distributed random variable and an
exponentially distributed random variable. This has a similar long tail, and has been used as a model for reaction
times.
Further reading
Robert Brooks, Jon Corson, and J. Donal Wales. "The Pricing of Index Options When the Underlying Assets All
Follow a Lognormal Diffusion"
[2]
, in Advances in Futures and Options Research, volume 7, 1994.
References
[1] Damgaard, Christian; Jacob Weiner (2000). "Describing inequality in plant size or fecundity". Ecology 81 (4): 1139-1142.
doi:10.1890/0012-9658(2000)081[1139:DIIPSO]2.0.CO;2.
[2] http:/ / papers. ssrn. com/ sol3/ papers.cfm?abstract_id=5735
Citations
The Lognormal Distribution, Aitchison, J. and Brown, J.A.C. (1957)
Log-normal Distributions across the Sciences: Keys and Clues (http:/ / stat. ethz. ch/ ~stahel/ lognormal/
bioscience. pdf), E. Limpert, W. Stahel and M. Abbt,. BioScience, 51 (5), p.341352 (2001).
Eric W. Weisstein et al. Log Normal Distribution (http:/ / mathworld. wolfram. com/ LogNormalDistribution.
html) at MathWorld. Electronic document, retrieved October 26, 2006.
Swamee, P.K. (2002). Near Lognormal Distribution (http:/ / scitation. aip. org/ getabs/ servlet/
GetabsServlet?prog=normal& id=JHYEFF000007000006000441000001& idtype=cvips& gifs=yes), Journal of
Hydrologic Engineering. 7(6): 441-444
Log-normal distribution
137
Roy B. Leipnik (1991), On Lognormal Random Variables: I - The Characteristic Function (http:/ / anziamj.
austms. org. au/ V32/ part3/ Leipnik. html), Journal of the Australian Mathematical Society Series B, vol. 32, pp
327347.
Gao et al. (2009), (http:/ / www. hindawi. com/ journals/ ijmms/ 2009/ 630857. html), Asymptotic Behaviors of
Tail Density for Sum of Correlated Lognormal Variables. International Journal of Mathematics and Mathematical
Sciences.
Daniel Dufresne (2009), (http:/ / www. soa. org/ library/ proceedings/ arch/ 2009/ arch-2009-iss1-dufresne. pdf),
SUMS OF LOGNORMALS, Centre for Actuarial Studies, University of Melbourne.
See also
Normal distribution
Geometric mean
Geometric standard deviation
Error function
Log-distance path loss model
Slow fading
Stochastic volatility
Heat equation
The heat equation predicts that if a hot body is
placed in a box of cold water, the temperature of
the body will decrease, and eventually (after
infinite time, and subject to no external heat
sources) the temperature in the box will equalize.
The heat equation is an important partial differential equation which
describes the distribution of heat (or variation in temperature) in a
given region over time. For a function u(x,y,z,t) of three spatial
variables (x,y,z) and the time variable t, the heat equation is
also written
or sometimes
where is a positive constant and or denotes the Laplacian operator. For the mathematical treatment it is
sufficient to consider the case =1. For the case of variation of temperature u(x,y,z,t) is the temperature and is
the thermal diffusivity
Heat equation
138
The heat equation is of fundamental importance in diverse scientific fields. In mathematics, it is the prototypical
parabolic partial differential equation. In probability theory, the heat equation is connected with the study of
Brownian motion via the FokkerPlanck equation. In financial mathematics it is used to solve the BlackScholes
partial differential equation. The diffusion equation, a more general version of the heat equation, arises in connection
with the study of chemical diffusion and other related processes.
General description
Graphical representation of the solution to a 1D
heat equation PDE. (View animated version)
Suppose one has a function u which describes the temperature at a
given location (x, y, z). This function will change over time as heat
spreads throughout space. The heat equation is used to determine the
change in the function u over time. The image to the right is animated
and describes the way heat changes in time along a metal bar. One of
the interesting properties of the heat equation is the maximum principle
which says that the maximum value of u is either earlier in time than
the region of concern or on the edge of the region of concern. This is
essentially saying that temperature comes either from some source or
from earlier in time because heat permeates but is not created from
nothingness. This is a property of parabolic partial differential
equations and is not difficult to prove mathematically (see below).
Another interesting property is that even if u has a discontinuity at an
initial time t = t
0
, the temperature becomes smooth as soon as t > t
0
. For example, if a bar of metal has temperature 0
and another has temperature 100 and they are stuck together end to end, then very quickly the temperature at the
point of connection is 50 and the graph of the temperature is smoothly running from 0 to 100.
The heat equation is used in probability and describes random walks. It is also applied in financial mathematics for
this reason.
It is also important in Riemannian geometry and thus topology: it was adapted by Richard Hamilton when he defined
the Ricci flow that was later used by Grigori Perelman to solve the topological Poincar conjecture.
The physical problem and the equation
Derivation in one dimension
The heat equation is derived from Fourier's law and conservation of energy (Cannon 1984). By Fourier's law, the
flow rate of heat energy through a surface is proportional to the negative temperature gradient across the surface,
where k is the thermal conductivity and u is the temperature. In one dimension, the gradient is an ordinary spatial
derivative, and so Fourier's law is
where u
x
is du/dx. In the absence of work done, a change in internal energy per unit volume in the material, Q, is
proportional to the change in temperature, u. That is,
where c
p
is the specific heat capacity and is the mass density of the material. (In this section only, is the ordinary
difference operator, not the Laplacian.) Choosing zero energy at absolute zero temperature, this can be rewritten as
.
The increase in internal energy in a small spatial region of the material
Heat equation
139
over the time period
is given by
[1]
where the fundamental theorem of calculus was used. Additionally, with no work done and absent any heat sources
or sinks, the change in internal energy in the interval [x-x, x+x] is accounted for entirely by the flux of heat across
the boundaries. By Fourier's law, this is
again by the fundamental theorem of calculus.
[2]
By conservation of energy,
This is true for any rectangle [tt, t+t] [xx, x+x]. Consequently, the integrand must vanish identically:
Which can be rewritten as:
or:
which is the heat equation. The coefficient k/(c
p
) is called thermal diffusivity and is often denoted .
Three-dimensional problem
In the special case of heat propagation in an isotropic and homogeneous medium in a 3-dimensional space, this
equation is

where:
u = u(x, y, z, t) is temperature as a function of space and time;
is the rate of change of temperature at a point over time;
u
xx
, u
yy
, and u
zz
are the second spatial derivatives (thermal conductions) of temperature in the x, y, and z
directions, respectively;
is the thermal diffusivity, a material-specific quantity depending on the thermal conductivity, k, the
mass density, , and the specific heat capacity, .
The heat equation is a consequence of Fourier's law of cooling (see heat conduction).
If the medium is not the whole space, in order to solve the heat equation uniquely we also need to specify boundary
conditions for u. To determine uniqueness of solutions in the whole space it is necessary to assume an exponential
bound on the growth of solutions, this assumption is consistent with observed experiments.
Solutions of the heat equation are characterized by a gradual smoothing of the initial temperature distribution by the
flow of heat from warmer to colder areas of an object. Generally, many different states and starting conditions will
tend toward the same stable equilibrium. As a consequence, to reverse the solution and conclude something about
earlier times or initial conditions from the present heat distribution is very inaccurate except over the shortest of time
periods.
The heat equation is the prototypical example of a parabolic partial differential equation.
Heat equation
140
Using the Laplace operator, the heat equation can be simplified, and generalized to similar equations over spaces of
arbitrary number of dimensions, as
where the Laplace operator, or , the divergence of the gradient, is taken in the spatial variables.
The heat equation governs heat diffusion, as well as other diffusive processes, such as particle diffusion or the
propagation of action potential in nerve cells. Although they are not diffusive in nature, some quantum mechanics
problems are also governed by a mathematical analog of the heat equation (see below). It also can be used to model
some phenomena arising in finance, like the Black-Scholes or the Ornstein-Uhlenbeck processes. The equation, and
various non-linear analogues, has also been used in image analysis.
The heat equation is, technically, in violation of special relativity, because its solutions involve instantaneous
propagation of a disturbance. The part of the disturbance outside the forward light cone can usually be safely
neglected, but if it is necessary to develop a reasonable speed for the transmission of heat, a hyperbolic problem
should be considered instead like a partial differential equation involving a second-order time derivative.
Internal heat generation
The function u above represents temperature of a body. Alternatively, it is sometimes convenient to change units and
represent u as the heat density of a medium. Since heat density is proportional to temperature in a homogeneous
medium, the heat equation is still obeyed in the new units.
Suppose that a body obeys the heat equation and, in addition, generates its own heat per unit volume (e.g., in
watts/L) at a rate given by a known function q varying in space and time.
[3]
Then the heat per unit volume u satisfies
an equation
For example, a tungsten light bulb filament generates heat, so it would have a positive nonzero value for when
turned on. While the light is turned off, the value of for the tungsten filament would be zero.
Solving the heat equation using Fourier series
Idealized physical setting for heat conduction in a rod with homogeneous boundary
conditions.
The following solution technique for the
heat equation was proposed by Joseph
Fourier in his treatise Thorie analytique de
la chaleur, published in 1822. Let us
consider the heat equation for one space
variable. This could be used to model heat
conduction in a rod. The equation is
where u = u(x, t) is a function of two variables x and t. Here
x is the space variable, so x [0,L], where L is the length of the rod.
t is the time variable, so t 0.
We assume the initial condition
Heat equation
141
where the function f is given and the boundary conditions
.
Let us attempt to find a solution of (1) which is not identically zero satisfying the boundary conditions (3) but with
the following property: u is a product in which the dependence of u on x, t is separated, that is:
This solution technique is called separation of variables. Substituting u back into equation (1),
Since the right hand side depends only on x and the left hand side only on t, both sides are equal to some constant
value . Thus:
and
We will now show that solutions for (6) for values of 0 cannot occur:
1. Suppose that < 0. Then there exist real numbers B, C such that
From (3) we get
and therefore B = 0 = C which implies u is identically 0.
2. Suppose that = 0. Then there exist real numbers B, C such that
From equation (3) we conclude in the same manner as in 1 that u is identically 0.
3. Therefore, it must be the case that > 0. Then there exist real numbers A, B, C such that
and
From (3) we get C = 0 and that for some positive integer n,
This solves the heat equation in the special case that the dependence of u has the special form (4).
In general, the sum of solutions to (1) which satisfy the boundary conditions (3) also satisfies (1) and (3). We can
show that the solution to (1), (2) and (3) is given by
where
Heat equation
142
Generalizing the solution technique
The solution technique used above can be greatly extended to many other types of equations. The idea is that the
operator u
xx
with the zero boundary conditions can be represented in terms of its eigenvectors. This leads naturally to
one of the basic ideas of the spectral theory of linear self-adjoint operators.
Consider the linear operator u = u
x x
. The infinite sequence of functions
for n 1 are eigenvectors of . Indeed
Moreover, any eigenvector f of with the boundary conditions f(0)=f(L)=0 is of the form e
n
for some n 1. The
functions e
n
for n 1 form an orthonormal sequence with respect to a certain inner product on the space of
real-valued functions on [0, L]. This means
Finally, the sequence {e
n
}
n N
spans a dense linear subspace of L
2
(0, L). This shows that in effect we have
diagonalized the operator .
Heat conduction in non-homogeneous anisotropic media
In general, the study of heat conduction is based on several principles. Heat flow is a form of energy flow, and as
such it is meaningful to speak of the time rate of flow of heat into a region of space.
The time rate of heat flow into a region V is given by a time-dependent quantity q
t
(V). We assume q has a density,
so that
Heat flow is a time-dependent vector function H(x) characterized as follows: the time rate of heat flowing through
an infinitesimal surface element with area d S and with unit normal vector n is
Thus the rate of heat flow into V is also given by the surface integral
where n(x) is the outward pointing normal vector at x.
The Fourier law states that heat energy flow has the following linear dependence on the temperature gradient
where A(x) is a 33 real matrix that is symmetric and positive definite.
By Green's theorem, the previous surface integral for heat flow into V can be transformed into the volume integral
Heat equation
143
The time rate of temperature change at x is proportional to the heat flowing into an infinitesimal volume element,
where the constant of proportionality is dependent on a constant
Putting these equations together gives the general equation of heat flow:
Remarks.
The coefficient (x) is the inverse of specific heat of the substance at x density of the substance at x.
In the case of an isotropic medium, the matrix A is a scalar matrix equal to thermal conductivity.
In the anisotropic case where the coefficient matrix A is not scalar (i.e., if it depends on x), then an explicit
formula for the solution of the heat equation can seldom be written down. Though, it is usually possible to
consider the associated abstract Cauchy problem and show that it is a well-posed problem and/or to show some
qualitative properties (like preservation of positive initial data, infinite speed of propagation, convergence toward
an equilibrium, smoothing properties). This is usually done by one-parameter semigroups theory: for instance, if
A is a symmetric matrix, then the elliptic operator defined by
is self-adjoint and dissipative, thus by the spectral theorem it generates a one-parameter semigroup.
Fundamental solutions
A fundamental solution, also called a heat kernel, is a solution of the heat equation corresponding to the initial
condition of an initial point source of heat at a known position. These can be used to find a general solution of the
heat equation over certain domains; see, for instance, (Evans 1998) for an introductory treatment.
In one variable, the Green's function is a solution of the initial value problem
where is the Dirac delta function. The solution to this problem is the fundamental solution
One can obtain the general solution of the one variable heat equation with initial condition u(x,0) = g(x) for -<x<
and 0<t< by applying a convolution:
In several spatial variables, the fundamental solution solves the analogous problem
in -<x
i
<, i=1,...,n, and 0<t<. The n-variable fundamental solution is the product of the fundamental solutions in
each variable; i.e.,
The general solution of the heat equation on R
n
is then obtained by a convolution, so that to solve the initial value
problem with u(x,t=0)=g(x), one has
Heat equation
144
The general problem on a domain in R
n
is
with either Dirichlet or Neumann boundary data. A Green's function always exists, but unless the domain can be
readily decomposed into one-variable problems (see below), it may not be possible to write it down explicitly. The
method of images provides one additional technique for obtaining Green's functions for non-trivial domains.
Some Green's function solutions in 1D
A variety of elementary Green's function solutions in one-dimension are recorded here. In some of these, the spatial
domain is the entire real line (-,). In others, it is the semi-infinite interval (0,) with either Neumann or Dirichlet
boundary conditions. One further variation is that some of these solve the inhomogeneous equation
where f is some given function of x and t.
Homogeneous heat equation
Initial value problem on (-,)
Comment. This solution is the convolution with respect to the variable of the fundamental solution
and the function . Therefore, according to the general properties of the
convolution with respect to differentiation, is a solution of the same heat equation, for
Moreover, and so that, by general facts
about approximation to the identity, as in various senses, according to the specific For
instance, if is assumed bounded and continuous on then converges uniformly to as ,
meaning that is continuous on with
Initial value problem on (0,) with homogeneous Dirichlet boundary conditions
Comment. This solution is obtained from the preceding formula as applied to the data suitably extended
to so as to be an odd function, that is, letting for all Correspondingly, the solution
of the initial value problem on is an odd function with respect to the variable for all values of
and in particular it satisfies the homogeneous Dirichlet boundary conditions
Initial value problem on (0,) with homogeneous Neumann boundary conditions
Heat equation
145
Comment. This solution is obtained from the first solution formula as applied to the data suitably
extended to so as to be an even function, that is, letting for all Correspondingly, the
solution of the initial value problem on is an even function with respect to the variable for all
values of and in particular, being smooth, it satisfies the homogeneous Neumann boundary conditions
Problem on (0,) with homogeneous initial conditions and non-homogeneous Dirichlet boundary conditions
Comment. This solution is the convolution with respect to the variable of
and the function . Since is the fundamental solution of
the function is also a solution of the same heat equation, and so is , thanks to
general properties of the convolution with respect to differentiation. Moreover, and
so that, by general facts about approximation to the identity, as in various
senses, according to the specific For instance, if is assumed continuous on with support in
then converges uniformly on compacta to as , meaning that is continuous on
with
Inhomogeneous heat equation
Problem on (-,) homogeneous initial conditions
Comment. This solution is the convolution in , that is with respect to both the variables and of the
fundamental solution and the function both meant as defined on the whole and
identically 0 for all . One verifies that which is expressed in the language of distributions as
where the distribution is the Dirac's delta function, that is the evaluation at 0.
Problem on (0,) with homogeneous Dirichlet boundary conditions and initial conditions
Comment. This solution is obtained from the preceding formula as applied to the data suitably extended to
, so as to be an odd function of the variable that is, letting for all and
Correspondingly, the solution of the inhomogeneous problem on is an odd function with respect to the
variable for all values of and in particular it satisfies the homogeneous Dirichlet boundary conditions
Problem on (0,) with homogeneous Neumann boundary conditions and initial conditions
Comment. This solution is obtained from the first formula as applied to the data suitably extended to
, so as to be an even function of the variable that is, letting for all and Correspondingly, the
solution of the inhomogeneous problem on is an even function with respect to the variable for all
values of and in particular, being a smooth function, it satisfies the homogeneous Neumann boundary conditions
Heat equation
146
Examples
Since the heat equation is linear, solutions of other combinations of boundary conditions, inhomogeneous term, and
initial conditions can be found by taking an appropriate linear combination of the above Green's function solutions.
For example, to solve
let
where u and v solve the problems
Similarly, to solve
let
where w, v, and r solve the problems
Mean-value property for the heat equation
Solutions of the heat equations satisfy a mean-value property analogous to the mean-value
properties of harmonic functions (solutions of ), though a bit more complicated. Precisely, if solves
and then
where is a "heat-ball", that is a super-level set of the fundamental solution of the heat equation:
Notice that as so the above formula holds for any in the (open) set
for large enough. Conversely, any function u satisfying the above mean-value property on an open
domain of is a solution of the heat equation. This can be shown by an argument similar to the analogous
one for harmonic functions.
Heat equation
147
Applications
Particle diffusion
One can model particle diffusion by an equation involving either:
the volumetric concentration of particles, denoted c, in the case of collective diffusion of a large number of
particles, or
the probability density function associated with the position of a single particle, denoted P.
In either case, one uses the heat equation
or
Both c and P are functions of position and time. D is the diffusion coefficient that controls the speed of the diffusive
process, and is typically expressed in meters squared over second. If the diffusion coefficient D is not constant, but
depends on the concentration c (or P in the second case), then one gets the nonlinear diffusion equation.
Brownian motion
The random trajectory of a single particle subject to the particle diffusion equation (or heat equation) is a Brownian
motion. If a particle is placed at at time , then the probability density function associated with the
position vector of the particle will be the following:
which is a (multivariate) normal distribution evolving in time.
Schrdinger equation for a free particle
With a simple division, the Schrdinger equation for a single particle of mass m in the absence of any applied force
field can be rewritten in the following way:
, where i is the unit imaginary number, and is Planck's constant divided by , and is
the wavefunction of the particle.
This equation is formally similar to the particle diffusion equation, which one obtains through the following
transformation:
Applying this transformation to the expressions of the Green functions determined in the case of particle diffusion
yields the Green functions of the Schrdinger equation, which in turn can be used to obtain the wavefunction at any
time through an integral on the wavefunction at t=0:
, with
Remark: this analogy between quantum mechanics and diffusion is a purely formal one. Physically, the evolution of
the wavefunction satisfying Schrdinger's equation might have an origin other than diffusion.
Heat equation
148
Thermal diffusivity in polymers
A direct practical application of the heat equation, in conjunction with Fourier theory, in spherical coordinates, is the
measurement of the thermal diffusivity in polymers (Unsworth and Duarte). The dual theoretical-experimental
method demonstrated by these authors is applicable to rubber and various other materials of practical interest.
Further applications
The heat equation arises in the modeling of a number of phenomena and is often used in financial mathematics in the
modeling of options. The famous BlackScholes option pricing model's differential equation can be transformed into
the heat equation allowing relatively easy solutions from a familiar body of mathematics. Many of the extensions to
the simple option models do not have closed form solutions and thus must be solved numerically to obtain a modeled
option price. The heat equation is also widely used in image analysis (Perona & Malik 1990) and in
machine-learning as the driving theory behind graph Laplacian methods. The heat equation can be efficiently solved
numerically using the CrankNicolson method of (Crank & Nicolson 1947). This method can be extended to many
of the models with no closed form solution, see for instance (Wilmott, Howison & Dewynne 1995).
An abstract form of heat equation on manifolds provides a major approach to the AtiyahSinger index theorem, and
has led to much further work on heat equations in Riemannian geometry.
Notes
[1] Here we are assuming that the material has constant mass density and heat capacity through space as well as time, although generalizations
are given below.
[2] In higher dimensions, the divergence theorem is used instead.
[3] Note that the units of u must be selected in a manner compatible with those of q. Thus instead of being temperature (K), units of u should be
J/L.
References
Cannon, John (1984), The One-Dimensional Heat Equation, Encyclopedia of mathematics and its applications,
Addison-Wesley, ISBN0-521-30243-9
Crank, J.; Nicolson, P.; Hartree, D. R. (1947), "A Practical Method for Numerical Evaluation of Solutions of
Partial Differential Equations of the Heat-Conduction Type", Proceedings of the Cambridge Philosophical
Society 43: 5067, doi:10.1017/S0305004100023197
Einstein, Albert (1905), "ber die von der molekularkinetischen Theorie der Wrme geforderte Bewegung von in
ruhenden Flssigkeiten suspendierten Teilchen", Ann. Phys. Leipzig 17 322: 549560,
doi:10.1002/andp.19053220806
Evans, L.C. (1998), Partial Differential Equations, American Mathematical Society, ISBN0-8218-0772-2
John, Fritz (1991), Partial Differential Equations (4th ed.), Springer, ISBN978-0387906096
Wilmott, P.; Howison, S.; Dewynne, J. (1995), The Mathematics of Financial Derivatives:A Student Introduction,
Cambridge University Press
Carslaw, H. S.; Jaeger, J. C. (1973), Conduction of Heat in Solids (2nd ed.), Oxford University Press,
ISBN9780198533689
Perona, P; Malik, J. (1990), "Scale-Space and Edge Detection Using Anisotropic Diffusion", {IEEE Transactions
on Pattern Analysis and Machine Intelligence 12 (7): 629-639
Unsworth, J.; Duarte, F. J. (1979), "Heat diffusion in a solid sphere and Fourier Theory", Am. J. Phys. 47:
891-893, doi:10.1119/1.11601
Heat equation
149
External links
Derivation of the heat equation (http:/ / www. mathphysics. com/ pde/ HEderiv. html)
Linear heat equations (http:/ / eqworld. ipmnet. ru/ en/ solutions/ lpde/ heat-toc. pdf): Particular solutions and
boundary value problems - from EqWorld
RadonNikodym derivative
In mathematics, the RadonNikodym theorem is a result in functional analysis that states that, given a measurable
space (X,), if a -finite measure on (X,) is absolutely continuous with respect to a -finite measure on (X,),
then there is a measurable function f on X and taking values in [0,), such that
for any measurable set A.
The theorem is named after Johann Radon, who proved the theorem for the special case where the underlying space
is R
N
in 1913, and for Otton Nikodym who proved the general case in 1930.
[1]
Above the function f was assumed to be complex-valued (or real-valued). If Y is a Banach space and the
generalization of the Radon-Nikodym theorem also holds for functions with values in Y (mutatis mutandis), then Y is
said to have the Radon-Nikodym property. All Hilbert spaces have the Radon-Nikodym property.
RadonNikodym derivative
The function f satisfying the above equality is uniquely defined up to a -null set, that is, if g is another function
which satisfies the same property, then f=g -almost everywhere. f is commonly written d/d and is called the
RadonNikodym derivative. The choice of notation and the name of the function reflects the fact that the function
is analogous to a derivative in calculus in the sense that it describes the rate of change of density of one measure with
respect to another (the way the Jacobian determinant is used in multivariable integration). A similar theorem can be
proven for signed and complex measures: namely, that if is a nonnegative -finite measure, and is a finite-valued
signed or complex measure such that , there is -integrable real- or complex-valued function g on X such
that
for any measurable set A.
Applications
The theorem is very important in extending the ideas of probability theory from probability masses and probability
densities defined over real numbers to probability measures defined over arbitrary sets. It tells if and how it is
possible to change from one probability measure to another. Specifically, the probability density function of a
random variable is the RadonNikodym derivative of the induced measure with respect to some base measure
(usually the Lebesgue measure for continuous random variables).
For example, it can be used to prove the existence of conditional expectation for probability measures. The latter
itself is a key concept in probability theory, as conditional probability is just a special case of it.
Amongst other fields, financial mathematics uses the theorem extensively. Such changes of probability measure are
the cornerstone of the rational pricing of derivative securities and are used for converting actual probabilities into
those of the risk neutral probabilities.
RadonNikodym derivative
150
Properties
Let , , and be -finite measures on the same measure space. If and ( and are absolutely
continuous in respect to), then
If , then
If and g is a -integrable function, then
If and , then
If is a finite signed or complex measure, then
Further applications
Information divergences
If and are measures over X, and
The Kullback-Leibler divergence from to is defined to be
For the Rnyi divergence of order from to is defined to be
The assumption of -finiteness
The RadonNikodym theorem makes the assumption that the measure with respect to which one computes the rate
of change of is sigma-finite. Here is an example when is not sigma-finite and the RadonNikodym theorem fails
to hold.
Consider the Borel sigma-algebra on the real line. Let the counting measure, , of a Borel set A be defined as the
number of elements of A if A is finite, and + otherwise. One can check that is indeed a measure. It is not
sigma-finite, as not every Borel set is at most a countable union of finite sets. Let be the usual Lebesgue measure
on this Borel algebra. Then, is absolutely continuous with respect to , since for a set A one has (A)=0 only if A
is the empty set, and then (A) is also zero.
Assume that the RadonNikodym theorem holds, that is, for some measurable function f one has
for all Borel sets. Taking A to be a singleton set, A={a}, and using the above equality, one finds
RadonNikodym derivative
151
for all real numbers a. This implies that the function f, and therefore the Lebesgue measure , is zero, which is a
contradiction.
Proof
This section gives a measure-theoretic proof of the theorem. There is also a functional-analytic proof, using Hilbert
space methods, that was first given by von Neumann.
For finite measures and , the idea is to consider functions f with f d d. The supremum of all such functions,
along with the monotone convergence theorem, then furnishes the Radon-Nikodym derivative. The fact that the
remaining part of is singular with respect to follows from a technical fact about finite measures. Once the result
is established for finite measures, extending to -finite, signed, and complex measures can be done naturally. The
details are given below.
For finite measures
First, suppose that and are both finite-valued nonnegative measures. Let F be the set of those measurable
functions f : X[0, +] satisfying
for every A (this set is not empty, for it contains at least the zero function). Let f
1
, f
2
F; let A be an arbitrary
measurable set, A
1
= {x A | f
1
(x) > f
2
(x)}, and A
2
= {x A | f
2
(x) f
1
(x)}. Then one has
and therefore, max{f
1
,f
2
} F.
Now, let {f
n
}
n
be a sequence of functions in F such that
By replacing f
n
with the maximum of the first n functions, one can assume that the sequence {f
n
} is increasing. Let g
be a function defined as
By Lebesgue's monotone convergence theorem, one has
for each A , and hence, g F. Also, by the construction of g,
Now, since g F,
defines a nonnegative measure on . Suppose
0
0; then, since is finite, there is an > 0 such that
0
(X) > (X).
Let (P,N) be a Hahn decomposition for the signed measure
0
. Note that for every A one has
0
(AP)
(AP), and hence,
Also, note that (P) > 0; for if (P) = 0, then (since is absolutely continuous in relation to )
0
(P) (P) = 0, so

0
(P) = 0 and
RadonNikodym derivative
152
contradicting the fact that
0
(X) > (X).
Then, since
g + 1
P
F and satisfies
This is impossible, therefore, the initial assumption that
0
0 must be false. So
0
= 0, as desired.
Now, since g is -integrable, the set {xX | g(x)=+} is -null. Therefore, if a f is defined as
then f has the desired properties.
As for the uniqueness, let f,g : X[0,+) be measurable functions satisfying
for every measurable set A. Then, gf is -integrable, and
In particular, for A = {xX | f(x) > g(x)}, or {x X | f(x) < g(x)}. It follows that
and so, that (gf)
+
= 0 -almost everywhere; the same is true for (gf)

, and thus, f = g -almost everywhere, as


desired.
For -finite positive measures
If and are -finite, then X can be written as the union of a sequence {B
n
}
n
of disjoint sets in , each of which has
finite measure under both and . For each n, there is a -measurable function f
n
: B
n
[0,+) such that
for each -measurable subset A of B
n
. The union f of those functions is then the required function.
As for the uniqueness, since each of the f
n
is -almost everywhere unique, then so is f.
For signed and complex measures
If is a -finite signed measure, then it can be HahnJordan decomposed as =
+


where one of the measures is
finite. Applying the previous result to those two measures, one obtains two functions, g,h : X[0,+), satisfying the
RadonNikodym theorem for
+
and

respectively, at least one of which is -integrable (i.e., its integral with
respect to is finite). It is clear then that f = gh satisfies the required properties, including uniqueness, since both g
and h are unique up to -almost everywhere equality.
If is a complex measure, it can be decomposed as =
1
+i
2
, where both
1
and
2
are finite-valued signed
measures. Applying the above argument, one obtains two functions, g,h : X[0,+), satisfying the required
properties for
1
and
2
, respectively. Clearly, f = g+i h is the required function.
RadonNikodym derivative
153
Notes
[1] Nikodym, O. (1930). "Sur une gnralisation des intgrales de M. J. Radon" (http:/ / matwbn. icm. edu. pl/ ksiazki/ fm/ fm15/ fm15114. pdf)
(in French). Fundamenta Mathematicae 15: 131179. JFM56.0922.02. . Retrieved 2009-05-11.
References
Shilov, G. E., and Gurevich, B. L., 1978. Integral, Measure, and Derivative: A Unified Approach, Richard A.
Silverman, trans. Dover Publications. ISBN 0486635198.
This article incorporates material from Radon-Nikodym theorem on PlanetMath, which is licensed under the
Creative Commons Attribution/Share-Alike License.
Risk-neutral measure
In mathematical finance, a risk-neutral measure, which is an equivalent martingale measure, or Q-measure is a
probability measure that results when one assumes that the current value of all financial assets is equal to the
expected future payoff of the asset discounted at the risk-free rate. The concept is used in the pricing of derivatives.
Idea
In an actual economy, prices of assets depend crucially on their risk. Investors typically demand payment for bearing
uncertainty. Therefore, today's price of a claim on a risky amount realised tomorrow will generally differ from its
expected value. Most commonly,
[1]
investors are risk-averse and today's price is below the expectation, remunerating
those who bear the risk.
To price assets, consequently, the calculated expected values need to be adjusted for the risk involved (see also
Sharpe ratio).
It turns out, under certain weak conditions (absence of arbitrage) there is an alternative way to do this calculation:
Instead of first taking the expectation and then adjusting for risk, one can first adjust the probabilities of future
outcomes such that they incorporate the effects of risk, and then take the expectation under those different
probabilities. Those adjusted, 'virtual' probabilities are called risk-neutral probabilities, they constitute the
risk-neutral measure.
The probabilities over asset outcomes in the real world cannot be impacted; the constructed probabilities are
counterfactual. They are only computed because the second way of pricing, called risk-neutral pricing, is often much
simpler to calculate than the first.
The main benefit stems from the fact that once the risk-neutral probabilities are found, every asset can be priced by
simply taking its expected payoff (i.e. calculating as if investors were risk neutral). If we used the real-world,
physical probabilities, every security would require a different adjustment (as they differ in riskiness).
Note that under the risk-neutral measure all assets have the same expected rate of return, the risk-free rate (or short
rate). This does not imply the assumption that investors were risk neutral. On the contrary, the point is to price given
exactly the risk aversion we observe in the physical world. Towards that aim, we hypothesize about parallel
universes where everybody is risk neutral. The risk-neutral measure is the probability measure of that parallel
universe where all claims have exactly the prices they have in our real world.
Mathematically, adjusting the probabilities is a measure transformation to an equivalent martingale measure; it is
possible if there are no arbitrage opportunities. If the markets are complete, the risk-neutral measure is unique.
Often, the physical measure is called , and the risk-neutral one . The term physical measure is often abused to
denote the Lebesgue measure, occasionally, the measure induced by the corresponding normal density with respect
Risk-neutral measure
154
to the Lebesgue measure.
Usage
Risk-neutral measures make it easy to express the value of a derivative in a formula. Suppose at a future time a
derivative (e.g., a call option on a stock) pays units, where is a random variable on the probability space
describing the market. Further suppose that the discount factor from now (time zero) until time is .
Then today's fair value of the derivative is
where the risk-neutral measure is denoted by . This can be re-stated in terms of the physical measure P as
where is the RadonNikodym derivative of with respect to .
Another name for the risk-neutral measure is the equivalent martingale measure. If in a financial market there is just
one risk-neutral measure, then there is a unique arbitrage-free price for each asset in the market. This is the
fundamental theorem of arbitrage-free pricing. If there are more such measures, then in an interval of prices no
arbitrage is possible. If no equivalent martingale measure exists, arbitrage opportunities do.
Example 1 Binomial model of stock prices
Given a probability space , consider a single-period binomial model. A probability measure is called
risk neutral if for all . Suppose we have a two-state economy: the initial
stock price can go either up to or down to . If the interest rate is , and
, then the risk-neutral probability of an upward stock movement is given by the number
Given a derivative with payoff when the stock price moves up and when it goes down, we can price the
derivative via
Example 2 Brownian motion model of stock prices
Suppose our economy consists of 2 assets, a stock and a risk-free bond, and that we use the Black-Scholes model. In
the model the evolution of the stock price can be described by Geometric Brownian Motion:
where is a standard Brownian motion with respect to the physical measure. If we define
Girsanov's theorem states that there exists a measure under which is a Brownian motion. is known
as the market price of risk. Differentiating and rearranging yields:
Put this back in the original equation:
Risk-neutral measure
155
is the unique risk-neutral measure for the model. The (discounted) payoff process of a derivative on the stock
is a martingale under . Since and are -martingales we can invoke the
martingale representation theorem to find a replicating strategy - a holding of stocks and bonds that pays off at
all times .
Notes
[1] At least in large financial markets. Example of risk-seeking markets are casinos and lotteries.
See also
Mathematical finance
Forward measure
Fundamental theorem of arbitrage-free pricing
Law of one price
Rational pricing
Brownian model of financial markets
Martingale (probability theory)
External links
Gisiger, Nicolas: Risk-Neutral Probabilities Explained (http:/ / ssrn. com/ abstract=1395390)
Tham, Joseph: Risk-neutral Valuation: A Gentle Introduction (http:/ / papers. ssrn. com/ sol3/ papers.
cfm?abstract_id=290044), Part II (http:/ / papers. ssrn. com/ sol3/ papers. cfm?abstract_id=292724)
Stochastic calculus
Stochastic calculus is a branch of mathematics that operates on stochastic processes. It allows a consistent theory of
integration to be defined for integrals of stochastic processes with respect to stochastic processes. It is used to model
systems that behave randomly.
The best-known stochastic process to which stochastic calculus is applied is the Wiener process (named in honor of
Norbert Wiener), which is used for modeling Brownian motion as described by Albert Einstein and other physical
diffusion processes in space of particles subject to random forces. Since the 1970s, the Wiener process has been
widely applied in financial mathematics and economics to model the evolution in time of stock prices and bond
interest rates.
The main flavours of stochastic calculus are the It calculus and its variational relative the Malliavin calculus. For
technical reasons the It integral is the most useful for general classes of processes but the related Stratonovich
integral is frequently useful in problem formulation (particularly in engineering disciplines.) The Stratonovich
integral can readily be expressed in terms of the It integral. The main benefit of the Stratonovich integral is that it
obeys the usual chain rule and does therefore not require It's lemma. This enables problems to be expressed in a
co-ordinate system invariant form, which is invaluable when developing stochastic calculus on manifolds other than
R
n
. The dominated convergence theorem does not hold for the Stratonovich integral, consequently it is very difficult
to prove results without re-expressing the integrals in It form.
Stochastic calculus
156
It integral
The It integral is central to the study of stochastic calculus. The integral is defined for a semimartingale
X and locally bounded predictable process H.
Stratonovich integral
The Stratonovich integral of a semimartingale against another semimartingale Y can be defined in terms of the
It integral as
where [X,Y]
t
c
denotes the quadratic covariation of the continuous parts of X andY. The alternative notation
is also used to denote the Stratonovich integral.
Applications
A very important application of stochastic calculus is in quantitative finance, in which asset prices are often assumed
to follow geometric Brownian motion.
External links
Notes on Stochastic Calculus
[1]
A short elementary description of the basic It integral.
T. Szabados and B. Szekely, Stochastic integration based on simple, symmetric random walks
[1]
- A new
approach which the authors hope is more transparent and technically less demanding.
References
[1] http:/ / arXiv. org/ abs/ 0712.3908/
Wiener process
157
Wiener process
A single realization of a one-dimensional Wiener process
A single realization of a three-dimensional Wiener process
In mathematics, the Wiener process is a
continuous-time stochastic process named in
honor of Norbert Wiener. It is often called
Brownian motion, after Robert Brown. It is
one of the best known Lvy processes
(cdlg stochastic processes with stationary
independent increments) and occurs
frequently in pure and applied mathematics,
economics and physics.
The Wiener process plays an important role
both in pure and applied mathematics. In
pure mathematics, the Wiener process gave
rise to the study of continuous time
martingales. It is a key process in terms of
which more complicated stochastic
processes can be described. As such, it plays
a vital role in stochastic calculus, diffusion
processes and even potential theory. It is the
driving process of Schramm-Loewner
evolution. In applied mathematics, the
Wiener process is used to represent the
integral of a Gaussian white noise process,
and so is useful as a model of noise in
electronics engineering, instrument errors in
filtering theory and unknown forces in
control theory.
The Wiener process has applications
throughout the mathematical sciences. In
physics it is used to study Brownian motion,
the diffusion of minute particles suspended
in fluid, and other types of diffusion via the
Fokker-Planck and Langevin equations. It
also forms the basis for the rigorous path
integral formulation of quantum mechanics
(by the Feynman-Kac formula, a solution to
the Schrdinger equation can be represented
in terms of the Wiener process) and the study of eternal inflation in physical cosmology. It is also prominent in the
mathematical theory of finance, in particular the BlackScholes option pricing model.
Wiener process
158
Characterizations of the Wiener process
The Wiener process W
t
is characterized by three properties:
[1]
1. W
0
= 0
2. W
t
is almost surely continuous
3. W
t
has independent increments with (for 0 s < t).
N(,
2
) denotes the normal distribution with expected value and variance
2
. The condition that it has independent
increments means that if 0 s
1
t
1
s
2
t
2
then W
t
1W
s
1 and W
t
2W
s
2 are independent random variables, and
the similar condition holds for n increments.
An alternative characterization of the Wiener process is the so-called Lvy characterization that says that the Wiener
process is an almost surely continuous martingale with W
0
= 0 and quadratic variation [W
t
,W
t
] = t (which means that
W
t
2
-t is also a martingale).
A third characterization is that the Wiener process has a spectral representation as a sine series whose coefficients are
independent N(0,1) random variables. This representation can be obtained using the Karhunen-Lo-ve theorem.
The Wiener process can be constructed as the scaling limit of a random walk, or other discrete-time stochastic
processes with stationary independent increments. This is known as Donsker's theorem. Like the random walk, the
Wiener process is recurrent in one or two dimensions (meaning that it returns almost surely to any fixed
neighborhood of the origin infinitely often) whereas it is not recurrent in dimensions three and higher. Unlike the
random walk, it is scale invariant, meaning that
is a Wiener process for any nonzero constant . The Wiener measure is the probability law on the space of
continuous functions g, with g(0) = 0, induced by the Wiener process. An integral based on Wiener measure may be
called a Wiener integral.
Properties of a one-dimensional Wiener process
Basic properties
The unconditional probability density function at a fixed time t:
The expectation is zero:
The variance is t:
The covariance and correlation:
The results for the expectation and variance follow immediately from the definition that increments have a normal
distribution, centered at zero. Thus
The results for the covariance and correlation follow from the definition that non-overlapping increments are
independent, of which only the property that they are uncorrelated is used. Suppose that t
1
< t
2
.
Wiener process
159
Substitute the simple identity :
Since W(t
1
) = W(t
1
)W(t
0
) and W(t
2
)W(t
1
), are independent,
Thus
Self-similarity
Brownian scaling
For every c>0 the process is another Wiener process.
A demonstration of Brownian scaling, showing for decreasing
c. Note that the average features of the function do not change while zooming in, and note
that it zooms in quadratically faster horizontally than vertically.
Time reversal
The process for 0 t 1 is distributed like for 0 t 1.
Time inversion
The process is another Wiener process.
A class of Brownian martingales
If a polynomial p(x,t) satisfies the PDE
then the stochastic process
is a martingale.
Example: is a martingale, which shows that the quadratic variation of on is equal to It
follows that the expected time of first exit of from is equal to
More generally, for every polynomial p(x,t) the following stochastic process is a martingale:
where a is the polynomial
Example: the process is a martingale,
which shows that the quadratic variation of the martingale on is equal to
Wiener process
160
About functions p(x,t) more general than polynomials, see local martingales.
Some properties of sample paths
The set of all functions w with these properties is of full Wiener measure. That is, a path (sample function) of the
Wiener process has all these properties almost surely.
Qualitative properties
For every >0, the function w takes both (strictly) positive and (strictly) negative values on (0,).
The function w is continuous everywhere but differentiable nowhere (like the Weierstrass function).
Points of local maximum of the function w are a dense countable set; the maximum values are pairwise different;
each local maximum is sharp in the following sense: if w has a local maximum at t then
as s tends to t. The same holds for local minima.
The function w has no points of local increase, that is, no t>0 satisfies the following for some in (0,t): first, w(s)
w(t) for all s in (t-,t), and second, w(s) w(t) for all s in (t,t+). (Local increase is a weaker condition than that
w is increasing on (t-,t+).) The same holds for local decrease.
The function w is of unbounded variation on every interval.
Zeros of the function w are a nowhere dense perfect set of Lebesgue measure 0 and Hausdorff dimension 1/2
(therefore, uncountable).
Quantitative properties
Law of the iterated logarithm
Modulus of continuity
Local modulus of continuity:
Global modulus of continuity (Lvy):
Local time
The image of the Lebesgue measure on [0,t] under the map w (the pushforward measure) has a density L
t
(). Thus,
for a wide class of functions (namely: all continuous functions; all locally integrable functions; all non-negative
measurable functions). The density L
t
is (more exactly, can and will be chosen to be) continuous. The number L
t
(x) is
called the local time at x of w on [0,t]. It is strictly positive for all x of the interval (a,b) where a and b are the least
and the greatest value of w on [0,t], respectively. (For x outside this interval the local time evidently vanishes.)
Treated as a function of two variables x and t, the local time is still continuous. Treated as a function of t (while x is
fixed), the local time is a singular function corresponding to a nonatomic measure on the set of zeros ofw.
These continuity properties are fairly non-trivial. Consider that the local time can also be defined (as the density of
the pushforward measure) for a smooth function. Then, however, the density is discontinuous, unless the given
Wiener process
161
function is monotone. In other words, there is a conflict between good behavior of a function and good behavior of
its local time. In this sense, the continuity of the local time of the Wiener process is another manifestation of
non-smoothness of the trajectory.
Related processes
The generator of a Brownian motion is times
the Laplace-Beltrami operator. The image above
is of the Brownian motion on a special manifold:
the surface of a sphere.
The stochastic process defined by
is called a Wiener process with drift and infinitesimal variance
2
. These processes exhaust continuous Lvy
processes.
Two random processes on the time interval [0,1] appear, roughly speaking, when conditioning the Wiener process to
vanish on both ends of [0,1]. With no further conditioning, the process takes both positive and negative values on
[0,1] and is called Brownian bridge. Conditioned also to stay positive on (0,1), the process is called Brownian
excursion.
[2]
In both cases a rigorous treatment involves a limiting procedure, since the formula
does not work when
A geometric Brownian motion can be written
It is a stochastic process which is used to model processes that can never take on negative values, such as the value
of stocks.
The stochastic process
is distributed like the OrnsteinUhlenbeck process.
The time of hitting a single point x>0 by the Wiener process is a random variable with the Lvy distribution. The
family of these random variables (indexed by all positive numbersx) is a left-continuous modification of a Lvy
process. The right-continuous modification of this process is given by times of first exit from closed intervals [0,x].
The local time L
t
(0) treated as a random function of t is a random process distributed like the process
The local time L
t
(x) treated as a random function of x (while t is constant) is a random process described by
Ray-Knight theorems in terms of Bessel processes.
Wiener process
162
Brownian martingales
Let A be an event related to the Wiener process (more formally: a set, measurable with respect to the Wiener
measure, in the space of functions), and X
t
the conditional probability of A given the Wiener process on the time
interval [0,t] (more formally: the Wiener measure of the set of trajectories whose concatenation with the given
partial trajectory on [0,t] belongs toA). Then the process X
t
is a continuous martingale. Its martingale property
follows immediately from the definitions, but its continuity is a very special fact a special case of a general
theorem stating that all Brownian martingales are continuous. A Brownian martingale is, by definition, a martingale
adapted to the Brownian filtration; and the Brownian filtration is, by definition, the filtration generated by the
Wiener process.
Time change
Every continuous martingale (starting at the origin) is a time changed Wiener process.
Example. 2W
t
=V(4t) where V is another Wiener process (different from W but distributed like W).
Example. where and is another Wiener process.
In general, if M is a continuous martingale then where is the quadratic variation of M on
[0,t], and is a Wiener process.
Corollary. (See also Doob's martingale convergence theorems) Let be a continuous martingale, and
Then only the following two cases are possible:
other cases (such as etc.) are of probability 0.
Especially, a nonnegative continuous martingale has a finite limit (as ) almost surely.
All stated (in this subsection) for martingales holds also for local martingales.
Change of measure
A wide class of continuous semimartingales (especially, of diffusion processes) is related to the Wiener process via a
combination of time change and change of measure.
Using this fact, the qualitative properties stated above for the Wiener process can be generalized to a wide class of
continuous semimartingales.
Complex-valued Wiener process
The complex-valued Wiener process may be defined as a complex-valued random process of the form
where are independent Wiener processes (real-valued).
[3]
Wiener process
163
Self-similarity
Brownian scaling, time reversal, time inversion: the same as in the real-valued case.
Rotation invariance: for every complex number c such that |c|=1 the process is another complex-valued Wiener
process.
Time change
If f is an entire function then the process is a time-changed complex-valued Wiener process.
Example. where and is another
complex-valued Wiener process.
In contrast to the real-valued case, a complex-valued martingale is generally not a time-changed complex-valued
Wiener process. For example, the martingale is not (here are independent Wiener processes, as
before).
See also
Wiener sausage
Abstract Wiener space
Classical Wiener space
Chernoff's distribution
Notes
[1] Durrett 1996, Sect. 7.1
[2] Vervaat, W. (1979). A relation between Brownian bridge and Brownian excursion. Ann. Prob. 7, 143-149.
[3] Navarro-moreno, J.; Estudillo-martinez, M.D; Fernandez-alcala, R.M.; Ruiz-molina, J.C., "Estimation of Improper Complex-Valued Random
Signals in Colored Noise by Using the Hilbert Space Theory" (http:/ / ieeexplore. ieee. org/ Xplore/ login. jsp?url=http:/ / ieeexplore. ieee. org/
iel5/ 18/ 4957623/ 04957648. pdf?arnumber=4957648& authDecision=-203), IEEE Transactions on Information Theory 55 (6): 28592867,
doi:10.1109/TIT.2009.2018329, , retrieved 2010-03-30
References
Kleinert, Hagen, Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets, 4th
edition, World Scientific (Singapore, 2004); Paperback ISBN 981-238-107-4 (also available online: PDF-files
(http:/ / www. physik. fu-berlin. de/ ~kleinert/ b5))
Stark,Henry, John W. Woods, Probability and Random Processes with Applications to Signal Processing, 3rd
edition, Prentice Hall (New Jersey, 2002); Textbook ISBN 0-13-020071-9
Durrett, R. (2000) Probability: theory and examples,4th edition. Cambridge University Press, ISBN 0521765390
Daniel Revuz and Marc Yor, Continuous martingales and Brownian motion, second edition, Springer-Verlag
1994.
Lvy process
164
Lvy process
In probability theory, a Lvy process, named after the French mathematician Paul Lvy, is any continuous-time
stochastic process that starts at 0, admits cdlg modification and has "stationary independent increments" this
phrase will be explained below. They are a stochastic analog of independent and identically-distributed random
variables, and the most well-known examples are the Wiener process and the Poisson process.
Definition
A stochastic process is said to be a Lvy process if,
1. almost surely
2. Independent increments: For any ,
are independent
3. Stationary increments: For any , is equal in distribution to
4. is almost surely right continuous with left limits.
Properties
Independent increments
A continuous-time stochastic process assigns a random variable X
t
to each point t 0 in time. In effect it is a random
function of t. The increments of such a process are the differences X
s
X
t
between its values at different times t < s.
To call the increments of a process independent means that increments X
s
X
t
and X
u
X
v
are independent random
variables whenever the two time intervals do not overlap and, more generally, any finite number of increments
assigned to pairwise non-overlapping time intervals are mutually (not just pairwise) independent.
Stationary increments
To call the increments stationary means that the probability distribution of any increment X
s
X
t
depends only on
the length st of the time interval; increments with equally long time intervals are identically distributed.
In the Wiener process, the probability distribution of X
s
X
t
is normal with expected value 0 and variance st.
In the (homogeneous) Poisson process, the probability distribution of X
s
X
t
is a Poisson distribution with expected
value (st), where > 0 is the "intensity" or "rate" of the process.
Divisibility
Lvy processes correspond to infinitely divisible probability distributions:
The probability distributions of the increments of any Lvy process are infinitely divisible, since the increment of
length t is the sum of n increments of length t/n, which are i.i.d. by assumption (independent increments and
stationarity).
Conversely, there is a Lvy process for each infinitely divisible probability distribution: given such a distribution
D, multiples and dividing define a stochastic process for positive rational time, defining it as a Dirac delta
distribution for time 0 defines it for time 0, and taking limits defines it for real time. Independent increments and
stationarity follow by assumption of divisibility, though one must check continuity and that taking limits gives a
well-defined function for irrational time.
Lvy process
165
Moments
In any Lvy process with finite moments, the nth moment , is a polynomial function of t; these
functions satisfy a binomial identity:
LvyKhintchine representation
It is possible to characterise all Lvy processes by looking at their characteristic function. This leads to the
LvyKhintchine representation. If is a Lvy process, then its characteristic function satisfies the following
relation:
where , and is the indicator function. The Lvy measure must be such that
A Lvy process can be seen as comprising of three components: a drift, a diffusion component and a jump
component. These three components, and thus the LvyKhintchine representation of the process, are fully
determined by the LvyKhintchine triplet . So one can see that a purely continuous Lvy process is a
Brownian motion with drift.
LvyIt decomposition
We can also construct a Lvy process from any given characteristic function of the form given in the
LvyKhintchine representation. This expression corresponds to the decomposition of a measure in Lebesgue's
decomposition theorem: the drift and diffusion are the absolutely continuous part, while the measure W is the
singular measure.
Given a Lvy triplet there exists three independent Lvy processes, which lie in the same probability
space, , , such that:
is a Brownian motion with drift, corresponding to the absolutely continuous part of a measure and
capturing the drift a and diffusion ;
is a compound Poisson process, corresponding to the pure point part of the singular measure W;
is a square integrable pure jump martingale that almost surely has a countable number of jumps on a finite
interval, corresponding to the singular continuous part of the singular measure W.
The process defined by is a Lvy process with triplet .
Constructing a stochastic probability measure
Consider a random process; with independent increments, where the random values occur in, say,
a second countable locally compact abelian group .
Let for denote the (borel regular) probability measures on the initial
position and the increments. Now for let
. These define bona fide probability measures which,
by the properties of the process, compute appropriate probabilities for properties of paths depending on only finitely
many times.
Now correspond to these measures continuous linear operators in the
obvious way. Then, for any countable set of times (for ease consider the rationals ) define a linear functional,
as follows: if depends only on finitely many times, say
Lvy process
166
where without loss of generality ; let . It is straight-forward to see that this is well-defined and linear. Moreover it is
clearly a positive, bounded operator with since . By Stone-Weierstrass, extends (uniquely) to a (linear) continuous
(positive) operator (with norm 1) on its domain. By the Riesz representation theorem, this in turn gives rise to a
(unique) (borel regular) probability measure, . Precisely, this measure is the unique one satisfying that for any .
Whereas initially we knew the probability distributions of a path at given times / over time increments, and thus
could talk about local properties of the paths in the stochastic process; the constructed measure above allows us to
attach a probability distribution to (almost) the full path space, and thus enables us to talk about global properties.
Roughly we are justified (and compelled to) thinking of the measure as though calculates "the probability"
that a path occurs in (when projected onto the times ).
As an example of our new ability to talk about global properties, we have that "almost every path has left limits / is
left continuous", if and only if: for every countable sequence of times , letting ,
we have -almost-everywhere then converges / converges to . (This makes sense, as it can be
shown that (i) has left limits/is left cts if and only if has limits/is cts under the topology on generated by
; and (ii) if is second countable then has limits/is cts if and only if converges / converges
to whenever .)
Verifying how global properties of paths over the real line can be translated into properties considering only
countably many times, can be a little tricky. There is no escaping this. Fortunately, the problem of having to change
the countable set of times over which the measure is based can be prevented. If we consider a countable dense
subset, , of the reals (e.g. the rationals), we may apply knowledge of the distribution on the increments together
with the stochastic measure to check these global properties. E.g. in the case of the wiener process, we are able
to check that almost every path is (i) everywhere cts; (ii) has continuity modulus (Lvy); and thus (iii)
is nowhere differentiable.
See also
Independent and identically-distributed random variables
External links
Applebaum, David (December 2004), "Lvy ProcessesFrom Probability to Finance and Quantum Groups"
[1]
(PDF), Notices of the American Mathematical Society (Providence, RI: American Mathematical Society) 51 (11):
13361347, ISSN1088-9477
References
[1] http:/ / www. ams. org/ notices/ 200411/ fea-applebaum. pdf
Stochastic differential equations
167
Stochastic differential equations
A stochastic differential equation (SDE) is a differential equation in which one or more of the terms is a stochastic
process, thus resulting in a solution which is itself a stochastic process. SDE are used to model diverse phenomena
such as fluctuating stock prices or physical system subject to thermal fluctuations. Typically, SDEs incorporate white
noise which can be thought of as the derivative of Brownian motion (or the Wiener process); however, it should be
mentioned that other types of random fluctuations are possible, such as jump processes.
Background
The earliest work on SDEs was done to describe Brownian motion in Einstein's famous paper, and at the same time
by Smoluchowski. However, one of the earlier works related to Brownian motion is credited to Bachelier (1900) in
his thesis 'Theory of Speculation'. This work was followed upon by Langevin. Later It and Stratonovich put SDEs
on more solid mathematical footing.
Terminology
In physical science, SDEs are usually written as Langevin equations. These are sometimes confusingly called "the
Langevin equation" even though there are many possible forms. These consist of an ordinary differential equation
containing a deterministic part and an additional random white noise term. A second form is the Fokker-Planck
equation. The Fokker-Planck equation is a partial differential equation that describes the time evolution of the
probability distribution function. The third form is the stochastic differential equation that is used most frequently in
mathematics and quantitative finance (see below). This is similar to the Langevin form, but it is usually written in
differential form. SDEs come in two varieties, corresponding to two versions of stochastic calculus.
Stochastic Calculus
Brownian motion or the Wiener process was discovered to be exceptionally complex mathematically. The Wiener
process is non-differentiable; thus, it requires its own rules of calculus. There are two dominating versions of
stochastic calculus, the Ito stochastic calculus and the Stratonovich stochastic calculus. Each of the two has
advantages and disadvantages, and newcomers are often confused whether the one is more appropriate than the other
in a given situation. Guidelines exist (e.g. ksendal, 2003) and conveniently, one can readily convert an Ito SDE to
an equivalent Stratonovich SDE and back again. Still, one must be careful which calculus to use when the SDE is
initially written down.
Numerical Solutions
Numerical solution of stochastic differential equations and especially stochastic partial differential equations is a
young field relatively speaking. Almost all algorithms that are used for the solution of ordinary differential equations
will work very poorly for SDEs, having very poor numerical convergence. A textbook describing many different
algorithms is Kloeden & Platen (1995).
Methods include the EulerMaruyama method, Milstein method and RungeKutta method (SDE).
Stochastic differential equations
168
Use in Physics
In physics, SDEs are typically written in the Langevin form and referred to as "the Langevin equation." For example,
a general coupled set of first-order SDEs is often written in the form:
where is the set of unknowns, the and are arbitrary functions and the are
random functions of time, often referred to as "noise terms". This form is usually usable because there are standard
techniques for transforming higher-order equations into several coupled first-order equations by introducing new
unknowns. If the are constants, the system is said to be subject to additive noise, otherwise it is said to be subject
to multiplicative noise. This term is somewhat misleading as it has come to mean the general case even though it
appears to imply the limited case where : . Additive noise is the simpler of the two cases; in that
situation the correct solution can often be found using ordinary calculus and in particular the ordinary chain rule of
calculus. However, in the case of multiplicative noise, the Langevin equation is not a well-defined entity on its own,
and it must be specified whether the Langevin equation should be interpreted as an Ito SDE or a Stratonovich SDE.
In physics, the main method of solution is to find the probability distribution function as a function of time using the
equivalent Fokker-Planck equation (FPE). The Fokker-Planck equation is a deterministic partial differential
equation. It tells how the probability distribution function evolves in time similarly to how the Schrdinger equation
gives the time evolution of the quantum wave function or the diffusion equation gives the time evolution of chemical
concentration. Alternatively numerical solutions can be obtained by Monte Carlo simulation. Other techniques
include the path integration that draws on the analogy between statistical physics and quantum mechanics (for
example, the Fokker-Planck equation can be transformed into the Schrdinger equation by rescaling a few variables)
or by writing down ordinary differential equations for the statistical moments of the probability distribution function.
Note on "the Langevin equation"
The "the" in "the Langevin equation" is somewhat ungrammatical nomenclature. Each individual physical model has
its own Langevin equation. Perhaps, "a Langevin equation" or "the associated Langevin equation" would conform
better with common English usage.
Use in probability and mathematical finance
The notation used in probability theory (and in many applications of probability theory, for instance mathematical
finance) is slightly different. This notation makes the exotic nature of the random function of time in the physics
formulation more explicit. It is also the notation used in publications on numerical methods for solving stochastic
differential equations. In strict mathematical terms, can not be chosen as a usual function, but only as a
generalized function. The mathematical formulation treats this complication with less ambiguity than the physics
formulation.
A typical equation is of the form
where denotes a Wiener process (Standard Brownian motion). This equation should be interpreted as an informal
way of expressing the corresponding integral equation
The equation above characterizes the behavior of the continuous time stochastic process X
t
as the sum of an ordinary
Lebesgue integral and an It integral. A heuristic (but very helpful) interpretation of the stochastic differential
equation is that in a small time interval of length the stochastic process X
t
changes its value by an amount that is
Stochastic differential equations
169
normally distributed with expectation (X
t
,t) and variance (X
t
,t) and is independent of the past behavior of the
process. This is so because the increments of a Wiener process are independent and normally distributed. The
function is referred to as the drift coefficient, while is called the diffusion coefficient. The stochastic process X
t
is
called a diffusion process, and is usually a Markov process.
The formal interpretation of an SDE is given in terms of what constitutes a solution to the SDE. There are two main
definitions of a solution to an SDE, a strong solution and a weak solution. Both require the existence of a process X
t
that solves the integral equation version of the SDE. The difference between the two lies in the underlying
probability space (F,Pr). A weak solution consists of a probability space and a process that satisfies the integral
equation, while a strong solution is a process that satisfies the equation and is defined on a given probability space.
An important example is the equation for geometric Brownian motion
which is the equation for the dynamics of the price of a stock in the Black Scholes options pricing model of financial
mathematics.
There are also more general stochastic differential equations where the coefficients and depend not only on the
present value of the process X
t
, but also on previous values of the process and possibly on present or previous values
of other processes too. In that case the solution process, X, is not a Markov process, and it is called an It process and
not a diffusion process. When the coefficients depends only on present and past values of X, the defining equation is
called a stochastic delay differential equation.
Existence and uniqueness of solutions
As with deterministic ordinary and partial differential equations, it is important to know whether a given SDE has a
solution, and whether or not it is unique. The following is a typical existence and uniqueness theorem for It SDEs
taking values in n-dimensional Euclidean space R
n
and driven by an m-dimensional Brownian motion B; the proof
may be found in ksendal (2003, 5.2).
Let T>0, and let
be measurable functions for which there exist constants C and D such that
for all t[0,T] and all x and yR
n
, where
Let Z be a random variable that is independent of the -algebra generated by B
s
, s0, and with finite second
moment:
Then the stochastic differential equation/initial value problem
has a Pr-almost surely unique t-continuous solution (t,)|X
t
() such that X is adapted to the filtration F
t
Z
generated by Z and B
s
, st, and
Stochastic differential equations
170
See also
Langevin dynamics
Local volatility
Stochastic volatility
Sethi advertising model
Stochastic partial differential equations
References
Adomian, George (1983). Stochastic systems. Mathematics in Science and Engineering (169). Orlando, FL:
Academic Press Inc..
Adomian, George (1986). Nonlinear stochastic operator equations. Orlando, FL: Academic Press Inc..
Adomian, George (1989). Nonlinear stochastic systems theory and applications to physics. Mathematics and its
Applications (46). Dordrecht: Kluwer Academic Publishers Group.
ksendal, Bernt K. (2003). Stochastic Differential Equations: An Introduction with Applications. Berlin:
Springer. ISBN3-540-04758-1.
Teugels, J. and Sund B. (eds.) (2004). Encyclopedia of Actuarial Science. Chichester: Wiley. pp.523527.
C. W. Gardiner (2004). Handbook of Stochastic Methods: for Physics, Chemistry and the Natural Sciences.
Springer. p.415.
Thomas Mikosch (1998). Elementary Stochastic Calculus: with Finance in View. Singapore: World Scientific
Publishing. p.212. ISBN981-02-3543-7.
Bachelier, L., (1900). Thorie de la speculation (in French), PhD Thesis. NUMDAM: http:/ / www. numdam.
org/ item?id=ASENS_1900_3_17__21_0. & #32;In English in 1971 book 'The Random Character of the Stock
Market' Eds. P.H. Cootner.
P.E. Kloeden and E. Platen, (1995). Numerical Solution of Stochastic Differential Equations,. Springer,.
Stochastic volatility
171
Stochastic volatility
Stochastic volatility models are used in the field of Mathematical finance to evaluate derivative securities, such as
options. The name derives from the models' treatment of the underlying security's volatility as a random process,
governed by state variables such as the price level of the underlying security, the tendency of volatility to revert to
some long-run mean value, and the variance of the volatility process itself, among others.
Stochastic volatility models are one approach to resolve a shortcoming of the Black-Scholes model. In particular,
these models assume that the underlying volatility is constant over the life of the derivative, and unaffected by the
changes in the price level of the underlying security. However, these models cannot explain long-observed features
of the implied volatility surface such as volatility smile and skew, which indicate that implied volatility does tend to
vary with respect to strike price and expiration. By assuming that the volatility of the underlying price is a stochastic
process rather than a constant, it becomes possible to model derivatives more accurately.
Basic model
Starting from a constant volatility approach, assume that the derivative's underlying price follows a standard model
for geometric brownian motion:
where is the constant drift (i.e. expected return) of the security price , is the constant volatility, and
is a standard gaussian with zero mean and unit standard deviation. The explicit solution of this stochastic differential
equation is
.
The Maximum likelihood estimator to estimate the constant volatility for given stock prices at different times
is
;
its expectation value is .
This basic model with constant volatility is the starting point for non-stochastic volatility models such as
Black-Scholes and Cox-Ross-Rubinstein.
For a stochastic volatility model, replace the constant volatility with a function , that models the variance of
. This variance function is also modeled as brownian motion, and the form of depends on the particular SV
model under study.
where and are some functions of and is another standard gaussian that is correlated with
with constant correlation factor .
Stochastic volatility
172
Heston model
The popular Heston model is a commonly used SV model, in which the randomness of the variance process varies as
the square root of variance. In this case, the differential equation for variance takes the form:
where is the mean long-term volatility, is the rate at which the volatility reverts toward its long-term mean,
is the volatility of the volatility process, and is, like , a gaussian with zero mean and unit standard
deviation. However, and are correlated with the constant correlation value .
In other words, the Heston SV model assumes that volatility is a random process that
1. exhibits a tendency to revert towards a long-term mean volatility at a rate ,
2. exhibits its own (constant) volatility, ,
3. and whose source of randomness is correlated (with correlation ) with the randomness of the underlying's price
processes.
SABR volatility model
The SABR model (Stochastic Alpha, Beta, Rho) describes a single forward (related to any asset e.g. an index,
interest rate, bond, currency or equity) under stochastic volatility :
The initial values and are the current forward price and volatility, whereas and are two correlated
Wiener processes (i.e. Brownian motions) with correlation coefficient . The constant parameters
are such that .
The main feature of the SABR model is to be able to reproduce the smile effect of the volatility smile.
GARCH model
The Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model is another popular model for
estimating stochastic volatility. It assumes that the randomness of the variance process varies with the variance, as
opposed to the square root of the variance as in the Heston model. The standard GARCH(1,1) model has the
following form for the variance differential:
The GARCH model has been extended via numerous variants, including the NGARCH, LGARCH, EGARCH,
GJR-GARCH, etc.
3/2 model
The 3/2 model is similar to the Heston model, but assumes that the randomness of the variance process varies with
. The form of the variance differential is:
.
Chen model
In interest rate modelings, Lin Chen in 1994 developed the first stochastic mean and stochastic volatility model,
Chen model. Specifically, the dynamics of the instantaneous interest rate are given by following the stochastic
differential equations:
,
,
Stochastic volatility
173
.
Calibration
Once a particular SV model is chosen, it must be calibrated against existing market data. Calibration is the process of
identifying the set of model parameters that are most likely given the observed data. This process is called Maximum
Likelihood Estimation (MLE). For instance, in the Heston model, the set of model parameters
can be estimated applying an MLE algorithm such as the Powell Directed Set method [1] to observations of historic
underlying security prices.
In this case, you start with an estimate for , compute the residual errors when applying the historic price data to
the resulting model, and then adjust to try to minimize these errors. Once the calibration has been performed, it is
standard practice to re-calibrate the model over time.
See also
Chen model
Heston model
Local volatility
Risk-neutral measure
SABR Volatility Model
Volatility
Volatility, uncertainty, complexity and ambiguity
BlackScholes
References
Stochastic Volatility and Mean-variance Analysis
[2]
, Hyungsok Ahn, Paul Wilmott, (2006).
A closed-form solution for options with stochastic volatility
[3]
, SL Heston, (1993).
Inside Volatility Arbitrage
[4]
, Alireza Javaheri, (2005).
Accelerating the Calibration of Stochastic Volatility Models
[5]
, Kilin, Fiodar (2006).
Lin Chen (1996). Stochastic Mean and Stochastic Volatility -- A Three-Factor Model of the Term Structure of
Interest Rates and Its Application to the Pricing of Interest Rate Derivatives. Blackwell Publishers.. Blackwell
Publishers.
References
[1] http:/ / www. library. cornell. edu/ nr/ bookcpdf.html
[2] http:/ / www. wilmott. com/ detail.cfm?articleID=245
[3] http:/ / www. javaquant. net/ papers/ Heston-original. pdf
[4] http:/ / www. amazon. com/ s?platform=gurupa& url=index%3Dblended& keywords=inside+ volatility+ arbitrage
[5] http:/ / ssrn.com/ abstract=982221
Numerical partial differential equations
174
Numerical partial differential equations
Numerical partial differential equations is the branch of numerical analysis that studies the numerical solution of
partial differential equations (PDEs).
Numerical techniques for solving PDEs include the following:
The finite difference method, in which functions are represented by their values at certain grid points and
derivatives are approximated through differences in these values.
The method of lines, where all but one variable is discretized. The result is a system of ODEs in the remaining
continuous variable.
The finite element method, where functions are represented in terms of basis functions and the PDE is solved in
its integral (weak) form.
The finite volume method, which divides space into regions or volumes and computes the change within each
volume by considering the flux (flow rate) across the surfaces of the volume.
The spectral method, which represents functions as a sum of particular basis functions, for example using a
Fourier series.
Meshfree methods don't need a grid to work and so may be better suited for some problems. However the
computational effort is usually higher.
Domain decomposition methods solve boundary value problems by splitting them into smaller boundary value
problems on subdomains and iterating to coordinate the solution between the subdomains.
Multigrid methods solve differential equations using a hierarchy of discretizations.
The finite difference method is often regarded as the simplest method to learn and use. The finite element and finite
volume methods are widely used in engineering and in computational fluid dynamics, and are well suited to
problems in complicated geometries. Spectral methods are generally the most accurate, provided that the solutions
are sufficiently smooth.
See also
List of numerical analysis topics#Numerical partial differential equations
Numerical ordinary differential equations
External links
Numerical Methods for Partial Differential Equations
[1]
course at MIT OpenCourseWare.
IMS
[2]
, the Open Source IMTEK Mathematica Supplement (IMS)
References
[1] http:/ / ocw. mit. edu/ courses/ aeronautics-and-astronautics/
16-920j-numerical-methods-for-partial-differential-equations-sma-5212-spring-2003/
[2] http:/ / www. imtek. uni-freiburg.de/ simulation/ mathematica/ IMSweb/
CrankNicolson method
175
CrankNicolson method
In numerical analysis, the CrankNicolson method is a finite difference method used for numerically solving the
heat equation and similar partial differential equations.
[1]
It is a second-order method in time, implicit in time, and is
numerically stable. The method was developed by John Crank and Phyllis Nicolson in the mid 20th century.
[2]
For diffusion equations (and many other equations), it can be shown the CrankNicolson method is unconditionally
stable.
[3]
However, the approximate solutions can still contain (decaying) spurious oscillations if the ratio of time
step to the square of space step is large (typically larger than 1/2). For this reason, whenever large time steps or high
spatial resolution is necessary, the less accurate backward Euler method is often used, which is both stable and
immune to oscillations.
The method
The CrankNicolson stencil for a 1D problem.
The CrankNicolson method is based on central difference in
space, and the trapezoidal rule in time, giving second-order
convergence in time. For example, in one dimension, if the partial
differential equation is
then, letting , the equation for CrankNicolson method is the average of that forward Euler
method at and that backward Euler method at n+1 (note, however, that the method itself is not simply the
average of those two methods, as the equation has an implicit dependence on the solution):
The function F must be discretized spatially with a central difference.
Note that this is an implicit method: to get the "next" value of u in time, a system of algebraic equations must be
solved. If the partial differential equation is nonlinear, the discretization will also be nonlinear so that advancing in
time will involve the solution of a system of nonlinear algebraic equations, though linearizations are possible. In
many problems, especially linear diffusion, the algebraic problem is tridiagonal and may be efficiently solved with
the tridiagonal matrix algorithm, which gives a fast direct solution as opposed to the usual for a full
matrix.
CrankNicolson method
176
Example: 1D diffusion
The CrankNicolson method is often applied to diffusion problems. As an example, for linear diffusion,
whose CrankNicolson discretization is then:
or, letting :
which is a tridiagonal problem, so that may be efficiently solved by using the tridiagonal matrix algorithm in
favor of a much more costly matrix inversion.
A quasilinear equation, such as (this is a minimalistic example and not general)
would lead to a nonlinear system of algebraic equations which could not be easily solved as above; however, it is
possible in some cases to linearize the problem by using the old value for , that is instead of .
Other times, it may be possible to estimate using an explicit method and maintain stability.
Example: 1D diffusion with advection for steady flow, with multiple channel
connections
This is a solution usually employed for many purposes when there's a contamination problem in streams or rivers
under steady flow conditions but information is given in one dimension only. Often the problem can be simplified
into a 1-dimensional problem and still yield useful information.
Here we model the concentration of a solute contaminant in water. This problem is composed of three parts: the
known diffusion equation ( chosen as constant), an advective component (which means the system is evolving
in space due to a velocity field), which we choose to be a constant Ux, and a lateral interaction between longitudinal
channels (k).
where C is the concentration of the contaminant and subscripts N and M correspond to previous and next channel.
The CrankNicolson method (where i represents position & j time) transform each component of the PDE into the
following:
Now we create the following constants to simplify the algebra:
CrankNicolson method
177
and substitute <1>, <2>, <3>, <4>, <5>, <6>, , and into <0>. We then put the new time terms on the left (j+1)
and the present time terms on the right (j) to get:
To model the first channel, we realize that it can only be in contact with the following channel (M), so the expression
is simplified to:
In the same way, to model the last channel, we realize that it can only be in contact with the previous channel (N), so
the expression is simplified to:
To solve this linear system of equations we must now see that boundary conditions must be given first to the
beginning of the channels:
: initial condition for the channel at present time step
: initial condition for the channel at next time step
: initial condition for the previous channel to the one analyzed at present time step
: initial condition for the next channel to the one analyzed at present time step
For the last cell of the channels (z) the most convenient condition becomes an adiabatic one, so
This condition is satisfied if and only if (regardless of a null value)
Let us solve this problem (in a matrix form) for the case of 3 channels and 5 nodes (including the initial boundary
condition). We express this as a linear system problem:
where
and
Now we must realize that AA and BB should be arrays made of four different subarrays (remember that only three
channels are considered for this example but it covers the main part discussed above).
and
CrankNicolson method
178
where the elements mentioned above correspond to the next arrays and an additional 4x4 full of zeros. Please note
that the sizes of AA and BB are 12x12:
,
,
,
&
The d vector here is used to hold the boundary conditions. In this example it is a 12x1 vector:
To find the concentration at any time, one must iterate the following equation:
Example: 2D diffusion
When extending into two dimensions on a uniform Cartesian grid, the derivation is similar and the results may lead
to a system of band-diagonal equations rather than tridiagonal ones. The two-dimensional heat equation
can be solved with the CrankNicolson discretization of
assuming that a square grid is used so that . This equation can be simplified somewhat by rearranging
terms and using the CFL number
For the CrankNicolson numerical scheme, a low CFL number is not required for stability, however it is required for
numerical accuracy. We can now write the scheme as:
CrankNicolson method
179
Application in financial mathematics
Because a number of other phenomena can be modeled with the heat equation (often called the diffusion equation in
financial mathematics), the CrankNicolson method has been applied to those areas as well.
[4]
Particularly, the
Black-Scholes option pricing model's differential equation can be transformed into the heat equation, and thus
numerical solutions for option pricing can be obtained with the CrankNicolson method.
The importance of this for finance, is that option pricing problems, when extended beyond the standard assumptions
(e.g. incorporating changing dividends), cannot be solved in closed form, but can be solved using this method. Note
however, that for non-smooth final conditions (which happen for most financial instruments), the CrankNicolson
method is not satisfactory as numerical oscillations are not damped. For vanilla options, this results in oscillation in
the gamma value around the strike price. Therefore, special damping initialization steps are necessary (e.g., fully
implicit finite difference method).
See also
Financial mathematics
Partial differential equations
References
[1] Tuncer Cebeci (2002). Convective Heat Transfer (http:/ / books. google. com/ ?id=xfkgT9Fd4t4C& pg=PA257& dq="Crank-Nicolson+
method"). Springer. ISBN0966846141. .
[2] Crank, J.; Nicolson, P. (1947). "A practical method for numerical evaluation of solutions of partial differential equations of the heat
conduction type". Proc. Camb. Phil. Soc. 43: 5067. doi:10.1007/BF02127704..
[3] Thomas, J. W. (1995). Numerical Partial Differential Equations: Finite Difference Methods. Texts in Applied Mathematics. 22. Berlin, New
York: Springer-Verlag. ISBN978-0-387-97999-1.. Example 3.3.2 shows that CrankNicolson is unconditionally stable when applied to
.
[4] Wilmott, P.; Howison, S.; Dewynne, J. (1995). The Mathematics of Financial Derivatives: A Student Introduction (http:/ / books. google. co.
in/ books?hl=en& q=The Mathematics of Financial Derivatives Wilmott& um=1& ie=UTF-8& sa=N& tab=wp). Cambridge Univ. Press.
ISBN0521497892. .
External links
Module for Parabolic P.D.E.'s (http:/ / math. fullerton. edu/ mathews/ n2003/ CrankNicolsonMod. html)
Finite difference
180
Finite difference
A finite difference is a mathematical expression of the form f(x+b) f(x+a). If a finite difference is divided by
ba, one gets a difference quotient. The approximation of derivatives by finite differences plays a central role in
finite difference methods for the numerical solution of differential equations, especially boundary value problems.
Recurrence relations can be written as difference equations by replacing iteration notation with finite differences.
Forward, backward, and central differences
Only three forms are commonly considered: forward, backward, and central differences.
A forward difference is an expression of the form
Depending on the application, the spacing h may be variable or constant.
A backward difference uses the function values at x and x h, instead of the values at x + h and x:
Finally, the central difference is given by
Relation with derivatives
The derivative of a function f at a point x is defined by the limit
If h has a fixed (non-zero) value, instead of approaching zero, then the right-hand side is
Hence, the forward difference divided by h approximates the derivative when h is small. The error in this
approximation can be derived from Taylor's theorem. Assuming that f is continuously differentiable, the error is
The same formula holds for the backward difference:
However, the central difference yields a more accurate approximation. Its error is proportional to square of the
spacing (if f is twice continuously differentiable):
The main problem with the central difference method, however, is that oscillating functions can yield zero
derivative. If f(nh)=1 for n uneven, and f(nh)=2 for n even, then f'(nh)=0 if it is calculated with the central difference
scheme. This is particularly troublesome if the domain of f is discrete.
Finite difference
181
Higher-order differences
In an analogous way one can obtain finite difference approximations to higher order derivatives and differential
operators. For example, by using the above central difference formula for and and
applying a central difference formula for the derivative of at x, we obtain the central difference approximation of
the second derivative of f:
More generally, the n
th
-order forward, backward, and central differences are respectively given by:
Note that the central difference will, for odd , have multiplied by non-integers. This is often a problem because
it amounts to changing the interval of discretization. The problem may be remedied taking the average of
and .
The relationship of these higher-order differences with the respective derivatives is very straightforward:

Higher-order differences can also be used to construct better approximations. As mentioned above, the first-order
difference approximates the first-order derivative up to a term of order h. However, the combination
approximates f'(x) up to a term of order h
2
. This can be proven by expanding the above expression in Taylor series,
or by using the calculus of finite differences, explained below.
If necessary, the finite difference can be centered about any point by mixing forward, backward, and central
differences.
Arbitrarily sized kernels
Using a little linear algebra, one can fairly easily construct approximations, which sample an arbitrary number of
points to the left and a (possibly different) number of points to the right of the center point, for any order of
derivative. This involves solving a linear system such that the Taylor expansion of the sum of those points, around
the center point, well approximates the Taylor expansion of the desired derivative.
This is useful for differentiating a function on a grid, where, as one approaches the edge of the grid, one must sample
fewer and fewer points on one side.
The details are outlined in these notes
[1]
.
Finite difference
182
Properties
For all positive k and n
Leibniz rule:
Finite difference methods
An important application of finite differences is in numerical analysis, especially in numerical differential equations,
which aim at the numerical solution of ordinary and partial differential equations respectively. The idea is to replace
the derivatives appearing in the differential equation by finite differences that approximate them. The resulting
methods are called finite difference methods.
Common applications of the finite difference method are in computational science and engineering disciplines, such
as thermal engineering, fluid mechanics, etc.
Calculus of finite differences
The forward difference can be considered as a difference operator, which maps the function f to
h
[f]. This operator
satisfies
where is the shift operator with step , defined by , and is an identity operator.
Finite difference of higher orders can be defined in recursive manner as or, in
operators notation, Another possible (and equivalent) definition is
The difference operator
h
is linear and satisfies Leibniz rule. Similar statements hold for the backward and central
difference.
Formally applying the Taylor series with respect to h gives the formula
where D denotes the derivative operator, mapping f to its derivative f'. The expansion is valid when both sides act on
analytic function, for sufficiently small h. Formally inverting the exponential suggests that
This formula holds in the sense that both operators give the same result when applied to a polynomial. Even for
analytic functions, the series on the right is not guaranteed to converge; it may be an asymptotic series. However, it
can be used to obtain more accurate approximations for the derivative. For instance, retaining the first two terms of
the series yields the second-order approximation to mentioned at the end of the section Higher-order
differences.
The analogous formulas for the backward and central difference operators are
The calculus of finite differences is related to the umbral calculus in combinatorics.
The inverse operator of the forward difference operator is the indefinite sum.
In mathematics, a difference operator maps a function, (x), to another function, (x+b)(x+a).
Finite difference
183
The forward difference operator
occurs frequently in the calculus of finite differences, where it plays a role formally similar to that of the derivative,
but used in discrete circumstances. Difference equations can often be solved with techniques very similar to those for
solving differential equations. This similarity led to the development of time scale calculus. Analogously we can
have the backward difference operator
When restricted to polynomial functions f, the forward difference operator is a delta operator, i.e., a shift-equivariant
linear operator on polynomials that reduces degree by 1.
n-th difference
The nth forward difference of a function f(x) is given by
where is the binomial coefficient. Forward differences applied to a sequence are sometimes called the
binomial transform of the sequence, and have a number of interesting combinatorial properties.
Forward differences may be evaluated using the NrlundRice integral. The integral representation for these types of
series is interesting because the integral can often be evaluated using asymptotic expansion or saddle-point
techniques; by contrast, the forward difference series can be extremely hard to evaluate numerically, because the
binomial coefficients grow rapidly for large n.
Newton series
The Newton series consists of the terms of the Newton forward difference equation, named after Isaac Newton
and in essence the Newton interpolation formula first published in his Principia Mathematica in 1687
[2]
, is the
relationship
which holds for any polynomial function f and for some, but not all, analytic functions. Here, the expression
is the binomial coefficient, as
is the "falling factorial" or "lower factorial" and the empty product (x)
0
defined to be 1. In this particular case there is
an assumption of unit steps for the changes in the values of x. Note also the formal similarity of this result to Taylor's
theorem; this is one of the observations that lead to the idea of umbral calculus.
To illustrate how one might use Newton's formula in actual practice consider the first few terms of the Fibonacci
sequence f = 2, 2, 4... One can find a polynomial that reproduces these values by first computing a difference table
and then substituting the differences which correspond to x
0
(underlined) into the formula as follows,
For the case of nonuniform steps in the values of x Newton computes the divided differences,
the series of products,
Finite difference
184
and the resulting polynomial is the scalar product,
.
In analysis with p-adic numbers, Mahler's theorem states that the assumption that f is a polynomial function can be
weakened all the way to the assumption that f is merely continuous.
Carlson's theorem provides necessary and sufficient conditions for a Newton series to be unique, if it exists.
However, a Newton series will not, in general, exist.
The Newton series, together with the Stirling series and the Selberg series, is a special case of the general difference
series, all of which are defined in terms of scaled forward differences.
Rules for calculus of finite difference operators
Analogous to rules for finding the derivative, we have:
Constant rule: If c is a constant, then
Linearity: if a and b are constants,
All of the above rules apply equally well to any difference operator, including as to .
Product rule:
Quotient rule:
or
Summation rules:
Finite difference
185
Indefinite sum
The inverse operator of the forward difference operator is the indefinite sum.
Generalizations
A generalized finite difference is usually defined as
where is its coefficients vector. An infinite difference is a further generalization, where the
finite sum above is replaced by an infinite series. Another way of generalization is making coefficients depend
on point : , thus considering weighted finite difference. Also one may make step depend on
point : . Such generalizations are useful for constructing different modulus of continuity.
Difference operator generalizes to Mbius inversion over a partially ordered set.
As a convolution operator: Via the formalism of incidence algebras, difference operators and other Mbius
inversion can be represented by convolution with a function on the poset, called the Mbius function ; for the
difference operator, is the sequence (1,1,0,0,0,...).
Finite difference in several variables
Finite differences can be considered in more than one variable. They are analogous to partial derivatives in several
variables.
Some partial derivative approximations are:
See also
Finite difference coefficients
Taylor series
Numerical differentiation
Five-point stencil
Divided differences
Modulus of continuity
Time scale calculus
Summation by parts
Newton polynomial
Table of Newtonian series
Lagrange polynomial
Gilbreath's conjecture
Finite difference
186
References
[1] http:/ / commons. wikimedia. org/ wiki/ File:FDnotes. djvu
[2] see Newton, Isaac, Principia, Book III, Lemma V, Case 1 (http:/ / books. google. com/ books?id=KaAIAAAAIAAJ& dq=sir isaac newton
principia mathematica& as_brr=1& pg=PA466#v=onepage& q& f=false)
William F. Ames, Numerical Methods for Partial Differential Equations, Section 1.6. Academic Press, New
York, 1977. ISBN 0-12-056760-1.
Francis B. Hildebrand, Finite-Difference Equations and Simulations, Section 2.2, Prentice-Hall, Englewood
Cliffs, New Jersey, 1968.
Boole, George, A Treatise On The Calculus of Finite Differences, 2
nd
ed., Macmillan and Company, 1872. [See
also: Dover edition 1960].
Levy, H.; Lessman, F. (1992). Finite Difference Equations. Dover. ISBN0-486-67260-3.
Robert D. Richtmyer and K. W. Morton, Difference Methods for Initial Value Problems, 2
nd
ed., Wiley, New
York, 1967.
Flajolet, Philippe; Sedgewick, Robert (1995), "Mellin transforms and asymptotics: Finite differences and Rice's
integrals" (http:/ / www-rocq. inria. fr/ algo/ flajolet/ Publications/ mellin-rice. ps. gz), Theoretical Computer
Science 144 (12): 101124, doi:10.1016/0304-3975(94)00281-M.
External links
Table of useful finite difference formula generated using [[Mathematica (http:/ / reference. wolfram. com/
mathematica/ tutorial/ NDSolvePDE. html#c:4)] ]
Finite Calculus: A Tutorial for Solving Nasty Sums (http:/ / www. stanford. edu/ ~dgleich/ publications/
finite-calculus. pdf)
Value at risk
In financial mathematics and financial risk management, Value at Risk (VaR) is a widely used risk measure of the
risk of loss on a specific portfolio of financial assets. For a given portfolio, probability and time horizon, VaR is
defined as a threshold value such that the probability that the mark-to-market loss on the portfolio over the given
time horizon exceeds this value (assuming normal markets and no trading in the portfolio) in the given probability
level.
[1]
For example, if a portfolio of stocks has a one-day 5% VaR of $1 million, there is a 0.05 probability that the
portfolio will fall in value by more than $1 million over a one day period, assuming markets are normal and there is
no trading. Informally, a loss of $1 million or more on this portfolio is expected on 1 day in 20. A loss which
exceeds the VaR threshold is termed a VaR break.
[2]
Value at risk
187
The 5% Value at Risk of a hypothetical profit-and-loss probability density function
VaR has five main uses in finance: risk
management, risk measurement,
financial control, financial reporting
and computing regulatory capital. VaR
is sometimes used in non-financial
applications as well.
[3]
Important related ideas are economic
capital, backtesting, stress testing and
expected shortfall.
[4]
Details
Common parameters for VaR are 1%
and 5% probabilities and one day and
two week horizons, although other
combinations are in use.
[5]
The reason for assuming normal markets and no trading, and to restricting loss to things measured in daily accounts,
is to make the loss observable. In some extreme financial events it can be impossible to determine losses, either
because market prices are unavailable or because the loss-bearing institution breaks up. Some longer-term
consequences of disasters, such as lawsuits, loss of market confidence and employee morale and impairment of
brand names can take a long time to play out, and may be hard to allocate among specific prior decisions. VaR marks
the boundary between normal days and extreme events. Institutions can lose far more than the VaR amount; all that
can be said is that they will not do so very often.
[6]
The probability level is about equally often specified as one minus the probability of a VaR break, so that the VaR in
the example above would be called a one-day 95% VaR instead of one-day 5% VaR. This generally does not lead to
confusion because the probability of VaR breaks is almost always small, certainly less than 0.5.
[1]
Although it virtually always represents a loss, VaR is conventionally reported as a positive number. A negative VaR
would imply the portfolio has a high probability of making a profit, for example a one-day 5% VaR of negative $1
million implies the portfolio has a 95% chance of making more than $1 million over the next day.
[7]
Another inconsistency is VaR is sometimes taken to refer to profit-and-loss at the end of the period, and sometimes
as the maximum loss at any point during the period. The original definition was the latter, but in the early 1990s
when VaR was aggregated across trading desks and time zones, end-of-day valuation was the only reliable number
so the former became the de facto definition. As people began using multiday VaRs in the second half of the 1990s
they almost always estimated the distribution at the end of the period only. It is also easier theoretically to deal with
a point-in-time estimate versus a maximum over an interval. Therefore the end-of-period definition is the most
common both in theory and practice today.
[8]
Value at risk
188
Varieties of VaR
The definition of VaR is nonconstructive; it specifies a property VaR must have, but not how to compute VaR.
Moreover, there is wide scope for interpretation in the definition.
[9]
This has led to two broad types of VaR, one used
primarily in risk management and the other primarily for risk measurement. The distinction is not sharp, however,
and hybrid versions are typically used in financial control, financial reporting and computing regulatory capital.
[10]
To a risk manager, VaR is a system, not a number. The system is run periodically (usually daily) and the published
number is compared to the computed price movement in opening positions over the time horizon. There is never any
subsequent adjustment to the published VaR, and there is no distinction between VaR breaks caused by input errors
(including Information Technology breakdowns, fraud and rogue trading), computation errors (including failure to
produce a VaR on time) and market movements.
[11]
A frequentist claim is made, that the long-term frequency of VaR breaks will equal the specified probability, within
the limits of sampling error, and that the VaR breaks will be independent in time and independent of the level of
VaR. This claim is validated by a backtest, a comparison of published VaRs to actual price movements. In this
interpretation, many different systems could produce VaRs with equally good backtests, but wide disagreements on
daily VaR values.
[1]
For risk measurement a number is needed, not a system. A Bayesian probability claim is made, that given the
information and beliefs at the time, the subjective probability of a VaR break was the specified level. VaR is adjusted
after the fact to correct errors in inputs and computation, but not to incorporate information unavailable at the time of
computation.
[7]
In this context, backtest has a different meaning. Rather than comparing published VaRs to actual
market movements over the period of time the system has been in operation, VaR is retroactively computed on
scrubbed data over as long a period as data are available and deemed relevant. The same position data and pricing
models are used for computing the VaR as determining the price movements.
[2]
Although some of the sources listed here treat only one kind of VaR as legitimate, most of the recent ones seem to
agree that risk management VaR is superior for making short-term and tactical decisions today, while risk
measurement VaR should be used for understanding the past, and making medium term and strategic decisions for
the future. When VaR is used for financial control or financial reporting it should incorporate elements of both. For
example, if a trading desk is held to a VaR limit, that is both a risk-management rule for deciding what risks to allow
today, and an input into the risk measurement computation of the desks risk-adjusted return at the end of the
reporting period.
[4]
VAR in Governance
An interesting takeoff on VaR is its application in Governance for endowments, trusts, and pension plans. Essentially
trustees adopt portfolio Values-at-Risk metrics for the entire pooled account and the diversified parts individually
managed. Instead of probability estimates they simply define maximum levels of acceptable loss for each. Doing so
provides an easy metric for oversight and adds accountability as managers are then directed to manage, but with the
additional constraint to avoid losses within a defined risk parameter. VAR utilized in this manner adds relevance as
well as an easy to monitor risk measurement control far more intuitive than Standard Deviation of Return. Use of
VAR in this context, as well as a worthwhile critique on board governance practices as it relates to investment
management oversight in general can be found in 'Best Practices in Governance".
[12]
Value at risk
189
Risk measure and risk metric
The term VaR is used both for a risk measure and a risk metric. This sometimes leads to confusion. Sources earlier
than 1995 usually emphasize the risk measure, later sources are more likely to emphasize the metric.
The VaR risk measure defines risk as mark-to-market loss on a fixed portfolio over a fixed time horizon, assuming
normal markets. There are many alternative risk measures in finance. Instead of mark-to-market, which uses market
prices to define loss, loss is often defined as change in fundamental value. For example, if an institution holds a loan
that declines in market price because interest rates go up, but has no change in cash flows or credit quality, some
systems do not recognize a loss. Or we could try to incorporate the economic cost of things not measured in daily
financial statements, such as loss of market confidence or employee morale, impairment of brand names or
lawsuits.
[4]
Rather than assuming a fixed portfolio over a fixed time horizon, some risk measures incorporate the effect of
expected trading (such as a stop loss order) and consider the expected holding period of positions. Finally, some risk
measures adjust for the possible effects of abnormal markets, rather than excluding them from the computation.
[4]
The VaR risk metric summarizes the distribution of possible losses by a quantile, a point with a specified probability
of greater losses. Common alternative metrics are standard deviation, mean absolute deviation, expected shortfall
and downside risk.
[1]
VaR risk management
Supporters of VaR-based risk management claim the first and possibly greatest benefit of VaR is the improvement in
systems and modeling it forces on an institution. In 1997, Philippe Jorion wrote
[13]
:
[14]
[T]he greatest benefit of VAR lies in the imposition of a structured methodology for critically thinking
about risk. Institutions that go through the process of computing their VAR are forced to confront their
exposure to financial risks and to set up a proper risk management function. Thus the process of getting
to VAR may be as important as the number itself.
Publishing a daily number, on-time and with specified statistical properties holds every part of a trading organization
to a high objective standard. Robust backup systems and default assumptions must be implemented. Positions that
are reported, modeled or priced incorrectly stand out, as do data feeds that are inaccurate or late and systems that are
too-frequently down. Anything that affects profit and loss that is left out of other reports will show up either in
inflated VaR or excessive VaR breaks. A risk-taking institution that does not compute VaR might escape disaster,
but an institution that cannot compute VaR will not.
[15]
The second claimed benefit of VaR is that it separates risk into two regimes. Inside the VaR limit, conventional
statistical methods are reliable. Relatively short-term and specific data can be used for analysis. Probability estimates
are meaningful, because there are enough data to test them. In a sense, there is no true risk because you have a sum
of many independent observations with a left bound on the outcome. A casino doesn't worry about whether red or
black will come up on the next roulette spin. Risk managers encourage productive risk-taking in this regime, because
there is little true cost. People tend to worry too much about these risks, because they happen frequently, and not
enough about what might happen on the worst days.
[16]
Outside the VaR limit, all bets are off. Risk should be analyzed with stress testing based on long-term and broad
market data.
[17]
Probability statements are no longer meaningful.
[18]
Knowing the distribution of losses beyond the
VaR point is both impossible and useless. The risk manager should concentrate instead on making sure good plans
are in place to limit the loss if possible, and to survive the loss if not.
[1]
One specific system uses three regimes.
[19]
1. One to three times VaR are normal occurrences. You expect periodic VaR breaks. The loss distribution typically
has fat tails, and you might get more than one break in a short period of time. Moreover, markets may be
Value at risk
190
abnormal and trading may exacerbate losses, and you may take losses not measured in daily marks such as
lawsuits, loss of employee morale and market confidence and impairment of brand names. So an institution that
can't deal with three times VaR losses as routine events probably won't survive long enough to put a VaR system
in place.
2. Three to ten times VaR is the range for stress testing. Institutions should be confident they have examined all the
foreseeable events that will cause losses in this range, and are prepared to survive them. These events are too rare
to estimate probabilities reliably, so risk/return calculations are useless.
3. Foreseeable events should not cause losses beyond ten times VaR. If they do they should be hedged or insured, or
the business plan should be changed to avoid them, or VaR should be increased. It's hard to run a business if
foreseeable losses are orders of magnitude larger than very large everyday losses. It's hard to plan for these
events, because they are out of scale with daily experience. Of course there will be unforeseeable losses more than
ten times VaR, but it's pointless to anticipate them, you can't know much about them and it results in needless
worrying. Better to hope that the discipline of preparing for all foreseeable three-to-ten times VaR losses will
improve chances for surviving the unforeseen and larger losses that inevitably occur.
"A risk manager has two jobs: make people take more risk the 99% of the time it is safe to do so, and survive the
other 1% of the time. VaR is the border."
[15]
VaR risk measurement
The VaR risk measure is a popular way to aggregate risk across an institution. Individual business units have risk
measures such as duration for a fixed income portfolio or beta for an equity business. These cannot be combined in a
meaningful way.
[1]
It is also difficult to aggregate results available at different times, such as positions marked in
different time zones, or a high frequency trading desk with a business holding relatively illiquid positions. But since
every business contributes to profit and loss in an additive fashion, and many financial businesses mark-to-market
daily, it is natural to define firm-wide risk using the distribution of possible losses at a fixed point in the future.
[4]
In risk measurement, VaR is usually reported alongside other risk metrics such as standard deviation, expected
shortfall and greeks (partial derivatives of portfolio value with respect to market factors). VaR is a distribution-free
metric, that is it does not depend on assumptions about the probability distribution of future gains and losses.
[15]
The
probability level is chosen deep enough in the left tail of the loss distribution to be relevant for risk decisions, but not
so deep as to be difficult to estimate with accuracy.
[20]
Risk measurement VaR is sometimes called parametric VaR. This usage can be confusing, however, because it can
be estimated either parametrically (for examples, variance-covariance VaR or delta-gamma VaR) or
nonparametrically (for examples, historical simulation VaR or resampled VaR). The inverse usage makes more
logical sense, because risk management VaR is fundamentally nonparametric, but it is seldom referred to as
nonparametric VaR.
[4]

[6]
History of VaR
The problem of risk measurement is an old one in statistics, economics and finance. Financial risk management has
been a concern of regulators and financial executives for a long time as well. Retrospective analysis has found some
VaR-like concepts in this history. But VaR did not emerge as a distinct concept until the late 1980s. The triggering
event was the stock market crash of 1987. This was the first major financial crisis in which a lot of
academically-trained quants were in high enough positions to worry about firm-wide survival.
[1]
The crash was so unlikely given standard statistical models, that it called the entire basis of quant finance into
question. A reconsideration of history led some quants to decide there were recurring crises, about one or two per
decade, that overwhelmed the statistical assumptions embedded in models used for trading, investment management
and derivative pricing. These affected many markets at once, including ones that were usually not correlated, and
Value at risk
191
seldom had discernible economic cause or warning (although after-the-fact explanations were plentiful).
[18]
Much
later, they were named "Black Swans" by Nassim Taleb and the concept extended far beyond finance.
[21]
If these events were included in quantitative analysis they dominated results and led to strategies that did not work
day to day. If these events were excluded, the profits made in between "Black Swans" could be much smaller than
the losses suffered in the crisis. Institutions could fail as a result.
[15]

[18]

[21]
VaR was developed as a systematic way to segregate extreme events, which are studied qualitatively over long-term
history and broad market events, from everyday price movements, which are studied quantitatively using short-term
data in specific markets. It was hoped that "Black Swans" would be preceded by increases in estimated VaR or
increased frequency of VaR breaks, in at least some markets. The extent to which has proven to be true is
controversial.
[18]
Abnormal markets and trading were excluded from the VaR estimate in order to make it observable.
[16]
It is not
always possible to define loss if, for example, markets are closed as after 9/11, or severely illiquid, as happened
several times in 2008.
[15]
Losses can also be hard to define if the risk-bearing institution fails or breaks up.
[16]
A
measure that depends on traders taking certain actions, and avoiding other actions, can lead to self reference.
[1]
This is risk management VaR. It was well-established in quantative trading groups at several financial institutions,
notably Bankers Trust, before 1990, although neither the name nor the definition had been standardized. There was
no effort to aggregate VaRs across trading desks.
[18]
The financial events of the early 1990s found many firms in trouble because the same underlying bet had been made
at many places in the firm, in non-obvious ways. Since many trading desks already computed risk management VaR,
and it was the only common risk measure that could be both defined for all businesses and aggregated without strong
assumptions, it was the natural choice for reporting firmwide risk. J. P. Morgan CEO Dennis Weatherstone famously
called for a 4:15 report that combined all firm risk on one page, available within 15 minutes of the market close.
[9]
Risk measurement VaR was developed for this purpose. Development was most extensive at J. P. Morgan, which
published the methodology and gave free access to estimates of the necessary underlying parameters in 1994. This
was the first time VaR had been exposed beyond a relatively small group of quants. Two years later, the
methodology was spun off into an independent for-profit business now part of RiskMetrics Group
[22]
.
[9]
In 1997, the U.S. Securities and Exchange Commission ruled that public corporations must disclose quantitative
information about their derivatives activity. Major banks and dealers chose to implement the rule by including VaR
information in the notes to their financial statements.
[1]
Worldwide adoption of the Basel II Accord, beginning in 1999 and nearing completion today, gave further impetus
to the use of VaR. VaR is the preferred measure of market risk, and concepts similar to VaR are used in other parts
of the accord.
[1]
Mathematics
"Given some confidence level the VaR of the portfolio at the confidence level is given by the
smallest number such that the probability that the loss exceeds is not larger than "
[3]
The left equality is a definition of VaR. The right equality assumes an underlying probability distribution, which
makes it true only for parametric VaR. Risk managers typically assume that some fraction of the bad events will
have undefined losses, either because markets are closed or illiquid, or because the entity bearing the loss breaks
apart or loses the ability to compute accounts. Therefore, they do not accept results based on the assumption of a
well-defined probability distribution.
[6]
Nassim Taleb has labeled this assumption, "charlatanism."
[23]
On the other
hand, many academics prefer to assume a well-defined distribution, albeit usually one with fat tails.
[1]
This point has
probably caused more contention among VaR theorists than any other.
[9]
Value at risk
192
Criticism
VaR has been controversial since it moved from trading desks into the public eye in 1994. A famous 1997 debate
[13]
between Nassim Taleb and Philippe Jorion set out some of the major points of contention. Taleb claimed VaR:
[24]
1. Ignored 2,500 years of experience in favor of untested models built by non-traders
2. Was charlatanism because it claimed to estimate the risks of rare events, which is impossible
3. Gave false confidence
4. Would be exploited by traders
More recently David Einhorn and Aaron Brown debated VaR in Global Association of Risk Professionals Review
[25][15]

[26]
Einhorn compared VaR to an airbag that works all the time, except when you have a car accident. He
further charged that VaR:
1. Led to excessive risk-taking and leverage at financial institutions
2. Focused on the manageable risks near the center of the distribution and ignored the tails
3. Created an incentive to take excessive but remote risks
4. Was potentially catastrophic when its use creates a false sense of security among senior executives and
watchdogs.
New York Times reporter Joe Nocera wrote an extensive piece Risk Mismanagement
[27][28]
on January 4, 2009
discussing the role VaR played in the Financial crisis of 2007-2008. After interviewing risk managers (including
several of the ones cited above) the article suggests that VaR was very useful to risk experts, but nevertheless
exacerbated the crisis by giving false security to bank executives and regulators. A powerful tool for professional
risk managers, VaR is portrayed as both easy to misunderstand, and dangerous when misunderstood.
A common complaint among academics is that VaR is not subadditive.
[4]
That means the VaR of a combined
portfolio can be larger than the sum of the VaRs of its components. To a practicing risk manager this makes sense.
For example, the average bank branch in the United States is robbed about once every ten years. A single-branch
bank has about 0.004% chance of being robbed on a specific day, so the risk of robbery would not figure into
one-day 1% VaR. It would not even be within an order of magnitude of that, so it is in the range where the institution
should not worry about it, it should insure against it and take advice from insurers on precautions. The whole point
of insurance is to aggregate risks that are beyond individual VaR limits, and bring them into a large enough portfolio
to get statistical predictability. It does not pay for a one-branch bank to have a security expert on staff.
As institutions get more branches, the risk of a robbery on a specific day rises to within an order of magnitude of
VaR. At that point it makes sense for the institution to run internal stress tests and analyze the risk itself. It will
spend less on insurance and more on in-house expertise. For a very large banking institution, robberies are a routine
daily occurrence. Losses are part of the daily VaR calculation, and tracked statistically rather than case-by-case. A
sizable in-house security department is in charge of prevention and control, the general risk manager just tracks the
loss like any other cost of doing business.
As portfolios or institutions get larger, specific risks change from low-probability/low-predictability/high-impact to
statistically predictable losses of low individual impact. That means they move from the range of far outside VaR, to
be insured, to near outside VaR, to be analyzed case-by-case, to inside VaR, to be treated statistically.
[15]
Even VaR supporters generally agree there are common abuses of VaR:
[6]

[9]
1. Referring to VaR as a "worst-case" or "maximum tolerable" loss. In fact, you expect two or three losses per year
that exceed one-day 1% VaR.
2. Making VaR control or VaR reduction the central concern of risk management. It is far more important to worry
about what happens when losses exceed VaR.
3. Assuming plausible losses will be less than some multiple, often three, of VaR. The entire point of VaR is that
losses can be extremely large, and sometimes impossible to define, once you get beyond the VaR point. To a risk
manager, VaR is the level of losses at which you stop trying to guess what will happen next, and start preparing
Value at risk
193
for anything.
4. Reporting a VaR that has not passed a backtest. Regardless of how VaR is computed, it should have produced the
correct number of breaks (within sampling error) in the past. A common specific violation of this is to report a
VaR based on the unverified assumption that everything follows a multivariate normal distribution.
References
[1] Philippe Jorion, Value at Risk: The New Benchmark for Managing Financial Risk, 3rd ed. McGraw-Hill (2006). ISBN 978-0071464956
[2] Glyn Holton, Value-at-Risk: Theory and Practice, Academic Press (2003). ISBN 978-0123540102.
[3] Alexander McNeil, Rdiger Frey and Paul Embrechts, Quantitative Risk Management: Concepts Techniques and Tools, Princeton University
Press (2005). ISBN 978-0691122557
[4] Kevin Dowd, Measuring Market Risk. John Wiley & Sons (2005) ISBN 978-0470013038.
[5] Neil Pearson, Risk Budgeting: Portfolio Problem Solving with Value-at-Risk. John Wiley & Sons (2002). ISBN 978-0471405566.
[6] Aaron Brown, The Unbearable Lightness of Cross-Market Risk, Wilmott Magazine, March 2004.
[7] Michel Crouhy, Dan Galai and Robert Mark, The Essentials of Risk Management. McGraw-Hill (2001) ISBN 978-0071429665
[8] Jose A. Lopez, Regulatory Evaluation of Value-at-Risk Models. Wharton Financial Institutions Center Working Paper 96-51, September
1996.
[9] Joe Kolman, Michael Onak, Philippe Jorion, Nassim Taleb, Emanuel Derman, Blu Putnam, Richard Sandor, Stan Jonas, Ron Dembo, George
Holt, Richard Tanenbaum, William Margrabe, Dan Mudge, James Lam and Jim Rozsypal, Roundtable: The Limits of VaR. Derivatives
Strategy, April 1998.
[10] Aaron Brown, The Next Ten VaR Disasters. Derivatives Strategy, March 1997.
[11] Paul Wilmott, Paul Wilmott Introduces Quantitative Finance. Wiley (2007). ISBN 978-0470319581
[12] Best Practices in Governance, Lawrence York, 2009
[13] http:/ / www.derivativesstrategy. com/ magazine/ archive/ 1997/ 0497fea2. asp
[14] Philippe Jorion in Nassim Taleb and Philippe Jorion, The Jorion/Taleb Debate. Derivatives Strategy, April 1997.
[15] Aaron Brown, in David Einhorn and Aaron Brown, Private Profits and Socialized Risk. GARP Risk Review (June/July 2008).
[16] Espen Haug, Derivative Models on Models. John Wiley & Sons (2007). ISBN 978-0470013229
[17] Ezra Zask, Taking the Stress Out of Stress Testing. Derivative Strategy, February 1999.
[18] Joe Kolman, Michael Onak, Philippe Jorion, Nassim Taleb, Emanuel Derman, Blu Putnam, Richard Sandor, Stan Jonas, Ron Dembo,
George Holt, Richard Tanenbaum, William Margrabe, Dan Mudge, James Lam and Jim Rozsypal, Roundtable: The Limits of Models.
Derivatives Strategy, April 1998.
[19] Aaron Brown, On Stressing the Right Size. GARP Risk Review, December 2007.
[20] Paul Glasserman, Monte Carlo Methods in Financial Engineering. Springer (2004). ISBN 978-0387004518.
[21] Taleb, Nassim Nicholas (2007). The Black Swan: The Impact of the Highly Improbable. New York: Random House.
ISBN978-1-4000-6351-2.
[22] http:/ / www.riskmetrics. com/
[23] Nassim Taleb, The World According to Nassim Taleb. Derivatives Strategy, December 1996/January 1997.
[24] Nassim Taleb in Philippe Jorion in Nassim Taleb and Philippe Jorion, The Jorion/Taleb Debate. Derivatives Strategy, April 1997.
[25] http:/ / www.garpdigitallibrary.org/ download/ GRR/ 2012. pdf
[26] David Einhorn in David Einhorn and Aaron Brown, Private Profits and Socialized Risk. GARP Risk Review (June/July 2008).
[27] http:/ / www.nytimes.com/ 2009/ 01/ 04/ magazine/ 04risk-t. html?pagewanted=1& _r=1
[28] Joe Nocera, Risk Mismanagement, The New York Times Magazine (January 4, 2009)
External links
Discussion
Perfect Storms Beautiful & True Lies In Risk Management (http:/ / www. wilmott. com/ blogs/ satyajitdas/
enclosures/ perfectstorms(may2007)1. pdf), Satyajit Das
Gloria Mundi All About Value at Risk (http:/ / www. gloriamundi. org/ ), Barry Schachter
Risk Management (http:/ / www. nytimes. com/ 2009/ 01/ 04/ magazine/ 04risk-t. html?dlbk=& pagewanted=all),
Joe Nocera NYTimes article.
Tools
Online real-time VaR calculator (http:/ / www. cba. ua. edu/ ~rpascala/ VaR/ VaRForm. php), Razvan Pascalau,
University of Alabama
Value at risk
194
Value-at-Risk (VaR) (http:/ / finance. wharton. upenn. edu/ ~benninga/ mma/ MiER74. pdf), Simon Benninga and
Zvi Wiener. (Mathematica in Education and Research Vol. 7 No. 4 1998.)
Volatility (finance)
In finance, volatility most frequently refers to the standard deviation of the continuously compounded returns of a
financial instrument within a specific time horizon. It is used to quantify the risk of the financial instrument over the
specified time period. Volatility is normally expressed in annualized terms, and it may either be an absolute number
($5) or a fraction of the mean (5%).
Volatility terminology
Volatility as described here refers to the actual current volatility of a financial instrument for a specified period (for
example 30 days or 90 days). It is the volatility of a financial instrument based on historical prices over the specified
period with the last observation the most recent price. This phrase is used particularly when it is wished to
distinguish between the actual current volatility of an instrument and
actual historical volatility which refers to the volatility of a financial instrument over a specified period but with
the last observation on a date in the past
actual future volatility which refers to the volatility of a financial instrument over a specified period starting at
the current time and ending at a future date (normally the expiry date of an option)
historical implied volatility which refers to the implied volatility observed from historical prices of the financial
instrument (normally options)
current implied volatility which refers to the implied volatility observed from current prices of the financial
instrument
future implied volatility which refers to the implied volatility observed from future prices of the financial
instrument
For a financial instrument whose price follows a Gaussian random walk, or Wiener process, the width of the
distribution increases as time increases. This is because there is an increasing probability that the instrument's price
will be farther away from the initial price as time increases. However, rather than increase linearly, the volatility
increases with the square-root of time as time increases, because some fluctuations are expected to cancel each other
out, so the most likely deviation after twice the time will not be twice the distance from zero.
Since observed price changes do not follow Gaussian distributions, others such as the Lvy distribution are often
used.
[1]
These can capture attributes such as "fat tails".
Volatility for market players
When investing directly in a security, volatility is often viewed as a negative in that it represents uncertainty and risk.
However, with other investing strategies, volatility is often desirable. For example, if an investor is short on the
peaks, and long on the lows of a security, the profit will be greatest when volatility is highest.
In today's markets, it is also possible to trade volatility directly, through the use of derivative securities such as
options and variance swaps. See Volatility arbitrage.
Volatility (finance)
195
Volatility versus direction
Volatility does not measure the direction of price changes, merely their dispersion. This is because when calculating
standard deviation (or variance), all differences are squared, so that negative and positive differences are combined
into one quantity. Two instruments with different volatilities may have the same expected return, but the instrument
with higher volatility will have larger swings in values over a given period of time.
For example, a lower volatility stock may have an expected (average) return of 7%, with annual volatility of 5%.
This would indicate returns from approximately -3% to 17% most of the time (19 times out of 20, or 95%). A higher
volatility stock, with the same expected return of 7% but with annual volatility of 20%, would indicate returns from
approximately -33% to 47% most of the time (19 times out of 20, or 95%). These estimates assume a normal
distribution; in reality stocks are found to be leptokurtotic.
Volatility is a poor measure of risk, as explained by Peter Carr, "it is only a good measure of risk if you feel that
being rich then being poor is the same as being poor then rich".
Volatility over time
Although the Black Scholes equation assumes predictable constant volatility, none of these are observed in real
markets, and amongst the models are Bruno Dupire's Local Volatility, Poisson Process where volatility jumps to new
levels with a predictable frequency, and the increasingly popular Heston model of Stochastic Volatility.
[2]
It's common knowledge that types of assets experience periods of high and low volatility. That is, during some
periods prices go up and down quickly, while during other times they might not seem to move at all.
Periods when prices fall quickly (a crash) are often followed by prices going down even more, or going up by an
unusual amount. Also, a time when prices rise quickly (a bubble) may often be followed by prices going up even
more, or going down by an unusual amount.
The converse behavior, 'doldrums' can last for a long time as well.
Most typically, extreme movements do not appear 'out of nowhere'; they're presaged by larger movements than
usual. This is termed autoregressive conditional heteroskedasticity. Of course, whether such large movements have
the same direction, or the opposite, is more difficult to say. And an increase in volatility does not always presage a
further increasethe volatility may simply go back down again.
Mathematical definition
The annualized volatility is the standard deviation of the instrument's yearly logarithmic returns.
The generalized volatility
T
for time horizon T in years is expressed as:
Therefore, if the daily logarithmic returns of a stock have a standard deviation of
SD
and the time period of returns
is P, the annualized volatility is
A common assumption is that P = 1/252 (there are 252 trading days in any given year). Then, if
SD
= 0.01 the
annualized volatility is
The monthly volatility (i.e., T = 1/12 of a year) would be
Volatility (finance)
196
The formula used above to convert returns or volatility measures from one time period to another assume a particular
underlying model or process. These formulas are accurate extrapolations of a random walk, or Wiener process,
whose steps have finite variance. However, more generally, for natural stochastic processes, the precise relationship
between volatility measures for different time periods is more complicated. Some use the Lvy stability exponent
to extrapolate natural processes:
If =2 you get the Wiener process scaling relation, but some people believe <2 for financial activities such as
stocks, indexes and so on. This was discovered by Benot Mandelbrot, who looked at cotton prices and found that
they followed a Lvy alpha-stable distribution with =1.7. (See New Scientist, 19 April 1997.) Mandelbrot's
conclusion is, however, not accepted by mainstream financial econometricians.
Crude volatility estimation
Using a simplification of the formulas above it is possible to estimate annualized volatility based solely on
approximate observations. Suppose you notice that a market price index, which has a current value near 10,000, has
moved about 100 points a day, on average, for many days. This would constitute a 1% daily movement, up or down.
To annualize this, you can use the "rule of 16", that is, multiply by 16 to get 16% as the annual volatility. The
rationale for this is that 16 is the square root of 256, which is approximately the number of trading days in a year
(252). This also uses the fact that the standard deviation of the sum of n independent variables (with equal standard
deviations) is n times the standard deviation of the individual variables.
Of course, the average magnitude of the observations is merely an approximation of the standard deviation of the
market index. Assuming that the market index daily changes are normally distributed with mean zero and standard
deviation , the expected value of the magnitude of the observations is (2/) = 0.798. The net effect is that this
crude approach overestimates the true volatility by about 25%.
Estimate of compound annual growth rate (CAGR)
Consider the Taylor series:
Taking only the first two terms one has:
Realistically, most financial assets have negative skewness and leptokurtosis, so this formula tends to be
over-optimistic. Some people use the formula:
for a rough estimate, where k is an empirical factor (typically five to ten).
Volatility (finance)
197
See also
Beta (finance)
Derivative (finance)
Financial economics
Implied volatility
IVX
Risk
Standard deviation
Stochastic volatility
Volatility arbitrage
Volatility smile
References
[1] http:/ / www. wilmottwiki.com/ wiki/ index.php/ Levy_distribution
[2] http:/ / www. wilmottwiki.com/ wiki/ index.php/ Volatility#Definitions
Lin Chen (1996). Stochastic Mean and Stochastic Volatility A Three-Factor Model of the Term Structure of
Interest Rates and Its Application to the Pricing of Interest Rate Derivatives. Blackwell Publishers.
External links
Complex Options (http:/ / www. optionistics. com/ f/ strategy_calculator) Multi-Leg Option Strategy Calculator
An introduction to volatility and how it can be calculated in excel, by Dr A. A. Kotz (http:/ / quantonline. co. za/
Articles/ article_volatility. htm)
Interactive Java Applet " What is Historic Volatility? (http:/ / www. frog-numerics. com/ ifs/ ifs_LevelA/
HistVolaBasic. html)"
Diebold, Francis X.; Hickman, Andrew; Inoue, Atsushi & Schuermannm, Til (1996) "Converting 1-Day
Volatility to h-Day Volatility: Scaling by sqrt(h) is Worse than You Think" (http:/ / citeseer. ist. psu. edu/ 244698.
html)
A short introduction to alternative mathematical concepts of volatility (http:/ / staff. science. uva. nl/ ~marvisse/
volatility. html)
Autoregressive conditional heteroskedasticity
198
Autoregressive conditional heteroskedasticity
In econometrics, AutoRegressive Conditional Heteroskedasticity (ARCH) models are used to characterize and
model observed time series. They are used whenever there's reason to believe that, at any point in a series, the terms
will have a characteristic size, or variance. In particular ARCH models assume the variance of the current error term
or innovation to be a function of the actual sizes of the previous time periods' error terms: often the variance is
related to the squares of the previous innovations.
Such models are often called ARCH models (Engle, 1982), although a variety of other acronyms is applied to
particular structures of model which have a similar basis. ARCH models are employed commonly in modeling
financial time series that exhibit time-varying volatility clustering, i.e. periods of swings followed by periods of
relative calm.
ARCH(q) model Specification
Suppose one wishes to model a time series with an ARCH process. Let denote the error terms (return residuals,
w.r.t. a mean process) i.e. the series terms. These are split into a stochastic piece and a time-dependent
standard deviation characterizing the typical size of the terms so that
where is a random variable drawn from a Gaussian distribution centered at 0 with standard deviation equal to 1.
(i.e. and where the series are modeled by
and where and .
An ARCH(q) model can be estimated using ordinary least squares. A methodology to test for the lag length of
ARCH errors using the Lagrange multiplier test was proposed by Engle (1982). This procedure is as follows:
1. Estimate the best fitting AR(q) model .
2. Obtain the squares of the error and regress them on a constant and q lagged values:
where q is the length of ARCH lags.
3. The null hypothesis is that, in the absence of ARCH components, we have for all . The
alternative hypothesis is that, in the presence of ARCH components, at least one of the estimated coefficients
must be significant. In a sample of T residuals under the null hypothesis of no ARCH errors, the test statistic TR
follows distribution with q degrees of freedom. If TR is greater than the Chi-square table value, we reject the
null hypothesis and conclude there is an ARCH effect in the ARMA model. If TR is smaller than the Chi-square
table value, we do not reject the null hypothesis.
Autoregressive conditional heteroskedasticity
199
GARCH
If an autoregressive moving average model (ARMA model) is assumed for the error variance, the model is a
generalized autoregressive conditional heteroskedasticity (GARCH, Bollerslev(1986)) model.
In that case, the GARCH(p, q) model (where p is the order of the GARCH terms and q is the order of the ARCH
terms ) is given by
Generally, when testing for heteroskedasticity in econometric models, the best test is the White test. However, when
dealing with time series data, this means to test for ARCH errors (as described above) and GARCH errors (below).
Prior to GARCH there was EWMA which has now been superseded by GARCH, although some people utilise both.
GARCH(p, q) model specification
The lag length p of a GARCH(p, q) process is established in three steps:
1. Estimate the best fitting AR(q) model
.
2. Compute and plot the autocorrelations of by
3. The asymptotic, that is for large samples, standard deviation of is . Individual values that are larger
than this indicate GARCH errors. To estimate the total number of lags, use the Ljung-Box test until the value of
the these are less than, say, 10% significant. The Ljung-Box Q-statistic follows distribution with n degrees of
freedom if the squared residuals are uncorrelated. It is recommended to consider up to T/4 values of n. The
null hypothesis states that there are no ARCH or GARCH errors. Rejecting the null thus means that there are
existing such errors in the conditional variance.
Nonlinear GARCH (NGARCH)
Nonlinear GARCH (NGARCH) also known as Nonlinear Asymmetric GARCH(1,1) (NAGARCH) was introduced
by Engle and Ng in 1993.
.
For stock returns, parameter is usually estimated to be positive; in this case, it reflects the leverage effect,
signifying that negative returns increase future volatility by a larger amount than positive returns of the same
magnitude.
[1]

[2]
This model shouldn't be confused with the NARCH model, together with the NGARCH extension, introduced by
Higgins and Bera in 1992.
Autoregressive conditional heteroskedasticity
200
IGARCH
Integrated Generalized Autoregressive Conditional Heteroskedasticity IGARCH is a restricted version of the
GARCH model, where the persistent parameters sum up to one, and therefore there is a unit root in the GARCH
process. The condition for this is
.
EGARCH
The exponential general autoregressive conditional heteroskedastic (EGARCH) model by Nelson (1991) is
another form of the GARCH model. Formally, an EGARCH(p,q):
where , is the conditional variance, , , , and are
coefficients, and may be a standard normal variable or come from a generalized error distribution. The
formulation for allows the sign and the magnitude of to have separate effects on the volatility. This is
particularly useful in an asset pricing context.
[3]
Since may be negative there are no (fewer) restrictions on the parameters.
GARCH-M
The GARCH-in-mean (GARCH-M) model adds a heteroskedasticity term into the mean equation. It has the
specification:
The residual is defined as
QGARCH
The Quadratic GARCH (QGARCH) model by Sentana (1995) is used to model asymmetric effects of positive and
negative shocks.
In the example of a GARCH(1,1) model, the residual process is
where is i.i.d. and
Autoregressive conditional heteroskedasticity
201
GJR-GARCH
Similar to QGARCH, The Glosten-Jagannathan-Runkle GARCH (GJR-GARCH) model by Glosten, Jagannathan
and Runkle (1993) also models asymmetry in the GARCH process. The suggestion is to model where
is i.i.d., and
where if , and if .
TGARCH model
The Threshold GARCH (TGARCH) model by Zakoian (1994) is similar to GJR GARCH, and the specification is
one on conditional standard deviation instead of conditional variance:
where if , and if . Likewise, if , and
if .
fGARCH
Hentschel's fGARCH model
[4]
, also known as Family GARCH, is an omnibus model that nests a variety of other
popular symmetric and asymmetric GARCH models including APARCH, GJR, AVGARCH, NGARCH, etc.
References
[1] Engle, R.F.; Ng, V.K.. "Measuring and testing the impact of news on volatility" (http:/ / papers. ssrn. com/ sol3/ papers.
cfm?abstract_id=262096). Journal of Finance 48 (5): 17491778. .
[2] Posedel, Petra (2006). "Analysis Of The Exchange Rate And Pricing Foreign Currency Options On The Croatian Market: The Ngarch Model
As An Alternative To The Black Scholes Model" (http:/ / www. ijf. hr/ eng/ FTP/ 2006/ 4/ posedel. pdf). Financial Theory and Practice 30
(4): 347368. .
[3] St. Pierre, Eilleen F (1998): Estimating EGARCH-M Models: Science or Art, The Quarterly Review of Economics and Finance, Vol. 38, No.
2, pp. 167-180 (http:/ / dx.doi. org/ 10.1016/ S1062-9769(99)80110-0)
[4] Hentschel, Ludger (1995). All in the family Nesting symmetric and asymmetric GARCH models (http:/ / www. personal. anderson. ucla. edu/
rossen. valkanov/ hentschel_1995.pdf), Journal of Financial Economics, Volume 39, Issue 1, Pages 71-104
Tim Bollerslev. "Generalized Autoregressive Conditional Heteroskedasticity", Journal of Econometrics,
31:307-327, 1986.
Enders, W., Applied Econometrics Time Series, John-Wiley & Sons, 139-149, 1995
Robert F. Engle. "Autoregressive Conditional Heteroscedasticity with Estimates of Variance of United Kingdom
Inflation", Econometrica 50:987-1008, 1982. (the paper which sparked the general interest in ARCH models)
Robert F. Engle. "GARCH 101: The Use of ARCH/GARCH Models in Applied Econometrics", Journal of
Economic Perspectives 15(4):157-168, 2001. (a short, readable introduction) (http:/ / pages. stern. nyu. edu/
~rengle/ Garch101. doc)
Engle, R.F. (1995) ARCH: selected readings. Oxford University Press. ISBN 0-19-877432-X
Gujarati, D. N., Basic Econometrics, 856-862, 2003
Nelson, D. B. (1991). Conditional heteroskedasticity in asset returns: A new approach, Econometrica 59:
347-370.
Bollerslev, Tim (2008). Glossary to ARCH (GARCH) (ftp:/ / ftp. econ. au. dk/ creates/ rp/ 08/ rp08_49. pdf),
working paper
Hacker, R. S. and Hatemi-J, A. (2005). A Test for Multivariate ARCH Effects (http:/ / ideas. repec. org/ a/ taf/
apeclt/ v12y2005i7p411-417. html), Applied Economics Letters, Vol. 12(7), pp.411-417.
Brownian Model of Financial Markets
202
Brownian Model of Financial Markets
The Brownian motion models for financial markets are based on the work of Robert C. Merton and Paul A.
Samuelson, as extensions to the one-period market models of Harold Markowitz and William Sharpe, and are
concerned with defining the concepts of financial assets and markets, portfolios, gains and wealth in terms of
continuous-time stochastic processes.
Under this model, these assets have continuous prices evolving continuously in time and are driven by Brownian
motion processes. This model requires an assumption of perfectly divisible assets and that no transaction costs occur
either for buying or selling (i.e. a frictionless market). Another assumption is that asset prices have no jumps, that is
there are no surprises in the market.
Financial market processes
Consider a financial market consisting of financial assets, where one of these assets, called a bond or
money-market, is risk free while the remaining assets, called stocks, are risky.
Definition
A financial market is defined as :
1. A probability space
2. A time interval
3. A -dimensional Brownian process adapted to the augmented
filtration
4. A measurable risk-free money market rate process
5. A measurable mean rate of return process .
6. A measurable dividend rate of return process .
7. A measurable volatility process such that .
8. A measurable, finite variation, singularly continuous stochastic
9. The initial conditions given by
The augmented filtration
Let be a probability space, and a be D-dimensional
Brownian motion stochastic process, with the natural filtration:
If are the measure 0 (i.e. null under measure ) subsets of , then define the augmented filtration:
The difference between and is that the latter is both
left-continuous, in the sense that:
and right-continuous, such that:
while the former is only left-continuous
[1]
.
Brownian Model of Financial Markets
203
Bond
A share of a bond (money market) has price at time with , is continuous,
adapted, and has finite variation. Because it has finite variation, it can be decomposted into
an absolutely continuous part and a singularly continuous part , by Lebesgue's decomposition
theorem. Define:
and
resulting in the SDE:
which gives:
Thus, it can be easily seen that if is absolutely continuous (i.e. ), then the price of the bond evolves
like the value of a risk-free savings account with instantaneous interest rate , which is random,
time-dependendent and measurable.
Stocks
Stock prices are modeled as being similar to that of bonds, except with a randomly fluctuating component (called its
volatility). As a premium for the risk originating from these random fluctuations, the mean rate of return of a stock is
higher than that of a bond.
Let be the strictly positive prices per share of the stocks, which are continuous stochastic
processes satisfying:
Here, gives the volatility of the -th stock, while is its mean rate of return.
In order for an arbitrage-free pricing scenario, must be as defined above. The solution to this is:
and the discounted stock prices are:
Note that the contribution due to the discontinuites in the bond price does not appear in this equation.
Dividend rate
Each stock may have an associated dividend rate process giving the rate of divident payment per unit price of
the stock at time . Accounting for this in the model, gives the yield process :
Portfolio and gain processes
Definition
Consider a financial market .
A portfolio process for this market is an measurable, valued process such that:
, almost surely,
, almost surely, and
Brownian Model of Financial Markets
204
, almost surely.
The gains process for this porfolio is:
We say that the porfolio is self-financed if:
.
It turns out that for a self-financed portfolio, the appropriate value of is determined from and
therefore sometimes is referred to as the portfolio process. Also, implies borrowing money from the
money-market, while implies taking a short position on the stock.
The term in the SDE of is the risk premium process, and it is the compensation
received in return for investing in the -th stock.
Motivation
Consider time intervals , and let be the number of shares of asset
, held in a portfolio during time interval at time . To avoid the case
of insider trading (i.e. foreknowledge of the future), it is required that is measurable.
Therefore, the incremental gains at each trading interval from such a portfolio is:
and is the total gain over time , while the total value of the portfolio is .
Define , let the time partition go to zero, and substitute for as defined earlier, to get the
corresponding SDE for the gains process. Here denotes the dollar amount invested in asset at time , not
the number of shares held.
Income and wealth processes
Definition
Given a financial market , then a cumulative income process is a semimartingale and
represents the income accumulated over time , due to sources other than the investments in the assets
of the financial market.
A wealth process is then defined as:
and represents the total wealth of an investor at time . The portfolio is said to be -financed if:
The corresponding SDE for the wealth process, through appropriate substitutions, becomes:
.
Note, that again in this case, the value of can be determined from .
Brownian Model of Financial Markets
205
Viable markets
The standard theory of mathematical finance is restricted to viable financial markets, i.e. those in which there are no
opportunities for arbitrage. If such opportunities exists, it implies the possibility of making an arbitrarily large
risk-free profit.
Definition
In a financial market , a self-financed portfolio process is said to be an arbitrage opportunity if the
associated gains process , almost surely and strictly. A market in which no
such portfolio exists is said to be viable.
Implications
In a viable market , there exists a adapted process such that for almost every
:
.
This is called the market price of risk and relates the premium for the -the stock with its volatility .
Conversely, if there exists a D-dimensional process such that it satifies the above requirement, and:
,
then the market is viable.
Also, a viable market can have only one money-market (bond) and hence only one risk-free rate. Therefore, if
the -th stock entails no risk (i.e. ) and pays no dividend (i.e. ), then its rate
of return is equal to the money market rate (i.e. ) and its price tracks that of the bond (i.e.
).
Standard financial market
Definition
A financial market is said to be standard if:
(i) It is viable
(ii) The number of stocks is not greater than the dimension of the underlying Brownian motion process
(iii) The market price of risk process satisfies:
, almost surely
(iv) The positive process is a
martingale
Brownian Model of Financial Markets
206
Comments
In case the number of stocks is greater than the dimension , in violation of point (ii), from linear algebra, it
can be seen that there are stocks $n$ whose wolatilies (given by the vector ) are linear
combination of the volatilities of other stocks (because the rank of is ). Therefore, the stocks can be
replaced by equivalent mutual funds.
The standard martingale measure on for the standard market, is defined as:
.
Note that and are absolutely continuous with respect to each other, i.e. they are equivalent. Also, according to
Girsanov's theorem,
,
is a -dimensional Brownian motion process on the filtration with respect to .
Complete financial markets
A complete financial market is one that allows effective hedging of the risk inherent in any investment strategy.
Definition
Let be a standard financial market, and be an -measurable random variable, such that:
.
,
The market is said to be complete if every such is financeable, i.e. if there is an -financed portfolio
process , such that its associated wealth process satisfies
, almost surely.
Motivation
If a particular investment strategy calls for a payment at time , the amount of which is unknown at time
, then a conservative strategy would be to set aside an amount in order to cover the
payment. However, in a complete market it is possible to set aside less capital (viz. ) and invest it so that at time
it has grown to match the size of .
Corollary
A standard financial market is complete if and only if , and the volalatily process is
non-singular for almost every , with respect to the Lebesgue measure.
Brownian Model of Financial Markets
207
Notes
[1] Karatzas, Ioannis; Shreve, Steven E. (1991). Brownian motion and stochastic calculus. New York: Springer-Verlag. ISBN0387976558.
See also
Mathematical finance
Monte Carlo method
Martingale (probability theory)
References
Karatzas, Ioannis; Shreve, Steven E. (1998). Methods of mathematical finance. New York: Springer.
ISBN0387948392.
Korn, Ralf; Korn, Elke (2001). Option pricing and portfolio optimization: modern methods of financial mathematics.
Providence, R.I.: American Mathematical Society. ISBN0821821237.
Merton, R. C. (1 August 1969). "Lifetime Portfolio Selection under Uncertainty: the Continuous-Time Case" (http:/ /
jstor. org/ stable/ 1926560). The Review of Economics and Statistics 51 (3): 247257. doi:10.2307/1926560.
ISSN00346535.
Merton, R.C. (1970). "Optimum consumption and portfolio rules in a continuous-time model" (http:/ / www. math.
uwaterloo. ca/ ~mboudalh/ Merton1971. pdf) (w). Journal of Economic Theory 3. Retrieved 2009-05-29.
Rational pricing
Rational pricing is the assumption in financial economics that asset prices (and hence asset pricing models) will
reflect the arbitrage-free price of the asset as any deviation from this price will be "arbitraged away". This
assumption is useful in pricing fixed income securities, particularly bonds, and is fundamental to the pricing of
derivative instruments.
Arbitrage mechanics
Arbitrage is the practice of taking advantage of a state of imbalance between two (or possibly more) markets. Where
this mismatch can be exploited (i.e. after transaction costs, storage costs, transport costs, dividends etc.) the
arbitrageur "locks in" a risk free profit without investing any of his own money.
In general, arbitrage ensures that "the law of one price" will hold; arbitrage also equalises the prices of assets with
identical cash flows, and sets the price of assets with known future cash flows.
The law of one price
The same asset must trade at the same price on all markets ("the law of one price"). Where this is not true, the
arbitrageur will:
1. buy the asset on the market where it has the lower price, and simultaneously sell it (short) on the second market at
the higher price
2. deliver the asset to the buyer and receive that higher price
3. pay the seller on the cheaper market with the proceeds and pocket the difference.
Rational pricing
208
Assets with identical cash flows
Two assets with identical cash flows must trade at the same price. Where this is not true, the arbitrageur will:
1. sell the asset with the higher price (short sell) and simultaneously buy the asset with the lower price
2. fund his purchase of the cheaper asset with the proceeds from the sale of the expensive asset and pocket the
difference
3. deliver on his obligations to the buyer of the expensive asset, using the cash flows from the cheaper asset.
An asset with a known future-price
An asset with a known price in the future, must today trade at that price discounted at the risk free rate.
Note that this condition can be viewed as an application of the above, where the two assets in question are the asset
to be delivered and the risk free asset.
(a) where the discounted future price is higher than today's price:
1. The arbitrageur agrees to deliver the asset on the future date (i.e. sells forward) and simultaneously buys it today
with borrowed money.
2. On the delivery date, the arbitrageur hands over the underlying, and receives the agreed price.
3. He then repays the lender the borrowed amount plus interest.
4. The difference between the agreed price and the amount owed is the arbitrage profit.
(b) where the discounted future price is lower than today's price:
1. The arbitrageur agrees to pay for the asset on the future date (i.e. buys forward) and simultaneously sells (short)
the underlying today; he invests the proceeds.
2. On the delivery date, he cashes in the matured investment, which has appreciated at the risk free rate.
3. He then takes delivery of the underlying and pays the agreed price using the matured investment.
4. The difference between the maturity value and the agreed price is the arbitrage profit.
It will be noted that (b) is only possible for those holding the asset but not needing it until the future date. There may
be few such parties if short-term demand exceeds supply, leading to backwardation.
Fixed income securities
Rational pricing is one approach used in pricing fixed rate bonds. Here, each cash flow can be matched by trading in
(a) some multiple of a zero-coupon bond corresponding to the coupon date, and of equivalent credit worthiness (if
possible, from the same issuer as the bond being valued) with the corresponding maturity, or (b) in a corresponding
strip and ZCB.
Given that the cash flows can be replicated, the price of the bond must today equal the sum of each of its cash flows
discounted at the same rate as each ZCB, as above. Were this not the case, arbitrage would be possible and would
bring the price back into line with the price based on ZCBs; see Bond valuation: Arbitrage-free pricing approach
The pricing formula is as below, where each cash flow is discounted at the rate that matches the coupon date:
Price =
Often, the formula is expressed as , using prices instead of rates, as prices are more
readily available.
See also Fixed income arbitrage; Bond credit rating.
Rational pricing
209
Pricing derivatives
A derivative is an instrument that allows for buying and selling of the same asset on two markets the spot market
and the derivatives market. Mathematical finance assumes that any imbalance between the two markets will be
arbitraged away. Thus, in a correctly priced derivative contract, the derivative price, the strike price (or reference
rate), and the spot price will be related such that arbitrage is not possible.
see: Fundamental theorem of arbitrage-free pricing
Futures
In a futures contract, for no arbitrage to be possible, the price paid on delivery (the forward price) must be the same
as the cost (including interest) of buying and storing the asset. In other words, the rational forward price represents
the expected future value of the underlying discounted at the risk free rate (the "asset with a known future-price", as
above). Thus, for a simple, non-dividend paying asset, the value of the future/forward, , will be found by
accumulating the present value at time to maturity by the rate of risk-free return .
This relationship may be modified for storage costs, dividends, dividend yields, and convenience yields; see futures
contract pricing.
Any deviation from this equality allows for arbitrage as follows.
In the case where the forward price is higher:
1. The arbitrageur sells the futures contract and buys the underlying today (on the spot market) with borrowed
money.
2. On the delivery date, the arbitrageur hands over the underlying, and receives the agreed forward price.
3. He then repays the lender the borrowed amount plus interest.
4. The difference between the two amounts is the arbitrage profit.
In the case where the forward price is lower:
1. The arbitrageur buys the futures contract and sells the underlying today (on the spot market); he invests the
proceeds.
2. On the delivery date, he cashes in the matured investment, which has appreciated at the risk free rate.
3. He then receives the underlying and pays the agreed forward price using the matured investment. [If he was short
the underlying, he returns it now.]
4. The difference between the two amounts is the arbitrage profit.
Options
As above, where the value of an asset in the future is known (or expected), this value can be used to determine the
asset's rational price today. In an option contract, however, exercise is dependent on the price of the underlying, and
hence payment is uncertain. Option pricing models therefore include logic that either "locks in" or "infers" this future
value; both approaches deliver identical results. Methods that lock-in future cash flows assume arbitrage free
pricing, and those that infer expected value assume risk neutral valuation.
To do this, (in their simplest, though widely used form) both approaches assume a Binomial model for the behavior
of the underlying instrument, which allows for only two states - up or down. If S is the current price, then in the next
period the price will either be S up or S down. Here, the value of the share in the up-state is S u, and in the
down-state is S d (where u and d are multipliers with d < 1 < u and assuming d < 1+r < u; see the binomial options
model). Then, given these two states, the "arbitrage free" approach creates a position that has an identical value in
either state - the cash flow in one period is therefore known, and arbitrage pricing is applicable. The risk neutral
approach infers expected option value from the intrinsic values at the later two nodes.
Rational pricing
210
Although this logic appears far removed from the Black-Scholes formula and the lattice approach in the Binomial
options model, it in fact underlies both models; see The Black-Scholes PDE. The assumption of binomial behaviour
in the underlying price is defensible as the number of time steps between today (valuation) and exercise increases,
and the period per time-step is increasingly short. The Binomial options model allows for a high number of very
short time-steps (if coded correctly), while Black-Scholes, in fact, models a continuous process.
The examples below have shares as the underlying, but may be generalised to other instruments. The value of a put
option can be derived as below, or may be found from the value of the call using put-call parity.
Arbitrage free pricing
Here, the future payoff is "locked in" using either "delta hedging" or the "replicating portfolio" approach. As above,
this payoff is then discounted, and the result is used in the valuation of the option today.
Delta hedging
It is possible to create a position consisting of shares and 1 call sold, such that the positions value will be identical
in the S up and S down states, and hence known with certainty (see Delta hedging). This certain value corresponds to
the forward price above ("An asset with a known future price"), and as above, for no arbitrage to be possible, the
present value of the position must be its expected future value discounted at the risk free rate, r. The value of a call is
then found by equating the two.
1) Solve for such that:
value of position in one period = S up - (S up strike price ) = S down - (S down
strike price)
2) Solve for the value of the call, using , where:
value of position today = value of position in one period (1 + r) = S current value of call
The replicating portfolio
It is possible to create a position consisting of shares and $B borrowed at the risk free rate, which will produce
identical cash flows to one option on the underlying share. The position created is known as a "replicating portfolio"
since its cash flows replicate those of the option. As shown above ("Assets with identical cash flows"), in the
absence of arbitrage opportunities, since the cash flows produced are identical, the price of the option today must be
the same as the value of the position today.
1) Solve simultaneously for and B such that:
i) S up - B (1 + r) = ( 0, S up strike price )
ii) S down - B (1 + r) = ( 0, S down strike price )
2) Solve for the value of the call, using and B, where:
call = S current - B
Note that here there is no discounting - the interest rate appears only as part of the construction. This approach is
therefore used in preference to others where it is not clear whether the risk free rate may be applied as the discount
rate at each decision point, or whether, instead, a premium over risk free would be required. The best example of this
would be under Real options analysis where managements' actions actually change the risk characteristics of the
project in question, and hence the Required rate of return could differ in the up- and down-states. Here, in the above
formulae, we then have: " S up - B (1 + r up)..." and " S down - B (1 + r down)..." .
Rational pricing
211
Risk neutral valuation
Here the value of the option is calculated using the risk neutrality assumption. Under this assumption, the expected
value (as opposed to "locked in" value) is discounted. The expected value is calculated using the intrinsic values
from the later two nodes: Option up and Option down, with u and d as price multipliers as above. These are then
weighted by their respective probabilities: probability p of an up move in the underlying, and probability (1-p) of
a down move. The expected value is then discounted at r, the risk free rate.
1) Solve for p
for no arbitrage to be possible in the share, todays price must represent its expected value discounted at the
risk free rate (i.e., the share price is a Martingale):
S = [ p (up value) + (1-p) (down value) ] (1+r) = [ p S u + (1-p) S d ] (1+r)
then, p = [(1+r) - d ] [ u - d ]
2) Solve for call value, using p
for no arbitrage to be possible in the call, todays price must represent its expected value discounted at the risk
free rate:
Option value = [ p Option up + (1-p) Option down] (1+r)
= [ p Max(S up - strike price, 0) + (1-p) Max(S down - strike price, 0) ] (1+r)
The risk neutrality assumption
Note that above, the risk neutral formula does not refer to the volatility of the underlying p as solved, relates to the
risk-neutral measure as opposed to the actual probability distribution of prices. Nevertheless, both arbitrage free
pricing and risk neutral valuation deliver identical results. In fact, it can be shown that Delta hedging and Risk
neutral valuation use identical formulae expressed differently. Given this equivalence, it is valid to assume risk
neutrality when pricing derivatives. See Fundamental theorem of arbitrage-free pricing.
Swaps
Rational pricing underpins the logic of swap valuation. Here, two counterparties "swap" obligations, effectively
exchanging cash flow streams calculated against a notional principal amount, and the value of the swap is the present
value (PV) of both sets of future cash flows "netted off" against each other.
Valuation at initiation
To be arbitrage free, the terms of a swap contract are such that, initially, the Net present value of these future cash
flows is equal to zero; see swap valuation. For example, consider the valuation of a fixed-to-floating Interest rate
swap where Party A pays a fixed rate, and Party B pays a floating rate. Here, the fixed rate would be such that the
present value of future fixed rate payments by Party A is equal to the present value of the expected future floating
rate payments (i.e. the NPV is zero). Were this not the case, an Arbitrageur, C, could:
1. Assume the position with the lower present value of payments, and borrow funds equal to this present value
2. Meet the cash flow obligations on the position by using the borrowed funds, and receive the corresponding
paymentswhich have a higher present value
3. Use the received payments to repay the debt on the borrowed funds
4. Pocket the difference - where the difference between the present value of the loan and the present value of the
inflows is the arbitrage profit
Rational pricing
212
Subsequent valuation
Once traded, swaps can also be priced using rational pricing. For example, the Floating leg of an interest rate swap
can be "decomposed" into a series of Forward rate agreements. Here, since the swap has identical payments to the
FRA, arbitrage free pricing must apply as above - i.e. the value of this leg is equal to the value of the corresponding
FRAs. Similarly, the "receive-fixed" leg of a swap, can be valued by comparison to a Bond with the same schedule
of payments. (Relatedly, given that their underlyings have the same cash flows, bond options and swaptions are
equatable.)
Pricing shares
The Arbitrage pricing theory (APT), a general theory of asset pricing, has become influential in the pricing of shares.
APT holds that the expected return of a financial asset, can be modelled as a linear function of various
macro-economic factors, where sensitivity to changes in each factor is represented by a factor specific beta
coefficient:
where
is the risky asset's expected return,
is the risk free rate,
is the macroeconomic factor,
is the sensitivity of the asset to factor ,
and is the risky asset's idiosyncratic random shock with mean zero.
The model derived rate of return will then be used to price the asset correctly - the asset price should equal the
expected end of period price discounted at the rate implied by model. If the price diverges, arbitrage should bring it
back into line. Here, to perform the arbitrage, the investor creates a correctly priced asset (a synthetic asset), a
portfolio with the same net-exposure to each of the macroeconomic factors as the mispriced asset but a different
expected return. See the APT article for detail on the construction of the portfolio. The arbitrageur is then in a
position to make a risk free profit as follows:
Where the asset price is too low, the portfolio should have appreciated at the rate implied by the APT, whereas the
mispriced asset would have appreciated at more than this rate. The arbitrageur could therefore:
1. Today: short sell the portfolio and buy the mispriced-asset with the proceeds.
2. At the end of the period: sell the mispriced asset, use the proceeds to buy back the portfolio, and pocket the
difference.
Where the asset price is too high, the portfolio should have appreciated at the rate implied by the APT, whereas
the mispriced asset would have appreciated at less than this rate. The arbitrageur could therefore:
1. Today: short sell the mispriced-asset and buy the portfolio with the proceeds.
2. At the end of the period: sell the portfolio, use the proceeds to buy back the mispriced-asset, and pocket the
difference.
Note that under "true arbitrage", the investor locks-in a guaranteed payoff, whereas under APT arbitrage, the
investor locks-in a positive expected payoff. The APT thus assumes "arbitrage in expectations" i.e. that arbitrage
by investors will bring asset prices back into line with the returns expected by the model.
The Capital asset pricing model (CAPM) is an earlier, (more) influential theory on asset pricing. Although based on
different assumptions, the CAPM can, in some ways, be considered a "special case" of the APT; specifically, the
CAPM's Securities market line represents a single-factor model of the asset price, where Beta is exposure to changes
in value of the Market.
Rational pricing
213
See also
Efficient market hypothesis
Fair value
Fundamental theorem of arbitrage-free pricing
Homo economicus
List of valuation topics
Rational choice theory
Rationality
Risk-neutral measure
Volatility arbitrage
External links
Arbitrage free pricing
Pricing by Arbitrage
[1]
, The History of Economic Thought Website
The Idea Behind Arbitrage Pricing
[2]
, Samy Mohammed, Quantnotes
"The Fundamental Theorem" of Finance
[3]
; part II
[4]
. Prof. Mark Rubinstein, Haas School of Business
Elementary Asset Pricing Theory
[5]
, Prof. K. C. Border California Institute of Technology
The Notion of Arbitrage and Free Lunch in Mathematical Finance
[6]
, Prof. Walter Schachermayer
Risk Neutral Pricing in Discrete Time
[7]
(PDF), Prof. Don M. Chance
No Arbitrage in Continuous Time
[8]
, Prof. Tyler Shumway
Risk neutrality and arbitrage free pricing
Risk-Neutral Probabilities Explained
[9]
. Nicolas Gisiger
Risk-neutral Valuation: A Gentle Introduction
[10]
, Part II
[11]
. Joseph Tham Duke University
Application to derivatives
Option Valuation in the Binomial Model
[12]
, Prof. Ernst Maug
Pricing Futures and Forwards by Arbitrage Argument
[13]
, Quantnotes
The relationship between futures and spot prices
[14]
, Investment Analysts Society of Southern Africa
The illusions of dynamic replication
[15]
, Emanuel Derman and Nassim Taleb
Swaptions and Options
[16]
, Prof. Don M. Chance
References
[1] http:/ / cepa. newschool.edu/ het/ essays/ sequence/ arbitpricing. htm
[2] http:/ / www. quantnotes. com/ fundamentals/ basics/ arbitragepricing. htm
[3] http:/ / www. in-the-money.com/ artandpap/ IV%20Fundamental%20Theorem%20-%20Part%20I. doc
[4] http:/ / www. in-the-money.com/ artandpap/ IV%20Fundamental%20Theorem%20-%20Part%20II. doc
[5] http:/ / www. hss. caltech.edu/ ~kcb/ Notes/ Arbitrage. pdf
[6] http:/ / www. fam. tuwien. ac. at/ ~wschach/ pubs/ preprnts/ prpr0118a. pdf
[7] http:/ / www. bus. lsu.edu/ academics/ finance/ faculty/ dchance/ Instructional/ TN96-02. pdf
[8] http:/ / www-personal.umich. edu/ ~shumway/ courses.dir/ f872. dir/ noarb. pdf
[9] http:/ / ssrn.com/ abstract=1395390
[10] http:/ / papers.ssrn.com/ sol3/ papers.cfm?abstract_id=290044
[11] http:/ / papers.ssrn.com/ sol3/ papers.cfm?abstract_id=292724
[12] http:/ / www.rpi.edu/ ~olivaa2/ binomial. pdf
[13] http:/ / www.quantnotes. com/ fundamentals/ futures/ futureforwardpricing. htm
[14] http:/ / www.iassa. co. za/ images/ file/ indexmain.htm
[15] http:/ / www.ederman.com/ new/ docs/ qf-Illusions-dynamic. pdf
[16] http:/ / papers.ssrn.com/ sol3/ papers.cfm?abstract_id=291988
Arbitrage
214
Arbitrage
In economics and finance, arbitrage (IPA:/rbtr/) is the practice of taking advantage of a price difference
between two or more markets: striking a combination of matching deals that capitalize upon the imbalance, the profit
being the difference between the market prices. When used by academics, an arbitrage is a transaction that involves
no negative cash flow at any probabilistic or temporal state and a positive cash flow in at least one state; in simple
terms, it is the possibility of a risk-free profit at zero cost.
In principle and in academic use, an arbitrage is risk-free; in common use, as in statistical arbitrage, it may refer to
expected profit, though losses may occur, and in practice, there are always risks in arbitrage, some minor (such as
fluctuation of prices decreasing profit margins), some major (such as devaluation of a currency or derivative). In
academic use, an arbitrage involves taking advantage of differences in price of a single asset or identical cash-flows;
in common use, it is also used to refer to differences between similar assets (relative value or convergence trades), as
in merger arbitrage.
People who engage in arbitrage are called arbitrageurs (IPA:/rbtrr/)such as a bank or brokerage firm. The
term is mainly applied to trading in financial instruments, such as bonds, stocks, derivatives, commodities and
currencies.
Arbitrage-free
If the market prices do not allow for profitable arbitrage, the prices are said to constitute an arbitrage equilibrium
or arbitrage-free market. An arbitrage equilibrium is a precondition for a general economic equilibrium. The
assumption that there is no arbitrage is used in quantitative finance to calculate a unique risk neutral price for
derivatives.
Conditions for arbitrage
Arbitrage is possible when one of three conditions is met:
1. The same asset does not trade at the same price on all markets ("the law of one price").
2. Two assets with identical cash flows do not trade at the same price.
3. An asset with a known price in the future does not today trade at its future price discounted at the risk-free
interest rate (or, the asset does not have negligible costs of storage; as such, for example, this condition holds for
grain but not for securities).
Arbitrage is not simply the act of buying a product in one market and selling it in another for a higher price at some
later time. The transactions must occur simultaneously to avoid exposure to market risk, or the risk that prices may
change on one market before both transactions are complete. In practical terms, this is generally only possible with
securities and financial products which can be traded electronically, and even then, when each leg of the trade is
executed the prices in the market may have moved. Missing one of the legs of the trade (and subsequently having to
trade it soon after at a worse price) is called 'execution risk' or more specifically 'leg risk'.
[1]
In the simplest example, any good sold in one market should sell for the same price in another. Traders may, for
example, find that the price of wheat is lower in agricultural regions than in cities, purchase the good, and transport it
to another region to sell at a higher price. This type of price arbitrage is the most common, but this simple example
ignores the cost of transport, storage, risk, and other factors. "True" arbitrage requires that there be no market risk
involved. Where securities are traded on more than one exchange, arbitrage occurs by simultaneously buying in one
and selling on the other.
See rational pricing, particularly arbitrage mechanics, for further discussion.
Mathematically it is defined as follows:
Arbitrage
215
and
where means a portfolio at time t.
Examples
Suppose that the exchange rates (after taking out the fees for making the exchange) in London are 5 = $10 =
1000 and the exchange rates in Tokyo are 1000 = $12 = 6. Converting 1000 to $12 in Tokyo and converting
that $12 into 1200 in London, for a profit of 200, would be arbitrage. In reality, this "triangle arbitrage" is so
simple that it almost never occurs. But more complicated foreign exchange arbitrages, such as the spot-forward
arbitrage (see interest rate parity) are much more common.
One example of arbitrage involves the New York Stock Exchange and the Chicago Mercantile Exchange. When
the price of a stock on the NYSE and its corresponding futures contract on the CME are out of sync, one can buy
the less expensive one and sell it to the more expensive market. Because the differences between the prices are
likely to be small (and not to last very long), this can only be done profitably with computers examining a large
number of prices and automatically exercising a trade when the prices are far enough out of balance. The activity
of other arbitrageurs can make this risky. Those with the fastest computers and the most expertise take advantage
of series of small differences that would not be profitable if taken individually.
Economists use the term "global labor arbitrage" to refer to the tendency of manufacturing jobs to flow towards
whichever country has the lowest wages per unit output at present and has reached the minimum requisite level of
political and economic development to support industrialization. At present, many such jobs appear to be flowing
towards China, though some which require command of English are going to India and the Philippines. In popular
terms, this is referred to as offshoring. (Note that "offshoring" is not synonymous with "outsourcing", which
means "to subcontract from an outside supplier or source", such as when a business outsources its bookkeeping to
an accounting firm. Unlike offshoring, outsourcing always involves subcontracting jobs to a different company,
and that company can be in the same country as the outsourcing company.)
Sports arbitrage numerous internet bookmakers offer odds on the outcome of the same event. Any given
bookmaker will weight their odds so that no one customer can cover all outcomes at a profit against their books.
However, in order to remain competitive their margins are usually quite low. Different bookmakers may offer
different odds on the same outcome of a given event; by taking the best odds offered by each bookmaker, a
customer can under some circumstances cover all possible outcomes of the event and lock a small risk-free profit,
known as a Dutch book. This profit would typically be between 1% and 5% but can be much higher. One problem
with sports arbitrage is that bookmakers sometimes make mistakes and this can lead to an invocation of the
'palpable error' rule, which most bookmakers invoke when they have made a mistake by offering or posting
incorrect odds. As bookmakers become more proficient, the odds of making an 'arb' usually last for less than an
hour and typically only a few minutes. Furthermore, huge bets on one side of the market also alert the bookies to
correct the market.
Exchange-traded fund arbitrage Exchange Traded Funds allow authorized participants to exchange back and
forth between shares in underlying securities held by the fund and shares in the fund itself, rather than allowing
the buying and selling of shares in the ETF directly with the fund sponsor. ETFs trade in the open market, with
prices set by market demand. An ETF may trade at a premium or discount to the value of the underlying assets.
When a significant enough premium appears, an arbitrageur will buy the underlying securities, convert them to
shares in the ETF, and sell them in the open market. When a discount appears, an arbitrageur will do the reverse.
In this way, the arbitrageur makes a low-risk profit, while fulfilling a useful function in the ETF marketplace by
keeping ETF prices in line with their underlying value.
Some types of hedge funds make use of a modified form of arbitrage to profit. Rather than exploiting price
differences between identical assets, they will purchase and sell securities, assets and derivatives with similar
characteristics, and hedge any significant differences between the two assets. Any difference between the hedged
positions represents any remaining risk (such as basis risk) plus profit; the belief is that there remains some
Arbitrage
216
difference which, even after hedging most risk, represents pure profit. For example, a fund may see that there is a
substantial difference between U.S. dollar debt and local currency debt of a foreign country, and enter into a
series of matching trades (including currency swaps) to arbitrage the difference, while simultaneously entering
into credit default swaps to protect against country risk and other types of specific risk..
Price convergence
Arbitrage has the effect of causing prices in different markets to converge. As a result of arbitrage, the currency
exchange rates, the price of commodities, and the price of securities in different markets tend to converge to the
same prices, in all markets, in each category. The speed at which prices converge is a measure of market efficiency.
Arbitrage tends to reduce price discrimination by encouraging people to buy an item where the price is low and resell
it where the price is high, as long as the buyers are not prohibited from reselling and the transaction costs of buying,
holding and reselling are small relative to the difference in prices in the different markets.
Arbitrage moves different currencies toward purchasing power parity. As an example, assume that a car purchased in
the United States is cheaper than the same car in Canada. Canadians would buy their cars across the border to exploit
the arbitrage condition. At the same time, Americans would buy US cars, transport them across the border, and sell
them in Canada. Canadians would have to buy American Dollars to buy the cars, and Americans would have to sell
the Canadian dollars they received in exchange for the exported cars. Both actions would increase demand for US
Dollars, and supply of Canadian Dollars, and as a result, there would be an appreciation of the US Dollar.
Eventually, if unchecked, this would make US cars more expensive for all buyers, and Canadian cars cheaper, until
there is no longer an incentive to buy cars in the US and sell them in Canada. More generally, international arbitrage
opportunities in commodities, goods, securities and currencies, on a grand scale, tend to change exchange rates until
the purchasing power is equal.
In reality, of course, one must consider taxes and the costs of travelling back and forth between the US and Canada.
Also, the features built into the cars sold in the US are not exactly the same as the features built into the cars for sale
in Canada, due, among other things, to the different emissions and other auto regulations in the two countries. In
addition, our example assumes that no duties have to be paid on importing or exporting cars from the USA to
Canada. Similarly, most assets exhibit (small) differences between countries, transaction costs, taxes, and other costs
provide an impediment to this kind of arbitrage.
Similarly, arbitrage affects the difference in interest rates paid on government bonds, issued by the various countries,
given the expected depreciations in the currencies, relative to each other (see interest rate parity).
Risks
Arbitrage transactions in modern securities markets involve fairly low day-to-day risks, but can face extremely high
risk in rare situations, particularly financial crises, and can lead to bankruptcy. Formally, arbitrage transactions have
negative skew prices can get a small amount closer (but often no closer than 0), while they can get very far apart.
The day-to-day risks are generally small because the transactions involve small differences in price, so an execution
failure will generally cause a small loss (unless the trade is very big or the price moves rapidly). The rare case risks
are extremely high because these small price differences are converted to large profits via leverage (borrowed
money), and in the rare event of a large price move, this may yield a large loss.
The main day-to-day risk is that part of the transaction fails execution risk. The main rare risks are counterparty
risk and liquidity risk that a counterparty to a large transaction or many transactions fails to pay, or that one is
required to post margin and does not have the money to do so.
In the academic literature, the idea that seemingly very low risk arbitrage trades might not be fully exploited because
of these risk factors and other considerations is often referred to as limits to arbitrage.
[2]
Arbitrage
217
Execution risk
Generally it is impossible to close two or three transactions at the same instant; therefore, there is the possibility that
when one part of the deal is closed, a quick shift in prices makes it impossible to close the other at a profitable price.
Competition in the marketplace can also create risks during arbitrage transactions. As an example, if one was trying
to profit from a price discrepancy between IBM on the NYSE and IBM on the London Stock Exchange, they may
purchase a large number of shares on the NYSE and find that they cannot simultaneously sell on the LSE. This
leaves the arbitrageur in an unhedged risk position.
In the 1980s, risk arbitrage was common. In this form of speculation, one trades a security that is clearly undervalued
or overvalued, when it is seen that the wrong valuation is about to be corrected by events. The standard example is
the stock of a company, undervalued in the stock market, which is about to be the object of a takeover bid; the price
of the takeover will more truly reflect the value of the company, giving a large profit to those who bought at the
current priceif the merger goes through as predicted. Traditionally, arbitrage transactions in the securities markets
involve high speed and low risk. At some moment a price difference exists, and the problem is to execute two or
three balancing transactions while the difference persists (that is, before the other arbitrageurs act). When the
transaction involves a delay of weeks or months, as above, it may entail considerable risk if borrowed money is used
to magnify the reward through leverage. One way of reducing the risk is through the illegal use of inside
information, and in fact risk arbitrage with regard to leveraged buyouts was associated with some of the famous
financial scandals of the 1980s such as those involving Michael Milken and Ivan Boesky.
Mismatch
Another risk occurs if the items being bought and sold are not identical and the arbitrage is conducted under the
assumption that the prices of the items are correlated or predictable; this is more narrowly referred to as a
convergence trade. In the extreme case this is merger arbitrage, described below. In comparison to the classical quick
arbitrage transaction, such an operation can produce disastrous losses.
Counterparty risk
As arbitrages generally involve future movements of cash, they are subject to counterparty risk: if a counterparty
fails to fulfill their side of a transaction. This is a serious problem if one has either a single trade or many related
trades with a single counterparty, whose failure thus poses a threat, or in the event of a financial crisis when many
counterparties fail. This hazard is serious because of the large quantities one must trade in order to make a profit on
small price differences.
For example, if one purchases many risky bonds, then hedges them with CDSes, profiting from the difference
between the bond spread and the CDS premium, in a financial crisis the bonds may default and the CDS writer/seller
may itself fail, due to the stress of the crisis, causing the arbitrageur to face steep losses.
Liquidity risk

The market can stay irrational longer than you can stay solvent.

John Maynard Keynes


Arbitrage trades are necessarily synthetic, leveraged trades, as they involve a short position. If the assets used are not
identical (so a price divergence makes the trade temporarily lose money), or the margin treatment is not identical,
and the trader is accordingly required to post margin (faces a margin call), the trader may run out of capital (if they
run out of cash and cannot borrow more) and go bankrupt even though the trades may be expected to ultimately
make money. In effect, arbitrage traders synthesize a put option on their ability to finance themselves.
[3]
Arbitrage
218
Prices may diverge during a financial crisis, often termed a "flight to quality"; these are precisely the times when it is
hardest for leveraged investors to raise capital (due to overall capital constraints), and thus they will lack capital
precisely when they need it most.
[3]
Types of arbitrage
Merger arbitrage
Also called risk arbitrage, merger arbitrage generally consists of buying the stock of a company that is the target of a
takeover while shorting the stock of the acquiring company.
Usually the market price of the target company is less than the price offered by the acquiring company. The spread
between these two prices depends mainly on the probability and the timing of the takeover being completed as well
as the prevailing level of interest rates.
The bet in a merger arbitrage is that such a spread will eventually be zero, if and when the takeover is completed.
The risk is that the deal "breaks" and the spread massively widens.
Municipal bond arbitrage
Also called municipal bond relative value arbitrage, municipal arbitrage, or just muni arb, this hedge fund strategy
involves one of two approaches.
Generally, managers seek relative value opportunities by being both long and short municipal bonds with a
duration-neutral book. The relative value trades may be between different issuers, different bonds issued by the same
entity, or capital structure trades referencing the same asset (in the case of revenue bonds). Managers aim to capture
the inefficiencies arising from the heavy participation of non-economic investors (i.e., high income "buy and hold"
investors seeking tax-exempt income) as well as the "crossover buying" arising from corporations' or individuals'
changing income tax situations (i.e., insurers switching their munis for corporates after a large loss as they can
capture a higher after-tax yield by offsetting the taxable corporate income with underwriting losses). There are
additional inefficiencies arising from the highly fragmented nature of the municipal bond market which has two
million outstanding issues and 50,000 issuers in contrast to the Treasury market which has 400 issues and a single
issuer.
Second, managers construct leveraged portfolios of AAA- or AA-rated tax-exempt municipal bonds with the
duration risk hedged by shorting the appropriate ratio of taxable corporate bonds. These corporate equivalents are
typically interest rate swaps referencing Libor or SIFMA[4] [5]. The arbitrage manifests itself in the form of a
relatively cheap longer maturity municipal bond, which is a municipal bond that yields significantly more than 65%
of a corresponding taxable corporate bond. The steeper slope of the municipal yield curve allows participants to
collect more after-tax income from the municipal bond portfolio than is spent on the interest rate swap; the carry is
greater than the hedge expense. Positive, tax-free carry from muni arb can reach into the double digits. The bet in
this municipal bond arbitrage is that, over a longer period of time, two similar instrumentsmunicipal bonds and
interest rate swapswill correlate with each other; they are both very high quality credits, have the same maturity
and are denominated in U.S. dollars. Credit risk and duration risk are largely eliminated in this strategy. However,
basis risk arises from use of an imperfect hedge, which results in significant, but range-bound principal volatility.
The end goal is to limit this principal volatility, eliminating its relevance over time as the high, consistent, tax-free
cash flow accumulates. Since the inefficiency is related to government tax policy, and hence is structural in nature, it
has not been arbitraged away.
Arbitrage
219
Convertible bond arbitrage
A convertible bond is a bond that an investor can return to the issuing company in exchange for a predetermined
number of shares in the company.
A convertible bond can be thought of as a corporate bond with a stock call option attached to it.
The price of a convertible bond is sensitive to three major factors:
interest rate. When rates move higher, the bond part of a convertible bond tends to move lower, but the call
option part of a convertible bond moves higher (and the aggregate tends to move lower).
stock price. When the price of the stock the bond is convertible into moves higher, the price of the bond tends to
rise.
credit spread. If the creditworthiness of the issuer deteriorates (e.g. rating downgrade) and its credit spread
widens, the bond price tends to move lower, but, in many cases, the call option part of the convertible bond moves
higher (since credit spread correlates with volatility).
Given the complexity of the calculations involved and the convoluted structure that a convertible bond can have, an
arbitrageur often relies on sophisticated quantitative models in order to identify bonds that are trading cheap versus
their theoretical value.
Convertible arbitrage consists of buying a convertible bond and hedging two of the three factors in order to gain
exposure to the third factor at a very attractive price.
For instance an arbitrageur would first buy a convertible bond, then sell fixed income securities or interest rate
futures (to hedge the interest rate exposure) and buy some credit protection (to hedge the risk of credit deterioration).
Eventually what he'd be left with is something similar to a call option on the underlying stock, acquired at a very low
price. He could then make money either selling some of the more expensive options that are openly traded in the
market or delta hedging his exposure to the underlying shares.
Depository receipts
A depository receipt is a security that is offered as a "tracking stock" on another foreign market. For instance a
Chinese company wishing to raise more money may issue a depository receipt on the New York Stock Exchange, as
the amount of capital on the local exchanges is limited. These securities, known as ADRs (American Depositary
Receipt) or GDRs (Global Depositary Receipt) depending on where they are issued, are typically considered
"foreign" and therefore trade at a lower value when first released. However, they are exchangeable into the original
security (known as fungibility) and actually have the same value. In this case there is a spread between the perceived
value and real value, which can be extracted. Since the ADR is trading at a value lower than what it is worth, one can
purchase the ADR and expect to make money as its value converges on the original. However there is a chance that
the original stock will fall in value too, so by shorting it you can hedge that risk.
Dual-listed companies
A dual-listed company (DLC) structure involves two companies incorporated in different countries contractually
agreeing to operate their businesses as if they were a single enterprise, while retaining their separate legal identity
and existing stock exchange listings. In integrated and efficient financial markets, stock prices of the twin pair should
move in lockstep. In practice, DLC share prices exhibit large deviations from theoretical parity. Arbitrage positions
in DLCs can be set-up by obtaining a long position in the relatively underpriced part of the DLC and a short position
in the relatively overpriced part. Such arbitrage strategies start paying off as soon as the relative prices of the two
DLC stocks converge toward theoretical parity. However, since there is no identifiable date at which DLC prices will
converge, arbitrage positions sometimes have to be kept open for considerable periods of time. In the meantime, the
price gap might widen. In these situations, arbitrageurs may receive margin calls, after which they would most likely
be forced to liquidate part of the position at a highly unfavorable moment and suffer a loss. Arbitrage in DLCs may
Arbitrage
220
be profitable, but is also very risky, see.
[6]
Background material is available at [7].
A good illustration of the risk of DLC arbitrage is the position in Royal Dutch Shellwhich had a DLC structure
until 2005by the hedge fund Long-Term Capital Management (LTCM, see also the discussion below). Lowenstein
(2000)
[8]
describes that LTCM established an arbitrage position in Royal Dutch Shell in the summer of 1997, when
Royal Dutch traded at an 8 to 10 percent premium. In total $2.3 billion was invested, half of which long in Shell and
the other half short in Royal Dutch (Lowenstein, p.99). In the autumn of 1998 large defaults on Russian debt created
significant losses for the hedge fund and LTCM had to unwind several positions. Lowenstein reports that the
premium of Royal Dutch had increased to about 22 percent and LTCM had to close the position and incur a loss.
According to Lowenstein (p.234), LTCM lost $286 million in equity pairs trading and more than half of this loss is
accounted for by the Royal Dutch Shell trade.
Private to public equities
The market prices for privately held companies are typically viewed from a return on investment perspective (such
as 25%), whilst publicly held and or exchange listed companies trade on a Price to Earnings multiple (such as a P/E
of 10, which equates to a 10% ROI). Thus, if a publicly traded company specialises in the acquisition of privately
held companies, from a per-share perspective there is a gain with every acquisition that falls within these guidelines.
Exempli gratia, Berkshire-Hathaway. A hedge fund that is an example of this type of arbitrage is Greenridge Capital,
which acts as an angel investor retaining equity in private companies which are in the process of becoming publicly
traded, buying in the private market and later selling in the public market. Private to public equities arbitrage is a
term which can arguably be applied to investment banking in general. Private markets to public markets differences
may also help explain the overnight windfall gains enjoyed by principals of companies that just did an initial public
offering.
Regulatory arbitrage
Regulatory arbitrage is where a regulated institution takes advantage of the difference between its real (or economic)
risk and the regulatory position. For example, if a bank, operating under the Basel I accord, has to hold 8% capital
against default risk, but the real risk of default is lower, it is profitable to securitise the loan, removing the low risk
loan from its portfolio. On the other hand, if the real risk is higher than the regulatory risk then it is profitable to
make that loan and hold on to it, provided it is priced appropriately.
This process can increase the overall riskiness of institutions under a risk insensitive regulatory regime, as described
by Alan Greenspan in his October 1998 speech on The Role of Capital in Optimal Banking Supervision and
Regulation
[9]
.
Regulatory Arbitrage was used for the first time in 2005 when it was applied by Scott V. Simpson, a partner at law
firm Skadden, Arps, to refer to a new defence tactic in hostile mergers and acquisitions where differing takeover
regimes in deals involving multi-jurisdictions are exploited to the advantage of a target company under threat.
In economics, regulatory arbitrage (sometimes, tax arbitrage) may be used to refer to situations when a company can
choose a nominal place of business with a regulatory, legal or tax regime with lower costs. For example, an
insurance company may choose to locate in Bermuda due to preferential tax rates and policies for insurance
companies. This can occur particularly where the business transaction has no obvious physical location: in the case
of many financial products, it may be unclear "where" the transaction occurs.
Regulatory arbitrage can include restructuring a bank by outsourcing services such as IT. The outsourcing company
takes over the installations, buying out the bank's assets and charges a periodic service fee back to the bank. This
frees up cashflow usable for new lending by the bank. The bank will have higher IT costs, but counts on the
multiplier effect of money creation and the interest rate spread to make it a profitable exercise.
Example: Suppose the bank sells its IT installations for 40 million USD. With a reserve ratio of 10%, the bank can
create 400 million USD in additional loans (there is a time lag, and the bank has to expect to recover the loaned
Arbitrage
221
money back into its books). The bank can often lend (and securitize the loan) to the IT services company to cover the
acquisition cost of the IT installations. This can be at preferential rates, as the sole client using the IT installation is
the bank. If the bank can generate 5% interest margin on the 400 million of new loans, the bank will increase interest
revenues by 20 million. The IT services company is free to leverage their balance sheet as aggressively as they and
their banker agree to. This is the reason behind the trend towards outsourcing in the financial sector. Without this
money creation benefit, it is actually more expensive to outsource the IT operations as the outsourcing adds a layer
of management and increases overhead.
Telecom arbitrage
Telecom arbitrage companies allow phone users to make international calls for free through certain access numbers.
Such services are offered in the United Kingdom; the telecommunication arbitrage companies get paid an
interconnect charge by the UK mobile networks and then buy international routes at a lower cost. The calls are seen
as free by the UK contract mobile phone customers since they are using up their allocated monthly minutes rather
than paying for additional calls.
Such services were previously offered in the United States by companies such as FuturePhone.com.
[10]
These
services would operate in rural telephone exchanges, primarily in small towns in the state of Iowa. In these areas, the
local telephone carriers are allowed to charge a high "termination fee" to the caller's carrier in order to fund the cost
of providing service to the small and sparsely-populated areas that they serve. However, FuturePhone (as well as
other similar services) ceased operations upon legal challenges from AT&T and other service providers.
[11]
Statistical arbitrage
Statistical arbitrage is an imbalance in expected nominal values. A casino has a statistical arbitrage in every game of
chance that it offersreferred to as the house advantage, house edge, vigorish or house vigorish.
The debacle of Long-Term Capital Management
Long-Term Capital Management (LTCM) lost 4.6 billion U.S. dollars in fixed income arbitrage in September 1998.
LTCM had attempted to make money on the price difference between different bonds. For example, it would sell
U.S. Treasury securities and buy Italian bond futures. The concept was that because Italian bond futures had a less
liquid market, in the short term Italian bond futures would have a higher return than U.S. bonds, but in the long term,
the prices would converge. Because the difference was small, a large amount of money had to be borrowed to make
the buying and selling profitable.
The downfall in this system began on August 17, 1998, when Russia defaulted on its ruble debt and domestic dollar
debt. Because the markets were already nervous due to the Asian financial crisis, investors began selling non-U.S.
treasury debt and buying U.S. treasuries, which were considered a safe investment. As a result the price on US
treasuries began to increase and the return began decreasing because there were many buyers, and the return (yield)
on other bonds began to increase because there were many sellers (i.e. the price of those bonds fell). This caused the
difference between the prices of U.S. treasuries and other bonds to increase, rather than to decrease as LTCM was
expecting. Eventually this caused LTCM to fold, and their creditors had to arrange a bail-out. More controversially,
officials of the Federal Reserve assisted in the negotiations that led to this bail-out, on the grounds that so many
companies and deals were intertwined with LTCM that if LTCM actually failed, they would as well, causing a
collapse in confidence in the economic system. Thus LTCM failed as a fixed income arbitrage fund, although it is
unclear what sort of profit was realized by the banks that bailed LTCM out.
Arbitrage
222
Etymology
"Arbitrage" is a French word and denotes a decision by an arbitrator or arbitration tribunal. (In modern French,
"arbitre" usually means referee or umpire.) In the sense used here it is first defined in 1704 by Mathieu de la Porte in
his treatise "La science des ngocians et teneurs de livres" as a consideration of different exchange rates to recognize
the most profitable places of issuance and settlement for a bill of exchange ("L'arbitrage est une combinaison que
lon fait de plusieurs changes, pour connoitre [connatre, in modern spelling] quelle place est plus avantageuse pour
tirer et remettre".)
[12]
See also
Types of financial arbitrage
Arbitrage betting
Covered interest arbitrage
Fixed income arbitrage
Political arbitrage
Risk arbitrage
Statistical arbitrage
Triangular arbitrage
Uncovered interest arbitrage
Volatility arbitrage
Related concepts
Algorithmic trading
Arbitrage pricing theory
Coherence (philosophical gambling strategy), analogous concept in Bayesian probability
Efficient market hypothesis
Immunization (finance)
Interest rate parity
Intermediation
TANSTAAFL
Value investing
Notes
[1] As an arbitrage consists of at least two trades, the metaphor is of putting on a pair of pants, one leg (trade) at a time. The risk that one trade
(leg) fails to execute is thus 'leg risk'.
[2] See e.g. Shleifer, Andrei, and Robert Vishny, 1997, The limits of arbitrage, Journal of Finance 52, 35-55.
Xiong, Wei, 2001, Convergence trading with wealth effects, Journal of Financial Economics 62, 247-292.
Kondor, Peter, 2009. Risk in Dynamic Arbitrage: Price Effects of Convergence Trading Journal of Finance 64(2),638-658,
[3] The Basis Monster That Ate Wall Street (http:/ / www. purearb. com/ purearb/ wp-content/ uploads/ 2009/ 03/
desco_market_insights_vol_1_no_1_20090313.pdf), D. E. Shaw & Co.
[4] http:/ / www. investinginbonds. com/ story. asp?id=351
[5] http:/ / www. bondmarkets. com/ story. asp?id=1157
[6] de Jong, A., L. Rosenthal and M.A. van Dijk, 2008, The Risk and Return of Arbitrage in Dual-Listed Companies, June 2008. (http:/ / papers.
ssrn.com/ sol3/ papers.cfm?abstract_id=525282)
[7] http:/ / mathijsavandijk.com/ dual-listed-companies/
[8] Lowenstein, R., 2000, When genius failed: The rise and fall of Long-Term Capital Management, Random House.
[9] http:/ / www. ny. frb.org/ research/ epr/ 98v04n3/ 9810gree. pdf
[10] Ned Potter (2006-10-13). "Free International Calls! Just Dial ... Iowa" (http:/ / abcnews. go. com/ Technology/ story?id=2560255). .
Retrieved 2008-12-23.
Arbitrage
223
[11] Mike Masnick (2007-02-07). "Phone Call Arbitrage Is All Fun And Games (And Profit) Until AT&T Hits You With A $2 Million Lawsuit"
(http:/ / techdirt.com/ articles/ 20070207/ 123022.shtml). . Retrieved 2008-12-23.
[12] See "Arbitrage" in Trsor de la Langue Franaise (http:/ / www. cnrtl. fr/ lexicographie/ arbitrage).
References
Greider, William (1997). One World, Ready or Not. Penguin Press. ISBN 0-7139-9211-5.
Special Situation Investing: Hedging, Arbitrage, and Liquidation, Brian J. Stark, Dow-Jones Publishers. New
York, NY 1983. ISBN 0870943847; ISBN 9780870943843
External links
What is Arbitrage? (About.com) (http:/ / economics. about. com/ cs/ finance/ a/ arbitrage. htm)
ArbitrageView.com (http:/ / www. arbitrageview. com/ riskarb. htm) Arbitrage opportunities in pending merger
deals in the U.S. market
Information on arbitrage in dual-listed companies on the website of Mathijs A. van Dijk. (http:/ / mathijsavandijk.
com/ dual-listed-companies)
What is Regulatory Arbitrage. Regulatory Arbitrage after the Basel ii framework and the 8th Company Law
Directive of the European Union. (http:/ / www. regulatory-arbitrage. com)
Institute for Arbitrage. (http:/ / www. rmjinstitute. com)
Futures contract
In finance, a futures contract is a standardized contract between two parties to buy or sell a specified asset
(eg.oranges, oil, gold) of standardized quantity and quality at a specified future date at a price agreed today (the
futures price). The contracts are traded on a futures exchange. Futures contracts are not "direct" securities like
stocks, bonds, rights or warrants. They are still securities, however, though they are a type of derivative contract. The
party agreeing to buy the underlying asset in the future assumes a long position, and the party agreeing to sell the
asset in the future assumes a short position.
The price is determined by the instantaneous equilibrium between the forces of supply and demand among
competing buy and sell orders on the exchange at the time of the purchase or sale of the contract.
In many cases, the underlying asset to a futures contract may not be traditional "commodities" at all that is, for
financial futures, the underlying asset or item can be currencies, securities or financial instruments and intangible
assets or referenced items such as stock indexes and interest rates.
The future date is called the delivery date or final settlement date. The official price of the futures contract at the end
of a day's trading session on the exchange is called the settlement price for that day of business on the exchange
[1]
.
A closely related contract is a forward contract; they differ in certain respects. Future contracts are very similar to
forward contracts, except they are exchange-traded and defined on standardized assets.
[2]
Unlike forwards, futures
typically have interim partial settlements or "true-ups" in margin requirements. For typical forwards, the net gain or
loss accrued over the life of the contract is realized on the delivery date.
A futures contract gives the holder the obligation to make or take delivery under the terms of the contract, whereas
an option grants the buyer the right, but not the obligation, to establish a position previously held by the seller of the
option. In other words, the owner of an options contract may exercise the contract, but both parties of a "futures
contract" must fulfill the contract on the settlement date. The seller delivers the underlying asset to the buyer, or, if it
is a cash-settled futures contract, then cash is transferred from the futures trader who sustained a loss to the one who
made a profit. To exit the commitment prior to the settlement date, the holder of a futures position has to offset
his/her position by either selling a long position or buying back (covering) a short position, effectively closing out
Futures contract
224
the futures position and its contract obligations.
Futures contracts, or simply futures, (but not future or future contract) are exchange-traded derivatives. The
exchange's clearing house acts as counterparty on all contracts, sets margin requirements, and crucially also provides
a mechanism for settlement.
[3]
Origin
Aristotle described the story of Thales, a poor philosopher from Miletus who developed a "financial device, which
involves a principle of universal application". Thales used his skill in forecasting and predicted that the olive harvest
would be exceptionally good the next autumn. Confident in his prediction, he made agreements with local olive press
owners to deposit his money with them to guarantee him exclusive use of their olive presses when the harvest was
ready. Thales successfully negotiated low prices because the harvest was in the future and no one knew whether the
harvest would be plentiful or poor and because the olive press owners were willing to hedge against the possibility of
a poor yield. When the harvest time came, and many presses were wanted concurrently and suddenly, he let them out
at any rate he pleased, and made a large quantity of money.
[4]
The first futures exchange market was the Djima Rice Exchange in Japan in the 1730s, to meet the needs of samurai
who being paid in rice, and after a series of bad harvests needed a stable conversion to coin.
[5]
The Chicago Board of Trade (CBOT) listed the first ever standardized 'exchange traded' forward contracts in 1864,
which were called futures contracts. This contract was based on grain trading and started a trend that saw contracts
created on a number of different commodities as well as a number of futures exchanges set up in countries around
the world.
[6]
By 1875 cotton futures were being traded in Mumbai in India and within a few years this had expanded
to futures on edible oilseeds complex, raw jute and jute goods and bullion.
[7]
Standardization
Futures contracts ensure their liquidity by being highly standardized, usually by specifying:
The underlying asset or instrument. This could be anything from a barrel of crude oil to a short term interest rate.
The type of settlement, either cash settlement or physical settlement.
The amount and units of the underlying asset per contract. This can be the notional amount of bonds, a fixed
number of barrels of oil, units of foreign currency, the notional amount of the deposit over which the short term
interest rate is traded, etc.
The currency in which the futures contract is quoted.
The grade of the deliverable. In the case of bonds, this specifies which bonds can be delivered. In the case of
physical commodities, this specifies not only the quality of the underlying goods but also the manner and location
of delivery. For example, the NYMEX Light Sweet Crude Oil contract specifies the acceptable sulphur content
and API specific gravity, as well as the pricing point -- the location where delivery must be made.
The delivery month.
The last trading date.
Other details such as the commodity tick, the minimum permissible price fluctuation.
Futures contract
225
Margin
To minimize credit risk to the exchange, traders must
post a margin or a performance bond, typically
5%-15% of the contract's value.
To minimize counterparty risk to traders, trades
executed on regulated futures exchanges are
guaranteed by a clearing house. The clearing house
becomes the buyer to each seller, and the seller to
each buyer, so that in the event of a counterparty
default the clearer assumes the risk of loss. This
enables traders to transact without performing due
diligence on their counterparty.
Margin requirements are waived or reduced in some
cases for hedgers who have physical ownership of
the covered commodity or spread traders who have
offsetting contracts balancing the position.
Clearing margin are financial safeguards to ensure
that companies or corporations perform on their
customers' open futures and options contracts.
Clearing margins are distinct from customer margins
that individual buyers and sellers of futures and
options contracts are required to deposit with
brokers.
Customer margin Within the futures industry, financial guarantees required of both buyers and sellers of futures
contracts and sellers of options contracts to ensure fulfillment of contract obligations. Futures Commission
Merchants are responsible for overseeing customer margin accounts. Margins are determined on the basis of market
risk and contract value. Also referred to as performance bond margin.
Initial margin is the equity required to initiate a futures position. This is a type of performance bond. The maximum
exposure is not limited to the amount of the initial margin, however the initial margin requirement is calculated
based on the maximum estimated change in contract value within a trading day. Initial margin is set by the exchange.
If a position involves an exchange-traded product, the amount or percentage of initial margin is set by the exchange
concerned.
In case of loss or if the value of the initial margin is being eroded, the broker will make a margin call in order to
restore the amount of initial margin available. Often referred to as variation margin, margin called for this reason is
usually done on a daily basis, however, in times of high volatility a broker can make a margin call or calls intra-day.
Calls for margin are usually expected to be paid and received on the same day. If not, the broker has the right to
close sufficient positions to meet the amount called by way of margin. After the position is closed-out the client is
liable for any resulting deficit in the clients account.
Some U.S. exchanges also use the term maintenance margin, which in effect defines by how much the value of the
initial margin can reduce before a margin call is made. However, most non-US brokers only use the term initial
margin and variation margin.
The Initial Margin requirement is established by the Futures exchange, in contrast to other securities Initial Margin
(which is set by the Federal Reserve in the U.S. Markets).
Futures contract
226
A futures account is marked to market daily. If the margin drops below the margin maintenance requirement
established by the exchange listing the futures, a margin call will be issued to bring the account back up to the
required level.
Maintenance margin A set minimum margin per outstanding futures contract that a customer must maintain in his
margin account.
Margin-equity ratio is a term used by speculators, representing the amount of their trading capital that is being held
as margin at any particular time. The low margin requirements of futures results in substantial leverage of the
investment. However, the exchanges require a minimum amount that varies depending on the contract and the trader.
The broker may set the requirement higher, but may not set it lower. A trader, of course, can set it above that, if he
doesn't want to be subject to margin calls.
Performance bond margin The amount of money deposited by both a buyer and seller of a futures contract or an
options seller to ensure performance of the term of the contract. Margin in commodities is not a payment of equity or
down payment on the commodity itself, but rather it is a security deposit.
Return on margin (ROM) is often used to judge performance because it represents the gain or loss compared to the
exchanges perceived risk as reflected in required margin. ROM may be calculated (realized return) / (initial margin).
The Annualized ROM is equal to (ROM+1)
(year/trade_duration)
-1. For example if a trader earns 10% on margin in two
months, that would be about 77% annualized.
Settlement - physical versus cash-settled futures
Settlement is the act of consummating the contract, and can be done in one of two ways, as specified per type of
futures contract:
Physical delivery - the amount specified of the underlying asset of the contract is delivered by the seller of the
contract to the exchange, and by the exchange to the buyers of the contract. Physical delivery is common with
commodities and bonds. In practice, it occurs only on a minority of contracts. Most are cancelled out by
purchasing a covering position - that is, buying a contract to cancel out an earlier sale (covering a short), or selling
a contract to liquidate an earlier purchase (covering a long). The Nymex crude futures contract uses this method
of settlement upon expiration
Cash settlement - a cash payment is made based on the underlying reference rate, such as a short term interest
rate index such as Euribor, or the closing value of a stock market index. The parties settle by paying/receiving the
loss/gain related to the contract in cash when the contract expires.
[8]
Cash settled futures are those that, as a
practical matter, could not be settled by delivery of the referenced item - i.e. how would one deliver an index? A
futures contract might also opt to settle against an index based on trade in a related spot market. Ice Brent futures
use this method.
Expiry (or Expiration in the U.S.) is the time and the day that a particular delivery month of a futures contract stops
trading, as well as the final settlement price for that contract. For many equity index and interest rate futures
contracts (as well as for most equity options), this happens on the third Friday of certain trading months. On this day
the t+1 futures contract becomes the t futures contract. For example, for most CME and CBOT contracts, at the
expiration of the December contract, the March futures become the nearest contract. This is an exciting time for
arbitrage desks, which try to make quick profits during the short period (perhaps 30 minutes) during which the
underlying cash price and the futures price sometimes struggle to converge. At this moment the futures and the
underlying assets are extremely liquid and any disparity between an index and an underlying asset is quickly traded
by arbitrageurs. At this moment also, the increase in volume is caused by traders rolling over positions to the next
contract or, in the case of equity index futures, purchasing underlying components of those indexes to hedge against
current index positions. On the expiry date, a European equity arbitrage trading desk in London or Frankfurt will see
positions expire in as many as eight major markets almost every half an hour.
Futures contract
227
Pricing
When the deliverable asset exists in plentiful supply, or may be freely created, then the price of a futures contract is
determined via arbitrage arguments. This is typical for stock index futures, treasury bond futures, and futures on
physical commodities when they are in supply (e.g. agricultural crops after the harvest). However, when the
deliverable commodity is not in plentiful supply or when it does not yet exist - for example on crops before the
harvest or on Eurodollar Futures or Federal funds rate futures (in which the supposed underlying instrument is to be
created upon the delivery date) - the futures price cannot be fixed by arbitrage. In this scenario there is only one force
setting the price, which is simple supply and demand for the asset in the future, as expressed by supply and demand
for the futures contract.
Arbitrage arguments
Arbitrage arguments ("Rational pricing") apply when the deliverable asset exists in plentiful supply, or may be freely
created. Here, the forward price represents the expected future value of the underlying discounted at the risk free
rateas any deviation from the theoretical price will afford investors a riskless profit opportunity and should be
arbitraged away.
Thus, for a simple, non-dividend paying asset, the value of the future/forward, F(t), will be found by compounding
the present value S(t) at time t to maturity T by the rate of risk-free return r.
or, with continuous compounding
This relationship may be modified for storage costs, dividends, dividend yields, and convenience yields.
In a perfect market the relationship between futures and spot prices depends only on the above variables; in practice
there are various market imperfections (transaction costs, differential borrowing and lending rates, restrictions on
short selling) that prevent complete arbitrage. Thus, the futures price in fact varies within arbitrage boundaries
around the theoretical price.
Pricing via expectation
When the deliverable commodity is not in plentiful supply (or when it does not yet exist) rational pricing cannot be
applied, as the arbitrage mechanism is not applicable. Here the price of the futures is determined by today's supply
and demand for the underlying asset in the futures.
In a deep and liquid market, supply and demand would be expected to balance out at a price which represents an
unbiased expectation of the future price of the actual asset and so be given by the simple relationship.
.
By contrast, in a shallow and illiquid market, or in a market in which large quantities of the deliverable asset have
been deliberately withheld from market participants (an illegal action known as cornering the market), the market
clearing price for the futures may still represent the balance between supply and demand but the relationship between
this price and the expected future price of the asset can break down.
Futures contract
228
Relationship between arbitrage arguments and expectation
The expectation based relationship will also hold in a no-arbitrage setting when we take expectations with respect to
the risk-neutral probability. In other words: a futures price is martingale with respect to the risk-neutral probability.
With this pricing rule, a speculator is expected to break even when the futures market fairly prices the deliverable
commodity.
Contango and backwardation
The situation where the price of a commodity for future delivery is higher than the spot price, or where a far future
delivery price is higher than a nearer future delivery, is known as contango. The reverse, where the price of a
commodity for future delivery is lower than the spot price, or where a far future delivery price is lower than a nearer
future delivery, is known as backwardation.
Futures contracts and exchanges
Contracts
There are many different kinds of futures contracts, reflecting the many different kinds of "tradable" assets about
which the contract may be based such as commodities, securities (such as single-stock futures), currencies or
intangibles such as interest rates and indexes. For information on futures markets in specific underlying commodity
markets, follow the links. For a list of tradable commodities futures contracts, see List of traded commodities. See
also the futures exchange article.
Foreign exchange market
Money market
Bond market
Equity market
Soft Commodities market
Trading on commodities began in Japan in the 18th century with the trading of rice and silk, and similarly in Holland
with tulip bulbs. Trading in the US began in the mid 19th century, when central grain markets were established and a
marketplace was created for farmers to bring their commodities and sell them either for immediate delivery (also
called spot or cash market) or for forward delivery. These forward contracts were private contracts between buyers
and sellers and became the forerunner to today's exchange-traded futures contracts. Although contract trading began
with traditional commodities such as grains, meat and livestock, exchange trading has expanded to include metals,
energy, currency and currency indexes, equities and equity indexes, government interest rates and private interest
rates.
Exchanges
Contracts on financial instruments were introduced in the 1970s by the Chicago Mercantile Exchange (CME) and
these instruments became hugely successful and quickly overtook commodities futures in terms of trading volume
and global accessibility to the markets. This innovation led to the introduction of many new futures exchanges
worldwide, such as the London International Financial Futures Exchange in 1982 (now Euronext.liffe), Deutsche
Terminbrse (now Eurex) and the Tokyo Commodity Exchange (TOCOM). Today, there are more than 90 futures
and futures options exchanges worldwide trading to include:
[9]
CME Group (formerly CBOT and CME) -- Currencies, Various Interest Rate derivatives (including US Bonds);
Agricultural (Corn, Soybeans, Soy Products, Wheat, Pork, Cattle, Butter, Milk); Index (Dow Jones Industrial
Average); Metals (Gold, Silver), Index (NASDAQ, S&P, etc.)
IntercontinentalExchange (ICE Futures Europe) - formerly the International Petroleum Exchange trades energy
including crude oil, heating oil, natural gas and unleaded gas
Futures contract
229
NYSE Euronext - which absorbed Euronext into which London International Financial Futures and Options
Exchange or LIFFE (pronounced 'LIFE') was merged. (LIFFE had taken over London Commodities Exchange
("LCE") in 1996)- softs: grains and meats. Inactive market in Baltic Exchange shipping. Index futures include
EURIBOR, FTSE 100, CAC 40, AEX index.
South African Futures Exchange - SAFEX
Sydney Futures Exchange
Tokyo Stock Exchange TSE (JGB Futures, TOPIX Futures)
Tokyo Commodity Exchange TOCOM
Tokyo Financial Exchange
[10]
- TFX - (Euroyen Futures, OverNight CallRate Futures, SpotNext RepoRate
Futures)
Osaka Securities Exchange OSE (Nikkei Futures, RNP Futures)
London Metal Exchange - metals: copper, aluminium, lead, zinc, nickel, tin and steel
IntercontinentalExchange (ICE Futures U.S.) - formerly New York Board of Trade - softs: cocoa, coffee, cotton,
orange juice, sugar
New York Mercantile Exchange CME Group- energy and metals: crude oil, gasoline, heating oil, natural gas,
coal, propane, gold, silver, platinum, copper, aluminum and palladium
Dubai Mercantile Exchange
Korea Exchange - KRX
Singapore Exchange - SGX - into which merged Singapore International Monetary Exchange (SIMEX)
ROFEX - Rosario (Argentina) Futures Exchange
Codes
Most Futures contracts codes are four characters. The first two characters identify the contract type, the third
character identifies the month and the last character is the last digit of the year.
Third (month) futures contract codes are
January = F
February = G
March = H
April = J
May = K
June = M
July = N
August = Q
September = U
October = V
November = X
December = Z
Example: CLX0 is a Crude Oil (CL), November (X) 2010 (0) contract.
Who trades futures?
Futures traders are traditionally placed in one of two groups: hedgers, who have an interest in the underlying asset
(which could include an intangible such as an index or interest rate) and are seeking to hedge out the risk of price
changes; and speculators, who seek to make a profit by predicting market moves and opening a derivative contract
related to the asset "on paper", while they have no practical use for or intent to actually take or make delivery of the
underlying asset. In other words, the investor is seeking exposure to the asset in a long futures or the opposite effect
via a short futures contract.
Futures contract
230
Hedgers typically include producers and consumers of a commodity or the owner of an asset or assets subject to
certain influences such as an interest rate.
For example, in traditional commodity markets, farmers often sell futures contracts for the crops and livestock they
produce to guarantee a certain price, making it easier for them to plan. Similarly, livestock producers often purchase
futures to cover their feed costs, so that they can plan on a fixed cost for feed. In modern (financial) markets,
"producers" of interest rate swaps or equity derivative products will use financial futures or equity index futures to
reduce or remove the risk on the swap.
An example that has both hedge and speculative notions involves a mutual fund or separately managed account
whose investment objective is to track the performance of a stock index such as the S&P 500 stock index. The
Portfolio manager often "equitizes" cash inflows in an easy and cost effective manner by investing in (opening long)
S&P 500 stock index futures. This gains the portfolio exposure to the index which is consistent with the fund or
account investment objective without having to buy an appropriate proportion of each of the individual 500 stocks
just yet. This also preserves balanced diversification, maintains a higher degree of the percent of assets invested in
the market and helps reduce tracking error in the performance of the fund/account. When it is economically feasible
(an efficient amount of shares of every individual position within the fund or account can be purchased), the
portfolio manager can close the contract and make purchases of each individual stock.
The social utility of futures markets is considered to be mainly in the transfer of risk, and increased liquidity between
traders with different risk and time preferences, from a hedger to a speculator, for example.
Options on futures
In many cases, options are traded on futures, sometimes called simply "futures options". A put is the option to sell a
futures contract, and a call is the option to buy a futures contract. For both, the option strike price is the specified
futures price at which the future is traded if the option is exercised. See the Black-Scholes model, which is the most
popular method for pricing these option contracts. Futures are often used since they are delta one instruments.
Futures contract regulations
All futures transactions in the United States are regulated by the Commodity Futures Trading Commission (CFTC),
an independent agency of the United States government. The Commission has the right to hand out fines and other
punishments for an individual or company who breaks any rules. Although by law the commission regulates all
transactions, each exchange can have its own rule, and under contract can fine companies for different things or
extend the fine that the CFTC hands out.
The CFTC publishes weekly reports containing details of the open interest of market participants for each
market-segment that has more than 20 participants. These reports are released every Friday (including data from the
previous Tuesday) and contain data on open interest split by reportable and non-reportable open interest as well as
commercial and non-commercial open interest. This type of report is referred to as the 'Commitments of Traders
Report', COT-Report or simply COTR.
Futures contract
231
Definition of futures contract
Following Bjrk
[11]
we give a definition of a futures contract. We describe a futures contract with delivery of item J
at the time T:
There exists in the market a quoted price F(t,T), which is known as the futures price at time t for delivery of J at
time T.
At time T, the holder pays F(T,T) and is entitled to receive J.
During any time interval , the holder receives the amount .
The spot price of obtaining the futures contract is equal to zero, for all time t such that .
Nonconvergence
Some exchanges tolerate 'nonconvergence', the failure of futures contracts and the value of the physical commodities
they represent to reach the same value on 'contract settlement' day at the designated delivery points. An example of
this is the CBOT (Chicago Board of Trade) Soft Red Winter wheat (SRW) futures. SRW futures have settled more
than 20 apart on settlement day and as much as $1.00 difference between settlement days. Only a few participants
holding CBOT SRW futures contracts are qualified by the CBOT to make or receive delivery of commodities to
settle futures contracts. Therefore, it's impossible for almost any individual producer to 'hedge' efficiently when
relying on the final settlement of a futures contract for SRW. The trend is for the CBOT to continue to restrict those
entities that can actually participate in settling commodities contracts to those that can ship or receive large quantities
of railroad cars and multiple barges at a few selected sites. The Commodity Futures Trading Commission, which has
oversight of the futures market in the United States, has made no comment as to why this trend is allowed to
continue since economic theory and CBOT publications maintain that convergence of contracts with the price of the
underlying commodity they represent is the basis of integrity for a futures market. It follows that the function of
'price discovery', the ability of the markets to discern the appropriate value of a commodity reflecting current
conditions, is degraded in relation to the discrepancy in price and the inability of producers to enforce contracts with
the commodities they represent.
[12]
Futures versus forwards
While futures and forward contracts are both contracts to deliver an asset on a future date at a prearranged price, they
are different in two main respects:
Futures are exchange-traded, while forwards are traded over-the-counter.
Thus futures are standardized and face an exchange, while forwards are customized and face a non-exchange
counterparty.
Futures are margined, while forwards are not.
Thus futures have significantly less credit risk, and have different funding.
Futures contract
232
Exchange versus OTC
Futures are always traded on an exchange, whereas forwards always trade over-the-counter, or can simply be a
signed contract between two parties.
Thus:
Futures are highly standardized, being exchange-traded, whereas forwards can be unique, being over-the-counter.
In the case of physical delivery, the forward contract specifies to whom to make the delivery. The counterparty
for delivery on a futures contract is chosen by the clearing house.
Margining
Futures are margined daily to the daily spot price of a forward with the same agreed-upon delivery price and
underlying asset (based on mark to market).
Forwards do not have a standard. They may transact only on the settlement date. More typical would be for the
parties to agree to true up, for example, every quarter. The fact that forwards are not margined daily means that, due
to movements in the price of the underlying asset, a large differential can build up between the forward's delivery
price and the settlement price, and in any event, an unrealized gain (loss) can build up.
Again, this differs from futures which get 'trued-up' typically daily by a comparison of the market value of the future
to the collateral securing the contract to keep it in line with the brokerage margin requirements. This true-ing up
occurs by the "loss" party providing additional collateral; so if the buyer of the contract incurs a drop in value, the
shortfall or variation margin would typically be shored up by the investor wiring or depositing additional cash in the
brokerage account.
In a forward though, the spread in exchange rates is not trued up regularly but, rather, it builds up as unrealized gain
(loss) depending on which side of the trade being discussed. This means that entire unrealized gain (loss) becomes
realized at the time of delivery (or as what typically occurs, the time the contract is closed prior to expiration) -
assuming the parties must transact at the underlying currency's spot price to facilitate receipt/delivery.
The result is that forwards have higher credit risk than futures, and that funding is charged differently.
In most cases involving institutional investors, the daily variation margin settlement guidelines for futures call for
actual money movement only above some insignificant amount to avoid wiring back and forth small sums of cash.
The threshold amount for daily futures variation margin for institutional investors is often $1,000.
The situation for forwards, however, where no daily true-up takes place in turn creates credit risk for forwards, but
not so much for futures. Simply put, the risk of a forward contract is that the supplier will be unable to deliver the
referenced asset, or that the buyer will be unable to pay for it on the delivery date or the date at which the opening
party closes the contract.
The margining of futures eliminates much of this credit risk by forcing the holders to update daily to the price of an
equivalent forward purchased that day. This means that there will usually be very little additional money due on the
final day to settle the futures contract: only the final day's gain or loss, not the gain or loss over the life of the
contract.
In addition, the daily futures-settlement failure risk is borne by an exchange, rather than an individual party, further
limiting credit risk in futures.
Example: Consider a futures contract with a $100 price: Let's say that on day 50, a futures contract with a $100
delivery price (on the same underlying asset as the future) costs $88. On day 51, that futures contract costs $90. This
means that the "mark-to-market" calculation would require the holder of one side of the future to pay $2 on day 51 to
track the changes of the forward price ("post $2 of margin"). This money goes, via margin accounts, to the holder of
the other side of the future. That is, the loss party wires cash to the other party.
Futures contract
233
A forward-holder, however, may pay nothing until settlement on the final day, potentially building up a large
balance; this may be reflected in the mark by an allowance for credit risk. So, except for tiny effects of convexity
bias (due to earning or paying interest on margin), futures and forwards with equal delivery prices result in the same
total loss or gain, but holders of futures experience that loss/gain in daily increments which track the forward's daily
price changes, while the forward's spot price converges to the settlement price. Thus, while under mark to market
accounting, for both assets the gain or loss accrues over the holding period; for a futures this gain or loss is realized
daily, while for a forward contract the gain or loss remains unrealized until expiry.
Note that, due to the path dependence of funding, a futures contract is not, strictly speaking, a European derivative:
the total gain or loss of the trade depends not only on the value of the underlying asset at expiry, but also on the path
of prices on the way. This difference is generally quite small though.
With an exchange-traded future, the clearing house interposes itself on every trade. Thus there is no risk of
counterparty default. The only risk is that the clearing house defaults (e.g. become bankrupt), which is considered
very unlikely.
See also
List of finance topics
Freight derivatives
Fuel price risk management
List of traded commodities
Seasonal spread trading
Prediction market
1256 Contract
Oil-storage trade
Onion Futures Act
Grain Futures Act
Commodity Exchange Act
London Metal Exchange
Notes
[1] Sullivan, Arthur; Steven M. Sheffrin (2003). Economics: Principles in action (http:/ / www. pearsonschool. com/ index.
cfm?locator=PSZ3R9& PMDbSiteId=2781& PMDbSolutionId=6724& PMDbCategoryId=& PMDbProgramId=12881& level=4). Upper
Saddle River, New Jersey 07458: Pearson Prentice Hall. pp.288. ISBN0-13-063085-3.
[2] Forward Contract on Wikinvest
[3] Hull, John C. (2005). Options, Futures and Other Derivatives (excerpt by Fan Zhang) (http:/ / fan. zhang. gl/ ecref/ futures) (6th ed.).
Prentice-Hall. ISBN 0-13-149908-4.
[4] Aristotle, Politics, trans. Benjamin Jowett, vol. 2, The Great Books of the Western World, book 1, chap. 11, p. 453.
[5] Schaede, Ulrike (September 1989). "Forwards and futures in tokugawa-period Japan:A new perspective on the Djima rice market". Journal of
Banking & Finance 13 (4-5): 487513. doi:10.1016/0378-4266(89)90028-9
[6] "timeline-of-achievements" (http:/ / www.cmegroup.com/ company/ history/ timeline-of-achievements. html). CME Group. .
[7] Inter-Ministerial task force (chaired by Wajahat Habibullah) (May 2003). "Convergence of Securities and Commodity Markets report" (http:/
/ www.fmc.gov. in/ htmldocs/ reports/ rep03.htm). Forward Markets Commission (India). .
[8] Cash settlement on Wikinvest
[9] Futures & Options Factbook (http:/ / www.theIFM.org/ gfb). Institute for Financial Markets.
[10] http:/ / www.tfx.co. jp/ en/
[11] Bjrk: Arbitrage theory in continuous time, Cambridge university press, 2004
[12] Henriques, D Mysterious discrepancies in grain prices baffle experts (http:/ / www. iht. com/ articles/ 2008/ 03/ 27/ business/ commod. php),
International Herald Tribune, March 23, 2008. Accessed April 12, 2008
Futures contract
234
References
The Institute for Financial Markets (http:/ / www. theifm. org) (2003). Futures & Options (http:/ / www. theifm.
org/ index. cfm?inc=education/ focourse. inc). Washington, DC: The IFM. p.237.
Redhead, Keith (1997). Financial Derivatives: An Introduction to Futures, Forwards, Options and Swaps.
London: Prentice-Hall. ISBN013241399X.
Lioui, Abraham; Poncet, Patrice (2005). Dynamic Asset Allocation with Forwards and Futures. New York:
Springer. ISBN0387241078.
Valdez, Steven (2000). An Introduction To Global Financial Markets (3rd ed.). Basingstoke, Hampshire:
Macmillan Press. ISBN0333764471.
Arditti, Fred D. (1996). Derivatives: A Comprehensive Resource for Options, Futures, Interest Rate Swaps, and
Mortgage Securities. Boston: Harvard Business School Press. ISBN0875845606.
The Institute for Financial Markets' Futures & Options Factbook (http:/ / www. theifm. org/ gfb)
U.S. Futures exchanges and regulators
Chicago Board of Trade, now part of CME Group
Chicago Mercantile Exchange, now part of CME Group
Commodity Futures Trading Commission
National Futures Association
Kansas City Board of Trade
New York Board of Trade now ICE
New York Mercantile Exchange, now part of CME Group
Minneapolis Grain Exchange
External links
BBC Oil Futures Investigation (http:/ / news. bbc. co. uk/ 1/ hi/ magazine/ 7559032. stm)
Energy Futures Databrowser (http:/ / mazamascience. com/ Energy/ NYMEXFutures/ ) Current and historical
charts of NYMEX energy futures chains.
CME Group futures contracts product codes (http:/ / www. cmegroup. com/ product-codes-listing/ )
Putcall parity
235
Putcall parity
In financial mathematics, put-call parity defines a relationship between the price of a call option and a put
optionboth with the identical strike price and expiry. To derive the put-call parity relationship, the assumption is
that the options are not exercised before expiration day, which necessarily applies to European options. Putcall
parity can be derived in a manner that is largely model independent.
Derivation
Relationships between the value of a call and put of an underlying variable, may be determined by considering
different portfolio bundles, and relying on the background assumption that the market is sufficiently ideal to prevent
arbitrage
The following example uses stock options with no dividends. Let be an underlying a financial instrument. Let
be the expiry of the call and put options of , let be their excercise (strike) price,
be the share, call, and put prices of respectively. Now consider two portfolios:
: one put option + one share of . The value at expiry is
.
: one call option + bonds. The value at expiry is
.
So the value of the portfolios at expiry is the same. We claim that since the market always prevents arbitrage
(risk-less profit), their price at all times before expiry, must be equal. Otherwise if at some time before expiry, they
differ; someone else could go long: purchasing the cheaper portfolio then immediately selling the more expensive
one (since they have the same value at expiry). Hence it follows that for all times . I.e.,
{{{}}}
{{{}}}
Here value of a bond that matures at time . This example demonstrates how put-call relationships may
be determined in other situations.
Note: (1) If a stock pays dividends, they should be included in , because option prices are typically not adjusted for ordinary
dividends. If the bond interest rate (or force of interest, compounded continuously) is constantly , then
.
(2) At first glance, their is a confusion about who sets these prices: do people go about constantly setting the prices of financial instruments?; or
do these prices merely fall into place? Well, metaphysically, the theory only admits that the cause of is the market forces; those who set
prices, etc., are merely agents (efficient means) of this cause.
Putcall parity
236
History
Nelson, an option arbitrage trader in New York, published a book: "The A.B.C. of Options and Arbitrage" in 1904
that describes the put-call parity in detail. His book was re-discovered by Espen Gaarder Haug in the early 2000s and
many references from Nelson's book are given in Haug's book "Derivatives Models on Models".
Henry Deutsch describes the put-call parity in 1910 in his book "Arbitrage in Bullion, Coins, Bills, Stocks, Shares
and Options, 2nd Edition". London: Engham Wilson but in less detail than Nelson (1904).
Mathematics professor Vinzenz Bronzin also derives the put-call parity in 1908 and uses it as part of his arbitrage
argument to develop a series of mathematical option models under a series of different distributions. The work of
professor Bronzin was just recently rediscovered by professor Wolfgang Hafner and professor Heinz Zimmermann.
The original work of Bronzin is a book written in German and is now translated and published in English in an
edited work by Hafner and Zimmermann (Vinzenz Bronzin's option pricing models, Springer Verlag).
Michael Knoll, in The Ancient Roots of Modern Financial Innovation: The Early History of Regulatory Arbitrage,
describes the important role that put-call parity played in developing the equity of redemption, the defining
characteristic of a modern mortgage, in Medieval England
Russell Sage used put-call parity to create synthetic loans, which had higher interest rates than the usury laws of the
time would have normally allowed.
Its first description in the "modern" literature appears to be Hans Stoll's paper, The Relation Between Put and Call
Prices, from 1969.
Implications
Putcall parity implies:
Equivalence of calls and puts: Parity implies that a call and a put can be used interchangeably in any delta-neutral
portfolio. If is the call's delta, then buying a call, and selling shares of stock, is the same as buying a put and
buying shares of stock. Equivalence of calls and puts is very important when trading options.
Parity of implied volatility: In the absence of dividends or other costs of carry (such as when a stock is difficult to
borrow or sell short), the implied volatility of calls and puts must be identical.
[1]
Other arbitrage relationships
Note that there are several other (theoretical) properties of option prices which may be derived via arbitrage
considerations. These properties define price limits, the relationship between price, dividends and the risk free rate,
the appropriateness of early exercise, and the relationship between the prices of various types of options. See links
below.
Put-call Parity and American Options
For American options, where you have the right to exercise before expiration, this affects the B(t, T) term in the
above equation. Putcall parity only holds for European options, or American options if they are not exercised early.
the left part of the equation is called "fiduciary call"
the right side of the equation is called "protective put"
Putcall parity
237
References
[1] Hull, John C. (2002). Options, Futures and Other Derivatives (5th ed.). Prentice Hall. pp.330331. ISBN0-13-009056-5.
External links
Put-Call parity
Put-Call Parity Relationship (http:/ / www. quantnotes. com/ fundamentals/ options/ putcallparity. htm),
quantnotes.com
Put-Call Parity and Arbitrage Opportunity (http:/ / www. investopedia. com/ articles/ optioninvestor/ 05/
011905. asp), investopedia.com
The Ancient Roots of Modern Financial Innovation: The Early History of Regulatory Arbitrage (http:/ / lsr.
nellco. org/ upenn/ wps/ papers/ 49/ ), Michael Knoll's history of Put-Call Parity
Other abitrage relationships
Arbitrage Relationships for Options (http:/ / www. sjsu. edu/ faculty/ watkins/ arb. htm), Prof. Thayer Watkins
Rational Rules and Boundary Conditions for Option Pricing (http:/ / www. bus. lsu. edu/ academics/ finance/
faculty/ dchance/ Instructional/ TN99-05. pdf) (PDFDi), Prof. Don M. Chance
No-Arbitrage Bounds on Options (http:/ / faculty. chicagobooth. edu/ robert. novy-marx/ teaching/ 35100/
Lectures/ lec03. pdf), Prof. Robert Novy-Marx
Tools
Option Arbitrage Relations (http:/ / www. duke. edu/ ~charvey/ Classes/ ba350/ optval/ arbitrage/ arbitrage.
htm), Prof. Campbell R. Harvey
Intrinsic value (finance)
In finance, intrinsic value refers to the value of a security which is intrinsic to or contained in the security itself. It is
also frequently called fundamental value. It is ordinarily calculated by summing the future income generated by the
asset, and discounting it to the present value.
Options
An option is said to have intrinsic value if the option is in-the-money. When out-of-the-money, its intrinsic value is
zero.
The intrinsic value for an in-the-money option is calculated as the absolute value of the difference between the
current price (S) of the underlying and the strike price (K) of the option, floored to zero.
For a call option
while for a put option
For example, if the strike price for a call option is USD $1 and the price of the underlying is USD $1.20, then the
option has an intrinsic value of USD $0.20.
The total value of an option is the sum of its intrinsic value and its time value.
Intrinsic value (finance)
238
Equity
In valuing equity, securities analysts may use fundamental analysis as opposed to technical analysis to
estimate the intrinsic value of a company. Here the "intrinsic" characteristic considered is the expected cash flow
production of the company in question. Intrinsic value is therefore defined to be the present value of all expected
future net cash flows to the company; it is calculated via discounted cash flow valuation.
An alternative, though related approach, is to view intrinsic value as the value of a business' ongoing operations, as
opposed to its accounting based book value, or break-up value. Warren Buffett is known for his ability to calculate
the intrinsic value of a business, and then buy that business when its price is at a discount to its intrinsic value.
Real Estate
In valuing real estate, a similar approach may be used. The "intrinsic value" of real estate is therefore defined as the
net present value of all future net cash flows which are foregone by buying a piece of real estate instead of renting it
in perpetuity. These cash flows would include rent, inflation, maintenance and property taxes. This calculation can
be done using the gordon model.
See also
Net Realizable Value
Option time value
Option (finance)
Expected value
External links
Investopedia
[1]
References
[1] http:/ / www. investopedia. com/ terms/ i/ intrinsicvalue. asp
Option time value
239
Option time value
In finance, the value of an option consists of two components, its intrinsic value and its time value. Time value is
simply the difference between option value and intrinsic value. Time value is also known as extrinsic value, or
instrumental value.
Intrinsic value
The intrinsic value of an option is the value of exercising it now. If the option has a positive monetary value, it is
referred to as being in-the-money, otherwise it is referred to as being out-of-the-money. If an option is
out-of-the-money at expiration, its holder will simply abandon the option and it will expire worthless. For this reason
we assume that the owner of the option will never choose to lose money by exercising, thus an option can never have
a negative value.
[1]
Value of a call option: , or
Value of a put option: , or
As seen on the graph, the intrinsic value of a call option is positive when the underlying asset's spot price S exceeds
the option's strike price K.
Option value
Option Value
Option value (i.e. price) is found via a predictive formula such as
Black-Scholes or using a numerical method such as the Binomial
model. This price will reflect the "likelihood" of the option finishing
"in-the-money". For an out-the-money option, the further in the future
the expiration date - i.e. the longer the time to exercise - the higher the
chance of this occurring, and thus the higher the option price; for an
in-the-money option the chance in the money decreases; however the
fact that the option cannot have negative value also works in the
owner's favor. The sensitivity of the option value to the amount of time
to expiry is known as the option's "theta"; see The Greeks. The option
value will never be lower than its intrinsic value.
As seen on the graph, the full call option value (intrinsic and time
value) is the red line.
Time value
Time value is, as above, the difference between option value and intrinsic value, i.e.
Time Value = Option Value - Intrinsic Value.
More specifically, an option's time value reflects the probability that the option will gain in intrinsic value or become
profitable to exercise before it expires
[2]
. An important factor is the option's volatility. Volatile prices of the
underlying instrument can stimulate option demand, enhancing the value. Numerically, this value depends on the
time until the expiration date and the volatility of the underlying instrument's price. The time value of an option is
not negative (because the option value is never lower than the intrinsic value), and converges towards zero with time.
At expiration, where the option value is simply its intrinsic value, time value is zero. Prior to expiration, the change
in time value with time is non-linear, being a function of the option price.
[3]
Option time value
240
See also
Intrinsic value
Naked call
Time value of money
References
[1] Understanding Option Pricing (http:/ / www. investopedia. com/ articles/ optioninvestor/ 07/ options_beat_market. asp) Hans Wagner
[2] Option premium valuation (http:/ / www.oxfordfutures. com/ futures-education/ option-premium-valuation. htm) 22 August 2007
[3] Options: Time Value (http:/ / demonstrations.wolfram. com/ OptionsTimeValue/ ), wolfram.com
External links and references
Basic Options Concepts: Intrinsic Value and Time Value (http:/ / biz. yahoo. com/ opt/ basics5. html),
biz.yahoo.com
Moneyness
"In the money" redirects here; for the poker term, see In the money (poker).
In finance, moneyness is a measure of the degree to which a derivative is likely to have positive monetary value at
its expiration, in the risk-neutral measure. It can be measured in percentage probability, or in standard deviations.
Intrinsic value and time value
The intrinsic value (or "monetary value") of an option is the value of exercising it now. Thus if the current (spot)
price of the underlying security is above the agreed (strike) price, a call has positive intrinsic value (and is called "in
the money"), while a put has zero intrinsic value.
The time value of an option is a function of the option value less the intrinsic value. It equates to uncertainty in the
form of investor hope. It is also viewed as the value of not exercising the option immediately. In the case of a
European option, you cannot choose to exercise it at any time, so the time value can be negative; for an American
option if the time value is never negative, you exercise it: this yields a boundary condition.
ATM: At-the-money
An option is at-the-money if the strike price is the same as the spot price of the underlying security on which the
option is written. An at-the-money option has no intrinsic value, only time value.
ITM: In-the-money
An in-the-money option has positive intrinsic value as well as time value. A call option is in-the-money when the
strike price is below the spot price. A put option is in-the-money when the strike price is above the spot price.
OTM: Out-of-the-money
An out-of-the-money option has no intrinsic value. A call option is out-of-the-money when the strike price is above
the spot price of the underlying security. A put option is out-of-the-money when the strike price is below the spot
price.
Moneyness
241
Spot versus forward
Assets can have a forward price (a price for delivery in future) as well as a spot price. One can also talk about
moneyness with respect to the forward price: thus one talks about ATMF, "ATM Forward", and so forth. For
instance, if the spot price for USD/JPY is 120, and the forward price one year hence is 110, then a call struck at 110
is ATMF but not ATM.
Which are used?
Buying an ITM option is effectively lending money in the amount of the intrinsic value. Further, an ITM call can be
replicated by entering a forward and buying an OTM put (and conversely). Consequently, ATM and OTM options
are the main traded ones.
Example
Suppose the current stock price of IBM is $100. A call or put option with a strike of $100 is at-the-money. A call
option with a strike of $80 is in-the-money (100 80 = 20 > 0). A put option with a strike at $80 is out-of-the-money
(80 100 = 20 < 0). Conversely, a call option with a $120 strike is out-of-the-money and a put option with a $120
strike is in-the-money.
When one uses the Black-Scholes model to value the option, one may define moneyness quantitatively. If we define
the moneyness (of a call) as
where d
1
and d
2
are the standard Black-Scholes parameters then
where T is the time to expiry.
In other words, it is the number of standard deviations the current price is above the ATMF price.
This choice of parameterisation means that the moneyness is zero when the forward price of the underlying,
discounted at the risk-free rate, equals the strike price. Such an option is often referred to as at-the-money-forward.
Moneyness is measured in standard deviations from this point, with a positive value meaning an in-the-money call
option and a negative value meaning an out-of-the-money call option (with signs reversed for a put option).
One can also measure it as a percent, via , where is the standard normal cumulative distribution function;
thus a moneyness of 0 yields a 50% probability of expiring ITM, while a moneyness of 1 yields an approximately
84% probability of expiring ITM.
Beware that (percentage) moneyness is close to but different from Delta: instead of
, for a call (conversely for a put).
Thus a 25 Delta call option has approximately (but not exactly) 25% moneyness.
Note that is the risk-free rate, not the expected return on the underlying.
References
McMillan, Lawrence G. (2002). Options as a Strategic Investment (4th ed. ed.). New York : New York Institute
of Finance. ISBN0-7352-0197-8.
BlackScholes
242
BlackScholes
The BlackScholes model is a mathematical description of financial markets and derivative investment instruments.
The model develops partial differential equations whose solution, the BlackScholes formula, is widely used in the
pricing of European-style options.
The model was first articulated by Fischer Black and Myron Scholes in their 1973 paper, "The Pricing of Options
and Corporate Liabilities." The foundation for their research relied on work developed by scholars such as Jack L.
Treynor, Paul Samuelson, A. James Boness, Sheen T. Kassouf, and Edward O. Thorp.
[1]
The fundamental insight of
BlackScholes is that the option is implicitly priced if the stock is traded. Robert C. Merton was the first to publish a
paper expanding the mathematical understanding of the options pricing model and coined the term BlackScholes
options pricing model.
Merton and Scholes received the 1997 Nobel Prize in Economics (The Sveriges Riksbank Prize in Economic
Sciences in Memory of Alfred Nobel) for their work. Though ineligible for the prize because of his death in 1995,
Black was mentioned as a contributor by the Swedish academy.
[2]
Model assumptions
The BlackScholes model of the market for a particular equity makes the following explicit assumptions:
It is possible to borrow and lend cash at a known constant risk-free interest rate. This restriction has been removed
in later extensions of the model.
The price follows a Geometric Brownian motion with constant drift and volatility. It follows from this that the
return has a Normal distribution (Then the price of the underlying has a Log-normal distribution). This often
implies the validity of the efficient-market hypothesis.
There are no transaction costs or taxes.
The underlying does not pay a dividend (see below for extensions to handle dividend payments).
All securities are infinitely divisible (i.e., it is possible to buy any fraction of a share).
There are no restrictions on short selling.
There is no arbitrage opportunity.
Options use the European exercise terms, which dictate that options may only be exercised on the day of
expiration.
From these conditions in the market for an equity (and for an option on the equity), the authors show that "it is
possible to create a hedged position, consisting of a long position in the stock and a short position in [calls on the
same stock], whose value will not depend on the price of the stock."
[3]
Several of these assumptions of the original model have been removed in subsequent extensions of the model.
Modern versions account for changing interest rates (Merton, 1976), transaction costs and taxes (Ingersoll, 1976),
and dividend payout (Merton, 1973).
BlackScholes
243
Notation
Let
be the underlying asset (e.g. a stock).
be the price of the asset (as below).
Under the assumptions this is governed by geometric brownian motion. Since almost surely every
such path is continuous, we may assume is continuous. In particular
where is the drift rate; the volatility of the stock's returns; and is a Wiener process.
be a derivative based on ; where this derivative is either a call or a put option.
be the price of the derivative; it depends only on the price of the asset and time.
(Later we shall specify the option as a (European) call or put, and replace by and
respectively.)
, the exercise (strike) price of the option.
denote time. We generally let denote the duration of the option.
, the annualized risk-free interest rate, continuously compounded.
Mathematical model
Simulated Geometric Brownian Motions with
Parameters from Market Data
The thing to be determined is the price of the derivative, , viz. how
evolves over time. To do this, we rely on the assumptions on the
underlying asset, as well as market equilibrium.
(1) Some minor calculations.
As per the assumptions, the price of the underlying asset is a geometric Brownian motion,
where is a Wiener process. So accounts for all volatility/risk in the price history of the stock. Then by
It's lemma for two variables we have for all times ,
(provided we assume is twice continuously differentiable with respect to and once with respect to ).
(2) Consider the following trading strategy (delta-hedging).
: during the lifetime of the derivative (i.e. the option), continuously hold one option and
shares of the underlying financial instrument.
The price of this portfolio at any time is:
.
BlackScholes
244
Also the instantaneous profit, at any time i.e. the payoff of purchasing this portfolio then selling it
at the end of an infinitesimal interval , is:
Now to prevent this kind of arbitrage (delta-hedging), it must be so, that at any time ,
where is the risk-free interest rate. (Otherwise either (i) one could borrow money at some time, purchase the
portfolio, then at the next instant sell the portfolio and return the money with interest; or (ii) one could
short-sell the portfolio, earn interest on the money, then at the next instant, buy the original portfolio at the
updated price. Clearly one of these will allow for a risk-free profit if equality does not hold.)
Putting this together readily determines a governing equation for the price of the derivative, . For all times
which is called the BlackScholes (second order) partial differential equation. The boundary conditions are
determined by what a put/call option is worth at expiry.
Note: Black and Scholes reasoned that market forces will prevent delta-hedging. This applies applies above, since as
; we have that, in any infinitesimal period of time, fluctuations in the underlying variables will not give rise
to [significant] changes in .
Other derivations
Above we used the method of arbitrage-free pricing ("delta-hedging") to derive some PDE governing option prices
given the BlackScholes model. It is also possible to use a risk-neutrality argument. This latter method gives the
price as the expectation of the option payoff under a particular probability measure, called the risk-neutral measure,
which differs from the real world measure.
BlackScholes formula
BlackScholes European Call Option Pricing
Surface
The Black Scholes formula calculates the price of European put and call options. It can be obtained by solving the
BlackScholes partial differential equation. If the derivative, , is a call option, then we have the boundary
condition: . If a put option the boundary condition is
.
The value of a call option for a non-dividend paying underlying stock in terms of the BlackScholes parameters is:
where
BlackScholes
245
, the time till expiry,
In turn, the price of a corresponding put option based on put-call parity is:
Interpretation
are the probabilities of the option expiring in-the-money under the equivalent exponential
martingale probability measure (numraire = stock) and the equivalent martingale probability measure (numraire =
risk free asset), respectively. The equivalent martingale probability measure is also called the risk-neutral probability
measure. Note that both of these are probabilities in a measure theoretic sense, and neither of these is the true
probability of expiring in-the-money under the real probability measure. To calculate the probability under the real
("physical") probability measure, additional information is required viz., the drift term in the physical measure, or
equivalently, the market price of risk.
Derivation
We now show how to get from the general BlackScholes PDE to a specific valuation for an option. Consider as an
example the BlackScholes price of a call option, for which the PDE above has boundary conditions
for all
for all
for all
The last condition gives the value of the option at the time that the option matures. The solution of the PDE gives the
value of the option at any earlier time and in fact equals .
To actually solve the PDE, transform the equation into a diffusion/wave equation which may be solved using
standard methods. To this end we apply the transformations
with domain over
with domain
Then the BlackScholes PDE becomes a heat equation
with boundary conditions
for all
for all
for all
Using the standard method for solving a wave equation we have
Which yields
BlackScholes
246
where are as above. Reverting the transformations yields the solution above.
Greeks
The Greeks under BlackScholes are given in closed form, below:
What Calls Puts
Note that the and formulas are the same for calls and puts, which can be seen directly from put-call
parity.
In practice, some sensitivities are usually quoted in scaled-down terms, to match the scale of likely changes in the
parameters. For example, is often reported divided by 10,000 (1bp rate change), by 100 (1 vol point change),
and by 365 or 252 (1 day decay based on either calendar days or trading days per year).
Extensions of the model
The above model can be extended for variable (but deterministic) rates and volatilities. The model may also be used
to value European options on instruments paying dividends. In this case, closed-form solutions are available if the
dividend is a known proportion of the stock price. American options and options on stocks paying a known cash
dividend (in the short term, more realistic than a proportional dividend) are more difficult to value, and a choice of
solution techniques is available (for example lattices and grids).
Instruments paying continuous yield dividends
For options on indexes, it is reasonable to make the simplifying assumption that dividends are paid continuously, and
that the dividend amount is proportional to the level of the index.
The dividend payment paid over the time period [t, t + dt) is then modelled as
for some constant q (the dividend yield).
Under this formulation the arbitrage-free price implied by the BlackScholes model can be shown to be
where now
is the modified forward price that occurs in the terms and :
BlackScholes
247
Exactly the same formula is used to price options on foreign exchange rates, except that now q plays the role of the
foreign risk-free interest rate and S is the spot exchange rate. This is the Garman-Kohlhagen model (1983).
Instruments paying discrete proportional dividends
It is also possible to extend the BlackScholes framework to options on instruments paying discrete proportional
dividends. This is useful when the option is struck on a single stock.
A typical model is to assume that a proportion of the stock price is paid out at pre-determined times t
1
, t
2
, .... The
price of the stock is then modelled as
where n(t) is the number of dividends that have been paid by time t.
The price of a call option on such a stock is again
where now
is the forward price for the dividend paying stock.
BlackScholes in practice
The normality assumption of the BlackScholes
model does not capture extreme movements such
as stock market crashes.
The BlackScholes model disagrees with reality in a number of ways,
some significant. It is widely employed as a useful approximation, but
proper application requires understanding its limitations blindly
following the model exposes the user to unexpected risk.
Among the most significant limitations are:
the underestimation of extreme moves, yielding tail risk, which can
be hedged with out-of-the-money options;
the assumption of instant, cost-less trading, yielding liquidity risk,
which is difficult to hedge;
the assumption of a stationary process, yielding volatility risk,
which can be hedged with volatility hedging;
the assumption of continuous time and continuous trading, yielding
gap risk, which can be hedged with Gamma hedging.
In short, while in the BlackScholes model one can perfectly hedge
options by simply Delta hedging, in practice there are many other
sources of risk.
Results using the BlackScholes model differ from real world prices
because of simplifying assumptions of the model. One significant
limitation is that in reality security prices do not follow a strict stationary log-normal process, nor is the risk-free
interest actually known (and is not constant over time). The variance has been observed to be non-constant leading to
models such as GARCH to model volatility changes. Pricing discrepancies between empirical and the BlackScholes
model have long been observed in options that are far out-of-the-money, corresponding to extreme price changes;
such events would be very rare if returns were lognormally distributed, but are observed much more often in
practice.
BlackScholes
248
Nevertheless, BlackScholes pricing is widely used in practice,
[4]
for it is easy to calculate and explicitly models the
relationship of all the variables. It is a useful approximation, particularly when analyzing the directionality that
prices move when crossing critical points. It is used both as a quoting convention and a basis for more refined
models. Although volatility is not constant, results from the model are often useful in practice and helpful in setting
up hedges in the correct proportions to minimize risk. Even when the results are not completely accurate, they serve
as a first approximation to which adjustments can be made.
One reason for the popularity of the BlackScholes model is that it is robust in that it can be adjusted to deal with
some of its failures. Rather than considering some parameters (such as volatility or interest rates) as constant, one
considers them as variables, and thus added sources of risk. This is reflected in the Greeks (the change in option
value for a change in these parameters, or equivalently the partial derivatives with respect to these variables), and
hedging these Greeks mitigates the risk caused by the non-constant nature of these parameters. Other defects cannot
be mitigated by modifying the model, however, notably tail risk and liquidity risk, and these are instead managed
outside the model, chiefly by minimizing these risks and by stress testing.
Additionally, rather than assuming a volatility a priori and computing prices from it, one can use the model to solve
for volatility, which gives the implied volatility of an option at given prices, durations and exercise prices. Solving
for volatility over a given set of durations and strike prices one can construct an implied volatility surface. In this
application of the BlackScholes model, a coordinate transformation from the price domain to the volatility domain
is obtained. Rather than quoting option prices in terms of dollars per unit (which are hard to compare across strikes
and tenors), option prices can thus be quoted in terms of implied volatility, which leads to trading of volatility in
option markets.
The volatility smile
One of the attractive features of the BlackScholes model is that the parameters in the model (other than the
volatility) the time to maturity, the strike, and the current underlying price are unequivocally observable. All
other things being equal, an option's theoretical value is a monotonic increasing function of implied volatility. By
computing the implied volatility for traded options with different strikes and maturities, the BlackScholes model
can be tested. If the BlackScholes model held, then the implied volatility for a particular stock would be the same
for all strikes and maturities. In practice, the volatility surface (the three-dimensional graph of implied volatility
against strike and maturity) is not flat. The typical shape of the implied volatility curve for a given maturity depends
on the underlying instrument. Equities tend to have skewed curves: compared to at-the-money, implied volatility is
substantially higher for low strikes, and slightly lower for high strikes. Currencies tend to have more symmetrical
curves, with implied volatility lowest at-the-money, and higher volatilities in both wings. Commodities often have
the reverse behaviour to equities, with higher implied volatility for higher strikes.
Despite the existence of the volatility smile (and the violation of all the other assumptions of the BlackScholes
model), the BlackScholes PDE and BlackScholes formula are still used extensively in practice. A typical approach
is to regard the volatility surface as a fact about the market, and use an implied volatility from it in a BlackScholes
valuation model. This has been described as using "the wrong number in the wrong formula to get the right price."
[5]
This approach also gives usable values for the hedge ratios (the Greeks).
Even when more advanced models are used, traders prefer to think in terms of volatility as it allows them to evaluate
and compare options of different maturities, strikes, and so on.
BlackScholes
249
Valuing bond options
BlackScholes cannot be applied directly to bond securities because of pull-to-par. As the bond reaches its maturity
date, all of the prices involved with the bond become known, thereby decreasing its volatility, and the simple
BlackScholes model does not reflect this process. A large number of extensions to BlackScholes, beginning with
the Black model, have been used to deal with this phenomenon.
Interest rate curve
In practice, interest rates are not constant - they vary by tenor, giving an interest rate curve which may be
interpolated to pick an appropriate rate to use in the BlackScholes formula. Another consideration is that interest
rates vary over time. This volatility may make a significant contribution to the price, especially of long-dated
options.This is simply like the interest rate and bond price relationship which is inversely related.
Short stock rate
It is not free to take a short stock position. Similarly, it may be possible to lend out a long stock position for a small
fee. In either case, this can be treated as a continuous dividend for the purposes of a BlackScholes valuation.
Alternative formula derivation
Let S
0
be the current price of the underlying stock and S the price when the option matures at time T. Then S
0
is
known, but S is a random variable. Assume that
is a normal random variable with mean and variance . It follows that the mean of S is
for some constant q (independent of T). Now a simple no-arbitrage argument shows that the theoretical future value
of a derivative paying one share of the stock at time T, and so with payoff S, is
where r is the risk-free interest rate. This suggests making the identification q = r for the purpose of pricing
derivatives. Define the theoretical value of a derivative as the present value of the expected payoff in this sense. For
a call option with exercise price K this discounted expectation (using risk-neutral probabilities) is
The derivation of the formula for C is facilitated by the following lemma: Let Z be a standard normal random
variable and let b be an extended real number. Define
If a is a positive real number, then
where is the standard normal cumulative distribution function. In the special case b = , we have
Now let
and use the corollary to the lemma to verify the statement above about the mean of S. Define
BlackScholes
250
and observe that
for some b. Define
and observe that
The rest of the calculation is straightforward.
Although the "elementary" derivation leads to the correct result, it is incomplete as it cannot explain, why the
formula refers to the risk-free interest rate while a higher rate of return is expected from risky investments. This
limitation can be overcome using the risk-neutral probability measure, but the concept of risk-neutrality and the
related theory is far from elementary. In elementary terms, the value of the option today is not the expectation of the
value of the option at expiry, discounted with the risk-free rate. (So the basic capital asset pricing model (CAPM)
results are not violated.) The value is instead computed using the expectation under another distribution of
probability, of the value of the option at expiry, discounted with the risk-free rate. This other distribution of
probability is called the "risk neutral" probability.
Remarks on notation
The reader is warned of the inconsistent notation that appears in this article. Thus the letter S is used as:
(1) a constant denoting the current price of the stock
(2) a real variable denoting the price at an arbitrary time
(3) a random variable denoting the price at maturity
(4) a stochastic process denoting the price at an arbitrary time
It is also used in the meaning of (4) with a subscript denoting time, but here the subscript is merely a mnemonic.
In the partial derivatives, the letters in the numerators and denominators are, of course, real variables, and the partial
derivatives themselves are, initially, real functions of real variables. But after the substitution of a stochastic process
for one of the arguments they become stochastic processes.
The BlackScholes PDE is, initially, a statement about the stochastic process S, but when S is reinterpreted as a real
variable, it becomes an ordinary PDE. It is only then that we can ask about its solution.
The parameter u that appears in the discrete-dividend model and the elementary derivation is not the same as the
parameter that appears elsewhere in the article. For the relationship between them see Geometric Brownian
motion.
BlackScholes
251
Criticism
Espen Gaarder Haug and Nassim Nicholas Taleb argue that the BlackScholes model merely recast existing widely
used models in terms of practically impossible "dynamic hedging" rather than "risk," to make them more compatible
with mainstream neoclassical economic theory.
[6]
Jean-Philippe Bouchaud argues: 'Reliance on models based on incorrect axioms has clear and large effects. The
BlackScholes model
[7]
, for example, which was invented in 1973 to price options, is still used extensively. But it
assumes that the probability of extreme price changes is negligible, when in reality, stock prices are much jerkier
than this. Twenty years ago, unwarranted use of the model spiralled into the worldwide October 1987 crash; the Dow
Jones index dropped 23% in a single day, dwarfing recent market hiccups.
See also
Black model, a variant of the BlackScholes option pricing model.
Binomial options model, which is a discrete numerical method for calculating option prices.
Monte Carlo option model, using simulation in the valuation of options with complicated features.
Financial mathematics, which contains a list of related articles.
Heat equation, to which the BlackScholes PDE can be transformed.
Real options analysis
Black Shoals, a financial art piece
Stochastic volatility
Notes
[1] Nassim Taleb and Espen Gaarder Haug: "Why We Have Never Used the Black-Scholes-Merton Option Pricing Formula"
[2] Nobel prize foundation, 1997 Press release (http:/ / nobelprize. org/ nobel_prizes/ economics/ laureates/ 1997/ press. html)
[3] Black, Fischer;. "The Pricing of Options and Corporate Liabilities". Journal of Political Economy 81 (3): 637654.
[4] http:/ / www. wilmott. com/ blogs/ paul/ index.cfm/ 2008/ 4/ 29/ Science-in-Finance-IX-In-defence-of-Black-Scholes-and-Merton
[5] R Rebonato: Volatility and correlation in the pricing of equity, FX and interest-rate options (1999)
[6] http:/ / papers. ssrn. com/ sol3/ papers.cfm?abstract_id=1012075
[7] Jean-Philippe Bouchaud (Capital Fund Management, physic professor at cole Polytechnique): Economics needs a scientific revolution,
NATURE|Vol 455|30 Oct 2008 OPINION ESSAY p. 1181
References
Primary references
Black, Fischer; Myron Scholes (1973). "The Pricing of Options and Corporate Liabilities". Journal of Political
Economy 81 (3): 637654. doi:10.1086/260062. (http:/ / links. jstor. org/ sici?sici=0022-3808(197305/
06)81:3<637:TPOOAC>2. 0. CO;2-P) (Black and Scholes' original paper.)
Merton, Robert C. (1973). "Theory of Rational Option Pricing" (http:/ / jstor. org/ stable/ 3003143). Bell Journal
of Economics and Management Science (The RAND Corporation) 4 (1): 141183. doi:10.2307/3003143. (http:/ /
links. jstor. org/ sici?sici=0005-8556(197321)4:1<141:TOROP>2. 0. CO;2-0& origin=repec)
Hull, John C. (1997). "Options, Futures, and Other Derivatives". Prentice Hall. ISBNISBN 0-13-601589-1.
BlackScholes
252
Historical and sociological aspects
Bernstein, Peter (1992). Capital Ideas: The Improbable Origins of Modern Wall Street. The Free Press.
ISBN0-02-903012-9.
MacKenzie, Donald (2003). "An Equation and its Worlds: Bricolage, Exemplars, Disunity and Performativity in
Financial Economics". Social Studies of Science 33 (6): 831868. doi:10.1177/0306312703336002. (http:/ / sss.
sagepub. com/ cgi/ content/ abstract/ 33/ 6/ 831)
MacKenzie, Donald; Yuval Millo (2003). "Constructing a Market, Performing Theory: The Historical Sociology
of a Financial Derivatives Exchange". American Journal of Sociology 109 (1): 107145. doi:10.1086/374404.
(http:/ / www. journals. uchicago. edu/ AJS/ journal/ issues/ v109n1/ 060259/ brief/ 060259. abstract. html)
MacKenzie, Donald (2006). An Engine, not a Camera: How Financial Models Shape Markets. MIT Press.
ISBN0-262-13460-8.
Further reading
Haug, E. G (2007). "Option Pricing and Hedging from Theory to Practice". Derivatives: Models on Models.
Wiley. ISBN9780470013229. The book gives a series of historical references supporting the theory that option
traders use much more robust hedging and pricing principles than the Black, Scholes and Merton model.
Triana, Pablo (2009). Lecturing Birds on Flying: Can Mathematical Theories Destroy the Financial Markets?.
Wiley. ISBN9780470406755. The book takes a critical look at the Black, Scholes and Merton model.
External links
Discussion of the model
Ajay Shah. Black, Merton and Scholes: Their work and its consequences. Economic and Political Weekly,
XXXII(52):3337-3342, December 1997 link (http:/ / www. mayin. org/ ajayshah/ PDFDOCS/ Shah1997_bms.
pdf)
Inside Wall Street's Black Hole (http:/ / www. portfolio. com/ news-markets/ national-news/ portfolio/ 2008/ 02/
19/ Black-Scholes-Pricing-Model?print=true) by Michael Lewis, March 2008 Issue of portfolio.com
Whither Black-Scholes? (http:/ / www. forbes. com/ opinions/ 2008/ 04/ 07/
black-scholes-options-oped-cx_ptp_0408black. html) by Pablo Triana, April 2008 Issue of Forbes.com
Derivation and solution
Proving the Black-Scholes Formula (http:/ / knol. google. com/ k/ the-black-scholes-formula#)
The risk neutrality derivation of the Black-Scholes Equation (http:/ / www. quantnotes. com/ fundamentals/
options/ riskneutrality. htm), quantnotes.com
Arbitrage-free pricing derivation of the Black-Scholes Equation (http:/ / www. quantnotes. com/ fundamentals/
options/ black-scholes. htm), quantnotes.com, or an alternative treatment (http:/ / www. sjsu. edu/ faculty/
watkins/ blacksch. htm), Prof. Thayer Watkins
Solving the Black-Scholes Equation (http:/ / www. quantnotes. com/ fundamentals/ options/ solvingbs. htm),
quantnotes.com
Solution of the BlackScholes Equation Using the Green's Function (http:/ / www. physics. uci. edu/ ~silverma/
bseqn/ bs/ bs. html), Prof. Dennis Silverman
Solution via risk neutral pricing or via the PDE approach using Fourier transforms (http:/ / homepages. nyu.edu/
~sl1544/ KnownClosedForms. pdf) (includes discussion of other option types), Simon Leger
Step-by-step solution of the Black-Scholes PDE (http:/ / planetmath. org/ encyclopedia/
AnalyticSolutionOfBlackScholesPDE. html), planetmath.org.
BlackScholes
253
On the Black-Scholes Equation: Various Derivations (http:/ / www. stanford. edu/ ~japrimbs/ Publications/
OnBlackScholesEq. pdf), Manabu Kishimoto
The Black-Scholes Equation (http:/ / terrytao. wordpress. com/ 2008/ 07/ 01/ the-black-scholes-equation/ )
Expository article by mathematician Terence Tao.
Revisiting the model
Why We Have Never Used the Black-Scholes-Merton Option Pricing Formula (http:/ / papers. ssrn. com/ sol3/
papers. cfm?abstract_id=1012075), Nassim Taleb and Espen Gaarder Haug
The illusions of dynamic replication (http:/ / www. ederman. com/ new/ docs/ qf-Illusions-dynamic. pdf),
Emanuel Derman and Nassim Taleb
When You Cannot Hedge Continuously: The Corrections to Black-Scholes (http:/ / www. ederman. com/ new/
docs/ risk-non_continuous_hedge. pdf), Emanuel Derman
In defence of Black Scholes and Merton (http:/ / www. wilmott. com/ blogs/ paul/ index. cfm/ 2008/ 4/ 29/
Science-in-Finance-IX-In-defence-of-Black-Scholes-and-Merton), Paul Wilmott
Computer implementations
Sourcecode
BlackScholes in Multiple Languages (http:/ / www. espenhaug. com/ black_scholes. html), espenhaug.com
VBA sourcecode for Black Scholes and Greeks (http:/ / www. global-derivatives. com/ code/ vba/
BSEuro-Greeks. txt), global-derivatives.com
Real Time
Black-Scholes tutorial based on graphic simulations (http:/ / www. optionanimation. com), Stephen Marlowe
Real-time calculator of Call and Put Option prices when the underlying follows a Mean-Reverting Geometric
Brownian Motion (http:/ / www. cba. ua. edu/ ~rpascala/ revertGBM/ BSOPMRForm. php)
Price and Hedge Simulation based on Black Scholes Option Price (http:/ / www. exotic-options-and-hybrids.
com/ price-hedge-simulation. html)
Historical
Trillion Dollar Bet (http:/ / www. pbs. org/ wgbh/ nova/ stockmarket/ )Companion Web site to a Nova episode
originally broadcast on February 8, 2000. "The film tells the fascinating story of the invention of the Black-Scholes
Formula, a mathematical Holy Grail that forever altered the world of finance and earned its creators the 1997
Nobel Prize in Economics."
BBC Horizon (http:/ / www. bbc. co. uk/ science/ horizon/ 1999/ midas. shtml) A TV-programme on the so-called
Midas formula and the bankruptcy of Long-Term Capital Management (LTCM)
Black model
254
Black model
The Black model (sometimes known as the Black-76 model) is a variant of the Black-Scholes option pricing model.
Its primary applications are for pricing bond options, interest rate caps / floors, and swaptions. It was first presented
in a paper written by Fischer Black in 1976.
Black's model can be generalized into a class of models known as log-normal forward models, also referred to as
LIBOR market model.
The Black formula
The Black formula is similar to the Black-Scholes formula for valuing stock options except that the spot price of the
underlying is replaced by the forward price F.
The Black formula for a European call option on an underlying strike at K, expiring T years in the future is
The put price is
where
where r is the risk-free discount rate, continuously compounded.
and N(.) is the cumulative normal distribution function.
Derivation and assumptions
The derivation of the pricing formulas in the model follows that of the Black-Scholes model almost exactly. The
assumption that the spot price follows a log-normal process is replaced by the assumption that the forward price at
maturity of the option is log-normally distributed. From there the derivation is identical and so the final formula is
the same except that the spot price is replaced by the forward - the forward price represents the undiscounted
expected future value.
See also
Financial mathematics
Black-Scholes
External links
'Greeks' Calculator using the Black model
[1]
, Razvan Pascalau, Univ. of Alabama
References
Black, Fischer (1976). The pricing of commodity contracts, Journal of Financial Economics, 3, 167-179.
Garman, Mark B. and Steven W. Kohlhagen (1983). Foreign currency option values, Journal of International
Money and Finance, 2, 231-237.
Black model
255
Miltersen, K., Sandmann, K. et Sondermann, D., (1997): Closed Form Solutions for Term Structure Derivates
with Log-Normal Interest Rates, Journal of Finance, 52(1), 409-430.
References
[1] http:/ / www. cba.ua. edu/ ~rpascala/ greeks2/ GFOPMForm. php
Binomial options pricing model
BOPM redirects here; for other uses see BOPM (disambiguation).
In finance, the binomial options pricing model (BOPM) provides a generalizable numerical method for the
valuation of options. The binomial model was first proposed by Cox, Ross and Rubinstein (1979). Essentially, the
model uses a "discrete-time" (lattice based) model of the varying price over time of the underlying financial
instrument.
Use of the model
The Binomial options pricing model approach is widely used as it is able to handle a variety of conditions for which
other models cannot easily be applied. This is largely because the BOPM is based on the description of an
underlying instrument over a period of time rather than a single point. As a consequence, it is used to value
American options that are exercisable at any time in a given interval as well as Bermudan options that are
exercisable at specific instances of time. Being relatively simple, the model is readily implementable in computer
software (including a spreadsheet).
Although computationally slower than the Black-Scholes formula, it is more accurate, particularly for longer-dated
options on securities with dividend payments. For these reasons, various versions of the binomial model are widely
used by practitioners in the options markets.
For options with several sources of uncertainty (e.g., real options) and for options with complicated features (e.g.,
Asian options), binomial methods are less practical due to several difficulties, and Monte Carlo option models are
commonly used instead. Monte Carlo simulation is computationally time-consuming, however (cf. Monte Carlo
methods in finance).
Methodology
The binomial pricing model traces the evolution of the option's key underlying variables in discrete-time. This is
done by means of a binomial lattice (tree), for a number of time steps between the valuation and expiration dates.
Each node in the lattice, represents a possible price of the underlying at a given point in time.
Valuation is performed iteratively, starting at each of the final nodes (those that may be reached at the time of
expiration), and then working backwards through the tree towards the first node (valuation date). The value
computed at each stage is the value of the option at that point in time.
Option valuation using this method is, as described, a three-step process:
1. price tree generation,
2. calculation of option value at each final node,
3. sequential calculation of the option value at each preceding node.
Binomial options pricing model
256
STEP 1: Create the binomial price tree
The tree of prices is produced by working forward from valuation date to expiration.
At each step, it is assumed that the underlying instrument will move up or down by a specific factor ( or ) per
step of the tree (where, by definition, and ). So, if is the current price, then in the next
period the price will either be or .
The up and down factors are calculated using the underlying volatility, and the time duration of a step, ,
measured in years (using the day count convention of the underlying instrument). From the condition that the
variance of the log of the price is , we have:
The above is the original Cox, Ross, & Rubinstein (CRR) method; there are other techniques for generating the
lattice, such as "the equal probabilities" tree. The Trinomial tree is a similar model, allowing for an up, down or
stable path.
The CRR method ensures that the tree is recombinant, i.e. if the underlying asset moves up and then down (u,d), the
price will be the same as if it had moved down and then up (d,u) here the two paths merge or recombine. This
property reduces the number of tree nodes, and thus accelerates the computation of the option price.
This property also allows that the value of the underlying asset at each node can be calculated directly via formula,
and does not require that the tree be built first. The node-value will be:
where:
: Number of up ticks
: Number of down ticks
STEP 2: Find Option value at each final node
At each final node of the tree i.e. at expiration of the option the option value is simply its intrinsic, or exercise,
value.
Max [ (S K), 0 ], for a call option
Max [ (K S), 0 ], for a put option:
Where: K is the Strike price and S is the spot price of the underlying asset
STEP 3: Find Option value at earlier nodes
Once the above step is complete, the option value is then found for each node, starting at the penultimate time step,
and working back to the first node of the tree (the valuation date) where the calculated result is the value of the
option.
In overview: the "binomial value" is found at each node, using the risk neutrality assumption; see Risk neutral
valuation. If exercise is permitted at the node, then the model takes the greater of binomial and exercise value at the
node.
The steps are as follows:
1) Under the risk neutrality assumption, today's fair price of a derivative is equal to the expected value of its future
payoff discounted by the risk free rate. Therefore, expected value is calculated using the option values from the later
two nodes (Option up and Option down) weighted by their respective probabilities -- "probability" p of an up move
in the underlying, and "probability" (1-p) of a down move. The expected value is then discounted at r, the risk free
Binomial options pricing model
257
rate corresponding to the life of the option.
The following formula to compute the expectation value is applied at each node:
Binomial Value = [ p Option up + (1-p) Option down] exp (- r t), or
where
is the option's value for the node at time ,
is chosen such that the related Binomial distribution simulates the geometric Brownian
motion of the underlying stock with parameters r and ,
is the dividend yield of the underlying corresponding to the life of the option. It follows that in a
risk-neutral world futures price should have an expected growth rate of zero and therefore we can consider
for futures.
(Note that the alternative valuation approach, arbitrage-free pricing, yields identical results; see
"delta-hedging".)
2) This result is the "Binomial Value". It represents the fair price of the derivative at a particular point in time (i.e. at
each node), given the evolution in the price of the underlying to that point. It is the value of the option if it were to be
held as opposed to exercised at that point.
3) Depending on the style of the option, evaluate the possibility of early exercise at each node: if (1) the option can
be exercised, and (2) the exercise value exceeds the Binomial Value, then (3) the value at the node is the exercise
value.
For a European option, there is no option of early exercise, and the binomial value applies at all nodes.
For an American option, since the option may either be held or exercised prior to expiry, the value at each node
is: Max (Binomial Value, Exercise Value).
For a Bermudan option, the value at nodes where early exercise is allowed is: Max (Binomial Value, Exercise
Value); at nodes where early exercise is not allowed, only the binomial value applies.
In calculating the value at the next time step calculated - i.e. one step closer to valuation - the model must use the
value selected here, for "Option up" / "Option down" as appropriate, in the formula at the node.
The following algorithm demonstrates the approach computing the price of an American put option, although is
easily generalised for calls and for European and Bermudan options:
function americanPut(T, S, K, r, sigma, q, n) {
' T... expiration time
' S... stock price
' K... strike price
' n... height of the binomial tree
deltaT:= T/ n;
up:= exp(sigma* sqrt(deltaT));
p0:= (up* exp(-r* deltaT)- exp(-q* deltaT))* up/ (up^ 2- 1);
p1:= exp(-r* deltaT)- p0;
for i:= 0 to n { ' initial values at time T
p(i):= K- S* up^(2* i- n); if p(i)< 0 then p(i)=0;
}
Binomial options pricing model
258
for j:= n- 1 to 0 step -1 { ' move to earlier times
for i:= 0 to j {
p(i):= p0* p(i)+ p1* p(i+1); ' binomial value
exercise:= K- S* up^ (2* i- j); ' exercise value
if p(i)< exercise then p(i)= exercise;
}
}
return americanPut:= p(0);
}
Discrete dividends
In practice, the use of continuous dividend yield, , in the formula above can lead to significant mis-pricing of the
option near an ex-dividend date. Instead, it is common to model dividends as discrete payments on the anticipated
future ex-dividend dates.
To model discrete dividend payments in the binomial model, apply the following rule:
At each time step, , calculate , for all where is the present value of the
-th dividend. Subtract this value from the value of the security price at each node ( , ).
Relationship with Black-Scholes
Similar assumptions underpin both the binomial model and the Black-Scholes model, and the binomial model thus
provides a discrete time approximation to the continuous process underlying the Black-Scholes model. In fact, for
European options without dividends, the binomial model value converges on the Black-Scholes formula value as the
number of time steps increases. The binomial model assumes that movements in the price follow a binomial
distribution; for many trials, this binomial distribution approaches the normal distribution assumed by
Black-Scholes. In addition, when analyzed as numerical procedure, the CRR binomial method can be viewed as a
special case of explicit finite difference method for Black-Scholes PDE.
See also
Trinomial tree - a similar model with three possible paths per node.
Black-Scholes: binomial lattices are able to handle a variety of conditions for which Black-Scholes cannot be
applied.
Monte Carlo option model, used in the valuation of options with complicated features that make them difficult to
value through other methods.
Real options analysis - where the BOPM is widely used.
Mathematical finance, which has a list of related articles.
Stochastic Volatility
Binomial options pricing model
259
References
Cox, John C., Stephen A. Ross, and Mark Rubinstein. 1979. "Option Pricing: A Simplified Approach." Journal of
Financial Economics 7: 229-263.[1]
Richard J. Rendleman, Jr. and Brit J. Bartter. 1979. "Two-State Option Pricing". Journal of Finance 24:
1093-1110. doi:10.2307/2327237
External links
Discussion
The Binomial Model for Pricing Options
[2]
, Prof. Thayer Watkins
Using The Binomial Model to Price Derivatives
[3]
, Quantnotes
Binomial Method (Cox, Ross, Rubinstein)
[4]
, global-derivatives.com
Binomial Option Pricing
[5]
(PDF), Prof. Robert M. Conroy
The Binomial Option Pricing Model
[6]
, Simon Benninga and Zvi Wiener
Options pricing using a binomial lattice
[7]
, The Investment Analysts Society of Southern Africa
Convergence of the Binomial to the Black-Scholes Model
[8]
PDF(143KB) , Prof. Don M. Chance
Some notes on the Cox-Ross-Rubinstein binomial model for pricing an option
[9]
, Prof. Rob Thompson
Binomial Option Pricing Model
[10]
by Fiona Maclachlan, The Wolfram Demonstrations Project
Variations
American and Bermudan options
American Options and Lattice Model Pricing
[11]
, Quantnotes
Pricing Bermudan Options
[12]
, umanitoba.ca
Option Pricing: Extending the Basic Binomial Model
[13]
, Rich Tanenbaum
Other tree structures
A Synthesis of Binomial Option Pricing Models for Lognormally Distributed Assets
[14]
, Don M. Chance
Binomial and Trinomial Trees - overview
[15]
, The Quant Equation Archive, sitmo.com
Fixed income derivatives
Binomial Pricing of Interest Rate Derivatives
[16]
PDF(76.3KB) , Don M. Chance
Binomial Models for Fixed Income Analytics
[17]
, David Backus
Binomial Term Structure Models
[18]
, Simon Benninga and Zvi Wiener
Computer Implementations
Spreadsheets
Binomial Options Pricing Spreadsheet
[19]
, Peter Ekman
American Options - Binomial Method
[20]
, global-derivatives.com
European Options - Binomial Method
[21]
, global-derivatives.com
Online
European and American Option Trees
[22]
, Jan-Petter Janssen
Programming Languages
C
[23]
Fortran
[24]
Mathematica
[6]
S-Plus
[24]
Binomial options pricing model
260
References
[1] http:/ / www. in-the-money.com/ artandpap/ Option%20Pricing%20-%20A%20Simplified%20Approach. doc
[2] http:/ / www. sjsu. edu/ faculty/ watkins/ binomial. htm
[3] http:/ / www. quantnotes. com/ fundamentals/ options/ binomial. htm
[4] http:/ / www. global-derivatives. com/ options/ european-options. php#binomial
[5] http:/ / faculty. darden.virginia.edu/ conroyb/ derivatives/ Binomial%20Option%20Pricing%20_f-0943_. pdf
[6] http:/ / finance. wharton.upenn. edu/ ~benninga/ mma/ MiER63. pdf
[7] http:/ / www. iassa. co.za/ articles/ 048_1998_05.pdf
[8] http:/ / www. bus. lsu.edu/ academics/ finance/ faculty/ dchance/ Instructional/ TN00-08. pdf
[9] http:/ / math. hunter.cuny. edu/ thompson/ math778/ CRRmodel/
[10] http:/ / demonstrations. wolfram. com/ BinomialOptionPricingModel/
[11] http:/ / www.quantnotes. com/ fundamentals/ options/ americanoptions. htm
[12] http:/ / www.cs. umanitoba. ca/ ~tulsi/ Final/ May27/ XiaoLiuFinal. ps. pdf
[13] http:/ / www.savvysoft.com/ display_whitepaper.cgi?class=whitepaper& doc=advanced. htm
[14] http:/ / www.bus. lsu. edu/ academics/ finance/ faculty/ dchance/ Research/ ASynthesisofBinomialOptionPricingModels. pdf
[15] http:/ / www.sitmo.com/ eqcat/ 14
[16] http:/ / www.bus. lsu. edu/ academics/ finance/ faculty/ dchance/ Instructional/ TN97-14. pdf
[17] http:/ / pages. stern. nyu.edu/ ~dbackus/ 3176/ adlec3. pdf
[18] http:/ / finance. wharton. upenn. edu/ ~benninga/ mma/ MiER73. pdf
[19] http:/ / www.disklectures. com/ freebies.php
[20] http:/ / www.global-derivatives.com/ xls/ American-Binomial. xls
[21] http:/ / www.global-derivatives.com/ xls/ European-Binomial. xls
[22] http:/ / developingtrader. com/ binomial.php
[23] http:/ / home.comcast. net/ ~laghan3/ options.htm
[24] http:/ / www.stat.rice.edu/ ~dobelman/ download/ bsopm. html
Monte Carlo option model
In mathematical finance, a Monte Carlo option model uses Monte Carlo methods to calculate the value of an option
with multiple sources of uncertainty or with complicated features.
[1]
The term 'Monte Carlo method' was coined by Stanislaw Ulam in the 1940s. The first application to option pricing
was by Phelim Boyle in 1977 (for European options). In 1996, M. Broadie and P. Glasserman showed how to price
Asian options by Monte Carlo. In 2001 F. A. Longstaff and E. S. Schwartz developed a practical Monte Carlo
method for pricing American-style options.
Methodology
In terms of theory, Monte Carlo valuation relies on risk neutral valuation.
[1]
Here the price of the option is its
discounted expected value; see risk neutrality and Rational pricing: Risk Neutral Valuation. The technique applied
then, is (1) to generate several thousand possible (but random) price paths for the underlying (or underlyings) via
simulation, and (2) to then calculate the associated exercise value (i.e. "payoff") of the option for each path. (3)
These payoffs are then averaged and (4) discounted to today. This result is the value of the option.
[2]
This approach, although relatively straight-forward, allows for increasing complexity:
An option on equity may be modelled with one source of uncertainty: the price of the underlying stock in
question.
[2]
Here the price of the underlying instrument is usually modelled such that it follows a geometric
Brownian motion with constant drift and volatility . So: , where is
found via a random sampling from a normal distribution; see further under Black-Scholes. (Since the underlying
random process is the same, for enough price paths, the value of a european option here should be the same as
under Black Scholes).
Monte Carlo option model
261
In other cases, the source of uncertainty may be at a remove. For example, for bond options
[3]
the underlying is a
bond, but the source of uncertainty is the annualized interest rate (i.e. the short rate). Here, for each randomly
generated yield curve we observe a different resultant bond price on the option's exercise date; this bond price is
then the input for the determination of the option's payoff. The same approach is used in valuing swaptions,
[4]
where the value of the underlying swap is also a function of the evolving interest rate. For the models used to
simulate the interest-rate see further under Short-rate model.
Monte Carlo Methods allow for a compounding in the uncertainty.
[5]
For example, where the underlying is
denominated in a foreign currency, an additional source of uncertainty will be the exchange rate: the underlying
price and the exchange rate must be separately simulated and then combined to determine the value of the
underlying in the local currency. In all such models, correlation between the underlying sources of risk is also
incorporated; see Cholesky decomposition: Monte Carlo simulation. Further complications, such as the impact of
commodity prices or inflation on the underlying, can also be introduced. Since simulation can accommodate
complex problems of this sort, it is often used in analysing real options
[1]
where management's decision at any
point is a function of multiple underlying variables.
Simulation can be used to value options where the payoff depends on the value of multiple underlying assets
[6]
such as a Basket option or Rainbow option. Here, correlation between assets is similarly incorporated.
As required, the stochastic process of the underlying(s) may be specified so as to exhibit jumps or mean reversion
or both.
[7]
Further, some models even allow for (randomly) varying statistical (and other) parameters of the
sources of uncertainty. For example, in models incorporating stochastic volatility, the volatility of the underlying
changes with time; see Heston model.
Application
As can be seen, Monte Carlo Methods are particularly useful in the valuation of options with multiple sources of
uncertainty or with complicated features which would make them difficult to value through a straightforward
Black-Scholes style computation. The technique is thus widely used in valuing Asian options
[8]
and in real options
analysis.
[1]

[5]
Conversely, however, if an analytical technique for valuing the option exists - or even a numeric technique, such as a
(modified) pricing tree
[8]
- Monte Carlo methods will usually be too slow to be competitive. They are, in a sense, a
method of last resort;
[8]
see further under Monte Carlo methods in finance. With faster computing capability this
computational constraint is less of a concern.
References
Notes
[1] Marco Dias: Real Options with Monte Carlo Simulation (http:/ / www. puc-rio. br/ marco. ind/ faq4. html)
[2] Don Chance: Teaching Note 96-03: Monte Carlo Simulation (http:/ / www. bus. lsu. edu/ academics/ finance/ faculty/ dchance/ Instructional/
TN96-03. pdf)
[3] Peter Carr and Guang Yang: Simulating American Bond Options in an HJM Framework (http:/ / www. math. nyu. edu/ research/ carrp/
papers/ pdf/ hjm. pdf)
[4] Carlos Blanco, Josh Gray and Marc Hazzard: Alternative Valuation Methods for Swaptions: The Devil is in the Details (http:/ / www. fea.
com/ resources/ pdf/ swaptions.pdf)
[5] Gonzalo Cortazar, Miguel Gravet and Jorge Urzua: The valuation of multidimensional American real options using the LSM simulation
method (http:/ / www.realoptions. org/ papers2005/ Cortazar_GU052RealOptionsParis. pdf)
[6] global-derivatives.com: Basket Options - Simulation (http:/ / www. global-derivatives. com/ index. php?option=com_content& task=view&
id=26#MCS)
[7] Les Clewlow, Chris Strickland and Vince Kaminski: Extending mean-reversion jump diffusion (http:/ / www. erasmusenergy. com/
downloadattachment. php?aId=4b0d2207d4169ee155591c70efa19c63& articleId=139)
[8] Rich Tanenbaum: Battle of the Pricing Models: Trees vs Monte Carlo (http:/ / www. savvysoft. com/ treevsmontecarlo. htm)
Monte Carlo option model
262
Articles
Boyle, Phelim P., Options: A Monte Carlo Approach (http:/ / ideas. repec. org/ a/ eee/ jfinec/ v4y1977i3p323-338.
html). Journal of Financial Economics 4, (1977) 323-338
Broadie, M. and P. Glasserman, Estimating Security Price Derivatives Using Simulation (http:/ / www. columbia.
edu/ ~mnb2/ broadie/ Assets/ bg_ms_1996. pdf), Management Science, 42, (1996) 269-285.
Longstaff F.A. and E.S. Schwartz, Valuing American options by simulation: a simple least squares approach
(http:/ / repositories. cdlib. org/ anderson/ fin/ 1-01/ ), Review of Financial Studies 14 (2001), 113-148
Resources
Books
Don L. McLeish, Monte Carlo Simulation & Finance (2005) ISBN 0-471-67778-7
Christian P. Robert, George Casella, Monte Carlo Statistical Methods (2005) ISBN 0-387-21239-6
Software
Fairmat (freeware) modeling and pricing complex options
MG Soft (http:/ / www. mgsoft. ru/ en/ products_options_calculator. aspx) (freeware) valuation and Greeks of
vanilla and exotic options
External links
Monte Carlo Simulation (http:/ / www. bus. lsu. edu/ academics/ finance/ faculty/ dchance/ Instructional/
TN96-03. pdf), Prof. Don M. Chance, Louisiana State University
Pricing complex options using a simple Monte Carlo Simulation (http:/ / www. quantnotes. com/ publications/
papers/ Fink-montecarlo. pdf), Peter Fink (reprint at quantnotes.com)
MonteCarlo Simulation in Finance (http:/ / www. global-derivatives. com/ maths/ k-o. php),
global-derivatives.com
Monte Carlo Derivative valuation (http:/ / spears. okstate. edu/ home/ tlk/ legacy/ fin5883/ notes6_s05. doc),
contd. (http:/ / spears. okstate. edu/ home/ tlk/ legacy/ fin5883/ notes7_s05. doc), Timothy L. Krehbiel, Oklahoma
State UniversityStillwater
Applications of Monte Carlo Methods in Finance: Option Pricing (http:/ / www. smartquant. com/ references/
MonteCarlo/ mc6. pdf), Y. Lai and J. Spanier, Claremont Graduate University
Option pricing by simulation (http:/ / finance-old. bi. no/ ~bernt/ gcc_prog/ recipes/ recipes/ node12. html), Bernt
Arne degaard, Norwegian School of Management
The Longstaff-Schwartz algorithm for American options (http:/ / www. princeton. edu/ ~rluss/ orf557/
MC_American_Opt_Pricing. pptx), Astrid Prajogo, Princeton University
Monte Carlo Method (http:/ / www. riskglossary. com/ link/ monte_carlo_method. htm), riskglossary.com
Volatility smile
263
Volatility smile
In finance, the volatility smile is a long-observed pattern in which at-the-money options tend to have lower implied
volatilities than in- or out-of-the-money options. The pattern displays different characteristics for different markets
and results from the probability of extreme moves. Equity options traded in American markets did not show a
volatility smile before the Crash of 1987 but began showing one afterwards.
[1]
Modelling the volatility smile is an active area of research in quantitative finance. Typically, a quantitative analyst
will calculate the implied volatility from liquid vanilla options and use models of the smile to calculate the price of
more complex exotic options.
A closely related concept is that of term structure of volatility, which refers to how implied volatility differs for
related options with different maturities. An implied volatility surface is a 3-D plot that combines volatility smile
and term structure of volatility into a consolidated view of all options for an underlier.
Volatility smiles and implied volatility
In the Black-Scholes model, the theoretical value of a vanilla option is a monotonic increasing function of the
Black-Scholes volatility. Furthermore, except in the case of American options with dividends whose early exercise
could be optimal, the price is a strictly increasing function of volatility. This means it is usually possible to compute
a unique implied volatility from a given market price for an option. This implied volatility is best regarded as a
rescaling of option prices which makes comparisons between different strikes, expirations, and underlyings easier
and more intuitive.
When implied volatility is plotted against strike price, the resulting graph is typically downward sloping for equity
markets, or valley-shaped for currency markets. For markets where the graph is downward sloping, such as for
equity options, the term "volatility skew" is often used. For other markets, such as FX options or equity index
options, where the typical graph turns up at either end, the more familiar term "volatility smile" is used. For
example, the implied volatility for upside (i.e. high strike) equity options is typically lower than for at-the-money
equity options. However, the implied volatilities of options on foreign exchange contracts tend to rise in both the
downside and upside directions. In equity markets, a small tilted smile is often observed near the money as a kink in
the general downward sloping implicit volatility graph. Sometimes the term "smirk" is used to describe a skewed
smile.
Market practitioners use the term implied-volatility to indicate the volatility parameter for ATM (at-the-money)
option. Adjustments to this value is undertaken by incorporating the values of Risk Reversal and Flys (Skews) to
determine the actual volatility measure that may be used for options with a delta which is not 50.
Callx = ATMx + 0.5 RRx + Flyx
Putx = ATMx - 0.5 RRx + Flyx
Risk reversals are generally quoted X% delta risk reversal and essentially is Long X% delta call, and short X% delta
put.
Butterfly, on the other hand, is Y% delta fly which mean Long Y% delta call, Long Y% delta put, and short ATM
call.
Volatility smile
264
Implied volatility and historical volatility
It is helpful to note that implied volatility is related to historical volatility, however the two are distinct. Historical
volatility is a direct measure of the movement of the underliers price (realized volatility) over recent history (e.g. a
trailing 21-day period). Implied volatility, in contrast, is set by the market price of the derivative contract itself, and
not the underlier. Therefore, different derivative contracts on the same underlier have different implied volatilities.
For instance, the IBM call option, struck at $100 and expiring in 6 months, may have an implied volatility of 18%,
while the put option struck at $105 and expiring in 1 month may have an implied volatility of 21%. At the same
time, the historical volatility for IBM for the previous 21 day period might be 17% (all volatilities are expressed in
annualized percentage moves).
Term structure of volatility
For options of different maturities, we also see characteristic differences in implied volatility. However, in this case,
the dominant effect is related to the market's implied impact of upcoming events. For instance, it is well-observed
that realized volatility for stock prices rises significantly on the day that a company reports its earnings.
Correspondingly, we see that implied volatility for options will rise during the period prior to the earnings
announcement, and then fall again as soon as the stock price absorbs the new information. Options that mature
earlier exhibit a larger swing in implied volatility than options with longer maturities.
Other option markets show other behavior. For instance, options on commodity futures typically show increased
implied volatility just prior to the announcement of harvest forecasts. Options on US Treasury Bill futures show
increased implied volatility just prior to meetings of the Federal Reserve Board (when changes in short-term interest
rates are announced).
The market incorporates many other types of events into the term structure of volatility. For instance, the impact of
upcoming results of a drug trial can cause implied volatility swings for pharmaceutical stocks. The anticipated
resolution date of patent litigation can impact technology stocks, etc.
Volatility smile
265
Volatility term structures list the relationship between implied volatilities and time to expiration. The term structures
provide another method for traders to gauge cheap or expensive options.
Implied volatility surface
It is often useful to plot implied volatility as a function of both strike price and time to maturity. The result is a 2-D
curved surface whereby the current market implied volatility (Z-axis) for all options on the underlier is plotted
against strike price and time to maturity (X & Y-axes).
The implied volatility surface simultaneously shows both volatility smile and term structure of volatility. Option
traders use an implied volatility plot to quickly determine the shape of the implied volatility surface, and to identify
any areas where the slope of the plot (and therefore relative implied volatilities) seems out of line.
The graph shows an implied volatility surface for all the call options on a particular underlying stock price. The
Z-axis represents implied volatility in percent, and X and Y axes represent the option delta, and the days to maturity.
Note that to maintain put-call parity, a 20 delta put must have the same implied volatility as an 80 delta call. For this
surface, we can see that the underlying symbol has both volatility skew (a tilt along the delta axis), as well as a
volatility term structure indicating an anticipated event in the near future.
Evolution: Sticky
An implied volatility surface is static: it describes the implied volatilities at a given moment in time. How the surface
changes over time (especially as spot changes) is called the evolution of the implied volatility surface.
Common heuristics include:
"sticky strike" (or "sticky-by-strike", or "stick-to-strike"): if spot changes, the implied volatility of an option with
a given absolute strike does not change.
"sticky moneyness" (aka, "sticky delta"; see moneyness for why these are equivalent terms): if spot changes, the
implied volatility of an option with a given moneyness does not change.
Volatility smile
266
So if spot moves from $100 to $120, sticky strike would predict that the implied volatility of a $120 strike option
would be whatever it was before the move (though it has moved from being OTM to ATM), while sticky delta would
predict that the implied volatility of the $120 strike option would be whatever the $100 strike option's implied
volatility was before the move (as these are both ATM at the time).
Modeling volatility
Methods of modelling the volatility smile include stochastic volatility models and local volatility models.
See also
Volatility (finance)
Stochastic volatility
Bruno Dupire
SABR Volatility Model
Jim Gatheral
Steven Heston
Emanuel Derman
Damiano Brigo
Fabio Mercurio
External links
Emanuel Derman, The Volatility Smile and Its Implied Tree (RISK, 7-2 Feb.1994, pp. 139-145, pp. 32-39)
[2]
(PDF)
Mark Rubinstein, Implied Binomial Trees
[3]
(PDF)
Optionistics, Daily Skew Charts
[4]
Damiano Brigo, Fabio Mercurio, Francesco Rapisarda and Giulio Sartorelli, Volatility Smile Modeling with
Mixture Stochastic Differential Equations
[5]
(PDF)
References
[1] John C. Hull, Options, Futures and Other Derivatives, 5th edition, page 335
[2] http:/ / www. ederman. com/ new/ docs/ gs-volatility_smile. pdf
[3] http:/ / www. haas. berkeley.edu/ finance/ WP/ rpf232. pdf
[4] http:/ / www. optionistics.com/ i/ volatility_skew
[5] http:/ / www. damianobrigo.it/ tokyo2002smile.pdf
Implied volatility
267
Implied volatility
In financial mathematics, the implied volatility of an option contract is the volatility implied by the market price of
the option based on an option pricing model. In other words, it is the volatility that, when used in a particular pricing
model, yields a theoretical value for the option equal to the current market price of that option. Non-option financial
instruments that have embedded optionality, such as an interest rate cap, can also have an implied volatility. Implied
volatility, a forward-looking measure, differs from historical volatility because the latter is calculated from known
past returns of a security.
Motivation
An ordinary option pricing model, such as Black-Scholes, uses a variety of inputs to derive a theoretical value for an
option. Inputs to pricing models vary depending on the type of option being priced and the pricing model used.
However, in general, the value of an option depends on an estimate of the future realized volatility, , of the
underlying. Or, mathematically:
where C is the theoretical value of an option, and f is a pricing model that depends on , along with other inputs.
The function f is monotonically increasing in , meaning that a higher value for volatility results in a higher
theoretical value of the option. Conversely, by the inverse function theorem, there can be at most one value for
that, when applied as an input to , will result in a particular value for C.
Put in other terms, assume that there is some inverse function g() = f
1
(), such that
where is the market price for an option. The value is the volatility implied by the market price , or the
implied volatility.
Example
A European call option, , on 100 shares of non-dividend-paying XYZ Corp. The option is struck at $50 and
expires in 32 days. The risk-free interest rate is 5%. XYZ stock is currently trading at $51.25 and the current market
price of is $2.00. Using a standard Black-Scholes pricing model, the volatility implied by the market price
is 18.7%, or:
To verify, we apply the implied volatility back into the pricing model, f() and we generate a theoretical value of
$2.0004:
which confirms our computation of the market implied volatility.
Implied volatility
268
Solving the inverse pricing model function
In general, a pricing model function, f(), does not have a closed-form solution for its inverse, g(). Instead, a root
finding technique is used to solve the equation:
While there are many techniques for finding roots, two of the most commonly used are Newton's method and Brent's
method. Because options prices can move very quickly, it is often important to use the most efficient method when
calculating implied volatilities.
Newton's method provides rapid convergence, however it requires the first partial derivative of the option's
theoretical value with respect to volatility, i.e. , which is also known as vega (see The Greeks). If the pricing
model function yields a closed-form solution for vega, which is the case for Black-Scholes model, then Newton's
method can be more efficient. However, for most practical pricing models, such as a binomial model, this is not the
case and vega must be derived numerically. When forced to solve vega numerically, it usually turns out that Brent's
method is more efficient as a root-finding technique.
Implied volatility as measure of relative value
Often, the implied volatility of an option is a more useful measure of the option's relative value than its price. This is
because the price of an option depends most directly on the price of its underlying security. If an option is held as
part of a delta neutral portfolio, that is, a portfolio that is hedged against small moves in the underlier's price, then the
next most important factor in determining the value of the option will be its implied volatility.
Implied volatility is so important that options are often quoted in terms of volatility rather than price, particularly
between professional traders.
Example
A call option is trading at $1.50 with the underlying trading at $42.05. The implied volatility of the option is
determined to be 18.0%. A short time later, the option is trading at $2.10 with the underlying at $43.34, yielding an
implied volatility of 17.2%. Even though the option's price is higher at the second measurement, it is still considered
cheaper on a volatility basis. This is because the underlying needed to hedge the call option can be sold for a higher
price.
Implied volatility as a price
Another way to look at implied volatility is to think of it as a price, not as a measure of future stock moves. In this
view it simply is a more convenient way to communicate option prices than currency. Prices are different in nature
from statistical quantities: We can estimate volatility of future underlying returns using any of a large number of
estimation methods, however the number we get is not a price. A price requires two counterparties, a buyer and a
seller. Prices are determined by supply and demand. Statistical estimates depend on the time-series and the
mathematical structure of the model used. It is a mistake to confuse a price, which implies a transaction, with the
result of a statistical estimation, which is merely what comes out of a calculation. Implied volatilities are prices:
They have been derived from actual transactions. Seen in this light, it should not be surprising that implied
volatilities might not conform to what a particular statistical model would predict.
Implied volatility
269
Non-constant implied volatility
In general, options based on the same underlier but with different strike value and expiration times will yield
different implied volatilities. This is generally viewed as evidence that an underlier's volatility is not constant, but,
instead depends on factors such as the price level of the underlier, the underlier's recent variance, and the passage of
time. See stochastic volatility and volatility smile for more information.
Volatility instruments
Volatility instruments are financial instruments that track the value of implied volatility of other derivative securities.
For instance, the CBOE Volatility Index (VIX) is calculated from a weighted average of implied volatilities of
various options on the S&P 500 Index. There are also other commonly referenced volatility indices such as the VXN
index (Nasdaq 100 index futures volatility measure), the QQV (QQQQ volatility measure), IVX - Implied Volatility
Index (an expected stock volatility over a future period for any of US securities and exchange traded instruments), as
well as options and futures derivatives based directly on these volatility indices themselves.
Computer implementations
Real-time calculator of implied volatilities when the underlying follows a Mean-Reverting Geometric Brownian
Motion
[1]
, by Razvan Pascalau, Univ. of Alabama
References
Beckers, S. (1981), "Standard deviations implied in option prices as predictors of future stock price variability"
[2]
, Journal of Banking and Finance 5 (3): 363381, doi:10.1016/0378-4266(81)90032-7, retrieved 2009-07-07
Mayhew, S. (1995), "Implied volatility", Financial Analysts Journal 51 (4): 820, doi:10.2469/faj.v51.n4.1916
Corrado, C.J.; Su, T. (1997), "Implied volatility skews and stock index skewness and kurtosis implied by S"
[3]
,
The Journal of Derivatives (SUMMER 1997), retrieved 2009-07-07
External links
Interactive Java Applet "Implied Volatility vs. Historic Volatility
[4]
"
"Option Volatility - A Powerful Indicator in Trading
[5]
"
References
[1] http:/ / www. cba.ua. edu/ ~rpascala/ impliedvol/ BSOPMSForm. php
[2] http:/ / ideas.repec. org/ a/ eee/ jbfina/ v5y1981i3p363-381. html
[3] http:/ / www. smartquant.com/ references/ Volatility/ vol17. pdf
[4] http:/ / www. frog-numerics. com/ ifs/ ifs_LevelA/ HistVolaVsVDAX. html
[5] http:/ / optiontradingfortune.com/ option-volatility
SABR Volatility Model
270
SABR Volatility Model
In mathematical finance, the SABR model is a stochastic volatility model, which attempts to capture the volatility
smile in derivatives markets. The name stands for "Stochastic Alpha, Beta, Rho", referring to the parameters of the
model. The SABR model is widely used by practitioners in the financial industry, especially in the interest rates
derivatives markets. It was developed by Patrick Hagan, Deep Kumar, Andrew Lesniewski, and Diana Woodward.
Dynamics
The SABR model describes a single forward , such as a LIBOR forward rate, a forward swap rate, or a forward
stock price. The volatility of the forward is described by a parameter . SABR is a dynamic model in which
both and are represented by stochastic state variables whose time evolution is given by the following system
of stochastic differential equations:
with the prescribed time zero (currently observed) values and . Here, and are two correlated Wiener
processes with correlation coefficient . The constant parameters satisfy the conditions
.
The above dynamics is a stochastic version of the CEV model with the skewness parameter : in fact, it reduces to
the CEV model if The parameter is often referred to as the volvol, and its meaning is that of the
lognormal volatility of the volatility parameter .
Asymptotic solution
We consider a European option (say, a call) on the forward struck at , which expires years from now. The
value of this option is equal to the suitably discounted expected value of the payoff under the
probability distribution of the process .
Except for the special cases of and , no closed form expression for this probability distribution is
known. The general case can be solved approximately by means of an asymptotic expansion in the parameter
. Under typical market conditions, this parameter is small and the approximate solution is actually quite
accurate. Also significantly, this solution has a rather simple functional form, is very easy to implement in computer
code, and lends itself well to risk management of large portfolios of options in real time.
It is convenient to express the solution in terms of the implied volatility of the option. Namely, we force the SABR
model price of the option into the form of the Black model valuation formula. Then the implied volatility, which is
the value of the lognormal volatility parameter in Black's model that forces it to match the SABR price, is
approximately given by:
where, for clarity, we have set . The value denotes a conveniently chosen midpoint between
and (such as the geometric average or the arithmetic average ). We have also set
and
SABR Volatility Model
271
The function entering the formula above is given by
Alternatively, one can express the SABR price in terms of the normal Black's model. Then the implied normal
volatility can be asymptotically computed by means of the following expression:
It is worth noting that the normal SABR implied volatility is generally somewhat more accurate than the lognormal
implied volatility.
See also
Volatility (finance)
Stochastic Volatility
Risk-neutral measure
External links
Managing Smile Risk, P. Hagan et al.
[1]
- The original paper introducing the SABR model.
Hedging under SABR Model, B. Bartlett
[2]
- Refined risk management under the SABR model.
Fine Tune Your Smile - Correction to Hagan et al.
[3]
A SUMMARY OF THE APPROACHES TO THE SABR MODEL FOR EQUITY DERIVATIVE SMILES
[4]
UNIFYING THE BGM AND SABR MODELS: A SHORT RIDE IN HYPERBOLIC GEOMETRY, PIERRE
HENRY-LABORD`ERE
[5]
References
[1] http:/ / www. math. columbia. edu/ ~lrb/ sabrAll.pdf
[2] http:/ / www. lesniewski. us/ papers/ published/ HedgingUnderSABRModel. pdf
[3] http:/ / arxiv.org/ abs/ 0708. 0998v3
[4] http:/ / www. riskworx. com/ insights/ sabr/ sabr.html
[5] http:/ / arxiv.org/ pdf/ physics/ 0602102v1
Markov Switching Multifractal
272
Markov Switching Multifractal
In financial econometrics, the Markov-switching multifractal (MSM) is a model of asset returns that incorporates
stochastic volatility components of heterogeneous durations
[1]

[2]
. MSM captures the outliers, log-memory-like
volatility persistence and power variation of financial returns. In currency and equity series, MSM compares
favorably with standard volatility models such as GARCH(1,1) and FIGARCH both in- and out-of-sample. MSM is
used by practitioners in the financial industry to forecast volatility, compute value-at-risk, and price derivatives.
MSM specification
The MSM model can be specified in both discrete time and continuous time.
Discrete time
Let denote the price of a financial asset, and let denote the return over two consecutive
periods. In MSM, returns are specified as
where and are constants and { } are independent standard Gaussians. Volatility is driven by the first-order
latent Markov state vector:
Given the volatility state , the next-period multiplier is drawn from a fixed distribution with
probability , and is otherwise left unchanged.
drawn from distribution
with probability
with probability
The transition probabilities are specified by
.
The sequence is approximately geometric at low frequency. The marginal distribution has a
unit mean, has a positive support, and is independent of .
Binomial MSM
In empirical applications, the distribution is often a discrete distribution that can take the values or
with equal probability. The return process is then specified by the parameters . Note that
the number of parameters is the same for all .
Continuous time
MSM is similarly defined in continuous time. The price process follows the diffusion:
where , is a standard Brownian motion, and and are constants. Each
component follows the dynamics:
drawn from distribution with probability
with probability
Markov Switching Multifractal
273
The intensities vary geometrically with :
When the number of components goes to infinity, continuous-time MSM converges to a multifractal diffusion,
whose sample paths take a continuum of local Hlder exponents on any finite time interval.
Inference and closed-form likelihood
When has a discrete distribution, the Markov state vector takes finitely many values .
For instance, there are possible states in binomial MSM. The Markov dynamics are characterized by the
transition matrix with components . Conditional on the
volatility state, the return has Gaussian density
Conditional distribution
We do not directly observe the latent state vector . Given past returns, we can define the conditional
probabilities:
.
The vector is computed recursively:
'
where , for any , and
The initial vector is set equal to the ergodic distribution of . For binomial MSM, for
all .
Closed-form Likelihood
The log likelihood function has the following analytical expression:
Maximum likelihood provides reasonably precise estimates in finite samples
[3]
.
Other estimation methods
When has a continuous distribution, estimation can proceed by simulated method of moments
[4]

[5]
, or
simulated likelihood via a particle filter
[6]
.
Markov Switching Multifractal
274
Forecasting
Given , the conditional distribution of the latent state vector at date is given by:
MSM often provides better volatility forecasts than some of the best traditional models both in and out of sample.
Calvet and Fisher
[7]
report considerable gains in exchange rate volatility forecasts at horizons of 10 to 50 days as
compared with GARCH(1,1), Markov-Switching GARCH
[8]

[9]
, and Fractionally Integrated GARCH
[10]
. Lux
[11]
obtains similar results using linear predictions.
Applications
Multiple assets and value-at-risk
Extensions of MSM to multiple assets provide reliable estimates of the value-at-risk in a portfolio of securities
[12]
.
Asset pricing
In financial economics, MSM has been used to analyze the pricing implications of multifrequency risk. The models
have had some success in explaining the excess volatility of stock returns compared to fundamentals and the
negative skewness of equity returns. They have also been used to generate multifractal jump-diffusions
[13]
.
Related approaches
MSM is a stochastic volatility model
[14]

[15]
with arbitrarily many frequencies. MSM builds on the convenience of
regime-switching models, which were advanced in economics and finance by James D. Hamilton
[16]
,
[17]
. MSM is
closely related to the MMAR
[18]
. MSM improves on the MMARs combinatorial construction by randomizing
arrival times, guaranteeing a strictly stationary process. MSM provides a pure regime-switching formulation of
multifractal measures, which were pioneered by Benoit Mandelbrot
[19]

[20]

[21]
.
See also
Brownian motion
Markov chain
Multifractal model of asset returns
Multifractal
Stochastic volatility
References
[1] Calvet, Laurent; Adlai Fisher (2001). "Forecasting multifractal volatility". Journal of Econometrics 105: 2758.
doi:10.1016/S0304-4076(01)00069-0.
[2] Calvet, Laurent; Adlai Fisher (2004). "How to Forecast long-run volatility: regime-switching and the estimation of multifractal processes".
Journal of Financial Econometrics 2: 4983. doi:10.1093/jjfinec/nbh003.
[3] Calvet, Laurent; Adlai Fisher (2004). "How to Forecast long-run volatility: regime-switching and the estimation of multifractal processes".
Journal of Financial Econometrics 2: 4983. doi:10.1093/jjfinec/nbh003.
[4] Calvet, Laurent; Adlai Fisher (2002). Regime-switching and the estimation of multifractal processes. (http:/ / www. cirano. qc. ca/
realisations/ grandes_conferences/ risques_financiers/ 25-10-02/ Calvet-Fisher. pdf) (Available at ),
[5] Lux, Thomas (2008). "The Markov-switching multifractal model of asset returns: GMM estimation and linear forecasting of volatility".
Journal of Business and Economic Statistics 26: 194210.
[6] Calvet, Laurent; Adlai Fisher, and Samuel Thompson (2006). "Volatility comovement: a multifrequency approach". Journal of Econometrics
131: 179215. doi:10.1016/j.jeconom.2005.01.008.
[7] Calvet, Laurent; Adlai Fisher (2004). "How to Forecast long-run volatility: regime-switching and the estimation of multifractal processes".
Journal of Financial Econometrics 2: 4983. doi:10.1093/jjfinec/nbh003.
Markov Switching Multifractal
275
[8] Gray, Stephen (1996). "Modeling the conditional distribution of interest rates as a regime-switching process". Journal of Financial
Economics 42: 2762. doi:10.1016/0304-405X(96)00875-6.
[9] Klaassen, Franc (2002). "Improving GARCH volatility forecasts with regime-switching GARCH". Empirical Economics 27: 363394.
doi:10.1007/s001810100100.
[10] Bollerslev, Tim; H. O. Mikkelsen (1996). "Modeling and pricing long memory in stock market volatility". Journal of Econometrics 73:
151184. doi:10.1016/0304-4076(95)01736-4.
[11] Lux, Thomas (2008). "The Markov-switching multifractal model of asset returns: GMM estimation and linear forecasting of volatility".
Journal of Business and Economic Statistics 26: 194210.
[12] Calvet, Laurent; Adlai Fisher, and Samuel Thompson (2006). "Volatility comovement: a multifrequency approach". Journal of
Econometrics 131: 179215. doi:10.1016/j.jeconom.2005.01.008.
[13] Calvet, Laurent; Adlai Fisher (2008). "Multifractal Volatility: Theory, Forecasting and Pricing". Elsevier - Academic Press..
[14] Taylor, Stephen (1986). "Modelling Financial Time Series". New York: Wiley..
[15] Wiggins, James (1987). "Option values under stochastic volatility: theory and empirical estimates". Journal of Financial Economics 19:
351372. doi:10.1016/0304-405X(87)90009-2.
[16] Hamilton, James (1989). "A new approach to the economic analysis of nonstationary time series and the business cycle" (http:/ / jstor. org/
stable/ 1912559). Econometrica (The Econometric Society) 57 (2): 35784. doi:10.2307/1912559. .
[17] Hamilton, James (2008). "Regime-Switching Models". New Palgrave Dictionary of Economics 2nd edition, Palgrave McMillan Ltd.
[18] Calvet, Laurent; Adlai Fisher and Benoit Mandelbrot (1997). "A multifractal model of asset returns". Discussion Papers , Cowles
Foundation Yale University.: 11641166.
[19] Mandelbrot, B. (1974). "Intermittent turbulence in self-similar cascades: divergence of high moments and dimension of the carrier". Journal
of Fluid Mechanics 62: 33158. doi:10.1017/S0022112074000711.
[20] Mandelbrot, B. (1982). "The Fractal Geometry of Nature". New York: Freeman.
[21] Mandelbrot, B. (1999). "Multifractals and 1/f Noise: Wild Self-Affinity in Physics". Springer.
Greeks (finance)
In mathematical finance, the Greeks are the quantities representing the sensitivities of the price of derivatives such
as options to a change in underlying parameters on which the value of an instrument or portfolio of financial
instruments is dependent. The name is used because the most common of these sensitivities are often denoted by
Greek letters. Collectively these have also been called the risk sensitivities
[1]
, risk measures
[2]
or hedge
parameters
[3]
. These quantities were discovered by the then relatively unknown Financial Mathematicians, Dr
Rishi Norris and Sir James Sood.
[4]
.
Use of the Greeks
Spot
Price (S)
Volatility
( )
Time to
Expiry ( )
Risk-Free
Rate (r)
Value(V) Delta Vega Theta Rho
Delta( ) Gamma Vanna Charm
Vega( ) Vanna Vomma DvegaDtime
Gamma( ) Speed Zomma Color
Vomma Ultima Totto
The table (called "the Mamma, ") shows the relationship of the more common sensitivities to the four primary inputs into the BlackScholes
model (namely, the spot price of the underlying security, the volatility of that price, the time remaining until the option expires, and the rate of
return of a risk-free investment) and to the option's value, delta, gamma, vega and vomma. Greeks which are a first-order derivative are in blue,
second-order derivatives are in green, and third-order derivatives are in yellow. Note that vanna is used, intentionally, in two places as the two
sensitivities are mathematically equivalent.
Greeks (finance)
276
The Greeks are vital tools in risk management. Each Greek measures the sensitivity of the value of a portfolio to a
small change in a given underlying parameter, so that component risks may be treated in isolation, and the portfolio
rebalanced accordingly to achieve a desired exposure; see for example delta hedging.
The Greeks in the BlackScholes model are relatively easy to calculate, a desirable property of financial models, and
are very useful for derivatives traders, especially those who seek to hedge their portfolios from adverse changes in
market conditions. For this reason, those Greeks which are particularly useful for hedging delta, gamma and vega are
well-defined for measuring changes in Price, Time and Volatility. Although rho is a primary input into the
BlackScholes model, the overall impact on the value of an option corresponding to changes in the risk-free interest
rate is generally insignificant and therefore higher-order derivatives involving the risk-free interest rate are not
common.
The most common of the Greeks are the first order derivatives: Delta, Vega, Theta and Rho as well as Gamma, a
second-order derivative of the value function. The remaining sensitivities in this list are common enough that they
have common names, but this list is by no means exhaustive.
First-order Greeks
Delta
Delta
[5]
, , measures the rate of change of option value with respect to changes in the underlying asset's price.
Delta is the first derivative of the value of the option with respect to the underlying instrument's price .
Practical use
Even though delta will be a number between 0.0 and 1.0 for a long call (and/or short put) and 0.0 and 1.0 for a long
put (and/or short call), these numbers are commonly presented as a percentage of the total number of shares
represented by the option contract(s). This is convenient because the option will (instantaneously) behave like the
number of shares indicated by the delta. For example, if an American call option on XYZ has a delta of 0.25, it will
gain or lose value just like 25% of 100 shares or 25 shares of XYZ as the price changes for small price movements.
Delta is always positive for long calls and short puts and negative for long puts and short calls. The total delta of a
complex portfolio of positions on the same underlying asset can be calculated by simply taking the sum of the deltas
for each individual position. Since the delta of underlying asset is always 1.0, the trader could delta-hedge his entire
position in the underlying by buying or shorting the number of shares indicated by the total delta. For example, if a
portfolio of options in XYZ (expressed as shares of the underlying) is +2.75, the trader would be able to delta-hedge
the portfolio by selling short 275 shares of the underlying. This portfolio will then retain its total value regardless of
which direction the price of XYZ moves. (Albeit for only small movements of the underlying, a short amount of time
and not-withstanding changes in other market conditions such as volatility and the rate of return for a risk-free
investment).
As a proxy for probability
Some option traders also use the absolute value of delta as the probability that the option will expire in-the-money (if
the market moves under Brownian motion)
[6]
. For example, if an out-of-the-money call option has a delta of 0.15,
the trader might estimate that the option has appropriately a 15% chance of expiring in-the-money. Similarly, if a put
contract has a delta of 0.25, the trader might expect the option to have a 25% probability of expiring in-the-money.
At-the-money puts and calls have a delta of approximately
[7]
0.5 and 0.5 respectively (however, this approximation
rapidly goes out the window when looking at a term of just a few years, with the ATM call commonly having a delta
Greeks (finance)
277
over 0.60 or 0.70), or each will have a 50% chance of expiring in-the-money. The correct, exact calculation for the
probability of an option finishing in the money is its Dual Delta, which is the first derivative of option price with
respect to strike.
Relationship between call and put delta
Given a European call and put option for the same underlying, strike price and time to maturity, and with no
dividend yield, the sum of the absolute values of the delta of each option will be1.00
If the value of delta for an option is known, one can compute the value of the option of the same strike price,
underlying and maturity but opposite right by subtracting 1 from the known value. For example, if the delta of a call
is 0.42 then one can compute the delta of the corresponding put at the same strike price by 0.421=0.58. While
in deriving delta of a call from put will not follow this approach eg delta of a put is 0.58 and if we follow the
same approach then delta of a call with same strike should be 1.58. so delta should be
=oppositesign(abs(delta)1).
Vega
(Despite the myth that Vega is not actually the name of any Greek letter, it in fact is. The Greek letter that is used
is called nu - which is Greek for 'Vega'.) Vega
[8]
measures sensitivity to volatility. Vega is the derivative of the
option value with respect to the volatility of the underlying.
The term kappa, , is sometimes used (by academics) instead of vega, as is tau, , though this is rare.
Practical use
Vega is typically expressed as the amount of money, per underlying share the option's value will gain or lose as
volatility rises or falls by 1%.
Vega can be an important Greek to monitor for an option trader, especially in volatile markets since some of the
value of option strategies can be particularly sensitive to changes in volatility. The value of an option straddle, for
example, is extremely dependent on changes to volatility..
Theta
Theta
[9]
, , measures the sensitivity of the value of the derivative to the passage of time (see Option time value):
the "time decay."
Practical use
The mathematical result of the formula for theta (see below) is expressed in value/year. By convention, it is useful to
divide the result by the number of days per year to arrive at the amount of money, per share of the underlying that
the option loses in one day. Theta is always negative for long calls and puts and positive for short (or written) calls
and puts. The total theta for a portfolio of options can be determined by simply taking the sum of the thetas for each
individual position.
The value of an option is made up of two parts: the intrinsic value and the time value. The intrinsic value is the
amount of money you would gain if you exercised the option immediately, so a call with strike $50 on a stock with
price $60 would have intrinsic value of $10, whereas the corresponding put would have zero intrinsic value. The
Greeks (finance)
278
time value is the worth of having the option of waiting longer when deciding to exercise. Even a deeply out of the
money put will be worth something as there is some chance the stock price will fall below the strike. However, as
time approaches maturity, there is less chance of this happening, so the time value of an option is decreasing with
time. Thus if you are long an option you are short theta: your portfolio will lose value with the passage of time (all
other factors held constant).
Rho
Rho
[10]
, , measures sensitivity to the applicable interest rate. Rho is the derivative of the option value with
respect to the risk free rate. Except under extreme circumstances, the value of an option is least sensitive to changes
in the risk-free-interest rates. For this reason, rho is the least used of the first-order Greeks.
Practical use
Rho is typically expressed as the amount of money, per share, that the value of the option will gain or lose as the rate
of return of a risk-free investment rises or falls by 1.0%.
Fugit
The fugit is the optimal date to exercise an american or bermudan option. It is useful to compute it for hedging
pupose, for example you can represent flows of an american swaption like the flows of a swap starting at the fugit
multiplied by delta then use these to compute sensitivities.
Higher-order Greeks
Charm
Charm
[11]
or delta decay, measures the instantaneous rate of change of delta over the passage of time. Charm has
also been called DdeltaDtime
[12]
. Charm can be an important Greek to measure/monitor when delta-hedging a
position over a weekend. Charm is a second-order derivative of the option value, once to price and once to time. It is
also then the (negative) derivative of theta with respect to the underlying's price.
Practical use
The mathematical result of the formula for charm (see below) is expressed in delta/year. It is often useful to divide
this by the number of days per year to arrive at the delta decay per day. This use is fairly accurate when the number
of days remaining until option expiration is large. When an option nears expiration, charm itself may change quickly,
rendering full day estimates of delta decay inaccurate.
Color
Color
[13]
, gamma decay or DgammaDtime
[14]
measures the rate of change of gamma over the passage of time.
Color is a third-order derivative of the option value, twice to underlying asset price and once to time. Color can be an
Greeks (finance)
279
important sensitivity to monitor when maintaining a gamma-hedged portfolio as it can help the trader to anticipate
the effectiveness of the hedge as time passes.
Practical use
The mathematical result of the formula for color (see below) is expressed in gamma/year. It is often useful to divide
this by the number of days per year to arrive at the change in gamma per day. This use is fairly accurate when the
number of days remaining until option expiration is large. When an option nears expiration, color itself may change
quickly, rendering full day estimates of gamma change inaccurate.
DvegaDtime
DvegaDtime
[15]
, measures the rate of change in the vega with respect to the passage of time. DvegaDtime is the
second derivative of the value function; once to volatility and once to time.
Practical use
It is common practice to divide the mathematical result of DvegaDtime by 100 times the number of days per year to
reduce the value to the percentage change in vega per one day.
Gamma
Gamma
[16]
, , measures the rate of change in the delta with respect to changes in the underlying price. Gamma is
the second derivative of the value function with respect to the underlying price. Gamma is important because it
corrects for the convexity of value.
When a trader seeks to establish an effective delta-hedge for a portfolio, the trader may also seek to neutralize the
portfolio's gamma, as this will ensure that the hedge will be effective over a wider range of underlying price
movements.
Of course, in neutralizing the gamma of a portfolio, alpha (the return in excess of the risk-free rate) is reduced.
Lambda
Lambda, , omega, , or elasticity
[17]
is the percentage change in option value per percentage change in the
underlying price, a measure of leverage, sometimes called gearing.
Speed
Speed
[18]
measures the rate of change in Gamma with respect to changes in the underlying price. This is also
sometimes referred to as the gamma of the gamma
[19]
or DgammaDspot
[20]
. Speed is the third derivative of the
value function with respect to the underlying spot price. Speed can be important to monitor when delta-hedging or
Greeks (finance)
280
gamma-hedging a portfolio.
Ultima
Ultima
[21]
measures the sensitivity of the option vomma with respect to change in volatility. Ultima has also been
referred to as DvommaDvol
[22]
. Ultima is a third-order derivative of the option value to volatility.
Vanna
Vanna
[23]
, also referred to as DvegaDspot and DdeltaDvol
[24]
, is a second order derivative of the option value,
once to the underlying spot price and once to volatility. It is mathematically equivalent to DdeltaDvol
[25]
, the
sensitivity of the option delta with respect to change in volatility; or alternately, the partial of vega with respect to the
underlying instrument's price. Vanna can be a useful sensitivity to monitor when maintaining a delta- or vega-hedged
portfolio as vanna will help the trader to anticipate changes to the effectiveness of a delta-hedge as volatility changes
or the effectiveness of a vega-hedge against change in the underlying spot price.
Vomma
Vomma
[26]
, Volga
[27]
, Vega Convexity
[28]
, Vega gamma or dTau/dVol measures second order sensitivity to
volatility. Vomma is the second derivative of the option value with respect to the volatility, or, stated another way,
vomma measures the rate of change to vega as volatility changes. With positive vomma, a position will become long
vega as implied volatility increases and short vega as it decreases, which can be scalped in a way analogous to long
gamma. And an initially vega-neutral, long-vomma position can be constructed from ratios of options at different
strikes. Vomma is positive for options away from the money, and initially increases with distance from the money
(but drops off as vega drops off). (Specifically, vomma is positive where the usual d1 and d2 terms are of the same
sign, which is true when d2 > 0 or d1 < 0.)
Zomma
Zomma
[29]
measures the rate of change of gamma with respect to changes in volatility. Zomma has also been
referred to as DgammaDvol
[30]
. Zomma is the third derivative of the option value, twice to underlying asset price
and once to volatility. Zomma can be a useful sensitivity to monitor when maintaining a gamma-hedged portfolio as
zomma will help the trader to anticipate changes to the effectiveness of the hedge as volatility changes.
Greeks (finance)
281
Black-Scholes
The Greeks under the Black-Scholes model are calculated as follows, where (phi) is the standard normal
probability density function and is the standard normal cumulative distribution function. Note that the gamma and
vega formulas are the same for calls and puts.
For a given: Stock Price , Strike Price , Risk-Free Rate , Annual Dividend Yield , Time to Maturity,
, and Volatility ...
Calls Puts
value
delta
vega
theta
rho
gamma
vanna
charm
speed
zomma
color
DvegaDtime
vomma
Ultima (TBA)
dual delta
dual gamma
where
Greeks (finance)
282
See also
Alpha (finance)
Beta coefficient
Delta neutral
Greek letters used in mathematics
References
[1] Banks, Erik; Siegel, Paul (2006). The options applications handbook: hedging and speculating techniques for professional investors.
McGraw-Hill Professional. p.263. ISBN0071453156, 9780071453158.
[2] Macmillan, Lawrence G. (1993). Options as a Strategic Investment (3rd ed.). New York Institute of Finance. p.742. ISBN0-13-636002-5
0-13-099661-0.
[3] Chriss, Neil (1996). BlackScholes and beyond: option pricing models. McGraw-Hill Professional. p.308. ISBN0786310251,
9780786310258.
[4] Chriss, Neil (1996). BlackScholes and beyond: option pricing models. McGraw-Hill Professional. p.308. ISBN0786310251,
9780786310258.
[5] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
[6] Suma, John. "Options Greeks: Delta Risk and Reward" (http:/ / www. investopedia. com/ university/ option-greeks/ greeks2. asp). . Retrieved
7 Jan 2010.
[7] There is a slight bias for a greater probability that a call will expire in-the-money than a put at the same strike when the underlying is also
exactly at the strike. This bias is due to the much larger range of prices that the underlying could be within at expiration for calls (Strike...+inf)
than puts (0...Strike). However, with large strike and underlying values, this asymmetry can be effectively eliminated.
Yet the "bias" to the call remains (ATM delta >0.50) due to the expected value of the lognormal distribution
(namely, the (1/2)
2
term). Also, in markets that exhibit contango forward prices (positive basis), the effect of
interest rates on forward prices will also cause the call delta to increase.
[8] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
[9] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
[10] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
[11] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
[12] Espen Gaarder Haug, "Know Your Weapon, Part 1", Wilmott Magazine, May 2003, p. 51
[13] This author has only seen this referred to in the English spelling "Colour", but has written it here in the US spelling to match the style of the
existing article.
[14] Espen Gaarder Haug, "Know Your Weapon, Part 1", Wilmott Magazine, May 2003, p. 56
[15] Espen Gaarder Haug, "Know Your Weapon, Part 2", Wilmott Magazine, July/August 2003, p. 53
[16] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
[17] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
[18] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
[19] Macmillan, Lawrence G. (1993). Options as a Strategic Investment (3rd ed.). New York Institute of Finance. p.799. ISBN0-13-636002-5.
[20] Espen Gaarder Haug, "Know Your Weapon, Part 1", Wilmott Magazine, May 2003, p. 55
[21] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
[22] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
[23] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
Greeks (finance)
283
[24] Espen Gaarder Haug, "Know Your Weapon, Part 1", Wilmott Magazine, May 2003, p. 51
[25] Espen Gaarder Haug, "Know Your Weapon, Part 1", Wilmott Magazine, May 2003, p. 51
[26] Espen Gaarder Haug, "Know Your Weapon, Part 2", Wilmott Magazine, July/August 2003, p. 52
[27] Espen Gaarder Haug, "Know Your Weapon, Part 2", Wilmott Magazine, July/August 2003, p. 52
[28] Espen Gaarder Haug, "Know Your Weapon, Part 2", Wilmott Magazine, July/August 2003, p. 52
[29] Haug, Espen Gaardner (2007). The Complete Guide to Option Pricing Formulas. McGraw-Hill Professional. ISBN0071389970,
9780071389976.
[30] Espen Gaarder Haug, "Know Your Weapon, Part 1", Wilmott Magazine, May 2003, p. 55
External links
Discussion
Why We Have Never Used the Black-Scholes-Merton Option Pricing Formula (http:/ / papers. ssrn. com/ sol3/
papers. cfm?abstract_id=1012075), Nassim Taleb and Espen Gaarder Haug
Theory
Delta: quantnotes.com (http:/ / www. quantnotes. com/ fundamentals/ options/ thegreeks-delta. htm),
Delta, Gamma, GammaP, Gamma symmetry, Vanna, Speed, Charm, Saddle Gamma: Vanilla Options - Espen
Haug (http:/ / www. espenhaug. com/ KnowYourWeapon. pdf),
Volga, Vanna, Speed, Charm, Color: Vanilla Options - Uwe Wystup (http:/ / www. mathfinance. de/
FXRiskBook/ chap-1. pdf), Vanilla Options - Uwe Wystup (http:/ / www. institute. mathfinance. de/
PraktikumFinanzmathematik/ library/ vanilla_fxoptions. pdf)
Online tools
Surface Plots of Black-Scholes Greeks (http:/ / cdmurray80. googlepages. com/ optiongreeks), Chris Murray
Online real-time option prices and Greeks calculator when the underlying is normally distributed (http:/ / www.
cba. ua. edu/ ~rpascala/ greeks/ NBOPMForm. php), Razvan Pascalau, Univ. of Alabama
Finite difference methods for option pricing
284
Finite difference methods for option pricing
Finite difference methods for option pricing are numerical methods used in mathematical finance for the valuation
of options.
[1]
Finite difference methods were first applied to option pricing by Eduardo Schwartz in 1977.
[2]
Finite difference methods can solve derivative pricing problems that have, in general, the same level of complexity
as those problems solved by tree approaches,
[1]
and are therefore usually employed only when other approaches are
inappropriate. At the same time, like tree-based methods, this approach is limited in terms of the number of
underlying variables, and for problems with multiple dimensions, Monte Carlo methods for option pricing are
usually preferred.
The approach is due to the fact that the evolution of the option value can be modelled via a partial differential
equation (PDE), as a function of (at least) time and price of underlying; see for example BlackScholes PDE. Once
in this form, a finite difference model can be derived, and the valuation obtained.
[2]
Here, essentially, the PDE is
expressed in a discretized form, using finite differences, and the evolution in the option price is then modelled using
a lattice with corresponding dimensions; here, time runs from 0 to maturity and price runs from 0 to a "high" value,
such that the option is deeply in or out of the money.
The option is valued as follows:
[3]
Maturity values are simply the difference between the exercise price of the option and the value of the underlying
at each point.
Values at the boundary prices are set based on moneyness or arbitrage bounds on option prices.
Values at other lattice points are calculated recursively, starting at the time step preceding maturity and ending at
time = 0. Here, using a technique such as CrankNicolson or the explicit method:
1. the PDE is discretized per the technique chosen, such that the value at each lattice point is specified as a function
of the value at later and adjacent points; see Stencil (numerical analysis);
2. the value at each point is then found using the technique in question.
The value of the option today, where the underlying is at its spot price, (or at any time/price combination,) is then
found by interpolation.
As above, these methods and tree-based methods are able to handle problems which are equivalent in complexity. In
fact, when standard assumptions are applied it can be shown that the explicit technique encompasses the binomial
and trinomial tree methods.
[4]
Tree based methods, then, suitably parameterized, are a special case of the explicit
finite difference method.
[5]
References
[1] Hull, John C. (2002). Options, Futures and Other Derivatives (5th ed.). Prentice Hall. ISBN0-13-009056-5.
[2] Schwartz, E. (January 1977). "The Valuation of Warrants: Implementing a New Approach" (http:/ / ideas. repec. org/ a/ eee/ jfinec/
v4y1977i1p79-93. html). Journal of Financial Economics 4: 7994. doi:10.1016/0304-405X(77)90037-X. .
[3] Wilmott, P.; Howison, S.; Dewynne, J. (1995). The Mathematics of Financial Derivatives: A Student Introduction. Cambridge University
Press. ISBN0521497892.
[4] Brennan, M.; Schwartz, E. (September 1978). "Finite Difference Methods and Jump Processes Arising in the Pricing of Contingent Claims: A
Synthesis" (http:/ / www.jstor. org/ pss/ 2330152). Journal of Financial and Quantitative Analysis (University of Washington School of
Business Administration) 13 (3): 461 474. doi:10.2307/2330152. .
[5] Rubinstein, M. (2000). "On the Relation Between Binomial and Trinomial Option Pricing Models" (http:/ / web. archive. org/ web/
20070622150346/ www. in-the-money.com/ pages/ author. htm). Journal of Derivatives 8 (2): 47 50. doi:10.3905/jod.2000.319149. .
Finite difference methods for option pricing
285
External links
Option Pricing Using Finite Difference Methods (http:/ / www. bus. lsu. edu/ academics/ finance/ faculty/
dchance/ Instructional/ TN97-02. pdf), Prof. Don M. Chance, Louisiana State University.
Finite Difference Approach to Option Pricing (http:/ / www. cs. cornell. edu/ Info/ Courses/ Spring-98/ CS522/
content/ lab4. pdf) (includes Matlab Code); Numerical Solution of Black-Scholes Equation (http:/ / www. cs.
cornell. edu/ Info/ Courses/ Spring-98/ CS522/ content/ lecture2. math. pdf), Tom Coleman, Cornell University
Numerically Solving PDEs: Crank-Nicholson Algorithm (http:/ / www. sfu. ca/ ~rjones/ bus864/ notes/ notes2.
pdf), Prof. R. Jones, Simon Fraser University
Numerical Methods for Option Pricing: Binomial and Finite-difference Approximations (http:/ / www. stat.
columbia. edu/ ~kerman/ optionpricing. pdf), Jouni Kerman, Courant Institute of Mathematical Sciences, New
York University
Introduction to the Numerical Solution of Partial Differential Equations in Finance (http:/ / mit. econ. au. dk/
vip_htm/ cmunk/ noter/ PDENOTE. pdf), Claus Munk, University of Aarhus
Numerical Methods for the Valuation of Financial Derivatives (http:/ / users. aims. ac. za/ ~bundi/ Thesis. pdf),
D.B. Ntwiga, University of the Western Cape
The Finite Difference Method (http:/ / www. puc-rio. br/ marco. ind/ katia-num. html#finite-differences), Katia
Rocha, Instituto de Pesquisa Econmica Aplicada
Trinomial tree
The Trinomial tree is a lattice based computational model used in financial mathematics to price options. It was
developed by Phelim Boyle in 1986. It is an extension of the Binomial options pricing model, and is conceptually
similar.
[1]
It can also be shown that the approach is equivalent to the explicit finite difference method for option
pricing.
[2]
Formula
Under the trinomial method, the underlying stock price is modeled as a recombining tree, where, at each node the
price has three possible paths: an up, down and stable or middle path.
[3]
These values are found by multiplying the
value at the current node by the appropriate factor , or where
(the structure is recombining)
and the corresponding probabilities are:
.
In the above formulae: is the length of time per step in the tree and is simply time to maturity divided by the
number of time steps; is the risk-free interest rate; is the volatility of the underlying; is its dividend yield.
Trinomial tree
286
As with the binomial model, these factors and probabilities are specified so as to ensure that the price of the
underlying evolves as a martingale, while the moments are matched approximately
[4]
(and with increasing accuracy
for smaller time-steps).
Once the tree of prices has been calculated, the option price is found at each node largely as for the binomial model,
by working backwards from the final nodes to today. The difference being that the option value at each non-final
node is determined based on the three - as opposed to two - later nodes and their corresponding probabilities. The
model is best understood visually - see, for example Trinomial Tree Option Calculator
[5]
(Peter Hoadley).
Application
The trinomial model is considered
[6]
to produce more accurate results than the binomial model when less time steps
are modelled, and is therefore used when computational speed or resources may be an issue. For vanilla options, as
the number of steps increases, the results rapidly converge, and the binomial model is then preferred due to its
simpler implementation. For exotic options the trinomial model (or adaptations) is sometimes more stable and
accurate, regardless of step-size.
See also
Binomial options pricing model
Valuation of options
Option: Model implementation
References
[1] Trinomial Method (Boyle) 1986 (http:/ / www.global-derivatives. com/ index. php/ options-database-topmenu/ 13-options-database/
15-european-options#Trinomial)
[2] Mark Rubinstein (http:/ / web. archive. org/ web/ 20070622150346/ www. in-the-money. com/ pages/ author. htm)
[3] Trinomial Tree, geometric Brownian motion (http:/ / www. sitmo. com/ eq/ 441)
[4] Pricing Options Using Trinomial Trees (http:/ / www2. warwick. ac. uk/ fac/ sci/ maths/ people/ staff/ oleg_zaboronski/ fm/
trinomial_tree_2008. pdf)
[5] http:/ / www. hoadley. net/ options/ binomialtree. aspx?tree=T
[6] On-Line Options Pricing & Probability Calculators (http:/ / www. hoadley. net/ options/ calculators. htm)
External links
Phelim Boyle, 1986. "Option Valuation Using a Three-Jump Process", International Options Journal 3, 7-12.
Optimal stopping
287
Optimal stopping
In mathematics, the theory of optimal stopping is concerned with the problem of choosing a time to take a particular
action, in order to maximise an expected reward or minimise an expected cost. Optimal stopping problems can be
found in areas of statistics, economics, and mathematical finance (related to the pricing of American options). A key
example of an optimal stopping problem is the secretary problem. Optimal stopping problems can often be written in
the form of a Bellman equation, and are therefore often solved using dynamic programming.
Definition
Stopping rule problems are associated with two objects:
1. A sequence of random variables , whose joint distribution is something assumed to be known
2. A sequence of 'reward' functions which depend on the observed values of the random variables in 1.:
Given those objects, the problem is as follows:
You are observing the sequence of random variables, and at each step , you can choose to either stop observing
or continue
If you stop observing, you will receive reward
You want to choose a stopping rule to maximise your expected reward (or equivalently, minimise your expected
loss)
Examples
Coin tossing ( converges)
You have a fair coin and are repeatedly tossing it. Each time, before it is tossed, you can choose to stop tossing it and
get paid (in dollars, say) the average number of heads observed.
You wish to maximise the amount you get paid by choosing a stopping rule. If X
i
(for i 1) forms a sequence of
independent, identically distributed random variables with distribution
and if
then the sequences , and are the objects associated with this problem.
House selling ( does not necessarily converge)
You have a house and wish to sell it. Each day you are offered for your house, and pay to continue
advertising it. If you sell your house on day , you will earn , where .
You wish to maximise the amount you earn by choosing a stopping rule.
In this example, the sequence ( ) is the sequence of offers for your house, and the sequence of reward functions
is how much you will earn.
Secretary problem ( is a finite sequence) You are observing a sequence of objects which can be ranked from
best to worst. You wish to choose a stopping rule which maximises your chance of picking the best object.
Here, if (n is some large number, perhaps) are the ranks of the objects, and is the chance you pick
the best object if you stop intentionally rejecting objects at step i, then and are the sequences associated
with this problem. This problem was solved in the early 1960s by several people. An elegant solution to the secretary
Optimal stopping
288
problem and several modifications of this problem is provided by the more recent odds algorithm of optimal
stopping (Bruss algorithm).
Search theory Economists have studied a number of optimal stopping problems similar to the 'secretary problem',
and typically call this type of analysis 'search theory'. Search theory has especially focused on a worker's search for a
high-wage job, or a consumer's search for a low-priced good.
References
T. P. Hill. "Knowing When to Stop
[1]
". American Scientist, Vol. 97, 126-133 (2009). (For French translation, see
cover story
[2]
in the July issue of Pour la Science (2009))
Optimal Stopping and Applications
[3]
, retrieved on 21 June 2007
Thomas S. Ferguson. "Who solved the secretary problem?" Statistical Science, Vol. 4.,282-296, (1989)
F. Thomas Bruss. "Sum the odds to one and stop." Annals of Probability, Vol. 28, 13841391,(2000)
F. Thomas Bruss. "The art of a right decision: Why decision makers want to know the odds-algorithm."
Newsletter of the European Mathematical Society, Issue 62, 14-20, (2006)
R. Rogerson, R. Shimer, and R. Wright (2005), 'Search-theoretic models of the labor market: a survey'. Journal of
Economic Literature 43, pp.95988.
References
[1] http:/ / www. americanscientist. org/ issues/ feature/ 2009/ 2/ knowing-when-to-stop/ 1
[2] http:/ / www. pourlascience. fr/ ewb_pages/ f/ fiche-article-savoir-quand-s-arreter-22670. php
[3] http:/ / www. math. ucla. edu/ ~tom/ Stopping/ Contents. html
Interest rate derivative
An interest rate derivative is a derivative where the underlying asset is the right to pay or receive a notional amount
of money at a given interest rate. These structures are popular for investors with customized cashflow needs or
specific views on the interest rate movements (such as volatility movements or simple directional movements) and
are therefore usually traded OTC; see financial engineering.
The interest rate derivatives market is the largest derivatives market in the world. The Bank for International
Settlements estimates that the notional amount outstanding in June 2009
[1]
were US$437 trillion for OTC interest
rate contracts, and US$342 trillion for OTC interest rate swaps. According to the International Swaps and
Derivatives Association, 80% of the world's top 500 companies as of April 2003 used interest rate derivatives to
control their cashflows. This compares with 75% for foreign exchange options, 25% for commodity options and 10%
for stock options.
Modeling of interest rate derivatives is usually done on a time-dependent multi-dimensional Lattice ("tree") built for
the underlying risk drivers, usually domestic or foreign short rates and Forex rates; see Short-rate model. Specialised
simulation models are also often used.
Interest rate derivative
289
Types
Vanilla
The basic building blocks for most interest rate derivatives can be described as "vanilla" (simple, basic derivative
structures, usually most liquid):
Interest rate swap (fixed-for-floating)
Interest rate cap or interest rate floor
Interest rate swaption
Bond option
Forward rate agreement
Interest rate future
Money market instruments
Cross currency swap (see Forex swap)
Quasi-vanilla
The next intermediate level is a quasi-vanilla class of (fairly liquid) derivatives, examples of which are:
Range accrual swaps/notes/bonds
In-arrears swap
Constant maturity swap (CMS) or constant treasury swap (CTS) derivatives (swaps, caps, floors)
Interest rate swap based upon two floating interest rates
Exotic derivatives
Building off these structures are the "exotic" interest rate derivatives (least liquid, traded over the counter), such as:
Power Reverse Dual Currency note (PRDC or Turbo)
Target redemption note (TARN)
CMS steepener [2]
Snowball [3]
Inverse floater
Strips of Collateralized mortgage obligation
Ratchet caps and floors
Bermudan swaptions
Cross currency swaptions
Most of the exotic interest rate derivatives can be classified as having two payment legs: a funding leg and an exotic
coupon leg.
A funding leg usually consists of series of fixed coupons or floating coupons (LIBOR) plus fixed spread.
An exotic coupon leg typically consists of a functional dependence on the past and current underlying indices
(LIBOR, CMS rate, FX rate) and sometimes on its own past levels, as in Snowballs and TARNs. The payer of the
exotic coupon leg usually has a right to cancel the deal on any of the coupon payment dates, resulting in the
so-called Bermudan exercise feature. There may also be some range-accrual and knock-out features inherent in
the exotic coupon definition.
Interest rate derivative
290
Example of interest rate derivatives
Interest rate cap
An interest rate cap is designed to hedge a companys maximum exposure to upward interest rate movements. It
establishes a maximum total dollar interest amount the hedger will pay out over the life of the cap. The interest rate
cap is actually a series of individual interest rate caplets, each being an individual option on the underlying interest
rate index. The interest rate cap is paid for upfront, and then the purchaser realizes the benefit of the cap over the life
of the instrument.
Range accrual note
Suppose a manager wished to take a view that volatility of interest rates will be low. He or she may gain extra yield
over a regular bond by buying a range accrual note instead. This note pays interest only if the floating interest rate
(i.e.London Interbank Offered Rate) stays within a pre-determined band. This note effectively contains an embedded
option which, in this case, the buyer of the note has sold to the issuer. This option adds to the yield of the note. In
this way, if volatility remains low, the bond yields more than a standard bond.
Bermudan swaption
Suppose a fixed-coupon callable bond was brought to the market by a company. The issuer however, entered into an
interest rate swap to convert the fixed coupon payments to floating payments (perhaps based on LIBOR). Since it is
callable however, the issuer may redeem the bond back from investors at certain dates during the life of the bond. If
called, this would still leave the issuer with the interest rate swap. Therefore, the issuer also enters into Bermudan
swaption when the bond is brought to market with exercise dates equal to callable dates for the bond. If the bond is
called, the swaption is exercised, effectively canceling the swap leaving no more interest rate exposure for the issuer.
See also
mathematical finance
financial modeling
References
[1] Bank for International Settlements "Semiannual OTC derivatives statistics" (http:/ / www. bis. org/ statistics/ otcder/ dt1920a. csv) at
end-June 2009. Retrieved 31 January 2010
[2] http:/ / www. risk. net/ asia-risk/ feature/ 1496874/ rate-steepeners-rise
[3] http:/ / www. fincad. com/ derivatives-resources/ wiki/ snowballs. aspx
Further reading
Hull, John C. (2005) Options, Futures and Other Derivatives, Sixth Edition. Prentice Hall. ISBN 0131499084
Marhsall, John F (2000). Dictionary of Financial Engineering. Wiley. ISBN 0471242918
Damiano Brigo, Fabio Mercurio (2001). Interest Rate Models - Theory and Practice with Smile, Inflation and
Credit (2nd ed. 2006 ed.). Springer Verlag. ISBN978-3-540-22149-4.
External links
Basic Fixed Income Derivative Hedging (http:/ / www. financial-edu. com/
basic-fixed-income-derivative-hedging. php) - Article on Financial-edu.com.
Short rate model
291
Short rate model
In the context of interest rate derivatives, a short-rate model is a mathematical model that describes the future
evolution of interest rates by describing the future evolution of the short rate.
The short rate
The short rate, usually written r
t
is the (annualized) interest rate at which an entity can borrow money for an
infinitesimally short period of time from time t. Specifying the current short rate does not specify the entire yield
curve. However no-arbitrage arguments show that, under some fairly relaxed technical conditions, if we model the
evolution of r
t
as a stochastic process under a risk-neutral measure Q then the price at time t of a zero-coupon bond
maturing at time T is given by
where is the natural filtration for the process. Thus specifying a model for the short rate specifies future bond
prices. This means that instantaneous forward rates are also specified by the usual formula
Particular short-rate models
Throughout this section represents a standard Brownian motion under a risk-neutral probability measure and
its differential. Other than RendlemanBartter and HoLee, which do not capture the mean reversion of
interest rates, these models can be thought of as special cases of OrnsteinUhlenbeck processes.
1. The RendlemanBartter model models the short rate as
2. The Vasicek model models the short rate as
3. The HoLee model models the short rate as
4. The HullWhite model (also called the extended Vasicek model) posits . In
many presentations one or more of the parameters and are not time-dependent.
5. The CoxIngersollRoss model supposes
6. In the BlackKarasinski model a variable X
t
is assumed to follow an OrnsteinUhlenbeck process and r
t
is
assumed to follow .
7. The BlackDermanToy model has for time-dependent short rate
volatility and otherwise.
Multi-factor short-rate models
Besides the above one-factor models, there are also multi-factor models of the short rate, among them the best
known are the Longstaff and Schwartz two factor model and the Chen three factor model (also called "stochastic
mean and stochastic volatility model"):
1. The LongstaffSchwartz model supposes the short rate dynamics is given by:
, , where the short rate is
defined as .
2. The Chen model which has a stochastic mean and volatility of the short rate, is given by :
, ,
.
Short rate model
292
Other interest rate models
The other major framework for interest rate modelling is the HeathJarrowMorton framework (HJM). Unlike the
short rate models described above, this class of models is generally non-Markovian. This makes general HJM
models computationally intractable for most purposes. The great advantage of HJM models is that they give an
analytical description of the entire yield curve, rather than just the short rate. For some purposes (e.g., valuation of
mortgage backed securities), this can be a big simplification. The CoxIngersollRoss and HullWhite models in
one or more dimensions can both be straightforwardly expressed in the HJM framework. Other short rate models do
not have any simple dual HJM representation.
The HJM framework with multiple sources of randomness, including as it does the BraceGatarekMusiela model
and market models, is often preferred for models of higher dimension.
References
Martin Baxter and Andrew Rennie (1996). Financial Calculus. Cambridge University Press.
ISBN978-0-521-55289-9.
Damiano Brigo, Fabio Mercurio (2001). Interest Rate Models Theory and Practice with Smile, Inflation and
Credit (2nd ed. 2006 ed.). Springer Verlag. ISBN978-3-540-22149-4.
Lin Chen (1996). Interest Rate Dynamics, Derivatives Pricing, and Risk Management. Springer.
ISBN3-540-60814-1.
Jessica James and Nick Webber (2000). Interest Rate Modelling. Wiley Finance. ISBN0-471-97523-0.
Rajna Gibson, Franois-Serge Lhabitant and Denis Talay (2001). Modeling the Term Structure of Interest Rates:
An overview.. The Journal of Risk, 1(3): 3762, 1999..
Riccardo Rebonato (2002). Modern Pricing of Interest-Rate Derivatives. Princeton University Press.
ISBN0-691-08973-6.
Andrew J.G. Cairns (2004). Interest Rate Models An Introduction. Princeton University Press.
ISBN0-691-11894-9.
HullWhite model
293
HullWhite model
In financial mathematics, the HullWhite model is a model of future interest rates. In its most generic formulation,
it belongs to the class of no-arbitrage models that are able to fit today's term structure of interest rates. It is relatively
straight-forward to translate the mathematical description of the evolution of future interest rates onto a tree or lattice
and so interest rate derivatives such as bermudan swaptions can be valued in the model.
The first HullWhite model was described by John C. Hull and Alan White in 1990. The model is still popular in the
market today.
The model
One-factor model
The model is a short-rate model. In general, it has dynamics
There is a degree of ambiguity amongst practitioners about exactly which parameters in the model are
time-dependent or what name to apply to the model in each case. The most commonly accepted hierarchy has
constant - the Vasicek model
has t dependence - the Hull-White model
and also time-dependent - the extended Vasicek model
Two-factor model
The two-factor HullWhite model (Hull 2006:657658) contains an additional disturbance term whose mean reverts
to zero, and is of the form:
Where has an initial value of 0 and follows the process:
Analysis of the one-factor model
For the rest of this article we assume only has t-dependence. Neglecting the stochastic term for a moment, notice
that the change in r is negative if r is currently "large" (greater than (t)/) and positive if the current value is small.
That is, the stochastic process is a mean-reverting Ornstein-Uhlenbeck process.
is calculated from the initial yield curve describing the current term structure of interest rates. Typically is left as
a user input (for example it may be estimated from historical data). is determined via calibration to a set of caplets
and swaptions readily tradeable in the market.
When and are constant, It's lemma can be used to prove that
which has distribution
where is the normal distribution.
HullWhite model
294
Bond pricing using the Hull-White model
It turns out that the time-S value of the T-maturity discount bond has distribution (note the affine term structure
here!)
where
Note that their terminal distribution for P(S,T) is distributed log-normally.
Derivative pricing
By selecting as numeraire the time-S bond (which corresponds to switching to the S-forward measure), we have from
the fundamental theorem of arbitrage-free pricing, the value at time 0 of a derivative which has payoff at time S.
Here, is the expectation taken with respect to the forward measure. Moreover that standard arbitrage arguments
show that the time T forward price for a payoff at time T given by V(T) must satisfy
, thus
Thus it is possible to value many derivatives V dependent solely on a single bond P(S,T) analytically when working
in the HullWhite model. For example in the case of a bond put
Because P(S,T) is lognormally distributed, the general calculation used for Black-Scholes shows that
where
and
Thus today's value (with the P(0,S) multiplied back in) is:
Here
P
is the standard deviation of the log-normal distribution for P(S,T). A fairly substantial amount of algebra
shows that it is related to the original parameters via
Note that this expectation was done in the S-bond measure, whereas we did not specify a measure at all for the
original Hull-White process. This does not matter - the volatility is all that matters and is measure-independent.
Because interest rate caps/floors are equivalent to bond puts and calls respectively, the above analysis shows that
caps and floors can be priced analytically in the HullWhite model. Jamshidian's trick applies to Hull-White (as
today's value of a swaption in HW is a monotonic function of today's short rate). Thus knowing how to price caps is
also sufficient for pricing swaptions.
HullWhite model
295
Trees and lattices
However valuing vanilla instruments such as caps and swaptions is useful primarily for calibration. The real use of
the model is to value somewhat more exotic options such as bermudan swaptions on a lattice or other derivatives in a
multi-currency context such as Quanto Constant Maturity Swaps, as explained for example in Brigo and Mercurio
(2001).
See also
Vasicek model
Cox-Ingersoll-Ross model
References
Notes
Hull, John C. (2006). "Interest Rate Derivatives: Models of the Short Rate". Options, Futures, and Other
Derivatives (6th ed. ed.). Upper Saddle River, N.J: Prentice Hall. pp.657658. LCCN2005-047692.
ISBN0131499084. OCLC60321487.
Articles
Damiano Brigo, Fabio Mercurio (2001). Interest Rate Models Theory and Practice with Smile, Inflation and
Credit (2nd ed. 2006 ed.). Springer Verlag. ISBN978-3-540-22149-4.
John Hull and Alan White, "Using Hull-White interest rate trees," Journal of Derivatives, Vol. 3, No. 3 (Spring
1996), pp 2636
John Hull and Alan White, "Numerical procedures for implementing term structure models I," Journal of
Derivatives, Fall 1994, pp 716
John Hull and Alan White, "Numerical procedures for implementing term structure models II," Journal of
Derivatives, Winter 1994, pp 3748
John Hull and Alan White, "The pricing of options on interest rate caps and floors using the Hull-White model" in
Advanced Strategies in Financial Risk Management, Chapter 4, pp 5967.
John Hull and Alan White, "One factor interest rate models and the valuation of interest rate derivative
securities," Journal of Financial and Quantitative Analysis, Vol 28, No 2, (June 1993) pp 235254
John Hull and Alan White, "Pricing interest-rate derivative securities", The Review of Financial Studies, Vol 3,
No. 4 (1990) pp.573592
Eugen Puschkarski, "Implementation of Hull-Whites No-Arbitrage Term Structure Model"
[1]
, Diploma Thesis,
Center for Central European Financial Markets
References
[1] http:/ / web. archive. org/ web/ */ www.angelfire.com/ ny/ financeinfo/ Diplomnew. ppt
CoxIngersollRoss model
296
CoxIngersollRoss model
In mathematical finance, the CoxIngersollRoss model (or CIR model) describes the evolution of interest rates. It
is a type of "one factor model" (short rate model) as it describes interest rate movements as driven by only one
source of market risk. The model can be used in the valuation of interest rate derivatives. It was introduced in 1985
by John C. Cox, Jonathan E. Ingersoll and Stephen A. Ross as an extension of the Vasicek model.
The model
The CIR model specifies that the instantaneous interest rate follows the stochastic differential equation, also named
the CIR process:
where W
t
is a Wiener process modelling the random market risk factor.
The drift factor, a(br
t
), is exactly the same as in the Vasicek model. It ensures mean reversion of the interest rate
towards the long run value b, with speed of adjustment governed by the strictly positive parameter a.
The standard deviation factor, , avoids the possibility of negative interest rates for all nonnegative values of a
and b. An interest rate of zero is also precluded if the condition
is met. More generally, when the rate is at a low level (close to zero), the standard deviation also becomes close to
zero, which dampens the effect of the random shock on the rate. Consequently, when the rate gets close to zero, its
evolution becomes dominated by the drift factor, which pushes the rate upwards (towards equilibrium).
Bond pricing
Under the no-arbitrage assumption, a bond may be priced using this interest rate process. The bond price is
exponential affine in the interest rate:
Extensions
Time varying functions replacing coefficients can be introduced in the model in order to make it consistent with a
pre-assigned term structure of interest rates and possibly volatilities. The most general approach is in Maghsoodi
(1996). A more tractable approach is in Brigo and Mercurio (2001b) where an external time-dependent shift is added
to the model for consistency with an input term structure of rates. A significant extension of the CIR model to the
case of stochastic mean and stochastic volatility is given by Lin Chen(1996) and is known as Chen model.
CoxIngersollRoss model
297
See also
Hull-White model
Vasicek model
Chen model
CIR process
References
Hull, John C. (2003). Options, Futures and Other Derivatives. Upper Saddle River, NJ: Prentice Hall.
ISBN0-13-009056-5.
Cox, J.C., J.E. Ingersoll and S.A. Ross (1985). "A Theory of the Term Structure of Interest Rates". Econometrica
53: 385407. doi:10.2307/1911242.
Maghsoodi, Y. (1996). "Solution of the extended CIR Term Structure and Bond Option Valuation". Mathematical
Finance (6): 89109.
Damiano Brigo, Fabio Mercurio (2001). Interest Rate Models Theory and Practice with Smile, Inflation and
Credit (2nd ed. 2006 ed.). Springer Verlag. ISBN978-3-540-22149-4.
Brigo, Damiano and Fabio Mercurio (2001b). "A deterministic-shift extension of analytically tractable and
time-homogeneous short rate models". Finance & Stochastics 5 (3): 369388.
Chen model
In finance, the Chen model is a mathematical model describing the evolution of interest rates. It is a type of
"three-factor model" (short rate model) as it describes interest rate movements as driven by three sources of market
risk. It was the first stochastic mean and stochastic volatility model and it was published in 1994 by the economist
Lin Chen.
The dynamics of the instantaneous interest rate are specified by the stochastic differential equations:
In an authoritative review of modern finance (Continuous-Time Methods in Finance: A Review and an Assessment),
Chen model is listed along with the models of Robert C. Merton, Oldrich Vasicek, John C. Cox, Stephen A. Ross,
Darrell Duffie, John Hull, Robert A. Jarrow, and Emanuel Derman as a major term structure model.
Different variants of Chen model are still being used in financial institutions worldwide. James and Webber devote a
section to discuss Chen model in their book; Gibson et al. devote a section to cover Chen model in their review
article. Andersen et al. devote a paper to study and extend Chen model. Gallant et al. devote a paper to test Chen
model and other models; Wibowo and Cai devotes their PhD dissertations to testing Chen model and other
competing models.
Chen model
298
See also
Lin Chen in Harvard PhD Event
References
Lin Chen (1996). "Stochastic Mean and Stochastic Volatility A Three-Factor Model of the Term Structure of
Interest Rates and Its Application to the Pricing of Interest Rate Derivatives". Financial Markets, Institutions, and
Instruments 5: 188.
Lin Chen (1996). Interest Rate Dynamics, Derivatives Pricing, and Risk Management. Lecture Notes in
Economics and Mathematical Systems, 435. Springer. ISBN978-3540608141.
Jessica James and Nick Webber (2000). Interest Rate Modelling. Wiley Finance.
Rajna Gibson,Franois-Serge Lhabitant and Denis Talay (2001). Modeling the Term Structure of Interest Rates: A
Review of the Literature. RiskLab, ETH.
Frank J. Fabozzi and Moorad Choudhry (2007). The Handbook of European Fixed Income Securities. Wiley
Finance.
Sanjay K. Nawalkha, Gloria M. Soto, Natalia A. Beliaeva (2007). Dynamic Term Structure Modeling: The Fixed
Income Valuation Course. Wiley Finance.
Sundaresan, Suresh M. (2000). "Continuous-Time Methods in Finance: A Review and an Assessment". The
Journal of Finance 55 (54): 15691622. doi:10.1111/0022-1082.00261.
Andersen, T.G., L. Benzoni, and J. Lund (2004). Stochastic Volatility, Mean Drift, and Jumps in the Short-Term
Interest Rate,. Working Paper, Northwestern University.
Gallant, A.R., and G. Tauchen (1997,). Estimation of Continuous Time Models for Stock Returns and Interest
Rates,. Macroeconomic Dynamics 1, 135-168..
Cai, L. (2008). Specification Testing for Multifactor Diffusion Processes:An Empirical and Methodological
Analysis of Model Stability Across Different Historical Episodes
[1]
. Rutgers University.
Wibowo A. (2006). Continuous-time identification of exponential-affine term structure models
[2]
. Twente
University.
References
[1] http:/ / econweb.rutgers. edu/ lcai/ lili_files/ JobMarketPaper. pdf
[2] http:/ / eprints.eemcs.utwente.nl
LIBOR Market Model
299
LIBOR Market Model
The LIBOR market model, also known as the BGM Model (Brace Gatarek Musiela Model, in reference of the
names of some of the inventors), is a financial model of interest rates. It is used for pricing interest rate derivatives,
especially exotic derivatives like Bermudan swaptions, ratchet caps and floors, target redemption notes, autocaps,
zero coupon swaptions, constant maturity swaps and spread options, among many others. The quantities that are
modeled, rather than the short rate or instantaneous forward rates (like in the Heath-Jarrow-Morton framework) are a
set of forward rates (also called LIBORs), which have the advantage of being directly observable in the market, and
whose volatilities are naturally linked to traded contracts. Each forward rate is modeled by a lognormal process
under its forward measure, i.e. a Black model leading to a Black formula for interest rate caps. This formula is the
market standard to quote cap prices in terms of implied volatilities, hence the term "market model". The LIBOR
market model may be interpreted as a collection of forward LIBOR dynamics for different forward rates with
spanning tenors and maturities, each forward rate being consistent with a Black interest rate caplet formula for its
canonical maturity. One can write the different rates dynamics under a common pricing measure, for example the
forward measure for a preferred single maturity, and in this case forward rates will not be lognormal under the
unique measure in general, leading to the need of numerical methods such as monte carlo simulation or
approximations like the frozen drift assumption.
Model dynamic
The LIBOR market model models a set of forward rates , as lognormal processes
Here, denotes the forward rate for the period . For each single forward rate the model corresponds to
the Black model. The novelty is that, in contrast to the Black model, the LIBOR market model describes the dynamic
of a whole family of forward rates under a common measure.
External links
Java applets for pricing under a LIBOR market model and Monte-Carlo methods
[1]
Sample chapters of the book "Mathematical Finance" (ISBN 0470047224)
[2]
, with, e.g.,, a derivation of the
LIBOR market model drift.
Damiano Brigo's lecture notes on the LIBOR market model for the Bocconi University fixed income course
[3]
References
Brace, A., Gatarek, D. et Musiela, M. (1997): The Market Model of Interest Rate Dynamics, Mathematical
Finance, 7(2), 127-154.
Miltersen, K., Sandmann, K. et Sondermann, D., (1997): Closed Form Solutions for Term Structure Derivates
with Log-Normal Interest Rates, Journal of Finance, 52(1), 409-430.
LIBOR Market Model
300
References
[1] http:/ / www. christian-fries.de/ finmath/ applets/
[2] http:/ / www. christian-fries.de/ finmath/ book/
[3] http:/ / www. damianobrigo.it/ bocconi.html
HeathJarrowMorton framework
The HeathJarrowMorton (HJM) framework is a general framework to model the evolution of interest rate
curve instantaneous forward rate curve in particular (as opposed to simple forward rates). For direct modeling of
simple forward rates please see the BraceGatarekMusiela Model as an example.
The HJM framework originates from the work of David Heath, Robert A. Jarrow and Andrew Morton in the late
1980s, especially Bond pricing and the term structure of interest rates: a new methodology (1987) working paper,
Cornell University, and Bond pricing and the term structure of interest rates: a new methodology (1989) working
paper (revised ed.), Cornell University.
Framework
The key to these techniques is the recognition that the drifts of the no-arbitrage evolution of certain variables can be
expressed as functions of their volatilities and the correlations among themselves. In other words, no drift estimation
is needed.
Models developed according to the HJM framework are different from the so called short-rate models in the sense
that HJM-type models capture the full dynamics of the entire forward rate curve, while the short-rate models only
capture the dynamics of a point on the curve (the short rate).
However, models developed according to the general HJM framework are often non-Markovian and can even have
infinite dimensions. A number of researchers have made great contributions to tackle this problem. They show that if
the volatility structure of the forward rates satisfy certain conditions, then an HJM model can be expressed entirely
by a finite state Markovian system, making it computationally feasible. Examples include a one-factor, two state
model (O. Cheyette, "Term Structure Dynamics and Mortgage Valuation", Journal of Fixed Income, 1, 1992; P.
Ritchken and L. Sankarasubramanian in "Volatility Structures of Forward Rates and the Dynamics of Term
Structure", Mathematical Finance, 5, No. 1, Jan 1995), and later multi-factor versions.
Mathematical formulation
The class of models developed by Heath, Jarrow and Morton (1992) is based on modeling the forward rates, yet it
does not capture all of the complexities of an evolving term structure.
The instantaneous forward rate is the continuous compounding rate available at time as seen
from time . It is defined by:
(1)
The basic relation between the rates and the bond prices is given by:
(2)
Consequently, the bank account grows according to:
(3)
since the spot rate at time is .
The assumption of the HJM model is that the forward rates satisfy for any :
HeathJarrowMorton framework
301
(4)
where the processes are continuous and adapted.
For this assumption to be compatible with the assumption of the existence of martingale measures we need the
following relation to hold:
(5)
We find the return on the bond in the HJM model and compare it (5) to obtain models that do not allow for arbitrage.
Let
(6)
Then
(7)
Using Leibniz's rule for differentiating under the integral sign we have that:
(8)
where
By It's lemma,
(9)
It follows from (5) and (9), we must have that
(10)
(11)
Rearranging the terms we get that
(12)
Differentiating both sides by , we have that
(13)
Equation (13) is known as the no-arbitrage condition in the HJM model. Under the martingale probability measure
and the equation for the forward rates becomes:
(14)
This equation is used in pricing of bonds and its derivatives.
HeathJarrowMorton framework
302
See also
HoLee model
HullWhite model
BlackDermanToy model
Chen model
BraceGatarekMusiela Model
External links and references
Bond Pricing and the Term Structure of Interest Rates: A New Methodology for Contingent Claims Valuation.
David Heath, Robert A. Jarrow and Andrew Morton, Econometrica, 1992, vol. 60, issue 1, pages 77105
doi:10.2307/2951677
HeathJarrowMorton model and its application
[1]
, Vladimir I Pozdynyakov, University of Pennsylvania
An Empirical Study of the Convergence Properties of the Non-recombining HJM Forward Rate Tree in Pricing
Interest Rate Derivatives
[2]
, A.R. Radhakrishnan New York University
References
[1] http:/ / repository.upenn. edu/ dissertations/ AAI3015358/
[2] http:/ / papers. ssrn. com/ sol3/ papers.cfm?abstract_id=123170
Article Sources and Contributors
303
Article Sources and Contributors
Mathematical finance Source: http://en.wikipedia.org/w/index.php?oldid=391412401 Contributors: A.j.g.cairns, Acroterion, Ahd2007, Albertod4, Allemandtando, Amckern, Angelachou,
Arthur Rubin, Author007, Avraham, Ayonbd2000, Baoura, Beetstra, Billolik, Btyner, Burakg, Burlywood, CapitalR, Cfries, Christofurio, Ciphers, Colonel Warden, DMCer, Drootopula,
DuncanHill, Eric Kvaalen, FF2010, Financestudent, Fintor, Flowanda, Gabbe, Gary King, Gene Nygaard, Giftlite, Giganut, HGB, Halliron, Hannibal19, Headbomb, Hroulf, Hu12, Hgsippe
Cormier, JBellis, Jackol, Jamesfranklingresham, Jimmaths, Jmnbatista, JonHarder, JonMcLoone, Jonhol, Jrtayloriv, Kaslanidi, Kaypoh, Kimys, Kolmogorov Complexity, Kuru, Langostas,
MER-C, MM21, Michael Hardy, Michaltomek, MrOllie, Niuer, Nparikh, Oleg Alexandrov, Onyxxman, Optakeover, Pcb21, PhotoBox, Pnm, Portutusd, Punanimal, Quantchina, Quantnet,
Ralphpukei, Rhobite, Riskbooks, Ronnotel, SUPER-QUANT-HERO, Sardanaphalus, Sentriclecub, Silly rabbit, SkyWalker, Smesh, Stanislav87, Tassedethe, Tigergb, Timorrill, Uxejn,
Vabramov, Vasquezomlin, WebScientist, Willsmith, Woohookitty, Xiaobajie, YUL89YYZ, Yunli, 165 anonymous edits
Asymptotic analysis Source: http://en.wikipedia.org/w/index.php?oldid=391201448 Contributors: Alastair Carnegie, AlexUT, Altenmann, Ashwin, Bdforbes, Bluemaster, Charles Matthews,
Constructive editor, Delirium, Drbreznjev, Dyaa, Giftlite, Jitse Niesen, Landroni, MFH, Marc van Leeuwen, MarkSweep, Melcombe, Mets501, Mhym, Michael Hardy, NeoUrfahraner, Ninly,
Nv8200p, OMERZEN, Oleg Alexandrov, Plw, Porcher, Rintrah, RobinK, Sligocki, Sverdrup, TheSeven, Twisted86, Wilke, Yhkhoo, Zero0000, 23 anonymous edits
Calculus Source: http://en.wikipedia.org/w/index.php?oldid=394948006 Contributors: 01001, 07fan, 129.132.2.xxx, 14chuck7, 1exec1, 207.77.174.xxx, 24.44.206.xxx, 4.21.52.xxx,
4twenty42o, 64.252.67.xxx, 6birc, ABCD, APH, Aaronbrick, Abcdwxyzsh, Abmax, Abrech, AbsolutDan, Accident4, Ace Frahm, Acepectif, Acroterion, Adamantios, Ahoerstemeier, Ahy1,
Akrabbim, Aktsu, Alansohn, AlexiusHoratius, Ali, Allen Moore, Allen3, Allen4names, Alpha Beta Epsilon, Alpha Omicron, AltGrendel, AmeliaElizabeth, AnOddName, AndHab, Andonic,
Andorphin, Andre Engels, Andrewlp1991, Andrewpmk, AndyZ, Angela, Angr, Animum, Antandrus, Antonio Lopez, Ap, Appropo, Arcfrk, Arno, Arthena, Arthur Rubin, Arthursimms, Asjafari,
Astropithicus, Asyndeton, Atallcostsky, Aurumvorax, AustinKnight, Avenue, Avoided, Awh, AxelBoldt, B, BOARshevik, Badagnani, Ballz4kidz, Barneca, Baronnet, Batmanand, Bazookacidal,
Bcherkas, Bcrowell, Beerad34, Bellas118, BenB4, Berek, Berndt, Bethnim, Bethpage89, Bevo, Bfesser, Bgpaulus, BiT, Biggerdude, Billymac00, Binary TSO, Bingbong555, Bkell, Bkessler23,
Black Falcon, Black Kite, Blahdeeblaj, Blainster, BlueDevil, Bmk, Bobblewik, Bobo192, Bogey97, Bonadea, Bongwarrior, Bookmaker, Bookmarks4life, Boznia, Brian Everlasting, Brianga,
Brion VIBBER, BryanHolland, Bsroiaadn, Buckner 1986, Buillon sexycat800, Burris, C S, C quest000, CART fan, CBM, CDutcher, CIreland, CL8, CSTAR, Cabalamat, Cabhan, Caesar1313,
Calculuschicken, Callmebill, Calqulus, Caltas, Calton, Calvin 1998, Camw, Can't sleep, clown will eat me, CanadianLinuxUser, Cap'n Refsmmat, Capricorn42, Carasora, CardinalDan,
CarlBBoyer, Carso32, Castro92, Catgut, CathySc, Cdthedude, Cenarium, Cessator, Cfarsi3, Charles Matthews, Cheeser1, Chibitoaster, Choster, Christofurio, Chriszim, Chun-hian, Ckatz,
Cmarkides, Cmwslw, Coldsquid, Commander Keane, CommonModeNoise, Comrademikhail, Conversion script, Courcelles, Courtneylynn45, CptCutLess, Cronholm144, Crotalus horridus,
Css2002, Cthompson, Cymon, DARTH SIDIOUS 2, DHN, DMacks, DVdm, Da Gingerbread Man, Damian Yerrick, Damicatz, Daniel Arteaga, Daniel Hughes 88, Daniel J. Leivick, Daniel
Quinlan, Daniel5127, DanielDeibler, Daniele.tampieri, Dannery4, Danski14, Darth Panda, Daryl Williams, Daven200520, Davewild, David Newton, DavidCBryant, Daxfire, Db099221, Dbach,
DeadEyeArrow, Debator of mathematics, Deeptrivia, Dekisugi, Delbert Grady, DerHexer, Dferg, Diddlefart, Diginity, Diletante, Dimimimon7, Dionyziz, Discospinster, Diverman, Dmharvey,
Doctormatt, Dominus, Domthedude001, Dontwerryaboutit, DopefishJustin, DragonflySixtyseven, Drdonzi, DreamGuy, Drilnoth, Drywallandoswald, Dtgm, Dullfig, Dyknowsore, Dysepsion,
Dysprosia, EJF, EMessina92, EdH, Edcar11, Edcolins, Edmoil, Eduardoporcher, Edward, Edward321, Egil, Egmontaz, Einsteins37, Eisnel, Ekotkie, El C, Elementaro, Eliyak, Elkman,
Eloquence, Email4mobile, Emann74, Emily Jensen, Emmett, Empty Buffer, Epbr123, Erutuon, Escape Orbit, Espressobongo, Estel, Everyking, Evil saltine, Excirial, Existentialistcowboy,
Eyu100, Faithlessthewonderboy, Falcorian, Farquaadhnchmn, Favonian, Feezo, Feinstein, Fephisto, Fetchcomms, Fiedorow, FilipeS, Filippowiki, Finell, Fintler, Fixthatspelling, Flex, Flutefreek,
Foobar333, Footballfan190, Four Dog Night, Fowler&fowler, Foxtrotman, Frazzydee, Freakinadrian, Fredrik, FrozenPurpleCube, Frymaster, Furrykef, Fuzzform, G.W., G026r, GT5162, Gabriel
Kielland, Gabrielleitao, Gadfium, Gaelen S., Gaff, Gaius Cornelius, Gaopeng, Gene Ward Smith, Genius101, Geoking66, Geometry guy, Gesslein, Giftlite, Gilliam, Glane23, Gnat,
Goeagles4321, Gofeel, Gogo Dodo, Golezan, Goocanfly, Goodwisher, Googl, Gop 24fan, Gracenotes, Graham87, Grokmoo, Groovybill, Groundling, Gscshoyru, Guanaco, Guiltyspark,
Guoguo12, Gurchzilla, Guy M, Gwernol, Gwguffey, Habhab38, Hadal, Hajhouse, Hannes Eder, Hanse, Haonhien, Harryboyles, Hawkhkylax26, Hawthorn, Hdt83, Headbomb, Headhold,
Hebrides, Heimstern, Helios Entity 2, Helix84, Helvetius, Heron, Hesacon, Hexagon70, Hgetnet, High Elf, Hike395, Hippasus, HolIgor, Homestarmy, Hotstreets, Htim, Hut 8.5, Hydrogen Iodide,
IDX, II MusLiM HyBRiD II, Icrosson, Ictlogist, Idealitem, Ideyal, Ieremias, If I Am Blocked I Will Cry, Igiffin, Ike9898, Ikiroid, Ilikepie2221, Imjustmatthew, Infinity0, Infrogmation,
Inquisitus, Insanity Incarnate, Interrobang, Ioscius, Iosef, Irish Souffle, IronGargoyle, Ironman104, IslandHopper973, Izzy007, J.Wolfe@unsw.edu.au, J.delanoy, JDPhD, JForget, JFreeman, JJL,
JTB01, JWillFiji, JaGa, Jacek Kendysz, Jackbaird, Jacob Nathaniel Richardson, Jacobolus, Jagged 85, JaimenKing, Jak86, Jake Wartenberg, James, James086, Jan1nad, Jandjabel, Jason Lynn
Carter, Jasongallagher, Jay.perna, Jclemens, Jeff3000, JeffPla, Jengirl1988, JensenDied, Jenssss, Jersey Devil, Jfiling, Jfilloy, JimR, JimVC3, Jimothy 46, Jimp, JinJian, Jitse Niesen, Jj137,
Jjacobsmeyer, Jman9 91, John Kershaw, John254, Johnnybfat, Johnuniq, Joodeak, Joseph Solis in Australia, Joshuac333, Jpo, Junglecat, Justinep, Jwpurple, Jxg, Jyril, Jna runn, KRS, Kai
Hillmann, Kamrama, Karl Dickman, Katanaofdoom, Katzmik, Kbdank71, Kemiv, Ken Kuniyuki, Kesac, Ketsuekigata, Killdevil, Killfire72, Koavf, Kocher2006, Koyos, Kragen, KrakatoaKatie,
Krich, Kristinadam, Kubigula, Kukooo, Kuru, L Kensington, L33tweasley, LLcopp, Lambiam, Le coq d'or, LeaveSleaves, Leszek Jaczuk, Lethe, Lifung, Lightdarkness, Likebox, Lindmere, Lir,
LittleDan, LittleOldMe, Littleyoda20, Loelin, Lollerskates, Lradrama, Luka666, Luna Santin, Lupo, M.hayek, M1ss1ontomars2k4, MER-C, MONGO, MacGyverMagic, Madchester,
Madmath789, Magioladitis, Malatesta, Mani1, Manuel Trujillo Berges, MapsMan, Marcushdaniel, Mariewoestman, MarkMarek, Markus Krtzsch, Mashford, Math.geek3.1415926, Matthias
Heiler, Mauler90, Maurice Carbonaro, Maurreen, Mav, Maxis ftw, Maxstr9, Mayumashu, Meisterkoch, Melos Antropon, Mentifisto, Merube 89, Mets501, Mgmei, Mgummess, Michael Hardy,
Michaelh09, Mike2vil, Miked2009, Minestrone Soup, Miskin, MithrandirAgain, Mjpurses, Mlm42, Modernage, Modulatum, Moink, Mokakeiche, Mr Stephen, MrOllie, MrSomeone, Mrbond69,
Mrhurtin, Ms2ger, Mspraveen, Musicman69123, Mygerardromance, N.j.hansen, Nahum Reduta, Nandesuka, Narcissus, Natural Philosopher, NawlinWiki, Nbarth, Ndkl, NeilN, Neokamek,
Newton890, Nick Garvey, Nigel5, Nikai, NinSmartSpasms, Ninly, Nixeagle, Nnedass, Nneonneo, Nohup.in, Nolefan3000, NuclearWarfare, NuclearWinner, Nucleusboy, Nufy8, Nuttyskin,
OSJ1961, Obey, Obradovic Goran, Oceras, Oleg Alexandrov, Oliver202, Olop4444, Omicronpersei8, Oreo Priest, Orlady, Orphic, Otheus, OverlordQ, OwenX, Owlgorithm, Ozob, P Carn, Pabix,
Pakula, Pascal.Tesson, Pattymayonaise, Paul August, Pcap, Peere, Penguinpwrdbox, Peruvianllama, Peter Grey, Petter Strandmark, Pgunnels, Phil Bastian, Philip Trueman, PhotoBox, PhySusie,
Physprob, Piano non troppo, Pieburningbateater, PierreAbbat, Pilif12p, PinchasC, Pinethicket, Pizza Puzzle, Pluppy, Pmanderson, Pmeisel, Pnzrfaust, Poison ivy boy, Pokemon1989, Pomakis,
Pramboy, Pranathi, Professor Fiendish, Proffie, Programmar, Puchiko, Puck is awesome., PurpleRain, Pvjthomas, Pyrospirit, Qertis, Quangbao, Quantumobserver, Quintote, Qxz, RHaworth,
RJHall, Ragesoss, Ral315, Ramblagir, Ramin325, Razimantv, Razorflame, Rdsmith4, Reach Out to the Truth, Reaperman, Recentchanges, Recognizance, Reconsider the static, Red Winged
Duck, RedWolf, Reepnorp, RekishiEJ, Renato Caniatti, Rettetast, Revolver, Rich Farmbrough, Rick Norwood, Rjwilmsi, Rl, Roastytoast, RobHar, Robertgreer, RodC, Rokfaith, Rorro, Rossami,
Rotem Dan, Routeh, Roy Brumback, Royboycrashfan, Roylee, Rpchase, Rpg9000, Rrenner, Rtyq2, Rustysrfbrds99, Rxc, Ryan Postlethwaite, Ryulong, SFC9394, Salix alba, Saupreissen,
Savidan, Schneelocke, ScienceApologist, Sciurin, Scottydude, Sdornan, SeoMac, Sephiroth BCR, Sfngan, Shanel, Sheeana, SheepNotGoats, Shinjiman, Shizhao, Shunpiker, Silly rabbit,
Simetrical, Sjakkalle, Sjforman, Skal, Skater, Skiasaurus, Skydot, Smashville, Smeira, SmilesALot, Smithbcs, Smithpith, Smoken Flames, Snotchstar!, SoSaysChappy, Soltras, Someones life,
Sp00n, SpK, SpLoT, Spartan-James, Specs112, Spkid64, Splash, Spreadthechaos, SpuriousQ, Sr1111, Srkris, Stammer, StaticGull, Stephenb, Stevenmattern, Stevertigo, Stickee, Stizz, Storeye,
Stumps, Stwalkerster, Suyashmanjul, Swegei, Symane, TBadger, TakuyaMurata, Tangent747, Tanweer Morshed, Tarek j, Tarret, Tawker, Taxman, Tbonnie, Tbsmith, Tcncv, TedE, Tedjn,
Telempe, Template namespace initialisation script, Terence, Terminaterjohn, Tetracube, Tfeeney65, That1dude35, Thatguyflint, The Anome, The Thing That Should Not Be, The Transhumanist,
The Transhumanist (AWB), The Utahraptor, The wub, TheMidnighters, Themfromspace, Thenub314, Thomasmeeks, Thunderboltz, ThuranX, Tide rolls, Tiga-Now, Tikiwont, Timo Honkasalo,
Timwi, Tkuvho, Tobby72, Tomayres, Tony Fox, Torvik, Tosha, Tothebarricades.tk, Travisc, Trd89, Tribaal, TrigWorks, Trovatore, Trusilver, Truth100, Tualha, TutterMouse, Tzf, Ukexpat, V10,
VDWI, VMS Mosaic, Variable, VasilievVV, Viriditas, Visualerror, WPIsFlawed, Wa2ise, Wapcaplet, Watsonksuplayer, Wayward, Welsh, Widdma, Wik, Wiki alf, WikiZorro, Wikiklrsc,
Wikilibrarian, WikipedianMarlith, Willardsimmons, William felton, Wimt, Wknight94, Wolfrock, Worrydoes, Wowz00r, Wraithdart, Wwoods, X!, Xantharius, Xharze, Xnuala, Xod, Xornok,
Xrchz, Y Strings 9 The Bar, Yacht, Yamamoto Ichiro, Yazaq, YellowMonkey, Yerpo, Yongtze28, Yosri, Youandme, YourBrain, Yute, Zachorious, Zaraki, Zchenyu, Zenohockey, ,
2, 1599 anonymous edits
Copula (statistics) Source: http://en.wikipedia.org/w/index.php?oldid=393675452 Contributors: A. Pichler, Alanb, Albmont, Alg543, Amir9a, AndrewDressel, Andrewpmack, AnonMoos,
Aptperson, ArsniureDeGallium, Asjogren, Asnelt, Avraham, BenFrantzDale, Bender235, Benjaminveal, Caco21, Charles Matthews, Christofurio, CopulaTomograph, Crasshopper, Den fjttrade
ankan, Derex, Diegotorquemada, Fabrizio.durante, Feraudyh, GUONing, Gabeornelas, Gbohus, Gene Nygaard, Giftlite, Helgus, Hu12, Ikiwdq55, Inference, JA(000)Davidson, Jeffq, Jitse Niesen,
KHamsun, Kimys, Kmanoj, Magicmike, Martinp, Martyndorey, Mcld, Melcombe, Michael Hardy, Mwarren us, Nelson50, Nutcracker, Oleg Alexandrov, Ossiemanners, Paul H., PeterSarkoci,
Philtime, Piloter, Qwfp, Rajah9, Roadrunner, SHINIGAMI-HIKARU, Shuetrim, Skbkekas, Srini121, Stigin, Sumple, Tomixdf, Waldir, Woohookitty, Woutersmet, Zasf, Zsolt Tulassay, Zundark,
122 anonymous edits
Differential equation Source: http://en.wikipedia.org/w/index.php?oldid=392867439 Contributors: 17Drew, After Midnight, Ahoerstemeier, Alarius, Alfred Centauri, Andrei Polyanin, Andres,
AndrewHowse, Andycjp, Andytalk, AngryPhillip, Anonymous Dissident, Antonius Block, Apmonitor, Arcfrk, Asdf39, Asyndeton, Attilios, Babayagagypsies, Baccala@freesoft.org, Baccyak4H,
Bejohns6, Bento00, Berland, Bidabadi, Bigusbry, Brandon, Bryanmcdonald, Btyner, Bygeorge2512, Callumds, Charles Matthews, Chtito, Cmprince, ConMan, Cxz111, Cybercobra, DAJF,
Danski14, Dbroadwell, Ddxc, Delaszk, DerHexer, Difu Wu, Djordjes, DominiqueNC, Donludwig, Dpv, Dr sarah madden, Drmies, DroEsperanto, Dysprosia, EconoPhysicist, Elwikipedista,
EricBright, Estudiarme, Fintor, Fioravante Patrone en, Gabrielleitao, Gandalf61, Gauss, Geni, Giftlite, Gombang, Grenavitar, Haham hanuka, Hamiltondaniel, Haruth, Haseeb Jamal, Heikki m,
Ilya Voyager, Iquseruniv, Iulianu, Izodman2012, J arino, J.delanoy, Ja 62, Jak86, Jao, Jayden54, Jersey Devil, Jim.belk, Jim.henderson, Jitse Niesen, JohnOwens, JorisvS, K-UNIT, Kayvan45622,
KeithJonsn, Kensaii, Kr5t, LOL, Lambiam, Lavateraguy, Lethe, LibLord, Linas, Lumos3, Madmath789, MarSch, Martynas Patasius, Math.geek3.1415926, Matqkks, Mattmnelson, Maurice
Carbonaro, Maxis ftw, Mazi, McVities, Mduench, Mets501, Mh, Michael Hardy, Mindspillage, MisterSheik, Mohan1986, Mossaiby, Mpatel, MrOllie, Mtness, Mysidia, Nik-renshaw,
Nkayesmith, Oleg Alexandrov, Opelio, Pahio, Paul August, Paul Matthews, Paul Richter, Pgk, Phoebe, PseudoSudo, Qzd800, Rama's Arrow, Reallybored999, RexNL, Robin S, Romansanders,
Rosasco, Ruakh, SDC, SFC9394, Salix alba, Sam Staton, Senoreuchrestud, Silly rabbit, Siroxo, Skypher, SmartPatrol, Snowjeep, Spirits in the Material, Starwiz, Sverdrup, Symane, TVBZ28,
TYelliot, Tbsmith, TexasAndroid, The Hybrid, The Thing That Should Not Be, Timelesseyes, Tranum1234567890, User A1, Vanished User 0001, Waltpohl, Wclxlus, Wihenao, Willtron,
Winterheart, XJamRastafire, Yafujifide, Zepterfd, 254 anonymous edits
Article Sources and Contributors
304
Expected value Source: http://en.wikipedia.org/w/index.php?oldid=394113300 Contributors: 65.197.2.xxx, A. Pichler, Aaronchall, Adamdad, Albmont, Almwi, AxelBoldt, B7582, Banus,
Bdesham, BenFrantzDale, Bjcairns, Brews ohare, Brockert, Bth, Btyner, CKCortez, Caesura, Calbaer, Caramdir, Carbuncle, Cburnett, Centrx, Charles Matthews, Chris the speller, Cloudguitar,
Coffee2theorems, Conversion script, Cretog8, Dartelaar, Daryl Williams, DavidCBryant, Dpv, Draco flavus, Drpaule, El C, Elliotreed, Fibonacci, FilipeS, Fintor, Fresheneesz, Funandtrvl,
Gala.martin, Gary King, Giftlite, Glass Sword, GraemeL, Grafen, Grapetonix, Greghm, Grubber, Guanaco, H2g2bob, HenningThielemann, Hyperbola, INic, Iakov, Idunno271828, Ikelos,
Jabowery, Jancikotuc, Jcmo, Jitse Niesen, Jj137, Jordsan, Jrincayc, Jsondow, Jt68, KMcD, Karol Langner, Katzmik, Kazabubu, Kurykh, LALess, LOL, Lee Daniel Crocker, Leighliu, Levineps,
Lponeil, MHoerich, MarSch, Markhebner, Mccready, Melchoir, Melcombe, Mgreenbe, Michael Hardy, Mindbuilder, Minimac, MrOllie, Netheril96, NinjaCharlie, O18, Obradovic Goran, Oleg
Alexandrov, Openlander, Ossiemanners, PAR, Patrick, Percy Snoodle, Pgreenfinch, Phdb, PierreAbbat, Pol098, Poor Yorick, Populus, Puckly, Q4444q, Qwfp, R3m0t, Reetep, Reric, Rjwilmsi,
RobHar, Robinh, Romanempire, Ronald King, Rray, Ryguasu, Saebjorn, Salix alba, Schmock, SebastianHelm, Shredderyin, Shreevatsa, Skarl the Drummer, Steve Kroon, Steven J. Anderson,
Stpasha, Tarotcards, Tarquin, Taxman, TedPavlic, Tejastheory, The Bad Boy 3584, TheObtuseAngleOfDoom, Tide rolls, Tobi Kellner, Tomi, Troy112233, Tsirel, Unfree, Unyoyega, Varuag
doos, Viesta, Werner.van.belle, Wmahan, Yesitsapril, Zero0000, ZeroOne, Zojj, ZomBGolth, Zvika, 212 anonymous edits
Ergodic theory Source: http://en.wikipedia.org/w/index.php?oldid=388432227 Contributors: AdamSiska, Arcfrk, Armando82, Bdmy, Benbest, Bowenthebeard, CBM, CRGreathouse, Catquas,
Charles Matthews, D6, Dcljr, Den fjttrade ankan, Diligent, DirkOliverTheis, Dysprosia, Fredrik, Giftlite, Headbomb, Hmasoom, Huon, Jackzhp, Jheald, Jim.belk, Jmath666, Joseph Grcar,
K-UNIT, Klaus scheicher, LamaO, Lemur235, Linas, Mamluk, Mani1, Mdd, Melcombe, Mhym, Michael Hardy, Mild Bill Hiccup, Msuzen, Negrello, Neithan Agarwaen, NoEdward,
OO0000OO, Ojigiri, PMajer, Phils, Pokipsy76, Rjwilmsi, RobHar, Rs2, Rspanton, SerialJaywalker, Silly rabbit, Spmeyn, Stotr, Sawomir Biay, Takwan, That Guy, From That Show!, The
Anome, Tiled, Tinton5, Tobacman, Tong, Torsten Nielsen, Vegasprof, Wile E. Heresiarch, XaosBits, Zvika, 54 anonymous edits
FeynmanKac formula Source: http://en.wikipedia.org/w/index.php?oldid=373075977 Contributors: Bedros815, Charles Matthews, Cherkash, Constructive editor, DavidCBryant,
Davidjfischer, Delirium, Eweinber, Genie05, Giftlite, GuidoGer, J.delanoy, Jmnbatista, Laser.Y, LeYaYa, Maximus Rex, Michael Hardy, Michael Slone, Nbarth, Ott2, Serhiy.kozak,
Shadowjams, Synchronism, Taxman, TensorProduct, Tobias Bergemann, Vovchyck, 31 anonymous edits
Fourier transform Source: http://en.wikipedia.org/w/index.php?oldid=395007877 Contributors: A Doon, Abecedare, Admartch, Adoniscik, Ahoerstemeier, Akbg, Alejo2083, Alipson,
Amaher, AnAj, Andrei Polyanin, Andres, Angalla, Anna Lincoln, Ap, Army1987, Arondals, Avicennasis, AxelBoldt, Barak Sh, Bci2, Bdmy, BenFrantzDale, BigJohnHenry, Bo Jacoby, Bob K,
Bobblewik, Bobo192, Bugnot, Burhem, Butala, CSTAR, Caio2112, Cassandra B, Catslash, Cburnett, Ch mad, Charles Matthews, Chris the speller, ClickRick, Cmghim925, Complexica,
Compsonheir, Coppertwig, CrisKatz, Crisfilax, DX-MON, Da nuke, DabMachine, DavidCBryant, Demosta, Discospinster, DmitTrix, Dmmaus, Dougweller, Dr Dec, DrBob, Drew335, Drilnoth,
Dysprosia, EconoPhysicist, Ed g2s, Eliyak, Elkman, Enochlau, Feline Hymnic, Feraudyh, Fizyxnrd, Fr33kman, Fred Bradstadt, Fropuff, Futurebird, Gaius Cornelius, Gareth Owen, Giftlite,
Glenn, GuidoGer, GyroMagician, H2g2bob, HappyCamper, Heimstern, HenningThielemann, Hesam7, HirsuteSimia, Hrafeiro, Ht686rg90, I am a mushroom, Igny, Ivan Shmakov, Iwfyita,
Jaakobou, Jdorje, Jhealy, Joerite, JohnQPedia, Joriki, Justwantedtofixonething, KHamsun, KYN, Keenan Pepper, Kevmitch, Kostmo, Kunaporn, Larsobrien, Linas, Looxix, Lovibond,
Luciopaiva, Lupin, M1ss1ontomars2k4, Manik762007, MathKnight, Maxim, Mckee, Metacomet, Michael Hardy, Mikeblas, Moxfyre, Mr. PIM, NTUDISP, Naddy, Nbarth, Nihil, Nishantjr,
Njerseyguy, Nmnogueira, NokMok, Od Mishehu, Oleg Alexandrov, Oli Filth, Omegatron, Ouzel Ring, PAR, Pak21, Papa November, Paul August, Pedrito, Pete463251, Petergans, Phasmatisnox,
Phils, PhotoBox, PigFlu Oink, Poincarecon, PtDw832, Publichealthguru, Quintote, Qwfp, R.e.b., Rainwarrior, Rbj, Red Winged Duck, Riesz, Rifleman 82, Rijkbenik, Rjwilmsi, RobertHannah89,
Rror, Rs2, Rurz2007, SKvalen, Safenner1, Sai2020, Sandb, Sbyrnes321, SebastianHelm, Sepia tone, Sgreddin, Shreevatsa, Silly rabbit, Slawekb, SlimDeli, Snigbrook, Snoyes, Sohale,
SpaceFlight89, Stevenj, Stpasha, StradivariusTV, Sunev, Sverdrup, Sylvestersteele, Sawomir Biay, THEN WHO WAS PHONE?, TakuyaMurata, TarryWorst, Tetracube, The Thing That Should
Not Be, Thenub314, Thermochap, Thinking of England, Tim Goodwyn, Tim Starling, Tinos, Tobias Bergemann, TranceThrust, Ujjalpatra, User A1, Vadik wiki, Vasi, Verdy p, VeryNewToThis,
VictorAnyakin, Vidalian Tears, WLior, WikiDao, Wile E. Heresiarch, Writer130, Wwheaton, Ybhatti, YouRang?, Zoz, Zvika, 426 anonymous edits
Girsanov's theorem Source: http://en.wikipedia.org/w/index.php?oldid=35943342 Contributors: Alanb, Bender235, BeteNoir, Brian Tvedt, Charles Matthews, Crasshopper, DocendoDiscimus,
Elb2000, Giftlite, Hillgentleman, Kaslanidi, Melcombe, Michael Hardy, MrOllie, Omnipaedista, Pcb21, Pearle, Petrelharp, Petrus, Populus, Rich Farmbrough, Rjwilmsi, Roboquant, Sbossone,
Stigin, Taxman, Vincent kraeutler, Willsmith, 25 anonymous edits
It's lemma Source: http://en.wikipedia.org/w/index.php?oldid=19326193 Contributors: Arjen Dijksman, Bdmy, Berland, Blight 88, Btyner, CanadianLinuxUser, Charles Matthews,
Christofurio, Crasshopper, Dan131m, Ehrenkater, Erxnmedia, FF2010, Farooqaziz86, Fintor, Flyingspuds, Fredrik, Freeinfo47, Gauge, Giftlite, Jackzhp, Jersey Devil, Jmath666, Jmc200,
Jojhutton, Julien Tuerlinckx, Kintetsubuffalo, Laser.Y, Marsden, Md25, Melcombe, Michael Hardy, Mr Ape, Nbrouard, OdedSchramm, Ohmygodbees, Oleg Alexandrov, OliAtlason, Orzetto,
Ott2, Pcb21, Pfortuny, Roadrunner, Roboquant, Rorro, Ruud Koot, Slaniel, Smaines, The Anome, Urbansuperstar, Vovchyck, Waltpohl, WhisperToMe, Zophar, 78 anonymous edits
Martingale representation theorem Source: http://en.wikipedia.org/w/index.php?oldid=393220493 Contributors: Alanb, Btyner, Charles Matthews, Elf, Eug, Gala.martin, Giftlite, HyDeckar,
Kylu, LeYaYa, M1ss1ontomars2k4, Mailer diablo, Meekohi, Melcombe, Michael Hardy, ReiVaX, Stefan Pantiru, 11 anonymous edits
Mathematical model Source: http://en.wikipedia.org/w/index.php?oldid=393942382 Contributors: 130.159.254.xxx, 213.253.39.xxx, Abdull, Abdullais4u, Abune, Agor153, Al Lemos, Aliotra,
Altenmann, AndrewDressel, Arvindn, Atlant, Audriusa, Awickert, AxelBoldt, Azuris, BD2412, BenBaker, Binarypower, Brianjd, CarlHewitt, Carlodn6, Cazort, Conversion script, Cureden,
Cybercobra, DARTH SIDIOUS 2, Dakart, Derek Ross, Dori, Dterp, Dvsphanindra, Dysprosia, Elapsed, EmersonLowry, Emperorbma, Eurosong, EyeSerene, Farazgiki, Filippof, Fioravante
Patrone en, Frau Holle, G.de.Lange, GiantSloth, Giftlite, Globalsolidarity, Gogo Dodo, Goodralph, Graham87, Gregbard, Grubber, Hans Adler, Hede2000, Helix84, Henrygb, Hesperian,
HisashiKobayashi, HolgerK, Howard.noble323, Hxxvxxy, Hydrargyrum, JRSpriggs, JWSchmidt, Jarash, Jdpipe, John Deas, JohnOwens, Jonnat, Josephfrodo, Kai.velten, Karnesky, Kbdank71,
Kingturtle, Kwaku, Lambiam, Lexor, MCrawford, Maedin, Magister Mathematicae, Maksim-e, Manop, Marek69, Martin451, MathMartin, Matqkks, Mayooranathan, Mdd, Meemee878, Michael
Hardy, Millancad, MrOllie, Msh210, Neilc, Nikai, Nshackles, Obradovic Goran, Oleg Alexandrov, Oliver Lineham, Olivier, Omicronpersei8, PMDrive1061, Paolo.dL, Passw0rd, Pdenapo,
Phanerozoic, Philip Trueman, Piano non troppo, Porcher, Prazan, Ranilb5, Rbakker99, Rich Farmbrough, RoyBoy, Sam Douglas, Sangwinc, Seaphoto, Silverhelm, Sina92, Sj, Skbkekas,
Smmurphy, Soy ivan, Spacemika, Spiritia, Spradlig, Squidonius, Stephanhartmannde, Stricklandjs59, T.ogar, Tasudrty, Tbackstr, Tbsmith, The Anome, The enemies of god, Tilin, TittoAssini,
Tom Fararo, Tomas e, Tparameter, Trebor, Trovatore, Vina, Vlado0605, Wavehunter, Wavelength, Wesley, WikiTome, Winhunter, Wsiegmund, Yotaloop, 170 anonymous edits
Monte Carlo method Source: http://en.wikipedia.org/w/index.php?oldid=393203937 Contributors: *drew, A.Cython, ABCD, Aardvark92, Adfred123, Aferistas, Agilemolecule, Akanksh,
Alanbly, Albmont, AlexBIOSS, AlexandreCam, AlfredR, Alliance09, Altenmann, Andrea Parri, Andreas Kaufmann, Angelbo, Aniu, Apanag, Aspuru, Atlant, Avalcarce, Avicennasis, Aznrocket,
BAxelrod, BConleyEEPE, Banano03, Banus, Bduke, Beatnik8983, BenFrantzDale, BenTrotsky, Bender235, Bensaccount, BillGosset, Bkell, Blotwell, Bmaddy, Bobo192, Boffob, Boredzo,
Broquaint, Btyner, CRGreathouse, Caiaffa, Charles Matthews, ChicagoActuary, Cibergili, Cm the p, Colonies Chris, Coneslayer, Cretog8, Criter, Cybercobra, Cython1, DMG413, Damistmu,
Davnor, Ddcampayo, Ddxc, Denis.arnaud, Dhatfield, Digemedi, Drewnoakes, Drsquirlz, Ds53, Duck ears, Duncharris, Dylanwhs, ERosa, Edstat, EldKatt, Elpincha, Elwikipedista, Eudaemonic3,
Ezrakilty, Fastfission, Fintor, Flammifer, Frozen fish, Furrykef, G716, Giftlite, Gilliam, Goudzovski, GraemeL, GrayCalhoun, Greenyoda, Grestrepo, Gtrmp, Gkhan, Hanksname, Hawaiian717,
Hokanomono, Hu12, Hubbardaie, ILikeThings, IanOsgood, Inrad, Ironboy11, Itub, Jackal irl, Jacobleonardking, Janpedia, JavaManAz, Jeffq, Jitse Niesen, Joey0084, John, John Vandenberg,
JohnOwens, Jorgenumata, Jsarratt, Jugander, Jrme, K.lee, KSmrq, KaHa242, Karol Langner, Kenmckinley, Kimys, Knordlun, Kroese, Kummi, Kuru, Lambyte, LeoTrottier, Lerdthenerd,
Levin, Lexor, LizardJr8, LoveMonkey, M-le-mot-dit, Malatesta, Male1979, ManchotPi, Marcofalcioni, Mark Foskey, Martinp, Masatran, Mathcount, MaxHD, Maxentrope, Maylene, Mbryantuk,
Melcombe, Michael Hardy, Mikael V, Misha Stepanov, Mlpearc, Mnath, Moink, MrOllie, Mtford, Nagasaka, Nanshu, Narayanese, Nasarouf, Nelson50, Nosophorus, Nsaa, Nuno Tavares,
Nvartaniucsd, Ohnoitsjamie, Oli Filth, Oneboy, Orderud, OrgasGirl, Ott2, P99am, Paul August, PaulxSA, Pbroks13, Pcb21, Pete.Hurd, PeterBoun, Pgreenfinch, Philopp, Pibwl, Pinguin.tk,
PlantTrees, Pne, Popsracer, Poupoune5, Qadro, Quantumelfmage, Quentar, Qwfp, Qxz, RWillwerth, Ramin Nakisa, Redgolpe, Renesis, Rich Farmbrough, Richie Rocks, Rinconsoleao, Rjmccall,
Ronnotel, Rs2, SKelly1313, Sam Korn, Samratvishaljain, Sergio.ballestrero, Shacharg, Shreevatsa, Snegtrail, Snoyes, Somewherepurple, Spellmaster, Splash6, Spotsaurian, SpuriousQ, Stefanez,
Stefanomione, StewartMH, Stimpy, Storm Rider, Superninja, Sweetestbilly, Tarantola, Taxman, Tayste, Tesi1700, Theron110, Thirteenity, ThomasNichols, Thr4wn, Tiger Khan, Tim Starling,
Tom harrison, TomFitzhenry, Tooksteps, Trebor, Twooars, UBJ 43X, Urdutext, Uwmad, Vipuser, VoseSoftware, Wile E. Heresiarch, William Avery, Yoderj, Zarniwoot, Zoicon5, Zr40,
Zuidervled, 377 anonymous edits
Numerical analysis Source: http://en.wikipedia.org/w/index.php?oldid=395111295 Contributors: A bit iffy, APH, Aither, Aitter, AmadeusKlocker, Arnero, Arthena, Asyndeton, Audiosmurf,
Beliavsky, BenFrantzDale, Berland, Bethnim, BigJohnHenry, Blainster, Bocianski, Bookandcoffee, Brianboonstra, Bubba73, Bugg, CBM, CES1596, CMaes, CRGreathouse, Carrionluggage,
Cat2020, Chaos, Charles Matthews, Charvest, Chrismacgr, Christina Silverman, Closedmouth, Conversion script, Crust, DSP-user, Darklilac, Darthhappyface, David Binner, David Haslam,
Decrease789, Decrypt3, Didimos, Dominus, Dontaskme, Drunken Pirate, Dungodung, Dysprosia, EXTER7, EagleFan, Eijkhout, Ekojekoj, Elwikipedista, ExamplePuzzle, FatBastardInk, Fintor,
Fleminra, Gadfium, Gene Nygaard, Gesslein, Giftlite, Ginkgo100, Gombang, Google Child, Graham87, Greenleaf, Gtpjg, Guanaco, Hede2000, Hlangeveld, Hongooi, HussainAbbas,
IronGargoyle, JJL, JLaTondre, Jagged 85, Jaredwf, JesseW, Jitse Niesen, Jmath666, JonMcLoone, Jtir, Jurgen, Justin W Smith, KSmrq, Kate, Kawautar, Kingpin13, Koavf, KymFarnik,
Lambiam, Ldo, Lethe, Lightmouse, Linas, Loisel, Loupeter, Lunch, MER-C, Manuel Trujillo Berges, Matt Crypto, Mattisse, Meaghan, Michael Hardy, MisterSheik, Mlpkr, Msh210,
MystRivenExile, NevilleDNZ, Nikolas Karalis, Ninly, Nixdorf, Nk, Nonphixion, Nostraticispeak, Ntsimp, Oleg Alexandrov, Oliver202, Oyz, Paul Matthews, PedroPVZ, Peter L, PhotoBox, Pred,
Quantufinity, Ramiamro, Redgecko, Rege, Requestion, RexNL, Robertgreer, Robinh, Rogper, Saforrest, Salih, SamShearman, Sina2, SlackerMom, Sligocki, Stephen B Streater, Stephenkirkup,
Stevan White, Svick, Systemlover, Tarquin, Taw, Taxman, TeemuN, The Anome, The undertow, Timothy Clemans, Tobias Hoevekamp, Tom harrison, Tomchen, Tomtzigt, Topbanana,
Troworld, Turidoth, Unmerklich, Urdutext, VictorAnyakin, WikipedianMarlith, Windharp, Wolfrock, Wordsoup, Wygk, Xask Linus, Xiaojeng, Yuval madar, Yvwv, Zeno Gantner, Zowie,
Zzuuzz, , 188 anonymous edits
Real analysis Source: http://en.wikipedia.org/w/index.php?oldid=394367634 Contributors: Alaz, Algebraist, Anastas5425, AugPi, AxelBoldt, Charles Matthews, Charvest, Conversion script,
D6, Damian Yerrick, Damien Karras, Dominus, Dysprosia, Enochlau, Giftlite, Gioto, Hans Adler, Hayabusa future, Helopticor, Ipatrol, Isnow, James pic, Jleedev, Joriki, Josh Cherry, Juan
Marquez, Justin W Smith, Kiefer.Wolfowitz, Lappado, Lllrahman, Marquez, Matthew Auger, Mets501, Michael Hardy, Mrh30, Msh210, N8chz, Northumbrian, Obradovic Goran, Oden, Oleg
Article Sources and Contributors
305
Alexandrov, Ozob, Paul August, Peter Stalin, Pomte, Reetep, Remote009, Salix alba, Sam Hocevar, SilenceSoLoud, Skal, Sligocki, Smithpith, Specs112, Tarquin, TheDrizzle, Toby Bartels,
TomS TDotO, Tristanreid, Wafulz, Wideofthemark, Yoctobarryc, Zyskowskrob, 50 anonymous edits
Partial differential equation Source: http://en.wikipedia.org/w/index.php?oldid=394357420 Contributors: Afluent Rider, Ahoerstemeier, Aliotra, Andrei Polyanin, AndrewHowse, Arnero,
ArnoldReinhold, Arthena, AxelBoldt, Belovedfreak, Bemoeial, Ben pcc, BenFrantzDale, Bender235, Bertik, Bjorn.sjodin, Borgx, Brian Tvedt, CYD, Cbm, Charles Matthews, Chbarts, Chris in
denmark, Cj67, Ckatz, Crust, CyrilB, D.328, DStoykov, David Crawshaw, Dharma6662000, Dicklyon, Dirkbb, Djordjes, DominiqueNC, Donludwig, DrHok, Dysprosia, Egriffin, Eienmaru,
Eigenlambda, El C, EmmetCaulfield, Epbr123, Erxnmedia, Filemon, Fintor, Foober, Frosted14, Gaj0129, Gerasime, Germandemat, Giese, Giftlite, GraemeL, Gseryakov, Gurch, Gvozdet,
Hongooi, Hut 8.5, Isnow, Iwfyita, Ixfd64, JNW, JaGa, Jitse Niesen, Jmath666, Jon Cates, JonMcLoone, Jonathanstray, Jss214322, Jyril, Kbolino, Kwiki, L-H, Linas, MFH, Magister
Mathematicae, Mandolinface, Manticore, MathMartin, Mathsfreak, Maurice Carbonaro, Mazi, Mhaitham.shammaa, Mhym, Michael Devore, Michael Hardy, Moink, Mpatel, Msh210,
MuthuKutty, NSiDms, Nbarth, Nneonneo, Ojcit, Oleg Alexandrov, Oliver Pereira, OrgasGirl, Oscarjquintana, PL290, Pacaro, Patrick, Paul August, Paul Matthews, PeR, PhotoBox,
Pranagailu1436, Prime Entelechy, Pt, Quibik, R'n'B, R.e.b., Richard77, Rjwilmsi, Rnt20, Roadrunner, Robinh, Roesser, Rpchase, Sbarnard, SobakaKachalova, Spartan-James, Srleffler, Stevenj,
Stizz, Super Cleverly, Sawomir Biay, THEN WHO WAS PHONE?, Tarquin, Tbsmith, The Anome, The Transhumanist, Thenub314, Tiddly Tom, Timwi, Topbanana, Tosha, Ub3rm4th,
Unigfjkl, User A1, Waltpohl, Wavesmikey, Winston365, Wolfrock, Wsulli74, Wtt, Yaje, Yhkhoo, Zhou Yu, Zzuuzz, 277 anonymous edits
Probability Source: http://en.wikipedia.org/w/index.php?oldid=393774599 Contributors: 21655, APH, Abby, Abby1019, AbsolutDan, Acerperi, Acroterion, Aitias, Aka042, Alansohn,
Alberg15, Alexjohnc3, Aliyah4499, Altenmann, Amalthea, Andeggs, AndrewHowse, Antandrus, Antonwalter, Ap, Arakunem, Arcfrk, Arenarax, Arjun01, ArnoLagrange, Avenue, BRUTE,
Badgernet, Beaumont, Bfinn, Bhound89, Bjcairns, Bobblewik, Bobo192, Braddodson, Brendo4, Brianjd, Brumski, Bryan Derksen, Btball, Buttonius, CBM, CO, CSTAR, Cactus.man, Caltas,
CanisRufus, Capitalist, Capitan Obvio, Capricorn42, Captmog, Carricko, Ceannaideachd, Cenarium, Centrx, Charles Matthews, CharlotteWebb, Chas zzz brown, Chetan.Panchal, Ciphers,
Classical geographer, Clausen, Clovis Sangrail, Connormah, Conversion script, Coppertwig, Craphouse, CrazyChemGuy, Cremepuff222, Cyclone49, D, DEMcAdams, DJ Clayworth, Dabomb87,
Danno12345, DarkFalls, DaveBrondsema, David Martland, David from Downunder, Dbtfz, Debator of mathematics, Dekisugi, Demicx, Demnevanni, Desteg, Dhammapal, Dirtytedd,
Discospinster, Disneycat, DopefishJustin, Doug Bell, Drestros power, Drivi86, Drmies, Dysprosia, ESkog, Ebsith, Edgar181, Ehheh, El Caro, Eliotwiki, Enchanter, Eog1916, Epbr123, Ettrig,
Evercat, Excirial, Fangz, Fantastic4boy, Fastilysock, Favonian, Fetchcomms, FishSpeaker, Flammifer, Footballfan190, FrF, FrankSanMiguel, Fred Bauder, Free Software Knight, FreplySpang,
G716, Gail, Garion96, Giftlite, Giggy, GoldenPi, Googie man, Graham87, Grstain, Guess Who, Gwernol, Hadal, Haduong, Hagedis, Happy-melon, Hasanbay, Hasihfiadhfoiahsio, Henrygb,
Herebo, Heron, Hirak 99, Hoomank, Hu12, Hut 8.5, II MusLiM HyBRiD II, INic, Ideyal, Ignacio Icke, Infarom, Instinct, Ixfd64, J.delanoy, JJL, JTN, Ja 62, Jacek Kendysz, Jackollie, Jake
Wartenberg, JamesTeterenko, Jaysweet, Jeff G., Jeffw57, Jheald, Jimmaths, Jitse Niesen, Jj137, Jmlk17, Jni, John Vandenberg, Johnleemk, Johnuniq, Jonik, JosephCampisi, Jpbowen, Jung
dalglish, Jwpurple, KG6YKN, Kaisershatner, Kaksag, Kbodouhi, Kevmus, King Mir, Kingpin13, Klapper, Koyaanis Qatsi, Krantz2, Kurtan, Kushalneo, Kzollman, Lambiam, Larklight,
Learnhead, Lee J Haywood, Lenoxus, Levineps, LiDaobing, Liang9993, Lifung, Lipedia, Lit-sci, Localhost00, Looxix, LoveMonkey, Lugnuts, MER-C, Mabsjenbu123, Mac Davis,
Mario777Zelda, MarkSweep, Markjoseph125, Marquez, MathMartin, Matthew Auger, Mattisse, Maximaximax, McSly, Mebden, Melcombe, Menthaxpiperita, Metagraph, Mets501, Michael
Hardy, Mikemoral, Mild Bill Hiccup, Mindmatrix, Minesweeper, MisterSheik, Mlpkr, Mortein, MrOllie, Msh210, Myasuda, Mycroft80, NYKevin, NatusRoma, NawlinWiki, Ncmvocalist,
NewEnglandYankee, Nick Number, Nigholith, Nijdam, Noctibus, NoisyJinx, Nsaa, Ogai, Omicronpersei8, Onore Baka Sama, OwenX, Oxymoron83, Packersfannn101, Paine Ellsworth,
PaperTruths, Patrick, Paul August, Paulcd2000, Pax:Vobiscum, Pd THOR, Pdn, Peter.C, Peterjhlee, PhilKnight, Philip Trueman, Philippe, Philtr, Pinethicket, Pointless.FF59F5C9, Progicnet,
Psyche825, Puchiko, Putgeminmouth, QmunkE, Qwertyus, Qwfp, RVS, RabinZhao, Randomblue, RandorXeus, Ranger2006, RattleMan, Razorflame, Readro, Recentchanges, Reddi, Reedy,
Regancy42, Requestion, RexNL, Richard001, Richardajohns, Riotrocket8676, Rmosler2100, Rogimoto, Ronhjones, Ronz, Rtc, RuM, Sagittarian Milky Way, Salix alba, Santa Sangre, Scfencer,
SchfiftyThree, Schwnj, Sengkang, Sevilledade, ShawnAGaddy, Shoeofdeath, Sina2, SiobhanHansa, Sluzzelin, Snoyes, Solipsist, Someguy1221, SonOfNothing, Srinivasasha, Stephen Compall,
Stevenmitchell, Stux, Suicidalhamster, Suisui, SusanLesch, Swpb, Sycthos, Symane, Takeda, Tarheel95, Tautologist, Taxisfolder, Tayste, The Thing That Should Not Be, The Transhumanist,
TheGreenCarrot, Thesoxlost, Thingg, Tide rolls, TigerShark, Tintenfischlein, Treisijs, Trovatore, Twisted86, Uncle Dick, UnitedStatesian, Valodim, Vandal B, Vanished User 1004, Varnesavant,
VasilievVV, Velho, Vericuester, Vicarious, Virgilian, Vivacissamamente, Voyagerfan5761, Wafulz, Wapcaplet, Wetman, Wikistudent 1, Wile E. Heresiarch, William915, Wimt, Wmahan,
Wordsmith, Wormdoggy, Wxlfsr, Wyatts, XKL, Yamakiri, Ybbor, Yerpo, Youandme, YourEyesOnly, Zach1994, Zalle, Zundark, , 704 anonymous edits
Probability distribution Source: http://en.wikipedia.org/w/index.php?oldid=393957667 Contributors: (:Julien:), 198.144.199.xxx, 3mta3, A.M.R., A5, Abhinav316, AbsolutDan, Adrokin,
Alansohn, Alexius08, Ap, Applepiein, Avenue, AxelBoldt, BD2412, Baccyak4H, Benwing, Bfigura's puppy, Bhoola Pakistani, Bkkbrad, Bryan Derksen, Btyner, Calvin 1998, Caramdir,
Cburnett, Chirlu, Chris the speller, Classical geographer, Closedmouth, Conversion script, Courcelles, Damian Yerrick, Davhorn, David Eppstein, David Vose, DavidCBryant, Dcljr, Delldot, Den
fjttrade ankan, Dick Beldin, Digisus, Dino, Domminico, Dysprosia, Eliezg, Emijrp, Epbr123, Eric Kvaalen, Fintor, Firelog, Fnielsen, G716, Gaius Cornelius, Gala.martin, Gandalf61,
Gate2quality, Giftlite, Gjnyasa, GoodDamon, Graham87, Hu12, ImperfectlyInformed, It Is Me Here, Iwaterpolo, J.delanoy, JRSpriggs, Jan eissfeldt, JayJasper, Jclemens, Jipumarino, Jitse
Niesen, Jon Awbrey, Josuechan, Jsd115, Jsnx, Jtkiefer, Knutux, Larryisgood, LiDaobing, Lilac Soul, Lollerskates, Lotje, Loupeter, MGriebe, MarkSweep, Markhebner, Marner, Megaloxantha,
Melcombe, Mental Blank, Michael Hardy, Miguel, MisterSheik, Morton.lin, MrOllie, Napzilla, Nbarth, Noodle snacks, NuclearWarfare, O18, OdedSchramm, Ojigiri, OverInsured, Oxymoron83,
PAR, Pabristow, Patrick, Paul August, Pax:Vobiscum, Pgan002, Phys, Ponnu, Poor Yorick, Populus, Ptrf, Quietbritishjim, Qwfp, Riceplaytexas, Rich Farmbrough, Richard D. LeCour,
Rinconsoleao, Roger.simmons, Rursus, Salgueiro, Salix alba, Samois98, Sandym, Schmock, Seglea, Serguei S. Dukachev, ShaunES, Shizhao, Silly rabbit, SiobhanHansa, Sky Attacker, Statlearn,
Stpasha, TNARasslin, TakuyaMurata, Tarotcards, Tayste, Techman224, Thamelry, The Anome, The Thing That Should Not Be, TheCoffee, Tomi, Topology Expert, Tordek ar, Tsirel, Ttony21,
Unyoyega, Uvainio, VictorAnyakin, Whosasking, Whosyourjudas, X-Bert, Zundark, 218 anonymous edits
Binomial distribution Source: http://en.wikipedia.org/w/index.php?oldid=394780048 Contributors: -- April, Aarond10, AchatesAVC, AdamRetchless, Ahoerstemeier, Ajs072,
Alexb@cut-the-knot.com, Alexius08, Atemperman, Atlant, AxelBoldt, Ayla, BPets, Baccyak4H, BenFrantzDale, Bill Malloy, Blue520, Br43402, Bryan Derksen, Btyner, Can't sleep, clown will
eat me, Cburnett, Cdang, Cflm001, Charles Matthews, Conversion script, Coppertwig, Crackerbelly, Cuttlefishy, David Martland, DavidFHoughton, Daytona2, Deville, Dick Beldin, Eesnyder,
Elipongo, Eric Kvaalen, Falk Lieder, Fisherjs, G716, Gary King, Gauravm1312, Gauss, Gerald Tros, Giftlite, GorillaWarfare, Gperjim, Graham87, Hede2000, Henrygb, Hirak 99, Ian.Shannon,
Ilmari Karonen, Intelligentsium, Iwaterpolo, J04n, JB82, JEH, Janlo, Johnstjohn, Kakofonous, Kmassey, Knutux, Koczy, LOL, Larry_Sanger, LiDaobing, Linas, Logan, MER-C, ML5, MSGJ,
Madkaugh, MarkSweep, Marvinrulesmars, Materialscientist, Mboverload, McKay, Meisterkoch, Melcombe, Michael Hardy, MichaelGensheimer, Miguel, MisterSheik, Mmustafa,
Moseschinyama, Mr Ape, MrOllie, Musiphil, N6ne, NatusRoma, Nbarth, Neshatian, New Thought, Nguyenngaviet, Nschuma, Oleg Alexandrov, PAR, Paul August, Ph.eyes, PhotoBox, Phr,
Pleasantville, Postrach, PsyberS, Pt, Pufferfish101, Qonnec, Quietbritishjim, Qwertyus, Qwfp, Redtryfan77, Rgclegg, Rich Farmbrough, Rjmorris, Rlendog, Ruber chiken, Seglea, Smachet,
SoSaysChappy, Spellcast, Stebulus, Steven J. Anderson, Stigin, Stpasha, Supergroupiejoy, TakuyaMurata, Tayste, The Thing That Should Not Be, Tim1357, Timwi, Tomi, VectorPosse, Wikid77,
WillKitch, Xiao Fei, Youandme, ZantTrang, Zmoboros, 330 anonymous edits
Log-normal distribution Source: http://en.wikipedia.org/w/index.php?oldid=394770945 Contributors: 2D, A. Pichler, Acct4, Albmont, Alue, Autopilot, AxelBoldt, Baccyak4H, BenB4,
Berland, Biochem67, Bryan Derksen, Btyner, Cburnett, Christian Damgaard, Ciberelm, Ciemo, Cleared as filed, ColinGillespie, Constructive editor, Encyclops, Evil Monkey, Floklk, Fredrik,
Gausseliminering, Giftlite, Humanengr, Hxu, IanOsgood, Iwaterpolo, Jackzhp, Jeff3000, Jitse Niesen, Khukri, Letsgoexploring, Lojikl, Lunch, Mange01, Martinp23, Melcombe, Michael Hardy,
MisterSheik, NonDucor, Ocatecir, Occawen, Osbornd, Oxymoron83, PAR, PBH, Paul Pogonyshev, Philip Trueman, Philtime, Phoxhat, Pichote, Pontus, Qwfp, R.J.Oosterbaan, Rgbcmy,
Rhowell77, Ricardogpn, Rlendog, Rmaus, Safdarmarwat, Sairvinexx, Schutz, Seriousme, Skunkboy74, SqueakBox, Sterrys, Stigin, Stpasha, Ta bu shi da yu, Techman224, The Siktath, Till
Riffert, Tkinias, Tomi, Umpi, Unyoyega, Urhixidur, User A1, Weialawaga, Wikomidia, Wile E. Heresiarch, ZeroOne, ^demon, , 169 anonymous edits
Heat equation Source: http://en.wikipedia.org/w/index.php?oldid=395101670 Contributors: AMR, Alai, Angusmclellan, Anterior1, Army1987, ArnoldReinhold, Arthena, Avmich, Bergonzc,
Berland, Betacommand, Bingul, Bryan Derksen, CSTAR, Carrionluggage, Cbjorland, Cfweise, Charles Matthews, ChristophDemmer, Chromaticity, Chubby Chicken, Constructive editor,
Corrigendas, DJ Clayworth, Damorbel, Delio-mug, Dicklyon, Dogonsi, Dr Dec, Francisco Quiumento, Germandemat, Giftlite, Glmory, Gstone, HEL, Hadal, Headbomb, Heron, Igny, Jayme,
Jbergquist, Jdpipe, Jeff3000, Jitse Niesen, Johnsarelli, Karol Langner, Kupirijo, Linas, Lunch, MC10, Marek69, MathMartin, Mathieu Perrin, Michael Hardy, Mietchen, Mikiemike, Mstftsm,
Nanog, Oleg Alexandrov, PAR, PMajer, Paintman, Peak, Peterlin, PhotoBox, Ppntori, Quibik, Random3f, RexNL, Richard Giuly, Rjw62, Rmz, Salgueiro, Silly rabbit, Sormani, Sawomir Biay,
Tabletop, Tarquin, Taxman, ThorinMuglindir, TonyW, User A1, VictorAnyakin, Voorlandt, Waltpohl, WriterHound, Wtt, Yill577, Zoicon5, , 144 anonymous edits
RadonNikodym derivative Source: http://en.wikipedia.org/w/index.php?oldid=169241696 Contributors: 3mta3, Aftermath1983, Bdmy, BeteNoir, Charles Matthews, Dogah, Drizzd, Edward,
Enchanter, Fibonacci, Giftlite, HyDeckar, Jheald, Kiefer.Wolfowitz, King Bee, Madmath789, Mct mht, Michael Hardy, Nbarth, Ohjaek33, Oleg Alexandrov, Parodi, Pcb21, Pfortuny, Populus,
Prumpf, Psychonaut, PurpleHz, Roadrunner, RuudVisser, Silly rabbit, Sullivan.t.j, TakuyaMurata, Thecmad, Tkboon, Winterfors, Zundark, Zvika, 36 anonymous edits
Risk-neutral measure Source: http://en.wikipedia.org/w/index.php?oldid=394290657 Contributors: AJR 1978, Arcenciel, Arvinder.virk, Brian Tvedt, Btyner, Charles Matthews, Crasshopper,
DocendoDiscimus, Elf, Feco, Felix Forgeron, Financestudent, Fintor, Forwardmeasure, Gabbe, Gauge, Henrygb, Hut 8.5, Ixfd64, Kaslanidi, Landroni, Laser.Y, Lowellian, MementoVivere,
Michael Hardy, MrOllie, Nobellaureatesphotographer, Pcb21, PigFlu Oink, Plastikspork, Pontus, Quest for Truth, Roadrunner, Tiger888, Tomhosking, Unara, WebScientist, Worldbeater2002, 41
anonymous edits
Stochastic calculus Source: http://en.wikipedia.org/w/index.php?oldid=385640477 Contributors: Akihabara, Alanb, Brian Tvedt, CesarB, Charles Matthews, Cosmic Latte, Elf, Encyclops,
FilipeS, Giftlite, Gruntler, Hairer, Jackzhp, Jfr26, Jhausauer, Jpbowen, Kaslanidi, Ma'ame Michu, Melcombe, Michael Hardy, MrOllie, Mrwojo, Oleg Alexandrov, Piil, Rachelm1978, Rgclegg,
Roboquant, Ronnotel, Rotem Dan, SUPER-QUANT-HERO, Schmock, Sullivan.t.j, The Anome, Tsirel, Useight, Voltagedrop, Vovchyck, Ziki, Ziyuang, 41 anonymous edits
Wiener process Source: http://en.wikipedia.org/w/index.php?oldid=390914880 Contributors: Aanderson@amherst.edu, Alexf, Ambrose.chongo, Andrewpmack, Awaterl, Berland, Bmju,
C45207, CBM, CSTAR, Charles Matthews, Cyp, DaniScoreKeeper, David Underdown, Delius, Deodar, Digfarenough, Dylan Lake, Elf, Forwardmeasure, Gala.martin, Giftlite, Jackzhp, James T
Curran, JeffreyBenjaminBrown, Jmnbatista, Joke137, JonAWellner, Jujutacular, Lambiam, Mbell, Melcombe, Michael Hardy, MisterSheik, Oleg Alexandrov, Oli Filth, PeR, Pokedork7876, Ptrf,
Article Sources and Contributors
306
R.e.b., Ru elio, Sandym, Soap, Speedplane, Spinningspark, Stephen B Streater, Sullivan.t.j, Tristanreid, Tsirel, Warbler271, Wikomidia, Wysinwygaa, Zvika, 51 anonymous edits
Lvy process Source: http://en.wikipedia.org/w/index.php?oldid=392378504 Contributors: Aav, Adler.fa, Albmont, Amir Aliev, Bearas, Charles Matthews, Docu, Drusus 0, Eug, Evil Monkey,
Fabee, Giftlite, Hyperbola, Lyst, Mactuary, Maksim-e, Melcombe, Michael Hardy, Myasuda, Nbarth, Noldo22, Ptrf, Rainwarrior, Rjwilmsi, Schomerus, Sigita.zitkeviciute, Sullivan.t.j, 26
anonymous edits
Stochastic differential equations Source: http://en.wikipedia.org/w/index.php?oldid=19592160 Contributors: Agmpinia, BenFrantzDale, Bender235, Benjamin.friedrich, Bistromathic, Btyner,
Cradel, FilipeS, Firsfron, Fox, Fuhghettaboutit, Gaius Cornelius, Giftlite, Hectorthebat, Innerproduct, Kiefer.Wolfowitz, KimYungTae, Kupirijo, LARS, LachlanA, Ladnunwala, Lhfriedman,
Lkinkade, Mathsfreak, McLowery, Melcombe, Michael Hardy, Moskvax, Nilanjansaha27, OliAtlason, Phys0111, Piloter, Rikimaru, RockMagnetist, Ronnotel, Sandym, SgtThroat,
Shawn@garbett.org, Shawnc, Siroxo, Sullivan.t.j, The Man in Question, UffeHThygesen, Vovchyck, Voyagerfan5761, 43 anonymous edits
Stochastic volatility Source: http://en.wikipedia.org/w/index.php?oldid=377298768 Contributors: A. Pichler, Asperal, Benna, Bluegrass, Btyner, Chiinners, DominicConnor, Enchanter,
Finnancier, Firsfron, Froufrou07, Hopefully acceptable username, Leifern, Mevalabadie, Michael Hardy, Mu, Roadrunner, Ronnotel, Seanhunter, Ulner, Wavelength, Woohookitty, 33
anonymous edits
Numerical partial differential equations Source: http://en.wikipedia.org/w/index.php?oldid=364889473 Contributors: Ben pcc, CBM, Charvest, Jitse Niesen, Neutiquam, Ocwjoe, Oleg
Alexandrov, Paul Matthews, Pentti71, Waltpohl, 4 anonymous edits
CrankNicolson method Source: http://en.wikipedia.org/w/index.php?oldid=388827024 Contributors: Albertod4, Argoreham, Axemanstan, Ben pcc, Berland, Chadernook, Cyktsui, Dicklyon,
Fintor, JJL, Jitse Niesen, JohnCD, Keenan Pepper, Koavf, Mariocastrogama, Michael Hardy, Mild Bill Hiccup, Natalya, Oleg Alexandrov, Peak, Roadrunner, Roundhouse0, Salih, Taxman,
Zylorian, 46 anonymous edits
Finite difference Source: http://en.wikipedia.org/w/index.php?oldid=394620955 Contributors: Abhinav.in, Alex Bakharev, Arthur Rubin, Beetstra, Ben pcc, BenFrantzDale, Berland, Bethnim,
BigJohnHenry, Bobo192, Charvest, CheckURmath, Commander Keane, DanielDeibler, Delaszk, Dfeldmann, Eijkhout, Feuer, Fintor, Foxjwill, Giftlite, Griffgruff, Iamrndy, Iridescent, J.delanoy,
JJL, Jbergquist, Jitse Niesen, Jloo, Johnpseudo, Kku, KrisK, Kymacpherson, Linas, MathFacts, Michael Hardy, Mir76, Namaxwell, Nk, Oleg Alexandrov, Ozob, Pentti71, Physchem, Salih,
Salsia, Sligocki, Superale85, Sawomir Biay, Taxman, Tekhnofiend, Tide rolls, Tobias Hoevekamp, Toby, Volemak, Voorlandt, Voxpuppet, Wallpaperdesktop, William M. Connolley, 120
anonymous edits
Value at risk Source: http://en.wikipedia.org/w/index.php?oldid=389302066 Contributors: 360Portfolios, AaCBrown, AdamSmithee, Adler.fa, Aimsoft, Alansohn, Alch0, Aldanor,
ArnoldReinhold, Ashley Y, Avraham, Barcturus, Bfinn, Bobo192, CSWarren, CarolGray, Charles Matthews, Christian List, Cmathio, Cmdrjameson, DMCer, Daniel Dickman, DenisHowe,
Didickman, DomCleal, Doulos Christos, Dpbsmith, EcoMan, Effco, Elwikipedista, Euchiasmus, Euphrosyne, Fintor, Forlornturtle, Fortdj33, Gandalf61, Geomatster, Giftlite, Gurch, Htournyol,
Hu12, HyDeckar, IvanLanin, Jafcbs, Jcsellak, Jeanine Leuckel, Jgonion, Jimmyshek, Juro, KelleyCook, Knowledge2, Kuru, Lamro, LeaveSleaves, Leibniz, Letournp, Lprideux, Luizabpr,
Marcika, Markhurd, MementoVivere, Michael Hardy, MithrandirAgain, MrOllie, Mydogategodshat, N5iln, Nbarth, Nshuks7, Nutcracker, Oleg Alexandrov, Oli Filth, Outriggr, PamD, Pelotas,
Pete5, Pgreenfinch, Plasticup, Pnm, Profitip, R'n'B, RBecks, RDSeabrook, Ramin Nakisa, Random user, Riskbooks, Ruziklan, SJP, Sadads, Schmiteye, Scientiae Defensor, ShelfSkewed,
Smallbones, Snehalkanakia, Sydal, TakuyaMurata, TheEmanem, Tigergb, Tsancoso, Uaqeel, Ukexpat, Willking1979, Woohookitty, X96lee15, Xp54321, Zain Ebrahim111, Zurich Risk, 216
anonymous edits
Volatility (finance) Source: http://en.wikipedia.org/w/index.php?oldid=395038275 Contributors: A quant, Andrewpmack, Arthena, Baeksu, BrokenSegue, Brw12, Btyner, Calair, Canadaduane,
Catofgrey, Charles Matthews, Christofurio, Cleared as filed, Complexfinance, D15724C710N, DocendoDiscimus, DominicConnor, Eeekster, Ehdr, Emperorbma, Enchanter, Eric Kvaalen,
Favonian, Finnancier, Fredbauder, Fvasconcellos, GraemeL, Helon, Hobsonlane, Honza Zruba, Hu12, Infinity0, Innohead, Jerryseinfeld, Jewzip, Jimmy Pitt, John Fader, Jphillips, KKramer,
Karol Langner, Kingpin13, KnightRider, Kyng, LARS, Lamro, Marishik88, Martinp, Merube 89, Michael Hardy, Nburden, Orzetto, Pcb21, PeterM, Pgreenfinch, Phil Boswell, Pt johnston,
Quaeler, Ronnotel, Ryguasu, S.rvarr.S, SKL, ShaunMacPherson, Swerfvalk, Taral, Tassedethe, Tedder, That Guy, From That Show!, Time9, UnitedStatesian, UweD, Volatility Professor,
Walkerma, Wongm, Wushi-En, Yurik, Zhenqinli, 169 anonymous edits
Autoregressive conditional heteroskedasticity Source: http://en.wikipedia.org/w/index.php?oldid=391371537 Contributors: 4l3xlk, Abeliavsky, Albmont, Bobo192, Bondegezou, Brenoneri,
Btyner, Charles Matthews, Christiaan.Brown, Christopher Connor, Cje, Cp111, Cronholm144, Davezes, DeadEyeArrow, Den fjttrade ankan, Erxnmedia, Finnancier, Fyyer, Gaius Cornelius,
GeorgeBoole, Hascott, Inferno, Lord of Penguins, Irbisgreif, JavOs, Jni, John Quiggin, Kevinhsun, Kizor, Kwertii, LDLee, Landroni, Loodog, Magicmike, Melcombe, Merube 89, Michael Hardy,
Nelson50, Nutcracker, Personline, Philip Trueman, Pitchfork4, Pontus, Protonk, Qwfp, Rgclegg, Rich Farmbrough, Ronnotel, Ryanrs, Sigmundur, Whisky brewer, Wile E. Heresiarch,
Wtmitchell, Xian.ch, Xieyihui, Zootm, 151 anonymous edits
Brownian Model of Financial Markets Source: http://en.wikipedia.org/w/index.php?oldid=328998024 Contributors: Chris the speller, Ensign beedrill, Financestudent, Giganut, GoingBatty,
LilHelpa, Rich Farmbrough, Rjwilmsi, SpacemanSpiff, Tassedethe, Woohookitty, 20 anonymous edits
Rational pricing Source: http://en.wikipedia.org/w/index.php?oldid=394439336 Contributors: Brianga, Btyner, Cburnett, Cherkash, Dan100, DocendoDiscimus, Drusus 0, Edward, Ehn,
Enchanter, Faradayplank, Feco, Fenice, Fintor, GraemeL, Guy M, Humanengr, John Quiggin, Lightdarkness, Maurreen, Michael Hardy, Mushroom, Pgreenfinch, Phil Boswell, Reinoutr, That
Guy, From That Show!, Timwiut, Ulner, Who, Woohookitty, 55 anonymous edits
Arbitrage Source: http://en.wikipedia.org/w/index.php?oldid=393344825 Contributors: 62.253.64.xxx, A. B., ACEngert, AdRock, Alast0r, Alcatrank, Alexasmith, Altenmann, Andrew
Reynolds, Angelic editor, AnnaFrance, Anwar saadat, Anwhite, Arthena, Assbackward, Athaenara, Avriette, Bender235, Benja, Boc236, Bombastus, Bookandcoffee, Brentdax, Bryan Derksen,
Bugloaf, Burntsauce, Calcuscribe, Chenyu, Chochopk, Chris Edgemon, Christofurio, Ciphers, Cje, Colonies Chris, Color probe, ColoradoZ, Coneslayer, Conversion script, Coolcaesar,
Crazycomputers, Cutter, Cyanide2600, Cyanoa Crylate, D6, Dandrake, Darkildor, Davidcam, Dcamp314, Ddantas, Deconstructhis, DerHexer, Dickpenn, DocendoDiscimus, Drmies, Edward,
Ehn, Fastfission, Finnancier, Fintor, Flockmeal, FreplySpang, Furrykef, Gede, Giganut, Gparker, GraemeL, Gregalton, Gurubrahma, Hadal, Hairy Dude, Henrygb, HotBridge, Hu12,
Hypersphere, I do not exist, Idleguy, Information Arbitrage, Insightfullysaid, InvestmentMogul, Isis, JHG, JLaTondre, JRR Trollkien, Jackmass, JamesMLane, Jburt1, Jerryseinfeld, Jitse Niesen,
Jkeene, Jmarkuso, Johnleemk, Jonathan Drain, Jose Ramos, Joy, Jtkiefer, Kdpinsf, Kjm, Kopeti5, Kungfuadam, Kwamikagami, LOL, Lambiam, LilHelpa, Liveandplay, Mark Meeker,
Marudubshinki, Matt Raines, Maurreen, Michael Hardy, Mitchan, Mojtaba-taheri-1983, MrVoluntarist, Msubronson, Mtg1977, Nbarth, Nlu, Noldo22, Novacatz, Olejasz, Oren0, OwenX, Phgao,
Phineas Ravenscroft, Pmj, Pseudomonas, Qazzt, REDyellowGreenBLUE, Random user, Reedy, Rich Farmbrough, Road Wizard, Roadrunner, RocketPaul77, Ronnotel, Rose Garden, SU
Linguist, Salahx, Sam Hocevar, Saturnight, Scootey, Simishag, Sjakkalle, Sjc, Skomorokh, Slicing, Smallbones, Snapperman2, Sosobra, SpuriousQ, Starkodder, SubSeven, Supasheep,
Sydbarrett74, Tarquin, TastyPoutine, Taxman, The Anome, Thiseye, Thomas Arelatensis, Tide rolls, Tinyforebrain, Trade2tradewell, Umalee, Umofomia, Urocyon, Vegaswikian, Vgranucci,
Ward3001, Westmorlandia, Whimsics, WikiPier, Wikidemon, Will Beback, Wissons, Wk muriithi, Wnissen, WordyGirl90, Wyattmj, Xiaoyanggu, Yahya Abdal-Aziz, Zaphod Beeblebrox,
ZenerV, 296 anonymous edits
Futures contract Source: http://en.wikipedia.org/w/index.php?oldid=394280506 Contributors: "alyosha", -oo0(GoldTrader)0oo-, 4twenty42o, A Softer Answer, ALLurGroceries, Aaron
Brenneman, Ac101, Advancedfutures, Aleator, Alesander, Allstar784, Altenmann, Amartya ray2001, Andycjp, Arthena, Artman772000, AtomikWeasel, Atrick, Avenged Eightfold, BadSeed,
Beetstra, Beganlocal, Bender235, Benjai, Bennoro, Bissinger, Blanchardb, Bobblewik, Bobknowitall, Bogdanb, Bomac, CRGreathouse, CRoetzer, Capricorn42, Chepurko, Chrylis, CliffC,
Cllectbook, Coder Dan, Commander Keane, Conant Webb, Cpl Syx, Craig t moore, Cyde, Cyktsui, Czalex, Daniel5127, Darkwing7, David Shay, Dc3m, Desolidirized, Discospinster, Dkeditor,
Doc9871, DocendoDiscimus, Donreed, Duesentrieb, Dvavasour, Dzordzm, Edgar181, Edward, Efutures, Egopaint, EntmootsOfTrolls, Ergative rlt, Espoo, Excirial, Expofutures, Farmhouse121,
Feco, Fenice, Fergusdog, Fintor, Frank Lofaro Jr., GB fan, Gandalf013, Gauge, Gavin.collins, Gene Nygaard, GeneralBob, Georgez (usurped), Gfk, GraemeL, Grazfather, Guy M, Gzornenplatz,
Hairy Dude, HappyInGeneral, Hede2000, Hedgefundconcepts, Heheman3000, Heman, Henrygb, Hu12, Ian Pitchford, Informationisacommodity, Int21h, Islander, JHP, Jayanta Sen, Jbaphna,
Jensp, Jeremiahmurray, Jerryseinfeld, Jfeckstein, Jnmclarty, John Comeau, John Laxson, JohnOwens, Jonathan Callahan, Jorunn, Josh Parris, Joshuaali, Jsm0711, Juxo, K12345wiki, Kat,
Kozuch, Kujo275, Kwertii, LaidOff, Lamro, Laudaka, Llywelyn, MER-C, Mattis, Mauri.carrasco, Mebits, Michael Hardy, Mikie yorkie, Msankowski, Mulad, Mydogategodshat, NEARER,
Nbarth, NeuronExMachina, Neutrality, Ninly, Notmyrealname, Oblonej, OwenX, PCock, Paine Ellsworth, Palouser1, Pauly04, Pcb21, Pcxtrader, Pekinensis, Pgreenfinch, Philip Trueman, Piet
Delport, Pilotguy, PizzaMargherita, Plinkit, Polly Ticker, Praet123, Psb777, Random user, Rangek, RayBirks, RedWolf, Redthoreau, Renamed user 4, Rhobite, Rich Farmbrough, Risce, Rmaus,
Rmhermen, Ronnotel, Ryguillian, SDC, Sargdub, Satori Son, Sharik, ShaunMacPherson, SimonP, Smallman12q, Solarapex, Spencer195, Stifle, Stirfutures, SunCreator, Swerfvalk, Taxman,
Tesseran, The Thing That Should Not Be, Tickenest, Tiger888, Timtx01, Toby Bartels, Tsuchan, UberScienceNerd, Ughh, Ulner, V35322, Veinor, Versageek, VerySmartNiceGuy, Vina, Vsmith,
Wavelength, Wcspaulding, When Muffins Attack, Wikomidia, Wongm, Woohookitty, Wooyi, Wordsmith, Xavid, Yone Fernandes, ZackDude, Zippymobile, Zven, 516 anonymous edits
Putcall parity Source: http://en.wikipedia.org/w/index.php?oldid=392613880 Contributors: Alexxandros, Alias Flood, Cburnett, Chimpex, Drusus 0, Feco, Fenice, Finnancier, Fintor,
Flyhighplato, Furrykef, Gaius Cornelius, Gerhard Schroeder, Geschichte, Hu12, Ian Pitchford, Jlang, Justin73, Lamro, Marudubshinki, Mike40033, Nearaj, Nposs, Pcb21, RJN, Roadrunner,
Ronnotel, Sam Korn, Sgcook, Smallbones, The Anome, Ulner, Waltke, Wcspaulding, 66 anonymous edits
Intrinsic value (finance) Source: http://en.wikipedia.org/w/index.php?oldid=382341931 Contributors: ElectricRay, Enchanter, Feco, Fintor, General agent, Gurch, Hu12, Investor123, JHP, JIP,
JoePonzio, Julesd, Larry Lawrence, LearnK, Mattis, Michael Hardy, MrOllie, Pgreenfinch, Pohick2, Radagast83, Rduinker, Rinconsoleao, RoyBoy, Rumpelstiltskin223, Sgcook, Shawnc, Sijo
Ripa, Srleffler, Vald, Vikramsidhu, Yabanc, 17 anonymous edits
Article Sources and Contributors
307
Option time value Source: http://en.wikipedia.org/w/index.php?oldid=375972039 Contributors: Aaron Brenneman, Acad Ronin, Bkessler, Brad22dir, Cedjacket, Enchanter, Feco, Fenice,
Finnancier, Fintor, GeneralBob, GraemeL, Jay, Jphillips, Leifern, Maximus Rex, Michael Hardy, Ramnarasimhan, Sardanaphalus, Sgcook, Smallbones, Who, Wik, Woohookitty, 35 anonymous
edits
Moneyness Source: http://en.wikipedia.org/w/index.php?oldid=394887362 Contributors: A. B., A5, Acad Ronin, Angela, Bbx, Bhadani, Camiflower, Chris 73, DiggyG, Dmsar,
DocendoDiscimus, Doug Bell, Enchanter, Erudecorp, EvanCarroll, Famspear, Fenice, Fijimf, Finnancier, GeneralBob, GraemeL, Graft, Hu12, JMSwtlk, Jphillips, KnightRider, Korhantoker1,
Macrobbie, Michael Hardy, Mydogategodshat, Nbarth, Pcb21, Pikiwyn, Shawnc, Shell Kinney, Taral, Tomeasy, Xcalibus, 42 anonymous edits
BlackScholes Source: http://en.wikipedia.org/w/index.php?oldid=394821703 Contributors: -Ozone-, A Train, A bit iffy, A. Pichler, AdrianTM, Akrouglo, Aldanor, Altenmann,
AndrewHowse, Antoniekotze, Arkachatterjea, Avikar1, Beetstra, Betterfinance, Bobo192, Btyner, CSWarren, Calltech, Charles Matthews, Chrisvls, Cibergili, Coder Dan, CoolGuy, Cretog8,
Crocefisso, Csl77, Css, Cyclist, Daida, Dan131m, Danfuzz, DavidRF, Dcsohl, Dino, Domino, Dpwkbw, Dr.007, Drusus 0, Dudegalea, EconomistBR, EdwardLockhart, Elkman, EmmetCaulfield,
Enchanter, Ercolev, Ernie shoemaker, Etrigan, Favonian, Feco, Fenice, FilipeS, Finnancier, Fintor, Fish147, Fsiler, GRBerry, Galizur, Gaschroeder, Gauge, Geschichte, Giftlite, Gnixon,
GoldenPi, Goliat 43, Goodnightmush, Gretchen, Guy M, HLwiKi, Hadal, Hadlock, Hu12, HyDeckar, Ikelos, IronGargoyle, Islandbay, IstvanWolf, JaGa, Jaccard, JahJah, JanSuchy, Jerzy,
Jigneshkerai89, Jitse Niesen, Jmnbatista, JustAGal, Justin73, JzG, Kaslanidi, Kbrose, Kcordina, Khagansama, Kimbly, Kungfuadam, Kwertii, Landroni, Lehalle, Leifern, Leyanese, Lxtko2,
Makreel, Marudubshinki, Mathiastck, Melcombe, Mets501, Mewbutton, Michael Hardy, Michael Slone, Mikc75, Mr Ape, MrOllie, Mydogategodshat, Naddy, Nbarth, Nicolas1981, Nixdorf,
Nkojuharov, Nohat, Notary137, Oleg Alexandrov, Oli Filth, Olivier, Pcb21, PeterM, Petrus, Pgreenfinch, Pontus, Pps, Prasantapalwiki, Pretzelpaws, Protonk, RJN, Rajnr, Razorflame,
Roadrunner, Roberto.croce, Ronnotel, RussNelson, S2000magician, SDC, Saibod, Scrutchfield, Sebculture, Sgcook, Shimgray, Silly rabbit, Smaines, Smallbones, Smallbones11, Spliffy, Stephen
B Streater, Stevenmitchell, Stochastic Financial Model, Tangotango, Tarotcards, Tawker, The Anome, TheObtuseAngleOfDoom, Tiles, Tomeasy, Tristanreid, Typewritten, Ulner, Viz,
Vonfraginoff, Vvarkey, Waagh, WallStGolfer31, WebScientist, Wile E. Heresiarch, William Avery, Williamv1138, Willsmith, Wlmsears, Wurmli, Yill577, Zophar, Zzyzx11, 602 anonymous
edits
Black model Source: http://en.wikipedia.org/w/index.php?oldid=388951552 Contributors: Captain Disdain, Cfries, DanielCD, Dori, Feco, Felix Forgeron, Finnancier, Fintor, GeneralBob,
Hu12, Jimmycorp, Materialscientist, Michael Hardy, Oleg Alexandrov, Oli Filth, Pcb21, PtolemyIV, Renamed user 4, Roadrunner, Samcol1492, Sgcook, Ulner, , 67 anonymous edits
Binomial options pricing model Source: http://en.wikipedia.org/w/index.php?oldid=388574261 Contributors: A Train, A. Pichler, Aav, CSWarren, Cburnett, Cdang, Charles Matthews, Choas,
Cibergili, Cpsdcann, Dan Guan, Davejagoda, Desx2501, DocendoDiscimus, Feco, Finnancier, Fintor, Gabbe, Garrethe, Geremy78, Hu12, Hwansokcho, Islandbay, Kevininspace, Kwertii, Kwiki,
Leifern, MarkSweep, Mayerwin, Michael Hardy, Nkojuharov, Oleg Alexandrov, PP Jewel, Pleasantville, Roadrunner, Ronnotel, Sbratu, Sgcook, Sivamoturi, Smallbones, Stebulus, SteinbDJ,
Swatiquantie, That Guy, From That Show!, Wmahan, Yahya Abdal-Aziz, Yamamoto Ichiro, 128 anonymous edits
Monte Carlo option model Source: http://en.wikipedia.org/w/index.php?oldid=259734429 Contributors: Charles Matthews, Christofurio, Collect, Dthomsen8, Ekotkie, Enchanter, Encyclops,
Finnancier, Fintor, Hu12, Iridescent, Islandbay, JaGa, Kalbasa, Khalid hassani, Nshuks7, Rich Farmbrough, Ronnotel, Ulner, Woohookitty, 15 anonymous edits
Volatility smile Source: http://en.wikipedia.org/w/index.php?oldid=376271493 Contributors: Anticipation of a New Lover's Arrival, The, Autoreplay, Brian Merz, Brianegge, Christofurio,
Dori, Duja, EdJohnston, EdwardLockhart, Elroch, Enchanter, Finnancier, Fintor, GregorB, Hu12, Ian Pitchford, Jerryseinfeld, JohnH, Kasprowicz, Mactuary, Magister Mathematicae, Michael
Hardy, Nbarth, Open2universe, Pcb21, Piloter, ProfessorTarantoga, Pt johnston, Roadrunner, Ronnotel, Rorro, Sgcook, SimonP, SteloKim, Stern, Vald, Vespristiano, Wknight94, Zanimum, 62
anonymous edits
Implied volatility Source: http://en.wikipedia.org/w/index.php?oldid=376435480 Contributors: Behshour, Bill37212, Ceyockey, Crasshopper, CyberSco, DanMS, Davidryan168, DerHias,
DocendoDiscimus, Favonian, Feco, Finnancier, Freedom2live, GraemeL, Henry Delforn, Highgamma, Hu12, Jadair10, Jerryseinfeld, Jphillips, Lars67, Leifern, Nikai, Oli Filth, Pcb21, Pt
johnston, RandomProcess, Raotel, Ronnotel, ST47, Shawnc, Ulner, Ustaudinger, UweD, 71 anonymous edits
SABR Volatility Model Source: http://en.wikipedia.org/w/index.php?oldid=359965470 Contributors: BetFut, Fintor, Hltommy2, Ilmari Karonen, JordanSamuels, LeYaYa, Michael Hardy,
Ppablo1812, ProfessorTarantoga, Rich Farmbrough, Ronnotel, 29 anonymous edits
Markov Switching Multifractal Source: http://en.wikipedia.org/w/index.php?oldid=295932443 Contributors: Hannibal19, Michael Hardy, RHaworth, Skittleys, 4 anonymous edits
Greeks (finance) Source: http://en.wikipedia.org/w/index.php?oldid=394801846 Contributors: 334a, Alfanje, Arisofalaska, BD2412, BeIsKr, Bender235, Bssc81, Crasshopper, Dealer01,
DocendoDiscimus, Edward, Enchanter, Ernie shoemaker, FF2010, FinancialRisk, Finnancier, Fintor, Firsfron, Fsiler, Gaius Cornelius, Geschichte, Gizbic, Glane23, GraemeL, Gxti, Harryemeric,
Hbent, Hu12, Ichase1, J heisenberg, Jackycwong, Jdthood, Jeff G., Jitse Niesen, JordanSamuels, Josephw, Joy, Justin73, Lamro, Laug, Lpele, Macrakis, Michael Hardy, Mkoistinen,
Moneyneversleeps, Mtpruitt, Mydogategodshat, Nbarth, Northp, Octopus-Hands, Oli Filth, Pcb21, Peralmq, Pgreenfinch, Philip Maton, Psemper, Qatter, Qxz, Razorflame, Roadrunner, Ronnotel,
Rr2419, Rshiller, Selmo, Slakr, Smallbones, The Thing That Should Not Be, Thoreaulylazy, Tomeasy, Ulner, Velocidex, Vercalos, Vivacissamamente, WallStGolfer31, WinterSpw,
Woohookitty, ZimZalaBim, Zophar, 276 anonymous edits
Finite difference methods for option pricing Source: http://en.wikipedia.org/w/index.php?oldid=376810911 Contributors: Calliopejen1, Fintor, Islandbay, Michael Hardy, Ronnotel, Ulner, 6
anonymous edits
Trinomial tree Source: http://en.wikipedia.org/w/index.php?oldid=391804946 Contributors: Biscuittin, EoGuy, Fabrictramp, Fintor, Garnett9, Javi.sabio, Jclemens, Katharineamy, Lamro,
Leszek Jaczuk, PigFlu Oink, Tgies, 3 anonymous edits
Optimal stopping Source: http://en.wikipedia.org/w/index.php?oldid=387350048 Contributors: Cactusthorn, David Eppstein, Erechtheus, Giftlite, Humanengr, Melcombe, Michael Hardy,
Mordecki, Oleg Alexandrov, Qwfp, Rich Farmbrough, Rinconsoleao, WebScientist, Yesitsapril, Yvswan, 14 anonymous edits
Interest rate derivative Source: http://en.wikipedia.org/w/index.php?oldid=392319830 Contributors: 478jjjz, Amit1law, Anwar saadat, Arthur Rubin, Bluemoose, Charles Matthews, CliffC,
Danielfranciscook, DocendoDiscimus, Edward, Ex-Nintendo Employee, Feeeshboy, Fenice, Finnancier, Fintor, Joshfinnie, Lancastle, Lfchuang, LilHelpa, Lost-theory, Malin Tokyo, Meinertsen,
Michael Hardy, NYArtsnWords, Pcb21, Pearle, Piloter, Ratesquant, SWAdair, Smallbones, Stuarthill, Sumeetakewar, Woohookitty, Yamamoto Ichiro, 46 anonymous edits
Short rate model Source: http://en.wikipedia.org/w/index.php?oldid=356011722 Contributors: AdamSmithee, Archimed, Asperal, Bankert, CanisRufus, Charles Matthews, Christofurio, Fintor,
Gaius Cornelius, It's-is-not-a-genitive, Lamro, LilHelpa, Michael Hardy, Morven, Napolun, Olejasz, Pcb21, Piloter, Rich Farmbrough, Sam Hocevar, Samcol1492, That Guy, From That Show!,
Wuser10, 28 anonymous edits
HullWhite model Source: http://en.wikipedia.org/w/index.php?oldid=380344332 Contributors: AdamSmithee, Alanb, Avraham, Charles Matthews, DocendoDiscimus, Dysprosia, Feco,
Finnancier, Fintor, Forwardmeasure, Gadget850, Good Olfactory, Hadlock, Linas, Michael Hardy, Ohnoitsjamie, Pcb21, Piloter, Pontus, Radicalsubversive, RandomProcess, Rdikeman,
Samcol1492, Tassedethe, Wragge, 34 anonymous edits
CoxIngersollRoss model Source: http://en.wikipedia.org/w/index.php?oldid=382511890 Contributors: A. Pichler, AdamSmithee, Amakuha, Bankert, Finnancier, Fintor, Forwardmeasure,
Gred925, Hwansokcho, JonathanIwiki, Lamro, Melcombe, Michael Hardy, Piloter, Sgmu, 14 anonymous edits
Chen model Source: http://en.wikipedia.org/w/index.php?oldid=391537923 Contributors: 82101ycb, AdamSmithee, Ahhf, Bankert, Dostanden, Dthomsen8, Finnancier, Gena1982, Giftlite,
Gogo Dodo, Ioannes Pragensis, Ivanptg, Jxadl, Napolun, Ohconfucius, PhiLiP, Prezbo, Qwfp, Reuben.Harris, Rich Farmbrough, Ricknmg, Undpgh, Zzssui, , 12 anonymous edits
LIBOR Market Model Source: http://en.wikipedia.org/w/index.php?oldid=257223711 Contributors: Centrx, Cfries, Db597, Dthomsen8, Felix Forgeron, Kaslanidi, Lpele, Michael Hardy,
MrOllie, PigFlu Oink, Piloter, Ronnotel, Stat789, 28 anonymous edits
HeathJarrowMorton framework Source: http://en.wikipedia.org/w/index.php?oldid=386865757 Contributors: Afelton, Bonjeroo, Christofurio, DARTH SIDIOUS 2, DocendoDiscimus,
Ernie shoemaker, Finnancier, Fintor, FosterBoondoggle, Gabbe, Giftlite, Karsten11, MagnusPI, MatthieuL, Michael Hardy, Ms2ger, Pontus, Quantyz, RandomProcess, Tide rolls, Whtsmith, 27
anonymous edits
Image Sources, Licenses and Contributors
308
Image Sources, Licenses and Contributors
File:GodfreyKneller-IsaacNewton-1689.jpg Source: http://en.wikipedia.org/w/index.php?title=File:GodfreyKneller-IsaacNewton-1689.jpg License: Public Domain Contributors: Algorithme,
Beyond My Ken, Bjankuloski06en, Grenavitar, Infrogmation, Kelson, Kilom691, Porao, Saperaud, Semnoz, Siebrand, Sparkit, Thomas Gun, Wknight94, Wst, Zaphod, 4 anonymous edits
File:Gottfried Wilhelm von Leibniz.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Gottfried_Wilhelm_von_Leibniz.jpg License: Public Domain Contributors: Beyond My Ken,
Davidlud, Eusebius, Factumquintus, Gabor, Luestling, Mattes, Schaengel89, Svencb, Tomisti, 4 anonymous edits
File:Tangent derivative calculusdia.svg Source: http://en.wikipedia.org/w/index.php?title=File:Tangent_derivative_calculusdia.svg License: GNU Free Documentation License Contributors:
Minestrone Soup
File:Sec2tan.gif Source: http://en.wikipedia.org/w/index.php?title=File:Sec2tan.gif License: GNU Free Documentation License Contributors: User:OSJ1961
File:Integral as region under curve.svg Source: http://en.wikipedia.org/w/index.php?title=File:Integral_as_region_under_curve.svg License: Creative Commons Attribution-Sharealike 2.5
Contributors: 4C
File:NautilusCutawayLogarithmicSpiral.jpg Source: http://en.wikipedia.org/w/index.php?title=File:NautilusCutawayLogarithmicSpiral.jpg License: Attribution Contributors: User:Chris 73
Image:Copule ord.png Source: http://en.wikipedia.org/w/index.php?title=File:Copule_ord.png License: Creative Commons Attribution-Sharealike 2.5 Contributors: Matteo Zandi
Image:Gaussian copula.png Source: http://en.wikipedia.org/w/index.php?title=File:Gaussian_copula.png License: Attribution Contributors: User:Zasf
File:Elmer-pump-heatequation.png Source: http://en.wikipedia.org/w/index.php?title=File:Elmer-pump-heatequation.png License: GNU Free Documentation License Contributors: User A1,
1 anonymous edits
Image:Function ocsillating at 3 hertz.svg Source: http://en.wikipedia.org/w/index.php?title=File:Function_ocsillating_at_3_hertz.svg License: Creative Commons Attribution-Sharealike 3.0
Contributors: User:Thenub314
Image:Onfreq.svg Source: http://en.wikipedia.org/w/index.php?title=File:Onfreq.svg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Original: Nicholas Longo, SVG
conversion: DX-MON (Richard Mant)
Image:Offfreq.svg Source: http://en.wikipedia.org/w/index.php?title=File:Offfreq.svg License: Creative Commons Attribution-Sharealike 3.0 Contributors: User:Thenub314
Image:Fourier transform of oscillating function.svg Source: http://en.wikipedia.org/w/index.php?title=File:Fourier_transform_of_oscillating_function.svg License: Creative Commons
Attribution-Sharealike 3.0 Contributors: User:Thenub314
File:Rectangular function.svg Source: http://en.wikipedia.org/w/index.php?title=File:Rectangular_function.svg License: unknown Contributors: Axxgreazz, Bender235, Darapti, Omegatron
File:Sinc function (normalized).svg Source: http://en.wikipedia.org/w/index.php?title=File:Sinc_function_(normalized).svg License: unknown Contributors: Bender235, Juiced lemon,
Omegatron, Pieter Kuiper
Image:Girsanov.png Source: http://en.wikipedia.org/w/index.php?title=File:Girsanov.png License: GNU Free Documentation License Contributors: Martin Keller-Ressel (uploaded by
Thomas Steiner)
Image:Monte carlo method.svg Source: http://en.wikipedia.org/w/index.php?title=File:Monte_carlo_method.svg License: Public Domain Contributors: --pbroks13talk? Original uploader was
Pbroks13 at en.wikipedia
Image:Ybc7289-bw.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Ybc7289-bw.jpg License: Creative Commons Attribution 2.5 Contributors: Billcasselman, Jtir, 1 anonymous
edits
Image:Schumacher (Ferrari) in practice at USGP 2005.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Schumacher_(Ferrari)_in_practice_at_USGP_2005.jpg License: Creative
Commons Attribution-Sharealike 2.0 Contributors: User:Rdsmith4
Image:Linear-regression.svg Source: http://en.wikipedia.org/w/index.php?title=File:Linear-regression.svg License: Public Domain Contributors: User:Qef
Image:LemonadeJuly2006.JPG Source: http://en.wikipedia.org/w/index.php?title=File:LemonadeJuly2006.JPG License: GNU Free Documentation License Contributors: Evrik, Gveret
Tered, Opponent, Polylerus
Image:Wind-particle.png Source: http://en.wikipedia.org/w/index.php?title=File:Wind-particle.png License: Public Domain Contributors: BenFrantzDale, Loisel, 2 anonymous edits
File:Heat_eqn.gif Source: http://en.wikipedia.org/w/index.php?title=File:Heat_eqn.gif License: Public Domain Contributors: User:Oleg Alexandrov
File:Standard deviation diagram.svg Source: http://en.wikipedia.org/w/index.php?title=File:Standard_deviation_diagram.svg License: Public Domain Contributors: Chesnok, Juiced lemon,
Krinkle, Manuelt15, Mwtoews, Petter Strandmark, Revolus, Tom.Reding, Wknight94, 17 anonymous edits
Image:Binomial distribution pmf.svg Source: http://en.wikipedia.org/w/index.php?title=File:Binomial_distribution_pmf.svg License: Public Domain Contributors: User:Tayste
Image:Binomial distribution cdf.svg Source: http://en.wikipedia.org/w/index.php?title=File:Binomial_distribution_cdf.svg License: Public Domain Contributors: User:Tayste
Image:Binomial Distribution.svg Source: http://en.wikipedia.org/w/index.php?title=File:Binomial_Distribution.svg License: Creative Commons Attribution-Sharealike 3.0 Contributors:
User:Cflm001
Image:Lognormal distribution PDF.svg Source: http://en.wikipedia.org/w/index.php?title=File:Lognormal_distribution_PDF.svg License: GNU Free Documentation License Contributors:
User:Autopilot, User:Par
Image:Lognormal distribution CDF.svg Source: http://en.wikipedia.org/w/index.php?title=File:Lognormal_distribution_CDF.svg License: GNU Free Documentation License Contributors:
User:Autopilot, User:PAR
Image:Heat eqn.gif Source: http://en.wikipedia.org/w/index.php?title=File:Heat_eqn.gif License: Public Domain Contributors: User:Oleg Alexandrov
Image:Heatequation exampleB frames.svg Source: http://en.wikipedia.org/w/index.php?title=File:Heatequation_exampleB_frames.svg License: Creative Commons Attribution-Sharealike 2.5
Contributors: User:Wtt
Image:Temp Rod homobc.svg Source: http://en.wikipedia.org/w/index.php?title=File:Temp_Rod_homobc.svg License: Creative Commons Attribution-Sharealike 2.5 Contributors: Wtt
Image:wiener_process_zoom.png Source: http://en.wikipedia.org/w/index.php?title=File:Wiener_process_zoom.png License: GNU Free Documentation License Contributors: User:PeR
Image:Wiener_process_3d.png Source: http://en.wikipedia.org/w/index.php?title=File:Wiener_process_3d.png License: GNU Free Documentation License Contributors: Original uploader
was Sullivan.t.j at English Wikipedia.
Image:Wiener process animated.gif Source: http://en.wikipedia.org/w/index.php?title=File:Wiener_process_animated.gif License: Creative Commons Attribution-Sharealike 3.0 Contributors:
User:Cyp
Image:BMonSphere.jpg Source: http://en.wikipedia.org/w/index.php?title=File:BMonSphere.jpg License: Creative Commons Attribution-Sharealike 2.5 Contributors: Thomas Steiner
Image:Crank-Nicolson-stencil.svg Source: http://en.wikipedia.org/w/index.php?title=File:Crank-Nicolson-stencil.svg License: Public Domain Contributors: Original uploader was Berland at
en.wikipedia
Image:VaR diagram.JPG Source: http://en.wikipedia.org/w/index.php?title=File:VaR_diagram.JPG License: Public Domain Contributors: User:AaCBrown
File:Futures Trading Composition.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Futures_Trading_Composition.jpg License: Creative Commons Attribution 3.0 Contributors:
User:TonyWikrent
Image:Option value.gif Source: http://en.wikipedia.org/w/index.php?title=File:Option_value.gif License: GNU Free Documentation License Contributors: Feco, Grenavitar
Image:Stockpricesimulation.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Stockpricesimulation.jpg License: Public Domain Contributors: Roberto Croce
Image:optionpricesurface.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Optionpricesurface.jpg License: Public Domain Contributors: Roberto Croce
File:Crowd outside nyse.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Crowd_outside_nyse.jpg License: Public Domain Contributors: AnRo0002, Echtner, Fnfd, Gribeco,
Gryffindor, Infrogmation, J 1982, Romary, Skeezix1000, Yerpo, 3 anonymous edits
Image:volatility smile.svg Source: http://en.wikipedia.org/w/index.php?title=File:Volatility_smile.svg License: Public Domain Contributors: Brianegge
Image:Ivsrf.gif Source: http://en.wikipedia.org/w/index.php?title=File:Ivsrf.gif License: GNU Free Documentation License Contributors: GregorB, Ronnotel, Stan Shebs
License
309
License
Creative Commons Attribution-Share Alike 3.0 Unported
http:/ / creativecommons. org/ licenses/ by-sa/ 3. 0/

You might also like