Jun 10, 2019

A reader recently asked an important question, one which often puzzles those new to

quantitative finance (especially those coming from technical analysis, which relies upon price

pattern analysis):

Why use the logarithm of returns, rather than price or raw returns?

The answer is several fold, each of whose individual importance varies by problem domain.

comparable metric, thus enabling evaluation of analytic relationships amongst two or more

variables despite originating from price series of unequal values. This is a requirement for

many multidimensional statistical analysis and machine learning techniques. For example,

interpreting an equity covariance matrix is made sane when the variables are both measured

in percentage.

First, log-normality: if we assume that prices are distributed log normally (which, in practice,

may or may not be true for any given price series), then is conveniently normally

distributed, because:

Second, approximate raw-log equality: when returns are very small (common for trades with

short holding durations), the following approximation ensures they are close in value to raw

returns:

calculated from this sequence is the compounding return, which is the running return of this

sequence of trades over time:

This formula is fairly unpleasant, as probability theory reminds us the product of normally-

distributed variables is not normal. Instead, the sum of normally-distributed variables is

normal (important technicality: only when all variables are uncorrelated), which is useful

when we recall the following logarithmic identity:

Thus, compounding returns are normally distributed. Finally, this identity leads us to a

pleasant algorithmic benefit; a simple formula for calculating compound returns:

Thus, the compound return over n periods is merely the difference in log between initial and

final periods. In terms of algorithmic complexity, this simplification reduces O(n)

multiplications to O(1) additions. This is a huge win for moderate to large n. Further, this

sum is useful for cases in which returns diverge from normal, as the central limit theorem

reminds us that the sample average of this sum will converge to normality (presuming finite

first and second moments).

Fourth, mathematical ease: from calculus, we are reminded (ignoring the constant of

integration):

continuous time stochastic processes which rely heavily upon integration and differentiation.

Fifth, numerical stability: addition of small numbers is numerically safe, while multiplying

small numbers is not as it is subject to arithmetic underflow. For many interesting problems,

this is a serious potential problem. To solve this, either the algorithm must be modified to be

numerically robust or it can be transformed into a numerically safe summation via logs.

As suggested by John Hall, there are downsides to using log returns. Here are two recent

papers to consider (along with their references):

Comparing Security Returns is Harder than You Think: Problems with Logarithmic

Returns, by Hudson (2010)

Quant Nugget 2: Linear vs. Compounded Returns – Common Pitfalls in Portfolio

Management, by Meucci (2010)

